CN1286727A - 奈瑟球菌抗原 - Google Patents
奈瑟球菌抗原 Download PDFInfo
- Publication number
- CN1286727A CN1286727A CN98812844A CN98812844A CN1286727A CN 1286727 A CN1286727 A CN 1286727A CN 98812844 A CN98812844 A CN 98812844A CN 98812844 A CN98812844 A CN 98812844A CN 1286727 A CN1286727 A CN 1286727A
- Authority
- CN
- China
- Prior art keywords
- sequence
- people
- protein
- gene
- dna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/22—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Neisseriaceae (F)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/04—Antibacterial agents
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
Landscapes
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medicinal Chemistry (AREA)
- Communicable Diseases (AREA)
- General Chemical & Material Sciences (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biochemistry (AREA)
- Gastroenterology & Hepatology (AREA)
- Oncology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Genetics & Genomics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Peptides Or Proteins (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
Abstract
本发明提供了脑膜炎奈瑟球菌(菌株A和B)以及淋病奈瑟球菌的蛋白,包括氨基酸序列、对应的核苷酸序列、表达数据以及血清学数据。该蛋白是有用的抗原,可用作疫苗、免疫原性组合物和/或诊断试剂。
Description
本发明涉及奈瑟球菌属细菌。
背景技术
脑膜炎奈瑟球菌(Neisseria meningitidis)和淋病奈瑟球菌(Neisseria gonrrhoeae)是不能动的人致病性革兰阴性双球菌。脑膜炎奈瑟球菌群集在咽喉处,并引起脑膜炎(有时没有脑膜炎而是败血病);淋病奈瑟球菌群集在生殖道并引起淋病。尽管这两种病原体群集在身体不同的区域并引起完全不同的疾病,但它们却是密切相关的,然而脑膜炎球菌与淋球菌明显不同的一个特征是所有病原性脑膜炎双球菌中存在多糖荚膜。
在1983-1990年期间,单单在美国,淋病奈瑟球菌每年就引起约800000起病例(Meitzner和Cohen,"抗淋球菌感染的疫苗"章节,在New Generation Vaccines,第2版,Levine,Woodrow,Kaper和Cobon,Marcel Dekker编辑,New York,1997,817-842页)。该疾病引起的发病率很高,但死亡率有限。非常希望对淋病奈瑟球菌进行疫苗接种,但反复尝试没有成功。该疫苗的主要候选抗原是表面外露蛋白如菌毛、孔蛋白、与不透明相关的蛋白(Opas)以及其它外露蛋白如Lip、Laz、IgA1蛋白酶以及运铁蛋白结合蛋白。也有人提议用脂寡糖(LOS)作为疫苗(Meitzner和Cohen,同上)。
脑膜炎奈瑟球菌会引起地方性和流行性疾病。在美国,其发病率为每年每100000人有0.6-1人,爆发时可以高得多(见Lieberman等人,(1996)"血清型A/C脑膜炎奈瑟球菌寡糖-蛋白偶联物疫苗在幼儿中的安全性和免疫原性",JAMA 275(19):1499-1503;Schuchat等人(1997)“1995年美国的细菌性脑膜炎”,N Engl J Med337(14):970-976)。在发展中国家,地方性疾病率要高得多,在流行时,发病率可高达每年每100000人有500起。该病的死亡率在美国很高,为10-20%,在发展中国家则要高得多。在引入了抗流感嗜血菌的偶联物疫苗后,脑膜炎奈瑟球菌是引起美国所有年龄人群中细菌性脑膜炎的主要原因(Schuchat等人(1997)同上)。
根据生物的荚膜多糖,已经鉴定出12种脑膜炎奈瑟球菌的血清型。A型是亚撒哈拉-非洲地区流行病中最常见的病原体。B型和C型血清型菌是导致美国以及大多数发达国家内的大多数病例的原因。W135和Y型血清型菌是导致美国和发达国家的其余病例的原因。目前使用的脑膜炎球菌是由血清型A、C、Y和W135组成的四价多糖疫苗。尽管其在青年和成人中有效,但是它诱导了差的免疫应答和短期的保护作用,并且不能用于婴儿[例如,见发病率和死亡率每周报道,46卷,PR-5(1997)]这是因为多糖是T细胞非依赖型抗原,其诱导的弱免疫应答不能通过重复免疫来加强。在流感嗜血菌的疫苗接种成功后,已经开发出了针对血清型A和C的偶联疫苗,现在是临床测试的最终阶段(Zollinger WD"新的和改进的抗脑膜炎球菌疾病疫苗",在:New Generation Vaccines中,同上,469-488页;Lieberman等人(1996)同上;Costantino等人(1992)“抗脑膜炎球菌A和C的偶联疫苗的开发和Ⅰ期临床测试”,Vaccine,10:691-698)。
然而,脑膜炎球菌B仍是一个问题。此血清型目前在美国、欧洲和南美州引起的病例约占总脑膜炎的50%。不能采用多糖方法,因为menB荚膜多糖是α(2-8)-相连的N-乙酰基神经氨酸的聚合物,它也存在于哺乳动物组织中。这导致了对抗原的耐受;实际上,如果引发免疫应答,则该免疫应答是抗自身的,因此是不希望的。为了避免引起自身免疫力并诱导保护性免疫应答,已经对该荚膜多糖进行化学修饰,例如用N-丙酰基代替N-乙酰基,而不改变特异性抗原性(Romero和Outschoorn(1994)"B型脑膜炎球菌候选疫苗的目前状况:荚膜或非荚膜?"ClinMicrobiol Rev 7(4):595-575)。
menB疫苗的另一种方法采用外膜蛋白(OMP)的复合物混合物,它只含有OMP、或富集在膜孔蛋白中的OMP,或缺失4型OMP(认为它诱导了封闭杀菌活性的抗体)。该方法产生的疫苗的性质还未经完全地分析。它们能保护机体抵抗同源的菌株,但是当存在许多外膜蛋白的抗原性变体株时一般无效。为了克服抗原性差异,已经构建了含有高达9种不同膜孔蛋白的多价疫苗(例如,Poolman JT(1992)“脑膜炎球菌疫苗的发展”Infect.Agents Dis.4:13-28)。用于外膜疫苗的其它蛋白是opa和opc蛋白,但是这些方法均不能克服抗原性差异(例如Ala′Aldeen和Borriello(1996)"脑膜炎球菌运铁蛋白结合蛋白1和2均是外露的,并产生能杀伤同源和异源菌株的杀菌性抗体"Vaccine 14(1):49-53)。
已可得到脑膜炎球菌和淋球菌的基因和蛋白的一定数量的序列信息(例如EP-A-0467714,WO96/29412),但这决不完全的。提供进一步的信息,就有机会能鉴定出估计是免疫系统靶标且没有抗原性差异的分泌的或外露的蛋白。例如,一些已鉴定的蛋白可作为抗脑膜炎球菌B的有效疫苗的成分,一些可作为抗有所脑膜炎球菌血清型的疫苗的成分,其它可作为抗所有病原性奈瑟球菌的疫苗的成分。
本发明
本发明提供了一些蛋白,该蛋白含有公开在实施例中的奈瑟球菌氨基酸序列。这些序列涉及脑膜炎奈瑟球菌或淋病奈瑟球菌。
本发明还提供了含有与实施例所公开的奈瑟球菌氨基酸序列同源(即具有序列相同性)的序列的蛋白。根据具体的序列,相同性的程度宜大于50%(例如65%、80%、90%或更高)。这些同源性蛋白包括实施例中公开的序列的突变体和等位基因变体。通常,认为两种蛋白之间有50%或更高的相同性表明功能等价。蛋白之间的相同性宜用在MRSRCH程序(Oxford Molecular)中执行的Smith-Watemen同源性搜寻算法来确定,采用仿射空隙搜寻,参数“空隙开口罚分(gap open penalty)”为12,“空隙延伸罚分(gap extension penalty)”为1。
本发明还提供了包含实施例所公开的奈瑟球菌氨基酸序列片段的蛋白。该片段应包含该序列中至少n个连续的氨基酸,根据具体的序列,n为7或更高(例如,8、10、12、14、16、18、20或更高)。该片段宜包含该序列的一个表位。
本发明的蛋白当然可用各种方法(例如重组表达、从细胞培养中纯化、化学合成等)制成各种形式(例如天然的、融合物等)。它们宜制成基本上纯的或分离的形式(即基本上不含其它奈瑟球菌或宿主细胞蛋白)。
另一方面,本发明提供了结合这些蛋白的抗体。它们可能是多克隆的或单克隆的,可用任何合适的方法制得。
还有一方面,本发明提供了包含实施例所公开的奈瑟球菌核苷酸序列的核酸。另外,本发明还提供了包含与实施例所公开的奈瑟球菌核苷酸序列同源(即具有序列相同性)的序列的核酸。
另外,本发明还提供了能与实施例中公开的奈瑟球菌核酸杂交(较佳的是在“高度严谨”条件(65℃,在0.1x SSC、0.5%SDS溶液中)下杂交)的核酸。
本发明还提供了包含这些序列之片段的核酸。这些核酸应包含来自奈瑟球菌序列的至少n个连续的核苷酸,根据具体的序列,n为10或更高(例如,12、14、15、18、20、25、30、35、40或更高)。
还有一方面,本发明提供了编码本发明的蛋白和蛋白片段的核酸。
也应理解,本发明也提供了包含与上述那些序列互补的序列的核酸(例如用于反义或探针目的)。
当然,本发明的核酸可用各种方式(例如化学合成,从基因组或cDNA文库、或从生物体本身制得等)制得,并可采用各种形式(例如单链、双链、载体、探针等)。
另外,术语“核酸”包括DNA和RNA,以及它们的类似物,如含有修饰的骨架的那些,还包括肽核酸(PNA)等。
另一方面,本发明提供了含有本发明的核苷酸序列的载体(如表达载体)以及转化了这些载体的宿主细胞。
另一方面,本发明提供了包含本发明的蛋白、抗体和/核酸的组合物。例如,这些组合物适合用作疫苗,或作为诊断性试剂,或作为免疫原性组合物。
本发明还提供了本发明的核酸、蛋白或抗体用作药剂(例如作为疫苗)或作为诊断性试剂的应用。本发明还提供了本发明的核酸、蛋白或抗体在生产下列物质中的应用:(ⅰ)用于治疗或预防奈瑟球菌感染的药剂;(ⅱ)用于检测奈瑟球菌或针对奈瑟球菌产生的抗体是否存在的诊断性试剂;和/或(ⅲ)可产生针对奈瑟球菌的抗体的制剂。所述奈瑟球菌可以是任何种或菌株(例如淋病奈瑟球菌或脑膜炎奈瑟球菌的任何菌株如菌株A、菌株B或菌株C)。
本发明还提供了一种治疗患者的方法,该方法包括给予患者治疗有效量的本发明的核酸、蛋白和/或抗体。
还有一方面,本发明提供了以下各种方法。
本发明提供了一种生产本发明的蛋白的方法,该方法包括在诱导蛋白表达的条件下培育本发明的宿主细胞的步骤。
本发明提供了一种生产本发明的蛋白或核酸的方法,其中用化学手段部分或全部合成所述蛋白或核酸。
本发明提供了一种检测本发明的多核苷酸的方法,该方法包括下列步骤:(a)在杂交条件下使本发明的核酸探针与生物样品接触,形成双链体;和(b)检测所述双链体。
本发明提供了一种检测本发明的蛋白质的方法,该方法包括下列步骤:(a)在适合形成抗体-抗原复合物的条件下使本发明的抗体和生物样品接触;和(b)检测所述复合物。
下面归纳了为了实施本发明而采用的标准技术和方法(例如用公开的序列用于接种或诊断性目的)。这种归纳不是对本发明的限制,而是举例,这些例子可以采用,但是不要求一定用。
综述
除非另有描述,本发明的实施将采用分子生物学、微生物学、重组DNA和免疫学的常规技术,这些均是本领域技术人员所知的。这些技术在下列文献中有完整的描述:例如,Sambrook《分子克隆实验指南》第2版(1989);《DNA克隆》第Ⅰ和Ⅱ卷(D.N.Glover编辑1985);《寡核苷酸合成》(M.J.Gait编辑,1984);《核酸杂交》(B.D.Hames和S.J.Higgins编辑.1984);《转录和翻译》(B.D.Hames和S.J.Higgins编辑,1984);《动物细胞培养》(R.I.Freshney编辑,1986);《固定化细胞和酶》(IRL出版社,1986);B.Perbal,《分子克隆实用指南》(1984);《酶学方法》系列丛书(Academic Press,Inc.),尤其是154和155卷;《哺乳动物细胞的基因转移载体》(J.H.Miller和M.P.Calos编辑,1987,Cold Spring Harbor Laboratory);Mayer和Walker编辑(1987),《细胞和分子生物学的免疫化学方法》(Academic Press,London);Scopes,(1987)《蛋白质纯化:原理和实践》第2版(Springer-Verlag,N.Y.),以及《实验免疫学手册》Ⅰ-Ⅳ卷(D.C.Weir和C.C.Blackwell编辑1986)。
在本说明书中采用了核苷酸和氨基酸的标准缩写。
本文引用的所有出版物、专利和专利申请均纳入本文作参考。尤其是将英国专利申请9723516.2、9724190.5、9724386.9、9725158.1、9726147.3、9800759.4和9819016.8的内容纳入本文作为参考。
定义
当组合物中总X+Y重量的至少85%是X时,则称含有X的组合物“基本上没有Y”。较佳的,X占组合物中X+Y总重量的至少约90%,更佳至少约95%或者甚至99%(重量)。
术语“包含”指“包括”以及“由…组成”,例如组合物“包含”X可以是只由X组成,或可包括X以外的物质,例如X+Y。
术语“异源”指在自然界中发现不在一起的两种生物学组分。此组分可以是宿主细胞、基因、或调控区如启动子。尽管异源组分在自然界中发现不在一起,但是它们能一起起作用,例如当与基因异源的启动子与该基因操作性相连时。另一个例子是奈瑟球菌序列与小鼠宿主细胞异源。还有一个例子是相同或不同蛋白的两个表位装配到一个蛋白中,以自然界中未曾发现的排列方式排列。
“复制起点”是启动和调节多核苷酸(例如表达载体)复制的多核苷酸序列。复制起点可作为细胞内多核苷酸复制的自主性单位,能在其自身的控制下进行复制。复制起点是载体在特定宿主细胞中复制所需的。有了某一复制起点,表达载体就能在细胞中合适蛋白的存在下高拷贝数的复制。复制起点的例子是在酵母中有效的自主复制序列;以及在COS-7细胞中有效的病毒性T-抗原。
“突变体”序列定义成与天然或公开的序列不同但具有序列相同性的DNA、RNA或氨基酸序列。根据具体的序列,天然或公开的序列与突变体序列之间的序列相同性程度宜大于50%(例如60%、70%、80%、90%、95%、99%或更高,用上述Smith-Waterman算法计算出)。如本文所述,本文提供的核酸序列的核酸分子或区域的“等位基因变体”是在另一或第二个分离物的基因组中基本上相同的基因座上的核酸分子或区域,由于诸如突变或重组引起的自然变异,它们具有相似但不相同的核酸序列。编码区等位基因变体通常编码的蛋白具有与其比较基因所编码蛋白相似的活性。等位基因变体还可包含基因5′或3′非翻译区中的变化,例如在调控控制区中的变化(例如见美国专利5,753,235)。
表达系统
奈瑟球菌核苷酸序列可在各种不同的表达系统中表达;例如和哺乳动物细胞、杆状病毒、植物、细菌和酵母一起使用的那些系统。
ⅰ.哺乳动物系统
哺乳动物表达系统是本领域中已知的。哺乳动物启动子是能结合哺乳动物RNA聚合酶并启动下游(3′)编码序列(如结构基因)转录成mRNA的任何DNA序列。启动子具有一个转录起始区,其通常邻近编码序列的5′端,还具有一个TATA盒,其通常位于转录起始位点上游25-30个碱基对(bp)处。认为TATA盒指导RNA聚合酶Ⅱ在正确位点开始RNA合成。哺乳动物启动子还含有一个上游启动子元件,其通常位于TATA盒上游100至200bp内。该上游启动子元件决定了转录启动的速度,并可在两个方向之一上起作用[Sambrook等人(1989)“克隆基因在哺乳动物细胞中的表达”《分子克隆实验指南》,第2版]。
哺乳动物病毒基因通常是高表达的,具有宽的宿主范围;因此,编码哺乳动物病毒基因的序列提供了特别有用的启动子序列。例子包括SV40早期启动子、小鼠乳房肿瘤病毒LTR启动子、腺病毒主要晚期启动子(Ad MLP)以及单纯疱疹病毒启动子。另外,从非病毒基因(如鼠金属硫蛋白基因)衍生的序列也提供了有用的启动子序列。表达可以是组成型的或受调控的(诱导的),这取决于该启动子能否在激素反应性细胞中用促糖皮质激素诱导。
增强元件(增强子)的存在,联合上述启动子元件通常会提高表达水平。增强子是这样一种调控性DNA序列,当其与同源或异源启动子相连,合成在正常的RNA起始位点开始时,它能刺激转录提高1000倍。当增强子位于转录起始位点的上游或下游,处于正常或翻转方向,或距离启动子1000个核苷酸以上的距离时,它均具有活性[Maniatis等人(1987)Science 236:1237;Alberts等人(1989)《细胞分子生物学》,第2版]。从病毒衍生获得的增强子元件可能是特别有用的,因为它们通常具有较宽的宿主范围。例子包括SV40早期基因增强子[Dijkema等人(1985)EMBO J.4:761]以及衍生自Rous肉瘤病毒的长末端重复序列(LTR)的增强子/启动子[Gorman等人(1982b)Proc.Natl.Acad.Sci.79:6777]以及来自人巨细胞病毒的增强子/启动子[Boshart等人(1985)Cell 41:521]。另外,一些增强子仅仅在诱导物(例如激素或金属离子)的存在下是可调节的并具有活性[Sassone-Corsi和Borelli(1986)Trends Genet.2:215;Maniatis等人(1987)Science 236:1237]。
DNA分子可在哺乳动物细胞中胞内表达。启动子序列可以和DNA分子直接相连,在这种情况下,重组蛋白的N端第一个氨基酸始终是甲硫氨酸,其由ATG起始密码子编码。如果需要,可通过和溴化氰体外培育来从蛋白上切下N端。
另外,外来蛋白也可从细胞中分泌到生长培养基中,方法是产生嵌合的DNA分子,该DNA分子编码的融合蛋白包括一前导序列片段,该片段在哺乳动物细胞中提供了外源蛋白的分泌。较佳的,在前导序列片段和外源基因之间可以有能在体内或体外断裂的加工位点。前导序列片段通常编码一种信号肽,该信号肽包含指导蛋白分泌出细胞的疏水性氨基酸。腺病毒三联前导序列是哺乳动物细胞中分泌外来蛋白的一个前导序列例子。
通常,哺乳动物细胞识别的转录终止和聚腺苷酸化序列是位于翻译终止密码子3′的调控区域,因此它和启动子元件一起连接在编码序列的侧面。成熟mRNA的3′端由定点的转录后断裂和聚腺苷酸化形成[Bimstiel等人(1985)Cell 41:349;Proudfoot和Whitelaw(1988)"真核RNA的终止和3′端加工"《转录和剪接》(B.D.Hames和D.M.Glover编辑);Proudfoot(1989)Trends Biochem.Sci.14:105]。这些序列指导mRNA的转录,mRNA能被翻译成该DNA编码的多肽。转录终止子/聚腺苷酸化信号的例子包括从SV40获得的那些[Sambrook等人(1989)“克隆基因在培养的哺乳动物细胞中的表达”《分子克隆实验指南》]。
通常,上述组件,包括启动子、聚腺苷酸化信号以及转录终止序列被一起放在表达构建物中。如果需要,该表达构建物中还包括增强子、具有功能性剪接供体体和受体位点的内含子以及前导序列。表达构建物通常以复制子形式维持,例如是能在宿主(如哺乳动物细胞或细菌)中稳定维持的染色体外元件(如质粒)。哺乳动物复制系统包括从动物病毒衍生的那些系统,其需要反式作用因子来进行复制。例如,含有乳多空病毒复制系统的质粒,如SV40[Gluzman(1981)Cell 23:175]或多瘤病毒,在合适的病毒T抗原存在下复制出极高的拷贝数。哺乳动物复制子的其它例子包括衍生自牛乳头瘤病毒和EB病毒的复制子。另外,复制子可以有两个复制系统,从而使其能维持在例如哺乳动物细胞中进行表达并能在原核宿主中克隆和扩增。这些哺乳动物细菌穿梭载体的例子包括pMT2[Kaufman等人(1989)Mol.Cell.Biol.9:946]和pHEBO[Shimizu等人(1986)Mol.Cell.Biol.6:1074]。
所用的转化程序取决于待转化的宿主。将异源多核苷酸导入哺乳动物细胞中的方法是本领域所知的,其包括葡聚糖介导的转染、磷酸钙沉淀、Polybrene(1,5-二甲基-1,5-二氮十一亚甲基聚甲溴化物)介导的转染、原生质体融合、电穿孔、将多核苷酸包裹在脂质体中以及将DNA直接显微注射到胞核中。
可作为宿主进行表达的哺乳动物细胞系是本领域中已知的,其包括许多从美国典型培养物保藏中心(ATCC)获得的无限增殖细胞系,包括但不局限于,中国仓鼠卵巢(CHO)细胞、海拉细胞、幼仓鼠肾(BHK)细胞、猴肾细胞(COS)、人肝细胞癌细胞(如Hep G2)和其它许多细胞系。
ⅱ.杆状病毒系统
编码蛋白质的多核苷酸也可插入合适的昆虫表达载体中,并与该载体中的控制元件操作性相连。载体构建采用本领域已知的技术。总地来说,表达系统的组分包括一种转移载体,通常是细菌质粒,其含有杆状病毒基因组片段以及便于插入待表达异源基因的限制性位点;野生型杆状病毒,其序列与转移载体中的杆状病毒特异性片段同源(这使得异源基因能同源重组到杆状病毒基因组中);以及合适的昆虫宿主细胞和生长培养基。
在将编码蛋白质的DNA序列插入转移载体中后,将载体和野生型病毒基因组转染到昆虫宿主细胞中,使载体和病毒基因组重组。表达包装的重组病毒,鉴定并纯化重组噬斑。杆状病毒/昆虫细胞表达系统材料及其方法,除别的以外,可以试剂盒形式购自Invitrogen,San Diego CA("MaxBac"试剂盒)。这些技术通常是本领域技术人员所知的,在Summers和Smith的Texas Agricultural Experiment Station Bulletin No.1555(1987)(后称“Summer和Smith的文章”)中有充分描述。
在将编码蛋白质的DNA序列插入杆状病毒基因组之前,通常将上述组件,包括启动子、前导序列(如果需要)、感兴趣的编码序列以及转录终止序列装配在中间置换型构建物(转移载体)中。该构建物可含有单个基因以及操作性相连的调控元件;多个基因,每个基因有其自己的操作性相连调控元件;或是由同一组调控元件调控的多个基因。中间置换型构建物通常保持在一个复制子中,例如能在宿主(如细菌)内稳定保持的染色体外元件(如质粒)。复制子将具有一个复制系统,从而使其能保持在合适的宿主中进行克隆和扩增。
目前,用来将外源基因导入AcNPV的最常用的转移载体是pAc373。还可设计本领域技术人员已知的其它许多载体。这些载体例如包括,pVL985(其将多角体蛋白的起始密码子从ATG变为ATT,在ATT下游32个碱基对处引入一个BamHⅠ克隆位点;见Luckow和Summers,Virology(1989)17:31)。
质粒通常还含有多角体蛋白聚腺苷酸化信号(Miller等人(1988)Ann.Rev.Microbiol.,42:177)以及用来在大肠杆菌中选择和繁殖的原核氨苄青霉素抗性(amp)基因和复制起点。
杆状病毒转移载体通常含有杆状病毒启动子。杆状病毒启动子是能结合杆状病毒RNA聚合酶并启动下游(5′到3′)编码序列(如结构基因)转录成mRNA的DNA序列。启动子具有一个转录起始区,该区通常邻近编码序列的5′端。该转录起始区通常包括一个RNA聚合酶结合位点以及一个转录起始位点。杆状病毒转移载体还可能有称为增强子的第二个区,如果该区域存在,它通常在结构基因的远端。表达可以是调控的或组成型的。
在病毒感染周期晚期大量转录的结构基因提供特别有用的启动子序列。例子包括从编码病毒多角体蛋白的基因衍生获得的序列,Friesen等人(1986)“杆状病毒基因表达的调控”《杆状病毒分子生物学》(Walter Doerfler编辑);EPO公开号127 839和155 476;以及编码p10蛋白的基因,Vlak等人(1988),J.Gen.Virol.69:765。
编码合适的信号序列的DNA可以衍生自分泌的昆虫或杆状病毒蛋白(如杆状病毒多角体蛋白基因)的基因(Carbonell等人,(1988)Gene,73:409)。另外,由于哺乳动物细胞翻译后修饰的信号(如信号肽断裂、蛋白水解断裂和磷酸化)看来可被昆虫细胞识别,且分泌和胞核积累所需的信号看来在非脊椎动物细胞和脊椎动物细胞之间是保守的,因此也可用非昆虫来源的前导序列来提供昆虫中的分泌,这些前导序列例如是从编码人α-干扰素(Maeda等人(1985),Nature 315:592)、人胃泌素释放的肽(Lebacq-Verheyden等人(1988),Molec.Cell.Biol.8:3129)、人IL-2(Smith等人(1985)PNAS,82:8404)、小鼠IL-3(Miyajima等人(1987)Gene 58:273)和人葡糖脑苷脂酶(Martin等人(1988)DNA,7:99)的基因衍生获得的。
重组多肽或多蛋白可以在胞内表达,或如果它和合适的调控序列一起表达,它可被分泌。非融合的外源蛋白的良好的胞内表达理想的通常需要具有短前导序列的异源基因在ATG起始信号前有合适的翻译起始信号。如果需要,可通过和溴化氰体外培育来从成熟蛋白上切下N端甲硫氨酸。
另外,可通过产生嵌合的DNA分子将非天然分泌的重组聚蛋白或蛋白从昆虫细胞中分泌出来,该嵌合的DNA分子所编码的融合蛋白包含一前导序列片段,该片段提供了昆虫中分泌外源蛋白的作用。该前导序列片段通常编码一种信号肽,该信号肽包含的疏水性氨基酸指导蛋白质转移到内质网中。
在插入了编码该蛋白表达产物前体的DNA序列和/或基因后,用转移载体的异源DNA和野生型杆状病毒的基因组DNA共同转化(通常是共转染)昆虫细胞宿主。构建物的启动子和转录终止序列通常包含2-5kb的杆状病毒基因组片段。将异源DNA引入杆状病毒中所需位点内的方法是本领域所知的。(见Summers和Smith的文章,同上;Ju等人(1987);Smith等人,Mol.Cell.Biol.(1983)3:2156;和Luckow和Summers(1989))。例如,插入可以是通过同源双交换重组来插入基因如多角体蛋白基因中;插入还可以是插入工程改造入所需杆状病毒基因内的限制性酶切位点中。Miller等人(1989),Bioessays 4:91。当DNA序列被克隆在表达载体多角体蛋白基因位置中后,其5′和3′均侧接了多角体蛋白特异性序列,并位于多角体蛋白启动子的下游。
随后将新形成的杆状病毒表达载体包装到感染性重组杆状病毒中。发生同源重组的频率很低(在约1%和5%之间);因此,共转染后产生的大多数病毒仍是野生型病毒。因此,需要用一种方法来鉴别重组病毒。该表达系统的一个优点是视觉筛选能区分重组病毒。在病毒感染后期,天然病毒产生的多角体蛋白在受其感染细胞的胞核中产生的水平非常高。累积的多角体蛋白形成的包涵体还含有包埋颗粒。这些包涵体的大小为15微米,它们具有高度的折光性,从而使它们呈现了明亮的发光的外观,在光学显微镜下很容易观察。感染了重组病毒的细胞缺少包涵体。为了区分重组病毒和野生型病毒,用本领域已知的技术将转染上清接种到单层昆虫细胞上形成噬斑。即,在光学显微镜下筛选存在(表明是野生型病毒)或不存在(表明是重组病毒)包涵体的噬斑。“当代微生物学方法”第2卷(Ausubel等人编辑),16.8(增补10,1990);Summers和Smith,同上;Miller等人(1989)。
已经开发出感染进入几种昆虫细胞的重组杆状病毒表达载体。例如,已经开发出用于感染以下昆虫的细胞的重组杆状病毒:埃及伊蚊、苜蓿丫纹夜蛾、家蚕、黑尾果蝇、草地夜蛾和粉纹夜蛾(WO 89/046699;Carbonell等人(1985)J.Virol.56:153;Wright(1986)Nature 321:718;Smith等人(1983)Mol.Cell.Biol.3:2156;综述见Fraser等人(1989)Vitro Cell.Dev.Biol.25:225)。
可以购得细胞和细胞培养基用于在杆状病毒/表达系统中直接表达和融合表达异源多肽;细胞培养技术是本领域技术人员通常所知的。例如见Summers和Smith,同上。
然后,经修饰的昆虫细胞可以生长在合适的营养培养基中,该培养基能稳定地保持该质粒于修饰的昆虫宿主中。当表达产物基因处于可诱导的控制下时,可以使宿主生长至高密度,并诱导表达。另外,当表达是组成型表达时,产物将被连续表达到培养基中,营养性培养基必需不断循环,同时取出感兴趣的产物并补充消耗的营养物。产物可用以下这些技术来纯化:例如层析,如HPLC、亲和层析、离子交换层析等;电泳;密度梯度离心;溶剂抽提等。产物可按需作进一步纯化,以基本上除去所有也分泌到培养基中或由昆虫细胞裂解而产生的昆虫蛋白,以提供一种至少基本上不含宿主碎片如蛋白质、脂质和多糖的产物。
为了进行蛋白质表达,将从转化子衍生获得的重组宿主细胞培育在允许重组蛋白的编码序列表达的条件下。这些条件将随所选定的宿主细胞而变。然而,本领域技术人员容易根据本领域已知的知识来确定该条件。
ⅲ.植物系统
本领域中已知有许多植物细胞培养系统和全植物遗传表达系统。典型的植物细胞基因表达系统包括在以下专利中描述的那些,例如:US 5,693,506;US 5,659,122;和US 5,608,143。Zenk,Phytochemistry 30:3861-3863(1991)中描述了在植物细胞培养物中遗传表达的其它例子。除上述参考文献外,关于植物蛋白信号肽的描述还可在下列文献中找到:Vaulcombe等人,Mol.Gen.Genet.209:33-40(1987);Chandler等人,Plant Molecular Biology 3:407-418(1984);Rogers,J.Biol.Chem.260:3731-3738(1985);Rothstein等人,Gene 55:353-356(1987);Whittier等人,Nucleic Acids Research15:2515-2535(1987);Wirsel等人,Molecular Microbiology 3:3-14(1989);Yu等人,Gene 122:247-253(1992)。关于用植物激素、赤霉素酸和赤霉素酸诱导分泌的酶调节植物基因表达的描述可在R.L.Jones和J.MacMillin,Gibberellins,《植物生理学进展》,Malcolm B.Wilkins编辑,1984 Pitman Publishing Limited,London,21-52页中找到。描述其它调节代谢的基因的参考文献参见:Sheen,Plant Cell,2:1027-1038(1990);Maas等人,EMBO J.9:3447-3452(1990);Benkel和Hickey,Proc.Natl.Acad.Sci.84:1337-1339(1987)。
通常,利用本领域已知的技术,将所需的多核苷酸序列插入一表达盒中,该表达盒含有为在植物中操作而设计的基因调控元件。将该表达盒插入所需的表达载体中,表达盒的上游和下游有适合在植物宿主中表达的伴随序列。该伴随序列可来自质粒或病毒,并为载体提供所需的性质,以允许载体将DNA从起初的克隆宿主(如细菌)中移动到所需植物宿主中。基础的细菌/植物载体构建物最好能提供宽的宿主范围原核复制起点;原核可选择标记;以及,对于农杆菌转化而言,宜提供T DNA序列用于农杆菌介导转移至植物染色体。当异源基因不易检测时,该构建物最好还具有一个适用于确定植物细胞是否已经转化的可选择标记基因。关于合适标记(例如对于禾草类家族成员)的综述可在Wilmink和Dons,1993,Plant Mol.Biol.Reptr,11(2):165-185中找到。
还建议采用合适将异源序列整合到植物基因组中的序列。这些序列可能包括用于同源重组的转座子序列以及允许将异源表达盒随机插入植物基因组中的Ti序列。合适的原核可选择标记包括抗生素(如氨苄青霉素或四环素)抗性标记。编码其它功能的其它DNA序列也可存在于载体中,这是本领域所知的。
本发明的核酸分子可包括在一个表达盒中来表达感兴趣的蛋白质。通常只有一个表达盒,但是两个或多个表达盒也是可行的。除了编码异源蛋白的序列外,重组表达盒还含有下列元件:启动子区域、植物5′非翻译序列、起始密码子(根据结构基因原来是否具有而定)、以及转录和翻译终止序列。表达盒5′和3′端的独特限制性酶位点能使表达盒方便地插入预先存在的载体中。
异源编码序列可以用于任何与本发明有关的蛋白。编码感兴趣的蛋白的序列将编码出一个信号肽,该信号肽能适当地加工和转运蛋白质,并且通常缺少可能会导致本发明的所需蛋白与膜结合的序列。由于对于大部分来说,转录起始区将针对发芽期间表达和转运的基因,采用提供转运的信号肽,也可提供转运感兴趣的蛋白质。通过这种方式,感兴趣的蛋白将从表达该蛋白的细胞中转运出来,并能被有效地收获。通常,种子中的分泌是通过糊粉或小盾体上皮层进入种子的胚乳。尽管不需要使蛋白从产生该蛋白的细胞中分泌出来,但是这种分泌有利于重组蛋白的分离和纯化。
由于所需基因产物的最终表达将在真核细胞中进行,因此需要确定克隆的基因部分是否含有作为内含子被宿主剪接体机制加工的序列。如果是这样,需要对“内含子”区进行定点诱变,以防止一部分遗传信息作为错误的内含子密码而丧失,Reed和Maniatis,Cell 41:95-105,1985。
可用微量移液管以机械方式转移重组DNA,将载体直接显微注射到植物细胞中。Crossway,Mol.Gen.Genet,202:179-185。还可用聚乙二醇将遗传物质转移到植物细胞中,Krens等人,Nature,296,72-74,1982。导入核酸片段的另一种方法是用小颗粒进行高速弹道贯穿,在这些小珠或颗粒的基质中或表面上带有核酸,Klein等人,Nature,327,70-73,1987,Knudsen和Muller,1991,Planta,185:330-336提出用颗粒轰击大麦胚乳以产生转基因大麦。还有一种导入方法是使原生质体和其它实体(微细胞(minicell)、细胞、溶酶体或其它可融合的脂质表面体)融合,Fraley等人,Proc.Natl.Acad.Sci.USA,79,1859-1863,1982。
载体也可通过电穿孔导入植物细胞中。(Fromm等人,PNAS 82:5824,1958)。在该技术中,在含有基因构建物的质粒存在下电穿孔植物原生质体。高电场强度的电脉冲使生物膜可逆地被通透,从而允许导入质粒。电穿孔的植物原生质体改造了细胞壁,分裂并形成植物胼胝体。
本发明可转化所有的植物,从中能分离出原生质体并能培育成全再生植物,从而回收得到含有转基因的全植物。已经知道实际上可以从培育的细胞或组织再生所有的全植物,其包括但不局限于,甘蔗、甜菜、棉花、果实和其它树、豆科植物和蔬菜的所有主要种类。一些合适的植物包括,例如,草莓属、莲花属、苜蓿属、驴食豆属、三叶草属、胡卢巴属、豇豆属、柑橘属、亚麻属、老鹳草属、Manihot、Daucus、鼠耳芥属、芸苔属、萝卜属、白芥属、颠茄属、辣椒属、曼陀罗属、天仙子属、番茄属、烟草属、茄属、碧冬茄属、毛地黄属、Majorana、菊苣属、向日葵属、莴苣属、雀麦属、天门冬属、金鱼草属、龙骨角属、Nemesia、天竺葵属、稷属、狼尾草属、毛茛属、千里光属、Salpiglossis、香瓜属、Browaalia、大豆属、黑麦草属、玉蜀黍属、小麦、蜀黍属和曼陀罗属各种类。
各种植物的再生方式是不同的,但是通常是首先提供含有异源基因拷贝的转化的原生质体悬液。形成胼胝体组织,从胼胝体中诱生出枝条,随后是根。另外,从原生质体悬液可以诱生形成胚胎。这些胚胎象天然的胚胎那样发芽形成植物。培养基通常含有各种氨基酸和激素,如植物生长素和细胞分裂素。尤其是对于玉米和苜蓿属来说,在培养基中加入谷氨酸和脯氨酸也是很有利的。枝条和根通常同时发育。有效的再生取决于培养基、基因型以及培养史。如果控制了这三个变量,那么再生能完全再现和重复。
在一些植物细胞培养系统中,本发明所需的蛋白可能被排泄出来,或者蛋白可从全植物中提取出来。当本发明所需的蛋白被分泌到培养基中后,就可进行收集。或者,可以用机械方式破碎胚以及无胚-半种子或其它植物组织,以释放出分泌到细胞和组织之间的蛋白。将该混合物悬于缓冲液中,以提取可溶性蛋白。然后用常规的蛋白分离和纯化方法纯化重组蛋白。用常规方法调节时间、温度、pH、氧和体积等参数,以优化异源蛋白的表达和回收。
ⅳ.细菌系统
细菌表达技术是本领域已知的。细菌启动子是能结合细菌RNA聚合酶并启动下游(3′)编码序列(如结构基因)转录成mRNA的DNA序列。启动子具有一个转录起始区,其通常位于编码序列的5′端附近。该转录起始区通常包括RNA聚合酶结合位点以及一个转录起始位点。细菌启动子可能还有第二个功能区域称为操纵子,它可能与毗邻的RNA合成开始的RNA聚合酶结合位点重叠。该操纵子允许(可诱导)对转录的负调节,因为基因阻遏蛋白可能结合操纵子并因而抑制特定基因的转录。在负调节元件(如操纵子)不存在时,可能发生组成型表达。另外,正调节可通过基因激活蛋白结合序列来实现,如果有的话,它通常邻近RNA聚合酶结合序列(5′)。基因激活蛋白的例子是分解代谢物激活剂蛋白(CAP),它帮助启动大肠杆菌(E.coli)中的lac操纵子的转录[Raibaud等人(1984)Annu.Rev.Genet.18:173]。因此,表达调控可能是正作用或负作用,从而增强或减弱了转录。
编码代谢途径中的酶的序列提供了特别有用的启动子序列。例子包括衍生白糖(如半乳糖、乳糖(lac)[Chang等人(1977)Nature 198:1056]和麦芽糖)代谢酶的启动子序列。其它例子包括衍生自生物合成酶(如色氨酸(trp))[Goeddel等人(1980)Nuc.AcidsRes.8:4057;Yelverton等人(1981)Nucl.Acids Res.9:731;美国专利4,738,921;EP-A-0036776和EP-A-0121775]的启动子序列。g-内酰胺酶(bla)启动子系统[Weissmann(1981)"干扰素的克隆和其它错误"《干扰素3》(I.Gresser编辑)],λ嗜菌体PL[Shimatake等人(1981)Nature 292:128]和T5[美国专利4,689,406]启动子系统也提供了有用的启动子序列。
另外,非天然存在的合成的启动子也可象细菌启动子一样起作用。例如,一种细菌或嗜菌体启动子的转录激活序列可以和另一种细菌或嗜菌体启动子的操纵子序列连接在一起,形成合成的杂交启动子[美国专利4,551,433]。例如,tac启动子是杂合的trp-lac启动子,它由trp启动子以及受lac阻遏蛋白调节的lac操纵子序列组成[Anann等人(1983)Gene 25:167;de Boer等人,(1983)Proc.Natl.Acad.Sci.80:21]。另外,细菌启动子可包括非细菌来源但能结合细菌RNA聚合酶并启动转录的天然存在的启动子。天然存在的非细菌来源的启动子还能和相容的RNA聚合酶偶联在一起,从而在原核细胞中高水平地表达某些基因[Studier等人(1986)J.Mol.Biol.189:113;Tabor等人(1985)Proc.Natl.Acad.Sci.82:1074]。另外,杂合的启动子还可由嗜菌体启动子以及大肠杆菌操纵子区域组成(EPO A-0 267 851)。
除了有功能的启动子序列外,有效的核糖体结合位点对于外来基因在原核细胞中的表达也是有用的。在大肠杆菌中,核糖体结合位点称为Shine-Dalgarno(SD)序列,其包括起始密码子(ATG)以及在起始密码子上游3-11个核苷酸处的长度为3-9个核苷酸的序列[Shine等人(1975)Nature 254:34]。认为SD序列是通过SD序列和大肠杆菌16S rRNA的3′端碱基配对来促进mRNA与核糖体结合的[Steitz等人(1979)"信使RNA中的遗传信号和核苷酸序列"生物学调节和发育:基因表达"(编者R.F.Goldberger)]。为了表达具有弱的核糖体结合位点的原核基因和真核基因[Sambrook等人(1989)"克隆基因在大肠杆菌中的表达"《分子克隆实验指南》]。
DNA分子可以在胞内表达。启动子序列可以直接与DNA分子相连,在这种情况下,N端的第一个氨基酸始终是甲硫氨酸,其由ATG起始密码子编码。如果需要,可通过和溴化氰体外培育或通过和细菌甲硫氨酸N-端肽酶体内或体外培育,将N端的甲硫氨酸从蛋白质上切下(EPO-A-0 219 237)。
融合蛋白为直接表达提供了一种备选方案。通常,将编码内源细菌蛋白或其它稳定的蛋白之N端部分的DNA序列与异源编码序列的5′端融合。在表达时,该构建物将提供这两个氨基酸序列的融合物。例如,λ噬菌体细胞基因可以和外源基因的5′端相连并在细菌中表达。所得融合蛋白宜保留一个酶(因子Xa)加工位点,以便将噬菌体蛋白与外源基因切开[Nagai等人(1984)Nature 309:810]。融合蛋白也可用lacZ[Jia等人(1987)Gene 60:197],trpE[Allen等人(1987)J.Biotechnol.5:93;Makoff等人(1989),J.Gen.Microbiol.135:11]以及Chey[EP-A-0-324 647]基因的序列组成。两个氨基酸序列连接处的DNA序列可能编码或不编码可切割的位点。另一个例子是遍在蛋白融合蛋白。这种融合蛋白由遍在蛋白区域组成,该区域宜保留一个酶(例如遍在蛋白特异性加工蛋白酶)加工位点,以便将外源蛋白和遍在蛋白切开。通过这种方法,可以分离获得天然的外源蛋白[Miller等人(1989)Bio/Technology 7:698]。
另外,还可通过产生嵌合的DNA分子来将外源蛋白分泌出细胞,该嵌合的DNA分子编码的融合蛋白含有一个信号肽序列片段,该序列片段能使细菌中的外源蛋白分泌出来[美国专利4,336,336]。信号序列片段通常编码一个信号肽,该信号肽含有疏水性氨基酸,能指引蛋白分泌出细胞。蛋白质被分泌到生长培养基(革兰阳性菌)中或细胞内膜和外膜之间的周质间隙内(革兰阴性菌)。在编码的信号肽片段和外源基因之间宜具有能在体内或体外切割的加工位点。
编码合适信号序列的DNA可以从分泌性细菌蛋白的基因衍生获得,这些基因例如是大肠杆菌外膜蛋白基因(ompA)[Masui等人(1983),《基因表达的实验操作》;Ghrayeb等人(1984)EMBO J.3:2437]以及大肠杆菌碱性磷酸酶信号序列(phoA)[Oka等人(1985)Proc.Natl.Acad.Sci.82:7212]。另一个例子是,可采用各种芽孢杆菌菌株的α淀粉酶基因的信号序列将异源蛋白分泌出枯草芽孢杆菌[Palva等人(1982)Proe.Natl.Acad.Sci.USA 79:5582;EP-A-0 244 042]。
通常,细菌所识别的转录终止序列是位于翻译终止密码子3′的调控区,它和启动子一起侧接在编码序列的两侧。这些序列指导mRNA的转录,而mRNA能被翻译成该DNA所编码的多肽。转录终止序列通常包括约50个核苷酸的DNA序列,该序列能形成帮助终止转录的茎环结构。例子包括衍生自具有强启动子的基因(如大肠杆菌中的trp基因以及其它生物合成的基因)的转录终止序列。
上述组件,包括启动子、信号序列(如果需要的)、感兴趣的编码序列以及转录终止序列通常一起被放在表达构建物中。表达构建物通常以复制子的形式维持,例如能在宿主(如细菌)中稳定维持的染色体外元件(如质粒)。复制子具有一个复制系统,从而允许其维持在原核宿主中或进行表达或进行克隆和扩增。另外,复制子可以是高拷贝数或低拷贝数的质粒。高拷贝数质粒的拷贝数大致在约5至200之间,通常在约10至150之间。含有高拷贝数质粒的宿主宜含有至少约10个质粒,更佳的含有至少约20个质粒。根据载体以及外源蛋白对宿主的影响,可以选择高拷贝数或低拷贝数的载体。
另外,表达构建物可以和一个整合载体一起整合入细菌基因组中。整合载体通常含有至少一个序列与细菌染色体同源,从而允许该载体整合。整合看来是载体和细菌染色体中的同源DNA之间重组引起的。例如,用不同芽孢杆菌菌株的DNA构建的整合载体整合到芽孢杆菌染色体中(EP-A-0 127 328)。整合载体还可包含噬菌体或转座子序列。
通常,染色体外以及整合的表达构建物均含有可选择的标记,以便选择已经转化的菌株。可选择标记可在细菌宿主中表达,其包括赋予细菌对药物(如氨苄青霉素、氯霉素、红霉素、卡那霉素(新霉素)和四环素)抗性的基因[Davies等人(1978)Annu.Rev.Microbiol.32:469]。可选择标记还可包括生物合成性基因,如在组氨酸、色氨酸以及亮氨酸生物合成途径中的那些基因。
另外,上述某些组件可以一起放在转化载体中。转化载体通常包含一个可选择标记,如上所述,该载体以复制子形式维持或发展成一个整合载体。
已经开发出了用于转化到许多细菌中的表达和转化载体(无论是染色体外复制子还是整合载体)。例如,已经开发出了用于下列细菌的表达载体:枯草芽孢杆菌[Palva等人,(1982)Proc.Natl.Acad.Sci.USA 79:5582;EP-A-0 036 259和EP-A-0 063 953;WO 84/04541],大肠杆菌[Shimatake等人,(1981)Nature 292:128;Amann等人,(1985)Gene 40:183;Studier等人,(1986)J.Mol.Biol.189:113;EP-A-0 036 776,EP-A-0 136829和EP-A-0 136 907],酪链球菌[Powell等人,(1988)Appl.Environ.Microbiol.54:655];浅青紫链球菌[Powell等人,(1988)Appl.Environ.Microbiol.54:655],浅青紫链霉菌[US patent 4,745,056]。
将外源DNA导入细菌宿主的方法是本领域熟知的,通常包括用氯化钙或其它试剂(如二价阳离子和DMSO)处理对细菌进行转化。DNA还可通过电穿孔方法导入细菌细胞。转化程序通常因待转化的细菌种类而不同。例如参见[Masson等人,(1989)FEMS Microbiol.Lett.60:273;Palva等人,(1982)Proc.Natl.Acad.Sci. USA 79:5582;EP-A-0 036 259和EP-A-0 063 953;WO 84/04541,芽孢杆菌],[Miller等人,(1988)Proc.Natl.Acad.Sci.85:856;Wang等人,(1990)J.Bacteriol.172:949,弯曲杆菌],[Cohen等人,(1973)Proc.Natl.Acad.Sci.69:2110;Dower等人,(1988)Nucleic Acids Res.l6:6127;Kushner(1978)"用ColE1-衍生的质粒转化大肠杆菌的改进的方法"GeneticEngineering:Proceedings of the International Symposium on Genetic Engineering(H.W.Boyer和S.Nicosia编辑);Mandel等人,(1970)J.Mol.Biol.53:159;Taketo(1988)Biochim.Biophys,Acta 949:318;埃希氏菌],[Chassy等人,(1987)FEMS Microbiol.Lett.44:173乳酸杆菌];[Fiedler等人,(1988)Anal,Biochem 170:38,假单胞菌];[Augustin等人,(1990)FEMS Microbiol.Lett.66:203,葡萄球菌],[Barany等人,(1980)J.Bacteriol.l44:698;Harlander(1987)"用电穿孔转化链球菌产乳酸微生物"Streptococcal Genetics(J.Ferretti和R.Curtiss Ⅲ编辑);Perry等人,(1981)Infect.Immun.32:1295;Powell等人,(1988)Appl.Environ.Microbiol.54:655;Somkuti等人,(1987)Proc.4th Evr.Cong.Biotechnology 1:412,链球菌]。
ⅴ.酵母表达
酵母表达系统也是本领域技术人员所知的。酵母启动子是能结合酵母RNA聚合酶并启动下游(3′)编码序列(如结构基因)转录成mRNA的DNA序列。启动子具有一个转录起始区,它通常位于编码序列的5′端附近。该转录起始区通常包括RNA聚合酶结合位点("TATA"盒)以及一个转录起始位点。酵母启动子可能还有第二个功能区域称为上游激活序列(UAS),如果存在的话,它通常在结构基因的远端。UAS能调节表达(可诱导)。在UAS不存在时,发生组成型表达。表达的调控可能是正作用或负作用的,从而增强或减弱了转录。
酵母是一种发酵生物体,具有活泼的代谢途径,因此编码代谢途径中的酶的序列提供了特别有用的启动子序列。例子包括醇脱氢酶(ADH)(EP-A-0 284 044)、烯醇酶、葡萄糖激酶、葡萄糖-6-磷酸异构酶、甘油醛-3-磷酸-脱氢酶(GAP或GAPDH)、己糖激酶、磷酸果糖激酶、3-磷酸甘油酸变位酶、以及丙酮酸激酶(PyK)(EPO-A-0 329203)。编码酸性磷酸酶的酵母PHO5基因也提供了有用的启动子序列[Myanohara等人(1983)Proc.Natl.Acad.Sci.USA 80:1]。
另外,非天然存在的合成的启动子也可象酵母启动子一样起作用。例如,一种酵母启动子的UAS序列可以和另一种酵母启动子的转录激活区连接在一起,形成合成的杂合启动子。这种杂合启动子的例子包括与GAP转录激活区相连的ADH调控序列(美国专利No.4,876,197和4,880,734)。杂合启动子的其它例子包括由ADH2、GAL4、GAL 10或PHO5基因的调控序列组成的启动子与糖酵解酶基因如GAP或PyK的转录激活区组合(EP-A-0 164 556)。另外,酵母启动子可包括非酵母来源但能结合酵母RNA聚合酶并启动转录的天然存在的启动子。这些启动子的例子包括,尤其是,[Cohen等人,(1980)Proc.Natl.Acad.Sci.USA 77:1078;Henikoff等人,(1981)Nature 283:835;Hollenberg等人,(1981)Curr.Topics Microbiol.Immunol. 96:119;Hollenberg等人,(1979)"细菌抗生素抗性基因在酿酒酵母中的表达"Plasmids ofMedical,Environmental and Commercial Importance(K.N.Timmis和A.Puhler编辑);Mercerau-Puigalon等人,(1980)Gene ll:63;Panthier等人,(1980)Curr.Genet.2:109]。
DNA分子可以在酵母菌胞内表达。启动子序列可以直接与DNA分子相连,在这种情况下,重组蛋白N端的第一个氨基酸始终是甲硫氨酸,其由ATG起始密码子编码。如果需要,可通过和溴化氰体外培育将N端的甲硫氨酸从蛋白质上切下。
象在哺乳动物、杆状病毒以及细菌表达系统中一样,融合蛋白为酵母表达系统提供了一种备选方案。通常,将编码内源酵母蛋白或其它稳定的蛋白之N端部分的DNA序列与异源编码序列的5′端融合。在表达时,该构建物将提供这两个氨基酸序列的融合物。例如,酵母或人超氧化物歧化酶(SOD)基因可以和外源基因5′端相连并在酵母中表达。两个氨基酸序列连接处的DNA序列可能编码或不编码可切割的位点。例如参见EP-A-0 196 056。另一个例子是遍在蛋白融合蛋白。这种融合蛋白由遍在蛋白区域组成,该区域宜保留一个酶(例如遍在蛋白特异性加工蛋白酶)加工位点,以便将外源蛋白和遍在蛋白切开。因此,通过这种方法,可以分离获得天然的外源蛋白(例如WO88/024066)。
另外,还可通过产生嵌合的DNA分子来将外源蛋白从细胞分泌到生长培养基中,该嵌合的DNA分子编码的融合蛋白含有一个前导序列片段,该前导序列片段能使酵母中的外源蛋白分泌出来。较佳的,在编码的前导片段和外来基因之间宜具有能在体内或体外切割的加工位点。该前导序列片段通常编码了含有疏水性氨基酸的信号肽,其指导蛋白从细胞分泌出来。
编码合适信号序列的DNA可以从分泌性酵母蛋白的基因衍生获得,这些基因例如有酵母转化酶基因(EP-A-0 012 873;JPO.62,096,086)以及A-因子基因(美国专利4,588,684)。另外,非酵母来源的前导序列(如干扰素前导序列)的存在也能提供分泌出酵母的作用(EP-A-0 060 057)。
较佳的一类分泌前导序列采用了酵母α-因子基因的片段,其含有"pre"信号序列和"pro"区。可采用的α因子片段的类型包括全长pre-pro α因子前导序列(约83个氨基酸残基)以及截短的α-因子前导序列(通常约25至50个氨基酸残基)(美国专利4,546,083和4,870,008;EP-A-0 324 274)。采用α-因子前导片段提供分泌作用的其它前导序列包括杂合的α-因子前导序列,其由第一个酵母的pre序列以及第二个酵母α因子的pro区域组成(例如见WO 89/02463)。
通常,被酵母识别的转录终止序列是位于翻译终止密码子3′的调控区,其和启动子一起侧接在编码序列的两侧。这些序列指导mRNA的转录,而mRNA能被翻译成该DNA所编码的多肽。转录终止序列和其它酵母识别的终止序列的例子例如是编码糖酵解酶的那些转录终止序列。
上述组件,包括启动子、信号序列(如果需要的)、感兴趣的编码序列以及转录终止序列,通常被一起放在表达构建物中。表达构建物通常以复制子的形式保持,例如能在宿主(如酵母或细菌)中稳定保持的染色体外元件(如质粒)。复制子可能具有两个复制系统,从而允许其能维持在例如酵母中进行表达,并能维持在原核宿主进行克隆和扩增。这些酵母-细菌穿梭载体的例子包括YEp24[Botstein等人(1979)Gene8:17-24],pCL/1[Brake等人,(1984)Proc.Natl.Acad.Sci.USA 81:4642-4646]和YRp17[Stinchcomb等人(1982)J.Mol.Biol.158:157]。另外,复制子可以是高拷贝数或低拷贝数的质粒。高拷贝数质粒的拷贝数大致在约5至200之间,通常在约10至150之间。含有高拷贝数质粒的宿主宜含有至少约10个质粒,更佳的含有至少约20个质粒。根据载体以及外源蛋白对宿主的影响,可以选择高拷贝数或低拷贝数的载体。例如参见Brake等人,同上。
另外,表达构建物可以和一个整合载体一起整合入酵母基因组中。整合载体通常含有至少一个序列与酵母染色体同源,从而允许该载体整合,最好含有两个同源序列侧接该表达构建物。整合看来是载体和酵母染色体中同源DNA之间重组引起的[Orr-Weaver等人(1983)Methods in Enzymol.101:228-245]。通过选择合适的同源序列插入载体中,可以使整合载体针对酵母中某一特定的基因座。见Orr-Weaver等人,同上。可以整合入一个或多个表达构建物,这可能会影响重组蛋白产生的水平[Rine等人(1983)Proc.Natl.Acad.Sci.USA 80:6750]。载体中的染色体序列可以载体中的单个片段形式存在(从而导致整个载体的整合),或是与染色体中的相邻片段同源的两个片段,这两个片段在载体中侧接在表达构建物两侧,从而导致仅仅表达构建物稳定地整合。
通常,染色体外以及整合的表达构建物均含有可选择的标记,以便选择已经转化的酵母菌株。可选择标记可包括能在酵母宿主中表达的生物合成基因(如ADE2、HIS4、LEU2、TRP1和ALG7以及G418抗性基因),这些基因分别赋予酵母细胞对衣霉素以及G418的抗性。另外,合适的可选择标记还可能为酵母在毒性化合物(如金属)存在下提供生长能力。例如,CUP1的存在使酵母能在铜离子存在下生长[Butt等人,(1987)Microbiol,Rev.51:351]。
另外,上述某些组件可以一起放在转化载体中。转化载体通常包含一个可选择标记,如上所述,该载体以复制子形式维持或发展成一个整合载体。
已经开发出了用于转化入许多酵母中的表达和转化载体(无论是染色体外复制子还是整合载体)。例如,已经开发出用于下列酵母菌的表达载体:白假丝酵母[Kurtz,等人,(1986)Mol.Cell.Biol.6:142],麦芽糖念珠菌[Kunze,等人,(1985)J.BasicMicrobiol.25:141],多形汉逊酵母[Gleeson,等人,(1986)J.Gen.Microbiol.132:3459;Roggenkamp等人,(1986)Mol.Gen.Genet.202:302],脆壁克鲁维酵母[Das,等人,(1984)J.Bacteriol.158:1165],乳酸克鲁维酵母[De Louvencourt等人,(1983)J.Bacteriol.l54:737;Van den Berg等人,(1990)Bio/Technology 8:135],季也蒙毕赤酵母[Kunze等人,(1985)J.Basic Microbiol.25:141],巴斯德毕酵母[Cregg,等人,(1985)Mol.Cell.Biol.5:3376;美国专利No.4,837,148和4,929,555],酿酒酵母[Hinnen等人,(1978)Proc.Natl.Acad.Sci.USA 75:1929;Ito等人,(1983)J.Bacteriol.153:163],栗酒裂植酵母[Beach和Nurse(1981)Nature 300:706],以及Yarrowialipolytica[Davidow,等人,(1985)Curr.Genet.10:380471 Gaillardin,等人,(1985)Curr.Genet.10:49]。
将外源DNA导入酵母宿主的方法是本领域熟知的,通常包括用碱阳离子处理转化的原生质球或完整酵母细胞。转化程序通常因待转化的酵母种类而不同。例如参见,[Kurtz等人,(1986)Mol.Cell.Biol.6:142;Kunze等人,(1985)J.BasicMicrobiol.25:141;假丝酵母];[Gleeson等人,(1986)J.Gen.Microbiol.132:3459;Roggenkamp等人,(1986)Mol.Gen.Genet.202:302;汉逊酵母];[Das等人,(1984)J.Bacteriol.158:1165;De Louvencourt等人,(1983)J.Bacteriol.154:1165;Van denBerg等人,(1990)Bio/Technology 8:135;克鲁维酵母];[Cregg等人,(1985)Mol.Cell.Biol.5:3376;Kunze等人,(1985)J.Basic Microbiol.25:141;美国专利No.4,837,148和4,929,555;毕赤酵母];[Hinnen等人,(1978)Proc.Natl.Acad.Sci.USA 75;1929;Ito等人,(1983)J.Bacteriol.153:163酿酒酵母];[Beach和Nurse(1981)Nature300:706;裂殖酵母];[Davidow等人,(1985)Curr.Genet.10:39;Gaillardin等人,(1985)Curr.Genet.10:49;Yarrowia]。
抗体
本文所用的术语“抗体”指由至少一个抗体结合位点组成的一个或一组多肽。“抗体结合位点”是一个三维结合空间,其内表面形状和电荷分布与抗原表位的特征互补,从而使抗体与抗原结合。“抗体”例如包括,脊椎动物抗体、杂合抗体、嵌合抗体、人化抗体、经修饰的抗体、单价抗体、Fab蛋白以及单结构域抗体。
针对本发明蛋白的抗体可用于亲和层析、免疫试验以及区别/鉴定奈瑟球菌蛋白。
针对本发明蛋白的多克隆和单克隆抗体可用常规方法制得。通常,首先用蛋白来免疫合适的动物,较佳的是小鼠、大鼠、家兔或山羊。由于可获得的血清体积多,能获得标记的抗家兔和抗山羊抗体,因此对于制备多克隆抗血清来说,家兔和山羊是较佳的。免疫通常这样进行:将蛋白混合或乳化到盐水(较佳的是佐剂如Freund完全佐剂)中,然后肠胃外(通常是皮下或肌内)注射该混合物或乳剂。每次注射50-200微克的剂量就足够了。2-6周后用盐水(较佳的是用Freund不完全佐剂)配的蛋白质注射一次或多次以强化免疫。另外可以用本领域已知的方法进行体外免疫来产生抗体,从本发明的目的来看,认为其与体内免疫等效。将免疫后的动物血液抽取到玻璃或塑料容器中,25℃培育该血液1小时,然后4℃培育2-18小时,获得多克隆抗血清。离心(例如1000g 10分钟)回收血清。家兔每次取血可获得约20-50毫升。
用Kohler和Milstein的标准方法[Nature(1975)256:495-96]或其改进方法制得单克隆抗体。通常,如上所述对小鼠或大鼠免疫。然而,并非是对动物取血然后抽提血清,而是取出脾脏(以及任选地取出几个大的淋巴结),将其分离成单细胞。如果需要,可将细胞悬液(在除去非特异性粘附的细胞后)加入包被了蛋白质抗原的板或孔中,对脾细胞进行筛选。表达抗原特异性的膜结合免疫球蛋白的B细胞结合到板上,不象悬液其它物质那样被洗去。然后使所得B细胞或所有解离的脾细胞与骨髓瘤细胞融合形成杂交瘤,培养在选择性培养基(如次黄嘌呤、氨基蝶呤胸苷培养基,“HAT”)中。通过有限稀释接种所得杂交瘤,并测定特异性结合免疫抗原(且不结合无关抗原)的抗体的产生。然后,体外(例如在组织培养瓶或中空纤维反应器中)或体内(如小鼠腹水中)培养所选的分泌单克隆抗体的杂交瘤。
如果需要,抗体(无论是多克隆还是单克隆抗体)可用常规技术来标记。合适的标记包括荧光团、发色团、放射活性原子(具体是32P和125I)、密电子试剂、酶、以及具有特异性结合配偶的配体。酶通常靠其活性来检测。例如,辣根过氧化物酶通常是检测其将3,3′,5,5′-四甲基联苯胺(TMB)转变成蓝色的能力,可用分光光度计定量测定。“特异性结合配偶”指能以高特异性结合配体分子的蛋白质,例如抗原以及对其有特异性的单克隆抗体。其它特异性结合配偶包括生物素和亲和素或链亲和素,IgG和蛋白A,以及本领域已知的许多受体-配体对。应理解,上述内容并非要将各种标记分成不同的类,因为同一标记可在几种不同的模型中起作用。例如,125I可作为放射活性标记,或作为密电子试剂。HRP可作为酶或单抗的抗原。另外,一种物质可以和各种标记组合以获得所需的效果。例如,在实施本发明中,单抗和亲和素也需要标记,因此,可以用生物素标记单抗,并用标记了125I的亲和素检测其存在,或用标记HRP的抗生物素单抗检测其存在。其它替换和可能性对于本领域普通技术人员来说是显而易见的,所以应认作等价物属于本发明的范围。
药物组合物
药物组合物可包含本发明的多肽、抗体或核酸。该药物组合物将包含治疗有效量的本发明的多肽、抗体或多核苷酸。
本文所用的术语“治疗有效量”指治疗剂治疗、缓解或预防目标疾病或状况的量,或是表现出可检测的治疗或预防效果的量。该效果例如可通过化学标记或抗原水平来检测。治疗效果也包括生理性症状的减少,例如体温降低。对于某一对象的精确有效的量取决于该对象的体型和健康状况、病症的性质和程度、以及选择给予的治疗剂和或治疗剂的组合。因此,预先指定准确的有效量是没用的。然而,对于某给定的状况而言,可以用常规实验来确定该有效量,临床医师是能够判断出来的。
为了本发明的目的,有效的剂量为给予个体约0.01毫克/千克至50毫克/千克或0.05毫克/千克至10毫克/千克的DNA构建物。
药物组合物还可含有药学上可接受的载体。术语“药学上可接受的载体”指用于治疗剂(例如抗体、多肽、基因或其它治疗剂)给药的载体。该术语指这样一些药剂载体:它们本身不诱导产生对接受该组合物的个体有害的抗体,且给药后没有过分的毒性。合适的载体可能是大的、代谢缓慢的大分子,如蛋白质、多糖、聚乳酸(polylactic acid)、聚乙醇酸、氨基酸聚合物、氨基酸共聚物以及无活性的病毒颗粒。这些载体是本领域普通技术人员所熟知的。
本文可用的药学上可接受的盐例如有:无机酸盐,如盐酸盐、氢溴酸盐、磷酸盐、硫酸盐等;以及有机酸的盐,如乙酸盐、丙酸盐、丙二酸盐、苯甲酸盐等。在Remington′s Pharmaceutical Sciences(Mack Pub.Co.,N.J.1991)中可找到关于药学上可接受的赋形剂的充分讨论。
治疗性组合物中的药学上可接受的载体可含有液体,如水、盐水、甘油和乙醇。另外,这些载体中还可能存在辅助性的物质,如润湿剂或乳化剂、pH缓冲物质等。通常,可将治疗性组合物制成可注射剂,例如作为液体溶液或悬液;还可制成在注射前适合配入溶液或悬液中、液体载体的固体形式。脂质体也包括在药学上可接受的载体的定义中。
输药方法
一旦配成本发明的组合物,可将其直接给予对象。待治疗的对象可以是动物;尤其可以治疗人对象。
直接输送该组合物通常可通过皮下、腹膜内、静脉内或肌内注射或输送至组织间隙来实现。组合物也可输送至病灶区。其它给药方式包括口服和肺给药、栓剂和透皮或经皮肤应用(例如参见WO98/20734)、用针、基因枪或手持喷雾器(hypospray)。治疗剂量方案可以是单剂方案或多剂方案。
疫苗
本发明的疫苗可以是预后性的(即预防感染)或治疗性的(即在感染后治疗疾病)。
这些疫苗包含免疫性抗原、免疫原、多肽、蛋白或核酸,通常与“药学上可接受的载体”组合,这些载体包括本身不诱导产生对接受该组合物的个体有害的抗体的任何载体。合适的载体通常是大的、代谢缓慢的大分子,如蛋白质、多糖、聚乳酸、聚乙醇酸、氨基酸聚合物、氨基酸共聚物、脂质凝集物(如油滴或脂质体)以及无活性的病毒颗粒。这些载体是本领域普通技术人员所熟知的。另外,这些载体可作为免疫刺激剂(“佐剂”)。另外,抗原或免疫原可以和细菌类毒素(如白喉、破伤风、霍乱、幽门螺杆菌等病原体的类毒素)偶联。
增强组合物效果的较佳的佐剂包括但不局限于:(1)铝盐(alum),如氢氧化铝、磷酸铝、硫酸铝等;(2)水包油的乳剂配方(有或没有其它特异性的免疫刺激剂,如胞壁酰肽(见下文)或细菌细胞壁成分),例如,例(a)MF59TM(WO 90/14837;《疫苗设计:亚基和佐剂方法》第10章,编者Powell和Newman,Plenum Press 1995),其含有5%鲨烯、0.5%吐温80和0.5%Span 85(任选地含有不同量的MTP-PE(见下文),虽然并不需要),用微量流化器(如110Y型微量流化器(Microfluidics,Newton,MA))制成亚微米级颗粒;(b)SAF,其含有10%鲨烯、0.4%吐温 80、5%普卢兰尼克(pluronic)嵌段聚合物L121以及thr-MDP(见下文),微量流化成亚微米级乳剂或涡流振荡产生粒径较大的乳剂,和(c)RibiTM佐剂系统(RAS)(Ribi Immunochem,Hamilton,MT),其含有2%鲨烯、0.2%吐温80以及取自单磷酰脂A(MPL)、二霉菌酸海藻糖酯(TDM)、和细胞壁骨架(CWS)的一种或多种细菌细胞壁组分,较佳的是MPL+CWS(DetoxTM);(3)皂素佐剂,例如可采用StimulonTM(Cambridge Bioscience,Worcester,MA)或从其产生的颗粒,如ISCOM(免疫刺激性复合物);(4)Freund完全佐剂(CFA)和Freund不完全佐剂(IFA);(5)细胞因子,如白介素(如IL-1、IL-2、IL-4、IL-5、IL-6、IL-7、IL-12等)、干扰素(如γ干扰素)、巨噬细胞集落刺激因子(M-CFS)、肿瘤坏死因子(TNF)等;以及(6)作为免疫刺激剂来增强组合物效果的其它物质。Alum和MF59TM是较佳的。
如上所述,胞壁酰肽包括但不局限于,N-乙酰-胞壁酰-L-苏氨酰-D-异谷氨酰胺(thr-MDP)、N-乙酰-去胞壁酰-L-丙氨酰-D-异谷氨酰胺(nor-MDP)、N-乙酰胞壁酰-L-丙氨酰-D-异谷氨酰氨酰基-L-丙氨酸-2-(1′-2′-二棕榈酰-sn-甘油-3-羟基磷酰氧)-乙胺(MTP-PE)等。
免疫原性组合物(如免疫用抗原/免疫原/多肽/蛋白质/核酸,药学上可接受的载体以及佐剂)通常含有稀释剂,如水,盐水,甘油,乙醇等。另外,辅助性物质,如润湿剂或乳化剂、pH缓冲物质等可存在于该赋形剂中。
通常,可将免疫原性组合物制成可注射剂,例如作为液体溶液或悬液;还可制成在注射前适合配入溶液或悬液、液体赋形剂的固体形式。该制剂还可乳化或包封在脂质体中,在上述药学上可接受的载体下增强佐剂效果。
用作疫苗的免疫原性组合物包含免疫学有效量的抗原性或免疫原性多肽,以及上述其它所需的组分。“免疫学有效量”指以单剂或连续剂一部分给予个体的量对治疗或预防是有效的。该用量根据所治疗个体的健康状况和生理状况、所治疗个体的类别(如非人灵长类等)、个体免疫系统合成抗体的能力、所需的保护程度、疫苗的配制、治疗医师对医疗状况的评估、及其它的相关因素而定。预计该用量将在相对较宽的范围内,可通过常规实验来确定。
传统方法是从肠胃外(皮下、肌内、或透皮/经皮肤(如WO98/20734))途径通过注射给予免疫原性组合物。适合其它给药方式的其它配方包括口服和肺制剂、栓剂和透皮应用。治疗剂量可以是单剂方案或多剂方案。疫苗可以结合其它免疫调节剂一起给予。
作为以蛋白质为基础的疫苗的备选方案是,可以采用DNA疫苗接种[例如,Robinson和Torres(1997)Seminars in Immunology 9:271-283;Donnelly等人(1997)Annu Rev.Immunol 15:617-648;见下文]。
基因输送载体
用于输送构建物的基因治疗载体可以口服或全身性给予,其中所述构建物包括本发明治疗剂的编码序列,将其输送至哺乳动物以便在哺乳动物体内表达。这些构建物可利用体内或活体外方式中的病毒或非病毒载体方法。这些编码序列的表达可用内源哺乳动物启动子或异源启动子诱导。编码序列的体内表达可以是组成型的或受调控的。
本发明包括能表达所涉及的核酸序列的基因输送载体。基因输送载体宜为病毒载体,更佳的是逆转录病毒、腺病毒、腺伴随病毒(AAV)、疱疹病毒或甲病毒载体。病毒载体还可以是星状病毒、冠状病毒、正粘病毒、乳多空病毒、副粘病毒、细小病毒、小核糖核酸病毒、痘病毒或披膜病毒的病毒载体。通常参见Jolly(1994)CancerGene Therapy 1:51-64;Kimura(1994)Human Gene Therapy 5:845-852;Connelly(1995)Human Gene Therapy 6:185-193;以及Kaplitt(1994)Nature Genetics 6:148-153。
逆转录病毒载体是本领域中熟知的,我们认为任何逆转录病毒基因治疗载体均可用于本发明,包括B、C和D型逆转录病毒、异嗜性逆转录病毒(例如NZB-X1、NZB-X2和NZB9-1(见O′Neill(1985)J.Virol.53:160)广食性逆转录病毒如MCF和MCF-MLV(见Kelly(1983)J.Virol 45:291)、泡沫病毒和慢病毒。见《RNA肿瘤病毒》第2版,Cold Spring Harbor Laboratory,1985。
逆转录病毒基因治疗载体的诸部分可从不同逆转录病毒衍生获得。例如,逆转录载体LTR可以从鼠肉瘤病毒衍生获得,tRNA结合位点可以从Rous肉瘤病毒衍生获得,包装信号从鼠白血病病毒获得,第二链的合成起点从禽类白血病病毒获得。
可将这些重组逆转录病毒导入合适的包装细胞系,用来产生转导感受态逆转录病毒载体颗粒(见美国专利5,591,624)。通过将嵌合性整合酶掺入逆转录病毒颗粒,构建逆转录病毒载体,以便将其定点整合到宿主细胞DNA中(见WO96/37626)。较佳的是重组病毒载体是复制缺陷型重组病毒。
适合与上述逆转录病毒载体一起使用的包装细胞系是本领域中熟知的,很容易制得(见WO95/30763和WO92/05266),并能用来产生能生产重组载体颗粒的生产型细胞系(也称为载体细胞系或“VCL”)。包装细胞系宜从人亲代细胞(如HTl080细胞)或貂亲代细胞系制取,以便消除人血清的灭活作用。
用来构建逆转录病毒基因治疗载体的较佳的逆转录病毒包括禽类白血病病毒、牛白血病病毒、鼠白血病病毒、水貂细胞灶诱导病毒、鼠肉瘤病毒、网状内皮组织增殖病毒和Rous肉瘤病毒。特别佳的鼠白血病病毒包括4070A和1504A(Hartley和Rowe(1976)J Virol 19:19-25),Abelson(ATCC No.VR-999),Friend(ATCCNo.VR-245),Graffi,Gross(ATCC Nol VR-590),Kirsten,Harvey肉瘤病毒和Rauscher(ATCC No.VR-998)以及莫洛尼鼠白血病病毒(ATCC No.VR-190)。这些逆转录病毒可以从保藏机构或保藏中心如Rockville,Maryland的美国典型培养物保藏中心(ATCC)获得,或用常用的技术从已知来源分离获得。
可用于本发明的典型的已知逆转录病毒基因治疗载体包括在以下专利申请中描述的那些载体:GB2200651,EP0415731,EP0345242,EP0334301,WO89/02468;WO89/05349,WO89/09271,WO90/02806,WO90/07936,WO94/03622,WO93/25698,WO93/25234,WO93/11230,WO93/10218,WO91/02805,WO91/02825,WO95/07994,US 5,219,740,US 4,405,712,US 4,861,719,US 4,980,289,US 4,777,127,US 5,591,624.另见Vile(1993)Cancer Res 53:3860-3864;Vile(1993) Cancer Res 53:962-967;Ram(1993)Cancer Res 53(1993)83-88;Takamiya(1992)J Neurosci Res 33:493-503;Baba(1993)JNeurosurg 79:729-735;Mann(1983)Cell 33:153;Cane(1984)Proc Natl AcadSci 81:6349;以及Miller(1990) Human Gene Therapy 1。
人腺病毒基因治疗载体也是本领域中已知的,并可用于本发明。例如参见Berkner(1988)Biotechniques 6:616和Rosenfeld(1991)Science 252:431,以及WO93/07283,WO93/06223和WO93/07282。用于本发明的典型的已知的腺病毒基因治疗载体包括在上述文献以及下述专利中描述的那些例子:WO94/12649,WO93/03769,WO93/19191,WO94/28938,WO95/11984,WO95/00655,WO95/27071,WO95/29993,WO95/34671,WO96/05320,WO94/08026,WO94/11506,WO93/06223,WO94/24299,WO95/14102,WO95/24297,WO95/02697,WO94/28152,WO94/24299,WO95/09241,WO95/25807,WO95/05835,WO94/18922和WO95/09654。另外,可以采用Curiel(1992)Hum.Gene Ther.3:147-154中描述的给予和已杀死腺病毒相连的DNA的方法。本发明的基因输递载体还包括腺病毒伴随病毒(AAV)载体。用于本发明的这种载体的主要且较佳的例子是Srivastava,WO93/09239中公开的AAV-2为基的载体。最佳的AAV载体包含两个AAV反向末端重复序列,其中通过替换核苷酸对天然D-序列进行修饰,使至少5-18个天然的核苷酸(较佳的至少10-18个天然核苷酸,最佳的10个天然核苷酸)被保留下来,而D-序列其余的核苷酸缺失或被非天然核苷酸取代。AAV末端反向重复序列的天然D-序列是每个AAV反向末端重复序列中不参与HP形成的20个串联核苷酸的序列(即每一端有一个序列)。非天然的替换核苷酸可以是天然D-序列该位置中所见核苷酸除外的任何核苷酸。其它可采用典型AAV载体是pWP-19、pWN-1,两者均公开在Nahreini(1993)Gene 124:257-262中。这样的AAV的另一个例子是psub201(见Samulski(1987)J.Virol.61:3096)。另一个典型的AAV载体是Double-D ITR载体。Double-D ITR载体的构建方案公开在美国专利5,478,745中。还有其它的载体是公开在Carter的美国专利4,797,368和Muzyczka的美国专利5,139,941、Chartejee的美国专利5,474,935和Kotin的WO94/288157中的载体。可用于本发明的另一个AAV载体例子是SSV9AFABTKneo,它含有AFP增强子和白蛋白启动子,并且主要指导肝内表达。其结构和构建方案公开在Su(1996)Human GeneTherapy 7:463-470中。其它的AAV基因治疗载体在美国专利5,354,678, 5,173,414,5,139,941, 5,252,479中有所描述。
本发明的基因治疗载体还包括疱疹载体。主要且较佳的例子是含有编码胸苷激酶多肽的序列的单纯疱疹病毒载体,如公开在US5,288,641和EP0176170(Roizman)中的那些。其它典型的单纯疱疹病毒载体包括WO95/04139中公开的HFEM/ICP6-LacZ(Wistar Institute)、Geller(1988)Science 241:1 667-1669以及WO90/09441和WO92/07945中公开的pHSVlac、Fink(1992)Human Gene Therapy 3:11-19中描述的HSV Us3∷pgC-lacZ、EP 0453242(Breakefield)中描述的HSV 7134、2 RH 105和GAL4以及保藏于ATCC、保藏号为ATCC VR-977和ATCC VR-260的那些病毒。
还考虑到甲病毒基因基因治疗载体也可用于本发明。较佳的甲病毒载体是新培斯病毒载体。披膜病毒、Semliki Forest病毒(ATCC VR-67;ATCC VR-1247)、Middleberg病毒(ATCC VR-370)、Ross River病毒(ATCC VR-373;ATCC VR-1246)、委内瑞拉马脑炎病毒(ATCC VR923;ATCC VR-1250;ATCC VR-1249;ATCC VR-532)、以及在美国专利5,091,309,5,217,879以及WO92/10578中描述的那些。更具体地说,可以采用1995年3月15日提交的美国申请08/405,627、WO94/21792、WO92/10578、WO95/07994、US 5,091,309和US 5,217,879中描述的那些甲病毒载体。这些甲病毒可以从保藏机构或保藏中心如Rockville,Maryland的美国典型培养物保藏中心(ATCC)获得,或用常用的技术从已知来源分离获得。较佳的是,采用细胞毒性减少的甲病毒载体(见USSN 08/679640)。
DNA载体系统,如真核分层的(layered)表达系统也可用于表达本发明的核酸。关于真核分层的表达系统详见WO95/07994。较佳的,本发明的真核分层表达系统宜从甲病毒载体衍生获得,更佳的从新培斯病毒载体衍生获得。
适用于本发明的其它病毒载体包括:从脊髓灰质炎病毒衍生的病毒,例如ATCCVR-58以及在Evans,Nature 339(1989)385和Sabin(1973)J.Biol.Standardization 1:115中描述的那些;鼻病毒,例如ATCC VR-1110以及在Arnold(1990)J Cell Biochem L401中描述的那些;痘病毒,如金黄色痘病毒或牛痘病毒,例如ATCC VR-111和ATCCVR-2010,以及在Fisher-Hoch(1989)Proc Natl AcadSci 86:317;Flexner(1989)Ann NYAcad Sci 569:86,Flexner (1990)Vaccine 8:17;US 4,603,112,US 4,769,330以及WO89/01973中描述的那些;SV40病毒,例如ATCC VR-305以及在Mulligan(1979)Nature 277:108和Madzak(1992)J Gen Virol 73:1533中描述的那些;流感病毒,例如ATCC VR-797以及用例如US 5,166,057和Enami(1990)Proc Natl Acad Sci87:3802-3805;Enami和Palese(1991)J Virol 65:2711-2713;Luytjes(1989)Cell 59:110中所述的反基因技术制得的重组流感病毒(另见McMichael(1983)NEJ Med 309:13,Yap(1978)Nature 273:238以及Nature(1979)277:108);EP-0386882和Buchschacher(1992)J.Virol. 66:2731中描述的人免疫缺陷病毒;麻疹病毒,例如ATCC VR-67和VR-1247,以及EP-0440219中描述的那些;奥拉病毒,例如ATCC VR-368;Bebaru病毒,例如ATCC VR-600和ATCC VR-1240;Cabassou病毒,例如ATCC VR-922;屈曲病毒,例如ATCC VR-64和ATCC VR-1241;Fort Morgan病毒,例如ATCCVR-924;Getah病毒,例如ATCC VR-369和ATCC VR-1243;Kyzylagach病毒,例如ATCC VR-927;Mayaro病毒,例如ATCC VR-66;Mucambo病毒,例如ATCC VR-580和ATCC VR-1244;Ndumu病毒,例如ATCC VR-371;Pixuna病毒,例如ATCC VR-372和ATCC VR-1245;Tonate病毒,例如ATCC VR-925;Triniti病毒,例如ATCC VR-469;Una病毒,例如ATCC VR-374;Whataroa病毒,例如ATCC VR-926;Y-62-33病毒,例如ATCC VR-375;O′Nyong病毒,东部脑炎病毒,例如ATCC VR-65和ATCCVR-1242;西部脑炎病毒,例如ATCC VR-70,ATCC VR-1251,ATCC VR-622和ATCCVR-1252;和冠状病毒,例如ATCC VR-740和在Hamre(1966)Proc Soc Exp Biol Med121:190中描述的那些。
将本发明的组合物输送至细胞内并不局限于上述病毒载体。还可采用其它输送方法和介质,例如核酸表达载体、与已被杀死的腺病毒相连或不相连的单独的聚阳离子凝缩的DNA(例如参见1994年12月30日美国申请No.08/366,787和Curiel(1992)Hum Gene Ther 3:147-154)、配体连接的DNA(例如参见Wu(1989)J.Biol.Chem.264:16985-16987)、真核细胞输送载体细胞(例如参见1994年5月9日提交的美国申请No.08/240,030以及美国申请No.08/404,796)、光聚合水凝胶材料的沉淀、手提式基因转移颗粒枪(如美国专利5,149,655所述)、电离辐射(如US5,206,152和WO92/11033所述)、核电荷中和或与细胞膜融合。其它方法在Philip(1994)Mol CellBiol 14:2411-2418以及Woffendin(1994)Proc.Natl.Acad.Sci.91:1581-1585中有所描述。
可以采用颗粒介导的基因转移,例如参见美国申请No.60/023,867。简言之,可将序列插入含有控制高水平表达的常规序列的常规载体中,然后和合成性基因转移分子一起培育,这些基因转移分子例如是聚合性DNA-结合阳离子(如聚赖氨酸、鱼精蛋白和白蛋白),其与细胞寻靶配体(如脱唾液酸血清类粘蛋白(如Wu和Wu(1987)J.Biol.Chem.262:4429-4432所述)、胰岛素(如Hucked(1990)Biochem Pharmacol40:253-263所述)、半乳糖(如Plank(1992)Bioconjugate Chem 3:533-539所述)、乳糖或运铁蛋白)相连。
还可使用裸露的DNA。典型的裸露DNA导入方法在WO 90/11902和US 5,580,859中有所描述。用可生物降解的乳胶珠可以改善摄取效果。在对珠粒的胞吞作用开始后,DNA包被的乳胶珠粒被有效地运输到细胞中。通过处理珠粒以提高其疏水性可进一步改进该方法,从而帮助破坏核内体和将DNA释放到细胞质中。
可作为基因输送载体的脂质体在US 5,422,120,WO95/13796,WO94/23697,WO91/14445和EP-524,968中有所描述。如USSN 60/023,867中所描述的,在非病毒输送时,可将编码多肽的核酸序列插入含有控制高水平表达的常规序列的常规载体中,然后和合成性基因转移分子一起培育,这些基因转移分子例如是聚合性DNA-结合阳离子(如聚赖氨酸、鱼精蛋白和白蛋白),其与细胞寻靶配体(如脱唾液酸血清类粘蛋白、胰岛素、半乳糖、乳糖或运铁蛋白)相连。其它输送系统包括采用脂质体来包裹DNA,该DNA所含基因在各种组织特异性或活性普遍存在的启动子控制下。适用的其它非病毒输送系统包括机械输送系统,如Woffendin等人(1994)Proc.Natl.Acad.Sci.USA 91(24):11581-11585中描述的方法。另外,该系统的编码序列和表达产物可以通过光聚合的水凝胶材料的沉淀来输送。可用来输送编码序列的其它基因输送常规方法例如包括,用手提式基因转移颗粒枪(如美国专利5,149,655所述);用电离辐射来激活转移的基因(如US 5,206,152和WO92/11033所述)。
典型的脂质体和聚阳离子基因输送载体在下列文献中有所描述:US 5,422,120和4,762,915;WO95/13796;WO94/23697;WO91/14445;EP-0524968;Stryer,Biochemistry,236-240页(1975)W.H.Freeman,San Francisco;Szoka(1980)BiochemBiophys Acta 600:1;Bayer(1979)Biochem Biophys Acta 550:464;Rivnay(1987)MethEnzymol 149:119;Wang(1987)Proc NatlAcad Sci 84:7851;Plant(1989)Anal Biochem176:420。
多核苷酸组合物可包含治疗有效量的基因治疗载体,其定义如上所述。出于本发明的目的,有效的剂量是给予个体约0.01毫克/千克至50毫克/千克或0.05毫克/千克至10毫克/千克的DNA构建物。
输送方法
一旦配制成后,本发明的多核苷酸组合物可以以下三种方式给予:(1)直接给予对象;(2)活体外输送至从对象衍生获得的细胞;或(3)体外表达重组蛋白。待处理的对象可以是哺乳动物或鸟类。另外,也可对人进行治疗。
直接输送该组合物通常可通过皮下、腹膜内、静脉内或肌内注射或输送至组织间隙来实现。组合物也可输送至病灶区。其它给药方式包括口服和肺给药、栓剂和透皮或经皮肤应用(例如参见WO98/20734)、用针、基因枪或手持喷雾器(hypospray)。治疗剂量方案可以是单剂方案或多剂方案。
活体外输送以及将转化的细胞重新植入对象体内的方法是本领域所熟知的,在例如WO93/14778中有所描述。用于活体外应用的细胞例子包括例如干细胞、尤其是造血细胞、淋巴细胞、巨噬细胞、树突细胞或肿瘤细胞。
通常,对于活体外和体外应用,核酸的输送可通过以下步骤来实现,例如有葡聚糖介导的转染、磷酸钙沉淀、Polybrene介导的转染、原生质体融合、电穿孔、将多核苷酸包囊在脂质体中以及将DNA直接显微注射到胞核中,所有这些均是本领域所熟知的。
多核苷酸和多肽药物组合物
除了上述的药学上可接受的载体和盐外,多核苷酸和多肽组合物中还可采用下列附加试剂。
A.多肽
一个例子是多肽,其包括但不局限于:脱唾液酸血清类粘蛋白(ASOR);运铁蛋白;脱唾液酸糖蛋白;抗体;抗体片段;铁蛋白;白介素;干扰素;粒细胞-巨噬细胞集落刺激因子(GM-CSF);粒细胞集落刺激因子(G-CSF)、巨噬细胞集落刺激因子(M-CSF)、干细胞因子和促红细胞生成素。还可使用病毒抗原,如包膜蛋白。另外,可用来自其它侵袭性生物的蛋白,例如疟原虫恶性疟疾的环孢子蛋白的17个氨基酸的肽(称为RⅡ)。
B.激素,维生素等
其它可包括的种类例如是:激素、类固醇、雄激素、雌激素、甲状腺激素或维生素、叶酸。
C.聚亚烷基、多糖等
另外,聚(亚烷基)二醇可以和所需的多核苷酸/多肽组合在一起。在一个较佳的实施方案中,聚(亚烷基)二醇是聚乙二醇。另外,可以加入单糖、二糖或多糖。在此方面的一个较佳实施方案中,多糖是葡聚糖或DEAE-葡聚糖。另外有脱乙酰壳多糖和聚交酯-聚乙醇酸内酯共聚物。
D.脂质和脂质体
所需的多核苷酸/多肽还可在输送给对象或对象衍生的细胞之前包裹在脂质中或包裹在脂质体中。
脂质包裹通常用能稳定结合或捕获并保留核酸的脂质体来实现。浓缩的多核苷酸与脂质制剂之比可以变化,但是通常在约1∶1(毫克DNA∶微摩尔脂质)之间,或脂质更多。关于脂质体作为输送核酸的载体的综述参见Hug和Sleight(1991)Biochim.Biophys.Acta.1097:1-17;Straubinger(1983)Meth.Enzymol.101:512-527。
用于本发明的脂质体制剂包括阳离子(带正电荷)、阴离子(带负电荷)和中性制剂。阳离子脂质体已经显示出能以有功能的形式介导质粒DNA的胞内输送(Felgner(1987)Proc.Natl.Acad.Sci.USA 84:7413-7416);mRNA(Malone(1989)Proc.Natl.Acad.Sci.USA 86:6077-6081);和纯化的转录因子(Debs(1990)J.Biol.Chem.265:10189-10192)。
阳离子脂质体很容易购得。例如,N[1-2,3-二油烯基氧)丙基]-N,N,N-三乙铵(DOMTA)脂质体可以Lipofectin的商品名从GIBCO BRL,GrandIsland,NY购得。(另见Felgner,同上)。其它市售的脂质体包括transfectace(DDAB/DOPE)和DOTAP/DOPE(Boerhinger)。其它阳离子脂质体可用本领域熟知的方法从易购得的材料制得。例如参见,Szoka(1978)PNAS 75:4194-4198;WO90/11092关于DOTAP(1,2-二(油酰基氧)-3-(三甲基铵溶)丙烷)脂质体合成的描述。
同样,阴离子和中性脂质体也是容易获得的,例如购自Avanti PolarLipids(birmingham,AL),或容易用易购得的材料制得。这种材料包括磷脂酰胆碱、胆固醇、磷脂酰乙醇胺、二油酰基磷脂酰胆碱(DOPC)、二油酰基磷脂酰甘油(DOPG)、二油酰基磷脂酰乙醇胺(DOPE)。这些材料还能以合适比例与DOTMA和DOTAP原料混合。用这些材料制备脂质体的方法是本领域熟知的。
脂质体可包含多层脂质体(MLV),小的单层脂质体(SUV)、或大的单层脂质体(LUTV)。各种脂质体-核酸复合物可用本领域已知的方法制得。例如参见Straubinger(1983)Meth.Immunol.101:512-527;Szoka(1978)Proc.Natl.Acad.Sci.USA75:4194-4198;Papahadjopoulos(1975)Biochim.Biophys.Acta 394:483;Wilson(1979)Cell 17:77);Deamer和Bangham(1976)Biochim.Biophys.Acta 443:629;Ostro(1977)Biochem.Biophys.Res.Commun.76:836;Fraley(1979)Proc.Natl.Acad.Sci.USA76:3348);Enoch和Strittmatter(1979)Proc.Natl.Acad.Sci.USA 76:145;Fraley(1980)J.Biol.Chem.(1980)255:10431;Szoka和Papahadjopoulos(1978)Proc.Natl.Acad.Sci.USA 75:145;以及Schaefer-Ridder(l982)Science 215:166。
E.脂蛋白
另外,脂蛋白也可加入待输送的多核苷酸/多肽中。采用的脂蛋白的例子包括:乳糜微粒、HDL、IDL、LDL和VLDL。还可采用这些蛋白的突变体、片段或融合物。另外,可采用天然存在的脂蛋白的修饰物,例如乙酰化的LDL。这些脂蛋白能使多核苷酸的输送指向表达脂蛋白受体的细胞。较佳的,如果待输送的多核苷酸中加入了脂蛋白,则组合物中不加入其它寻靶的配体。
天然存在的脂蛋白包含脂质和蛋白部分。蛋白部分称为脱辅基蛋白。目前,已经分离并鉴定出了脱辅基蛋白A、B、C、D和E。其中至少有两个含有几种蛋白,用罗马数字AⅠ、AⅡ、AⅣ;CⅠ、CⅡ、CⅢ命名。
脂蛋白可包含多个脱辅基蛋白。例如,天然存在的乳糜微粒包含A、B、C和E,随着时间的推移,这些脂蛋白失去A,得到C和E脱辅基蛋白。VLDL包含A、B、C、和E脱辅基蛋白,LDL包含脱辅基蛋白B;HDL包含脱辅基蛋白A、C和E。
这些脱辅基蛋白的氨基酸是已知的,并且在下列文献中有所描述:Breslow(1985)Annu Rev.Biochem 54:699;Law(1986)Adv.Exp Med.Biol.151:162;Chen(1986)J BiolChem 261:12918;Kane(1980)Proc Natl Acad Sci USA 77:2465;and Utermann(1984)Hum Genet 65:232。
脂蛋白含有各种脂质,包括甘油三酯、胆固醇(游离的和酯型)以及磷脂。天然存在的脂蛋白中脂质的组成是不同的。例如,乳糜微粒主要含甘油三酯。关于天然存在的脂蛋白的脂质含量更详细的描述可在例如Meth.Enzymol.128(1968)中找到。选择脂质的组成,以使脱辅基蛋白的构型与受体结合活性相符。还可选择脂质的组成,以促进与多核苷酸结合分子的疏水性相互作用和结合。
天然存在的脂蛋白可以用诸如超离心的方法从血清中分离出来。这些方法在Meth.Enzymol.(同上);Pitas(1980)J.BioChem.255:5454-5460以及Mahey(1979)J Clin.Invest 64:743-750中有所描述。脂蛋白还可在体外产生,或通过在所需宿主细胞中表达脱辅基蛋白基因的重组方法产生。例如参见Atkinson(1986)Annu Rev Biophys Chem15:403和Radding(1958)Biochim Biophys Acta 30:443。脂蛋白也可购自商业供应商,如Biomedical Techniologies,Inc.,Stoughton,Massachusetts,USA。关于脂蛋白的进一步描述可在Zuckermann等人的PCT/US97/14465中找到。
F.聚阳离子试剂
聚阳离子试剂可以与或不与脂蛋白一起包括在含所需待输送多核苷酸/多肽的组合物中。
聚阳离子试剂通常在生理性相关的pH下表现出净的正电荷,并能中和核酸的电荷,以有助于输送至所需位置。这些试剂具有体外、活体外和体内的用途。聚阳离子试剂可用来将核酸通过肌内或皮下等输送至活的对象。
下面是用作聚阳离子试剂的多肽例子:聚赖氨酸、聚精氨酸、聚鸟氨酸和鱼精蛋白。其它例子包括组蛋白、鱼精蛋白、人血清白蛋白、DNA结合蛋白、非组蛋白染色体蛋白、DNA病毒的外壳蛋白,如(X174,转录因子还含有结合DNA的结构域,因此可用作核酸浓缩剂。简言之,转录因子如C/CEBP、c-jun、c-fos、AP-1、AP-2、AP-3、CPF、Prot-1、Sp-1、Oct-1、Oct-2、CREP、和TFIID含有结合DNA序列的基础性结构域。
聚阳离子有机试剂包括:精胺、亚精胺和腐胺。
从上面的清单可以推出聚阳离子试剂的尺寸和生理性能,以构建其它多肽聚阳离子试剂或产生合成的聚阳离子试剂。
可采用的合成的聚阳离子试剂例如包括,DEAE-葡聚糖、poiybrene。LipofectinTM和lipofectAVINETM是和多核苷酸/多肽组合时形成聚阳离子复合物的单体。
免疫诊断试验
本发明的奈瑟球菌抗原可用于免疫试验来检测抗体水平(或相反,可用抗奈瑟球菌抗体来检测抗原水平)。根据明确的免疫试验,可以开发出重组抗原,以代替侵入性诊断性方法。针对生物学样品(例如包括血液或血清样品)中的奈瑟球菌蛋白的抗体可以被检测出来。免疫试验的设计可作很大变化,其各种方案均是本领域中已知的。免疫试验的方案可采取例如竞争性、或直接反应或夹心型试验。方案例如还可采用固体支持物,或可以采用免疫沉淀法。大多数试验涉及采用有标记的抗体或多肽;该标记例如可以是荧光标记、化学发光标记、放射活性标记或染料分子。扩增探针信号的试验也是已知的;其例子是采用生物素和亲和素的试验,酶标记的和介导的免疫试验,如ELISA试验。
将合适的材料(包括本发明的组合物)以及进行试验所需的其它试剂和材料(例如合适的缓冲液、盐溶液等)和合适的试验说明包装到合适的容器中,构成适用于免疫诊断且含有适当标记的试剂的试剂盒。
核酸杂交
“杂交”指两个核酸序列相互之间通过氢键而结合。通常,一个序列被固定到固体载体中,另一个将游离于溶液内。然后,在有利于形成氢键的条件下使两个序列相互接触。影响这种结合的因素包括:溶剂的类型和体积;反应温度;杂交时间;搅拌程度;封闭液相序列与固体载体非特异性连接的试剂(Denhardt′s试剂或BLOTTO);序列的浓度;是否用化合物来增加序列结合的速度(硫酸葡聚糖或聚乙二醇);以及杂交后洗涤条件的严谨程度。见Sambrook等人[同上]第2卷,第9章,9.47至9.57页。
“严谨性”指有利于非常相似的序列与不同序列发生结合的杂交反应条件。例如,应选择温度和盐浓度组合,使温度比所研究的杂交物的Tm计算值低大约120至200℃。温度和盐浓度常可在前期初步实验中通过经验来确定,在初步实验中,固定在滤膜上的基因组DNA样品与感兴趣的序列杂交,然后在不同的严谨度条件下洗涤。见Sambrook等人第9.50页。
在进行例如Southern印迹时,要考虑的参数是(1)待印迹的DNA的复杂性以及(2)探针与受检测序列之间的同源性。对于高度复杂的真核基因组中的单拷贝基因,待研究片段的总量可以在10的一个数量级范围内变化,质粒为0.1至1微克,或将噬菌体消化至10-9至10-8克。对于复杂性较低的多核苷酸,可以采用实际上更短的印迹、杂交以及接触时间,更少量的起始多核苷酸,以及比活更低的探针。例如,从1微克酵母DNA开始,用仅仅1小时的接触时间,印迹2小时,然后和108cpm/μg的探针杂交4-8小时,就可以检测单拷贝酵母基因。对于单拷贝哺乳动物基因而言,一种保守的方法是从10微克DNA开始,印迹过夜,在10%硫酸葡聚糖存在下用108cpm/μg以上的探针杂交过夜,导致接触时间约为24小时。
有几个因素可能会影响探针与感兴趣片段之间的DNA-DNA杂合物的解链温度(Tm),因而影响杂交和洗涤的合适条件。在许多情况下,探针并非与片段100%同源。其它常常遇到的变量包括杂交序列的长度和G+C总含量,以及杂交缓冲液的离子强度和甲酰胺含量。所有这些因素的作用可近似表示成一个方程式:
Tm=81+16.6(log10Ci)+0.4[%(G+C)]-0.6(%甲酰胺)-600/n-1.5(%错配)其中Ci是盐浓度(单价离子),n是杂交物碱基对的长度(对Meinkoth和Wahl(1984)Anal.Biochem.138:267-284中的稍稍作了修改)。
在设计杂交实验时,影响核酸杂交的一些因素可以方便地予以改变。杂交和洗涤时的温度以及洗涤时的盐浓度的调节最为简单。随着杂交温度(即严谨度)的升高,不同源的链之间发生杂交的可能性变得更少,结果背景值降低。如果放射性标记的探针并非与固定的片段完全同源(这在基因家族和种间杂交实验中是常见的),则必须降低杂交温度,而背景值将会增加。洗涤温度以类似的方式影响杂交带的强度和背景值的程度。洗涤的严谨性也随盐浓度的降低而升高。
通常,在50%甲酰胺存在下的方便的杂交温度是:对于靶片段同源性达95%至100%的探针而言,是42℃;对于同源性为90%至95%的探针,为37℃;对于同源性为85%和90%的探针,为32℃。对于较低的同源性,应用上述方程式应相应地降低甲酰胺含量和调节温度。如果探针和靶片段之间的同源性是未知的,则最简单的方法是从非严谨的杂交和洗涤条件开始。如果在放射自显影后发现了非特异性的条带或高背景值,则可在高严谨性下洗涤滤膜,并重新曝光。如果曝光所需时间使得该方法不切实际,则应平行测试几个杂交和/或洗涤严谨性。
核酸探针试验
采用本发明的核酸探针的方法(如PCR、分支DNA探针试验或印迹技术)能确定cDNA或mRNA的存在。如果探针和本发明的序列能形成稳定地足以被检测到的双链体或双链复合物,则称探针与本发明的序列“杂交”。
核酸探针将与本发明的奈瑟球菌核苷酸序列(包括有义和反义链)杂交。尽管有许多不同的核苷酸序列编码该氨基酸序列,但是天然的奈瑟球菌序列是较佳的,因为它是实际存在于细胞中的序列。mRNA代表一种编码序列,因此探针应与该编码序列互补;单链cDNA与mRNA互补,因此cDNA探针应与非编码序列互补。
探针序列无需和奈瑟球菌序列(或其互补体)相同,序列以及长度的一些差异能增加试验的灵敏度,如果核酸探针能和靶核苷酸形成能被检测的双链体的话。另外,核酸探针可包括其它核苷酸,以使形成的双链体稳定。其它奈瑟球菌序列也是有帮助的,可作为检测形成的双链体的标记。例如,非互补的核苷酸序列可以和探针的5′端相连,探针序列的其余部分与奈瑟球菌序列互补。或者,非互补的碱基或较长的序列能散布到探针中,只要探针序列与奈瑟球菌序列有足够的互补性以便与其杂交从而形成能被检测的双链体。
探针的确切长度和序列将取决于杂交条件,如温度,盐浓度等。例如,对于诊断应用,根据分析物序列的复杂程度,核酸探针通常含有至少10-20个核苷酸,较佳的有15-25个,更佳的有至少30个核苷酸,但是也可短于该长度。短的引物通常需要温度较低,以便和模板形成足够稳定的杂交复合物。
探针可用合成方法产生,例如Matteucci等人[J.Am.Chem.Soc.(1981)103:3185]的方法或Urdea等人[Proc.Natl.Acad.Sci.USA(1983)80:7461]的方法,或用市售的自动寡核苷酸合成仪合成。
可以根据偏好选择探针的化学特征。对于某些应用,DNA或RNA是合适的。对于其它的应用,可以加入修饰,例如骨架修饰,如硫代磷酸酯或甲基磷酸酯,可用来增加体内半衰期,改变RNA亲和力,增加核酸酶抗性等[例如参见Agrawal和Iyer(1995)Curr Opin Biotechnol 6:12-19;Agrawal(1996)TIBTECH 14:376-387];还可采用类似物如肽核酸[例如参见Corey(1997)TIBTECH 15:224-229;Buchardt等人(1993)TIBTECH 11:384-386]。
另外,聚合酶链反应(PCR)是另一个熟知的检测少量靶核酸的手段。该试验在Mullis等人[Meth.Enzymol.(1987)155:335-350];美国专利4,683,195和4,468,202中有所描述。用两个“引物”核苷酸与靶核酸杂交,并用来引导反应。引物可包含不与扩增靶序列(或其互补序列)杂交的序列,以帮助双链体的稳定性,或例如可插入一个简便的限制性位点。这些序列通常侧接所需的奈瑟球菌序列。
利用最初的靶核酸作为模板,热稳定的聚合酶能从引物产生靶核酸的拷贝。在聚合酶产生临界量的靶核酸后,它们可用较传统的方法(如Southern印迹)来检测。当采用Southern印迹方法后,标记的探针将与奈瑟球菌序列(如其互补序列)杂交。
另外,mRNA或cDNA也可用Sambrook等人[同上]中描述的传统印迹技术来检测。用凝胶电泳可纯化并分离利用聚合酶从mRNA产生的mRNA或cDNA。然后,将凝胶上的核酸印迹到固体载体如硝酸纤维素上。使固体载体与标记的探针接触,然后洗涤除去所有未杂交的探针。然后,检测含有标记探针的双链体。该探针通常用放射活性物质作标记。
附图简述
图1-20显示了实施例中,和ORF 37、5、2、15、22、28、32、4、61、76、89、97、106、138、23、25、27、79、85以及132的序列分析所得的生化数据。M1和M2是分子量标记。箭头表示主要重组产物的位置,或在Western印迹中,主要的脑膜炎奈瑟球菌免疫反应性条带的位置。TP表示脑膜炎奈瑟球菌总蛋白抽提物;OMV表示脑膜炎奈瑟球菌外膜泡囊制备物。在杀菌试验的结果中:菱形(◆)表示免疫前的数据;三角(▲)表示GST对照数据;圆圈(●)表示脑膜炎奈瑟球菌重组蛋白的数据。计算机分析显示了亲水性曲线(上方)、抗原性指数曲线(中间)以及AMPHI分析(下方)。用AMPHI程序预测T-细胞表位[Gao等人(1989)J.Immunol.143:3007;Roberts等人(1996)AIDS Res Hum Retrovir 12:593;Quakyi等人(1992)Scand J Immunol增版11:9],该程序可从DNASTAR,Inc(1228 South Park Street,Madison,Wisconsin 53715USA)的Protean软件包中获得。
实施例
下列实施例描述已经在脑膜炎奈瑟球菌和淋病奈瑟球菌中鉴定的核酸序列及其推定的翻译产物。并非所有的核酸序列都是完整的,即它们编码的不是全长野生型蛋白。
实施例总体上采用下列形式:
●脑膜炎奈瑟球菌(B株)中已经鉴定的核苷酸序列
●该序列推定的翻译产物
●根据数据库比较用计算机分析翻译产物
●脑膜炎奈瑟球菌(A株)以及淋病奈瑟球菌中鉴定的对应的基因和蛋白序列
●可能具有适当抗原性的蛋白的特性描述
●生物化学分析(表达、纯化、ELISA、FACS等)的结果
实施例通常包括菌株和菌株之间的序列相同性细节情况。序列相似的蛋白其结构和功能通常是相似的,序列相同性通常表示有共同的进化起源。广泛采用功能已知的蛋白序列之间的比较,作为赋予其新序列推定蛋白功能的指南,在全基因组分析中证明这是特别有用的。
在NCBI(http:∥www.ncbi.nlm.nih.gov)用BLAST、BLAST2、BLASTn、BLASTp、tBLASTn、BLASTx、和tBLASTx算法进行序列比较[例如参见Altschul等人(1997)"Gapped BLAST和PSI-BLAST:新一代的蛋白数据库搜索程序"Nucleic AcidsResearch 25:2289-3402]。对下列数据库进行搜索:非冗长的GenBank+EMBL+DDBJ+PDB序列和非冗长的GenBank CDS翻译+PDB+SwissProt+SPupdate+PIR序列。
为了比较脑膜炎球菌和淋球菌序列,用tBLASTx算法,在http:∥www.genome.ou.edu/gono_blast.html中执行。还用FASTA算法来比较ORF(购自GCG Wisconsin Package,9.0版)。
核苷酸序列中的点(例如SEQ ID 11中的495位)代表为了维持读码框而任意导入的核苷酸。同样,除去带双划线的核苷酸。小写字母(如SEQ ID 11的496位)代表在独立测序反应的序列对比时出现了多义性(实施例中的一些核苷酸序列是通过合并两个或多个实验的结果而获得的)。
用根据Esposti等人["膜蛋白亲水性的关键评价"(1990)Eur J Biochem 190:207-219]的统计研究的算法,扫描所有6个读码框中的核苷酸序列,以预测疏水性区域的存在。这些结构域代表潜在的跨膜区域或疏水性前导序列。
用ORFFINDER程序(NCBI)从片段化的核苷酸序列预测开放读框。
有下划线的氨基酸序列代表用PSORT算法(http:∥www.psort.nibb.ac.jp)估测出的ORF中可能的跨膜区域或前导序列。还用MOTIFS程序(GCG Wisconsin和PROSITE)预测了功能性结构域。
可用各种试验来评价实施例中鉴定的蛋白的体内免疫原性。例如,可以重组表达蛋白,并用于免疫印迹筛选患者血清。蛋白和患者血清之间发生阳性反应表明该患者以前已经建立了对该所述蛋白的免疫应答,即该蛋白是免疫原。该方法还可用来鉴定免疫优势蛋白。
重组蛋白还可方便地用来例如在小鼠中制备抗体。这些抗体可用来直接确认蛋白位于细胞表面。将标记的抗体(例如对于FACS为荧光标记)与完整的细菌培育,细菌表面出现标记确认了该蛋白的位置。
具体地说,采用下列方法(A)至(S),来表达、纯化和分析本发明蛋白的生物化学特性:
A)染色体DNA制备
使脑膜炎奈瑟球菌2996菌株在100毫升GC培养基中生长至指数期,离心收获,重悬于5毫升缓冲液(20%蔗糖、50毫摩尔Tris-HCl、50毫摩尔EDTA、pH 8)中。冰上培育10分钟后,加入10毫升裂解溶液(50毫摩尔NaCl,1%Na-十二烷基肌氨酸钠,50微克/毫升蛋白酶K)裂解该细菌,37℃培育悬液2小时。用苯酚抽提两次(平衡至pH 8),用三氯甲烷/异戊醇(24∶1)抽提一次。加入0.3M乙酸钠和2体积乙醇,使DNA沉淀,离心收集。用70%乙醇洗涤沉淀一次,重新溶解在4毫升缓冲液(10毫摩尔Tris-HCl,1毫摩尔EDTA,pH 8)中。读取260纳米下OD值,测定DNA浓度。
B)寡核苷酸设计
用(a)脑膜炎球菌B的序列(当能获得时),或(b)淋球菌/脑膜炎球菌A序列(按需适应于脑膜炎球菌密码子偏好利用率),根据各ORF的编码序列,设计合成的寡核苷酸引物。推导紧靠预计的前导序列下游5′端扩增引物序列,忽略任何预计的信号肽。
对于大多数ORF,5′引物包括两个限制性酶识别位点(BamHⅠ-NdeⅠ,BamHⅠ-NheⅠ或EcoRⅠ-NheⅠ,这取决于基因自身的限制性方式);3′引物包括一个XhoⅠ限制性位点。建立该步骤是为了指导各扩增产物(对应于各ORF)克隆到以下两个不同的表达系统中:pGEX-KG(用BamHⅠ-XhoⅠ或EcoRⅠ-XhoⅠ),以及pET21b+(用NdeⅠ-XhoⅠ或NheⅠ-XhoⅠ)。
5’-端引物尾序列: CGCGGATCCCATATG (BamHⅠ-NdeⅠ)
CGCGGATCCGCTAGC (BamHⅠ-NheⅠ)
CCGGAATTCTAGCTAGC (EcoRⅠ-NheⅠ)
3’-端引物尾序列: CCCGCTCGAG (XhoⅠ)
对于ORF5、15、17、19、20、22、27、28、65和69,进行两个不同的扩增,将各ORF克隆到两个表达系统中。对于各ORF采用两个不同的5′引物;如前采用了同一3′XhoⅠ引物:
5’-端引物尾序列: GGAATTCCATATGGCCATGG (NdeⅠ)
5’-端引物尾序列:CGGGATCC (BamHⅠ)
ORF76被克隆到pTRC表达载体中,并表达成氨基端His-tag融合。在该具体情况中,预计的信号肽包括在最终产物中。用下列引物掺入NheⅠ-BamHⅠ限制性位点:
5’-端引物尾序列:GATCAGCTAGCCATATG (MheⅠ)
3’-端引物尾序列:CGGGATCC (BamHⅠ)
引物不仅含有限制性酶识别序列,而且还包括与待扩增序列杂交的核苷酸。杂交核苷酸的数目取决于整个引物的解链温度,对于各引物可用下式测定:
Tm=4(G+C)+2(A+T) (排除尾部)
Tm=64.9+0.41(%GC)-600/N (整个引物)
对于整个寡核苷酸来说,所选寡核苷酸的平均解链温度为65-70℃,对于单单杂交区来说,平均解链温度为50-55℃。
表1(511页)显示了用于各种扩增的正向和反向引物。在某些情况下,应注意引物的序列没有与ORF中的序列完全匹配。在进行最初的扩增时,一些脑膜炎球菌ORF的完整的5′和/或3′序列并不是已知的,但是已经鉴定了其在淋球菌中的对应序列。为了进行扩增,可用淋球菌序列作为引物设计的根据,考虑密码偏好作了改变。具体地说,改变下列密码子:ATA→ATT;TCG→TCT;CAG→CAA;AAG→AAA;GAG→GAA;CGA→CGC;CGG→CGC;GGG→GGC。表1中的斜体核苷酸表明这种变化。应理解,一旦鉴定出了完整的序列,就不再需要该方法了。
用Perkin Elmer 394 DNA/RNA合成仪合成寡核苷酸,用2毫升氢氧化铵从柱上洗脱下,56℃培育5小时去保护。加入0.3M乙酸钠和2体积乙醇,使寡核苷酸沉淀。然后离心样品,将沉淀重悬于100微升或1毫升水中。用Perkin ElmerλBio分光光度计测定OD260,测得浓度,调节至2-10pmol/微升。
C)扩增
标准的PCR程序如下:在20-40微摩尔各寡核苷酸、400-800微摩尔dNTP溶液、1x PCR缓冲液(包括1.5毫摩尔氯化镁)、2.5单位TaqⅠ DNA聚合酶(用Perkin-ElmerAmpliTaQ,GIBCO Platinum,Pwo DNA聚合酶或Tahara Shuzo Taq聚合酶)存在下,用50-200ng基因组DNA作为模板。
在一些例子中,通过加入10微升DMSO或50微升2M甜菜碱来优化PCR。
在加热开始后(在最初的95℃培育整个混合物3分钟期间加入聚合酶),每个样品经历两个步骤的扩增:开头5轮的进行用排除限制性酶尾部的寡核苷酸的解链温度作为杂交温度,随后的30轮根据全长寡核苷酸的杂交温度来进行。这些轮后是最后在72℃下延伸10分钟。
标准循环如下:
变性 | 杂交 | 延伸 | |
前5轮 | 30秒95℃ | 30秒50-55℃ | 30-60秒72℃ |
后30轮 | 30秒95℃ | 30秒65-70℃ | 30-60秒72℃ |
延伸时间随待扩增ORF的长度不同而不同。
扩增用9600或2400 Perkin Elmer GeneAmp PCR系统进行。为了检查结果,将1/10的扩增体积装载到1-1.5%琼脂糖凝胶上,将各扩增片段的大小与DNA分子标记作比较。
将扩增的DNA直接上样到1%琼脂糖凝胶上,或是先用乙醇沉淀,然后重悬于合适的体积中,上样到1%琼脂糖凝胶上。然后用Qiagen凝胶抽提试剂盒按照生产商说明从凝胶中洗脱并纯化获得对应于大小正确条带的DNA片段。该DNA片段的最终体积为30微升或50微升的水,或10毫摩尔Tris,pH 8.5。
D)PCR片段的消化
将对应于扩增片段的纯化的DNA分成2等份,用以下物质进行双重消化:
-NdeⅠ/XhoⅠ或NheⅠ/XhoⅠ,用于克隆到pET-21b+中,该蛋白进一步表达成C-端His-尾融合物
-BamHⅠ/XhoⅠ或EcoRⅠ/XhoⅠ,用于克隆到pGEX-KG中,该蛋白进一步表达成N-端GST融合物
-对于ORF76,NheⅠ/BamHⅠ,用于克隆到pTRC-HisA载体中,该蛋白进一步表达成N-端His-尾融合物
-EcoRⅠ/PstⅠ,EcoRⅠ/SalⅠ,SalⅠ/PstⅠ,用于克隆到pGex-His中,该蛋白进一步表达成N-端His-尾融合物
在合适的缓冲液存在下,使各纯化的DNA片段与20单位的各种限制性酶(NewEngland Biolabs)在30或40微升的最终体积中培育(37℃培育3小时至过夜)。然后用QIAquiek PCR纯化试剂盒按照生产商说明书纯化消化产物,洗脱到最终体积为30微升或50微升的水中或10毫摩尔Tris-HCl,pH 8.5中。在滴定的分子量标记存在下,通过1%琼脂糖凝胶电泳测定最终的DNA浓度。
E)克隆载体(pET22B,pGEX-KG,pTRC-His A和pGex-His)的消化
在合适的缓冲液存在下,使200微升反应体积中的限制性酶各50单位与10微克质粒37℃培育过夜,对10微克质粒进行双消化。在将全部消化物上样到1%琼脂糖凝胶上后,用Qiagen QIAquick凝胶抽提试剂盒从凝胶中纯化对应于消化载体的条带,将DNA洗脱到50微升10毫摩尔Tris-HCl,pH 8.5中。测定样品的OD260,评价其DNA浓度,并调节至50微克/微升。每个克隆步骤采用1微升质粒。
pGEX-His载体是经修饰的pGEX-2T载体,其在凝血酶断裂位点上游携带有一个编码6个组氨酸残基的区域,而且还含有载体pTRC99(Pharmacia)的多个克隆位点。
F)克隆
将预先消化和纯化的对应于各ORF的片段连接到pET22b和pGEX-KG中。在20微升的最终体积,在生产商提供的缓冲液存在下,用0.5微升NEB T4 DNA连接酶(400单位/微升)连接摩尔比为3∶1的片段/载体。室温培育反应3小时。在一些实验中,用Boheringer的"快速连接试剂盒"按照生产商说明书进行连接。
为了将重组质粒导入合适的菌株内,使100微升大肠杆菌DH5感受态细胞与连接酶反应溶液于冰上培育40分钟,然后37℃3分钟,然后在加入800微升LB肉汤后,再37℃培育20分钟。然后在Eppendorf微量离心机中以最大速度离心细胞,重悬于约200微升上清液中。然后将悬液接种到LB氨苄青霉素(100毫克/毫升)平板上。
使5个随机选择的菌落在2毫升(pGEX或pTC克隆)或5毫升(pET克隆)LB肉汤+100微克/毫升氨苄青霉素中37℃生长过夜,对重组克隆进行筛选。然后,使细胞沉淀,用Qiagen QIAprep旋转微量制备试剂盒,按照生产商说明书,将DNA抽提到最终体积为30微升。用NdeⅠ/XhoⅠ或BamHⅠ/XhoⅠ消化5微升各个微量制备物(约1微克),将整个消化物上样到1-1.5%琼脂糖凝胶上(取决于预计的插入物大小),与分子量标记(1Kb DNA梯序列,GIBCO)平行。根据正确的插入物大小筛选阳性克隆。
对于ORF110、111、113、115、119、122、125和130的克隆,将双消化的PCR产物连接入双消化载体利用的是EcoRⅠ-PstⅠ克隆位点,或者对于115和127,利用的是EcoRⅠ-SalⅠ位点,或者对于ORF122,利用的是SalⅠ-PstⅠ位点。克隆后,将重组质粒导入大肠杆菌宿主W3110中。使单个克隆在含50微升/毫升氨苄青霉素的L-肉汤中37℃生长过夜。
G)表达
将克隆到表达载体中的每个ORF转化入适合表达重组蛋白产物的菌株中。用1微升各构建物转化上述30微升大肠杆菌BL21(pGEX载体)、大肠杆菌TOP10(pTRC载体)或大肠杆菌BL21-DE3(pET载体)。在pGEX-His载体例子中,用相同的大肠杆菌菌株(W3110)进行最初的克隆和表达。将单个重组菌落接种到2毫升LB+Amp(100微克/毫升)中,37℃培育过夜,然后1∶30稀释在100毫升瓶中的20毫升LB+Amp(100微克/毫升)中,确保OD600在0.1至0.15之间。将瓶培育在30℃的回转水浴摇床中,直至OD表明达到适合诱导表达的指数生长(pET和pTRC载体的OD为0.4-0.8;pGEX和pGEX-His载体的OD为0.8-1)。对于pET,pTRC和pGEX-His载体,加入1毫摩尔IPTG,诱导蛋白质表达,而在pGEX系统情况下,IPTG的最终浓度为0.2毫摩尔。30℃培育3小时后,测OD检查样品的最终浓度。为了检查表达,取出各样品1毫升,在微量离心机中离心,将沉淀重悬于PBS中,用12%SDS-PAGE和考马斯蓝染色分析。6000g离心整个样品,将沉淀重悬于PBS中待用。
H)GST-融合蛋白大规模纯化
使单菌落在LB+Amp琼脂板上37℃培育过夜。将细菌接种到水浴摇床中20毫升LB+Amp培养液中,生长过夜。将细菌1∶30稀释到600毫升新鲜培养基中,使其在最适温度(20-37℃)下生长至OD550为0.8-1。用0.2毫摩尔IPTG诱导蛋白质表达,然后培育3小时。4℃、8000rpm离心培养物。弃去上清液,将细菌沉淀重悬于7.5毫升冷的PBS中。用40W的Brason超声波仪B-15在冰上超声破碎细胞30秒种,冻融2次,再次离心。收集上清液,与150微升谷胱苷肽-Sepharose 4B树脂(Pharmacia)(先用PBS洗涤)混合,室温下培育30分钟。4℃、700g离心样品5分钟。用10毫升冷的PBS洗涤树脂2次10分钟,重悬于1毫升冷的PBS中,上样于一次性柱中。用2毫升冷PBS洗柱2次,直至流穿液OD280达到0.02-0.06。加入700微升冷的谷胱苷肽洗脱缓冲液(10毫摩尔还原的谷胱苷肽,50毫摩尔Tris-HCl),洗脱GST-融合蛋白,收集组分直至OD280为0.1。将各组分21微升上样于12%SDS凝胶上,凝胶采用Biorad SDS-PAGE分子量标准宽范围(M1)(200,116.25,97.4,66.2,45,31,21.5,14.4,6.5kDa)或Amersham Rainbow标记(M2)(220,66,46,30,21.5,14.3kDa)作为标准。因为GST的MW为26kDa,因此该值必须加入各GST-融合蛋白的MW中。
Ⅰ)His-融合物溶解度分析(ORF111-129)
为了分析His-融合物表达产物的溶解度,将3毫升培养物沉淀重悬于缓冲液M1[500微升PBS,pH 7.2]中。加入25微升溶菌酶(10毫克/毫升),4℃培育细菌15分钟。用Branson超声仪B-15以40W超声破碎沉淀30秒,冻融两次,然后再次离心分离成沉淀和上清液。收集上清液,将沉淀重悬于缓冲液M2[8M尿素,0.5M氯化钠,20毫摩尔咪唑和0.1M磷酸二氢钠]中,4℃培育3-4小时。离心后,收集上清液,将沉淀重悬于缓冲液M3[6M盐酸胍,0.5M氯化钠,20毫摩尔咪唑和0.1M磷酸二氢钠]中,4℃过夜。用SDS-PAGE分析所有步骤的上清液。
发现ORF113、119和120表达的蛋白溶于PBS,而ORF111、122、116以及129表达的蛋白的溶解需要尿素,ORF125和127的需要盐酸胍。
J)His融合物大规模纯化
使单菌落在LB+Amp琼脂板上37℃培育过夜。将细菌接种到20毫升LB+Amp培养液中,在水浴摇床中培育过夜。将细菌1∶30稀释到600毫升新鲜培养基中,使其在最适温度(20-37℃)下生长至OD550为0.6-0.8。加入1毫摩尔IPTG诱导蛋白质表达,进一步培育该培养物3小时。4℃、8000rpm离心培养物,弃去上清液,将细菌沉淀重悬于7.5毫升(ⅰ)冷的缓冲液A(300毫摩尔氯化钠,50毫摩尔磷酸缓冲液,10毫摩尔咪唑,pH 8,针对可溶性蛋白)或(ⅱ)缓冲液B(尿素8M,10毫摩尔Tris-HCl,100毫摩尔磷酸缓冲液,pH 8.8,针对不溶性蛋白)。
用Brason超声波仪B-15于40W在冰上超声破碎细胞30秒种,冻融2次,再次离心。
对于不溶性蛋白,-20℃保藏上清液,而将沉淀重悬于2毫升缓冲液C(6M盐酸胍,100毫摩尔磷酸缓冲液,10毫摩尔Tris-HCl,pH 7.5)中,在匀化器中处理10个循环。13000rpm离心产物40分钟。
收集上清液,与150微升Ni2+-树脂(Pharmacia)(先用合适的缓冲液A或缓冲液B洗涤),室温下轻微搅动培育30分钟。4℃,700g离心样品5分钟。用10毫升缓冲液A或B洗涤树脂二次10分钟,重悬于1毫升缓冲液A或B中,上样于一次性柱中。用2毫升冷的缓冲液A 4℃洗涤树脂,或在室温下用2毫升缓冲液B洗涤树脂,直至流穿液OD280达到0.02-0.06。
用以下缓冲液洗涤树脂:(ⅰ)2毫升冷的20毫摩尔咪唑缓冲液(300毫摩尔氯化钠,50毫摩尔磷酸缓冲液,20毫摩尔咪唑,pH 8)或(ⅱ)缓冲液D(尿素8M,10毫摩尔Tris-HCl,100毫摩尔磷酸缓冲液,pH 6.3),直至流穿液OD280达到0.02-0.06。加入700微升的(ⅰ)冷的洗脱缓冲液A(300毫摩尔氯化钠,50毫摩尔磷酸缓冲液,250毫摩尔咪唑,pH8)或(ⅱ)洗脱缓冲液B(尿素8M,10毫摩尔Tris-HCl,100毫摩尔磷酸缓冲液,pH 4.5),洗脱His-融合蛋白,收集组分直至OD280为0.1。将各组分21微升上样于12%SDS凝胶中。
K)His-融合蛋白复性
在变性的蛋白中加入10%甘油。然后用透析缓冲液Ⅰ(10%甘油,0.5M精氨酸,50毫摩尔磷酸缓冲液,5毫摩尔还原的谷胱苷肽,0.5毫摩尔氧化的谷胱苷肽,2M尿素,pH 8.8)将蛋白质稀释至20微克/毫升,用相同的缓冲液4℃透析12-14小时。用透析缓冲液Ⅱ(10%甘油,0.5M精氨酸,50mM磷酸缓冲液,5毫摩尔还原的谷胱苷肽,0.5毫摩尔氧化的谷胱苷肽,pH 8.8)进一步4℃透析蛋白质12-14小时。用下式评价蛋白浓度:
蛋白质(毫克/毫升)=(1.55×OD280)-(0.76×OD260)
L)His-融合物大规模纯化(ORF111-129)
用上述步骤诱导500毫升细菌培养物,获得可溶于缓冲液M1、M2或M3的融合蛋白。将细菌粗提物上样于Ni-NTA superflow柱(Quiagen),根据融合蛋白的溶解缓冲液,用M1、M2或M3预先平衡该柱。用相同缓冲液洗柱,洗脱未结合的物质。用含有500毫摩尔咪唑的相应缓冲液洗脱特异性蛋白,用不含咪唑的相应缓冲液透析。每一轮后,在下次使用前用至少两个柱体积的0.5M氢氧化钠洗涤并重新平衡,对柱进行清洁。
M)小鼠免疫
用各纯化蛋白20微克腹膜内免疫小鼠。在ORF 2、4、15、22、27、28、37、76、89和97情况下,用氢氧化铝作为佐剂,在第1、21和42天免疫Balb-C小鼠,检测第56天所取样品中的免疫应答。对于ORF 44、106和132,用相同方案免疫CD1小鼠。对于ORF 25和40,用Freund佐剂,而不是氢氧化铝,免疫CD1小鼠,采用相同的免疫方案,只是在第42天而非56天测定免疫应答。同样,对于ORF 23、32、38和79,用Freund佐剂免疫CD1小鼠,但是在第49天测定免疫应答。
N)ELISA试验(血清分析)
将无荚膜MenB M7菌株接种到巧克力琼脂板上,37℃培育过夜。用无菌挑菌拭子收集琼脂板的细菌菌落,接种到7毫升含0.25%葡萄糖的Mueller-Hinton肉汤(Difco)中。跟踪OD260每30分钟监测细菌生长。使细菌长至OD达到0.3-0.4。10000rpm离心培养物10分钟。弃去上清液,用PBS洗涤细菌1次,重悬于含0.025%甲醛的PBS中,室温培育2小时,然后4℃搅拌过夜。在96孔Greiner板的每个孔中加入100微升细菌细胞,4℃培育过夜。然后用PBT洗涤缓冲液(0.1%吐温-20,PBS配)洗涤孔三次。每个孔中加入200微升饱和缓冲液(含2.7%聚乙烯吡咯烷酮10的水),37℃培育平板2小时。用PBT洗涤各孔3次。每个孔中加入200微升稀释的血清(稀释缓冲液:1%BSA,0.1%吐温-20,0.1%叠氮钠,PBS配),37℃培育平板90分钟。用PBT洗孔三次。在每个孔中加入100微升以稀释缓冲液1∶2000稀释的HRP-偶联的家兔抗小鼠(Dako)血清,37℃培育平板90分钟。用PBT缓冲液洗涤孔三次。在每个孔中加入100微升HRP的底物缓冲液(25毫升柠檬酸缓冲液pH 5,10毫克邻苯二胺和10微升水),使平板在室温下放置20分钟。在每个孔中加入100微升硫酸,并跟踪OD490。当OD490为各自免疫前血清OD值的2.5倍时,认为ELISA呈阳性。
O)FACScan细菌结合试验程序
将无荚膜MenB M7菌株接种到巧克力琼脂板上,37℃培育过夜。用无菌挑菌拭子收集琼脂板上的细菌菌落,接种到8毫升含0.25%葡萄糖的Mueller-Hinton肉汤(Difco)的4个试管中。跟踪OD260,每30分钟监测细菌生长。使细菌长至OD达到0.35-0.5。4000rpm离心培养物10分钟。弃去上清液,将沉淀重悬于封闭缓冲液(1%BSA,0.4%叠氮钠)中,4000rpm离心5分钟。将细胞重悬于封闭缓冲液中,至OD620为0.07。在Costar 96孔板的每个孔中加入100微升细菌细胞。在每个孔中加入100微升稀释(1∶200)血清(封闭缓冲液配),4℃培育平板2小时。4000rpm离心细胞5分钟,吸出上清液,每个孔中加入200微升封闭缓冲液,洗涤细胞。在每个孔中加入1∶100稀释的R-Phicoerytrin偶联的F(ab)2山羊抗小鼠抗体,4℃培育平板1小时。4000rpm离心5分钟,使细胞旋转沉淀,在每个孔中加入200微升封闭缓冲液进行洗涤。吸出上清液,将细胞重悬于每孔200微升PBS和0.25%甲醛中。将样品转移到FACScan管中读数。FACScan设置的条件为:FL1,开,FL2和FL3关;FSC-H临界值:92;FSC PMT电压:E 02;SSC PMT:474;Amp.Gains 7.1;FL-2 PMT:539;补偿值:0。
P)OMV制备
使细菌在5 GC平板上生长过夜,用挑菌环收获,重悬于10毫升20毫摩尔Tris-HCl中。56℃热灭活30分钟,在冰上超声破碎该细菌10分钟(50%负载循环,50%输出)。5000g离心10分钟,除去未破碎的细胞,4℃、50000g离心75分钟,回收全部细胞包膜组分。为了从粗制的外膜中抽提出细胞质膜蛋白,将全部组分重悬于2%十二烷基肌氨酸钠(Sigma)中,室温培育20分钟。10000g离心该悬浮液10分钟,除去凝聚物,对上清液进一步50000g超离心75分钟,使外膜沉淀。将外膜重悬于10毫摩尔Tris-HCl,pH 8,用BioRad蛋白质试验以BSA为标准品测定蛋白浓度。
Q)全抽提物制备
使细菌在GC板上生长过夜,用挑菌环收获,重悬于1毫升20毫摩尔Tris-HCl中。56℃热灭活30分钟。
R)Western印迹
将MenB菌株2996的纯化蛋白(每条泳道500ng)、外膜泡囊(5微克)和全细胞抽提物(25微克)上样于15%SDS-PAGE中并转移到硝酸纤维素膜上。转移在4℃、150mA、转移缓冲液(0.3%Tris碱,1.44%甘氨酸,20%甲醇)中进行2小时。在饱和缓冲液(10%脱脂乳、0.1%Triton X100,PBS配)中4℃培育过夜,使该膜饱和。用洗涤缓冲液(3%脱脂乳,0.1%Triton X 100,PBS配)洗涤该膜两次,并与洗涤缓冲液1∶200稀释的小鼠血清37℃培育2小时。洗涤该膜两次,和稀释度为1∶2000的辣根过氧化物酶标记的抗小鼠Ig培育90分钟。用含0.1%Triton X100的PBS洗涤该膜两次,用Opti-4CN底物试剂盒(Bio-Rad)显影。加入水,终止反应。
S)杀菌试验
使MC58菌株在巧克力琼脂板上37℃生长过夜。收集5-7个菌落,用于接种7毫升Mueller-Hinton肉汤。在章动器上37℃培育该悬浮液,使其生长,至OD620为0.5-0.8。将培养液等分到1.5毫升无菌Eppendorf管中,在微量离心机中以最大速度离心20分钟。以Gey′s缓冲液(Gibco)洗涤沉淀一次,重悬于相同缓冲液中,至OD620为0.5,以Gey′s缓冲液稀释1∶20000,25℃保藏。
在96孔组织培养板的每个孔中加入50微升Gey′s缓冲液/1%BSA。在每个孔中加入25微升稀释的小鼠血清(1∶100稀释在Gey′s缓冲液/0.2%BSA中),4℃培育平板。将25微升前述细菌悬浮液加入每个孔中。每个孔中加入25微升热灭活(56℃水浴30分钟)或正常的幼兔补体。在加入幼兔补体后,立即将每个孔中22微升的样品接种到Mueller-Hinton琼脂板(时间0)。37℃转动培育96孔板1小时,然后将每个孔内22微升的样品接种到Mueller-Hinton琼脂板(时间1)上。过夜培育后,计数对应于时间0和时间1的菌落。
表Ⅱ(520页)给出了克隆、表达和纯化结果的小结。
实施例1
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 1>:1 ATGAAACAGA CAGTCAA.AT GCTTGCCGCC GCCCTGATTG CCTTGGGCTT51 GAACCGACCG GTGTGGNCGG ATGACGTATC GGATTTTCGG GAAAACTTGC101 A.GCGGCAGC ACAGGGAAAT GCAGCAGCCC AATACAATTT GGGCGCAATG151 TAT.TACAAA GGACGCGCGT GCGCCGGGAT GATGCTGAAG CGGTCAGATG201 GTATCGGCAG CCGGCGGAAC AGGGGTTAGC CCAAGCCCAA TACAATTTGG251 GCTGGATGTA TGCCAACGGG CGCGC.GTGC GCCAAGATGA TACCGAAGCG301 GTCAGATGGT ATCGGCAGGC GGCAGCGCAG GGGGTTGTCC AAGCCCAATA351 CAATTTGGGC GTGATATATG CCGAAGGACG TGGAGTGCGC CAAGACGATG401 TCGAAGCGGT CAGATGGTTT CGGCAGGCGG CAGCGCAGGG GGTAGCCCAA451 GCCCAAAACA ATTTGGGCGT GATGTATGCC GAAAGANCGC GCGTGCGCCA501 AGACCG...它对应于氨基酸序列<SEQID 2;ORF37>:1 MKQTVXMLAA ALIALGLNRP VWXDDVSDFR ENLXAAAQGN AAAQYNLGAM51 YXQRTRVRRD DAEAVRWYRQ PAEQGLAQAQ YNLGWMYANG RXVRQDDTEA101 VRWYRQAAAQ GVVQAQYNLG VIYAEGRGVR QDDVEAVRWF RQAAAQGVAQ151 AQNNLGVMYA ERXRVRQD...进一步的工作揭示了完整的核苷酸序列<SEQ ID 3>:1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT51 GAACCGAGCG GTGTGGGCGG ATGACGTATC GGATTTTCGG GAAAACTTGC101 AGGCGGCAGC ACAGGGAAAT GCAGCAGCCC AATACAATTT GGGCGCAATG151 TATTACAAAG GACGCGGCGT GCGCCGGGAT GATGCTGAAG CGGTCAGATG201 GTATCGGCAG GCGGCGGAAC AGGGGTTAGC CCAAGCCCAA TACAATTTGG251 GCTGGATGTA TGCCAACGGG CGCGGCGTGC GCCAAGATGA TACCGAAGCG301 GTCAGATGGT ATCGGCAGGC GGCAGCGCAG GGGGTTGTCC AAGCCCAATA351 CAATTTGGGC GTGATATATG CCGAAGGACG TGGAGTGCGC CAAGACGATG401 TCGAAGCGGT CAGATGGTTT CGGCAGGCGG CAGCGCAGGG GGTAGCCCAA451 GCCCAAAACA ATTTGGGCGT GATGTATGCC GAAAGACGCG GCGTGCGCCA501 AGACCGCGCC CTTGCACAAG AATGGTTTGG CAAGGCTTGT CAAAACGGAG551 ACCAAGACGG CTGCGACAAT GACCAACGCC TGAAGGCGGG TTATTGA其对应于氨基酸序列<SEQID 4;ORF37-1>:1 MKQTVKWLAA ALIALGLNRA VWADDVSDFR ENLQAAAQGN AAAQYNLGAM51 YYKGRGVRRD DAEAVRWYRQ AAEQGLAQAQ YNLGWMYANG RGVRQDDTEA101 VRWYRQAAAQ GVVQAQYNLG VIYAEGRGVR QDDVEAVRWF RQAAAQGVAQ151 AQNNLGVMYA ERRGVRQDRA LAQEWFGKAC QNGDQDGCDN DQRLKAGY*进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 5>:1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT51 GAACCAAGCG GTGTGGGCGG ATGACGTATC GGATTTTCGG GAAAACTTGC101 AGGCGGCAGC ACAGGGAAAT GCAGCAGCCC AAAACAATTT GGGCGTGATG151 TATGCCGAAA GACGCGGCGT GCGCCAAGAC CGCGCCCTTG CACAAGAATG201 GCTTGGCAAG GCTTGTCAAA ACGGATACCA AGACAGCTGC GACAATGACC251 AACGCCTGAA AGCGGGTTAT TGA它编码的蛋白具有以下的氨基酸序列<SEQ ID 6;ORF37a>:1 MKQTVKWLAA ALIALGLNQA VWADDVSDFR ENLQAAAQGN AAAQNNLGVM51 YAERRGVRQD RALAQEWLGK ACQNGYQDSC DNDQRLKAGY *
最初鉴定的部分菌株B序列(ORF37)和ORF37a在75个氨基酸的重叠区内显示出有68.0%的相同性:
10 20 30 40 50 60orf37.pep MKQTVXMLAAALIALGLNRPVWXDDVSDFRENLXAAAQGNAAAQYNLGAMYXQRTRVRRD
||||| |||||||||||: || |||||||||| |||||||||| |||:|| :| ||:|orf37a MKQTVKWLAAALIALGLNQAVWADDVSDFRENLQAAAQGNAAAQNNLGVMYAERRGVRQD
10 20 30 40 50 60
70 80 90 100 110 120orf37.pep DAEAVRWYRQPAEQGLAQAQYNLGWMYANGRXVRQDDTEAVRWYRQAAAQGVVQAQYNLG
| | :| : ::|orf37a RALAQEWLGKACQNGYQDSCDNDQRLKAGYX
70 80 90进一步的工作鉴定了淋病奈瑟球菌中的对应基因<SEQ ID 7>:1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT51 GAACCAAGCG GTGTGGGCGG GTGACGTATC GGATTTTCGG GAAAACTTGC101 AGgcggcaGA ACaggGAAAT GCAGCAGCCC AATTCAATTT GGGCGTGATG151 TATGAAAATG GACAAGGAGT TCGTCAAGAT TATGTACAGG CAGTGCAGTG201 GTATCGCAAG GCTTCAGAAC AAGGGGATGC CCAAGCCCAA TACAATTTGG251 GCTTGATGTA TTACGATGGA CGCGGCGTGC GCCAAGACCT TGCGCTCGCT301 CAACAATGGC TTGGCAAGGC TTGTCAAAAC GGAGACCAAA ACAGCTGCGA351 CAATGACCAA CGCCTGAAGG CGGGTTATTA A它编码的蛋白质具有以下的氨基酸序列<SEQ ID 8;ORF37ng>:1 MKQTVKWLAA ALIALGLNQA VWAGDVSDFR ENLQAAEQGN AAAQFNLGVM551 YENGQGVRQD YVQAVQWYRK ASEQGDAQAQ YNLGLMYYDG RGVRQDLALA101 QQWLGKACQN GDQNSCDNDQ RLKAGY*
最初鉴定的部分菌株B序列(ORF37)在与ORF37ng重叠的111个氨基酸内显示出64.9%的相同性:orf37.pep MKQTVXMLAAALIALGLNRPVWXDDVSDFRENLXAAAQGNAAAQYNLGAMYXQRTRVRRD 60
||||| |||||||||||: || ||||||||| || |||||||:|||:|| : ||:|orf37ng MKQTVKWLAAALIALGLNQAVWAGDVSDFRENLQAAEQGNAAAQFNLGVMYENGQGVRQD 60orf37.pep DAEAVRWYRQPAEQGLAQAQYNLGWMYANGRXVRQDDTEAVRWYRQAAAQGVVQAQYNLG 120
::||:|||: :||| |||||||| || :|| |||| : | :| :| :|orf37ng YVQAVQWYRKASEQGDAQAQYNLGIMYYDGRGVRQDLALAQQWLGKACQNGDQNSCDNDQ 120orf37.pep VIYAEGRGVRQDDVEAVRWFRQAAAQGVAQAQNNLGVMYAERXRVRQD 168orf37ng RLKAGY 126
完整的菌株B序列(ORF37-1)和ORF37ng在重叠的198个氨基酸中显示出51.5%的相同性:
10 20 30 40 50 60orf37-1.pep MKQTVKWLAAALIALGLNRAVWADDVSDFRENLQAAAQGNAAAQYNLGAMYYKGRGVRRD
||||||||||||||||||:|||| |||||||||||| |||||||:|||:|| :|:|||:|orf37ng MKQTVKWLAAALIALGLNQAVWAGDVSDFRENLQAAEQGNAAAQFNLGVMYENGQGVRQD
10 20 30 40 50 60
70 80 90 100 110 120orf37-1.pep DAEAVRWYRQAAEQGLAQAQYNLGWMYANGRGVRQDDTEAVRWYRQAAAQGVVQAQYNLG
::||:|||:|:||| |||||||| || :|||||||orf37ng YVQAVQWYRKASEQGDAQAQYNLGLMYYDGRGVRQD------------------------
70 80 90
130 140 150 160 170 180orf37-1.pep VIYAEGRGVRQDDVEAVRWFRQAAAQGVAQAQNNLGVMYAERRGVRQDRALAQEWFGKAC
||||:|:||||orf37ng ------------------------------------------------LALAQQWLGKAC
100
190 199orf37-1.pep QNGDQDGCDNDQRLKAGYX
|||||::||||||||||||orf37ng QNGDQNSCDNDQRLKAGYX
110 120
这些氨基酸序列的计算机分析表明了一个推定的前导序列,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位能用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF37-1(11kDa)克隆到pET和pGex载体中,并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图1A显示了GST-融合蛋白亲和纯化的结果,图1B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白免疫小鼠,用该小鼠的血清进行ELISA(阳性结果),FACS分析(图1C)和杀菌试验(图1D)。这些实验确证ORF37-1是一种外露蛋白,并且是一种有用的免疫原。
图1E显示了ORF37-1的亲水性、抗原性指数以及AMPHI区域。
实施例2
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 9>:TTCGGCGA CATCGGCGGT TTGAAGGTCA ATGCCCCCGT CAAATCCGCAGGCGTATTGG TCGGGCGCGT CGGCGCTATC GGACTTGACC CGAAATCCTATCAGGCGAGG GTGCGCCTCG ATTTGGACGG CAAGTATCAG TTCAGCAGCGACGTTTCCGC GCAAATCCTG ACTTCsGGAC TTTTGGGCGA GCAGTACATCGGGCTGCAGC AGGGCGGCGA CACGGAAAAC CTTGCTGCCG GCGACACCATCTCCGTAACC AGTTCTGCAA TGGTTCTGGA AAACCTTATC GGCAAATTCATGACGAGTTT TGCCGAGAAA AATGCCGACG GCGGCAATGC GGAAAAAGCCGCCGAATAA它对应于氨基酸序列<SEQ ID 10>:1 FGDIGGLKVN APVKSAGVLV GRVGAIGLDP KSYQARVRLD LDGKYQFSSD51 VSAQILTSGL LGEQYIGLQQ GGDTENLAAG DTISVTSSAM VLENLIGKFM101 TSFAEKNADG GNAEKAAE*这些氨基酸序列的计算机分析给出了下列结果:与假设的流感嗜血菌蛋白(Vbrd.haein:登录号p45029)的同源性SEQ ID 9和ybrd.haein在122个重叠的氨基酸内显示出有48.4%的相同性:
20 30 40 50 60 70yrbd.h LGIGALVFLGLRVANVQGFAETKSYTVTATFDNIGGLKVRAPLKIGGVVIGRVSAITLDE
|::||||||:||:| :||::|||:||:||N.m FGDIGGLKVNAPVKSAGVLVGRVGAIGLDP
10 20 30
80 90 100 110 120 130yrbd.h KSYLPKVSIAINQEYNEIPENSSLSIKTSGLLGEQYIALTMGFDDGDTAMLKNGSQIQDT
||| ::|::::: :| ::::: | | ||||||||||:| | |||: | :|: | |N. m KSYQARVRLDLDGKY-QFSSDVSAQILTSGLLGEQYIGLQQG---GDTENLAAGDTISVT
40 50 60 70 80
140 150 160yrbd.h TSAMVLEDLIGQFL--YGSKKSDGNEKSESTEQ
:||||||:|||:|: :::|::||:: ::::|:N. m SSAMVLENLIGKFMTSFAEKNADGGNAEKAAEX
90 100 110 120
与淋病奈瑟球菌的预计的ORF的同源性
SEQ ID 9与淋病奈瑟球菌的预计的ORF在重叠的118个氨基酸内显示出有99.2%的相同性:
20 30 40 50 60 70yrbd GAAAVAFLAFRVAGGAAFGGSDKTYAVYADFGDIGGLKVNAPVKSAGVLVGRVGAIGLDP
||||||||||||||||||||||||||||||N. m FGDIGGLKVNAPVKSAGVLVGRVGAIGLDP
10 20 30
80 90 100 110 120 130yrbd KSYQARVRLDLDGKYQFSSDVSAQILTSGLLGEQYIGLQQGGDTENLAAGDTISVTSSAM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||N. m KSYQARVRLDLDGKYQFSSDVSAQILTSGLLGEQYIGLQQGGDTENLAAGDTISVTSSAM
40 50 60 70 80 90
140 150 160yrbd VLENLIGKFMTSFAEKNAEGGNAEKAAEX
||||||||||||||||||:||||||||||N. m VLENLIGKFMTSFAEKNADGGNAEKAAEX
100 110 120
完整的yrbd流感嗜血菌序列具有一个前导序列,预计全长的同源脑膜炎奈瑟球菌该蛋白也会有一个前导序列。这提示它可能是膜蛋白、分泌的蛋白或表面蛋白,且该蛋白或其表位之一可能是疫苗或诊断的有用抗原。
实施例3
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 11>: 1 ..ATTTTGATAT ACCTCATCCG CAAGAATCTA GGTTCGCCCG TCTTCTTCTT51 TCAGGAACGC CCCGGAAAGG ACGGAAAACC TTTTAAAATG GTCAAATTCC101 GTTCCATGCG CGACGGCTTG TATTCAGACG GCATTCCGCT GCCCGACGGA151 GAACGCCTGA CACCGTTCGG CAAAAAACTG CGTGCCGcCA GTwTGGACGA201 ACTGCCTGAA TTATGGAATA TCTTAAAAGG CGAGATGAGC CTGGTCGGCC251 CCCGCCCGCT GCTGATGCAA TATCTGCCGC TGTACGACAA CTTCCAAAAC301 CGCCGCCACG AAATGAAACC CGGCATTACC GGCTGGGCGC AGGTCAACGG351 GCGCAACGCg CTTTCGTGGG ACGAAAAATT CGCCTGCGAT GTTTGGTATA401 TCGACCACTT CAGCCTGTGC CTCGACATCA AAATCCTACT GCTGACGGTT451 AAAAAAGTAT TAATCAAGGA AGGGATTTCC GCACAGGGCG AACA.aCCAT501 GCCCCCTTTC ACAGGAAAAC GCAAACTCGC CGTCGTCGGT GCGGGCGGAC551 ACGGAAAAGT CGTTGCCGAC CTTGCCGCCG CACTCGGCCG GTACAGGGAA601 ATCGTTTTTC TGGACGACCG CGCACAAGGC AGCGTCAACG GCTTTTCCGT651 CATCGGCACG ACGCTGCTGC TTGAAAACAG TTTATCGCCC GAACAATACG701 ACGTCGCCGT CGCCGTCGGC AACAACCGCA TCCGCCGCCA AATCGCCGAA751 AAAGCCGCCG CGCTCGGCTT CGCCCTGCCC GTACTGGTTC ATCCGGACGC801 GACCGTCTCG CCTTCTGCAA CAGTCGGACA AGGCAGCGTC GTTATGGCGA851 AAGCGGTCG..它对应于氨基酸序列<SEQID 12;ORF3>:1 ..ILIYLIRKNL GSPVFFFQER PGKDGKPFKM VKFRSMRDGL YSDGIPLPDG51 ERLTPFGKKL RAASXDELPE LWNILKGEMS LVGPRPLLMQ YLPLYDNFQN101 RRHEMKPGIT GWAQVNGRNA LSWDEKFACD VWYIDHFSLC LDIKILLLTV151 KKVLIKEGIS AQGEXTMPPF TGKRKLAVVG AGGHGKVVAD LAAALGRYRE201 IVFLDDRAQG SVNGFSVIGT TLLLENSLSP EQYDVAVAVG NNRIRRQIAE251 KAAALGFALP VLVHPDATVS PSATVGQGSV VMAKAV..进一步的序列分析揭示了完整的核苷酸序列<SEQ ID 13>:1 ATGAGTAAAT TCTTCAAACG CCTGTTTGAC ATTGTTGCCT CCGCCTCGGG51 ACTGATTTTC CTCTCGCCAG TATTTTTGAT TTTGATATAC CTCATCCGCA101 AGAATCTAGG TTCGCCCGTC TTCTTCTTTC AGGAACGCCC CGGAAAGGAC151 GGAAAACCTT TTAAAATGGT CAAATTCCGT TCCATGCGCG ACGCGCTTGA201 TTCAGACGGC ATTCCGCTGC CCGACGGAGA ACGCCTGACA CCGTTCGGCA251 AAAAACTGCG TGCCGCCAGT TTGGACGAAC TGCCTGAATT ATGGAATATC301 TTAAAAGGCG AGATGAGCCT GGTCGGCCCC CGCCCGCTGC TGATGCAATA351 TCTGCCGCTG TACGACAACT TCCAAAACCG CCGCCACGAA ATGAAACCCG401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC451 GAAAAATTCG CCTGCGATGT TTGGTATATC GACCACTTCA GCCTGTGCCT501 CGACATCAAA ATCCTACTGC TGACGGTTAA AAAAGTATTA ATCAAGGAAG551 GGATTTCCGC ACAGGGCGAA GCCACCATGC CCCCTTTCAC AGGAAAACGC601 AAACTCGCCG TCGTCGGTGC GGGCGGACAC GGAAAAGTCG TTGCCGACCT651 TGCCGCCGCA CTCGGCCGGT ACAGGGAAAT CGTTTTTCTG GACGACCGCG701 CACAAGGCAG CGTCAACGGC TTTTCCGTCA TCGGCACGAC GCTGCTGCTT751 GAAAACAGTT TATCGCCCGA ACAATACGAC GTCGCCGTCG CCGTCGGCAA801 CAACCGCATC CGCCGCCAAA TCGCCGAAAA AGCCGCCGCG CTCGGCTTCG851 CCCTGCCCGT TCTGGTTCAT CCGGACGCGA CCGTCTCGCC TTCTGCAACA901 GTCGGACAAG GCAGCGTCGT TATGGCGAAA GCCGTCGTAC AGGCAGGCAG951 CGTATTGAAA GACGGCGTGA TTGTGAACAC TGCCGCCACC GTCGATCACG1001 ACTGCCTGCT TAACGCTTTC GTCCACATCA GCCCAGGCGC GCACCTGTCG1051 GGCAACACGC ATATCGGCGA AGAAAGCTGG ATAGGCACGG GCGCGTGCAG1101 CCGCCAGCAG ATCCGTATCG GCAGCCGCGC AACCATTGGA GCGGGCGCAG1151 TCGTCGTACG CGACGTTTCA GACGGCATGA CCGTCGCGGG CAATCCGGCA1201 AAGCCGCTGC CGCGCAAAAA CCCCGAGACC TCGACAGCAT AA它对应于氨基酸序列<SEQ ID 14;ORF3-1>:1 MSKFFKRLFD IVASASGLIF LSPVFLILIY LIRKNLGSPV FFFQERPGKD51 GKPFKMVKFR SMRDALDSDG IPLPDGERLT PFGKKLRAAS LDELPELWNI101 LKGEMSLVGP RPLLMQYLPL YDNFQNRRHE MKPGITGWAQ VNGRNALSWD151 EKFACDVWYI DHFSLCLDIK ILLLTVKKVL IKEGISAQGE ATMPPFTGKR201 KLAVVGAGGH GKVVADLAAA LGRYREIVFL DDRAQGSVNG FSVIGTTLLL251 ENSLSPEQYD VAVAVGNNRI RRQIAEKAAA LGFALPVLVH PDATVSPSAT301 VGQGSVVMAK AVVQAGSVLK DGVIVNTAAT VDHDCLLNAF VHISPGAHLS351 GNTHIGEESW IGTGACSRQQ IRIGSRATIG AGAVVVRDVS DGMTVAGNPA401 KPLPRKNPET STA*
对该氨基酸序列的计算机分析给出下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF3与脑膜炎奈瑟球菌菌株A的ORF(ORF3a)在重叠的286个氨基酸内显示出有93.0%的相同性:
10 20 30orf3.pep ILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
||||||||||||||||||||||||||||||||||orf3a MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
10 20 30 40 50 60
40 50 60 70 80 90orf3.pep SMRDGLYSDGIPLPDGERLTPFGKKLRAASXDELPELWNILKGEMSLVGPRPLLMQYLPL
||:|:| |||| |||||||||||||||||| ||||||||:|||:||||||||||||||||orf3a SMHDALDSDGILLPDGERLTPFGKKLRAASLDELPELWNVLKGDMSLVGPRPLLMQYLPL
70 80 90 100 110 120
100 110 120 130 140 150orf3.pep YDNFQNRRHEMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFSLCLDIKILLLTVKKVL
|||||||||||||||||||||||||||||||:||||:|||||||||||||||||||||||orf3a YDNFQNRRHEMKPGITGWAQVNGRNALSWDERFACDIWYIDHFSLCLDIKILLLTVKKVL
130 140 150 160 170 180
160 170 180 190 200 210orf3.pep IKEGISAQGEXTMPPFTGKRKLAVVGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG
|||||||||| ||||||||||||||||||||||||:|||||| | ||||||||:||||||orf3a IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVAELAAALGTYGEIVFLDDRVQGSVNG
190 200 210 220 230 240
220 230 240 250 260 270orf3.pep FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT
| ||||||||||||||||:|:|||||||||||||||||||||||||||:|||:|||||||orf3a FPVIGTTLLLENSLSPEQFDIAVAVGNNRIRRQIAEKAAALGFALPVLIHPDSTVSPSAT
250 260 270 280 290 300
280orf3.pep VGQGSVVMAKAV
||||:|||||||orf3a VGQGGVVMAKAVVQADSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESW
310 320 330 340 350 360全长ORF3a核苷酸序列<SEQ ID 15>是:1 ATGAGTAAAT TCTTCAAACG CCTGTTTGAC ATTGTTGCCT CCGCCTCGGG51 ACTGATTTTC CTCTCGCCAG TATTTTTGAT TTTGATATAC CTCATCCGCA101 AGAATCTGGG TTCGCCCGTC TTCTTCTTTC AGGAACGCCC CGGAAAGGAC151 GGAAAACCTT TTAAAATGGT CAAATTCCGT TCCATGCACG ACGCGCTTGA201 TTCAGACGGC ATTCTGCTGC CCGACGGAGA ACGCCTGACA CCGTTCGGCA251 AAAAACTGCG TGCCGCCAGT TTGGACGAAC TGCCCGAACT GTGGAACGTC301 CTCAAAGGCG ACATGAGCCT GGTCGGCCCC CGCCCGCTGC TGATGCAATA351 TCTGCCGCTG TACGACAACT TCCAAAACCG CCGCCACGAA ATGAAACCGG401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC451 GAACGCTTCG CATGCGACAT CTGGTATATC GACCACTTCA GCCTGTGCCT501 CGACATCAAA ATCCTACTGC TGACGGTTAA AAAAGTATTA ATCAAAGAAG551 GGATTTCCGC ACAGGGCGAA GCCACCATGC CCCCTTTCAC AGGAAAACGC601 AAACTTGCCG TCGTCGGTGC GGGCGGACAC GGCAAAGTCG TTGCCGAGCT651 TGCCGCCGCA CTCGGCACAT ACGGCGAAAT CGTTTTTCTG GACGACCGCG701 TCCAAGGCAG CGTCAACGGC TTCCCCGTCA TCGGCACGAC GCTGCTGCTT 751 GAAAACAGTT TATCGCCCGA ACAATTCGAC ATCGCCGTCG CCGTCGGCAA801 CAACCGCATC CGCCGCCAAA TCGCCGAAAA AGCCGCCGCG CTCGGCTTCG851 CCCTGCCCGT CCTGATTCAT CCGGACTCGA CCGTCTCGCC TTCTGCAACA901 GTCGGACAAG GCGGCGTCGT TATGGCGAAA GCCGTCGTAC AGGCTGACAG951 CGTATTGAAA GACGGCGTAA TTGTGAACAC TGCCGCCACC GTCGATCACG1001 ATTGCCTGCT TGATGCTTTC GTCCACATCA GCCCGGGCGC GCACCTGTCG1051 GGCAACACGC GTATCGGCGA AGAAAGCTGG ATAGGCACAG GCGCGTGCAG1101 CCGCCAGCAG ATCCGTATCG GCAGCCGCGC AACCATTGGA GCGGGCGCAG1151 TCGTCGTGCG CGACGTTTCA GACGGCATGA CCGTCGCGGG CAACCCGGCA1201 AAACCATTGG CAGGCAAAAA TACCGAGACC CTGCGGTCGT AA预计它编码的蛋白具有下列氨基酸序列<SEQ ID 16>:1 MSKFFKRLFD IVASASGLIF LSPVFLILIY LIRKNLGSPV FFFQERPGKD51 GKPFKMVKFR SMHDALDSDG ILLPDGERLT PFGKKLRAAS LDELPELWNV101 LKGDMSLVGP RPLLMQYLPL YDNFQNRRHE MKPGITGWAQ VNGRNALSWD151 ERFACDIWYI DHFSLCLDIK ILLLTVKKVL IKEGISAQGE ATMPPFTGKR201 KLAVVGAGGH GKVVAELAAA LGTYGEIVFL DDRVQGSVNG FPVIGTTLLL251 ENSLSPEQFD IAVAVGNNRI RRQIAEKAAA LGFALPVLIH PDSTVSPSAT301 VGQGGVVMAK AVVQADSVLK DGVIVNTAAT VDHDCLLDAF VHISPGAHLS351 GNTRIGEESW IGTGACSRQQ IRIGSRATIG AGAVVVRDVS DGMTVAGNPA401 KPLAGKNTET LRS*两个跨膜结构域用下划线表示。ORF3-1与ORF3a在重叠的410个氨基酸中显示出有94.6%的相同性:
10 20 30 40 50 60orf3a.pep MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf3-1 MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
10 20 30 40 50 60
70 80 90 100 110 120orf3a.pep SMHDALDSDGILLPDGERLTPFGKKLRAASLDELPELWNVLKGDMSLVGPRPLLMQYLPL
||:|||||||| |||||||||||||||||||||||||||:|||:||||||||||||||||orf3-1 SMRDALDSDGIPLPDGERLTPFGKKLRAASLDELPELWNILKGENSLVGPRPLLMQYLPL
70 80 90 100 110 120
130 140 150 160 170 180orf3a.pep YDNFQNRRHEMKPGITGWAQVNGRNALSWDERFACDIWYIDHFSLCLDIKILLLTVKKVL
|||||||||||||||||||||||||||||||:||||:|||||||||||||||||||||||orf3-1 YDNFQNRRHEMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFSLCLDIKILLLTVKKVL
130 140 150 160 170 180
190 200 210 220 230 240orf3a.pep IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVAELAAALGTYGEIVFLDDRVQGSVNG
|||||||||||||||||||||||||||||||||||:|||||| | ||||||||:||||||orf3-1 IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG
190 200 210 220 230 240
250 260 270 280 290 300orf3a.pep FPVIGTTLLLENSLSPEQFDIAVAVGNNRIRRQIAEKAAALGFALPVLIHPDSTVSPSAT
| ||||||||||||||||:|:|||||||||||||||||||||||||||:|||:|||||||orf3-1 FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT
250 260 270 280 290 300
310 320 330 340 350 360orf3a.pep VGQGGVVMAKAVVQADSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESW
||||:|||||||||| |||||||||||||||||||||:|||||||||||||||:||||||orf3-1 VGQGSVVMAKAVVQAGSVLKDGVIVNTAATVDHDCLLNAFVHISPGAHLSGNTHIGEESW
310 320 330 340 350 360
370 380 390 400 410orf3a.pep IGTGACSRQQIRIGSRATIGAGAVVVRDVSDGMTVAGNPAKPLAGKNTETLRSX
||||||||||||||||||||||||||||||||||||||||||| || ||orf3-1 IGTGACSRQQIRIGSRATIGAGAVVVRDVSDGMTVAGNPAKPLPRKNPETSTAX
370 380 390 400 410
与枯草芽孢杆菌的yvfc基因(登录号为Z71928)编码的假设蛋白质的同源性
ORF3和YVFC蛋白质在170个氨基酸重叠区域内表现出有55%的氨基酸相同性(BLASTp):ORF3 3 IYLIRKNLGSPVFFFQERPGKDGKPFKMVKFRSMRDGLYSDGIPLPDGERLTPFGKKLRA 62
I ++R +GSPVFF Q RPG GKPF + KFR+M D S G LPD RLT G+ +Ryvfc 27 IAVVRLKIGSPVFFKQVRPGLHGKPFTLYKFRTMTDERDSKGNLLPDEVRLTKTGRLIRK 86ORF3 63 ASXDELPELWNILKGEMSLYGPRPLLMQYLPLYDNFQNRRHEMKPGITGWAQVNGRNALS 122
S DELP+L N+LKG++SLVGPRPLLM YLPLY Q RRHE+KPGITGWAQ+NGRNA+Syvfc 87 LSIDELPQLLNVLKGDLSLVGPRPLLMDYLPLYTEKQARRHEVKPGITGWAQINGRNAIS 146ORF3 123 WDEKFACDVWYIDHFSLCLDXXXXXXXXXXXXXXEGISAQGEXTMPPFTG 172
W++KF DVWY+D++S LD EGI T FTGyvfc 147 WEKKFELDVWYVDNWSFFLDLKILCLTVRKVLVSEGIQQTNHVTAERFTG 196
与淋病奈瑟球菌的预计ORF的同源性
ORF3与淋病奈瑟球菌的预计ORF(ORF3.ng)在重叠的286个氨基酸内显示出有86.3%的相同性:orf3 ILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR 34
:|||||||| ||||||::||||||||||||||||orf3ng MSKAVKRLFDIIASASGLIVLSPVFLVLIYLIRKNKGSPVFFIRERPGKDGKPFKMVKFR 60orf3 SMRDGLYSDGIPLPDGERLTPFGKKLRAASXDELPELWNILKGEMSLVGPRPLLMQYLPL 94
||||:| ||||||||:|||| |||||||:| ||||||||:||||||||||||||||||||orf3ng SMRDALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPL 120orr3 YDNFQNRRHFMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFSLCLDIKILLLTVKKVL 154
|::||||||||||||||||||||||||||||||:||||| |:||: ||:|||:|||||||orf3ng YNKFQNRRHEMKPGITGWAQVNGRNALSWDEKFSCDVWYTDNFSFWLDMKILFLTVKKVL 180orr3 IKEGISAQGEXTMPPFTGKRKLAVVGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG 214
|||||||||| |||||:|:|||||:||||||||||:|||||| | ||||||||:||||||orf3ng IKEGISAQGEATMPPFAGNRKLAVIGAGGHGKVVAELAAALGTYGEIVFLDDRTQGSVNG 240orf3 FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT 274
| ||||||||||||||||:|::||||||||||||:|:|||||| ||||:||||||||||orf3ng FPVIGTTLLLENSLSPEQFDITVAVGNNRIRRQITENAAALGFKLPVLIHPDATVSPSAI 300orf3 VGQGSVVMAKAV 286
:|||||||||||orf3ng IGQGSVVMAKAVVQAGSVLKDGVIVNTAATVDHICLLDAFVHISPGAHLSGNTRIGEESR 360全长ORF3ng核苷酸序列<SEQ ID 17>是:1 ATGAGTAAAG CCGTCAAACG CCTGTTCGAC ATCATCGCAT CCGCATCGGG51 GCTGATTGTC CTGTCGCCCG TGTTTTTGGT TTTAATATAC CTCATCCGCA101 AAAACTTAGG TTCGCCCGTC TTCTTCattC GGGAACGCCc cgGAAAGGAc151 ggaaaacCTT TTAAAATGGT CAAATTCCGT TCCAtgcgcg acgcgcttGA201 TTCAGACGGC ATTCCGCTGC CCGATAGCGA ACGCCTGACC GATTTCGGCA251 AAAAATTACG CGCCACCAGT TTGGACGAAC TTCCTGAATT ATGGAATGTC301 CTCAAAGGCG AGATGAGCCT GGTCGGCCCC CGCCCGCTTT TGATGCAGTA351 TCTGCCGCTT TACAACAAAT TTCAAAACCG CCGCCACGAA ATGAAACCGG401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC451 GAAAAGTTCT CCTGCGATGT TTGGTACACC GACAATTTCA GCTTTTGGCT 501 GGATATGAAA ATCCTGTTTC TGACAGTCAA AAAAGTCTTG ATTAAAGAAG551 GCATTTCGGC GCAAGGGGAA GCCACCATGC CCCCTTTCGC GGGGAATCGC601 AAACTCGCCG TTATCGGCGC GGGCGGACAC GGCAAAGTCG TTGCCGAGCT651 TGCCGCCGCA CTCGGCACAT ACGGCGAAAT CGTTTTTCTG GACGACCGCA701 CCCAAGGCAG CGTCAACGGC TTCCCCGTCA TCGGCACGAC GCTGCTGCTT751 GAAAACAGTT TATCGCCCGA ACAATTCGAC ATCACCGTCG CCGTCGGCAA801 CAACCGCATC CGCCGCCAAA TCACCGAAAA CGCCGCCGCG CTCGGCTTCA851 AACTGCCCGT TCTGATTCAT CCCGACGCGA CCGTCTCGCC TTCTGCAATA901 ATCGGACAAG GCAGCGTCGT AATGGCGAAA GCCGTCGTAC AGGCCGGCAG951 CGTATTGAAA GACGGCGTGA TTGTGAACAC TGCCGCCACC GTCGATCACG1001 ACTGCCTGCT TGACGCTTTC GtccaCATCA GCCCGGGCGC GCACCTGTCG1051 GGCAACACGC GTATCGGCGA AGAAAGCCGG ATAGGCACGG GCGCGTGCAG1101 CCGCCAGCAG ACAACCGTCG GCAGCGGGGT TACCgccgGT GCAGGGgcGG1151 TTATCGTATG CGACATCCCG GACGGCATGA CCGTCGCGGG CAACCCGGCA1201 AAGCCCCTTA CGGGCAAAAA CCCCAAGACC GGGACGGCAT AA它编码的蛋白质具有下列氨基酸序列<SEQ ID 18>:1 MSKAVKRLFD IIASASGLIV LSPVFLVLIY LIRKNLGSPV FFIRERPGKD51 GKPFKMVKFR SMRDALDSDG IPLPDSERLT DFGKKLRATS LDELPELWNV101 LKGEMSLVGP RPLLMQYLPL YNKFQNRRHE MKPGITGWAQ VNGRNALSWD151 EKFSCDVWYT DNFSFWLDMK ILFLTVKKVL IKEGISAQGE ATMPPFAGNR201 KLAVIGAGGH GKVVAELAAA LGTYGEIVFL DDRTQGSVNG FPVIGTTLLL251 ENSLSPEQFD ITVAVGNNRI RRQITENAAA LGFKLPVLIH PDATVSPSAI301 IGQGSVVMAK AVVQAGSVLK DGVIVNTAAT VDHDCLLDAF VHISPGAHLS351 GNTRIGEESR IGTGACSRQQ TTVGSGVTAG AGAVIVCDIP DGMTVAGNPA401 KPLTGKNPKT GTA*该蛋白与ORF3-1在重叠的413个氨基酸内有86.9%的相同性:
10 20 30 40 50 60orf3-1.pep MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
||| ||||||:||||||| ||||||:|||||||||||||||::||||||||||||||||orf3ng MSKAVKRLFDIIASASGLIVLSPVFLVLIYLIRKNLGSPVFFIRERPGKDGKPFKMVKFR
10 20 30 40 50 60
70 80 90 100 110 120orf3-1.pep SMRDALDSDGIPLPDGERLTPFGKKLRAASLDELPELWNILKGEMSLVGPRPLLMQYLPL
|||||||||||||||:|||| |||||||:||||||||||:||||||||||||||||||||orf3ng SMRDALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPL
70 80 90 100 110 120
130 140 150 160 170 180orf3-1.pep YDNFQNRRHEMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFSLCLDIKILLLTVKKVL
|::||||||||||||||||||||||||||||||:||||| |:||: ||:|||:|||||||orf3ng YNKFQNRRHEMKPGITGWAQVNGRNALSWDEKFSCDVWYTDNFSFWLDMKILFLTVKKVL
130 140 150 160 170 180
190 200 210 220 230 240orf3-1.pep IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG
||||||||||||||||:|:|||||:||||||||||:|||||| | ||||||||:||||||orf3ng IKEGISAQGEATMPPFAGNRKLAVIGAGGHGKVVAELAAALGTYGEIVFLDDRTQGSVNG
190 200 210 220 230 240
250 260 270 280 290 300orf3-1.pep FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT
| ||||||||||||||||:|::||||||||||||:|:|||||| ||||:|||| ||||||orf3ng FPVIGTTLLLENSLSPEQFDITVAVGNNRIRRQITENAAALGFKLPVLIHPDATVSPSAI
250 260 270 280 290 300
310 320 330 340 350 360orf3-1.pep VGQGSVVMAKAVVQAGSVLKDGVIVNTAATVDHDCLLNAFVHISPGAHLSGNTHIGEESW
:||||||||||||||||||||||||||||||||||||:|||||||||||||||:|||||orf3ng IGQGSVVMAKAVVQAGSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESR
310 320 330 340 350 360
370 380 390 400 410orf3-1.pep IGTGACSRQQIRIGSRATIGAGAVVVRDVSDGMTVAGNPAKPLPRKNPETSTAX
|||||||||| :|| :| |||||:| |: ||||||||||||| |||:|:|||orf3ng IGTGACSRQQTTVGSGVTAGAGAVIVCDIPDGMTVAGNPAKPLTGKNPKTGTAX
370 380 390 400 410另外,ORF3ng显示出与枯草芽孢杆菌的假设蛋白明显同源:gnl|PID|e238668(Z71928)假设蛋白[枯草芽孢杆菌]>gi|1945702|gnl|PID|e313004(Z94043)假设蛋白[枯草芽孢杆菌]>gi|2635938|gnl|PID|e1186113(Z99121)与荚膜多糖生物合成相似[枯草芽孢杆菌]长度=202评分=235位(594),估计值=3e-61相同性=114/195(58%),阳性=142/195(72%)询问:5 VKRLFDIIASASGLIVLSPVFLVLIYLIRKNLGSPVFFIRERPGKDGKPFKMVKFRSMRD 64
+KRLFD+ A+ L S + L I ++R +GSPVFF + RPG GKPF + KFR+M D目标:3 LKRLFDLTAAIFLLCCTSVIILFTIAVVRLKIGSPVFFKQVRPGLHGKPFTLYKFRTMTD 62询问:65 ALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPLYNKF 124
DS G LPD RLT G+ +R S+DELP+L NVLKG++SLVGPRPLLM YLPLY +目标:63 ERDSKGNLLPDEVRLTKTGRLIRKLSIDELPQLLNVLKGDLSLVGPRPLLMDYLPLYTEK 122询问:125 QNRRHEMKPGITGWAQVNGRNALSWDEKFSCDVWYTDNFSFWLDMKILFLTVKKVLIKEG 184
Q RRHE+KPGITGWAQ+NGRNA+SW++KF DVWY DN+SF+LD+KIL LTV+KVL+ EG目标:123 QARRHEVKPGITGWAQINGRNAISWEKKFELDVWYVDNWSFFLDLKILCLTVRKVLVSEG 182询问:185 ISAQGEATMPPFAGN 199
I T F G+目标:183 IQQTNHVTAERFTGS 197
yvfc基因的假设产物显示与苜蓿根瘤菌(R.meliloti)的EXOY(一种外多糖产生蛋白(exopolysacCharide production protein))相似。根据这个情况以及同源的淋病奈瑟球菌序列中两个预计的跨膜区,预计这些蛋白或它们的表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例4
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 19>:1 ..AACCATATGG CGATTGTCAT CGACGAATAC GGCGGCACAT CCGGCTTGGT51 CACCTTTGAA GACATCATCG AGCAAATCGT CGGCGAAATC GAAGACGAGT101 TTGACGAAGA CGATAGCGCC GACAATATCC ATGCCGTTTC TTCAGACACG151 TGGCGCATCC ATGCAGCTAC CGAAATCGAA GACATCAACA CCTTCTTCGG201 CACGGAATAC AGCATCGAAG AAGCCGACAC CATT.GGCGG CCTGGTCATT251 CAAGAGTTGG GACATCTGCC CGTGCGCGGC GAAAAAGTCC TTATCGGCGG301 TTTGCAGTTC ACCGTCGCAC GCGCCGACAA CCGCCCCCTG CATACGCTGA351 TGGCGACCCG CGTGAAGTAA GC........ .....ACCGC CGTTTCTGCA401 CAGTTTAG它对应于氨基酸序列<SEQ ID 20:ORF5>:1 ..NHMAIVIDEY GGTSGLVTFE DIIEQIVGEI EDEFDEDDSA DNIHAVSSDT51 WRIHAATEIE DINTFFGTEY SIEEADTIXR PGHSRVGTSA RARRKSPYRR101 FAVHRRTRRQ PPPAYADGDP REVS....XR RFCTV*进一步的序列分析揭示了完整的DNA序列是<SEQ ID 21>:1 ATGGACGGCG CACAACCGAA AACGAATTTT TTTGAACGCC TGATTGCCCG 51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTAAAC CTGCTTCGGC101 AGGCGCACGA GCAGGAAGTT TTTGATGCGG ATACGCTTTT AAGATTGGAA151 AAAGTCCTCG ATTTTTCCGA TTTGGAAGTG CGCGACGCGA TGATTACGCG201 CAGCCGTATG AACGTTTTAA AAGAAAACGA CAGCATCGAG CGCATCACCG251 CCTACGTTAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGCGAAGAC301 AAAGACGAAG TTTTGGGCAT TTTGCACGCC AAAGACCTGC TCAAATATAT351 GTTTAACCCC GAGCAGTTCC ACCTCAAATC CATTCTCCGC CCCGCCGTCT401 TCGTCCCCGA AGGCAAATCG CTGACCGCCC TTTTAAAAGA GTTCCGCGAA451 CAGCGCAACC ATATGGCGAT TGTCATCGAC GAATACGGCG GCACATCCGG501 CTTGGTCACC TTTGAAGACA TCATCGAGCA AATCGTCGGC GAAATCGAAG551 ACGAGTTTGA CGAAGACGAT AGCGCCGACA ATATCCATGC CGTTTCTTCC601 GAACGCTGGC GCATCCATGC AGCTACCGAA ATCGAAGACA TCAACACCTT651 CTTCGGCACG GAATACAGCA GCGAAGAAGC CGACACCATT CGGCCTGGTC701 ATTCAAGAGT TGGGACATCT GCCCGTGCGC GGCGAAAAAG TCCTTATCGG751 CGGTTTGCAG TTCACCGTCG CACGCGCCGA CAACCGCCGC CTGCATACGC801 TGATGGCGAC CCGCGTGAAG TAAGCACCGC CGTTTCTGCA CAGTTTAGGA851 TGACGGTACG GGCGTTTTCT GTTTCAATCC GCCCCATCCG CCAAACATAA它对应于氨基酸序列<SEQ ID 22;ORF5-1>:1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLLRLE51 KVLDFSDLEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED101 KDEVLGILHA KDLLKYMFNP EQFHLKSILR PAVFVPEGKS LTALLKEFRE151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG EIEDEFDEDD SADNIHAVSS201 ERWRIHAATE IEDINTFFGT EYSSEEADTI RPGHSRVGTS ARARRKSPYR251 RFAVHRRTRR QPPPAYADGD PREVSTAVSA QFRMTVRAFS VSIRPIRQT*进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 23>:1 ATGGACGGCG CACAACCGAA AACAAATTTT TTNNAACGCC TGATTGCCCG51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTGACC CTGTTGCGCC101 AAGCGCACGA ACAGGAAGTA TTTGATGCGG ATACGCTTTT AAGATTGGAA151 AAAGTCCTCG ATTTTTCTGA TTTGGAAGTG CGCGACGCGA TGATTACGCG201 CAGCCGTATG AACGTTTTAA AAGAAAACGA CAGCATCGAA CGCATCACCG251 CCTACGTTAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGTGAAGAC301 AAAGACGAAG TTTTGGGTAT TTTGCACGCC AAAGACCTGC TCAAATATAT351 GTTCAACCCC GAGCAGTTCC ACCTCAAATC GATATTGCGC CCTGCCGTCT401 TCGTCCCCGA AGGCAAATCG CTGACCGCCC TTTTAAAAGA GTTCCGCGAA451 CAGCGCAACC ATATGGCAAT CGTCATCGAC GAATACGGCG GCACGTCGGG501 TTTGGTAACT TTTGAAGACA TCATCGAGCA AATCGTCGGC GACATCGAAG551 ATGAGTTTGA CGAAGACGAA AGCGCGGACA ACATCCACGC CGTTTCCGCC601 GAACGCTGGC GCATCCACGC GGCTACCGAA ATCGAAGACA TCAACGCCTT651 TTTCGGCACG GAATACAGCA GCGAAGAAGC CGACACCATC GGCGGCCNTG701 GTCATTCAGG AATTGGNACA CCTGCCCGTG CGCGGCGAAA AAGTCNTTAT751 CGGCGNNTTG CANTTCACNG TCGCCNGCGC NGACAACCGC CGCCTGCATA801 CGCTGATGGC GACCCGCGTG AAGTAAGCTC CGCCGTTTCT GTACAGTTTA851 GGATGACGGT ACGGGCGTTT TCTGTTTCAA TCCGCCCCAT CCGCCANACA901 TAA它编码的蛋白具有以下的氨基酸序列<SEQ ID 24;ORF5a>:1 MDGAQPKTNF XXRLIARLAR EPDSAEDVLT LLRQAHEQEV FDADTLLRLE51 KVLDFSDLEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED101 KDEVLGILHA KDLLKYMFNP EQFHLKSILR PAVFVPEGKS LTALLKEFRE151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADNIHAVSA201 ERWRIHAATE IEDINAFFGT EYSSEEADTI GGXGHSGIGT PARARRKSXY251 RRXAXHXRXR XQPPPAYADG DPREVSSAVS VQFRMTVRAF SVSIRPIRXT301 *
最初鉴定的部分菌株B序列(ORF5)与ORF5a在重叠的124个氨基酸内显示出有54.7%的相同性:
10 20 30orf5.pep NHMAIVIDEYGGTSGLVTFEDIIEQIVGEI
||||||||||||||||||||||||||||:|orf5a FHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVGDI
130 140 150 160 170 180
40 50 60 70 80 90orfS.pep EDEFDEDDSADNIHAVSSDTWRIHAATEIEDINTFFGTEYSIEEADTIXRPGHSRVGTSA
|||||||:|||||||||:: |||||||||||||:||||||| |||||| ||| :|| |orf5a EDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADTIGGXGHSGIGTPA
190 200 210 220 230 240
100 110 120 130off5.pep RARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSXXXXXRRFCTV
|||||| ||| | | |:| |||||||||||||||orf5a RARRKSXYRRXAXHXRXRXQPPPAYADGDPREVSSAVSVQFRMTVRAFSVSIRPIRXTX
250 260 270 280 290 300
完整的菌株B序列(ORF5-1)和ORF5a在重叠的300个氨基酸中显示出有92.7%的相同性:
10 20 30 40 50 60orf5a.pep MDGAQPKTNFXXRLIARLAREPDSAEDVLTLLRQAHEQEVFDADTLLRLEKVLDFSDLEV
|||||||||| |||||||||||||||||:||||||||||||||||||||||||||||||orfS-1 MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLLRLEKVLDFSDLEV
10 20 30 40 50 60
70 80 90 100 110 120orf5a.pep RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf5-1 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
70 80 90 100 110 120
130 140 150 160 170 180orfSa.pep EQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf5-1 EQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
130 140 150 160 170 180
190 200 210 220 230 240orf5a.pep DIEDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADTIGGXGHSGIGT
:||||||||:|||||||||:|||||||||||||||:|||||||||||||| ||| :||orf5-1 EIEDEFDEDDSADNIHAVSSERWRIHAATEIEDINTFFGTEYSSEEADTIRP-GHSRVGT
190 200 210 220 230
250 260 270 280 290 300orf5a.pep PARARRKSXYRRXAXHXRXRXQPPPAYADGDPREVSSAVSVQFRMTVRAFSVSIRPIRXT
||||||| ||| | | |:| |||||||||||||||:|||:||||||||||||||||| |orf5-1 SARARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSTAVSAQFRMTVRAFSVSIRPIRQT
240 250 260 270 280 290
进一步的工作鉴定了淋病奈瑟球菌中的部分DNA序列<SEQ ID 25>,它编码的蛋白质具有氨基酸序列<SEQ ID 26;ORF5ng>.1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLTRLE51 KVLDFAELEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED101 KDEVLGILHA KDLLKYMFNP EQFHLKSVLR PAVFVPEGKS LTALLKEFRE151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADDIHSVSA201 ERWRIHAATE IEDINAFFGT EYGSEEADTI RRLGHSGIGT PARARRKSPY251 RRFAVHRRPR RQPPPAHADG DPREVSRACP HRRFCTV*进一步的分析揭示了完整的淋球菌核苷酸序列<SEQ ID 27>是:1 ATGGACGGCG CACAACCGAA AACAAATTTT TTTGAACGCC TGATTGCCCG51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTAAAC CTGCTTCGGC101 AGGCGCACGA ACAGGAAGTT TTTGATGCCG ACACACTGAC CCGGCTGGAA151 AAAGTATTGG ACTTTGCCGA GCTGGAAGTG CGCGATGCGA TGATTACGCG201 CAGCCGCATG AACGTATTGA AAGAAAACGA CAGCATCGAA CGCATCACCG251 CCTACGTCAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGCGAAGAC301 AAAGACGAAG TTTTGGGCAT TTTGCACGCC AAAGACCTGC TCAAATATAT351 GTTCAACCCC GAGCAGTTCC ACCTGAAATC CGTCTTGCGC CCTGCCGTTT401 TCGTGCCCGA AGGCAAATCT TTGACCGCCC TTTTAAAAGA GTTCCGCGAA451 CAGCGCAACC ATATGGCAAT CGTCATCGAC GAATACGGCG GCACGTCGGG501 TTTGGTCACC TTTGAAGACA TCATCGAGCA AATCGTCGGT GACATCGAAG551 ACGAGTTTGA CGAAGACGAA AGCGccgacg acatCCACTC cgTTTccgCC601 GAACGCTGGC GCATCCacgc ggctaCCGAA ATCGAAGaca TCAACGCCTT651 TTTCGGTACG GAatacggca gcgaagaagc cgacaccatc cggcggctTG701 GTCATTCAGG AATTGGGACA CCTGCCCGTG CGCGGCGAAA AAGTCCTTAt751 cggcgGTTTG Cagttcaccg tCGCCCGCGC CGACAACCGC CGCCTGCACA801 CGCTGATGGC GACCCGCGTG AAGTAAGCAG AGCCTGCCcg AccgccgttT851 CTGCacAGTT TAGGatgACG gtaCGGTCGT TTTCTGTTTC AATCCGCCCC901 ATCCGCCAAA CATAA它编码的蛋白质具有氨基酸序列<SEQID 28;ORF5ng-1>:1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHFQEV FDADTLTRLE51 KVLDFAELEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED101 KDEVLGILHA KDLLKYMFNP FQFHLKSVLR PAVFVPEGKS LTALLKEFRE151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADDIHSVSA201 ERWRIHAATE IEDINAFFGT EYGSEEADTI RRLGHSGIGT PARARRKSPY251 RRFAVHRRPR RQPPPAHADG DPREVSRACP TAVSAQFRMT VRSFSVSIRP301 IRQT*
最初鉴定的部分菌株B序列(ORF5)与部分淋球菌序列(ORF5ng)在重叠的135个氨基酸内显示出有83.1%的相同性:orf5 NHMAIVIDEYGGTSGLVTFEDIIEQIVGEI 30
||||||||||||||||||||||||||||:|orf5ng FHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVGDI 182orf5 EDEFDEDDSADNIHAVSSDTWRIHAATEIEDINTFFGTEYSIEEADTIXRPGHSRVGTSA 90
|||||||:|||:||:||:: |||||||||||||:||||||: |||||| | ||| :|| |orf5ng EDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEADTIRRLGHSGIGTPA 242orf5 RARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSX----RRFCTV 131
|||||||||||||||| |||||||:||||||||| ||||||orf5ng RARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPHRRFCTV 287
完整的菌株B和淋球菌序列(ORF5-1和ORF5ng-1)在重叠的304个氨基酸中显示出有92.4%的相同性:
10 20 30 40 50 60orf5ng-1.pep MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLTRLEKVLDFAELEV
|||||||||||||||||||||||||||||||||||||||||||||| ||||||||::|||orf5-1 MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLLRLEKVLDFSDLEV
10 20 30 40 50 60
70 80 90 100 110 120orf5ng-1.pep RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf5-1 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
70 80 90 100 110 120
130 140 150 160 170 180orf5ng-1.pep EQFHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf5-1 EQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
130 140 150 160 170 180
190 200 210 220 230 240orf5ng-1.pep DIEDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEADTIRRLGHSGIGT
:||||||||:|||:||:||:|||||||||||||||:||||||:|||||||| ||| :||orf5-1 EIEDEFDEDDSADNIHAVSSERWRIHAATEIEDINTFFGTEYSSEEADTIRP-GHSRVGT
190 200 210 220 230
250 260 270 280 290 300orf5ng-1.pep PARARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPTAVSAQFRMTVRSFSVSIRP
||||||||||||||||| |||||||:||||||||| ||||||||||||:|||||||orf5-1 SARARRKSPYRRFAVHRRTRRQPPPAYADGDPREVS----TAVSAQFRMTVRAFSVSIRP
240 250 260 270 280 290orf5ng-1.pep IRQTX
|||||orf5-1 IRQTX
300
这些氨基酸序列的计算机分析表明了一个推定的前导序列,并鉴定了下列同源性:
与流感嗜血菌的溶血素同系物TlvC(登录号为U32716)的同源性
ORF5和TlyC蛋白在重叠的77个氨基酸内有58%的相同性(BLASTp)。ORF5 2 HMAIVIDEYGGTSGLVTFEDIIEQIVGEIEDEFDEDDSADNIHAVSSDTWRIHAATEIED 61
HMAIV+DE+G SGLVT EDI+EQIVG+IEDEFDE++ AD I +S T+ + A T+I+DTlyC 166 HMAIVVDEFGAVSGLVTIEDILEQIVGDIEDEFDEEEIAD-IRQLSRHTYAVRALTDIDD 224ORF5 62 INTFFGTEYSIEEADTI 78
N F T++ EE DTITlyC 225 FNAQFNTDFDDEEVDTI 241ORF5ng-1还显示出与TlyC明显同源:评分 Init1: 301 Initn: 419 Opt: 668Smith-Waterman评分:668; 242个重叠的氨基酸内有45.9%的相同性
10 20 30 40 50orf5ng-1.pep MDGAQPKTNFFERLIARLAR-EPDSAEDVLNLLRQAHEQEVFDADTLTRLEK
| ||: |::|: : | : |::::::|::::::::| :| :|tlyc_haein MNDEQQNSNQSENTKKPFFQSLFGRFFQGELKNREELVEVIRDSEQNDLIDQNTREMIEG
10 20 30 40 50 60
60 70 80 90 100 109orf5ng-1.pep VLDFAELEVRDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGE--DKDEVLGILH
|:::|||:||| || ||:: :::::::: :|::||||||||:: |:|:::||||tlyc_haein VMEIAELRVRDIMIPRSQIIFIEDQQDLNTCLNTIIESAHSRFPVIADADDRDNIVGILH
70 80 90 100 110 120
110 120 130 140 150 160orf5ng-1.pep AKDLLKYMF-NPEQFHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGL
||||||:: : | | |:|:|||:|:|||:| : :||:|| :| |||||:||:|::|||tlyc_haein AKDLLKFLREDAEVFDLSSLLRPVVIVPESKRVDRMLKDFRSERFHMAIVVDEFGAVSGL
130 140 150 160 170 180
170 180 190 200 210 220orf5ng-1.pep VTFEDIIEQIVGDIEDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEAD
||:|||:|||||||||||||:| || |:::| : : ::| |:|:|:|| |:|:: :||:|tlyc_haein VTIEDILEQIVGDIEDEFDEEEIAD-IRQLSRHTYAVRALTDIDDFNAQFNTDFDDEEVD
190 200 210 220 230
230 240 250 260 270 280orf5ng-1.pep TIRRLGHSGIG-TPARARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPTAVSAQF
|| | : :| | |:tlyc_haein TIGGLIMQTFGYLPKRGEEIILKNLQFKVTSADSRRLIQLRVTVPDEHLAEMNNVDEKSE
240 250 260 270 280 290与大肠杆菌的假设分泌蛋白的同源性:ORF5显示出与大肠杆菌的一种假设分泌蛋白有同源性:sp|P77392|YBEX_ECOLI CUTE-ASNB基因间区域中假设的33.3 KD蛋白>gi|1778577(U82598)与流感嗜血菌相似[大肠杆菌]>gi|1786879(AE000170)f292;该292 aa ORF与约440aa蛋白的272个残基有23%的相同性(9个空隙)YTFL_HAEIN SW:P44717[大肠杆菌]长度=292评分=212位(533),估计值=3e-54相同性=112/230(48%),阳性=149/230(64%),空隙=3/230(1%)询问:2 DGAQPKTNFXXRLIARLAR-EPDSAEDVLTLLRQAHEQEVFDADTLLRLEKVLDFSDLEV 60
D K F L+++L EP + +++L L+R + + ++ D DT LE V+D +D V目标:10 DTISNKKGFFSLLLSQLFHGEPKNRDELLALIRDSGQNDLIDEDTRDMLEGVMDIADQRV 69询问:61 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYM-FN 119
RD MI RS+M LK N +++ +I++AHSRFPVI EDKD + GIL AKDLL +M +目标:70 RDIMIPRSQMITLKRNQTLDECLDVIIESAHSRFPVISEDKDHIEGILMAKDLLPFMRSD 129询问:120 PEQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIV 179
E F + +LR AV VPE K + +LKEFR QR HMAIVIDE+GG SGLVT EDI+E IV目标:130 AEAFSMDKVLRQAVVVPESKRVDRMLKEFRSQRYHMAIVIDEFGGVSGLVTIEDILELIV 189询问:180 GDIEDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADT 229
G+IEDE+DE++ D +S W + A IED N FGT +S EE DT目标:190 GEIEDEYDEEDDID-FRQLSRHTWTVRALASIEDFNEAFGTHFSDEEVDT 238
根据该分析,包括与流感嗜血菌的TlyC溶血素同系物的氨基酸同源性(溶血素是分泌的蛋白质),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白是分泌性的,因此可用作疫苗或诊断用的抗原。
如上所述,将ORF5-1(30.7kDa)克隆到pGex中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图2A显示了GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白免疫小鼠,将小鼠的血清用于Western印迹分析(图1B)。这些实验确认ORF5-1是外露的蛋白,且是有用的免疫原。
实施例5
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQID 29>:1 ATGCGCGGCG GCAGGCCGGA TTCCGTTACC GTGCAGATTA TCGAAGGTTC51 GCGTTTTTCG CATATGAGGA AAGTCATCGA CGCAACGCCC GACATCGGAC101 ACGACACCAA AGGCTGGAGC AATGAAAAAC TGATGGCGGA AGTTGCGCCC151 GATGCCTTCA GCGGCAATCC TGAAgGGCAG TTTTTCCCCG ACAGCTACGA201 AATCGATGCG GGCGGCAGTG ATTTGCAGAT TTACCAAACC GCCTACAAgG251 GCGATGCAAC GCCGCCTGAA TGAgGGCATG GGAAAGCAGG CAGGACGGGC301 TGCCTTATAA AAACCCTTAT GAAATGCTGA TTATGGCGAr CCTGGTCGAA351 AAGGAAACAG GGCATGAAGC CGAsCsCGAC CATGTcGCTT CCGTCTTCGT401 CAACCGCCTG AAAATCGGTA TGCGCCTGCA AACCgAssCG TCCGTGATTT451 ACGGCATGGG TGCGGCATAC AAGGGCAAAA TCCGTAAAGC CGACCTGCGC501 CGCGACACGC CGTACAACAC CTACACGCGC GGCGGTCTGC CGCCAACCCC551 GATTGCGCTG CCC..它对应于氨基酸序列<SEQ ID 30;ORF7>: 1 MRGGRPDSYT VQIIEGSRFS HMRKVIDATP DIGHDTKGWS NEKLMAEYAP51 DAFSGNPEGQ FFPDSYEIDA GGSDLQIYQT AYKAMQRRLN EAWESRQDGL101 PYKNPYEMLI MAXLVEKETG HEAXXDHVAS VFVNRLKIGM RLQTXXSVIY151 GMGAAYKGKI RKADLRRDTP YNTYTRGGLP PTPIALP..进一步的序列分析揭示了完整的DNA序列<SEQ ID 31>:1 ATGTTGAGAA AATTGTTGAA ATGGTCTGCC GTTTTTTTGA CCGTGTCGGC51 AGCCGTTTTC GCCGCGCTGC TTTTTGTTCC TAAGGATAAC GGCAGGGCAT101 ACCGAATCAA AATTGCCAAA AACCAGGGTA TTTCGTCGGT CGGCAGGAAA151 CTTGCCGAAG ACCGCATCGT GTTCAGCAGG CATGTTTTGA CGGCGGCGGC201 CTACGTTTTG GGTGTGCACA ACAGGCTGCA TACGGGGACG TACAGATTGC251 CTTCGGAAGT GTCTGCTTGG GATATCTTGC AGAAAATGCG CGGCGGCAGG301 CCGGATTCCG TTACCGTGCA GATTATCGAA GGTTCGCGTT TTTCGCATAT351 GAGGAAAGTC ATCGACGCAA CGCCCGACAT CGGACACGAC ACCAAAGGCT401 GGAGCAATGA AAAACTGATG GCGGAAGTTG CGCCCGATGC CTTCAGCGGC451 AATCCTGAAG GGCAGTTTTT CCCCGACAGC TACGAAATCG ATGCGGGCGG501 CAGTGATTTG CAGATTTACC AAACCGCCTA CAAGGCGATG CAACGCCGCC551 TGAATGAGGC ATGGGAAAGC AGGCAGGACG GGCTGCCTTA TAAAAACCCT601 TATGAAATGC TGATTATGGC GAGCCTGGTC GAAAAGGAAA CAGGGCATGA651 AGCCGACCGC GACCATGTCG CTTCCGTCTT CGTCAACCGC CTGAAAATCG701 GTATGCGCCT GCAAACCGAC CCGTCCGTGA TTTACGGCAT GGGTGCGGCA751 TACAAGGGCA AAATCCGTAA AGCCGACCTG CGCCGCGACA CGCCGTACAA801 CACCTACACG CGCGGCGGTC TGCCGCCAAC CCCGATTGCG CTGCCCGGCA851 AGGCGGCACT CGATGCCGCC GCCCATCCGT CCGGCGAAAA ATACCTGTAT901 TTCGTGTCCA AAATGGACGG CACGGGCTTG AGCCAGTTCA GCCATGATTT951 GACCGAACAC AATGCCGCCG TCCGCAAATA TATTTTGAAA AAATAA它对应于氨基酸序列<SEQ ID 32;ORF7-1>:1 MLRKLLKWSA VFLTVSAAVF AALLFVPKDN GRAYRIKIAK NQGISSVGRK51 LAEDRIVFSR HVLTAAAYVL GVHNRLHTGT YRLPSEVSAW DILQKMRGGR101 PDSVTVQIIE GSRFSHMRKV IDATPDIGHD TKGWSNEKLM AEVAPDAFSG151 NPEGQFFPDS YEIDAGGSDL QIYQTAYKAM QRRLNEAWES RQDGLPYKNP201 YEMLIMASLV EKETGHEADR DHVASVFVNR LKIGMRLQTD PSVIYGMGAA251 YKGKIRKADL RRDTPYNTYT RGGLPPTPIA LPGKAALDAA AHPSGEKYLY301 FVSKMDGTGL SQFSHDLTEH NAAVRKYILK K*该氨基酸序列的计算机分析给出了下列结果:与流感嗜血菌的vceg基因(登录号为P44270)编码的假设蛋白的同源性ORF7和yceg蛋白在重叠的192个氨基酸内显示出有44%的氨基酸相同性:ORF7 1 MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMA-----EVAPDAFSG 55
+ G+ V+ IEG F RK ++ P + K SNE++ A ++ +yceg 102 LNSGKEVQFNVKWIEGKTFKDWRKDLENAPHLVQTLKDKSNEEIFALLDLPDIGQNLELK 161ORF7 56 NPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWESRQDGLPYKNPYEMLIMAXLV 115
N EG +PD+Y +DL++ + + + M++ LN+AW R + LP NPYEMLI+A +Vyceg 162 NVEGWLYPDTYNYTPKSTDLELLKRSAERMKKALNKAWNERDEDLPLANPYEMLILASIV 221ORF7 116 EKETGHEAXXDHVASVFVNRLKIGMRLQTXXSVIYGMGAAYKGKIRKADLRRDTPYNTYT 175
EKETG VASVF+NRLK M+LQT +VIYGMG Y G IRK DL TPYNTYyceg 222 EKETGIANERAKVASVFINRLKAKMKLQTDPTVIYGMGENYNGNIRKKDLETKTPYNTYV 281ORF7 176 RGGLPPTPIALP 187
GLPPTPIA+Pyceg 282 IDGLPPTPIAMP 293全长YCEG蛋白具有以下序列:1 MKKFLIAILL LILILAGVAS FSYYKMTEFV KTPVNVQADE LLTIERGTTS51 SKLATLFEQE KLIADGKLLP YLLKLKPELN KIKAGTYSLE NVKTVQDLLD101 LLNSGKEVQF NVKWIEGKTF KDWRKDLENA PHLVQTLKDK SNEEIFALLD151 LPDIGQNLEL KNVEGWLYPD TYNYTPKSTD LELLKRSAER MKKALNKAWN201 ERDEDLPLAN PYEMLILASI VEKETGIANE RAKVASVFIN RLKAKMKLQT251 DPTVIYGMGE NYNGNIRKKD LETKTPYNTY VIDGLPPTPI AMPSESSLQA301 VANPEKTDFY YFVADGSGGH KFTRNLNEHN KAVQEYLRWY RSQKNAK
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF7与脑膜炎奈瑟球菌菌株A的ORF(ORF7a)在重叠的187个氨基酸内显示出有95.2%的相同性:
10 20 30orf7. pep MRGGRPDSVTVQIIEGSRFSHMRKVIDATP
||||||||||||||||||||||||||||||orf7a AAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKVIDATP
70 80 90 100 110 120
40 50 60 70 80 90orf7.pep DIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLN
|| ||||||||||||||||||||||||||||||||||||||||||:||| ||||||||||orfTa DIEHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLRIYQIAYKAMQRRLN
130 140 150 160 170 180
100 110 120 130 140 150orf7.pep EAWESRQDGLPYKNPYEMLIMAXLYEKETGHEAXXDHYASVFVNRLKIGMRLQTXXSVIY
|||||||||||||||||||||| |:|||||||| ||||||||||||||||||| ||||orf7a EAWESRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTDPSVIY
190 200 210 220 230 240
160 170 180orf7.pep GMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALP
||||||||||||||||||||||||||||||||||||||orf7a GMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLYFVSKM
250 260 270 280 290 300orf7a DGTGLSQFSHDLTEHNAAVRKYILKKX
310 320 330全长ORF7a核苷酸序列<SEQ ID 33>是:1 ATGTTGAGAA AATTGTTGAA ATGGTCTGCC GTTTTTTTGA CCGTATCGGC51 AGCCGTTTTC GCCGCGCTGC TTTTCGTCCC TAAAGACAAC GGCAGGGCAT101 ACAGGATTAA AATTGCCAAA AACCAGGGTA TTTCGTCGGT CGGCAGGAAA151 CTTGCCGAAG ACCGCATCGT GTTCAGCAGG CATGTTTTGA CGGCGGCGGC201 CTACGTTTTG GGTGTGCACA ACAGGCTGCA TACGGGGACG TACAGACTGC251 CTTCGGAAGT GTCTGCTTGG GATATCTTGC AGAAAATGCG CGGCGGCAGG301 CCGGATTCCG TTACCGTGCA GATTATCGAA GGTTCGCGTT TTTCGCATAT351 GAGGAAAGTC ATCGACGCAA CGCCCGACAT CGAACACGAC ACCAAAGGCT401 GGAGCAATGA AAAACTGATG GCGGAAGTTG CCCCTGATGC CTTCAGCGGC451 AATCCTGAAG GGCAGTTTTT CCCCGACAGC TACGAAATCG ATGCGGGCGG501 CAGCGATTTA CGGATTTACC AAATCGCCTA CAAGGCGATG CAACGCCGAC551 TGAATGAGGC ATGGGAAAGC AGGCAGGACG GGCTGCCTTA TAAAAACCCT601 TATGAAATGC TGATTATGGC GAGCCTGATC GAAAAGGAAA CAGGGCATGA651 AGCCGACCGC GACCATGTCG CTTCCGTCTT CGTCAACCGC CTGAAAATCG701 GTATGCGCCT GCAAACCGAC CCGTCCGTGA TTTACGGCAT GGGTGCGGCA751 TACAAGGGCA AAATCCGTAA AGCCGACCTG CGCCGCGACA CGCCGTACAA801 CACCTACACG CGCGGCGGTC TGCCGCCAAC CCCGATCGCG CTGCCCGGCA851 AGGCGGCACT CGATGCCGCC GCCCATCCGT CCGGTGAAAA ATACCTGTAT901 TTCGTGTCCA AAATGGACGG TACGGGCTTG AGCCAGTTCA GCCATGATTT951 GACCGAACAC AACGCCGCCG TTCGCAAATA TATTTTGAAA AAATAA预计它编码的蛋白质具有氨基酸序列<SEQ ID 34>:1 MLRKLLKWSA VFLTVSAAVF AALLFVPKDN GRAYRIKIAK NQGISSVGRN51 LAEDRIVFSR HVLTAAAYVL GVHNRLHTGT YRLPSEVSAW DILQKMRGGR101 PDSVTVQIIE GSRFSHMRKV IDATPDIEHD TKGWSNEKLM AEVAPDAFSG151 NPEGQFFPDS YEIDAGGSDL RIYQIAYKAM QRRLNEAWES RQDGLPYKNP201 YEMLIMASLI EKETGHEADR DHVASVFVNR LKIGMRLQTD PSVIYGMGAA251 YKGKIRKADL RRDTPYNTYT RGGLPPTPIA LPGKAALDAA AHPSGEKYLY301 FVSKMDGTGL SQFSHDLTEH NAAVRKYILK K*
前导肽用下划线表示。
ORF7a和ORF7-1在重叠的133个氨基酸内显示出有98.1%的相同性:
10 20 30 40 50 60
orf7a.pep MLRKLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7-1 MLRKLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSR
10 20 30 40 50 60
70 80 90 100 110 120
orf7a.DeD HVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7-1 HVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKV
70 80 90 100 110 120
130 140 150 160 170 180
orf7a.pep IDATPDIEHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLRIYQIAYKAM
||||||| ||||||||||||||||||||||||||||||||||||||||||:||| |||||
orf7-1 IDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAM
130 140 150 160 170 180
190 200 210 220 230 240
orf7a.pep QRRLNEAWESRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTD
|||||||||||||||||||||||||||||:||||||||||||||||||||||||||||||
orf7-1 QRRLNEAWESRQDGLPYKNPYEMLIMASLVEKETGHEADRDHVASVFVNRLKIGMRLQTD
190 200 210 220 230 240
250 260 270 280 290 300
orf7a.pep PSVIYGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7-1 PSVIYGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLY
250 260 270 280 290 300
310 320 330
orf7a.pep FVSKMDGTGLSQFSHDLTEHNAAVRKYILKKX
||||||||||||||||||||||||||||||||
orf7-1 FVSKMDGTGLSQFSHDLTEHNAAVRKYILKKX
310 320 330
与淋病奈瑟球菌的预计ORF的同源性
ORF7与淋病奈瑟球菌的预计ORF(ORF7.ng)在重叠的187个氨基酸内显示出有94.7%的相同性:
orf7 MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHITKGWSNEKLMAEVAPDAFSGNPEGQ 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7ng MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQ 60
orf7 FFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWESRQDGLPYKNPYEMLIMAXLVEKETG 120
||||||||||||||||||||||||||||||||| :||||||||||||||||| |:|||||
orf7ng FFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWAGRQDGLPYKNPYEMLIMASLIEKETG 120
orf7 HEAXXDHVASVFVNRLKIGMRLQTXXSVIYGMGAAYKGKIRKADLRRDTPYNTYTRGGLP 180
||| ||||||||||||||||||| ||||||||||||||||||||||||||||| ||||
orf7ng HEADRDHVASVFVNRLKIGMRLQTDPSVIYGMGAAYKGKIRKADLRRDTPYNTYTGGGLP 180
orf7 PTPIALP 187
|| ||||orf7ng PTRIALPGKAAMDAAAHPSGEKYLYFVSKMDGTGLSQFSHDLTEHNAAYRKYILKK 236
预计ORF7ng核苷酸序列<SEQ ID 35>编码的蛋白质具有氨基酸序列<SEQ ID36>:1 MRGGRPDSVT VQIIEGSRFS HMRKVIDATP DIGHDTKGWS NEKLMAEVAP51 DAFSGNPEGQ FFPDSYEIDA GGSDLQIYQT AYKAMQRRLN EAWAGRQDGL101 PYKNPYFMLI MASLIEKETG HEADRDHVAS VFVNRLKIGM RLQTDPSVIY151 GMGAAYKGKI RKADLRRDTP YNTYTGGGLP PTRIALPGKA AMDAAAHPSG201 EKYLYFVSKM DGTGLSQFSH DLTEHNAAVR KYILKK*进一步的序列分析揭示了ORF7ng的部分DNA序列<SEQ ID 37>:1 ..taccgaatca AGATTGCCAA AAATCAGGGT ATTTCGTCGG TCGGCAGGAA51 ACTTGCcgaA GACCGCATCG TGTTCAGCAG GCATGTTTTG ACAGCGGCGG101 CCTACGTTTT GGGTGTGCAC AACAGGCTGC ATACGGGGAC gTACAGATTG151 CCTTCGGAAG TGTCTGCTTG GGATATCTTG CAGAAAATGC GCGGCGGCAG201 GCCGGATTCC GTTACCGTGC AGATTATCGA AGGTTCGCGT TTTTCGCATA251 TGAGGAAAGT CATCGACGCA ACGCCCGACA TCGGACACGA CACCAAAGGC301 TGGAGCAATG AAAAACTGAT GGCGGAAGTT GCGCCCGATG CCTTCAGCGG351 CAATCCTGAA GGGCAGTTTT TTCCCGACAG CTACGAAATC GATGCGGGCG401 GCAGCGATTT GCAGATTTAC CAAACCGCCT ACAAGGCGAT GCAACGCCGC451 CTGAACGAGG CATGGGCAGG CAGGCAGGAC GGGCTGCCTT ATAAAAACCC501 TTATGAAATG CTGATTATGG CGAGCCTGAT CGAAAAGGAA ACGGGGCATG551 AGGCCGACCG CGACCATGTC GCTTCCGTCT TCGTCAACCG CCTGAAAATC601 GGTATGCGCC TGCAAACCGA CCCGTCCGTG ATTTACGGCA TGGGTGCGGC651 ATACAAGGGC AAAATCCGTA AAGCCGACCT GCGCCGCGAC ACGCCGTACA701 aCAccTAtac gggcgggggc ttgccgccaa cccggattgc gctgcccggC751 Aaggcggcaa tggatgccgc cgcccacccg tccggcgaAa aatacctgTa801 tttcgtgtcC AAAATGGACG GCACGGGCTT GAGCCAGTTC AGCCATGATT851 TGACCGAACA CAACGCCGCc gTcCGCAAAT ATATTTTGAA AAAATAA它对应于氨基酸序列<SEQID 38:ORF7ng-1>:1 ..YRIKIAKNQG ISSVGRKLAE DRIVFSRHVL TAAAYVLGVH NRLHTGTYRL51 PSEVSAWDIL QKMRGGRPDS VTVQIIEGSR FSHMRKVIDA TPDIGHDTKG101 WSNEKLMAEV APDAFSGNPE GQFFPDSYEI DAGGSDLQIY QTAYKAMQRR151 LNEAWAGRQD GLPYKNPYEM LIMASLIEKE TGHEADRDHV ASVFVNRLKI201 GMRLQTDPSV IYGMGAAYKG KIRKADLRRD TPYNTYTGGG LPPTRIALPG251 KAAMDAAAHP SGEKYLYFVS KMDGTGLSQF SHDLTEHNAA VRKYILKK*ORF7ng-1和ORF7-1在重叠的298个氨基酸内有98.0%的相同性:
10 20 30 40 50 60orf7-1.pep KLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSRHVL
||||||||||||||||||||||||||||||orf7ng-1 YRIKIAKNQGISSYGRKLAEDRIVFSRHVL
10 20 30
70 80 90 100 110 120orf7-1.pep TAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKVIDA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf7ng-1 TAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKVIDA
40 50 60 70 80 90
130 140 150 160 170 180orf7-1.pep TPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf7ng-1 TPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRR
100 110 120 130 140 150
190 200 210 220 230 240orf7-1.pep LNEAWESRQDGLPYKNPYEMLIMASLVEKETGHEADRDHVASVFVNRLKIGMRLQTDPSV
||||| :|||||||||||||||||||:|||||||||||||||||||||||||||||||||orf7ng-1 LNEAWAGRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTDPSV
160 170 180 190 200 210
250 260 270 280 290 300orf7-1.pep IYGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLYFVS
||||||||||||||||||||||||||| |||||| ||||||||:||||||||||||||||orf7ng-1 IYGMGAAYKGKIRKADLRRDTPYNTYTGGGLPPTRIALPGKAAMDAAAHPSGEKYLYFVS
220 230 240 250 260 270
310 320 330orf7-1.pep KMDGTGLSQFSHDLTEHNAAVRKYILKKX
|||||||||||||||||||||||||||||orf7ng-1 KMDGTGLSQFSHDLTEHNAAVRKYILKKX
280 290另外,ORF7ng-1显示出与一种假设的大肠杆菌蛋白明显同源:sp|P28306|YCEG_ECOLI PABC-HOLB基因间区域中假设的38.2KD蛋白gi|1787339(AE000210)o340;与YCEG_ECOLI片段100%相同SW:P28306但有97个附加的C端残基[大肠杆菌]长度=340评分=79(36.2位),估计值=5.0e-57,Sum P(2)=5.0e-57相同性=20/87(22%),阳性=40/87(45%)询问: 10 GISSVGRKLAEDRIVFSRHVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPD 69
G ++G +L D+I+ V + + GTYR +++ ++L+ + G+目标: 49 GRLALGEQLYADKIINRPRVFQWLLRIEPDLSHFKAGTYRFTPQMTVREMLKLLESGKEA 108询问: 70 SVTVQIIEGSRFSHMRKVIDATPDIGH 96
++++EG R S K + P I H目标: 109 QFPLRLVEGMRLSDYLKQLREAPYIKH 135评分=438(200.7位),估计值=5.0e-57,Sum P(2)=5.0e-57相同性=84/155(54%),阳性=111/155(71%)询问:120 EGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWAGRQDGLPYKNPYEMLIMASLIEK 179
EG F+PD++ A +D+ + + A+K M + ++ AW GR DGLPYK+ +++ MAS+IEK目标:158 EGWFWPDTWMYTANTTDVALLKRAHKKMVKAVDSAVEGRADGLPYKDKNQLVTMASIIEK 217询问:180 ETGHEADRDHVASVFVNRLKIGMRLQTDPSVIYGMGAAYKGKIRKADLRRDTPYNTYTGG 239
ET ++RD VASVF+NRL+IGMRLQTDP+VIYGMG Y GK+ +ADL T YNTYT目标:218 ETAVASERDKVASVFINRLRIGMRLQTDPTVIYGMGERYNGKLSRADLETPTAYNTYTIT 277询问:240 GLPPTRIALPGKAAMDAAAHPSGEKYLYFVSKMDG 274
GLPP IA PG ++ AAAHP+ YLYFV+ G目标:278 GLPPGAIATPGADSLKAAAHPAKTPYLYFVADGKG 312
根据该分析,包括流感嗜血菌YCEG蛋白具有一个可能的前导序列这一事实,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白及其表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例6
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 39>:1 CGTTTCAAAA TGTTAACTGT GTTGACGGCA ACCTTGATTG CCGGACAGGT51 ATCTGCCGCC GGAGGCGGTG CGGGGGATAT GAAACAGCCG AAGGAAGTCG101 GAAAGGTTTT CAGAAAGCAG CAGCGTTACA GCGAGGAAGA AATCAAAAAC151 GAACGCGCAC GGCTTGCGGC AGTGGGCGAG CGGGTTAATC AGATATTTAC201 GTTGCTGGGA GGGGAAACCG CCTTGCAAAA GGGGCAGGCG GGAACGGCTC251 TGGCAACCTA TATGCTGATG TTGGAACGCA CAAAATCCCC CGAAGTCGCC301 GAACGCGCCT TGGAAATGGC CGTGTCGCTG AACGCGTTTG AACAGGCGGA351 AATGATTTAT CAGAAATGGC GGCAGATTGA GCCTATACCG GGTAAGGCGC401 AAAAACGGGC GGGGTGGCTG CGGAACGTGC TGAGGGAAAG AGGAAATCAG451 CATCTGGACG GACGGGAAGA AGTGCTGGCT CAGGCGGACG AAGGACAG它对应于氨基酸序列<SEQ ID 40;ORF9>:1 ..RFKMLTVLTA TLIAGQVSAA GGGAGDMKQP KEVGKVFRKQ QRYSEEEIKN51 ERARLAAVGE RVNQIFTLLG GETALQKGQA GTALATYMLM LERTKSPEVA101 ERALEMAVSL NAFEQAEMIY QKWRQIEPIP GKAQKRAGWL RNVLRERGNQ151 HLDGREEVLA QADEGQ进一步的序列分析揭示了完整的DNA序列<SEQ ID 41>:1 ATGTTACCTA ACCGTTTCAA AATGTTAACT GTGTTGACGG CAACCTTGAT51 TGCCGGACAG GTATCTGCCG CCGGAGGCGG TGCGGGGGAT ATGAAACAGC101 CGAAGGAAGT CGGAAAGGTT TTCAGAAAGC AGCAGCGTTA CAGCGAGGAA151 GAAATCAAAA ACGAACGCGC ACGGCTTGCG GCAGTGGGCG AGCGGGTTAA201 TCAGATATTT ACGTTGCTGG GAGGGGAAAC CGCCTTGCAA AAGGGGCAGG251 CGGGAACGGC TCTGGCAACC TATATGCTGA TGTTGGAACG CACAAAATCC301 CCCGAAGTCG CCGAACGCGC CTTGGAAATG GCCGTGTCGC TGAACGCGTT351 TGAACAGGCG GAAATGATTT ATCAGAAATG GCGGCAGATT GAGCCTATAC401 CGGGTAAGGC GCAAAAACGG GCGGGGTGGC TGCGGAACGT GCTGAGGGAA451 AGAGGAAATC AGCATCTGGA CGGACTGGAA GAAGTGCTGG CTCAGGCGGA501 CGAAGGACAG AACCGCAGGG TGTTTTTATT GTTGGCACAA GCCGCCGTGC551 AACAGGACGG GTTGGCGCAA AAAGCATCGA AAGCGGTTCG CCGCGCGGCG601 TTGAAATATG AACATCTGCC CGAAGCGGCG GTTGCCGATG TGGTGTTCAG651 CGTACAGGGA CGCGAAAAGG AAAAGGCAAT CGGAGCTTTG CAGCGTTTGG701 CGAAGCTCGA TACGGAAATA TTGCCCCCCA CTTTAATGAC GTTGCGTCTG751 ACTGCACGCA AATATCCCGA AATACTCGAC GGCTTTTTCG AGCAGACAGA801 CACCCAAAAC CTTTCGGCCG TCTGGCAGGA AATGGAAATT ATGAATCTGG851 TTTCCCTGCA CAGGCTGGAT GATGCCTATG CGCGTTTGAA CGTGCTGTTG901 GAACGCAATC CGAATGCAGA CCTGTATATT CAGGCAGCGA TATTGGCGGC951 AAACCGAAAA GAAGGTGCTT CCGTTATCGA CGGCTACGCC GAAAAGGCAT1001 ACGGCAGGGG GACGGAGGAA CAGCGGAGCA GGGCGGCGCT AACGGCGGCG1051 ATGATGTATG CCGACCGCAG GGATTACGCC AAAGTCAGGC AGTGGCTGAA1101 AAAAGTATCC GCGCCGGAAT ACCTGTTCGA CAAAGGTGTG CTGGCGGCTG1151 CGGCGGCTGT CGAGTTGGAC GGCGGCAGGG CGGCTTTGCG GCAGATCGGC1201 AGGGTGCGGA AACTTCCCGA ACAGCAGGGG CGGTATTTTA CGGCAGACAA1251 TTTGTCCAAA ATACAGATGC TCGCCCTGTC GAAGCTGCCC GATAAACGGG1301 AGGCTTTGAG GGGGTTGGAC AAGATTATCG AAAAACCGCC TGCCGGCAGT1351 AATACAGAGT TACAGGCAGA GGCATTGGTA CAGCGGTCAG TTGTTTACGA1401 TCGGCTTGGC AAGCGGAAAA AAATGATTTC AGATCTTGAA AGGGCGTTCA1451 GGCTTGCACC CGATAACGCT CAGATTATGA ATAATCTGGG CTACAGCCTG1501 CTGACCGATT CCAAACGTTT GGACGAAGGT TTCGCCCTGC TTCAGACGGC1551 ATACCAAATC AACCCGGACG ATACCGCTGT CAACGACAGC ATAGGCTGGG1601 CGTATTACCT GAAAGGCGAC GCGGAAAGCG CGCTGCCGTA TCTGCGGTAT1651 TCGTTTGAAA ACGACCCCGA GCCCGAAGTT GCCGCCCATT TGGGCGAAGT1701 GTTGTGGGCA TTGGGCGAAC GCGATCAGGC GGTTGACGTA TGGACGCAGG1751 CGGCACACCT TACGGGAGAC AAGAAAATAT GGCGGGAAAC GCTCAAACGT1801 CACGGCATCG CATTGCCCCA ACCTTCCCGA AAACCTCGGA AATAA它对应于氨基酸序列<SEQ ID 42;ORF9-1>:1 MLPNRFKMLT VLTATLIAGQ VSAAGGGAGD MKQPKEVGKV FRKQQRYSEE51 EIKNERARLA AVGERVNQIF TLLGGETALQ KGQAGTALAT YMLMLERTKS101 PEVAERALEM AVSLNAFEQA EMIYQKWRQI EPIPGKAQKR AGWLRNVLRE151 RGNQHLDGLE EVLAQADEGQ NRRVFLLLAQ AAVQQDGLAQ KASKAVRRAA201 LKYEHLPEAA VADVVFSYQG REKEKAIGAL QRLAKLDTEI LPPTLMTLRL251 TARKYPEILD GFFEQTDTQN LSAVWQEMEI MNLVSLHRLD DAYARLNVLL301 ERNPNADLYI QAAILAANRK EGASVIDGYA EKAYGRGTEE QRSRAALTAA351 MMYADRRDYA KVRQWLKKVS APEYLFDKGV LAAAAAVELD GGRAALRQIG401 RVRKLPEQQG RYFTADNLSK IQMLALSKLP DKREALRGLD KIIEKPPAGS451 NTELQAEALV QRSVVYDRLG KRKKMISDLE RAFRLAPDNA QIMNNLGYSL501 LTDSKRLDEG FALLQTAYQI NPDDTAVNDS IGWAYYLKGD AESALPYLRY551 SFENDPEPEV AAHLGEVLWA LGERDQAVDV WTQAAHLTGD KKIWRETLKR
601 HGIALPQPSR KPRK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF9与脑膜炎奈瑟球菌菌株A的ORF(ORF9a)在重叠的166个氨基酸区域表现出有89.8%的相同性:
10 20 30 40 50orf9.pep RFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA
|| :|:||:|:|:|||: || ||:| | |||||||||||||||||||||||||||orf9a MLPARFTILSVLAAALLAGQAYAA--GAADAKPPKEVGKVFRKQQRYSEEEIKNERARLA
10 20 30 40 50
60 70 80 90 100 110orf9.pep AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
|||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||orf9a AVGERVNQIFTLLGXETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
60 70 80 90 100 110
120 130 140 150 160orf9.pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGREEVLAQADEGQ
|||||||||||||||||||||||||||||||||||||| || |||||| |orf9a EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEXLAQADEXQNRRVFLLLAQ
120 130 140 150 160 170orf9a AAVQQDGLAQKASKAVRRAALRYEHLPEAAVADVVFSVQXREKEKAIGALQRLAKLDTEI
180 190 200 210 220 230全长ORF9a核苷酸序列<SEQ ID 43>是:1 ATGTTACCCG CCCGTTTCAC CATTTTATCT GTGCTCGCGG CAGCCCTGCT51 TGCCGGGCAG GCGTATGCCG CCGGCGCGGC GGATGCGAAG CCGCCGAAGG101 AAGTCGGAAA GGTTTTCAGA AAGCAGCAGC GTTACAGCGA GGAAGAAATC151 AAAAACGAAC GCGCACGGCT TGCGGCAGTG GGCGAGCGGG TTAATCAGAT201 ATTTACGTTG CTGGGANGGG AAACCGCCTT GCAAAAGGGG CAGGCGGGAA251 CGGCTCTGGC AACCTATATG CTGATGTTGG AACGCACAAA ATCCCCCGAA301 GTCGCCGAAC GCGCCTTGGA AATGGCCGTG TCNCTGAACG CGTTTGAACA351 GGCGGAAATG ATTTATCAGA AATGGCGGCA GATTGAGCCT ATACCGGGTA401 AGGCGCAAAA ACGGGCGGGG TGGCTGCGGA ACGTGCTGAG GGAAAGAGGA451 AATCAGCATC TAGACGGACT GGAAGAANTG CTGGCTCAGG CGGACGAANG501 ACAGAACCGC AGGGTGTTTT TATTGTTGGC ACAAGCCGCC GTGCAACAGG551 ACGGGTTGGC GCAAAAAGCA TCGAAAGCGG TTCGCCGCGC GGCGTTGAGA601 TATGAACATC TGCCCGAAGC GGCGGTTGCC GATGTGGTGT TCAGCGTACA651 GGNACGCGAA AAGGAAAAGG CAATCGGAGC TTTGCAGCGT TTGGCGAAGC701 TCGATACGGA AATATTGCCC CCCACTTTAA TGACGTTGCG TCTGACTGCA751 CGCAAATATC CCGAAATACT CGACGGCTTT TTCGAGCAGA CAGACACCCA801 AAACCTTTCG GCCGTCTGGC AGGAAATGGA AATTATGAAT CTGGTTTCCC851 TGCACAGGCT GGATGATGCC TATGCGCGTT TGAACGTGCT GTTGGAACGC901 AATCCGAATG CAGACCTGTA TATTCAGGCA GCGATATTGG CGGCAAACCG951 AAAAGAANGT GCTTCCGTTA TCGACGGCTA CGCCGAAAAG GCATACGGCA1001 GGGGGACGGG GGAACAGCGG GGCAGGGCGG CAATGACGGC GGCGATGATA1051 TATGCCGACC GAAGGGATTA CACCAAAGTC AGGCAGTGGT TGAAAAAAGT1101 GTCCGCGCCG GAATACCTGT TCGACAAAGG TGTGCTGGCG GCTGCGGCGG1151 CTGTCGAGTT GGACNGCGGC AGGGCGGCTT TGCGGCAGAT CGGCAGGGTG1201 CGGAAACTTC CCGAACAGCA GGGGCGGTAT TTTACGGCAG ACAATTTGTC1251 CAAAATACAG ATGTTCGCCC TGTCGAAGCT GCCCGACAAA CGGGAGGCTT1301 TGAGGGGGTT GGACAAGATT ATCGAAAAAC CGCCTGCCGG CAGTAATACA1351 GAGTTACAGG CAGAGGCATT GGTACAGCGG TCAGTTGTTT ACGATCGGCT1401 TGGCAAGCGG AAAAAAATGA TTTCAGATCT TGAAAGGGCG TTCAGGCTTG1451 CACCCGATAA CGCTCAGATT ATGAATAATC TGGGCTACAG CCTGCTTTCC1501 GATTCCAAAC GTTTGGACGA AGGCTTCGCC CTGCTTCAGA CGGCATACCA1551 AATCAACCCG GACGATACCG CTGTCAACGA CAGCATAGGC TGGGCGTATT1601 ACCTGAAANG CGACGCGGAA AGCGCGCTGC CGTATCTGCG GTATTCGTTT1651 GAAAACGACC CCGAGCCCGA AGTTGCCGCC CATTTGGGCG AAGTGTTGTG1701 GGCATTGGGC GAACGCGATC AGGCGGTTGA CGTATGGACG CAGGCGGCAC1751 ACCTTACGGG AGACAAGAAA ATATGGCGGG AAACGCTCAA ACGTCACGGC1801 ATCGCATTGC CCCAACCTTC CCGAAAACCT CGGAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 44>:1 MLPARFTILS VLAAALLAGQ AYAAGAADAK PPKEVGKVFR KQQRYSEEEI51 KNERARLAAV GERVNQIFTL LGXETALQKG QAGTALATYM LMLERTKSPE101 VAERALEMAV SLNAFEQAEM IYQKWRQIEP IPGKAQKRAG WLRNVLRERG151 NQHLDGLEEX LAQADEXQNR RVFLLLAQAA VQQDGLAQKA SKAVRRAALR201 YEHLPEAAVA DVVFSVQXRE KEKAIGALQR LAKLDTEILP PTLMTLRLTA251 RKYPEILDGF FFQTDTQNLS AVWQEMEIMN LVSLHRLDDA YARLNVLLER301 NPNADLYIQA AILAANRKEX ASVIDGYAEK AYGRGTGEQR GRAAMTAAMI351 YADRRDYTKV RQWLKKVSAP EYLFDKGVLA AAAAVELDXG RAALRQIGRV401 RKLPEQQGRY FTADNLSKIQ MFALSKLPDK REALRGLDKI IEKPPAGSNT451 ELQAEALVQR SVVYDRLGKR KKMISDLERA FRLAPDNAQI MNNLGYSLLS501 DSKRLDEGFA LLQTAYQINP DDTAVNDSIG WAYYLKXDAE SALPYLRYSF551 ENDPEPEVAA HLGEYLWALG ERIQAVDVWT QAAHLTGDKK IWRETLKRHG601 IALPQPSRKP RK*ORF9a和ORF9-1在614个氨基酸的重叠区域内显示出有95.3%的相同性:
10 20 30 40 50orf9a.pep MLPARFTILSVLAAALLAGQAYAAG--AADAKPPKEVGKVFRKQQRYSEEEIKNERARLA
||| || :|:||:|:|:|||: ||| |:| | |||||||||||||||||||||||||||orf9-1 MLPNRFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA
10 20 30 40 50 60
60 70 80 90 100 110orf9a.pep AVGERVNQIFTLLGXETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
|||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||orf9-1 AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
70 80 90 100 110 120
120 130 140 150 160 170orf9a.pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEXLAQADEXQNRRVFLLLAQ
||||||||||||||||||||||||||||||||||||||||| |||||| |||||||||||orf9-1 EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEVLAQADEGQNRRVFLLLAQ
130 140 150 160 170 180
180 190 200 210 220 230orf9a.pep AAVQQDGLAQKASKAVRRAALRYEHLPEAAVADVVFSVQXREKEKAIGALQRLAKLDTEI
|||||||||||||||||||||:||||||||||||||||| ||||||||||||||||||||orf9-1 AAVQQDGLAQKASKAVRRAALKYEHLPEAAVADVVFSVQGREKEKAIGALQRLAKLDTEI
190 200 210 220 230 240
240 250 260 270 280 290orf9a.pep LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf9-1 LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL
250 260 270 280 290 300
300 310 320 330 340 350orf9a.pep ERNPNADLYIQAAILAANRKEXASVIDGYAEKAYGRGTGEQRGRAAMTAAMIYADRRDYT
||||||||||||||||||||| |||||||||||||||| |||:|||:||||:|||||||:orf9-1 ERNPNADLYIQAAILAANRKEGASVIDGYAEKAYGRGTEEQRSRAALTAAMMYADRRDYA
310 320 330 340 350 360
360 370 380 390 400 410orf9a.pep KVRQWLKKVSAPEYLFDKGVLAAAAAVELDXGRAALRQIGRVRKLPEQQGRYFTADNLSK
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||
orf9-1 KVRQWLKKVSAPEYLFDKGVLAAAAAVELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK
370 380 390 400 410 420
420 430 440 450 460 470
orf9a.pep IQMFALSKLPDKREALRGLDKIIEKPPAGSNTELQAEALVQRSVVYDRLGKRKKMISDLE
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf9-1 IQMLALSKLPDKREALRGLDKIIEKPPAGSNTELQAEALVQRSVVYDRLGKRKKMISDLE
430 440 450 460 470 480
480 490 500 510 520 530
orf9a.pep RAFRLAPDNAQIMNNLGYSLLSDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKXD
|||||||||||||||||||||:|||||||||||||||||||||||||||||||||||| |
orf9-1 RAFRLAPDNAQIMNNLGYSLLTDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKGD
490 500 510 520 530 540
540 550 560 570 580 590
orf9a.pep AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf9-1 AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR
550 560 570 580 590 600
600 610
orf9a.pep HGIALPQPSRKPRKX
|||||||||||||||
orf9-1 HGIALPQPSRKPRKX
610
与淋病奈瑟球菌的预计ORF的同源性
ORF9与淋病奈瑟球菌的预计ORF(ORF9.ng)在重叠的163个氨基酸区域内显示出有82.8%的相同性:
Orf9 RFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERAR 54
|| :|:||:|:|:|||: || ||:|:: |||||||:||::|||||||||||||
orf9ng MIMLPARFTILSVLAAALLAGQAYAA--GAADVELPKEVGKVLRKHRRYSEEEIKNERAR 58
orf9 LAAVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFE 114
|||||||||::|||||||||||||||||||||||||||||||||||||||||||||||||
orf9ng LAAVGERVNRVFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFE 118
orf9 QAEMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGREEVLAQADEGQ 166
|||||||||||||||||:||| ||||||||:| || ||| ||| ||:|
orf9ng QAEMIYQKWRQIEPIPGEAQKPAGWLRNVLKEGGNPHLDRLEEVPAQSDYVHQPMIFLLL 178
预计ORF9ng核苷酸序列<SEQ ID 45>编码的蛋白质具有氨基酸序列<SEQ ID46>:1 MIMLPARFTI LSVLAAALLA GQAYAAGAAD VELPKEVGKV LRKHRRYSEE51 EIKNERARLA AVGERVNRVF TLLGGETALQ KGQAGTALAT YMLMLERTKS101 PEVAERALEM AVSLNAFEQA EMIYQKWRQI EPIPGEAQKP AGWLRNVLKE151 GGVPHLDRLE EVPAQSDYVH QPMIFLLLVQ AAVQHGGVAQ KPSKAVRPAA201 YNYEVLPETA GADAVFCVQG PQYEKAIQSF PPCGRNPQTE NIAPPFNELF251 RPTARPISPK LLQRFFRTEP NLAKPFRPPG PEMETYQTGF PRPLTRNNPT氨基酸1-28是推定的前导序列,173-189预计是跨膜结构域。进一步的序列分析揭示了全长ORF9ng DNA序列<SEQ ID 47>:1 ATGTTACCCG CCCGTTTCAC TATTTTATCT GTCCTCGCAG CAGCCCTGCT51 TGCCGGACAG GCGTATGCTG CCGGCGCGGC GGATGTGGAG CTGCCGAAGG101 AAGTCGGAAA GGTTTTAAGG AAACATCGGC GTTACAGCGA GGAAGAAATC151 AAAAACGAAC GCGCACGGCT TGCGGCAGTG GGCGAACGGG TCAACAGGGT 201 GTTTACGCTG TTGGGCGGTG AAACGGCTTT GCAGAAAGGG CAGGCGGGAA251 CGGCTCTGGC AACCTATATG CTGATGTTGG AACGCACAAA ATCCCCCGAA301 GTCGCCGAAC GCGCCTTGGA AATGGCCGTG TCGCTGAACG CGTTTGAACA351 GGCGGAAATG ATTTATCAGA AATGgcggca gatcgagcct ataCcgggtg401 aggcgcaaaa accgGcgggG tggctgcgga acgtattgaa ggaagggGGa451 aaTCAGCATC TGGAcgggtt gaaagaggTG CtggcgcaAT cggacgatGT501 GCAAAAAcgc aggaTATTTT TGCTGCTGGT GCAAGCCGCC GTGCagcagg551 gTGGGGTGGC TCAAAAAGCA TCGAAAGCGG TTCGCcgtgc GGcgttgaAG601 TATGAACATC TGCCcgaagc ggcggTTGCC GATGcggTGT TCGGCGTACA651 GGGACGCGAA AAGGAAAagg caaTCGAAGC TTTGCAGCGT TTGGCGAAGC701 TCGATACGGA AATATTGCCC CCCACTTTAA TGACGTTGCG TCTGACTGCA751 CGCAAATATC CCGAAATACT CGACGGCTTT TTCGAGCAGA CAGACACCCA801 AAACCTTTCG GCCGTCTGGC AGGAAATGGA AATTATGAAT CTGGTTTCCC851 TGCGTAAGCC GGATGATGCC TATGCGCGTT TGAACGTGCT GTTGGAACAC901 AACCCGAATG CAAACCTGTA TATTCAGGCG GCGATATTGG CGGCAAACCG951 AAAAGAAGGT GCGTCCGTTA TCGACGGCTA CGCCGAAAAG GCATACGGCA1001 GGGGGACGGG GGAACAGCGG GGCagggcgg cAATgacggc GGCGATGATA1051 TATGCCGACC GCAGGGATTA CGCCAAAGTC AGGCAGTGGT TGAAAAAAGT1101 GTCCGCGCCG GAATACCTGT TCGACAAAGG CGTGCTGGCG GCTGCGGCGG1151 CTGCCGAATT GGACGGAGGC CGGGCGGCTT TGCGGCAGAT CGGCAGGGTG1201 CGGAAACTTC CCGAACAGCA GGGGCGGTAT TTTACGGCAG ACAATTTGTC1251 CAAAATACAG ATGCTCGCCC TGTCGAAGCT GCCCGACAAA CGGGAAGCCC1301 TGATCGGGCT GAACAACATC ATCGCCAAAC TTTCGGCGGC GGGAAGCACG1351 GAACCTTTGG CGGAAGCATT GGCACAGCGT TCCATTATTT ACGaacAGTT1401 cggCAAACGG GGAAAAATGA TTGCCGACCT tgaAACcgcg CTCAAACTTA1451 CGCCCGATAA TGCACAAATT ATGAATAATC TGGGCTACAG CCTGCTTTCC1501 GATTCCAAAC GTTTGGACGA GGGTTTCGCC CTGCTTCAGA CGGCATACCA1551 AATCAACCCG GACGATACCG CCGTTAACGA CAGCATAGGC TGGGCGTATT1601 ACCTGAAAGG CGACgcggaA AGCGCGCTGC CGTATCTGcg gtattcgttt1651 gAAAACGACC CCGAGCCCGA AGTTGCCGCC CATTTGGGCG AAGTGTTGTG1701 GGCATTGGGC GAACGCGATC AGGCGGTTGA CGTATGGACG CAGGCGGCAC1751 ACCTTAGGGG AGACAAGAAA ATATGGCGGG AGACGCTCAA ACGCTACGGA1801 ATCGCCTTGC CCGAGCCTTC CCGAAAACCC CGGAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 48>:1 MLPARFTILS VLAAALLAGQ AYAAGAADVE LPKEVGKVLR KHRRYSEEEI51 KNERARLAAV GERVNRVFTL LGGETALQKG QAGTALATYM LMLERTKSPE101 VAERALEMAV SLNAFEQAEM IYQKWRQIEP IPGEAQKPAG WLRNVLKEGG151 NQHLDGLKEV LAQSDDVQKR RIFLLLVQAA VQQGGVAQKA SKAVRRAALK201 YEHLPEAAVA DAVFGVQGRE KEKAIEALQR LAKLDTEILP PTLMTLRLTA251 RKYPEILDGF FEQTDTQNLS AVWQEMEIMN LVSLRKPDDA YARLNVLLEH301 NPNANLYIQA AILAANRKEG ASVIDGYAEK AYGRGTGEQR GRAAMTAAMI351 YADRRDYAKV RQWLKKVSAP EYLFDKGVLA AAAAAELDGG RAALRQIGRV401 RKLPEQQGRY FTADNLSKIQ MLALSKLPDK REALIGLNNI IAKLSAAGST451 EPLAEALAQR SIIYEQFGKR GKMIADLETA LKLTPDNAQI MNNLGYSLLS501 DSKRLDEGFA LLQTAYQINP DDTAVNDSIG WAYYLKGDAE SALPYLRYSF551 ENDPEPEVAA HLGEVLWALG ERDQAVDVWT QAAHLRGDKK IWRETLKRYG601 IALPEPSRKP RK*ORF9ng和ORF9-1在614个氨基酸的重叠区域内显示出有88.1%的相同性:
10 20 30 40 50 60orf9-1.pep MLPNRFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEYGKVFRKQQRYSEEEIKNERARLA
||| || :|:||:|:|:|||: ||| |:|:: |||||||:||::|||||||||||||||orf9ng-1 MLPARFTILSVLAAALLAGQAYAAG--AADVELPKEYGKVLRKHRRYSEEEIKNERARLA
10 20 30 40 50
70 80 90 100 110 120orf9-1.pep AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
|||||||::|||||||||||||||||||||||||||||||||||||||||||||||||||orf9ng-1 AVGERVNRVFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
60 70 80 90 100 110
130 140 150 160 170 180orf9-1.pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEVLAQADEGQNRRVFLLLAQ
|||||||||||||||:||| ||||||||:| ||||||||:|||||:|: |:||:||||:|orf9ng-1 EMIYQKWRQIEPIPGEAQKPAGWLRNVLKEGGNQHLDGLKEVLAQSDDVQKRRIFLLLVQ
120 130 140 150 160 170
190 200 210 220 230 240orf9-1.pep AAVQQDGLAQKASKAVRRAALKYEHLPEAAVADVVFSVQGREKEKAIGALQRLAKLDTEI
||||| |:|||||||||||||||||||||||||:||:|||||||||| ||||||||||||orf9ng-1 AAVQQGGVAQKASKAVRRAALKYEHLPEAAVADAVFGVQGREKEKAIEALQRLAKLDTEI
180 190 200 210 220 230
250 260 270 280 290 300orf9-1.pep LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL
||||||||||||||||||||||||||||||||||||||||||||||:: |||||||||||orf9ng-1 LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQFMEIMNLVSLRKPDDAYARLNVLL
240 250 260 270 280 290
310 320 330 340 350 360orf9-1.pep ERNPNADLYIQAAILAANRKEGASVIDGYAEKAYGRGTEEQRSRAALTAAMMYADRRDYA
|:||||:||||||||||||||||||||||||||||||| |||:|||:||||:||||||||orf9ng-1 EHNPNANLYIQAAILAANRKEGASVIDGYAEKAYGRGTGEQRGRAAMTAAMIYADRRDYA
300 310 320 330 340 350
370 380 390 400 410 420orf9-1.pep KVRQWLKKVSAPEYLFDKGVLAAAAAVELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||orf9ng-1 KVRQWLKKVSAPEYLFDKGVLAAAAAAELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK
360 370 380 390 400 410
430 440 450 460 470 480orf9-1.pep IQMLALSKLPDKREALRGLDKIIEKPPAGSNTELQAEALVQRSVVYDRLGKRKKMISDLE
|||||||||||||||| ||::|| | |:::|| ||||:|||::|:::||| |||:|||orf9ng-1 IQMLALSKLPDKREALIGLNNIIAKLSAAGSTEPLAEALAQRSIIYEQFGKRGKWIADLE
420 430 440 450 460 470
490 500 510 520 530 540orf9-1.pep RAFRLAPDNAQIMNNLGYSLLTDSKRLDEGFALLQTAYQINPDDTAYNDSIGWAYYLKGD
|::|:|||||||||||||||:||||||||||||||||||||||||||||||||||||||orf9ng-1 TALKLTPDNAQIMNNLGYSLLSDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKGD
480 490 500 510 520 530
550 560 570 580 590 600orf9-1.pep AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR
||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||orf9ng-1 AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLRGDKKIWRETLKR
540 550 560 570 580 590
610orf9-1.pep HGIALPQPSRKPRKX
:|||||:||||||||orf9ng-1 YGIALPEPSRKPRKX
600 610另外,ORF9ng显示出与绿脓杆菌的一种假设蛋白明显同源:sp|P42810|YHE3_PSEAE HEMM-HEMA基因间区域中的假设的64.8 KD蛋白(ORF3)>gi|1072999|pir||S49376假设蛋白3-绿脓杆菌>gi|557259(X82071)orf3[绿脓杆菌]长度=576评分=128位(318),估计值=1e-28相同性=138/587(23%),阳性=228/587(38%),空隙=125/587(21%)询问:67 VFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQAEMIYQKWR 126
+++LL E A Q+ + AL+ Y++ ++T+ P V+ERA +A L A ++A W目标:53 LYSLLVAELAGQRNRFDIALSNYVVQAQKTRDPGVSERAFRIAEYLGADQEALDTSLLWA 112询问:127 QIEPIPGEAQKPAG--------------WLRNVLKEGGNQHLDGLKEVLAQSDDVQKRRI 172
+ P +AQ+ A ++ VL G+ H D L A++D + +目标:113 RSAPDNLDAQRAAAIQLARAGRYEESMVYMEKVLNGQGDTHFDFLALSAAETDPDTRAGL 172询问:173 FXXXXXXXXXXXXXXXKASKAVRRAALKYEHLPEAAVADAVFGVQGREKEKAIEALQRLA 232
++ KY + + A+ Q ++A+ L+ +目标:173 L------------------QSFDHLLKKYPNNGQLLFGKALLLQQDGRPDEALTLLEDNS 214询问:233 KLDTEILPPTLMTLRLTARK-----YPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLRKP 287
E+ P L + L + K P + G E D + + + + LV +目标:215 ASRHEVAPLLLRSRLLQSMKRSDEALPLLKAGIKEHPDDKRVRLAYARL----LVEQNRL 270询问:288 DDAYARLNVLLEHNPN---------------------ANLYIQAAI-------------- 312
DDA A L++ P+ A +Y++ +目标:271 DDAKAEFAGLVQQFPDDDDDLRFSLALVCLEAQAWDEARIYLEELVERDSHVDAAHFNLG 330询问:313 -LAANRKEGASVIDGYAEKAYGRGTGEQRGRAAMTAAMIYADRRDYAKVRQWLKKVSAPE 371
LA +K+ A +D YA+ G G + T ++ A R D A R + P+目标:331 RLAEEQKDTARALDEYAQ--VGPGNDFLPAQLRQTDVLLKAGRVDEAAQRLDKARSEQPD 388询问:372 YLFDKXXXXXXXXXXXXXXXXXXRQIGRVRKLPEQQGRYFTADNLSKIQMLALSKLPDKR 431
Y A L I+ ALS +目标:389 Y----------------------------------------AIQLYLIEAEALSNNDQQE 408询问:432 EALIGLNNIIAKLSAAGSTEPLAEALAQRSIIYEQFGKRGKMIADLETALKLTPDNAQIM 491
+A + + + E L L RS++ E+ +M DL + PDNA +目标:409 KAWQAIQEGLKQYP-----EDL-NLLYTRSMLAEKRNDLAQMEKDLRFVIAREPDNAMAL 462询问:492 NNLGYSLLSDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKGDAESALPYLRYSFE 551
N LGY+L + R E L+ A+++NPDD A+ DS+GW Y +G A YLR + +目标:463 NALGYTLADRTTRYGEARELILKAHKLNPDDPAILDSMGWINYRQGKLADAERYLRQALQ 522询问:552 NDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLRGDKKIWRETLKR 598
P+ EVAAHLGEVLWA G + A +W + + D + R T+KR目标:523 RYPDHEVAAHLGEVLWAQGRQGDARAIWREYLDKQPDSDVLRRTIKR 569gi|2983399(AE000710)假设蛋白[Aquifex aeolicus]长度=545评分=81.5位(198),估计值=1e-14相同性=61/198(30%),阳性=98/198(48%),空隙=19/198(9%)询问:408 GRYFTADNL-SKIQMLALSKLPDKREALIGLNNIIAKLSAAGSTEPLAEALAQ------- 459
G Y A L K ++LA PDK+E L + +K + + L +目标:335 GNYEDAKRLIEKAKVLA----PDKKEILFLEADYYSKTKQYDKALEILKKLEKDYPNDSR 390询问:460 ----RSIIYEQFGKRGKMIADLETALKLTPDNAQIMNNLGYSLLS--DSKRLDEGFALLQ 513
+I+Y+ G L A++L P+N N LGYSLL +R++E L++目标:391 VYFMEAIVYDNLGDIKNAEKALRKAIELDPENPDYYNYLGYSLLLWYGKERVEEAEELIK 450询问:514 TAYQINPDDTAVNDSIGWAYYLKGDAESALPYLRYSF-ENDPEPEVAAHLGEVLWALGER 572
A + +P++ A DS+GW YYLKGD E A+ YL + E +P V H+G+VL +G +目标:451 KALEKDPENPAYIDSMGWVYYLKGDYERAMQYLLKALREAYDDPVVNEHVGDVLLKMGYK 510询问:573 DQAVDVWTQAAHLRGDKK 590
++A + + +A L + K目标:511 EEARNYYERALKLLEEGK 528
根据该分析,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例7
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 49>:1 AACCTCTACG CCGGCCCGCA GACCACATCC GTCATCGCAA ACATCGCCGA51 CAACCTGCAA CTGGCCAAAG ACTACGGCAA AGTACACTGG TTCGCCTCCC101 CGCTCTTCTG GCTCCTGAAC CAACTGCACA ACATCATCGG CAACTGGGGC151 TGGGCGATTA TCGTTTTAAC CATCATCGTC AAAGCCGTAC TGTATCCATT201 GACCAACGCC TCTTACCGCT CTATGGCGAA AATGCGTGCC GCCGCACCCA251 AACTGCAAGC CATCAAAGAG AAATACGGCG ACGACCGTAT GGCGCAACAA301 CAGGCGATGA TGCAGCTTTA CACAGACGAG AAAATCAACC CGaCTGGGCG351 GCTGCCTGCC TATGCTGTTG CAAATCCCCG TCTTCATCGG ATTGTATTGG401 GCATTGTTCG CCTCCGTAGA ATTGCGCCAG GCACCTTGGC TGGGTTGGAT451 TACCGACCTC AGCCGCGCCG ACCCCTACTA CATCCTGCCC ATCATTATGG501 CGGCAACGAT GTTCGCCCAA ACTTATCTGA ACCCGCCGCC GAcCGACCCG551 ATGCagGCGA AAATGATGAA AATCATGCCG TTGGTTTTCT CsGwCrTGTT601 CTTCTTCTTC CCTGCCGGks TGGTATTGTA CTGGGTAGTC AACAACCTCC651 TGACCATCGC CCAGCAATGG CACATCAACC GCAGCATCGA AAAACAACGC701 GCCCAAGGCG AAGTCGTTTC CTAA它对应于氨基酸序列<SEQ ID 50:ORF11>:1 ..NLYAGPQTTS VIANIADNLQ LAKDYGKVHW FASPLFWLLN QLHNIIGNWG51 WAIIVLTIIV KAVLYPLTNA SYRSMAKMRA AAPKLQAIKE KYGDDRMAQQ101 QAMMQLYTDE KINPLGGCLP MLLQIPVFIG LYWALFASVE LRQAPWLGWI151 TDLSRADPYY ILPIIMAATM FAQTYLNPPP TDPMQAKMMK IMPLVFSXXF201 FFFPAGXVLY WVVNNLLTIA QQWHINRSIE KQRAQGEVVS *进一步的序列分析揭示了全部的DNA序列<SEQ ID 51>:1 ATGGATTTTA AAAGACTCAC GGCGTTTTTC GCCATCGCGC TGGTGATTAT51 GATCGGCTGG GAAAAGATGT TCCCCACTCC GAAGCCAGTC CCCGCGCCCC101 AACAGGCAGC ACAACAACAG GCCGTAACCG CTTCCGCCGA AGCCGCGCTC151 GCGCCCGCAA CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTCAT201 TGATGAAAAA AGCGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG251 CAACCGGCGA CGAAAATAAA CCGTTCATCC TGTTTGGCGA CGGCAAAGAA301 TACACCTACG TCGCCCAATC CGAACTTTTG GACGCGCAGG GCAACAACAT351 TCTAAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC AGCTTGGAAG401 GCGACAAAGT TGAAGTCCGC CTGAGCGCGC CTGAAACACG CGGTCTGAAA451 ATCGACAAAG TTTATACTTT CACCAAAGGC AGCTATCTGG TCAACGTCCG501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG TTACTTTACC601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA651 AGTCAGCTTT TCCGACTTGG ACGACGATGC CAAATCCGGC AAATCCGAGG701 CCGAATACAT CCGCAAAACC CCGACCGGCT GGCTCGGCAT GATTGAACAC751 CACTTCATGT CCACCTGGAT TCTCCAACCT AAAGGCAGAC AAAGCGTTTG801 CGCCGCAGGC GAGTGCAACA TCGACATCAA ACGCCGCAAC GACAAGCTGT851 ACAGCACCAG CGTCAGCGTG CCTTTAGCCG CCATCCAAAA CGGCGCGAAA901 GCCGAAGCCT CCATCAACCT CTACGCCGGC CCGCAGACCA CATCCGTCAT951 CGCAAACATC GCCGACAACC TGCAACTGGC CAAAGACTAC GGCAAAGTAC1001 ACTGGTTCGC CTCCCCGCTC TTCTGGCTCC TGAACCAACT GCACAACATC1051 ATCGGCAACT GGGGCTGGGC GATTATCGTT TTAACCATCA TCGTCAAAGC1101 CGTACTGTAT CCATTGACCA ACGCCTCTTA CCGCTCTATG GCGAAAATGC1151 GTGCCGCCGC ACCCAAACTG CAAGCCATCA AAGAGAAATA CGGCGACGAC1201 CGTATGGCGC AACAACAGGC GATGATGCAG CTTTACACAG ACGAGAAAAT1251 CAACCCGCTG GGCGGCTGCC TGCCTATGCT GTTGCAAATC CCCGTCTTCA1301 TCGGATTGTA TTGGGCATTG TTCGCCTCCG TAGAATTGCG CCAGGCACCT1351 TGGCTGGGTT GGATTACCGA CCTCAGCCGC GCCGACCCCT ACTACATCCT1401 GCCCATCATT ATGGCGGCAA CGATGTTCGC CCAAACTTAT CTGAACCCGC1451 CGCCGACCGA CCCGATGCAG GCGAAAATGA TGAAAATCAT GCCGTTGGTT1501 TTCTCCGTCA TGTTCTTCTT CTTCCCTGCC GGTCTGGTAT TGTACTGGGT1551 AGTCAACAAC CTCCTGACCA TCGCCCAGCA ATGGCACATC AACCGCAGCA1601 TCGAAAAACA ACGCGCCCAA GGCGAAGTCG TTTCCTAA它对应于氨基酸序列<SEQ ID 52;ORF11-1>:1 MDFKRLTAFF AIALVIMIGW EKMFPTPKPV PAPQQAAQQQ AVTASAEAAL51 APATPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDENK PFILFGDGKE101 YTYVAQSELL DAQGNNILKG IGFSAPKKQY SLEGDKVEYR LSAPETRGLK151 IDKVYTFTKG SYLVNYRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT201 HSYVGPVVYT PEGNFQKVSF SDLDDDAKSG KSEAEYIRKT PTGWLGMIEH251 HFMSTWILQP KGRQSVCAAG ECNIDIKRRN DKLYSTSVSV PLAAIQNGAK301 AEASINLYAG PQTTSVIANI ADNLQLAKDY GKVHWFASPL FWLLNQLHNI351 IGNWGWAIIV LTIIVKAVLY PLTNASYRSM AKMRAAAPKL QAIKEKYGDD401 RMAQQQAMMQ LYTDEKINPL GGCLPMLLQI PVFIGLYWAL FASVELRQAP451 WLGWITDLSR ADPYYILPII MAATMFAQTY LNPPPTDPMQ AKMMKIMPLV501 FSVMFFFFPA GLVLYWVVNN LLTIAQQWHI NRSIEKQRAQ GEVVS*
该氨基酸序列的计算机分析给出了下列结果:
与恶臭假单胞菌ORF11的60kDa内膜蛋白(登录号为P25754)的同源性
ORF11和60kDa的蛋白在229个氨基酸的重叠区域内显示出有58%的氨基酸相同性(BLASTp)。ORF11 2 LYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIVLTIIVK 61
LYAGP+ S + ++ L+L DYG + + A P+FWLL +H+++GNWGW+IIVLT+++K60K 324 LYAGPKIQSKLKELSPGLELTVDYGFLWFIAQPIFWLLQHIHSLLGNWGWSIIVLTMLIK 383ORF11 52 AVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRXXXXXXXXXLYTDEKINPLGGCLPM 121
+ +PL+ ASYRSMA+MRA APKL A+KE++GDDR LY EKINPLGGCLP+60K 384 GLFFPLSAASYRSMARMRAVAPKLAALKERFGDDRQKMSQAMMELYKKEKINPLGGCLPI 443ORF11 122 LLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLNPPPT 181
L+Q+PVF+ LYW L SVE+RQAPW+ WITDLS DP++ILPIIM ATMF Q LNP P60K 444 LVQMPVFLALYWVLLESVEMRQAPWILWITDLSIKDPFFILPIIMGATMFIQQRLNPTPP 503ORF11 182 DPMQAKMMKIMPLVXXXXXXXXPAGXVLYWVVNNLLTIAQQWHINRSIE 230
DPMQAK+MK+MP++ PAG VLYWVVNN L+I+QQW+I R IE60K 504 DPMQAKVMKMMPIIFTFFFLWFPAGLVLYWVVNNCLSISQQWYITRRIE 552
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF11与脑膜炎奈瑟球菌菌株A的ORF(ORF11a)在240个氨基酸重叠区域内显示出有97.9%的相同性:
10 20 30orf11. pep NLYAGPQTTSVIANIADNLQLAKDYGKVHW
||||||||||||||||||||| ||||||||orf11a IKRRNDKLYSTSVSVPLAAIQNGAKSXASINLYAGPQTTSVIANIADNLQLXKDYGKVHW
280 290 300 310 320 330
40 50 60 70 80 90orf11.pep FASPLFWLLNQLHNIIGNWGWAIIVLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11a FASPLFWLLNQLHNIIGNWGWAIIVLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKE
340 350 360 370 380 390
100 110 120 130 140 150orf11.pep KYGDDRMAQQQAMMQLYTDEKINPLGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11a KYGDDRMAQQQAMMQLYTDEKINPLGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWI
400 410 420 430 440 450
160 170 180 190 200 210orf11. pep TDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVFSXXFFFFPAGXVLY
||||||||||||||||||||||||||||||||||||||||||||| ||||| |||| |||orf11a TDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVXSXXFFXFPAGLVLY
460 470 480 490 500 510
220 230 240orf11.pep WVVNNLLTIAQQWHINRSIEKQRAQGEVVSX
||:||||||||||||||||||||||||||||orf11a WVINNLLTIAQQWHINRSIEKQRAQGEVVSX
520 530 540全长ORF11核苷酸序列<SEQ ID 53>是:1 ANGGATTTTA AAAGACTCAC NGNGTTTTTC GCCATCGCAC TGGTGATTAT51 GATCGGATNG NAAANGATGT TCCCCACTCC GAAGCCCGTC CCCGCGCCCC101 AACAGACGGC ACAACAACAG GCCGTAANCG CTTCCGCCGA AGCCGCGCTC151 GCGCCCGNAN CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTCAT201 TGATGAAAAA AGCGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG251 CAACCGGCGA CNAAAATAAA CCGTTCATCC TGTTTGGCGA CGGCAAANAA301 TACACCTACN TCGCCCANTC CGAACTTTTG GACGCGCAGG GCAACAACAT351 TCTAAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC AGCTTGGAAG401 GCGACAAAGT TGAAGTCCGC CTGAGCGCAC CTGAAACACG CGGTCTGAAA451 ATCGACAAAG TTTATACTTT CACCAAAGGC AGCTATCTGG TCAACGTCCG501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG CTACTTTACC601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA651 AGTCAGCTTC TCCGACTTGG ACGACGATGC CAANTCCGGN AAATCCGAGG701 CCGAATACAT CCGCAAAACC CNGACCGGCT GGCTCGGCAT GATTGAACAC751 CACTTCATGT CCACCTGGAT CCTCCAACCC AAAGGCGGAC AAAGCGTTTG801 CGCCGCTGGC GACTGCNGTA TNGACATCAA ACGCCGCAAC GACAAGCTGT851 ACAGCACCAG CGTCAGCGTG CCTTTAGCCG CTATCCAAAA CGGTGCGAAA901 TCCNAAGCCT CCATCAACCT CTACGCCGGC CCACAGACCA CATCNGTTAT951 CGCAAACATC GCCGACAACC TGCAACTGGN CAAAGACTAC GGCAAAGTAC1001 ACTGGTTCGC CTCCCCCCTC TTTTGGCTTT TGAACCAACT GCACAACATC1051 ATCGGCAACT GGGGCTGGGC GATTATCGTT TTAACCATCA TCGTCAAAGC1101 CGTACTGTAT CCATTGACCA ACGCCTCTTA CCGTTCGATG GCGAAAATGC1151 GTGCCGCCGC GCCCAAACTG CAAGCCATCA AAGAGAAATA CGGCGACGAC1201 CGTATGGCGC AGCAACAAGC CATGATGCAG CTTTACACAG ACGAGAAAAT1251 CAACCCGCTG GGCGGCTGCC TGCCTATGCT GTTGCAAATC CCCGTCTTCA1301 TCGGATTGTA TTGGGCATTG TTCGCCTCCG TAGAATTGCG CCAGGCACCT1351 TGGCTGGGTT GGATTACCGA CCTCAGCCGC GCCGACCCNT ACTACATCCT1401 GCCCATCATT ATGGCGGCAA CGATGTTCGC CCAAACCTAT CTGAACCCGC1451 CGCCGACCGA CCCGATGCAG GCGAAAATGA TGAAAATCAT GCCTTTGGTT1501 NTNTCNNNNA NGTTCTTCNN CTTCCCTGCC GGTCTGGTAT TGTACTGGGT1551 GATCAACAAC CTCCTGACCA TCGCCCAGCA ATGGCACATC AACCGCAGCA1601 TCGAAAAACA ACGCGCCCAA GGCGAAGTCG TTTCCTAA它编码的蛋白质具有氨基酸序列<SEQ ID 54>:1 XDFKRLTXFF AIALVIMIGX XXMFPTPKPV PAPQQTAQQQ AVXASAEAAL51 APXXPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDXNK PFILFGDGKX101 YTYXAXSELL DAQGNNILKG IGFSAPKKQY SLEGDKVEVR LSAPETRGLK151 IDKVYTFTKG SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT201 HSYVGPVVYT PEGNFQKVSF SDLDDDAXSG KSEAEYIRKT XTGWLGMIEH251 HFMSTWILQP KGGQSVCAAG DCXXDIKRRN DKLYSTSVSV PLAAIQNGAK301 SXASINLYAG PQTTSVIANI ADNLQLXKDY GKVHWFASPL FWLLNQLHNI351 IGNWGWAIIV LTIIVKAVLY PLTNASYRSM AKMRAAAPKL QAIKEKYGDD401 RMAQQQAMMQ LYTDEKINPL GGCLPMLLQI PVFIGLYWAL FASVELRQAP451 WLGWITDLSR ADPYYILPII MAATMFAQTY LNPPPTDPMQ AKMMKIMPLV501 XSXXFFXFPA GLVLYWVINN LLTIAQQWHI NRSIEKQRAQ GEVVS*ORF11和ORF11-1在544个氨基酸重叠区域内显示出有95.2%的相同性:
10 20 30 40 50 60orf11a.pep XDFKRLTXFFAIALVIMIGXXXMFPTPKPVPAPQQTAQQQAVXASAEAALAPXXPITVTT
|||||| ||||||||||| |||||||||||||:||||||:||||||||| :||||||orf11-1 MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQQQAVTASAEAALAPATPITVTT
10 20 30 40 50 60
70 80 90 100 110 120orf11a.pep DTVQAVIDEKSGDLRRLTLLKYKATGDXNKPFILFGDGKXYTYXAXSELLDAQGNNILKG
||||||||||||||||||||||||||| ||||||||||| ||| | ||||||||||||||orf11-1 DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFILFGDGKEYTYVAQSELLDAQGNNILKG
70 80 90 100 110 120
130 140 150 160 170 180orf11a.pep IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDIANGSGQTANL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11-1 IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDIANGSGQTANL
130 140 150 160 170 180
190 200 210 220 230 240orf11a.pep SADYRIVRDHSEPEGQGYFTHSYVGPVVYTPEGNFQKVSFSDLDDDAXSGKSEAEYIRKT
||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||orf11-1 SADYRIVRDHSEPEGQGYFTHSYVGPVVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT
190 200 210 220 230 240
250 260 270 280 290 300orf11a.pep XTGWLGWIEHHFMSTWILQPKGGQSVCAAGDCXXDIKRRNDKLYSTSVSVPLAAIQNGAK
||||||||||||||||||||| |||||||:| ||||||||||||||||||||||||||orf11-1 PTGWLGMIEHHFMSTWILQPKGRQSVCAAGECNIDIKRRNDKLYSTSVSVPLAAIQNGAK
250 260 270 280 290 300
310 320 330 340 350 360orf11a.pep SXASINLYAGPQTTSVIANIADNLQLXKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIV
: |||||||||||||||||||||||| |||||||||||||||||||||||||||||||||orf11-1 AEASINLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIV
310 320 330 340 350 360
370 380 390 400 410 420orf11a.pep LTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11-1 LTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPL
370 380 390 400 410 420
430 440 450 460 470 480orf11a.pep GGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11-1 GGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTY
430 440 450 460 470 480
490 500 510 520 530 540orf11a.pep LNPPPTDPMQAKMMKIMPLVXSXXFFXFPAGLVLYWVINNLLTIAQQWHINRSIEKQRAQ
|||||||||||||||||||| | || ||||||||||:||||||||||||||||||||||orf11-1 LNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRAQ
490 500 510 520 530 540orf11a.pep GEVVSX
||||||orf11-1 GEVVSX
与淋病奈瑟球菌的预计ORF的同源性
ORF11与淋病奈瑟球菌的预计ORF(ORF11.ng)在240个氨基酸重叠区域内显示出有93.6%的相同性:Orf11 NLYAGPQTTSVIANIADNLQLAKDYGKVHMFASPLEWLLNQLHNIIGNWGWAIIVLT 57
|||||||||||||||||||||||||||||||||||||||||||||||||||||:|||orf11ng MAVNLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIVVLT 60orf11 IIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPLGG 117
||||||||||||||||||||||||||:||:|||||||||||||||||||: ||:||||||orf11ng IIVKAVLYPLTNASYRSMAKMRAAAPELQTIKEKYGDDPMAQQQAMMQLFEDEEINPLGG 120orf11 CLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLN 177
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11ng CLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLN 180orf11 PPPTDPMQAKMNKGMPLVFSXXFFFFPAGXVLYWVVNNLLTIAQQWHINRSIEKQRAQGE 237
|||||||||||||||||||| ||||||| |||||||||||||||||||||||||||||orf11ng PPPTDPMQAKMMKIMAPLVFSVMFFFFPAGLVLYWVNNLLTIAQQWHINRSIEKQRAQGE 240orf11 VVS 240
|||orf11ng VVS 243
预计ORF11ng核苷酸序列<SEQ ID 55>编码的蛋白质具有氨基酸序列<SEQ ID56>:1 MAVNLYAGPQ TTSVIANIAD NLQLAKDYGK VHWFASPLFW LLNQLHNIIG51 NWGWAIVVLT IIVKAVLYPL TNASYRSMAK MRAAAPELQT IKEKYGDDRM101 AQQQAMMQLF EDEEINPLGG CLPMLLQIPV FIGLYWALFA SVELRQAPWL151 GWITDLSRAD PYYILPIIMA ATMFAQTYLN PPPTDPMQAK MMKIMPLVFS201 VMFFFFPAGL VLYWVVNNLL TIAQQWHINR SIEKQRAQGE VVS*进一步的序列分析揭示了全部的淋球菌DNA序列<SEQ ID 57>是:1 ATGGATTTTA AAAGACTCAC GGCGTTTTTC GCCATCGCGC TGGTGATTAT51 GATCGGCTGG GAAAAAATGT TCCCCACCCC GAAACCCGTC CCCGCGCCCC101 AACAGGCGGC ACAAAAACAG GCAGCAACCG CTTCCGCCGA AGCCGCGCTC151 GCGCCCGCAA CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTTAT201 TGATGAAAAA AGTGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG251 CAACCGGCGA CGAAAACAAA CCGTTCGTCC TGTTTGGCGA CGGCAAAGAA301 TACACCTACG TCGCCCAATC CGAACTTTTG GACGCGCAGG GCAACAACAT351 TCTGAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC ACCCTCAACG401 GCGACACAGT CGAAGTCCGC CTGAGCGCGC CCGAAACCAA CGGACTGAAA451 ATCGACAAAG TCTATACCTT TACCAAAGAC AGCTATCTGG TCAACGTCCG501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG CTACTTTACC601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA651 AGTCAGCTTC TCCgacTTgg acgACGATGC gaaaTccggc aaATccgagg701 ccgaatacaT CCGCAAAACC ccgaccggtt ggctcggcat gattgaacac751 cacttcatgt ccacctggat cctccAAcct aaaggcggcc aaaacgtttg801 cgcccaggga gactgccgta tcgacattaa aCgccgcaac gacaagctgt851 acagcgcaag cgtcagcgtg cctttaaccg ctatcccaac ccgggggcca901 aaaccgaaaa tggcggTCAA CCTGTATGCC GGTCCGCAAA CCACATCCGT951 TATCGCAAAC ATCGCcgacA ACCTGCAACT GGCAAAAGAC TACGGTAAAG1001 TACACTGGTT CGCATCGCCG CTCTTCTGGC TCCTGAACCA ACTGCACAAC1051 ATTATCGGCA ACTGGGGCTG GGCAATCGTC GTTTTGACCA TCATCGTCAA1101 AGCCGTACTG TATCCATTGA CCAACGcctc ctACCGTTCG ATGGCGAAAA1151 TGCGTGccgc cgcacCcaaA CTGCAGACCA TCAAAGAAAA ATAcgGCGAC1201 GACCGTATGG CGCAACAGCA AGCGATGATG CAGCTTTACA AAgacgAGAA1251 AATCAACCCG CTGGGCGGCT GTctgcctat gctgttgCAA ATCCCCGTCT1301 TCATCGGCTT GTACTGGGCA TTGTTCGCCT CCGTAGAATT GCGCCAGGCA1351 CCTTGGCTGG GCTGGATTAC CGACCTCAGC CGCGCCGACC CCTACTACAT1401 CCTGCCCATC ATTATGGCGG CAACGATGTT CGCCCAAACC TATCTGAACC1451 CGCCGCCGAC CGACCCGATG CAGGCGAAAA TGATGAAAAT CATGCCGTTG1501 GTTTTCTCCG TCATGTTCTT CTTCTTCCCT GCCGGTTTGG TTCTCTACTG1551 GGTGGTCAAC AACCTCCTGA CCATCGCCCA GCAGTGGCAC ATCAACCGCA1601 GCATCGAAAA ACAACGCGCC CAAGGCGAAG TCGTTTCCTA A它编码的蛋白质具有氨基酸序列<SEQ ID 58;ORF11ng-1>:1 MDFKRLTAFF AIALVIMIGW EKMFPTPKPV PAPQQAAQKQ AATASAEAAL51 APATPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDENK PFVLFGDGKE101 YTYVAQSELL DAQGNNILKG IGFSAPKKQY TLNGDTVEVR LSAPETNGLK151 IDKVYTFTKD SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT201 HSYVGPVVYT PEGNFQKVSF SDLDDDAKSG KSEAEYIRKT PTGWLGMIEH251 HFMSTWILQP KGGQNVCAQG DCRIDIKRRN DKLYSASVSV PLTAIPTRGP301 KPKMAVNLYA GPQTTSVIAN IADNLQLAKD YGKVHWFASP LFWLLNQLHN351 IIGNWGWAIV VLTIIVKAVL YPLTNASYRS MAKMRAAAPK LQTIKEKYGD401 DRMAQQQAMM QLYKDEKINP LGGCLPMLLQ IPVFIGLYWA LFASVELRQA451 PWLGWITDLS RADPYYILPI IMAATMFAQT YLNPPPTDPM QAKMMKIMPL501 VFSVMFFFFP AGLVLYWVVN NLLTIAQQWH INRSIEKQRA QGEVVS*ORF11ng-1和ORF11-1在546个氨基酸的重叠区域内显示出有95.1%的相同性:
10 20 30 40 50 60orf11ng-1.pep MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQKQAATASAEAALAPATPITVTT
||||||||||||||||||||||||||||||||||||||:||:||||||||||||||||||orf11-1 MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQQQAVTASAEAALAPATPITVTT
10 20 30 40 50 60
70 80 90 100 110 120orf11ng-1.pep DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFVLFGDGKEYTYVAQSELLDAQGNNILKG
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||orf11-1 DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFILFGDGKEYTYVAQSELLDAQGNNILKG
70 80 90 100 110 120
130 140 150 160 170 180orf11ng-1.pep IGFSAPKKQYTLNGDTVEVRLSAPETNGLKIDKVYTFTKDSYLVNVRFDIANGSGQTANL
||||||||||:|:|| |||||||||| |||||||||||| ||||||||||||||||||||orf11-1 IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDIANGSGQTANL
130 140 150 160 170 180
190 200 210 220 230 240orf11ng-1.pep SADYRIVRDHSEPEGQGYFTHSYVGPVVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11-1 SADYRIVRDHSEPEGQGYFTHSYVGPYVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT
190 200 210 220 230 240
250 260 270 280 290 300orf11ng-1.pep PTGWLGMIEHHFMSTWILQPKGGQNVCAQGDCRIDIKRRNDKLYSASVSVPLTAIPTRGP
|||||||||||||||||||||| |:||| |:| ||||||||||||:||||||:|| : |orf11-1 PTGWLGMIEHHFMSTWILQPKGRQSVCAAGECNIDIKRRNDKLYSTSVSVPLAAIQN-GA
250 260 270 280 290
310 320 330 340 350 360orf11ng-1.pep KPKMAVNLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIV
| : ::|||||||||||||||||||||||||||||||||||||||||||||||||||||:orf11-1 KAEASINLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAII
300 310 320 330 340 350
370 380 390 400 410 420orf11ng-1.pep VLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQTIKEKYGDDRMAQQQAMMQLYKDEKINP
||||||||||||||||||||||||||||||||:|||||||||||||||||||| ||||||orf11-1 VLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINP
360 370 380 390 400 410
430 440 450 460 470 480orf11ng-1.pep LGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11-1 LGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQT
420 430 440 450 460 470
490 500 510 520 530 540orf11ng-1.pep YLNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf11-1 YLNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRA
480 490 500 510 520 530orf11ng-1.pep QGEVVSX
|||||||orf11-1 QGEVVSX
540另外,ORF11ng-1显示出与数据库中的内膜蛋白(登录号为p25754)明显同源:ID 60IM_PSEPU STANDARD; PRT; 560 AA.AC P25754;DT 01-MAY-1992(REL.22,产生)DT 01-MAY-1992(REL.22,序列的最后更新)DT 01-NOV-1995(REL.32,注解的最后更新)DE 60 KD内膜蛋白....SCORES Initl:1074 Initn:1293 Opt:1103Smith-Waterman评分:1406; 574个氨基酸重叠区内有41.5%的相同性
10 20 30 40orf11ng-1.pep MDFKR---LTAFFAIALVIMIGW-----EKMFPT------------PKPVPAPQQAAQKQ
||:|| ::|: ::: |::: | : :|| | ||| :::|: :p25754 MDIKRTILIAALAVVSYVMVLKWNDDYGQAALPTQNTAASTVAPGLPDGVPAGNNGASAD
10 20 30 40 50 60
50 60 70 80 90orf11ng-1.pep AATASAEAALAPATPIT-------VTTDTVQAVIDEKSGDLRRLTLLKYKATGDE-NKPF
: :|:||:: | :|:: | ||::: :|| :||: :|:| || |: | ||p25754 VPSANAESSPAELAPVALSKDLIRVKTDVLELAIDPVGGDIVQLNLPKYPRRQDHPNIPF
70 80 90 100 110 120
100 110 120 130 140orf11ng-1.pep VLFGDGKEYTYVAQSELLQAQGNNILKGIG---FSAPKKQYTL-NGD---TVEVRLSAPE
|| :| | :|:||| | ::| : :: | ::| :|:| | :|: :|::::|p25754 QLFDNGGERVYLAQSGLTGTDGPDA-RASGRPLYAAEQKSYQLADGQEQLVVDLKFS---
130 140 150 160 170
150 160 170 180 190 200orf11ng-1.pep TNGLKIDKVYTFTKDSYLVNVRFDIANGSGQTANLSADYRIVRDHS-EPEGQGYF-THSY
||:: | ::| : | :|| : | | |||: | : :: || | :| :: | :|p25754 DNGVNYIKRFSFKRGEYDLNVSYLIDNQSGQAWNGNMFAQLKRDASGDPSSSTATGTATY
180 190 200 210 220 230
210 220 230 240 250 260orf11ng-1.pep VGPVVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKTPTGWLGMIEHHFMSTWILQPKGG
:| :::| ::|||::|:| |:: :| :: ||:: ::|:|:::|| |:p25754 LGAALWTASEPYKKVSMKDID---KGSLKE-----NVSGGWVAWLQHYFVTAWI-PAKSD
240 250 260 270 280
270 280 290 300 310 320orf11ng-1.pep QNVCAQGDCRIDIKRRNDKLYSASVSVPLTAIPTRGPKPKMAYNLYAGPQTTSVIANIAD
:|| :: :: :: | : : |: ::|: | | : :: |||||: | : :::p25754 NNV-------VQTRKDSQGNYIIGYTGPVISVPA-GGKVETSALLYAGPKIQSKLKELSP
290 300 310 320 330
330 340 350 360 370 380orf11ng-1.pep NLQLAKDYGKVHWF-ASPLFWLLNQLHNIIGNWGWAIVVLTIIVKAVLYPLTNASYRSMA
:|:|: ||| : || |:|:||||:::|:::|||||:|:|||:::|::::||: |||||||p25754 GLELTVDYGFL-WFIAQPIFWLLQHIHSLLGNWGWSIIVLTMLIKGLFFPLSAASYRSMA
340 350 360 370 380 390
390 400 410 420 430 440orf11ng-1.pep KMRAAAPKLQTIKEKYGDDRMAQQQAMMQLYKDEKINPLGGCLPMLLQIPVFIGLYWALF
:|||:|||| ::||::||||: ::||||:||| |||||||||||:|:|:|||::|||:|:p25754 RMRAVAPKLAALKERFGDDRQKMSQAMMELYKKEKINPLGGCLPILVQMPVFLALYWVLL
400 410 420 430 440 450
450 460 470 480 490 500orf11ng-1.pep ASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLNPPPTDPKQAKMMKIMPLVF
|||:|||||: |||||| ||::||||||:|||| | ||| | ||||||:||:||::|p25754 ESVEMRQAPWILWITDLSIKDPFFILPIIMGATMFIQQRLNPTPPDPMQAKVMKMMPIIF
460 470 480 490 500 510
510 520 530 540orf11ng-1.pep SVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRAQGEVVSX
: :|::||||||||||||| |:|:|||:|:| ||p25754 TFFFLWFPAGLVLYWVVNNCLSISQQWYITRRIEAATKKAAA
520 530 540 550 560
根据该分析结果(包括与恶臭假单胞菌的内膜蛋白以及预计的跨膜结构域有同源性(在脑膜炎球菌和淋球菌蛋白均见到)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白及其表位可能是疫苗或诊断,或产生抗体的有用抗体。
实施例8
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 1>:1 ..GCCGTCTTAA TCATCGAATT ATTGACGGGA ACGGTTTATC TTTTGGTTGT51 NAGCGCGGCT TTGGCGGGTT CGGGCATTGC TTACGGGCTG ACCGGCAGTA101 CGCCTGCCGC CGTCTTGACC GNCGCTCTGC TTTCCGCGCT GGGTATTTNG151 TTCGTACACG CCAAAACCGC CGTTAGAAAA GTTGAAACGG ATTCATATCA201 GGATTTGGAT GCCGGACAAT ATGTCGAAAT CCTCCGNCAC ACAGGCGGCA251 ACCGTTACGA AGTT.TTTAT CGCGGTACG. ACTGGCAGGC TCAAAATACG301 GGGCAAGAAG AGCTTGAACC AGGAACTCGC GCCCTCATTG TCCGCAAGGA351 AGGCAACCTT CTTATTATCA CACACCCTTA A它对应于氨基酸序列<SEQ ID 2;ORF13>:1 ..AVLIIELLTG TVYLLVVSAA LAGSGIAYGL TGSTPAAVLT XALLSALGIX51 FVHAKTAYRK VETDSYQDLD AGQYVEILRH TGGNRYEVXY RGTXWQAQNT101 GQEELEPGTR ALIVRKEGNL LIITHP*进一步的序列分析稍稍详细地描述了DNA序列<SEQ ID 3>:1 ..GCCGTCTTAA TCATCGAATT ATTGACGGGA ACGGTTTATC TTTTGGTTGT51 nAGCGCGGCT TTGGCGGGTT CGGGCATTGC TTACGGGCTG ACCGGCAGTA101 CGCCTGCCGC CGTCTTGACC GnCGCTCTGC TTTCCGCGCT GGGTATTTnG151 TTCGTACACG CCAAAACCGC CGTTAGAAAA GTTGAAACGG ATTCATATCA201 GGATTTGGAT GCCGGACAAT ATGTCGAAAT CCTCCGACAC ACAGGCGGCA251 ACCGTTACGA AGTTTTtTAT CGCGGTACGc ACTGGCAGGC TCAAAATACG301 GGGCAAGAAG AGCTTGAACC AGGAACTCGC GCCCTCATTG TCCGCAAGGA351 AGGCAACCTT CTTATTATCA CACACCCTTA A它对应于氨基酸序列<SEQ ID 4;ORF13-1>:1 ..AVLIIELLTG TVYLLVVSAA LAGSGIAYGL TGSTPAAVLT XALLSALGIX51 FVHAKTAVRK VETDSYQDLD AGQYVEILRH TGGNRYEVFY RGTHWQAQNT101 GQEELEPGTR ALIVRKEGNL LIITHP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF13和脑膜炎奈瑟球菌菌株A的ORF(ORF13a)在126氨基酸重叠区显示出有92.9%的相同性:
10 20 30 40 50orf13.pep AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF
|||||||||||||||||||||||||||||||||||||||| |||||||| |orf13a MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF
10 20 30 40 50 60
60 70 80 90 100 110orf13.pep VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVXYRGTXWQAQNTGQEELEPGTRA
||||||| |||||||||||||||:|||||:||||||| |||| |||||||||||||||||orf13a VHAKTAVGKVETDSYQDLDAGQYAEILRHAGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
70 80 90 100 110 120
120orf13.pep LIVRKEGNLLIITHPX
||||||||||||::||orf13a LIVRKEGNLLIIAKPX
130全长ORF13a核苷酸序列<SEQ ID 5>是:1 ATGACTGTAT GGTTTGTTGC CGCTGTTGCC GTCTTAATCA TCGAATTATT51 GACGGGAACG GTTTATCTTT TGGTTGTCAG CGCGGCTTTG GCGGGTTCGG101 GCATTGCTTA CGGGCTGACC GGCAGCACGC CTGCCGCCGT CTTGACCGCC151 GCTCTGCTTT CCGCGCTGGG TATTTGGTTC GTACACGCCA AAACCGCCGT201 GGGAAAAGTT GAAACGGATT CATATCAGGA TTTGGATGCC GGGCAATATG251 CCGAAATCCT CCGGCACGCA GGCGGCAACC GTTACGAAGT TTFTTATCGC301 GGTACGCACT GGCAGGCTCA AAATACGGGG CAAGAAGAGC TTGAACCAGG351 AACGCGCGCC CTAATCGTCC GCAAGGAAGG CAACCTTCTT ATCATCGCAA401 AACCTTAA它编码的蛋白质具有氨基酸序列<SEQ ID 6>:1 MTVWFVAAVA VLIIELLTGT VYLLVVSAAL AGSGIAYGLT GSTPAAVLTA51 ALLSALGIWF VHAKTAVGKV ETDSYQDLDA GQYAEILRHA GGNRYEVFYR101 GTHWQAQNTG QEELEPGTRA LIVRKEGNLL IIAKP*ORF13a和ORF13-1在126氨基酸重叠区内表现出94.4%的相同性
10 20 30 40 50 60orf13a.pep MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF
|||||||||||||||||||||||||||||||||||||||| |||||||| |orf13-1 AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF
10 20 30 40 50
70 80 90 100 110 120orf13a.pep VHAKTAVGKVETDSYQDLDAGQYAEILRHAGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
||||||| |||||||||||||||:|||||:||||||||||||||||||||||||||||||orf13-1 VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
60 70 80 90 100 110
orf13a.pep LIVRKEGNLLIIAKPX
||||||||||||::||
orf13-1 LIVRKEGNLLIITHPX
120
与淋病奈瑟球菌的预计ORF的同源性
ORF13和淋病奈瑟球菌的预计ORF(ORF13.ng)在126氨基酸重叠区内显示出有89.7%的相同性:
orf13 AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF 51
|||||||||||||||||||||||||||||||||||||||| |||||||| |
orf13ng MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF 60
orf13 VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVXYRGTXWQAQNTGQEELEPGTRA 111
||||||| |||||||||||:|:|:||||:|||||||| |||| ||||||||| :||||||
orf13ng VHAKTAVGKVETDSYQDLDTGKYAEILRYTGGNRYEVFYRGTHWQAQNTGQEVFEPGTRA 120
orf13 LIVRKEGNLLIITHP 126
||||||||||||::|
orf13ng LIVRKEGNLLIIANP 135
全长ORF13ng核苷酸序列<SEQ ID 7>是:
1 ATGACTGTAT GGTTTGTTGC CGCTGTTGCC GTCTTAATCA TCGAATTATT
51 GACGGGAACG GTTTATCTTT TGGTTGTCAG CGCGGCTTTG GCGGGTTCGG
101 GCATTGCCTA CGGGCTGACT GGCAGCACGC CTGCCGCCGT CTTGACCGCC
151 GCACTGCTTT CCGCGCTGGG CATTTGGTTC GTACATGCCA AAACCGCCGT
201 GGGAAAAGTT GAAACGGATT CATATCAGGA TTTGGATACC GGAAAATATG
251 CCGAAATCCT CCGATACACA GGCGGCAACC GTTACGAAGT TTTTTATCGC
301 GGTACGCACT GGCAGGCGCA AAATACGGGG CAGGAAGTGT TTGAACCGGG
351 AACGCGCGCC CTCATCGTCC GCAAAGAAGG TAACCTTCTT ATCATCGCAA
401 ACCCTTAA它编码的蛋白质具有氨基酸序列<SEQ ID 8>:
1 MTVWFYAAYA VLIIELLTGT VYLLVVSAAL AGSGIAYGLT GSTPAAVLTA
51 ALLSALGIWF VHAKTAVGKV ETDSYQDLDT GKYAEILRYT GGNRYEVFYR
101 GTHWQAQNTG QEVFEPGTRA LIVRKEGNLL IIANP*
OFR13ng和ORF13-1在重叠的126个氨基酸内显示出有91.3%的相同性:
10 20 30 40 50
orf13-1.pep AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF
|||||||||||||||||||||||||||||||||||||||| |||||||| |
orf13ng MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF
10 20 30 40 50 60
60 70 80 90 100 110
orf13-1.pep VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
||||||| |||||||||||:|:|:||||:||||||||||||||||||||||| :||||||
orf13ng VHAKTAVGKVETDSYQDLDTGKYAEILRYTGGNRYEVFYRGTHWQAQNTGQEVFEPGTRA
70 80 90 100 110 120
120
orf13-1.pep LIVRKEGNLLIITHPX
||||||||||||::||
orf13ng LIVRKEGNLLIIANPX
130
根据该分析,包括该蛋白中的延伸前导序列,预计ORF13和ORF13ng可能是外膜蛋白。因此,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例9
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 9>:1 ATGTwTGATT TCGGTTTrGG CGArCTGGTT TTTGTCGGCA TTATCGCCCT51 GATwGtCCTC GGCCCCGAAC GCsTGCCCGA GGCCGCCCGC AyCGCCGGAC101 GGcTCATCGG CAGGCTGCAA CGCTTTGTCG GcAGCGTCAA ACAGGAATTT151 GACACTCAAA TCGAACTGGA AGAACTGAGG AAGGCAAAGC AGGAATTTGA201 AGCTGCCGcC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA251 TGGAAGGCAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA301 CTGCCCGAAC AGCGGACACC TGCCGATTTC GGTGTCGATG AAAACGGCAA351 TCCGCT.TCC CGATGCGGCA AACACCCTAT CAGACGGCAT TTCCGACGTT401 ATGCCGTC..它对应于氨基酸序列<SEQ ID 10;ORF2>:1 MXDFGLGELV FVGIIALIVL GPERXPEAAR XAGRLIGRLQ RFVGSVKQEF51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK101 LPEQRTPADF GVDENGNPXS RCGKHPIRRH FRRYAV..进一步的工作揭示了完整的核苷酸序列<SEQ ID 11>:1 ATGTTTGATT TCGGTTTGGG CGAGCTGGTT TTTGTCGGCA TTATCGCCCT51 GATTGTCCTC GGCCCCGAAC GCCTGCCCGA GGCCGCCCGC ACCGCCGGAC101 GGCTCATCGG CAGGCTGCAA CGCTTTGTCG GCAGCGTCAA ACAGGAATTT151 GACACTCAAA TCGAACTGGA AGAACTGAGG AAGGCAAAGC AGGAATTTGA201 AGCTGCCGCC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA251 TGGAAGGCAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA301 CTGCCCGAAC AGCGGACACC TGCCGATTTC GGTGTCGATG AAAACGGCAA351 TCCGCTTCCC GATGCGGCAA ACACCCTATC AGACGGCATT TCCGACGTTA401 TGCCGTCCGA ACGTTCCTAC GCTTCCGCCG AAACCCTTGG GGACAGCGGG451 CAAACCGGCA GTACAGCCGA ACCCGCGGAA ACCGACCAAG ACCGCGCATG501 GCGGGAATAC CTGACTGCTT CTGCCGCCGC ACCCGTCGTA CAGACCGTCG551 AAGTCAGCTA TATCGATACT GCTGTTGAAA CGCCTGTTCC GCACACCACT601 TCCCTGCGCA AACAGGCAAT AAGCCGCAAA CGCGATTTTC GTCCGAAACA651 CCGCGCCAAA CCTAAATTGC GCGTCCGTAA ATCATAA它对应于氨基酸序列<SEQ ID 12;ORF2-1>:1 MFDFGLGELV FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEF51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK101 LPEQRTPADF GVDENGNPLP DAANTLSDGI SDVMPSERSY ASAETLGDSG151 QTGSTAEPAE TDQDRAWREY LTASAAAPVV QTVEVSYIDT AVETPVPHTT201 SLRKQAISRK RDFRPKHRAK PKLRVRKS*进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 13>:1 ATGTTTGATT TCGGTTTGGG CGAGCTGGTT TTTGTCGGCA TTATCGCCCT51 GATTGTCCTC GGCCCCGAAC GCCTGCCCGA GGCCGCCCGC ACCGCCGGAC101 GGCTCATCGG CAGGCTGCAA CGCTTTGTCG GCAGCGTCAA ACAGGAATTT151 GACACGCAAA TCGAACTGGA AGAACTAAGG AAGGCAAAGC AGGAATTTGA201 AGCTGCCGCT GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA251 TGGAGGGTAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA301 CTGCCCGAAC AGCGCACGCC TGCTGATTTC GGTGTCGATG AAAACGGCAA351 TCCCTTTCCC GATGCGGCAA ACACCCTATT AGACGGCATT TCCGACGTTA401 TGCCGTCCGA ACGTTCCTAC GCTTCCGCCG AAACCCTTGG GGACAGCGGG451 CAAACCGGCA GTACAGCCGA ACCCGCGGAA ACCGACCAAG ACCGTGCATG501 GCGGGAATAC CTGACTGCTT CTGCCGCCGC ACCCGTCGTA CAGACCGTCG551 AAGTCAGCTA TATCGATACC GCTGTTGAAA CCCCTGTTCC GCATACCACT601 TCGCTGCGTA AACAGGCAAT AAGCCGCAAA CGCGATTTGC GTCCTAAATC651 CCGCGCCAAA CCTAAATTGC GCGTCCGTAA ATCATAA它编码的蛋白质具有氨基酸序列<SEQ ID 14;ORF2a>: 1 MFDFGLGELV FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEF51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK101 LPEQRTPADF GVDENGNPFP DAANTLLDGI SDVMPSERSY ASAETLGDSG151 QTGSTAEPAE TDQDRAWREY LTASAAAPVV QTVEVSYIDT AVETPVPHTT201 SLRKQAISRK RDLRPKSRAK PKLRVRKS*
最初鉴定的菌株B部分序列(ORF2)和ORF2a在重叠的118氨基酸内显示出有97.5%的相同性:
10 20 30 40 50 60
orf2.pep MXDFGLGELVFVGIIALIVLGPERXPEAARXAGRLIGRLQRFVGSVKQEFDTQIELEELR
| |||||||||||||||||||||| |||||:|||||||||||||||||||||||||||||
orf2a MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR
10 20 30 40 50 60
70 80 90 100 110 120
orf2.pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPXS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf2a KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPFP
70 80 90 100 110 120
130
orf2.pep RCGKHPIRRHFRRYAV
orf2a DAANTLLDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV
130 140 150 160 170 180
完整的菌株B序列(ORF2-1)和ORF2a在228个氨基酸重叠区内显示出有98.2%的相同性:
orf2a.pep MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf2-1 MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 60
orf2a.pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPFP 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf2-1 KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP 120
orf2a.pep DAANTLLDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV 180
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||
orf2-1 DAANTLSDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV 180
orf2a.pep QTVEVSYIDTAVETPVPHTTSLRKQAISRKRDLRPKSRAKPKLRVRKSX 229
||||||||||||||||||||||||||||||||:||| ||||||||||||
orf2-1 QTVEVSYIDTAVETPVPHTTSLRKQAISRKRDFRPKHRAKPKLRVRKSX 229
进一步的工作鉴定了淋病奈瑟球菌中的部分DNA序列<SEQ ID 15>,它编码下列氨基酸序列<SEQID 16;ORF2ng>:1 MFDFGLGELI FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEL51 DTQIELEELR KVKQAFFAAA AQVRDSLKET DTDMQNSLHD ISDGLKPWEK101 LPEQRTPADF GVDEKGNSLS RYGKHRIRRH FRRYAV*进一步的工作鉴定了完整的淋球菌基因序列<SEQ ID 17>:1 ATGTTTGATT TCGGTTTGGG CGAGCTGATT TTTGTCGGCA TTATCGCCCT51 GATTGTCCTT GGTCCAGAAC GCCTGCCCGA AGCCGCCCGC ACTGCCGGAC101 GGCTTATCGG CAGGCTGCAA CGCTTTGTAG GAAGCGTCAA ACAAGAACTT151 GACACTCAAA TCGAACTGGA AGAGCTGAGG AAGGTCAAGC AGGCATTCGA201 AGCTGCCGCC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GATACGGATA251 TGCAGAACAG TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA301 CTGCCCGAAC AGCGCACGCc tgccgatttc gGTGTCGATg AAAacggcaa351 tccccttccc gATACGGCAA ACACCGTATC AGACGGCATT TCCGACGTTA401 TGCCGTCTGA ACGTTCCGAT ACTtccgcCG AAACCCTTGG GGACGACAGG451 CAAACCGGCA GTACAGCCGA ACCTGCGGAA ACCGACAAAG ACCGCGCATG501 GCGGGAATAC CTGactgctt ctgccgccgc acctgtcgta Cagagggccg551 tcgaagtcag ctaTATCGAT ACTGCTGTTG AAacgcctgT tccgcaCacc601 acttccctgc gcaAACAGGC AATAAACCGC AAACGCGATT TttgtccgaA651 ACACCGCGCc aAACCGAAat tgcgcgtcCG TAAATCATAA它编码的蛋白质具有氨基酸序列<SEQ ID 18;ORF2ng-1>:1 MFDFGLGELI FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEL51 DTQIELEELR KVKQAFEAAA AQVRDSLKET DTDMQNSLHD ISDGLKPWEK101 LPEQRTPADF GVDENGNPLP DTANTVSDGI SDVMPSERSD TSAETLGDDR151 QTGSTAEPAE TDKDRAWREY LTASAAAPVV QRAVEVSYID TAVETPVPHT201 TSLRKQAINR KRDFCPKHRA KPKLRVRKS*
最初鉴定的菌株B部分序列(ORF2)和ORF2ng在重叠的136个氨基酸内显示出有87.5%的相同性:
orf2.pep MXDFGLGELVFVGIIALIVLGPERXPEAARXAGRLIGRLQRFVGSVKQEFDTQIELEELR 60
| |||||||:|||||||||||||| |||||:||||||||||||||||||:||||||||||
orf2ng MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR 60
orf2.pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPXS 120
|:|| ||||||||||||||| |||:::|||||||||||||||||||||||||||:||
orf2ng KVKQAFEAAAAQVRDSLKETDTDMQNSLHDISDGLKPWEKLPEQRTPADFGVDEKGNSLP 120
orf2.pep RCGKHPIRRHFRRYAV 136
| ||| ||||||||||
orf2ng RYGKHRIRRHFRRYAV 136
完整的菌株B和淋球菌序列(ORF2-1和ORF2ng-1)在229个氨基酸的重叠区内显示出有91.7%的相同性:
10 20 30 40 50 60
orf2-1.pep MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR
|||||||||:|||||||||||||||||||||||||||||||||||||||:||||||||||
orf2ng-1 MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR
10 20 30 40 50 60
70 80 90 100 110 120
orf2-1.pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP
|:|| ||||||||||||||| |||:::|||||||||||||||||||||||||||||||||
orf2ng-1 KVKQAFEAAAAQVRDSLKETDTDMQNSLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP
70 80 90 100 110 120
130 140 150 160 170 180
orf2-1.pep DAANTLSDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV
|:|||:||||||||||||| :|||||||: ||||||||||||:|||||||||||||||||
orf2ng-1 DTANTVSDGISDVMPSERSDTSAETLGDDRQTGSTAEPAETDKDRAWREYLTASAAAPVV
130 140 150 160 170 180
190 200 210 220 229
orf2-1.pep Q-TVEVSYIDTAVETPVPHTTSLRKQAISRKRDFRPKHRAKPKLRVRKSX
| :|||||||||||||||||||||||||:||||| |||||||||||||||
orf2ng-1 QRAVEVSYIDTAVETPVPHTTSLRKQAINRKRDFCPKHRAKPKLRVRKSX
190 200 210 220 230
计算机分析这些氨基酸序列,结果提示了一个跨膜区(下划线),并且还揭示淋球菌序列与大肠杆菌的TatB蛋白之间有同源性(59%的相同性):gnl|PID|e1292181(AJ005830)TatB蛋白[大肠杆菌]长度=171评分=56.6位(134),估计值=1e-07相同性=30/88(34%),阳性=52/88(59%),空隙=1/88(1%)询问:1 MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR 60
MFD G EL+ V II L+VLGP+RLP A +T I L+ +V+ EL +++L+E +目标:1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60询问:61 -KVKQAFEAAAAQVRDSLKETDTDMQNS 87
+K+ +A+ + LK + +++ +目标:61 DSLKKVEKASLTNLTPELKASMDELRQA 88
根据该分析,预计ORF2、ORF2a和ORF2ng可能是膜蛋白,因此,脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF2-1(16kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图3A显示了GST融合蛋白亲和纯化的结果,图3B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图3C)、ELISA(阳性结果)和FACS分析(图3D)。这些实验确认ORF37-1是一种外露蛋白,并且它是有用的免疫原。
实施例10
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 19>:1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC51 CGC.TGCGGG ACACTGACAG GTATTCCATC GCATGGCGgA GkTAAACgCT101 TTgCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC201 CACTATGGGC GACCAAGGTT CAGGcAGTTT GACAGGGGGG TCGCTACTCC251 ATTGATGCAC kGrTwCsTGG CGAATACATA AACAGCCCTG CCGTCCGTAC301 CGATTACACC TATCCACGTT ACGAAACCAC CGCTGAAACA ACATCAGGCG351 GTTTGACAGG TTTAACCACT TCTTTATCTA CACTTAATGC CCCTGCACTC401 TCTCGCACCC AATCAGACGG TAGCGGAAGT AAAAGCAGTC TGGGCTTAAA451 TATTGGCGGG ATGGGGGATT ATCGAAATGA AACCTTGACG ACTAACCCGC501 GCGACACTGC CTTTCTTTCC CACTTGGTAC AGACCGTATT TTTCCTGCGC551 GGCATAGACG TTGTTTCTCC TGCCAATGCC GATACAGATG TGTTTATTAA601 CATCGACGTA TTCGGAACGA TACGCAACAG AACCGAAATG..它对应于氨基酸序列<SEQ ID 20;ORF15>:1 MQARLLIPIL FSVFILSACG TLTGIPSHGG XKRFAVEQEL VAASARAAVK51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDAXXXG EYINSPAVRT101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN201 IDVFGTIRNR TEM..进一步的工作揭示了完整的核苷酸序列<SEQ ID 21>:1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC201 CACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC301 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG351 TTTGACAGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT401 CTCGCACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CTAACCCGCG501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACAGATGT GTTTATTAAC601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA701 GAACCAATAA AAAATTGCTC ATCAAACCAA AAACCAATGC GTTTGAAGCT751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA801 AGGAATTAAA CCGACGGAAG GATTAATGGT CGATTTCTCC GATATCCGAC851 CATACGGCAA TCATACGGGT AACTCCGCCC CATCCGTAGA GGCTGATAAC901 AGTCATGAGG GGTATGGATA CAGCGATGAA GTAGTGCGAC AACATAGACA951 AGGACAACCT TGA它对应于氨基酸序列<SEQ ID 22;ORF15-1>:1 MQARLLIPIL FSVFILSACG TLTGIPSHGG GKRFAVEQEL VAASARAAVK51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIRPYGNHTG NSAPSVEADN301 SHEGYGYSDE VVRQHRQGQP *进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 23>:1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC201 AACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC301 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG351 TTTGACAGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT401 CGCGCACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CTAACCCGCG501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACGGATGT GTTTATTAAC601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA701 GAACCAATAA AAAATTGCTC ATCAAACCAA AAACCAATGC GTTTGAAGCT751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGACCGTATA AAGTAAGCAA801 AGGAATTAAA CCGACAGAAG GATTAATGGT CGATTTCTCC GATATCCAAC851 CATACGGCAA TCATATGGGT AACTCTGCCC CATCCGTAGA GGCTGATAAC901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCAGTGCGAC GACATAGACA951 AGGGCAACCT TGA它编码的蛋白质具有氨基酸序列<SEQ ID 24;ORF15a>:1 MQARLLIPIL FSVFILSACG TLTGIPSHGG GKRFAVEQEL VAASARAAVK51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHMG NSAPSVEADN301 SHEGYGYSDE AVRRHRQGQP *
最初鉴定的菌株B部分序列(ORF15)和ORF15a在213氨基酸重叠区内显示出有98.1%的相同性:
10 20 30 40 50 60orf15.pep MQARLLIPILFSVFILSACGTLTGIPSHGGXKRFAVEQELVAASARAAVKDMDLQALHGR
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||orf15a MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
10 20 30 40 50 60
70 80 90 100 110 120orf15.pep KVALYIATMGDQGSGSLTGGRYSIDAXXXGEYINSPAVRTDYTYPRYETTAETTSGGLTG
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf15a KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
70 80 90 100 110 120
130 140 150 160 170 180
orf15.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15a LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
130 140 150 160 170 180
190 200 210
orf15.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEM
|||||||||||||||||||||||||||||||||
orf15a FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
190 200 210 220 230 240
完整的菌株B序列(ORF15-1)和ORF15a在320个氨基酸的重叠区内显示出有98.8%的相同性:
10 20 30 40 50 60
orf15a.pep MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKIMDLQALHGR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
10 20 30 40 50 60
70 80 90 100 110 120
orf15a.pep KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
70 80 90 100 110 120
130 140 150 160 170 180
orf15a.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
130 140 150 160 170 180
190 200 210 220 230 240
orf15a.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
190 200 210 220 230 240
250 260 270 280 290 300
orf15a.pep IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHMGNSAPSVEADN
||||||||||||||||||||||||||||||||||||||||||:||||| |||||||||||
orf15-1 IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN
250 260 270 280 290 300
310 320
orf15a.pep SHEGYGYSDEAVRRHRQGQPX
||||||||||:||:|||||||
orf15-1 SHEGYGYSDEVVRQHRQGQPX
310 320
进一步的工作鉴定了淋病奈瑟球菌中对应的基因<SEQ ID 25>:1 ATGCGGGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGCAAACGCT101 TCGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC201 AACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA251 TTGATGCACT GATTCGCGGC GAATACATAA ACAGCCCTGC CGTCCGCACC301 GATTACACCT ATCCGCGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG351 TTTGACGGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT401 CGCGCACCCA ATCAGACGGT AGCGGAAGTA GGAGCAGTCT GGGCTTAAAT451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CCAACCCGCG501 CGACACTGCC TTTCTTTCCC ACTTGGTGCA GACCGTATTT TTCCTGCGCG551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACAGATGT GTTTATTAAC601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA701 GAACCAATAA AAAATTGCTC ATCAAACCCA AAACCAATGC GTTTGAAGCT751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA801 AGGAATCAAA CCGACGGAAG GATTGATGGT CGATTTCTCC GATATCCAAC851 CATACGGCAA TCATACGGGT AACTCCGCCC CATCCGTAGA GGCTGATAAC901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCAGTGCGAC AACATAGACA951 AGGGCAACCT TGA它编码的蛋白质具有氨基酸序列<SEQ ID 26;ORF15ng>:1 MRARLLIPIL FSVFILSACG TLTGIPSHGG GKRFAVEQEL VAASARAAVK51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSRSSLGLN151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHTG NSAPSVEADN301 SHEGYGYSDE AVRQHRQGQP *
最初鉴定的菌株B部分序列(ORF15)和ORF15ng在重叠的213个氨基酸内显示出有97.2%的相同性:
orf15.pep MQARLLIPILFSVFILSACGTLTGIPSHGGXKRFAVEQELVAASARAAVKDMDLQALHGR 60
|:|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||
orf15ng MRARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR 60
orf15.pep KVALYIATMGDQGSGSLTGGRYSIDAXXXGEYINSPAVRTDYTYPRYETTAETTSGGLTG 120
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf15ng KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 120
orf15.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 180
|||||||||||||||||||||||:||||||||||||||||||||||||||||||||||||
orf15ng LTTSLSTLNAPALSRTQSDGSGSRSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 180
orf15.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEM 213
|||||||||||||||||||||||||||||||||
orf15ng FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL 240
完整的菌株B序列(ORF15-1)和ORF15ng在320个氨基酸的重叠区内显示出有98.8%的相同性:
10 20 30 40 50 60
orf15-1.pep MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15ng MRARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
10 20 30 40 50 60
70 80 90 100 110 120
orf15-1.pep KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15ng KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
70 80 90 100 110 120
130 140 150 160 170 180
orf15-1.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
|||||||||||||||||||||||:||||||||||||||||||||||||||||||||||||
orf15ng LTTSLSTLNAPALSRTQSDGSGSRSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
130 140 150 160 170 180
190 200 210 220 230 240orf15-1.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf15ng FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
190 200 210 220 230 240
250 260 270 280 290 300orf15-1.pep IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||orf15ng IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHTGNSAPSVEADN
250 260 270 280 290 300
310 320orf15-1.pep SHEGYGYSDEVVRQHRQGQPX
||||||||||:||||||||||orf15ng SHEGYGYSDEAVRQHRQGQPX
310 320
这些氨基酸序列的计算机分析揭示了一个ILSAC基序(推定的膜脂蛋白脂质连接位点,如MOTIFS程序所预计的那样),暗示了一个推定的前导序列,并且预计脑膜炎奈瑟球菌和淋病奈瑟球菌的该蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
将ORF15-1(31.7kDa)如上所述克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图4A显示了GST-融合蛋白的亲和纯化结果,图4B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图4C)和ELISA(阳性结果)。这些结果确认ORFX-1是一种外露蛋白,且是一种有用的免疫原。
实施例11
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 27>:1 ..GG.CAGCACA AAAAACAGGC GGTTGAACGG AAAAACCGTA TTTACGATGA51 TGCCGGGTAT GATATTCGGC GTATTCACGG GCGCATTCTC CGCAAAATAT101 ATCCCCGCGT TCGGGCTTCA AATTTTCTTC ATCCTGTTTT TAACCGCCGT151 CGCATTCAAA ACACTGCATA CCGACCCTCA GACGGCATCC CGCCCGCTGC201 CCGGACTGCC CrGACTGACT GCGGTTTCCA CACTGTTCGG CACAATGTCG251 AGCTGGGTCG GCATAGGCGG CGGTTCACTT TCCGTCCCCT TCTTAATCCA301 CTGCGGCTTC CCCGCCCATA AAGCCATCGG CACATCATCC GGCCTTGCCT351 GGCCGATTGC ACTCTCCGGC GCAATATCGT ATCTGCTCAA CGGCCTGAAT401 ATTGCAGGAT TGCCCGAAGG GTCACTGGGC TTCCTTTACC TGCCCGCCGT451 CGCCGTCCTC AGCGCGGCAA CCATTGCCTT TGCCCCGCTC GGTGTCAAAA501 CCGCCCACAA ACTTTCTTCT GCCAAACTCA AAAAATC.TT CGGCATTATG551 TTGCTTTTGA TTGCCGGAAA AATGCTGTAC AACCTGCTTT AA它对应于氨基酸序列<SEQ ID 28;ORF17>:1 ..GQHKKQAVNG KTVFTMMPGM IFGVFTGAFS AKYIPAFGLQ IFFILFLTAV51 AFKTLHTDPQ TASRPLPGLP XLTAVSTLFG TMSSWVGIGG GSLSVPFLIH101 CGFPAHKAIG TSSGLAWPIA LSGAISYLLN GLNIAGLPEG SLGFLYLPAV151 AVLSAATIAF APLGVKTAHK LSSAKLKKSF GIMLLLIAGK MLYNLL*进一步的工作揭示了完整的核苷酸序列<SEQ ID 29>:1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCCGTAG GCAGTGCGGC 51 AGGTTTTATT GCCGGCCTGT TCGGCGTAGG CGGCGGCACG CTGATTGTCC101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC151 GCGCAACACC TCGCCGTCGG CACATCCTTC GCCGTCATGG TCTTCACCGC201 CTTTTCCAGT ATGCTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA251 CCGTATTTAC GATGATGCCG GGTATGATAT TCGGCGTATT CACGGGCGCA301 CTCTCCGCAA AATATATCCC CGCGTTCGGG CTTCAAATTT TCTTCATCCT351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGAC CCTCAGACGG401 CATCCCGCCC GCTGCCCGGA CTGCCCGGAC TGACTGCGGT TTCCACACTG451 TTCGGCACAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG601 CTCAACGGCC TGAATATTGC AGGATTGCCC GAAGGGTCAC TGGGCTTCCT651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAAAA751 Tc.TTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT801 GCTTTAA它对应于氨基酸序列<SEQ ID 30;ORF17-1>:1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIYPVVLWVL DLQGLAQHPY51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTVFTMMP GMIFGVFTGA101 LSAKYIPAFG LQIFFILFLT AVAFKTLHTD PQTASRPLPG LPGLTAVSTL151 FGTMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL201 LNGLNIAGLP EGSLGFLYLP AVAYLSAATI AFAPLGVKTA HKLSSAKLKK251 XFGIMLLLIA GKMLYNLL*
该氨基酸序列的计算机分析给出了下列结果:
与假设的流感嗜血菌跨膜蛋白HI0902(登录号P44070)的同源性:
ORF17和HI0902蛋白在192个氨基酸的重叠区内显示出有28%的氨基酸相同性:ORF17 3 HKKQAVNGKTVFTMMPGMIFGVFT-GAFSAKYIPAFGLQIF--FILFLTAVAFKTLHTDP 59
HK + + V + P ++ VF G F + +IF +++L ++ DHI0902 72 HKLGNIVWQAVRILAPVIMLSVFICGLFIGRLDREISAKIFACLVVYLATKMVLSIKKD- 130ORF17 60 QTASRPLPGLPXLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKAIGTSSGLAWPI 119
Q ++ L L + L G SS GIGGG VPFL G +AIG+S+ +HI0902 131 QVTTKSLTPLSSVIG-GILIGMASSAAGIGGGGFIVPFLTARGINIKQAIGSSAECGMLL 189ORF17 120 ALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVXXXXXXXXXXXXXX 179
+SG S++++G +PE SLG++YLPAV ++A + + LGHI0902 190 GISGMFSFIVSGWGNPLMPEYSLGYIYLPAVLGITATSFFTSKLGASATAKLPVSTLKKG 249ORF17 180 FGIMLLLIAGKM 191
F + L+++A MHI0902 250 FALELIVVAINM 261
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF17和脑膜炎奈瑟球菌菌株A的ORF(ORF17a)在重叠的196个氨基酸内显示出有96.9%的相同性:
10 20 30orf17.pep GQHKKQAVNGKTVFTMMPGMIFGVFTGAFS
||||||||: ||||||||||:||||:||:|or f17a QGLAQHPYAQHLAVGTSFAVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMVFGVFAGALS
50 60 70 80 90 100
40 50 60 70 80 90orf17.pep AKYIPAFGLQIFFILFLTAVAFKTLHTDPQTASRPLPGLPXLTAVSTLFGTMSSWVGIGG
|||||||||||||||||||||||||||||||||||||||| |||||||||||||||||||orf17a AKYIPAFGLQIFFILFLTAVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGG
110 120 130 140 150 160
100 110 120 130 140 150orf17.pep GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf17a GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAV
170 180 190 200 210 220
160 170 180 190orf17.pep AVLSAATIAFAPLGVKTAHKLSSAKLKKSFGLMLLLIAGKMLYNLLX
|||||||||||||||||||||||||||||||||||||||||||||||orf17a AVLSAATIAFAPLGVKTAHKLSSAKLKKSFGIMLLLIAGKMLYNLLX
230 240 250 260全长ORF17a核苷酸序列<SEQ ID 31>是:1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTrGCCGTAG GCAGTGCGGC51 AGGTTTTATT GCCGGCCTGT TCGGCGTAGG CGGCGGCACG CTGATTGTCC101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC151 GCGCAACACC TCGCCGTCGG CACATCCTTC GCCGTCATGG TCTTCACCGC201 CTTTTCCAGT ATGCTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA251 CCGTATTTAC GATGATGCCG GGTATGGTAT TCGGCGTATT CGCTGGCGCA301 CTCTCCGCAA AATATATCCC AGCGTTCGGG CTTCAAATTT TCTTCATCCT351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGAC CCTCAGACGG401 CATCCCGCCC GCTGCCCGGA CTGCCCGGAC TGACTGCGGT TTCCACACTG451 TTCGGCACAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG601 CTCAACGGCC TGAATATTGC AGGATTGCCC GAAGGGTCAC TGGGCTTCCT651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAAAA751 TCCTTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT801 GCTTTAA它编码的蛋白质具有氨基酸序列<SEQ ID 32>:1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPVVLWVL DLQGLAQHPY51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTVFTMMP GMVFGVFAGA101 LSAKYIPAFG LQIFFILFLT AVAFKTLHTD PQTASRPLPG LPGLTAVSTL151 FGTMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL201 LNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGVKTA HKLSSAKLKK251 SFGIMLLLIA GKMLYNLL*ORF17a和ORF17-1在268个氨基酸的重叠区内显示出有98.9%的相同性:
10 20 30 40 50 60orf17a.pep MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf17-1 MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
10 20 30 40 50 60
70 80 90 100 110 120orf17a.pep AVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMVFGVFAGALSAKYIPAFGLQIFFILFLT
||||||||||||||||||||||||||||||||:||||:||||||||||||||||||||||orf17-1 AVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMIFGVFTGALSAKYIPAFGLQIFFILFLT
70 80 90 100 110 120
130 140 150 160 170 180orf17a.pep AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf17-1 AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA
130 140 150 160 170 180
190 200 210 220 230 240
orf17a.pep IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf17-1 IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
190 200 210 220 230 240
250 260 269
orf17a.pep HKLSSAKLKKSFGIMLLLIAGKMLYNLLX
|||||||||| ||||||||||||||||||
orf17-1 HKLSSAKLKKXFGIMLLLIAGKMLYNLLX
250 260
与淋病奈瑟球菌的预计ORF的同源性
ORE17与淋病奈瑟球菌的预计ORE(ORF17.ng)在196氨基酸重叠区内显示出有93.9%的相同性:
orf17.pep GQHKKQAVNGKTVFTMMPGMIFGVFTGAFS 30
||||||||: ||:|:||||||||||:||:|
orf17ng QGLAQHPYAQHLAVGTSFAVMVFTAFSSMLGQHKKQAVDWKTIFAMMPGMIFGVFAGALS 102
orf17.pep AKYIPAFGLQIFFILFLTAVAFKTLHTDPQTASRPLPGLPXLTAVSTLFGTMSSWVGIGG 90
||||||||||||||||||||||||||| ||||||||||| |||||||||:|||||||||
orf17ng AKYIPAFGLQIFFILFLTAVAFKTLHTGRQTASRPLPGLPGLTAVSTLFGAMSSWVGIGG 162
orf17.pep GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAV 150
||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||
orf17ng GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLVNGLNIAGLPEGSLGFLYLPAV 202
orf17.pep AVLSAATIAFAPLGVKTAHKLSSAKLKKSFGIMLLLIAGKMLYNLL 196
|||||||||||||||||||||||||||:||||||||||||||||||
orf17ng AVLSAATIAFAPLGVKTAHKLSSAKLKESFGIMLLLIAGKMLYNLL 268
预计ORF17ng核苷酸序列<SEQ ID 33>编码的蛋白质具有氨基酸序列<SEQ ID34>:1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPVVLWVL DLQGLAQHPY51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTIFAMMP GMIFGVFAGA101 LSAKYIPAFG LQIFFILFLT AVAFKTLHTG RQTASRPLPG LPGLTAVSTL151 FGAMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL201 VNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGVKTA HKLSSAKLKE251 SFGIMLLLIA GKMLYNLL*进一步的工作揭示了该完整的淋球菌DNA序列<SEQ ID 35>:1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCcgtag gcAGTGCGGC51 AGGTTTTATT GCCGGCCTGT Tcggtgtagg cggcgGTACG CTGATTGTCC101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC151 GCGCAACACC TCGCCGTCGG CAcaTccttc gcCGTCATGG TCTTCACCGC201 CTTTTCCAGT ATGTTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA251 CCATATTTGC GATGATGCCG GGTATGATAT TCGGCGTATT CGCTGGCGCA301 CTCTCCGCAA AATATATCCC CGCGTTCGGG CTTCAAATTT TCTTCATCCT351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGGT CGTCAGACGG401 CATCCCGCCC GCTGCCCGGG CTGCCCGGAC TGACTGCGGT TTCCACACTG451 TTCGGCGCAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG601 GTCAACGGTC TGAATATTGC AGGATTGCCC GAAGGGTCGC TGGGCTTCCT651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAGAA751 TCCTTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT801 GCTTTAA它对应于氨基酸序列<SEQ ID 36;ORF17ng-1>:1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPVVLWVL DLQGLAQHPY51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTIFAMMP GMIFGVFAGA101 LSAKYIPAFG LQIFFILFLT AVAFKTLHTG RQTASRPLPG LPGLTAVSTL151 FGAMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL201 VNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGVKTA HKLSSAKLKE251 SFGIMLLLIA GKMLYNLL*ORF17ng-1和ORF17-1在268个氨基酸的重叠区内显示出有96.6%的相同性:
10 20 30 40 50 60orf17-1.pep MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf17ng-1 MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
10 20 30 40 50 60
70 80 90 100 110 120orf17-1.pep AVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMIFGVFTGALSAKYIPAFGLQIFFILFLT
||||||||||||||||||||||||:|:||||||||||:||||||||||||||||||||||orf17ng-1 AVMVFTAFSSMLGQHKKQAVDWKTIFAMMPGMIFGVFAGALSAKYIPAFGLQIFFILFLT
70 80 90 100 110 120
130 140 150 160 170 180orf17-1.pep AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA
||||||||| |||||||||||||||||||||:|||||||||||||||||||||||||||orf17ng-1 AVAFKTLHTGRQTASRPLPGLPGLTAVSTLFGAMSSWVGIGGGSLSVPFLIHCGFPAHKA
130 140 150 160 170 180
190 200 210 220 230 240orf17-1.pep IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||orf17ng-1 IGTSSGLAWPIALSGAISYLVNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
190 200 210 220 230 240
250 260 269orf17-1.pep HKLSSAKLKKXFGIMLLLIAGKMLYNLLX
|||||||||: ||||||||||||||||||orf17ng-1 HKLSSAKLKESFGIMLLLIAGKMLYNLLX
250 260
另外,ORF17ng-1显示出与假设的流感嗜血菌蛋白同源:
sp|P44070|Y902_HAEIN假设蛋白HI0902 pir||G64015假设蛋白HI0902-流感嗜血菌(Rd KW20菌株)gi|1573922(U32772)流感嗜血菌预计编码区H10902[流感嗜血菌]长度=264评分=74(34.9位),估计值=1.6e-23,Sum P(2)=1.6e-23相同性=15/43(34%),阳性=23/43(53%)询问: 55 AVGTSFAVMVFTAFSSMLGQHKKQAVDWKTIFAMMPGMIFGVF 97
A+GTSFA +V T S HK + W+ + + P ++ VF目标: 52 ALGTSFATIVITGIGSAQRHHKLGNIVWQAVRILAPVIMLSVF 94评分=195(91.9位),估计值=1.6e-23,Sum P(2)=1.6e-23相同性=44/114(38%),阳性=65/114(57%)询问:150 LFGAMSSWVGIGGGSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLVNGLNIAGL 209
L G SS GIGGG VPFL G +AIG+S+ + +SG S++V+G +目标:148 LIGMASSAAGIGGGGFIVPFLTARGINIKQAIGSSAFCGMLLGISGMFSFIVSGWGNPLM 207询问:210 PEGSLGFLYLPAVAVLSAATIAFAPLGVKTAHKLSSAKLKESFGIMLLLIAGKM 263
PE SLG++YLPAV ++A + + LG KL + LK+ F + L+++A M目标:208 PEYSLGYIYLPAVLGITATSFFTSKLGASATAKLPVSTLKKGFALFLIVVAINM 261
这个分析结果,包括与一种假设的流感嗜血菌跨膜蛋白有同源性,提示脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例12
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 37>:1 ..GGAAACGGAT GGCAGGCAGA CCCCGAACAT CCGCTGCTCG GGCTTTTTGC51 CGTCAGTAAT GTATCGATGA CGCTTGCTTT TGTCGGAATA TGTGCGTTGG101 TGCATTATTG CTTTTCGGGA ACGGTTCAAG TGTTTGTGTT TGCGGCACTG151 CTCAAACTTT ATGCGCTGAA GCCGGTTTAT TGGTTCGTGT TGCAGTTTGT201 GCTGATGGCG GTTGCCTATG TCCACCGCTG CGGTATAGAC CGGCAGCCGC251 CGTCAACGTT CGGCGGCTCG CAGCTGCGAC TCGGCGGGTT GACGGCAGCG301 TTGATGCAGG TCTCGGTACT GGTGCTGCTG CTTTCAGAAA TTGGAAGATA351 A它对应于氨基酸序列<SEQ ID 38;ORF18>:1 ..GNGWQADPEH PLLGLFAVSN VSMTLAFVGI CALVHYCFSG TVQVFVFAAL51 LKLYALKPVY WFVLQFVLMA VAYVHRCGID RQPPSTFGGS QLRLGGLTAA101 LMQVSVLVLL LSEIGR*进一步的工作揭示了完整的核苷酸序列<SEQ ID 39>:1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGT ATGCGGCGGT51 TTTTCTGTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA101 GTATTATGCT GTGGCTGGGC ATATCGGTTT TGGGGGCAAA GCTGATGCCC151 GGCATATGGG GAATGACCCG CGCCGCGCCC TTGTTCATCC CCCATTTTTA201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGCATTGG AACCGGAAAA251 CAGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCGCT GCTCGGGCTT301 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC351 GTTGGTGCAT TATTGCTTTT CGGGAACGGT TCAAGTGTTT GTGTTTGCGG401 CACTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG451 TTTGTGCTGA TGGCGGTTGC CTATGTCCAC CGCTGCGGTA TAGACCGGCA501 GCCGCCGTCA ACGTTCGGCG GCTCGCAGCT GCGACTCGGC GGGTTGACGG551 CAGCGTTGAT GCAGGTCTCG GTACTGGTGC TGCTGCTTTC AGAAATTGGA601 AGATAA它对应于氨基酸序列<SEQ ID 40;ORF18-1>:1 MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIMLWLG ISVLGAKLMP51 GIWGMTRAAP LFIPHFYLTL GSIFFFIGHW NRKTDGNGWQ ADPEHPLLGL101 FAVSNVSMTL AFVGICALVH YCFSGTVQVF VFAALLKLYA LKPVYWFVLQ151 FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG GLTAALMQVS VLVLLLSEIG201 R*该氨基酸序列的计算机分析给出了下列结果:与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF18与脑膜炎奈瑟球菌菌株A的ORF(ORF18a)在重叠的116个氨基酸内显示出有98.3%的相同性:
10 20 30orf18.pep GNGWQADPEHPLLGLFAVSNVSMTLAFVGI
||||||||||||||||||||||||||||||orf18a TRAAPLFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGI
60 70 80 90 100 110
40 50 60 70 80 90orf18.pep CALVHYCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS
||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||orf18a CALVHYCFSXTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS
120 130 140 150 160 170
100 110orf18.pep QLRLGGLTAALMQVSVLVLLLSEIGRX
||||||||||||| |||||||||||||orf18a QLRLGGLTAALMQXSVLVLLLSEIGRX
180 190 200全长ORF18a核苷酸序列<SEQ ID 41>是:1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGT ATGCGGCGGT51 TTTTCTGTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA101 GTATTATGCT GTGGCTGGGC ATATCGGTTT TGGGGGCAAA GCTGATGCCC151 GGCATATGGG GAATGACCCG CGCCGCGCCC TTGTTCATCC CCCATTTTTA201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGCATTGG AACCGGAAAA251 CGGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCTCT GCTCGGGCTG301 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC351 GTTGGTGCAT TATTGCTTTT CGNGAACGGT TCAAGTGTTT GTGTTTGCGG401 CACTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG451 TTTGTGCTGA TGGCGGTTGC CTATGTCCAC CGCTGCGGTA TAGACCGGCA501 GCCGCCGTCA ACGTTCGGCG GNTCGCAGCT GCGACTCGGC GGGTTGACGG551 CAGCGTTGAT GCAGNTCTCG GTACTGGTGC TGCTGCTTTC AGAAATTGGA601 AGATAA它编码的蛋白质具有氨基酸序列<SEQ ID 42>:1 MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIMLWLG ISVLGAKLMP51 GIWGMTRAAP LFIPHFYLTL GSIFFFIGHW NRKTDGNGWQ ADPEHPLLGL101 FAVSNVSMTL AFVGICALVH YCFSXTVQVF VFAALLKLYA LKPVYWFVLQ151 FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG GLTAALMQXS VLVLLLSEIG201 R*ORF18a和ORF18-1在201个氨基酸的重叠区内显示出有99.0%的相同性:
10 20 30 40 50 60orf18a.pep MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIMLWLGISVLGAKLMPGIWGMTRAAP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf18-1 MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIMLWLGISVLGAKLMPGIWGMTRAAP
10 20 30 40 50 60
70 80 90 100 110 120orf18a.pep LFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEMPLLGLFAVSNVSMTLAFVGICALVH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf18-1 LFIPHFYLTLGSIFFFIGHVNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH
70 80 90 100 110 120
130 140 150 160 170 180orf18a.pep YCFSXTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
|||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||orf18-1 YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
130 140 150 160 170 180
190 200orf18a.pep GLTAALMQXSVLVLLLSEIGRX
|||||||| |||||||||||||orf18-1 GLTAALMQVSVLVLLLSEIGRX
190 200
与淋病奈瑟球菌的预计ORF的同源性
ORF18显示出在与淋病奈瑟球菌的预计ORF(ORF18.ng)在重叠的116个氨基酸中有93.1%的相同性:
orf18.pep GNGWQADPEHPLLGLFAVSNVSMTLAFVGI 30
||||||||||||||||||||||||||||||
orf18ng TRAAPLFIPHFYLTLGSIFFFIGYWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGI 115
orf18.pep CALVHYCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS 90
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf18ng CALVHYCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS 175
orf18.pep QLRLGGLTAALMQVSVLVLLLSEIGR 116
||||| |:| ||||:| ::||:||||
orf18ng QLRLGVLAAMLMQVAVTAMLLAEIGR 201
全长ORF18ng核苷酸序列是<SEQ ID 43>:1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGt aTGCGGcggt51 tttTctgTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA101 GTATTGCGTT GTGGCTCGGC ATCTCGGTTT TAGGGGTAAA GCTGATGCCG151 GGGATGTGGG GAATGACCCG CGCCGCGCCT TTGTTCATCC CCCATTTTTA201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGTATTGG AACCGGAAAA251 CAGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCGCT GCTCGGGCTT301 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC351 GTTGGTGCAT TATTGCTTTT CGGGAACGGT TCAAGTGTTT GTGTTTGCGG401 CATTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG451 TTTGTATTGA TGGCGGttgC CTATGTCCAC CGCTGCGGTA TAGACCGGCA501 GCCGCCGTCA ACGTTCGGCG GTTCGCAGCT GCGACTCGGC GTGTTGGCGG551 CGATGTTGAT GCAGGTTGCG GTAACGGCGA TGCTGCTTGC CGAAATCGGC601 AGATGA它编码的蛋白质具有氨基酸序列<SEQ ID 44>:1 MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIALWLG ISVLGVKLMP51 GMWGMTRAAP LFIPHFYLTL GSIFFFIGYW NRKTDGNGWQ ADPEHPLLGL101 FAVSNVSMTL AFVGICALVH YCFSGTVQVF VFAALLKLYA LKPVYWFVLQ151 FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG VLAAMLMQVA VTAMLLAEIG201 R*
此ORF18ng蛋白质序列显示出与ORF18-1在重叠的201个氨基酸中有94.0%的相同性:
10 20 30 40 50 60
orf18-1.pep MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIMLWLGISVLGAKLMPGIWGMTRAAP
||||||||||||||||||||||||||||||||||| |||||||||:|||||:||||||||
orf18ng MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIALWLGISVLGVKLMPGMWGMTRAAP
10 20 30 40 50 60
70 80 90 100 110 120
orf18-1.pep LFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf18ng LFIPHFYLTLGSIFFFIGYWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH
70 80 90 100 110 120
130 140 150 160 170 180
orf18-1.pep YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf18ng YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
130 140 150 160 170 180
190 200
orf18-1.pep GLTAALMQVSVLVLLLSEIGRX
|:| ||||:| ::||:|||||
orf18ng VLAAMLMQVAVTAMLLAEIGRX
190 200
根据本分析,包括该淋球菌蛋白中存在几个推定跨膜结构域的分析,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例13
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 45>:1 ATGAAAACCC CACTCCTCAA GCCTCTGCTN ATTACCTCGC TTCCCGTTTT51 CGCCAGTGTT TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTTGTCGAT151 TTGGACAACC NCNTGACCGG ACGGCTNAAA AACATCATCA CCACCGTCGC201 CCTGTTCACC CTCTCCTCGC TCACGGCACA AAGCACCCTC GGCACAGGGC251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CTT.CG.CTT CACCATTTTA301 GGCGCGGNCG ...它对应于氨基酸序列<SEQ ID 46;ORF19>:1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVDS1 LDNXXTGRLK NIITTVALFT LSSLTAQSTL GTGLPFILAM TLMTXXFTIL101 GAX...进一步的工作揭示了完整的核苷酸序列<SEQ ID 47>:1 ATGAAAACCC CACTCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT51 CGCCAGTGTT TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTTGTCGAT151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCA CCACCGTCGC201 CCTGTTCACC CTCTCCTCGC TCACGGCACA AAGCACCCTC GGCACAGGGC251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CCTTCGGCTT CACCATTTTA301 GGCGCGGTCG GGCTCAAATA CCGCACCTTC GCCTTCGGTG CACTCGCCGT351 CGCCACCTAC ACCACACTTA CCTACACCCC CGAAACCTAC TGGCTGACCA401 ACCCCTTCAT GATTTTATGC GGCACCGTAC TGTACAGCAC CGCCATCCTC451 CTGTTCCAAA TCGTCCTGCC CCACCGCCCC GTCCAAGAAA GCGTCGCCAA501 CGCCTACGAC GCACTCGGCG GCTACCTCGA AGCCAAAGCC GACTTCTTCG551 ACCCCGATGA GGCAGCCTGG ATAGGCAACC GCCACATCGA CCTCGCCATG601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT651 TTACCGCCTT CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC701 GTTACTACTT TGCCGCCCAA GACATACACG AACGCATCAG CTCCGCCCAC751 GTCGATTATC AGGAAATGTC CGAAAAATTC AAAAACACCG ACATCATCTT801 CCGCATCCAC CGCCTGCTCG AAATGCAGGG ACAAGCCTGC CGCAACACCG851 CCCAAGCCCT GCGCGCAAGC AAAGACTACG TTTACAGCAA ACGCCTCGGC901 CGCGCCATCG AAGGCTGCCG CCAATCGCTG CGCCTCCTTT CAGACAGCAA951 CGACAGTCCC GACATCCGCC ACCTGCGCCG CCTTCTCGAC AACCTCGGCA1001 GCGTCGACCA GCAGTTCCGC CAACTCCAGC ACAACGGCCT GCAGGCAGAA1051 AACGACCGCA TGGGCGACAC CCGCATCGCC GCCCTCGAAA CCAGCAGCCT1101 CAAAAACACC TGGCAGGCAA TCCGTCCGCA GCTAAACCTC GAATCAGGCG1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTCG TCGTTGCCGC CGCCTGCACC1201 ATCGTCGAAG CCCTCAACCT CAACCTCGGC TACTGGATAC TACTGACCGC1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTCCGCC1301 AGCGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC1351 TACTTCACCC CGTCTGTCGA AACCAAACTC TGGATTGTCA TCGCCAGTAC1401 CACCCTCTTT TTCATGACCC GCACCTACAA ATACAGTTTC TCCACCTTCT1451 TCATTACCAT TCAAGCCCTG ACCAGCCTCT CCCTCGCAGG TTTGGACGTA1501 TACGCCGCCA TGCCCGTACG CATCATCGAC ACCATTATCG GCGCATCCCT1551 TGCCTGGGCG GCAGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAACGGTGC CTATCTCGAA1651 AAAATCACCG AACGCCTCAA AAGCGGCGAA ACCGGCGACG ACGTCGAATA1701 CCGCGCCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT1901 TTACCGCACA GTTCCACCTC GCCGCCGAAC ACACCGCCCA CATCTTCCAA1951 CACCTGCCCG AAACCGAACC CGACGACTTT CAGACAGCAC TGGATACACT2001 GCGCGGCGAA CTCGACACCC TCCGCACCCA CAGCAGCGGA ACACAAAGCC2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CCCGACAGCT CGAACCCTAC2101 TACCGCGCCT ACCGCCAAAT TCCGCACAGG CAGCCCCAAA ATGCAGCCTG2151 A它对应于氨基酸序列<SEQ ID 48;ORF19-1>:1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD51 LDNRLTGRLK NIITTVALFT LSSLTAQSTL GTGLPFILAM TLMTFGFTIL101 GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNPFMILC GTVLYSTAIL151 LFQIVLPHRP VQESVANAYD ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH251 VDYQEMSEKF KNTDIIFRIH RLLEMQGQAC RNTAQALRAS KDYVYSKRLG301 RAIEGCRQSL RLLSDSNDSP DIRHLRRLLD NLGSVDQQFR QLQHNGLQAE351 NDRMGDTRIA ALETSSLKNT WQAIRPQLNL ESGVFRHAVR LSLVVAAACT401 IVEALNLNLG YWILLTALFV CQPNYTATKS RVRQRIAGTV LGVIVGSLVP451 YFTPSVETKL WIVIASTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV501 YAAMPVRIID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSNGAYLE551 KITERLKSGE TGDDVEYRAT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ651 HLPETEPDDF QTALDTLRGE LDTLRTHSSG TQSHILLQQL QLIARQLEPY701 YRAYRQIPHR QPQNAA*该氨基酸序列的计算机分析给出了下列结果:与预计的流感嗜血菌的跨膜蛋白YHFK(登录号为P44289)的同源性ORF19和YHFK蛋白在97个氨基酸的重叠区内显示出有45%的氨基酸相同性:orf19 6 LKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNXXTGRLKNIITT 65
L +I+++PVF +V AA +W +MP +LGIIAGGLVDLDN TGRLKN+ TYHFK 5 LNAKVISTIPVFIAVNIAAVGIWFFDISSQSMPLILGIIAGGLVDLDNRLTGRLKNVFFT 64orf19 66 VALFTLSSLTAQSTLGTGLPFILAMTLMTXXFTILGA 102
+ F++SS Q +G + +I+ MT++T FT++GAYHFK 65 LIAFSISSFIVQLHIGKPIQYIVLMTVLTFIFTMIGA 101
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF19显示出在与脑膜炎奈瑟球菌菌株A的ORF(ORF19a)在重叠的102个氨基酸内有92.2%的相同性:
10 20 30 40 50 60
orf19.pep MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNXXTGRLK
|||| |||||||||||||||||||||||||||||||||||||||||||||||| |||||
orf19a MKTPPLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
10 20 30 40 50 60
70 80 90 100
orf19.pep NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTXXFTILGAX
|||:||||||||||:||||||||||||||||||| |||:||
orf19a NIIATVALFTLSSLVAQSTLGTGLPFILAMTLMTFGFTIMGAVGLKYRTFAFGALAVATY
70 80 90 100 110 120orf19a TTLTYTPETYWLTNPFMILCGTVLYSTAIILFQIILPHRPVQENVANAYEALGSYLEAKA
130 140 150 160 170 180全长ORF19a核苷酸序列<SEQ ID 49>是:1 ATGAAAACCC CACCCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT51 CGCCAGTGTC TTTACCGCCG CCTCCATCGT CTGGCAGCTG GGCGAACCCA101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCTGGCGG CCTGGTCGAT151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCG CCACCGTCGC201 CCTGTTCACC CTCTCCTCAC TTGTCGCGCA AAGCACCCTC GGCACAGGTT251 TGCCATTCAT CCTCGCCATG ACCCTGATGA CTTTCGGCTT TACCATCATG301 GGCGCGGTCG GGCTGAAATA CCGCACCTTC GCCTTCGGCG CACTCGCCGT351 CGCCACCTAC ACCACACTTA CCTACACCCC CGAAACCTAC TGGCTGACCA401 ACCCCTTTAT GATTCTGTGC GGAACCGTAC TGTACAGCAC CGCCATCATC451 CTGTTCCAAA TCATCCTGCC CCACCGCCCC GTTCAAGAAA ACGTCGCCAA501 CGCCTACGAA GCACTCGGCA GCTACCTCGA AGCCAAAGCC GACTTTTTCG551 ATCCCGACGA AGCCGAATGG ATAGGCAACC GCCACATCGA CCTCGCCATG601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT651 TTACCGCCTT CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC701 GCTACTACTT CGCCGCCCAA GACATACACG AACGCATCAG CTCCG CCCAC751 GTCGACTACC AAGAGATGTC CGAAAAATTC AAAAACACCG ACATCATCTT801 CCGCATCCAC CGCCTGCTCG AAATGCAGGG ACAAGCCTGC CGCAACACCG851 CCCAAGCCCT GCGCGCAAGC AAAGACTACG TTTACAGCAA ACGCCTCGGC901 CGCGCCATCG AAGGCTGCCG CCAATCGCTG CGCCTCCTTT CAGACAGCAA951 CGACAATCCC GACATCCGCC ACCTGCGCCG CCTTCTCGAC AACCTCGGCA1001 GCGTCGACCA GCAGTTCCGC CAACTCCAGC ACAACGGCCT GCAGGCAGAA1051 AACGACCGCA TGGGCGACAC CCGCATCGCC GCCCTCGAAA CCGGCAGCCT1101 CAAAAACACC TGGCAGGCAA TCCGTCCGCA GCTAAACCTC GAATCAGGCG1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTTG TCGTTGCCGC CGCCTGCACC1201 ATCGTCGAAG CCCTCAACCT CAACCTCGGC TACTGGATAC TACTGACCGC1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTCCGCC1301 AGCGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC1351 TACTTTACCC CCTCCGTCGA AACCAAACTC TGGATCGTCA TCGCCAGTAC1401 CACCCTCTTT TTCATGACCC GCACCTACAA ATACAGCTTC TCGACATTTT1451 TCATCACCAT TCAAGCCCTG ACCAGCCTCT CCCTCGCAGG GTTGGACGTA1501 TACGCCGCCA TGCCCGTACG CATCATCGAC ACCATTATCG GCGCATCCCT1551 TGCCTGGGCG GCAGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAACGGCGC CTATCTCGAA1651 AAAATCACCG AACGCCTCAA AAGCGGCGAA ACCGGCGACG ACGTCGAATA1701 CCGCGCCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT1901 TTACCGCACA GTTCCACCTC GCCGCCGAAC ACACCGCCCA CATCTTCCAA1951 CACCTGCCCG AAACCGAACC CGACGACTTT CAGACAGCAC TGGATACACT2001 GCGCGGCGAA CTCGACACCC TCCGCACCCA CAGCAGCGGA ACACAAAGCC2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CCCGGCAGCT CGAACCCTAC2101 TACCGCGCCT ACCGACAAAT TCCGCACAGG CAGCCCCAAA ACGCAGCCTG2151 A它编码的蛋白质具有氨基酸序列<SEQ ID 50>:1 MKTPPLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD51 LDNRLTGRLK NIIATVALFT LSSLVAQSTL GTGLPFILAM TLMTFGFTIM101 GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNPFMILC GTVLYSTAII151 LFQIILPHRP VQENVANAYE ALGSYLEAKA DFFDPDEAEW IGNRHIDLAM201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH251 VDYQEMSEKF KNTDIIFRIH RLLEMQGQAC RNTAQALRAS KDYVYSKRLG301 RAIEGCRQSL RLLSDSNDNP DIRHLRRLLD NLGSVDQQFR QLQHNGLQAE351 NDRMGDTRIA ALETGSLKNT WQAIRPQLNL ESGVFRHAVR LSLVVAAACT401 IVEALNLNLG YWILLTALFV CQPNYTATKS RVRQRIAGTV LGVIVGSLVP451 YFTPSVETKL WIVIASTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV501 YAAMPYRIID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSNGAYLE551 KITERLXSGE TGDDVEYRAT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ651 HLPETEPDDF QTALDTLRGE LDTLRTHSSG TQSHILLQQL QLIARQLEPY701 YRAYRQIPHR QPQNAA*ORF19a和ORF19-1显示在716个氨基酸的重叠区内有98.3%的相同性:
10 20 30 40 50 60orf19a.pep MKTPPLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
|||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||orf19-1 MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
10 20 30 40 50 60
70 80 90 100 110 120orf19a.pep NIIATVALFTLSSLVAQSTLGTGLPFILAMTLMTFGFTIMGAVGLKYRTFAFGALAVATY
|||:||||||||||:||||||||||||||||||||||||:||||||||||||||||||||orf19-1 NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY
70 80 90 100 110 120
130 140 150 160 170 180orf19a.pep TTLTYTPETYWLTNPFMILCGTVLYSTAIILFQIILPHRPVQENVANAYEALGSYLEAKA
|||||||||||||||||||||||||||||:||||:||||||||:|||||:|||:||||||orf19-1 TTLTYTPETYWLTNPFMILCGTVLYSTAILLFQIVLPHRPVQESVANAYDALGGYLEAKA
130 140 150 160 170 180
190 200 210 220 230 240orf19a.pep DFFDPDEAEWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||orf19-1 DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
190 200 210 220 230 240
250 260 270 280 290 300orf19a.pep DIHERISSAHVDYQEMSEKFKNTDIIFRIHRLLEMQGQACRNTAQALRASKDYVYSKRLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf19-1 DIHERISSAHVDYQEMSEKFKNTDIIFRIHRLLEMQGQACRNTAQALRASKDYVYSKRLG
250 260 270 280 290 300
310 320 330 340 350 360orf19a.pep RAIEGCRQSLRLLSDSNDNPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||orf19-1 RAIEGCRQSLRLLSDSNDSPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA
310 320 330 340 350 360
370 380 390 400 410 420orf19a.pep ALETGSLKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||orf19-1 ALETSSLKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
370 380 390 400 410 420
430 440 450 460 470 480orf19a.pep CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf19-1 CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF
430 440 450 460 470 480
490 500 510 520 530 540orf19a.pep STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf19-1 STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
490 500 510 520 530 540
550 560 570 580 590 600
orf19a.pep AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
550 560 570 580 590 600
610 620 630 640 650 660
orf19a.pep PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF
610 620 630 640 650 660
670 680 690 700 710
orf19a.pep QTALDTLRGELDTLRTHSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 QTALDTLRGELDTLRTHSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
670 680 690 700 710
与淋病奈瑟球菌的预计ORF的同源性
ORF19在与淋病奈瑟球菌的预计ORF(ORF19.ng)在重叠的102个氨基酸内显示有95.1%的相同性:
orf19.pep MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNXXTGRLK 60
||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
orf19ng MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK 60
orf19.pep NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTXXFTILGAX 103
|||:|||||||||||||||||||||||||||||| ||||||
orf19ng NIIATVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY 120
预计ORF19ng核苷酸序列<SEQ ID 51>编码的蛋白质具有氨基酸序列<SEQ ID52>:1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD51 LDNRLTGRLK NIIATVALFT LSSLTAQSTL GTGLPFILAM TLMTFGFTIL101 GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNPFMILC GTVLYSTAII151 LFQIILPHRP VQESVANAYE ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH251 VDYQEMSEKF KNTDIIFRIR RLLEMQGQAC RNTAQAIRSG KDYVYSKRLG301 RAIEGCRQSL RLLSDGNDSP DIRHLSRLLD NLGSVDQQFR QLRHSDSPAE351 NDRMGDTRIA ALETGSFKNT *进一步的工作揭示了完整的核苷酸序列<SEQ ID 53>:1 ATGAAAACCC CACTCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT51 CGCCAGTGTC TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTGGTCGAT151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCG CCACCGTCGC201 CCTGTTTACC CTCTCCTCGC TCACGGCGCA AAGCACCCTC GGCACAGGGC251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CCTTCGGCTT TACCATTTTA301 GGCGCGGTCG GGCTGAAATA CCGCACCTTC GCCTTCGGCG CACTCGCCGT351 CGCCACCTAC ACCACGCTTA CCTACACCCC CGAAACCTAC TGGCTGACCA401 ACCCCTTCAT GATTTTATGC GGCACCGTAC TGTACAGCAC CGCCATCATC451 CTGTTCCAAA TCATCCTGCC CCACCGCCCC GTCCAAGAAA GCGTCGCCAA501 TGCCTACGAA GCACTCGGCG GCTACCTCGA AGCCAAAGCC GACTTCTTCG551 ACCCCGATGA GGCAGCCTGG ATAGGCAACC GCCACATCGA CCTCGCCATG601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT651 TTACCGTTTG CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC701 GCTACTACTT CGCCGCCCAA GACATCCACG AACGCATCAG CTCCGCCCAC751 GTCGACTACC AAGAGATGTC CGAAAAATTC AAAAACACCG ACATCATCTT801 CCGCATCCGC CGCCTGCTCG AAATGCAGGG GCAGGCGTGC CGCAACACCG 851 CCCAAGCCAT CCGGTCGGGC AAAGACTAcg tTTACAGCAA ACGCCTCGGA901 CGCGCCATcg aaggctgCCG CCAGTCGCtg cgcctCCTTt cagacggcaA951 CGACAGTCCC GACATCCGCC ACCTGAGccg CCTTCTCGAC AACCTCGgca1001 GCGTcgacca gcagtTCcgc caactCCGAC ACAgcgactC CCCCGCcgaa1051 Aacgaccgca tgggcgacaC CCGCATCGCC GCCCtcgaaa ccggcagctT1101 caaaaaCAcc tggcaggCAA TCCGTCCGCa gctgaaCCTC GAATCatgCG1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTCG TCGTTGCCGC CGCCTGCACC1201 ATCGTCgaag cCCTCAACCT CAACCTCGGC TACTGGATAC TGCTGACCGC1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTGTACC1301 AACGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC1351 TACTTCACCC CCTCCGTCGA AACCAAACTC TGGATTGTCA TCGCCGGTAC1401 CACCCTGTTC TTCATGACCC GCACCTACAA ATACAGTTTC TCCACCTTCT1451 TCATCACCAT TCAGGCACTG ACCAGCCTCT CCCTCGCAGG TTTGGACGTA1501 TACGCCGCCA TGCCCGTGCG CATCATcgaC ACCATTATCG GCGCATCCCT1551 TGCCTGGGCG GCGGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAGCGGCAC ATACCTCCAA1651 AAAATTGCCG AACGCCTCAA AACCGGCGAA ACCGGCGACG ACATAGAATA1701 CCGCATCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT1901 TTACCGCACA GTTCCACCTT GCCGCCGAAC ACACCGCCCA CATCTTCCAA1951 CACCTGCCCG ACATGGGACC CGACGACTTT CAGACGGCAT TGGATACACT2001 GCGCGGCGAA CTCGGCACCC TCCCCACCCG CAGCAGCGGA ACACAAAGCC2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CccgGCAACT CGAACCCTAC2101 TACCGCGCCT ACCGACAAAT TCCGCACAGG CAGCCCCAAA ACGCAGCCTG2151 A它对应于氨基酸序列<SEQ ID 54;ORF19ng-1>:1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD51 LDNRLTGRLK NIIATVALFT LSSLTAQSTL GTGLPFILAM TLMTFGFTIL101 GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNPFMILC GTVLYSTAII151 LFQIILPHRP VQESVANAYE ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH251 VDYQEMSEKF KNTDIIFRIR RLLEMQGQAC RNTAQAIRSG KDYVYSKRLG301 RAIEGCRQSL RLLSDGNDSP DIRHLSRLLD NLGSVDQQFR QLRHSDSPAE351 NDRMGDTRIA ALETGSFKNT WQAIRPQLNL ESCVFRHAVR LSLVYAAACT401 IVEALNLNLG YWILLTALFV CQPNYTATKS RVYQRIAGTV LGVIVGSLVP451 YFTPSVETKL WIVIAGTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV501 YAAMPYRIID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSSGTYLQ551 KIAERLKTGE TGDDIEYRIT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ651 HLPDMGPDDF QTALDTLRGE LGTLRTRSSG TQSHILLQQL QLIARQLEPY701 YRAYRQIPHR QPQNAA*ORF19ng-1和ORF19-1在716个氨基酸的重叠区内显示出有95.5%的相同性:
10 20 30 40 50 60orf19-1.pep MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf19ng-1 MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
10 20 30 40 50 60
70 80 90 100 110 120orf19-L pep NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf19ng-1 NIIATVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY
70 80 90 100 110 120
130 140 150 160 170 180orf19-1.pep TTLTYTPETYWLTNPFMILCGTVLYSTAILLFQIVLPHRPVQESVANAYDALGGYLEAKA
|||||||||||||||||||||||||||||:||||:||||||||||||||:||||||||||
orf19ng-1 TTLTYTPETYWLTNPFMILCGTVLYSTAIILFQIILPHRPVQESVANAYEALGGYLEAKA
130 140 150 160 170 180
190 200 210 220 230 240
orf19-1.pep DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19ng-1 DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
190 200 210 220 230 240
250 260 270 280 290 300
orf19-1.pep DIHERISSAHVDYQEMSEKFKNTDIIFRIHRLLEMQGQACRNTAQALRASKDYVYSKRLG
|||||||||||||||||||||||||||||:||||||||||||||||:|::||||||||||
orf19ng-1 DIHERISSAHVDYQEMSEKFKNTDIIFRIRRLLEMQGQACRNTAQAIRSGKDYVYSKRLG
250 260 270 280 290 300
310 320 330 340 350 360
orf19-1.pep RAIEGCRQSLRLLSDSNDSPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA
|||||||||||||||:||||||||| ||||||||||||||||:|: ||||||||||||
orf19ng-1 RAIEGCRQSLRLLSDGNDSPDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIA
310 320 330 340 350 360
370 380 390 400 410 420
orf19-1.pep ALETSSLKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
||||:|:||||||||||||||| |||||||||||||||||||||||||||||||||||||
orf19ng-1 ALETGSFKNTWQAIRPQLNLESCVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
370 380 390 400 410 420
430 440 450 460 470 480
orf19-1.pep CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF
|||||||||||| ||||||||||||||||||||||||||||||||:||||||||||||||
orf19ng-1 CQPNYTATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSF
430 440 450 460 470 480
490 500 510 520 530 540
orf19-1.pep STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19ng-1 STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
490 500 510 520 530 540
550 560 570 580 590 600
orf19-1.pep AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
||||:|:||:||:||||:||||||:||| |||||||||||||||||||||||||||||||
orf19ng-1 AVCSSGTYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
550 560 570 580 590 600
610 620 630 640 650 660
orf19-1.pep PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF
|||||||||||||||||||||||||||||||||||||||||||||||||||||: ||||
orf19ng-1 PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPDMGPDDF
610 620 630 640 650 660
670 680 690 700 710
orf19-1.pep QTALDTLRGELDTLRTHSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
||||||||||| ||||:||||||||||||||||||||||||||||||||||||||||
orf19ng-1 QTALDTLRGELGTLRTRSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
670 680 690 700 710
另外,ORF19ng-1显示出与以前输入数据库的一种假设的淋球菌蛋白有明显同源性:
sp|033369|YOR2_NEIGO假设的45.5 KD蛋白(ORF2)gnl|PID|e1154438(AJ002423)假设蛋白[淋病奈瑟球菌]长度=417评分=1512(705.6位),估计值=5.3e-203,P=5.3e-203相同性=301/326(92%),阳性=306/326(93%)询问:307 RQSLRLLSDGNDSPDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIAALETGS 366
RQSLRLLSDGNDS DIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIAALETGS目标: 1 RQSLRLLSDGNDSXDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIAALETGS 60询问:367 FKNTWQAIRPQLNLESCVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFVCQPNYT 426
FKNTWQAIRPQLNLES VFRHAVRLSLVVAAACTIVEALNLNLGYWILLT LFVCQPNYT目标: 61 FKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTRLFVCQPNYT 120询问:427 ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT 486
ATKSRVYQRIAGTVLGVIVGSLVPYFTPSYETKLWIVIAGTTLFFMTRTYKYSFSTFFIT目标:121 ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT 180询问:487 IQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG 546
IQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG目标:181 IQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG 240询问:547 TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADSLQPGFTLL 606
TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFAD+ P目标:241 TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADTCNPALPCS 300询问:607 KTGYALTGYISALGAYRSEMHEECSP 632
K ALTGYISALG ++ + +P目标:301 KPATALTGYISALGHTAAKCTKNAAP 326
根据该分析,包括该淋球菌蛋白中存在几个推定的跨膜结构域(第一个结构域在脑膜炎球菌蛋白中也见到)以及与YHFK蛋白的同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例14
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 55>:1 ATGAATATGC TGGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT201 TTTGGCGGAA TACAAGGAAA CGCGTTCAAA AGAGGCGG.C GAAGCCTTTA251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTTAT CGTTACCGCG301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGAGTT351 TTGCCCAAGA TGCCGACAAA TTTCAGCTCT CCATCGATTT GCTGCGGATT401 ACGTTTCCTT ATATATTATT GATTTCCCTG TCTTCATTTG TCGGCTCGGT451 ACTCAATTCT TATCATAAGT TCGGCATTCC GGCGTTTACG CCAC.GTTTC501 TGAACGTGTC GTTTATCGTA TTCGCGCTGT TTTTCGTGCC GTATTTCGAT551 CCGCCCGTTA CCGCGCyGGC GTGGGCGGTC TTTGTCGGCG GCATTTTGCA601 ACTCGrmTTC CAACTGCCCT GGCTGGCGAA ACTGGGCTTT TTGAAACTGC651 CCAAACtGAG TTTCAAAGAT GCGGCGGTCA ACCGCGTGAT GAAACAGATG701 GCGCCTGCgA TTTTgGGCGT GAgCGTGGCG CAGGTTTCTT TGGTGATCAA751 CACGATTTTc GCGTCTTATC TGCAATCGGG CAGCGTTTCA TGGATGTATT801 ACGCCGACCG CATGATGGAG CTGCCCAGCG GCGTGCTGGG GGCGGCACTC851 GGTACGATTT TGCTGCCGAC TTTGTCCAAA CACTCGGCAA ACCaAGATAC901 GGaACAGTTT TCCGCCCTGC TCGACTGGGG TTTGCGCCTG TGCATGCtgc951 TGACGCTGCC GGCGgcGGTC GGACTGGCGG TGTTGTCGTT cCCgCtGGTG1001 GCGACGCTGT TTATGTACCG CGwATTTACG CTGTTTGACG CGCAGATGAC1051 GCAACACGCG CTGATTGCCT ATTCTTTCGG TTTAATCGGC TTAATCATGA1101 TTAAAGTGTT GGCACCCGGC TTCTATGCGC GGCAAAACAT CAAwAmGCCC1151 GTCAAAATCG CCATCTTCAC GCTCATCTGC mCGCAGTTGA TGAACCTTGs1201 CTTTAyCGGC CCACTrrAAC rCasTCGGAC TTTCGCTTGC CATCGGTCTG1251 GGCGCGTGTA TCAATGCCGG ATTGTTGTTT TACCTGTTGC GCAGACACGG1301 TATTTACCAA CCTGG.CAAG GGTTGGGCAG CGTTCTT.AG CAAAAATGCT1351 GcTCTCGCTC GCCGTGA它对应于氨基酸序列<SEQ ID 56;ORF20>:1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL51 LRRVFAEGAF AQAFVPILAE YKETRSKEAX EAFIRHVAGM LSFVLVIVTA101 LGILAAPWVI YVSAPSFAQD ADKFQLSIDL LRITFPYILL ISLSSFVGSV151 LNSYHKFGIP AFTPXFLNVS FIVFALFFVP YFDPPVTAXA WAVFVGGILQ201 LXFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQMAPAILGV SVAQVSLVIN251 TIFASYLQSG SVSWMYYADR MMELPSGVLG AALGTILLPT LSKHSANQDT301 EQFSALLDWG LRLCMLLTLP AAVGLAVLSF PLVATLFMYR XFTLFDAQMT351 QHALIAYSFG LIGLIMIKVL APGFYARQNI XXPVKIAIFT LICXQLMNLX401 FXGPLXXIGL SLAIGLGACI NAGLLFYLLR RHGIYQPXQG LGSVLXQKCC451 SRSP*详细描述这些序列,其完整的DNA序列<SEQ ID 57>是:1 ATGAATATGC TGGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT201 TTTGGCGGAA TACAAGGAAA CGCGTTCAAA AGAGGCGGCG GAGGCTTTTA251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTTAT CGTTACCGCG301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGGTTT351 TGCCCAAGAT GCCGACAAAT TTCAGCTCTC CATCGATTTG CTGCGGATTA401 CGTTTCCTTA TATATTATTG ATTTCCCTGT CTTCATTTGT CGGCTCGGTA451 CTCAATTCTT ATCATAAGTT CGGCATTCCG GCGTTTACGC CCACGTTTCT501 GAACGTGTCG TTTATCGTAT TCGCGCTGTT TTTCGTGCCG TATTTCGATC551 CGCCCGTTAC CGCGCTGGCG TGGGCGGTCT TTGTCGGCGG CATTTTGCAA601 CTCGGCTTCC AACTGCCCTG GCTGGCGAAA CTGGGCTTTT TGAAACTGCC651 CAAACTGAGT TTCAAAGATG CGGCGGTCAA CCGCGTGATG AAACAGATGG701 CGCCTGCGAT TTTGGGCGTG AGCGTGGCGC AGGTTTCTTT GGTGATCAAC751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTATTA801 CGCCGACCGC ATGATGGAGC TGCCCAGCGG CGTGCTGGGG GCGGCACTCG851 GTACGATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCCTGT GCATGCTGCT951 GACGCTGCCG GCGGCGGTCG GACTGGCGGT GTTGTCGTTC CCGCTGGTGG1001 CGACGCTGTT TATGTACCGC GAATTTACGC TGTTTGACGC GCAGATGACG1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGCT TAATCATGAT1101 TAAAGTGTTG GCACCCGGCT TCTATGCGCG GCAAAACATC AAAACGCCCG1151 TCAAAATCGC CATCTTCACG CTCATCTGCA CGCAGTTGAT GAACCTTGCC1201 TTTATCGGCC CACTGAAACA CGTCGGACTT TCGCTTGCCA TCGGTCTGGG1251 CGCGTGTATC AATGCCGGAT TGTTGTTTTA CCTGTTGCGC AGACACGGTA1301 TTTACCAACC TGGCAAGGGT TGGGCAGCGT TCTTAGCAAA AATGCTGCTC1351 TCGCTCGCCG TGATGTGCGG CGGACTGTGG GCAGCGCAGG CTTACCTGCC1401 GTTTGAATGG GCGCACGCCG GCGGAATGCG GAAAGCGGGG CAGCTCTGCA1451 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCACT GGCGGCTTTG1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAACTGA它对应于氨基酸序列<SEQ ID 58;ORF20-1>:1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL51 LRRVFAEGAF AQAFVPILAE YKETRSKEAA EAFIRHVAGM LSFVLVIVTA101 LGILAAPWVI YVSAPGFAQD ADKFQLSIDL LRITFPYILL ISLSSFVGSV151 LNSYHKFGIP AFTPTFLNVS FIVFALFFVP YFDPPVTALA WAVFVGGILQ201 LGFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQMAPAILGV SVAQVSLVIN251 TIFASYLQSG SVSWMYYADR MMELPSGVLG AALGTILLPT LSKHSANQDT301 EQFSALLDWG LRLCMLLTLP AAVGLAYLSF PLVATLFMYR EFTLFDAQMT351 QHALIAYSFG LIGLIMIKYL APGFYARQNI KTPVKIAIFT LICTQLMNLA401 FIGPLKHVGL SLAIGLGACI NAGLLFYLLR RHGIYQPGKG WAAFLAKMLL
451 SLAVMCGGLW AAQAYLPFEW AHAGGMRKAG QLCILIAVGG GLYFASLAAL
501 GFRPRHFKRV EN*
该氨基酸序列的计算机分析给出了下列结果:
与鼠伤寒杆菌的MviN毒力因子(登录号为P37169)的同源性
ORF20和MviN蛋白在440个氨基酸重叠区内显示出有63%的氨基酸相同性:Orf20 1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRYFAEGAF 60
MN+L +LA V S+TM SRVLGF RD ++AR FGAGMATDAFFVAFKLPNLLRR+FAEGAFMviN 14 MNLLKSLAAVSSMTMFSRVLGFARDAIVARIFGAGMATDAFFVAFKLPNLLRRIFAEGAF 73Orf20 61 AQAFVPILAEYKETRSKEAXEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPSFAQD 120
+QAFVPILAEYK + +EA F+ +V+G+L+ L +VT G+LAAPWVI V+AP FAMviN 74 SQAFVPILAEYKSKQGEEATRIFVAYVSGLLTLALAVVTVAGMLAAPWVIMVTAPGFADT 133Orf20 121 ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPXFLNVSFIVFALFFVP 180
ADKF L+ LLRITFPYILLISL+S VG++LN++++F IPAF P FLN+S I FALF PMviN 134 ADKFALTTQLLRITFPYILLISLASLVGAILNTWNRFSIPAFAPTFLNISMIGFALFAAP 193Orf20 181 YFDPPVTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV 240
YF+PPV A AWAY VGG+LQL +QLP+L K+G L LP+++F+D RV+KQM PAILGVMviN 194 YFNPPVLALAWAVTVGGVLQLVYQLPYLKKIGMLVLPRINFRDTGAMRVVKQMGPAILGV 253Orf20 241 SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT 300
SV+Q+SL+INTIFAS+L SGSVSWMYYADR+ME PSGVLG ALGTILLP+LSK A+ +MviN 254 SVSQISLIINTIFASFLASGSVSWMYYADRLMEFPSGVLGVALGTILLPSLSKSFASGNH 313Orf20 301 EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHALIAYSFG 360
+++ L+DWGLRLC LL LP+AV L +L+ PL +LF Y FT FDA MTQ ALIAYS GMviN 314 DEYCRLMDWGLRLCFLLALPSAVALGILAKPLTVSLFQYGKFTAFDAAMTQRALIAYSVG 373Orf20 361 LIGLIMIKVLAPGFYARQNIXXPVKIAIFTLICXQLMNLXFXXXXXXXXXXXXXXXXXCI 420
LIGLI++KVLAPGFY+RQ+I PVKIAI TLI QLMNL F C+MviN 374 LIGLIVVKVLAPGFYSRQDIKTPVKIAIVTLIMTQLMNLAFIGPLKHAGLSLSIGLAACL 433Orf20 421 NAGLLFYLLRRHGIYQPXQG 440
NA LL++ LR+ I+ P GMviN 434 NASLLYWQLRKQNIFTPQPG 453
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF20与脑膜炎奈瑟球菌菌株A的ORF(ORF20a)在重叠的447个氨基酸内显示出有93.5%的相同性:
10 20 30 40 50 60
orf20.pep MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20a MNMLGALVKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
10 20 30 40 50 60
70 80 90 100 110 120
orf20.pep AQAFVPILAEYKETRSKEAXEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPSFAQD
|||||||||||||||||||:|||||||||||||||||||||||||||||||||||:||:|
orf20a AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAKD
70 80 90 100 110 120
130 140 150 160 170 180orf20.pep ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPXFLNVSFIVFALFFVF
|||||||||||||||||||||||||||||||||||||:||||||:|||||||||||||||orf20a ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFSIPAFTPTFLNVSFIVFALFFVF
130 140 150 160 170 180
190 200 210 220 230 240orf20.pep YFDPPVTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
|||||||| |||||||||||| ||||||||||||||||||||||||||||||||||||||orf20a YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
190 200 210 220 230 240
250 260 270 280 290 300orf20.pep SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT
||||:||||||||||||||||||||||||||||||:||||||||||||||||||||||||orf20a SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT
250 260 270 280 290 300
310 320 330 340 350 360orf20.pep EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHALIAYSFG
|||||||||||| |||||||||||:||||||||||||||| |||||||||||||||||||orf20a EQFSALLDWGLRXCMLLTLPAAVGMAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
310 320 330 340 350 360
370 380 390 400 410 420orf20.pep LIGLIMIKVLAPGFYARQNIXXPVKIAIFTLICXQLMNLXFXGPLXXIGLSLAIGLGACI
|||||||||||||||||||| :|||||||||||:||||| | ||| :||||||||||||orf20a LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI
370 380 390 400 410 420
430 440 450orf20.pep NAGLLFYLLRRHGIYQPXQGLGSVLXQKCCSRSPX
||||||||||||||||| :| :: | :orf20a NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMGGGLYAAQIWLPFDWAHAGGMQKAA
430 440 450 460 470 480全长ORF20a核苷酸序列<SEQ ID 59>是:1 ATGAATATGC TGGGAGCTTT GGTAAAAGTC GGCAGCCTGA CGATGGTGTC51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGC GCATTCGGCG101 CAGGCATGGC GACGGATGCG TTCTTTGTCG CGTTCAAACT GCCCAACCTG151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCGGAT201 TTTGGCGGAA TATAAGGAAA CGCGTTCTAA AGAGGCGACG GAGGCTTTTA251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTCAT CGTTACCGCG301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGGTTT351 TGCCAAAGAT GCCGACAAAT TTCAGCTCTC TATCGATTTG CTGCGGATTA401 CGTTTCCTTA TATCTTATTG ATTTCACTTT CCTCTTTTGT CGGCTCGGTA451 CTCAATTCCT ATCATAAATT CAGCATTCCT GCGTTTACGC CCACGTTCCT501 GAACGTGTCG TTTATCGTAT TCGCGCTGTT TTTCGTGCCG TATTTCGATC551 CTCCCGTTAC CGCGCTGGCT TGGGCGGTTT TTGTCGGCGG CATTTTGCAA601 CTCGGCTTCC AACTGCCCTG GCTGGCGAAA CTGGGTTTTT TGAAACTGCC651 CAAACTGAGT TTCAAAGATG CGGCGGTCAA CCGCGTGATG AAACAGATGG701 CGCCTGCGAT TTTGGGCGTG AGCGTGGCGC AGATTTCTTT GGTGATCAAC751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTATTA801 CGCCGACCGC ATGATGGAAC TGCCCGGCGG CGTGCTGGGG GCGGCACTCG851 GTACGATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCNTGT GCATGCTGCT951 GACGCTGCCG GCGGCGGTCG GAATGGCGGT GTTGTCGTTC CCGCTGGTGG1001 CAACCTTGTT TATGTACCGA GAATTCACGC TGTTTGACGC GCAGATGACG1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGTT TAATCATGAT1101 TAAAGTGTTG GCGCCCGGCT TTTATGCGCG GCAAAACATC AAAACGCCCG1151 TCAAAATCGC CATCTTCACG CTCATTTGCA CGCAGTTGAT GAACCTTGCC1201 TTTATCGGCC CACTGAAACA CGTCGGACTT TCGCTTGCCA TCGGTCTGGG1251 CGCGTGTATC AATGCCGGAT TGTTGTTTTA CCTGTTGCGC AGACACGGTA1301 TTTACCAACC TGGCAAGGGT TGGGCAGCGT TCTTGGCAAA AATGCTGCTC1351 TCGCTCGCCG TGATGGGAGG CGGCCTGTAT GCCGCCCAAA TCTGGCTGCC1401 GTTCGACTGG GCACACGCCG GCGGAATGCA AAAGGCCGCC CGGCTCTTCA1451 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCACT GGCGGCTTTG1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAGCTGA它编码的蛋白质具有氨基酸序列<SEQ ID 60>:1 MNMLGALVKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL51 LRRVFAEGAF AQAFVPILAE YKETRSKEAT EAFIRHVAGM LSFVLVIVTA101 LGILAAPWVI YVSAPGFAKD ADKFQLSIDL LRITFPYILL ISLSSFVGSV151 LNSYHKFSIP AFTPTFLNVS FIVFALFFVP YFDPPVTALA WAVFVGGILQ201 LGFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQMAPAILGV SVAQISLVIN251 TIFASYLQSG SVSWMYYADR MMELPGGVLG AALGTILLPT LSKHSANQDT301 EQFSALLDWG LRXCMLLTLP AAVGMAVLSF PLVATLFMYR EFTLFDAQMT351 QHALIAYSEG LIGLIMIKVL APGFYARQNI KTPVKIAIFT LICTQLMNLA401 FIGPLKHVGL SLAIGLGACI NAGLLFYLLR RHGIYQPGKG WAAFLAKMLL451 SLAVMGGGLY AAQIWLPFDW AHAGGMQKAA RLFILIAVGG GLYFASLAAL501 GFRPRHFKRV ES*ORF20a和ORF20-1在512个氨基酸的重叠区内显示出有96.5%的相同性:
10 20 30 40 50 60orf20a.pep MNMLGALVKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf20-1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
10 20 30 40 50 60
70 80 90 100 110 120orf20a.pep AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAKD
|||||||||||||||||||:||||||||||||||||||||||||||||||||||||||:|orf20-1 AQAFVPILAEYKETRSKEAAEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAQD
70 80 90 100 110 120
130 140 150 160 170 180orf20a.pep ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFSIPAFTPTFLNVSFIVFALFFVP
|||||||||||||||||||||||||||||||||||||:||||||||||||||||||||||orf20-1 ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPTFLNVSFIVFALFFVP
130 140 150 160 170 180
190 200 210 220 230 240orf20a.pep YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf20-1 YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
190 200 210 220 230 240
250 260 270 280 290 300orf20a.pep SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT
||||:||||||||||||||||||||||||||||||:||||||||||||||||||||||||orf20-1 SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT
250 260 270 280 290 300
310 320 330 340 350 360orf20a.pep EQFSALLDWGLRXCMLLTLPAAVGMAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
|||||||||||| |||||||||||:|||||||||||||||||||||||||||||||||||orf20-1 EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
310 320 330 340 350 360
370 380 390 400 410 420orf20a.pep LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20-1 LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI
370 380 390 400 410 420
430 440 450 460 470 480
orf20a.pep NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMGGGLYAAQIWLPFDWAHAGGMQKAA
||||||||||||||||||||||||||||||||||| |||:||| :|||:|||||||:||:
orf20-1 NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMCGGLWAAQAYLPFEWAHAGGMRKAG
430 440 450 460 470 480
490 500 510
orf20a.pep RLFILIAVGGGLYFASLAALGFRPRHFKRVESX
:| ||||||||||||||||||||||||||||:|
orf20-1 QLCILIAVGGGLYFASLAALGFRPRHFKRVENX
490 500 510
与淋病奈瑟球菌的预计ORF的同源性
ORF20与淋病奈瑟球菌的预计ORF(ORF20ng)在重叠的454个氨基酸内显示出有92.1%的相同性:
orf20.pep MNMLGALAKYGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20ng MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60
orf20.pep AQAFVPILAEYKETRSKEAXEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPSFAQD 120
|||||||||||||||||||:|||||||||||||||::||||||||||||||||||:|::|
orf20ng AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLIVVTALGILAAPWVIYVSAPGFTKD 120
orf20.pep ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPXFLNVSFIVFALFFVP 180
||||||||:||||||||||||||||||||:||||||||||||||:|||:|||||||||||
orf20ng ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP 180
orf20.pep YFDPPVTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV 240
|||||||| |||||||||||| |||||||||||||||||:||||||||||||||||||||
orf20ng YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLNFKDAAVNRVMKQMAPAILGV 240
orf20.pep SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT 300
||||:||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf20ng SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT 300
orf20.pep EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHALIAYSFG 360
||||||||||||||||||||||:||||||||||||||||| |||||||||||||||||||
orf20ng EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 360
orf20.pep LIGLIMIKVLAPGFYARQNIXXPVKIAIFTLICXQLMNLXFXGPLXXIGLSLAIGLGACI 420
||||||||||| |||||||| :|||||||||||:||||| | ||| ||||||||||||
orf20ng LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI 420
orf20.pep NAGLLFYLLRRHGIYQPXQGLGSVLXQKCCSRSP 454
||||||:|:|:||||:| ||||: :|||||||
orf20ng NAGLLFFLFRKHGIYRPGQGLGQPSWRKCCSRSP 454
预计ORF20ng核苷酸序列<SEQ ID 61>编码的蛋白质具有氨基酸序列<SEQ ID62>:1 MNMLGALAKV GSLTMVSRYL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL51 LRRVFAEGAF AQAFVPILAE YKETRSKEAT EAFIRHVAGM LSFVLIVYTA101 LGILAAPWVI YVSAPGFTKD ADKFQLSISL LRITFPYILL ISLSSFVGSI151 LNSYHKFGIP AFTPTFLNIS FIYFALFFYP YFDPPYTALA WAVFYGGILQ201 LGFQLPWLAK LGFLKLPKLN FKDAAVNRVM KQMAPAILGV SVAQISLVIN251 TIFASYLQSG SVSWMYYADR MMELPGGVLG AALGTILLPT LSKHSANQDT301 EQFSALLDWG LRLCMLLTLP AAAGLAVLSF PLVATLFMYR EFTLFDAQMT351 QHALIAYSFG LIGLIMIKVL ASGFYARQNI KTPVKIAIFT LICTQLMNLA401 FIGPLKHAGL SLAIGLGACI NAGLLFFLFR KHGIYRPGQG LGQPSWRKCC451 SRSP*进一步的DNA分析揭示了下列DNA序列<SEQ ID 63>:1 ATGAATATGC TTGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT201 TTTGGCGGAA TATAAGGAAA CGCGTTCTAA AGAGGCGAcg gAGGCTTTTA251 TCCGCCACGt tgcgggAatg CTGTCGTTTG TGCTGATcgt cGttacCGCG301 CTGGGCATAC TTGCCGCgcc tTGGGTGATT TATGTTtccg CgcccGGCTT351 TACCAAAGAC GCGGACAAGT TCCAACTTTC CATCAGCCTG CTGCGGATTA401 CGTTTCCTTA TATATTATTG ATTTCTTTGT CTTCTTTTGT CGGCTCGATA451 CTCAATTCCT ACCATAAGTT CGGCATTCCC GCGTTTACGC CCACGTTTTT501 AAACATCTCT TTTATCGTAT TCGCACTGTT TTTCGTGCCG TATTTCGATC551 CGCCCGTTAC CGCGCTGGCG TGGGCGGTTT TTGTCGGCGG TATTTTGCAG601 CTCGGTTTCC AACTGCCGTG GCTGGCGAAA CTGGGCTTTT TGAAACTGCC651 CAAACTGAAT TTCAAAGATG CGGCGGTCAA CCGCGTCATG AAACAGATGG701 CGCCTGCGAT TTTGGGCGTG agcgTGGCGC AAATTTCTTT GgttATCAAC751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTatta801 cgCCGACCGC ATGATGGAGc tgcgccGGGG CGTGCTGGGG GCTGCACTCG851 GTACAATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCCTGT GCATGCTGCT951 GACGCTGCCG GCGGCGGccg GACTGGCGGT ATTGTCGTTC CCGCTGGTGG1001 CGACGCTGTT TATGTACCGA GAATTCACGC TGTTTGACGC ACAAATGACG1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGTT TAATTATGAT1101 TAAAGTGTTG GCATCCGGCT TTTATGCGCG GCAAAACATC AAAACGCCCG1151 TCAAAATCGC CATCTTCACG CTCATCTGCA CGCAGTTGAT GAACCTCGCC1201 TTTATCGGTC CGTTGAAACA CGCCGGGCTT TCGCTCGCCA TCGGCCTGGG1251 CGCGTGCATC AACGCCGGAT TGTTGTTCTT CCTGTTGCGC AAACACGGTA1301 TTTACCGGCC cggcaggggt tgggcggcgt TCTTGGCGAA AATGCTGCTC1351 GCGCTCGCCG TGATGTGCGG CGGACTGTGG GCGGCGCAGG CTTGCCTGCC1401 GTTCGAATGG GCGCACGCCG GCGGAATGCG GAAAGCGGGG CAGCTCTGCA1451 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCTCT GGCGGCTTTG1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAGCTGA它编码下列氨基酸序列<SEQ ID 64;ORF20ng-1>:1 MNMLGALAKV GSLTMVSEVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL51 LRRVFAEGAF AQAFVPILAE YKETRSKEAT EAFIRHVAGM LSFVLIVVTA101 LGILAAPWVI YVSAPGFTKD ADKFQLSISL LRITFPYILL ISLSSFVGSI151 LNSYHKFGIP AFTPTFLNIS FIVFALFFVP YFDPPVTALA WAVFVGGILQ201 LGFQLPWLAK LGFLKLPKLN FKDAAVNRVM KQMAPAILGV SVAQISLVIN251 TIFASYLQSG SVSWMYYADR MMELRRGVLG AALGTILLPT LSKHSANQDT301 EQFSALLDWG LRLCMLLTLP AAAGLAVLSF PLVATLFMYR EFTLFDAQMT351 QHALIAYSFG LIGLIMIKVL ASGFYARQNI KTPVKIAIFT LICTQLMNLA401 FIGPLKHAGL SLAIGLGACI NAGLLFFLLR KHGIYRPGRG WAAFLAKMLL451 ALAVMCGGLW AAQACLPFEW AHAGGMRKAG QLCILIAVGG GLYFASLAAL501 GFRPRHFKRV ES*ORF20ng-1和ORF20-1在512个氨基酸的重叠区内显示出有95.7%的相同性:
10 20 30 40 50 60orf20-1.pep MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf20ng-1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
10 20 30 40 50 60
70 80 90 100 110 120orf20-1.pep AQAFVPILAEYKETRSKEAAEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAQD
|||||||||||||||||||:|||||||||||||||::||||||||||||||||||||::|orf20ng-1 AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLIVVTALGILAAPWVIYVSAPGFTKD
70 80 90 100 110 120
130 140 150 160 170 180orf20-1.pep ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPTFLNVSFIVFALFFVP
||||||||:||||||||||||||||||||:||||||||||||||||||:|||||||||||orf20ng-1 ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP
130 140 150 160 170 180
190 200 210 220 230 240orf20-1.pep YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
|||||||||||||||||||||||||||||||||||||||:||||||||||||||||||||orf20ng-1 YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLNFKDAAVNfVMKQMAPAILGV
190 200 210 220 230 240
250 260 270 280 290 300orf20-1.pep SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT
||||:||||||||||||||||||||||||||||| ||||||||||||||||||||||||orf20ng-1 SVAQISLVINTIFASYLQSGSVSWMYYADRMMELRRGVLGAALGTILLPTLSKHSANQDT
250 260 270 280 290 300
310 320 330 340 350 360orf20-1.pep EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
||||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||orf20ng-1 EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
310 320 330 340 350 360
370 380 390 400 410 420orf20-1.pep LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI
||||||||||| |||||||||||||||||||||||||||||||||||:||||||||||||orf20ng-1 LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI
370 380 390 400 410 420
430 440 450 460 470 480orf20-1.pep NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMCGGLWAAQAYLPFEWAHAGGMRKAG
||||||:|||:||||:||:|||||||||||:||||||||||||| |||||||||||||||orf20ng-1 NAGLLFFLLRKHGIYRPGRGWAAFLAKMLLALAVMCGGLWAAQACLPFEWAHAGGMRKAG
430 440 450 460 470 480
490 500 510orf20-1.pep QLCILIAVGGGLYFASLAALGFRPRHFKRVENX
|||||||||||||||||||||||||||||||:|orf20ng-1 QLCILIAVGGGLYFASLAALGFRPRHFKRVESX
490 500 510
另外,ORF20ng-1显示出与鼠伤寒杆菌的一种毒力因子明显同源:
sp|P37169|MVIN_SALTY毒力因子MVIN pir||S40271 mviN蛋白-鼠伤寒杆菌gi|438252(Z26133)mviB基因产物[鼠伤寒杆菌]gnl|PID|d1005521(D25292)ORF2[鼠伤寒杆菌]长度=524
评分=1573(750.1位),估计值=1.1e-220,Sum P(2)=1.1e-220
相同性=309/467(66%),阳性=368/467(78%)询问: 1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60
MN+L +LA V S+TM SRVLGF RD ++AR FGAGMATDAFFVAFKLPNLLRR+FAEGAF目标: 14 MNLLKSLAAVSSMTMFSRVLGFARDAIVARIFGAGMATDAFFVAFKLPNLLRRIFAEGAF 73询问: 61 AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLIVVTALGILAAPWVIYVSAPGFTKD 120
+QAFVPILAEYK + +EAT F+ +V+G+L+ L VVT G+LAAPWVI V+APGF目标: 74 SQAFVPILAEYKSKQGEEATRIFVAYVSGLLTLALAVVTVAGMLAAPWVIMVTAPGFADT 133询问: 121 ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP 180
ADKF L+ LLRITFPYILLISL+S VG+ILN++++F IPAF PTFLNIS I FALF P目标: 134 ADKFALTTQLLRITFPYILLISLASLVGAILNTWNRFSIPAFAPTFLNISMIGFALFAAP 193询问:181 YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLNFKDAAVNRVMKQMAPAILGV 240
YF+PPV ALAWAV VGG+LQL +QLP+L K+G L LP++NF+D RV+KQM PAILGV目标:194 YFNPPVLALAWAVTVGGVLQLVYQLPYLKKIGMLVLPRINFRDTGAMRVVKQMGPAILGV 253询问:241 SVAQISLVINTIFASYLQSGSVSWMYYADRMMELRRGVLGAALGTILLPTLSKHSANQDT 300
SV+QISL+INTIFAS+L SGSVSWMYYADR+ME GVLG ALGTILLP+LSK A+ +目标:254 SVSQISLIINTIFASFLASGSVSWMYYADRLMEFPSGVLGVALGTILLPSLSKSFASGNH 313询问:301 EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 360
+++ L+DWGLRLC LL LP+A L +L+ PL +LF Y +FT FDA MTQ ALLAYS G目标:314 DEYCRLMDWGLRLCFLLALPSAVALGILAKPLTVSLFQYGKFTAFDAAMTQRALIAYSVG 373询问:361 LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI 420
LIGLI++KVLA GFY+RQ+IKTPVKIAI TLI TQLMNLAFIGPLKHAGLSL+IGL AC+目标:374 LIGLIVVKVLAPGFYSRQDIKTPVKIAIVTLIMTQLMNLAFIGPLKHAGLSLSIGLAACL 433询问:421 NAGLLFFLLRKHGIYRPGRGWXXXXXXXXXXXXVMCGGLWAAQACLP 467
NA LL++ LRK I+ P GW VM L+ +P目标:434 NASLLYWQLRKQNIFTPQPGWMWFLMRLIISVLVMAAVLFGVLHIMP 480评分=70(33.4位),估计值=1.1e-220,Sum P(2)=1.1e-220相同性=14/41(34%),阳性=23/41(56%)询问:469 EWAHAGGMRKAGQLCILIAVGGGLYFASLAALGFRPRHFKR 509
EW+ + + +L ++ G YFA+LA LGF+ + F R目标:481 EWSQGSMLWRLLRLMAVVIAGIAAYFAALAVLGFKVKEFVR 521
根据该分析结果,包括与鼠伤寒杆菌的一种毒力因子有同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白质可作为疫苗或诊断用的抗原,或用来产生抗体。
实施例15
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 65>:1 atGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA51 GCAAGCCGTT tACGACGGCC CGGCCaTTAC CGAAGtCGCG TTGCTTGGCG101 AAGAATATGC CGGTATGCGC CCCTCGATGA AAGTCAAGGA AGGCGATGCC151 GTcAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTGGT201 GTTTACTGCG CCGGCTTCAG GcAAAATCGC CGCGATTCAC CGTGGCGAAA251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAArGCAA CGACGAAATC301 GAGTTTGAAC GCTACGCACC TGAAGCGCTG GCAAACTTAA GCGGCGAAGA351 AGTGCGCCGC AACCTGATCC AATCCGGTTT GTGGACTGCG CTGCGCACCC401 GTCCGTTCAG CAAAATTCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC451 GTCAATGCGA tGGACACCAA TCCG..它对应于氨基酸序列<SEQ ID 66;ORF22>:1 MIKIKKGLNL PIAGRPEQAV YDGPAITEVA LLGEEYAGMR PSMKVKEGDA51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEXNDEI101 EFERYAPEAL ANLSGEEVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIP151 VNAMDTNP..进一步的工作揭示了完整的核苷酸序列<SEQ ID 67>:1 ATGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA51 GCAAGCCGTT TACGACGGCC CGGCCATTAC CGAAGTCGCG TTGCTTGGCG101 AAGAATATGC CGGTATGCGC CCCTCGATGA AAGTCAAGGA AGGCGATGCC151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTGGT201 GTTTACTGCG CCGGCTTCAG GCAAAATCGC CGCGATTCAC CGTGGCGAAA 251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC301 GAGTTTGAAC GCTACGCACC TGAAGCGCTG GCAAACTTAA GCGGCGAAGA351 AGTGCGCCGC AACCTGATCC AATCCGGTTT GTGGACTGCG CTGCGCACCC401 GTCCGTTCAG CAAAATTCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC451 GTCAATGCGA TGGACACCAA TCCGCTGGCT GCCGACCCTA CGGTCATTAT501 CAAAGAAGCC GCCGAGGATT TCAAACGCGG CCTGTTGGTA TTGAGCCGTT551 TGACCGAACG CAAAATCCAT GTTTGTAAGG CAGCTGGCGC AGACGTGCCG601 TCTGAAAATG CTGCCAACAT CGAAACACAT GAATTCGGCG GCCCGCATCC651 TGCCGGTTTG AGTGGCACGC ACATTCATTT CATCGAGCCG GTCGGCGCGA701 ATAAAACCGT GTGGACCATC AATTATCAAG ATGTAATTAC CATTGGCCGT751 TTGTTTGCAA CAGGCCGTCT GAACACCGAG CGCGTGATTG CCCTAGGTGG801 TTCTCAAGTC AACAAACCGC GCCTCTTGCG TACCGTTTTG GGTGCGAAAG851 TATCGCAAAT TACTGCGGGC GAATTGGTTG ACACAGACAA CCGCGTGATT901 TCCGGTTCGG TATTGAACGG CGCGATTACA CAAGGCGCGC ACGATTATTT951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGT1051 ACAACCCTCG GCCATTTCCT GAAAAACAAA CTCTTCAAGT TCAACACAGC1101 CGTCAACGGC GGCGACCGCG CCATGGTGCC GATTGGTACT TACGAGCGCG1151 TGATGCCCTT GGATATCCTG CCCACCCTGC TTTTGCGCGA TTTAATCGTC1201 GGCGATACCG ACAGCGCGCA GGCATTGGGT TGCTTGGAAT TGGACGAAGA1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATACGGCC1301 CGCTGTTGCG CAAAGTGCTG GAAACCATTG AGAAGGAAGG CTGA它对应于氨基酸序列<SEQ ID 68;ORF22-1>:1 MIKIKKGLNL PIAGRPEQAV YDGPAITEVA LLGEEYAGMR PSMKVKEGDA51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEGNDEI101 EFERYAPEAL ANLSGEEVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF151 VNAMDTNPLA ADPTVIIKEA AEDFKRGLLV LSRLTERKIH VCKAAGADVP201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVITIGR251 LFATGRLNTE RVIALGGSQV NKPRLLRTVL GAKVSQITAG ELVDTDNRVI301 SGSVLNGAIT QGAHIYLGRY HNQISVIEEG RSKELFGWVA PQPDKYSITR351 TTLGHFLKNK LFKFNTAVNG GDRAMVPIGT YERVMPLDIL PTLLLRDLIV401 GDTDSAQALG CLELDEEDLA LCSFVCPGKY EYGPLLRKVL ETIEKEG*进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 69>:1 ATGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA51 GCAAGTCATT TATGACGGGC CCGTCATTAC CGAAGTCGCG TTGCTTGGCG101 AAGAATATGC CGGTATGCGC CCCTNGATGA AAGTCAAGGA AGGCGATGCC151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGNATC CGGGCGTGGT201 GTTTACCGCG CCNGTTTCAG GCAAAATCGC CGCCATCCAT CGCGGCGAAA251 AGCGCGTACT TCAGTCGGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC301 GAGTTCGAAC GCTACGCGCC CGAAGCGTTG GCAAACTTAA GCGGCGANGA351 ANTNNGNNGC AATCTGATCC AATCCGGTTT GTGGACTGCG CTGCGTANCC401 GTCCGTTCAG CAAAATCCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC451 GTCAATGCGA TGGACACCAA TCCGC NGCG GCAGACCCTG TGGTTGTGAT501 CAAAGAAGCC GNCGANGATT TCAGACGANG TNTGCTGGTA TTGAGCCGTT551 TGACCGAGCG TAAAATCCAT GTGTGTAAGG CAGCTGGCGC AGACGTGCCG601 TCTGAAAATG CTGCCAACAT CGAAACACAT GAATTCGGCG GCCCGCATCC651 GGCCGGTTTG AGTGGCACGC ACATTCATTT CATTGAGCCG GTCGGTGCAA701 ACAAAACCGT TTGGACCATC AATTATCAAG ATGTAATTGC CATCGGACGT751 TTGTTTGCAA CAGGCCGTCT GAACACCGAG CGCGTGATTG CTTTGGGTGG801 TTCTCAAGTC AACAAACCAC GCCTCTTGCG TACCGTTTTG GGTGCGAAAG851 TATCGCAAAT TACTGCGGGC GAATTGGTTG ACGCAGACAA CCGCGTGATT901 TCCGGTTCGG TATTGAACGG CGCGATTACA CAAGGCGCGC ACGATTATTT951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGT1051 ACGACCCTCG GCCATTTCCT GAAAAACAAA CTCTTCAAGT TCACGACAGC1101 CGTCAACGGT GGCGACCGCG CCATGGTGCC GATTGGTACT TACGAGCGCG1151 TAATGCCGCT AGACATCCTG CCTACCCTGC TTTTGCGCGA TTTAATCGTC1201 GGCGATACCG ACAGCGCGCA AGCATTGGGT TGCTTGGAAT TGGACGAAGA1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATANGGCC1301 CGCTGTTGCG TAAGGTGCTG GAAACCNTTG AGAAGGAAGG CTGA它编码的蛋白质具有氨基酸序列<SEQ ID 70;ORF22a>:1 MIKIKKGLNL PIAGRPEQVI YDGPVITEVA LLGEEYAGMR PXMKVKEGDA51 VKKGQVLFED KKXPGVVFTA PVSGKIAAIH RGEKRVLQSV VIAVEGNDEI101 EFERYAPEAL ANLSGXEXXX NLIQSGLWTA LRXRPFSKIP AVDAEPFAIF151 VNAMDTNPLA ADPVVVIKEA XXDFRRXXLV LSRLTERKIH VCKAAGADYP201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVIAIGR251 LFATGRLNTE RVIALGGSQV NKPRLLRTVL GAKVSQITAG ELYDADNRVI301 SGSVLNGAIT QGAHDYLGRY HNQISVIEEG RSKELFGWVA PQPDKYSITR351 TTLGHFLKNK LFKFTTAVNG GDRAMVPIGT YERVMPLDIL PTLLLRDLIV401 GDTDSAQALG CLELDEEDLA LCSFVCPGKY EXGPLLRKVL ETXEKEG*
最初鉴定的菌株B部分序列(ORF22)与ORF22a在重叠的158个氨基酸内显示出有94.2%的相同性:
10 20 30 40 50 60
orf22.pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED
||||||||||||||||||::||||:|||||||||||||||| ||||||||||||||||||
orf22a MIKIKKGLNLPIAGRPEQVIYDGPVITEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED
10 20 30 40 50 60
70 80 90 100 110 120
orf22.pep KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEXNDEIEFERYAPEALANLSGEEVRR
|| ||||||||:||||||||||||||||||||||| ||||||||||||||||||| |
orf22a KKXPGVVFTAPVSGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGXEXXX
70 80 90 100 110 120
130 140 150
orf22.pep NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNP
||||||||||||:|||||||||||||||||||||||||
orf22a NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVVVIKEAXXDFRRXXLV
130 140 150 160 170 180
完整的菌株B序列(ORF22-1)和ORF22a在447个氨基酸的重叠区内显示出有94.9%的相同性:
10 20 30 40 50 60
orf22a.pep MIKIKKGLNLPIAGRPEQVIYDGPVITEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED
||||||||||||||||||::||||:|||||||||||||||| ||||||||||||||||||
orf22-1 MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED
10 20 30 40 50 60
70 80 90 100 110 120
orf22a.pep KKXPGVVFTAPVSGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGXEXXX
|| ||||||||:||||||||||||||||||||||||||||||||||||||||||| |
orf22-1 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGEEVRR
70 80 90 100 110 120
130 140 150 160 170 180
orf22a.pep NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVVVIKEAXXDFRRXXLV
||||||||||||:||||||||||||||||||||||||||||||:|:|||| ||:| ||
orf22-1 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV
130 140 150 160 170 180
190 200 210 220 230 240
orf22a.pep LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22-1 LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
190 200 210 220 230 240
250 260 270 280 290 300
orf22a.pep NYQDVIAIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDADNRVI
||||||:|||||||||||||||||||||||||||||||||||||||||||||||:|||||
orf22-1 NYQDVITIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDTDNRVI
250 260 270 280 290 300
310 320 330 340 350 360
orf22a.pep SGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22-1 SGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
310 320 330 340 350 360
370 380 390 400 410 420
orf22a.pep LFKFTTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22-1 LFKFNTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
370 380 390 400 410 420
430 440
orf22a.pep LCSFVCPGKYEXGPLLRKVLETXEKEGX
||||||||||| |||||||||| |||||
orf22-1 LCSFVCPGKYEYGPLLRKVLETIEKEGX
430 440
进一步的工作鉴定了淋病奈瑟球菌的部分基因序列<SEQ ID 71>,它编码下列氨基酸序列<SEQ ID 72;ORF22ng>:1 MIKIKKGLNL PIAGRPEQVI YDGPAITEVA LLGEEYVGMR PSMKIKEGEA51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEGNDEI101 EFERYVPEAL AKLSSEKVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF151 VNAMDTNPLA ADPTVIIKEA AEDFKRGLLV LSRLTERKIH VCKAAGADVP201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVIAIGR251 LFVTGRLNTE RVVALGGLQV NKPRLLRTVL GAKVSQLTAG ELVDADNRVI301 SGSVLNGAIA QGAHDYLGRY HN*进一步的工作鉴定了完整的淋球菌基因<SEQ ID 73>:1 ATGATTAAAA TCAAAAAAGG TCTAAATCTG CCCATCGCGG GCAGACCGGA51 GCAAGTCATT TATGACGGCC CGGCCATTAC CGAAGTCGCG TTGCTTGGCG101 AAGAATATGT CGGCATGCGC CCCTCGATGA AAATCAAGGA AGGTGAAGCC151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTAGT201 ATTTACTGCG CCGGCTTCAG GCAAAATCGC CGCTATTCAC CGTGGCGAAA251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC301 GAGTTCGAAC GCTACGTACC TGAAGCGCTG GCAAAATTGA GCAGCGAAAA351 AGTGCGCCGC AACCTGATTC AATCAGGCTT ATGGACTGCG CTTCGCACCC401 GTCCGTTCAG CAAAATCCCT GCCGTAGATG CCGAGCCGTT CGCCATCTTC451 GTCAATGCGA TGGACACCAA TCCGCTGGCT GCCGACCCTA CGGTCATCAT501 CAAAGAAGCC GCCGAAGACT TCAAACGCGG CCTGTTGGTA TTGAGCCGCC551 TGACCGAACG TAAAATCCAT GTGTGTAAAG CAGCAGGCGC AGACGTGCCG601 TCTGAAAATG CTGCCAATAT CGAAACACAT GAATTTGGCG GCCCGCATCC651 TGCCGGCTTG AGTGGCACGC ACATTCATTT CATCGAGCCA GTCGGCGCGA701 ATAAAACCGT GTGGACCATC AATTATCAAG ACGTGATTGC TATCGGACGT751 TTGTTCGTAA CAGGCCGTCT GAATACCGAG CGCGTGGTTG CCTTGGGCGG801 CCTGCAAGTC AACAAACCGC GCCTCTTGCG TACCGTTTTG GGTGCGAAGG851 TGTCTCAACT TACCGCCGGC GAATTGGTTG ACGCGGACAA CCGCGTGATT901 TCCGGTTCGG TATTGAACGG TGCGATTGCA CAAGGCGCGC ATGATTATTT951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGC1051 ACCACTCTCG GCCATTTCCT AAAAAACAAA CTCTTCAAGT TCACGACAGC1101 CGTCAACGGC GGCGACCGCG CCATGGTACC GATCGGCACT TATGAGCGCG1151 TAATGCCGTT GGACATCCTG CCTACCTTGC TTTTGCGCGA TTTAATCGTC1201 GGCGATACCG ACAGCGCGCA GGCTTTGGGT TGCTTGGAAT TGGACGAAGA1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATACGGCC1301 CGCTGTTGCG CAAAGTGCTG GAAACCATTG AGAAGGAAGG CTGA它编码的蛋白质具有氨基酸序列<SEQ ID 74;ORF22ng-1>:1 MIKIKKGLNL PIAGRPEQVI YDGPAITEVA LLGEEYVGMR PSMKIKEGEA51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEGNDEI101 EFERYVPEAL AKLSSEKVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF151 VNAMDTNPLA ADPTVIIKEA AEDFKRGLLV LSRLTERKIH VCKAAGADVP201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVIAIGR251 LFVTGRLNTE RVVALGGLQV NKPRLLRTVL GAKVSQLTAG ELVDADNRVI301 SGSVLNGAIA QGAHIYLGRY HNQISVIEEG RSKELFGWVA PQPDKYSITR351 TTLGHFLKNK LFKFTTAVNG GDRAMVPIGT YERVMPLDIL PTLLLRDLIV401 GDTDSAQALG CLELDEEDLA LCSFVCPGKY EYGPLLRKVL ETIEKEG*
最初鉴定的菌株B部分序列(ORF22)与ORF22ng在重叠的158个氨基酸内显示出有93.7%的相同性:
orf22.pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 60
||||||||||||||||||::||||||||||||||||:|||||||:|||:|||||||||||
orf22ng MIKIKKGLNLPIAGRPEQVIYDGPAITEVALLGEEYVGMRPSMKIKEGEAVKKGQVLFED 60
orf22.pep KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEXNDEIEFERYAPEALANLSGEEVRR 120
||||||||||||||||||||||||||||||||||| |||||||||:|||||:||:|:|||
orf22ng KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYVPEALAKLSSEKVRR 120
orf22.pep NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNP 158
||||||||||||||||||||||||||||||||||||||
orf22ng NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV 180
菌株B(ORF22-1)和淋球菌(ORF22ng)的完整序列在447个氨基酸的重叠区内显示出有96.2%的相同性:
10 20 30 40 50 60
orf22-1.pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED
||||||||||||||||||::||||||||||||||||:|||||||:|||:|||||||||||
orf22ng-1 MIKIKKGLNLPIAGRPEQVIYDGPAITEVALLGEEYVGMRPSMKIKEGEAVKIGQVLFED
10 20 30 40 50 60
70 80 90 100 110 120
orf22-1.pep KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGEEVRR
|||||||||||||||||||||||||||||||||||||||||||||:|||||:||:|:|||
orf22ng-1 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYVPEALAKLSSEKVRR
70 80 90 100 110 120
130 140 150 160 170 180
orf22-1.pep NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22ng-1 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV
130 140 150 160 170 180
190 200 210 220 230 240
orf22-1.pep LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22ng-1 LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
190 200 210 220 230 240
250 260 270 280 290 300
orf22-1.pep NYQDVITIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDTDNRVI
||||||:|||||:|||||||||:|||| ||||||||||||||||||:|||||||:|||||
orf22ng-1 NYQDVIAIGRLFVTGRLNTERVVALGGLQVNKPRLLRTVLGAKVSQLTAGELVDADNRVI
250 260 270 280 290 300
310 320 330 340 350 360orf22-1.pep SGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||||||orf22ng-1 SGSVLNGAIAQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
310 320 330 340 350 360
370 380 390 400 410 420orf22-1.pep LFKFNTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||orf22ng-1 LFKFTTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
370 380 390 400 410 420
430 440orf22-1.pep LCSFVCPGKYEYGPLLRKVLETIEKEGX
||||||||||||||||||||||||||||orf22ng-1 LCSFVCPGKYEYGPLLRKVLETIEKEGX
430 440这些序列的计算机分析给出了下列结果:与大叶性肺炎放线杆菌的48kDa外膜蛋白(登录号U24492)的同源性ORF22和该48kDa蛋白在158个氨基酸的重叠区内有72%的氨基酸相同性:Orf22 1 MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 60
MI IKKGL+LPIAG P Q +++G + EVA+LGEEY GMRPSMKV+EGD VKKGQVLFED48kDa 1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDVVKKGQVLFED 60off22 61 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEXNDEIEFERYAPEALANLSGEEVRR 120
KKNPGVVFTAPASG + I+RGEKRVLQSVVI VE +++I F RY LA+LS E+V++48kDa 61 KKNPGVVFTAPASGTVVTINRGEKRVLQSVVIKVEGDEQITFTRYEAAQLASLSAEQVKQ 120orf22 121 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNP 158
NLI+SGLWTA RTRPFSK+PA+DA P +IFVNAMDTNP48kDa 121 NLIESGLWTAFRTRPFSKVPALDAIPSSIFVNAMDTNP 158ORF22a还显示出与48kDa的大叶性肺炎放线杆菌蛋白有同源性:gi|1185395(U24492)48 kDa外膜蛋白[大叶性肺炎放线杆菌]长度=449评分=530位(1351),估计值=e-150相同性=274/450(60%),阳性=323/450(70%),空隙=4/450(0%)询问:1 MIKIKKGLNLPIAGRPEQVIYDGPVITEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED 60
MI IKKGL+LPIAG P QVI++G + EVA+LGEEY GMRP MKV+EGD VKKGQVLFED目标:1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDVVKKGQVLFED 60询问:61 KKXPGVVFTAPVSGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGXEXXX 120
KK PGVVFTAP SG + I+RGEKRVLQSVVI VEG+++I F RY LA+LS +目标:61 KKNPGVVFTAPASGTVVTINRGEKRVLQSVVIKVEGDEQITFTRYEAAQLASLSAEQVKQ 120询问:121 NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVVVIKEAXXDFRRXXLV 180
NLI+SGLWTA R RPFSK+PA+DA P +IFVNAMDTNPLAADP VV+KE DF+ V目标:121 NLIESGLWTAFRTRPFSKVPALDAIPSSIFVNAMDTNPLAADPEVVLKEYETDFKDGLTV 180询问:181 LSRL--TERKIHVCKAAGADVP-SENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTV 237
L+RL ++ +++CK A +++P S I F G HPAGL GTHIHF++PVGA K V目标:181 LTRLFNGQKPVYLCKDADSNIPLSPAIEGITIKSFSGVHPAGLVGTHIHFVDPVGATKQV 240询问:238 WTINYQDVIAIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDADN 297
W +NYQDVIAIG+LF TG L T+R+I+L G QV PRL+RT LGA +SQ+TA EL +N目标:241 WHLNYQDVIAIGKLFTTGELFTDRIISLAGPQVKNPRLVRTRLGANLSQLTANELNAGEN 300询问:298 RVISGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFL 357
RVISGSVL+GA G DYLGRY Q+SV+ EGR KELFGW+ P DK+SITRT LGHF目标:301 RVISGSVLSGATAAGPVDYLGRYALQVSVLAEGREKELFGWIMPGSDKFSITRTVLGHFG 360询问:358 KNKLFKFTTAVNGGDRAMVPIGTYERVMXXXXXXXXXXXXXXVGDTDSAQXXXXXXXXXX 417
K KLF FTTAV+GG+RAMVPIG YERVM GDTDSAQ目标:361 K-KLFNFTTAVHGGERAMVPIGAYERVMPLDIIPTLLLRDLAAGDTDSAQNLGCLELDEE 419询问:418 XXXXXSFVCPGKYEXGPLLRKVLETXEKEG 447
++VCPGK GP+LR LE EKEGORF22ng-1还显示出与大叶性肺炎放线杆菌的OMP有同源性:gi|1185395(U24492)48 kDa外膜蛋白[大叶性肺炎放线杆菌]长度=449评分=555位(1414),估计值=e-157相同性=284/450(63%),阳性=337/450(74%),空隙=4/450(0%)询问:27 MIKIKKGLNLPIAGRPEQVIYDGPAITEVALLGEEYVGMRPSMKIKEGEAVKKGQVLFED 86
MI IKKGL+LPIAG P QVI++G + EVA+LGEEYVGMRPSMK++EG+ VKKGQVLFED目标:1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDVVKKGQVLFED 60询问:87 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYVPEALAKLSSEKVRR 146
KKNPGVVFTAPASG + I+RGEKRVLQSVVI VEG+++I F RY LA LS+E+V++目标:61 KKNPGVVFTAPASGTVVTINRGEKRVLQSVVIKVEGDEQITFTRYEAAQLASLSAEQVKQ 120询问:147 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV 206
NLI+SGLWTA RTRPFSK+PA+DA P +IFVNAMDTNPLAADP V++KE DFK GL V目标:121 NLIESGLWTAFRTRPFSKVPALDAIPSSIFVNAMDTNPLAADPEVVLKEYETDFKDGLTV 180询问:207 LSRL--TERKIHVCKAAGADVP-SENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTV 263
L+RL ++ +++CK A +++P S I F G HPAGL GTHIHF++PVGA K V目标:181 LTRLFNGQKPVYLCKDADSNIPLSPAIEGITIKSFSGVHPAGLVGTHIHFVDPVGATKQV 240询问:264 WTINYQDVIAIGRLFVTGRLNTERVVALGGLQVNKPRLLRTVLGAKVSQLTAGELVDADN 323
W +NYQDVIAIG+LF TG L T+R+++L G QV PRL+RT LGA +SQLTA EL +N目标:241 WHLNYQDVIAIGKLFTTGELFTDRIISLAGPQVKNPRLVRTRLGANLSQLTANELNAGEN 300询问:324 RVISGSVLNGAIAQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFL 383
RVISGSVL+GA A G DYLGRY Q+SV+ EGR KELFGW+ P DK+SITRT LGHF目标:301 RVISGSVLSGATAAGPVDYLGRYALQVSVLAEGREKELFGWIMPGSDKFSITRTVLGHFG 360询问:384 KNKLFKFTTAVNGGDRAMVPIGTYERVMXXXXXXXXXXXXXXVGDTDSAQXXXXXXXXXX 443
K KLF FTTAV+GG+RAMVPIG YERVM GDTDSAQ目标:361 K-KLFNFTTAVHGGERAMVPIGAYERVMPLDIIPTLLLRDLAAGDTDSAQNLGCLELDEE 419询问:444 XXXXXSFVCPGKYEYGPLLRKVLETIEKEG 473
++VCPGK YGP+LR LE IEKEG目标:420 DLALCTYVCPGKNNYGPMLRAALEKIEKEG 449
根据该分析结果,包括与大叶性肺炎放线杆菌外膜蛋白有同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
将ORF22-1(35.4kDa)如上所述克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图5A显示了GST-融合蛋白的亲和纯化结果,图5B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA(阳性结果)和FACS分析(图5C)。这些结果确认ORF22-1是一种外露蛋白,且是一种有用的免疫原。
实施例16
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 75>:1 ..GCGnCGnAAA TCATCCATCC CC..nACGTC GTAGGCCCTG AAGCCAACTG51 GTTTTTTATG GTAGCCAGTA CGTTTGTGAT TGCTTTGATT GGTTATTTTG101 TTACTGAAAA AATCGTCGAA CCGCAATTGG GCCCTTATCA ATCAGATTTG151 TCACAAGAAG AAAAAGACAT TCGGCATTCC AATGAAATCA CGCCTTTGGA201 ATATAAAGGA TTAATTTGGG CTGGCGTGGT GTTTGTTGCC TTATCCGCCC251 TATTGGCTTG GAGCATCGTC CCTGCCGACG GTATTTTGCG TCATCCTGAA301 ACAGGATTGG TTTCCGGTTC GCCGTTTTTA AAATCGATTG TTGTTTTTAT351 TTTCTTGTTG TTTGCACTGC CGGGCATTGT TTATGGCCGG GTAACCCGAA401 GTTTGCGCGG CGAACAGGAA GTCGTTAATG CGmyGGCCGA ATCGATGAGT451 ACTCTGGsGC TTTmTTTGsw CAkcATCTTT TTTGCCGCAC AGTTTGTCGC501 ATTTTTTAAT TGGACGAATA TTGGGCAATA TATTGCCGTT AAAGGGGCGA551 CGTTCTTAAA AGAAGTCGGC TTGGGCGGCA GCGTGTTGTT TATCGGTTTT601 ATTTTAATTT GTGCTTTTAT CAATCTGATG ATAGGCTCCG CCTCCGCGCA651 ATGGGCGGTA ACTGCGCCGA TTTTCGTCCC TATGCTGATG TTGGCCGGCT701 ACGCGCCCGA AGTCATTCAA GCCGCTTACC GCATCGGTGA TTCCGTTACC751 AATATTATTA CGCCGATGAT GAGTTATTTC GGGCTGATTA TGGCGACGGT801 GrkCmmmTAC AAAAAAGATG CGGGCGTGGG TaCGcTGATT wCTATGATGT851 TGCCGTATTC CGCTTTCTTC TTGATTGCgT GGATTGCCTT ATTCTGCATT901 TGGGTATTTg TTTTGGGCCT GCCCGTCGGT CCCGGCGCGC CCACATTCTA951 TCCCGCACCT TAA它对应于氨基酸序列<SEQ ID 76;ORF12>:1 ..AXXIIHPXXV VGPEANWFFM VASTFVIALI GYFVTEKIVE PQLGPYQSDL51 SQEEKDIRHS NEITPLEYKG LIWAGVVFVA LSALLAWSIV PADGILRHPE101 TGLVSGSPFL KSIVVFIFLL FALPGIVYGR VTRSLRGEQE VVNAXAESMS151 TLXLXLXXIF FAAQFVAFFN WTNIGQYIAV KGATFLKEVG LGGSVLFIGF201 ILICAFINLM IGSASAQWAV TAPIFVPMLM LAGYAPEVIQ AAYRIGDSVT251 NIITPMMSYF GLIMATVXXY KKDAGVGTLI XMMLPYSAFF LIAWIALFCI301 WVFVLGLPVG PGAPTFYPAP *进一步的序列分析揭示了完整的DNA序列<SEQ ID 77>是:1 ATGAGTCAAA CCGATACGCA ACGGGACGGA CGATTTTTAC GCACAGTCGA51 ATGGCTGGGC AATATGTTGC CGCATCCGGT TACGCTTTTT ATTATTTTCA101 TTGTGTTATT GCTGATTGCC TCTGCCGTCG GTGCGTATTT CGGACTATCC151 GTCCCCGATC CGCGCCCTGT TGGTGCGAAA GGACGTGCCG ATGACGGTTT201 GATTTACATT GTCAGCCTGC TCAATGCCGA CGGTTTTATC AAAATCCTGA251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC351 ATTAATGCGC TTATTGCTCA CAAAATCGCC ACGCAAACTC ACTACTTTTA401 TGGTTGTTTT TACAGGGATT TTATCTAATA CCGCTTCTGA ATTGGGCTAT451 GTCGTCCTAA TCCCTTTGTC CGCCATCATC TTTCATTCCC TCGGCCGCCA501 TCCGCTTGCC GGTCTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC651 CAACTGGTTT TTTATGGTAG CCAGTACGTT TGTGATTGCT TTGATTGGTT701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA751 GATTTGTCAC AAGAAGAAAA AGACATTCGG CATTCCAATG AAATCACGCC801 TTTGGAATAT AAAGGATTAA TTTGGGCTGG CGTGGTGTTT GTTGCCTTAT851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT901 CCTGAAACAG GATTGGTTTC CGGTTCGCCG TTTTTAAAAT CGATTGTTGT951 TTTTATTTTC TTGTTGTTTG CACTGCCGGG CATTGTTTAT GGCCGGGTAA1001 CCCGAAGTTT GCGCGGCGAA CAGGAAGTCG TTAATGCGAT GGCCGAATCG1051 ATGAGTACTC TGGGGCTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG1151 GGGCGACGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGCGT GTTGTTTATC1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC1401 GACGGTGATC AAATACAAAA AAGATGCGGG CGTGGGTACG CTGATTTCTA1451 TGATGTTGCC GTATTCCGCT TTCTTCTTGA TTGCGTGGAT TGCCTTATTC1501 TGCATTTGGG TATTTGTTTT GGGCCTGCCC GTCGGTCCCG GCGCGCCCAC1551 ATTCTATCCC GCACCTTAA它对应于氨基酸序列<SEQ ID 78;ORF12-1>:1 MSQTDTQRDG RFLRTVEWLG NMLPHPVTLF IIFIVLLLIA SAVGAYFGLS51 VPDPRPVGAK GRADDGLIYI VSLLNADGFI KILTHTVKNF TGFAPLGTVL101 VSLLGVGIAE KSGLISALMR LLLTKSPRKL TTFMVVFTGI LSNTASELGY151 VVLIPLSAII FHSLGRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT201 QQAAQIIHPD YVVGPEANWF FMVASTFVIA LIGYFVTEKI VEPQLGPYQS251 DLSQEEKDIR HSNEITPLEY KGLIWAGVVF VALSALLAWS IVPADGILRH301 PETGLVSGSP FLKSIVVFIF LLFALPGIVY GRVTRSLRGE QEVVNAMAES351 MSTLGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGATFLKE VGLGGSVLFI401 GFILICAFIN LMIGSASAQW AVTAPIFVPM LMLAGYAPEV IQAAYRIGDS451 VTNIITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA FFLIAWIALF501 CIWVFVLGLP VGPGAPTFYP AP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF12与脑膜炎奈瑟球菌菌株A的ORF(ORF12a)在重叠的320个氨基酸内显示出有96.3%的相同性:
10 20 30
orf12.pep AXXIIHPXXVVGPEANWFFMVASTFVIALI
| |||| |||||||||||||||||||||
orf12a AAAFAGVSGGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALI
180 190 200 210 220 230
40 50 60 70 80 90
orf12.pep GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12a GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV
240 250 260 270 280 290
100 110 120 130 140 150
orf12.pep PADGILRHPETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAXAESMS
|||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
orf12a PADGILRHPETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMS
300 310 320 330 340 350
160 170 180 190 200 210
orf12.pep TLXLXLXXIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLM
|| | | ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12a TLGLYLVIIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLM
360 370 380 390 400 410
220 230 240 250 260 270
orf12.pep IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVXXY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||| |
orf12a IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKY
420 430 440 450 460 470
280 290 300 310 320orf12.pep KKDAGVGTLIXMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
|||||||||| ||||||||||||||||||||||||||||||||||||||||orf12a KKDAGVGTLISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
480 490 500 510 520全长ORF12a核苷酸序列<SEQ ID 79>是:1 ATGAGTCAAA CCGATACGCA ACGGGACGGA CGATTTTTAC GCACAGTCGA51 ATGGCTGGGC AATATGTTGC CGCACCCGGT TACGCTTTTT ATTATTTTCA101 TTGTGTTATT GCTGATTGCC TCTGCCGCCG GTGCGTATTT CGGACTATCC151 GTGCCCGATC CGCGCCCTGT TGGTGCGAAA GGACGTGCCG ATGACGGTTT201 GATTCACGTT GTCAGCCTGC TCGATGCTGA CGGTTTGATC AAAATCCTGA251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC351 ATTAATGCGC TTATTGCTCA CAAAATCTCC ACGCAAACTC ACTACTTTTA401 TGGTTGTTTT TACAGGGATT TTATCTAATA CCGCTTCTGA ATTGGGCTAT451 GTCGTCCTAA TCCCTTTGTC CGCCATCATC TTTCATTCCC TCGGCCGCCA501 TCCGCTTGCC GGTCTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC651 CAACTGGTTT TTTATGGTAG CCAGTACGTT TGTGATTGCT TTGATTGGTT701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA751 GATTTGTCAC AAGAAGAAAA AGACATTCGA CATTCCAATG AAATCACGCC801 TTTGGAATAT AAAGGATTAA TTTGGGCTGG CGTGGTGTTT GTTGCCTTAT851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT901 CCTGAAACAG GATTGGTTTC CGGTTCGCCG TTTTTAAAAT CAATTGTTGT951 TTTTATTTTC TTGTTGTTTG CACTGCCGGG CATTGTTTAT GGCCGGGTAA1001 CCCGAAGTTT GCGCGGCGAA CAGGAAGTCG TTAATGCGAT GGCCGAATCG1051 ATGAGTACTC TGGGGCTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG1151 GGGCGACGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGCGT GTTGTTTATC1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC1401 GACGGTGATC AAATACAAAA AAGATGCGGG CGTGGGTACG CTGATTTCTA1451 TGATGTTGCC GTATTCCGCT TTCTTCTTGA TTGCGTGGAT TGCCTTATTC1501 TGCATTTGGG TATTTGTTTT GGGCCTGCCC GTCGGTCCCG GCGCGCCCAC1551 ATTCTATCCC GCACCTTAA它编码的蛋白质具有氨基酸序列<SEQ ID 80>:1 MSQTDTQRDG RFLRTVEWLG NMLPHPVTLF IIFIVLLLIA SAAGAYFGLS51 VPDPRPVGAK GRADDGLIHV VSLLDADGLI KILTHTVKNF TGFAPLGTVL101 VSLLGVGIAE KSGLISALMR LLLTKSPRKL TTFMVVFTGI LSNTASELGY151 VVLIPLSAII FHSLGRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT201 QQAAQIIHPD YVYGPEANWF FMVASTFVIA LIGYFVTEKI VEPQLGPYQS251 DLSQEEKDIR HSNEITPLEY KGLIWAGVVF VALSALLAWS IVPADGILRH301 PETGLVSGSP FLKSIVVFIF LLFALPGIVY GRVTRSLRGE QEVVNAMAES351 MSTLGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGATFLKE VGLGGSVLFI401 GFILICAFIN LMIGSASAQW AVTAPIFVPM LMLAGYAPEV IQAAYRIGDS451 VTNIITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA FFLIAWIALF501 CIWVFVLGLP VGPGAPTFYP AP*ORF12a和ORF12-1在522个氨基酸的重叠区内显示出有99.0%的相同性:
10 20 30 40 50 60orf12a.pep MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAAGAYFGLSVPDPRPVGAK
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||orf12-1 MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAVGAYFGLSVPDPRPVGAK
10 20 30 40 50 60
70 80 90 100 110 120orf12a.pep GRADDGLIHVVSLLDADGLIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
||||||||::||||:|||:|||||||||||||||||||||||||||||||||||||||||
orf12-1 GRADDGLIYIVSLLNADGFIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
70 80 90 100 110 120
130 140 150 160 170 180
orf12a.pep LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAIIFHSLGRHPLAGLAAAFAGVS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAIIFHSLGRHPLAGLAAAFAGVS
130 140 150 160 170 180
190 200 210 220 230 240
orf12a.pep GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALIGYFVTEKI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALIGYFVTEKI
190 200 210 220 230 240
250 260 270 280 290 300
orf12a.pep VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
250 260 270 280 290 300
310 320 330 340 350 360
orf12a.pep PETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMSTLGLYLVI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 PETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMSTLGLYLVI
310 320 330 340 350 360
370 380 390 400 410 420
orf12a.pep IFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 IFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
370 380 390 400 410 420
430 440 450 460 470 480
orf12a.pep AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
430 440 450 460 470 480
490 500 510 520
orf12a.pep LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
|||||||||||||||||||||||||||||||||||||||||||
orf12-1 LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
490 500 510 520
与淋病奈瑟球菌的预计ORF的同源性
ORF12与淋病奈瑟球菌的预计ORF(ORF12.ng)在重叠的320个氨基酸内显示出有92.5%的相同性:
orf12.pep AXXIIHPXXVVGPEANWFFMVASTFVIALI 30
| |||| |||||||||||:|||||||||
orf12ng AAAFAGVSGGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMAASTFVIALI 232
orf12.pep GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV 90
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12ng GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV 292orf12.pep PADGILRHPETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAXAESMS 150
||||||||||||||:|||||||||||||||||||||||||:|||||||:||||| |||||orf12ng PADGILRHPETGLVAGSPFLKSIVVFIFLLFALPGIVYGRITRSLRGEREVVNAMAESMS 352orf12.pep TLXLXLXXIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLM 210
|| | | |||||||||||||||||||||||||:|||: ||||||||||||||||||||orf12ng TLGLYLVIIFFAAQFVAFFNWTNIGQYIAVKGAVFLKKFRLGGSVLFIGFILICAFINLM 412orf12.pep IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVXXY 270
||||||||||||||||||||||| ||:|||||||||||||||||||||||||||||| |orf12ng IGSASAQWAVTAPIFVPMLMLAGNAPQVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKY 472orf12.pep KKDAGVGTLIXMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAP 320
|||||||||| |||||||||||||||||||||||||||||||:|||||:|orf12ng KKDAGVGTLISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGTPTFYPVP 522全长ORF12ng核苷酸序列<SEQ ID 81>是:1 ATGAGTCAAA CCGACGCGCG TCGTAGCGGA CGATTTTTAC GCACAGTCGA51 ATGGCTGGGC AATATGTTGC CGCACCCGGT TACGCTTTTT ATTATTTTCA101 TTGTGTTATT GCTGATTGcc tctgCCGTCG GTGCGTATTT CGGACTATCC151 GTCCCCGATC CGCGTCCTGT TGGGGCGAAA GGACGTGCCG ATGACGGTTT201 GATTCACGTT GTCAGCCTGC TCGATGCCGA CGGTTTGATC AAAATCCTGA251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC351 ATTAATGCGC TTATTGCTCA CAAAATCCCC ACGCAAACTC ACTACTTTTA401 TGGTTGTTTT TACAGGGATT TTATCCAATA CGGCTTCTGA ATTGGGCTAT451 GTCGTCCTAA TCCCTTTGTC CGCCGTCATC TTTCATTCGC TCGGCCGCCA501 TCCGCTTGCC GGTTTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC651 CAACTGGTTT TTTATGGCAG CCAGTACGTT TGTGATTGCT TTGATTGGTT701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA751 GATTTGTCAC AAGAAGAAAA AGACATTCGG CATTCCAATG AAATCACGCC801 TTTGGAATAT AAAGGATTAA TTTGGGCAGG CGTGGTGTTT GTTGCCTTAT851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT901 CCTGAAACAG GATTGGTTGC CGGTTCGCCG TTTTTAAAAT CGATTGTTGT951 TTTTATTTTC TTGTTGTTTG CGCTGCCGGG CATTGTTTAT GGCCGGATAA1001 CCCGAAGTTT GCGCGGCGAA CGGGAAGTCG TTAATGCGAT GGCCGAATCG1051 ATGAGTACTT TGGGACTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG1151 GGGCGGTGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGTGT GTTGTTTATC1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC1401 GACGGTAATC AAATACAAAA AAGATGCGGG CGTAGGCACG CTGATTTCTA1451 TGATGTTGCC GTATTCCGCT TTCTTCTTAA TTGCATGGAT CGCCTTATTC1501 TGCATTTGGG TATTTGTTTT GGGTCTGCCC GTCGGTCCCG GCACACCCAC1551 ATTCTATCCG GTGCCTTAA它编码的蛋白质具有氨基酸序列<SEQ ID 82>:1 MSQTDARRSG RFLRTVEWLG NMLPHPVTLF IIFIVLLLIA SAVGAYFGLS51 VPDPRPVGAK GRADDGLIHV VSLLDADGLI KILTHTVKNF TGFAPLGTVL101 VSLLGVGIAE KSGLISALMR LLLTKSPRKL TTFMVVFTGI LSNTASELGY151 VVLIPLSAVI FHSLGRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT201 QQAAQIIHPD YVVGPEANWF FMAASTFVIA LIGYFVTEKI VEPQLGPYQS251 DLSQEEKDIR HSNEITPLEY KGLIWAGVVF VALSALLAWS IVPADGILRH301 PETGLVAGSP FLKSIVVFIF LLFALPGIVY GRITRSLRGE REVVNAMAES351 MSTLGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGAVFLKK FRLGGSVLFI401 GFILICAFIN LMIGSASAQW AVTAPIFVPM LMLAGNAPQV IQAAYRIGDS451 VTNIITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA FFLIAWIALF
501 CIWVFVLGLP VGPGTPTFYP VP*ORF12ng与ORF12-1在重叠的522个氨基酸内显示出有97.1%的相同性:
10 20 30 40 50 60orf12-1.pep MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAVGAYFGLSVPDPRPVGAK
|||||::|:|||||||||||||||||||||||||||||||||||||||||||||||||||orf12ng MSQTDARRSGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAVGAYFGLSVPDPRPVGAK
10 20 30 40 50 60
70 80 90 100 110 120orf12-1.pep GRADDGLIYIVSLLNADGFIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
||||||||::||||:|||:|||||||||||||||||||||||||||||||||||||||||orf12ng GRADDGLIHVVSLLDADGLIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
70 80 90 100 110 120
130 140 150 160 170 180orf12-1.pep LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAIIFHSLGRHPLAGLAAAFAGVS
||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||orf12ng LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAVIFHSLGRHPLAGLAAAFAGVS
130 140 150 160 170 180
190 200 210 220 230 240orf12-1.pep GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALIGYFVTEKI
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||orf12ng GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMAASTFVIALIGYFVTEKI
190 200 210 220 230 240
250 260 270 280 290 300orf12-1.pep VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf12ng VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
250 260 270 280 290 300
310 320 330 340 350 360orf12-1.pep PETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMSTLGLYLVI
||||||:|||||||||||||||||||||||||:|||||||:|||||||||||||||||||orf12ng PETGLVAGSPFLKSIVVFIFLLFALPGIVYGRITRSLRGEREVVNAMAESMSTLGLYLVI
310 320 330 340 350 360
370 380 390 400 410 420orf12-1.pep IFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
|||||||||||||||||||||||||:||||||||||||||||||||||||||||||||||orf12ng IFFAAQFVAFFNWTNIGQYIAVKGAVFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
370 380 390 400 410 420
430 440 450 460 470 480orf12-1.pep AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf12ng AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
430 440 450 460 470 480
490 500 510 520orf12-1.pep LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
||||||||||||||||||||||||||||||||||:|||||:||orf12ng LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGTPTFYPVPX
490 500 510 520另外,ORF12ng显示出与大肠杆菌的一种假设蛋白明显同源:sp|P46133|YDAH_ECOLI OGT-DBPA基因间区域中假设的55.1 KD蛋白>gi|1787597(AE000231)5’区域中的假设蛋白[大肠杆菌]长度=510评分=329位(835),估计值=2e-89相同性=178/507(35%),阳性=281/507(55%),空隙=15/507(2%)询问:8 RSGRFLRTVEWLGNMLPHPVTXXXXXXXXXXXASAVGAYFGLSVPDPRPVGAKGRADDGL 67
+SG+ VE +GN +PHP +A+ + FG+S +P D目标:13 QSGKLYGWVERIGNKVPHPFLLFIYLIIVLMVTTAILSAFGVSAKNP--------TDGTP 64询问:68 IHVVSLLDADGLIKILTHTVKNFTGFAPXXXXXXXXXXXXIAEKSGLISALMRLLLTKSP 127
+ V +LL +GL L + +KNF+GFAP +AE+ GL+ ALM + +目标:65 VVVKNLLSVEGLHWFLPNVIKNFSGFAPLGAILALVLGAGLAERVGLLPALMVKMASHVN 124询问:128 RKLTTFMVVFTGILSNTASELGYVVLIPLSAVIFHSLGRHPLAGLAAAFAGVSGGYSANL 187
+ ++MV+F S+ +S+ V++ P+ A+IF ++GRHP+AGL AA AGV G++ANL目标:125 ARYASYMVLFIAFFSHISSDAALVIMPPMGALIFLAVGRHPVAGLLAAIAGVGCGFTANL 184询问:188 FLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMAASTFVIALIGYFVTEKIVEPQLGP 247
+ T D LL+GI+ +AA +P V NW+FMA+S V+ ++G +T+KI+EP+LG目标:185 LIVTTDVLLSGISTEAAAAFNPQMHVSVIDNWYFMASSVVVLTIVGGLITDKIIEPRLGQ 244询问:248 YQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRHPETGLVA 307
+Q + ++ + + S GL AGVV + A +A ++P +GILR P V目标:245 WQGNSDEKLQTLTESQRF------GLRIAGVVSLLFIAAIALMVIPQNGILRDPINHTVM 298询问:308 GSPFLKSIVVFIFLLFALPGIVYGRITRSLRGEREVVNAMAESMSTLGLYLXXXXXXXXX 367
SPF+K IV I L F + + YG TR++R + ++ + M E M + ++目标:299 PSPFIKGIVPLIILFFFVVSLAYGIATRTIRRQADLPHLMIEPMKEMAGFIVMVFPLAQF 358询问:368 XXXXNWTNIGQYIAVKGAVFLKEVGLGGSVLFIGFILICAFINLMIGSASAQWAVTAPIF 427
NW+N+G++IAV L+ GL G F+G L+ +F+ + I S SA W++ APIF目标:359 VAMFNWSNMGKFIAVGLTDILESSGLSGIPAFVGLALLSSFLCMFIASGSAIWSILAPIF 418询问:428 VPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGTLISMMLP 487
VPM ML G+ P Q +RI DS + P+ + L + + +YK DA +GT S++LP目标:419 VPMFMLLGFHPAFAQILFRIADSSVLPLAPVSPFVPLFLGFLQRYKPDAKLGTYYSLVLP 478询问:488 YSAFFLIAWIALFCIWVFVLGLPVGPG 514
Y FL+ W+ + W +++GLP+GPG目标:479 YPLIFLVVWLLMLLAW-YLVGLPIGPG 504
根据该分析结果,包括该淋球菌蛋白中存在几个推定的跨膜结构域和预计的辅肌动蛋白型结合肌动蛋白的结构域特征序列(用粗体表示),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例17
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 83>:1 ..ACAGCCGGCG CAGCAGGTTn CnCGGTCTTC GTTTTCGTAA CGGACAGTCA51 GGTGGAGGTG TTCGGGAACA TCCAGACCGC AGTGGAAACA GGTTTTTTTC101 ATGGCATTTC GGTTTCGTCT GTGTTTGGTG CGGCGGCACA AGACTCGGCA151 ATgGCTTCGC GCAGTGCGTC TATACCGGTA TTTTCAGCAA CGGAAATGCG201 GACGGcGgCA ATTTTTCCCG CAGCGTCGCG CCATATGCCC GTGTTTTgTT251 CTTCAGACGG CAGCAGGTCG GTTTTGTTGT ACACCTTgAT GCACGGAaTA301 TCGCCGGCAT GGATTTCTTG CAGTACGTTT TCCACGTCTT CAATCTGCTG351 TCCGCTGTTC GGAGCGGCGG CATCGACGAC GTGCAGCAGC ACATCgGcTT401 gCGCGGTTTC TTCCAGCGTG GCgGAAAAGG CGGAAATCAG TTTgTGCGGC451 agATyGCTnA CGAATCCGAC GGTATCGGTC AGGATAATGC TGCATTCGGG501 ACT..它对应于氨基酸序列<SEQ ID 84;ORF14>:1 ..TAGAAGXXVF VFVTDSQVEV FGNIQTAVET GFFHGISVSS VFGAAAQDSA51 MASRSASIPV FSATEMRTAA IFPAASRHMP VFCSSDGSRS VLLYTLMHGI101 SPAWISCSTF STSSICCPLF GAAASTTCSS TSACAVSSSV AEKAEISLCG151 RXLTNPTVSV RIMLHSG..
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF14与脑膜炎奈瑟球菌菌株A的ORF(ORF14a)在重叠的167个氨基酸内显示出有94.0%的相同性:
10 20 30
orf14.pep TAGAAGXXVFVFVTDSQVEVFGNIQTAVET
|:|||| |||||||:|::||||:| ||||
orf14a GRQLGFLRVGGALFVITAQARVNNALCDCLTTGAAGFAVFVFVTDGQMQVFGNVQPAVET
150 160 170 180 190 200
40 50 60 70 80 90
orf14.pep GFFHGISVSSVFGAAAQDSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf14a GFFHGISVSSVFGAAAQYSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS
210 220 230 240 250 260
100 110 120 130 140 150
orf14.pep VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf14a VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG
270 280 290 300 310 320
160
orf14.pep RXLTNPTVSVRIMLHSG
| |||||||||||||||
orf14a RSLTNPTVSVRIMLHSGLMYSRRAVVSSVAKSWSFAYMPDLVSRLNRLDLPTLVX
330 340 350 360 370 380
全长ORF14a核苷酸序列<SEQ ID 85>是:1 ATGGAGGATT TGCAGGAAAT CGGGTTCGAT GTCGCCGCCG TAAAGGTAGG51 TCGGCAGCGC GAACATCATC GTCTGCATCA TCCCCAGCCC GGCAACGGCG101 AGGCGGACGA TGTATTGTTT GCGTTCTTTT TGGTTGGCGG CTTCGATTTT151 TTGCGCGTCA TAGGGTGCGG CGGTGTAGCC TATCTGCCTG ATTTTCAACA201 GAATGTCGGA AAGGCGGATT TTGCCGTCGT CCCAGACGAC GCGGCAGCGG251 TGCGTGCTGT AATTGAGGTC GATGCGGACG ATGCCGTCTG TACGCAAAAG301 CTGCTGTTCG ATCAGCCAGA CGCAGGCGGC GCAGGTGATG CCGCCGAGCA351 TTAAAACCGC CTCGCGCGTG CCGCCGTGGG TTTCCACAAA GTCGGACTGG401 ACTTCGGGCA GGTCGTACAG GCGGATTTGG TCGAGGATTT CTTGGGGCGG451 CAGCTCGGTT TTTTGCGCGT CGGCGGTGCG TTGTTTGTAA TAACTGCCCA501 AGCCCGCGTC AATAATGCTT TGTGCGACTG CCTGACAACC GGCGCAGCAG551 GTTTCGCGGT CTTCGTTTTC GTAACGGACG GTCAGATGCA GGTTTTCGGG601 AACGTCCAGC CCGCAGTGGA AACAGGTTTT TTTCATGGCA TTTCGGTTTC651 GTCTGTGTTT GGTGCGGCGG CACAATACTC GGCAATGGCT TCGCGCAGTG701 CGTCTATACC GGTATTTTCA GCAACGGAAA TGCGGACGGC GGCAATTTTT751 CCCGCAGCGT CGCGCCATAT GCCCGTGTTT TGTTCTTCAG ACGGCAGCAG801 GTCGGTTTTG TTGTACACCT TGATGCACGG AATATCGCCG GCATGGATTT851 CTTGCAGTAC GTTTTCCACG TCTTCAATCT GCTGTCCGCT GTTCGGAGCG901 GCGGCATCGA CGACGTGCAG CAGCACATCG GCTTGCGCGG TTTCTTCCAG951 CGTGGCGGAA AAGGCGGAAA TCAGTTTGTG CGGCAGATCG CTGACGAATC1001 CGACGGTATC GGTCAGGATA ATGCTGCATT CGGGACTGAT GTACAGCCGC1051 CGCGCCGTCG TGTCGAGTGT GGCGAAAAGC TGGTCTTTCG CATATATGCC1101 CGACTTGGTC AGCCGGTTGA ACAGACTGGA TTTGCCGACA TTGGTATAG它编码的蛋白质具有氨基酸序列<SEQ ID 86>:1 MEDLQEIGFD VAAVKVGRQR EHHRLHHPQP GNGEADDVLF AFFLVGGFDF51 LRVIGCGGVA YLPDFQQNVG KADFAVVPDD AAAVRAVIEV DADDAVCTQK101 LLFDQPDAGG AGDAAEH*NR LARAAVGFHK VGLDFGQVVQ ADLYEDFLGR151 QLGFLRVGGA LFVITAQARV NNALCDCLTT GAAGFAVFVF VTDGQMQVFG201 NVQPAVETGF FHGISVSSVF GAAAQYSAMA SRSASIPVFS ATEMRTAAIF251 PAASRHMPVF CSSDGSRSVL LYTLMHGISP AWISCSTFST SSICCPLFGA301 AASTTCSSTS ACAVSSSVAE KAEISLCGRS LTNPTVSVRI MLHSGLMYSR351 RAVVSSVAKS WSFAYMPDLV SRLNRLDLPT LV*
应注意该序列在118位包括一个终止密码子。
与淋病奈瑟球菌的预计ORF的同源性
ORF14与淋病奈瑟球菌的预计ORF(ORF14.ng)在重叠的167个氨基酸内显示出有89.8%的相同性:
orf14.pep TAGAAGXXVFVFVTDSQVEVFGNIQTAVET 30
|| ||| ||:||:|:|::||||:| ||||
orf14ng GRQFGFFRVGGASFVITAQAGIDDALCDCLTADAAGFAVFAFVADGQMQVFGNVQPAVET 208
orf14.pep GFFHGISVSSVFGAAAQDSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS 90
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf14ng GFFHGISVSSVFGAAAQYSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS 268
orf14.pep VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG 150
||||||||||| |||||||||||||||||| |||||||||||||:|||:|||||||||||
orf14ng VLLYTLMHGISWAWISCSTFSTSSICCPLFRAAASTTCSSTSACTVSSKVAEKAEISLCG 328
orf14.pep RXLTNPTVSVRIMLHSG 167
| |||||||||||||:|
orf14ng RSLTNPTVSVRIMLHAGLMYSRRAVVSRVAKSWSFAYMPDLVSRLNRLDLPTLV 382
预计全长ORF14ng核苷酸序列<SEQ ID 87>编码的蛋白质具有氨基酸序列<SEQID 88>:1 MEDLQEIGFD VAAVKVGRQR EHHRLHHTQS GNGKADDVLF AFFLVGGFDF51 LRVIGCGGVA CLPDFQQNVG EADFAVVPDD AAAVRAVIEV DADDAVCAQK101 LLFDQPDAGG AGNAAEHQHC FVRAIMGFHK VGLDFGQVVQ ADLVEDFLGR151 QFGFFRVGGA SFVITAQAGI DDALCDCLTA DAAGFAVFAF VADGQMQVFG201 NVQPAVETGF FHGISVSSVF GAAAQYSAMA SRSASIPVFS ATEMRTAAIF251 PAASRHMPVF CSSDGSRSVL LYTLMHGISW AWISCSTFST SSICCPLFRA301 AASTTCSSTS ACTVSSKVAE KAEISLCGRS LTNPTVSVRI MLHAGLMYSR351 RAVVSRVAKS WSFAYMPDLV SRLNRLDLPT LV*
根据该淋球菌蛋白中有一个推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例18
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 89>:1 ..GGCCATTACT CCGACCGCAC TTGGAAGCCG CGTTTGGNCG GCCGCCGTCT51 GCCGTATCTG CTTTATGGCA CGCTGATTGC GGTTATTGTG ATGATTTTGA101 TGCCGAACTC GGGCAGCTTC GGTTTCGGCT ATGCGTCGCT GGCGGCTTTG151 TCGTTCGGCG CGCTGATGAT TGCGCTGTTA GACGTGTCGT CAAATATGGC201 GATGCAGCCG TTTAAGATGA TGGTCGGCGA CATGGTCAAC GAGGAGCAGA251 AAA.NTACGC CTACGGGATT CAAAGTTTCT TAGCAAATAC GGGCGCGGTC301 GTGGCGGCGA TTCTGCCGTT TGTGTTTGCG TATATCGGTT TGGCGAACAC351 CGCCGANAAA GGCGTTGTGC CGCAGACCGT GGTCGTGGCG TTTTATGTGG401 GTGCGGCGTT GCTGGTGATT ACCAGCGCGT TCACGATTTT CAAAGTGAAG451 GAATACGANC CGGAAACCTA CGCCCGTTAC CACGGCATCG ATGTCGCCGC501 GAATCAGGAA AAAGCCAACT GGATCGCACT CTTAAAA.CC GCGC..它对应于氨基酸序列<SEQ ID 90;ORF16>:1 ..GHYSDRTWKP RLXGRRLPYL LYGTLIAVIV MILMPNSGSF GFGYASLAAL51 SFGALMIALL DVSSNMAMQP FKMMVGDMVN EEQKXYAYGI QSFLANTGAV101 VAAILPFVFA YIGLANTAXK GVVPQTVVVA FYVGAALLVI TSAFTIFKVK151 EYXPETYARY HGIDVAANQE KANWIALLKX A..进一步的工作揭示了完整的核苷酸序列<SEQ ID 91>:1 ATGTCGGAAT ATACGCCTCA AACAGCAAAA CAAGGTTTGC CCGCGCTGGC51 AAAAAGCACG ATTTGGATGC TCAGTTTCGG CTTTCTCGGC GTTCAGACGG101 CCTTTACCCT GCAAAGCTCG CAAATGAGCC GCATTTTTCA AACGCTAGGC151 GCAGACCCGC ACAATTTGGG CTGGTTTTTC ATCCTGCCGC CGCTGGCGGG201 GATGCTGGTG CAGCCGATTG TCGGCCATTA CTCCGACCGC ACTTGGAAGC251 CGCGTTTGGG CGGCCGCCGT CTGCCGTATC TGCTTTATGG CACGCTGATT301 GCGGTTATTG TGATGATTTT GATGCCGAAC TCGGGCAGCT TCGGTTTCGG351 CTATGCGTCG CTGGCGGCTT TGTCGTTCGG CGCGCTGATG ATTGCGCTGT401 TAGACGTGTC GTCAAATATG GCGATGCAGC CGTTTAAGAT GATGGTCGGC451 GACATGGTCA ACGAGGAGCA GAAAGGCTAC GCCTACGGGA TTCAAAGTTT501 CTTAGCAAAT ACGGGCGCGG TCGTGGCGGC GATTCTGCCG TTTGTGTTTG551 CGTATATCGG TTTGGCGAAC ACCGCCGAGA AAGGCGTTGT GCCGCAGACC601 GTGGTCGTGG CGTTTTATGT GGGTGCGGCG TTGCTGGTGA TTACCAGCGC651 GTTCACGATT TTCAAAGTGA AGGAATACGA TCCGGAAACC TACGCCCGTT701 ACCACGGCAT CGATGTCGCC GCGAATCAGG AAAAAGCCAA CTGGATCGAA751 CTCTTGAAAA CCGCGCCTAA GGCGTTTTGG ACGGTTACTT TGGTGCAATT801 CTTCTGCTGG TTCGCCTTCC AATATATGTG GACTTACTCG GCAGGCGCGA851 TTGCGGAAAA CGTCTGGCAC ACCACCGATG CGTCTTCCGT AGGTTATCAG901 GAGGCGGGTA ACTGGTACGG CGTTTTGGCG GCGGTGCAGT CGGTTGCGGC951 GGTGATTTGT TCGTTTGTAT TGGCGAAAGT GCCGAATAAA TACCATAAGG1001 CGGGTTATTT CGGCTGTTTG GCTTTGGGCG CGCTCGGCTT TTTCTCCGTT1051 TTCTTCATCG GCAACCAATA CGCGCTGGTG TTGTCTTATA CCTTAATCGG1101 CATCGCTTGG GCGGGCATTA TCACTTATCC GCTGACGATT GTGACCAACG1151 CCTTGTCGGG CAAGCATATG GGCACTTACT TGGGCTTGTT TAACGGCTCT1201 ATCTGTATGC CTCAAATCGT CGCTTCGCTG TTGAGTTTCG TGCTTTTCCC1251 TATGCTGGGC GGCTTGCAGG CCACTATGTT CTTGGTAGGG GGCGTCGTCC1301 TGCTGCTGGG CGCGTTTTCC GTGTTCCTGA TTAAAGAAAC ACACGGCGGG1351 GTTTGA它对应于氨基酸序列<SEQ ID 92;ORF16-1>:1 MSEYTPQTAK QGLPALAKST IWMLSFGFLG VQTAFTLQSS QMSRIFQTLG51 ADPHNLGWFF ILPPLAGMLV QPIVGHYSDR TWKPRLGGRR LPYLLYGTLI101 AVIVMILMPN SGSFGFGYAS LAALSFGALM IALLDVSSNM AMQPFKMMVG151 DMVNEEQKGY AYGIQSFLAN TGAVVAAILP FVFAYIGLAN TAEKGVVPQT201 VVVAFYVGAA LLVITSAFTI FKVKEYDPET YARYHGIDVA ANQEKANWIE251 LLKTAPKAFW TVTLVQFFCW FAFQYMWTYS AGAIAENVWH TTDASSVGYQ301 EAGNWYGVLA AVQSVAAVIC SFVLAKVPNK YHKAGYFGCL ALGALGFFSV351 FFIGNQYALV LSYTLIGIAW AGIITYPLTI VTNALSGKHM GTYLGLFNGS401 ICMPQIVASL LSFVLFPMLG GLQATMFLVG GVVLLLGAFS VFLIKETHGG451 V*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF16与脑膜炎奈瑟球菌菌株A的ORF(ORF16a)在重叠的181个氨基酸内显示出有96.7%的相同性:
10 20 30
orf16.pep GHYSDRTWKPRLXGRRLPYLLYGTLIAVIV
|||||||||||| |||||||||||||||||orf16a IFQTLGADPHSLGWFFILPPLAGMLVQPIVGHYSDRTWKPRLGGRRLPYLLYGTLIAVIV
50 60 70 80 90 100
40 50 60 70 80 90orf16.pep MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKXYAYGI
|||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||orf16a MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKGYAYGI
110 120 130 140 150 160
100 110 120 130 140 150orf16.pep QSFLANTGAVVAAILPFVFAYIGLANTAXKGVVPQTVVVAFYVGAALLVITSAFTIFKVK
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||orf16a QSFLANTGAVVAAILPFVFAYIGLANTAEKGVVPQTVVVAFYVGAALLVITSAFTIFKVK
170 180 190 200 210 220
160 170 180orf16.pep EYXPETYARYHGIDVAANQEKANWIALLKXA
|| |||||||||||||||||||||| |||:|orf16a EYNPETYARYHGIDVAANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAI
230 240 250 260 270 280orf16a AENVWHTTDASSVGYQEAGNWYGVLAAVQSVAAVICSFVLAKVPNKYHKAGYFGCLALGA
290 300 310 320 330 340全长ORF16a核苷酸序列<SEQ ID 93>是:1 ATGTCGGAAT ATACGCCTCA AACAGCAAAA CAAGGTTTGC CCGCGCTGGC51 AAAAAGCACG ATTTGGATGC TCAGTTTCGG CTTTCTCGGC GTTCAGACGG101 CCTTTACCCT GCAAAGCTCG CAGATGAGCC GCATCTTCCA GACGCTCGGT151 GCCGATCCGC ACAGCCTCGG CTGGTTCTTT ATCCTGCCGC CGCTGGCGGG201 GATGCTGGTG CAGCCGATTG TCGGCCATTA CTCCGACCGC ACTTGGAAGC251 CGCGTTTGGG CGGCCGCCGT CTGCCGTATC TGCTTTATGG CACGCTGATT301 GCGGTTATTG TGATGATTTT GATGCCGAAC TCGGGCAGCT TCGGTTTCGG351 CTATGCGTCG CTGGCGGCTT TGTCGTTCGG CGCGCTGATG ATTGCGCTGT401 TAGACGTGTC GTCAAATATG GCGATGCAGC CGTTTAAGAT GATGGTCGGC451 GACATGGTCA ACGAGGAGCA GAAAGGCTAC GCCTACGGGA TTCAAAGTTT501 CTTAGCGAAT ACGGGCGCGG TCGTGGCGGC GATTCTGCCG TTTGTGTTTG551 CGTATATCGG TTTGGCGAAC ACCGCCGAGA AAGGCGTTGT GCCGCAGACC601 GTGGTCGTGG CGTTTTATGT GGGTGCGGCG TTGCTGGTGA TTACCAGCGC651 GTTCACGATT TTCAAAGTGA AGGAATACAA TCCGGAAACC TACGCCCGTT701 ACCACGGCAT CGATGTCGCC GCGAATCAGG AAAAAGCCAA CTGGATCGAA751 CTCTTGAAAA CCGCGCCTAA GGCGTTTTGG ACGGTTACTT TGGTGCAATT801 CTTCTGCTGG TTCGCCTTCC AATATATGTG GACTTACTCG GCAGGCGCGA851 TTGCGGAAAA CGTCTGGCAC ACCACCGATG CGTCTTCCGT AGGTTATCAG901 GAGGCGGGTA ACTGGTACGG CGTTTTGGCG GCGGTGCAGT CGGTTGCGGC951 GGTGATTTGT TCGTTTGTAT TGGCGAAAGT GCCGAATAAA TACCATAAGG1001 CGGGTTATTT CGGCTGTTTG GCTTTGGGCG CGCTCGGCTT TTTCTCCGTT1051 TTCTTCATCG GCAACCAATA CGCGCTGGTG TTGTCTTATA CCTTAATCGG1101 CATCGCTTGG GCGGGCATTA TCACTTATCC GCTGACGATT GTGACCAACG1151 CCTTGTCGGG CAAGCATATG GGCACTTACT TGGGCCTGTT TAACGGCTCT1201 ATCTGTATGC CGCAAATCGT CGCTTCGCTG TTGAGTTTCG TGCTTTTCCC1251 TATGCTGGGC GGCTTGCAGG CCACTATGTT CTTGGTAGGG GGCGTCGTCC1301 TGCTGCTGGG CGCGTTTTCC GTGTTCCTGA TTAAAGAAAC ACACGGCGGG1351 GTTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 94>:1 MSEYTPQTAK QGLPALAKST IWMLSFGFLG VQTAFTLQSS QMSRIFQTLG51 ADPHSLGWFF ILPPLAGMLV QPIVGHYSDR TWKPRLGGRR LPYLLYGTLI101 AVIVMILMPN SGSFGFGYAS LAALSFGALM IALLDVSSNM AMQPFKMMVG151 DMVNEEQKGY AYGIQSFLAN TGAVVAAILP FVFAYIGLAN TAEKGVVPQT201 VVVAFYVGAA LLVITSAFTI FKVKEYNPET YARYHGIDVA ANQEKANWIE251 LLKTAPKAFT TVTLVQFFCW FAFQYMWTYS AGAIAENVWH TTDASSVGYQ301 EAGNWYGVLA AVQSVAAVIC SFVLAKVPNK YHKAGYFGCL ALGALGFFSV351 FFIGNQYALV LSYTLIGIAW AGIITYPLTI VTNALSGKHM GTYLGLFNGS401 ICMPQIVASL LSFVLFPMLG GLQATMFLVG GVVLLLGAFS VFLIKETHGG451 V*ORF16a和ORF16-1在451个氨基酸的重叠区内显示出有99.6%的相同性:
10 20 30 40 50 60orf16a.pep MSEYTPQTAKQGLPALAKSTIWMLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHSLGWFF
||||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||orf16-1 MSEYTPQTAKQGLPALAKSTIWMLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHNLGWFF
10 20 30 40 50 60
70 80 90 100 110 120orf16a.pep ILPPLAGMLVQPIVGHYSDRTWKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf16-1 ILPPLAGMLVQPIVGHYSDRTWKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYAS
70 80 90 100 110 120
130 140 150 160 170 180orf16a.pep LAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKGYAYGIQSFLANTGAVVAAILP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf16-1 LAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKGYAYGIQSFLANTGAVVAAILP
130 140 150 160 170 180
190 200 210 220 230 240orf16a.pep FVFAYIGLANTAEKGVVPQTVVVAFYVGAALLVITSAFTIFKVKEYNPETYARYHGIDVA
||||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||orf16-1 FVFAYIGLANTAEKGVVPQTVVVAFYVGAALLVITSAFTIFKYKEYDPETYARYHGIDVA
190 200 210 220 230 240
250 260 270 280 290 300orf16a.pep ANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf16-1 ANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQ
250 260 270 280 290 300
310 320 330 340 350 360orf16a.pep EAGNWYGVLAAVQSVAAVICSFVLAKVPNKYHKAGYFGCLALGALGFFSVFFIGNQYALV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf16-1 EAGNWYGVLAAVQSVAAVICSFVLAKVPNKYHKAGYFGCLALGALGFFSVFFIGNQYALV
310 320 330 340 350 360
370 380 390 400 410 420orf16a.pep LSYTLIGIAWAGIITYPLTIVTNALSGKHMGTYLGLFNGSICMPQIVASLLSFVLFPMLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf16-1 LSYTLIGIAWAGIITYPLTIVTNALSGKHMGTYLGLFNGSICMPQIVASLLSFVLFPMLG
370 380 390 400 410 420
430 440 450orf16a.pep GLQATMFLVGGVVLLLGAFSVFLIKETHGGVX
||||||||||||||||||||||||||||||||orf16-1 GLQATMFLVGGVVLLLGAFSVFLIKETHGGVX
430 440 450
与淋病奈瑟球菌的预计ORF的同源性
ORF16与淋病奈瑟球菌预计的ORF(ORF16.ng)在重叠的181个氨基酸内显示出有93.9%的相同性:
orf16.pep GHYSDRTWKPRLXGRRLPYLLYGTLIAVIV 30
|:|||||||||| |||||||||||||||||orf16ng HFSNARRRPAQFGLVFHPAAAGGDAGSADSGYYSDRTWKPRLGGRRLPYLLYGTLIAVIV 131orf16.pep MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKXYAYGI 90
|||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||orf16ng MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKSYAYGI 191orf16.pep QSFLANTGAVVAAILPFVFAYIGLANTAXKGVVPQTVVVAFYVGAALLVITSAFTIFKVK 150
||||||| |||||||||||||||||||| |||||||||||||||||||:||||||| |||orf16ng QSFLANTDAVVAAILPFVFAYIGLANTAEKGVVPQTVVVAFYVGAALLIITSAFTISKVK 251orf16.pep EYXPETYARYHGIDVAANQEKANWIALLKXA 181
|| |||||||||||||||||||||: |||:|orf16ng EYDPETYARYHGIDVAANQEKANWFELLKTAPKVFWTVTPVQFFCWFAFRYMWTYSAGAI 311全长ORF16ng核苷酸序列<SEQ ID 95>是:1 ATGATAGGGG ATCGCCGCGC CGGCAACCAT TTCGGATTTT CCAAAGCAAA51 TACTTTTCAA ATCAAAAAAA AGGATTTACT TTATGTCGGA ATATACGCCT101 CAAACAGCAA AACAAGGTTT GCCCGCGCCG GCAAAAAGCA CGATTTGGAT151 GTTGAGCTTC GGCTATCTCG GCGTTCAGAC GGCCTTTACC CTGCAAAGCT201 CGCAGATGAG CCGCATTTTT CAAACGCTAG GCGCAGACCC GCACAATTTG251 GGCTGGTTTT TCATCCTGCC GCCGCTGGCG GGGATGCTGG TTCAGCCGAT301 AGTGGCTACT ACTCAGACCG CACTTGGAAG CCGCGCTTGG GCGGCCGCCG351 CCTGCCGTAT CTGCTTTACG GCACGCTGAT TGCGGTCATC GTGATGATTT401 TGATGCCGAA CTCGGGCAGC TTCGGTTTCG GCTATGCGTC GCTGGCGGCC451 TTGTCGTTCG GCGCGCTGAT GATTGCGCTG TTGGACGTGT CGTCGAATAT501 GGCGATGCAG CCGTTTAAGA TGATGGTCGG CGATATGGTC AACGAGGAGC551 AGAAAAGCTA CGCCTACGGG ATTCAAAGTT TCTTAGCGAA TACGGACGCG601 GTTGTGGCAG CGATTCTGCC GTTTGTGTTC GCGTATATCG GTTTGGCGAA651 CACTGCCGAG AAAGGCGTTG TGCCACAAAC CGTGGTCGTA GCATTCTATG701 TGGGTGCGGC GTTACTGATT ATTACCAGTG CGTTCACAAT CTCCAAAGTC751 AAAGAATACG ACCCGGAAAC CTACGCCCGT TACCACGGCA TCGATGTCGC801 CGCGAATCAG GAAAAAGCCA ACTGGTTCGA ACTCTTAAAA ACCGCGCCTA851 AAGTGTTTTG GACGGTTACT CCGGTACAGT TTTTCTGCTG GTTCGCCTTC901 CGGTATATGT GGACTTACTC GGCAGGCGCG ATTGCAGAAA ACGTCTGGCA951 CACTACCGAT GCGTCTTCCG TAGGCCATCA GGAGGCGGGC AACCGGTACG1001 GCGTTTTGGC GGCGGTGTAG它编码的蛋白质具有氨基酸序列<SEQ ID 96>:1 MIGDRRAGNH FGFSKANTFQ IKKKDLLYVG IYASNSKTRF ARAGKKHDLD51 VELRLSRRSD GLYPAKLADE PHFSNARRRP AQFGLVFHPA AAGGDAGSAD101 SGYYSDRTWK PRLGGRRLPY LLYGTLIAVI VMILMPNSGS FGFGYASLAA151 LSFGALMIAL LDVSSNMAMQ PFKMMVGDMV NEEQKSYAYG IQSFLANTDA201 VVAAILPFVF AYIGLANTAE KGVVPQTVVV AFYVGAALLI ITSAFTISKV251 KEYDPETYAR YHGIDVAANQ EKANWFELLK TAPKVFWTVT PVQFFCWFAF301 RYMWTYSAGA IAENVWHTTD ASSVGHQEAG NRYGVLAAV*ORF16ng和ORE16-1在261个氨基酸的重叠区内显示出有89.3%的相同性:
30 40 50 60 70 80orf16-1.pep MLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHNLGWFFILPPLAGMLVQPI-VGHYSDRT
| ::| | | || : |:|||||orf16ng DVELRLSRRSDGLYPAKLADEPHFSNARRRPAQFGLVF-HPAAAGGDAGSADSGYYSDRT
50 60 70 80 90 100
90 100 110 120 130 140orf16-1.pep WKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf16ng WKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMA
110 120 130 140 150 160
150 160 170 180 190 200orf16-1.pep MQPFKMMVGDMVNEEQKGYAYGIQSFLANTGAVVAAILPFVFAYIGLANTAEKGVVPQTV
|||||||||||||||||:|||||||||||| |||||||||||||||||||||||||||||
orf16ng MQPFKMMVGDMVNEEQKSYAYGIQSFLANTDAVVAAILPFVFAYIGLANTAEKGVVPQTV
170 180 190 200 210 220
210 220 230 240 250 260
orf16-1.pep VVAFYVGAALLVITSAFTIFKVKEYDPETYARYHGIDVAANQEKANWIELLKTAPKAFWT
|||||||||||:||||||| |||||||||||||||||||||||||||:||||||||:|||
orf16ng VVAFYVGAALLIITSAFTISKVKEYDPETYARYHGIDVAANQEKANWFELLKTAPKVFWT
230 240 250 260 270 280
270 280 290 300 310 320
orf16-1.pep VTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQEAGNWYGVLAAVQSVAAVICS
|| |||||||||:||||||||||||||||||||||||:||||| |||||||
orf16ng VTPVQFFCWFAFRYMWTYSAGAIAENVWHTTDASSVGHQEAGNRYGVLAAVX
290 300 310 320 330 340
根据该分析结果,包括该淋球菌蛋白中存在几个推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例19
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 97>:1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGCATA CCTTGATGCT51 GAACGGCTGT ACGTTGATGT TGTGGGGAAT GAACAACCCG GTCAGCGAAA101 CAATCACCCG NAAACACGTT GNCAAAGACC AAATCCGNGN CTTCGGTGTG151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG201 CGGAAAATAC TGGTTCGTCG TCAATCCCGA AGATTCGGCG AA.NTGACGG251 GNATTTTGAN GGCAGGGCTG GACAAACCCT TCCAAATAGT TNAGGATACC301 CCGAGCTATG C.TGCCACCA AGCCCTGCCG GTCAAACTCG GATCGNCTGG351 CAGCCAGAAT...它对应于氨基酸序列<SEQ ID 98;ORF28>:
1 MLFRKTTAAV LAHTLMLNGC TLMLWGMNNP VSETITRKHV XKDQIRXFGV51 VAEDNAQLEK GSLVMMGGKY WFVVNPEDSA XXTGILXAGL DKPFQIVXDT101 PSYXCHQALP VKLGSXGSQN...进一步的工作揭示了完整的核苷酸序列<SEQ ID 99>:1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATGCT51 GAACGGCTGT ACGTTGATGT TGTGGGGAAT GAACAACCCG GTCAGCGAAA101 CAATCACCCG CAAACACGTT GACAAAGACC AAATCCGCGC CTTCGGTGTG151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG201 CGGAAAATAC TGGTTCGTCG TCAATCCCGA AGATTCGGCG AAGCTGACGG251 GCATTTTGAA GGCAGGGCTG GACAAACCCT TCCAAATAGT TGAGGATACC301 CCGAGCTATG CTCGCCACCA AGCCCTGCCG GTCAAACTCG AATCGCCTGG351 CAGCCAGAAT TTCAGTACCG AAGGCCTTTG CCTGCGCTAC GATACCGACA401 AGCCTGCCGA CATCGCCAAG CTGAAACAGC TCGGGTTTGA AGCGGTCAAA451 CTCGACAATC GGACCATTTA CACGCGCTGC GTATCCGCCA AAGGCAAATA501 CTACGCCACA CCGCAAAAAC TGAACGCCGA TTACCATTTT GAGCAAAGTG551 TGCCTGCCGA TATTTATTAC ACGGTTACTG AAGAACATAC CGACAAATCC601 AAGCTGTTTG CAAATATCTT ATATACGCCC CCCTTTTTGA TACTGGATGC651 GGCGGGCGCG GTACTGGCCT TGCCTGCGGC GGCTCTGGGT GCGGTCGTGG701 ATGCCGCCCG CAAATGA它对应于氨基酸序列<SEQ ID 100;ORF28-1>:1 MLFRKTTAAV LAATLMLNGC TLMLWGMNNP VSETITRKHV DKDQIRAFGV51 VAEDNAQLEK GSLVMMGGKY WFVVNPEDSA KLTGILKAGL DKPFQIVEDT101 PSYARHQALP VKLESPGSQN FSTEGLCLRY DTDKPADIAK LKQLGFEAVK
151 LDNRTIYTRC VSAKGKYYAT PQKLNADYHF EQSVPADIYY TVTEEHTDKS
201 KLFANILYTP PFLILDAAGA VLALPAAALG AVVDAARK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF28与脑膜炎奈瑟球菌菌株A的ORF(ORF28a)在重叠的120个氨基酸内显示出有79.2%的相同性:
10 20 30 40 50 60
orf28.pep MLFRKTTAAVLAHTLMLNGCTLMLWGMNNPVSETITRKHVXKDQIRXFGVVAEDNAQLEK
|||||||||||| ||||||||:|:||||:| ||| :|||| ||||| |||||||||||||
orf28a MLFRKTTAAVLAATLMLNGCTVMMWGMNSPFSETTARKHVDKDQIRAFGVVAEDNAQLEK
10 20 30 40 50 60
70 80 90 100 110 120
orf28.pep GSLVMMGGKYWFVVNPEDSAXXTGILXAGLDKPFQIVXDTPSYXCHQALPVKLGSXGSQN
|||||||||||||||||||| |||| ||||| ||:| :| : :||||||| | :|||
orf28a GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKQFQMVEPNPRFA-YQALPVKLESPASQN
70 80 90 100 110
orf28a FSTEGLCLRYDTDRPADIAKLKQLEFEAVELDNRTIYTRCVSAKGKYYATPQKLNADYHF
120 130 140 150 160 170
全长ORF28a核苷酸序列<SEQ ID 101>是:1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATGTT51 GAACGGCTGT ACGGTAATGA TGTGGGGTAT GAACAGCCCG TTCAGCGAAA101 CGACCGCCCG CAAACACGTT GACAAGGACC AAATCCGCGC CTTCGGTGTG151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG201 CGGGAAATAC TGGTTCGTCG TCAATCCTGA AGATTCGGCG AAGCTGACGG251 GCATTTTGAA GGCCGGGTTG GACAAGCAGT TTCAAATGGT TGAGCCCAAC301 CCGCGCTTTG CCTACCAAGC CCTGCCGGTC AAACTCGAAT CGCCCGCCAG351 CCAGAATTTC AGTACCGAAG GCCTTTGCCT GCGCTACGAT ACCGACAGAC401 CTGCCGACAT CGCCAAGCTG AAACAGCTTG AGTTTGAAGC GGTCGAACTC451 GACAATCGGA CCATTTACAC GCGCTGCGTC TCCGCCAAAG GCAAATACTA501 CGCCACACCG CAAAAACTGA ACGCCGATTA TCATTTTGAG CAAAGTGTGC551 CTGCCGATAT TTATTACACG GTTACGAAAA AACATACCGA CAAATCCAAG601 TTGTTTGAAA ATATTGCATA TACGCCCACC ACGTTGATAC TGGATGCGGT651 GGGCGCGGTG CTGGCCTTGC CTGTCGCGGC GTTGATTGCA GCCACGAATT701 CCTCAGACAA ATGA它编码的蛋白质具有氨基酸序列<SEQ ID 102>:1 MLFRKTTAAV LAATLMLNGC TVMMWGMNSP FSETTARKHV DKDQIRAFGV51 VAEDNAQLEK GSLVMMGGKY WFVVNPEDSA KLTGILKAGL DKQFQMVEPN101 PRFAYQALPV KLESPASQNF STEGLCLRY1 TDRPADIAKL KQLEFEAVEL151 DNRTIYTRCV SAKGKYYATP QKLNADYHFE QSVPADIYYT VTKKHTDKSK201 LFENIAYTPT TLILDAVGAV LALPVAALIA ATNSSDK*ORF28a和ORF28-1在238个氨基酸的重叠区内显示出有86.1%的相同性:
10 20 30 40 50 60orf28a.pep MLFRKTTAAVLAATLMLNGCTVMMWGMNSPFSETTARKHVDKDQIRAFGVVAEDNAQLEK
|||||||||||||||||||||:|:||||:| ||| :||||||||||||||||||||||||orf28-1 MLFRKTTAAVLAATLMLNGCTLMLWGMNNPVSETITRKHVDKDQIRAFGVVAEDNAQLEK
10 20 30 40 50 60
70 80 90 100 110 119orf28a.pep GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKQFQMVEPNPRFA-YQALPVKLESPASQN
|||||||||||||||||||||||||||||||| ||:|| :| :|||||||||||||:|||orf28-1 GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKPFQIVEDTPSYARHQALPVKLESPGSQN
70 80 90 100 110 120
120 130 140 150 160 170 179
orf28a.pep FSTEGLCLRYDTDRPADIAKLKQLEFEAVELDNRTIYTRCVSAKGKYYATPQKLNADYHF
|||||||||||||:|||||||||| ||||:||||||||||||||||||||||||||||||
orf28-1 FSTEGLCLRYDTDKPADIAKLKQLGFEAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF
130 140 150 160 170 180
180 190 200 210 220 230
orf28a.pep EQSVPADIYYTVTKKHTDKSKLFENIAYTPTTLILDAVGAVLALPVAALIAATNSSDKX
|||||||||||||::|||||||| || ||| |||||:|||||||:||| |::::: ||
orf28-1 EQSVPADIYYTVTEEHTDKSKLFANILYTPPFLILDAAGAVLALPAAALGAVVDAARKX
190 200 210 220 230
与淋病奈瑟球菌的预计ORF的同源性
ORF28与淋病奈瑟球菌的预计ORF(ORF28.ng)在重叠的120个氨基酸内显示出有84.2%的相同性:
orf28.pep MLFRKTTAAVLAHTLMLNGCTLMLWGMNNPVSETITRKHVXKDQIRXFGVVAEDNAQLEK 60
|||||||||||| ||:|||||:|| |||||||:||||||| ||||| |||||||||||||
orf28ng MLFRKTTAAVLAATLILNGCTMMLRGMNNPVSQTITRKHVDKDQIRAFGVVAEDNAQLEK 60
orf28.pep GSLVMMGGKYWFVVNPEDSAXXTGILXAGLDKPFQIVXDTPSYXCHQALPVKLGSXGSQN 120
||||||||||||:||||||| ||:| |||||||||| ||||| |||||||: : ||||
orf28ng GSLVMMGGKYWFAVNPEDSAKLTGLLKAGLDKPFQIVEDTPSYARHQALPVKFEAPGSQN 120
全长ORF28ng核苷酸序列<SEQ ID 103>是1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATACT51 GAACGGCTGT ACGATGATGT TGCGGGGGAT GAACAACCCG GTCAGCCAAA101 CAATCACCCG CAAACACGTT GACAAAGACC AAATCCGCGC CTTCGGTGTG151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG201 CGGGAAATAC TGGTTCGCCG TCAATCCCGA AGATTCGGCG AAGCTGACGG251 GCCTTTTGAA GGCCGGGTTG GACAAGCCCT TCCAAATAGT TGAGGATACC301 CCGAGCTATG CCCGCCACCA AGCCCTGCCG GTCAAATTCG AAGCGCCCGG351 CAGCCAGAAT TTCAGTACCG GAGGTCTTTG CCTGCGCTAT GATACCGGCA401 GACCTGACGA CATCGCCAAG CTGAAACAGC TTGAGTTTAA AGCGGTCAAA451 CTCGACAATC GGACCATTTA CACGCGCTGC GTATCCGCCA AAGGCAAATA501 CTACGCCACG CCGCAAAAAC TGAACGCCGA TTATCATTTT GAGCAAAGTG551 TGCCCGCCGA TATTTATTAT ACGGTTACTG AAAAACATAC CGACAAATCC601 AAGCTGTTTG GAAATATCTT ATATACGCCC CCCTTGTTGA TATTGGATGC651 GGCGGCCGCG GTGCTGGTCT TGCCTATGGC TCTGATTGCA GCCGCGAATT701 CCTCAGACAA ATGA它编码的蛋白质具有氨基酸序列<SEQ ID 104>:1 MLFRKTTAAV LAATLILNGC TMMLRGMNNP VSQTITRKHV DKDQIRAFGV51 VAEDNAQLEK GSLVMMGGKY WFAVNPEDSA KLTGLLKAGL DKPFQIVEDT101 PSYARHQALP VKFEAPGSQN FSTGGLCLRY DTGRPDDIAK LKQLEFKAVK151 LDNRTIYTRC VSAKGKYYAT PQKLNADYHF EQSVPADIYY TVTEKHTDKS201 KLFGNILYTP PLLILDAAAA VLVLPMALIA AANSSDK*ORF28ng和ORF28-1在231个氨基酸的重叠区内显示有90.0%的相同性:
10 20 30 40 50 60orf28-1.pep MLFRKTTAAVLAATLMLNGCTLMLWGMNNPVSETITRKHVDKDQIRAFGVVAEDNAQLEK
|||||||||||||||:|||||:|| |||||||:|||||||||||||||||||||||||||orf28ng MLFRKTTAAVLAATLILNGCTMMLRGMNNPVSQTITRKHVDKDQIRAFGVVAEDNAQLEK
10 20 30 40 50 60
70 80 90 100 110 120orf28-1.pep GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKPFQIVEDTPSYARHQALPVKLESPGSQN
||||||||||||:|||||||||||:|||||||||||||||||||||||||||:|:|||||orf28ng GSLVMMGGKYWFAVNPEDSAKLTGLLKAGLDKPFQIVEDTPSYARHQALPVKFEAPGSQN
70 80 90 100 110 120
130 140 150 160 170 180
orf28-1.pep FSTEGLCLRYDTDKPADIAKLKQLGFEAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF
||| |||||||| :| |||||||| |:|||||||||||||||||||||||||||||||||
orf28ng FSTGGLCLRYDTGRPDDIAKLKQLEFKAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF
130 140 150 160 170 180
190 200 210 220 230 239
orf28-1.pep EQSVPADIYYTVTEEHTDKSKLFANILYTPPFLILDAAGAVLALPAAALGAVVDAARKX
||||||||||||||:||||||||:|||||||:||||||:|||:|| | ::|:
orf28ng EQSVPADIYYTVTEKHTDKSKLFGNILYTPPLLILDAAAAVLVLPMALIAAANSSDKX
190 200 210 220 230
根据该分析结果(包括该淋球菌蛋白中存在一个推定的跨膜结构域的结果),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF28-1(24kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图6A显示了GST-融合蛋白的亲和纯化结果,图6B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA,得到阳性结果。这些结果确认ORF28-1是一种外露蛋白,且其可能是一种有用的免疫原。
实施例20
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 105>:1 ..GTCAGTCCTG TACTGCCTAT TACACACGAA CGGACAGGGT TTGAAGGTGT51 TATCGGTTAT GAAACCCATT TTTCAGGGCA CGGACATGAA GTACACAGTC101 CGTTCGATCA TCATGATTCA AAAAGCACTT CTGATTTCAG CGGCGGTGTA151 GACGGCGGTT TTACTGTTTA CCAACTTCAT CGAACATGGT CGGAAATCCA201 TCCGGAGGAT GAATATGACG GGCCGCAAGC AGCG.ATTAT CCGCCCCCCG251 GAGGAGCAAG GGATATATAC AGCTATTATG TCAAAGGAAC TTCAACAAAA301 ACAAAGACTA GTATTGTCCC TCAAGCCCCA TTTTCAGACC GTTGGCTAGA351 AGAAAATGCC GGTGCCGCCT CTGGT..它对应于氨基酸序列<SEQ ID 106;ORF29>:1 ..VSPVLPITHE RTGFEGVIGY ETHFSGHGHE VHSPFDHHDS KSTSDFSGGV51 DGGFTVYQLH RTWSEIHPED EYDGPQAAXY PPPGGARDIY SYYVKGTSTK101 TKTSIVPQAP FSDRWLEENA GAASG..进一步的工作揭示了完整的核苷酸序列<SEQ ID 107>:1 ATGAATTTGC CTATTCAAAA ATTCATGATG CTGTTTGCAG CAGCAATATC51 GTTGCTGCAA ATCCCCATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGTAA ATACCATCTG151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAG CGGGTTTACG CCGTCCAGAC201 ATTTGATGCA ACTGCGGTCA GTCCTGTACT GCCTATTACA CACGAACGGA251 CAGGGTTTGA AGGTGTTATC GGTTATGAAA CCCATTTTTC AGGGCACGGA301 CATGAAGTAC ACAGTCCGTT CGATCATCAT GATTCAAAAA GCACTTCTGA351 TTTCAGCGGC GGTGTAGACG GCGGTTTTAC TGTTTACCAA CTTCATCGAA401 CAGGGTCGGA AATCCATCCG GAGGATGGAT ATGACGGGCC GCAAGGCAGC451 GATTATCCGC CCCCCGGAGG AGCAAGGGAT ATATACAGCT ATTATGTCAA501 AGGAACTTCA ACAAAAACAA AGACTAATAT TGTCCCTCAA GCCCCATTTT551 CAGACCGTTG GCTAAAAGAA AATGCCGGTG CCGCCTCTGG TTTTTTCAGC 601 CGTGGGGATG AAGCAGGAAA ACTGATATGG GAAAGCGACC CCAATAAAAA651 TTGGTGGGCT AACCGTATGG ATGATGTTCG CGGCATCGTC CAAGGTGCGG701 TTAATCCTTT TTTAATGGGT TTTCAAGGAG TAGGGATTGG GGCAATTACA751 GACAGTGCAG TAAGCCCGGT CACAGATACA GCCGCGCAGC AGACTCTACA801 AGGTATTAAT GATTTAGGAA AATTAAGTCC GGAAGCACAA CTTGCTGCCG851 CGAGCCTATT ACAGGACAGT GCTTTTGCGG TAAAAGACGG TATCAACTCT901 GCCAAACAAT GGGCTGATGC CCATCCAAAT ATAACAGCTA CTGCCCAAAC951 TGCCCTTTCC GCAGCAGAGG CCGCAGGTAC GGTTTGGAGA GGTAAAAAAG1001 TAGAACTTAA CCCGACTAAA TGGGATTGGG TTAAAAATAC CGGTTATAAA1051 AAACCTGCTG CCCGCCATAT GCAGACTTTA GATGGGGAGA TGGCAGGTGG1101 GAATAAACCT ATTAAATCTT TACCAAACAG TGCCGCTGAA AAAAGAAAAC1151 AAAATTTTGA GAAGTTTAAT AGTAACTGGA GTTCAGCAAG TTTTGATTCA1201 GTGCACAAAA CACTAACTCC CAATGCACCT GGTATTTTAA GTCCTGATAA1251 AGTTAAAACT CGATACACTA GTTTAGATGG AAAAATTACA ATTATAAAAG1301 ATAACGAAAA CAACTATTTT AGAATCCATG ATAATTCACG AAAACAGTAT1351 CTTGATTCAA ATGGTAATGC TGTGAAAACC GGTAATTTAC AAGGTAAGCA1401 AGCAAAAGAT TATTTACAAC AACAAACTCA TATCAGGAAC TTAGACAAAT1451 GA它对应于氨基酸序列<SEQ ID 108;ORF29-1>:1 MNLPIQKFMM LFAAAISLLQ IPISHANGLD ARLRDDMQAK HYEPGGKYHL51 FGNARGSVKK RVYAVQTFDA TAVSPVLPIT HERTGFEGVI GYETHFSGHG101 HEVHSPFDHH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGS151 DYPPPGGARD IYSYYVKGTS TKTKTNIVPQ APFSDRWLKE NAGAASGFFS201 RADEAGKLIW ESDPNKNWWA NRMDDVRGIV QGAVNPFLMG FQGVGIGAIT251 DSAVSPVTDT AAQQTLQGIN DLGKLSPEAQ LAAASLLQDS AFAVKDGINS301 AKQWADAHPN ITATAQTALS AAEAAGTVWR GKKVELNPTK WDWVKNTGYK351 KPAARHMQTL DGEMAGGNKP IKSLPNSAAE KRKQNFEKFN SNWSSASFDS401 VHKTLTPNAP GILSPDKVKT RYTSLDGKIT IIKDNENNYF RIHDNSRKQY451 LDSNGNAVKT GNLQGKQAKD YLQQQTHIRN LDK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF29与脑膜炎奈瑟球菌菌株A的ORF(ORF29a)在重叠的125个氨基酸中显示出有88.0%的相同性:
10 20 30
orf29.pep VSPVLPITHERTGFEGVIGYETHFSGHGHE
|:|:||||||||||||:|||||||||||||
orf29a EPGGKYHLFGNARGSVKNRVYAVQTFDATAVGPILPITHERTGFEGIIGYETHFSGHGHE
50 60 70 80 90 100
40 50 60 70 80 90
orf29.pep VHSPFDHHDSKSTSDFSGGVDGGFTVYQLHRTWSEIHPEDEYDGPQAAXYPPPGGARDIY
||||||:||||||||||||||||||||||||| ||||||| |||||:: |||||||||||
orf29a VHSPFDNHDSKSTSDFSGGVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIY
110 120 130 140 150 160
100 110 120
orf29.pep SYYVKGTSTKTKTSIVPQAPFSDRWLEENAGAASG
||||||||||::|||:||||||||:||||||||
orf29a XXYVKGTSTKTKSNIVPRAPFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANR
170 180 190 200 210 220
orf29a MDDIRGIVQGAVNPFLMGFQGVGIGAITDSAVSPVTDTAAQQTLQGXNHLGXLSPEAQLA
230 240 250 260 270 280全长ORF29a核苷酸序列<SEQ ID 109>是:1 ATGAATTNGC CTATTCAAAA ATTCATGATG CTGTTTGCAG CAGCAATATC51 GTNGCTGCAA ATCCCNATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC 101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGTAA ATACCATCTG151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAT CGGGTTTACG CCGTCCAAAC201 ATTTGATGCA ACTGCGGTCG GCCCCATACT GCCTATTACA CACGAACGGA251 CAGGATTTGA AGGCATTATC GGTTATGAAA CCCATTTTTC AGGACATGGA301 CATGAAGTAC ACAGTCCGTT CGATAATCAT GATTCAAAAA GCACTTCTGA351 TTTCAGCGGC GGCGTAGACG GTGGTTTTAC CGTTTACCAA CTTCATCGGA401 CAGGGTCGGA AATCCATCCG GAGGATGGAT ATGACGGGCC GCAAGGCAGC451 GATTATCCGC CCCCCGGAGG AGCAAGGGAT ATATACANNT ANTATGTCAA501 AGGAACTTCA ACAAAAACAA AGAGTAATAT TGTTCCCCGA GCCCCATTTT551 CAGACCGCTG GCTAAAAGAA AATGCCGGTG CCGCCTCTGG TTTTTTCAGC601 CGTGCTGATG AAGCAGGAAA ACTGATATGG GAAAGCGACC CCAATAAAAA651 TTGGTGGGCT AACCGTATGG ATGATATTCG CGGCATCGTC CAAGGTGCGG701 TTAATCCTTT TTTAATGGGT TTTCAAGGAG TAGGGATTGG GGCAATTACA751 GACAGTGCAG TAAGCCCGGT CACAGATACA GCCGCGCAGC AGACTCTACA801 AGGTATNAAT CATTTAGGAA ANTTAAGTCC CGAAGCACAA CTTGCGGCTG851 CAACCGCATT ACAAGACAGT GCTTTTGCGG TAAAAGACGG TATCAATTCC901 GCCAGACAAT GGGCTGATGC CCATCCGAAT ATAACTGCAA CAGCCCAAAC951 TGCCCTTGCC GTAGCAGANG CCGCAACTAC GGTTTGGGGC GGTAAAAAAG1001 TAGAACTTAA CCCGACCAAA TGGGATTGGG TTAAAAATAC NGGCTATAAN1051 ACACCTGCTG TTCGCACCAT GCATACTTTG GATGGGGAAA TGGCCGGTGG1101 GAATAGACCG CCTAAATCTA TAACGTCCAA CAGCAAAGCA GATGCTTCCA1151 CACAACCGTC TTTACAAGCG CAACTAATTG GAGAACAAAT TANNNNNGGG1201 CATGCTTATA ACAAGCATGT CATAAGACAA CAAGAATTTA CGGATTTAAA1251 TATCAATTCA CCAGCAGATT TTGCTCGGCA TATTGAAAAT ATTGTTAGCC1301 ATCCANCAAA TATGAAAGAG TTACCTCGCG GTAGAACTGC GTATTGGGAT1351 NATAAAACAG GGACNATAGT TATCCGAGAT AAAAATTCTG ACGATGGAGG1401 TACAGCATTT AGACCAACAT CAGGTAAAAA ATATTATGAT GATTTATAG它编码的蛋白质具有氨基酸序列<SEQ ID 110>:1 MNXPIQKFMM LFAAAISXLQ IPISHANGLD ARLRDDMQAK HYEPGGKYHL51 FGNARGSV1N RVYAVQTFDA TAYGPILPIT HERTGFEGII GYETHFSGHG101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGS151 DYPPPGGARD IYXXYVKGTS TKTKSNIVPR APFSDRWLKE NAGAASGFFS201 RADEAGKLIW ESDPNKNWWA NRMDDIRGIV QGAVNPFLMG FQGVGIGAIT251 DSAVSPVTDT AAQQTLQGXN HLGXLSPEAQ LAAATALQDS AFAVKDGINS301 ARQWADAHPN ITATAQTALA VAXAATTVWG GKKVELNPTK WDWVKNTGYX351 TPAVRTMHTL DGEMAGGNRP PKSITSNSKA DASTQPSLQA QLIGEQIXXG401 HAYNKHVIRQ QEFTDLNINS PADFARHIEN IVSHPXNMKE LPRGRTAYWD451 XKTGTIVIRD KNSDDGGTAF RPTSGKKYYD DL*ORF29a和ORF29-1在385个氨基酸的重叠区内显示出有90.1%的相同性:
10 20 30 40 50 60orf29a.pep MNXPIQKFMMLFAAAISXLQIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKN
|| |||||||||||||| |||||||||||||||||||||||||||||||||||||||||:orf29-1 MNLPIQKFMMLFAAAISLLQIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKK
10 20 30 40 50 60
70 80 90 100 110 120orf29a.pep RVYAVQTFDATAVGPILPITHERTGFEGIIGYETHFSGHGHEVHSPFDNHDSKSTSDFSG
|||||||||||||:|:||||||||||||:|||||||||||||||||||:|||||||||||orf29-1 RVYAVQTFDATAVSPVLPITHERTGFEGVIGYETHFSGHGHEVHSPFDHHDSKSTSDFSG
70 80 90 100 110 120
130 140 150 160 170 180orf29a.pep GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYXXYVKGTSTKTKSNIVPR
|||||||||||||||||||||||||||||||||||||||||| ||||||||||:||||:orf29-1 GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYSYYVKGTSTKTKTNIVPQ
130 140 150 160 170 180
190 200 210 220 230 240orf29a.pep APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDIRGIVQGAVNPFLMG
|||||||||||||||||||||||||||||||||||||||||||||:||||||||||||||
orf29-1 APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDVRGIVQGAVNPFLMG
190 200 210 220 230 240
250 260 270 280 290 300
orf29a.pep FQGVGIGAITDSAVSPVTDTAAQQTLQGXNHLGXLSPEAQLAAATALQDSAFAVKDGINS
|||||||||||||||||||||||||||| | || ||||||||||: ||||||||||||||
orf29-1 FQGVGIGAITDSAVSPVTDTAAQQTLQGINDLGKLSPEAQLAAASLLQDSAFAVKDGINS
250 260 270 280 290 300
310 320 330 340 350 360
orf29a.pep ARQWADAHPNITATAQTALAVAXAATTVWGGKKVELNPTKWDWVKNTGYXTPAVRTMHTL
|:|||||||||||||||||::| || ||| ||||||||||||||||||| ||:| |:||
orf29-1 AKQWADAHPNITATAQTALSAAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTL
310 320 330 340 350 360
370 380 390 400 410 420
orf29a.pep DGEMAGGNRPPKSITSNSKADASTQPSLQAQLIGEQIXXGHAYNKHVIRQQEFTDLNINS
||||||||:| ||: || |: |
orf29-1 DGEMAGGNKPIKSLP-NSAAEKRKQNFEKFNSNWSSASFDSVHKTLTPNAPGILSPDKVK
370 380 390 400 410
与淋病奈瑟球菌的预计ORF的同源性
ORF29与淋病奈瑟球菌的预计ORF(ORF29.ng)在重叠的125个氨基酸内显示出有88.8%的相同性:
orf29.pep VSPVLPITHERTGFEGVIGYETHFSGHGHE 30
|:|:||||||||||||||||||||||||||
orf29ng EPGGKYHLFGNARGSVKNRVCAVQTFDATAVGPILPITHERTGFEGVIGYETHFSGHGHE 102
orf29.pep VHSPFDHHDSKSTSDFSGGVDGGFTVYQLHRTWSEIHPEDEYDGPQAAXYPPPGGARDIY 90
||||||:||||||||||||||||||||||||| ||||||| |||||:: |||||||||||
orf29ng VHSPFDNHDSKSTSDFSGGVDGGFTVYQLHRTGSEIHPEDGYDGPQGGGYPPPGGARDIY 162
orf29.pep SYYVKGTSTKTKTSIVPQAPFSDRWLEENAGAASG 125
||::|||||||| : |||||||||||:||||||||
orf29ng SYHIKGTSTKTKINTVPQAPFSDRWLKENAGAASGFLSRADEAGKLIWENDPDKNWRANR 222
预计全长ORF29ng核苷酸序列<SEQ ID 111>编码的蛋白质具有氨基酸序列<SEQ ID 112>:1 MNLPIQKFMM LFAAAISLLQ IPISHANGLD ARLRDDMQAK HYEPGGKYHL51 FGNARGSVKN RVCAVQTFDA TAVGPILPIT HERTGFEGVI GYETHFSGHG101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGG151 GYPPPGGARD IYSYHIKGTS TKTKINTVPQ APFSDRWLKE NAGAASGFLS201 RADEAGKLIW ENDPDKNWRA NRMDDIRGIV QGAVNPFLTG FQGLGVGAIT251 DSAVSPYTYA AARKTLQGIH NLGNLSPEAQ LAATALQDS AFAVKDSINS301 ARQWADAHPN ITATAQTALA VTEAATTVWG GKKVELNPAK WDWVKNTGYK351 KPAARHMQTV DGEMAGGNKP LESKNTVTTN NFFENTGYTE KVLRQASNGD401 YHGFPQSVDA FSENGTVIQI VGGDNIVRHK LYIPGSYKGK DGNFEYIREA451 DGKINHRLFV PNQQLPEK*在第二个实验中,鉴定出下列DNA序列<SEQ ID 113>:1 atgAATTTGC CTATTCAAAA ATTCATGATG ctgttggcAg cggcaatatc51 gatgctGCat ATCCCCATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGCAA ATACCATCTG151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAT CGGGTTTGCG CCGTCCAAAC201 ATTTGATGCA ACTGCGGTCG GCCCCATACT GCCTATTACA CACGAACGGA251 CAGGATTTGA AGGTGTTATC GGCTATGAAA CCCATTTTTC AGGACACGGA301 CACGAAGTAC ACAGTCCGTT CGATAATCAT GATTCAAAAA GCACTTCTGA 351 TTTCAGCGGC GGCGTAGACG GCGGTTTTAC CGTTTACCAA CTTCATCGGA401 CAGGGTCGGA AATACATCCC GCAGACGGAT ATGACGGGCC TCAAGGCGGC451 GGTTATCCGG AACCACAAGG GGCAAGGGAT ATATACAGCT ACCATATCAA501 AGGAACTTCA ACCAAAACAA AGATAAACAC TGTTCCGCAA GCCCCTTTTT551 CAGACCGCTG GCTAAAAGAA AATGCCGGTG CCGCTTCCGG TTTTCTCAGC601 CGTGCGGATG AAGCAGGAAA ACTGATATGG GAAAACGACC CCGATAAAAA651 TTGGCGGGCT AACCGTATGG ATGATATTCG CGGCATCGTC CAAGGTGCGG701 TTAATCCTTT TTTAACGGGT TTTCAAGGGG TAGGGATTGG GGCAATTACA751 GACAGTGCGG TAAGCCCGGT CACAGATACA GCCGCTCAGC AGACTCTACA801 AGGTATTAAT GATTTAGGAA ATTTAAGTCC GGAAGCACAA CTTGCCGCCG851 CGAGCCTATT ACAGGACAGT GCCTTTGCGG TAAAAGACGG CATCAATTCC901 GCCAGACAAT GGGCTGATGC CCATCCGAAT ATAACAGCAA CAGCCCAAAC951 TGCCCTTGCC GTAGCAGAGG CCGCAGGTAC GGTTTGGCGC GGTAAAAAAG1001 TAGAACTTAA CCCGACCAAA TGGGATTGGG TTAAAAATAC CGGCTATAAA1051 AAACCTGCTG CCCGCCATAT GCAGACTGTA GATGGGGAGA TGGCAGGGGG1101 GAATAGACCG CCTAAATCTA TAACGTCGGA AGGAAAAGCT AATGCTGCAA1151 CCTATCCTAA GTTGGTTAAT CAGCTAAATG AGCAAAACTT AAATAACATT1201 GCGGCTCAAG ATCCAAGATT GAGTCTAGCT ATTCATGAGG GTAAAAAAAA1251 TTTTCCAATA GGAACTGCAA CTTATGAAGA GGCAGATAGA CTAGGTAAAA1301 TTTGGGTTGG TGAGGGTGCA AGACAAACTA GTGGAGGCGG ATGGTTAAGT1351 AGAGATGGCA CTCGACAATA TCGGCCACCA ACAGAAAAAA AATCACAATT1401 TGCAACTACA GGTATTCAAG CAAATTTTGA AACTTATACT ATTGATTCAA1451 ATGAAAAAAG AAATAAAATT AAAAATGGAC ATTTAAATAT TAGGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 114;ORF29ng-1>:1 MNLPIQKFMM LLAAAISMLH IPISHANGLD ARLRDDMQAK HYEPGGKYHL51 FGNARGSVKN RVCAVQTFDA TAVGPILPIT HERTGFEGVI GYETHFSGHG101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP ADGYDGPQGG151 GYPEPQGARD IYSYHIKGTS TKTKINTVPQ APFSDRWLKE NAGAASGFLS201 RADEAGKLIW ENDPDKNWRA NRMDDIRGIV QGAVNPFLTG FQGVGIGAIT251 DSAVSPVTDT AAQQTLQGIN DLGNLSPEAQ LAAASLLQDS AFAVKDGINS301 ARQWADAHPN ITATAQTALA VAEAAGTVWR GKKVELNPTK WDWVKNTGYK351 KPAARHMQTV DGEMAGGNRP PKSITSEGKA NAATYPKLVN QLNEQNLNNI401 AAQDPRLSLA IHEGKKNFPI GTATYEEADR LGKIWVGEGA RQTSGGGWLS451 RDGTRQYRPP TEKKSQFATT GIQANFETYT IDSNEKRNKI KNGHLNIR*ORF29ng-1和ORF29-1在401个氨基酸的重叠区内显示出有86.0%的相同性:
10 20 30 40 50 60orf29ng-1.pep MNLPIQKFMMLLAAAISMLHIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKN
|||||||||||:|||||:|:|||||||||||||||||||||||||||||||||||||||:orf29-1 MNLPIQKFMMLFAAAISLLQIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKK
10 20 30 40 50 60
70 80 90 100 110 120orf29ng-1.pep RVCAYQTFDATAVGPILPITHERTGFEGVIGYETHFSGHGHEVHSPFDNHDSKSTSDFSG
|| ||||||||||:|:||||||||||||||||||||||||||||||||:|||||||||||orf29-| RVYAVQTFDATAVSPVLPITHERTGFEGVIGYETHFSGHGHEVHSPFDHHDSKSTSDFSG
70 80 90 100 110 120
130 140 150 160 170 180orf29ng-1.pep GVDGGFTVYQLHRTGSEIHPADGYDGPQGGGYPEPQGARDIYSYHIKGTSTKTKINTVPQ
|||||||||||||||||||| ||||||||: || | ||||||||::|||||||| | |||orf29-1 GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYSYYVKGTSTKTKTNIVPQ
130 140 150 160 170 180
190 200 210 220 230 240orf29ng-1.pep APFSDRWLKENAGAASGFLSRADEAGKLIWENDPDKNWRANRMDDIRGIVQGAVNPFLTG
||||||||||||||||||:||||||||||||:||:||| ||||||:|||||||||||| |orf29-1 APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDVRGIVQGAVNPFLMG
190 200 210 220 230 240
250 260 280 280 290 300
orf29ng-1.pep FQGVGIGAITDSAVSPVTDTAAQQTLQGINDLGNLSPEAQLAAASLLQDSAFAVKDGINS
|||||||||||||||||||||||||||||||||:||||||||||||||||||||||||||
orf29-1 FQGVGIGAITDSAVSPVTDTAAQQTLQGINDLGKLSPEAQLAAASLLQDSAFAVKDGINS
250 260 270 280 290 300
310 320 330 340 350 360
orf29ng-1.pep ARQWADAHPNITATAQTALAVAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTV
|:|||||||||||||||||::||||||||||||||||||||||||||||||||||||||:
orf29-1 AKQWADAHPNITATAQTALSAAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTL
310 320 330 340 350 360
370 380 390 400 410 419
orf29ng-1.pep DGEMAGGNRPPKSI-TSEGKANAATYPKLVNQLNEQNLNNIAAQDPRLSLAIHEGKKNFP
||||||||:| ||: :| :: :: |: :: : :::::
orf29-1 DGEMAGGNKPIKSLPNSAAEKRKQNFEKFNSNWSSASFDSVHKTLTPNAPGILSPDKVKT
370 380 390 400 410 420
420 430 440 450 460 470 479
orf29ng-1.pep IGTATYEEADRLGKIWVGEGARQTSGGGWLSRDGTRQYRPPTEKKSQFATTGIQANFETY
orf29-1 RYTSLDGKITIIKDNENNYFRIHDNSRKQYLDSNGNAVKTGNLQGKQAKDYLQQQTHIRN
430 440 450 460 470 480
根据该分析结果,包括该淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例21
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 115>:
1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAATGTTCC101 ACACGCGGGC AGATGCACCG ATGCAG...
它对应于氨基酸序列<SEQ ID 116;ORF30>:
1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QMFHTRADAP MQ..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 117>:1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC101 ACACGCGGGC AGATGCACCG ATGCAGTTGG CGGAGCTTTC TCAAAAGGAG151 ATGAAGGAGA CAGAGGGGGC GTTTCTTCCA TTGGCTATCT TGGGTGGTGC201 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA251 GACCAGCTTC TGTTAGAGAT GTTGCTATTG CTGGCGGATT AGGCGCAATT301 CCTGGTGGTG TAGGCGCCGC AGGAAAGGTT GTTTCCTTTG CTAAATATGG351 ACGTGAGATT AAAATCGGCA ATAATATGCG GATAGCCCCT TTCGGTAATA401 GAACAGGTCA TCCTATTGGA AAATTTCCCC ATTATCATCG TCGAGTTACG451 GATAATACGG GCAAGACTTT GCCTGGACAG GGAATTGGTC GTCATCGCCC501 TTGGGAATCA AAATCTACGG ACAGATCATG GAAAAACCGC TTCTAA它对应于氨基酸序列<SEQ ID 118;ORF30-1>:1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE51 MKETEGAFLP LAILGGAAIG MWTQHGFSYA TTGRPASVRD VAIAGGLGAI101 PGGVGAAGKV VSFAKYGREI KIGNNMRIAP FGNRTGHPIG KFPHYHRRVT151 DNTGKTLPGQ GIGRHRPWES KSTDRSWKNR F*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF30与脑膜炎奈瑟球菌菌株A的ORF(ORF30a)在重叠的42个氨基酸内显示出有97.6%的相同性:
10 20 30 40
orf30.pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQMFHTRADAPMQ
|||||||||||||||||||||||||||||||:||||||||||
orf30a MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKXTXGAFLP
10 20 30 40 50 60
orf30a LXILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGXVGAAGKVVSFAKYGREI
70 80 90 100 110 120
全长ORF30a核苷酸序列<SEQ ID 119>是:1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC101 ACACGCGGGC AGATGCACCG ATGCAGTTGG CGGAGCTTTC TCAAAAGGAG151 ATGAAGGANA CAGNGGGGGC GTTTCTTCCA TTGGNTATCT TGGGTGGTGC201 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA251 GACCAGCTTC TGTTAGAGAT GTTGCTATTG CTGGCGGATT AGGCGCAATT301 CCTGGTGNTG TAGGCGCCGC AGGAAAGGTT GTTTCCTTTG CTAAATATGG351 ACGTGAGATT AAAATCGGCA ATAATATGCG GATAGCCCCT TTCGGTAATA401 GAACAGGTCA TCCTATTGGN AAATTTCCCC ATTATCATCG TCGAGTTACG451 GATAATACGG GCAAGACTTT GCCTGGACAG GGAATTGGTC GTCATCGCCC501 TTGGGAATCA AAATCTACGG ACAGATCATG GAAAAACCGC TTCTAA它编码的蛋白质具有氨基酸序列<SEQ ID 120>:1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE51 MKXTXGAFLP LXILGGAAIG MWTQHGFSYA TTGRPASVRD VAIAGGLGAI101 PGXVGAAGKV VSFAKYGREI KIGNNMRIAP FGNRTGHPIG KFPHYHRRVT151 DNTGKTLPGQ GIGRHRPWES KSTDRSWKNR F*
ORF30a和ORF30-1在181个氨基酸的重叠区内显示出有97.8%的相同性:
orf30a.pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKXTXGAFLP 60
|||||||||||||||||||||||||||||||||||||||||||||||||||| | |||||
orf30-1 MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 60
orf30a.pep LXILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGXVGAAGKVVSFAKYGREI 120
| |||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf30-1 LAILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGGVGAAGKVVSFAKYGREI 120
orf30a.pep KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf30-1 KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR 180
orf30a.pep FX
||
orf30-1 FX
与淋病奈瑟球菌的预计ORF的同源性
ORF30与淋病奈瑟球菌的预计ORF(ORF30.ng)在重叠的42个氨基酸内显示出有97.6%的相同性:
orf30.pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQMFHTRADAPMQ 42
|||||||||||||||||||||||||||||||:||||||||||
orf30ng MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 60全长ORF30ng核苷酸序列<SEQ ID 121>是1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATCGCCCC51 CGCAATGGCA AACGGATTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC101 ACACGCGGGC AGATGCGCCG ATGCAGTTGG CGGAGCTTTC TCAGAAGGAG151 ATGAAGGAGA CTGAAGGGGC TTTTCTTCCA TTGGCTATCT TGGGTGGTGC201 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA251 GACCAGCTTC TGTTAGAGAT GTTGCTGGCG GATTAGGCGC AATTCCTGGT301 GATGTAGGTG CTGCAGGAAA GGTTGTTTCC TTTGCTAAAT ATGGACGTGA351 GATTAAAATC GGCAATAATA TGCGGATAGC CCCTTTCGGT AATAGAACAG401 GTCATCCTAT TGGAAAATTT CCCCATTATC ATCGTCGAGT TACGGATAAT451 ACGGGCAAGA CTTTGCCTGG ACAGGGAATT GGTCGTCATC GCCCTTGGGA501 ATCAAAATCT ACGGACAGAT CATGGAAAAA CCGCTTCTAA它编码的蛋白质具有氨基酸序列<SEQ ID 122>:1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE51 MKETEGAFLP LAILGGAAIG MWTQHGFSYA TTGRPASVRD VAGGLGAIPG101 DVGAAGKVVS FAKYGREIKI GNNMRIAPFG NRTGHPIGKF PHYHRRVTDN151 TGKTLPGQGI GRHRPWESKS TDRSWKNRF*
ORF30ng和ORF30-1在181个氨基酸的重叠区内显示出有98.3%的相同性:
10 20 30 40 50 60
orf30ng.pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf30-1 MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP
10 20 30 40 50 60
70 80 90 100 110
orf30ng.pep LAILGGAAIGMWTQHGFSYATTGRPASVRDVA--GGLGAIPGDVGAAGKVVSFAKYGREI
|||||||||||||||||||||||||||||||| |||||||| |||||||||||||||||
orf30-1 LAILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGGVGAAGKVVSFAKYGREI
70 80 90 100 110 120
120 130 140 150 160 170
orf30ng.pep KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf30-1 KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR
130 140 150 160 170 180
180
orf30ng.pep FX
||
orf30-1 FX
根据该分析结果,包括该淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例22
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 123>:1 ATGAATAAAA CTCTCTATCG TGTAATTTTC AACCGCAAAC GTGGGGCTGT51 GrTAGCCGTT GCTGAAACTA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA101 GTGATTCAGG CAGCGCTCAT GTGAAATCTG TTCCTTTTGG TACTACTCAT151 GCACCTGTTT GTg.CGTTaC AAATATCTTT TCTTTTTCTT TATTGGGCTT201 TTCTTTATGT TTGGCTGTAG GtacGGyCAA TATTGCTTTT GCTGATGGCA251 TT..它对应于氨基酸序列<SEQ ID 124;ORF31>:
1 MNKTLYRVIF NRKRGAVXAV AETTKREGKS CADSDSGSAH VKSVPFGTTH
51 APVCXVTNIF SFSLLGFSLC LAVGTXNIAF ADGI..进一步的工作揭示进一步的部分核苷酸序列<SEQ ID 125>:1 ATGAATAAAA CTCTCTATCG TGTAATTTTC AACCGCAAAC GTGGGGCTGT51 GGTAGCCGTT GCTGAAACTA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA101 GTGATTCAGG CAGCGCTCAT GTGAAATCTG TTCCTTTTGG TACTACTCAT151 GCACCTGTTT GTCGTTCAAA TATCTTTTCT TTTTCTTTAT TGGGCTTTTC201 TTTATGTTTG GCTGTAGGTA CGGCCAATAT TGCTTTTGCT GATGGCATT..它对应于氨基酸序列<SEQ ID 126;ORF31-1>:
1 MNKTLYRVIF NRKRGAVVAV AETTKREGKS CADSDSGSAH VKSVPFGTTH
51 APVCRSNIFS FSLLGFSLCL AVGTANIAFA DGI..
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF31与淋病奈瑟球菌的预计ORF(ORF31.ng)在重叠的84个氨基酸内显示出有76.2%的相同性:
orf31.pep MNKTLYRVIFNRKRGAVXAVAETTKREGKSCADSDSGSAHVKSVPFGTTHAPVCXVTNIF 60
||||||||||||||||| |||||||||||||||| |||::|||| | || :: |
orf31ng MNKTLYRVIFNRKRGAVVAVAETTKREGKSCADSGSGSVYVKSVSFIPTH------SKAF 54
orf31.pep SFSLLGFSLCLAVGTXNIAFADGI 84
|| ||||||||:|| ||||||||
orf31ng CFSALGFSLCLALGTVNIAFADGIITDKAAPKTQQATILQTGNGIPQVNIQTPTSAGVSV 114
全长ORF31ng核苷酸序列<SEQ ID 127>是:1 ATGAACAAAA CCCTCTATCG TGTGATTTTC AACCGCAAAC GCGGTGCTGT51 GGTAGCTGTT GCCGAAACCA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA101 GTGGTTCGGG CAGCGTTTAT GTGAAATCCG TTTCTTTCAT TCCTACTCAT151 TCCAAAGCCT TTTGTTTTTC TGCATTAGGC TTTTCTTTAT GTTTGGCTTT201 GGGTACGGTC AATATTGCTT TTGCTGACGG CATTATTACT GATAAAGCTG251 CTCCTAAAAC CCAACAAGCC ACGATTCTGC AAACAGGTaa cGGCATACCG301 CAAGTCAATA TTCAAACCCC TACTTCGGCA GGGGTTTCTG TTAATCAATA351 TGCCCAGTTT GATGTGGGTA ATCGCGGGGC GATTTTAAAC AACAGTCGCA401 GCAACACCCA AACACAGCTA GGCGGTTGGA TTCAAGGCAA TCCTTGGTTG451 ACAAGGGGCG AAGCACGTGT GGTTGTAAAC CAAATCAACA GCAGCCATCC501 TTCACAACTG AATGGCTATA TTGAAGTGGG TGGACGACGT GCAGAAGTCG551 TTATTGCCAA TCCGGCAGGG ATTGCAGTCA ATGGTGGTGG TTTTATCAAT601 GCTTCCCGTG CCACTTTGAC GACAGGCCAA CCGCAATATC AAGCAGGAGA651 CTTTAGCGGC TTTAAGATAA GGCAAGGCAA TGCTGTAATC GCCGGACACG701 GTTTGGATGC CCGTGATACC GATTTCACAC GTATTCTTGT ATGCCAACAA751 AATCACCTTG ATCAGTACGG CCGAACAAGC AGGCATTCGT AA它编码的蛋白质具有氨基酸序列<SEQ ID 128>:1 MNKTLYRVIF NRKRGAVVAV AETTKREGKS CADSGSGSVY VKSVSFIPTH51 SKAFCFSALG FSLCLALGTV NIAFADGIIT DKAAPKTQQA TILQTGNGIP101 QVNIQTPTSA GVSVNQYAQF DVGNRGAILN NSRSNTQTQL GGWIQGNPWL151 TRGEARVVVN QINSSHPSQL NGYIEVGGRR AEVVIANPAG IAVNGGGFIN201 ASRATLTTGQ PQYQAGDFSG FKIRQGNAVI AGHGLDARDT DFTRILVCQQ251 NHLDQYGRTS RHS*
该淋球菌蛋白与菊欧文氏菌的孔形成溶血素样HecA蛋白(登录号为L39897)在重叠的149个氨基酸内有50%的相同性:orf31ng 96 GNGIPQVNIQTPTSAGVSVNQYAQFDVGNRGAILNNSRSN-TQTQLGGWIQGNPWLTRGE 154
GNG+P VNI TP ++G+S N+Y F+V NRG ILNN + T +QLGG IQ NP LHecA 45 GNGVPVVNIATPDASGLSHNRYHDFNVDNRGLILNNGTARLTPSQLGGLIQNNPNLNGRA 104Orf31ng 155 ARVVVNQINSSHPSQLNGYIEVGGRRAEVVIANPAGIAVNGGGFINASRATLTTGQPQYQ 214
A ++N++ S + S+L GY+EV G+ A VV+ANP GI +G GF+N R TLTTG PQ+HecA 105 AAAILNEVVSPNRSRLAGYLEYAGQAANVVVANPYGITCSGCGFLNTPRLTLTTGTPQFD 164Orf31ng 215 -AGDFSGFKIRQGNAVIAGHGLDARDTDF 242
AG SG +R G+ +I G GLDA +D+HecA 165 AAGGLSGLDVRGGDILIDGAGLDASRSDY 193
另外,ORF31ng和ORF31-1在83个氨基酸的重叠区内显示出有79.5%的相同性:
10 20 30 40 50 60
orf31-1.pep MNKTLYRVIFNRKRGAVVAVAETTKREGKSCADSDSGSAHVKSVPFGTTHAPVCRSNIFS
|||||||||||||||||||||||||||||||||| |||::|||| | || |: |
orf31ng MNKTLYRVIFNRKRGAVVAVAETTKREGKSCADSGSGSVYVKSVSFIPTH-----SKAFC
10 20 30 40 50
70 80
orf31-1.pep FSLLGFSLCLAVGTANIAFADGI
|| ||||||||:||:||||||||
orf31ng FSALGFSLCLALGTVNIAFADGIITDKAAPKTQQATILQTGNGIPQVNIQTPTSAGVSVN
60 70 80 90 100 110
根据这一发现,包括与溶血素以及粘附素有同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例23
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 129>:1 ATGAATACTC CTCCTTTTGT CTGTTGGATT TTTTGCAAGG TCATCGACAA51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT CGCCCGTGTT TTGCACCGCG101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT151 GCGCTTTGCC CTGATTTGCC CGATGTTCCC TGCGTTCATC AGGATATTCA201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCG..它对应于氨基酸序列<SEQID 130;ORF32>:
1 MNTPPFVCWI FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR
51 ALCPDLPDVP CVHQDIHVRT WHSDAADIDT A..进一步的工作揭示了完整的核苷酸序列<SEQ ID 131>:1 ATGAATACTC CTCCTTTTGT CTGTTGGATT TTTTGCAAGG TCATCGACAA51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT CGCCCGTGTT TTGCACCGCG101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT151 GCGCTTTGCC CTGATTTGCC CGATGTTCCC TGCGTTCATC AGGATATTCA201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCGCCTGTTC251 CCGATGTCGT CATCGAAACT TTTGCCTGCG ACCTGCCCGA AAATGTGCTG301 CACATTATCC GCCGACACAA GCCGCTTTGG CTGAATTGGG AATATTTGAG351 CGCGGAGGAA AGCAATGAAA GGCTGCATCT GATGCCTTCG CCGCAGGAGG401 GTGTTCAAAA ATATTTTTGG TTTATGGGTT TCAGCGAAAA AAGCGGCGGG451 TTGATACGCG AACGTGATTA CTGCGAAGCC GTCCGTTTCG ATACTGAAGC501 CCTGCGAGAG CGGCTGATGC TGCCCGAAAA AAACGCCTCC GAATGGCTGC551 TTTTCGGCTA TCGGAGCGAT GTTTGGGCAA AGTGGCTGGA AATGTGGCGA601 CAGGCAGGCA GCCCGATGAC ACTGTTGCTG GCGGGGACGC AAATCATCGA651 CAGCCTCAAA CAAAGCGGCG TTATTCCGCA AGATGCCCTG CAAAACGACG701 GCGATGTTTT TCAGACGGCA TCCGTCCGCC TCGTCAAAAT CCCTTTCGTG751 CCGCAACAGG ACTTCGACCA ACTGCTGCAC CTTGCCGACT GCGCCGTCAT801 CCGCGGCGAA GACAGTTTCG TGCGCGCCCA GCTTGCGGGC AAACCCTTCT 851 TTTGGCACAT CTACCCGCAA GACGAGAATG TCCATCTCGA CAAACTCCAC901 GCCTTTTGGG ATAAGGCACA CGGTTTCTAC ACGCCCGAAA CCGTGTCGGC951 ACACCGCCGT CTTTCGGACG ACCTCAACGG CGGAGAGGCT TTATCCGCAA1001 CACAACGCCT CGAATGTTGG CAAACCCTGC AACAACATCA AAACGGCTGG1051 CGGCAAGGCG CGGAGGATTG GAGCCGTTAT CTTTTCGGGC AGCCGTCAGC1101 TCCTGAAAAA CTCGCTGCCT TTGTTTCAAA GCATCAAAAA ATACGCTAG它对应于氨基酸序列<SEQ ID 132;ORF32-1>:1 MNTPPFVCWI FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR51 ALCPDLPDVP CVHQDIHVRT WHSDAADIDT APVPDVVIET FACDLPENVL101 HIIRRHKPLW LNWEYLSAEE SNERLHLMPS PQEGVQKYFW FMGFSEKSGG151 LIRERDYCEA VRFDTFALRE RLMLPEKNAS EWLLFGYRSD VWAKWLEMWR201 QAGSPMTLLL AGTQIIDSLK QSGVIPQDAL QNDGDVFQTA SVRLVKIPFV251 PQQDFDQLLH LADCAVIRGE DSFVRAQLAG KPFFWHIYPQ DENVHLDKLH301 AFWDKAHGFY TPETVSAHRR LSDDLNGGEA LSATQRLECW QTLQQHQNGW351 RQGAEDWSRY LFGQPSAPEK LAAFVSKHQK IR*w
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF32与脑膜炎奈瑟球菌菌株A的ORF(ORF32a)在重叠的81个氨基酸内显示出有93.8%的相同性:
10 20 30 40 50 60
orf32.pep MNTPPFVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVP
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||
orf32a MNTPPFSAGXFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVX
10 20 30 40 50 60
70 80
orf32.pep CVHQDIHVRTWHSDAADIDTA
|||||||||||||||||||||
orf32a CVHQDIHVRTWHSDAADIDTAPVXDVVIETFACDLPENVLHIIRRHKPLWLXWEYLSAEX
70 80 90 100 110 120
全长ORF32a核苷酸序列<SEQ ID 133>是:1 ATGAATACTC CTCCTTTTTC TGCTGGANTT TTTTGCAAGG TCATCGACAA51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT TGCCCGTGTT TTGCACCGCG101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT151 GCGCTTTGCC CTGATTTGCC CGATGTTCNC TGCGTTCATC AGGATATTCA201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCGCCTGTTC251 NCGATGTCGT CATCGAAACT TTTGCCTGCG ACCTGCCCGA AAATGTGCTG301 CACATCATCC GCCGACACAA GCCGCTTTGG CTGAANTGGG AATATTTGAG351 CGCGGAGGAN AGCAATGAAA GGCTGCACNT GATGCCTTCG CCGCAGGAGA401 GTGTTCNAAA ATANTTTTGG TTTATGGGTT TCAGCGAANN NAGCGGCGGA451 CTGATACGCG AACGCGATTA CTGCGAAGCC GTCCGTTTCG ATAGCGGAGC501 CTTGCGCAAG AGGCTGATGC TTCCCGAAAA AAACGNCCCC GAATGGCTGC551 TTTTCGGCTA TCGGAGCGAT GTTTGGGCAA AGTGGCTGGA AATGTGGCGA601 CAGGCAGGCA GTCCGTTGAC ACTTTTGCTG GCNGGGGCGC ANATTATCGA651 CAGCCTCAAA CAAAACGGCG TTATTCCGCA AGATGCCCTG CAAAACGACG701 GCGATGTTTT TCAGACGGCA TCCGTCCGCC TCGTCAAAAT CCCTTTCGTG751 CCGCAACAGG ACTTCGACAA ACTGCTGCAC CTTGCCGACT GCGCCGTCAT801 CCGCGGCGAA GACAGTTTCG TGCGCGCCCA GCTTGCGGGC AAACCCTTCT851 TTTGGCACAT CTACCCGCAA GATGAGAATG TCCATCTCGA CAAACTCCAC901 GCCTTTTGGG ATAAGGCACA CGGTTTCTAC ACGCCCGAAA CCGCATCGGC951 ACACCGCCGC CTTTCAGACG ACCTCAACGG CGGAGAGGCT TTATCCGCAA1001 CACAACGCCT CGAATGTTGG CAAATCCTGC AACAACATCA AAACGGCTGG1051 CGGCAAGGCG CGGAGGATTG GAGCCGTTAT CTTTTTGGGC AGCCTTCCGC1101 ATCCGAAAAA CTCGCCGCCT TTGTTTCAAA GCATCAAAAA ATACGCTAG它编码的蛋白质具有氨基酸序列<SEQ ID 134>: 1 MNTPPFSAGX FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR51 ALCPDLPDYX CVHQDIHVRT WHSDAADIDT APYXDVVIET FACDLPENVL101 HIIRRHKPLW LXWEYLSAEX SNERLHXMPS PQESVXKXFW FMGFSEXSGG151 LIRERDYCEA VRFDSGALRK RLMLPEKNXP EWLLFGYRSD VWAKWLFMWR201 QAGSPLTLLL AGAXIIDSLK QNGVIPQDAL QNDGDVFQTA SVRLVKIPFV251 PQQDFDKLLH LADCAVIRGE DSFVRAQLAG KPFFWHIYPQ DENVHLDKLH301 AFWDKAHGFY TPETASAHRR LSDDLNGGEA LSATQRLECW QILQQHQNGW351 RQGAEDWSRY LFGQPSASEK LAAFVSKHQK IR*
ORF32a和ORF32-1在382个氨基酸的重叠区内显示出有93.2%的相同性:
10 20 30 40 50 60
orf32-1.pep MNTPPFVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVP
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||
orf32a MNTPPFSAGXFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVX
10 20 30 40 50 60
70 80 90 100 110 120
orf32-1.pep CVHQDIHVRTWHSDAADIDTAPVPDVVIETFACDLPENVLHIIRRHKPLWLNWEYLSAEE
||||||||||||||||||||||| ||||||||||||||||||||||||||| |||||||
orf32a CVHQDIHVRTWHSDAADIDTAPVXDVVIETFACDLPENVLHIIRRHKPLWLXWEYLSAEX
70 80 90 100 110 120
130 140 150 160 170 180
orf32-1.pep SNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYCEAVRFDTEALRERLMLPEKNAS
|||||| ||||||:| | |||||||| |||||||||||||||||: |||:||||||||
orf32a SNERLHXMPSPQESVXKXFWFMGFSEXSGGLIRERDYCEAVRFDSGALRKRLMLPEKNXP
130 140 150 160 170 180
190 200 210 220 230 240
orf32-1.pep EWLLFGYRSDVWAKWLEMWRQAGSPMTLLLAGTQIIDSLKQSGVIPQDALQNDGDVFQTA
|||||||||||||||||||||||||:||||||: |||||||:||||||||||||||||||
orf32a EWLLFGYRSDVWAKWLEMWRQAGSPLTLLLAGAXIIDSLKQNGVIPQDALQNDGDVFQTA
190 200 210 220 230 240
250 260 270 280 290 300
orf32-1.pep SVRLVKIPFVPQQDFDQLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKLH
||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||
orf32a SVRLVKIPFVPQQDFDKLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKLH
250 260 270 280 290 300
310 320 330 340 350 360
orf32-1.pep AFWDKAHGFYTPETVSAHRRLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSRY
||||||||||||||:|||||||||||||||||||||||||| ||||||||||||||||||
orf32a AFWDKAHGFYTPETASAHRRLSDDLNGGEALSATQRLECWQILQQHQNGWRQGAEDWSRY
310 320 330 340 350 360
370 380
orf32-L pep LFGQPSAPEKLAAFVSKHQKIRX
||||||| |||||||||||||||
orf32a LFGQPSASEKLAAFVSKHQKIRX
370 380
与淋病奈瑟球菌的预计ORF的同源性
ORF32与淋病奈瑟球菌的预计ORF(ORF32.ng)在重叠的82个氨基酸内显示出有95.1%的相同性:
orf32.pep MNTPPF-VCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLP 57
||| | |||||||||||||||||||||||||||||||||||||||||||||||||||
orf32ng MVMNTYAFPVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLP 60
orf32.pep DVPCVHQDIHVRTWHSDAADIDTA 81
||| ||||||||||||||||||||
orf32ng DVPFVHQDIHVRTWHSDAADIDTAPVPDAVIETFACDLPENVLNIIRRHKPLWLNWEYLS 120
预计ORF32ng核苷酸序列<SEQ ID 135>编码的蛋白质具有氨基酸序列<SEQ ID136>:1 MVMNTYAFPV CWIFCKVIDN FGDIGVSWRL ARVLHRELGW QVHLWTDDVS51 ALRALCPDLP DVPFVHQDIH VRTWHSDAAD IDTAPVPDAV IETFACDLPE101 NVLNIIRRHK PLWLNWEYLS AEESNERLHL MPSPQEGVQK YFWFMGFSEK151 SGGLIRERDY REAVRFDTEA LRRRLVLPEK NAPEWLLFGY RGDVWAKWLD201 MWQQAGSLMT LLLAGAQIIQ SLKQSGVIPQ NALQNEGGVF QTASVRLVKI251 PFVPQQDFDK LLHLADCAVI RGEDSFVRTQ LAGKPFFWHI YPQDENVHLD301 KLHAFWDKAY GFYTPETASV HRLLSDDLNG GEALSATQRL ECGVL*进一步的测序揭示了下列DNA序列<SEQ ID 137>:1 ATGAATACAT ACGCTTTTCC TGTCTGTTGG ATTTTTTGCA AGGTCATCGA51 CAATTTCGGC GACATCGGCG TTTCGTGGCG GCTCGCCCGT GTTTTGCACC101 GCGAACTCGG TTGGCAGGTG CATTTGTGGA CGGACGACGT GTCCGCCTTG151 CGCGCGCTTT GTCCCGATTT GCCCGATGTT CCCTTCGTTC ATCAGGATAT201 TCATGTCCGC ACTTGGCATT CCGATGCGGC AGACATTGAT ACCGCGCCCG251 TTCCCGATGC CGTTATCGAA ACTTTTGCCT GCGACCTGCC CGAAAATGTG301 CTGAACATCA TCCGCCGACA CAAACCGCTT TGGCTGAATT GGGAATATTT351 GAGCGCGGAG GAAAGCAATG AAAGGCTGCA CCTGATGCCT TCGCCGCAGG401 AGGGCGTTCA AAAATATTTT TGGTTTATGG GTTTCAGCGA AAAAAGCGGC451 GGGTTGATAC GCGAACGCGA TTACCGCGAA GCCGTCCGTT TCGATACCGA501 AGCCCTGCGC CGGCGGCTGG TGCTGCCCGA AAAAAACGCC CCCGAATGGC551 TGCTTTTCGG CTATCGGGGC GATGTTTGGG CAAAGTGGCT GGACATGTGG601 CAACAGGCAG GCAGCCTGAT GACCCTACTG CTGGCGGGGG CGCAAATTAT651 CGACAGCCTC AAACAAAGCG GCGTTATTCC GCAAAACGCC CTGCAAAAtg701 aaggcgGTGT CTTTCagacG gcatccgTcC gccttGTCAA AAtcCCGTTC751 GTGCcGCAAC AGGAcTTCGA CAAATTGCTG CAcctcgcCG ACTGCGCCGT801 GATACGCGGC GAAGACAGTT TCGTGCGTAC CCAGCTTGCC GGAAAACCCT851 TTTTTTGGCA CATCTACCCG CAAGACGAGA ATGTCCATCT CGACAAACTC901 CACGCCTTTT GGGATAAGGC ATACGGCTTC TACACGCCCG AAACCGCATC951 GGTGCACCGC CTCCTTTCGG ACGACCTCAA CGGCGGAGAG GCTTTATCCG1001 CAACACAACG CCTCGAATGT TGGCAAACCC TGCAACAACA TCAAAACGGC1051 TGGCGGCAAG GCGCGGAGGA TTGGAGCCGT TATCTTTTCG GGCAGCCTTC1101 CGCATCCGAA AAACTCGCCG CCTTTGTTTC AAAGCATCAA AAAATACGCT1151 AG它编码的蛋白质具有氨基酸序列<SEQ ID 138;ORF32ng-1>:1 MNTYAFPVCW IFCKVIDNFG DIGVSWRLAR VLHRELGWQV HLWTDDVSAL51 RALCPDLPDV PFVHQDIHVR TWHSDAADID TAPVPDAVIE TFACDLPENV101 LNIIRRHKPL WLNWEYLSAE ESNERLHLMP SPQEGVQKYF WFMGFSEKSG151 GLIRERDYRE AVRFDTEALR RRLVLPEKNA PEWLLFGYRG DVWAKWLDMW201 QQAGSLMTLL LAGAQIIDSL KQSGVIPQNA LQNEGGVFQT ASVRLVKIPF251 VPQQDFDKLL HLADCAVIRG EDSFVRTQLA GKPFFWHIYP QDENVHLDKL301 HAFWDKAYGF YTPETASVHR LLSDDLNGGE ALSATQRLEC WQTLQQHQNG351 WRQGAEDWSR YLFGQPSASE KLAAFVSKHQ KIR*ORF32ng-1和ORF32-1在383个氨基酸的重叠区内显示出有93.5%的相同性:
10 20 30 40 50 59orf32-1.pep MNTPPF-VCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDV
||| | |||||||||||||||||||||||||||||||||||||||||||||||||||||orf32ng-1 MNTYAFPVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDV
10 20 30 40 50 60
60 70 80 90 100 110 119orf32-1.pep PCVHQDIHVRTWHSDAADIDTAPVPDVVIETFACDLPENVLHIIRRHKPLWLNWEYLSAE
| ||||||||||||||||||||||||:||||||||||||||:||||||||||||||||||orf32ng-1 PFVHQDIHVRTWHSDAADIDTAPVPDAVIETFACDLPENVLNIIRRHKPLWLNWEYLSAE
70 80 90 100 110 120
120 130 140 150 160 170 179
orf32-L pep ESNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYCEAVRFDTEALRERLMLPEKNA
|||||||||||||||||||||||||||||||||||||| |||||||||||:||:||||||
orf32ng-1 ESNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYREAVRFDTEALRRRLVLPEKNA
130 140 150 160 170 180
180 190 200 210 220 230 239
orf32-1.pep SEWLLFGYRSDVWAKWLEMWRQAGSPMTLLLAGTQIIDSLKQSGVIPQDALQNDGDVFQT
||||||||:|||||||:||:|||| |||||||:||||||||||||||:||||:| ||||
orf32ng-1 PEWLLFGYRGDVWAKWLDMWQQAGSLMTLLLAGAQIIDSLKQSGVIPQNALQNEGGVFQT
190 200 210 220 230 240
240 250 260 270 280 290 299
orf32-1.pep ASVRLVKIPFVPQQDFDQLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKL
|||||||||||||||||:||||||||||||||||||:|||||||||||||||||||||||
orf32ng-1 ASVRLVKIPFVPQQDFDKLLHLADCAVIRGEDSFVRTQLAGKPFFWHIYPQDENVHLDKL
250 260 270 280 290 300
300 310 320 330 340 350 359
orf32-L pep HAFWDKAHGFYTPETVSAHRRLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSR
|||||||:|||||||:|:|| |||||||||||||||||||||||||||||||||||||||
orf32ng-1 HAFWDKAYGFYTPETASVHRLLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSR
310 320 330 340 350 360
360 370 380
orf32-1.pep YLFGQPSAPEKLAAFVSKHQKIRX
|||||||| |||||||||||||||
orf32ng-1 YLFGQPSASEKLAAFVSKHQKIRX
370 380
根据这一发现,包括该淋球菌蛋白中有粘附素有特有的RGD序列的发现,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF32-1(42kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图7A显示出His-融合蛋白亲和纯化的结果,图7B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,用小鼠血清进行ELISA,得到阳性结果。这些结果确认ORF32-1是一种外露蛋白,且是一种有用的免疫原。
实施例24
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 139>:1 ..TTGTTCCTGC GTGTNAAAGT GGGGCGTTTT TTCAGCAGTC CGGCGACGTG51 GTTTCGGGNC AAAGACCCTG TAAATCAGGC GGTGTTGCGG CTGTATNCGG101 ACGAGTGGCG GCA.ACTTCG GTACGTTGGA AAATAGNCGC AACGTCGCAC151 AGCCTGTGGC TCTGCACGCT GCTCGGAATG CTGGTGTCGG TATTGTTGCT201 GCTTTTGGTG CGGCAATATA CGTTCAACTG GGAAAGCACG CTGTTGAGCA251 ATGCCGCTTC GGTACGCGCG GTGGAAATGT TGGCATGGCT GCCGTCGAAA301 CTCGGTTTCC CTGTCCCCGA TGCGCGGTCG GTCATCGAAG GCCGTCTGAA351 CGGCAATATT GCCGATGCGC GGGCTTGGTC GGGGCTGCTG GTCGNCAGTA
401 TCGCCTGCTA NGGCATCCTG CCGCGCCTG..它对应于氨基酸序列<SEQ ID 140;ORF33>:1 ..LFLRVKVGRF FSSPATWFRX KDPVNQAVLR LYXDEWRXTS VRWKIXATSH51 SLWLCTLLGM LVSVLLLLLV RQYTFNWEST LLSNAASVRA VEMLAWLPSK101 LGFPVPDARS VIEGRLNGNI ADARAWSGLL VXSIACXGIL PRL..进一步的工作揭示了完整的核苷酸序列<SEQ ID 141>:1 ATGTTGAATC CATCCCGAAA ACTGGTTGAG CTGGTCCGTA TTTTGGACGA51 AGGCGGTTTT ATTTTCAGCG GCGATCCCGT ACAGGCGACG GAGGCTTGC101 GCCGCGTGGA CGGCAGTACG GAGGAAAAAA TCATCCGTCG GGCGGAGATG151 ATTGACAGGA ACCGTATGCT GCGGGAGACG TTGGAACGTG TGCGTGCGGG201 GTCGTTCTGG TTGTGGGTGG TGGCGGCGAC GTTTGCATTT TTTACCGGTT251 TTTCAGTCAC TTATCTTCTA ATGGACAATC AGGGTCTGAA TTTCTTTTTG301 GTTTTGGCGG GCGTGTTGGG CATGAATACG CTGATGCTGG CAGTATGGTT351 GGCAATGTTG TTCCTGCGTG TGAAAGTGGG GCGTTTTTTC AGCAGTCCGG401 CGACGTGGTT TCGGGGCAAA GACCCTGTAA ATCAGGCGGT GTTGCGGCTG451 TATGCGGACG AGTGGCGGCA ACCTTCGGTA CGTTGGAAAA TAGGCGCAAC501 GTCGCACAGC CTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT551 TGTTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG601 TTGAGCAATG CCGCTTCGGT ACGCGCGGTG GAAATGTTGG CATGGCTGCC651 GTCGAAACTC GGTTTCCCTG TCCCCGATGC GCGGGCGGTC ATCGAAGGCC701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC751 GGCAGTATCG CCTGCTACGG CATCCTGCCG CGCCTGCTGG CTTGGGTAGT801 GTGTAAAATC CTTTTGAAAA CAAGCGAAAA CGGATTGGAT TTGGAAAAGC851 CCTATTATCA GGCGGTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCACCGAAAA TCATCTTGAA951 CGATGCGCCG AAATGGGCGG TCATGCTGGA GACCGAGTGG CAGGACGGCG1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA GGGCGTTGCC1051 ACCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC1101 GGCGCAACTG CTTATCGGCG TGCGCGCCCA AACTGTGCCG GACCGCGGCG1151 TGTTGCGGCA GATTGTCCGA CTCTCGGAAG CGGCGCAGGG CGGCGCGGTG1201 GTGCAGCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT1251 GGAACATTGG CGTAACGCGC TGGCCGAATG CGGCGCGGCG TGGCTTGAGC1301 CTGACAGGGC GGCGCAGGAA GGGCGTTTGA AAGACCAATA A它对应于氨基酸序列<SEQ ID 142;ORF33-1>:1 MLNPSRKLVE LVRILDEGGF IFSGDPVQAT EALRRVDGST EEKIIRRAEM51 IDRNRMLRET LERVRAGSFW LWVVAATFAF FTGFSVTYLL MDNQGLNFFL101 VLAGVLGMNT LMLAVWLAML FLRVKVGRFF SSPATWFRGK DPVNQAVLRL151 YADEWRQPSV RWKIGATSHS LWLCTLLGML VSVLLLLLVR QYTFNWESTL201 LSNAASVRAV EMLAWLPSKL GFPVPDARAV IEGRLNGNIA DARAWSGLLV251 GSIACYGILP RLLAWVVCKI LLKTSENGLD LEKPYYQAVI RRWQNKITDA301 DTRRETVSAV SPKIILNDAP KWAVMLETEW QDGEWFEGRL AQEWLDKGVA351 TNREQVAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV401 VQLLAEQGLS DDLSEKLEHW RNALAECGAA WLEPDRAAQE GRLKDQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF33与脑膜炎奈瑟球菌菌株A的ORF(ORF33a)在重叠的143个氨基酸内显示出有90.9%的相同性:
10 20 30
orf33.pep LFLRVKVGRFFSSPATWFRXKDPVNQAVLR
||||||||||||||||||| ||||||||||
orf33a LMDNQGLNFFLVLAGVXGMNTLMLAVWLAMLFLRVKVGRFFSSPATWFRGKDPVNQAVLR
90 100 110 120 130 140
40 50 60 70 80 90
orf33.pep LYXDEWRXTSVRWKIXATSHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA
|| ||||| |||||| ||||||||||||||||||||||||||||||||||||::::|||orf33a LYADEWRXPSVRWKIGATSHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLGDSSSVRL
150 160 170 180 190 200
100 110 120 130 140orf33.pep VEMLAWLPSKLGFPVPDARSVIEGRLNGNIADARAWSGLLVXSIACXGILPRL
||||||||:||||||||||:||||||||||||||||||||| |||| ||||||orf33a VEMLAWLPAKLGFPVPDARAVIEGRLNGNIADARAWSGLLVGSTACYGILPRLLAWAVCK
210 220 230 240 250 260orf33a ILXXTSENGLDLEKXXXXXXIRRWQNKITDADTRRETVSAVSPKIVLNDAPKWAVMLETE
270 280 290 300 310 320全长ORF33a核苷酸序列<SEQ ID 143>是:1 ATGTTGAATC CATCCCGAAA ACTGGTTGAG CTGGTCCGTA TTTTGGAAGA51 AGGCGGCTTT ATTTTCAGCG GCGATCCCGT GCAGGCGACG GAGGCTTTGC101 GCCGCGTGGA CGGCAGTACG GAGGAAAAAA TCATCCGTCG GGCGAAGATG151 ATCGACAGGA ACCGTATGCT GCGGGAGACG TTGGAACGTG TGCGTGCGGG201 GTCGTTCTGG TTGTGGGTGG CGGCGGCGAC GTTTGCGTTT NTTACCGNTT251 TTTCAGTTAC TTATCTTCTA ATGGACAATC AGGGTCTGAA TTTCTTTTTG301 GTTTTGGCGG GCGTGNTGGG CATGAATACG CTGATGCTGG CAGTATGGTT351 GGCAATGTTG TTCCTGCGCG TGAAAGTGGG GCGTTTTTTC AGCAGTCCGG401 CGACGTGGTT TCGGGGCAAA GACCCTGTCA ATCAGGCGGT GTTGCGGCTG451 TATGCGGACG AGTGGCGGCN ACCTTCGGTA CGTTGGAAAA TAGGCGCAAC501 GTCGCACAGC CTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT551 TGTTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG601 TTGGGCGATT CGTCTTCGGT ACGGCTGGTG GAAATGTTGG CATGGCTGCC651 TGCGAAACTG GGTTTTCCCG TGCCTGATGC GCGGGCGGTC ATCGAAGGTC701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC751 GGCAGTATCG CCTGCTACGG CATCCTGCCG CGCCTCTTGG CTTGGGCGGT801 ATGCAAAATC CTTNTGNAAA CAAGCGAAAA CGGCTTGGAT TTGGAAAAGC851 NCNNNNNTCN NNCGNTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCGCCGAAAA TCGTCTTGAA951 CGATGCGCCG AAATGGGCGG TCATGCTGGA GACCGAATGG CAGGACGGCG1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA AGGGCGTTGCC1051 GCCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC1101 GGCGCAACTG CTTATCGGCG TGCGCGCCCA AACTGTGCCC GACCGCGGCG1151 TGTTGCGGCA GATCGTCCGA CTTTCGGAAG CGGCGCAGGG CGGCGCGGTG1201 GTGCANCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT1251 GGAACATTGG CGTAACGCGC TGACCGAATG CGGCGCGGCG TGGCTGGAAC1301 CCGACAGAGC GGCGCAGGAA GGCCGTCTGA AAACCAACGA CCGCACTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 144>:1 MLNPSRKLVE LVRILEEGGF IFSGDPVQAT EALRRVDGST EEKIIRRAKM51 IDRNRMLRET LERVRAGSFW LWVAAATFAF XTXFSVTYLL MDNQGLNFFL101 VLAGVXGMNT LMLAVWLAML FLRVKVGRFF SSPATWFRGK DPVNGAVLRL151 YADEWRXPSV RWKIGATSHS LWLCTLLGML VSVLLLLLVR QYTFNWESTL201 LGDSSSVRLV EMLAWLPAKL GFPVPDARAV IEGRLNGNIA DARAWSGLLV251 GSIACYGILP RLLAWAVCKI LXXTSENGLD LEKXXXXXXI RRWQNKITDA301 DTRRETVSAV SPKIVLNDAP KWAVMLETEW QDGEWFEGRL AQEWLDKGVA351 ANREQVAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV401 VXLLAEQGLS DDLSEKLEHW RNALTECGAA WLEPDRAAQE GRLKTNDRT*ORF33a和ORF33-1在444个氨基酸的重叠区内显示出有94.1%的相同性:
10 20 30 40 50 60orf33a.pep MLNPSRKLVELVRILEEGGFIFSGDPVQATEALRRVDGSTEEKIIRRAKMIDRNRMLRET
|||||||||||||||:||||||||||||||||||||||||||||||||:|||||||||||orf33-1 MLNPSRKLVELVRILDEGGFIFSGDPVQATEALRRVDGSTEEKIIRRAEMIDRNRMLRET
10 20 30 40 50 60
70 80 90 100 110 120orf33a.pep LERVRAGSFWLWVAAATFAFXTXFSVTYLLMDNQGLNFFLVLAGVXGMNTLMLAVWLAML
|||||||||||||:|||||| | |||||||||||||||||||||| ||||||||||||||
orf33-1 LERVRAGSFWLWVVAATFAFFTGFSVTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLAML
70 80 90 100 110 120
130 140 150 160 170 180
orf33a.pep FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRXPSVRWKIGATSHSLWLCTLLGML
|||||||||||||||||||||||||||||||||||| |||||||||||||||||||||||
orf33-1 FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRQPSVRWKIGATSHSLWLCTLLGML
130 140 150 160 170 180
190 200 210 220 230 240
orf33a.pep VSVLLLLLVRQYTFNWESTLLGDSSSVRLVEMLAWLPAKLGFPVPDARAVIEGRLNGNIA
|||||||||||||||||||||::::||| ||||||||:||||||||||||||||||||||
orf33-1 VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA
190 200 210 220 230 240
250 260 270 280 290 300
orf33a.pep DARAWSGLLVGSIACYGILPRLLAWAVCKILXXTSENGLDLEKXXXXXXIRRWQNKITDA
|||||||||||||||||||||||||:||||| |||||||||| |||||||||||
orf33-1 DARAWSGLLVGSIACYGILPRLLAWVVCKILLKTSENGLDLEKPYYQAVIRRWQNKITDA
250 260 270 280 290 300
310 320 330 340 350 360
orf33a.pep DTRRETVSAVSPKIVLNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVAANREQVAALE
||||||||||||||:|||||||||||||||||||||||||||||||||||:|||||||||
orf33-1 DTRRETVSAVSPKIILNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVATNREQVAALE
310 320 330 340 350 360
370 380 390 400 410 420
orf33a.pep TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVXLLAEQGLSDDLSEKLEHW
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||
orf33-1 TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVQLLAEQGLSDDLSEKLEHW
370 380 390 400 410 420
430 440 450
orf33a.pep RNALTECGAAWLEPDRAAQEGRLKTNDRTX
||||:|||||||||||||||||||
orf33-1 RNALAECGAAWLEPDRAAQEGRLKDQX
430 440
与淋病奈瑟球菌的预计ORF的同源性
ORF33与淋病奈瑟球菌的预计ORF(ORF33.ng)在重叠的143个氨基酸内显示出有91.6%的相同性:
orf33.pep LFLRVKVGRFFSSPATWFRXKDPVNQAVLR 30
||||||||||||||||||| | ||||||||
orf33ng LMDNQGLNFFLVLAGVLGMNTLMLAVWLATLFLRVKVGRFFSSPATWFRGKGPVNQAVLR 100
orf33.pep LYXDEWRXTSVRWKIXATSHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA 90
|| |:|| |||||| ||:|||||||||||||||||||||||||||||||||||||||||
orf33ng LYADQWRQPSVRWKIGATAHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA 160
orf33.pep VEMLAWLPSKLGFPVPDARSVIEGRLNGNIADARAWSGLLVXSIACXGILPRL 143
|||||||||||||||||||:||||||||||||||||||||| ||:| ||||||
orf33ng VEMLAWLPSKLGFPVPDARAVIEGRLNGNIADARAWSGLLVGSIVCYGILPRLLAWVVCK 220
预计ORF33ng核苷酸序列<SEQ ID 145>编码的蛋白质具有氨基酸序列<SEQ ID146>:
1 MIDRDRMLRD TLERVRAGSF WLWVVVASMM FTAGFSGTYL LMDNQGLNFF 51 LVLAGVLGMN TLMLAVWLAT LFLRVKVGRF FSSPATWFRG KGPVNQAVLR101 LYADQWRQPS VRWKIGATAH SLWLCTLLGM LVSVLLLLLV RQYTFNWEST151 LLSNAASVRA VEMLAWLPSK LGFPVPDARA VIEGRLNGNI ADARAWSGLL201 VGSIVCYGIL PRLLAWVVCK ILLKTSENGL DLEKTYYQAV IRRWQNKITD251 ADTRRETVSA VSPKIVLNDA PKWALMLETE WQDGQWFEGR LAQEWLDKGV301 AANREQVAAL ETELKQKPAQ LLIGVRAQTV PDRGVLRQIV RLSEAAQGGA351 VVQLLAEQGL SDDLSEKLEH WRNALTECGA AWLEPDRVAQ EGRLKDQ*进一步的序列分析揭示了下列DNA序列<SEQ ID 147>:1 ATGTTGaatC CATCCCgaAA ACTGgttgag ctGgTCCgtA Ttttgaataa51 agggggtTTT attttcagcg gcgatcctgt gcaggcgacg gaggctttgc101 gccgcgtgga cggcAGTACG GAggAaaaaa tcttccgtcg GGCGGAGAtg151 atcgACAGGg accgtatgtt gcgggACaCg TtggaacGTG TGCGTGCggg201 gtcgtTctgG TTATGGGTGG TggtggCAtC gATGATGTtt aCCGCCGGAT251 TTTCAGgcac ttatCttCTG ATGGACaatC AGGGGCtGAA TtTCTTTTTA301 GTTTTggcgG GAGTGTtggG CATGaatacG ctgATGCTGG CAGTATGGtt351 gGCAACGTTG TTCCTGCGCG TGAAAGTGGG ACGGTTTTTC AGCAGTCCGG401 CGACGTGGTT TCGGGGCAAA GGCCCTGTAA ATCAGGCGGT GTTGCGGCTG451 TATGCGGACC AGTGGCGGCA ACCTTCGGTA CGATGGAAAA TAGGCGCAAC501 GGCGCACAGC TTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT551 TGCTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG601 TTGAGCAATG CCGCTTCGGT ACGCGCGGTG GAAATGTTGG CATGGCTGCC651 GTCGAAACTC GGTTTCCCTG TCCCCGATGC GCGGGCGGTC ATCGAAGGTC701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC751 GGCAGTATCG TCTGCTACGG CATCCTGCCG CGCCTCTTGG CTTGGGTAGT801 GTGTAAAATC CTTTTGAAAA CAAGCGAAAA CGGattgGAT TTGGAAAAAA851 CCTATTATCA GGCGGTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCGCcgaAAA TCGTCTTGAA951 CGATGCGCCG AAATGGGCGC TCATGCTGGA GACCGAGTGG CAGGACGGCC1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA GGGCGTTGCC1051 GCCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC1101 GGCGCAACTG CTTATCGGCG TACGCGCCCA AACTGTGCCG GACCGGGGCG1151 TGCTGCGGCA GATTGTGCGG CTTTCGGAAG CGGCGCAGGG CGGCGCGGTG1201 GTGCAGCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT1251 GGAACATTGG CGTAACGCGC TGACCGAATG CGGCGCGGCG TGGCTTGAGC1301 CTGACAGGGT GGCGCAGGAA GGCCGTTTGA AAGACCAATA A它编码的蛋白质具有氨基酸序列<SEQ ID 148;ORF33ng-1>:1 MLNPSRKLVE LVRILNKGGF IFSGDPVQAT EALRRVDGST EEKIFRRAEM51 IDRDRMLRDT LERVRAGSFW LWVVVASMMF TAGFSGTYLL MDNQGLNFFL101 VLAGVLGMNT LMLAVWLATL FLRVKVGRFF SSPATWFRGK GPVNQAVLRL151 YADQWRQPSV RWKIGATAHS LWLCTLLGML VSVLLLLLVR QYTFNWESTL201 LSNAASVRAV EMLAWLPSKL GFPVPDARAV IEGRLNGNIA DARAWSGLLV251 GSIVCYGILP RLLAWVVCKI LLKTSENGLD LEKTYYQAVI RRWQNKITDA301 DTRRETVSAV SPKIVLNDAP KWALMLETEW QDGQWFEGRL AQEWLDKGVA351 ANREQVAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV401 VQLLAEQGLS DDLSEKLEHW RNALTECGAA WLEPDRVAQE GRLKDQ*ORF33ng-1和ORF33-1在446个氨基酸的重叠区内显示出有94.6%的相同性:
10 20 30 40 50 60orf33-1.pep MLNPSRKLVELVRILDEGGFIFSGDPVQATEALRRVDGSTEEKIIRRAEMIDRNRMLRET
|||||||||||||||::|||||||||||||||||||||||||||:||||||||:||||:|orf33ng-1 MLNPSRKLVELVRILNKGGFIFSGDPVQATEALRRVDGSTEEKIFRRAEMIDRDRMLRDT
10 20 30 40 50 60
70 80 90 100 110 120orf33-1.pep LERVRAGSFWLWVVAATFAFFTGFSVTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLAML
||||||||||||||:|:: | :||| |||||||||||||||||||||||||||||||| |orf33ng-1 LERVRAGSFWLWVVVASMMFTAGFSGTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLATL
70 80 90 100 110 120
130 140 150 160 170 180
orf33-1.pep FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRQPSVRWKIGATSHSLWLCTLLGML
|||||||||||||||||||| ||||||||||||:|||||||||||||:||||||||||||
orf33ng-1 FLRVKVGRFFSSPATWFRGKGPVNQAVLRLYADQWRQPSVRWKIGATAHSLWLCTLLGML
130 140 150 160 170 180
190 200 210 220 230 240
orf33-1.pep VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf33ng-1 VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA
190 200 210 220 230 240
250 260 270 280 290 300
orf33-1.pep DARAWSGLLVGSIACYGILPRLLAWVVCKILLKTSENGLDLEKPYYQAVIRRWQNKITDA
|||||||||||||:||||||||||||||||||||||||||||| ||||||||||||||||
orf33ng-1 DARAWSGLLVGSIVCYGILPRLLAWVVCKILLKTSENGLDLEKTYYQAVIRRWQNKITDA
250 260 270 280 290 300
310 320 330 340 350 360
orf33-1.pep DTRRETVSAVSPKIILNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVATNREQVAALE
||||||||||||||:||||||||:|||||||||:||||||||||||||||:|||||||||
orf33ng-1 DTRRETVSAVSPKIVLNDAPKWALMLETEWQDGQWFEGRLAQEWLDKGVAANREQVAALE
310 320 330 340 350 360
370 380 390 400 410 420
orf33-1.pep TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVQLLAEQGLSDDLSEKLEHW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf33ng-1 TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVQLLAEQGLSDDLSEKLEHW
370 380 390 400 410 420
430 440
orf33-1.pep RNALAECGAAWLEPDRAAQEGRLKDQX
||||:|||||||||||:||||||||||
orf33ng-1 RNALTECGAAWLEPDRVAQEGRLKDQX
430 440
根据该淋球菌蛋白中存在几个推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例25
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 149>:1 ..CAGAAGAGTT TGTCGAGAAT TTCTTTATGG GGTTTGGGCG GCGTGTTTTT51 CGGGGTGTCC GGTCTGGTAT GGTTTTCTTT GGGCGTTTCT TT.GAGTGCG101 CCTGTTTTTC GGGTGTTTCT TTTCGGGGTT CGGGACGGGG GACGTTTGTG151 GGCAGTACGG GGGTTTCTTT GAGTGTGTTT TCAGCTTGTG TTCC.GGCGT201 CGTCCGGCTG CCTGTCGGTT TGAGCTGTGT CGGCAGGTTG CG..GTTTGA251 CCCGGTTTTT CTTGGGTGCG GCAGGGGACG TCATTCTCCT GCCGCTTTCG301 TCTGTGCCGT CCGGCTGTGC GGGTTCGGAT GAGGCGGCGT GGTGGTGTTC351 GGGTTGGGCG GCATCTTGTT CCGACTACGC CGTTTGGCAG CCAGAATTCG401 GTTTCGCGGG GGCTGTCGGT GTGTTGCGGT TCGGCTTGAA GGGTTTTGTC451 GTCC..它对应于氨基酸序列<SEQ ID 150;ORF34>:
1 ..QKSLSRISLW GLGGVFFGVS GLVWFSLGVS XECACFSGVS FRGSGRGTFV51 GSTGVSLSVF SACVXGVVRL PVGLSCVGRL XXLTRFFLGA AGDVILLPLS101 SVPSGCAGSD EAAWWCSGWA ASCPTTPFGS QNSVSRGLSV CCGSA*RVLS151 S..进一步的工作揭示了完整的核苷酸序列<SEQ ID 151>:1 ATGATGATGC CGTTCATAAT GCTTCCTTGG ATTGCkGGTG TGCCTGCCGT51 GCCGGGTCAG AATAGGTTGT CCAGAATTTC TTTATGGGGT TTGGGCGGCG101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTTTG151 GGCTGCGCCT GTTTTTCGGG TGTTTCTTTT CGGGGTTCGG GACGGGGGAC201 GTTTGTGGGC AGTACGGGGG TTTCTTTGAG TGTGTTTTCA GCTTGTGTTC251 CGGCGTCGTC CGGCTGCCTG TCGGTTTGAG CTGTGTCGGC AGGTTGCGGT301 TTGACCCGGT TTTTCTTGGG TGCGGCAGGG GACGGCAGTC CGCTGCCGCT351 TTCGTCTGTG CCGTCCGGCT GTGCGGGTTC GGATGAGGCG GCGTGGTGGT401 GTTCGGGTTG GGCGGCATCT TGTCCGACTA CGCCGTTTGG CAGCCAGAAT451 TCGGTTTCGC GGGGGCTGTC GGTGTGTTGC GGTTCGGCTT GAAGGGTTTT501 GTCGCCGTTC GGGTTGAATG TGCTGACGAT GCCTATTGCC AATGCGCCGA551 TGGCGGCGAT ACAGATGAGC AATACGGCGC GTATCAGGAG TTTGGGGGTC601 AGCCTGAAGG GTTTGTTCGG TTTTTTTGCC ATTTTGATTG TGCTTTTGGG651 GTGTCGGGCA ATGCCGTCTG AAGGCGGTTC AGACGGCATT GCCGAGTCAG701 CGTTGGACGT AGTTTTGGTA GAGGGTGATG ACTTTTTGTA CGCCGACGGT751 GGTGCTGACT TTTTGGGTAA TCTGCGCCTG TTCTTCGGGG GTGAGGATGC801 CCATAACGTA GGTTACGTTG CCGTAGGTAA CGATTTTGAC GCGCGCCTGT851 GTGGCGGGGC TGATGCCCAA CAGCGTGGCG CGGACTTTGG ATGTGTTCCA901 AGTGTCGCCG GCGATGTCGC CGGCAGTGCG CGGCAGGGAG GCGACGGTAA951 TATAGTTGTA CACGCCTTCG GCGGCCTGTT CGGAACGTGC AATCTGACCG1001 ACGAACTGTT TTTCGCCTTC GGTGGCGACT TGTCCGAGCA GCAGCAGGTG1051 GCGGTTGTAG CCGACGACGG AGATTTGGGG CGTGTAGCCT TTGGTTTGGT1101 TGTTTTGGCG CAGATAGGAA CGGGCGGTGG TTTCGATACG CAACGCCATA1151 ACGTTGTCGT CGGTTTGCGC GCCGGTGGTT CGGCGGTCGA CGGCGGATTT1201 CGCGCCGACG GCGGCGCTTC CGATTACTGC GCTGACGCAG CCGCTAAGGG1251 CAAGGCTGAA AATGGCGGCA ATCAGGGTGC GGACGGTGTG CGGTTTGGGT1301 TTCATCGGGT GCTTCCTTTC TTGGGCGTTT CAGACGGCAT TGCTTTGCGC1351 CATGCCGTCT GA它对应于氨基酸序列<SEQ ID 152;ORF34-1>:1 MMMPFIMLPW IAGVPAVPGQ NRLSRISLWG LGGVFFGVSG LVWFSLGVSL51 GCACFSGVSF RGSGRGTFVG STGVSLSVFS ACVPASSGCL SV*AVSAGCG101 LTRFFLGAAG DGSPLPLSSV PSGCAGSDEA AWWCSGWAAS CPTTPFGSQN151 SVSRGLSVCC GSA*RVLSPF GLNVLTMPIA NAPMAAIQMS NTARIRSLGV201 SLKGLFGFFA ILIVLLGCRA MPSEGGSDGI AESALDVVLV EGDDFLYADG251 GADFLGNLRL FFGGEDAHNV GYVAVGNDFD ARLCGGADAQ QRGADFGCVP301 SVAGDVAGSA RQGGDGNIVV HAFGGLFGTC NLTDELFFAF GGDLSEQQQV351 AVVADDGDLG RVAFGLVVLA QIGTGGGFDT QRHNVVVGLR AGGSAVDGGF401 RADGGASDYC ADAAAKGKAE NGGNQGADGV RFGFHRVLPF LGVSDGIALR451 HAV*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF34与脑膜炎奈瑟球菌菌株A的ORF(ORF34a)在重叠的161个氨基酸内显示出有73.3%的相同性:
10 20 30
orf34.pep QKSLSRISLWGLGGVFFGVSGLVWFSLGVSXE------CAC
|| ||| ||||||| |||||||||||||||| |||
orf34a MMXPXIMLPWIAGVPAVPGQKRLSRXSLWGLGGXFFGVSGLVWFSLGVSXSLGVSXGCAC
10 20 30 40 50 60
40 50 60 70 80 90
orf34.pep FSGVSFRGSGRGTFVGSTGVSLSVFSACVXGVVRLPVGLSCVGRLXX-----LTRFFLGA
||||||||||||||||||||||||||||: |:: :|:: ||| | ||
orf34a FSGVSFRGSGRGTFVGSTGVSLSVFSACA------PASSGCLSVXAVSAGCGLTRXFXGA
70 80 90 100 110
100 110 120 130 140 150orf34.pep AGDVILLPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLS
||| ||||||||||||:|| | |||||||||||||||||||||||||||||: ||||orf34a AGDGSPLPLSSVPSGCAGADEEAXXCSGWAASCPTTPFGSQNSVSRGLSVCCGSVWRVLS
120 130 140 150 160 170orf34.pep Sorf34a PFGXNVLTMPIANAPMAVIQMSNTARIRSLGVSLKGLFXFFAILIVLLGCRAMPSEGGSD
180 190 200 210 220 230全长ORF34a核苷酸序列<SEQ ID 153>是:1 ATGATGATNC CGTTNATAAT GCTTCCTTGG ATTGCGGGTG TGCCTGCCGT51 GCCGGGTCAG AAGAGGTTGT CGAGAANTTC TTTATGGGGT TTAGGCGGCN101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTNTT151 TCTTTGGGTG TTTCTNTGGG CTGTGCCTGT TTTTCGGGTG TTTCTTTTCG201 GGGTTCGGGA CGGGGGACGT TTGTGGGCAG TACNGGGGTT TCTTTGAGTG251 TGTTTTCAGC TTGTGCTCCG GCGTCGTCCG GCTGCCTGTC GGTTTNAGCT301 GTGTCGGCAG GTTGCGGTTT GACCCGGNTT TTCTTNGGTG CGGCAGGGGA351 CGGCAGTCCG CTGCCGCTTT CGTCTGTGCC GTCCGGCTGT GCGGGTGCGG401 ATGAGGAGGC GTNGTNGTGT TCGGGTTGGG CGGCATCTTG TCCGACTACG451 CCGTTTGGCA GCCAGAATTC GGTTTCGCGG GGGCTGTCGG TGTGTTGCGG501 TTCGGTNTGG AGGGTTTTGT CNCCGTTCGG GTNGAATGTG CTGACGATGC551 CTATTGCCAA TGCGCCGATG GCGGTGATAC AGATGAGCAA TACGGCGCGT601 ATCAGGAGTT TGGGGGTCAG CCTGAAGGGT TTGTTCNGTT TTTTTGCCAT651 TTTGATTGTG CTTTTGGGGT GTCGGGCAAT GCCGTCTGAA GGCGGTTCAG701 ACGGCATTGC CGAGTCAGCG TTGGACGTAG TTTNGGTAGA GGGTGATGAC751 TTTTTGTACG CCGACGGTGG TGCTGACTTT TTGGGTAATC TGCGCCTGTT801 CTTCGGGGGT GAGGATGCCC ATAACGTAGG TTACGTTGCC GTAGGTAACG851 ATTTTGACGC GCGCCTGTGT GGCGGGGCTG ATGCCCAACA GCGTGGCGCG901 GACTTTGGAT GTGTTCCAAG TGTCGCCGGC GATGTCGCCG GCAGTGCGCG951 GCAGGGAGGC GACGGTAATG TANTTGTACA CGCCTTCGGC GGCCTGTTCG1001 GAACGTGCAA TCTGACCGAC GAACTGTTTC TCGCCTTCGG TGGCGACTTG1051 TCCGAGCAGC AGCAGGTGGC GGTTGTAGCC GACAACGGAG ATTTGGGGCG1101 TGTANCCTTT GGTTTGGTTG TTTTGGCGCA GATAGGAGCG GGCGGTGGTT1151 TCGATACGCA GCGCCATTAC GTTGTCGTCG GTTNGCGCGC CGGTGGTTCG1201 GCGGTCGACG GCGGATTTCG CGCCGACCGC CGCGCCGCCG ACGACTGCGC1251 TGACGCAGCC GCCGAGGGCA AGGCTGAGGA CGGCGGCAGT CAGGGTGCGG1301 ACGGTGTGCG GTTTGGGTTT CATCGGGTGC TTCCTTTCTT GGGCGTTTCA1351 GACGGCATTG CTTTGCGCCA TGCCGTCTGA它编码的蛋白质具有氨基酸序列<SEQ ID 154>:1 MMXPXIMLPW IAGVPAVPGQ KRLSRXSLWG LGGXFFGVSG LVWFSLGVSX51 SLGVSXGCAC FSGVSFRGSG RGTFVGSTGV SLSVFSACAP ASSGCLSVXA101 VSAGCGLTRX FXGAAGDGSP LPLSSVPSGC AGADEEAXXC SGWAASCPTT151 PFGSQNSVSR GLSVCCGSVW RVLSPFGXNV LTMPIANAPM AVIQMSNTAR201 IRSLGVSLKG LFXFFAILIV LLGCRAMPSE GGSDGIAESA LDVVXVEGDD251 FLYADGGADF LGNLRLFFGG EDAHNVGYVA VGNDFDARLC GGADAQQRGA301 DFGCVPSVAG DVAGSARQGG DGNVXVHAFG GLFGTCNLTD ELFLAFGGDL351 SEQQQVAVVA DNGDLGRVXF GLVVLAQIGA GGGFDTQRHY VVVGXRAGGS401 AVDGGFRADR RAADDCADAA AEGKAEDGGS QGADGVRFGF HRVLPFLGVS451 DGIALRHAV*ORF34a和ORF34-1在459个氨基酸的重叠区内显示出有91.3%的相同性:
10 20 30 40 50 60orf34a.pep MMXPXIMLPWIAGVPAVPGQKRLSRXSLWGLGGXFFGVSGLVWFSLGVSXSLGVSXGCAC
|| | |||||||||||||||:|||| ||||||| ||||||||||||||| ||||orf34-1 MMMPFIMLPWIAGVPAVPGQNRLSRISLWGLGGVFFGVSGLVWFSLGVSL------GCAC
10 20 30 40 50
70 80 90 100 110 120
orf34a.pep FSGVSFRGSGRGTFVGSTGVSLSVFSACAPASSGCLSVXAVSAGCGLTRXFXGAAGDGSP
||||||||||||||||||||||||||||:|||||||||||||||||||| | ||||||||
orf34-1 FSGVSFRGSGRGTFVGSTGVSLSVFSACVPASSGCLSVXAVSAGCGLTRFFLGAAGDGSP
60 70 80 90 100 110
130 140 150 160 170 180
orf34a.pep LPLSSVPSGCAGADEEAXXCSGWAASCPTTPFGSQNSVSRGLSVCCGSVWRVLSPFGXNV
||||||||||||:|| | |||||||||||||||||||||||||||||: ||||||| ||
orf34-1 LPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLSPFGLNV
120 130 140 150 160 170
190 200 210 220 230 240
orf34a.pep LTMPIANAPMAVIQMSNTARIRSLGVSLKGLFXFFAILIVLLGCRAMPSEGGSDGIAESA
|||||||||||:|||||||||||||||||||| |||||||||||||||||||||||||||
orf34-1 LTMPIANAPMAAIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA
180 190 200 210 220 230
250 260 270 280 290 300
orf34a.pep LDVVXVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA
|||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf34-1 LDVVLVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA
240 250 260 270 280 290
310 320 330 340 350 360
orf34a.pep DFGCVPSVAGDVAGSARQGGDGNVXVHAFGGLFGTCNLTDELFLAFGGDLSEQQQVAVVA
|||||||||||||||||||||||: ||||||||||||||||||:||||||||||||||||
orf34-1 DFGCVPSVAGDVAGSARQGGDGNIVVHAFGGLFGTCNLTDELFFAFGGDLSEQQQVAVVA
300 310 320 330 340 350
370 380 390 400 410 420
orf34a.pep DNGDLGRVXFGLVVLAQIGAGGGFDTQRHYVVVGXRAGGSAVDGGFRADRRAADDCADAA
|:|||||| ||||||||||:||||||||| |||| |||||||||||||| |:| |||||
orf34-1 DDGDLGRVAFGLVVLAQIGTGGGFDTQRHNVVVGLRAGGSAVDGGFRADGGASDYCADAA
360 370 380 390 400 410
430 440 450 460
orf34a.pep AEGKAEDGGSQGADGVRFGFHRVLPFLGVSDGIALRHAVX
|:||||:||:||||||||||||||||||||||||||||||
orf34-1 AKGKAENGGNQGADGVRFGFHRVLPFLGVSDGIALRHAVX
420 430 440 450
与淋病奈瑟球菌的预计ORF的同源性
ORF34与淋病奈瑟球菌的预计ORF(ORF34.ng)在重叠的161个氨基酸内显示出有77.6%的相同性:
orf34.pep QKSLSRISLWGLGGVFFGVSGLVWFSLGVSXE------CAC 35
|| |||||||||:||||||||||||||||| |||
orf34ng MMMPFIMLPWIAGVPAVPGQKRLSRISLWGLAGVFFGVSGLVWFSLGVSFSLGVSLGCAC 60
orf34.pep FSGVSFRGSGRGTFVGSTGVSLSVFSACVXGVVRLPVGLSCV-----GRLXXLTRFFLGA 90
|||||||||| |:||||||||||||||| :||: | : || ||||||||
orf34ng FSGVSFRGSGWGAFVGSTGVSLSVFSACVP----VPVNESAARAASEGR--GLTRFFLGA 114
orf34.pep AGDVILLPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLS 150
||| |||||||||||||||||||||||||||||:||||||||||||||||||: ||||
orf34ng AGDGSPLPLSSVPSGCAGSDEAAWWCSGWAASCPTAPFGSQNSVSRGLSVCCGSVWRVLS 174
orf34.pep S 175
orf34ng PFGLNVLTMPTANAPMAVIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSD 234全长ORF34ng核苷酸序列<SEQ ID 155>是:1 ATGATGATGC CGTTCATAAT GCTTCCTTGG ATTGCGGGTG TGCCTGCCGT51 GCCGGGTCAA AAGAGGTTGT CGAGAATCTC TTTATGGGGT TTGGCCGGCG101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTTTT151 TCTTTGGGTG TTTCTTTGGG CTGCGCCTGT TTTTCGGGTG TTTCTTTTCG201 GGGTTCGGGA TGGGGGGCGT TTGTGGGCAG TACGGGGGTT TCTTTGAGTG251 TGTTTTCAGC TTGTGTTCCG GTGCCGGTTA ACGAATCGGC TGCCCGGGCC301 GCATCCGAAG GGCGCGGTTT gACCCGGTTT TTCTTGGGTG CGGCAGGGGA351 CGGCAGTCCG CTGCCGCTTT CTTCTGTGCC GTCCGGCTGT GCGGGTTCGG401 ATGAGGCGGC GTGGTGGTGT TCGGGTTGGG CGGCATCTTG TCCGACGGCG451 CCGTTTGGCA GCCAGAATTC GGTTTCGCGG GGGCTGTCGG TGTGTTGCGG501 TTCGGTTTGG AGGGTTTTGT CGCCGTTCGG GTTGAATGTG CTGACGATGC551 CTACTGCCAA TGCGCCGATG GCGGTGATAC AGATGAGCAA TACGGCGCGT601 ATCAGGAGTT TGGGGGTCAG CCTGAAGGGT TTGTTCGGTT TTTTTGCCAT651 TTTGATTGTG CTTTTGGGGT GTCGGGCAAT GCCGTCTGAA GGCGGTTCAG701 ACGGCATTGC CGAGTCAGCG TTGGACGTAG TTTTGGTAGA GGGTAATGAC751 TTTTTGTACG CCGAcggTGG TGCTGACTTT TTGGGTAATC TGCGCCTGTT801 CTTCGGGGGT GAGGATGCCC ATAACGTAGG TTACATTGCC GTAGGTAATG851 ATTTTGACGC GCGCCTGTGT AGCGGGGCTG ATGCCCAGCA GcgtgGCGCG901 GACTTTGGAC GTGTTCCAAG TGTCGCCGGC GATGTCGCCC GCAGTGCGCG951 GCAGGGAGGC GACGGTAATG TAGTTGTATA CGCCTTCGGC GGCCTGTTCG1001 GAACGTGCAA TCTGACCGAC GAACTGTTTT TCGCCTTCGG TGGCGACTTG1051 TCCGAGCAGC AGCAGGTGGC GGTTGTAGCC GACGACGGAG ATTTGGGGCG1101 TGTAGCCTTT GGTTTGGTTG TTTTGGCGCA GGTAGGAACG GGCGGTGGTT1151 TCGATACGCA ACGCCATAAC GTtgtCATCG GTTtgcgcgc CGGTGGTTcg1201 gCGGTCGATG ACGGATTTTG CGCCGACGGC GGCCCCGCCG ACGACTGCGC1251 TGAAGCAGCC GCCGAGGGCA AGGCTGAGGA CGGCGGCAAT CAGGGTGCGG1301 ACGGTGTGTG GTTTGGGTTT CATCGGGGAC TTCCTTTCTT GGGCGTTTCA1351 GACGGCATTG CTTTGCGCCA TGCCGTCTGA它编码的蛋白质具有氨基酸序列<SEQ ID 156>:1 MMMPFIMLPW IAGVPAVPGQ KRLSRISLWG LAGVFFGVSG LVWFSLGVSF51 SLGVSLGCAC FSGVSFRGSG WGAFVGSTGV SLSVFSACVP VPVNESAARA101 ASEGRGLTRF FLGAAGDGSP LPLSSVPSGC AGSDEAAWWC SGWAASCPTA151 PFGSQNSVSR GLSVCCGSVW RVLSPFGLNV LTMPTANAPM AVIQMSNTAR201 IRSLGVSLKG LFGFFAILIV LLGCRAMPSE GGSDGIAESA LDVVLVEGND251 FLYADGGADF LGNLRLFFGG EDAHNVGYIA VGNDFDARLC SGADAQQRGA301 DFGRVPSVAG DVARSARQGG DGNVVVYAFG GLFGTCNLTD ELFFAFGGDL351 SEQQQVAYVA DDGDLGRVAF GLVVLAQVGT GGGFDTQRHN VVIGLRAGGS401 AVDDGFCADG GPADDCAEAA AEGKAEDGGN QGADGVWFGF HRGLPFLGVS451 DGIALRHAV*ORF34ng和ORF34-1在459个氨基酸的重叠区内显示出有90.0%的相同性:
10 20 30 40 4 50orf34-1.pep MMMPFIMLPWIAGVPAVPGQNRLSRISLWGLGGVFFGVSGLVWFSLGVS------LGCAC
||||||||||||||||||||:||||||||||:||||||||||||||||| |||||orf34ng MMMPFIMLPWIAGVPAVPGQKRLSRISLWGLAGVFFGVSGLVWFSLGVSFSLGVSLGCAC
10 20 30 40 50 60
60 70 80 90 100 110orf34-1.pep FSGVSFRGSGRGTFVGSTGVSLSVFSACVPASSGCLSVXAVSAGCGLTRFFLGAAGDGSP
|||||||||| |:|||||||||||||||||: : :: |:| | |||||||||||||||orf34ng FSGVSFRGSGWGAFVGSTGVSLSVFSACVPVPVNESAARAASEGRGLTRFFLGAAGDGSP
70 80 90 100 110 120
120 130 140 150 160 170orf34-1.pep LPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLSPFGLNV
|||||||||||||||||||||||||||||:||||||||||||||||||: ||||||||||orf34ng LPLSSVPSGCAGSDEAAWWCSGWAASCPTAPFGSQNSVSRGLSVCCGSVWRVLSPFGLNV
130 140 150 160 170 180
180 190 200 210 220 230
orf34-1.pep LTMPIANAPMAAIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA
|||| ||||||:||||||||||||||||||||||||||||||||||||||||||||||||
orf34ng LTMPTANAPMAVIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA
190 200 210 220 230 240
240 250 260 270 280 290
orf34-1.pep LDVVLVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA
||||||||:|||||||||||||||||||||||||||||:|||||||||||:|||||||||
orf34ng LDVVLVEGNDFLYADGGADFLGNLRLFFGGEDAHNVGYIAVGNDFDARLCSGADAQQRGA
250 260 270 280 290 300
300 310 320 330 340 350
orf34-1.pep DFGCVPSVAGDVAGSARQGGDGNIVVHAFGGLFGTCNLTDELFFAFGGDLSEQQQVAVVA
||| ||||||||| |||||||||:||:|||||||||||||||||||||||||||||||||
orf34ng DFGRVPSVAGDVARSARQGGDGNVVVYAFGGLFGTCNLTDELFFAFGGDLSEQQQVAVVA
310 320 330 340 350 360
360 370 380 390 400 410
orf34-1.pep DDGDLGRVAFGLVVLAQIGTGGGFDTQRHNVVVGLRAGGSAVDGGFRADGGASDYCADAA
|||||||||||||||||:||||||||||||||:|||||||||| || |||| :| ||:||
orf34ng DDGDLGRVAFGLVVLAQVGTGGGFDTQRHNVVIGLRAGGSAVDDGFCADGGPADDCAEAA
370 380 390 400 410 420
420 430 440 450
orf34-1.pep AKGKAENGGNQGADGVRFGFHRVLPFLGVSDGIALRHAVX
|:||||:||||||||| ||||| |||||||||||||||||
orf34ng AEGKAEDGGNQGADGVWFGFHRGLPFLGVSDGIALRHAVX
430 440 450 460
根据该分析结果,包括此淋球菌蛋白中存在一个推定的前导序列(双划线)和几个推定的跨膜结构域(单划线)的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例26
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 157>:1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT51 CGCCGCCTGC GGATT.CAAA AAGACAGCGC GCCCGCCGCA TCCGCTTCTG101 CCGCCGCCGA CAACGGCGCG GCGTAAAAAA GAAATCGTCT TCGGCACGAC151 CGTCGGCGAC TTCGGCGATA TGGTCAAAGA ACAAATCCAA GCCGAGCTGG201 AGAAAAAAGG CTACACCGTC AAACTGGTCG AGTTTACCGA CTATGTACGC251 CCGAATCTGG CATTGGCTGA GGGCGAGTTG它对应于氨基酸序列<SEQ ID 158;ORF4>:
1 MKTFFKTLSA AALALILAAC G.QKDSAPAA SASAAADNGA AKKEIVFGTT51 VGDFGDMVKE QIQAELEKKG YTVKLVEFTD YVRPNLALAE GEL进一步的序列分析揭示了完整的核苷酸序列<SEQ ID 159>:1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT51 CGCCGCCTGC GGCGGTCAAA AAGACAGCGC GCCCGCCGCA TCCGCTTCTG101 CCGCCGCCGA CAACGGCGCG GCGAAAAAAG AAATCGTCTT CGGCACGACC151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CAAATCCAAG CCGAGCTGGA201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTTTACCGAC TATGTACGCC251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTCCAACAC301 AAACCCTATC TTGACGACTT CAAAAAAGAA CACAATCTGG ACATCACCGA351 AGTCTTCCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA401 AATCGCTGGA AGAAGTCAAA GACGGCAGCA CCGTATCCGC GCCCAACGAC451 CCGTCCAACT TCGCCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTGGAT501 CAAACTCAAA GACGGCATCA ATCCGTTGAC CGCATCCAAA GCGGACATCG551 CCGAGAACCT GAAAAACATC AAAATCGTCG AGCTTGAAGC CGCGCAACTG601 CCGCGTAGCC GCGCCGACGT GGATTTTGCC GTCGTCAACG GCAACTACGC651 CATAAGCAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG851 GCGCAGCCAA ATAA它对应于氨基酸序列<SEQ ID 160;ORF4-1>:1 MKTFFKTLSA AALALILAAC GGQKDSAPAA SASAAADNGA AKKEIVFGTT51 VGDFGDMVKE QIQAELEKKG YTVKLVEFTD YVRPNLALAE GELDINVFQH101 KPYLDDFKKE HNLDITEVFQ VPTAPLGLYP GKLKSLEEVK DGSTVSAPND151 PSNFARVLVM LDELGWIKLK DGINPLTASK ADIAENLKNI KIVELEAAQL201 PRSRADVDFA VVNGNYAISS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF4与脑膜炎奈瑟球菌菌株A的ORF(ORF4a)在重叠的93个氨基酸内显示出有93.5%的相同性:
10 20 30 40 50 59
off4.pep MKTFFKTLSAAALALILAACG-QKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE
||||||||||||||||||||| ||||||||||||||||||| ||||||||||||||||||
orf4a MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAXKEIVFGTTVGDFGDMVKE
10 20 30 40 50 60
60 70 80 90
orf4.pep QIQAELEKKGYTVKLVEFTDYVRPNLALAEGEL
|| ||||||||||||| ||||| |||||||||
orf4a XIQPELEKKGYTVKLVEXTDYVRXNLALAEGELDINVXQHXXYLDDXKKXHNLDITXVXQ
70 80 90 100 110 120
orf4a VPTAPLGLYPGKLKSLXXVKXGSTVSAPNDPXXFXRVLVMLDELGXIKLKDXIXXXXXXX
130 140 150 160 170 180
全长ORF4a核苷酸序列<SEQ ID 161>是:1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT51 CGCCGCCTGC GGCGGTCAAA AAGATAGCGC GCCCGCCGCA TCCGCTTCTG101 CCGCCGCCGA CAACGGCGCG GCGAANAAAG AAATCGTCTT CGGCACGACC151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CANATCCAAC CCGAGCTGGA201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTNTACCGAC TATGTGCGCN251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTNCAACAC301 ANACNCTATC TTGACGACTN CAAAAAANAA CACAATCTGG ACATCACCNN351 AGTCTTNCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA401 AATCGCTGGA NNAAGTCAAA GANGGCAGCA CCGTATCCGC GCCCAACGAC451 CCGTNNNACT TCGNCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTNGAT501 CAAACTCAAA GACNGCATCA NNNNGNNGNN NNNANCNANA NNNGANANNN551 NNNNANNNNT NNNNNNNNNN NNNNNCNNCG NNNNNNNANN NNNNNNNNNN601 NCGNNTNNNN NNGCNNNNNT NNANNNTNNN NNCNNCNNNN NNNNNTNNNN651 NANNANNAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG851 GCGCAGCCAA ATAA预计编码的蛋白质具有氨基酸序列<SEQ ID 162>:
1 MKTFFKTLSA AALALILAAC GGQKDSAPAA SASAAADNGA AXKEIVFGTT 51 VGDFGDMVKE XIQPELEKKG YTVKLVEXTD YVRXNLALAE GELDINVXQH101 XXYLDDXKKX HNLDITXVXQ VPTAPLGLYP GKLKSLXXVK XGSTVSAPND151 PXXFXRVLVM LDELGXIKLK DXIXXXXXXX XXXXXXXXXX XXXXXXXXXX201 XXXXAXXXXX XXXXXXXXXS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK*前导肽用下划线表示。对这些菌株A序列作进一步的分析,揭示了完整的DNA序列<SEQ ID 163>:1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT51 CGCCGCCTGC GGCGGTCAAA AAGATAGCGC GCCCGCCGCA TCCGCTTCTG101 CCGCCGCCGA CAACGGCGCG GCGAAAAAAG AAATCGTCTT CGGCACGACC151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CAAATCCAAC CCGACCTGGA201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTTTACCGAC TATGTGCGCC251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTCCAACAC301 AAACCCTATC TTGACGACTT CAAAAAAGAA CACAATCTGG ACATCACCGA351 AGTCTTCCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA401 AATCGCTGGA AGAAGTCAAA GACGGCAGCA CCGTATCCGC GCCCAACGAC451 CCGTCCAACT TCGCCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTGGAT501 CAAACTCAAA GACGGCATCA ATCCGCTGAC CGCATCCAAA GCGGACATTG551 CCGAAAACCT GAAAAACATC AAAATCGTCG AGCTTGAAGC CGCGCAACTG601 CCGCGTAGCC GCGCCGACGT GGATTTTGCC GTCGTCAACG GCAACTACGC651 CATAAGCAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG851 GCGCAGCCAA ATAA它编码的蛋白质具有氨基酸序列<SEQ ID 164;ORF4a-1>:1 MKTFFKTLSA AALALILAAC GGQKDSAPAA SASAAADNGA AKKEIVFGTT51 VGDFGDMVKE QIQPELEKKG YTVKLVEFTD YVRPNLALAE GELDINYFQH101 KPYLDDFKKE HNLDITEVFQ VPTAPLGLYP GKLKSLEEVK DGSTVSAPND151 PSNFARVLVM LDELGWIKLK DGINPLTASK ADIAENLKNI KIVELEAAQL201 PRSRADVDFA VVNGNYAISS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK*ORF4a-1和ORF4-1在287个氨基酸的重叠区内显示出有99.7%的相同性:
10 20 30 40 50 60orf4a-1 MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf4-1 MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE
10 20 30 40 50 60
70 80 90 100 110 120orf4a-1 QIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf4-1 QIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ
70 80 90 100 110 120
130 140 150 160 170 180orf4a-1 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf4-1 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK
130 140 150 160 170 180
190 200 210 220 230 240orf4a-1 ADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNWS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf4-1 ADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNWS
190 200 210 220 230 240
250 260 270 280
orf4a-1 AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX
||||||||||||||||||||||||||||||||||||||||||||||||
orf4-1 AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX
250 260 270 280
与溶血性巴斯德氏菌的外膜蛋白(登录号为q08869)的同源性
ORF4和此外膜蛋白在91个氨基酸的重叠区内显示出有33%的氨基酸相同性:
10 20
lip2.pasha MNFKKLLGVALVSALALTACKDEKAQAP----
|| | ::|| || |:|| :|: |
0RF4 VXTPNPDGRTPCPSFLFETATTSGENMKTFFKTLSAAAL--ALILAACGFKKTARPPHPL
110 120 130 140 150
30 40 50 60 70 80
lip2.pasha -ATTAKTENKAPLKVGVMTGPEAQMTEVAVKIAKEKYGLDVELVQFTEYTQPNAALHSKD
: :: | : |: :| ::|:: :: || | |:||:||:|::|| || :
ORF4 LPPPTTARRKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALAEGE
160 170 180 190 200 210
90 100 110 120 130 140
lip2.pasha LDANAFQTVPYLEQEVKDRGYKLAIIGNTLVWPIAAYSKKIKNISELKDGATVAIPNNAS
|
ORF4 L.....
与淋病奈瑟球菌的预计ORF的同源性
ORF4与淋病奈瑟球菌的预计ORF(ORF4.ng)在重叠的94个氨基酸内显示出有93.6%的相同性:
10 20 30
orf4nm.pep MKTFFKTLSAAALALILAACGXQKDSAPAA
|||||||||:|:||||||||| ||||||||
orf4ng RANAVXTPNPDGRTPCLSFLFETATTSGENMKTFFKTLSTASLALILAACGGQKDSAPAA
200 210 220 230 240 250
40 50 60 70 80 89
orf4nm.pep SASA-AADNGAAKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALA
||:| :||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf4ng SAAAPSADNGAAKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALA
260 270 280 290 300 310
90
orf4nm.pep EGEL
||||
orf4ng EGELDINVFQHKPYLDDFKKEHNLDITEAFQVPTAPLGLYPGKLKSLEEVKDGSTVSAPN
320 330 340 350 360 370
预计全长ORF4ng核苷酸序列<SEQ ID 165>编码的蛋白质具有氨基酸序列<SEQID 166>:1 MKTFFKTLST ASLALILAAC GGQKDSAPAA SAAAPSADNG AAKKEIYFGT51 TVGDFGDMVK EQIQAELEKK GYTYKLVEFT DYVRPNLALA EGELDINVFQ101 HKPYLDDFKK EHNLDITEAF QVPTAPLGLY PGKLKSLEEV KDGSTVSAPN151 DPSNFARALV MLNELGWIKL KDGINPLTAS KADIAENLKN IKIVELEAAQ201 LPRSRADVDF AVVNGNYAIS SGMKLTEALF QEPSFAYVNW SAVKTADKDS251 QWLKDVTEAY NSDAFKAYAH KRFEGYKYPA AWNEGAAK*进一步的分析揭示了全长ORF4ngDNA序列<SEQ ID 167>是:1 atgAAAACCT TCTTCAAAAC cctttccgcc gccgcaCTCG CGCTCATCCT 51 CGCAGCCTGc ggCggtcaAA AAGACAGCGC GCCCgcagcc tctgcCGCCG101 CCCCTTCTGC CGATAACGgc gCgGCGAAAA AAGAAAtcgt ctTCGGCACG151 Accgtgggcg acttcggcgA TAtggTCAAA GAACAAATCC AagcCGAgct201 gGAGAAAAAA GgctACACcg tcAAattggt cgaatttacc gactatgtGC251 gCCCGAATCT GGCATTGGCG GAGGGCGAGT TGGACATCAA CGTCTTCCAA301 CACAAACCCT ATCTTGACGA TTTCAAAAAA GAACACAACC TGGACATCAC351 CGAAGCCTTC CAAGTGCCGA CCGCGCCTTT GGGACTGTAT CCGGGCAAAC401 TGAAATCGCT GGAAGAAGTC AAAGACGGCA GCACCGTATC CGCGCCCAac451 gACccgTCCA ACTTCGCACG CGCCTTGGTG ATGCTGAACG AACTGGGTTG501 GATCAAACTC AAAGACGGCA TCAATCCGCT GACCGCATCC AAAGCCGACA551 TCGCGGAAAA CCTGAAAAAC ATCAAAATCG TCGAGCTTGA AGCCGCACAA601 CTGCCGCGCA GCCGCGCCGA CGTGGATTTT GCCGTCGTCA ACGGCAACTA651 CGCCATAAGC AGCGGCATGA AGCTGACCGA AGCCCTGTTC CAAGAGCCGA701 GCTTTGCCTA TGTCAACTGG TCTGCCgtcA AAACCGCCGA CAAAGACAGC751 CAATGGCTTA AAGACGTAAC CGAGGCCTAT AACTCCGACG CGTTCAAAGC801 CTACGCGCAC AAACGCTTCG AGGGCTACAA ATACCCTGCC GCATGGAATG851 AAGGCGCAGC CAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 168;ORF4ng-1>:1 MKTFFKTLSA AALALILAAC GGQKDSAPAA SAAAPSADNG AAKKEIVFGT51 TVGDFGDMVK EQIQAELEKK GYTVKLVEFT DYVRPNLALA EGELDINVFQ101 HKPYLDDFKK EHNLDITEAF QVPTAPLGLY PGKLKSLEEV KDGSTVSAPN151 DPSNFARALV MLNELGWIKL KDGINPLTAS KADIAENLKN IKIVELEAAQ201 LPRSRADVDF AVVNGNYAIS SGMKLTEALF QEPSFAYVNW SAVKTADKDS251 QWLKDVTEAY NSDAFKAYAH KRFEGYKYPA AWNEGAAK*它与ORF4-1在重叠的288个氨基酸内显示出有97.6%的相同性:
10 20 30 40 50 59orf4-1.pep MKTFFKTLSAAALALILAACGGQKDSAPAASASA-AADNGAAKKEIVFGTTVGDFGDMVK
||||||||||||||||||||||||||||||||:| :||||||||||||||||||||||||orf4ng-1 MKTFFKTLSAAALALILAACGGQKDSAPAASAAAPSADNGAAKKEIVFGTTVGDFGDMVK
10 20 30 40 50 60
60 70 80 90 100 110 119orf4-1.pep EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|orf4ng-1 EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEAF
70 80 90 100 110 120
120 130 140 150 160 170 179orf4-1.pep QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTAS
|||||||||||||||||||||||||||||||||||||:||||:|||||||||||||||||orf4ng-1 QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLTAS
130 140 150 160 170 180
180 190 200 210 220 230 239orf4-1.pep KADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf4ng-1 KADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNW
190 200 210 220 230 240
240 250 260 270 280orf4-1.pep SAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX
||||||||||||||||||||||||||||||||||||| |||||||||||orf4ng-1 SAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKYPAAWNEGAAKX
250 260 270 280另外,ORF4ng-1显示出与数据库的一种外膜蛋白明显同源:ID LIP2_PASHA STANDARD; PRT;276 AA.AC Q08869;DT 01-NOV-1995(REL.32,产生的)DT 01-NOV-1995(REL.32,序列的最后更新)DT 01-NOV-1995(REL.32,注解的最后更新)DE 28.2 KD外膜蛋白前体....SCORES Initl: 279 Initn: 416 0pt: 494Smith-Waterman评分:494; 在275个氨基酸的重叠区内有36.0%的相同性
10 20 30 40 50orf4ng-1.pep MKTFFKTLSAAAL--ALILAACGGQKDSAPAASAAAPSADNGAAKKEIVFGTTVGDFGDM
|| | ::|| || |:|| :| :|||::| :::| | | |: :| ::|lip2_pasha MNFKKLLGVALVSALALTACKDEKAQAPATTA---KTENKAPLK---VGVMTGPEAQM
10 20 30 40 50
60 70 80 90 100 110orf4ng-1.pep VKEQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITE
:: :: || | |:||:||:|::|| || :|| |:|| |||:: |::: ::lip2_pasha TEVAVKIAKEKYGLDVELVQFTEYTQPNAALHSKDLDANAFQTVPYLEQEVKDRGYKLAI
60 70 80 90 100 110
120 130 140 150 160 170orf4ng-1.pep AFQVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLT
:: : |:: | |:|:: |:|||:||: ||: || ||||::|: | :|||| | :lip2_pasha IGNTLVWPIAAYSKKIKNISELKDGATVAIPNNASNTARALLLLQAHGLLKLKDPKN-VF
120 130 140 150 160 170
180 190 200 210 220 230orf4ng-1.pep ASKADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTE--ALFQEPSFA
|:: || || ||||||: ::: | | ||::||:|::|| ::|:: : : : :lip2_pasha ATENDIIENPKNIKIVQADTSLLTRMLDDVELAVINNTYAGQAGLSPDKDGIIVESKDSP
180 190 200 210 220 230
240 250 260 270 280 289orf4ng-1.pep YVNWSAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKYPAAWNEGAAKX
||| : : :||: |: ::::::: | | |:|lip2_pasha YVNLVVSREDNKDDPRLQTFVKSFQTEEVFQEALKLFNGGVVKGW
240 250 260 270
根据该分析结果(包括与溶血性巴斯德氏菌的外膜蛋白同源,以及淋球菌蛋白中存在一个推定的原核细胞膜脂蛋白脂质连接位点),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF4-1(30kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图8A和8B分别显示了His-融合蛋白以及GST-融合蛋白的亲和纯化的结果。用纯化的His-融合蛋白免疫小鼠,用其血清进行ELISA(阳性结果),Western印迹(图8C),FACS分析(图8D),和杀菌试验(图8E)。这些结果确认ORF4-1是一种外露蛋白,且是一种有用的免疫原。
图8F显示了出ORF4-1的亲水性、抗原性指数和AMPHI区域的曲线。
实施例27
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 169>:
1 CCTCGTCGTC CTCGGCATGC TCCAGTTTCA AGGGGCGATT TACTCCAAGG51 CGGTGGAACG TATGCTCGGC ACGGTCATCG GGCTGGGCGC GGGTTTGGGC101 GTTTTATGGC TGAACCAGCA TTATTTCCAC GGCAACCTCC TCTTCTACCT151 CACCGTCGGC ACGGCAAGCG CACTGGCCGG CTGGGCGGCG GTCGGCAAAA201 ACGGCTACGT CCCTmTGCTG GCAGGGCTGA CGATGTGTAT GCTCATCGGC251 GACAACGGCA GCGAATGGCT CGACAGCGGA CTCATGCGCG CCATGAACGT301 CCTCATCGGC GyGGCCATCG CCATCGCCGC CGCCAAACTG CTGCCGCTGA351 AATCCACACT GATGTGGCGT TTCATGCTTG CCGACAACCT GGCCGACTGC401 AGCAAAATGA TTGCCGAAAT CAGCAACGGC AGGCGCATGA CCCGCGAACG451 CCTCGAGGAG AACATGGCGA AAATGCGCCA AATCAACGCA CGCATGGTCA501 AAAGCCGCAG CCATCTCGCC GCCACATCGG GCGAAAGCTG CATCAGCCCC551 GCCATGATGG AAGCCATGCA GCACGCCCAC CGTAAAATCG TCAACACCAC601 CGAGCTGCTC CTGACCACCG CCGCCAAGCT GCAATCTCCC AAACTCAACG651 GCAGCGAAAT CCGGCTGCTT GACCGCCACT TCACACTGCT CCAAAC....701 .......... .......... ........GC AGACACGCCC GCCGCATCCG751 CATCGACACC GCCATCAACC CCGAACTGGA AGCCCTCGCC GAACACCTCC801 ACTACCAATG GCAGGGCTTC CTCTGGCTCA GCACCGATAT GCGTCAGGAA851 ATTTCCGCCC TCGTCATCCT GCTGCAACGC ACCCGCCGCA AATGGCTGGA901 TGCCCACGAA CGCCAACACC TGCGCCAAAG CCTGCTTGA它对应于氨基酸序列<SEQ ID 170;ORF8>:1 ......PRRP RHAPVSRGDL LQGGGTYARH GHRAGRGFGR FMAEPALFPR51 QPPLLPHRRH GKRTGRLGGG RQKRLRPXAG RADDVYAHRR QRQRMARQRT101 HARHERPHRR GHRHRRRQTA AAEIHTDVAF HACRQPGRLQ QNDCRNQQRQ151 AHDPRTPRGE HGENAPNQRT HGQKPQPSER HIGRKLHQPR HDGSHAARPP201 XNRQHHRAAP DHRRQAAISQ TQRQRNPAAX PPLHTAPN.. .........Q251 TRPPHPHRHR HQPRTGSPRR TPPLPMAGLP LAQHRYASGN FRPRHPAATH301 PPQMAGCPRT PTPAPKPA*
该氨基酸序列的计算机分析给出了下列结果:
序列基序
ORF8富含脯氨酸,其脯氨酸残基分布与表面定位相符。而且,RGD基序的存在可能暗示其可能在细菌粘附行为中有作用。
与淋病奈瑟球菌的预计ORF的同源性
ORF8与淋病奈瑟球菌的预计ORF(ORF8.ng)在重叠的312个氨基酸内有86.5%的相同性:
orf8ng 1 MDRDDRLRRPRHAPVPRRDLLQRGGTYARYGHRAGRGFGRFMAEPALFPR 50
|||||||| | |||| ||||||:||||||||||||||||||||
orf8.pep 1 ......PRRPRHAPVSRGDLLQGGGTYARHGHRAGRGFGRFMAEPALFPR 44
orf8ng 51 QPPLLPDHRHGKRTGRLGGGRQKRLRPYVGGADDVHAHRRQRQRMARQRP 100
|||||| ||||||||||||||||||| | ||||:|||||||||||||
orf8.pep 45 QPPLLPHRRHGKRTGRLGGGRQKRLRPXAGRADDVYAHRRQRQRMARQRT 94
orf8ng 101 DARDERPHRRRHRHCRRQTAAAEIHTDVAFHACRQPGRLQQNDCRNQQRQ 150
|| |||||| ||| ||||||||||||||||||||||| |||||||||||
orf8.pep 95 HARHERPHRRGHRHRRRQTAAAEIHTDVAFHACRQPGRMQQNDCRNQQRQ 144
orf8ng 151 AYDARTFGAEYGQNAPNQRTHGQKPQPPRRHIGRKPHQPLHDGSHAARPP 200
|:| || |:|:|||||||||||||| ||||||| ||| ||||||||||
orf8.pep 145 AHDPRTPRGEHGENAPNQRTHGQKPQPSRRHIGRKLHQPRHDGSHAARPP 194
orf8ng 201 QNRQHHRAAPDHRRQAAISQTQRQRNPAARPPLHTAPNRPATNRRPHQRQ 250
|||||||||||||||||||||||||||| |||||||| |
orf8.pep 195 XNRQHHRAAPDHRRQAAISQTQRQRNPAAXPPLHTAPN...........Q 244
orf8ng 251 TRPPHPHRHRHQPRTGSPRRTPPLPMAGFPLAQHQYASGNFRPRHPPATH 300
|||||||||||||||||||||||||||| |||||.||||||||||| |||
orf8.pep 245 TRPPHPHRHRHQPRTGSPRRTPPLPMAGLPLAQHRYASGNFRPRHPAATH 294
orf8ng 301 PPQMAGCPRTPTPAPKPA* 319
|||||||||||||||||||
orf8.pep 295 PPQMAGCPRTPTPAPKPA* 313
预计全长ORF8ng核苷酸序列<SEQ ID 171>编码的蛋白质具有氨基酸序列<SEQID 172>:1 MDRDDRLRRP RHAPVPRRDL LQRGGTYARY GHRAGRGFGR FMAEPALFPR51 QPPLLPDHRH GKRTGRLGGG RQKRLRPYVG GADDVHAHRR QRQRMARQRP101 DARDERPHRR RHRHCRRQTA AAEIHTDVAF HACRQPGRLQ QNDCRNQQRQ151 AYDARTFGAE YGQNAPNQRT HGQKPQPPRR HIGRKPHQPL HDGSHAARPP201 QNRQHHRAAP DHRRQAAISQ TQRQRNPAAR PPLHTAPNRP ATNRRPHQRQ251 TRPPHPHRHR HQPRTGSPRR TPPLPMAGFP LAQHQYASGN FRPRHPPATH301 PPQMAGCPRT PTPAPKPA*
根据这些蛋白质中的序列基序,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例28
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 173>:1 ..GAAATCAGCC TGCGGTCCGA CNACAGGCCG GTTTCCGTGN CGAAGCGGCG51 GGATTCGGAA CGTTTTCTGC TGTTGGACGG CGGCAACAGC CGGCTCAAGT101 GGGCGTGGGT GGAAAACGGC ACGTTCGCAA CCGTCGGTAG CGCGCCGTAC151 CGCGATTTGT CGCCTTTGGG CGCGGAGTGG GCGGAAAAGG CGGATGGAAA201 TGTCCGCATC GTCGGTTGCG CTGTGTGCGG AGAATTCAAA AAGGCACAAG251 TGCAGGAACA GCTCGCCCGA AAAATCGAGT GGCTGCCGTC TTCCGCACAG301 GCTTT.GGCA TACGCAACCA CTACCGCCAC CCCGAAGAAC ACGGTTCCGA351 CCGCTGGTTC AACGCCTTGG GCAGCCGCCG CTTCAGCCGC AACGCCTGCG401 TCGTCGTCAG TTGCGGCACG GCGGTAACGG TTGACGCGCT CACCGATGAC451 GGACATTATC TCGGAGA.GG AACCATCATG CCCGGTTTCC ACCTGATGAA501 AGAATCGCTC GCCGTCCGAA CCGCCAACCT CAACCGGCAC GCCGGTAAGC551 GTTATCCTTT CCCGACCGG..它对应于氨基酸序列<SEQ ID 174;ORF61>:1 ..EISLRSDXRP VSYXKRRDSE RFLLLDGGNS RLKWAWVENG TFATVGSAPY51 RDLSPLGAEW AEKADGNVRI VGCAVCGEFK KAQVQEQLAR KIEWLPSSAQ101 AXGIRNHYRH PEEHGSDRWF NALGSRRFSR NACVVVSCGT AVTVDALTDD151 GHYLGXGTIM PGFHLMKESL AVRTANLNRH AGKRYPFPT..进一步的工作揭示了完整的核苷酸序列<SEQ ID 175>:1 ATGACGGTTT TGAAGCTTTC GCACTGGCGG GTGTTGGCGG AGCTTGCCGA51 CGGTTTGCCG CAACACGTCT CGCAACTGGC GCGTATGGCG GATATGAAGC101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA CATACGCGGG151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CATTGGCGGT201 TTTCGATGCC GAAGGTTTGC GCGAGCTGGG GGAAAGGTCG GGTTTTCAGA251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGCG TGACCCACCT351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG401 GCGAGTGTCT GATGTTCAGT TTTGGCTGGG TGTTTGACCG GCCGCAGTAT451 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA GTGGCGTGTC GGCGCGCCTT501 GTCGCGTTTA GGTTTGGATG TGCAGATTAA GTGGCCCAAT GATTTGGTTG551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACGGT CAGGACGGGC601 GGCAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTTG TCCTGCCCAA651 GGAAGTAGAA AATGCCGCTT CCGTGCAATC GCTGTTTCAG ACGGCATCGC701 GGCGGGGCAA TGCCGATGCC GCCGTGCTGC TGGAAACGCT GTTGGTGGAA751 CTGGACGCGG TGTTGTTGCA ATATGCGCGG GACGGATTTG CGCCTTTTGT801 GGCGGAATAT CAGGCTGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT 851 TGCGCGACGG CGAAACCGTG TTCGAAGGCA CGGTTAAAGG CGTGGACGGA901 CAAGGCGTTT TGCACTTGGA AACGGCAGAG GGCAAACAGA CGGTCGTCAG951 CGGCGAAATC AGCCTGCGGT CCGACGACAG GCCGGTTTCC GTGCCGAAGC1001 GGCGGGATTC GGAACGTTTT CTGCTGTTGG ACGGCGGCAA CAGCCGGCTC1051 AAGTGGGCGT GGGTGGAAAA CGGCACGTTC GCAACCGTCG GTAGCGCGCC1101 GTACCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGCGGATG1151 GAAATGTCCG CATCGTCGGT TGCGCTGTGT GCGGAGAATT CAAAAAGGCA1201 CAAGTGCAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC1251 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT1301 CCGACCGCTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA1401 TGACGGACAT TATCTCGGGG GAACCATCAT GCCCGGTTTC CACCTGATGA1451 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGGCA CGCCGGTAAG1501 CGTTATCCTT TCCCGACCAC AACGGGCAAT GCCGTCGCCA GCGGCATGAT1551 GGATGCGGTT TGCGGCTCGG TTATGATGAT GCACGGGCGT TTGAAAGAAA1601 AAACCGGGGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG1651 GCAAAAGTTG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT1701 GCGCGTGGCG GACAACCTCG TCATTTACGG GTTGTTGAAC ATGATTGCCG1751 CCGAAGGCAG GGAATATGAA CATATTTAA它对应于氨基酸序列<SEQ ID 176;ORF61-1>:1 MTVLKLSHWR VLAELADGLP QHVSQLARMA DMKPQQLNGF WQQMPAHIRG51 LLRQHDGYWR LVRPLAVFDA EGLRELGERS GFQTALKHEC ASSNDEILEL101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWVFDRPQY151 ELGSLSPVAA VACRRALSRL GLDVQIKWPN DLVVGRDKLG GILIETVRTG201 GKTVAVVGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLVE251 LDAVLLQYAR DGFAPFVAEY QAANRDHGKA VLLLRDGETV FEGTVKGVDG301 QGVLHLETAE GKQTVVSGEI SLRSDDRPVS VPKRRDSERF LLLDGGNSRL351 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KADGNVRIVG CAVCGEFKKA401 QVQEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA451 CVVVSCGTAV TVDALTDDGH YLGGTIMPGF HLMKESLAVR TANLNRHAGK501 RYPFPTTTGN AVASGMMDAV CGSVMMMHGR LKEKTGAGKP VDVIITGGGA551 AKVAEALPPA FLAENTVRVA DNLVIYGLLN MIAAEGREYE HI*
图9显示出ORF61-1的亲水性、抗原性指数和AMPHI区域的曲线。该氨基酸序列的进一步计算机分析给出了下列结果:
与副百日咳博德特氏菌的baf蛋白(登录号为U12020)的同源性
ORF61和baf蛋白在166个氨基酸的重叠区内有33%的氨基酸相同性:orf61 23 LLLDGGNSRLKWAWVE-NGTFATVGSAPYR----DLSPLGAEWAEKADGNVRIVGCAVCG 77
+L+D GNSRLK W + + A AP DL LG A R +G V Gbaf 3 ILIDSGNSRLKVGWFDPDAPQAAREPAPVAFDNLDLDALGRWLATLPRRPQRALGVNVAG 62orf61 78 EFKKAQVQEQLAR---KIEWLPSSAQAXGIRNHYRHPEEHGSDRW---FNALGSRRFSRN 131
+ + L I WL + A G+RN YR+P++ G+DRW L +baf 63 LARGEAIAATLRAGGCDIRWLRAQPLAMGLRNGYRNPDQLGADRWACMVGVLARQPSVHP 122orf61 132 ACVVVSCGTAVTVDALTDDGHYLGXGTIMPGFHLMKESLAVRTANL 177
+V S GTA T+D + D + G G I+PG +M+ +LA TA+Lbaf 123 PLLVASFGTATTLDTIGPDNVFPG-GLILPGPAMMRGALAYGTAHL 167
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF61与脑膜炎奈瑟球菌菌株A的ORF(ORF61a)在重叠的189个氨基酸内有97.4%的相同性:
10 20 30
orf61.pep EISLRSDXRPVSVXKRRDSERFLLLDGGNS
||||||| ||||| ||||||||||||||||orf61a TVFEGTVKGVDGQGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNS
290 300 310 320 330 340
40 50 60 70 80 90orf61.pep RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLAR
|||||||||||||||||||||||||||||||||:||||||||||||||||||||||||||orf61a RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKVDGNVRIVGCAVCGEFKKAQVQEQLAR
350 360 370 380 390 400
100 110 120 130 140 150orf61.pep KIEWLPSSAQAXGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDD
||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||orf61a KIEWLPSSAQALGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDD
410 420 430 440 450 460
160 170 180 189orf61.pep GHYLGXGTIMPGFHLMKESLAVRTANLNRHAGKRYPFPT
||||| |||||||||||||||||||||||||||||||||orf61a GHYLG-GTIMPGFHLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMM
470 480 490 500 510 520orf61a HGRLKEKTGAGKPVDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIHGLLNLIAAEGG
530 540 550 560 570 580全长ORF61a核苷酸序列<SEQ ID 177>是:1 ATGACGGTTT TGAAGCCTTC GCACTGGCGG GTGTTGGCGG AGCTTGCCGA51 CGGTTTGCCG CAACACGTCT CGCAACTGGC GCGTATGGCG GATATGAAGC101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA CATACGCGGG151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CATTGGCGGT201 TTTCGATGCC GAAGGTTTGC GCGAGCTGGG GGAAAGGTCG GGTTTTCAGA251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGTG TGACCCACCT351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG401 GCGAGTGTCT GATGTTCAGT TTTGGCTGGG TGTTTGACCG GCCGCAGTAT451 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA GTGGCGTGCC GGCGCGCCTT501 GTCGCGTTTG GGTTTGAAAA CGCAAATCAA GTGGCCAAAC GATTTGGTCG551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACGGT CAGGACGGGC601 GGCAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTCG TGCTGCCCAA651 GGAAGTGGAA AACGCCGCTT CCGTGCAATC GCTGTTTCAG ACGGCATCGC701 GGCGGGGAAA TGCCGATGCC GCCGTGTTGC TGGAAACGCT GTTGGCGGAA751 CTTGATGCGG TGTTGTTGCA ATATGCGCGG GACGGATTTG CGCCTTTTGT801 GGCGGAATAT CAGGCTGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT851 TGCGCGACGG CGAAACCGTG TTCGAAGGCA CGGTTAAAGG CGTGGACGGA901 CAAGGCGTTC TGCACTTGGA AACGGCAGAG GGCAAACAGA CGGTCGTCAG951 CGGCGAAATC AGCCTGCGGT CCGACGACAG GCCGGTTTCC GTGCCGAAGC1001 GGCGGGATTC GGAACGTTTT CTGCTGTTGG ACGGCGGCAA CAGCCGGCTC1051 AAGTGGGCGT GGGTGGAAAA CGGCACGTTC GCAACCGTCG GTAGCGCGCC1101 GTACCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGTGGATG1151 GAAATGTCCG CATCGTCGGT TGCGCCGTGT GCGGAGAATT CAAAAAGGCA1201 CAAGTGCAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC1251 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT1301 CCGACCGCTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA1401 TGACGGACAT TATCTCGGGG GAACCATCAT GCCCGGTTTC CACCTGATGA1451 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGGCA CGCCGGTAAG1501 CGTTATCCTT TCCCGACCAC AACGGGCAAT GCCGTCGCCA GCGGCATGAT1551 GGATGCGGTT TGCGGCTCGG TTATGATGAT GCACGGGCGT TTGAAAGAAA1601 AAACCGGGGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG1651 GCAAAAGTTG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT1701 GCGCGTGGCG GACAACCTCG TCATTCACGG GCTGCTGAAC CTGATTGCCG1751 CCGAAGGCGG GGAATCGGAA CATACTTAA它编码的蛋白质具有氨基酸序列<SEQ ID 178>:1 MTVLKPSHWR VLAELADGLP QHVSQLARMA DMKPQQLNGF WQQMPAHIRG51 LLRQHDGYWR LVRPLAVFDA EGLRELGERS GFQTALKHEC ASSNDEILEL101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWVFDRPQY151 ELGSLSPVAA VACRRALSRL GLKTQIKWPN DLVVGRDKLG GILIETVRTG201 GKTVAVVGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLAE251 LDAVLLQYAR DGFAPFVAEY QAANRDHGKA VLLLRDGETV FEGTVKGVDG301 QGVLHLETAE GKQTVVSGEI SLRSDDRPVS VPKRRDSERF LLLDGGNSRL351 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KVDGNVRIVG CAVCGEFKKA401 QVQEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA451 CVVVSCGTAV TVDALTDDGH YLGGTIMPGF HLMKESLAVR TANLNRHAGK501 RYPFPTTTGN AVASGMMDAV CGSVMMMHGR LKEKTGAGKP VDVIITGGGA551 AKVAEALPPA FLAENTVRVA DNLVIHGLLN LIAAEGGESE HT*ORF61a和ORF61-1在591个氨基酸的重叠区内有98.5%的相同性:
10 20 30 40 50 60orf61a.pep MTVLKPSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR
||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||orf61-1 MTVLKLSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR
10 20 30 40 50 60
70 80 90 100 110 120orf61a.pep LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf61-1 LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK
70 80 90 100 110 120
130 140 150 160 170 180orf61a.pep GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLKTQIKWPN
|||||||||||||||||||||||||||||||||||||||||||||||||||| :||||||orf61-1 GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLDVQIKWPN
130 140 150 160 170 180
190 200 210 220 230 240orf61a.pep DLVVGRDKLGGILIETVRTGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf61-1 DLVVGRDKLGGILIETVRTGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA
190 200 210 220 230 240
250 260 270 280 290 300orf61a.pep AVLLETLLAELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG
||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||orf61-1 AVLLETLLVELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG
250 260 270 280 290 300
310 320 330 340 350 360orf61a.pep QGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf61-1 QGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF
310 320 330 340 350 360
370 380 390 400 410 420orf61a.pep ATVGSAPYRDLSPLGAEWAEKVDGNVRIVGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||orf61-1 ATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL
370 380 390 400 410 420
430 440 450 460 470 480orf61a.pep GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF
430 440 450 460 470 480
490 500 510 520 530 540
orf61a.pep HLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 HLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP
490 500 510 520 530 540
550 560 570 580 590
orf61a.pep VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIHGLLNLIAAEGGESEHTX
|||||||||||||||||||||||||||||||||||:||||:||||| | ||
orf61-1 VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIYGLLNMIAAEGREYEHIX
550 560 570 580 590
与淋病奈瑟球菌的预计ORF的同源性
ORF61与淋病奈瑟球菌的预计ORF(ORF61.ng)在重叠的189个氨基酸内有94.2%的相同性:
orf61.pep EISLRSDXRPVSVXKRRDSERFLLLDGGNS 30
||||| | | ||| || ||||||||:||||
orf61ng TVCEGTVKGVDGRGVLHLETAEGEQTVVSGEISLRPDNRSVSVPKRPDSERFLLLEGGNS 211
orf61.pep RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLAR 90
|||||||||||||||||||||||||||||||||||||||||||||||| |||||:|||||
orf61ng RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGESKKAQVKEQLAR 271
orf61.pep KIEWLPSSAQAXGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDD 150
||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||
orf61ng KIEWLPSSAQALGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDD 331
orf61.pep GHYLGXGTIMPGFHLMKESLAVRTANLNRHAGKRYPFPT 189
||||| ||||||||||||||||||||||| |||||||||
orf61ng GHYLG-GTIMPGFHLMKESLAVRTANLNRPAGKRYPFPTTTGNAVASGMMDAVCGSIMMM 390
预计ORF61ng核苷酸序列<SEQ ID 179>编码的蛋白质具有氨基酸序列<SEQ ID180>:1 MFSFGWAFDR PQYELGSLSP VAALACRRAL GCLGLETQIK WPNDLVVGRD51 KLGGILIETV RAGGKTVAVV GIGINFVLPK EVENAASVQS LFQTASRRGN101 ADAAVLLETL LAELGAVLEQ YAEEGFAPFL NEYETANRDH GKAVLLLRDG151 ETVCEGTVKG VDGRGVLHLE TAEGEQTVVS GEISLRPDNR SVSVPKRPDS201 ERFLLLEGGN SRLKWAWVEN GTFATVGSAP YRDLSPLGAE WAEKADGNVR251 IVGCAVCGES KKAQVKEQLA RKIEWLPSSA QALGIRNHYR HPEEHGSDRW301 FNALGSRRFS RNACVVVSCG TAVTVDALTD DGHYLGGTIM PGFHLMKESL351 AVRTANLNRP AGKRYPFPTT TGNAVASGMM DAVCGSIMMM HGRLKEKNGA401 GKPVDVIITG GGAAKVAEAL PPAFLAENTV RVADNLVIHG LLNLIAAEGG451 ESEHA*进一步的分析揭示完整的淋球菌DNA序列<SEQ ID 181>是:1 ATGACGGTTT TGAAGCCTTC GCATTGGCGG GTGTTGGCGG AGCTTGCCGA51 CGGTTTGCCG CAACACGTAT CGCAATTGGC GCGTGAGGCG GACATGAAGC101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA TATACGCGGG151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CCTTGGCGGT201 TTTCGATGCC GAAGGTTTGC GCGATCTGGG GGAAAGGTCG GGTTTTCAGA251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGCG TGACCCACCT351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG401 GCGAGTGCCT GATGTTCAGT TTCGGCTGGG CGTTTGACCG GCCGCAGTAT451 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA CTTGCGTGCC GGCGCGCTTT 501 GGGGTGTTTG GGTTTGGAAA CGCAAATCAA GTGGCCAAAC GATTTGGTCG551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACAGT CAGGGCGGGC601 GGTAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTCG TGCTGCCCAA651 GGAAGTGGAA AACGCCGCTT CCGTGCAGTC GCTGTTTCAG ACGGCATCGC701 GGCGGGGCAA TGCCGATGCC GCCGTATTGC TGGAAACATT GCTTGCGGAA751 CTGGGCGCGG TGTTGGAACA ATATGCGGAA GAAGGGTTCG CGCCATTTTT801 AAATGAGTAT GAAACGGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT851 TGCGCGACGG CGAAACCGTG TGCGAAGGCA CGGTTAAAGG CGTGGACGGA901 CGAGGCGTTC TGCACTTGGA AACGGCAgaa ggcgaACAGa cggtcgtcag951 cggcgaaaTC AGcctGCggc ccgacaacaG GTCGGtttcc gtgccgaagc1001 ggccggatTC GgaacgtTTT tTGCtgttgg aaggcgggaa cagccgGCTC1051 AAGTGGGCGT GggtggAAAa cggcacgttc gcaaccgtgg gcagcgcgCc1101 gtaCCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGCGGATG1151 GAAATGTCCG CATCGTCGGT TGCGCCGTGT GCGGAGAATC CAAAAAGGCA1201 CAAGTGAAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC12S1 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT1301 CCGACCGTTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA1401 TGACGGACAT TATCTCGGCG GAACCATCAT GCCCGGCTTC CACCTGATGA1451 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGCCC CGCCGGCAAA1501 CGTTACCCTT TCCCGACCAC AACGGGCAAC GCCGTCGCAA GCGGCATGAT1551 GGACGCGGTT TGCGGCTCGA TAATGATGAT GCACGGCCGT TTGAAAGAAA1601 AAAACGGCGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG1651 GCGAAAGTCG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT1701 GCGCGTGGCG GACAACCTCG TCATCCACGG GCTGCTGAAC CTGATTGCCG1751 CCGAAGGCGG GGAATCGGAA CACGCTTAA它对应于氨基酸序列<SEQ ID 182;ORF61ng-1>:1 MTVLKPSHWR VLAELADGLP QHVSQLAREA DMKPQQLNGF WQQMPAHIRG51 LLRQHDGYWR LVRPLAVFDA EGLRDLGERS GFQTALKHEC ASSNDEILEL101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWAFDRPQY151 ELGSLSPVAA LACRRALGCL GLETQIKWPN DLVVGRDKLG GILIETVRAG201 GKTVAVVGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLAE251 LGAVLEQYAE EGFAPFLNEY ETANRDHGKA VLLLRDGETV CEGTVKGVDG301 RGVLHLETAE GEQTVVSGEI SLRPDNRSVS VPKRPDSERF LLLEGGNSRL351 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KADGNVRIVG CAVCGESKKA401 QVKEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA451 CVVVSCGTAV TVDALTDDGH YLGGTIMPGF HLMKESLAVR TANLNRPAGK501 RYPFPTTTGN AVASGMMDAV CGSIMMMHGR LKEKNGAGKP VDVIITGGGA551 AKVAEALPPA FLAENTVRVA DNLVIHGLLN LIAAEGGESE HA*ORF61ng-1和ORF61-1在591个氨基酸的重叠区内有93.9%的相同性:orf61ng-1.pep MTVLKPSHWRVLAELADGLPQHVSQLAREADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR 60
||||| |||||||||||||||||||||| |||||||||||||||||||||||||||||||orf61-1 MTVLKLSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR 60orf61ng-1.pep LVRPLAVFDAEGLRDLGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 120
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||orf61-1 LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 120orf61ng-1.pep GRGRQGRKWSHRLGECLMFSFGWAFDRPQYELGSLSPVAALACRRALGCLGLETQIKWPN 180
|||||||||||||||||||||||:||||||||||||||||:||||||: |||::||||||orf61-1 GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLDVQIKWPN 180orf61ng-1.pep DLVVGRDKLGGILIETVRAGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA 240
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||orf61-1 DLVVGRDKLGGILIETVRTGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA 240orf61ng-1.pep AVLLETLLAELGAVLEQYAEEGFAPFLNEYETANRDHGKAVLLLRDGETVCEGTVKGVDG 300
||||||||:|| ||| |||::|||||: ||::|||||||||||||||||| |||||||||orf61-1 AVLLETLLVELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG 300
orf61ng-1.pep RGVLHLETAEGEQTVVSGEISLRPDNRSVSVPKRPDSERFLLLEGGNSRLKWAWVENGTF 360
:||||||||||:||||||||||| |:| |||||| ||||||||:||||||||||||||||
orf61-1 QGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF 360
orf61ng-1.pep ATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGESKKAQVKEQLARKIEWLPSSAQAL 420
|||||||||||||||||||||||||||||||||||| |||||:|||||||||||||||||
orf61-1 ATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL 420
orf61ng-1.pep GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF 480
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF 480
orf61ng-1.pep HLMKESLAVRTANLNRPAGKRYPFPTTTGNAVASGMMDAVCGSIMMMHGRLKEKNGAGKP 540
|||||||||||||||| ||||||||||||||||||||||||||:||||||||||:|||||
orf61-1 HLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP 540
orf61ng-1.pep VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIHGLLNLIAAEGGESEHAX 593
|||||||||||||||||||||||||||||||||||:||||:||||| | ||
orf61-1 VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIYGLLNMIAAEGREYEHIX 593
根据该分析结果(包括与副百日咳博德特氏菌的baf蛋白有同源性,以及存在一个推定的原核细胞膜脂蛋白脂质连接位点),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例29
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 183>:1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC101 GCCTGCTAAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG401 CGGaAGAGGG CGGCGaAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC601 TGGAGCGTCG GGATGGTATT GTCGCTGCTG TATTTGGGTT TGGGGTGC..它对应于氨基酸序列<SEQ ID 184;ORF62>:1 MFYQILALII WSSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV101 FVGHFFFNDK ARAYHWICGA AAFAGVALLM AGGAEEGGEV GWFGCLLVLL151 AGAGFCAAMR PTQRLIARIG APAFTSVSIA AASLMCLPFS LALAQSYTVD201 WSVGMVLSLL YLGLGC..进一步的工作揭示了完整的核苷酸序列<SEQ ID 185>:1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC101 GCCTGCTAAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG401 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC601 TGGAGCGTCG GGATGGTATT GTCGCTGCTG TATTTGGGTT TGGGGTGCGG651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA701 ATGTTTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGCTG751 GCGGTTTTGA TTTTGGGCGA ACACCTGTCG CCCGTGTCCG CCTTGGGCGT801 GTTTGTCGTC ATCGCCGCCA CCTTGGTTGC CGGCCGGCTG TCGCATCAAA851 AATAA它对应于氨基酸序列<SEQ ID 186;ORF62-1>:1 MFYQILALII WSSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV51 GKIPREEWKP LLIVSFYNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV101 FVGHFFFNDK ARAYHWICGA AAFAGVALLM AGGAEFGGEV GWFGCLLVLL151 AGAGFCAAMR PTQRLIARIG APAFTSVSIA AASLMCLPFS LALAQSYTVD201 WSVGMVLSLL YLGLGCGMYA YWLWNKGMSR VPANVSGLLI SLEPVVGVLL251 AVLILGEHLS PVSALGVFVV IAATLVAGRL SHQK*该氨基酸序列的计算机分析给出了下列结果:与流感嗜血菌的假设跨膜蛋白HI0976(登录号为O57147)的同源性ORF62和HI0976在114的氨基酸的重叠区内有50%的氨基酸相同性:Orf62 1 MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRXXXXXXXXXXXCRRHVGKIPREEWKP 60
M YQILAL+IWSSS I K Y +DP L+V VR R KI + KHI0976 1 MLYQILALLIWSSSLIVGKLTYSMMDPVLVVQVRLIIAMIIVMPLFLRRWKKIDKPMRKQ 600rf62 61 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAY 114
L ++F NY LLQF+GLKYTSA+SA ++GLEPLL+VFVGHFFF K +HI0976 61 LWWLAFFNYTAVFLLQFIGLKYTSASSAVTMIGLEPLLVVFVGHFFFKTKQNGF 114
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF62与脑膜炎奈瑟球菌菌株A的ORF(ORF62a)在重叠的216个氨基酸内有99.5%的相同性:
10 20 30 40 50 60
orf62.pep MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62a MFYQILALIIWSSSFIAAKYVYGGIDPALMVGYRLLIAALPALPACRRHVGKIPREEWKP
10 20 30 40 50 60
70 80 90 100 110 120
orf62.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62a LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFYGHFFFNDKARAYHWICGA
70 80 90 100 110 120
130 140 150 160 170 180
orf62.pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62a AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA
130 140 150 160 170 180
190 200 210
orf62.pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGC
|||||||||||||||||||||||||||||||||:||
orf62a AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGVGCSWYAYWLWNKGMSRVPANVSGLLI
190 200 210 220 230 240orf62a SLEPVVGVLLAVLILGEHLSPVSVLGVFVVIAATLVAGRLSHQKX
250 260 270 280全长ORF62a核苷酸序列<SEQ ID 187>是:1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC101 GCCTGCTGAT TGCTGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT201 CAACTATGTG CTGACCCTGC TACTTCAGTT TGTCGGGTTG AAATACACTT251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCACT GCTGATGGTG301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG401 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC601 TGGAGCGTCG GAATGGTATT GTCGCTGCTG TATTTGGGCG TGGGGTGCAG651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA701 ACGTTTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGCTG751 GCGGTTTTGA TTTTGGGCGA ACACCTGTCG CCCGTGTCCG TCTTGGGCGT801 GTTTGTCGTC ATCGCCGCCA CCTTGGTTGC CGGCCGGCTG TCGCATCAAA851 AATAA它编码的蛋白质具有氨基酸序列<SEQ ID 188>:1 MFYQILALII WSSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV101 FVGHFFFNDK ARAYHWICGA AAFAGVALLM AGGAEEGGEV GWFGCLLVLL151 AGAGFCAAMR PTQRLIARIG APAFTSVSIA AASLMCLPFS LALAQSYTVD201 WSVGMVLSLL YLGVGCSWYA YWLWNKGMSR VPANVSGLLI SLEPVVGVLL251 AVLILGEHLS PVSVLGVFVV IAATLVAGRL SHQK*
ORF62a和ORF62-1在284个氨基酸的重叠区内有98.9%的相同性:
orf62a.pep MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60
orf62a.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120
orf62a.pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180
orf62a.pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGVGCSWYAYWLWNKGMSRVPANVSGLLI 240
|||||||||||||||||||||||||||||||||:||:|||||||||||||||||||||||
orf62-1 AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANVSGLLI 240
orf62a.pep SLEPVVGVLLAVLILGEHLSPVSVLGVFVVIAATLVAGRLSHQKX 285
|||||||||||||||||||||||:|||||||||||||||||||||
orf62-1 SLEPVVGVLLAVLILGEHLSPVSALGVFVVIAATLVAGRLSHQKX 285
与淋病奈瑟球菌的预计ORF的同源性
ORF62与淋病奈瑟球菌的预计ORF(ORF62.ng)在重叠的216个氨基酸内有99.5%的相同性:
orf62.pep MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60
|||||||||||:||||||||||||||||||||||||||||||||||||||||||||||||
orf62ng MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60orf62.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf62ng LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120orf62.pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf62ng AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180orf62.pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGC 216
||||||||||||||||||||||||||||||||||||orf62ng AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANASGLLI 240全长ORF62ng核苷酸序列<SEQ ID 189>是:1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGGGCAGCT CGTTTATTGC51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC101 GCCTGCTGAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG401 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC501 CCGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC601 TGGAGCGTCG GGATGGTATT GTCGCTGTTG TATTTGGGTT TGGGGTGCGG651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA701 ACGCGTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGTTG751 GCGGTTTTGA TTTTGGGCGA ACATTTATCG CCCGTGTCCG CCTTGGGCGT801 GTTTGTCGTC ATCGCCGCCA CTTTCGCCGC CGGCCGGCTG TCGCGCAGGG851 ACGCGCAAAA CGGCAATGCC GTCTGA它编码的蛋白质具有氨基酸序列<SEQ ID 190>:1 MFYQILALII WGSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV101 FVGHFFFNDK ARAYHWICGA AAFAGVALLM AGGAEEGGEV GWFGCLLVLL151 AGAGFCAAMR PTQRLIARIG APAFTSVSIA AASLMCLPFS LALAQSYTVD201 WSVGMVLSLL YLGLGCGWYA YWLWNKGMSR VPANASGLLI SLEPVVGVLL251 AVLILGEHLS PVSALGVFVV IAATFAAGRL SRRDAQNGNA V*ORF62ng和ORF62-1在283个氨基酸的重叠区内有97.9%的相同性:
10 20 30 40 50 60orf62ng.pep MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP
|||||||||||:||||||||||||||||||||||||||||||||||||||||||||||||orf62-1 MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP
10 20 30 40 50 60
70 80 90 100 110 120orf62ng.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf62-1 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA
70 80 90 100 110 120
130 140 150 160 170 180orf62ng.pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf62-1 AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA
130 140 150 160 170 180
190 200 210 220 230 240
orf62ng.pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANASGLLI
||||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||
orf62-1 AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANVSGLLI
190 200 210 220 230 240
250 260 270 280 290
orf62ng.pep SLEPVVGVLLAVLILGEHLSPVSALGVFVVIAATFAAGRLSRRDAQNGNAVX
||||||||||||||||||||||||||||||||||::|||||::
orf62-1 SLEPVVGVLLAVLILGEHLSPVSALGVFVVIAATLVAGRLSHQKX
250 260 270 280
另外,ORF62ng显示出与假设的流感嗜血菌蛋白明显同源:
sp|Q57147|Y975_HAEIN假设蛋白HI0976>gi|1074589|pir||B64163假设蛋白HI0976-流感嗜血菌(Rd KW20菌株)
>gi|1574004(U32778)假设的[流感嗜血菌]长度=128
评分=106位(262),估计值=2e-22
相同性=56/114(49%),阳性=68/114(59%)
询问:1 MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRXXXXXXXXXXXCRRHVGKIPREEWKP 60
M YQILAL+IW SS I K Y +DP L+V VR R KI + K
目标:1 MLYQILALLIWSSSLIVGKLTYSMMDPVLVVQVRLIIAMIIVMPLFLRRWKKIDKPMRKQ 60
询问:61 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAY 114
L ++F NY LLQF+GLKYTSA+SA ++GLEPLL+VFVGHFFF K +
目标:61 LWWLAFFNYTAVFLLQFIGLKYTSASSAVTMIGLEPLLVVFVGHFFFKTKQNGF 114
根据该分析结果(包括与流感嗜血菌的跨膜蛋白同源,且淋球菌蛋白中有推定的前导序列和几个跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例30
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 191>:1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCmGwms TCCTGkkGTA51 sGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT201 CGGTTCGCtA srTyGCCAAA gsGCCTgkks TGGG.ATGTT TACGCTGGTT251 GCCGkACTGC CCGGCGTGTT TCTGTTCGGC TTTCCCGCAC AGTTCATCAA301 CGGCACGATT AATTCGTGGT TCGGCAACGA TACCCACGAG GCGCTTGAAC351 GCAGCCTCAA TTTGAGCAAG TCCGCATTGA ATTTGGCGGC AGACAACGCC401 CTCGGCAACG CCGTCCCCGT GCAGATAGAC CTCATCGGCG CGGCTTCCCT451 GCCCGGGGAT ATGGGCAGGG TGCTGGAACA TTACGCCGGC AGCGGTTTTG501 CCCAGCTTGC CCTGTACAAy ksCGCAAGCG GCAAAATCGA AAAAAGCATC551 AACCCGCACA AGCTCGATCA GCCGTTTCCA GGTAAGGCGC GTTGGGAaAa601 AATCCaACGG GCGGGTTCGG TCAGGGATTT GGAAAGCATA GGCGGCGTAT651 TGTaCGCGCA GGGCTGGCTG TCGGCGGGTA CGCACwACGG GCGCGATTAC701 GCCTTGTTTT TCCGTCAGCC GGTTCCCAAA GGCGTGGCAG AGGATGCCGT751 yTTAATCGAA AAGGCAAGGG CGAAATATGC TGAGTTGAGT TACAGCAAAA801 AAGGTTTGCA GACCTTTTTC CTGGCAACCC TGCTGATTGC CTCGCTGCTG851 TCGATTTTTC TTGCACTGGT CATGGCACTG TATTTCGCCC GCCGTTTCGT901 CGAACCCGTC CTATCGCTTG CCGAGGGGGC GAAGGCGGTG GCGCAAGGCG951 ATTTCAGCCA GACGCGCCCC GTGTTGCGCA ACGACGAGTT CGGACGCTTG1001 ACCArGTTGT TCAACCACAT GACCGAGCAG CTTTCCATCG CCAAAGATGC1051 AGACGAGCGC AACCGCCGGC GCGAGGAAGC CGCCAGGCAT TATCTTGAAT1101 GCGTGTTGGA GGGGCTGACC ACGGGCGTGG TGGTGTTTGA CGAACAAGGC 1151 TGTCTGAAAA CCTTCAACAA AGCGGCGGGT ACC..它对应于氨基酸序列<SEQ ID 192;ORF64>:1 MRRFLPIAAI CAXXLXXGLT AATGSTSSLA DYFWWIVAFS AMLLLVLSAV51 LARYVILLLK DRRDGVFGSX XAKXPXXXMF TLVAXLPGVF LFGFPAQFIN101 GTINSWFGND THEALERSLN LSKSALNLAA DNALGNAVPV QIDLIGAASL151 PGDMGRVLEH YAGSGFAQLA LYNXASGKIE KSINPHKLDQ PFPGKARWEK201 IQRAGSVRDL ESIGGVLYAQ GWLSAGTHXG RDYALFFRQP VPKGVAEDAV251 LIEKARAKYA ELSYSKKGLQ TFFLATLLIA SLLSIFLALV MALYFARRFV301 EPVLSLAEGA KAVAQGDFSQ TRPVLRNDEF GRLTXLFNHM TEQLSIAKDA351 DERNRRREEA ARHYLECVLE GLTTGVVVFD EQGCLKTFNK AAGT..进一步的工作揭示了完整的核苷酸序列<SEQ ID 193>:1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGTTGTA51 CGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT201 CGGTTCGCAG ATTGCCAAAC GCCTTTCTGG GATGTTTACG CTGGTTGCCG251 TACTGCCCGG CGTGTTTCTG TTCGGCGTTT CCGCACAGTT CATCAACGGC301 ACGATTAATT CGTGGTTCGG CAACGATACC CACGAGGCGC TTGAACGCAG351 CCTCAATTTG AGCAAGTCCG CATTGAATTT GGCGGCAGAC AACGCCCTCG401 GCAACGCCGT CCCCGTGCAG ATAGACCTCA TCGGCGCGGC TTCCCTGCCC451 GGGGATATGG GCAGGGTGCT GGAACATTAC GCCGGCAGCG GTTTTGCCCA501 GCTTGCCCTG TACAATGCCG CAAGCGGCAA AATCGAAAAA AGCATCAACC551 CGCACAAGCT CGATCAGCCG TTTCCAGGTA AGGCGCGTTG GGAAAAAATC601 CAACGGGCGG GTTCGGTCAG GGATTTGGAA AGCATAGGCG GCGTATTGTA651 CGCGCAGGGC TGGCTGTCGG CGGGTACGCA CAACGGGCGC GATTACGCCT701 TGTTTTTCCG TCAGCCGGTT CCCAAAGGCG TGGCAGAGGA TGCCGTCTTA751 ATCGAAAAGG CAAGGGCGAA ATATGCTGAG TTGAGTTACA GCAAAAAAGG801 TTTGCAGACC TTTTTCCTGG CAACCCTGCT GATTGCCTCG CTGCTGTCGA851 TTTTTCTTGC ACTGGTCATG GCACTGTATT TCGCCCGCCG TTTCGTCGAA901 CCCGTCCTAT CGCTTGCCGA GGGGGCGAAG GCGGTGGCGC AAGGCGATTT951 CAGCCAGACG CGCCCCGTGT TGCGCAACGA CGAGTTCGGA CGCTTGACCA1001 AGTTGTTCAA CCACATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC1051 GAGCGCAACC GCCGGCGCGA GGAAGCCGCC AGGCATTATC TTGAATGCGT1101 GTTGGAGGGG CTGACCACGG GCGTGGTGGT GTTTGACGAA CAAGGCTGTC1151 TGAAAACCTT CAACAAAGCG GCGGAACAGA TTTTGGGGAT GCCGCTTACC1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA1251 GTCCCTGCTT GCCGAAGTGT TTGCCGCCAT CGGCGCGGCG GCAGGTACGG1301 ACAAACCGGT CCATGTGAAA TATGCCGCGC CGGACGATGC CAAAATCCTG1351 CTGGGCAAGG CAACCGTCCT GCCCGAAGAC AACGGCAACG GCGTGGTAAT1401 GGTGATTGAC GACATCACCG TTTTGATACA CGCGCAAAAA GAAGCCGCGT1451 GGGGCGAAGT GGCGAAGCGG CTGGCACACG AAATCCGCAA TCCGCTCACG1501 CCCATCCAGC TTTCCGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT1551 GGATGAGCAG GATGCGCAAA TCCTGACGCG TTCGACCGAC ACCATCGTCA1601 AACAGGTGGC GGCATTGAAG GAAATGGTCG AAGCATTCCG CAATTATGCG1651 CGTTCCCCTT CGCTCAAATT GGAAAATCAG GATTTGAACG CCTTAATCGG1701 CGATGTGTTG GCATTGTATG AAGCCGGTCC GTGCCGGTTT GCGGCGGAGC1751 TTGCCGGCGA ACCGCTGACG GTGGCGGCGG ATACGACCGC CATGCGGCAG1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA1851 TGTGCCCGAA GTCAGGGTAA AATCGGAAAC AGGGCAGGAC GGTCGGATTG1901 TCCTGACGGT TTGCGACAAC GGCAAAGGGT TCGGCAGGGA AATGCTGCAC1951 AACGCCTTCG AGCCGTATGT AACGGACAAA CCGGCGGGAA CGGGATTGGG2001 TCTGCCTGTG GTGAAAAAAA TCATTGAAGA ACACGGCGGC CGCATCAGCC2051 TGAGCAATCA GGATGCGGGT GGCGCGTGTG TCAGAATCAT CTTGCCAAAA2101 ACGGTAAAAA CTTATGCGTA G它对应于氨基酸序列<SEQ ID 194;ORF64-1>:1 MRRFLPIAAI CAVVLLYGLT AATGSTSSLA DYFWWIVAFS AMLLLVLSAV51 LARYVILLLK DRRDGVFGSQ IAKRLSGMFT LVAVLPGVFL FGVSAQFING101 TINSWFGNDT HEALERSLNL SKSALNLAAD NALGNAVPVQ IDLIGAASLP151 GDMGRVLEHY AGSGFAQLAL YNAASGKIEK SINPHKLDQP FPGKARWEKI201 QRAGSVRDLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPV PKGVAEDAVL251 IEKARAKYAE LSYSKKGLQT FFLATLLIAS LLSIFLALVM ALYFARRFVE301 PVLSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNIMT EQLSIAKEAD351 ERNRRREEAA RHYLECVLEG LTTGVVVFDE QGCLKTFNKA AEQILGMPLT401 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVHVK YAAPDDAKIL451 LGKATVLPED NGNGVVMVID DITVLIHAQK EAAWGEVAKR LAHEIRNPLT501 PIQLSAERLA WKLGGKLDEQ DAQILTRSTD TIVKQVAALK EMVEAFRNYA551 RSPSLKLENQ DLNALIGDVL ALYEAGPCRF AAELAGEPLT VAADTTAMRQ601 VLHNIFKNAA EAAEEADVPE VRVKSETGQD GRIVLTVCDN GKGFGREMLH651 NAFEPYVTDK PAGTGLGLPV VKKIIEEHGG RISLSNQDAG GACVRIILPK701 TVKTYA*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF64与脑膜炎奈瑟球菌菌株A的ORF(ORF64a)在重叠的392个氨基酸内有92.6%的相同性:
10 20 30 40 50 60orf64.pep MRRFLPIAAICAXXLXXGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
|||||||||||| | |||||||||||||||||||||||||||||||||||||||||||orf64a MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
10 20 30 40 50 60
70 80 90 100 110 120orf64.pep DRRDGVFGSXXAKXPXXXMFTLVAXLPGVFLFGFPAQFINGTINSWFGNDTHEALERSLN
||||||||| || |||||| |||||||| |||||||||||||||||||||||||orf64a DRRDGVFGSQIAKR-LSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLN
70 80 90 100 110
130 140 150 160 170 180orf64.pep LSKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNXASGKIE
|||||||||||||||||:||||| ||||||| ||||||||||||||||||||| ||||||orf64a LSKSALNLAADNALGNAIPVQIDXIGAASLPXDMGRVLEHYAGSGFAQLALYNAASGKIE
120 130 140 150 160 170
190 200 210 220 230 240orf64.pep KSINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHXGRDYALFFRQP
||||||||||||||||||||||:|||||| ||||||||| ||||| || |||||||||||orf64a KSINPHKLDQPFPGKARWEKIQQAGSVRDXESIGGVLYAXGWLSAXTHNGRDYALFFRQP
180 190 200 210 220 230
250 260 270 280 290 300orf64.pep VPKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFV
||||||||||||||||| |||||||||||||||||||||||||||||||||||||||orf64a VPKGVAEDAVLIEKARAXXXXLSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFV
240 250 260 270 280 290
310 320 330 340 350 360orf64.pep EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTXLFNHMTEQLSIAKDADERNRRREEA
|||||||||||||||||||||||||||||||||| |||||||||||||:|||||||||||orf64a EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEA
300 310 320 330 340 350
370 380 390orf64.pep ARHYLECVLEGLTTGVVVFDEQGCLKTFNKAAGT
||||||||||||||||||||||||||||||||orf64a ARHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSL
360 370 380 390 400 410orf64a LAEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNXNGVVMVIDDITVLIHAQ
420 430 440 450 460 470全长ORF64a核苷酸序列<SEQ ID 195>是:1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGTTGTA51 CGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT201 CGGTTCGCAG ATTGCCAAAC GCCTTTCCGG GATGTTTACG CTGGTTGCCG251 TACTGCCCGG CGTGTTTCTG TTCGGCGTTT CCGCACAGTT TATCAACGGC301 ACGATTAATT CGTGGTTCGG CAACGATACC CACGAGGCGC TTGAACGCAG351 CCTCAATTTG AGCAAGTCCG CATTGAATCT GGCGGCAGAC AACGCCCTTG401 GCAACGCCAT CCCCGTGCAG ATAGACNTCA TCGGCGCGGC TTCCCTGCCC451 NGGGATATGG GCAGGGTGCT GGAACATTAC GCCGGCAGCG GTTTTGCCCA501 GCTTGCCCTG TACAATGCCG CAAGCGGCAA AATCGAAAAA AGCATCAACC551 CGCACAAGCT CGATCAGCCG TTTCCAGGTA AGGCGCGTTG GGAAAAAATC601 CAACAGGCGG GTTCGGTCAG GGATNNGGAA AGCATAGGCG GCGTATTGTA651 CGCGCANGGC TGGCTGTCGG CAGNNACGCA CAACGGGCGC GATTACGCCT701 TGTTTTTCCG TCAGCCGGTT CCCAAAGGCG TGGCAGAGGA TGCCGTCTTA751 ATCGAAAAGG CAAGGGCGNA ANANNNTNAG TTGAGTTACA GCAAAAAAGG801 TTTGCAGACC TTTTTCCTNG CAACCCTGCT GATTGCCTCN CTGCTGTCGA851 TTTTTCTTGC ACTGGTCATG GCACTGTATT TCGCCCGCCG TTTCGTCGAA901 CCCGTCCTAT CGCTTGCCGA GGGGGCGAAG GCGGTGGCGC AAGGCGATTT951 CAGCCAGACG CGCCCCGTGT TGCGCAACGA CGAGTTCGGA CGCTTGACCA1001 AGTTGTTCAA CCACATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC1051 GAGCGCAACC GCCGGCGCGA GGAAGCCGCC AGACATTATC TCGAATGCGT1101 GTTGGAGGGG CTGACCACGG GCGTGGTGGT GTTTGACGAA CAAGGCTGTC1151 TGAAAACCTT CAACAAAGCG GCGGAACAGA TTTTGGGGAT GCCGCTTACC1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA1251 GTCCCTGCTT GCCGAAGTGT TTGCCGCCAT CGGCGCGGCG GCAGGTACGG1301 ACAAACCGGT CCATGTGAAA TATGCCGCGC CGGACGATGC CAAAATCCTG1351 CTGGGCAAGG CAACCGTCCT GCCCGAAGAC AACNGCAACG GCGTGGTAAT1401 GGTGATTGAC GACATCACCG TTTTGATACA CGCGCAAAAA GAAGCCGCGT1451 GGGGCGAAGT GGCAAAACGG CTGGCACACG AAATCCGCAA TCCGCTCACG1501 CCCATCCAGC TTTCTGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT1551 GGACGAGCAN GACGCGCAAA TCCTGACACG TTCGACCGAC ACCATCATCA1601 AACAAGTGGC GGCATTAAAA GAAATGGTCG AGGCATTCCG CAATTACNCG1651 CGTTCCCCTT CGNCTCAATT GGAAAATCAG GATTTGAACG CCTTAATCGG1701 CGATGTGTTG GCATTGTACG AAGCTGGTCC GTGCCGGTTT GCGGCGGAAC1751 TTGCCGGCGA ACCGCTGATG ATGGCGGCGG ATACGACCGC CATGCGGCAG1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA1851 TGTGCCCGAA GTCAGGGTAA AATCGGAAGC GGGGCAGGAC GGACGGATTG1901 TCCTGACAGT TTGCGACAAC GGCAAGGGGT TCGGCAGGGA AATGCTGCAC1951 AATGCCTTCG AGCCGTATGT AACGGACAAA CCGGCTGGAA CGGGATTGNG2001 ACTGCCCGTG GTGAAAAAAA TCATTGAAGA ACACGGCGGC CNCATCAGCC2051 TGAGCAATCA GGATGCGGGC GGCGCGTNTG TCAGAATCAT CTTGCCAAAA2101 ACGGTAGAAA CTTATGCGTA G它编码的蛋白质具有氨基酸序列<SEQ ID 196>:1 MRRFLPIAAI CAVVLLYGLT AATGSTSSLA DYFWWIVAFS AMLLLVLSAV51 LARYVILLLK DRRDGVFGSQ IAKRLSGMFT LVAVLPGYFL FGVSAQFING101 TINSWFGNDT HEALERSLNL SKSALNLAAD NALGNAIPYQ IDXIGAASLP151 XDMGRVLEHY AGSGFAQLAL YNAASGKIEK SINPHKLDQP FPGKARWEKI201 QQAGSVRDXE SIGGVLYAXG WLSAXTHNGR DYALFFRQPV PKGVAEDAVL251 IEKARAXXXX LSYSKKGLQT FFLATLLIAS LLSIFLALVM ALYFARRFVE301 PVLSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD351 ERNRRREEAA RHYLECVLEG LTTGVVVFDE QGCLKTFNKA AEQILGMPLT401 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVHVK YAAPDDAKIL451 LGKATVLPED NXNGVVMVID DITVLIHAQK EAAWGEVAKR LAHEIRNPLT501 PIQLSAERLA WKLGGKLDEX DAQILTRSTD TIIKQVAALK EMVEAFRNYX551 RSPSXQLENQ DLNALIGDVL ALYEAGPCRF AAELAGEPLM MAADTTAMRQ601 VLHNIFKNAA EAAEEADVPE VRVKSEAGQD GRIVLTVCDN GKGFGREMLH
651 NAFEPYVTDK PAGTGLXLPV VKKIIEEHGG XISLSNQDAG GAXVRIILPK
701 TVETYA*ORF64a和ORF64-1在706个氨基酸的重叠区内有96.6%的相同性:
10 20 30 40 50 60orf64a.pep MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf64-1 MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
10 20 30 40 50 60
70 80 90 100 110 120orf64a.pep DRRDGVFGSQIAKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf64-1 DRRDGVFGSQIAKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL
70 80 90 100 110 120
130 140 150 160 170 180orf64a.pep SKSALNLAADNALGNAIPVQIDXIGAASLPXDMGRVLEHYAGSGFAQLALYNAASGKIEK
||||||||||||||||:||||| ||||||| |||||||||||||||||||||||||||||orf64-1 SKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNAASGKIEK
130 140 150 160 170 180
190 200 210 220 230 240orf64a.pep SINPHKLDQPFPGKARWEKIQQAGSVRDXESIGGVLYAXGWLSAXTHNGRDYALFFRQPV
|||||||||||||||||||||:|||||| ||||||||| ||||| |||||||||||||||orf64-1 SINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHNGRDYALFFRQPV
190 200 210 220 230 240
250 260 270 280 290 300orf64a.pep PKGVAEDAVLIEKARAXXXXLSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE
|||||||||||||||| ||||||||||||||||||||||||||||||||||||||||orf64-1 PKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE
250 260 270 280 290 300
310 320 330 340 350 360orf64a.pep PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf64-1 PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
310 320 330 340 350 360
370 380 390 400 410 420orf64a.pep RHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf64-1 RHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL
370 380 390 400 410 420
430 440 450 460 470 480orf64a.pep AEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNXNGVVMVIDDITVLIHAQK
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||orf64-1 AEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNGNGVVMVIDDITVLIHAQK
430 440 450 460 470 480
490 500 510 520 530 540orf64a.pep EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDEXDAQILTRSTDTIIKQVAALK
||||||||||||||||||||||||||||||||||||||| ||||||||||||:|||||||orf64-1 EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDEQDAQILTRSTDTIVKQVAALK
490 500 510 520 530 540
550 560 570 580 590 600orf64a.pep EMVEAFRNYXRSPSXQLENQDLNALIGDVLALYEAGPCRFAAELAGEPLMMAADTTAMRQ
||||||||| |||| :||||||||||||||||||||||||||||||||| :|||||||||orf64-1 EMVEAFRNYARSPSLKLENQDLNALIGDVLALYEAGPCRFAAELAGEPLTYAADTTAMRQ
550 560 570 580 590 600
610 620 630 640 650 660orf64a.pep VLHNIFKNAAEAAEEADVPEVRVKSEAGQDGRIVLTVCDNGKGFGREMLHNAFEPYVTDK
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||orf64-1 VLHNIFKNAAEAAEEADVPEVRVKSETGQDGRIVLTVCDNGKGFGREMLHNAFEPYVTDK
610 620 630 640 650 660
670 680 690 700orf64a.pep PAGTGLXLPVVKKIIEEHGGXISLSNQDAGGAXVRIILPKTVETYAX
|||||| ||||||||||||| ||||||||||| |||||||||:||||orf64-1 PAGTGLGLPVVKKIIEEHGGRISLSNQDAGGACVRIILPKTVKTYAX
670 680 690 700与淋病奈瑟球菌的预计ORF的同源性
ORF64与淋病奈瑟球菌的预计ORF(ORF64.ng)在重叠的387个氨基酸内有86.6%的相同性:orf64.pep MRRFLPIAAICAXXLXXGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 60
|||||||||||| | ||||||||||||||||||||:||||||||||||||||||||||orf64ng MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVSFSAMLLLVLSAVLARYVILLLK 60orf64.pep DRRDGVFGSXXAKXPXXXMFTLVAXLPGVFLFGFPAQFINGTINSWFGNDTHEALERSLN 120
|||:||||| || |||||| |||:||||: |||||||||||||||||||||||||orf64ng DRRNGVFGSQIAKR-LSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLN 119orf64.pep LSKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNXASGKIE 180
||||||:||||||::|||||||||||:||| |:|| ||||||||||||||||| ||||||orf64ng LSKSALDLAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAGSGFAQLALYNAASGKIE 179orf64.pep KSINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHXGRDYALFFRQP 240
||||||::|||:| | :||:||::||||:||||||||||||||||||| |||||||||||orf64ng KSINPHQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYALFFRQP 239orf64.pep VPKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFV 300
:|::||:|||||||||||||||||||||||||||:|||||||||||||||||||||||||orf64ng IPENVAQDAVLIEKARAKYAELSYSKKGLQTFFLVTLLIASLLSIFLALVMALYFARRFV 299orf64.pep EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTXLFNHMTEQLSIAKDADERNRRREEA 360
||:||||||||||||||||||||||||||||||| |||||||||||||:|||||||||||orf64ng EPILSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEA 359orf64.pep ARHYLECVLEGLTTGVVVFDEQGCLKTFNKAAGT 394
|||||||||:|||||||| :| :|orf64ng ARHYLECVLDGLTTGVVVSYPLSCCRTAVFSTCHSSPLSYF 400
预计ORF64ng核苷酸序列<SEQ ID 197>编码的蛋白质具有氨基酸序列<SEQ ID198>:1 MRRFLPIAAI CAVVLLYGLT AATGSTSSLA DYFWWIVSFS AMLLLVLSAV51 LARYVILLLK DRRNGVFGSQ IAKRLSGMFT LVAVLPGLFL FGISAQFING101 TINSWFGNDT HEALERSLNL SKSALDLAAD NAVSNAVPVQ IDLIGTASLS151 GNMGSVLEHY AGSGFAQLAL YNA4SGKIEK SINPHQFDQP LPDKEHWEQI201 QQTGSVRSLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPI PENVAQDAVL251 IEKARAKYAE LSYSKKGLQT FFLVTLLIAS LLSIFLALVM ALYFARRFVE301 PILSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD351 ERNRRREEAA RHYLECVLDG LTTGVVVSYP LSCCRTAVFS TCHSSPLSYF*进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 199>: 1 ATGCGCCGCT TCCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGCTGTA51 CGGATTGACG GCGGCGACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT101 GGTGGATAGT CTCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCA ACGGCGTGTT201 CGGTTCGCAG ATTGCCAAAC GCCTTTCCGG GATGTTCACG CTGGTCGCCG251 TACTGCCCGG CTTGTTCCTG TTCGGCATTT CCGCGCAGTT TATCAACGGC301 ACGATTAATT CGTGGTTCGG CAACGACACC CACGAAGCCC TCGAACGCAG351 CCTTAATTTG AGCAAGTCCG CACTGGATTT GGCGGCAGAC AATGCCGTCA401 GCAACGCCGT TCCCGTACAG ATAGACCTCA TCGGCACCGC CTCCCTGTCG451 GGCAATATGG GCAGTGTGCT GGAACACTAC GCCGGCAGCG GTTTTGCCCA501 GCTTGCCCTG TACAATGCCG CAAGCGGGAA AATCGAAAAA AGCATCAATC551 CGCACCAATT CGACCAGCCG CTTCCCGACA AAGAACATTG GGAACAGATT601 CAGCAGACCG GTTCGGTTCG GAGTTTGGAA AGCATAGGCG GCGTATTGTA651 CGCGCAGGGA TGGTTGTCGG CAGGTACGCA CAACGGGCGC GATTACGCGC701 TGTTCTTCCG CCAGCCGATT CCCGAAAATG TGGCACAGGA TGCCGTTCTG751 ATTGAAAAGG CGCGGGCGAA ATATGCCGAA TTGAGTTACA GCAAAAAAGG801 TTTGCAGACC TTTTTTCTGG TAACCCTGCT GATTGCCTCG CTGCTGTCGA851 TTTTTCTTGC GCTGGTAATG GCACTGTATT TTGCCCGCCG TTTCGTCGAA901 CCCATTCTGT CGCTTGCCGA GGGCGCAAAG GCGGTGGCGC AGGGTGATTT951 CAGCCAGACG CGCCCCGTAT TGCGCAACGA CGAGTTCGGA CGTTTGACCA1001 AGCTGTTCAA CCATATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC1051 GAACGCAACC GCCGGCGCGA GGAAGCCGCC CGTCACTACC TCGAGTGCGT1101 GTTGGATGGG TTGACTACCG GTGTGGTGGT GTTTGACGAA AAAGGCCGTT1151 TGAAAACCTT CAACAAGGCG GCGGAACAGA TTTTGGGGAT GCCGCTCGCC1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA1251 GTCCCTGCTT GCCGAAGTGT TtgccgccAT CGGTGCGGCG GCAGGTACGG1301 ACAAACCGGT CCAGGTGGAA TATGCCGCGC CGGACGATGC CAAAATCCTG1351 CTGGGCAAGG CGACGGTATT GCCCGAAGAC AACGGCAACG GCGTGGTGAT1401 GGTGATTGAC GACATCACCG TGCTGATACG CGCGCAAAAA GAAGCCGCGT1451 GGGGTGAAGT GGCGAAGCGG CTGGCACACG AAATCCGCAA TCCGCTCACG1501 CCCATCCAGC TTTCCGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT1551 GGACGATCAG GACGCGCAAA TCCTGACGCG TtcgACCGAC ACCATCATCA1601 AACAGgtggc gGCGTTAAAA GAAATGGTCG AGGCATTCCG CAATTACGCG1651 CGCGCCCCTT CGCTCAAACT GGAAAATCAG GATTTGAACG CCTTAATCGG1701 CGATGTTTTG GCCCTGTACG AAGCCGGCCC GTGCCGGTTT GAGGCGGAAC1751 TTGCCGGCGA ACCGCTGATG ATGGCGGCGG ATACGACCGC CATGCGGCAG1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA1851 TATGCCCGAA GTCAGGGTAA AATCGGAAAC GGGGCAGGAC GGACGGATTG1901 TCCTGACGGT TTGCGACAAC GGCAAGGGAT TCGGCAAGGA AATGCTGCAC1951 AATGCFTTCG AGCCGTATGT GACGGATAAG CCGGCGGGAA CGGGACTGGG2001 TCTGCCTGTA GTGAAAAAAA TCATTGGAGA ACACGGCGGC CGCATCAGCC2051 TGAGCAATCA GGATGCGGGT GGGGCGTGTG TCAGAATCAT CTTGCCAAAA2101 ACGGTAGAAA CTTATGCGTA G它对应于氨基酸序列<SEQ ID 200;ORF64ng-1>:1 MRRFLPIAAI CAVVLLYGLT AATGSTSSLA DYFWWIVSFS AMLLLVLSAV51 LARYVILLLK DRRNGVFGSQ IAKRLSGMFT LVAVLPGLFL FGISAQFING101 TINSWFGNDT HEALERSLNL SKSALDLAAD NAVSNAVPVQ IDLIGTASLS151 GNMGSVLEHY AGSGFAQLAL YNAASGKIEK SINPHQFDQP LPDKEHWEQI201 QQTGSVRSLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPI PENVAQDAVL251 IEKARAKYAE LSYSKKGLQT FFLVTLLIAS LLSIFLALVM ALYFARRFVE301 PILSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD351 ERNRRREEAA RHYLECVLDG LTTGVVVFDE KGRLKTFNKA AEQILGMPLA401 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVQVE YAAPDDAKIL451 LGKATVLPED NGNGVVMVID DITVLIRAQK EAAWGEVAKR LAHEIRNPLT501 PIQLSAERLA WKLGGKLDDQ DAQILTRSTD TIIKQVAALK EMVEAFRNYA551 RAPSLKLENQ DLNALIGDVL ALYEAGPCRF EAELAGEPLM MAADTTAMRQ601 VLHNIFKNAA EAAEEADMPE VRVKSETGQD GRIVLTVCDN GKGFGKEMLH651 NAFEPYVTDK PAGTGLGLPV VKKIIGEHGG RISLSNQDAG GACVRIILPK701 TVETYA*ORF64ng-1和ORF64-1在706个氨基酸的重叠区内有93.8%的相同性:
10 20 30 40 50 60orf64ng-1.pep MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVSFSAMLLLVLSAVLARYVILLLK
|||||||||||||||||||||||||||||||||||||:||||||||||||||||||||||orf64-1 MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
10 20 30 40 50 60
70 80 90 100 110 120orf64ng-1.pep DRRNGVFGSQIAKRLSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLNL
|||:|||||||||||||||||||||||:||||:|||||||||||||||||||||||||||orf64-1 DRRDGVFGSQIAKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL
70 80 90 100 110 120
130 140 150 160 170 180orf64ng-1.pep SKSALDLAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAGSGFAQLALYNAASGKIEK
|||||:||||||::|||||||||||:||| |:|| |||||||||||||||||||||||||orf64-1 SKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNAASGKIEK
130 140 150 160 170 180
190 200 210 220 230 240orf64ng-1.pep SINPHQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYALFFRQPI
|||||::|||:| | :||:||::||||:|||||||||||||||||||||||||||||||:orf64-1 SINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHNGRDYALFFRQPV
190 200 210 220 230 240
250 260 270 280 290 300orf64ng-1.pep PENVAQDAVLIEKARAKYAELSYSKKGLQTFFLVTLLIASLLSIFLALVMALYFARRFVE
|::||:|||||||||||||||||||||||||||:||||||||||||||||||||||||||orf64-1 PKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE
250 260 270 280 290 300
310 320 330 340 350 360orf64ng-1.pep PILSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf64-1 PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
310 320 330 340 350 360
370 380 390 400 410 420orf64ng-1.pep RHYLECVLDGLTTGVVVFDEKGRLKTFNKAAEQILGMPLAPLWGSSRHGWHGVSAQQSLL
||||||||:|||||||||||:| ||||||||||||||||:||||||||||||||||||||orf64-1 RHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL
370 380 390 400 410 420
430 440 450 460 470 480orf64ng-1.pep AEVFAAIGAAAGTDKPVQVEYAAPDDAKILLGKATVLPEDNGNGVVMVIDDITVLIRAQK
|||||||||||||||||:|:||||||||||||||||||||||||||||||||||||:|||orf64-1 AEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNGNGVVMVIDDITVLIHAQK
430 440 450 460 470 480
490 500 510 520 530 540orf64ng-1.pep EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDDQDAQILTRSTDTIIKQVAALK
||||||||||||||||||||||||||||||||||||||:|||||||||||||:|||||||orf64-1 EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDEQDAQILTRSTDTIVKQVAALK
490 500 510 520 530 540
550 560 570 580 590 600orf64ng-1.pep EMVEAFRNYARAPSLKLENQDLNALIGDVLALYEAGPCRFEAELAGEPLMMAADTTAMRQ
|||||||||||:|||||||||||||||||||||||||||| |||||||| :|||||||||orf64-1 EMVEAFRNYARSPSLKLENQDLNALIGDVLALYEAGPCRFAAELAGEPLTVAADTTAMRQ
550 560 570 580 590 600
610 620 630 640 650 660
orf64ng-1.pep VLHNIFKNAAEAAEEADMPEVRVKSETGQDGRIVLTVCDNGKGFGKEMLHNAFEPYVTDK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf64-1 VLHNIFKNAAEAAEEADVPEVRVKSETGQDGRIVLTVCDNGKGFGREMLHNAFEPYVTDK
610 620 630 640 650 660
670 680 690 700
orf64ng-1.pep PAGTGLGLPVVKKIIGEHGGRISLSNQDAGGACVRIILPKTVETYAX
||||||||||||||| ||||||||||||||||||||||||||:||||
orf64-1 PAGTGLGLPVVKKIIEEHGGRISLSNQDAGGACVRIILPKTVKTYAX
670 680 690 700
另外,ORF64ng-1显示出与茎瘤固氮根瘤菌的一种蛋白明显同源:
sp|Q04850|NTRY_AZOCA氮调节蛋白NTRY>gi|77479|pir||S18624 ntrY蛋白-茎瘤固氮根瘤菌>gi|38737(X63841)NtrY基因产物[茎瘤固氮根瘤菌]长度=771
评分=218位(550),估计值=7e-56
相同性=195/720(27%),阳性=320/720(44%),空隙=58/720(8%)
询问:7 IAAICAVVLLYGLTAATGSTSSLADYFWWIXXXXXXXXXXXXXXXXRYVILLLKDRRNGV 66
I+A+ ++L GLT + + + R + + K R G
目标:35 ISALATFLILMGLTPVVPTHQVVIS----VLLVNAAAVLILSAMVGREIWRIAKARARGR 90
询问:67 FGSQIAKRLSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLNLSKSALD 126
+++ R+ G+F +V+V+P + + +++ ++ ++ WF T E + S++++++ +
目标:91 AAARLHIRIVGLFAVVSVVPAILVAVVASLTLDRGLDRWFSMRTQEIVASSVSVAQTYVR 150
询问:127 LAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAG--SGFAQLALYNAASGKIEKSINP 184
A N + + + DL S+ Y G S F Q+ AA + ++
目标:151 EHALNIRGDILAMSADLTRLKSV----------YEGDRSRFNQILTAQAALRNLPGAMLI 200
询问:185 HQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYA----------- 233
+ D + ++ + I + V + +IG Q + N DY
目标:201 RR-DLSVVERAN-VNIGREFIVPANLAIGDATPDQPVIYLP--NDADYVAAVVPLKDYDD 256
询问:234 --LFFRQPIPENVAQDAVLIEKARAKYAE1SYSKKGLQTFFLVTXXXXXXXXXXXXXVMA 291
L+ + I V ++ A Y L + G+Q F + +
目标:257 LYLYVARLIDPRVIGYLKTTQETLADYRSLEERRFGVQVAFALMYAVITLIVLLSAVWLG 316
询问:292 LYFARRFVEPILSLAEGAKAVAQGDFSQTRPVLRND-EFGRLTKLFNHMTEQLSIXXXXX 350
L F++ V PI L A VA+G+ P+ R + + L + FN MT +L
目标:317 LNFSKWLVAPIRRLMSAADHVAEGNLDVRVPIYRAEGDLASLAETFNKMTHELRSQREAI 376
询问:351 XXXXXXXXXXXHYLECVLDGLTTGVVVFDEKGRLKTFNKAAEQILGMPLAPLWGSSRHGW 410
+ E VL G+ GV+ D + R+ N++AE++LG L+ + RH
目标:377 LTARDQIDSRRRFTEAVLSGVGAGVIGLDSQERITILNRSAERLLG--LSEVEALHRHLA 434
询问:411 HGVSAQQSLLAEVFXXXXXXXXTDKPVQVEYAAPDDAKILLGKATVLPEDNG---NGVVM 467
V LL E + VQ D + + V E + +G V+
目标:435 EVVPETAGLLEEA------EHARQRSVQGNITLTRDGRERVFAVRVTTEQSPEAEHGWVV 488
询问:468 VIDDITVLIRAQKEAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDDQDAQILTR 527
+DDIT LI AQ+ +AW +VA+R+AHEI+NPLTPIQLSAERL K G + QD +I +
目标:489 TLDDITELISAQRTSAWADVARRIAHEIKNPLTPIQLSAERLKRKFGRHV-TQDREIFDQ 547
询问:528 STDTIIKQVAALKEMVEAFRNYARAPSLKLENQDLNALIGDVLALYEAGPCRFEAELAGE 587
TDTII+QV + MV+ F ++AR P +++QD++ +I + L G +
目标:548 CTDTIIRQVGDIGRMVDEFSSFARMPKPVVDSQDMSEIIRQTVFLMRVGHPEVVFDSEVP 607
询问:588 PLMMAA-DTTAMRQVLHNIFKNXXXXXXXXDMPEVRVK-------SETGQDGRIVLTVCD 639
P M A D + Q L NI KN P+VR + + G+D +V+ + D
目标:608 PAMPARFDRRLVSQALTNILKNAAEAIEAVP-PDVRGQGRIRVSANRVGED--LVIDIID 664
询问:640 NGKGFGKEMLHNAFEPYVTDKPAGTGLGLPVVKKIIGEHGGRISLSNQDAG-GACVRIIL 698
NG G +E + EPYVT + GTGLGL +V KI+ EHGG I L++ G GA +R+ L
目标:665 NGTGLPQESRNRLLEPYVTTREKGTGLGLAIVGKIMEEHGGGIELNDAPEGRGAWIRLTL 724
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列(双划线)和几个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例31
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 201>:1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC101 CTTTCCAAAT TTTCGGCATC CACACCACTT GGGGCGCATT TTCCTTTCCC151 TTCATCTTCC TTGCCACCGA CCTGACCGTC CGCATTTTCG GTTCTCACTT201 GGCACGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT TTGCTTTCCT251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACAGG CTTGGGCGCG301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCCTTAG CCAGCTTTGC351 CGCCTACGCG ATCGGACAAA TCCTTGATAT TTTTGTATTC AACAAATTAC401 GCCGTCTGAA AGCGTGGTGG ATTGCACCGA ACGCATCAAC CGTCATCGGG451 CACGCGTTGG ATACG...它对应于氨基酸序列<SEQ ID 202;ORF66>:1 MYAFTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFQIFGI HTTWGAFSFP51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA101 LSEFNTFVGR IALASFAAYA IGQILDIFVF NKLRRLKAWW IAPNASTVIG151 HALDT...进一步的工作揭示了完整的核苷酸序列<SEQ ID 203>:1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT51 GCTTTTTCAT ATCCTCATCA TCCCCGCCAG CAACTATCTG GTGCAGTTCC101 CTTTCCAAAT TTTCGGCATC CACACCACTT GGGGCGCATT TTCCTTTCCC151 TTCATCTTCC TTGCCACCGA CCTGACCGTC CGCATTTTCG GTTCTCACTT201 GGCACGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT TTGCTTTCCT251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACAGG CTTGGGCGCG301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCCTTAG CCAGCTTTGC351 CGCCTACGCG ATCGGACAAA TCCTTGATAT TTTTGTATTC AACAAATTAC401 GCCGTCTGAA AGCGTGGTGG ATTGCACCGA CCGCATCAAC CGTCATCGGC451 AACGCCTTGG ATACGCTGGT ATTTTTCGCC GTTGCCTTCT ACGCAAGCAG501 CGATGGATTT ATGGCGGCAA ACTGGCAGGG CATCGCTTTT GTCGATTACC551 TGTTCAAACT TACCGTCTGC ACCCTCTTCT TCCTGCCCGC CTACGGCGTG601 ATACTGAATC TGCTGACGAA AAAACTGACA ACCCTGCAAA CCAAACAGGC651 GCAAGACCGC CCCGCGCCCT CGCTGCAAAA TCCGTAA它对应于氨基酸序列<SEQ ID 204;ORF66-1>:1 MYAFTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFQIFGI HTTWGAFSFP51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA101 LSEFNTFVGR IALASFAAYA IGQILDIFVF NKLRRLKAWW IAPTASTVIG151 NALDTLVFFA VAFYASSDGF MAANWQGIAF VDYLFKLTVC TLFFLPAYGV201 ILNLLTKKLT TLQTKQAQDR PAPSLQNP*
该氨基酸序列的计算机分析给出了下列结果:
与大肠杆菌的假设蛋白o221(登录号P37619)的同源性
ORF66和o221蛋白在155个氨基酸的重叠区内有67%的氨基酸相同性:orf66 1 MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 60
M F+ Q+ KALF L LFH+L+I +SNYLVQ P I G HTTWGAFSFPFIFLATDLTVo221 1 MNVFSQTQRYKALFWLSLFHLLVITSSNYLVQLPVSILGFHTTWGAFSFPFIFLATDLTV 60orf66 61 RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 120
RIFG+ LARRIIF VM PALL+SYV S LF+ GSW G GAL+ FN FV RIA ASF AYAo221 61 RIFGAPLARRIIFAVMIPALLISYVISSLFYMGSWQGFGALAHFNLFVARIATASFMAYA 120orf66 121 IGQILDIFVFNKLRRLKAWWIAPNASTVIGHALDT 155
+GQILD+ VFN+LR+ + WW+AP AST+ G+ DTo221 121 LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDT 155与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF66与脑膜炎奈瑟球菌菌株A的ORF(ORF66a)在重叠的155个氨基酸内有96.1%的相同性:
10 20 30 40 50 60orf66. pep MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV
|||||||||||||| |||||||||||||||||||||| ||||||||||||||||||||||orf66a MYAFTAAQQQKALFWLVLFHILIIAASNYLVQFPFQISGIHTTWGAFSFPFIFLATDLTV
10 20 30 40 50 60
70 80 90 100 110 120orf66.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf66a RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA
70 80 90 100 110 120
130 140 150orf66.pep IGQILDIFVFNKLRRLKAWWIAPNASTVIGHALDT
:|||||||||||||||||||:||:||||||:||||orf66a LGQILDIFVFNKLRRLKAWWVAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF
130 140 150 160 170 180orf66a VDYLFKLTVCGLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX
190 200 210 220全长ORF66a核苷酸序列<SEQ ID 205>是:1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCTGGCTGGT51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC101 CCTTCCAAAT TTCCGGCATC CACACCACTT GGGGCGCGTT TTCCTTTCCC151 TTCATCTTCC TCGCCACCGA CCTGACCGTC CGCATTTTCG GTTCGCACTT201 GGCACGGCGG ATTATCTTTT GGGTCATGTT CCCCGCCCTT TTGCTTTCCT251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACGGG CTTGGGCGCG301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCGCTGG CAAGTTTTGC351 CGCCTACGCG CTCGGACAAA TCCTTGATAT TTTTGTGTTC AACAAATTAC401 GCCGTCTGAA AGCGTGGTGG GTTGCCCCGA CTGCATCAAC CGTCATCGGC451 AACGCCTTAG ATACGTTGGT ATTTTTCGCC GTTGCCTTCT ACGCAAGCAG501 CGATGGATTT ATGGCGGCAA ACTGGCAGGG CATCGCTTTT GTCGATTACC551 TGTTCAAACT CACCGTCTGC GGTCTGTTTT TCCTGCCCGC CTACGGCGTG601 ATTCTGAATC TGCTGACGAA AAAACTGACG ACCCTGCAAA CCAAACAGGC651 GCAAGACCGC CCCGCGCCCT CGCTGCAAAA TCCGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 206>:1 MYAFTAAQQQ KALFWLVLFH ILIIAASNYL VQFPFQISGI HTTWGAFSFP51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA101 LSEFNTFVGR IALASFAAYA LGQILDIFVF NKLRRLKAWW VAPTASTVIG151 NALDTLVFFA VAFYASSDGF MAANWQGIAF VDYLFKLTVC GLFFLPAYGV201 ILNLLTKKLT TLQTKQAQDR PAPSLQNP*
ORF66a和ORF66-1在228个氨基酸的重叠区内有97.8%的相同性:
10 20 30 40 50 60
orf66a.pep MYAFTAAQQQKALFWLVLFHILIIAASNYLVQFPFQISGIHTTWGAFSFPFIFLATDLTV
|||||||||||||| |||||||||||||||||||||| ||||||||||||||||||||||
orf66-1 MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV
10 20 30 40 50 60
70 80 90 100 110 120
orf66a.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf66-1 RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA
70 80 90 100 110 120
130 140 150 160 170 180
orf66a.pep LGQILDIFVFNKLRRLKAWWVAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF
:|||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||
orf66-1 IGQILDIFVFNKLRRLKAWWIAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF
130 140 150 160 170 180
190 200 210 220 229
orf66a.pep VDYLFKLTVCGLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX
|||||||||| ||||||||||||||||||||||||||||||||||||||
orf66-1 VDYLFKLTVCTLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF66与淋病奈瑟球菌的预计ORF(ORF66.ng)在重叠的155个氨基酸内有94.2%的相同性:
orf66.pep MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 60
|||:|||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf66ng MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 60
orf66.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 120
|||||||||||||||||||| ||||||||||||||||||| |:|||||||||||||||||
orf66ng RIFGSHLARRIIFWVMFPALSLSYVFSVLFHNGSWTGLGAPSQFNTFVGRIALASFAAYA 120
orf66.pep IGQILDIFVFNKLRRLKAWWIAPNASTVIGHALDT 155
:|||||||||:|||||||||||| ||||||:||||
orf66ng LGQILDIFVFDKLRRLKAWWIAPAASTVIGNALDTLVFFAVAFYASSDEFMAANWQGIAF 180全长ORF66ng核苷酸序列<SEQ ID 207>是:1 ATGTACGCAT TGACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT51 GCTTTTCCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC101 CCTTCCGGAT TTTCGGCATC CACACCACTT GGGGCGCGTT TTCCTTTCCC151 TTCATCTTCC TCGCCACCGA CCTGACCGTC CGCATTTTCG GTTCGCACTT201 GGCGCGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT ttgCTTTcat251 aCGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACGGG CTTGGGCGCG301 ctgTCCCAAT TCAACACCTT TGTCGGACGC ATCGCGCTGG CAAGTTTTGC351 CGCCTACGCG CTCGGACAAA TCCTTGATAT TTTCGTATTC GACAAATTAC401 GCCGTCTGAA AGCGTGGTGG ATTGCCCCGG CCGCATCAAC CGTCATCGGC451 AATGCACTGG ACACGTTAGT ATTTTTTGCC GTTGCCTTTT ACGCAAGCAG501 CGATGAATTT ATGGCGGCAA ACTGGCAGGG CATCGCTTTT GTCGATTACC551 TGTTCAAACT TACCGTCTGC ACCCTCTTCT TCCTGCCCGC CTACGGCGTG601 ATACTGAATC TGCTGACGAA AAAACTGACG GCCCTGCAAA CCAAACAGGC651 GCAAGACCGC CCCGTGCCCT CGCTGCAAAA TCCGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 208>:
1 MYALTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFRIFGI HTTWGAFSFP
51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL SLSYVFSVLF HNGSWTGLGA
101 PSQFNTFVGR IALASFAAYA LGQILDIFVF DKLRRLKAWW IAPAASTVIG
151 NALDTLVFFA VAFYASSDEF MAANWQGIAF VDYLFKLTVC TLFFLPAYGV
201 ILNLLTKKLT ALQTKQAQDR PVPSLQNP*
另一个注释的序列是:
1 MYALTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFRIFGI HTTWGAFSFP
51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA
101 LSQFNTFVGR IALASFAAYA LGQILDIFVF DKLRRLKAWW IAPAASTVIG
151 NALDTLVFFA VAFYASSDEF MAANWQGIAF VDYLFKLTVC TLFFLPAYGV
201 ILNLLTKKLT ALQTKQAQDR PVPSLQNP*
ORF66ng和ORF66-1在228个氨基酸的重叠区内有96.1%的相同性:
orf66-1.pep MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 60
|||:|||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf66ng MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 60
orf66-1.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 120
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||
orf66ng RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSQFNTFVGRIALASFAAYA 120
orf66-1.pep IGQILDIFVFNKLRRLKAWWIAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF 180
:|||||||||:||||||||||||:|||||||||||||||||||||||| |||||||||||
orf66ng LGQILDIFVFDKLRRLKAWWIAPAASTVIGNALDTLVFFAVAFYASSDEFMAANWQGIAF 180
orf66-1.pep VDYLFKLTVCTLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX 229
||||||||||||||||||||||||||||||:||||||||||:|||||||
orf66ng VDYLFKLTVCTLFFLPAYGVILNLLTKKLTALQTKQAQDRPVPSLQNPX 229
另外,ORF66ng显示出与大肠杆菌的ORF有明显的同源性:
sp|P37619|YHHQ_ECOLI FTSY-NIKA基因间区域中的假设的25.3 KD蛋白(0221)
>gi|1073495|pir||S47690假设蛋白o221-大肠杆菌>gi|466607(U00039)没有发现确定线[大肠杆菌]>gi|1789882(AE000423)ftsY-nikA基因间区域中假设的25.3 kD蛋白[大肠杆菌]长度=221
评分=273位(692),估计值=5e-73
相同性=132/203(65%),阳性=155/203(76%)
询问:1 MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 60
M + Q+ KALF L LFH+L+I +SNYLVQ P I G HTTWGAFSFPFIFLATDLTV
目标:1 MNVFSQTQRYKALFWLSLFHLLVITSSNYLVQLPVSILGFHTTWGAFSFPFIFLATDLTV 60
询问:61 RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSQFNTFVGRIALASFAAYA 120
RIFG+ LARRIIF VM PALL+SYV S LF+ GSW G GAL+ FN FV RIA ASF AYA
目标:61 RIFGAPLARRIIFAVMIPALLISYVISSLFYMGSWQGFGALAHFNLFVARIATASFMAYA 120
询问:121 LGQILDIFVFDKLRRLKAWWIAPAASTVIGNALDTLVFFAVAFYASSDEFMAANWQGIAF 180
LGQILD+ VF++LR+ + WW+AP AST+ GN DTL FF +AF+ S D FMA +W IA
目标:121 LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDTLAFFFIAFWRSPDAFMAEHWMEIAL 180
询问:181 VDYLFKLTVCTLFFLPAYGVILN 203
VDY FK+ + +FFLP YGV+LN
目标:181 VDYCFKVLISIVFFLPMYGVLLN 203
根据该分析结果(包括与大肠杆菌蛋白质同源以及淋球菌蛋白中存在几个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例32
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 209>:
1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC
51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAAyGCA GTmwrAATAT
101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT
151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA
201 TTTAACACAC AyyCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA
251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC
301 CGCTTAGgCG CGAAATTCAG CACAAGGGCG GTtCCCTATG TCGGAACAGC
351 CcTTTTAGCC CACGACGTAT ACGAAAcTTT CAAAGAAGAC ATACAGGCAC
401 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGTAAA AGGCTACGAA
451 TATAGTAATT GCCTTTGGTA CGAAGACAAA AGACGTATTA ATAGAACCTA
501 TGGCTGCTAC GGCGTTGAT..
它对应于氨基酸序列<SEQ ID 210;ORF72>:
1 MVIKYTNLNF AKLSIIAILM MYSFEANANA VXISETVSVD TGQGAKIHKF
51 VPKNSKTYSS DLIKTVDLTH XPTGAKARIN AKITASVSRA GVLAGVGKLA
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFVKGYE
151 YSNCLWYEDK RRINRTYGCY GVD..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 211>:
1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC
51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT
101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT
151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA
201 TTTAACACAC ATCCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA
251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC
301 CGCTTAGGCG CGAAATTCAG CACAAGGGCG GTTCCCTATG TCGGAACAGC
351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC
401 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGCAAA GGTCTCAGGC
451 TAA
它对应于氨基酸序列<SEQ ID 212;ORF72-1>:
1 MVIKYTNLNF AKLSIIAILM MYSFEANANA VKISETVSVD TGQGAKIHKF
51 VPKNSKTYSS DLIKTVDLTH IPTGAKARIN AKITASVSRA GVLAGVGKLA
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFAKVSG
151 *
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF72与脑膜炎奈瑟球菌菌株A的ORF(ORF72a)在重叠的147个氨基酸内有98.0%的相同性。
10 20 30 40 50 60
orf72.pep MVIKYTNLNFAKLSIIAILMMYSFEANANAVXISETVSVDTGQGAKIHKFVPKNSKTYSS
||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||
orf72a MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
10 20 30 40 50 60
70 80 90 L00 110 120
orf72.pep DLIKTVDLTHXPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
|||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||
orf72a DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
70 80 90 100 110 120
130 140 150 160 170
orf72.pep HDVYETFKEDIQARGYQYDPETDKFVKGYEYSNCLWYEDKRRINRTYGCYGVD
|||||||||||||||||||||||||:|
orf72a HDVYETFKEDIQARGYQYDPETDKFAKVSGX
130 140 150
全长ORF72a核苷酸序列<SEQ ID 213>是:
1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC
51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT
101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT
151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA
201 TTTAACACAC ATCCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA
251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC
301 CGCTTAGGCG CGAAATTCAG CACAAGGGCG GTTCCCTATG TCGGAACAGC
351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC
401 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGCAAA GGTCTCAGGC
451 TAA
它编码的蛋白质具有氨基酸序列<SEQ ID 214>:
1 MVIKYTNLNF AKLSIIAILM MYSFEANANA VKISETVSVD TGQGAKIHKF
51 VPKNSKTYSS DLIKTVDLTH IPTGAKARIN AKITASVSRA GVLAGVGKLA
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFAKVSG
151 *
ORF72a和ORF72-1在150个氨基酸的重叠区内有100.0%的相同性:
10 20 30 40 50 60
orf72a.pep MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf72-1 MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
10 20 30 40 50 60
70 80 90 100 110 120
orf72a.pep DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf72-1 DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
70 80 90 100 110 120
130 140 150
orf72a.pep HDVYETFKEDIQARGYQYDPETDKFAKVSGX
|||||||||||||||||||||||||||||||
orf72-1 HDVYETFKEDIQARGYQYDPETDKFAKVSGX
130 140 150
与淋病奈瑟球菌的预计ORF的同源性
ORF72与淋病奈瑟球菌的预计ORF(ORF72.ng)在重叠的173个氨基酸内有89%的相同性:
orf72.pep MVIKYTNLNFAKLSIIAILMMYSFEANANAVXISETVSVDTGQGAKIHKFVPKNSKTYSS 60
|| |:|||||||||||||||||||||||||| ||||:|||||||||:||||||:|: |||
orf72ng MVTKHTNLNFAKLSIIAILMMYSFEANANAVKISETLSVDTGQGAKVHKFVPKSSNIYSS 60
orf72.pep DLIKTVDLTHXPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA 120
|| |:||||| ||||||||||||||||||||||:|||||:| ||||:|||||||||||||
orf72ng DLTKAVDLTHIPTGAKARINAKITASVSRAGVLSGVGKLVRQGAKFGTRAVPYVGTALLA 120
orf72.pep HDVYETFKEDIQARGYQYDPETDKFVKGYEYSNCLWYEDKRRINRTYGCYGVD 173
||||||||||||||| :||||||||||||||:|||||||:|||||||||||||
orf72ng HDVYETFKEDIQARGCRYDPETDKFVKGYEYANCLWYEDERRINRTYGCYGVDSSIMRLM 180
预计ORF72ng核苷酸序列<SEQ ID 215>编码的蛋白质具有氨基酸序列<SEQ ID216>: 1 MVTKHTNLNF AKLSIIAILM MYSFEANANA VKISETLSVD TGQGAKVHKF51 VPKSSNIYSS DLTKAVDLTH IPTGAKARIN AKITASVSRA GVLSGVGKLV101 RQGAKFGTRA VPYVGTALLA HDVYETFKED IQARGCRYDP ETDKFVKGYE151 YANCLWYEDE RRINRTYGCY GVDSSIMRLM PDRSRFPEVK QLMESQMYRL201 ARPFWNWRKE ELNKLSSLDW NNFVLNRCTF DWNGGGCAVN KGDDFRAGAS251 FSLGRNPKYK EEMDAKKPEE ILSLKVDADP DKYIEATGYP GYSEKVEVAP301 GTKVNMGPVT DRNGNPVQVA ATFGRDAQGN TTADVQVIPR PDLTPASAEA351 PHAQPLPEVS PAENPANNPD PDENPGTRPN PEPDPDLNPD ANPDTDGQPG401 TSPDSPAVPD RPNGRHRKER KEGEDGGLSC DYFPEILACQ EMGKPSDRMF451 HDISIPQVTD DKTWSSHNFL PSNGVCPQPK TFHVFGRQYR ASYEPLCVFA501 EKIRFAVLLA FIIMSAFVVF GSLGGE*在进一步分析后,鉴定出下列淋球菌DNA序列<SEQ ID 217>:1 ATGGTCACAA AACATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT101 CTGAAACTCT TTCGGTTGAT ACCGGACAAG GCGCGAAAGT TCATAAGTTC151 GTTCCTAAAT CAAGTAATAT TTATTCATCT GATTTAACAA AAGCGGTAGA201 TTTAACGCAT ATCCCCACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGT CGGGGGTCGG CAAACTTGTC301 CGCCAAGGCG CGAAATTCGG CACAAGGGCG GTTCCCTATG TCGGAACAGC351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC401 GAGGCTGCCG ATACGATCCC GAAACCGACA AATTT它对应于氨基酸序列<SEQ ID 218;ORF72ng-1>:1 MVTKHTNLNF AKLSIIAILM MYSFEANANA VKISETLSVD TGQGAKVHKF51 VPKSSNIYSS DLTKAVDLTH IPTGAKARIN AKITASVSRA GVLSGVGKLV101 RQGAKFGTRA VPYVGTALLA HDVYETFKED IQARGCRYDP ETDKF
ORF72ng-1和ORF721-1在145个氨基酸的重叠区内有89.7%的相同性:
10 20 30 40 50 60
orf72ng-1.pe MVTKHTNLNFAKLSIIAILMMYSFEANANAVKISETLSVDTGQGAKVHKFVPKSSNIYSS
|| |:|||||||||||||||||||||||||||||||:|||||||||:||||||:|: |||
orf72-1 MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
10 20 30 40 50 60
70 80 90 100 110 120
orf72ng-1.pe DLTKAVDLTHIPTGAKARINAKITASVSRAGVLSGVGKLVRQGAKFGTRAVPYVGTALLA
|| |:|||||||| || ||||||||||||||||:|||||:| ||||:|||||||||||||
orf72-1 DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
70 80 90 100 110 120
130 140
orf72ng-1.pe HDVYETFKEDIQARGCRYDPETDKF
||||||||||||||| :||||||||
orf72-1 HDVYETFKEDIQARGYQYDPETDKFAKVSGX
130 140 150
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列以及数个跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例33
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 219>:
1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT
51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGCTGG ACGTTGTTTT
101 TGATGGCGGC AGGTTTTGCC GCCGGCGTGC TGATGCTCAG GCAAACCGGG
151 GCTGACCGGT CTTTTATTGG CGGGCGGGGC AATGAGAAGC GGCGGGAAGG
201 TATCCGTTTA TCAGATGTTG TGGCCTATC..它对应于氨基酸序列<SEQ ID 220;ORF73>:
1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAAGFA AGVLMLRQTG
51 LTGLLLAGAA MRSGGKVSVY QMLWPI..进一步的工作揭示了完整的核苷酸序列<SEQ ID 221>:1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGCTGG ACGTTGTTTT101 TGATGGCGGC AGGTTTTGCC GCCGGCGTGC TGATGCTCAG GCATACGGGG151 CTGTCCGGTC TTTTATTGGC GGGCGCGGCA ATGAGAAGCG GCGGGAGGGT201 ATCCGTTTAT CAGATGTTGT GGCCTATCCG TTATACGGTG GCGGCTGTGT251 GTCTGATGAG TCCGGGATTC GTATCCTCGG TGTTGGCGGT ATTGCTGCTG301 CTGCCGTTTA AGGGAGGGGC AGTGTTGCAG GCAGGAGGTG CGGAAAATTT351 TTTCAACATG AACCAATCGG GCAGAAAAGA GGGCTTTTCC CGCGATGACG401 ATATTATCGA GGGAGAATAT ACGGTTGAAG AGCCTTACGG CGGCAATCGT451 TCCCGAAACG CCATCGAACA CAAAAAAGAC GAATAA它对应于氨基酸序列<SEQ ID 222:ORF73-1>:1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAAGFA AGVLMLRHTG51 LSGLLLAGAA MRSGGRVSVY QMLWPIRYTV AAVCLMSPGF VSSVLAVLLL101 LPFKGGAVLQ AGGAENFFNM NQSGRKEGFS RDDDIIEGEY TVEEPYGGNR151 SRNAIEHKKD E*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF73与脑膜炎奈瑟球菌菌株A的ORF(ORF73a)在重叠的76个氨基酸内有90.8%的相同性:
10 20 30 40 50 60
orf73.pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRQTGLTGLLLAGAA
||||||||||||||||||||||||||||||||||||| |||||:|||:|||:||||||||
orf73a MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVVMLRHTGLSGLLLAGAA
10 20 30 40 50 60
70
orf73.pep MRSGGKVSVYQMLWPI
|||||:|||| ||| |
orf73a MRSGGRVSVYXMLWXIRYTVAAVCXMSPGFVSSVXAVLLXLPFKGGAVLQAGGAENFFNM全长ORF73a核苷酸序列<SEQ ID 223>是:1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT51 GTCGATTGTG TGGGTTGCCG ATTGGTTGGG CGGCGGTTGG ACGCTGTTTC101 TAATGGCGGC AACCTTTGCC GCCGGCGTGG TGATGCTCAG GCATACGGGG151 CTGTCCGGTC TTTTATTGGC GGGCGCGGCA ATGAGAAGCG GCGGGAGGGT201 ATCCGTTTAT CANATGTTGT GGCNTATCCG TTATACGGTG GCGGCGGTGT251 GTCNGATGAG TCCGGGATTC GTATCCTCGG TGTNGGCGGT ATTGCTGNTG301 CTNCCGTTTA AGGGAGGTGC AGTGTTGCAG GCAGGAGGTG CGGAAAATTT351 TTTCAACATG AACCANTCGG GCAGAAAAGA NGGCNTTTCC CGCGATGACG401 ATATTATCGA GGGGGAATAT ACGGTTGAAG ANCCTTACGG CGGCANTCGT451 TTCCGAAACG CCNTNGAACA CAAAAAAGAC GAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 224>:1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAATFA AGVVMLRHTG51 LSGLLLAGAA MRSGGRVSVY XMLWXIRYTV AAVCXMSPGF VSSVXAVLLX101 LPFKGGAVLQ AGGAENFFNM NXSGRKXGXS RDDDIIEGEY TVEXPYGGXR151 FRNAXEHKKD E*ORF73a和ORF73-1在161个氨基酸的重叠区内有91.3%的相同性
180 190 200 210 220 230
220 230 240 250 260orf1.pep ----RESSYH----IA-----SGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGFQLVRK
|::: : || ||||||||| ::|||:||||||| || |: |||||:||orf1a SGDVRHANDYGPMPIAGAAGDSGSPMFIYDKTNNKWLLNGVLQTGYPYSGRENGFQLIRK
240 250 260 270 280 290
270 280 290 300 310 320orf1.pep DWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRTVQLFNV
|||||:|: ||||:| :|||:||::||:::||||| :: :|: | | :||::||:||:orf1a DWFYDDIYRGDTHTVXFEPRSNGHFSFTSNNNGTGTVTETNEKVSNP-KLKVQTVRLFDE
300 310 320 330 340 350
330 340 350 360 370 380orf1.pep SLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYFQGDFT
||:|| :|||| ||||:|:||||||||||:|||| |:|:|||::|||||||||||:||||orf1a SLNETDKEPVY-AAGGVNQYRPRLNNGENLSFIDYGNGKLILSNNINQGAGGLYFEGDFT
360 370 380 390 400 410
390 400 410 420 430orf1.pep VSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTL------------------
||||||||||||||||||||||||||||||||||||||||||orf1a VSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSISVGDGT
420 430 440 450 460 470orf1pep ------------------------------------------------------------orf1a VILDQQADDKGKKQAFSEIGLXSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGHSLSFH
480 490 500 510 520 530orf1.pep ------------------------------------------------------------orf1a RIQNTDEGAMIXXHNATTTSTVTITGNESITQPSGKNINRLNYSKEIAYNGWFGEKDTTK
540 550 560 570 580 590orf1.pep ------------------------------------------------------------orf1a TNGRLNLVYQPAAEDRTXLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSGWSKMEG
600 610 620 630 640 650orf1.pep -----------------------------------------------------------orf1a IPQGEIVWDNDWIXRTFKAENFHIQGGQAVISRNVAKVEGDXHLSNHAQAVFGVAPHQSH
660 670 680 690 700 710
440 450 460 470 480orf1.pep ----------------XXXXXDKVTASLTKTDISGNVDLADHAHLNLTGLATLNGNLSAN
: || : ||| ||||||| || | : |:| ||||||orf1a TICTRSDWTGLTNCVEXXITDDKVIASLTKTDXSGXVXLXXXXXXXLXGXAXLXGNLSAN
720 730 740 750 760 770
490 500 510 520 530 540orf1.pep GDTRYTVSHNATQNGNXSLVXNAQATFNQATLNGNTSASGNASFNLSDHAVQNGSLTLSG
|||||||||||||||| ||| ||||||||||||||:| |||||||||::|:|||||||||orf1a GDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNXSXSGNASFNLSNNAAQNGSLTLSD
780 790 800 810 820 830
10 20 30 40 50 60
orf73a.pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVVMLRHTGLSGLLLAGAA
||||||||||||||||||||||||||||||||||||| |||||:||||||||||||||||
orf73-1 MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRHTGLSGLLLAGAA
10 20 30 40 50 60
70 80 90 100 110 120
orf73a.pep MRSGGRVSVYXMLWXIRYTVAAVCXMSPGFVSSVXAVLLXLPFKGGAVLQAGGAENFFNM
|||||||||| ||| ||||||||| ||||||||| |||| ||||||||||||||||||||
orf73-1 MRSGGRVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM
70 80 90 100 110 120
130 140 150 160
orf73a.pep NXSGRKXGXSRDDDIIEGEYTVEXPYGGXRFRNAXEHKKDEX
| |||| | |||||||||||||| |||| | ||| |||||||
orf73-1 NQSGRKEGFSRDDDIIEGEYTVEEPYGGNRSRNAIEHKKDEX
130 140 150 160
与淋病奈瑟球菌的预计ORF的同源性
ORF73与淋病奈瑟球菌的预计ORF(ORF73.ng)在重叠的76个氨基酸内有92.1%的相同性:
orf73.pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRQTGLTGLLLAGAA 60
||||||||||||||||||||||||||||||||||||| |||||||||:|||:||||||||
orf73ng MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVLMLRHTGLSGLLLAGAA 60
orf73.pep MRSGGKVSVYQMLWPI 76
::|:||||||||||||
orf73ng VKSSGKVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM 120
全长ORF73ng核苷酸序列<SEQ ID 225>是:1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAAATTAT51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGTTGG AcgcTGTTTC101 TAATGGCGGC AACCTTTGCC GCCGGTGTGC TGATGCTCAG GCATAcggGG151 CTGTCCGGTC TTTTATTGGC TGGCGCGGCG GTAAAAagta gtgGGAAGGT201 ATCTGTTTAT CagatgtTGT GGCCTATCCG TTATAcggtg gcggcggtgT251 GTCTGatgag tCcggGATTC GTATCCTccg tgttggCGGT ATTGCTGCTG301 CTGCcgttta aggGaggGgc agtgttgcag gcaggaggtg cggaaaATTT351 TTTCAACATg aaCcaatcgg gcagaaAaga gggatttttc cacgatgacg401 atattatcga gggagaatat acggttgaaa aacctgacgg cggcaatcgt451 tcccgaAAcg ccatcgaaca cgaaaAagac gaataA它编码的蛋白质具有氨基酸序列<SEQ ID 226>:1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAATFA AGVLMLRHTG51 LSGLLLAGAA VKSSGKVSVY QMLWPIRYTV AAVCLMSPGF VSSVLAVLLL101 LPFKGGAVLQ AGGAENFFNM NQSGRKEGFF HDDDIIEGEY TVEKPDGGNR151 SRNAIEHEKD E*ORF73ng和ORG73-1在161个氨基酸的重叠区内有93.8%的相同性
10 20 30 40 50 60orf73-1.pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRHTGLSGLLLAGAA
||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||orf73ng MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVLMLRHTGLSGLLLAGAA
10 20 30 40 50 60
70 80 90 100 110 120orf73-1.pep MRSGGRVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM
::|:|:||||||||||||||||||||||||||||||||||||||||||||||||||||||orf73ng VKSSGKVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM
70 80 90 100 110 120
130 140 150 160
orf73-1.pep NQSGRKEGFSRDDDIIEGEYTVEEPYGGNRSRNAIEHKKDEX
||||||||| :||||||||||||:| |||||||||||:||||
orf73ng NQSGRKEGFFHDDDIIEGEYTVEKPDGGNRSRNAIEHEKDEX
130 140 150 160
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列以及推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例34
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 227>:1 ATGTTTGTTT TTCAGACGGC ATTCTT.ATG TTTCAGAAAC ATTTGCAGAA51 AGCCTCCGAC AGCGTCGTCG GAGGGACATT ATACGTGGTT GCCACGCCCA101 TCGGCAATTT GGCGGACATT ACCCTGCGCG CTTTGGCGGT ATTGCAAAAG151 GCG....... .....GCCGA AGACACGCGC GTTACCGCAC AGCTTTTGAG201 CGCGTACGGC ATTCAGGGCA AACTCGTCAG TGTGCGCGAA CACAACGAAC251 GGCAGATGGC GGACAAGATT GTCGGCTATC TTTCAGACGG CATGGTTGTG301 GCACAGGTTT CCGATGCGGG TACGCCGGCC GTGTGCGACC CGGGCGCGAA351 ACTCGCCCGC CGCGTGCGTG AGGCCGGGTT TAAAGTCGTT CCCGTCGTGG401 GCGCAAC.GC GGTGATGGCG GCTTTGAGCG TGGCCGGTGT GGAAGGATCC451 GATTTTTATT TCAACGGTTT TGTACCGCCG AAATCGGGAG AACGCAGGAA501 ACTGTTTGCC AAATGGGTGC GGGCGGCGTT TCCTATCGTC ATGTTTGAAA551 CGCCGCACCG CATCGGTGCA GCGCTTGCCG ATATGGCGGA ACTGTTCCCC601 GAACGCCGAT TAATGCTGGC GCGCGAAATT ACGAAAACGT TTGAAACGTT651 CTTAAGCGGC ACGGTTGGGG AAATTCAGAC GGCATTGTCT GCCGACGGCG701 ACCAATCGCG CGGCGAGATG GTGTTGGTGC TTTATCCGGC GCAGGATGAA751 AAACACGAAG GCTTGTCCGA GTCCGCGCAA AACATCATGA AAATCCTCAC801 AGCCGAGCTG CCGACCAAAC AGGCGGCGGA GCTTGCTGCC AAAATCACGG851 GCGAGGGAAA GAAAGCTTTG TACGAT..它对应于氨基酸序列<SEQ ID 228;ORF75>:1 MFVFQTAFXM FQKHLQKASD SVVGGTLYVV ATPIGNLADI TLRALAVLQK51 A....AEDTR VTAQLLSAYG IQGKLVSVRE HNERQMADKI VGYLSDGMVV101 AQVSDAGTPA VCDPGAKLAR RVREAGFKVV PVVGAXAVMA ALSVAGVEGS151 DFYFNGFVPP KSGERRKLFA KWVRAAFPIV MFETPHRIGA ALADMAELFP201 ERRLMLAREI TKTFETFLSG TVGEIQTALS ADGDQSRGEM VLVLYPAQDE251 KHEGLSESAQ NIMKILTAEL PTKQAAELAA KITGEGKKAL YD..进一步的工作揭示了完整的核苷酸序列<SEQ ID 229>:1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG151 CGCGTTACCG CACAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT201 CAGTGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGCCGG351 GTTTAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTGATG GCGGCTTTGA401 GCGTGGCCGG TGTGGAAGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG451 CCGAAATCGG GAGAACGCAG GAAACTGTTT GCCAAATGGG TGCGGGCGGC501 GTTTCCTATC GTCATGTTTG AAACGCCGCA CCGCATCGGT GCGACGCTTG551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA601 ATTACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA651 GACGGCATTG TCTGCCGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC801 GGAGCTTGCT GCCAAAATCA CGGGCGAGGG AAAGAAAGCT TTGTACGATC851 TGGCTCTGTC TTGGAAAAAC AAATAG它对应于氨基酸序列<SEQ ID 230;ORF75-1>:1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP101 AVCDPGAKLA RRVREAGFKV VPVVGASAVM AALSVAGVEG SDFYFNGFVP151 PKSGERRKLF AKWVRAAFPI VMFETPHRIG ATLADMAELF PERRLMLARE201 ITKTFETFLS GTVGEIQTAL SADGNQSRGE MVLVLYPAQD EKHEGLSESA251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF75与脑膜炎奈瑟球菌菌株A的ORF(ORF75a)在重叠的283个氨基酸内有95.8%的相同性:
10 20 30 40 50 60orf75.pep MFVFQTAFXMFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKAXXXXAEDTR
|||||||||||||||||||||||||||||||||||||||||| |||||orf75a MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTR
10 20 30 40 50
70 80 90 100 110 120orf75.pep VTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLAR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf75a VTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLAR
60 70 80 90 100 110
130 140 150 160 170 180orf75.pep RVREAGFKVVPVVGAXAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIV
||||:|||||||||| ||||||||||| ||||||||||||||||||||||||||:|||:|orf75a RVREVGFKVVPVVGASAVMAALSVAGVAGSDFYFNGFVPPKSGERRKLFAKWVRVAFPVV
120 130 140 150 160 170
190 200 210 220 230 240orf75.pep MFETPHRIGAALADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGDQSRGEM
||||||||||:||||||||||||||||||||||||||||||||||||||:|||:||||||orf75a MFETPHRIGATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEM
180 190 200 210 220 230
250 260 270 280 290orf75.pep VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYD
||||||||||||||||||||||||||||||||||||||||||||||||||||orf75a VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNK
240 250 260 270 280 290orf75a X全长ORF75a核苷酸序列<SEQ ID 231>是:1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG151 CGCGTTACCG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT201 CAGCGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGTCGG351 GTTTAAAGTT GTCCCTGTTG TCGGCGCAAG CGCGGTGATG GCGGCTTTGA401 GTGTGGCTGG TGTGGCGGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG451 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGTGGC501 GTTTCCCGTC GTGATGTTTG AAACGCCGCA CCGCATCGGG GCGACGCTTG551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC801 GGAGCTTGCC GCCAAAATCA CGGGCGAGGG AAAAAAAGCT TTGTACGATC851 TGGCACTGTC TTGGAAAAAC AAATGA它编码的蛋白质具有氨基酸序列<SEQ ID 232>:1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP101 AVCDPGAKLA RRVREVGFKV VPVVGASAVM AALSVAGVAG SDFYFNGFVP151 PKSGERRKLF AKWVRVAFPV VMFETPHRIG ATLADMAELF PERRLMLARE201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*ORF75a和ORF75-1在291个氨基酸的重叠区内有98.3%的相同性:
10 20 30 40 50 60orf75a.pep MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf75-1 MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAGLLSAY
10 20 30 40 50 60
70 80 90 100 110 120orf75a.pep GIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREVGFKV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|||orf75-1 GIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKV
70 80 90 100 120 120
130 140 150 160 170 180orf75a.pep VPVVGASAVMAALSVAGVAGSDFYFNGFVPPKSGERRKLFAKWVRVAFPVVMFETPHRIG
|||||||||||||||||||||||||||||||||||||||||||||:|||:||||||||||orf75-1 VPVVGASAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIG
130 140 150 160 170 180
190 200 210 220 230 240orf75a.pepmATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQD
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||orf75-1 ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGNQSRGEMVLVLYPAQD
190 200 210 220 230 240
250 260 270 280 290orf75a.pep EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
||||||||||||||||||||||||||||||||||||||||||||||||||||orf75-1 EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
250 260 270 280 290
与淋病奈瑟球菌的预计ORF的同源性
ORF75与淋病奈瑟球菌的预计ORF(ORF75.ng)在重叠的292个氨基酸内有93.2%的相同性:orf75.pep MFVFQTAFXMFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKA----AEDTR 56
||||||||||||||||||||||||||||||||||||||||||||||||||| |||||orf75ng MSVFQTAFFMFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTR 60orf75.pep VTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLAR 116
|||||||||||||:|||||||||||||||::|:||||:||||||||||||||||||||||orf75ng VTAQLLSAYGIQGRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLAR 120orf75.pep RVREAGFKVVPVVGAXAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIV 176
||||||||||||||| ||||||||||| |||||||||||||||||||||||||||||:|orf75ng RVREAGFKVVPVVGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVV 180orf75.pep MFETPHRIGAALADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGDQSRGEM 236
||||||||||:||||||||||||||||||||||||||||||||||||||:|||:||||||orf75ng MFETPHRIGATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEM 240orf75.pep VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYD 288
||||||:|||||||||||||| ||||:|||||||||||||||||||||||||orf75ng VLVLYPAQDEKHEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLALSWKNK 300
预计ORF75ng核苷酸序列<SEQ ID 233>编码的蛋白质具有氨基酸序列<SEQ ID234>:1 MSVFQTAFFM FQKHLQKASD SVVGGTLYVV ATPIGNLADI TLRALAVLQK51 ADIICAEDTR VTAQLLSAYG IQGRLVSVRE HNERQMADKV IGFLSDGLVV101 AQVSDAGTPA VCDPGAKLAR RVREAGFKVV PVVGASAVMA ALSVAGVAES151 DFYFNGFVPP KSGERRKLFA KWVRAAFPVV MFETPHRIGA TLADMAELFP201 ERRLMLAREI TKTFETFLSG TVGEIQTALA ADGNQSRGEM VLVLYPAQDE251 KHEGLSESAQ NAMKILAAEL PTKQAAELAA KITGEGKKAL YDLALSWKNK301 *
在进一步分析后,鉴定出下列淋球菌DNA序列<SEQ ID 235>:1 ATGTTTCAGA AACACTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCAGAC ATTACCCTGC101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATTTGTGC CGAAGACACG151 CGCGTTACTG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAGGTTGGT201 CAGTGTGCGC GAACACAACG AGCGGCAGAT GGCGGACAAG GTAATCGGTT251 TCCTTTCAGA CGGCCTGGTT GTGGCGCAGG TTTCCGATGC GGGTACGCCG301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GCGAAGCAGG351 GTTCAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTAATG GCGGCGTTGA401 GTGTGGCCGG TGTGGCGGAA TCCGATTTTT ATTTCAACGG TTTTGTACCG451 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGCGGC501 ATTTCCTGTC GTCATGTTTG AAACGCCGCA CCGAATCGGG GCAACGCTTG551 CCGATATGGC GGAATTGTTC CCCGAACGCC GTCTGATGCT GGCGCGCGAA601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCTGCG751 CAAAATGCGA TGAAAATCCT TGCGGCCGAG CTGCCGACCA AGCAGGCGGC801 GGAGCTTGCC GCCAAGATTA CAGGTGAGGG CAAAAAGGCT TTGTACGATT851 TGGCACTGTC GTGGAAAAAC AAATGA它对应于氨基酸序列<SEQ ID 236;ORF75ng-1>:1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT51 RVTAQLLSAY GIQGRLVSVR EHNERQMADK VIGFLSDGLV VAQVSDAGTP101 AVCDPGAKLA RRVREAGFKV VPVVGASAVM AALSVAGVAE SDFYFNGFVP151 PKSGERRKLF AKWVRAAFPV VMFETPHRIG ATLADMAELF PERRLMLARE201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA251 QNAMKILAAE LPTKQAAELA AKITGEGKKA LYDLALSWIN K*ORF75ng-1和ORF75-1在291个氨基酸的重叠区内有96.2%的相同性:
10 20 30 40 50 60orf75-1.pep MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf75ng-1 MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAY
10 20 30 40 50 60
70 80 90 100 110 120orf75-1.pep GIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKV
||||:|||||||||||||||::|:||||:|||||||||||||||||||||||||||||||orf75ng-1 GIQGRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVQREAGFKV
70 80 90 100 110 120
130 140 150 160 170 180orf75-1.pep VPVVGASAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIG
|||||||||||||||||| |||||||||||||||||||||||||||||:||||||||||orf75ng-1 VPVVGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIG
130 140 150 160 170 180
190 200 210 220 230 240orf75-1.pep ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGNQSRGEMVLVLYPAQD
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||orf75ng-1 ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQD
190 200 210 220 230 240
250 260 270 280 290orf75-1.pep EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
|||||||||||| |||||:|||||||||||:|||||||||||||||||||||orf75ng-1 EKHEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
250 260 270 280 290另外,ORG75ng-1显示出与一种假设的大肠杆菌蛋白明显同源:sp|P45528|YRAL_ECOLI AGAI-MTR基因间区域中的假设的31.3KD蛋白(F286)>gi|606086(U18997)0RF_f286[大肠杆菌]>gi|1789535(AE000395)agai-mtr基因间区域中的假设的31.3kD蛋白[大肠杆菌]长度=286评分=218位(550),估计值=3e-56相同性=128/284(45%),阳性=171/284(60%),空隙=4/284(1%)询问:4 KHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQ 63
K Q A +S G LY+V TPIGNLADIT RAL VLQ D+I AEDTR T LL +GI目标:2 KQHQSADNSQ--GQLYIVPTPIGNLADITQRALEVLQAVDLIAAEDTRHTGLLLQHFGIN 59询问:64 GRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPV 123
RL ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG L R REAG +VVP+目标:60 ARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRVVPL 119询问:124 VGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATL 183
G A + ALS AG+ F + GF+P KS RR ++ +E+ HR+ +L目标:120 PGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSL 179询问:184 ADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEK 242
D+ + E R ++LARE+TKT+ET VGE+ + D N+ +GEMVL++ +目标:180 EDIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDENRRKGEMVLIV-EGHKAQ 238询问:243 HEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLAL 286
E L A + +L AELP K+AA LAA+I G K ALY AL目标:239 EEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYAL 282
根据该分析结果(包括该淋球菌蛋白中存在一个推定的跨膜结构域的结果),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的该蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例35
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 237>:1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCCAA TGTTGGCAGG51 TTTTGCGGCA GC AAAGCAC CCGAAATCGA CCCGGCCTTTG ..........
//651 .......... ...GAGTTGG TCAGAAACCA GTTGGAGCAG GGTTTGAGAC701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAAGA AAACGGTGTC751 AAACCGTAA它对应于氨基酸序列<SEQ ID 238;ORF76>:1 MKQKKTAAAV IAAMLAGFAA XKAPEIDPAL .......... ..........
//201 .......... .......... ELYRNQLEQG LRQEKARLKI DALLEENGYK251 P*进一步的工作揭示了完整的核苷酸序列<SEQ ID 239>:1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA151 AAACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGCC GGCTACAAAC201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAGACGAGCT351 GCACAAGTTT TACGAACAGC AAATCCGCAT GATCAAATTG CAGCAGGTCA401 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA451 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC501 TTTTGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTCGC551 AGTTTGCCGC GATGAATCGG GGCGACGTTA CCCGCGATCC GGTCAAATTG601 GGCGAACGCT ATTATCTGTT CAAACTCAGC GAGGTCGGGA AAAACCCCGA651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAGCAG GGTTTGAGAC701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAAGA AAACGGTGTC751 AAACCGTAA它对应于氨基酸序列<SEQ ID 240;ORF76-1>:1 MKQKKTAAAV IAAMLAGFAA AKAPEIDPAL VDTLVAQIMQ QADRHAEQSQ51 KPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE101 EYVRFLERSE TVSEDELHKF YEQQIRMIKL QQVSFATEEE ARQAQQLLLK151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAAMNR GDVTRDPVKL201 GERYYLFKLS EVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDALLEENGV251 KP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF76与脑膜炎奈瑟球菌菌株A的ORF(ORF76a)在重叠的30个氨基酸中有96.7%的相同性,在31个氨基酸的重叠区内有96.8%的相同性:
10 20 30orf76.pep MKQKKTAAAVIAAMLAGFAAXKAPEIDPAL
|||||||||||||||||||| |||||||||orf76a MKQKKTAAAVIAAMLAGFAAAAKAPEIDPALVDTLVAQIMQADRHAEQSQKPDGQAIRND
10 20 30 40 50 60
//
70 80 90orf76.pep XELVRNQLEQGLRQEKARLKIDALLEENGVKPX
|||||||||||||||||||||||:|||||||||orf76a DVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLKIDAILEENGVKPX
200 210 220 230 240 250全长ORF76a核苷酸序列<SEQ ID 241>是:1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA151 AAACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGTC GGCTGCAAAC201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAAGCGCACT351 GCGTCAGTTT TATGAGCGGC AAATCCGCAT GATCAAATTG CAGCAGGTCA401 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA451 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC501 TTTTGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTCGC551 AGTTTGCAGC GATGAATCGG GGCGACGTTA CCCGCGATCC GGTCAAATTG601 GGCGAACGCT ATTATCTGTT CAAACTCAGC GAGGTCGGGA AAAACCCCGA651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAACAA GGTTTGAGAC701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCA TTTTGGAAGA AAACGGTGTC751 AAACCGTAA
它编码的蛋白质具有氨基酸序列<SEO ID 242>:1 MKQKKTAAAV IAAMLAGFAA AKAPEIDPAL VDTLVAQIMQ QADRHAEQSQ51 KPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE101 EYVRFLERSE TVSESALRQF YERQIRMIKL QQVSFATEEE ARQAQQLLLK151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAAMNR GDVTRDPVKL201 GERYYLFKLS EVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDAILEENGV251 KP*
ORF76a和ORF76-1在252个氨基酸的重叠区内有97.6%的相同性:
10 20 30 40 50 60orf76a.pep MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf76-1 MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND
10 20 30 40 50 60
70 80 90 100 110 120orf76a.pep AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSESALRQF
||||||||||||||||||||||||||||||||||||||||||||||||||||||: |::|orf76-1 AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSEDELHKF
70 80 90 100 110 120
130 140 150 160 170 180orf76a.pep YERQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf76-1 YEQQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
130 140 150 160 170 180
190 200 210 220 230 240orf76a.pep LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf76-1 LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK
190 200 210 220 230 240
250orf76a.pep IDAILEENGVKPX
|||:|||||||||orf76-1 IDALLEENGVKPX
250
与淋病奈瑟球菌的预计ORF的同源性
ORF76与淋病奈瑟球菌的预计ORF(ORF76.ng)的N端和C端进行氨基酸序列对比,分别显示在30和31个氨基酸重叠区内有96.7%和100%的相同性:orf76.pep MKQKKTAAAVIAAMLAGFAAXKAPEIDPAL 30
|||||||||||||||||||| |||||||||orf76ng MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQRPDGQAIRND 60
//orf76.pep ELVRNQLEQGLRQEKARLKIDALLEENGVKP 251
|||||||||||||||||||||||||||||||orf76ng VTRNPVKLGERYYLFKLGAVGKNPDAQPFELVRNQLEQGLRQEKARLKIDALLEENGVKP 251全长ORF76ng核苷酸序列<SEQ ID 243>是:1 ATGAAACAGA AAAAGACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA151 AGACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGCC GGCTGCAAAC201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAAGCGCACT351 GCGTCAGTTT TATGAGCGGC AAATCCGCAT GATCAAATTG CAGCAGGTCA401 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA451 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC501 GTTCGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTcgc551 agtttgCCGG TATGAACCGT GGCGACGTTA CCCGCAATCC GGTCAAATTG601 GGCGAACGCT ATTACCTGTT CAAACTCGGC GCGGTCGGGA AAAACCCCGA651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAACAA GGTTTGAGGC701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAaga Aaacggtgtc751 AaacCGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 244>:1 MKQKKTAAAV IAAMLAGFAA AKAPEIDPAL VDTLVAQIMQ QADRHAEQSQ51 RPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE101 EYVRFLERSE TVSESALRQF YERQIRMIKL QQVSFATEEE ARQAQQLLLK151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAGMNR GDVTRNPVKL201 GERYYLFKLG AVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDALLEENGV251 KP*ORF76ng和ORF76-1在252个氨基酸的重叠区内有96.0%的相同性
10 20 30 40 50 60orf76-1.pep MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND
||||||||||||||||||||||||||||||||||||||||||||||||||:|||||||||orf76ng MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADPHAEQSQRPDGQAIRND
10 20 30 40 50 60
70 80 90 100 110 120orf76-1.pep AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSEDELHKF
||||||||||||||||||||||||||||||||||||||||||||||||||||||: |::|orf76ng AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSESALRQF
70 80 90 100 110 120
130 140 150 160 170 180orf76-1.pep YEQQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf76ng YERQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
130 140 150 160 170 180
190 200 210 220 230 240orf76-1.pep LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK
||||||:||||||||:|||||||||||||: |||||||||||||||||||||||||||||orf76ng LASQFAGMNRGDVTRNPVKLGERYYLFKLGAVGKNPDAQPFELVRNQLEQGLRQEKARLK
190 200 210 220 230 240
250orf76-1.pep IDALLEENGVKPX
|||||||||||||orf76ng IDALLEENGVKPX
250
另外,ORF76ng显示出与一种枯草杆菌输出蛋白(export protein)前体明显同源:
sp|P24327|PRSA_BACSU蛋白输出蛋白PRSA前体>gi|98227| pir||S15269 33K脂蛋白-枯草芽孢杆菌>gi|39782(X57271)33kDa脂蛋白[枯草芽孢杆菌]
>gi|2226124|gnl|PID|e325181(Y14077)33kDa脂蛋白[枯草芽孢杆菌]>gi|2633331|gnl|PID|e1182997(Z99109)分子陪伴蛋白[枯草芽孢杆菌]长度=292
评分=50.4位(118),估计值=le-05
相同性=48/199(24%),阳性=82/199(41%),空隙=32/199(16%)
询问:70 VLKNRALKEGLDK-----DKDVQNRFKIAEASF----------YAEEYYRFLERSETVSE 114
VL ++ LDK DK++ N+ K + Y ++Y++ + E +++
目标:53 VLTQLVQEKYLDKKYKVSDKEIDNKLKEYKTQLGDQYTALEKQYGKDYLKEQVKYELLTQ 112
询问:115 SA-----------LRQFYERQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPN 163
A +++++E I+ + A ++ A + ++ L KG FE L K Y
目标:113 KAAKDNIKVTDADIKEYWEGLKGKIRASHILVADKKTAEEVEKKLKKGEKFEDLAKEYST 172
询问:164 DEQAFDG-----FIMAQQLPEPLASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDA 218
D A G F Q+ E + + G+V+ DPVK Y++ K +E D
目标:173 DSSASKGGDLGWFAKEGQMDETFSKAAFKLKTGEVS-DPVKTQYGYHIIKKTEERGKYDD 231
询问:219 QPFELVRNQLEQGLRQEKA 237
EL LEQ L A
目标:232 MKKELKSEVLEQKLNDNAA 250
根据该分析结果(包括此淋球菌蛋白中存在一个推定前导序列和一个RGD基序),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF76-1(27.8kDa)克隆到pET载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图10A显示出His-融合蛋白亲和纯化的结果。用纯化的His-融合蛋白免疫小鼠,将小鼠血清用于Western印迹(图10B),ELISA(阳性结果),和FACS分析(图10C)。这些实验确认ORF76-1是一种外露蛋白,且是一种有用的免疫原。
实施例36
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 245>:1 ATGAAAAAAT CTTTCCTTAC GCTTGTTCTG TATTCGTCTT TACTTACCGC51 CAGCGAAATT GCCTTACCCC TTGGAATTGG GGATTGAAAC CTTACCGGCG101 GCAAAAATTG CGGAAACGTT TGCGCTGACA TTTGTGATTG CTGCGCTGTA151 TCTGTTTGCG CGTAATAAGG TGACGCGTTT GTTGATTGCG GTGTTTTTTG201 CGTTCAGCAT TATTGCCAAC AATGTGCATT ACGCGGATTA TCAAAGCTGG251 ATGACG.... .......... .......... .......... ..........
//1201 .......... CAAACCGTAT TCGAGCAGCT GCAAAAGACT CCTGACGGCA1251 ACTGGCTGTT TGCCTATACC TCCGATCATG GCCAGTATGT TCGCCAAGAT1301 ATCTACAATC AAGGCACGGT GCAGCCCGAC AGCTATCTCG TGCCGCTAGT1351 GTTGTACAGC CCGGATAAGG CCGTGCAACA GGCTGCCAAC CAGGCTTTTG1401 CGCCTTGCGA GATTGCCTTC CATCAGCAGC TTTCAACGTT CCTGATTCAC1451 ACGTTGGGCT ACGATATGCC GGTTTCAGGT TGTCGCGAAG GCTCGGTAAC1501 GGGCAACCTG ATTACGGGTG ATGCAGGCAG CTTGAACATT CGCGACGGCA1551 AGGCGGAATA TGTTTATCCG CAATGA
它对应于氨基酸序列<SEQ ID 246;ORF81>: 1 MKKSFLTLVL YSSLLTASEI AYPLELGIET LPAAKIAETF ALTFVIAALY51 LFARNKVTRL LIAVFFAFSI IANNVHYADY QSWMT..... ..........
//401 ...QTVFEQL QKTPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV451 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT501 GNLITGDAGS LNIRDGKAEY VYPQ*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 247>:1 ATGAAAAAAT CTTTCCTTAC GCTTGTTCTG TATTCGTCTT TACTTACCGC51 CAGCGAAATT GCCTATCGCT TTGTATTTGG GATTGAAACC TTACCGGCGG101 CAAAAATTGC GGAAACGTTT GCGCTGACAT TTGTGATTGC TGCGCTGTAT151 CTGTTTGCGC GTTATAAGGT GACGCGTTTG TTGATTGCGG TGTTTTTTGC201 GTTCAGCATT ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA251 TGACGGGCAT CAATTATTGG CTGATGCTGA AAGAGGTTAC CGAAGTCGGC301 AGCGCGGGTG CGTCGATGTT GGATAAGTTG TGGCTGCCTG TGTTGTGGGG351 CGTGTTGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA401 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC451 GTGCGTTCGT TCGACACGAA ACAAGAGCAC GGTATTTCGC CCAAACCGAC501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGAC551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAGGATTCC CGCCTTTAAG601 CAGCCTGCTC CAAGCAAAAT CGGGCAGGGC AGTGTTCAAA ATATCGTCCT651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAGCTG TTTGGCTACG701 GACGCGAAAC TTCGCCGTTT TTAACCCGGC TGTCGCAAGC CGATTTTAAG751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACTG CAGTGTCCCT801 GCCCAGTTTT TTCAATGCGA TACCGCACGC CAACGGCTTG GAACAAATCA851 GCGGCGGCGA TACCAATATG TTCCGCCTCG CCAAAGAGCA GGGCTATGAA901 ACGTATTTTT ACAGCGCGCA GGCGGAAAAC GAGATGGCGA TTTTGAACTT951 AATCGGTAAG AAATGGATAG ACCATCTGAT TCAGCCGACG CAACTTGGCT1001 ACGGCAACGG CGACAATATG CCCGATGAGA AGCTGCTGCC GTTGTTCGAC1051 AAAATCAATT TGCAGCAGGG CAAGCATTTT ATCGTGTTGC ACCAACGCGG1101 TTCGCACGCC CCATACGGCG CATTGTTGCA GCCTCAAGAT AAAGTATTCG1151 GCGAAGCEGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC1201 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA1251 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTT CGCCAAGATA1301 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATCTCGT GCCGCTAGTG1351 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC1401 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA1451 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACG1501 GGCAACCTGA TTACGGGTGA TGCAGGCAGC TTGAACATTC GCGACGGCAA1551 GGCGGAATAT GTTTATCCGC AATGA
它对应于氨基酸序列<SEQ ID 248;ORF81-1>:1 MKKSFLTLVL YSSLLTASEI AYRFVFGIET LPAAKIAETF ALTFVIAALY51 LFARYKVTRL LIAVFFAFSI IANNVHYAVY QSWMTGINYW LMLKEVTEVG101 SAGASMLDKL WLPVLWGVLE VMLFCSLAKF RRKTHFSADI LFAFLMLMIF151 VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSRIPAFK201 QPAPSKIGQG SVQNIVLIMG ESESAAHLKL FGYGRETSPF LTRLSQADFK251 PIVKQSYSAG FMTAVSLPSF FNAIPHANGL EQISGGDTNM FRLAKEQGYE301 TYFYSAQAEN EMAILNLIGK KWIDHLIQPT QLGYGNGDNM PDEKLLPLFD351 KINLQQGKHF IVLHQRGSHA PYGALLQPQD KVFGEADIVD KYDNTIHKTD401 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV451 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT501 GNLITGDAGS LNIRDGKAEY VYPQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF81和脑膜炎奈瑟球菌菌株A的ORF(ORF81a)在85个氨基酸的重叠区内有84.7%的相同性,在121个氨基酸的重叠区内有99.2%的相同性:
10 20 30 40 50 60orf81.pep MKKSFLTLVLYSSLLTASEIAYPLELGIETLPAAKIAETFALTFVIAALYLFARNKVTRL
||||:::||||||||||||||| : :|||||||||:|||||||||||||||||| |:|||orf81a MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFVIAALYLFARYKATRL
10 20 30 40 50 60
70 80orf81.pep LIAVFFAFSIIANNVHYADYQSWMT
|||||||||||||||||| ||||:|orf81a LIAVFFAFSIIANNVHYAVYQSWITGINYWLMLKEITEVGGAGASMLDKLWLPALWGVLE
70 80 90 100 110 120
//
120 130 140orf81.pep QTVFEQLQKTPDGNWLFAYTSDHGQYVRQD
||||||||| ||||||||||||||||||||orf81a IPHANGLEQISGGDIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLFAYTSDHGQYVRQD
280 290 300 310 320 330
150 160 170 180 190 200orf81.pep IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf81a IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG
340 350 360 370 380 390
210 220 230orf81.pep CREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
||||||||||||||||||||||||||||||||orf81a CREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
400 410 420全长ORF81a核苷酸序列<SEQ ID 249>是:1 ATGAAAAAAT CCCTTTTCGT TCTCTTTCTG TATTCGTCCC TACTTACTGC51 CAGCGAAATT GCTTATCGCT TTGTATTCGG AATTGAAACC TTACCGGCTG101 CAAAAATGGC AGAAACGTTT GCGCTGACAT TTGTGATTGC TGCGCTGTAT151 CTGTTTGCGC GTTATAAGGC AACGCGTTTG TTGATTGCGG TGTTTTTCGC201 GTTCAGCATT ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA251 TAACGGGCAT TAATTATTGG CTGATGCTGA AAGAGATTAC CGAAGTTGGC301 GGCGCAGGGG CGTCGATGTT GGATAAGTTG TGGCTGCCTG CGTTGTGGGG351 CGTGTTGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA401 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC451 GTGCGTTCGT TCGACACGAA ACAAGAACAC GGTATTTCGC CCAAACCGAC501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGAC551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAAGATTCC TGTGTTCAAA601 CAGCCTGCTC CAAGCAGAAT CGGGCAAGGC AGTATTCAAA ATATCGTCCT651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAATTG TTTGGCTACG701 GGCGCGAAAC TTCGCCGTTT TTGACCCAGC TTTCGCAAGC CGATTTTAAG751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACGG CAGTATCCCT801 GCCCAGTTTC TTTAACGTCA TACCGCATGC CAACGGCTTG GAACAAATCA851 GCGGCGGCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC901 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA951 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTT CGCCAAGATA1001 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATCTCGT GCCGCTGGTG1051 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC1101 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA1151 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACG1201 GGCAACCTGA TTACGGGTGA TGCAGGCAGC TTGAACATTC GCGACGCCAA1251 GGCGGAATAT GTTTATCCGC AATGA它编码的蛋白质具有氨基酸序列<SEQ ID 250>:1 MKKSLFVLFL YSSLLTASEI AYRFVFGIET LPAAKMAFTF ALTFVIAALY51 LFARYKATRL LIAVFFAFSI IANNVHYAVY QSWITGINYW LMLKEITEVG101 GAGASMLDKL WLPALWGVLE VMLFCSLAKF RRKTHFSADI LFAFLMLMIF151 VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSKIPVFK201 QPAPSRIGQG SIQNIVLIMG ESESAAHLKL FGYGRETSPF LTQLSQADFK251 PIVKQSYSAG FMTAVSLPSF FNVIPHANGL EQISGGDIVD KYDNTIHKTD301 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV351 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT401 GNLITGDAGS LNIRDGKAEY VYPQ*ORF81a和ORF81-1在524个氨基酸的重叠区内有77.9%的相同性:
10 20 30 40 50 60orf81a.pep MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFVIAALYLFARYKATRL
||||:::| ||||||||||||||||||||||||||:||||||||||||||||||||:|||orf81-1 MKKSFLTLVLYSSLLTASEIAYRFVFGIETLPAAKIAETFALTFVIAALYLFARYKVTRL
10 20 30 40 50 60
70 80 90 100 110 120orf81a.pep LIAVFFAFSIIANNVHYAVYQSWITGINYWLMLKEITEVGGAGASMLDKLWLPALWGVLE
|||||||||||||||||||||||:|||||||||||:||||:||||||||||||:||||||orf81-1 LIAVFFAFSIIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPVLWGVLE
70 80 90 100 110 120
130 140 150 160 170 180orf81a.pep VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf81-1 VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
130 140 150 160 170 180
190 200 210 220 230 240orf81a.pep FVGRVLPYQLFDLSKIPVFKQPAPSRIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPF
||||||||||||||:||:|||||||:|||||:||||||||||||||||||||||||||||orf81-1 FVGRVLPYQLFDLSRIPAFKQPAPSKIGQGSVQNIVLIMGESESAAHLKLFGYGRETSPF
190 200 210 220 230 240
250 260 270 280orf81a.pep LTQLSQADFKPIVKQSYSAGFMTAVSLPSFFNVIPHANGLEQISGGD-------------
||:|||||||||||||||||||||||||||||:||||||||||||||orf81-1 LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNAIPHANGLEQISGGDTNMFRLAKEQGYE
250 260 270 280 290 300orf81a.pep ------------------------------------------------------------orf81-1 TYFYSAQAENEMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGKHF
310 320 330 340 350 360
290 300 310 320orf81a.pep ---------------------------IVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
|||||||||||||||||||||||||||||||||orf81-1 IVLHQRGSHAPYGALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
370 380 390 400 410 420
330 340 350 360 370 380orf81a.pep AYTSDHGQYVRQDIYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf81-1 AYTSDHGQYVRQDIYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF
430 440 450 460 470 480
390 400 410 420orf81a.pep LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
|||||||||||||||||||||||||||||||||||||||||||||orf81-1 LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
490 500 510 520
与淋病奈瑟球菌的预计ORF的同源性
ORF81与淋病奈瑟球菌的预计ORF(ORF81.ng)的N-和C-端的氨基酸序列对比分别显示出在85和121个氨基酸的重叠区内有82.4%和97.5%的相同性:orf81.pep MKKSFLTLVLYSSLLTASEIAYPLELGIETLPAAKIAETFALTFVIAALYLFARNKVTRL 60
||||:::| ||||||||||||| : :|||||||||:||||||||:||||||||| |::||orf81ng MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFMIAALYLFARYKASRL 60orf81.pep LIAVFFAFSIIANNVHYADYQSWMT 85
|||||||||:|||||||| ||||||orf81ng LIAVFFAFSMIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAE 120
//orf81.pep QTVFEQLQKTPDGNWLFAYTSDHGQYVRQD 433
||||||||| ||||||||||||||||||||orf81ng ALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLFAYTSDHGQYVRQD 433orf81.pep IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG 493
|||||||||||||||:||||||||||||||||||||||||||||||||||||||||||||orf81ng IYNQCTYQPDSYIVPLVLYSPDKAVQQAANQAFAPCEIAFHQQKSTFLIHTLGYDMPVSG 493orf81.pep CREGSVTGNLITGDAGSLNIRDGKAEYVYPQ 524
|||||||||||||||||||||:|||||||||orf81ng CREGSVTGNLITGDAGSLNIRNGKAEYVYPQ 524全长ORF81ng核苷酸序列<SEQ ID 251>是:1 ATGAAAAAAT CCCTTTTCGT TCTCTTTCTG TATTCATCCC TACTTACCGC51 CAGCGAAATC GCCTATCGCT TTGTATTCGG AATTGAAACC TTACCGGCTG101 CAAAAATGGC GGAAACGTTT GCGCTGACAT TTATGATTGC TGCGCTGTAT151 CTGTTTGCGC GTTATAAGGC TTCGCGGCTG CTGATTGCGG TGTTTTTCGC201 GTTCAGCATG ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA251 TGACGGGTAT TAACTATTGG CTGATGCTGA AAGAGGTTAC CGAAGTCGGC301 AGCGCGGGCG CGTCGATGTT GGATAAGTTG TGGCTGCCTG CTTTGTGGGG351 CGTGGCGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA401 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC451 GTGCGTTCGT TCGACACGAA ACAAGAGCAC GGTATTTCGC CCAAACCGAC501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGGC551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAAGATCCC TGTGTTCAAA601 CAGCCTGCTC CAAGCAAAAT CGGGCAAGGC AGTATTCAAA ATATCGTCCT651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAATTG TTTGGTTACG701 GGCGCGAAAC TTCGCCGTTT TTAACCCGGC TGTCGCAAGC CGATTTTAAG751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACGG CAGTATCCCT801 GCCCAGTTTC TTTAACGTCA TACCGCACGC CAACGGCTTG GAACAAATCA851 GCGGCGGCGA TACCAATATG TTCCGCCTCG CCAAAGAGCA GGGCTATGAA901 ACGTATTTTT ACAGTGCCCA GGCTGAAAAC CAAATGGCAA TTTTGAACTT951 AATCGGTAAG AAATGGATAG ACCATCTGAT TCAGCCGACG CAACTTGGCT1001 ACGGCAACGG CGACAATATG CCCGATGAGA ACCTGCTGCC GTTGTTCGAC1051 AAAATCAATT TGCAGCAGGG CAGGCATTTT ATCGTGTTGC ACCAACGCGG1101 TTCGCACGCC CCATACGGCG CATTGTTGCA GCCTCAAGAT AAAGTATTCG1151 GCGAAGCCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC1201 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA1251 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTG CGCCAAGATA1301 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATATTGT GCCTCTGGTT1351 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCCTTTTGC1401 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA1451 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACA1501 GGCAACCTGA TTACGGGCGA TGCAGGCAGC TTGAACATTC GCAACGGCAA1551 GGCGGAATAT GTTTATCCGC ATAA它编码的蛋白质具有氨基酸序列<SEQ ID 252>: 1 MKKSLFVLFL YSSLLTASEI AYRFVFGIET LPAAKMAETF ALTFMIAALY51 LFARYKASRL LIAVFFAFSM IANNYHYAVY QSWMTGINYW LMLKEVTEVG101 SAGASMLDKL WLPALWGVAE VMLFCSLAKF RRKTHFSADI LFAFLMLMIF151 VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSKIPVFK201 QPAPSKIGQG SIQNIVLIMG ESESAAHLKL FGYGRETSPF LTRLSQADFK251 PIVKQSYSAG FMTAVSLPSF FNVIPHANGL EQISGGDTNM FRLAKEQGYE301 TYFYSAQAEN QMAILNLIGK KWIDHLIQPT QLGYGNGDNM PDEKLLPLFD351 KINLQQGRHF IVLHQRGSHA PYGALLQPQD KVFGEADIVD KYDNTIHKTD401 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYIVPLV451 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT501 GNLITGDAGS LNIRNGKAEY VYPQ*ORF81ng和ORF81-1在524个氨基酸的重叠区内有96.4%的相同性:
10 20 30 40 50 60orf81ng-1.pep MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFMIAALYLFARYKASRL
||||:::||||||||||||||||||||||||||||:||||||||:|||||||||||::||orf81-1 MKKSFLTLVLYSSILTASEIAYRFVFGIETLPAAKIAETFALTFVIAALYLFARYKVTRL
10 20 30 40 50 60
70 80 90 100 110 120orf81ng-1.pep LIAVFFAFSMIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAE
|||||||||:|||||||||||||||||||||||||||||||||||||||||||:||||||orf81-1 LIAVFFAFSIIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPVLWGVLE
70 80 90 100 110 120
130 140 150 160 170 180orf81ng-1.pep VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf81-1 VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
130 140 150 160 170 180
190 200 210 220 230 240orf81ng-1.pep FVGRVLPYQLFDLSKIPVFKQPAPSKIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPF
||||||||||||||:||:|||||||||||||:||||||||||||||||||||||||||||orf81-1 FVGRVLPYQLFDLSRIPAFKQPAPSKIGQGSVQNIVLIMGESESAAHLKLFGYGRETSPF
190 200 210 220 230 240
250 260 270 280 290 300orf81ng-l.pep LTRLSQADFIPIVKQSYSAGFMTAVSLPSFFNVIPHANGLEQISGGDTNMFRLAKEQGYE
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||orf81-1 LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNAIPHANGLEQISGGDTNMFRLAKEQGYE
250 260 270 280 290 300
310 320 330 340 350 360orf81ng-1.pep TYFYSAQAENQMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGRHF
||||||||||:||||||||||||||||||||||||||||||||||||||||||||||:||orf81-1 TYFYSAQAENEMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGKHF
310 320 330 340 350 360
370 380 390 400 410 420orf81ng-1.pep IVLHQRGSHAPYGALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf81-1 IVLHQRGSHAPYGALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
370 380 390 400 410 420
430 440 450 460 470 480orf81ng-1.pep AYTSDHGQYVRQDIYNQGTVQPDSYIVPLVLYSPIKAVQQAANQAFAPCEIAFHQQLSTF
|||||||||||||:||||||||||||||||||||||||||||||||||||||||||||||orf81-1 AYTSDHGQYVRQDIYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF
430 440 450 460 470 480
490 500 510 520orf81ng-1.pep LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRNGKAEYVYPQX
||||||||||||||||||||||||||||||||||:||||||||||orf81-1 LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
490 500 510 520另外,ORF81ng显示出与大肠杆菌的OMP明显同源:gi|1256380(U50906)结合外膜蛋白粘附蛋白的蛋白[E.coli]长度=547评分=87.4位(213),估计值=2e-16相同性=122/468(26%),阳性=198/468(42%),空隙=70/468(14%)询问:25 VFGIETLPAAKMAETFA-LTFMIAALYLFARYKAS--RLLIAVFFAFSMIANNVHYAVYQ 81
VFGI L A+ A L F + + + R + RLL+A F + A ++ ++Y目标:29 VFGITNLVASSGAHMVQRLLFFVLTILVVKRISSLPLRLLVAAPFVL-LTAADMSISLY- 86询问:82 SWMT-------GINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAEVMLFCSLAKFRRKT 134
SW T G ++ + EV A ML ++ P L A + L +目标:87 SWCTFGTTFNDGFAISVLQSDPDEV----AKMLG-MYSPYLCAFAFLSLLFLAVIIKYDV 141询问:135 HFSADILFAFLMLMIFVRSF---------DTKQEHGISPKPTYSRIKAN--YFSFGYFVG 183
+ L+L++ S D K ++ SP SR +F+ YF目标:142 SLPTKKVTGILLLIVISGSLFSACQFAYKDAKNKNAFSPYILASRFATYTPFFNLNYFAL 201询问:184 RVLPYQ--LFDLSKIPVFKQPAPSKIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPFL 241
+Q L + +P F+ + I VLI+GES ++ L+GY R T+P +目标:202 AAKEHQRLLSIANTVPYFQL----SVRDTGIDTYVLIVGESVRVDNMSLYGYTRSTTPQV 257询问:242 TRLSQADFKPIVKQSYSAGFMTAVSLP---SFFNVIPHANGLEQISGGDTNMFRLAKEQG 298
+Q + Q+ S TA+S+P + +V+ H I N+ +A + G目标:258 E--AQRKQIKLFNQAISGAPYTALSVPLSLTADSVLSH-----DIHNYPDNIINMANQAG 310询问:299 YETYFYSAQA---ENQMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQ 355
++T++ S+Q+ +N A+ ++ ++ + Y G DE LLP + Q目标:311 FQTFWLSSQSAFRQNGTAVTSI--------AMRAMETVYVRGF---DELLLPHLSQALQQ 359询问:356 --QGRHFIVLHQRGSHAPYGALLQPQDKVFGEADIVDK-YDNTIHKTDQMIQTVFEQLQK 412
Q + IVLH GSH P + VF D D YDN+IH TD ++ VFE L+目标:360 NTQQKKLIVLHLNGSHEPACSAYPQSSAVFQPQDDQDACYDNSIHYTDSLLGQVFELLK- 418询问:413 QPDGNWLFAYTSDHG---QYVRQDIYNQG--TVQPDSYIVPL-VLYSP 454
D Y +DHG ++++Y G +Y VP+ + YSP目标:419 --DRRASVMYFADHGLERDPTKKNVYFHGGREASQAYHVPMFIWWYSP 464
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域单划线)的结果),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例37
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 253>:1 ...ACCCTGCTCC TCTTCATCCC CCTCGTCCTC ACAC.GTGCG GCACACTGAC51 CGGCATACTC GCCCaCGGCG GCGGCAAACG CTTTGCCGTC GAACAAGAAC101 TCGTCGCCGC ATCGTCCCGC GCCGCCGTCA AAGAAATGGA TTTGTCCGCC151 yTAAAAGGAC GCAAAGCCGC CyTTTACGTC TCCGTTATGG GCGACCAAGG201 TTCGGGCAAC ATAAGCGGCG GACGCTACTC TATCGACGCA CTGATACGCG251 GCGGCTACCA CAACAACCCC GAAAGTGCCA CCCAATACAG CTACCCCGCC301 TACGACACTA CCGCCACCAC CAAATCCGAC GCGCTCTCCA GCGTAACCAC351 TTCCACATCG CTTTTGAACG CCCCCGCCGC CGyCyTGACG AAAAACAGCG401 GACGCAAAGG CGAACGcTCC GCCGGACTGT CCGTCAACGG CACGGGCGAC451 TACCGCAACG AAACCCTGCT CGCCAACCCC CGCGACGTTT CCTTCCTGAC501 CAACCTCATC CAAACCGTCT TCTACCTGCG CGGCATCGAA GTCgTACCGC551 CCGrATACGC CGACACCGAC GTATTCGTAA CCGTCGACGT A...
它对应于氨基酸序列<SEQ ID 254;ORF83>:1..TLLLFIPLVL TXCGTLTGIL AHGGGKRFAV EQELVAASSR AAVKEMDLSA51 LKGRKAAXYV SVMGDQGSGN ISGGRYSIDA LIRGGYHNNP ESATQYSYPA101 YDTTATTKSD ALSSVTTSTS LLNAPAAXLT KNSGRKGERS AGLSVNGTGD151 YRNETLLANP RDVSFLTNLI QTVFYLRGIE VVPPXYADTD VFYTVDV..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 255>:1 ATGAAAACCC TGCTCCTCCT CATCCCCCTC GTCCTCACAG CCTGCGGCAC51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC101 AAGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGATTTG151 TCCGCCCTAA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCTATC GACGCACTGA251 TACGCGGCGG CTACCACAAC AACCCCGAAA GTGCCACCCA ATACAGCTAC301 CCCGCCTACG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCAGCGT351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA401 ACAGCGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG451 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC601 GGCACCGTCC GCAGCCGTAC CGAACTGCAC CTCTACAACG CCGAAACCCT651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTTGACCGC GACAGCCGGA701 AACTGCTGAT TACCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA751 CAATACGCCC TTTGGACCGG CCCTTACAAA GTCAGCAAAA CCGTCAAAGC801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATTACCCCC TACGGCGACA851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAAACCC901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA
它对应于氨基酸序列<SEQ ID 256;ORF83-1>:1 MKTLLLLIPL VLTACGTLTG IPAHGGGKRF AVEQELVAAS SRAAVKEMDL51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPESATQYSY101 PAYDTTATTK SDALSSVTTS TSLLNAPAAA LTKNSGRKGE RSAGLSVNGT151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEVVPPEYAD TDVFVTVDVF201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLITPK TAAYESQYQE251 QYALWTGPYK VSKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKKP301 DVGNEVIRRR KGG*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF83与脑膜炎奈瑟球菌菌株A的ORF(ORF83a)在重叠的197个氨基酸内有96.4%的相同性:
10 20 30 40 50orf83.pep TLLLFIPLVLTXCGTLTGILAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAX
||| :|||||| ||||||| ||||||||||||||||||||||||||||||||||||||orf83a MKTLLXLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
10 20 30 40 50 60
60 70 80 90 100 110orf83.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf83a YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
70 80 90 100 110 120
120 130 140 150 160 170orf83.pep TSLLNAPAAXLTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf83a TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
130 140 150 160 170 180
180 190orf83.pep IEVVPPXYADTDVFVTVDV
|||||| ||||||||||||orf83a IEVVPPEYADTDYFYTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK
190 200 210 220 230 240全长ORF83a核苷酸序列<SEQ ID 257>是:1 ATGAAAACCC TGCTCNTCCT CATCCCCCTC GTCCTCACAG CCTGCGGCAC51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC101 AAGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGACTTG151 TCCGCCCTGA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCTATC GACGCACTGA251 TACGCGGCGG CTACCACAAC AACCCCGAAA GTGCCACCCA ATACAGCTAC301 CCCGCCTACG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCAGCGT351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA401 ACAGCGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG451 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC601 GGCACCGTCC GCAGCCGCAC CGAACTGCAC CTCTACAACG CCGAAACCCT651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTTGACCGC GACAGCCGGA701 AACTGCTGAT TGCCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA751 CAATACGCCC TCTGGATGGG ACCTTACAGC GTCGGCAAAA CCGTCAAAGC801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATCACCCCC TACGGCGACA851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAAACCC901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA它编码的蛋白质具有氨基酸序列<SEQ ID 258>:1 MKTLLXLIPL VLTACGTLTG IPAHGGGKRF AVEQELVAAS SRAAVKEMDL51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPESATQYSY101 PAYDTTATTK SDALSSVTTS TSLLNAPAAA LTKNSGRKGE RSAGLSVNGT151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEVVPPEYAD TDVFVTVDVF201 GTVRSRTEIH LYNAETLKAQ TKLEYFAVDR DSRKLLIAPK TAAYESQYQE251 QYALWMGPYS VGKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKKP301 DVGNEVIRRR KGG*ORF83a和ORF83-1在313个氨基酸的重叠区内有98.4%的相同性:
10 20 30 40 50 60orf83a.pep MKTLLXLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf83-1 MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
10 20 30 40 50 60
70 80 90 100 110 120orf83a.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf83-1 YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
70 80 90 100 110 120
130 140 150 160 170 180orf83a.pep TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf83-1 TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
130 140 150 160 170 180
190 200 210 220 230 240orf83a.pep IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||:||orf83-1 IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLITPK
190 200 210 220 230 240
250 260 270 280 290 300orf83a.pep TAATESQYQEQYALWMGPYSVGKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP
|||||||||||||| ||||:|:||||||||||||||||||||||||||||||||||||||orf83-1 TAAYESQYQEQYALWTGPYKVSKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP
250 260 270 280 290 300
310orf83a.pep DVGNEVIRRRKGGX
||||||||||||||orf83-1 DVGNEVIRRRKGGX
310
与淋病奈瑟球菌的预计ORF的同源性
ORF83和淋病奈瑟球菌的预计ORF(ORF83.ng)在重叠的197个氨基酸内有94.9%的相同性:orf83.pep TLLLFIPLVLTXCGTLTGILAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAX 58
||||:|||||| ||||||| ||||||||||||||||||||||||||||||||||||||orf83ng MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL 60orf83.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 118
||||||||||||||||||||||||||||||||:|||:||||||||||||||||||:||||orf83ng YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPDSATRYSYPAYDTTATTKSDALSGVTTS 120orf83.pep TSLLNAPAAXLTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 178
||||||||| ||||:|||||||||||||||||||||||||||||||||||||||||||||orf83ng TSLLNAPAAALTKNNGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 180orf83.pep IEVVPPXYADTDVFVTVDV 197
|||||| ||||||||||||orf83ng IEVVPPEYADTDYFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK 240全长ORF83ng核苷酸序列<SEQ ID 259>是:1 ATGAAAACCC TGCTCCTCCT CATCCCCCTC GTACTCACCG CCTGCGGCAC51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC101 AGGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGACTTG151 TCCGCCCTGA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCCATC GACGCACTGA251 TACGCGGCGG CTACCACAAC AACCCCGACA GCGCCACCCG ATACAGCTAC301 CCCGCCTATG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCGGCGT351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA401 ACAACGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG451 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC601 GGCACCGTCC GCAGCCGTAC CGAACTGCAC CTCTACAACG CCGAAACCCT651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTCGACCGC GACAGCCGGA701 AACTGCTGAT TGCCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA751 CAATACGCCC TCTGGATGGG ACCTTACAGC GTCGGCAAAA CCGTCAAAGC801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATCACCCCC TACGGCGACA851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAACCCC901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 260>:
1 MKTLLLLIPL VLTACGTLTG IPAHGGGKRF AVEQELVAAS SRAAVKEMDL 51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPDSATRYSY101 PAYDTTATTK SDALSGVTTS TSLLNAPAAA LTKNNGRKGE RSAGLSVNGT151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEVVPPEYAD TDVFVTVDVF201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLIAPK TAAYESQYQE251 QYALWMGPYS VGKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKNP301 DVGNEVIRRR KGG*ORF83ng和ORF83-1在313个氨基酸的重叠区内有97.1%的相同性
10 20 30 40 50 60orf83-1.pep MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf83ng MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
10 20 30 40 50 60
70 80 90 100 110 120orf83-1.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
||||||||||||||||||||||||||||||||:|||:||||||||||||||||||:||||orf83ng YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPDSATRYSYPAYDTTATTKSDALSGVTTS
70 80 90 100 110 120
130 140 150 160 170 180orf83-1.pep TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||orf83ng TSLLNAPAAALTKNNGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
130 140 150 160 170 180
190 200 210 220 230 240orf83-1.pep IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLITPK
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||:||orf83ng IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK
190 200 210 220 230 240
250 260 270 280 290 300orf83-1.pep TAAYESQYQEQYALWTGPYKVSKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP
||||||||||||||| |||:|:||||||||||||||||||||||||||||||||||||:|orf83ng TAAYESQYQEQYALWMGPYSVGKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKNP
250 260 270 280 290 300
310orf83-1.pep DVGNEVIRRRKGGX
||||||||||||||orf83ng DVGNEVIRRRKGGX
310
根据该分析结果(预计淋球菌蛋白中存在一个推定的ATP/GTP-结合位点基序A(P-环)(双划线)以及一个推定的原核细胞膜脂蛋白脂质连接位点(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例38
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 261>:1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT51 AAAAATGGTT TCCATGATGG CGAATGATGA AATGTTTAAG CCTGATGAAA101 AAGCCATACG CCGTAAAGTA TTTACGAACA TAAAAGGCTT GAAAATACCG151 CACACCTACA TAGAAACGGA CGCAAAAAAG CTGCCGAAAT CGACAGATGA 201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC301 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGTCCT AAGCTTCTAG401 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC451 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA551 AAGTTTATGA CTTGTAysrr TmmGCGGAAG TTCATACCGT AAATAAGGTC601 AAGCGGTCAA AGTGGTTTTA CACTCTGCCa GTAATAGTAT TGCTGATTCC651 CGTGTTTGTC GGCCTGTCCT ATAAAATGTT GagCaGTTAC GGAAAAAAAC701 aGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA751 CTTCCGGATA AAACAGAAGG CGAGCCGGTA AATAACGGCA ACCTTACCGC801 AGATATGTTT GTTCCGACAT TGTCCGAaAA ACCCGrAAGC AAGCcgaTTT851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTATA901 GAAGGCGGAA GAACCGGATG CGCCTGCTAT TCGCaTCAAG GGACGGCATt951 gaAAGAAGTG ACGGaGTTGA TGTGccaAgG aCTATGTaAA AAacGGCTTG1001 CCGTTTAACC CaTACAAAGA AGAAAGCCAA GGGCAGGAAG TTCAGCAAAG1051 CGCGCAgCAA CATTCGGACA GGGCGcCAAG TTGCCACATT GGGCGGAAAA1101 CCGTAGCAGA ACCTAATGTA CGATAATTGG GAAGAACGCG GGAAACCGTT1151 TGAAGGAATC GGaCGGGGGC GTGGTCGGAT CGGCAAACTG A它对应于氨基酸序列<SEQ ID 262;ORF84>:1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDEKAIRRKV FTNIKGLKIP51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VRKHYHIASN151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYX XAEVHTVNKV201 KRSKWFYTLP VIVLLIPVFV GLSYKMLSSY GKKQEEPAAQ ESAATEQAV251 LPDKTEGEPV NNGNLTADMF VPTLSEKPXS KPIYNGVRQV RTFEYIAGCI301 EGGRTGCACY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS351 AQQHSDRAQV ATLGGKPXQN LMYDNWEERG KPFEGIGGGV VGSAN*进一步的工作揭示了完整的核苷酸序列<SEQ ID 263>:1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT51 AAAAATGGTT TCCATGATGG CGAATGATGA AATGTTTAAG CCTGATGAAA101 ACGGCATACG CCGTAAAGTA TTTACGAACA TAAAAGGCTT GAAAATACCG151 CACACCTACA TAGAAACGGA CGCAAAAAAG CTGCCGAAAT CGACAGATGA201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC301 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGTCCT AAGCTTCTAG401 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC451 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA551 AAGTTTATGA CTTGTACGAA TCAGCGGAAG TTCATACCGT AAATAAGGTC601 AAGCGGTCAA AGTGGTTTTA CACTCTGCCA GTAATAGTAT TGCTGATTCC651 CGTGTTTGTC GGCCTGTCCT ATAAAATGTT GAGCAGTTAC GGAAAAAAAC701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA751 CTTCCGGATA AAACAGAAGG CGAGCCGGTA AATAACGGCA ACCTTACCGC801 AGATATGTTT GTTCCGACAT TGTCCGAAAA ACCCGAAAGC AAGCCGATTT851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTATA901 GAAGGCGGAA GAACCGGATG CGCCTGCTAT TCGCATCAAG GGACGGCATT951 GAAAGAAGTG ACGGAGTTGA TGTGCAAGGA CTATGTAAAA AACGGCTTGC1001 CGTTTAACCC ATACAAAGAA GAAAGCCAAG GGCAGGAAGT TCAGCAAAGC1051 GCGCAGCAAC ATTCGGACAG GGCGCAAGTT GCCACATTGG GCGGAAAACC1101 GTAGCAGAAC CTAATGTACG ATAATTGGGA AGAACGCGGG AAACCGTTTG1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA它对应于氨基酸序列<SEO ID 264;ORF84-1>:1 MAEICLITGT PGSGKTLKMV SMMANDFMFK PDENGIRRKV FTNIKGLKIP51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VRKHYHIASN151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYE SAEVHTVNKV201 KRSKWFYTLP VIVLLIPVFV GLSYKMLSSY GKKQEEPAAQ ESAATEQQAV251 LPDKTEGEPV NNGNLTADMF VPTLSEKPES KPIYNGVRQV RTFEYIAGCI301 EGGRTGCACY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS351 AQQHSDRAQV ATLGGKP*QN LMYDNWEERG KPFEGIGGGV VGSAN*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF84与脑膜炎奈瑟球菌菌株A的ORF(ORF84a)在重叠的395个氨基酸内有93.9%的相同性:
10 20 30 40 50 60orf84.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDEKAIRRKVFTNIKGLKIPHTYIETDAKK
|||||||||||||||||||||||||||||||||::|||||||||||||||||||||||||orf84a MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
10 20 30 40 50 60
70 80 90 100 110 120orf84.pep LPKSTDEQLSAHDWYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf84a LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
70 80 90 100 110 120
130 140 150 160 170 180orf84.pep IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf84a IDIFVLTQGSKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
130 140 150 160 170 180
190 200 210 220 230 240orf84.pep LDKKVYDLYXXAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ
||||||||| |||||||||||||||||||||:|||||||||||||||||||||||||||orf84a LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIILLIPVFVGLSYKMLSSYGKKQEEPAAQ
190 200 210 220 230 240
250 260 270 280 290 300orf84.pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPXSKPIYNGVRQVRTFEYIAGCI
||||||:|||:||||||||||||||||||||||||||| ||||||||||||||||||||:orf84a ESAATEHQAVFQDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCV
250 260 270 280 290 300
310 320 330 340 350 360orf84.pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
|||||||:|||||||||||:|: |||||::||||||||||||||::|||| |:|||| ||orf84a EGGRTGCTCYSHQGTALKEITKEMCKDYARNGLPFNPYKEESQGRDVQQSEQHHSDRPQV
310 320 330 340 350 360
370 380 390orf84.pep ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSANX
||||||| ||||||||:|||||||||||||||||||orf84a ATLGGKPWQNLMYDNWQERGKPFEGIGGGVVGSANX
370 380 390全长ORF84a核苷酸序列<SEQ ID 265>是:1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT51 AAAAATGGTT TCCATGATGG CAAACGATGA AATGTTTAAG CCGGATGAAA101 ACGGCATACG CCGTAAAGTA TTTACGAACA TCAAAGGCTT GAAGATACCG151 CACACCTACA TAGAAACGGA CGCGAAAAAG CTGCCGAAAT CGACAGATGA201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC301 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG 351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGCTCT AAGCTTCTAG401 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC451 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA551 AAGTTTATGA CTTGTACGAA TCAGCGGAAG TTCATACCGT AAATAAGGTC601 AAGCGGTCAA AATGGTTTTA TACTCTGCCA GTAATAATAT TGCTGATTCC651 CGTTTTTGTC GGCCTGTCCT ATAAAATGTT AAGTAGTTAT GGAAAAAAAC701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA TCAGGCAGTA751 TTTCAGGATA AAACAGAAGG CGAGCCGGTA AACAACGGTA ACCTTACCGC801 AGATATGTTT GTTCCGACAT TGTCCGAAAA ACCCGAAAGC AAGCCGATTT851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTGTA901 GAAGGCGGAA GAACCGGATG CACATGCTAT TCGCATCAAG GGACGGCATT951 GAAAGAAATT ACAAAGGAAA TGTGCAAGGA TTACGCAAGA AACGGATTGC1001 CGTTTAACCC ATATAAAGAA GAAAGCCAAG GGCGGGATGT CCAGCAAAGT1051 GAGCAGCACC ATTCGGACAG ACCGCAAGTT GCCACGTTGG GCGGAAAGCC1101 GTGGCAAAAT CTTATGTATG ATAATTGGCA GGAGCGCGGA AAACCGTTTG1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA它编码的蛋白质具有氨基酸序列<SEQ ID 266>:1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDENGIRRKV FTNIKGLKIP51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGS KLLDQNLRTL VRKHYHIASN151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYE SAEVHTVNKV201 KRSKWFYTLP VIILLIPVFV GLSYKMLSSY GKKQEEPAAQ ESAATEHQAV251 FQDKTEGEPV NNGNLTADMF VPTLSEKPES KPIYNGVRQV RTFEYIAGCV301 EGGRTGCTCY SHQGTALKEI TKEMCKDYAR NGLPFNPYKE ESQGRDVQQS351 EQHHSDRPQV ATLGGKPWQN LMYDNWQERG KPFEGIGGGV VGSAN*ORF84a和ORF84-1在395个氨基酸的重叠区内有95.2%的相同性:
10 20 30 40 50 60orf84a.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf84-1 MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
10 20 30 40 50 60
70 80 90 100 110 120orf84a.pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf84-1 LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
70 80 90 100 110 120
130 140 150 160 170 180orf84a.pep IDIFVLTQGSKLLIQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||orf84-1 IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
130 140 150 160 170 180
190 200 210 220 230 240orf84a.Dep LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIILLIPVFVGLSYKMLSSYGKKQEEPAAQ
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||orf84-1 LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ
190 200 210 220 230 240
250 260 270 280 290 300orf84a.pep ESAATEHQAVFQDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCV
||||||:|||: |||||||||||||||||||||||||||||||||||||||||||||||:orf84-1 ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCI
250 260 270 280 290 300
310 320 330 340 350 360orf84a.pep EGGRTGCTCYSHQGTALKEITXEMCKDYARNGLPFNPYXEESQGRDVQQSEQHHSDRPQV
|||||||:|||||||||||:|: |||||::||||||||||||||::|||| |:|||| ||orf84-1 EGGRTGCACYSHQGGALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
310 320 330 340 350 360
370 380 390orf84a.pep ATLGGKPWQNLMYDNWQERGKPFEGIGGGVVGSANX
||||||| ||||||||:|||||||||||||||||||orf84-1 ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSANX
370 380 390
与淋病奈瑟球菌的预计ORF的同源性
ORF84与淋病奈瑟球菌的预计ORF(ORF84.ng)在重叠的395个氨基酸内有94.2%的相同性:orf84.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDEKAIRRKVFTNIKGLKIPHTYIETDAKK 60
|||||||||||||||||||||||||||||||||:::||||||||||||||||:|||||||orf84ng MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGVRRKVFTNIKGLKIPHTHIETDAKK 60orf84.pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG 120
|||||||||||||||||||||||:|:||||||||||||||||||||||||||||||||||orf84ng LPKSTDEQLSAHDMYEWIKKPENVGAIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG 120orf84.pep IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT 180
|||||||||||||||||||||::|||||:||||:|||||||:||||||||||||||||||orf84ng IDIFVLTQGPKLLDQNLRTLVKRHYHIAANKMGLRTLLEWKVCADDPVKMASSAFSSIYT 180orf84.pep LDKKVYDLYXXAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ 240
||||||||| ||:|||||||||||||:||||:||||:|||||||||:||||||||||||orf84ng LDKKVYDLYESAEIHTVNKVKRSKWFYALPVIILLIPLFVGLSYKMLGSYGKKQEEPAAQ 240orf84.pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPXSKPIYNGVRQVRTFEYIAGCI 300
|||||||||||||||||| ||||||||||||||| ||| |||||||||||||||||||||orf84ng ESAATEQQAVLPDKTEGESVNNGNLTADMFVPTLPEKPESKPIYNGVRQVRTFEYIAGCI 300orf84.pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 360
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf84ng EGGRTGCTCYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 360orf84.pep ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSAN 395
||||||| |||||||||||||||||||||||||||orf84ng ATLGGKPQQNLMYDNWEERGKPFEGIGGGVVGSAN 395全长ORF84ng核苷酸序列<SEQ ID 267>是:1 ATGGCAGAAA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT51 AAAAATGGTT TCCATGATGG CAAACGATGA AATGTTTAAG CCAGATGAAA101 ACGGCGTACG CCGTAAAGTA TTTACGAACA TCAAAGGTTT GAAGATACCG151 CACACCCACA TAGAAACAGA CGCAAAGAAG CTGCCGAAAT CAACCGATGA201 ACAGCTTTCG GCGCATGATA TGTATGAATG GATCAAGAAG CCTGAAAacg251 tcggcgCAAT CGTTATTGTC GATGAGGCGC AAGACGTATG GCCCGCACGC301 TccgCAGGTT CGAAAATCCC CGAAAACGTC CAATGGCTGA ACACACACAG351 GCATCAGGGC ATAGATATAT TTGTATTGAC ACAAGGTCCT AAACTCTTAG401 ATCAGAACTT GCGAACATTG GTTAAAAGAC ATTACCACAT TGCGGCCAAC451 AAAATGGGTT TGCGTACCCT GCTTGAATGG AAAGTATGCG CGGATGACCC501 GGTAAAAATG GCATCAAGTG CATTTTCCAG TATCTACACA CTGGATAAAA551 AAGTTTATGA CTTGTACGAA TCCGCAGAAA TTCACACGGT AAACAAAGTC601 AAGCGTTCAA AATGGTTTTA TGCATTGCCC GTCATCATAT TATTGATTCC651 GCTATTTGTC GGTTTGTCTT ACAAAATGTT GGGCAGTTAC GGAAAAAAAC701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA751 CTTCCGGATA AAACAGAAGG AGAATCGGTG AATAACGGAA ACCTTACGGC801 AGATATGTTT GTTCCGACAT TGCCCGAAAA ACCCGAAAGC AAGCCGATTT 851 ATAACGGTGT AAGGCAGGTA AGGACCTTTG AATATATAGC AGGCTGTATA901 GAAGGCGGAA GAACCGGATG CACCTGCTAT TCGCATCAAG GGACGGCATT951 GAAAGAAGTG ACGGAGTTGA TGTGCAAGGA CTATGTAAAA AACGGCTTGC1001 CGTTTAACCC ATACAAAGAA GAAAGCCAAG GGCAGGAAGT TCAGCAAAGC1051 GCGCAGCAAC ATTCGGACAG GGCGCAAGTT GCCACCTTGG GCGGAAAACC1101 GCAGCAGAAC CTAATGTACG ACAATTGGGA AGAACGCGGG AAACCGTTTG1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA它编码的蛋白质具有氨基酸序列<SEQ ID 268>:1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDENGVRRKV FTNIKGLKIP51 HTHIETDAKK LPKSTDEQLS AHIMYEWIKK PENVGAIVIV DEAQDYWPAR101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VKRHYHIAAN151 KMGLRTLLEW KVCADDPVKM ASSAFSSIYT LDKKVYDLYE SAEIHTVNKV201 KRSKWFYALP VIILLIPLFV GLSYKMLGSY GKKQEEPAAQ ESAATEQQAV251 LPDKTEGESV NNGNLTADMF VPTLPEKPES KPIYNGVRQV RTFEYIAGCI301 EGGRTGCTCY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS351 AQQHSDRAQV ATLGGKPQQN LMYDNWEERG KPFEGIGGGV VGSAN*ORF84ng和ORF84-1在395个氨基酸的重叠区内有95.4%的相同性:
10 20 30 40 50 60orf84-1.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
|||||||||||||||||||||||||||||||||||:||||||||||||||||:|||||||orf84ng MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGVRRKVFTNIKGLKIPHTHIETDAKK
10 20 30 40 50 60
70 80 90 100 110 120orf84-1.pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
|||||||||||||||||||||||:|:||||||||||||||||||||||||||||||||||orf84ng LPKSTDEQLSAHDMYEWIKKPENVGAIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
70 80 90 100 110 120
130 140 150 160 170 180orf84-1.pep IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
|||||||||||||||||||||::|||||:||||:|||||||:||||||||||||||||||orf84ng IDIFVLTQGPKLLDQNLRTLVKRHYHIAANKMGLRTLLEWKVCADDPVKMASSAFSSIYT
130 140 150 160 170 180
190 200 210 220 230 240orf84-1.pep LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ
|||||||||||||:|||||||||||||:||||:||||:|||||||||:||||||||||||orf84ng LDKKVYDLYESAEIHTVNKVKRSKWFYALPVIILLIPLFVGLSYKMLGSYGKKQEEPAAQ
190 200 210 220 230 240
250 260 270 280 290 300orf84-1.pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCI
|||||||||||||||||| |||||||||||||||||||||||||||||||||||||||||orf84ng ESAATEQQAVLPDKTEGESVNNGNLTADMFVPTLPEKPESKPIYNGVRQVRTFEYIAGCI
250 260 270 280 290 300
310 320 330 340 350 360orf84-1.pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf84ng EGGRTGCTCYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
310 320 330 340 350 360
370 380 390orf84-1.pep ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSANX
||||||| ||||||||||||||||||||||||||||orf84ng ATLGGKPQQNLMYDNWEERGKPFEGIGGGVVGSANX
370 380 390
根据该分析结果(包括淋球菌蛋白中存在一个推定的跨膜结构域(单划线),以及一个推定的ATP/GTP-结合位点基序A(P环,双划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例39
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 269>:1 GTGGTTTTCC TGAATGCCGA CAACGGGATA TTGGTTCAGG ACTTGCCTTT51 TGAAGTCAAA CTGAAAAAAT TCCATATCGA TTTTTACAAT ACGGGTATGC101 CGCGTGATTT CGCCAGCGAT ATTGAAGTGA CGGACAAGGC AACCGGTGAG151 AAACTCGAGC GCACCATCCG CGTGAACCAT CCTTTGACCT TGCACGGCAT201 CACGATTTAT CAGGCGAGTT TTGCCGACGG CGGTTCGGAT TTGACATTCA251 AGGCGTGGAA TTTGGGTGAT GCTTCGCGCG AGCCTGTCGG GTTGAAGGCA301 ACATCCATAC ACCAGTTTCC GTTGGAAATT GGCAAACACA AATATCGTCT351 TGAGTTCGAT CAGTTCACTT CTATGAATGT GGAGGACATG AGCGAGGGCG401 CGGAACGGGA AAAAAGCCTG AAATCCACGC TGCCCGATGT CCGCGCCGTT451 ACTCAGGAAG GTCACAAATA CACCAAT... .......... .....TACCG501 TATCCGTGAT GCGCCAGGCC AGGCGGTCGA ATATAAAAAC TATATGCTGC551 CGGTTTTGCA GGAACAGGAT TATTTTTGGA TTACCGGCAC GCGCAGCGC.601 TTGCAGCAGC AATACCGCTG GCTGCGTATC CCCTTGGACA AGCAGTTGAA651 AGCGGACACC TTTATGGCAT TGCGTGAGTT TTTGAAAGAT GGGGAAGGGC701 GCAAACGTCT .GTTGCCGAC GCAACCAAAG GCGCACCTGC CGAAATCCGC751 GAACAATTCA TGCTGGCTGC GGAAAACACG CTGAACATCT TTGCACAAAA801 AGGCTATTTG GGATTGGACG AATTTATTAC GTCCAATATC CCGAAAGAGC851 AGCAGGATAA GATGCAGGGC TATTTCTACG AAATGCTTTA CGGCGTGATG901 AACGCTGCTT TGGATGAAAC CAT.ACCCGG TACGGCTTGC CCGAATGGCA951 GCAGGATGAA GCGCGGAATC GTTTCCTGCT GCACAGTATG GATGCGTACA1001 CGGGTTTGAC CGAATATCCC GCGCCTATGC TGCTGCAACT TGATGGGTTT1051 TCCGAGGTGC GTTCGTCGGG TTTGCAGATG ACCCGTTCCC C.GGTCCGCT1101 TTTGGTCTAT CTC...它对应于氨基酸序列<SEQ ID 270;ORF88>:1 MVFLNADNGI LVQDLPFEVK LKKFHIDFYN TGMPRDFASD IEVTDKATGE51 KLERTIRVNH PLTLHGITIY QASFADGGSD LTFKAWNLGD ASREPVVLKA101 TSIHQFPLEI GKHKYRLEFD QFTSMNVEDM SEGAEREKSL KSTLPDVRAV151 TQEGHKYTNX XXXXXYRIRD APGQAVEYKN YMLPVLQEQD YFWITGTRSX201 LQQQYRWLRI PLDKQLKADT FMALREFLKD GEGRKRXYAD ATKGAPAEIR251 EQFMLAAENT LNIFAQKGYL GLDEFITSNI PKEQQDKMQG YFYEMLYGVM301 NAALDETXTR YGLPEWQQDE ARNRFLLHSM DAYTGLTEYP APMLLQLDGF351 SEVRSSGLQM TRSXGPLLVY L...进一步的工作揭示了完整的核苷酸序列<SEQ ID 271>:1 ATGAGTAAAT CCCGTAGATC TCCCCCACTT CTTTCCCGTC CGTGGTTCGC51 TTTTTTCAGC TCCATGCGCT TTGCAGTCGC TTTGCTCAGT CTGCTGGGTA101 TTGCATCGGT TATCGGTACG GTGTTGCAGC AAAACCAGCC GCAGACGGAT151 TATTTGGTCA AATTCGGATC GTTTTGGGCG CAGATTTTTG GTTTTCTGGG201 ACTGTATGAC GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTT251 TGGTGGTTTC TACCAGTTTG TGCCTGATTC GCAATGTGCC GCCGTTCTGG301 CGCGAAATGA AGTCTTTTCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCGCCC GAGGTTGCCA401 AACGTTATCT GGAAGTACAA GGTTTTCAGG GAAAAACCAT TAACCGTGAA451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCACAATGA ACAAATGGGG501 CTATATCTTT GCCCATGTTG CTTTGATTGT CATTTGCCTG GGCGGGTTGA551 TAGACAGTAA CCTGCTGTTG AAACTGGGTA TGCTGACCGG TCGGATTGTT601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC701 AGAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT ATTGGTTCAG751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA801 TACGGGTATG CCGCGTGATT TCGCCAGCGA TATTGAAGTG ACGGACAAGG 851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA951 TTTGACATTC AAGGCGTGGA ATTTGGGTGA TGCTTCGCGC GAGCCTGTCG1001 TGTTGAAGGC AACATCCATA CACCAGTTTC CGTTGGAAAT TGGCAAACAC1051 AAATATCGTC TTGAGTTCGA TCAGTTCACT TCTATGAATG TGGAGGACAT1101 GAGCGAGGGC GCGGAACGGG AAAAAAGCCT GAAATCCACG CTGAACGATG1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC1201 ATTGTTTACC GTATCCGTGA TGCGGCAGGG CAGGCGGTCG AATATAAAAA1251 CTATATGCTG CCGGTTTTGC AGGAACAGGA TTATTTTTGG ATTACCGGCA1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC1351 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA1401 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GGCGCACCTG1451 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAACATC1501 TTTGCACAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT1551 CCCGAAAGAG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAT CGTTTCCTGC TGCACAGTAT1701 GGATGCGTAC ACGGGTTTGA CCGAATATCC CGCGCCTATG CTGCTGCAAC1751 TTGATGGGTT TTCCGAGGTG CGTTCGTCGG GTTTGCAGAT GACCCGTTCC1801 CCGGGTGCGC TTTTGGTCTA TCTCGGCTCG GTGCTGTTGG TATTGGGTAC1851 GGTATTGATG TTTTATGTGC GCGAAAAACG GGCGTGGGTA TTGTTTTCAG1901 ACGGCAAAAT CCGTTTTGCC ATGTCTTCGG CCCGCAGCGA ACGGGATTTG1951 CAGAAGGAAT TTCCAAAACA CGTCGAGAGT CTGCAACGGC TCGGCAAGGA2001 CTTGAATCAT GACTGA它对应于氨基酸序列<SEQ ID 272;ORF88-1>:1 MSKSRRSPPL LSRPWFAFFS SMRFAVALLS LLGIASVIGT VLQQNQPQTD51 YLVKFGSFWA QIFGFLGLYD VYASAWFVVI MMFLVVSTSL CLIRNVPPFW101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVQ GFQGKTINRE151 DGSVLIAAKK GTMNKWGYIF AHVALIVICL GGLIDSNLLL KLGMLTGRIV201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADVVF LNADNGILVQ251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT301 LHGITIYQAS FADGGSDLTF KAWNLGDASR EPVVLKATSI HQFPLEIGKH351 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS401 IVYRIRDAAG QAVEYKNYML PVLQEQDYFW ITGTRSGLQQ QYRWLRIPLD451 KQLKADTFMA LREFLKDGEG RKRLVADATK GAPAEIREQF MLAAENTLNI501 FAQKGYLGLD EFITSNIPKE QQDKMQGYFY EMLYGVMNAA LDETIRRYGL551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS601 PGALLVYLGS VLLVLGTVLM FYVREKRAWV LFSDGKIRFA MSSARSERDL651 QKEFPKHVES LQRLGKDLNH D*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF88与脑膜炎奈瑟球菌菌株A的ORF(ORF88a)在重叠的371个氨基酸内有95.7%的相同性:
10 20 30orf88.pep MVFLNADNGILVQDLPFEVKLKKFHIDFYN
:|||||||||||||||||||||||||||||orf88a AKDFKPESILGASNLSFRGNVNISEGQSADVVFLNADNGILVQDLPFEVKLKKFHIDFYN
210 220 230 240 250 260
40 50 60 70 80 90orf88.pep TGMPRDFASDIEVTDKATGEKLERTIRVNHPLTLHGITIYQASFADGGSDLTFKAWNLGD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88a TGMPRDFASDIEVTDKATGEKLERTIRVNHPLTLHGITIYQASFADGGSDLTFKAWNLGD
270 280 290 300 310 320
100 110 120 130 140 150orf88.pep ASREPVVLKATSIHQFPLEIGKHKYRLEFDQFTSMNVEDMSEGAEREKSLKSTLPDVRAV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88a ASREPVVLKATSIHQFPLEIGKHKYRLEFDQFTSMNVEDMSEGAEREKSLKSTLNDVRAV
330 340 350 360 370 380
160 170 180 190 200 210orf88.pep TQEGHKYTNXXXXXXYRIRDAPGQAVEYKNYMLPVLQEQDYFWITGTRSXLQQQYRWLRI
||||:|||| |||||| ||||||||||||||||||||||||||| ||||||||||orf88a TQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYMLPVLQEQDYFWITGTRSGLQQQYRWLRI
390 400 410 420 430 440
220 230 240 250 260 270orf88.pep PLDKQLKADTFMALREFLKDGEGRKRXVADATKGAPAEIREQFMLAAENTLNIFAQKGYL
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||orf88a PLDKQLKADTFMALREFLKDGEGRKRLVADATKGAPAEIREQFMLAAENTLNIFAQKGYL
450 460 470 480 490 500
280 290 300 310 320 330orf88.pep DLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAALDETXTRYGLPEWQQDEARNRFLLHSM
||||||||||||||||||||||||||||||||||||| |||||||||||||||||||||orf88a GLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAALDETIRRYGLPEWQQDEARNRFLLHSM
510 520 530 540 550 560
340 350 360 370orf88.pep DAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRSXGPLLVYL
||||||||||||||||||||||||||||||||| | |||||orf88a DAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRSPGALLVYLGSVLLVLGTVLMFYVREKR
570 580 590 600 610 620orf88a AWVLFSDGKIRFAMSSARSERDLQKEFPKHVESLQRLGKDLNHDX
630 640 650 660 670全长ORF88a核苷酸序列<SEQ ID 273>是:1 ATGAGTAAAT CCCGTAGATC TCCCCCACTT CTTTCCCGTC CGTGGTTCGC51 TTTTTTCAGC TCCATGCGCT TTGCGGTCGC TTTGCTCAGT CTGCTGGGTA101 TTGCATCGGT TATCGGTACG GTGTTGCAGC AAAACCAGCC GCAGACGGAT151 TATTTGGTCA AATTCGGATC GTTTTGGGCG CAGATTTTTG GTTTTCTGGG201 ACTGTATGAC GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTT251 TGGTGGTTTC TACCAGTTTG TGCCTGATTC GCAATGTGCC GCCGTTCTGG301 CGCGAAATGA AGTCTTTTCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCGCCC GAGGTTGCCA401 AACGTTATCT GGAAGTACAA GGTTTTCAGG GAAAAACCAT TAACCGTGAA451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCACAATGA ACAAATGGGG501 CTATATCTTT GCCCATGTTG CTTTGATTGT CATTTGCCTG GGCGGGTTGA551 TAGACAGTAA CCTGCTGTTG AAACTGGGTA TGCTGACCGG TCGGATTGTT601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC701 AGAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT ATTGGTTCAG751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA801 TACGGGTATG CCGCGCGATT TTGCCAGTGA TATTGA4GTA ACGGATAAGG851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA951 TTTGACATTC AAGGCGTGGA ATTTGGGTGA TGCTTCGCGC GAGCCTGTCG1001 TGTTGAAGGC AACATCCATA CACCAGTTTC CGTTGGAAAT TGGCAAACAC1051 AAATATCGTC TTGAGTTCGA TCAGTTTACT TCTATGAATG TGGAGGACAT1101 GACCGAGGGC GCGGAACGGG AAAAAAGCCT GAAATCCACG CTGAACGATG1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC1201 ATTGTTTACC GTATCCGTGA TGCGGCAGGG CAGGCGGTCG AATATAAAAA1251 CTATATGCTG CCGGTTTTGC AGGAACAGGA TTATTTTTGG ATTACCGGCA1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC1351 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA1401 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GGCGCACCTG1451 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAACATC1501 TTTGCACAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT1551 CCCGAAAGAG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAT CGTTTCCTGC TGCACAGTAT1701 GGATGCGTAC ACGGGTTTGA CCGAATATCC CGCGCCTATG CTGCTGCAAC1751 TTGATGGGTT TTCCGAGGTG CGTTCGTCGG GTTTGCAGAT GACCCGTTCC1801 CCGGGTGCGC TTTTGGTCTA TCTCGGCTCG GTGCTGTTGG TATTGGGTAC1851 GGTATTGATG TTTTATGTGC GCGAAAAACG GGCGTGGGTA TTGTTTTCAG1901 ACGGCAAAAT CCGTTTTGCC ATGTCTTCGG CCCGCAGCGA ACGGGATTTG1951 CAGAAGGAAT TTCCAAAACA CGTCGAGAGT CTGCAACGGC TCGGCAAGGA2001 CTTGAATCAT GACTGA它编码的蛋白质具有氨基酸序列<SEQ ID 274>:1 MSKSRRSPPL LSRPWFAFFS SMRFAVALLS LLGIASVIGT VLQQNQPQTD51 YLVKFGSFWA QIFGFLGLYD VYASAQFVVT MMFLVVSTSL CLIRNVPPFW101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVQ GFQGKTINRE151 DGSVLIAAKK GTMNKWGYIF AHVALIVICL GGLIDSNLLL KLGMLTGRIV201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADVVF LNADNGILVQ251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT301 LHGITIYQAS FADGGSDLTF KAWNLGDASR EPVVLKATSI HQFPLEIGKH351 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS401 IVYRIRDAAG QAVEYKNYML PVLQEQDYFW ITGTRSGLQQ QYRWLRIPLD451 KQLKADTFMA LREFLKDGEG RKRLVADATK GAPAEIREQF MLAAENTLNI501 FAQKGYLGLD EFITSNIPKE QQDKMQGYFY EMLYGVMNAA LDETIRRYGL551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS601 PGALLVYLGS VLLVLGTVLM FYVREKRAWV LFSDGKIRFA MSSARSERDL651 QKEFPKHVES LQRLGKDLNH D*ORF88a和ORF88-1在671个氨基酸的重叠区内有100.0%的相同性:orf88a.pep MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60orf88a.pep QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120orf88a.pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180orf88a.pep GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240orf88a.pep LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300orf88a.pep LHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 LHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360orf88a.pep SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420orf88a.pep PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480orf88a.pep GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 540
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 540orf88a.pep LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600orf88a.pep PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88-1 PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660orf88a.pep LQRLGKDLNHD 672
|||||||||||orf88-1 LQRLGKDLNHD 672
与淋病奈瑟球菌的预计ORF的同源性
ORF88与淋病奈瑟球菌的预计ORF(ORF88.ng)在重叠的371个氨基酸内有93.8%的相同性:orf88.pep MVFLNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNH 60
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||||||orf88ng MVFLNADNGMLVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNH 60orf88.pep PLTLHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGIIKYRLEFD 120
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||orf88ng PLTLHGITIYQASFADGGSDLTFKAWNLRDASREPVVLKATSIHQFPLEIGKHKYRLEFD 120orf88.pep QFTSMNVEDMSEGAEREKSLKSTLPDVRAVTQEGHKYTNXXXXXXYRIRDAPGQAVEYKN 180
|||||||||||||||||||||||| |||||||||:|||| |||||| ||||||||orf88ng QFTSMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKN 180orf88.pep YMLPVLQEQDYFWITGTRSXLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRXVAD 240
||||:||::||||:||||| |||||||||||||||||||||||||||||||||||| |||orf88ng YMLPILQDKDYFWLTGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVAD 240orf88.pep ATKGAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDIIQGYFYEMLYGVM 300
|||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||orf88ng ATKDAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKGQQDKMQGYFYEMLYGVM 300orf88.pep NAALDETXTRYGLPEWQQDEARNRFLLHSMIAYTGLTEYPAPMLLQLDGFSEVRSSGLQM 360
||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||orf88ng NAALDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQM 360orf88.pep TRSXGPLLVYL 371
||| | |||||orf88ng TRSPGALLVYLGSVLLVLGTVFMFYVPKKRAWVLFSNXKIRFAMSSARSERDLQKEFPKH 420
预计ORF88ng核苷酸序列<SEQ ID 275>编码的蛋白质具有氨基酸序列<SEQ ID276>:1 MVFLNADNGM LVQDLPFEVK LKKFHIDFYN TGMPRDFASD IEVTDKATGE51 KLERTIRVNH PLTLHGITIY QASFADGGSD LTFKAWNLRD ASREPVVLKA101 TSIHQFPLEI GKHKYRLEFD QFTSMNVEDM SEGAEREKSL KSTLNDVRAV151 TQEGKKYTNI GPSIVYRIRD AAGQAVEYKN YMLPILQDKD YFWLTGTRSG201 LQQQYRWLRI PLDKQLKADT FMALREFLKD GEGRKRLVAD ATKDAPAEIR251 EQFMLAAENT LNIFAQKGYL GLDEFITSNI PKGQQIKMQG YFYEMLYGVM301 NAALDETIRR YGLPEWQQDE ARNRFLLHSM DAYTGLTEYP APMLLQLDGF351 SEVRSSGLQM TRSPGALLVY LGSVLLVLGT VFMFYVPKKR AWVLFSNXKI 401 RFNMSSARSE RDLQKEFPKH VESLQRLGKD LNHD*进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 277>:1 ATGAGTAAAT CCCGTATATC TCCCACACTT CTTTCCCGTC CGTGGTTCGC51 TTTTTTCAGC TCCATGCGCT TTGCGGTCGC TTTGCTCAGT CTGCTGGGTA101 TTGCATCGGT TATCGGCACG GTGTTACAGC AAAACCAGCC GCAGACGGAT151 TATTTGGTCA AATTCGGACC GTTTTGGACT CGGATTTTTG ATTTTTTGGG201 TTTGTATGAT GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTC251 TGGTGGTTTC TACCAGTTTG TGTTTAATCC GTAACGTTCC GCCGTTTTGG301 CGCGAAATGA AGTCTTTCCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCCCCC GAAGTTGCCA401 AACGTTATCT GGAGGTGCGG GGTTTTCAGG GAAAAACCGT CAGCCGTGAG451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCAcaatga acaaATGGGG501 CTATATCTTT GCccaagtag ctTTGATTGT CATTTGCCTG GGCGGGTTGA551 TAGACAGTAA CCTGCTGCTG AACCTGGGTA TGCTGGCCGG TCGGATTGTT601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC701 AAAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT GTTGGTTCAG751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA801 TACGGGTATG CCGCGCGATT TTGCCAGCGA TATTGAAGTA ACGGACAAGG851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA951 TTTGACATTC AAGGCGTGGA ATTTGAGGGA TGCTTCGCGC GAACCTGTCG1001 TGTTGAAGGC AACCTCCATA CACCAGTTTC CGTTGGAAAT CGGCAAACAC1051 AAATATCGTC TTGAGTTCGA TCAGTTCACT TCTATGAATG TGGAGGACAT1101 GAGCGAGGGT GCGGAACGGG AAAAAAGCCT GAAATCCACT CTGAACGATG1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC1201 ATCGTGTACC GCATCCGTGA TGcggCAGGG CAGGCGGTCG AATATAAAAA1251 CTATATGCTG CCGATTTTGC AGGACAAAGA TTATTTTTGG CTGACCGGCA1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC1351 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA1401 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GACGCACCTG1451 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAATATC1501 TTTGCGCAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT1551 CCCGAAAGGG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAC CGTTTCCTGC TGCACAGTAT1701 GGATGCCTAT ACGGGGCTGA CGGAATATCC CGCGCCTATG CTGCTCCAGC1751 TTGACGGGTT TTCCGAGGTG CGTTCCTCAG GTTTGCAGAT GACCCGTTCG1801 CCGGGTGCGC TTTTGGTCTA TCtcggctcg gtattgttgg TTTTGGgtac1851 ggtaTttatg tTTTATGTGC GCGAAAAACG GGCGTGGgta tTGTTTTCag1901 aCGGCAAAAT CCGTTTTGCT ATGtCTTcgg CCcgcagcga ACGGGATTTG1951 cAGAaggaaT TTCCAAAACA CGtcgAGAGC CTGCAACggc tcggcaaggA2001 CttgaaTCAT GACTga它对应于氨基酸序列<SEQ ID 278;ORF88ng-1>:1 MSKSRISPTL LSRPWFAFFS SMRFAVALLS LLGIASVIGT VLQQNQPQTD51 YLVKFGPFWT RIFDFLGLYD VYASAWFVVT MMFLVVSTSL CLTRNVPPFW101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVR GFQGKTVSRE151 DGSVLIAAKK GTMNKWGYIF AQVALIVICL GGLIDSNLLL KLGMLAGRIV201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADVVF LNADNGMLVQ251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT301 LHGITIYQAS FADGGSDLTF KAWNLRDASR EPVVLKATSI HQFPLEIGKH351 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS401 IVYRIRDAAG QAVEYKNYML PILQDKDYFW LTGTRSGLQQ QYRWLRIPLD451 KQLKADTFMA LREFLKDGEG RKRLVADATK DAPAEIREQF MLAAENTLNI501 FAQKGYLGLD EFITSNIPKG QQDKMQGYFY EMLYGVMNAA LDETIRRYGL551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS601 PGALLVYLGS VLLVLGTVFM FYVREKRAWV LFSDGKIRFA MSSARSERDL651 QKEFPKHVES LQRLGKDLNH D*ORF88ng-1和ORF88-1在671个氨基酸的重叠区内有97.0%的相同性:orf88-1.pep MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60
||||| |||||||||||||||||||||||||||||||||||||||||||||||||| ||:orf88ng-1 MSKSRISPTLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGPFWT 60orf88-1.pep QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKV|EKSLAAMRH 120
:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88ng-1 RIFDFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120orf88-1.pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180
|||||||||||||||||||:||||||::|||||||||||||||||||||||:||||||||orf88ng-1 SSLLDVKIAPEVALRYLEVRGFQGKTVSREDGSVLIAAKKGTMNKWGYIFAQVALIVICL 180orf88-1.pep GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240
|||||||||||||||:||||||||||||||||||||||||||||||||||||||||||||orf88ng-1 GGLIDSNLLLKLGMLAGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240orf88-1.pep LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300
||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||orf88ng-1 LNADNGMLVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300orf88-1.pep LHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360
||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||orf88ng-1 LHGITIYQASFADGGSDLTFKAWNLRDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360orf88-1.pep SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88ng-1 SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420orf88-1.pep PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480
|:||::||:|||||||||||||||||||||||||||||||||||||||||||||||||||orf88ng-1 PILQDKDYFWLTGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480orf88-1.pep GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 540
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||orf88ng-1 DAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKGQQDKMQGYFYEMLYGVMNAA 540orf88-1.pep LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf88ng-1 LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600orf88-1.pep PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||orf88ng-1 PGALLVYLGSVLLVLGTVFMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660orf88-1.pep LQRLGKDLNHD 671
|||||||||||orf88ng-1 LQRLGKDLNHD 671另外,ORG88ng-1显示出与Aguifex aeolicus的一种假设蛋白同源:gi|2984296(AE000771)假设蛋白[Aquifex aeolicus]长度=537评分=94.4位(231),估计值=2e-18相同性=91/334(27%),阳性=159/334(47%),空隙=59/334(17%)询问:16 FAFFSSMRFAVALLSLLGIASVIG-TVLQQNQPQTDYLVKFGPFWTRIFDFLGLYDVYAS 74
+ F +S++ A+ ++ +LGI S++G T ++QNQ YL +FG L L DV+ S目标:80 YDFLASLKLAIFIMLVLGILSMLGSTYIKQNQSFEWYLDQFGYDVGIWIWKLWLNDVFHS 139询问:75 AWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRHSSLLDVKIAPEVAK 134
++++ ++ L V+ C I+ +P W++ S +E++ + A +H + VKI P+ K目标:140 WYYILFIVLLAVNLIFCSIKRLPRVWKQAFS-KERILKLDEHAEKHLKPITVKI-PDKDK 197
询问:135 -RYLEVRGFQGKTVSREDGSVLIAAKKGTMNKWGYIFAQVALIVICLGGLIDSNLLLKL 192
++L +GF+ V E + + A+KG ++ G +AL+VI G LID
目标:198 VLKFLLKKGFK-VFVEEEGNKLYVFAEKGRFSRLGVYITHIALLVIMAGALID------ 249
询问:193 GMLAGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVFLNADNGMLVQDL 252
+I+G RG++ ++EG + DV+ + A+ L
目标:250-----------------------AIVGV-----RGSLIVAEGDTNDVMLVGAE--QKPYKL 280
询问:253 PFEVKLKKFHIDFY---NTGMPRDFA-------SDIEVTDKATGEKLER--TIRVNHPLT 300
PF V L F I Y N + + FA SDIE+ + G K+E T++VN P
目标:281 PFAVHLIDFRIKTYAEENPNVDKRFAQAVSSYESDIEIIN---GGKVEAKGTVKVNEPFD 337
询问:301 LHGITIYQASFA--DGGSDLTFKAWNLRDASREP 332
++QA++ DG S + + + A +P
目标:338 FGRYRLFQATYGILDGTSGMGVIVVDRKKAHEDP 371
根据该分析结果(包括此淋球菌蛋白中有推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例40
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 279>:1 ATGATGAGTA ATAmAATGGm ACAAAAAGGG TTTACATTGA TTGmGmTGAT51 GATAGTCGTC GCGATAGTCG GCATTATCAG CGTCATTGCC ATACCTTCTT101 ATCmAAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG151 GyCGGTATCA ACAATATTTC CAAACAGTTT ATTTTGAAAA ATCCCCTGGA201 CGATAATCAG ACCATCGAGA ACAAACTGGA AATATTTGTC TCAGGCTATA251 AGATGAATCC GAAAATTGCC AAAAAaTATA GTGTTTCGGT AAAGTTTGTC301 GATAAGGAAA AATCAAGGGC ATACAGGTTG GTCGGCGTTC CGAAGGCGGG351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA401 AATGCCGTGA TGCCGCTTCT GCCCAAGCCC ATTTGGAGAC CTTGTCCTCA451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAA它对应于氨基酸序列<SEQ ID 280;ORF89>:1 MMSNXMXQKG FTLIXXMIVV AILGIISVIA IPSYXSYIEK GYQSQLYTEM51 XGINNISKQF ILKNPLDDNQ TIENKLEIFV SGYKMNPKIA KKYSVSVKFV101 DKEKSRAYRL VGVPKAGTGY TLSVWMNSVG DGYKCRDAAS AQAHLETLSS151 DVGCEAFSNR KK*进一步的工作揭示了完整的核苷酸序列<SEQ ID 281>:1 ATGATGAGTA ATAAAATGGA AGAAAAAGGG TTTACATTGA TTGAGATGAT51 GATAGTCGTC GCGATACTCG GCATTATCAG CGTCATTGCC ATACCTTCTT101 ATCAAAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG151 GTCGGTATCA ACAATATTTC CAAACAGTTT ATTTTGAAAA ATCCCCTGGA201 CGATAATCAG ACCATCGAGA ACAAACTGGA AATATTTGTC TCAGGCTATA251 AGATGAATCC GAAAATTGCC AAAAAATATA GTGTTTCGGT AAAGTTTGTC301 GATAAGGAAA AATCAAGGGC ATACAGGTTG GTCGGCGTTC CGAAGGCGGG351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA401 AATGCCGTGA TGCCGCTTCT GCCCAAGCCC ATTTGGAGAC CTTGTCCTCA451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAA它对应于氨基酸序列<SEQ ID 282;ORF89-1>:1 MMSNKMEQKG FTLIEMMIVV AILGIISVIA IPSYQSYIEK GYQSQLYTEM51 VGINNISKQF ILKNPLDDNQ TIENKLEIFV SGYKMNPKIA KKYSVSVKFV101 DKEKSRAYRL VGVPKAGTGY TLSVWMNSVG DGYKCRDAAS AQAHLETLSS151 DVGCEAFSNR KK*
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的PilE(登录号Z69260)的同源性
ORF89和PilE蛋白在120个氨基酸重叠区内显示出有30%的氨基酸相同性:
orf89 8 QKGFTLIXXMIVVAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQFILKNPL- 66
QKGFTLI MIV+AI+GI++ +A+P+Y Y + S+ G + ++ L + +
PilE 5 QKGFTLIELMIVIAIVGILAAVALPAYQDYTARAQVSEAILLAEGQKSAVTEYYLNHGIW 64
orf89 67 -DDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGYTLSVW 125
DN + +G + KI KY SV + GV K G LS+W
PilE 65 PKDNTS---------AGVASSDKIKGKYVQSVTVAKGVVTAEMASTGVNKEIQGKKLSLW 115
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF89显示与脑膜炎奈瑟球菌菌株A的ORF(ORF89a)在重叠的162个氨基酸内有83.3%的相同性:
10 20 30 40 50 60orf89.pep MMSNXMXQKGFTLIXXMIVVAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQF
|||| ||||||||||| || ||| ||||||||||||||||| |||||||||orf89a MMSNKMEQKGFTLIXXXXXXAIXXXXSVIXXXXYXSYIEKGYQSQLYTEMVGINNISKQX
10 20 30 40 50 60
70 80 90 100 110 120orf89.pep ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY
||||||||||||::||||||||||||||||:||:|||:||::|| ||| ||||||:||||orf89a ILKNPLDDNQTIKSKLEIFVSGYKMNPKIAEKYNVSVHFVNEEKPRAYSLVGVPKTGTGY
70 80 90 100 110 120
130 140 150 160orf89.pep TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX
|||||||||||||||||||||:|||||||||||||||||||||orf89a TLSVWMNSVGDGYKCRDAASARAHLETLSSDVGCEAFSNRKKX
130 140 150 160全长ORF89a核苷酸序列<SEQ ID 283>是:1 ATGATGAGTA ATAAAATGGA ACAAAAAGGG TTTACATTGA TTGNGANGNT51 NATNGNCNTC GCGATACNCN GCNTTANCAG CGTCATTNCN ATNNNTNCNT101 ATCNNAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG151 GTCGGTATCA ACAATATTTC CAAACAGTNT ATTTTGAAAA ATCCCCTGGA201 CGATAATCAG ACCATCAAGA GCAAACTGGA AATATTTGTC TCAGGCTATA251 AGATGAATCC GAAAATTGCC GAAAAATATA ATGTTTCGGT GCATTTTGTC301 AATGAGGAAA AACCNAGGGC ATACAGCTTG GTCGGCGTTC CAAAGACGGG351 GAGGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA401 AATGCCGTGA TGCCGCTTCT GCCCGAGCCC ATTTGGAGAC CTTGTCCTCA451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAG它编码的蛋白质具有氨基酸序列<SEQ ID 284>:1 MMSNKMEQKG FTLIXXXXXX AIXXXXSVIX XXXYXSYIEK GYQSQLYTEM51 VGINNISKQX ILKNPLDDNQ TIKSKLEIFV SGYKMNPKIA EKYNVSVHFV101 NEEKPRAYSL VGVPKTGTGY TLSVWMNSVG DGYKCRDAAS ARAHLETLSS151 DVGCEAFSNR KK*ORF89a和ORF89-1显示在162个氨基酸的重叠区内有83.3%的相同性:
10 20 30 40 50 60orf89a.pep MMSNKMEQKGFTLIXXXXXXAIXXXXSVIXXXXYXSYIEKGYQSQLYTEMVGINNISKQX
|||||||||||||| || || |||||||||||||||||||||||||||orf89-1 MMSNKMEQKGFTLIEMMIVVAILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNISKQF
10 20 30 40 50 60
70 80 90 100 110 120orf89a.pep ILKNPLDDNQTIKSKLEIFVSGYKMNPKIAEKYNVSVHFVNEEKPRAYSLVGVPKTGTGY
||||||||||||::||||||||||||||||:||:|||:||::|| ||| ||||||:||||orf89-1 ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY
70 80 90 100 110 120
130 140 150 160orf89a.pep TLSVWMNSVGDGYKCRDAASARAHLETLSSDVGCEAFSNRKKX
|||||||||||||||||||||:|||||||||||||||||||||orf89-1 TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX
130 140 150 160
与淋病奈瑟球菌的预计ORF的同源性
ORF89与淋病奈瑟球菌的预计ORF(ORF89.ng)在重叠的162个氨基酸内显示有84.6%的相同性:orf89 MMSNXMXQKGFTLIXXMIVVAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQF 60
|||| | ||||||| ||||:||||||||||||| ||||||||||||||| ||||: |||orf89ng MMSNKMEQKGFTLIEMMIVVTILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNVLLQF 60orf89 ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY 120
||||| |||:|:::||:||||||||||||||||||||:||| || |||||||||:|||||orf89ng ILKNPQDDNDTLKSKLKIFVSGYKMNPKIAKKYSVSVRFVDAEKPRAYRLVGVPNAGTGY 120orf89 TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKK 162
||||||||||||||||||:||||: :|:||| ||||||||||orf89ng TLSVWMNSVGDGYKCRDATSAQAYSDTLSADSGCEAFSNRKK 162全长ORF89ng核苷酸序列<SEQ ID 285>是:1 aTGATGAGCA ATAAAATGGA ACAAAAAGGG TTTACATTGA TTGAGATGAT51 GATAGTTGTC ACGATACTCG GCATCATCAG CGTCATTGCC ATACCTTCTTI01 ATCAGAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG151 GTCGGTATCA ACAATGTTCT CAAACAGTTT ATTTTGAAAA ATCCCCAGGA201 CGATAATGAT ACCCTCAAGA GCAAACTGAA AATATTTGTC TCAGGCTATA251 AGATGAATCC GAAAAttgCC AAAAAATATA GTGTTTCGGt aaggtttGTC301 gatGCGGAAA AACCAAGGGC ATACAGGTTG GTCGGCGTTC CGAACGCGGG351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA401 AATGCCGTGA TGCCACTTCT GCCCAGGCCT ATTCGGACAC CTTGTCCGCA451 GATAGCGGCT GTGAAGCTTT CTCTAATCGT AAAAAATAG它编码的蛋白质具有氨基酸序列<SEQ ID 286>:1 MMSNKMEQKG FTLIEMMIVV TILGIISVIA IPSYQSYIEK GYQSQLYTEM51 VGINNVLKQF ILKNPQDDND TLKSKLKIFV SGYKMNPKIA KKYSVSVRFV101 DAEKPRAYRL VGVPNAGTGY TLSVWMNSVG DGYKCRDATS AQAYSDTLSA151 DSGCEAFSNR KK*
该淋球菌蛋白具有一个推定的前导序列(下划线)和N端甲基化位点(NMePhe或4型菌毛,双划线)。另外,ORF89ng和ORF89-1在162个氨基酸的重叠区内有88.3%的相同性:
10 20 30 40 50 60orf89-1.pep MMSNKMEQKGFTLIEMMIVVAILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNISKQF
||||||||||||||||||||:||||||||||||||||||||||||||||||||||: |||orf89ng MMSNKMEQKGFTLIEMMIVVTILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNVLKQF
10 20 30 40 50 60
70 80 90 100 110 120orf89-1.pep ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY
||||| |||:|:::||:||||||||||||||||||||:||| ||||||||:|||:|||||orf89ng ILKNPQDDNDTLKSKLKIFVSGYKMNPKIAKKYSVSVRFVDAEKPRAYRLVGVPNAGTGY
70 80 90 100 110 120
130 140 150 160orf89-1.pep TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX
||||||||||||||||||:||||: :|||:| |||||||||||orf89ng TLSVWMNSVGDGYKCRDATSAQAYSDTLSADSGCEAFSNRKKX
130 140 150 160
根据该分析结果(包括淋球菌基序以及与已知PilE蛋白的同源性),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF89-1(13.6kDa)克隆到pGex载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图11A显示出GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白免疫小鼠,该小鼠的血清在ELISA测试中给出了阳性结果,这确认了ORF89-1是一种外露蛋白,且是一种有用的免疫原。
实施例41
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 287>:1 ATGAAAAAAT CCTCCCTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAGCCAA ATCCGTCAAA101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA ACGGCGATGC CAACACCGCT151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT201 GACCGCATTG GCGGTCGGCA ACCCTTGGsG CACCG.GTCC GACG.GCAAA251 AACAAGCGTT GGCCn.AGAA TTTCAACCC...它对应于氨基酸序列<SEQ ID 288;ORF91>:1 MKKSSLISAL GIGILSIGMA FAAPADAVSQ IRQNATQVLS ILKNGDANTA51 RQKAEAYAIP YFDFQRMTAL AVGNPWXTXS DXQKQALAXE FQP...进一步的工作揭示了完整的核苷酸序列<SEQ ID 289>:1 ATGAAAAAAT CCTCCCTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAGCCAA ATCCGTCAAA101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA ACGGCGATGC CAACACCGCT151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG CACCGCGTCC GACGCGCAAA251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC301 GGCACGATGC TGAAATTAAA AAACGCCAAC GTCAACGTCA AAGACAATCC351 CATCGTCAAT AAAGGCGGCA AAGAAATCAT CGTCCGCGCC GAAGTCGGCG401 TACCCGGGCA AAAACCCGTC AACATGGACT TCACCACCTA CCAAAGCGGC451 GGTAAATACC GTACCTACAA CGTCGCCATC GAAGGCGCGA GCCTGGTTAC501 CGTGTACCGC AACCAATTCG GCGAAATTAT CAAAGCGAAA GGCGTGGACG551 GACTGATTGC CGAGTTGAAA GCCAAAAACG GCGGCAAATA A它对应于氨基酸序列<SEQ ID 290;ORF91-1>:1 MKKSSLISAL GIGILSIGMA FAAPADAVSQ IRQNATQVLS ILKNGDANTA51 RQKAEAYAIP YFDFQPMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS101 GTMLKLKNAN VNVKDNPIVN KGGKEIIVRA EVGVPGQKPV NMDFTTYQSG151 GKYRTYNVAI EGASLVTVYR NQFGEIIKAK GVDGLIAELK AKNGGK*该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF91显示与脑膜炎奈瑟球菌菌株A的ORF(ORF91a)在重叠的92个氨基酸内有92.4%的相同性:
10 20 30 40 50 60orf91.pep MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP
|||||:||||||||||||||||||||||:||||||||||||||:||||||||||||||||orf91a MKKSSFISALGIGILSIGMAFAAPADAVNQIRQNATQVLSILKSGDANTARQKAEAYAIP
10 20 30 40 50 60
70 80 90orf91.pep YFDFQRMTALAVGNPWXTXSDXQKQALAXEFQP
|||||||||||||||| | || |||||| |||orf91a YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
70 80 90 100 110 120orf91a KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
130 140 150 160 170 180全长ORF91a核苷酸序列<SEQ ID 291>是:1 ATGAAAAAAT CCTCCTTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAACCAA ATCCGTCAAA101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA GCGGTGATGC CAACACCGCC151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG CACCGCGTCC GACGCGCAAA251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC301 GGCACGATGC TGAAATTAAA AAACGCCAAC GTCAACGTCA AAGACAATCC351 CATCGTCAAT AAAGGCGGCA AAGAAATCAT CGTCCGCGCC GAAGTCGGCG401 TACCCGGGCA AAAACCCGTC AACATGGACT TCACCACCTA CCAAAGCGGC451 GGTAAATACC GTACCTACAA CGTCGCCATC GAAGGCCCGA GCCTGGTTAC501 CGTGTACCGC AACCAATTCG GCGAAATTAT CAAAGCGAAA GGCGTGGACG551 GACTGATTGC CGAGTTGAAG GCTAAAAACG GCAGCAAGTA A它编码的蛋白质具有氨基酸序列<SEQ ID 292>:1 MKKSSFISAL GIGILSIGMA FAAPADAVNQ IRQNATQVLS ILKSGDANTA51 RQKAEAYAIP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS101 GTMLKLKNAN VNVKDNPIVN KGGKEIIVRA EVGVPGQKPV NMDFTTYQSG151 GKYRTYNVAI EGASLVTVYR NQFGEIIKAK GVDGLIAELK AKNGSK*ORF91a和ORF91-1显示在196个氨基酸的重叠区内有98.0%的相同性:
10 20 30 40 50 60orf91a.pep MKKSSFISALGIGILSIGMAFAAPADAVNQIRQNATQVLSILKSGDANTARQKAEAYAIP
|||||:||||||||||||||||||||||:||||||||||||||:||||||||||||||||orf91-1 MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP
10 20 30 40 50 60
70 80 90 100 110 120orf91a.pep YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf91-1 YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
70 80 90 100 110 120
130 140 150 160 170 180orf91a.pep KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf91-1 KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
130 140 150 160 170 180
190orf91a.pep GVDGLIAELAKNGSKX
||||||||||||||:||orf91-1 GVDGLIAELKAKNGGKX
190
与淋病奈瑟球菌的预计ORF的同源性
ORF91显示与淋病奈瑟球菌的预计ORF(ORF91.ng)在重叠的92个氨基酸内有84.8%的相同性:orf91.pep MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP 60
:||||:||||||||||||||||:|||||:||||||||||:|||:||| :|| ||||||:|orf91ng VKKSSFISALGIGILSIGMAFASPADAVGQIRQNATQVLTILKSGDAASARPKAEAYAVP 60orf91.pep YFDFQRMTALAVGNPWXTXSDXQKQALAXEFQP 93
|||||||||||||||| | ||||||||| |||orf91ng YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPIVN 120
预计全长ORF91ng核苷酸序列<SEQ ID 293>编码的蛋白质具有氨基酸序列<SEQ ID 294>:1 VKKSSFISAL GIGILSIGMA FASPADAVGQ IRQNATQVLT ILKSGDAASA51 RPKAEAYAVP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS101 GTMLKFKNAT VNVKDNPIVN KGGKEIVVRA EYGIPGQKPV NMDFTTYQSG151 GKYRTYNVAI EGTSLVTVYR NQFGEIIKAK GIDGLIAELK AKNGGK*进一步的工作揭示了完整的核苷酸序列<SEQ ID 295>:1 ATGAAAAAAT CCTCCTTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT51 CGGCATGGCA TTTGCCTCCC CGGCCGACGC AGTGGGACAA ATCCGCCAAA101 ACGCCACACA GGTTTTGACC ATCCTCAAAA GCGGCGACGC GGCTTCTGCA151 CGCCCAAAAG CCGAAGCCTA TGCGGTTCCC TATTTCGATT TCCAACGTAT201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG TACCGCGTCC GACGCGCAAA251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC301 GGCACGATGC TGAAATTCAA AAACGCGACC GTCAACGTCA AAGACAATCC351 CATCGTCAAT AAGGGCGGCA AGGAAATCGT CGTCCGTGCC GAAGTCGGCA401 TCCCCGGTCA GAAGCCCGTC AATATGGACT TTACCACCTA CCAAAGCGGC451 GGCAAATACC GTACCTACAA CGTCGCCATC GAAGGCACGA GCCTGGTTAC501 CGTGTACCGC AACCAATTCG GCGAAATCAT CAAAGCCAAA GGCATCGACG551 GGCTGATTGC CGAGTTGAAA GCCAAAAACG GCGGCAAATA A它对应于氨基酸序列<SEQ ID 296;ORF91ng-1>:1 MKKSSFISAL GIGILSIGMA FASPADAVGQ IRQNATQVLT ILKSGDAASA51 RPKAEAYAVP YFDFQRMTAL AYGNPWRTAS DAQKQALAKE FQTLLIRTYS101 GTMLKFKNAT VNVKDNPIVN KGGKEIVVRA EVGIPGQKPV NMDFTTYQSG151 GKYRTYNVAI EGTSLVTVYR NQFGEIIKAK GIDGLIAELK AKNGGK*ORF91ng-1和ORF91-1显示在196个氨基酸的重叠区内有92.3%的相同性:
10 20 30 40 50 60orf91-1.pep MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP
|||||:||||||||||||||||:|||||:||||||||||:|||:||| :|| ||||||:|orf91ng-1 MKKSSFISALGIGILSIGMAFASPADAVGQIRQNATQVLTILKSGDAASARPKAEAYAVP
10 20 30 40 50 60
70 80 90 100 110 120orf91-1.pep YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
|||||||||||||||||||||||||||||||||||||||||||||:|||:||||||||||orf91ng-1 YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPIVN
70 80 90 100 110 120
130 140 150 160 170 180orf91-1.pep KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
||||||:||||||:||||||||||||||||||||||||||||:|||||||||||||||||orf91ng-1 KGGKEIVVRAEVGIPGQKPVNMDFTTYQSGGKYRTYNVAIEGTSLVTVYRNQFGEIIKAK
130 140 150 160 170 180
190orf91-1.pep GVDGLIAELKAKNGGKX
|:|||||||||||||||orf91ng-1 GIDGLIAELKAKNGGKX
190
另外,ORF91ng-1显示出与一种假设的大肠杆菌蛋白同源:
sp|P45390|YRBC_ECOLI MURA-RPON基因间区域中的假设的24.0 KD蛋白前体(F211)>gi|606130(U18997)ORF_f211[大肠杆菌]>gi|1789583(AE000399)murZ-rpoN基因间区域中的假设的24.0 kD蛋白[大肠杆菌]长度=211
评分=70.6位(170),估计值=6e-12
相同性=42/137(30%),阳性=76/137(54%),空隙=6/137(4%)
询问:59 VPYFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPI 118
+PY + AL +G +++A+ AQ++A F+ L + Y + + T + P
目标:65 LPYVQVKYAGALVLGQYYKSATPAQREAYFAAFREYLKQAYGQALAMYHGQTYQIA--PE 122
询问:119 VNKGGKEIV-VRAEVGIP-GQKPVNMDFTTYQSG--GKYRTYNVAIEGTSLVTVYRNQFG 174
G K IV +R + P G+ PV +DF ++ G ++ Y++ EG S++T +N++G
目标:123 QPLGDKTIVPIRVTIIDPNGRPPVRLDFQWRKNSQTGNWQAYDMIAEGVSMITTKQNEWG 182
询问:175 EIIKAKGIDGLIAELKA 191
+++ KGIDGL A+LK+
目标:183 TLLRTKGIDGLTAQLKS 199
根据该分析结果,包括此淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例42
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 297>:1 ATGAAACACA TACTCCCCCT GATTGCCGCA TCCGCACTCT GCATTTCAAC51 CGCTTCGGCA CATCCTGCCA GCGAACCGTC CACTCAAAAC GAAACCGCTA101 TGATCACGCA TACCCTCATC TCAAAATACA GTTTTGGnnn nnnnnnnnnn151 nnnnnnnnnn nnGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT201 CGACCATCAG GAAGCCGCAC GCCGAAACGG CTTAACGATG CAGCCGGCAA251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GCACGCCGCT GATGGTCAAA301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTA CGCGTCCTCG TTACCGAAAC351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG401 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA451 AAACTGATAC AAAAAACCGT AGGCGAATAA它对应于氨基酸序列<SEQ ID 298;ORF97>:1 MKHILPLIAA SALCISTASA HPASEPSTQN ETAMITHTLI SKYSFGXXXX51 XXXXAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK101 DPAFALQLPL RVLVTETDGK VRAAYTDTRA LIAGSRIGFD EYANTLANAE151 KLIQKTVGE*进一步的工作揭示了完整的核苷酸序列<SEQ ID 299>:1 ATGAAACACA TACTCCCCCT GATTGCCGCA TCCGCACTCT GCATTTCAAC51 CGCTTCGGCA CATCCTGCCA GCGAACCGTC CACCCAAAAC GAAACCGCTA101 TGACCACGCA TACCCTCACC TCAAAATACA GTTTTGACGA AACCGTCAGC151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT201 CGACCATCAG GAAGCCGCCC GCCGAAACGG CTTAACGATG CAGCCGGCAA251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GCACGCCGCT GATGGTCAAA301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTA CGCGTCCTCG TTACCGAAAC351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG401 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA451 AAACTGATAC AAAAAACCGT AGGCGAATAA它对应于氨基酸序列<SEQ ID 300;ORF97-1>:1 MKHILPLIAA SALCISTASA HPASEPSTQN ETAMTTHTLT SKYSFDETVS51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK101 DPAFALQLPL RVLVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE151 KLIQKTYGE*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF97显示与脑膜炎奈瑟球菌菌株A的ORF(ORF97a)在重叠的159个氨基酸内有88.7%的相同性:
10 20 30 40 50 60orf97.pep MKHILPLIAASALCISTASAHPASEPSTQNETAMITHTLISKYSFGXXXXXXXXAIKSKG
||||||| |||||||||| ||||||:||||||| |||||||||| : :||||||orf97a MXHILPLXXASALCISTASXHPASEPQTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
10 20 30 40 50 60
70 80 90 100 110 120orf97.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
|||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||orf97a MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVXVTETDGK
70 80 90 100 110 120
130 140 150 160orf97.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX
||||||||||||||||||||||||||||||||||||:|||orf97a VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTIGEX
130 140 150 160全长ORF97a核苷酸序列<SEQ ID 301>是:1 ATGANACACA TACTCCCCCT GANTGNCGCA TCCGCACTCT GCATTTCAAC51 CGCTTCGGNN CATCCTGCCA GCGAACCGCA AACCCAAAAC GAAACCGCTA101 TGACCACGCA TACCCTCACC TCAAAATACA GTTTTGACGA AACCGTCAGC151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT201 CGACCATCAG GAAGCCGCCC GCCGAAACGG CTTAACGATG CAGCCGGCAA251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GTACGCCGCT GATGGTCAAA301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTG CGCGTCNTCG TTACCGAAAC351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG401 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA451 AAACTGATAC AAAAAACCAT AGGCGAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 302>:1 MXHILPLXXA SALCISTASX HPASEPQTQN ETAMTTHTLT SKYSFDETVS51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK101 DPAFALQLPL RVXVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE151 KLIQKTIGE*ORF97a和ORF97-1显示在159个氨基酸的重叠区内有95.6%的相同性:
10 20 30 40 50 60orf97a.pep MXHILPLXXASALCISTASXHPASEPQTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
| ||||| |||||||||| ||||||:|||||||||||||||||||||||||||||||||orf97-1 MKHILPLIAASALCISTASAHPASEPSTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
10 20 30 40 50 60
70 80 90 100 110 120orf97a.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVXVTETDGK
|||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||orf97-1 MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
70 80 90 100 110 120
130 140 150 160orf97a.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTIGEX
||||||||||||||||||||||||||||||||||||:|||orf97-1 VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX
130 140 150 160
与淋病奈瑟球菌的预计ORF的同源性
ORF97显示与淋病奈瑟球菌的预计ORF(ORF97.ng)在重叠的159个氨基酸内有88.1%的相同性:orf97.pep MKHILPLIAASALCISTASAHPASEPSTQNETAMITHTLISKYSFGXXXXXXXXAIKSKG 60
|||||| |||||:||||||||||::| ||||||| |||| |||| : :||||||orf97ng MKHILPPIAASAFCISTASAHPAGKPPTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG 60orf97.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf97ng MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 120orf97.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGE 159
||:|||||||||:||||:|||||||||||||||||||||orf97ng VRTAYTDTRALIVGSRISFDEVANTLANAEKLIQKTVGE 159
预计全长ORF97ng核苷酸序列<SEQ ID 303>编码的蛋白质具有氨基酸序列<SEQ ID 304>:1 MKHILPPIAA SAFCISTASA HPAGKPPTQN ETAMTTHTLT SKYSFDETVS51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK101 DPAFALQLPL RVLVTETDGK VRTAYTDTRA LIVGSRISFD EVANTLANAE151 KLIQKTVGE*进一步的工作揭示了完整的核苷酸序列<SEQ ID 305>:1 ATGAAACACA TACTCCCcct gatcgccgca TccgcactCT GCATTTCAAC51 CGCTTCGGCA CACCCTGCCG GCAAACCGCC CACCCAAAAC GAAACCGCTA101 TGACCACGCA CACCCTCACC TCGAAATACA GTTTTGACGA AACCGTCAGC151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT201 CGACCATCAG GAAGCGGCAC GCCGAAACGG CCTGACCATG CAGCCGGCAA251 AAGTCATCGT CTTCGGCACG CCCAAGGCCG GTACGCCgct GATGGTCAAA301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTG CGCGTCCTCG TTACCGAAAC351 GGACGGCAAA GTACGCACCG CCTATACCGA TACGCGCGCC CTCATCGTCG401 GCAGCCGCAT CAGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA451 AAACTGATAC AAAAAACCGT AGGCGAATAA它对应于氨基酸序列<SEQ ID 306;ORF97ng-1>:1 MKHILPLIAA SALCISTASA HPAGKPPTQN ETAMTTHTLT SKYSFDETVS51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK101 DPAFALQLPL RVLVTETDGK VRTAYTDTRA LIVGSRISFD EVANTLANAE151 KLIQKTVGE*ORF97ng-1和ORF97-1显示在159个氨基酸的重叠区内有96.2%的相同性:
10 20 30 40 50 60orf97-1.pep MKHILPLIAASALCISTASAHPASEPSTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
|||||||||||||||||||||||::| |||||||||||||||||||||||||||||||||orf97ng-1 MKHILPLIAASALCISTASAHPAGKPPTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
10 20 30 40 50 60
70 80 90 100 110 120orf97-1.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
|||||||||||||||||||||||||||||||||V||||||||||||||||||||||||||orf97ng-1 MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
70 80 90 100 110 120
130 140 150 160orf97-1.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX
||:|||||||||:||||:||||||||||||||||||||||orfg7ng-1 VRTAYTDTRALIYGSRISFDEVANTLANAEKLIQKTVGEX
130 140 150 160
根据该分析,包括此淋球菌蛋白中有一个推定的前导序列,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF97-1(15.3kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图12A和12B分别显示了GST-融合蛋白和His-融合蛋白的亲和纯化结果。用纯化的GST-融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图12C),ELISA(阳性结果),和FACS分析(图12D)。这些实验确认ORF97-1是一种外露蛋白,且是一种有用的免疫原。
图12E显示出ORF97-1的亲水性、抗原性指数和AMPHI区域的曲线。
实施例43
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA<SEQ ID 307>:1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC AGTAAATGGC TGATTGTGCC51 GCTGATGCTC CCCGCCTTTC AGAATGTGGC GGCGGAGGGG ATAGATGTGA101 GCCGTGCCGA AGCGAGGATA ACCGACGGCG GGCAGCTTTC CATCAGCAGC151 CGCTTCCAAA CCGAGCTGCC CGACCAGCTC CAACAGGCGT TGCGCCGGGg201 CGTGCCGCTC AACTTTACCT TAAGCTGGCA GCTTTCCGCC CCGATAATCG251 CTTCTTATGG GTTTAAATTG GGGCAACTGA TTGGCGATGA CGACaATATT301 GACTACAAAC TGAGTTTCCA TCCGCTGACc AaACGCTACC GCGTTACCgT351 CGgCGCGTTT TCGACAGACT ACGACACCTT GGATGCGGCA TTGCGCGCGA401 CCGGCGCGGT TGCCAACTGG AAAGTCCTGA ACAAAGGCGC GCTGTCCGGT451 GCGGAAGCAG GGGAAACCAA GGCGGAAATC CGCCTGACGC TGTCCACTTC501 AAAACTGCCC AAGCCTTTTC AAATCAATGC ATTGACTTCT CAAAACTGGC551 ATTTGGATTC GGGTTGGAAA CCTCTAAACA TCATCGGGAA CAAATAA它对应于氨基酸序列<SEQ ID 308;ORF106>:1 MAFITRLFKS SKWLIVPLML PAFQNVAAEG IDVSRAEARI TDGGQLSISS51 RFQTELPDQL QQALRRGVPL NFTLSWQLSA PIIASYRFKL GQLIGDDDNI101 DYKLSFHPLT KRYRVTVGAF STDYDTLDAA LRATGAVANW KVLNKGALSG151 AEAGETKAEI RLTLSTSKLP KPFQINALTS QNWHLDSGWK PLNIIGNK*进一步的工作揭示了下列DNA序列<SEQ ID 309>:1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC AGTAAATGGC TGATTGTGCC51 GCTGATGCTC CCCGCCTTTC AGAATGTGGC GGCGGAGGGG ATAGATGTGA101 GCCGTGCCGA AGCGAGGATA ACCGACGGCG GGCAGCTTTC CATCAGCAGC151 CGCTTCCAAA CCGAGCTGCC CGACCAGCTC CAACAGGCGT TGCGCCGGGG201 CGTGCCGCTC AACTTTACCT TAAGCTGGCA GCTTTCCGCC CCGATAATCG251 CTTCTTATCG GTTTAAATTG GGGCAACTGA TTGGCGATGA CGACAATATT301 GACTACAAAC TGAGTTTCCA TCCGCTGACC AACCGCTACC GCGTTACCGT351 CGGCGCGTTT TCGACAGACT ACGACACCTT GGATGCGGCA TTGCGCGCGA401 CCGGCGCGGT TGCCAACTGG AAAGTCCTGA ACAAAGGCGC GCTGTCCGGT451 GCGGAAGCAG GGGAAACCAA GGCGGAAATC CGCCTGACGC TGTCCACTTC501 AAAACTGCCC AAGCCTTTTC AAATCAATGC ATTGACTTCT CAAAACTGGC551 ATTTGGATTC GGGTTGGAAA CCTCTAAACA TCATCGGGAA CAAATAA它对应于氨基酸序列<SEQ ID 310;ORF106-1>:1 MAFITRLFKS SKWLIVPLML PAFQNVAAEG IDVSRAEARI TDGGQLSISS51 RFQTELPDQL QQALRRGYPL NFTLSWQLSA PIIASYRFKL GQLIGDDDNI101 DYKLSFHPLT ARYRVTVGAF STDYDTLDAA LRATGAVANW KVLNKGALSG151 AEAGETKAEI RLTLSTSKLP KPFQINALTS QNWHLDSGWK PLNIIGNK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF106显示与脑膜炎奈瑟球菌菌株A的ORF(ORF106a)在重叠的199个氨基酸内有87.4%的相同性:
10 20 30 40 50 59orf106.pep MAFITRLFKSSK-WLIVPLMLPAFQNVAAEGIDVSRAEARITDGGQLSISSRFQTELPDQ
|||||||||| | ||:: || :: ::||||||:|||||||:|||||| ||||||||||orf106a MAFITRLFKSIKQWLVLLPMLSVLPDAAAEGIDVSRAEARIXDGGQLSXXSRFQTELPDQ
10 20 30 40 50 60
60 70 80 90 100 110 119orf106.pep LQQALRRGVPLNFTLSWQLSAPIIASYRFKLGQLIGDDDNIDYKLSFHPLTKRYRVTVGA
|| | ||| || || ||||||||||||| ||||||||| |||||||||||:||||||||orf106a LQXAXXRGVXLNXTLXWQLSAPIIASYRFXLGQLIGDDDXIDYKLSFHPLTNRYRVTVGA
70 80 90 100 110 120
120 130 140 150 160 170 179orf106.pep FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf106a FSTXYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT
130 140 150 160 170 180
180 190 199orf106.pep SQNWHLDSGWKPLNIIGNKX
||||||||||||||||||||orf106a SQNWHLDSGWKPLNIIGNKX
190 200
由于残基11位的K被替代成N,ORF106a和ORF106-1之间在相同的199个氨基酸重叠区内的同源性是87.9%。
全长ORF106a核苷酸序列<SEQ ID 311>是:1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC ATTAAACAAT GGCTTGTGCT51 GCTGCCGATG CTTTCCGTTT TGCCGGACGC GGCGGCGGAG GGGATAGATG101 TGAGCCGCGC CGAAGCGAGG ATAANCGACG GCGGGCAGCT TTCCATNAGN151 AGCCGCTTCC AAACCGAGCT GCCCGACCAG CTCCAANNNG CGNNGNGCCG201 GGGCGTGNCG CTCAACTNTA CCTTAAGNTG GCAGCTTTCC GCCCCGATAA251 TCGCTTCTTA TCGGTTTNAA TTGGGGCAAC TGATTGGCGA TGACGACNAT301 ATTGACTACA AACTGAGTTT CCATCCGCTG ACCAACCGCT ACCGCGTTAC351 CGTCGGCGCG TTTTCGACAG ANTACGACAC CTTGGATGCG GCATTGCGCG401 CGACCGGCGC GGTTGCCAAC TGGAAAGTCC TGAACAAAGG CGCGCTGTCC451 GGTGCGGAAG CAGGGGAAAC CAAGGCGGAA ATCCGCCTGA CGCTGTCCAC501 TTCAAAACTG CCCAAGCCTT TTCAAATCAA TGCATTGACT TCTCAAAACT551 GGCATTTGGA TTCGGGTTGG AAACCTCTAA ACATCATCGG GAACAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 312>:1 MAFITRLFKS IKQWLVLLPM LSVLPDAAAE GIDVSRAEAR IXDGGQLSXX51 SRFQTELPDQ LQXAXXRGVX LNXTLXWQLS APIIASYRFX LGQLIGDDDX101 IDYKLSFHPL TNRYRVTVGA FSTXYDTLDA ALRATGAVAN WKVLNKGALS151 GAEAGETKAE IRLTLSTSKL PKPFQINALT SQNWHLDSGW KPLNIIGNK*
与淋病奈瑟球菌的预计ORF的同源性
ORF106显示与淋病奈瑟球菌的预计ORF(ORF106.ng)在重叠的199个氨基酸内有90.5%的相同性:orf106.pep MAFITRLFKSSK-WLIVPLMLPAFQNVAAEGIDVSRAEARITDGGQLSISSRFQTELPDQ 59
|||||||||| | ||:: :| :: ::||||| ::||||||||||:||||||||||||||orf106ng MAFITRLFXSIKQWLVLLPILSVLPDAAAEGIAATRAEARITDGGRLSISSRFQTELPDQ 60orf106.pep LQQALRRGVPLNFTLSWQLSAPIIASYRFKLGQLIGDDDNIDYKLSFHPLTKRYRVTVGA 119
|||||||||||||||||||||||||||||||||||||||||||||||||||:||||||||orf106ng LQQALRRGVPLNFTLSWQLSAPTIASYRFKLGQLIGDDDNIDYKLSFHPLTNRYRVTVGA 120orf106.pep FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 179
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf106ng FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 180orf106.pep SQNWHLDSGWKPLNIIGNK 198
|||||||||||||||||||orf106ng SQNWHLDSGWKPLNIIGNK 199
由于残基111位的K被替换成N,ORF106ng和ORF106-1之间在相同的199个氨基酸重叠区内的同源性是91.0%。
全长ORF106ng核苷酸序列<SEQ ID 313>是:1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGD ATTAAACAAT GGCTTGTGCT51 GTTGCCGATA CTCTCCGTTT TGCCGGACGC GGCGGCGGAG GGCATTGCCG101 CGACCCGCGC CGAAGCGAGG ATAACCGACG GCGGGCGGCT TTCCATCAGC151 AGCCGCTTCC AAACCGAGCT GCCCGACCAG CTCCAACAGG CGTTGCGCCG201 GGGCGTACCG CTCAACTTTA CCTTAAGCTG GCAGCTTTCC GCCCCGACAA251 TCGCTTCTTA TCGGTTTAAA TTGGGGCAAC TGATTGGCGA TGACGACAAT301 ATTGACTACA AACTAAGTTT CCATCCGCTG ACCAACCGCT ACCGCGTTAC351 CGTCGGCGCA TTTTCCACCG ATTACGACAC TTTGGATGCG GCATTGCGCG401 CGACCGGCGC GGTTGCCAAC TGGAAAGTCC TGAACAAAGG CGCGTTGTCC451 GGTGCGGAAG CAGGGGAAAC CAAGGCGGAA ATCCGCCTGA CGCTGTCCAC501 TTCAAAACTG CCCAAGCCTT TCGAAATCAA CGCATTGACT TCTCAAAACT551 GGCATTTGGA TTCGGGTTGG AAACCTCTAA ACATCATCGG GAACAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 314>:1 MAFITRLFKS IKQWLVLLPI LSVLPDAAAE GIAATRAEAR ITDGGRLSIS51 SRFQTELPDQ LQQALRRGVP LNFTLSWQLS APTIASYRFK LGQLIGDDDN101 IDYKLSFHPL TNRYRVTVGA FSTDYDTLDA ALRATGAVAN WKVLNKGALS151 GAEAGETKAE IRLTLSTSKL PKPFQINALT SQNWHLDSGW KPLNIIGNK*
根据该分析结果,包括此淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF106-1(18kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图13A显示出His-融合蛋白亲和纯化的结果,图13B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,将小鼠血清用于FACS分析(图13C)。这些实验确认ORF106-1是一种外露蛋白,且是一种有用的免疫原。
实施例44
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 315>:1 ATGGACACAA AAGAAATCCT CGG.TACGCG GcAGGcTCGA TCGGCAGCGC51 GGTTTTAGCC GTCATCATCc TGCCGCTGCT GTCGTGGTAT TTCCCCGCCG101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG GCTgACGGTG151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC201 CACCGCCGAC AAAGACAcCT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGG TTTCCCGCCC GTCCCTGCGG301 TCTGAAATCC TGTTTTCACT CGACGATGCC gCCGCCGGCa TCGGGCTGGT351 GCTGTTTGAA CtGAGCTTCC TGCCCATCCG cTTTCTCTTA CTGGTTTTGC401 GTATGGAAGG ACGCGCCcTT GCCTTTTCGT CCGCGCAACT CGTGCcCAAG451 CTCGCCATCC TGCTGCTG.T GCCGCTGACG GTGGGGCTGC TGCACTTTCC501 AGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG601 CACGCACCGT TTTCGCCCGC CGTCCTGCAC CGGGGG.TGC GCTACGGCAT651 ACCGATCGCA CTGAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC701 GTTTGTTCCT GAAAAAATAT GCCGGCCTGG AACAGCTCGG CGTTTATTCG751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGTTCCAAA GCATCTTTTC801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGAA AACGCCCCGC851 CCGCTCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC901 GCCCTCTGC. TGACCGGCAT TTTCTCGCCC CTTGCCTCCC TCCTGCTGCC951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATG.TGCCGC1001 CGCTGTTTTG CACGCTGGCG GAAATCAGCG GCATCGGTTT GAACGTCGTT1051 CGGAAAACGC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA1101 CCTGCTGCTG CTGGGGCTTG ACCGTGCCGT ACCGGCGAGG CCGCC.GGCG1151 CGGCGGTTGC CTGTGCCGCC TCATTCTGGC TGTTTTTTGC CTTCAAGACC1201 GAAAGCTCyT GCCGCCTGTG GCAGCCGCTC AAACGCCTGC CGCTTTATCT1251 GCACACATTG TTCTGCCTGA CCTCCTCGGC GGCCTACACC TGCTTCGGCA1301 CGCGGGCAAA CTATCCCCTG TTTGCCGGCG TATGGGCGGC ATATCTGGCA1351 GGCTGCATCC TGCGCCACCG GAAAGATTTG CACAAACTGT TTCATTATTT1401 GAAAAAACAA GGTTTCCCAT TATGA它对应于氨基酸序列<SEQ ID 316;ORF10>:1 MDTKEILXYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV51 SVLCLGLDQA YVREYYATAD KDTLFKTLFL PPLLSAAAIA ALLLSRPSLP101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVPK151 LAILLLXPLT VGLLHFPANT AVLTAVYALA NLAAAAFLLF QNRCRLKAVR201 HAPFSPAVLH RGXRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQLGVYS251 MGISFGGAAL LFQSIFSTVW TPYIFRAIEE NAPPARLSAT AESAAALLAS301 ALCXTGIFSP LASLLLPENY AAVRFIVVSC MXPPLFCTLA EISGIGLNVV351 RKTRPIALAT LGALAANLLL LGLDRAVPAR PXGAAVACAA SFWLFFAFKT401 ESSCRLWQPL KRLPLYLHTL FCLTSSAAYT CFGTPANYPL FAGVWAAYLA451 GCILRHRKDL HKLFHYLKKQ GFPL*进一步的序列分析揭示了完整的DNA序列<SEQ ID 317>是:1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCCCCGCCG101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG GCTGACGGTG151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC201 CACCGCCGAC AAAGACACCT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG 301 TCTGAAATCC TGTTTTCACT CGACGATGCC GCCGCCGGCA TCGGGCTGGT351 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC401 GTATGGAAGG ACGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGCCCAAG451 CTCGCCATCC TGCTGCTGCT GCCGCTGACG GTCGGGCTGC TGCACTTTCC501 AGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG601 CACGCACCGT TTTCGCCCGC CGTCCTGCAC CGGGGGCTGC GCTACGGCAT651 ACCGATCGCA CTGAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC701 GTTTGTTCCT GAAAAAATAT GCCGGCCTGG AACAGCTCGG CGTTTATTCG751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGTTCCAAA GCATCTTTTC801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGAA AACGCCCCGC851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCTTGCCT GCTTGCCTCC901 GCCCTCTGCC TGACCGGCAT TTTCTCGCCC CTTGCCTCCC TCCTGCTGCC951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATGCTGCCGC1001 CGCTGTTTTG CACGCTGGCG GAAATCAGCG GCATCGGTTT GAACGTCGTC1051 CGCAAAACGC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA1101 CCTGCTGCTG CTGGGGCTTG CCGTGCCGTC CGGCGGCGCG CGCGGGCGCG1151 CGGTTGCCTG TGCCGCCTCA TTCTGGCTGT TTTTTGCCTT CAAGACCGAA1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATCTGCA1251 CACATTGTTC TGCCTGACCT CCTCGGCGGC CTACACCTGC TTCGGCACGC1301 CGGCAAACTA TCCCCTGTTT GCCGGCGTAT GGGCGGCATA TCTGGCAGGC1351 TGCATCCTGC GCCACCGGAA AGATTTGCAC AAACTGTTTC ATTATTTGAA1401 AAAACAAGGT TTCCCATTAT GA它对应于氨基酸序列<SEQ ID 318;ORF10-1>:1 MDTKEILGYA AGSIGSAVLA VTILPLLSWY FPADDIGRIV LMQTAAGLTV51 SVLCLGLDQA YVREYYATAD KDTLFKTLFL PPLLSAAAIA ALLLSRPSLP101 SEILFSLDDA AAGIGLVLFF LSFLPIRFLL LVLRMEGRAL AFSSAQLVPK151 LAILLLLPLT VGLLHFPANT AVLTAVYALA NLAAAAFLLF QNRCRLKAVR201 HAPFSPAVLH RGLRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQLGVYS251 MGISFGGAAL LFQSIFSTVW TPYIFRAIEE NAPPARLSAT AESAAALLAS301 ALCLTGIFSP LASLLLPENY AAVRFIVVSC MLPPLFCTLA EISGIGLNVV351 RKTRPIALAT LGALAANLLL LGLAVPSGGA RGAAVACAAS FWLFFAFKTE401 SSCRLWQPLK RLPLYLHTLF CLTSSAAYTC FGTPANYPLF AGVWAAYLAG451 CILRHRKDLH KLFHYLKKQG FPL*
该氨基酸序列的计算机分析给出了下列结果:
预计
预计ORF10-1是一种整合膜蛋白的前体,因为它包含几个(12-13个)潜在跨膜片段,以及一个可能的可断裂信号肽。
与唾液链球菌嗜热亚种的EpsM(登录号为U40830)的同源性
ORF10显示出与唾液链球菌嗜热亚种的epsM基因同源,该基因编码的蛋白质大小与ORF10相似,并涉及外多糖的合成。它还与原核生物膜蛋白有其它同源性:询问:213 LRYGIPLALSSLAYWGLASADRLFLKKYAGLEQLGVYSMGISFGGAALLLQSIFSTVW 270
L Y +PL SS+ +W L ++ R F+ + G G+ ++ + +IF+ W目标:210 LYYALPLIPSSILWWLLNASSRYFVLFFLGAGANGLLAVATKIPSIISIFNTIFTQAW 267相同性=15/57(26%),阳性=31/57(54%)询问: 7 LGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQAYVR 63
L + G++GS +L +++PL ++ + G L QT A L + ++ + + A +R目标: 12 LVFTIGNLGSKLLVFLLVPLYTYAMTPQEYGMADLYQTTANLLLPLIT~NVFDATLR 68
相同性=16/96(16%),阳性=36/96(37%)
询问:307 IFSPLASLLLPENYAAVRFTVVSCMLPPLFYTLTEISGIGLNVVRKTRPIXXXXXXXXXX 366
+ P+ ++ +YA+ V ML LF + ++ G ++T+ +
目标:305 VLKPIVEKVVSSDYASSWQYVPFFMLSMLFSSFSDFFGTNYIAAKQTKGVFMTSIYGTIV 364
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF10显示与脑膜炎奈瑟球菌菌株A的ORF(ORF10a)在重叠的475个氨基酸内有95.4%的相同性:
10 20 30 40 50 60orf10.pep MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||orf10a MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
10 20 30 40 50 60
70 80 90 100 110 120orf10.pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf10a YVREYYAAADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
70 80 90 100 110 120
130 140 150 160 170 180orf10.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA
|||||||||||||||||||||||||||| ||||||| |||||||||||||||||||||||orf10a LSFLPIRFLLLVLRMEGRALAFSSAQLVSKLAILLLLPLTVGLLHFPANTAVLTAVYALA
130 140 150 160 170 180
190 200 210 220 230 240orf10.pep NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGXRYGIPIALSSIAYWGLASADRLFLKKY
||||||||||||||||||||:|||| |||||| |||||||||||||||||||||||||||orf10a NLAAAAFLLFQNRCRLKAVRRAPFSSAVLHRGLRYGIPIALSSIAYWGLASADRLFLKKY
190 200 210 220 230 240
250 260 270 280 290 300orf10.pep AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||orf10a AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEANAPPARLSATAESAAALLAS
250 260 270 280 290 300
310 320 330 340 350 360orf10.pep ALCXTGIFSPLASLLLPENYAAVRFIVVSCMXPPLFCTLAEISGIGLNVVRKTRPIALAT
||| ||||||||||||||||||||||||||| |||||||:||||||||||||||||||||orf10a ALCLTGIFSPLASLLLPENYAAVRFIVVSCMLPPLFCTLVEISGIGLNVVRKTRPIALAT
310 320 330 340 350 360
370 380 390 400 410 419orf10.pep LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT
||||||||||||| |||: ||||||||||||||:|||||||||||||||||||:||orf10a LGALAANLLLLGL--AVPSGGARGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT
370 380 390 400 410
420 430 440 450 460 470orf10.pep LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX
||||:||||||||||||||||||||||:||||||||||||||||||||||||||||orf10a LFCLASSAAYTCFGTPANYPLFAGVWAVYLAGCILRHRKDLHKLFHYLKKQGFPLX
420 430 440 450 460 470全长ORF10a核苷酸序列<SEQ ID 319>是:
1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC 51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCCCTGCCG101 ACGACATCGG ACGCATCGTG CTGATGCAGA CGGCGGCGGG GCTGACGGTG151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC201 CGCCGCCGAC AAAGACACTT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC ATCCCTGCCG301 TCTGAAATCC TGTTTTCGCT CGACGATGCC GCCGCCGGCA TCGGGCTGGT351 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC401 GTATGGAAGG ACGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGTCCAAG451 CTCGCCATCC TGCTGCTGCT GCCGCTGACG GTCGGGCTGC TGCACTTTCC501 GGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG601 CGCGCACCGT TTTCATCCGC CGTCCTGCAT CGCGGCCTGC GCTACGGCAT651 ACCGATCGCA CTAAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC701 GTTTGTTCCT GAAAAAATAT GCCGGCCTAG AACAGCTCGG CGTTTATTCG751 ATGGGTATTT CGTTCGGCGG AGCGGCATTA TTGTTCCAAA GCATCTTTTC801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGCA AACGCCCCGC851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC901 GCCCTCTGCC TGACCGGCAT TTTCTCGCCC CTCGCCTCCC TCCTGCTGCC951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATGCTGCCTC1001 CGCTGTTTTG CACGCTGGTA GAAATCAGCG GCATCGGTTT GAACGTCGTC1051 CGAAAAACAC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA1101 CCTGCTGCTG CTGGGGCTTG CCGTACCGTC CGGCGGCGCG CGCGGCGCGG1151 CGGTTGCCTG TGCCGCCTCA TTTTGGCTGT TTTTTGTTTT CAAGACCGAA1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATATGCA1251 CACATTGTTC TGCCTGGCCT CCTCGGCGGC CTACACCTGC TTCGGCACTC1301 CGGCAAACTA CCCCCTGTTT GCCGGCGTAT GGGCGGTATA TCTGGCAGGC1351 TGCATCCTGC GCCACCGGAA AGATTTGCAC AAACTGTTTC ATTATTTGAA1401 AAAACAAGGT TTCCCATTAT GA它编码的蛋白质具有氨基酸序列<SEQ ID 320>:1 MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV51 SVLCLGLDQA YVREYYAAAD KDTLFKTLFL PPLLSAAAIA ALLLSRPSLP101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVSK151 LAILLLLPLT VGLLHFPANT AVLTAVYALA NLAAAAFLLF QNRCRLKAVR201 RAPFSSAVLH RGLRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQLGVYS251 MGISFGGAAL LFQSIFSTVW TPYIFRAIEA NAPPARLSAT AESAAALLAS301 ALCLTGIFSP LASLLLPENY AAVRFIVVSC MLPPLFCTLV EISGIGLNVV351 RKTRPIALAT LGALAANLLL LGLAVPSGGA RGAAVACAAS FWLFFVFKTE401 SSCRLWQPLK RLPLYMHTLF CLASSAAYTC FGTPANYPLF AGVWAVYLAG451 CILRHRKDLH KLFHYLKKQG FPL*ORF10a和ORF10-1显示在475个氨基酸的重叠区内有95.4%的相同性:
10 20 30 40 50 60oTf10-1.pep MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf10a MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
10 20 30 40 50 60
70 80 90 100 110 120orf10-1.pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf10a YVREYYAAADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
70 80 90 100 110 120
130 140 150 160 170 180orf10-1.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA
|||||||||||||||||||||||||||| ||||||| |||||||||||||||||||||||orf10a LSFLPIRFLLLVLRMEGRALAFSSAQLVSKLAILLLLPLTVGLLHFPANTAVLTAVYALA
130 140 150 160 170 180
190 200 210 220 230 240orf10-1.pep NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGXRYGIPIALSSIAYWGLASADRLFLKKY
||||||||||||||||||||:|||| |||||| |||| ||||||||||| ||||||||||orf10a NLAAAAFLLFQNRCRLKAVRRAPFSSAVLHRGLRYGIPIALSSIAYWGLASADRLFLKKY
190 200 210 220 230 240
250 260 270 280 290 300orf10-1.pep AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||orf10a AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEANAPPARLSATAESAAALLAS
250 260 270 280 290 300
310 320 330 340 350 360orf10-1.pep ALCXTGIFSPLASLLLPENYAAVRFIVVSCMXPPLFCTLAEISGIGLNVVRKTRPIALAT
||| ||||||||||||||||||||||||||| |||||||:||||||||||||||||||||orf10a ALCLTGIFSPLASLLLPENYAAVRFIVVSCMLPPLFCTLVEISGIGLNVVRKTRPIALAT
310 320 330 340 350 360
370 380 390 400 410 419orf10-1.pep LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT
||||||||||||| |||: ||||||||||||||:|||||||||||||||||||:||orf10a LGALAANLLLLGL--AVPSGGARGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT
370 380 390 400 410
420 430 440 450 460 470orf10-1.pep LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX
||||:||||||||||||||||||||||:||||||||||||||||||||||||||||orf10a LFCLASSAAYTCFGTPANYPLFAGVWAVYLAGCILRHRKDLHKLFHYLKKQGFPLX
420 430 440 450 460 470
与淋病奈瑟球菌的预计ORF的同源性
ORF10显示与淋病奈瑟球菌的预计ORF(ORF10.ng)在重叠的475个氨基酸内有94.1%的相同性:orf10ng.pep MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 60
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||orf10nm MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 60orf10ng.pep YVREYYAAADKDTLFKTLFLPPLLFSAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 120
|||||||:|||||||||||||||| :||||||||||||||||||||||||||||||||||orf10nm YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 120orf10ng.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTSVLTAVYALA 180
|||||||||||||||||||||||||||||||||||| |||||||||||||:|||||||||orf10nm LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA 180orf10ng.pep NLAAAAFLLFQNRCRLKAVRRAPFSPAVLHRGLRYGIPLALSSLAYWGLASADRLFLKKY 240
||||||||||||||||||||:||||||||||| |||||:||||:||||||||||||||||orf10nm NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGXRYGIPIALSSIAYWGLASADRLFLKKY 240orf10ng.pep AGLEQLGVYSMGISFGGAALLLQSIFSTVWTPYIFRAIEENATPARLSATAESAAALLAS 300
|||||||||||||||||||||:|||||||||||||||||||| |||||||||||||||||orf10nm AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS 300orf10ng.pep ALCLTGIFSPLASLLLPENYAAVRFTVVSCMLPPLFYTLTEISGIGLNVVRKTRPIALAT 360
||| ||||||||||||||||||||| ||||| |||| ||:||||||||||||||||||||orf10nm ALCXTGIFSPLASLLLPENYAAVRFIVVSCMXPPLFCTLAEISGIGLNVVRKTRPIALAT 360
370 380 390 400 410orf10ng.pep LGALAANLLLLGL--AVPSGGTRGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT
||||||||||||| |||: ||||||||||||||:|||||||||||||||||||:||orf10nm LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT
370 380 390 400 410
420 430 440 450 460 470orf10ng.pep LFCLASSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKNLHKLFHYLKKQGFPLX
||||:||||||||||||||||||||||||||||||||||:||||||||||||||||orf10nm LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX
420 430 440 450 460 470全长ORF10ng核苷酸序列<SEQ ID 321>是:1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCcccgCCG101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG ACTGACGGTG151 TCGGTATTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC201 CGCCGCCGAC AAAGACACTT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC251 TGTTTTCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG301 TCTGAAATCC TGTTTTCGCT CGACGATGCC GCCGCCGGCA TCGGGCTGGT351 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC401 GTATGGAAGG GCGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGCCCAAA451 CTCGCCATTC TGCTGCTGTT GCCGCTGACG GTCGGGCTGC TGCACTTTCC501 GGCGAACACC TCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG601 CGCGCGCCGT TTTCGCCCGC CGTCCTGCAC CGGGGGCTGC GCTACGGCAT651 ACCGCTCGCA CTGAGCAGCC TTGCCTATTG GGGGCTGGCA TCCGCCGACC701 GTTTGTTCCT GAAAAAATAT GCGGGCCTGG AACAGCTCGG CGTTTATTCG751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGCTCCAAA GCATCTTTTC801 AACGGTCTGG ACACCGTATA TTTTCCGTGC AATCGAAGAA AACGCCACGC851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC901 GCCCTCTGCC TGACCGGAAT TTTCTCGCCC CTCGCCTCCC TCCTGCTGCC951 GGAAAACTAC GCCGCCGTCC GGTTTACCGT CGTATCGTGT ATGCTGccgc1001 cgctGTTTTA CACGCTGACC GAAATCAGCG GCATCGGTTT GAACGTCGTC1051 CGCAAAACGC GTCCGATCGC GCTTGCCACC TTGGGCGCGC TGGCGGCAAA1101 CCTGCTGCTG CTGGGGCTTG CCGTACCGTC CGGCGGCACG CGCGGCGCGG1151 CGGTTGCCTG TGCCGCCTCA TTCTGGTTGT TTTTTGTTTT CAAGACAGAA1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATATGCA1251 CACATTGTTC TGCCTgGCCT CCTCGGCGGC CTACACCTGC TTCGGCACAC1301 CGGCAAACTA CCCcctgttt gccggcgtAT GGGCGGCATA TCTGGCAGGC1351 TGCATCCTGC GCCACCGGAA AAATTTGCAC AAACTGTTTC ATTATTTGAA1401 AAAACAAGGT TTCCCATTAT GA它编码的蛋白质具有氨基酸序列<SEQ ID 322>:1 MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV51 SVLCLGLDQA YVREYYAAAD KDTLFKTLFL PPLLFSAAIA ALLLSRPSLP101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVPK151 LAILLLLPLT VGLLHFPANT SVLTAVYALA NLAAAAFLLF QNRCRLKAVR201 RAPFSPAVLH RGLRYGIPLA LSSLAYWGLA SADRLFLKKY AGLEQLGYYS251 MGISFGGAAL LLQSIFSTVW TPYIFRAIEE NATPARLSAT AESAAALLAS301 ALCLTGIFSP LASLLLPENY AAVRFTVVSC MLPPLFYTLT EISGIGLNVV351 RKTRPIALAT LGALAANLLL LGLAVPSGGT RGAAVACAAS FWLFFVFKTE401 SSCRLWQPLK RLPLYMHTLF CLASSAAYTC FGTPANYPLF AGVWAAYLAG451 CILRHRKNLH KLFHYLKKQG FPL*ORF10ng和ORF10-1显示在473个氨基酸的重叠区内有96.4%的相同性:
10 20 30 40 50 60orf10-1.pep MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf10ng-1 MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
10 20 30 40 50 60
70 80 90 100 110 120orf10-1.pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
|||||||:|||||||||||||||| :||||||||||||||||||||||||||||||||||orf10ng-1 YVREYYAAADKDTLFKTLFLPPLLFSAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
70 80 90 100 110 120
130 140 150 160 170 180orf10-1.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTAVLTAVYALA
||||||||||||||||||||||||||||||||||||||||||||||||||:|||||||||orf10ng-1 LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTSVLTAVYALA
130 140 150 160 170 180
190 200 210 220 230 240orf10-1.pep NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGLRYGIPIALSSIAYWGLASADRLFLKKY
||||||||||||||||||||:|||||||||||||||||:||||:||||||||||||||||orf10ng-1 NLAAAAFLLFQNRCRLKAVRRAPFSPAVLHRGLRYGIPLALSSLAYWGLASADRLFLKKY
190 200 210 220 230 240
250 260 270 280 290 300orf10-1.pep AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS
|||||||||||||||||||||:|||||||||||||||||||| |||||||||||||||||orf10ng-1 AGLEQLGVYSMGISFGGAALLLQSIFSTVWTPYIFRAIEENATPARLSATAESAAALLAS
250 260 270 280 290 300
310 320 330 340 350 360orf10-1.pep ALCLTGIFSPLASLLLPENYAAVRFIVVSCMLPPLFCTLAEISGIGLNVVRKTRPIALAT
||||||||||||||||||||||||| |||||||||| ||:||||||||||||||||||||orf10ng-1 ALCLTGIFSPLASLLLPENYAAVRFTVVSCMLPPLFYTLTEISGIGLNVVRKTRPIALAT
310 320 330 340 350 360
370 380 390 400 410 420orf10-1.pep LGALAANLLLLGLAVPSGGARGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHTLF
|||||||||||||||||||:|||||||||||||||:|||||||||||||||||||:||||orf10ng-1 LGALAANLLLLGLAVPSGGTRGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHTLF
370 380 390 400 410 420
430 440 450 460 470orf10-1.pep CLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKGGFPLX
||:||||||||||||||||||||||||||||||||||:||||||||||||||||orf10ng-1 CLASSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKNLHKLFHYLKKQGFPLX
430 440 450 460 470
根据该分析结果(包括存在一个推定的前导肽和几个跨膜片段,以及存在一个亮氨酸拉链基序(相隔6个氨基酸的4个Leu残基,用粗体表示)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例45
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 323>:1 ..ATCCTGAAAC CGCATAACCA GCTTAAGGAA GACATCCAAC CTGATCCGGC51 CGATCAAAAC GCCTTGTCCG AACCGGATGC TGCGACAGAG GCAGAGCAGT101 CGGATGCGGA AAATGCTGCC GACAAGCAGC CCGTTGCCGA TAAAGCCGAC151 GAGGTTGAAG AAAAGGCGGG CGAGCCGGAA CGGGAAGAGC CGGACGGACA201 GGCAGTGCGT AAGAAAGCGC TGACGGAAGA GCGTGAACAA ACCGTCAGGG251 AAAAAGCGCA GAAGAAAGAT GCCGAAACGG TTAAAATACA AGCGGTAAAA301 CCGTCTAAAG AAACAGAGAA AAAAGCTTCA AAAGAAGAGA AAAAGGCGGC351 GAAGGAAAAA GTTGCACCCA AACCAACCCC GGAACAAATC CTCAACAGCG401 GCAgCATCGA AAAmGCGCGC AgTGCCGCCG CCAAAGAAGT GCAGAAAATG451 AA.AACGTCC GACAAGGCGG AAGC.AACGC ATTATCTGCA AATGGGCGCG501 TATGCCGACC GTCAGAGCGC GGAAGGGCAG CGTGCCAAAC TGGCAATCTT551 GGGCATATCT TCCAAGGTGG TCGGTTATCA GGCGGGACAT AAAACGCTTT601 ACCGGGTGCA AAGCGGCAAT ATGTCTGCCG ATGCGGTGA它对应于氨基酸序列<SEQ ID 324;ORF65>:1..ILKPHNQLKE DIQPDPADQN ALSEPDAATE AEQSDAENAA DKQPVADKAD51 EVEEKAGEPE REEPDGQAVR KKALTEEREQ TVREKAQKKD AETVKIQAVK101 PSKETEKKAS KEEKKAAKEK VAPKPTPEQI LNSGSIEXAR SAAAKEVQKM151 XNVRQGGSXR IICKWARMPT VRARKGSVPN WQSWAYLPRW SVIRRDIKRF201 TGCKAAICLP MR*进一步的工作揭示了完整的核苷酸序列<SEQ ID 325>:1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTTTT51 CTTCGGTTTG ATACTGGCGA CGGTCATTAT TGCCGGTATT TTGTTTTATC101 TGAACCAGAG CGGTCAAAAT GCGTTCAAAA TCCCGGCTTC GTCGAAGCAG151 CCTGCAGAAA CGGAAATCCT GAAACCGAAA AACCAGCCTA AGGAAGACAT201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGCTGCGA251 CAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT301 GCCGATAAAG CCGACGAGGT TGAAGAAAAG GCGGGCGAGC CGGAACGGGA351 AGAGCCGGAC GGACAGGCAG TGCGTAAGAA AGCGCTGACG GAAGAGCGTG401 AACAAACCGT CAGGGAAAAA GCGCAGAAGA AAGATGCCGA AACGGTTAAA451 AAACAAGCGG TAAAACCGTC TAAAGAAACA GAGAAAAAAG CTTCAAAAGA501 AGAGAAAAAG GCGGCGAAGG AAAAAGTTGC ACCCAAACCA ACCCCGGAAC551 AAATCCTCAA CAGCGGCAGC ATCGAAAAAG CGCGCAGTGC CGCCGCCAAA601 GAAGTGCAGA AAATGAAAAC GTCCGACAAG GCGGAAGCAA CGCATTATCT651 GCAAATGGGC GCGTATGCCG ACCGTCAGAG CGCGGAAGGG CAGCGTGCCA701 AACTGGCAAT CTTGGGCATA TCTTCCAAGG TGGTCGGTTA TCAGGCGGGA751 CATAAAACGC TTTACCGGGT GCAAAGCGGC AATATGTCTG CCGATGCGGT801 GAAAAAAATG CAGGACGAGT TGAAAAAACA TGAAGTCGCC AGCCTGATCC851 GTYCTATCGA AAGCAAATAA它对应于氨基酸序列<SEQ ID 326;ORF65-1>:1 MFMNKFSQSG KGLSGFFFGL ILATVIIAGI LFYLNQSGQN AFKIPASSKQ51 PAETEILKPK NQPKEDIQPE PADQNALSEP DAATEAEQSD AEKAADKQPV101 ADKADEVEEK AGEPEREEPD GQAVRKKALT EEREQTVREK AQKKDAETVK151 KQAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQDLNSGS IEKARSAAAK201 EVQKMKTSDK AEATHYLQMG AYADRQSAEG QRAKLAILGI SSKVVGYQAG251 HKTLYRVQSG NMSADAVKKM QDELKKHEVA SLIRSIESK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF65显示与脑膜炎奈瑟球菌菌株A的ORF(ORF65a)在重叠的150个氨基酸内有92.0%的相同性:
10 20 30orf65.pep ILKPHNQLKEDIQPDPADQNALSEPDAATE
||||:|| ||:|||:|||||||||||||||orf65a IIAGILFYLNQSGQNAFKIPVPSKQPAETEILKPKNQPKEDIQPEPADQNALSEPDAAKE
30 40 50 60 70 80
40 50 60 70 80 90orf65.pep AEQSDAENAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD
|||||||:|||||||||||||||||| |||||: |||||||||||||||||| |||||||orf65a AEQSDAEKAADKQPVADKADEVEEKADEPEREKSDGQAVRKKALTEEREQTVGEKAQKKD
90 100 110 120 130 140
100 110 120 130 140 150orf65.pep AETVKIQAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSGSIEXARSAAAKEVQKM
||||| |||||||||||||||||||| |||||||||||||||||||| ||||||||||||orf65a AETVKKQAVKPSKETE1KASKEEKKAEKEKVAPKPTPEQILNSGSIEKARSAAAKEVQKM
150 160 170 180 190 200
160 170 180 190 200 210orf65.pep XNVRQGGSXRIICKWARMPTVRARKGSVPNWQSWAYLPRWSVIRRDIKRFTGCKAAICLPorf65a KTPDKAEATHYLQMGAYADRRSAEGQRAKLAILGISSKVVGYQAGHKTLYRVQSGNMSAD
210 220 230 240 250 260全长ORF65a核苷酸序列<SEQ ID 327>是:1 ATGTTTATGA ACAAATTTTC CCAATGCGGA AAAGGTCTGT CCGGTTTTTT51 CTTCGGTTTG ATACTGGCGA CGGTCATTAT TGCCGGTATT TTGTTTTATC101 TGAACCAGAG CGGTCAAAAT GCGTTCAAAA TCCCGGTTCC GTCGAAGCAG151 CCTGCAGAAA CGGAAATCCT GAAACCGAAA AACCAGCCTA AGGAAGACAT201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGCTGCGA251 AAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT301 GCCGACAAAG CCGACGAGGT TGAGGAAAAG GCGGACGAGC CGGAGCGGGA351 AAAGTCGGAC GGACAGGCAG TGCGCAAGAA AGCACTGACG GAAGAGCGTG401 AACAAACCGT CGGGGAAAAA GCGCAGAAGA AAGATGCCGA AACGGTTAAA451 AAACAAGCGG TAAAACCATC TAAAGAAACA GAGAAAAAAG CTTCAAAAGA501 AGAGAAAAAG GCGGAGAAGG AAAAAGTTGC ACCCAAACCG ACCCCGGAAC551 AAATCCTCAA CAGCGGCAGC ATCGAAAAAG CGCGCAGTGC CGCTGCCAAA601 GAAGTGCAGA AAATGAAAAC GCCCGACAAG GCGGAAGCAA CGCATTATCT651 GCAAATGGGC GCGTATGCCG ACCGCCGGAG CGCGGAAGGG CAGCGTGCCA701 AACTGGCAAT CTTGGGCATA TCTTCCAAGG TGGTCGGTTA TCAGGCGGGA751 CATAAAACGC TTTACCGGGT GCAAAGCGGC AATATGTCTG CCGATGCGGT801 GAAAAAAATG CAGGACGAGT TGAAAAAACA TGAAGTCGCC AGCCTGATCC851 GTTCTATCGA AAGCAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 328>:1 MFMNKFSQSG KGLSGFFFGL ILATVIIAGI LFYLNQSGQN AFKIPVPSKQ51 PAETEILKPK NQPKEDIQPE PADQNALSEP DAAKEAEQSD AEKAADKQPV101 ADKADEVEEK ADEPEREKSD GQAVRKKALT EEREQTVGEK AQKKDAETVK151 KQAVKPSKET EKKASKEEKK AEKEKVAPKP TPEQILNSGS IEKARSAAAK201 EVQKMKTPDK AEATHYLQMG AYADRRSAEG QRAKLAILGI SSKVVGYQAG251 HKTLYRVQSG NMSADAVKKM QDELKKHEVA SLIRSIESK*ORF65a和ORF65-1显示在289个氨基酸的重叠区内有96.5%的相同性:
10 20 30 40 50 60orf65a.pep MFMNKFSQSGKGLSGFFFGLILATVIIAGILFYLNQSGQNAFKIPVPSKQPAETEILKPK
|||||||||||||||||||||||||||||||||||||||||||||: |||||||||||||orf65-1 MFMNKFSQSGKGLSGFFFGLILATVIIAGILFYLNQSGQNAFKIPASSKQPAETEILKPK
10 20 30 40 50 60
70 80 90 100 110 120orf65a.pep NQPKEDIQPEPADQNALSEPDAAKEAEQSDAEKAADKQPVADKADEVEEKADEPEREKSD
||||||||||||||||||||||| ||||||||||||||||||||||||||| |||||: |orf65-1 NQPKEDIQPEPADQNALSEPDAATEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD
70 80 90 100 110 120
130 140 150 160 170 180orf65a.pep GQAVRKKALTEEREQTVGEKAQKKDAETVKKQAVKPSKETEKKASKEEKKAEKEKVAPKP
||||||||||||||||| ||||||||||||||||||||||||||||||||| ||||||||orf65-1 GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP
130 140 150 160 170 180
190 200 210 220 230 240orf65a.pep TPEQILNSGSIEKARSAAAKEVQKMKTPDKAEATHYLQMGAYADRRSAEGQRAKLAILGI
||||||||||||||||||||||||||| |||||||||||||||||:||||||||||||||
orf65-1 TPEQILNSGSIEKARSAAAKEVQKMKTSDKAEATHYLQMGAYADRQSAEGQRAKLAILGI
190 200 210 220 230 240
250 260 270 280 290
orf65a.pep SSKVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHEVASLIRSIESKX
||||||||||||||||||||||||||||||||||||||||||||||||||
orf65-1 SSKVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHEVASLIRSIESKX
250 260 270 280 290
与淋病奈瑟球菌的预计ORF的同源性
ORF65显示与淋病奈瑟球菌的预计ORF(ORF65.ng)在重叠的212个氨基酸内有89.6%的相同性:
30 40 50 60 70 80
ORF65ng IIAGILLYLNQGGQNAFKIPAPSKQPAETEILKLKNQPKEDIQPEPADQNALSEPDVAKE
||| :||| |||||:|||||||||||:| |
ORF65 ILKPHNQLKEDIQPDPADQNALSEPDAATE
10 20 30
90 100 110 120 130 140
ORF65ng AEQSDAEKAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
0RF65 AEQSDAENAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD
40 50 60 70 80 90
150 160 170 180 190 200
0RF65ng AETVKKKAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSRSIEKARSAAAKEVQKM
|||||: ||||||||||||||||||||||||||||||||||||| ||| ||||||||||||
ORF65 AETVKIQAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSGSIEXARSAAAKEVQKM
100 110 120 130 140 150
210 220 230 240 250 260
ORF65ng KNFGQGGSQRIICKWARMPNPGARKGSVPNWQSWAYLPKWSAIRRDIKRFTACKAAICPP
| |||| ||||||||||: ||||||||||||||||:||:||||||||||:||||| |
ORF65 XNVRQGGSXRIICKWARMPTVRARKGSVPNWQSWAYLPRWSVIRRDIKRFTGCKAAICLP
160 170 180 190 200 210
ORF65ng MR
||
ORF65 MR
预计An ORF65ng核苷酸序列<SEQ ID 329>编码的蛋白质具有氨基酸序列<SEQID 330>:1 MFMNKFSQSG KGLSGFFFGL ILATVIIAGI LLYLNQGGQN AFKIPAPSKQ51 PAETEILKLK NQPKEDIQPE PADQNALSEP DVAKEAEQSD AEKAADKQPV101 ADKADEVEEK AGEPEREEPD GQAVRKKALT EEREQTVREK AQKKDAETVK1S1 KKAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQILNSRS IEKARSAAAK201 EVQ1MKNFGQ GGSQRIICKW ARMPNPGARK GSVPNWQSWA YLPKWSAIRR251 DIKRFTACKA AICPPMR*进一步分析后,发现此完整的淋球菌DNA序列<SEQ ID 331>是:1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTCTT51 CTTCGGTTTG ATACTGGCAA CGGTCATTAT TGCCGGTATT TTGCTTTATC101 TGAACCAGGG CGGTCAAAAT GCGTTCAAAA TCCCGGCTCC GTCGAAGCAG151 CCTGCAGAAA CGGAAATCCT GAAACTGAAA AACCAGCCTA AGGAAGACAT201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGTTGCGA251 AAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAAG CAGCCCGTT301 GCCGACAAag ccgacgAGGT TGAAGAAAag GcGGgcgAgc cggaACGGga351 aGAGCCGGAC ggACAGGCAG TGCGCAAGAA AGCACTGAcg gAAGAgcGTG401 AACAAACcgt cagggAAAAA GCGCagaaga AAGATGCCGA AACGgTTAAA451 AAacaaCCgg tAaaaccgtc tAAAGAAACa gagaaaaaag cTtcaaaaga501 agagaaaaag gcggcgaaag aaaAAGttgc acccaaaccg accccggaaC551 aaatcctcaa cagccgCagc atcgaaaaag cgcgtagtgc cgctgccaaa601 gaAgtgcaGA AAatgaaaaa ctTtgggcaa ggcgGaagcc aacgcattaT651 CTGcaaatgg gcgcgtatgc cgaccgtccg gagcgcggaA gggcagcgtg701 ccaaACtggc aAtcttgGgc atatctTccg aagtggtcgG CTATCAGGCG751 GGACATAAAA CGCTTTACCG CGTGCAAagc GGCAatatgt ccgccgatgc801 gGTGAAAAAA AGTTGAAAAA GCATGGGGtt gcCAGCCTGA851 TCCGTGcgAT TGAAGGCAAA TAA它编码下列氨基酸序列<SEQ ID 332>:1 MFMNKFSQSG KGLSGFFFGL ILATVIIAGI LLYLNQGGQN AFKIPAPSKQ51 PAETEILKLK NQPKEDIQPE PADQNALSEP DYAKEAEQSD AEKAADKQPV101 ADKADEVEEK AGEPEREEPD GQAYRKKALT EEREQTVREK AQKKDAETVK151 KQAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQILNSRS IEKARSAAAK201 EVQKMKNFGQ GGSQRIICKW ARMPTVRSAE GQRAKLAILG ISSEVVGYQA251 GHKTLYRVQS GNMSADAVKK MQDELKKHGV ASLIRAIEGK *ORF65ng-1和ORF65-1显示在290个氨基酸的重叠区内有89.0%的相同性:
10 20 30 40 50 60orf65-1.pep MFMNKFSQSGKGLSGFFFGLILATVIIAGILFYLNQSGQNAFKIPASSKQPAETEILKPK
|||||||||||||||||||||||||||||||:||||:||||||||| ||||||||||| |orf65ng-1 MFMNKFSQSGKGLSGFFFGLILATVIIAGILLYLNQGGQNAFKIPAPSKQPAETEILKLK
10 20 30 40 50 60
70 80 90 100 110 120orf65-1.pep NQPKEDIQPEPADQNALSEPDAATEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD
|||||||||||||||||||||:| ||||||||||||||||||||||||||||||||||||orf65ng-1 NQPKEDIQPEPADQNALSEPDVAKEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD
70 80 90 100 110 120
130 140 150 160 170 180orf65-1.pep GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf65ng-1 GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP
130 140 150 160 170 180
190 200 210 220 230 239orf65-1.pep TPEQILNSGSIEKARSAAAKEVQKMKTSDKAEATHYL-QMGAYADRQSAEGQRAKLAILG
||||||||||||||||||||||||||: :: : : : : : : :|||||||||||||||orf65ng-1 TPEQILNSRSIEKARSAAAKEVQKMKNFGQGGSQRIICKWARMPTVRSAEGQRAKLAILG
190 200 210 220 230 240
240 250 260 270 280 290orf65-1.pep ISSKVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHEVASLIRSIESKX
|||:|||||||||||||||||||||||||||||||||| ||||||:||:||orf65ng-1 ISSEVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHGVASLIRAIEGKX
250 260 270 280 290
根据该结果,包括淋球菌蛋白中存在一个推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例46
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 333>:1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTACTCG GTkTCTTCGG51 CGGAAcGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GcGTTTGs.s101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATCCT GCTGCTTAAC151 ACAGGACGGG TAAGCAGCTA TACGGCAAtC GGCCTGATAC TCGGATTAAT201 CGGACAGGTC GGCGTTTCAC TCGAcCAaAC CCGCGTCCTG CAGAATATTT251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAaATCGGCA AACCGATATG351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC401 CCGCCTGCCT tGCGgTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG451 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AgCGGTAGTG CGGCAACGGG501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTtTAG551 CAATCGGCAT TTTtTCCCTG CAACTGAAwA AAATCATGCA AAACCGATAT601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT651 TGCCGTCCTG TGGCTGTAA它对应于氨基酸序列<SEQ ID 334;ORF103>:1 MNHDITFLTL FLLGXFGGTH CIGMCGGLSS AFXXQLPPHI NRFWLILLLN51 TGRVSSYTAI GLILGLIGQV GVSLDQTRVL QNILYTAANL LLLFLGLYLS101 GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIPACLAVG ILWGWLPCGL151 VYSASLYALG SGSAATCGLY MLAFALGTLP NLLAIGIFSL QLXKIMQNRY201 IRLCTGLSVS LWALWKLAVL WL*进一步的工作详细描述了该DNA序列<SEQ ID 335>:1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTACTCG GTTTCTTCGG51 CGGAACGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATCCT GCTGCTTAAC151 ACAGGACGGG TAAGCAGCTA TACGGCAATC GGCCTGATAC TCGGATTAAT201 CGGACAGGTC GGCGTTTCAC TCGACCAAAC CCGCGTCCTG CAGAATATTT251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC401 CCGCCTGCCT TGCGGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG451 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AGCGGTAGTG CGGCAACGGG501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTTAG551 CAATCGGCAT TTTTTCCCTG CAACTGAAAA AAATCATGCA AAACCGATAT601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT651 TGCCGTCCTG TGGCTGTAA它对应于氨基酸序列<SEQ ID 336;ORF103-1>:1 MNHDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRFWLILLLN51 TGRVSSYTAI GLILGLIGQV GVSLDQTRVL QNILYTAANL LLLFLGLYLS101 GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIPACLAVG ILWGWLPCGL151 VYSASLYALG SGSAATGGLY MLAFALGTLP NLLAIGIFSL QLKKIMQNRY201 IRLCTGLSVS LWALWKLAVL WL*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF103显示与脑膜炎奈瑟球菌菌株A的ORF(ORF103.a)在重叠的222个氨基酸内有93.8%的相同性:
10 20 30 40 50 60orf103.pep MNHDITFLTLFLLGXFGGTHCIGMCGGLSSAFXXQLPPHINRFWLILLLNTGRVSSYTAI
|| ||||||||||| ||||||||||||||||| |||||||| |||||||||||||||||orf103a MNXDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRXWLILLLNTGRVSSYTAI
10 20 30 40 50 60
70 80 90 100 110 120orf103.pep GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||orf103a GLILGLIGQVGVSLDQTRVXQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
70 80 90 100 110 120
130 140 150 160 170 180orf103.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf103a NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
130 140 150 160 170 180
190 200 210 220orf103.pep NLLAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
|| ||||||||||||||||||||||||||||||||||||||||orf103a NLXAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
190 200 210 220全长ORF103a核苷酸序列<SEQ ID 337>是:1 ATGAACCANG ACATCACTTT CCTCACCCTG TTCCTACTCG GTTTCTTCGG51 CGGAACGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC101 TCCAACTCCC CCCGCATATC AACCGCTTNT GGCTGATCCT GCTGCTTAAC151 ACAGGACGGG TAAGCAGCTA TACGGCAATC GGCCTGATAC TCGGATTAAT201 CGGACAGGTC GGCGTTTCAC TCGACCAAAC CCGCGTCNTG CAGAATATTT251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC401 CCGCCTGCCT TGCGGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTA451 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AGCGGTAGTG CGGCAACGGG501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTNGG551 CAATCGGCAT TTTTTCCCTG CAACTGNAAA AAATCATGCA AAACCGATAT601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT651 TGCCGTCCTG TGGCTGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 338>:1 MNXDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRXWLILLLN51 TGRVSSYTAI GLILGLIGQV GVSLDQTRVX QNILYTAANL LLLFLGLYLS101 GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIPACLAVG ILWGWLPCGL151 VYSASLYALG SGSAATGGLY MLAFALGTLP NLXAIGIFSL QLXKIMQNRY201 IRLCTGLSVS LWALWKLAVL WL*ORF103a和ORF103-1显示在222个氨基酸的重叠区内有97.7%的相同性:
10 20 30 40 50 60orf103a.pep MNXDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRXWLILLLNTGRVSSYTAI
|| ||||||||||||||||||||||||||||||||||||||| |||||||||||||||||orf103-1 MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRVSSYTAI
10 20 30 40 50 60
70 80 90 100 110 120orf103a.pep GLILGLIGQVGVSLDQTRVXQNDLYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||orf103-1 GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
70 80 90 100 110 120
130 140 150 160 170 180orf103a.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf103-1 NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
130 140 150 160 170 180
190 200 210 220orf103a.pep NLXAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
|| ||||||||| ||||||||||||||||||||||||||||||orf103-1 NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF103显示与淋病奈瑟球菌的预计ORF(ORF103.ng)在重叠的222个氨基酸内有95.5%的相同性:orf103.pep MNHDITFLTLFLLGXFGGTHCIGMCGGLSSAFXXQLPPHINRFWLILLLNTGRVSSYTAI 60
|||||||||||||| ||||||||||||||||| ||||||||||||||||||||:||||||orf103ng MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLMTGRISSYTAI 60orf103.pep GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 120
||:||||||:|:|||||||||||||||:||||||||||||||||||||||||||||||||orf103ng GLMLGLIGQLGISLDQTRVLQNILYTASNLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 120orf103.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP 180
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||orf103ng NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSATTGGLYMLAFALGTLP 180orf103.pep NLLAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWL 222
|||||||||||| |||||||||||||||||||||||||||||orf103ng NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWL 222全长ORF103ng核苷酸序列<SEQ ID 339>是:1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTGCTCG GTTTCTTCGG51 CGGAACTCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATTCT GCTGCTTAAC151 ACAGGACGGA TAAGCAGCTA TACGGCAATC GGCCTGATGC TCGGATTAAT201 CGGACAACTC GGCATTTCAC TCGACCAAAc ccgcgTCCTG CAAAATATTT251 tatacacagc ctccaaCCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG351 GCGCAACCTG AACCCGATAC TCAACCGGCT GCTGCCCATA AAATCCATAC401 CCGCCTGCCT TGCTGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG451 GTTTACAGCG CATCACTTTA CGCGCTGGGA AGCGGTAGTG CGACAACCGG501 CGGACTGTAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTTGG551 CAATCGGCAT TTTTTCCCTG CAACTGAAAA AAATCATGCA AAACCGATAT601 ATCCGCCTGT GTACAGGATT ATCCGTATCA TTATGGGCAT TATGGAAGCT651 TGCCGTCCTG TGGCTGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 340>:1 MNHDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRFWLILLLN51 TGRISSYTAI GLMLGLIGQL GISLDQTRVL QNILYTASNL LLLFLGLYLS101 GISSLAAKIE KIGKPIWRNL NPLNRLLPI KSIPACLAVG ILWGWLPCGL151 VYSASLYALG SGSATTGGLY MLAFALGTLP NLLAIGIFSL QLKKIMQNRY201 IRLCTGLSVS LWALWKLAVL WL*
另外,ORF103ng和ORF103-1显示在222个氨基酸的重叠区内有97.3%的相同性:
10 20 30 40 50 60orf103-1.pep MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRVSSYTAI
|||||||||||||||||||||||||||||||||||||||||||||||||||||:||||||orf103ng MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRISSYTAI
10 20 30 40 50 60
70 80 90 100 110 120orf103-1.pep GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
||:||||||:|:|||||||||||||||:||||||||||||||||||||||||||||||||orf103ng GLMLGLIGQLGISLDQTRVLQNILYTASNLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
70 80 90 100 110 120
130 140 150 160 170 180orf103-1.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||orf103ng NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSATTGGLYMLAFALGTLP
130 140 150 160 170 180
190 200 210 220orf103-1.pep NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
|||||||||||||||||||||||||||||||||||||||||||orf103ng NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
190 200 210 220
根据该分析结果(包括此淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例47
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 341>:1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTT CGCTTGGCAC TTTTGGCGGC51 GATGACGTGG GGAACGCTGC CGAT.TCCGT GCGGCAGGTA TTGAAGTTTG101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA151 TTGTTTGTTT TGCTGGCACT GGGCGGGCGG CTGCcGAAGC GGCGaGGATT201 TTTCTTGGTG CTCATTCAGG CTGCTGCTGC TCGGCGTGGC GGGCATTTCG251 GCAAACTTTG TGCTGATTGC CCAAGGGCTG CATTATATTT CGCCGACCAC301 GACGCAGGTT TTGTGGCAGA TTTCGCCGTT TACGATGATT GTwGTCGGTG351 TGTTGGTGTT TAAAGACCGG ATGACTGCCG CTCAGAAAAT CGGCTTGGTT401 TTGCTGCTTG CCGGTTTGCT TATGTATTTT AACGATAAAT TCGGCGAGTT451 GTCGGGTTTG GGCGCGTATG C.AAGGGCGT GTTGCTGTGT GCGGCAGGCA501 GTATGGCATG GGTGTGTAAT GCCGTGGCGC AAAAGCTGCT GTCGGCGCAA551 TTCGGGCCGC AACAGATTCT GCTGTTGATT TATGCGGCAA GTGCCGCCGT601 GTTCCTGCCG TTTGCCGAAC CGGCACACAT CGGAAGTATG GACGGTACGT651 TGGCGTGGGT ATGTATTGCG TATTGCTGCT TGAATACGTT AATCGGTTAC701 GGCTCGTTCG GCGAGGCGTT GAAACATTGG GAGGCTTCCA AAGTCAGCGC751 GGTAACAACC TTGCTCCCCG TGTTTACCGT AATAAATACT TTGCTCGGGC801 ATTATGTGAT GCCTGAAACT TTTGCCGCGC CGGA..它对应于氨基酸序列<SEQ ID 342;ORF104>:1 MENQRPLLGF RLALLAAMTW GTLPXSVRQV LKFVDAPTLV WVRFTVAAAV51 LFVLLALGGR LPKRRDFSWC SFRLLLLGVA GISANFVLIA QGLHYISPTT101 TQVLWQISPF TMIVVGVLVF KDRMTAAQKI GLVLLLAGLL MYFNDKFGEL151 SGLGAYXKGV LLCAAGSMAW VCNAVAQKLL SAQFGPQQIL LLIYAASAAV201 FLPFAEPAHI GSMDGTLAWV CIAYCCLNTL IGYGSFGEAL KHWEASKVSA251 VTTLLPVFTV INTLLGHYVM PETFAAP...进一步的工作进一步揭示了部分DNA序列<SEQ ID 343>:1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC51 GATGACGTGG GGAACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA151 TTGTTTGTTT TGCTGGCACT GGGCGGGCGG CTGCCGAAGC GGCGGGATTT201 TTCTTGGTGC TCATTCAGGC TGCTGCTGCT CGGCGTGGCG GGCATTTCGG251 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG301 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGTGT351 GTTGGTGTTT AAAGACCGGA TGACTGCCGC TCAGAAAATC GGCTTGGTTT401 TGCTGCTTGC CGGTTTGCTT ATGTTTTTTA ACGATAAATT CGGCGAGTTG451 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG501 TATGGCATGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGCAAG TGCCGCCGTG601 TTCCTGCCGT TTGCCGAACC GGCACACATC GGAAGTTTGG ACGGTACGTT651 GGCGTGGGTT TGTTTTGCGT ATTGCTGCTT GAATACGTTA ATCGGTTACG701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG751 GTAACAACCT TGCTCCCCGT GTTTACGGTA ATAwTwwCTT TGCTCGGGCA801 TTATGTGATG CCTGAAACTT TTGCCGCGCC GGA...它对应于氨基酸序列<SEQ ID 344;ORF104-1>:1 MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPTLV WVRFTVAAAV51 LFVLLALGGR LPKRRDFSWC SFRLLLLGVA GISANFVLIA QGLHYISPTT101 TQVLWQISPF TMIVVGVLVF KDRMTAAQKI GLVLLLAGLL MFFNDKFGEL151 SGLGAYAKGV LLCAAGSMAW VCYAVAQKLL SAQFGPQQIL LLIYAASAAV201 FLPFAEPAHI GSLDGTLAWV CFAYCCLNTL IGYGSFGEAL KHWEASKVSA251 VTTLLPVFTV IXXLLGHYVM PETFAAP...该氨基酸序列的计算机分析给出了下列结果:与假设的流感嗜血菌HI0878蛋白(登录号U32769)的同源性ORF104和HI0878显示在277个氨基酸的重叠区内有40%的氨基酸相同性:orf104 4 QRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWXXXXXXXXXXXXXXXXXXXXP- 62
Q+PLLGF AL+ AM WG+LP +++QVL ++A T+VW PHI0878 3 QQPLLGFTFALITAMAWGSLPIALKQVLSVMNAQTIVWYRFIIAAVSLLALLAYKKQLPE 62orf104 63 --KRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 120
K R ++W ++L+GV G+++NF+L + L+YI P+ Q+ +S F M++ GVL+FHI0878 63 LMKVRQYAW----IMLIGVIGLTSNFLLFSSSLNYIEPSVAQIFIHLSSFGMLICGVLIF 118orf104 121 KDRMTAAQKIXXXXXXXXXXMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL 180
K+++ QKI ++FND+F +GL Y GV+L G++ WV +AQKL+HI0878 119 KEKLGLHQKIGLFLLLIGLGLFFNDRFDAFAGLNQYSTGVILGVGGALIWVAYGMAQKLM 178orf104 181 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL 240
+F QQILL++Y A F+P A+ + + + LA +C YCCLNTLIGYGS+ EALHI0878 179 LRKFNSQQILLMMYLGCAIAFMPMADFSQVQELT-PLALICFIYCCLNTLIGYGSYAEAL 237orf104 241 KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP 277
W+ SKVS V TL+P+FT++ + + HY P FAAPHI0878 238 NRWDVSKVSVVITLVPLFTILFSHIAHYFSPADFAAP 274
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF104显示与脑膜炎奈瑟球菌菌株A的ORF(ORF104a)在重叠的277个氨基酸内有95.3%的相同性:
10 20 30 40 50 60orf104.pep MENQRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
|||||||||| ||||||||||||| :||||||||||||||||||||||||||||||||||orf104a MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
10 20 30 40 50 60
70 80 90 100 110 120orf104.pep LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf104a LPKWRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
70 80 90 100 110 120
130 140 150 160 170 180orf104.pep KDRMTAAQKIGLVLLLAGLLMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL
|||||||||||||||||||||:|||||||||||||| ||||||||||||||| |||||||orf104a KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
130 140 150 160 170 180
190 200 210 220 230 240orf104.pep SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL
|||||||||||||||||||||||||| |||||:||||||||:||||||||||||||||||orf104a SAQFGPQQILLLIYAASAAVFLPFAELAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
190 200 210 220 230 240
250 260 270orf104.pep KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP
||||||||||||||||||||| :||||||||:|||||orf104a KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYAGALVVVGGAVTAAVG
250 260 270 280 290 300全长ORF104a核苷酸序列<SEQ ID 345>是:1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC51 GATGACGTGG GGAACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA151 TTGTTTGTTT TGCTGGCATT GGGCGGGCGG CTGCCGAAGT GGCGGGATTT201 TTCTTGGTGC TCATTCAGGC TGCTGCTGCT CGGCGTGGCG GGCATTTCGG251 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG301 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGTGT351 GTTGGTGTTT AAAGACCGGA TGACTGCCGC TCAGAAAATC GGCTTGGTTT401 TGCTGCTTGC CGGTTTGCTT ATGTFTTTTA ACGATAAATT CGGCGAGTTG451 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG501 TATGGCATGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGCAAG TGCCGCCGTG601 TTCCTGCCGT TTGCCGAACT GGCACACATC GGAAGTTTGG ACGGTACGTT651 GGCGTGGGTT TGTTTTGCGT ATTGCTGCTT GAATACGTTA ATCGGTTACG701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATATTTTCTT TGCTCGGGCA801 TTATGTGATG CCTGATACTT TTGCCGCGCC GGATATGAAC GGTTTGGGTT851 ATGCCGGCGC ACTGGTCGTG GTCGGGGGTG CGGTTACGGC GGCGGTGGGG901 GACAGGCTGT TCAAACGCCG CTAG它编码的蛋白质具有氨基酸序列<SEQ ID 346>:1 MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPTLV WVRFTVAAAV51 LFVLLALGGR LPKWRDFSWC SFRLLLLGVA GISANFVLIA QGLHYISPTT101 TQVLWQISPF TMIVVGYLVF KDRMTAAQKI GLVLLLAGLL MFFNDKFGEL151 SGLGAYAKGV LLCAAGSMAW VCYAVAQKLL SAQFGPQQIL LLIYAASAAV201 FLPFAELAHI GSLDGTLAWV CFAYCCLNTL IGYGSFGEAL KHWEASKVSA251 VTTLLPVFTV IFSLLGHYVM PDTFAAPDMN GLGYAGALVV VGGAVTAAVG301 DRLFKRR*ORF104a和ORF104-1显示在277个氨基酸的重叠区内有98.2%的相同性:
10 20 30 40 50 60orf104a.pep MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf104-1 MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
10 20 30 40 50 60
70 80 90 100 110 120orf104a.pep LPKWRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf104-1 LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
70 80 90 100 110 120
130 140 150 160 170 180orf104a.pep KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf104-1 KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
130 140 150 160 170 180
190 200 210 220 230 240orf104a.pep SAQFGPQQILLLIYAASAAVFLPFAELAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||orf104-1 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
190 200 210 220 230 240
250 260 270 280 290 300orf104a.pep KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYAGALVVVGGAVTAAVG
||||||||||||||||||||| ||||||||:||||||||||||||||||||||||||||orf104-1 KHWEASKVSAVTTLLPVFTVIXXLLGHYVMPETFAAP
250 260 270
与淋病奈瑟球菌的预计ORF的同源性
ORF104显示和淋病奈瑟球菌的预计ORF(ORF104.ng)在重叠的277个氨基酸内有93.9%的相同性:orf104.pep MENQRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 60
|||||||||| ||||||||||||| :||||||||||||||||||||||||||||||||||orf104ng MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 60orf104.pep LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 120
||||||||| |||||||||:||||||||||||||||||||||||||||||||||||||||orf104ng LPKRRDFSWHSFRLLLLGVTGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 120orf104.pep KDRMTAAQKIGLVLLLAGLLMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL 180
||||||||||||||||:||||:|||||||||||||| ||||||||||||||| |||||||orf104ng KDRMTAAQKIGLVLLLVGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 180orf104.pep SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL 240
|||||||||||||||||||||| ||||||||:||||||||::|||||||||||||||||orf104ng SAQFGPQQILLLIYAASAAVFLLXAEPAHIGSLDGTLAWVCFVYCCLNTLIGYGSFGEAL 240orf104.pep KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP 277
||||||||||||||||||||| :||||||||:|||||orf104ng KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYVGALVVVGGAVTAAVG 300
预计全长ORF104ng核苷酸序列<SEQ ID 347>编码的蛋白质具有氨基酸序列<SEQ ID 348>:1 MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPTLV WVRFTVAAAV51 LFVLLALGGR LPKRRDFSWH SFRLLLLGVT GISANFVLIA QGLHYISPTT101 TQVLWQISPF TMIVVGVLVF KDRMTAAQKI GLYLLLVGLL MFFNDKFGEL151 SGLGAYAKGV LLCAAGSMAW VCYAVAQKLL SAQFGPQQIL LLIYAASAAV201 FLLXAEPAHI GSLDGTLAWV CFVYCCLNTL IGYGSFGEAL KHWEASKVSA251 VTTLLPVFTV IFSLLGHYYM PDTFAAPDMN GLGYVGALVV VGGAVTAAVG301 DRPFKRR*进一步的工作揭示了完整的淋球菌核苷酸序列<SEQ ID 349>:1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC51 GATGACGTGG GGGACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA151 TTGTTTGTTT TGCTGGCATT GGGCGGGCGG CTGCCGAAGC GGCGGGATTT201 TTCTTGGCAT TCATTCAGGC TGCTGCTGCT CGGCGTGACG GGCATTTCGG251 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG301 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGCGT351 GTTGGTGTTT AAAGACCGGA tgaCTGCCGC GCAGAAAATC GGTTTGGTTT401 TGCTGCttgT CGGTttgCTT ATGTTTTtta ACGACAAATT CGGCGAGTTG451 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG501 TATGGCCTGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGcaag tgccgccGTG601 TTCCtgccgT TTGccgaaCC GGCACACATC GGAAGTTTgg aCGGTACGtt651 GGCGTGGGTT TGTTTTGTGT ATTGCTGCTT GAATACGTTA ATCGGTTACG701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATATTTTCTT TGCTCGGGCA801 TTATGTGATG CCTGATACTT TTGCCGCGCC GGATATGAAC GGTTTGGGTT851 ATGTCGGCGC ACTGGTCGTG GTCGGGGGTG CGGTTACGGC GGCGGTGGGG901 GACAGGCCGT TCAAACGCCG CTAG它对应于氨基酸序列<SEQ ID 350;ORF104ng-1>:1 MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPTLV WVRFTVAAAV51 LFVLLALGGR LPKRRDFSWH SFRLLLLGYT GISANFVLIA QGLHYISPTT101 TQVLWQISPF TMIVVGVLVF KDRMTAAQKI GLVLLLVGLL MFFNDKFGEL151 SGLGAYAKGV LLCAAGSMAW VCYAVAQKLL SAQFGPQQIL LLIYAASAAV201 FLPFAEPAHI GSLDGTLAWV CFVYCCLNTL IGYGSFGEAL KHWEASKVSA251 VTTLLPVFTV IFSLLGHYVM PDTFAAPDMN GLGYVGALVV VGGAVTAAVG301 DRPFKRR*ORF104ng-1和ORF104-1显示在277个氨基酸的重叠区内有97.5%的相同性:
10 20 30 40 50 60orf104-1.pep MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf104ng-1 MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
10 20 30 40 50 60
70 80 90 100 110 120orf104-1.pep LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
||||||||| |||||||||:||||||||||||||||||||||||||||||||||||||||orf104ng-1 LPKRRDFSWHSFRLLLLGVTGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
70 80 90 100 110 120
130 140 150 160 170 180orf104-1.pep KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||orf104ng-1 KDRMTAAQKIGLVLLLVGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
130 140 150 160 170 180
190 200 210 220 230 240orf104-1.pep SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
||||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||orf104ng-1 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFVYCCLNTLIGYGSFGEAL
190 200 210 220 230 240
250 260 270orf104-1.pep KHWEASKVSAVTTLLPVFTVIXXLLGHYVMPETFAAP
||||||||||||||||||||| ||||||||:|||||orf104ng-1 KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYVGALVVVGGAVTAAVG
250 260 270 280 290 300另外,ORF104ng-1显示出与一种假设的流感嗜血菌蛋白明显同源:gi|1573895(U32769)假设的[流感嗜血菌]长度=306评分=237位(598),估计值=8e-62相同性=114/280(40%),阳性=168/280(59%),空隙=8/280(2%)询问:30 QRPXXXXXXXXXXXMTWGTLPIAVRQVLKFVDAPTLVWXXXXXXXXXXXXXXXXXXXXP- 88
Q+P M WG+LPIA++QVL ++A T+VW P目标:3 QQPLLGFTFALITAMAWGSLPIALKQVLSVMNAQTIVWYRFIIAAVSLLALLAYKKQLPE 62询问:89 --KRRDFSWHSFRLLLLGVTGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 146
K R ++W ++L+GV G+++NF+L + L+YI P+ Q+ +S F M++ GVL+F目标:63 LKVRQYAW----IMLIGVIGLTSNFLLFSSSLNYIEPSVAQIFIHLSSFGMLICGVLIF 118
询问:147 KDRMTAAQKIXXXXXXXXXXMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 206
K+++ QKI +FFND+F +GL Y+ GV+L G++ WV Y +AQKL+
目标:119 KEKLGLHQKIGLFLLLIGLGLFFNDRFDAFAGLNQYSTGVILGVGGALIWVAYGMAQKLM 178
询问:207 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFVYCCLNTLIGYGSFGEAL 266
+F QQILL++Y A F+P A+ + + L LA +CF+YCCLNTLIGYGS+ EAL
目标:179 LRKFNSQQILLMMYLGCAIAFMPMADFSQVQELT-PLALICFIYCCLNTLIGYGSYAEAL 237
询问:267 KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMN 306
W+ SKVS V TL+P+FT++FS + HY P FAAP++N
目标:238 NRWDVSKVSVVITLVPLFTILFSHIAHYFSPADFAAPELN 277
根据该分析结果(包括此淋球菌蛋白中存在一个推定的前导序列和数个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例48
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 351>:1 ATGGTAGCTC GTCGGGCTCA TAACCCGAAG GTCGTAGGTT CGAATCCTGT51 .CCCGCAACC TAATTTCAAA CCCCTCGGTT CAATGCCGAG GG.GTTTTGT101 T.TTGCCTGT TTCCTGTTTC CTGTTTCCTG CCGCCTCCGT TTTTTGCCGG151 ATTTTCCTTC CGGCCGCAAT ATCGGAACGG CAGACCGCCG TCTGTTTGCG201 GTTGCAAATT CAGGCAGTTT GGCTACAATC TTCCGCATTG TCTTCAAGAA251 AGCCAACCAT GCCGACCGTC CGTTTTACCG AATCCGTCAG CAAACAAGAC301 CTTGATGCTC TGTTCGAGTG GGCAAAAGCA AGTTACGGTG CAGAAAGTTG351 CTGGAAAACG CTGTATCTGA ACGGTCysCC TTTGGGCAAC CTGTCGCCGG401 AATGGGTGGA ACGCGTsmmA AAAGACTGGG AGGCAGGCTG CyCGGAGTCT451 TCAGACGGCA TTTTTCTGAA TgCGGACGGc TGgCctGATA TGGgCGGAcg501 cTTACAGCAC CTCGCCCTCG GTTGGCACTG TGCGGGGCTG TTGGACGgsT551 GGCGCAACGA 6TGTTTCGAC CTGACCGACG GCGGCGGCAA CCCCTTGTTC601 ACGCTCGaAc GCGCCGyTTT mCGTCCTkTC GGACTGCTCA GCCGCGCCGT651 CCATCTCAAC GGTCTGACCG AATCGGACGG CCGATGGCAT TTCTGGATAG701 GCAGGCGCAG TCCGCACAAA GCAGTCGATC CCAACAAACT CGACAATACT751 rCCGCCGGCG GTGTTTCCGG CGGCGAAATG CCGTCTGAAG CCGTGTGTCG801 CGAAAGCAGC GAAGAAGCCG GTTTGGATAA AACGCTGcTT CCGCTCATCC851 GCCCGGTATC GCAGCTGCAC AGCCTGCGCT CCGTCAGCCG GGGTGTACAC901 AATGAAATCC TGTATGTATT CGATGCCGTC CTGCCG...它对应于氨基酸序列<SEQ ID 352;ORF105>:1 MVARRAHNPK VVGSNPXPAT XFQTPRFNAE XVLXLPVSCF LFPAASVFCR51 IFLPAAISER QTAVCLRLQI QAVWLQSSAL SSRKPTMPTV RFTESVSKQD101 LDALFEWAKA SYGAESCWKT LYLNGXPLGN LSPEWVERVX KDWEAGCXES151 SDGIFLNADG WPDMGGRLQH LALGWHCAGL LDGWRNECFD LTDGGGNPLF201 TLERAXXRPX GLLSRAVHLN GLTESDGRWH FWIGRRSPHK AVDPNKLDNT251 XAGGVSGGEM PSEAVCRESS EEAGLDKTLL PLIRPVSQLH SLRSVSRGVH301 NEILYVFDAV LP...进一步的工作揭示了完整的核苷酸序列<SEQ ID 353>:1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACAAG ACCTTGATGC51 TCTGTTCGAG TGGGCAAAAG CAAGTTACGG TGCAGAAAGT TGCTGGAAAA101 CGCTGTATCT GAACGGTCTG CCTTTGGGCA ACCTGTCCCC GGAATGGGTG151 GAACGCGTCA AAAAAGACTG GGAGGCAGGC TGCTCGGAGT CTTCAGACGG201 CATTTTTCTG AATGCGGACG GCTGGCCTGA TATGGGCGGA CGCTTACAGC251 ACCTCGCCCT CGGTTGGCAC TGTGCGGGGC TGTTGGACGG CTGGCGCAAC301 GAGTGTTTCG ACCTGACCGA CGGCGGCGGC AACCCCTTGT TCACGCTCGA351 ACGCGCCGCT TTCCGTCCTT TCGGACTGCT CAGCCGCGCC GTCCATCTCA401 ACGGTCTGAC CGAATCGGAC GGCCGATGGC ATTTCTGGAT AGGCAGGCGC451 AGTCCGCACA AAGCAGTCGA TCCCAACAAA CTCGACAATA CTGCCGCCGG501 CGGTGTTTCC GCCGGCGAAA TGCCGTCTGA AGCCGTGTGT CGCGAAAGCA551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGC TTCCGCTCAT CCGCCCGGTA601 TCGCAGCTGC ACAGCCTGCG CTCCGTCAGC CGGGGTGTAC ACAATGAAAT651 CCTGTATGTA TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC701 AGGATGGCGA AGTGGCGGGT TTTGAGAAAA TGGACATCGG CGGTCTGTTG751 GATGCCATGT TGTCGGGAAA CATGATGCAC GACGCGCAAC TGGTTACGCT801 GGACGCGTTT TGCCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG851 AGTGGCTGGA CGGCATACGT TTATAG它对应于氨基酸序列<SEQ ID 354;ORF105-1>:1 MPTVRFTESV SKQDLDALFE WAKASYGAES CWKTLYLNGL PLGNLSPEWV51 ERVKKDWEAG CSESSDGIFL NADGWPDMGG RLQHLALGWH CAGLLDGWRN101 ECFDLTDGGG NPLFTLERAA FRPFGLLSRA VHLNGLTESD GRWHFWIGRR151 SPHKAVDPNK LDNTAAGGVS GGEMPSEAVC RESSEEAGLD KTLLPLIRPV201 SQLHSLRSVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEKMDIGGLL251 DAMLSGNMMH DAQLVTLDAF CRYGLIDAAH PLSEWLDGIR L*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF105显示与脑膜炎奈瑟球菌菌株A的ORF(ORF105a)在重叠的226个氨基酸内有89.4%的相同性:
60 70 80 90 100 110orf105.pep ISERQTAVCLRLQIQAVWLQSSALSSRKPTMPTVRFTESVSKQDLDALFEWAKASYGAES
|||||||||||:||||||||||||||||||orf105a MPTVRFTESVSKHDLDALFEWAKASYGAES
10 20 30
120 130 140 150 160 170orf105.pep CWKTLYLNGXPLGNLSPEWVERVXKDWEAGCXESSDGIFLNADGWPDMGGRLQHLALGWH
||||||||| |||||||||:||| ||||||| ||||||||||||||||| |||||| |:orf105a CWKTLYLNGLPLGNLSPEWAERVKKDWEAGCSESSDGIFLNADGWPDMGRRLQHLARIWK
40 50 60 70 80 90
180 190 200 210 220 230orf105.pep CAGLLDGWRNECFDLTDGGGNPLFTLERAXXRPXGLLSRAVHLNGLTESDGRWHFWIGRR
|||| |||:|||||||||:||||:|||| || ||||||||||||:|||||||||||||orf105a EAGLLHGWRDECFDLTDGGSNPLFALERAAFRPFGLLSRAVHLNGLVESDGRWHFWIGRR
100 110 120 130 140 150
240 250 260 270 280 290orf105.pep SPHKAVDPNKLDNTXAGGVSGGEMPSEAVCRESSEEAGLDKTLLPLIRPVSQLHSLRSVS
||||||||:||||| |||||:||:|||:||||||||||||||||||||||||||||| ||orf105a SPHKAVDPDKLDNTAAGGVSSGELPSETVCRESSEEAGLDKTLLPLIRPVSQLHSLRPVS
160 170 180 190 200 210
300 310orf105.pep RGVHNEILYVFDAVLP
||||||||||||||||orf105a RGVHNEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLAAMLSGNMMHDAQLVTLDAF
220 230 240 250 260 270全长ORF105a核苷酸序列<SEQ ID 355>是:1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACACG ACCTTGATGC51 CCTATTCGAG TGGGCAAAGG CAAGTTACGG TGCGGAAAGT TGCTGGAAAA101 CGCTGTATCT GAACGGTCTG CCTTTGGGCA ATCTGTCGCC GGAATGGGCG151 GAGCGCGTCA AAAAAGACTG GGAGGCAGGC TGCTCGGAGT CTTCAGACGG201 CATTTTCCTG AATGCGGACG GCTGGCCAGA TATGGGCAGA CGCTTGCAGC251 ACCTCGCCCG AATATGGAAA GAAGCGGGAC TGCTTCACGG CTGGCGCGAC301 GAGTGTTTCG ACCTGACCGA CGGCGGCAGC AATCCCTTGT TCGCGCTCGA351 ACGCGCCGCT TTCCGTCCGT TCGGACTGCT CAGCCGCGCC GTCCATCTCA401 ACGGTTTGGT CGAATCGGAC GGCCGATGGC ATTTCTGGAT AGGCAGGCGC451 AGTCCGCACA AAGCAGTCGA TCCCGACAAA CTCGACAATA CTGCCGCCGG501 CGGTGTTTCC AGCGGTGAAT TGCCGTCTGA AACCGTGTGT CGCGAAAGCA551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGC TTCCGCTCAT CCGCCCGGTA601 TCGCAGCTGC ACAGCCTGCG CCCCGTCAGC CGGGGTGTGC ACAATGAAAT651 CCTGTATGTA TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC701 AGGATGGCGA AGTGGCGGGT TTTGAGAAAA TGGACATCGG CGGTCTGTTG751 GCTGCCATGT TGTCGGGAAA CATGATGCAC GACGCGCAAC TGGTTACGCT801 GGACGCGTTT TGCCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG851 AGTGGCTGGA CGGCATACGT TTATAG它编码的蛋白质具有氨基酸序列<SEQ ID 356>:1 MPTVRFTESV SKHDLDALFE WAKASYGAES CWKTLYLNGL PLGNLSPEWA51 ERVKKDWEAG CSESSDGIFL NADGWPDMGR RLQHLARIWK EAGLLHGWRD101 ECFDLTDGGS NPLFALERAA FRPFGLLSRA VHLNGLYESD GRWHFWIGRR151 SPHKAVDPDK LDNTAAGGVS SGELPSETVC RESSEEAGLD KTLLPLIRPV201 SQLHSLRPVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEKMDIGGLL251 AAMLSGNMMH DAQLVTLDAF CRYGLIDAAH PLSEWLDGIR L*ORF105a和ORF105-1显示在291个氨基酸的重叠区内有93.8%的相同性:
10 20 30 40 50 60orf105a.pep MPTVRFTESVSKHDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWAERVKKDWEAG
||||||||||||:||||||||||||||||||||||||||||||||||||:||||||||||orf105-1 MPTVRFTESVSKQDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWVERVKKDWEAG
10 20 30 40 50 60
70 80 90 100 110 120orf105a.pep CSESSDGIFLNADGWPDMGRRLQHLARIWKEAGLLHGWRDECFDLTDGGSNPLFALERAA
||||||||||||||||||| |||||| |: |||| |||:|||:|||||:||||:|||||orf105-1 CSESSDGIFLNADGWPDMGGRLQHLALGWHCAGLLDGWRNECFDLTDGGGNPLFTLERAA
70 80 90 100 110 120
130 140 150 160 170 180orf105a.pep FRPFGLLSRAVHLNGLVESDGRWHFWIGRRSPHKAVDPDKLDNTAAGGVSSGELPSETVC
||||||||||||||||:|||||||||||||||||||||:|||||||||||:||:|||:||orf105-1 FRPFGLLSRAVHLNGLTESDGRWHFWIGRRSPHKAVDPNKLDNTAAGGVSGGEMPSEAVC
130 140 150 160 170 180
190 200 210 220 230 240orf105a.pep RESSEEAGLDKTLLPLIRPVSQLHSLRPVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||orf105-1 RESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
190 200 210 220 230 240
250 260 270 280 290orf105a.pep FEKMDIGGLLAAMLSGNMMHDAQLVTLDAFCRYGLIDAAHPLSEWLDGIRLX
|||||||||| |||||||||||||||||||||||||||||||||||||||||orf105-1 FEKMDIGGLLDAMLSGNMMHDAQLVTLDAFCRYGLIDAAHPLSEWLDGIRLX
250 260 270 280 290
与淋病奈瑟球菌的预计ORF的同源性
ORF105显示与淋病奈瑟球菌的预计ORF(ORF105.ng)在重叠的312个氨基酸内有87.5%的相同性:orf105.pep MVARRAHNPKVVGSNPXPATXFQTPRFNAEXVLXLPVSCFLFPAASVFCRIFLPAAISER 60
|||||||||||||||| ||| :|||||||| || |||||||||||||||||||||orf105ng MVARRAHNPKVVGSNPAPATKYQTPRFNAEGVLF-----FLFPAASVFCRIFLPAAISER 55orf105.pep QTAVCLRLQIQAVWLQSSALSSRKPTMPTVRFTESVSKQDLDALFEWAKASYGAESCWKT 120
|:|||||||||||||||||| |||||||||||||||||||||||| ||||||||||||||orf105ng QAAVCLRLQIQAVWLQSSALCSRKPAMPTVRFTESVSKQDLDALFERAKASYGAESCWKT 115orf105.pep LYLNGXPLGNLSPEWVERVXKDWEAGCXESSDGIFLNADGWPDMGGRLQHLALGWHCAGL 180
|||| |||||||||:||||||||||| |||:|||||||||||||||||||| |: |||orf105ng LYLNRLPLGNLSPEWAERIKKDWEAGCSESSNGIFLNADGWPDMGGRLQHLARTWNKAGL 175orf105.pep LDGWRNECFDLTDGGGNPLFTLERAXXRPXGLLSRAVHLNGLTESDGRWHFWIGRRSPHK 240
| ||||||||||||||||||||||| || ||| ||||||||:||:||||||||||||||orf105ng LHGWRNECFDLTDGGGNPLFTLERAAFRPFGLLIRAVHLNGLVESNGRWHFWIGRRSPHK 235orf105.pep AVDPNKLDNTXAGGVSGGEMPSEAVCRESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVH 300
||||:|||| :|||||||||||||||||||||||||||:|||||||:||||| ||||||orf10Sng AVDPGKLDNIAGGGVSGGEMPSEAVCRESSEEAGLDKTLFPLIRPVSRLHSLRPVSRGVH 295orf105.pep NEILYVFDAVLP 312
||||||||||||orf105ng NEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLDAMLSKNMMHDAQLVTLDAFYRYG 355
预计全ORF105ng核苷酸序列<SEQ ID 357>编码的蛋白质具有氨基酸序列<SEQID 358>:1 MVARRAHNPK VVGSNPAPAT KYQTPRFNAE GVLFFLFPAA SVFCRIFLPA51 AISERQAAVC LRLQIQAVWL QSSALCSRKP AMPTVRFTES VSKQDLDALF101 ERAKASYGAE SCWKTLYLNR LPLGNLSPEW AERIKKDWEA GCSESSNGIF151 LNADGWPDMG GRLQHLARTW NKAGLLHGWR NECFDLTDGG GNPLFTLERA201 AFRPFGLLIR AVHLNGLVES NGRWHFWIGR RSPHKAVDPG KLDNIAGGGV251 SGGFMPSEAV CRESSEEAGL DKTLFPLIRP VSRLHSLRPV SRGYHNEILY301 VFDAVLPETF LPENQDGEVA GFEKMDIGGL LDAMLSKNMM HDAQLVTLDA351 FYRYGLIDAA HPLSEWLDGI RL*进一步的工作揭示了完整的核苷酸序列<SEQ ID 359>:1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACAAG ACCTTGATGC51 CCTGTTCGAG CGGGCAAAAG CAAGTTACGG TGCCGAAAGT TGCTGGAAAA101 CGCTGTATCT GAACCGTCTT CCTTTGGGCA ATCTGTCGCC GGAATGGGCT151 GAGCGCATCA AAAAAGACTG GGAGGCAGGC TGCTCCGAGT CTTCAGACGG201 CATTTTTCTG AATGCGGACG GCTGGCCGGA TATGGGCGGA CGCTTGCAGC251 ACCTCGCCCG CACATGGAAC AAGGCGGGGC TGCTTCACGG ATGGCGCAAC301 GAGTGTTTCG ACCTGACCGA CGGCGGCGGC AACCCCTTGT TCACGCTCGA351 ACGCGCCGCT TTCCGTCCGT TCGGACTACT CAGCCGCGCC GTCCATCTCA401 ACGGTTTGGT CGAATCGAAC GGCAGATGGC ATTTTTGGAT AGGCAGGCGC451 AGTCCGCACA AAGCAGTCGa tcCCGGCAAG CTCGACAATA TTGCCGGCGG501 CGGTGTTTCC GGCGGCGAAA TGCCGTCTGA AGCCGTGTGC CGCGAAAGCA551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGT TTCCGCTCAT CCGCCCAGTA601 TCGCGGCTGC ACAGCCTTCG CCCCGTCAGC CGAGGTGTGC ACAATGAAAT651 CCTGTATGTG TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC701 AGGATGGCGA GGTAGCGGGT TTTGAAAAGA TGGACATTGG CGGCCTATTG751 GATGCCATGT TGTCGAAAAA CATGATGCAC GACGCGCAAC TGGTTACGCT801 GGACGCGTTT TACCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG851 AGTGGCTGGA CGGCATACGT TTATAG它对应于氨基酸序列<SEQ ID 360;ORF105ng-1>:1 MPTVRFTESV SKQDLDALFE RAKASYGAES CWKTLYLNRL PLGNLSPEWA51 ERIKKDWEAG CSESSDGIFL NADGWPDMGG RLQHLARTWN KAGLLHGWRN101 ECFDLTDGGG NPLFTLERAA FRPFGLLSRA VHLNGLVESN GRWHFWIGRR151 SPHKAVDPGK LDNIAGGGVS GGEMPSEAVC RESSEEAGLD KTLFPLIRPV201 SRLHSLRPVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEMDIGGLL251 DAMLSKNMMH DAQLVTLDAF YRYGLIDAAH PLSEWLDGIR L*ORG105ng-1和ORF105-1显示在291个氨基酸的重叠区内有93.5%的相同性:
10 20 30 40 50 60orf105-1.pep MPTVRFTESVSKQDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWVERVKKDWEAG
||||||||||||||||||| ||||||||||||||||| ||||||||||:||:||||||||orf105ng-1 MPTVRFTESVSKQDLDALFERAKASYGAESCWKTLYLNRLPLGNLSPEWAERIKKDWEAG
10 20 30 40 50 60
70 80 90 100 110 120orf105-1.pep CSESSDGIFLNADGWPDMGGRLQHLALGWHCAGLLDGWRNECFDLTDGGGNPLFTLERAA
|||||||||||||||||||||||||| |: |||| |||||||||||||||||||||||orf105ng-1 CSESSDGIFLNADGWPDMGGRLQHLARTWNKAGLLHGWRNECFDLTDGGGNPLFTLERAA
70 80 90 100 110 120
130 140 150 160 170 180orf105-1.pep FRPFGLLSRAVHLNGLTESDGRWHFWIGRRSPHKAVDPNKLDNTAAGGVSGGEMPSEAVC
||||||||||||||||:||:||||||||||||||||||:|||| |:||||||||||||||orf105ng-1 FRPFGLLSRAVHLNGLVESNGRWHFWIGRRSPHKAVDPGKLDNIAGGGVSGGEMPSEAVC
130 140 150 160 170 180
190 200 210 220 230 240orf105-1.pep RESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
|||||||||||||:|| ||||:|||||||||||||||||||||||||||||||||| |||orf105ng-1 RESSEEAGLDKTLFPLIRPVSRLHSLRPVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
190 200 210 220 230 240
250 260 270 280 290orf105-1.pep FEKMDIGGLLDAMLSGNMMHDAQLVTLDAFCRYGLIDAAHPLSEWLDGIRLX
||||||||| ||||| |||||||||||||| |||||||||||||||||||||orf105ng-1 FEKMDIGGLLDAMLSKNMMHDAQLVTLDAFYRYGLIDAAHPLSEWLDGIRLX
250 260 270 280 290
另外,ORF105ng-1显示出与一种酵母的酶同源:
sP|P41888|TNR3_SCHPO硫胺焦磷酸激酶(TPK)(硫胺激酶)
>gi|1076928|pir||S52350硫胺焦磷酸激酶(EC2.7.6.2)-裂殖酵母(栗酒裂殖酵母)>gi|666111(X84417)硫胺焦磷酸激酶[栗酒裂殖酵母]>gi|2330852|gnl|PID|e334056(Z98533)硫胺焦磷酸激酶[栗酒裂殖酵母]长度=569
评分=105位(259),估计值=4e-22
相同性=64/192(33%),阳性=94/192(48%),空隙=3/192(1%)
询问:268 NKAGLLHGWRNECFDLTDGGGNPLFTLERAAFRPFGLLSRAVHLNGLVESNGRW--HFWI 441
N G+ WRNE + + P+ +ER F FG LS VH + + W+
目标:96 NTFGIADQWRNELYTVYGKSK1PVLAVERGGFWLFGFLSTGVHCTMYIPATKEHPLRIWV 155
询问:442 GRRSPHKAVDPGKLDNIAGGGVSGGEMPSEAVCRESSEEAGLDKTLFPLIRPVSRLHSLR 621
RRSP K P LDN GG++ G+ + +E SEEA LD + LI P + ++
目标:156 PRRSPTKQTWPNYLDNSVAGGIAHGDSVIGTMIKEFSEEANLDVSSMNLI-PCGTVSYIK 214
询问:622 PVSRG-VHNEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLDAMLSKNMMHDAQLVT 798
R + E+ YVFD + + +P DGEVAGF + + +L + K+ + LV
目标:215 MEKRHWIQPELQYVFDLPVDDLVIPRINDGEVAGFSLLPLNQVLHELELKSFKPNCALVL 274
询问:799 LDAFYRYGLIDAAHP 843
LD R+G+I HP
目标:275 LDFLIRHGIITPQHP 289
根据该分析结果(包括淋球菌蛋白中存在一个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例49
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 361>:1 ATGAATAGAC CCAAGCAACC CTTCTTCCGT CCCGAAGTCG CCGTTGCCCG51 CCAAACCAGC CTGACGGGTA AAGTGATTCT GACACGACCG TTGTCATTTT101 CCCTATGGAC GACATTTGCA TCGATATCTG CGTTATTGAT TATCCTGTTT151 TTGATATTTG GTAACTATAC GCGAAAGACA ACAGTGGAGG GACAAATTTT201 ACCTGCATCG GGCGTAATCA GGGTGTATGC ACCGgATACG rGkACAATTA251 CAGCGAAATT CGTGGAAGAT GGmsAAAAGG TTAAGGCTGG CGACAAGCTA301 TTTGCGCTTT CGACCTCACG TTTCGGCGCA GGAGGTAGCG TGCAGCAGCA351 GTTGAAAACG GAGGCAGTTT TGAAGAAAAC GTTGGCAGAA CAGGAACTGG401 GTCGTCTGAA GCTGATACAC GGGAATGAAA CGCGCAgCcT TAAAGCAACT451 GTCGAACGTT TGGAAAACCA GGAACTCCAT ATTTCGCAAC AGATAGACGG501 TCAGAAAAGG CGCATTAGAC TTGCGGAAGA AATGTTGCAG AAATATCGTT551 TCCTATCCGC .CAATGA它对应于氨基酸序列<SEQ ID 362;ORF107>:1 MNRPKQPFFR PEVAVARQTS LTGKVILTRP LSFSLWTTFA SISALLIILF51 LIFGNYTRKT TVEGQILPAS GVIRVYAPDT XTITAKFVED GXKVKAGDKL101 FALSTSRFGA GGSVQQQLKT EAVLKKTLAE QELGRLKLIH GNETRSLKAT151 VERLENQELH ISQQIDGQKR RIRLAEEMLQ KYRFLSXQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF107显示与脑膜炎奈瑟球菌菌株A的ORF(ORF107a)在重叠的186个氨基酸内有97.8%的相同性:
10 20 30 40 50 60orf107.pep MNRPKQPFFRPEVAVARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf107a MNRPKQPFFRPEVAVARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT
10 20 30 40 50 60
70 80 90 100 110 120orf107.pep TVEGQILPASGVIRVYAPDTXTITAKFVEDGXKVKAGDKLFALSTSRFGAGGSVQQQLKT
|||||||||||||||||||| |||||| ||| ||||||||||||||||||| ||||||||orf107a TVEGQILPASGVIRVYAPDTGTITAKFXEDGEKVKAGDKLFALSTSRFGAGDSVQQQLKT
70 80 90 100 110 120
130 140 150 160 170 180orf107.pep EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf107a EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ
130 140 150 160 170 180
189orf107.pep KYRFLSXQX
||||||orf107a KYRFLSANDAVPKQEMMNVKAELLEQKAKLDAYRREEVGLLQEIRTQNLTLXSLPQAAX
190 200 210 220 230全长ORF107a核苷酸序列<SEQ ID 363>是:1 ATGAATAGAC CCAAGCAACC NTTCTTCCGT CCCGAAGTCG CCGTTGCCCG51 CCAAACCAGC CTGACGGGTA AAGTGATTCT GACACGACCG TTGTCATTTT101 CCCTATGGAC GACATTTGCA TCGATATCTG CGTTATTGAT TATCCTGTTT151 TTGATATTTG GTAACTATAC GCGAAAGACA ACAGTGGAGG GACAAATTTT201 ACCTGCATCG GGCGTAATCA GGGTGTATGC ACCGGATACG GGGACAATTA251 CNGCGAAATT CNTGGAAGAT GGAGAAAAGG TTAAGGCTGG CGACAAGCTA301 TTTGCGCTTT CGACCTCACG TTTCGGCGCA GGAGATAGCG TGCAGCAGCA351 GTTGAAAACG GAGGCAGTTT TGAAGAAAAC GTTGGCAGAA CAGGAACTGG401 GTCGTCTGAA GCTGATACAC GGGAATGAAA CGCGCAGCCT TAAAGCAACT451 GTCGAACGTT TGGAAAACCA GGAACTCCAT ATTTCGCAAC AGATAGACGG501 TCAGAAAAGG CGCATTAGAC TTGCGGAAGA AATGTTGCAG AAATATCGTT551 TCCTATCCGC CAATGATGCA GTGCCAAAAC AAGAAATGAT GAATGTCAAG601 GCAGAGCTTT TAGAGCAGAA AGCCAAACTT GATGCCTACC GCCGAGAAGA651 AGTCGGGCTG CTTCAGGAAA TCCGCACGCA GAATCTGACA TTGGNNAGCC701 TCCCCCAAGC GGCATGA它编码的蛋白质具有氨基酸序列<SEQ ID 364>:1 MNRPKQPFFR PEVAVARQTS LTGKVILTRP LSFSLWTTFA SISALLIILF51 LIFGNYTRKT TVEGQILPAS GVIRVYAPDT GTITAKFXED GEKVKAGDKL101 FALSTSRFGA GDSVQQQLKT EAVLKKTLAE QELGRLKLIH GNETRSLKAT151 VERLENQELH ISQQIDGQKR RIRLAEEMLQ KYRFLSANDA VPKQEMMNVK201 AELLEQKAKL DAYRREEVGL LQEIRTQNLT LXSLPQAA*
与淋病奈瑟球菌的预计ORF的同源性
ORF107显示与淋病奈瑟球菌的预计ORF(ORF107.ng)在重叠的188个氨基酸内有95.7%的相同性:orf107.pep MNRPKQPFFRPEVAVARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT 60
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||orf107ng MNRPKQPFFRPEVAIARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT 60orf107.pep TVEGQILPASGVIRVYAPDTXTITAKFVEDGXKVKAGDKLFALSTSRFGAGGSVQQQLKT 120
|:|||||||||||||||||| |||||||||| ||||||||||||||||||||||||||||orf107ng TMEGQILPASGVIRVYAPDTGTITAKFVEDGEKVKAGDKLFALSTSRFGAGGSVQQQLKT 120orf107.pep EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ 180
|||||||||||||||||||| ||||||||||||||:|||||||||||||||||||||||:orf107ng EAVLKKTLAEQELGRLKLIHENETRSLKATVERLENQKLHISQQIDGQKRRIRLAEEMLR 180orf107.pep KYRFLSXQ 188
|||||| |orf107ng KYRFLSAQ 188
预计全长ORF107ng核苷酸序列<SEQ ID 365>编码的蛋白质具有氨基酸序列<SEO ID 366>:1 MNRPKQPFFR PEVAIARQTS LTGKVILTRP LSFSLWTTFA SISALLIILF51 LIFGNYTRKT TMEGQILPAS GVIRVYAPDT GTITAKFVED GEKVKAGDKL101 FALSTSRFGA GGSVQQQLKT EAVLKKTLAE QELGRLKLIH ENETRSLKAT151 VERLENQKLH ISQQIDGQKR RIRLAEEMLR KYRFLSAQ*
根据此淋球菌蛋白中存在一个推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例50
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 367>:1 ATGCTGAATA CTTTTTTTGC CGTATTGGGC GGCTGCCTGC TGCT.TTGCC51 GTGCGGCAAA TCCGTAAATA CGGCGGTACA GCCGCAAAAC GCGGTACAAA101 GCGCGCCGAA ACCGGTTTTC AAAGTCATAT ATATCGACAA TACGGCGATT151 GCCGGTTTGG ATTTGGGACA AAGCAGCGAA GGCAAAACCA ACGACGGCAA201 AAAACAAATC AGTTATCCGA TTAAAGGCTT GCCGGAACAA AATGTTATCC251 GACTGATCGG CAAGCATCCC GGCGACTTGG AAGCCGTCAG CGGCAAATGT301 ATGGAAACCG ATGATAAGGA CAGTCCGGCA GGTTGGGCAG AAAACGGCGT351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG401 GCAAACTGAC GGATTACCTA GTTTCGCATG CCGCCCTGCA ACCCTATCAG451 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA501 AATCGACAGC GAAGGGGCGT TTTATTTCCG CCGCCGCCAT TATTGA它对应于氨基酸序列<SEQ ID 368;ORF108>:1 MLNTFFAVLG GCLLXLPCGK SVNTAVQPQN AVQSAPKPVF KVIYIDNTAI51 AGLDLGQSSE GKTNDGKKQI SYPIKGLPEQ NVIRLIGKHP GDLEAVSGKC101 METDDKDSPA GWAENGVCHT LFAKLVGNIA EDGGKLTDYL VSHAALQPYQ151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y*进一步的工作揭示了下列DNA序列<SEQ ID 369>:1 ATGCTGAAAA CATCTTTTGC CGTATTGGGC GGCTGCCTGC TGCTTGCCGC51 CTGCGGCAAA TCCGAAAATA CGGCGGAACA GCCGCAAAAC GCGGTACAAA101 GCGCGCCGAA ACCGGTTTTC AAAGTCAAAT ATATCGACAA TACGGCGATT151 GCCGGTTTGG ATTTGGGACA AAGCAGCGAA GGCAAAACCA ACGACGGCAA201 AAAACAAATC AGTTATCCGA TTAAAGGCTT GCCGGAACAA AATGTTATCC251 GACTGATCGG CAAGCATCCC GGCGACTTGG AAGCCGTCAG CGGCAAATGT301 ATGGAAACCG ATGATAAGGA CAGTCCGGCA GGTTGGGCAG AAAACGGCGT351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG401 GCAAACTGAC GGATTACCTA GTTTCGCATG CCGCCCTGCA ACCCTATCAG451 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA501 AATCGACAGC GAAGGGGCGT TTTATTTCCG CCGCCGCCAT TATTGA它对应于氨基酸序列<SEQ ID 370;ORF108-1>:1 MLKTSFAVLG GCLLLAACGK SENTAEQPQN AVQSAPKPVF KVKYIDNTAI51 AGLDLGQSSE GKTNDGKKQI SYPIKGLPEQ NVIRLIGKHP GDLEAVSGKC101 METDDKDSPA GWAENGVCHT LFAKLVGNIA EDG GKLTDYL VSHAALQPYQ151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y*
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF108显示与淋病奈瑟球菌的预计ORF(ORF108.ng)在重叠的181个氨基酸内有88.4%的相同性:orf108.pep MLNTFFAVLGGCLLXLPCGKSVNTAVQPQNAVQSAPKPVFKVIYIDNTAIAGLDLGQSSE 60
||: ||||||||| |||| ||| |||||:|||||||||| |||||||||| ||||||orf108ng MLKIPFAVLGGCLLLAACGKSENTAEQPQNAAQSAPKPVFKVKYIDNTAIAGLALGQSSE 60orf108.pep GKTNDGKKQISYPIKGLPEQNVIRLIGKHPGDLEAVSGKCMETDDKDSPAGWAENGVCHT 120
|||||||||||||||||||||::|| ||||:||||| ||||||| ||:||||||||||||orf108ng GKTNDGKKQISYPIKGLPEQNAVRLTGKHPNDLEAVVGKCMETDGKDAPSGWAENGVCHT 120orf108.Dep LFAKLVGNIAEDGGKLTDYLVSHAALQPYQAGKSGYAAVQNGRYVLEIDSEGAFYFRRRHY 181
||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||||orf108ng LFAKLVGNIAEDGGKLTDYLISHSALQPYQAGKSGYAAYQNGRYVLEIDSEGAFYFRRRHY 181ORF108-1与ORF108ng在相同的181个氨基酸重叠区内有92.3%的相同性:orf108-1.pep MLKTSFAVLGGCLLLAACGKSENTAEQPQNAVQSAPKPVFKVKYIDNTAIAGLDLGQSSE 60
||| ||||||||||||||||||||||||||:||||||||||||||||||||| ||||||orf108ng-1 MLKIPFAVLGGCLLLAACGKSENTAEQPQNAAQSAPKPVFKVKYIDNTAIAGLALGQSSE 60orf108-1.pep GKTNDGKKQISYPIKGLPEQNVIRLIGKHPGDLEAVSGKCMETDDKDSPAGWAENGVCHT 120
|||||||||||||||||||||::|| ||||:||||| ||||||| ||:|:||||||||||orf108ng-1 GKTNDGKKQISYPIKGLPEQNAVRLTGKHPNDLEAVVGKCMETDGKDAPSGWAENGVCHT 120orf108-1.pep LFAKLVGNIAEDGGKLTDYLVSHAALQPYQAGKSGYAAVQNGRYVLEIDSEGAFYFRRRHY 181
||||||||||||||||||||:||:|||||||||||||||||||||||||||||||||||||orf108ng-1 LFAKLVGNIAEDGGKLTDYLISHSALQPYQAGKSGYAAVQNGRYVLEIDSEGAFYFRRRHY 181全长ORF108ng核苷酸序列<SEQ ID 371>是:1 ATGCTGAAAa tacctTTTGC CGTGTtgggc ggCtgcctGC TGCTTGCCGC51 CTGCGGCAAA TCCGAAAATa cggcggaACA GCCGCAAAAT gcggCACAAA101 GCGCGCCGAA ACCGGTTTTC AAAGTCAAAT ACATCGACAA TACGGCGATT151 GCCGGTTTGG CTTTGGGACA AAGTAGCGAA GGCAAAACCA acgacgGCAA201 AAAACAAATC AGTTATccgA TTAAAGGCTT GCCGGAACAA Aacgccgtcc251 gGCTGACCGG AAAGCATCCC AACGACTTGG AagccgtcgT CGGCAAATGT301 ATGGAAACCG ACGGAAAGGA CGCGCCTTCG GGCTGGGCGG AAAACGGCGT351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG401 GCAAACTGAC TGATTACCTG ATTTCGCATT CCGCCCTGCA ACCCTATCAG451 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA501 AATCGACAGC GagggGGCGT TTTATttccg ccgccgccat tattgA它编码的蛋白质具有氨基酸序列<SEQ ID 372>:1 MLKIPFAVLG GCLLLAACGK SENTAEQPQN AAQSAPKPVF KVKYIDNTAI51 AGLALGQSSE GKTNDGKKQI SYPIKGLPEQ NAVRLTGKHP NDLEAVVGKC101 METDGKDAPS GWAENGVCHT LFAKLVGNIA EDGGKLTDYL ISHSALQPYQ151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y*
根据该分析结果(包括淋球菌蛋白中存在一个预计的原核细胞膜脂蛋白脂质连接位点(下划线)和一个推定的ATP/GTP-结合位点基序A(P-环,双划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例51
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 373>:1 ATGGAAGATT TATATATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC51 CGgATTTATC GATgcgatTg cGggCGGGGG TGGTTTGATT ACGCTGCCCG101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG151 CTGCAAgCAG CCGCTGCTAC GTTTTCAGCT ACGGTTTCTT TTGCACGCAA201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG251 TAGGCGGCGT GGcCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT301 CTgCTgGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCAC TGTATTTTGT351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT401 TTTTTCTGTT cGGGCTGACG GTCGC.ACCG CTTTTGGGTT TTTACGACGG451 TGTGTTCGGA CCGGGTGTCG GCTCGTTTTT TCTGATTGCC TTTATTGTTT501 TGCTCGGCTG CAAgCTGTTG AACGCGATGT CTTACACCAA ATTGGCGAAC551 GTTGCCTGCA ATCTTGGTTC GCTATCGGTA TTCCTGCTGC ACGGTTCGAT601 TATTTTCCCG ATTGCGGCAA CGaTGGCGGT CGGTGCGTTT GTCGGtGCGA651 ATTTAgGTGC GAGATTTGCC GTaCgctTCG GTTCGAAGCT GATTAA它对应于氨基酸序列<SEQ ID 374;ORF109>:1 MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNA51 LQAAAATFSA TVSFARKGLI DWKKGLPIAA ASFVGGVAGA LSVSLVSKDI101 LLAVVPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VXTAFGFLRR151 CVRTGCRLVF SDCLYCFARL QAVERDVLHQ IGERCLQSWF AIGIPAARFD201 YFPDCGNDGG RCVCRCEFRC EICRTLRFEA D*进一步的工作揭示了下列DNA序列<SEQ ID 375>:1 ATGGAAGATT TATATATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC51 CGGATTTATC GATGCGATTG CGGGCGGGGG TGGTTTGATT ACGCTGCCCG101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG151 CTGCAAGCAG CCGCTGCTAC GTTTTCAGCT ACGGTTTCTT TTGCACGCAA201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG251 TAGGCGGCGT GGCCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT301 CTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCAC TGTATTTTGT351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT401 TTTTTCTGTT CGGGCTGACG GTCGCACCGC TTTTGGGTTT TTACGACGGT451 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG551 TTGCCTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT601 ATTTTCCCGA TTGCGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA它对应于氨基酸序列<SEQ ID 376;ORF109-1>:1 MEDLYIILAL GLVAMIAGFT DAIAGGGGLI TLPALLLAGI PPVSAIATNK51 LQAAAATFSA TVSFARKGLI DWKKGLPIAA ASFVGGVAGA LSVSLVSKDI101 LLAVVPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VAPLLGFYDG151 VFGPGVGSFF LIAFIVLLGC KLLNAMSYTK LANVACNLGS LSVFLLHGSI201 IFPIAATMAV GAFVGANLGA RFAVRFGSKL IKPLLIVISI SMAVKLLIDE251 RNPLYQMIVS MF*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF109显示与脑膜炎奈瑟球菌菌株A的ORF(ORF109a)在重叠的147个氨基酸内有95.9%的相同性:
10 20 30 40 50 60orf109.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSAorf109a MEDLYIILALGLYAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
10 20 30 40 50 60
70 80 90 100 110 120orf109.pep TVSFARKGLIDWKKGLPIAAASFVGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||orf109a TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
70 80 90 100 110 120
130 140 150 160 170 180orf109.pep KLDGSKEGKARMSFFLFGLTVXTAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQorf109a KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
130 140 150 160 170 180全长ORF109a核苷酸序列<SEQ ID 377>是:1 ATGGAAGATT TATACATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC51 CGGATTTATC GATGCGATTG CGGGTGGGGG TGGTTTGATT ACGCTGCCTG101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG151 CTGCAAGCAG CCGCTGCTAC GTTTTCGGCT ACGGTTTCTT TTGCACGCAA201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCGGCA GCATCGTTTG251 CAGGCGGCGT GGTCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT301 CTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCGC TGTATTTTGT351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT401 TTTTTCTGTT CGGTCTGACG GTTGCACCAC TTTTGGGTTT TTACGACGGT451 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG551 TTGCCTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT601 ATTTTCCCGA TTGCGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA它编码的蛋白质具有氨基酸序列<SEQ ID 378>:1 MEDLYIILAL GLYAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK51 LQAAAATFSA TVSFARKGLI DWGKGLPIAA ASFAGGVVGA LSVSLVSKDI101 LLAVVPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VAPLLGFYDG151 VFGPGVGSFF LIAFIVLLGC KLLNAMSYTK LANVACNLGS LSVFLLHGSI201 IFPIAATMAV GAFVGANLGA RFAVRFGSKL IKPLLIVISI SMAVKLLIDE251 RNPLYQMIVS MF*ORF109a和ORF109-1显示在262个氨基酸的重叠区内有99.2%的相同性:
10 20 30 40 50 60orf109a.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf109-1 MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
10 20 30 40 50 60
70 80 90 100 110 120orf109a.pep TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||orf109-1 TVSFARKGLIDWKKGLPIAAASFYGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
70 80 90 100 110 120
130 140 150 160 170 180orf109a.pep KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf109-1 KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
130 140 150 160 170 180
190 200 210 220 230 240orf109a.pep LANVACNLGSLSVFLLHGSIIFPIAATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||orf109-1 LANVACNLGSLSVFLLHGSIIFPIAATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
190 200 210 220 230 240
250 260orf109a.pep SMAVKLLIDERNPLYQMIVSMFX
|||||||||||||||||||||||orf109-1 SMAVKLLIDERNPLYQMIVSMFX
250 260
与淋病奈瑟球菌的预计ORF的同源性
ORF109显示与淋病奈瑟球菌的预计ORF(ORF109.ng)在重叠的231个氨基酸内有98.3%的相同性:orf109.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf109ng MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 60orf109.pep TVSFARKGLIDWKKGLPIAAASFVGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP 120
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||orf109ng TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP 120orf109.pep KLDGSKEGKARMSFFLFGLTVXTAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ 180
||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||orf109ng KLDGSKEGKARMSFFLFGLTVATAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ 180orf109.pep IGERCLQSWFAIGIPAARFDYFPDCGNDGGRCVCRCEFRCEICRTLRFEAD 231
|||||||||||||||||||||||||||||||||||||||||||| ||||||orf109ng IGERCLQSWFAIGIPAARFDYFPDCGNDGGRCVCRCEFRCEICRPLRFEAD 231
预计ORF109ng核苷酸序列<SEQ ID 379>编码的蛋白质具有氨基酸序列<SEQ ID380>:1 MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK51 LQAAAATFSA TVSFARKGLI DWKKGLPIAA ASFAGGVYGA LSVSLVSKDI101 LLAVVPVLLI FYALYFYFSP KLDGSKEGKA RMSFFLFGLT VATAFGFLRR151 CVRTGCRLVF SDCLYCFARL QAVERDVLHQ IGERCLQSWF AIGIPAARFD201 YFPDCGNDGG RCVCRCEFRC EICRPLRFEA D*进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 381>:1 ATGGAAGATT TATACATAAT ACTCGCTTTG GGTTTGGTTG CGATGATCGC51 CGGATTTATC GATGCGATTG CGGGCGGGGG TGGTTTGATT ACGCTGCCTG101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG151 CTGCAAGCAG CCGCTGCTAC GTTTTCGGCT ACGGTTTCTT TTGCACGCAA201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG251 CAGGCGGCGT GGTCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT301 TTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCGC TGTATTTTGT351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT401 TTTTTCTATT CGGGCTGACG GTTGCACCGC TTTTGGGTTT TTACGACGGT451 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG551 TTGCTTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT601 ATTTTCCCGA TTGTGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA它对应于氨基酸序列<SEQ ID 382;ORF109ng-1>:1 MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK51 LQAAAATFSA TVSFARKGLI DWKKGLPIAA ASFAGGVVGA LSVSLVSKDI101 LLAVVPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VAPLLGFYDG151 VFGPGVGSFF LIAFIVLLGC KLLNAMSYTK LANVACNLGS LSVFLLHGSI201 IFPIVATMAV GAFVGANLGA RFAVRFGSKL IKPLLIVISI SMAVKLLIDE251 RNPLYQMIVS MF*ORF109ng-1和ORF109-1显示在262个氨基酸的重叠区内有98.9%的相同性:
10 20 30 40 50 60orf109ng-1.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf109-1 MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
10 20 30 40 50 60
70 80 90 100 110 120orf109ng-1.pep TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||orf109-1 TVSFARKGLIDWKKGLPIAAASFVGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
70 80 90 100 110 120
130 140 150 160 170 180orf109ng-1.pep KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf109-1 KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
130 140 150 160 170 180
190 200 210 220 230 240orf109ng-1.pep LANVACNLGSLSVFLLHGSIIFPIVATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||||orf109-1 LANVACNLGSLSVFLLHGSIIFPIAATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
190 200 210 220 230 240
250 260orf109ng-1.pep SMAVKLLIDERNPLYQMIVSMFX
|||||||||||||||||||||||orf109-1 SMAVKLLIDERNPLYQMIVSMFX
250 260另外,ORF109ng-1显示出与一种假设的假单胞菌属蛋白同源:sp|P29942|YCB9_PSEDE COBO 3’区域中假设的27.4KD蛋白(ORF9)>gi|94984|pir||I38164假设蛋白9-假单胞菌属>gi|551929(M62866)ORF9[脱氮假单胞菌]长度=261评分=175位(439),估计值=3e-43相同性=83/214(38%),阳性=131/214(60%),空隙=1/214(0%)询问:41 PPVSAIATNKLQXXXXXXXXXXXXXRKGLIDWKKGLPIXXXXXXXXXXXXXXXXXXXKDI 100
PP+ + TNKLQ R+G ++ K+ LP+ D+目标:43 PPLQTLGTNKLQGLFGSGSATLSYARRGHVNLKEQLPMALMSAAGAVLGALLATIVPGDV 102询问:101 LLAVVPVLLIFVALYFVFSPKLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFF 160
L A++P LLI +ALYF P + G + +R++ F+F LT+ PL+GFYDGVFGPG GSFF目标:103 LKAILPFLLIAIALYFGLKPNM-GDVDQHSRVTPFVFTLTLVPLIGFYDGVFGPGTGSFF 161询问:161 LIAFIVLLGCKLLNAMSYTKLANVACNLGSLSVFLLHGSIIFPIVATMAVGAFVGANLGA 220
++ F+ L G +L A ++TK N N+G+ VFL G++++ + M +G F+GA +G+目标:162 MLGFVTLAGFGVLKATAHTKFLNFGSNVGAFGVFLFFGAVLWKVGLLMGLGQFLGAQVGS 221询问:221 RFAVRFGSKLIKPLLIVISISMAVKLLIDERNPL 254
R+A+ G+K+IKPLL+++SI++A++LL D +PL目标:222 RYAMAKGAKIIKPLLVIVSIALAIRLLADPTHPL 255
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例52
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 383>:1 ..CTGCTAGGGT ATTGCATCGG TTATCGGTAC GGCTGTTGCA GCAAAACCAG51 CCGCAGACGG ATTATTTGGT CAAATTCGGA TCGTTTTGGG CGAG.ATTTT101 TGGTTTTCTG GGACTGTATG ACGTCTATGC TTCGGCATGG TTTGTCGTTA151 TCATGATGTT TTTGGTGGTT TCTACCAGTT TGTGCCTGAT TCGCAATGTG201 CCGCCGTTCT GGCGCGAAAT GAAGTCTTTT CGGGAAAAGG TTAAAGAAAA251 ATCTCTGGCG GCGATGCGCC ATTCTTCGCT GTTGGATGTA AAAATTGCGC301 CCGAGGTTGC CAAACGTTAT CTGGAAGTAC AAGGTTTTCA GGGGAAAACC351 ATTAACCGTG AAGACGGGTC GGTTCTGATT GCCGCCAAAA AAGGCACAAT401 GAACAAATGG GGCTATATCT TTGCCCATGT TGCTTTGATT GTCATTTGCC451 TGGGCGGGTT GATAGACAGT AACCTGCTGT TGAAACTGGG TATGCTGACC501 GGTCGGATTG TTCCGGACAA TCAGGCGGTT TATGCCAAGG ATTTC.AAGC551 CCGAAAGTAT .TTTGGGTGC gTCCAATCTC TCATTTAGGG GCAACGTCAA601 TATTTCCG.A GGGGCAGAgT GCGGATGTGG TTTTCCTGA它对应于氨基酸序列<SEQ ID 384 ORF110>:1 ..LLGIASVIGT LLQQNQPQTD YLVKFGSFWA XIFGFLGLYD VYASAWFVVI51 MMFLVVSTSL CLIRNVPPFW REMKSFREKV KEKSLAAMRH SSLLDVKIAP101 EVAKRYLEVQ GFQGKTINRE DGSVLIAAKK GTMNKQGYIF AHVALIVICL151 GGLIDSNLLL KLGMLTGRIF RTIRRFMPRI XKPESXFGCV QSLI*GQRQY201 FXRGRVRMWF S*该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的ORF88a的同源性
ORF110显示与脑膜炎奈瑟球菌菌株A的ORF88a在重叠的188个氨基酸内有91.5%的相同性:
10 20 30 40 50 60orf88a.pep MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA
||||||||||:|||||||||||||||||||orf110 LLGIASVIGTLLQQNQPQTDYLVKFGSFWA
10 20 30
70 80 90 100 110 120orf88a.pep QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf110 XIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH
40 50 60 70 80 90
130 140 150 160 170 180orf88a.pep SSLLDVKIAPEVAKRYLEVGGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf110 SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL
100 110 120 130 140 150
190 200 210 220 230 240orf88a.pep GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF
||||||||||||||||||| : : : |||| :|orf110 GGLIDSNLLLKLGMLTGRIFRTIRRFMPRIXKPESXFGCVQSLIXGQRQYFXRGRVRMWF
160 170 180 190 200 210
250 260 270 280 290 300orf88a.pep LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLTorf110 SX
然而,ORF88和ORF110并不匹配,因为它们代表了同一蛋白上的两个不同片段。
与淋病奈瑟球菌的预计ORF的同源性
ORF110显示与淋病奈瑟球菌的预计ORF(ORF110.ng)在重叠的211个氨基酸内有88.6%的相同性:orf110.pep LLGIASVIGTLLQQNQPQTDYLVKFGSFWA 30
||||||||||:||||||||||||||| ||:orf110ng MSKSRISPTLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGPFWT 60orf110.pep XIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 90
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf110ng RIFDFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120orf110.pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 150
|||||||||||||||||||:||||||::||||||||||||||||||||| ||||||||||orf110ng SSLLDVKIAPEVAKRYLEVRGFQGKTVSREDGSVLIAAKKGTMNKWGYIXAHVALIVICL 180orf110.pep GGLIDSNLLLKLGMLTGRIFRTIRRFMPRIXKPESXFGCVQSLIXGQRQYFXRGRVRMWF 210
| ||: |||||||||:| |||: || |||| |||| :| ||||| |||||| ||:|||||orf110ng GRLINXNLLLKLGMLAGSIFRNNRRVMPRISKPESIWGGVQSLIKGQRQYFQRGKVRMWF 240
orf110.pep S 211
|
orf110ng S 241
预计全长ORF110ng核苷酸序列<SEQ ID 385>编码的蛋白质具有氨基酸序列<SEQ ID 386>:1 MSKSRISPTL LSRPWFAFFS SMRFAVALLS LLGIASVIGT VLQQNQPQTD51 YLVKFGPFWT RIFDFLGLYD VYASAWFVVI MMFLVVSTSL CLIRNVPPFW101 REMKSFREKV KEKSLAAMRH SSLLDYKIAP EVAKRYLEVR GFQGKTVSRE151 DGSVLIAAKK GTMNKWGYIX AHVALIVICL GRLINXNLLL KLGMLAGSIF201 RNNRRVMPRI SKPESIWGGV QSLIKGQRQY FQRGKVPMWF S*
根据淋球菌蛋白中存在推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。实施例53在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 387>:1 ATGCCGTCTG AAACACGCCT GCGGAACTTT ATCCGCGTCT TGATATTTGC51 CCTGGGTTTC ATCTTCCTGA ACGCCTGTTC GGAACAAACC GCGCAAACCG101 TTACCCTGCA AGGCGAAACG ATGGGCACGA CCTATACCGT CAAATACCTT151 TCAAATAATC GGGACAAACT CCCCTCACCT GCCGAAATAC AAAAACGCAT201 CGATGACGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TATCAGCCCG251 ACTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC301 ATTTCAAGCG ACTTCGCACA CGTTACTGCC GAAGCCGTCC GCCTGAACCG351 CCTGACACAC GGCGCGCTGG ACGTAACCGT CGGCCCCTTG GTCAACCTTT401 GGGGATTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA451 ATCAAACAGG CGGCATCTTA TACGGGCATA GACAAAATCA TTTTGAAACA501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAG GCCTATTTGG551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATAAAGT TGCGGGCGAA601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAATCG GCGGCGAGTT651 GCACGGCAAA GGCAAAAACG CGCGCGGCGA ACCGTGGCGC ATCGGTATCG701 AGCAGCCCAA TATCGTCCAA GGCGGCAATA CGCAGATTAT CGTCCCGCTG751 AACAACCGTT CGCTTGCCAC TTCCGGCGAT TACCGTATTT TCCACGTCGA801 TAAAAACGGC AAACGCCTCT CCCATATCAT CAACCCGAAC AACAAACGAC851 CCATCAGCCA CAACCTCGCC TCCATCAGCG TGGTCGCAGA CAGTGCGATG901 ACGGCGGACG GCTTGTCCAC AGGATTATTC GTATTGGGCG AAACCGAAGC951 CTTAAAGCTG GCAGAGCGCG AAAAACTCGC TGTTTTCCTG ATTGTCAGGG1001 ATAAAGGCGG CTACCGCACC GCCATGTCTT CCGAATTTGA AAAACTGCTC1051 CGCTAA它对应于氨基酸序列<SEQ ID 388;ORF111>:1 MPSETRLPNF IRVLIFALGF IFLNACSEQT AQTVTLQGET MGTTYTVKYL51 SNNRDKLPSP AEIQKRIDDA LKEVNRQMST YQPDSEISRF NQHTAGKPLR101 ISSDFAHVTA EAVRLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ151 IKQAASYTGI DKIILKQGKD YASLSKTHPK AYLDLSSIAK GFGVDKVAGE201 LEKYGIQNYL VEIGGELHGK GKNARGEPWR IGIEQPNIVQ GGNTQIIVPL251 NNRSLATSGD YRIFHVDKNG KRLSHIINPN NKRPISHNLA SISVVADSAM301 TADGLSTGLF VLGETEALKL AEREKLAVFL IVRDKGGYRT AMSSEFEKLL351 R*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF111显示与脑膜炎奈瑟球菌菌株A的ORF(ORF111a)在重叠的351个氨基酸内有96.9%的相同性:
10 20 30 40 50 60orf111a.pep MPSETRLPNFIRTLIFALSFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDXLPSP
||||||||||||:|||||:|||||||||||||||||||||||||||||||||||| ||||orf111 MPSETRLPNFIRVLIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP
10 20 30 40 50 60
70 80 90 100 110 120orf111a.pep AEIQXRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVHLNRLTH
|||| ||||||||||||||||||||||||||||||||||||||||||||||||:||||||orf111 AEIQKRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH
70 80 90 100 110 120
130 140 150 160 170 180orf111a.pep GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILKQGKDYASLSKTHPK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf111 GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILKQGKDYASLSKTHPK
130 140 150 160 170 180
190 200 210 220 230 240orf111a.pep AYLDLSSIAKGFGVDXVAGELEKYGIQNYLVEIGGELHGKXKNARGEPWRIGIEQPNIVQ
||||||||||||||| |||||||||||||||||||||||| |||||||||||||||||||orf111 AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNARGEPWRIGIEQPNIVQ
190 200 210 220 230 240
250 260 270 280 290 300orf111a.pep GGNTQIIVPLNNRSXATSGDYRIFHVDKSGKRLSHIINPNNKRPISHNLASISVXADSAM
|||||||||||||| |||||||||||||:||||||||||||||||||||||||| |||||orf111 GGNTQIIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVADSAM
250 260 270 280 290 300
310 320 330 340 350orf111a.pep TADGXSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX
|||| |||||||||||||||||||||||||||||||||||||||||||||||orf111 TADGLSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX
310 320 330 340 350全长ORF111a核苷酸序列<SEQ ID 389>是:1 ATGCCGTCTG AAACACGCCT GCCGAACTTT ATCCGCACCT TGATATTTGC51 CCTGAGTTTT ATCTTCCTGA ACGCCTGTTC GGAACAAACC GCGCAAACCG101 TTACCCTGCA AGGTGAAACG ATGGGCACGA CCTATACCGT CAAATACCTT151 TCAAATAATC GGGACNAACT CCCNTCACCT GCCGAAATAC AAAANCGCAT201 CGATGACGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TATCAGCCCG251 ACTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC301 ATTTCAAGCG ACTTCGCACA CGTTACTGCC GAAGCCGTCC ACCTGAACCG351 CCTGACACAC GGCGCGCTGG ACGTAACCGT CGGCCCCTTG GTCAACCTTT401 GGGGATTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA451 ATCAAACAAG CAGCATCTTA TACGGGCATA GACAAAATCA TTTTGAAACA501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAG GCCTATTTGG551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATNANGT TGCGGGCGAA601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAATCG GCGGNGAGTT651 GCACGGCAAA GNCAAAAACG CGCGCGGCGA ACCTTGGCGC ATCGGCATCG701 AACAGCCCAA CATCGTCCAA GGCGGCAATA CGCAGATTAT CGTCCCGCTG751 AACAACCGTT CGNTTGCCAC TTCCGGCGAT TACCGTATTT TCCACGTCGA801 TAAAAGCGGC AAACGCCTCT CCCATATCAT TAATCCGAAC AACAAACGAC851 CCATCAGCCA CAACCTCGCC TCCATCAGCG TGNTCGCAGA CAGTGCGATG901 ACGGCGGACG GCTTNTCCAC AGGATTATTC GTATTGGGCG AAACCGAAGC951 CTTAAAGCTG GCAGAGCGCG AAAAACTCGC TGTTTTCCTG ATTGTCAGGG1001 ATAAAGGCGG CTACCGCACC GCCATGTCTT CCGAATTTGA AAAACTGCTC1051 CGCTAA它编码的蛋白质具有氨基酸序列<SEQ ID 390>:
1 MPSETRLPNF IRTLIFALSF IFLNACSEQT AQTVTLQGET MGTTYTVKYL 51 SNNRDXLPSP AEIQXRIDDA LKEVNRQMST YQPDSEISRF NQHTAGKPLR101 ISSDFAHVTA EAVHLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ151 IKQAASYTGI DKIILKQGKD YASLSKTHPK AYLDLSSIAK GFGVDXVAGE201 LEKYGIQNYL VEIGGELHGK XKNARGEFWR IGIEQPNIYQ GGNTQIIVPL251 NNRSXATSGD YRIFHVDKSG KRLSHIINPN NKRPISHNLA SISVXADSAM301 TADGXSTGLF VLGETEALKL AEREKLAVFL IVRDKGGYRT AMSSEFEKLL351 R*
与淋病奈瑟球菌的预计ORF的同源性
ORF111显示与淋病奈瑟球菌的预计ORF(ORF111.ng)在重叠的351个氨基酸内有96.6%的相同性:
10 20 30 40 50 60orf111ng MPSETRLPNLIRALIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP
|||||||||:||:|||||||||||||||||||||||||||||||||||||||||||||||orf111 MPSETRLPNFIRVLIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP
10 20 30 40 50 60
70 80 90 100 110 120orf111 AKIQKRIDDALKEVNRQMSTYQTDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH
|:|||||||||||||||||||| |||||||||||||||||||||||||||||||||||||orf111 AEIQKRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH
70 80 90 100 110 120
130 140 150 160 170 180orf111ng GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILQQGKDYASLSKTHPK
|||||||||||||||||||||||||||||||||||||||||||||:||||||||||||||orf111 GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILKQGKDYASLSKTHPK
130 140 150 160 170 180
190 200 210 220 230 240orfI1Ing AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNAHGEPWRIGIEQPNIIQ
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||:|orf111 AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNARGEPWRIGIEQPNIVQ
190 200 210 220 230 240
250 260 270 280 290 300orf111ng GGNTQIIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVSDSAM
|||||||||||||||||||||||||||||||||||||||||||||||||||||||:||||orf111 GGNTQIIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVADSAM
250 260 270 280 290 300
310 320 330 340 350orf111ng TADGLSTGLFVLGETEALRLAEQEKLAVFLIVRDKDGYRTAMSSEFAKLLRX
||||||||||||||||||:|||:|||||||||||| |||||||||| |||||orf111 TADGLSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX
310 320 330 340 350全长ORF111ng核苷酸序列<SEQ ID 391>是:1 ATGCCGTCTG AAACACGCCT GCCGAACCTT ATCCGCGCCT TGATATTTGC51 CCTGGGTTTC ATCTTCCTGA ACGCCTGTTC GGaacaaacC GCGCAaaccg101 TTACCCTGCA AGGCGAAAcg aTGGGTACGA CCTATACCGT CAAATACCTT151 TCAAATAATC GGGACAAACT CCCCTCCCCT GCCAAAATAC AAAAGCGCAT201 TGATGATGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TACCAGACCG251 ATTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC301 ATTTCAAGCG ATTTCGCACA CGTTACCGCC GAAGCCGTCC GCCTGAACCG351 CCTGACTCAC GGCGCACTGG ACGTAACCGT CGGCCCTTTG GTCAACCTTT401 GGGGGTTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA451 ATCAAACAGG CGGCATCTTA TACGGGCATA GACAAAATCA TTTTGCAACA501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAA GCCTATTTGG 551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATAAAGT TGCGGGCGAA601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAAtcg gcggcGAGTT651 GCACGGCAAA GGCAAAAATG CGCACGGCGA ACCGTGGCGC ATCGGTATAG701 AGCAACCCAA TATCATCCAA GgcgGCAata CGCAGATTAt cgtcccgctg751 aaCaaccgtt cgctTGCCAC TTCCGGCGAT TAccgtaTTT tccacgtcgA801 TAAAAAcggc aaacgccttt cccacaTCAT CAATCCCaAC aacAAACgac851 ccATCAGcca caacctcgcc tccatcagcg tggtctcAGA CAGTGCAATG901 ACGGCGGACG GTTtatCCAC AGGATTATTT GTTTTAGGCG AAACCGAAGC951 CTTAAGGCTG GCAGAACAAG AAAAACTCGC TGTTTTCCTA ATTGTCCGGG1001 ATAAGGACGG CTACCGCACC GCCATGTCTT CCGAATTTGC CAAGCTGCTC1051 CGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 392>:1 MPSETRLPNL IRALIFALGF IFLNACSEQT AQTVTLQGET MGTTYTVKYL51 SNNRDKLPSP AKIQKRIDDA LKEVNRQMST YQTDSEISRF NQHTAGKPLR101 ISSDFAHVTA EAVRLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ151 IKQAASYTGI DKIILQAGKD YASLSKTHPK AYLDLSSIAK GFGVDKVAGE201 LEKYGIQNYL VEIGGELHGK GKNAHGEPWR IGIEQPNIIQ GGNTQIIVPL251 NNRSLATSGD YRIFHVDKNG KRLSHIINPN NKRPISHNLA SISVVSDSAM301 TADGLSTGLF VLGETEALRL AEQEKLAVFL IVRDKDGYRT AMSSEFAKLL351 R*
该蛋白显示出与一种假设的流感嗜血菌的脂蛋白前体同源:
sp|P44550|YOJL_HAEIN假设的脂蛋白HI0172前体>gi|1074292|pir|4假设蛋白HI0172-流感嗜血菌(Rd KW20菌株)>gi|1573128(U32702)假设的[流感嗜血菌]长度=346
评分=353位(896),估计值=9e-97
相同性=181/344(52%),阳性=247/344(71%),空隙=4/344(1%)
询问:7 LPNLIRALIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSPAKIQKR 66
+ LI +I + L AC ++T + ++L G+TMGTTY VKYL + S K +
目标:1 MKKLISGIIAVAMALSLAACQKET-KVISLSGKTMGTTYHVKYLDDGSITATSE-KTHEE 58
询问:67 IDDALKEVNRQMSTYQTDSEISRFNQHT-AGKPLRISSDFAHVTAEAVRLNRLTHGALDV 125
I+ LK+VN +MSTY+ DSE+SRFNQ+T P+ IS+DFA V AEA+RLN++T GALDV
目标:59 IEAILKDVNAKMSTYKKDSELSRFNQNTQVNTPIEISADFAKVLAEAIRLNKVTEGALDV 118
询问:126 TVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILQQGKDYASLSKTHPKAYLDL 185
TVGP+VNLWGFGP+K ++P+PEQ+ + ++ GIDKI L K+ A+LSK P+ Y+DL
目标:119 TVGPVVNLWGFGPEKRPEKQPTPEQLAERQAWVGIDKITLDTNKEKATLSKALPQVYVDL 178
询问:186 SSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNAHGEPWRIGIEQPNIIQGGNTQ 245
SSIAKGFGVD+VA +LE+ QNY+VEIGGE+ KGKN G+PW+I IE+P +
目标:179 SSIAKGFGVDQVAEKLEQLNAQNYMVEIGGEIRAKGKNIEGKPWQIAIEKPTTTGERAVE 238
询问:246 IIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVSDSAMTADGL 305
++ LNN +A+SGDYRI+ ++NGKR +H I+P PI H+LASI+V++ ++MTADGL
目标:239 AVIGLNNMGMASSGDYRIY-FEENGKRFAHEIDPKTGYPIQHHLASITVLAPTSMTADGL 297
询问:306 STGLFVLGETEALRLAEQEKLAVFLIVRDKDGYRTAMSSEFAKL 349
STGLFVLGE +AL +AE+ LAV+LI+R +G+ T SS F KL
目标:298 STGLFVLGEDKALEVAEKNNLAVYLIIRTDNGFVTKSSSAFKKL 341
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例54
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 393>:
1 ..CCGTGCCGCC GACAGGGCGA CGACGTGTAT GCGGCGCACG CGTCCCGTCA 51 AAAATTGTGG CTGCGCTTCA TCGGCGGCCG GTCGCATCAA AATATACGGG101 GCGGCGCGGC TGCGGACGGG TGGCGCAAAG GCGTGCAAAT CGGCGGCGAG151 GTGTTTGTAC GGCAAAATGA AGGCAGCCkA yTGGCAATCG GCGTGATGGG201 CGGCAGGGCC GGCCAGCACG CwTCAGTCAA CGGCAAAGGC GGTGCGGCAG251 gCAGTGATTT GTATGGTTAT GgCGGGGgTG TTTATGCTgC GTGGCATCAG301 TTGCGCGATA AACAAACGGG TgCGTATTTG GACGGCTGGT TGCAATACCA351 ACGTTTCAAA CACCGCATCA ATGATGAAAA CCGTGCGGAA CgCTACAAAA401 CCAAAGGTTG GACGGCTTCT GTCGAAGGCG GCTACAACGC GCTTGTGGCG451 GAAGGCATTG TCGGAAAAGG CAATAATGTG CGGTTTTACC TACAACCGCA501 GgCGCAGTTT ACCTACTTGG GCGTAAACGG CGGCTTTACC GACAGCGAGG551 GGACGGCGGT CGGACTGCTC GGCAGCGGTC AGTGGCAAAG CCGCGCCGGC601 AtTCGGGCAA AAACCCGTTT TGCTTTGCGT AACGGTGTCA ATCTTCAGCC651 TTTTGCCGCT TTTAATGTtt TGCACAGGTC AAAATCTTTC GGCGTGGAAA701 TGGACGGCGA AAAACAGACG CTGGCAGGCA GGACGGCACT CGAAGGGCGG751 TTCGGTATTG AAGCCGGTTG GAAAGGCCAT ATGTCCGCA..它对应于氨基酸序列<SEQ ID 394;ORF35>:1..PCRRQGDDVY AAHASRQKLW LRFIGGRSHQ NIRGGAAADG WRKGVQIGGE51 VFVRQNEGSX LAIGVMGGRA GQHASVNGKG GAAGSDLYGY GGGVYAAWHQ101 LRDKQTGAYL DGWLQYQRFK HRINDENRAE RYKTKGWTAS VEGGYNALVA151 EGIVGKGNNV RFYLQPQAQF TYLGVNGGFT DSEGTAVGLL GSGQWQSRAG201 IRAKTRFALR NGVNLQPFAA FNYLHRSKSF GVEMDGEKQT LAGRTALEGR251 FGIEAGWKGH MSA..该氨基酸序列的计算机分析给出了下列结果:与脑膜炎奈瑟球菌的推定分泌性VirG-同系物(登录号A32247)的同源性ORF和virg-h蛋白显示在261个氨基酸的重叠区内有51%的氨基酸相同性:Orf35 5 QGDDVYAAHASRQKLWLRFIGGRSHQNIRGGAA-ADGWRKGVQIGGEVFVRQNEGSXLAI 63
+ D++ R+ LWLR I G S+Q ++G A +G+RKGVQ+GGEVF QNE + L+Ivirg-h 396 KNSDIFDRTLPRKGLWLRVIDGHSNQWVQGKTAPVEGYRKGVQLGGEVFTWQNESNQLSI 455Orf35 64 GVMGGRAGQHASVNGKG--GAAGSDLYGYGGGVYAAWHQLRDKQTGAYLDGWLQYQRFKH 121
G+MGG+A Q ++ + ++ G+G GVYA WHQL+DKQTGAY D W+QYQRF+Hvirg-h 456 GLMGGQAEQRSTFHNPDTDNLTTGNVKGFGAGVYATWHQLQDKQTGAYADSWMQYQRFRH 515Orf35 122 RINDENRAERYKTKGWTASVEGGYNALVAEGIVGKGNNVRFYLQPQAQFTYLGVNGGFTD 181
RIN E+ ER+ +KG TAS+E GYNAL+AE KGN++R YLQPQAQ TYLGVNG F+Dvirg-h 516 RINTEDGTERFTSKGITASIEAGYNALLAEHFTKKGNSLRVYLQPQAQLTYLGVNGKFSD 575Orf35 182 SEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVNLQPFAAFNVLHRSKSFGVEMDGEKQTL 241
SE V LLGS Q Q+R G++AK +F+L + ++PFAA N L+ +K FGVEMDGE++ +virg-h 576 SENAHVNLLGSRQLQTRVGVQAKAQFSLYKNIAIEPFAAVNALYHNKPFGVEMDGERRVI 635Orf35 242 AGRTALEGRFGIEAGWKGHMS 262
+TA+E + G+ K H++virg-h 636 NNKTAIESQLGVAVKIKSHLT 656
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF35显示与脑膜炎奈瑟球菌菌株A的ORF(ORF35a)在重叠的259个氨基酸内有96.9%的相同性:
10 20 30orf35. pep PCRRQGDDVYAAHASRQKLWLRFIGGRSHQNIRG
:||||||| ||||||||||||| ||||||orf35a QRLAIPEAEAVLYAQQAYAANTLFGLRAADRGDDVYAADPSRQKLWLRFIGGRSHQNIRG
310 320 330 340 350 360
40 50 60 70 80 90orf35.pep GAAADGWRKGVQIGGEVFVRQNEGSXLAIGVMGGRAGQHASVNGKGGAAGSDLYGYGGGV
|||||| |||||||||||||||||| ||||||||||||||||||||||||| |:||||||orf35a GAAADGRRKGVQIGGEVFVRQNEGSRLAIGVMGGRAGQHASVNGKGGAAGSYLHGYGGGV
370 380 390 400 410 420
100 110 120 130 140 150orf35.pep YAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAEGIV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|orf35a YAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAEGVV
430 440 450 460 470 480
160 170 180 190 200 210orf35.pep GKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf35a GKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVN
490 500 510 520 530 540
220 230 240 250 260orf35.pep LQPFAAFNVLHRSKSFGVEMDGEKQTLAGRTALEGRFGIEAGWKGHMSA
|||||||||||||||||||||||||||||||||||||||||||||||||orf35a LQPFAAFNVLHRSKSFGVEMDGEKQTLAGRTALEGRFGIEAGWKGHMSARIGYGKRTDGD
550 560 570 580 590 600orf35a KEAALSLKWLFX
610 620全长ORF35a核苷酸序列<SEQ ID 395>是:1 ATGTTCAGAG CTCAGCTTGG TTCAAATACT CGTTCTACCA AAATCGGCGA51 CGATGCCGAT TTTTCATTTT CAGACAAGCC GAAACCCGGC ACTTCCCATT101 ATTTTTCCAG CGGTAAAACC GATCAAAATT CATCCGAATA TGGGTATGAC151 GAAATCAATA TCCAAGGTAA AAACTACAAT AGCGGCATAC TCGCCGTCGA201 TAATATGCCC GTTGTTAAGA AATATATTAC AGATACTTAC GGGGATAATT251 TAAAGGATGC GGTTAAGAAG CAATTACAGG ATTTATACAA AACAAGACCC301 GAAGCTTGGG AAGAAAATAA AAAACGGACT GAGGAGGCGT ATATAGAACA351 GCTTGGACCA AAATTTAGTA TACTCAAACA GAAAAACCCC GATTTAATTA401 ATAAATTGGT AGAAGATTCC GTACTCACTC CTCATAGTAA TACATCACAG451 ACTAGTCTCA ACAACATCTT CAATAAAAAA TTACACGTCA AAATCGAAAA501 CAAATCCCAC GTCGCCGGAC AGGTGTTGGA ACTGACCAAG ATGACGCTGA551 AAGATTCCCT TTGGGAACCG CGCCGCCATT CCGACATCCA TATGCTGGAA601 ACTTCCGATA ATGCCCGCAT CCGCCTGAAC ACGAAAGATG AAAAACTGAC651 CGTCCATAAA GCGTATCAGG GCGGTGCGGA TTTCCTGTTC GGCTACGACG701 TGCGGGAGTC GGACAAACCC GCCCTGACCT TTGAAGAAAA AGTCAGCGGA751 CAATCCGGCG TGGTTTTGGA ACGCCGGCCG GAAAATCTGA AAACGCTCGA801 CGGGCGCAAA CTGATTGCGG CGGAAAAGGC AGACTCTAAT TCGTTTGCGT851 TTAAACAAAA TTACCGGCAG GGACTGTACG AATTATTGCT CAAGCAATGC901 GAAGGCGGAT TTTGCTTGGG CGTGCAGCGT TTGGCTATCC CCGAGGCGGA951 AGCGGTTTTA TATGCCCAAC AGGCTTATGC GGCAAATACT TTGTTCGGGC1001 TGCGTGCCGC CGACAGGGGC GACGACGTGT ATGCCGCCGA TCCGTCCCGT1051 CAAAAATTGT GGCTGCGCTT CATCGGCGGC CGGTCGCATC AAAATATACG1101 GGGCGGCGCG GCTGCGGACG GGCGGCGCAA AGGCGTGCAA ATCGGCGGCG1151 AGGTGTTTGT ACGGCAAAAT GAAGGCAGCC GGCTGGCAAT CGGCGTGATG1201 GGCGGCAGGG CTGGCCAGCA CGCATCAGTC AACGGCAAAG GCGGTGCGGC1251 AGGCAGTTAT TTGCATGGTT ATGGCGGGGG TGTTTATGCT GCGTGGCATC1301 AGTTGCGCGA TAAACAAACG GGTGCGTATT TGGACGGCTG GTTGCAATAC1351 CAACGTTTCA AACACCGCAT CAATGATGAA AACCGTGCGG AACGCTACAA1401 AACCAAAGGT TGGACGGCTT CTGTCGAAGG CGGCTACAAC GCGCTTGTGG1451 CGGAAGGCGT TGTCGGAAAA GGCAATAATG TGCGGTTTTA CCTGCAACCG1501 CAGGCGCAGT TTACCTACTT GGGCGTAAAC GGCGGCTTTA CCGACAGCGA1551 GGGGACGGCG GTCGGACTGC TCGGCAGCGG TCAGTGGCAA AGCCGCGCCG1601 GCATTCGGGC AAAAACCCGT TTTGCTTTGC GTAACGGTGT CAATCTTCAG1651 CCTTTTGCCG CTTTTAATGT TTTGCACAGG TCAAAATCTT TCGGCGTGGA1701 AATGGACGGC GAAAAACAGA CGCTGGCAGG CAGGACGGCG CTCGAAGGGC1751 GGTTCGGCAT TGAAGCCGGT TGGAAAGGCC ATATGTCCGC ACGCATCGGA1801 TACGGCAAAA GGACGGACGG CGACAAAGAA GCCGCATTGT CGCTCAAATG1851 GCTGTTTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 396>:1 MFRAQLGSNT RSTKIGDDAD FSFSDKPKPG TSHYFSSGKT DQNSSEYGYD51 EINIQGKNYN SGILAVDNMP VVKKYITDTY GDNLKDAVKK QLQDLYKTRP101 EAWEENKKRT EEAYIEQLGP KFSILKQKNP DLINKLVEDS VLTPHSNTSQ151 TSLNNIFNKK LHVKIENKSH VAGQVLELTK MTLKDSLWEP RRHSDIHMLE201 TSDNARIRLN TKDEKLTVHK AYQGGADFLF GYDVRESDKP ALTFEEKVSG251 QSGVVLERRP ENLKTLDGRK LIAAEKADSN SFAFKQNYRQ GLYELLLKQC301 EGGFCLGVQR LAIPEAEAVL YAQQAYAANT LFGLRAADRG DDVYAADPSR351 QKLWLRFIGG RSHQNIRGGA AADRRKGVQ IGGEVFVRQN EGSRLAIGVM401 GGRAGQHASV NGKGGAAGSY LHGYGGGVYA AWHQLRDKQT GAYLDGWLQY451 QRFKHRINDE NRAERYKTKG WTASVEGGYN ALVAEGVVGK GNNVRFYLQP501 QAQFTYLGVN GGFTDSEGTA VGLLGSGQWQ SRAGIRAKTR FALRNGVNLQ551 PFAAFNVLHR SKSFGVEMDG EKQTLAGRTA LEGRFGIEAG WKGHQSARIG601 YGKRTDGDKE AALSLKWLF*
与淋病奈瑟球菌的预计QRF的同源性
ORF35显示与淋病奈瑟球菌的预计ORF(ORF35ngh)在重叠的261个氨基酸内有51.7%的相同性:orf35.pep PCRRQGDDVYAAHASRQKLWLRFIGGRSHQNIRG 34
:::|:: |: |||| | |:|:| ::|orf35ngh FTKVQERDDIAIYAQQAQAANTLFALRLNDKNSDIFDRTLPRKGLWLRVIDGHSNQWVQG 370orf35.pep GAA-ADGWRKGVQIGGEVFVRQNEGSXLAIGVMGGRAGQHASVNGKG--GAAGSDLYGYG 91
:| ::|:|||||:|||||: |||:: |:||:|||:| |::: : : : ::: |:|orf35ngh KTAPVEGYRKGVQLGGEVFTWQNESNQLSIGLMGGQAEQRSTFRNPDTDNLTTGNVKGFG 430orf35.pep GGVYAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAE 151
:||||:||||:||||| |:|:|:||| |:|||| | :||: :|| |||:|:|||||:||orf35ngh AGVYATWHQLQDKQTGAYVDSWMQYQRFRHRINTEYATERFTSKGITASIEAGYNALLAE 490orf35.pep GIVGKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRN 211
:: |||::| |||||||:||||||| |:|||:: |:|||| | |||:|::||::||: |orf35ngh HFTKKGNSLRVYLQPQAQLTYLGVNGKFSDSENAQVNLLGSRQLQSRVGVQAKAQFAFTN 550orf35.pep GVNLQPFAAFNVLHRSKSFGVEMDGEKQTLAGRTALEGRFGIEAGWKGHMSA 263
||::|||:| | ::::| ||||:||::::: ::|::| ::|: | |:|::orf35ngh GVTFQPFVAVNSIYQQKPFCVEIDGDRRVINNKTVIETQLGVAAKIKSHLTLQASFNRQT 610
预计部分ORF35ngh核苷酸序列<SEQ ID 397>编码的蛋白质具有部分氨基酸序列<SEQ ID 398>:1..KKLRDRNSEY WKEETYHIKS NGRTYPNIPA LFPKHPFDPF ENINNSKKIS51 FYDKEYTEDY LVGFARGFGV EKRNGEEEKP LRQYFKDCVN TENSNNDNCK101 ISSFGNYGPI LIKSDIFALA SQIKNSHINS EILSVGNYIE WLRPTLNKLT151 GWQEHLYAGL DPFHYIEVTD NSHVIGQTID LGALELTNSL WKPRWNSNID201 YLITKNAEIR FNTKNESLLV KEDYAGGARF RFAYDLKDKV PEIPVLTFEK251 NITGTSDIIF EGKALDNLKH LDGHQIVKVN DTADKDAFRL SSKYRKGIYT301 LSLQQRPEGF FTKVQERDDI AIYAQQAQAA NTLFALRLND KNSDIFDRTL351 PRKGLWLRVI DGHSNQWVQG KTAPVEGYRK GVQLGGEVFT WQNESNQLSI401 GLMGGQAEQR STFRNPDTDN LTTGNVKGFG AGVYATWHQL QDKQTGAYVD451 SWMQYQRFRH RINTEYATER FTSKGITASI EAGYNALLAE HFTKKGNSLR501 VYLQPQAQLT YLGVNGKFSD SENAQVNLLG SRQLQSRVGV QAKAQFAFTN551 GVTFQPFVAV NSIYQQKPFG VEIDGDRRVI NNKTVIETQL GVAAKIKSHL601 TLQASFNRQT SKHHHAKQGA LNLQWTF*
根据该预测,脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例55
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 399>:1 ..GCGGAATATG TTCAGTTCTC TATAGATTTG TTCAGTGTGG GTAAATCGGG51 GGGCGGTATA CCTAAGGCTA AGCCTGTGTT TGATGCGAAA CCGAGATGGG101 AGGTTGATAG GAAGCTTAAT AAATTGACAA CTCGTGAGCA GGTGGAGAAA151 AATGTTCAGG AAACGAGAAG AAGGAGTCAG AGTAGTCAGT TTAAAGCCCA201 TGCGCAACGA GAATGGGAAA ATAAAACAGG GTTAGATTTT AATCATTTTA251 TAGGTGGTGA TATCAATAAA AAAGGCACAG TAACAGGAGG GCATAGTCTA301 ACCCGTGGTG ATGTACGGGT GATACAACAA ACCTCGGCAC CTGATAAACA351 TGGGGT.TTA TCAAGCGACA GTGGAAATTN A它对应于氨基酸序列<SEQ ID 400;ORF46>:1 ..AEYVQFSIDL FSVGKSGGGI PKAKPVFDAK PRWEVDRKLN KLTTREQVEK51 NVQETRRRSQ SSQFKAHAQR EWENKTGLDF NHFIGGDINK KGTVTGGHSL101 TRGDVRVIQQ TSAPDKHGXL SSDSGNX进一步的工作进一步揭示了部分核苷酸序列<SEQ ID 401>:1 ..GCAGTGTGCC TnCCGATGCA TGCACACGCC TCAnATTTGG CAAACGATTC51 TTTTATCCGG CAGGTTCTCG ACCGTCAGCA TTTCGAACCC GACGGGAAAT101 ACCACCTATT CGGCAGCAGG GGGGAACTTG CCGAGCGCCA GTCTCATATC151 GGATTGGGAA AAATACAAAG CCATCAGTTG GGCAACCTGA TGATTCAACA201 GGCGGCCATT AAAGGAAATA TCGGCTACAT TGTCCGCTTT TCCGATCACG251 GGCACGAAGT CCATTCCCCs TTCGACAACC ATGCCTCACA TTCCGATTCT301 GATGAAGCCG GTAGTCCCGT TGACGGATTT AGCCTTT.ACC GCATCCATTG351 GGACGGATAC GAACACCATC CCGCCGACGG CTATGACGGG CCACAGGGCG401 GCGGCTATCC CGCTCCCAAA GGCGCGAGGG ATATATACAG TTACGACATA451 AAAGGCGTTG CCCAAAATAT CCGCCTCAAC CTGACCGACA ACCGCAGCAC501 CGGACAACGG CTTGCCGACC GTTTCCACAA TGCCGGTAGT ATGCTGACGC551 AAGGAGTAGG CGACGGATTC AAACGCGCCA CCCGATACAG CCCCGAGCTG601 GACAGATCGG GCAATGCCGC CGAAGCCTTC AACGGCACTG CAGATATCGT651 TAAAAACATC ATCGGCGCTG CAGGAGAAAT TGT它对应于氨基酸序列<SEQ ID 402;ORF46-1>:1 ..AVCLPMHAHA SXLANDSFIR QVLDRQHFEP DGKYHLFGSR GELAERQSHI51 GLGKIQSHQL GNLMIQQAAI KGNIGYIVRF SDHGHEVHSP FDNHASHSDS101 DEAGSPVDGF SLYRIHWDGY EHHPADGYDG PQGGGYPAPK GARDIYSYDI151 KGVAQNIRLN LTDNRSTGQR LADRFHNAGS MLTQGVGDGF KRATRYSPEL201 DRSGNAAEAF NGTADIVKNI IGAAGEI
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF46显示与淋病奈瑟球菌的预计ORF(ORF64ng)在重叠的111个氨基酸内有98.2%的相同性:orf46.pep AEYVQFSIDLFSVGKSGGGIPKAKPVFDAKPRWEVDRKLNKLTTR 45
||||||||||||||||||||||||||| |orf46ng PKTGVPFDGKGFPNFEKHVKYDTKLDIQELSGGGIPKAKPVFDAKPRWEVDRKLNKLTTR 217orf46.pep EQVEKNVQETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGTVTGGHSLTRGDV 105
|||||||||||||||||||||||||||||||||||||||||||||||:||||||||||||orf46ng EQVEKNVQETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGAVTGGHSLTRGDV 277orf46.pep RVIQQTSAPDKHGXLSSDSGN 126
||||||||||||| |||||||orf46ng RVIQQTSAPDKHGVLSSDSGN 298
预计部分ORF46ng核苷酸序列<SEQ ID 403>编码的蛋白质具有部分氨基酸序列<SEOID 404>:1 ..RRLKHCCHAR LGSAFHRKQD GAHQRFGRYG ATQRLCRSSH PRLGSPKPQC51 RTRHRSRQQY LYGSHPHQRD WSCPGKIQLG RHHGTSCRAV ADXRDRICER101 EIRRQRQXCR CRLGKIPSLS IPKYPLKLEQ RYGKENITSS TVPPSNGKNV151 KLADQRHPKT GVPFDGKGFP NFEKHVKYDT KLDIQELSGG GIPKAKPVFD201 AKPRWEVDRK LNKLTTREQV EKNVQETRRR SQSSQFKAHA QREWENKTGL251 DFNHFIGGDI NKKGAVTGGH SLTRGDVRVI QQTSAPDKHG VLSSDSGN*进一步的工作揭示了该完整的淋球菌DNA序列<SEQ ID 405>:1 TTGGGCATTT CCCGCAAAAT ATCCCTTATT CTGTCCATAC TGGCAGTGTG51 CCTGCCGATG CATGCACACG CCTCAGATTT GGcaAACGAT CCCTTTATCC101 GgCaggttcT CGaccGTCAG CATTTCGaac ccgacggGAa ATACCaCCTA151 TTcggCaGCA GGGGGGAGCT TgccnagcGC aacggccATa tcggattggG201 aaacaTAcaa Agccatcagt tGggccacct gatgattcaa caggcggccg251 ttgaaggaaA TAtcgGctac attgtccgct tttccgatca cgggcacaaa301 ttccattcgc ccttcGAcaa ccaTGCCTCA CATTCCGATT CTGACGAAGC351 CGGTAGTCCC GTTGACGGAT TCAGCCTTTA CCGCATCCAT TGGGACGGAT401 ACGAACACCA TCCCGCCGAC GGCTATGACG GGCCACAGGG CGGCGGCTAT451 CCCGCTCCCA AAGGCGCGAG GGATATATAC AGCTACGACA TAAAAGGCGT501 TGCCCAAAAT ATCCGCCTCA ACCTGACCGA CAACCGCAGC ACCGGACAAC551 GGCTTGCCGA CCGTTTCCAC AATGCCGGCG CTATGCTGAC GCAAGGAGTA601 GGCGACGGAT TCAAACGCGC CACCCGATAC AGCCCCGAGC TGGACAGATC651 GGGCAATGCc gccGAAGCCT TCAACGGCAC TGCAGATATC GTCAAAAACA701 TCATCGGCGC GGCAGGAGAA ATTGTCGGCG CAGGCGRTGC CGTGCagGGT751 ATAAGCGAAG GCTCAAACAT TGCTGTCATG CACGGCTTGG GTCTGCTTTC801 CACCGAAAAC AAGATGGCGC GCATCAACGA TTTGGCAGAT ATGGCGCAAC851 TCAAAGACTA TGCCGCAGCA GCCATCCGCG ATTGGGCAGT CCAAAACCCC901 AATGCCGCAC AAGGCATAGA AGCCGTCAGC AATATCTTTA TGGCAGCCAT951 CCCCATCAAA GGGATTGGAG CTGTCCGGGG AAAATACGGC TTGGGCGGCA1001 TCACGGCACA TCCTGTCAAG CGGTCGCAGA TGGGCGCGAT CGCATTGCCG1051 AAAGGGAAAT CCGCCGTCAG CGACAATTTT GCCGATGCGG CATACGCCAA1101 ATACCCGTCC CCTTACCATT CCCGAAATAT CCGTTCAAAC TTGGAGCAGC1151 GTTACGGCAA AGAAAACATC ACCTCCTCAA CCGTGCCGCC GTCAAACGGC1201 AAAAATGTCA AACTGGCAGA CCAACGCCAC CCGAAGACAG GCGTACCGTT1251 TGACGGTAAA GGGTTTCCGA ATTTTGAGAA GCACGTGAAA TATGATACGA1301 AGCTCGATAT TCAAGAATTA TCGGGGGGCG GTATACCTAA GGCTAAGCCT1351 GTGTTTGATG CGAAACCGAG ATGGGAGGTT GATAGGAAGC TTAATAAATT1401 GACAACTCGT GAGCAGGTGG AGAAAAATGT TCAGGAAACG AGAAGAAGGA1451 GTCAGAGTAG TCAGTTTAAA GCCCATGCGC AACGAGAATG GGAAAATAAA1501 ACAGGGTTAG ATTTTAATCA TTTTATAGGT GGTGATATCA ATAAGAAAGG1551 CACAGTAACA GGAGGGCATA GTCTAACCCG TGGTGATGTA CGGGTGATAC1601 AACAAACCTC GGCACCTGAT AAACATGGGG TTTATCAAGC GACAGTGGAA1651 ATTAAAAAGC CTGATGGAAG TTGGGAGGTG AAAACGAAAA AAGGTGGGAA1701 AGTGATGACC AAGCACACCA TGTTCCCAAA AGATTGGGAT GAGGCTAGAA1751 TTAGGGCTGA AGTTACTTCG GCTTGGGAAA GTAGAATAAT GCTTAAGGAT1801 AATAAATGGC AGGGTACAAG TAAATCGGGT ATTAAAATAG AAGGATTTAC1851 CGAACCTAAT AGAACAGCAT ATCCCATTTA TGAATAG它对应于氨基酸序列<SEQ ID 406;ORF46ng-1>:1 LGISRKISLI LSILAVCLPM HAHASDLAND PFIRQVLDRQ HFEPDGKYHL51 FGSRGELAXR NGHIGLGNIQ SHQLGHLMIQ QAAVEGNIGY IVRFSDHGHK101 FHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLADRFH NAGAMLTQGV201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP301 NAAQGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPVK RSQMGAIALP351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG401 KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTKLDIQEL SGGGIPKAKP451 VFDAKPRWEV DRKLNKLTTR EQVEKNVQET RRRSQSSQFK AHAQREWENK501 TGLDFNHFIG GDINKKGTVT GGHSLTRGDV RVIQQTSAPD KHGVYQATVE551 IKKPDGSWEV KTKKGGKVMT KHTMFPK1WD EARIRAEVTS AWESRIMLKD601 NKWQGTSKSG IKIEGFTEPN RTAYPIYE*ORF46ng-1和ORF46-1显示在227个氨基酸的重叠区内有94.7%的相同性:
10 20 30 40orf46-1.pep AVCLPMHAHASXLANDSFIRQVLDRQHFEPDGKYHLFGSRGELAER
||||||||||| |||| ||||||||||||||||||||||||||| |orf46ng-1 LGISRKISLILSILAVCLPMHAHASDLANDPFIRQVLDRQHFEPDGKYHLFGSRGELAXR
10 20 30 40 50 60
50 60 70 80 90 100orf46-1.pep QSHIGLGKIQSHQLGNLMIQQAAIKGNIGYIVRFSDHGHEVHSPFDNHASHSDSDEAGSP
::|||||:|||||||:|||||||::||||||||||||||: |||||||||||||||||||orf46ng-1 NGHIGLGNIQSHQLGHLMIQQAAVEGNIGYIVRFSDHGHKFHSPFDNHASHSDSDEAGSP
70 80 90 100 110 120
110 120 130 140 150 160orf46-1.pep VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf46ng-1 VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
130 140 150 160 170 180
170 180 190 200 210 220orf46-1.pep TGQRLADRFHNAGSMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
|||||||||||||:||||||||||||||||||||||||||||||||||||||||||||||orf46ng-1 TGQRLADRFHNAGAMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
190 200 210 220 230 240orf46-1.pep I
|orf46ng-1 IVGAGDAVQGISEGSNIAVMHGLGLLSTENKMARINDLADMAQLKDYAAAAIRDWAVQNP
250 260 270 280 290 300
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF46ng-1显示与脑膜炎奈瑟球菌菌株A的ORF(ORF46a)在重叠的486个氨基酸内有87.4%的相同性:
10 20 30 40 50 60orf46a.pep LGISRKISLILSILAVCLPMHAHASDLANDSFIRQVLDRQHFEPDGKYHLFGSRGELAER
|||||||||||||||||||||||||||||| ||||||||||||||||||||||||||| |orf46ng-1 LGISRKISLILSILAVCLPMHAHASDLANDPFIRQVLDRQHFEPDGKYHLFGSRGELAXR
10 20 30 40 50 60
70 80 90 100 110 120orf46a.pep SGHIGLGNIQSHQLGNLFIQQAAIKGNIGYIVRFSDHGHEVHSPFDNHASHSDSDEAGSP
:||||||||||||||:|:|||||::|||||||||||||||: ||||||||||||||||||orf46ng-1 NGHIGLGNIQSHQLGHLMIQQAAVEGNIGYIVRFSDHGHKFHSPFDNHASHSDSDEAGSP
70 80 90 100 110 120
130 140 150 160 170 180orf46a.pep VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf46ng-1 VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
130 140 150 160 170 180
190 200 210 220 230 240orf46a.pep TGQRLVDRFHNTGSMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
|||||:|||||:|:||||||||||||||||||||||||||||||||||||||||||||||orf46ng-1 TGQRLADRFHNAGAMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
190 200 210 220 230 240
250 260 270 280 290 300orf46a.pep IVGAGDAVQGISEGSNIAVMHGLGLLSTENKMARINDLADMAQLKDYAAAAIRDWAVQNP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf46ng-1 IVGAGDAVQGISEGSNIAVMHGLGLLSTENKMARINDLADMAQLKDYAAAAIRDWAVQNP
250 260 270 280 290 300
310 320 330 340 350 360orf46a.pep NAAQGIEAVSNIFTAVIPVKGIGAVRGKYGLGGITAHPVKRSQMGEIALPKGKSAVSDNF
||||||||||||| |:||:|||||||||||||||||||||||||| ||||||||||||||orf46ng-1 NAAQGIEAVSNIFMAAIPIKGIGAVRGKYGLGGITAHPVKRSQMGAIALPKGKSAVSDNF
310 320 330 340 350 360
370 380 390 400 410 420orf46a.pep ADAAYAKYPSPYHSRNIRSNLEQRYGKENITSSTVPPSNGKNVKLANKRHPKTKVPFDGK
||||||||||||||||||||||||||||||||||||||||||||||::||| ||||||||orf46ng-1 ADAAYAKYPSPYHSRNIRSNLEQRY6KENITSSTVPPSNGKNVKLADQRHPKTGVPFDGK
370 380 390 400 410 420
430 440 450 460 470orf46a.pep GFPNFEKDVKYDTRINTAVPQVN----PIDEPVFN--PKGSVGSAHSWSITARIQYAKLP
||||||| |||||::: : ::: | :|||: |: | : ::|:| | |orf46ng-1 GFPNFEKHVKYDTKLD--IQELSGGGIPKAKPVFDAKPRWEVDRKLN-KLTTREQVEKNV
430 440 450 460 470
480 490 500 510 520 530orf46a.pep RQGRIRYIPPKNYSPSAPLPKGPNNGYLDKFGNEWTKGPSRTKGQEFEWDVQLSKTGRFQ
:: | |orf46ng-1 QETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGTVTGGHSLTRGDVRVIQQTS
480 490 500 510 520 530全长ORF46aDNA序列<SEQ ID 407>是:1 TTGGGCATTT CCCGCAAAAT ATCCCTTATT CTGTCCATAC TGGCAGTGTG51 CCTGCCGATG CATGCACACG CCTCAGATTT GGCAAACGAT TCTTTTATCC101 GGCAGGTTCT CGACCGTCAG CATTTCGAAC CCGACGGGAA ATACCACCTA151 TTCGGCAGCA GGGGGGAACT TGCCGAGCGC AGCGGTCATA TCGGATTGGG201 AAACATACAA AGCCATCAGT TGGGCAACCT GTTCATCCAG CAGGCGGCCA251 TTAAAGGAAA TATCGGCTAC ATTGTCCGCT TTTCCGATCA CGGGCACGAA301 GTCCATTCCC CCTTCGACAA CCATGCCTCA CATTCCGATT CTGATGAAGC351 CGGTAGTCCC GTTGACGGAT TCAGCCTTTA CCGCATCCAT TGGGACGGAT401 ACGAACACCA TCCCGCCGAC GGCTATGACG GGCCACAGGG CGGCGGCTAT451 CCCGCTCCCA AAGGCGCGAG GGATATATAC AGCTACGACA TAAAAGGCGT501 TGCCCAAAAT ATCCGCCTCA ACCTGACCGA CAACCGCAGC ACCGGACAAC551 GGCTTGTCGA CCGTTTCCAC AATACCGGTA GTATGCTGAC GCAAGGAGTA601 GGCGACGGAT TCAAACGCGC CACCCGATAC AGCCCCGAGC TGGACAGATC651 GGGCAATGCC GCCGAAGCTT TCAACGGCAC TGCAGATATC GTCAAAAACA701 TCATCGGCGC GGCAGGAGAA ATTGTCGGCG CAGGCGATGC CGTGCAGGGT751 ATAAGCGAAG GCTCAAACAT TGCTGTTATG CACGGCTTGG GTCTGCTTTC801 CACCGAAAAC AAGATGGCGC GCATCAACGA TTTGGCAGAT ATGGCGCAAC851 TCAAAGACTA TGCCGCAGCA GCCATCCGCG ATTGGGCAGT CCAAAACCCC901 AATGCCGCAC AAGGCATAGA AGCCGTCAGC AATATCTTTA CGGCAGTCAT951 CCCCGTCAAA GGGATTGGAG CTGTTCGGGG AAAATACGGC TTGGGCGGCA1001 TCACGGCACA TCCTGTCAAG CGGTCGCAGA TGGGCGAGAT CGCATTGCCG1051 AAAGGGAAAT CCGCCGTCAG CGACAATTTT GCCGATGCGG CATACGCCAA1101 ATACCCGTCC CCTTACCATT CCCGAAATAT CCGTTCAAAC TTGGAGCAGC1151 GTTACGGCAA AGAAAACATC ACCTCCTCAA CCGTGCCGCC GTCAAACGGA1201 AAGAATGTGA AACTGGCAAA CAAACGCCAC CCGAAGACCA AAGTGCCGTT1251 TGACGGTAAA GGGTTTCCGA ATTTTGAAAA AGACGTAAAA TACGATACGA1301 GAATTAATAC CGCTGTACCA CAAGTGAATC CTATAGATGA ACCCGTCTTT1351 AATCCTAAAG GTTCTGTCGG ATCGGCTCAT TCTTGGTCTA TAACTGCCAG1401 AATTCAATAC GCAAAATTAC CAAGGCAAGG TAGAATCAGA TATATCCCAC1451 CTAAAAATTA CTCTCCTTCA GCACCGCTAC CAAAAGGACC TAATAATGGA1501 TATTTGGATA AATTTGGTAA TGAATGGACT AAAGGTCCAT CAAGAACTAA1551 AGGTCAAGAA TTTGAATGGG ATGTTCAATT GTCTAAAACA GGAAGAGAGC1601 AACTTGGATG GGCTAGTAGG GATGGTAAGC ATTTAAATAT ATCAATTGAT1651 GGAAAGATTA CACACAAATG A它对应于氨基酸序列<SEQ ID 408>:1 LGISRKISLI LSILAVCLPM HAHASDLAND SFIRQVLDRQ HFEPDGKYHL51 FGSRGELAER SGHIGLGNIQ SHQLGNLFIQ QAAIKGNIGY IVRFSDHGHE101 VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLVDRFH NTGSMLTQGV201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP301 NAAQGIEAVS NIFTAVIPVK GIGAVRGKYG LGGITAHPVK RSQMGEIALP351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG401 KNVKLANKRH PKTKVPFDGK GFPNFEKDVK YDTRINTAVP QVNPIDEPVF451 NPKGSVGSAH SWSITARIQY AKLPRQGRIR YIPPKNYSPS APLPKGPNNG501 YLDKFGNEWT KGPSRTKGQE FEWDVQLSKT GREQLGWASR DGKHLNISID551 GKITHK*
根据该分析结果(包括淋球菌蛋白中存在粘附素典型的RGD序列),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例56
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 409>:1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTTGCC CCCAATGCGG101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT151 TTGGACTATC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT201 CAAAATTGCC GGCGTATTGG CGTTTTGGCT GGCGGTTTTG TTTGACGGGC251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC351 CGGGCTG...它对应于氨基酸序列<SEQ ID 410;ORF48>:1 MNIHTLLSKQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN51 LDYLPAALLI ALPWRFVKIA GVLAFWLAVL FDGLMMVIQL FPFMDLIGAI101 NLVPFILTAP APYQIMTGL...进一步的工作揭示了完整的核苷酸序列<SEQ ID 411>:1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTTGCC CCCAATGCGG101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT151 TTGGACTATC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT201 CAAAATTGCC GGCGTATTGG CGTTTTGGCT GGCGGTTTTG TTTGACGGGC251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC351 CGGGCTGTTG CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAGAAAG401 CCGCCGCCAA AACCGACTTC CGGCACATTG CCGTCTGCGC CGCCGTTGTG451 GCGGCAGCCG GCTATTTCAC CGGCCATTTG AGTTACTACG ACCGGGGTCG 501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTACTACGCC AAAAGTCAGG551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGGCCTG601 GTCGATCCCG TCTTCCTCCC CTTGGGCAAT CAACAGCGTG CCGCCACGCA651 TCTGAACGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT701 GGGGGCTGCC GGCCAATCCC GAACTTCAAA ACGCCACTTT TGCCAAACTG751 CTGGCGCAAA AAGACCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT801 CATCGGCGCG ACGGTCGAAG GCGAAATGCG CGAACTGTGT GCCTACGGCG851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA951 CGGCGCGGGC AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG1001 GCTTTCAAGA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC1101 ATTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA1151 GCCACGCCGA CTATCCCGAA TCCGACATTT TCAACCACAG GCTCAAATGC1201 ACCGAATATG GCCTGCCCGC CGAAACCGAC CTCTGCCGCA ATTTCAGCCT1251 GCACACCCAA TTCTTCGACC AACTGGCGGA TTTGATCCAA CGCCCCGAAA1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGGCACG TCGCCTGGCT1401 GAACTTCAAA ATCAAATAA它对应于氨基酸序列<SEQ ID 412;ORF48-1>:1 MNIHTLLSKQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN51 LDYLPAALLI ALPWRFVKIA GVLAFWLAVL FDGLMMVIQL FPFMDLIGAI101 NLVPFILTAP APYQIMTGLL LLYMLAMPFV LQKAAAKTDF RHIAVCAAVV151 AAAGYFTGHL SYYDRGRMAN IFGANNFYYA KSQAMLYTVS QNADFITAGL201 VDPVFLPLGN QQRAATHLNE PKSQKILFIV AESWGLPANP ELQNATFAKL251 LAQKDRFSVW ESGSFPFIGA TVEGEMRELC AYGGLRGFAL RRAPDEKFAR301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQEIKT AENLIGKKTC351 AIFGGVCDSE LFGEVSAFFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC401 TEYGLPAETD LCRNFSLHTQ FFDQLADLIQ RPEMKGTEVI IVGDHPPPVG451 NLNETFRYLK GGHVAWLNFK IK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF48显示与脑膜炎奈瑟球菌菌株A的ORF(ORF48a)在重叠的119个氨基酸内有94.1%的相同性:
10 20 30 40 50 60orf48.pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
||||||||||||||||||||||||||||| ||||||||||||||||||||| ||||||||orf48a MNIHTLLSKQWTLPPFLPKRLLLSLLILLXPNAVFWVLALLTATARPIVNLXYLPAALLI
10 20 30 40 50 60
70 80 90 100 110 119orf48.pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGL
||||| |||||||| ||||||||||||||||||||||||||||||| |||| |||||||orf48a ALPWRXVKIXGVLAXWLAVLFDGLMMVIQLFPFMDLIGAINLVPFIXTAPALYQIMTGLL
70 80 90 100 110 120orf48a LLYMLAMPFVLQKAAAKTDFRHIAACAAVVVAAGYFTGHLSXYDRGRMANIFGANNFYYA
130 140 150 160 170 180全长ORF48a核苷酸序列<SEQ ID 413>是:1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTNNCC CCCAATGCGG101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT151 TTGGANTACC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTNTCGT201 CAAAATTGNC GGCGTATTGG CGTNTTGGCT GGCGGTTTTG TTTGACGGGC251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC301 AACCTCGTCC CCTTCATCNT GACCGCCCCC GCCCTTTATC AGATAATGAC 351 CGGGCTGTTA CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAGAAAG401 CCGCCGCCAA AACCGACTTC CGACACATTG CCGCCTGTGC CGCCGTTGTG451 GTGGCAGCCG GCTATTTTAC CGGCCATTTG AGTTANTACG ACCGGGGGCG501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTATTACGCC AAAAGTCAGG551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGGCCTG601 GTCGATCCCG TCTTCCTCCC CTTGGGCAAT CAACAGCGTG CCGCCACGCA651 TCTGAACGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT701 GGGGGCTGCC GGCCAATCCC GAACTTCAAA ACGCCACTTT TGCCAAACTG751 CTGGCGCAAA AAGANCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT801 CATCGGCGCG ACGATCGAAG GCGAAATGCG CGAACTGTGT GCCTACGGCG851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA951 CGGCGCGGGC AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG1001 GCTTTCAAGA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC1101 ANTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA1151 GCCACGCCGA CTATCCCGAA TCNGACATTT TCAACCACAG GCTCAAATGC1201 ACCGAATATG GCCTGCCCGC CGAAACCGAC NTCTGCCGCA ATTTCAGCCT1251 GCACACCCAA TTCTTCGACC AACTGGCGGA TTTGATCCAA CGCCCCGAAA1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGGCACG TCGNCTGGCT1401 GAACTTCAAA ATCAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 414>:1 MNIHTLLSKQ WTLPPFLPKR LLLSLLILLX PNAVFWVLAL LTATARPIVN51 LXYLPAALLI ALPWRXVKIX GVLAXWLAVL FDGLMMVIQL FPFMDLIGAI101 NLVPFIXTAP ALYQIMTGLL LLYMLAMPFV LQKAAAKTDF RHIAACAAVV151 VAAGYFTGHL SXYDRGRMAN IFGANNFYYA KSQAMLYTVS QNADFITAGL201 VDPVFLPLGN QQRAATHLNE PKSQKILFIV AESWGLPANP ELQNATFAKL251 LAQKXRFSVW ESGSFPFIGA TIEGEMRELC AYGGLRGFAL RRAPDEKFAR301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQEIKT AENLIGKKTC351 AIFGGVCDSE LFGEVSAXFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC401 TEYGLPAETD XCRNFSLHTQ FFDQLADLIQ RPEMKGTEVI IVGDHPPPVG451 NLNETFRYLK QGHVXWLNFK IK*ORF48a和ORF48-1显示在472个氨基酸的重叠区内有96.8%的相同性:
10 20 30 40 50 60orf48a.pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLXPNAVFWVLALLTATARPIVNLXYLPAALLI
||||||||||||||||||||||||||||| ||||||||||||||||||||| ||||||||orf48-1 MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
10 20 30 40 50 60
70 80 90 100 110 120orf48a.pep ALPWRXVKIXGVLAXWLAVLFDGLMMVIQLFPFMDLIGAINLVPFIXTAPALYQIMTGLL
||||| ||| |||| ||||||||||||||||||||||||||||||| |||| ||||||||orf48-1 ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL
70 80 90 100 110 120
130 140 150 160 170 180orf48a.pep LLYMLAMPFVLQKAAAKTDFRHIAACAAVVVAAGYFTGHLSXYDRGPMANIFGANNFYYA
||||||||||||||||||||||||:|||||:|||||||||| ||||||||||||||||||orf48-1 LLYMLAMPFVLQKAAAKTDFRHIAVCAAVVAAAGYFTGHLSYYDRGRMANIFGANNFYYA
130 140 150 160 170 180
190 200 210 220 230 240orf48a.pep KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf48-1 KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESW6LPANP
190 200 210 220 230 240
250 260 270 280 290 300orf48a.pep ELQNATFAKLLAQKXRFSVWESGSFPFIGATIEGEMRELCAYGGLRGFALRRAPDEKFAR
||||||||||||| ||||||||||||||||:||||||||||||||||||||||||||||orf48-1 ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR
250 260 270 280 290 300
310 320 330 340 350 360orf48a.pep CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKXTCAIFGGVCDSE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf48-1 CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKKTCAIFGGVCDSE
310 320 330 340 350 360
370 380 390 400 410 420DRf48a.pep LFGEVSAXFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDXCRNFSLHTQ
||||||| |||||||||||||||||||||||||||||||||||||||||| |||||||||orf48-1 LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ
370 380 390 400 410 420
430 440 450 460 470orf48a.pep FFDQLADLIQRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVXWLNFKIKX
|||||||||||||||||||||||||||||||||||||||||||| ||||||||orf48-1 FFDQLADLIQRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLNFKIKX
430 440 450 460 470
与淋病奈瑟球菌的预计ORF的同源性
ORF48显示与淋病奈瑟球菌的预计ORF(ORF48ng)在重叠的119个氨基酸内有97.5%的相同性:orf48.pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI 60
||||:|||:|||||||||||||||||||||||||||||||||||||||||||||||||||orf48ng MNIHALLSEQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI 60orf48.pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGL 119
|||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||orf48ng ALPWRFVKIAGVLAFWPAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL 120
预计ORF48ng核苷酸序列<SEQ ID 415>编码的蛋白质具有氨基酸序列<SEQ ID416>:1 MNIHALLSEQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN51 LDYLPAALLI ALPWRFVKIA GVLAFWPAVL FDGLMMVIQL FPFMDLIGAI101 NLVPFILTAP APYQIMTGLL LLYMLAMPFV LQKAAVKTDF RHIAVCAAVV151 AAARYFTGPF ELLRTGGRWQ YVQHRRLLLS GSRASFRRRQ KADVLRRLGN201 PYASMGNGG..进一步的工作鉴定出完整的淋球菌DNA序列<SEQ ID 417>:1 ATGAATATTC ACGCCCTGCT CTCCGAACAA TGGACGCTGC CGCCATTCCT51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTGGCC CCCAATGCGG101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT151 TTGGACTACC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT201 CAAAATTGCC GGCGTATTGG CGTTTTGGCC GGCGGTTTTG TTTGACGGGC251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGACCTCAT CGGCGCCATC301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC351 CGGGCTGTTG CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAAAAAG401 CCGCCGTCAA AACCGACTTC CGACACATTG CCGTCTGTGC CGCCGTTGTG451 GCGGCAGCCG GCTATTTCAC CGGCCATTTG AGTTACTACG ACCGGGGGCG501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTATTACGCc aAAAGTCAGG551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGgcctG601 GTCGACCCCG TCTTCCTCCC CTTGGGCAAT CAGCAGCGTG CCGCCACGCG651 GCTGAGTGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT701 GGGGGCTGCC GGGCAATCCC GAGCTTCAAA ACGCCACTTT TGCCAAACTG 751 CTGGCGCAAA AAGACCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT801 CATCGGCGCG ACGGTCGAAG GCGAAATGCG CGAATTGTGC GCCTACGGCG851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA951 CGGCGCGGGT AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG1001 GCTTTCAAAA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC1101 ATTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA1151 GCCACGCCGA CTATCCCGAA TCCGACATTT TCAACCACAG GCTCAAATGC1201 ACCGAATACG GCCTGCCCGC CGAAACCGAC CTCTGCCGCA ATTTCAGCCT1251 GCACACCCAA TtcttcgACC AACTGGCGGA TTTGATCCGA CGCCCCGAAA1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGACACG TCGCCTGGCT1401 GCACTTCAAA ATCAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 418;ORF48ng-1>:1 MNIHALLSEQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN51 LDYLPAALLI ALPWRFVKIA GVLAFWPAVL FDGLMMVIQL FPFMDLIGAI101 NLVPFILTAP APYQIMTGLL LLYMLAMPFV LQKAAVKTDF RHIAVCAAVV151 AAAGYFTGHL SYYDRGRMAN IFGANNFYYA KSQAMLYTVS QNADFITAGL201 VDPVFLPLGN QQRAATRLSE PKSQKILFIV AESWGLPGNP ELQNATFAKL251 LAQKDRFSVW ESGSFPFIGA TVEGEMRELC AYGGLRGFAL RRAPDEKFAR301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQKIKT AENLIGKKTC351 AIFGGVCDSE LFGEVSAFFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC401 TEYGLPAETD LCRNFSLHTQ FFDQLADLIR RPEMKGTEVI IVGDHPPPVG451 NLNETFRYLK QGHVAWLHFK IK*ORG48ng-1和ORF48-1显示在472个氨基酸的重叠区内有97.9%的相同性:
10 20 30 40 50 60orf48-1.pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
||||:|||:|||||||||||||||||||||||||||||||||||||||||||||||||||orf48ng-1 MNIHALLSEQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
10 20 30 40 50 60
70 80 90 100 110 120orf48-1.pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL
|||||||||||| |||||||||||||||||||||||||||||||||||||||||||||||orf48ng-1 ALPWRFVKIAGVLAFWPAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL
70 80 90 100 110 120
130 140 150 160 170 180orf48-1.pep LLYMLAMPFVLQKAAAKTDFRHIAVCAAVVAAAGYFTGHLSYYDRGRMANIFGANNFYYA
|||||||||||||||:||||||||||||||||||||||||||||||||||||||||||||orf48ng-1 LLYMLAMPFVLQKAAVKTDFRHIAVCAAVVAAAGYFTGHLSYYDRGRMANIFGANNFYYA
130 140 150 160 170 180
190 200 210 220 230 240orf48-1.pep KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP
||||||||||||||||||||||||||||||||||||:|:||||||||||||||||||:||orf48ng-1 KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATRLSEPKSQKILFIVAESWGLPGNP
190 200 210 220 230 240
250 260 270 280 290 300orf48-1.pep ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf48ng-1 ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR
250 260 270 280 290 300
310 320 330 340 350 360orf48-1.pep CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKKTCAIFGGVCDSE
||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||orf48ng-1 CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQKIKTAENLIGKKTCAIFGGVCDSE
310 320 330 340 350 360
370 380 390 400 410 420orf48-1.pep LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf48ng-1 LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ
370 380 390 400 410 420
430 440 450 460 470orf48-1.pep FFDQLADLIQRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLNFKIKX
|||||||||:|||||||||||||||||||||||||||||||||||||:|||||orf48ng-1 FFDQLADLIRRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLHFKIKX
430 440 450 460 470
根据该分析结果(包括淋球菌蛋白中存在一个推定的前导序列(双划线)和两个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例57
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 419>:1 ..GTGAGCGGAC GTTACCGCGC TTTGGATCGC GTTTCCAAAA TCATCATCGT51 TACTTTGAGT ATCGCCACGC TTGCCGCCGC CGGCATCGCT ATGTCGCGCG101 GTATGCAGAT GCAGTCCGAT TTTATCGAGC CGACACCGTG GACGCTTGCC151 GGTTTGGGCT TCCTGATCGC GCTGATGGGC TGGATGCCCG CGCCGATTGA201 AATTTCCGCC ATCAATTCTT TGTGGGTAAC CGAAAAACAA CGCATCAATC251 CTTCCGAATA CCGCGACGGG ATTTTTGAAT TCAACGTCGG TTATATCGCC301 AGTGCGGTTT TGGCTTTGGT TTTCCTTGCA CTGGGCGC.G TAGCGCCGAA351 CGGCAACGGC GA.ACAGTGC AGATGGCGGG CGGCAAATAT AACGGGCAAT401 TGATCAATAT GTACGCC..它对应于氨基酸序列<SEQ ID 420;ORF53>:1 ..VSGRYRALDR VSKIIIVTLS IATLAAAGIA MSRGMQMQSD FIEPTPWTLA51 GLGFLIALMG WMPAPIEISA INSLWVTEKQ RINPSEYRDG IFEFNVGYIA101 SAVLALVFLA LGXVAPNGNG XTVQMAGGKY NGQLINMYA..进一步的工作揭示了完整的核苷酸序列<SEQ ID 421>:1 ATGTCCGAAC AACATATTTC GACTTGGAAA AGTAAAATCA ACGCATTGGG51 TCCGGGGATC ATGATGGCTT CGGCGGCGGT CGGCGGTTCG CACCTGATTG101 CCTCGACGCA GGCGGGCGCG CTTTACGGCT GGCAGATCGC GCTCATCATC151 ATCCTGACCA ACCTCTTCAA ATACCCGTTT TTCCGCTTCA GCGCGCATTA201 CACGCTGGAC ACGGGCAAGA GCCTGATTGA AGGTTATGCC GAGAAAAGCC251 GCGTTTATTT GTGGGTATTC CTGATTTTGT GCATCCTCTC CGCCACGATT301 AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA AAATGGCGAT351 TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG ATTATGGCAT401 CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT GGATCGCGTT451 TCCAAAATCA TCATCGTTAC TTTGAGTATC GCCACGCTTG CCGCCGCCGG501 CATCGCTATG TCGCGCGGTA TGCAGATGCA GTCCGATTTT ATCGAGCCGA551 CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT GATGGGCTGG601 ATGCCCGCGC CGATTGAAAT TTCCGCCATC AATTCTTTGT GGGTAACCGA651 AAAACAACGC ATCAATCCTT CCGAATACCG CGACGGGATT TTTGATTTCA701 ACGTCGGTTA TATCGCCAGT GCGGTTTTGG CTTTGGTTTT CCTTGCACTG751 GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA TGGCGGGCGG801 CAAATATATC GGGCAATTGA TCAATATGTA CGCCGTTACC ATCGGCGGCT851 GGTCGCGCCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT GTACGGCACG901 ACGATTACCG TCGTGGACGG CTATGCCCGT GCCATTGCCG AACCCGTGCG951 CCTGCTGCGC GGAAAAGACA AAACGGGCAA CGCCGAATTC TTTGCCTGGA1001 ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG GTTTGACGGC1051 GTAATGGCGA ATCTGCTCAA ATTTGCGATG ATTGCCGCTT TTGTGTCCGC1101 CCCTGTGTTT GCCTGGCTGA ATTACCGTTT GGTTAAAGGT GATGAAAAAC1151 ACAAACTCAC ATCAGGTATG AATGCCCTTG CATTGGCAGG CTTGATTTAT1201 CTGACCGGTT TTACCGTTTT GTTCTTATTG AATTTGGCGG GAATGTTCAA1251 ATGA它对应于氨基酸序列<SEQ ID 422;ORF53-1>:1 MSEQHISTWK SKINALGPGI MMASAAVGGS HLIASTQAGA LYGWQIALII51 ILTNLFKYPF FRFSAHYTLD TGKSLIEGYA EKSRVYLWVF LILCILSATI101 NAGAVAIVTA AIVKMAIPSL MFDAGTVAAL IMASCLIILV SGRYRALDRV151 SKIIIVTLSI ATLAAAGIAM SRGMQMQSDF IEPTPWTLAG LGFLIALMGW201 MPAPIEISAI NSLWVTEKQR INPSEYRDGI FDFNVGYIAS AVLALVFLAL251 GAFVQYGNGE AVQMAGGKYI GQLINMYAVT IGGWSRPLVA FIAFACMYGT301 TITVVDGYAR AIAEPVRLLR GKDKTGNAEF FAWNIWVAGS GLAVIFWFDG351 VMANLLKFAM IAAFVSAPVF AWLNYRLVKG DEKHKLTSGM NALALAGLIY401 LTGFTVLELL NLAGMFK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF53显示与脑膜炎奈瑟球菌菌株A的ORF(ORF53a)在重叠的139个氨基酸内有93.5%的相同性:
10 20 30orf53.pep VSGRYRALDRVSKIIIVTLSIATLAAAGIA
||||||||||||||||||||||||||||||orf53a AAIVKMAIPSLMFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIA
110 120 130 140 150 160
40 50 60 70 80 90orf53.pep MSRGMQMQSDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53a MSRGMQMQSDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG
170 180 190 200 210 220
100 110 120 130 139orf53.pep IFEFNVGYIASAVLALVFLALGXVAPNGNGXTVQMAGGKYNGQLINMYA
||:||||||||||||||||||| : ||| :|||||||| ||||||||orf53a IFDFNVGYIASAVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLV
230 240 250 260 270 280orf53a AFIAFACMYGTTITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFD
290 300 310 320 330 340全长ORF53a核苷酸序列<SEQ ID 423>是:1 ATGTCCGAAC AACATATTTC GACTTGGAAA AGTAAAATCA ACGCATTGGG51 ACCGGGGATT ATGATGGCTT CGGCGGCGGT CGGCGGTTCG CACCTGATTG101 CCTCGACGCA GGCGGGCGCG CTTTACGGCT GGCAGATCGC GCTCATCATC151 ATCCTGACCA ACCTCTTCAA ATACCCGTTT TTCCGCTTCA GCGCGCATTA201 CACGCTGGAC ACGGGCAAGA GCCTGATTGA AGGTTATGCC GAGAAAAGCC251 GCGTTTATTT GTGGGTATTC CTGATTTTGT GCATCCTCTC CGCCACGATT301 AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA AAATGGCGAT351 TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG ATTATGGCAT401 CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT GGATCGCGTT451 TCCAAAATCA TCATCGTTAC TTTGAGTATC GCCACGCTTG CCGCCGCCGG501 CATCGCTATG TCGCGCGGTA TGCAGATGCA GTCCGATTTT ATCGAGCCGA551 CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT GATGGGCTGG601 ATGCCCGCGC CGATTGAAAT TTCCGCCATC AATTCTTTGT GGGTAACCGA651 AAAACAACGC ATCAATCCTT CCGAATACCG CGACGGGATT TTTGATTTCA701 ACGTCGGTTA TATCGCCAGT GCGGTTTTGG CTTTGGTTTT CCTTGCACTG 751 GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA TGGCGGGCGG801 CAAATATATC GGGCAATTGA TCAATATGTA CGCCGTTACC ATCGGCGGCT851 GGTCGCGCCC GCTGGTGCCG TTTATCGCGT TTGCCTGTAT GTACGGCACG901 ACGATTACCG TTGTGGACGG CTATGCCCGT GCCATTGCCG AACCCGTGCG951 CCTGCTGCGC GGAAAAGACA AAACGGGCAA CGCCGAATTC TTTGCCTGGA1001 ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG GTTTGACGGC1051 GTAATGGCGA ATCTGCTCAA ATTTGCGATG ATTGCCGCTT TTGTGTCCGC1101 CCCTGTGTTT GCCTGGCTGA ATTACCGTTT GGTCAAAGGT GATGAAAAAC1151 ACAAACTCAC ATCAGGTATG AATGCCCTTG CATTGGCAGG CTTGATTTAT1201 CTGACCGGTT TTACCGTTTT GTTCTTATTG AATTTGGCGG GAATGTTCAA1251 ATGA它编码的蛋白质具有氨基酸序列<SEQ ID 424>:1 MSEQHISTWK SKINALGPGI MMASAAVGGS HLIASTQAGA LYGWQIALII51 ILTNLFKYPF FRFSAHYTLD TGKSLIEGYA EKSRVYLWVF LILCILSATI101 NAGAVAIVTA AIVKMAIPSL MFDAGTVAAL IMASCLIILV SGRYRALDRV151 SKIIIVTLSI ATLAAAGIAM SRGMQMQSDF IEPTPWTLAG LGFLIALMGW201 MPAPIEISAI NSLWVTEKQR INPSEYRDGI FDFNVGYIAS AVLALVFLAL251 GAFVQYGNGE AVQMAGGKYI GQLINMYAVT IGGWSRPLVA FIAFACMYGT301 TITVVDGYAR AIAEPVRLLR GKDKTGNAEF FAWNIWVAGS GLAVIFWFDG351 VMANLLKFAM IAAFVSAPVF AWLNYRLVKG DEKHKLTSGM NALALAGLIY401 LTGFTVLFLL NLAGMFK*ORF 53a显示与ORF53-1在重叠的417个氨基酸内有100.0%的相同性:
10 20 30 40 50 60orf53a.pep MSEQHISTWKSKINALGPGIMMASAAVGGSHLIASTQAGALYGWQIALIIILTNLFKYPF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53-1 MSEQHISTWKSKINALGPGIMMASAAVGGSHLIASTQAGALYGWQIALIIILTNLFKYPF
10 20 30 40 50 60
70 80 90 100 110 120orf53a.pep FRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTAAIVKMAIPSL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53-1 FRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTAAIVKMAIPSL
70 80 90 100 110 120
130 140 150 160 170 180orf53a.pep MFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAMSRGMQMQSDF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53-1 MFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAMSRGMQMQSDF
130 140 150 160 170 180
190 200 210 220 230 240orf53a.pep IEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGIFDFNVGYIAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53-1 IEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGIFDFNVGYIAS
190 200 210 220 230 240
250 260 270 280 290 300orf53a.pep AVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVAFIAFACMYGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53-1 AVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVAFIAFACMYGT
250 260 270 280 290 300
310 320 330 340 350 360orf53a.pep TITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDGVMANLLKFAM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53-1 TITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDGVMANLLKFAM
310 320 330 340 350 360
370 380 390 400 410orf53a.pep IAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLLNLAGMFKX
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53-1 IAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLLNLAGMFKX
370 380 390 400 410
与淋病奈瑟球菌的预计ORF的同源性
ORF53显示与淋病奈瑟球菌的预计ORF(ORF53ng)在重叠的139个氨基酸内有92.1%的相同性:orf53.pep VSGRYRALDRVSKIIIVTLSIATLAAAGIA 30
||||||||||||||||||||||||||||||orf53ng AAIVKMAIPSLMFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIA 91orf53.pep MSRGMQMQSDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG 90
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||orf53ng MSRGMQMQPDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG 151orf53.pep IFEFNVGYIASAVLALVFLALGXVAPNGNGXTVQMAGGKYNGQLINMYA 139
||:||||||||||||||||||| : ||| :|||:|||| ||||||||orf53ng IFDFNVGYIASAVLALVFLALGAFVQYGNGEAVQMGGGKYIGQLINMYAVTIGGGSRPLV 211
预计ORF53ng核苷酸序列<SEQ ID 425>编码的蛋白质具有氨基酸序列<SEQ ID426>:1 MPKKSCVYLW VFLILCIASA TINAGAVAIV TAAIVKMAIP SLMFDAGTVA51 ALIMASCLII LVSGRYRALD RVSKIIIVTL SIATLAAAGI AMSRGMQMQP101 DFIEPTPWTL AGLGFLIALM GWMPAPIEIS AINSLWVTEK QRINPSEYRD151 GIFDFNVGYI ASAVLALVFL ALGAFVQYGN GEAVQMGGGK YIGQLINMYA201 VTIGGGSRPL VAFIAFACMY GAASTVVDGY ARAIAEPVRL LRGKDKTARP251 IVLLEKLGGR HRFGRDFLV*进一步的分析进一步揭示了淋球菌的该部分DNA序列<SEQ ID 427>:1 ..aagaAAAGCT GCGTTTATTT GTGGGTTTTT TTGATTTTGT GTATCGCCTC51 CGCCACGATT AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA101 AAATGGCGAT TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG151 ATTATGGCAT CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT201 GGATCGTGTT TCCAAAATCA TCATTGTTAC TTTGAGCATC GCCACGCTTG251 CCGCCGCCGG CATCGCTATG TCGCGCGGTA TGCAGATGCA GCCCGATTTT301 ATCGAGCCGA CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT351 GATGGGCTGG ATGCCCGCGC CGATCGAAAT TTCCGCCATC AATTCTTTGT401 GGGTAACCGA AAAACAACGC ATCAATCCTT CTGAATACCG CGACGGGATT451 TTCGATTTCA ACGTCGGTTA TATCGCcagT GCGGTTTTGG CTTTGGTTTT501 CCTTGCACTG GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA551 TGGCGGGCGG CAAATATATC GGGCAATTGA TTAATATGTA TGCCGTAACC601 ATCGGCGGCT GGTCTCGTCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT651 GTACGGCACG ACGATTACCG TTGTGGACGG TTATGCGCGT GCCATTGCCG701 AACCCGTGCG CCTGCTGCGC GGCAGGGATA AAACCGGCAA CGCCGAGTTG751 TTtgccTGGA ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG801 GTTTGACggc gcaaTGGCgG AACtgcTCAA ATTTGCGATG ATtgccgcCT851 TTGTGTCCGC CCCTGTGTTC GCCTGGCTCA ACTACCGCCT CGTCAAAGGG901 GACAAACGCC ACAGGCTTAC CGCCGGTATG AACGCCCTTG CCATTGTCGG951 CCTGCTCTAC CTGGCCGGGT TTGCCGTTTT GTTCCTGTTG AACCTTACCG1001 GACTTTTGGC ATAG它对应于氨基酸序列<SEQ ID 428;ORF53ng-1>:1 ..KKSCVYLWVF LILCIASATI NAGAVAIVTA AIVKMAIPSL MFDAGTVAAL51 IMASCLIILV SGRYRALDRV SKIIIVTLSI ATLAAAGIAM SRGMQMQPDF101 IEPTPWTLAG LGFLIALMGW MPAPIEISAI NSLWVTEKQR INPSEYRDGI151 FDFNVGYIAS AVLALVFLAL GAFVQYGNGE AVQMAGGKYI GQLINMYAVT201 IGGWSRPLVA FIAFACMYGT TITVVDGYAR AIAEPVRLLR GRDKTGNAEL251 FAWNIWVAGS GLAVIFWFDG AMAELLKFAM IAAFVSAPVF AWLNYRLVKG301 DKRHRLTAGM NALAIYGLLY LAGFAYLFLL NLTGLLA*ORF53ng-1和ORF53-1显示在336个氨基酸的重叠区内有94.0%的相同性:
60 70 80 90 100 110orf53-1.pep ILTNLFKYPFFRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTA
:|| ||||||||||| ||||||||||||||orf53ng-1 KKSCVYLWVFLILCIASATINAGAVAIVTA
10 20 30
120 130 140 150 160 170orf53-1.pep AIVKMAIPSLMFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53ng-1 AIVKMAIPSLMFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAM
40 50 60 70 80 90
180 190 200 210 220 230orf53-1.pep SRGMQMQSDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGI
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||orf53ng-1 SRGMQMQPDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGI
100 110 120 130 140 150
240 250 260 270 280 290orf53-1.pep FDFNVGYIASAVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf53ng-1 FDFNVGYIASAVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVA
160 170 180 190 200 210
300 310 320 330 340 350orf53-1.pep FIAFACMYGTTITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDG
|||||||||||||||||||||||||||||||:|||||||:||||||||||||||||||||orf53ng-1 FIAFACMYGTTITVVDGYARAIAEPVRLLRGRDKTGNAELFAWNIWVAGSGLAVIFWFDG
220 230 240 250 260 270
360 370 380 390 400 410orf53-1.pep VMANLLKFAMIAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLL
:||:|||||||||||||||||||||||||||::|:||:||||||::||:||:||:|||||orf53ng-1 AMAELLKFAMIAAFVSAPVFAWLNYRLVKGDKRHRLTAGMNALAIVGLLYLAGFAVLFLL
280 290 300 310 320 330orf53-1.pep NLAGMFKX
||:|::orf53ng-1 NLTGLLAX
根据该分析结果(包括淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例58
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 429>:1 ..TTGCGGGAAA CGGCATATGT TTTGGATAGT TTTGATCGTT ATTTTGTTGT51 TGCGCTTGCC GGCTTGTTTT TTGTCCGCGC ACAATCCGAA CGCGAGTGGA101 TGCGCGAGGT TTCTGCGTGG CAGGAAAAGA AAGGGGAAAA ACAGGCGGAG151 CTGCCTGAAA TCAAAGACGG TATGCCCGAT TTTCCCGAAC TTGCCCTGAT201 GCTTTTCCAC GCCGTCAAAA CGGCAGTGTA TTGGCTGTTT GTCGGTGTCG251 TCCGTTTCTG CCGAAACTAT CTGGCGCACG AATCCGAACC GGACAGGCCC301 GTTCCGCCT..它对应于氨基酸序列<SEQ ID 430;ORF58>:1 ..LRETAYVLDS FDRYFVVALA GLFFVRAQSE REWMREVSAW QEKKGEKQAE51 LPEIKDGMPD FPELALMLFH AVKTAVYWLF VGVVRFCRNY LAHESEPDRP101 VPP..进一步的工作揭示了其完整的核苷酸序列<SEQ ID 431>:1 ATGTTTTGGA TAGTTTTGAT CGTTATTTTG TTGCTTGCGC TTGCCGGCTT51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA151 GACGGTATGC CCGATTTTCC CGAACTTGCC CTGATGCTTT TCCATGCCGT201 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT301 GCAAACCGTG CGGATGTTCC GACCGCATCC GACGGATATT CAGACAGTGG351 AAACGGGACG GAAGAAGCGG AAACGGAAGA AGCAGAAGCT GCGGAGGAAG401 AGGCTGCCGA TACGGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC451 ATCCCATTCG ACCGGAGTAT TGCTGAAGGG TTGATGCCGT CTGAAAGCGA501 AATTTCGCCC GTCCGTCCGG TTTTTAAAGA AATCACTTTG GAAGAAGCAA551 CGCGTGCTTT AAACAGCGCG GCTTTAAGGG AAACGA4AAA ACGCTATATC601 GATGCATTTG AGAAAAACGA AACAGCGGTC CCCAAAGTCC GCGTGTCCGA651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC701 AACGCACGTA TTCCCATATG TTCGATGCGG ACAAAGAAGC GTTTTCCGAG751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCACCGTC851 ATGCAGGGCA GRGGAAAGGG CAGGCGGAGG CAAA4TCCCC GGATGTTTCC901 CAAGGGCAGT CCGTTTCAGA CGGCACGGCC GTCCGCGATG CCCGCCGCCG951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG1001 CGCGAATTTC TCGCCTGATT CCGGAAAGTC AGACGGTTGT CGGGAAACGG1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAACCGTTTC1101 GTCTGTGGGA TACGGCGGTC CGGTTTATGA TGAAACTGCC GATATCCATA1151 TTGAAGAACC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG1201 CCGAAAGTTC CCATGACCGC AATCGATATT CAGCCGCCGC CTCCCGTATC1251 GGAAATCTAC AACCGTACCT ATGAACCGCC GTCAGGATTC GAGCAGGTGC1301 AACGCAGCCG CATTGCCGAG ACCGACCATC TTGCCGATGA TGTTTTGAAT1351 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCGGATGACG GCAGTGAAGG1401 TGCGGCAGAG CGGTCAAGCG GGCAATATCT GTCGGAAACC GAAGCGTTCG1451 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAAATGTGCC GTCTGAACGC1501 CCGTCCTGCC GGGTATCGGA TACGGAAGCG GATGAAGGGG CGTTCCCATC1551 TGAAGAAACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC1601 TGCCTCCGCT GTTCAATCCC GAGGCGACGC AAACCGAAGA AGAACTGTTG1651 GAAAACAGCA TCACCATCGA AGAAAAATTG GCGGAGTTCA AAGTCAAGGT1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT AATTACGCGT TATGAAATCG1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTGAATCT GGAAAAAGAT1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCC1851 CGGCAAAACC TGCATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA1901 TACGCCTGAG CGAAATCTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC1951 AAGCTGACGC TCGCGCTCGG TCAGGACATC ACCGGACAGC CCGTCGTAAC2001 CGACTTGGGA AAAGCACCGC ATTTGTTGGT TGCCGGCACG ACCGGTTCGG2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT2151 GAGCATTTAC GAAGGCATCC CGCACCTGCT CGCCCCTGTC GTTACCGATA2201 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGTAATCTTG CGGGCTTCAA2301 TCAAAAAATC GCCGAAGCCG CAGCAAGGGG AGAAAAAATC GGCAATCCGT2351 TCAGCCTCAC GCCCGACGAT CCCGAACCTT TGGAAAAACT GCCGTTTATC2401 GTGGTCGTGG TCGATGAGTT TGCCGACCTG ATGATGACGG CAGGCAAGAA2451 AATCGAAGAA CTGATTGCCC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA2501 TCCATTTGAT TCTTGCCACA CAACGCCCCA GCGTCGATGT CATCACGGGT2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA2601 AATCGACAGC CGCACGATTC TCGACCAAAT GGGCGCGGAA AACCTGCTCG2651 GTCAGGGCGA TATGCTGTTC CTGCTGCCGG GTACTGCCTA TCCGCAGCGC2701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA2751 TTTGAAACAG TTTGGCGAAC CGGACTATGT TGACGATATT TTGAGCGGCG2801 GCGGCAGCGA AGAGCTGCCC GGCATCGGGC GCAGCGGCGA CGACGAAACC2851 GATCCGATGT ACGACGAGGC CGTATCCGTT GTCCTGAAAA CGCGCAAAGC2901 CAGCATTTCG GGCGTACAGC GCGCCTTGCG TATCGGCTAC AACCGCGCCG2951 CGCGTCTGAT TGACCAGATG GAGGCGGAAG GCATTGTGTC CGCACCGGAA3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTGGACAATG CTTGA它对应于氨基酸序列<SEQ ID 432;ORF58-1>:1 MFWIVLIVIL LLALAGLFFV RAQSEREWMR EVSAWQEKKG EKQAELPEIK51 DGMPDFPELA LMLFHAVKTA VYWLFVGVVR FCRNYLAHES EPDRPVPPAS101 ANRADVPTAS DGYSDSGNGT EEAETEEAEA AEEEAADTED IATAVIDNRR151 IPFDRSIAEG LMPSESEISP VRPVFKEITL EEATRALNSA ALRETKKRYI201 DAFEKNETAV PKVRVSDTPM EGLQIIGLDD PVLQRTYSHA FDADKEAFSE251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FHRHAGQGKG QAEAKSPDVS301 QGQSVSDGTA VRDARRRVSV NLKEPNKATV SAEARISRLI PESQTVVGKR351 DVEMPSETEN VFTETVSSVG YGGPVYDETA DIHIEEPAAP DAWVVEPPEV401 PKVPMTAIDI QPPPPVSEIY NRTYEPPSGF EQVQRSRIAE TDHLADDVLN451 GGWQEETAAI ADDGSEGAAE RSSGQYLSET EAFGHDSQAV CPFENVPSER501 PSCRVSDTEA DEGAFPSEET GAVSEHLPTT DLLLPPLFNP EATQTEEELL551 ENSITIEEKL AEFKVKVKVV DSYSGPVITR YEIEPDVGVR GNSVLNLEKD601 LARSLGVASI RVVETIPGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS651 KLTLALGQDI TGQPVVTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA701 APEDVRMIMI DPKMLELSIY EGIPHLLAPV VTDMKLAANA LNWCVNEMEK751 RYRLMSFMGV RNLAGFNQKI AEAAARGEKI GNPFSLTPDD PEPLEKLPFI801 VVVVDEFADL MMTAGKKIEE LIARLAQKAR AAGIHLILAT QRPSVDVITG851 LIKANIPTRI AFQVSSKIDS RTILDQMGAE NLLGQGDMLF LLPGTAYPQR901 VHGAFASDEE VHRVVEYLKQ FGEPDYVDDI LSGGGSEELP GIGRSGDDET951 DPMYDEAVSV VLKTRKASIS GVQRALRIGY NRAARLIDQM EAEGIVSAPE1001 HNGNRTILVP LDNA*
对该氨基酸序列的计算机分析预计了指定的跨膜区,并给出下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF58显示与脑膜炎奈瑟球菌菌株A的ORF(ORF58a)在重叠的89个氨基酸内有96.6%的相同性:
10 20 30 40 50 60orf58.pep LRETAYVLDSFDRYFVVALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPD
:::|||||||||||||||||||||||||||||||||||||||||||orf58a MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPD
10 20 30 40 50
70 80 90 100orf58.pep FPELALMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPP
|||||||||||||||||||||||||||||||||||||||||||orf58a FPELALMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSD
60 70 80 90 100 110全长ORF58a核苷酸序列<SEQ ID 433>是:1 ATGTTTTGGA TAGTTTTGAT CGTTATTTTG TTGCTTGCGC TTGCCGGCTT51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA151 GACGGTATGC CCGATTTTCC CGAACTTGCC CTGATGCTTT TCCATGCCGT201 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT301 GCAAATCGTG CGGATGTTCC GACCGCATCC GACGGATATT CAGACAGTGG351 AAACGGGACG GAAGAAGCGG AAACGGAAGA AGCAGAAGCT GCGGAGGAAG 401 AGGCTGCCGA TACGGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC451 ATCCCATTCG ACCGGAGTAT TGCTGAAGGG TTGATGCCGT CTGAAAGCGA501 AATTTCGCCC GTCCGTCCGG TTTTTAAGGA AATCACTTTG GAAGAAGCAA551 CGCGTGCTTT AAACAGCGCG GCTTTAAGGG AAACGAAAAA ACGCTATATC601 GATGCATTTG AGAAAAACGA AACAGCGGTC CCCAAAGTCC GCGTGTCCGA651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC701 AACGCACGTA TTCCCGTATG TTCGATGCGG ACAAAGAAGC GTTTTCCGAG751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCGCCGTC851 ATGCAGGGCA GGGNAAAGGG CAGGCGGAGG CNAAATCCCC GGATGTTTCC901 CAAGGGCAGT CCGTTTCAGA CGGCACAGCC GTCCGCGATG CCNGCCGCCG951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG1001 CGCGGATTTC GCGCCTGATT CCGGAAAGTC GGACGGTTGT CGGGAAACGG1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAANTGTTTC1101 GTCTGTGGGA TACGGCGNTC CGGTTTATGA TGAAACTGCC GATATCCATA1151 TTGAAGAACC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG1201 CCGAAAGTTC CCATGCCCGC AATNGATATT CCGCCGCCGC CTCCCGTATC1251 GGAAATCTAC AACCGTACCT ATGAACCGCC GGCAGGATTC GAGCAGGTGC1301 AACGCAGCCG CATTGCCGAA ACCGATCATC TTGCCGATGA TGTTTTGAAT1351 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCGAATGACG GCAGTGAGGG1401 TGTGGCAGAG CGGTCAAGCG GGCAATATTT GTCGGAAACC GAAGCGTTCG1451 GGCATGACAG TCAGCCGGTT TGTCCGTTTG AAAATGTCCC GTCTGA4CGC1501 CCGTCCCGCC GGGCATNGGA TACGGAAGCG GATGAAGGGG CGTTCCAATC1551 TGAAGAAACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC1601 TGCCGCCGCT GTTCAATCCC GGGGCGACGC AAACCGAAGA AGANCTGTTG1651 GANAACAGCA TCACCATCGA AGAAAAATNG GCGGAGTTCA AAGTCAAGGT1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT GATTACGCGT TATGAAATCG1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTAAATCT GGAAAAAGAN1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCT1851 CGGCAAAACC TGTATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA1901 TACGCCTGAG CGAAATCTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC1951 AAGCTGACGC TCGCGCTCGG TCAGGACATC ACCGGACAGC CCGTCGTAAC2001 CGACTTGGGC AAAGCACCGC ATTTGTTGGT TGCCGGCACG ACCGGTTCGG2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT2151 GAGCATTTAC GAAGGCATCC CGCACCTGCT CGCCCCTGTC GTTACCGATA2201 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGCAATCTTG CGGGTNTCAA2301 TCAAAAAATC GCCGAAGCCG CAGCAAGGGG GGAGAAAATC GGCAACCCGT2351 TCAGCCTCAC GCCCGACAAT CCCGAACCTT TGGANAAATT GCCGTTTATC2401 GTGGTCGTGG TTGATGAGTT TGCCGACCTG ATGATGACGG CAGGCAAGAA2451 AATCGAAGAA CTGATTGCCC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA2501 TCCATCTTAT CCTTGCCACA CAACGCCCCA GTGTCGATGT CATCACGGGT2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA2601 AATCGACAGC CGCACGATTC TTGACCAAAT GGGTGCGGAA AACCTGCTCG2651 GGCAGGGCGA TATGCTGTTC CTGCCGCCGG GTACGGCCTA TCCGCAGCGC2701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA2751 TCTGAAACAG TTTGGCGAAC CGGACTATGT TGACGATATN TTGAGCGGCG2801 GTATGTCCGA CGATTTGCTG GGAATCAGCC GGAGCGGCGA CGGCGAAACC2851 GATCCGATGT ACGACGAGGC CGTGTCNGTT GTTTTGAAAA CGCGCAAAGC2901 CAGCATTTCT GGCGTGCAGC GCGCATTGCG TATCGGCTAT AATCGCGCCG2951 CGCGTCTGAT TGACCAGATG GAGGCGGAAG GCATTGTGTC CGCACCGGAA3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTNGACAATG CTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 434>:1 MFWIVLIVIL LLALAGLFFV RAQSEREWMR EVSAWQEKKG EKQAELPEIK51 DGMPDFPELA LMLFHAVKTA VYWLFVGVVR FCRNYLAHES EPDRPVPPAS101 ANRADVPTAS DGYSDSGNGT EEAETEEAEA AEEEAADTED IATAVIDNRR151 IPFDRSIAEG LMPSESEISP VRPVFKEITL EEATRALNSA ALRETKKRYI201 DAFEKNETAV PKVRVSDTPM EGLQIIGLDD PVLQRTYSRM FDADKEAFSE251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FRRHAGAGKG AAEAKSPDVS 301 QGASVSDGTA VRDAXRRVSV NLKEPNKATV SAEARISRLI PESRTVVGKR351 DVEMPSETEN VFTEXVSSVG YGXPVYDETA DIHIEEPAAP wDAWVVEPPEV401 PKVPMPAXDI PPPPPVSEIY NRTYEPPAGF EQVQRSRIAE TDHLADDVLN451 GGWQEETAAI ANDGSEGYAE RSSGQYLSET EAFGHDSQAV CPFENVPSER501 PSRRAXDTEA DEGAFQSEET GAVSEHLPTT DLLLPPLFNP GATQTEEXLL551 XNSITIEEKX AEFKVKYKVV DSYSGPVITR YEIEPDVGVR GNSYLNLEKX601 LARSLGVASI RVYETILGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS651 KLTLALGQDI TGQPVVTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA701 APEDYRMIMI DPKMLELSIY EGIPHLLAPV VTDMKLAANA LNWCYNEMEK751 RYRLMSFMGV RNLAGXNQKI AEAAARGEKI GNPFSLTPDN PEPLXKLPFI801 VVVVDEFADL MMTAGKKIEE LIARLAQKAR AAGIRLILAT QRPSVDVITG851 LIKANIPTRI AFQVSSKIDS RTILIQMGAE NLLGQGDMLF LPPGTAYPQR901 VHGAFASDEE VHRVYEYLKQ FGEPDYVDDX LSGGMSDDLL GISRSGDGET951 DPMYDEAVSV VLKTRKASIS GVQRALRIGY NRAARLIDQM EAEGIVSAPE1001 HNGNRTILVP XDNA*ORF58a和ORF58-1显示在1014个氨基酸的重叠区内有96.6%的相同性:
10 20 30 40 50 60orf58a.pep MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58-1 MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA
10 20 30 40 50 60
70 80 90 100 110 120orf58a.pep LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58-1 LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
70 80 90 100 110 120
130 140 150 160 170 180orf58a.Dep EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL
||||||||||||||||||||||||||||||||||||V|||||||||||||||||||||||orf58-1 EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL
130 140 150 160 170 180
190 200 210 220 230 240orf58a.pep EEATRALNSAALRETKKRYIDAFEKNETAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSRM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|orf58-1 EEATRALNSAALRETKKRYIDAFEKNETAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSHM
190 200 210 220 230 240
250 260 270 280 290 300orf58a.pep FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFRRHAGQGKGQAEAKSPDVS
|||||||||||||||||||||||||||||||||||||||||:||||||||||||| ||||orf58-1 FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFHRHAGQGKGQAEAKSPDVS
250 260 270 280 290 300
310 320 330 340 350 360orf58a.pep QGQSVSDGTAVRDAXRRVSVNLKEPNKATVSAEARISRLIPESRTVVGKRDVEMPSETEN
|||||||||||||| ||||||||||||||||||||||||||||:||||||||||||||||orf58-1 QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESQTVVGKRDVEMPSETEN
310 320 330 340 350 360
370 380 390 400 410 420orf58a.pep VFTEXVSSVGYGXPVYDETADIHIEEPAAPDAWVVEPPEVPKVPMPAXDIPPPPPVSEIY
||||:||||||| |||||||||||||||||||||||||||||||| | ||||||||||||orf58-1 VFTETVSSVGYGGPVYDETADIHIEEPAAPDAWVVEPPEVPKVPMTAIDIQPPPPVSEIY
370 380 390 400 410 420
430 440 450 460 470 480orf58a.pep NRTYEPPAGFEQVQRSRIAETDHLADDVLNGGWQEETAAIANDGSEGVAERSSGQYLSET
|||||||:|||||||||||||||||||||||||||||||||:|||||:||||||||||||orf58-1 NRTYEPPSGFEQVQRSRIAETDHLADDVLNGGWQEETAAIADDGSEGAAERSSGQYLSET
430 440 450 460 470 480
490 500 510 520 530 540orf58a.pep EAFGHDSQAVCPFENVPSERPSRRAXDTEADEGAFQSEETGAVSEHLPTTDLLLPPLFNP
|||||||||||||||||||||| |: ||||||||| ||||||||||||||||||||||||orf58-1 EAFGHDSQAVCPFENVPSERPSCRVSDTEADEGAFPSEETGAVSEHLPTTDLLLPPLFNP
490 500 510 520 530 540
550 560 570 580 590 600orf58a.pep GATQTEEXLLXNSITIEEKXAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKX
|||||| || |||||||| |||||||||||||||||||||||||||||||||||||||orf58-1 EATQTEEELLENSITIEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKD
550 560 570 580 590 600
610 620 630 640 650 660orf58a.pep LARSLGVASIRVVETILGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||orf58-1 LARSLGVASIRVVETIPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
610 620 630 640 650 660
670 680 690 700 710 720orf58a.pep TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58-1 TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
670 680 690 700 710 720
730 740 750 760 770 780orf58a.pep EGIPHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGXNQKIAEAAARGEKI
||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||||orf58-1 EGIPHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI
730 740 750 760 770 780
790 800 810 820 830 840orf58a.pep GNPFSLTPDNPEPLXKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
|||||||||:|||| |||||||||||||||||||||||||||||||||||||||||||||orf58-1 GNPFSLTPDDPEPLEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
790 800 810 820 830 840
850 860 870 880 890 900orf58a.pep QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQR
||||||||||||||||||||||||||||||||||||||||||||||||||| ||||||||orf58-1 QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLLPGTAYPQR
850 860 870 880 890 900
910 920 930 940 950 960orf58a.pep VHGAFASDEEVHRVVEYLKQFGEPDYVDDXLSGGMSDDLLGISRSGDGETDPMYDEAVSV
||||||||||||||||||||||||||||| |||| |::| ||:|||| ||||||||||||orf58-1 VHGAFASDEEVHRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDDETDPMYDEAVSV
910 920 930 940 950 960
970 980 990 1000 1010orf58a.pep VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPXDNAX
|||||||||||||||||||||||||||||||||||||||||||||||||| ||||orf58-1 VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPLDNAX
970 980 990 1000 1010
与淋病奈瑟球菌的预计ORF的同源性
ORF58显示出与淋病奈瑟球菌的预计ORF(ORF58ng)的9个氨基酸重叠区完全相同:orf58.pep ALMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPP 103
|||||||||orf58ng SEPDRPVPPASANRADVPTASDGYSDSGNG 30
预计ORF58ng核苷酸序列<SEQ ID 435>编码的蛋白质具有部分氨基酸序列<SEQ ID 436>:1 ..SEPDRPVPPA SANRADVPTA SDGYSDSGNG TEEAETEAAE AAEEEAADTE51 DIATAVIDNR RIPFDRSIAE GLMQSESKTS PVRPVFKEIT LEEATRALSS101 AALRETKKRY IDAFEKNGTA VPKVRVSDTP MEGLQIIGLD DPVLQRTYSR151 MFDADKEAFS ESADYGFEPY FEKQHPSAFS AVKAENARNA PFRRHAGQEK201 GQAEAKSPDV SQGQSVSDGT AVRDARRRVS VNLKEPNKAT VSAEARISRL251 IPESRTVVGK RDVEMPSETE NVFTETVSSV GYGGPVYDEA ADIHIEEPAA301 PDAWVVEPPE VPEVAVPEID ILPPPPVSEI YNRTYEPPAG FEQAQRSRIA351 ETDHLAADVL NGGWQEETAA IADDGSEGAA ERSSGQYLSE TEAFGHDSQA401 VCPFEDVPSE RPSCRVSDTE ADEGAFQSEE TGAVSEHLPT TDLLLPPLFN451 PEATQTEEEL LENSITIEEK LAEFKVKVKV VDSYSGPVIT RYEIEPDVGV501 RGNSVLNLEK DLARSLGVAS IRVVETIPGK TCMGLELPNP KRQMIRLSEI551 FNSPEFAESK SKLTLALGQD ITGQPVVTDL GKAPHLLVAG TTGSGKSVGV601 NAMILSMLFK AAPEDVRMIM IDPIMLELSI YEGITHLLAP VVTDMKLAAN651 ALNWCVNEME KRYRLMSFMG VRNLAGFNQK IAEAAARGEK IGNPFSLTPD701 DPEPLEKLPF IVVVVDEFAD LMMTAGKKIE ELIARLAQKA RAAGIHLILA751 TQRPSVDVIT GLIKANIPTR IAFQVSSKID SRTILDQMGA ENLLGQGDML801 FLPPGTAYPQ RVHGAFASDE EVHRVVEYLK QFGEPDYVDD ILSGGGSEEL851 PGIGRSGDGE TDPMYDEAVS VVLKTRKASI SGVQRALRIG YNRAARLIDQ901 MEAEGIVSAP EHNGNRTILV PLDNA*
该部分淋球菌序列含有一个预计的跨膜区和一个预计的ATP/GTP-结合位点基序A(P-环;双划线)。另外,它具有一个与大肠杆菌的FTSK细胞分裂蛋白同源的结构域。将ORF58ng和Ftsk(登录号p46889)作序列对比,结果显示在459个氨基酸重叠区内有65%的氨基酸相同性:ORF58ng: 467 IEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKDLARSLGVASIRVVET 526
+E +LA+F++K VV+ GPVITR+E+ GV+ + NL +DLARSL ++RVVEFtsK: 868 VEARLADFRIKADVVNYSPGPVITRFELNLAPGVKAARISNLSRDLARSLSTVAVRVVEV 927ORF58ng: 527 IPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDITGQPVVTDLGKAPHL 586
IPGK +GLELPN KRQ + L E+ ++ +F ++ S LT+ LG+DI G+PVV DL K PHLFtsK: 928 IPGKPYVGLELPNKKRQTVYLREVLDNAKFRDNPSPLTVVLGKDIAGEPVVADLAKMPHL 987ORF58ng: 587 LVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIYEGITHLLAPVVTDMK 845
LVAGTTGSGKSYGVNAMILSML+KA PEDVR IMIDPKMLELS+YEGI HLL VVTDMKFtsK: 988 LVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFIMIDPKMLELSVYEGIPHLLTEVVTDMK 1047ORF58ng: 647 LAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKIGNPFSLTPDDPEP-- 704
AANAL WCVNEME+RY+LMS +GVRNLAG+N+KIAEA I +P+ D +FtsK: 1048 DAANALRWCVNEMERRYKLMSALGVRNLAGYNEKIAEADRMMRPIPDPYWKPGDSMDAQH 1107ORF58ng: 705 --LEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILATQRPSVDVITGL 762
L+K P+IVV+VDEFADLMMT GKK+EELIARLAQKARAAGIHL+LATQRPSVDVITGLFtsK: 1108 PVLKKEPYIVVLVDEFADLMMTVGKKVEELIARLAQKARAAGIHLVLATQRPSVDVITGL 1167ORF58ng: 763 IKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQRYHGAFASDEEV 822
IKANIPTRIAF VSSKIDSRTILDQ GAE+LLG GDML+ P + P RVHGAF D+EVFtsK: 1168 IKANIPTRIAFTVSSKIDSRTILDQAGAESLLGMGDMLYSGPNSTLPVRVHGAFVRDQEV 1227ORF58ng: 823 HRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSVVLKTRKASISG 882
H VV+ K G P YVD I S SE G G G E DP++D+AV V + RKASISGFtsK: 1228 HAVVQDWKARGRPQYVDGITSDSESEGGAG-GFDGAEELDPLFIQAVQFVTEKRKASISG 1286ORF58ng: 883 VQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVP 921
VQR RIGYNRAAR+I+QMEA+GIVS HNGNR +L PFtsK: 1287 VQRQFRIGYNRAARIIEQMEAQGIVSEQGHNGNREVLAP 1325对ORF58ng作进一步工作揭示了其完整的淋球菌DNA序列<SEO ID 437>是:1 ATGTTTTGGA TAGTTTTGAT CGTTATtgtg TTGCTTGCGC TTGCCGGCCT51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA151 GACGGTATGC CCGATTTTCC CGAGTTTTCC CTGATGCTTT TCCATGCCGT201 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT301 GCAAACCGTG CGGATGTTCC GACCGCATCC GACGGGTATT CAGACAGTGG351 AAACGGGACG GAAGAAGCGG AAACGGAAGC AGCAGAAGCT GCGGAGGAAG401 AGGCTGCCgA TACgGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC451 ATCCcatTCG ACCGGAGTAT TGCTGAAGGG TTGATGCAGT CTGAAACCAA501 AACTTCGCCC GTCCGTCCGG TTTTTAAGGA AATCACTTTG GAAGAAGCAA551 CGCGTGCTTT AAGCAGCGCG GCTTTAAGGG AAACGAAAAA ACGCTATATC601 GATGCATTTG AGAAAAACGG AACAGCCGTC CCCAAAGTAC GCGTGTCCGA651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC701 AACGCACGTA TTCCCGTATG TTTGATGCGG ACAAAGAAGC GTTTTCCGAG751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCGCCGTC851 ATGCAGGGCA GGAGAAAGGG CAGGCGGAGG CAAAATCCCC GGATGTTTCC901 CAAGGGCAGT CCGTTTCAGA CGGCACAGCC GTCCGCGATG CCCGCCGCCG951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG1001 CGCGGATTTC GCGCCTGATT CCGGAAAGTC GGACGGTTGT CGGGAAACGG1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAACCGTTTC1101 GTCTGTGGGA TACGGCGGTC CGGTTTATGA TGAAGCTGCC GATATCCATA1151 TTGAAGAGCC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG1201 CCGGAGGTAG CCGTACCCGA AATCGATATT CTGCCGCCGC CTCCCGTATC1251 GGAAATCTAC AACCGTACCT ATGAGCCGCC GGCAGGATTC GAGCAGGCGC1301 AACGCAGCCG CATTGCCGAA ACCGACCATC TTGCCGCTGA TGTTTTGAAT1351 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCAGATGACG GCAGTGAGGG1401 TGCGGCAGAG CGGTCAAGCG GGCAATATCT GTCGGAAACC GAAGCGTTCG1451 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAGATGTGCC GTCTGAACGC1501 CCGTCCTGCC GGGTATCGGA TACGGAAGCG GATGAAGGGG CGTTCCAATC1551 GGAAGAGACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC1601 TGCCTCCGCT GTTCAATCCC GAGGCGACGC AAACCGAAGA AGAACTGTTG1651 GAAAACAGCA TCACCATCGA AGAAAAATTG GCGGAGTTCA AAGTCAAGGT1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT GATTACGCGT TATGAAATCG1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTGAATTT GGAAAAAGAC1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCC1851 CGGCAAAACC TGCATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA1901 TACGCCTGAG CGAAATTTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC1951 AAGCTGACGC TCGCGCTCGG TCAGGACATT ACCGGACAGC CCGTCGTAAC2001 CGACTTGGGC AAAGCACCGC ATTTGCTGGT TGCCGGCACG ACCGGTTCGG2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT2151 GAGCATTTAC GAAGGCATCA CGCACCTGCT CGCCCCTGTC GTTACCGATA2201 TGAAGCTGG GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGCAATCTTG CGGGCTTCAA2301 CCAAAAAATC GCCGAAGCCG CAGCAAGGGG AGAAAAAATC GGCAATCCGT2351 TCAGCCTCAC GCCCGACGAT CCCGAACCTT TGGAAAAACT GCCGTTTATC2401 GTGGTCGTGG TCGATGAGTT TGCCGATTTG ATGATGACGG CAGGCAAGAA2451 AATCGAAGAA CTGATTGCGC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA2501 TCCACCTTAT CCTTGCCACA CAACGCCCCA GCGTCGATGT CATCACGGGT2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA2601 AATCGACAGC CGCACGATTC TCGACCAAAT GGGCGCGGAA AACCTGCTCG2651 GTCAGGGCGA TATGCTGTTC CTGCCGCCGG GTACTGCCTA TCCGCAGCGC2701 GTTCACCGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA2751 TCTGAAGCAG TTTGGCGAGC CGGACTATGT TGACGATATT TTGAGCGGCG2801 GCGGCAGCGA AGAGCTGCCC GGCATCGGGC GCAGCGGCGA CGGCGAAACC2851 GATCCGATGT ACGACGAGGC CGTATCCGTT GTCCTGAAAA CGCGCAAAGC2901 CAGCATTTCG GGCGTACAGC GCGCCTTGCG CATCGGCTAC AACCGCGCCG2951 CGCGTCTGAT TGACCAAATG GAAGCGGAAG GCATTGTGTC CGCACCGGAA3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTGGACAATG CTTGA它对应于氨基酸序列<SEQ ID 438;ORF58ng-1>:1 MFWIVLIVIV LLALAGLFFV RAQSEREWMR EVSAWQEKKG EKQAELPEIK51 DGMPDFPEFS LMLFHAVKTA VYWLFVGVVR FCRNYLAHES EPDRPVPPAS101 ANRADVPTAS DGYSDSGNGT EEAETEAAEA AEEEAADTED IATAVIDNRR151 IPFDRSIAEG LMQSESKTSP VRPVFKEITL EEATRALSSA ALRETKKRYI201 DAFEKNGTAV PKVRVSDTPM EGLQIIGLDD PVLQRTYSRM FDADKEAFSE251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FRRHAGQEKG QAEAKSPDVS301 QGQSVSDGTA VRDARRRVSV NLKEPNKATV SAEARISRLI PESRTVVGKR351 DVEMPSETEN VFTETVSSVG YGGPVYDEAA DIHIEEPAAP DAWVVEPPEV401 PEVAVPEIDI LPPPPVSEIY NRTYEPPAGF EQAQRSRIAE TDHLAADVLN451 GGWQEETAAI ADDGSEGAAE RSSGQYLSET EAFGHDSQAV CPFEDVPSER501 PSCRVSDTEA DEGAFQSEET GAVSEHLPTT DLLLPPLFNP EATQTEEELL551 ENSITIEEKL AEFKVKVKVV DSYSGPVITR YEIEPDVGVR GNSVLNLEKD601 LARSLGVASI RVVETIPGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS651 KLTLALGQDI TGQPVVTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA701 APEDVRMIMI DPKMLELSIY EGITHLLAPV VTDMKLAANA LNWCVNEMEK751 RYRLMSFMGV RNLAGFNQKI AEAAARGEKI GNPFSLTPDD PEPLEKLPFI801 VVVVDEFADL MMTAGKKIEE LIARLAQKAR AAGIHLILAT QRPSVDVITG851 LIKANIPTRI AFQVSSKIDS RTILDQMGAE NLLGQGDMLF LPPGTAYPQR901 VHGAFASDEE VHRVVEYLKQ FGEPDYVDDI LSGGGSEELP GIGRSGDGET951 DPMYDEAVSV VLKTRKASIS GVQRALRIGY NRAARLIDQM EAEGIVSAPE1001 HNGNRTILVP LDNA*ORF58ng-1和ORF58-1显示在1014个氨基酸的重叠区内有97.2%的相同性:
10 20 30 40 50 60orf58-1.pep MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA
|||||||||:|||||||||||||||||||:||||||||||||||||||||||||||||::orf58ng-1 MFWIVLIVIVLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPEFS
10 20 30 40 50 60
70 80 90 100 110 120orf58-1.pep LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
70 80 90 100 110 120
130 140 150 160 170 180orf58-1.pep EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL
|||||| ||||||||||||||||||||||||||||||||||| |||: ||||||||||||orf58ng-1 EEAETEAAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMQSESKTSPVRPVFKEITL
130 140 150 160 170 180
190 200 210 220 230 240orf58-1.pep EEATRALNSAALRETKKRYIDAFEKNETAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSHM
|||||||:|||||||||||||||||| |||||||||||||||||||||||||||||||:|orf58ng-1 EEATRALSSAALRETKKRYIDAFEKNGTAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSRM
190 200 210 220 230 240
250 260 270 280 290 300orf58-1.pep FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFHRHAGQGKGQAEAKSPDVS
|||||||||||||||||||||||||||||||||||||||||:||||| ||||||||||||orf58ng-1 FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFRRHAGQEKGQAEAKSPDVS
250 260 270 280 290 300
310 320 330 340 350 360orf58-1.pep QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESQTVVGKRDVEMPSETEN
|||||||||||||||||||||||||||||||||||||||||||:||||||||||||||||orf58ng-1 QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESRTVVGKRDVEMPSETEN
310 320 330 340 350 360
370 380 390 400 410 420orf58-1.pep VFTETVSSVGYGGPVYDETADIHIEEPAAPDAWVVEPPEVPKVPMTAIDIQPPPPVSEIY
||||||||||||||||||:||||||||||||||||||||||:| : |||||||||||||orf58ng-1 VFTETVSSVGYGGPVYDEAADIHIEEPAAPDAWVVEPPEVPEVAVPEIDILPPPPVSEIY
370 380 390 400 410 420
430 440 450 460 470 480orf58-1.pep NRTYEPPSGFEQVQRSRIAETDHLADDVLNGGWQEETAAIADDGSEGAAERSSGQYLSET
|||||||:||||:|||||||||||| ||||||||||||||||||||||||||||||||||orf58ng-1 NRTYEPPAGFEQAQRSRIAETDHLAADVLNGGWQEETAAIADDGSEGAAERSSGQYLSET
430 440 450 460 470 480
490 500 510 520 530 540orf58-1.pep EAFGHDSQAVCPFENVPSERPSCRVSDTEADEGAFPSEETGAVSEHLPTTDLLLPPLFNP
||||||||||||||:|||||||||||||||||||| ||||||||||||||||||||||||orf58ng-1 EAFGHDSQAVCPFEDVPSERPSCRVSDTEADEGAFQSEETGAVSEHLPTTDLLLPPLFNP
490 500 510 520 530 540
550 560 570 580 590 600orf58-1.pep EATQTEEELLENSITIEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 EATQTEEELLENSITIEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKD
550 560 570 580 590 600
610 620 630 640 650 660orf58-1.pep LARSLGVASIRVVETIPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 LARSLGVASIRVVETIPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
610 620 630 640 650 660
670 680 690 700 710 720orf58-1.pep TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
670 680 690 700 710 720
730 740 750 760 770 780orf58-1.pep EGIPHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 EGITHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI
730 740 750 760 770 780
790 800 810 820 830 840orf58-1.pep GNPFSLTPDDPEPLEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 GNPFSLTPDDPEPLEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
790 800 810 820 830 840
850 860 870 880 890 900orf58-1.pep QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLLPGTAYPQR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQR
850 860 870 880 890 900
910 920 930 940 950 960orf58-1.pep VHGAFASDEEVHRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDDETDPMYDEAVSV
||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||orf58ng-1 VHGAFASDEEVHRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSV
910 920 930 940 950 960
970 980 990 1000 1010orf58-1.pep VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPLDNAX
|||||||||||||||||||||||||||||||||||||||||||||||||||||||orf58ng-1 VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPLDNAX
970 980 990 1000 1010
另外,ORF58ng-1显示出与大肠杆菌蛋白Ftsk明显同源:
sp|P46889|FTSK_ECOLI细胞分裂蛋白FTSK>gi|1651412|gnl|PID|d1015290(D1分裂蛋白FtsK[大肠杆菌]>gi|1651418|gnl|PID|d1015296(D90727)细胞分裂蛋白FtsK[大肠杆菌]>gi|1787117(AE000191)细胞分裂蛋白FtsK[大肠杆菌]长度=1329
评分=576位(1469),估计值=e-163
相同性=301/459(65%),阳性=353/459(76%),空隙=5/459(1%)
询问:556 IEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKDLARSLGVASIRVVET 615
+E +LA+F++K VV+ GPVITR+E+ GV+ +NL +DLARSL ++RVVE
目标:868 VEARLADFRIKADVVNYSPGPVITRFELNLAPGVKAARISNLSRDLARSLSTVAVRVVEV 927
询问:616 IPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDITGQPVVTDLGKAPHL 675
IPGK +GLELPN KRQ + L E+ ++ +F ++ S LT+ LG+DI G+PVV DL K PHL
目标:928 IPGKPYVGLELPNKKRQTVYLREVLDNAKFRDNPSPLTVVLGKDIAGEPVVADLAKMPHL 987
询问:676 LVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIYEGITHLLAPVVTDMK 735
LVAGTTGSGKSVGVNAMILSML+KA PEDVR IMIDPKMLELS+YEGI HLL VVTDMK
目标:988 LVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFIMIDPKMLELSVYEGIPHLLTEVVTDMK 1047
询问:736 LAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKIGNPFSLTPDDPEP-- 793
AANAL WCVNEME+RY+LMS +GVRNLAG+N+KIAEA I +P+ D +
目标:1048 DAANALRWCVNEMERRYKLMSALGVRNLAGYNEKIAEADRMMRPIPDPYWKPGDSMDAQH 1107
询问:794 --LEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILATQRPSVDVITGL 851
L+K P+IVV+VDEFADLMMT GKK+EELIARLAQKARAAGIHL+LATQRPSVDVITGL
目标:1108 PVLKKEPYIVVLVDEFADLMMTVGKKVEELIARLAQKARAAGIHLVLATQRPSVDVITGL 1167
询问:852 IKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQRVHGAFASDEEV 911
IKANIPTRIAF VSSKIDSRTILDQ GAE+LLG GDML+ P + P RVHGAF D+EV
目标:1168 IKANIPTRIAFTVSSKIDSRTILDQAGAESLLGMGDMLYSGPNSTLPVRVHGAFVRDQEV 1227
询问:912 HRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSVVLKTRKASISG 971
H VV+ K G P YVD I S SE G G G E DP++D+AV V + RKASISG
目标:1228 HAVVQDWKARGRPQYVDGITSDSESEGGAG-GFDGAEELDPLFDQAVQFVTEKRKASISG 1286
询问:972 VQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVP 1010
VQR RIGYNRAAR+I+QMEA+GIVS HNGNR +L P
目标:1287 VQRQFRIGYNRAARIIEQMEAQGIVSEQGHNGGNEVLAP 1325
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例59
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 439>:1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC101 TGCTCGGCCG TGCCGCCGAC GGGC..GTGA TCGCCATCGA TGCCGTGTTG151 GCATTGGTCG GCTTCTGGGT C......... .......... ..........
//901 .........A TTGCCATCGG TTTGTTTTTA ATTTACCAAA ACGGGCTGAC951 CCTGCTTTTT GAAGCCGTGG AAGACGGCAA AATCCATTTT TGGCTCGGAC1001 TGCTGCCTAT GCACATTATC ATGTTTGTCC TTGCACTCAT CCTGTTGCGC1051 GTCCGCAGTA TGCCCAGCCA GCCCTTCTGG CAGGCGGTTG GCAAAAGTCT1101 GACATTGAAA GGCGGAAAAT GA它对应于氨基酸序列<SEQ ID 440;ORF101>:1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GXVIAIDAVL51 ALVGFWV... .......... .......... .......... ..........
//301 ...IAIGLFL IYQNGLTLLF EAVEDGKIHF WLGLLPMHII MFVLALILLR351 VRSMPSQPFW QAVGKSLTLK GGK*进一步的工作揭示了完整的核苷酸序列<SEQ ID 441>:1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC101 TGCTCGGCCG TGCCGCCGAC GGGCGTGTCG CCATCGATGC CGTGTTGGCA151 TTGGTCGGCT TCTGGGTCAT CGGTATGACG CCGCTTTTGC TGGTGTTGAC201 CGCATTTATC AGTACGTTGA CCGTGTTGAC CCGCTACTGG CGCGACAGCG251 AAATGTCGGT CTGGCTATCC TGCGGATTGG CATTGAAACA ATGGATACGC301 CCGGTGATGC AGTTTGCCGT GCCGTTTGCC GTTTTGGTTG CCGTCATGCA351 GCTTTGGGTG ATACCGTGGG CAGAGCTACG CAGCCGCGAA TACGCTGAAA401 TCCTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAGGCAGG CGAGTTCAAC451 AGTTTGGGCA AGCGCAACGG CAGGGTTTAT TTTGTCGAAA CCTTCGATAC501 CGAATCCGGC ATCATGAAAA ACCTGTTCCT GCGCGAACAG GACAAAAACG551 GCGGCGACAA CATCATCTTC GCCAAAGAAG GTAACTTCTC GCTGAACGAC601 AACAAACGCA CGCTCGAATT GCGCCACGGC TACCGTTACA GCGGCACGCC651 CGGACGCGCC GACTACAATC AGGTTTCCTT CCAAAAACTC AACCTGATTA701 TCAGCACCAC GCCCAAACTC ATCGACCCCG TTTCCCACCG CCGTACCATT751 CCGACCGCCC AACTGATTGG CAGCAGCAAC CCGCAACATC AGGCGGAATT801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTACTC TGCCTGCTTG851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC901 TTGATTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC1001 CTATGCACAT TATCATGTTT GCCGTTGCAC TCATCCTGTT GCGCGTCCGC1051 AGTATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT1101 GAAAGGCGGA AAATGA它对应于氨基酸序列<SEQ ID 442:ORF101-1>:1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GRVAIDAVLA51 LVGFWNIGMT PLLLVLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR101 PVMQFAVPFA VLVAVMQLWV IPQAELRSRE YAEILKQKQE LSLVEAGEFN151 SLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLND201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI251 PTAQLIGSSN PQHQAELMWR ISLTVSVLLL CLLAVPLSYF NPRSGHTYNI301 LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF AVALILLRVR351 SMPSQPFWQA VGKSLTLKGG K*该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF101显示和脑膜炎奈瑟球菌菌株A的ORF(ORF101a)在57个氨基酸重叠区内有91.2%的相同性,在69个氨基酸重叠区内有95.7%的相同性:
10 20 30 40 50orf101.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGXVIAIDAVLALVGFWVX
|||||||||||||||||||||||||||||||||||| ||| ||||||||||||||orf101a MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGXAADXRX-AIDAVLALVGFWVXXM
10 20 30 40 50
//
90 100 110orf101.pep .............................IAIGLFLIYQNGLTLLFEAVEDGKIHFWLGL
||||||||||||||||||||||||||||||orf101a LTVSVLLLCLLAVPLSYFNPRSGHTYNILXAIGLFLIYQNGLTLLFEAVEDGKIHFWLGL
280 290 300 310 320 330
120 130 140 150orf101.pep LPMHIIMFVLALILLRVRSMPSQPFWQAVGKSLTLKGGKX
|||||||||:|::|||||||||||||||||||||||||||orf101a LPMHIIMFVIAIVLLRVRSMPSQPFWQAVGKSLTLKGGKX
340 350 360 370全长ORF101a核苷酸序列<SEQ ID 443>是:1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC101 TGCTCGGCCN TGCCGCCGAC NGGCGTNTCG CCATCGATGC CGTGTTGGCA151 TTGGTCGGCT TCTGGGTCNN NNGNATGACG CCGCTTTTGC TNGTGTTGAC201 CGCATTTATC AGTACGTTGA CCGTGTTGAC CCGCTACTGG CGNGACAGCG251 AAATGTCGGT CTGGNTATCC TGCGGATTGG CATTGAAACA ATGGATACGC301 CCGGTGATGC AGTTTGCCGT GCCGTTTGCC GTTTTGGTTG CCGTCATGCA351 GCTTTGGGTG ATACCGTGGG CAGAGCTACG CAGCCGCGAA TACGCTGAAA401 TCCTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAGGCAGG CGGGTTCAAC451 AGTTTGGGCA AGCGCAACGG CAGGGTTTAT TTTGTCGAAA CCTTCGATAC501 CGAATCCGGC ATCATGAAAA ACCTGTTCCT GCGCGAACAG GACAAAAACG551 GCGGCGACAA CATCATCTTC NCCAAAGAAA GTAACTTCTC GCTGAACGAC601 AACAAACGCA CGCTCGAATT GCGCCACGGC TACCGTTACA GCGGCACGCC651 CGGACGCGCC GACTACAATC AGGTTTCCTT CCNAAAACTC AACCTGATTA701 TCAGCACCAC GCCCAAACTC ATCGACCCCG TTTCCCACCG CCGTACNATN751 CCNACNGCCC AACTGATTGG CAGCAGCAAC CCGCAACATC ANGCGGAATT801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTACTC TGCCTGCTTG851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC901 TTGANTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC1001 CTATGCACAT CATCATGTTC GTCATCGCAA TCGTACTTCT GCGCGTCCGC1051 AGCATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT1101 GAAAGGCGGA AAATGA它编码的蛋白质具有氨基酸序列<SEQ ID 444>:1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGXAAD XRXAIDAVLA51 LVGFWVXXMT PLLLVLTAFI STLTVLTRYW RDSEMSVWXS CGLALKQWIR101 PVMQFAVPFA VLVAVMQLWV IPWAELRSRE YAEILKQKQE LSLVEAGGFN151 SLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF XKESNFSLND201 NKRTLELRHG YRYSGTPGRA DYNQVSFXKL NLIISTTPKL IDPVSHRRTX251 PTAQLIGSSN PQHXAELMWR ISLTVSVLLL CLLAVPLSYF NPRSGHTYNI301 LXAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF VIAIVLLRVR351 SMPSQPFWQA VGKSLTLKGG K*ORF101a和ORF101-1显示在371个氨基酸的重叠区内有95.4%的相同性:orf101a.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGXAADXRXAIDAVLALVGFWVXXMT 60
|||||||||||||||||||||||||||||||||||| ||| | ||||||||||||| ||orf101-1 MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRVAIDAVLALVGFWVIGMT 60orf101a.pep PLLLVLTAFISTLTVLTRYWRDSEMSVWXSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV 120
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||orf101-1 PLLLVLTAFISTLTVLTRYWRDSFMSVWLSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV 120orf101a.pep IPWAELRSREYAEILKQKQELSLVEAGGFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ 180
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||orf101-1 IPWAELRSREYAEILKQKQELSLVEAGEFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ 180orf101a.pep DKNGGDNIIFXKESNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFXKLNLIISTTPKL 240
|||||||||| ||:||||||||||||||||||||||||||||||||||||||||||||||orf101-1 DKNGGDNIIFAKEGNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFQKLNLIISTTPKL 240orf101a.pep IDPVSHRRTXPTAQLIGSSNPQHXAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI 300
||||||||| ||||||||||||| ||||||||||||||||||||||||||||||||||||Drf101-1 IDPVSHRRTIPTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI 300orf101a.pep LXAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFVIAIVLLRVRSMPSQPFWQA 360
||||||||||||||||||||||||||||||||||||||||::|::|||||||||||||||orf101-1 LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFAVALILLRVRSMPSQPFWQA 360orf101a.pep VGKSLTLKGGK 371
|||||||||||orf101-1 VGKSLTLKGGK 371
与淋病奈瑟球菌的预计ORF的同源性
ORF101显示和淋病奈瑟球菌的预计ORF(ORF101ng)在N端结构域的57个氨基酸重叠区以及C端结构域的61个氨基酸重叠区内分别有96.5%和95.1%的相同性:orf101.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGXVIAIDAVLALVGFWV 57
||||||||||||||||||||||||||||||||||||||||| | |||||||||||||orf101ng MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRV-AIDAVLALVGFWVIGM 59
//orf101.pep IAIGLFLIYQNGLTLLFEAVEDGKIHFWLG 333
||||||||||||||||||||||||||||||orf101ng SLTVSVLLLCLLAVPLSYFNPRSGHTYNILIAIGLFLIYQNGLTLLFEAVEDGKIHFWLG 331orf101.pep LLPMHIIMFVLALILLRVRSMPSQPFWQAVGKSLTLKGGK 373
||||||||||:|::|||||||||||||||||orf101ng LLPMHIIMFVIAIVLLRVRSMPSQPFWQAVG 362
预计ORF101ng核苷酸序列<SEQ ID 445>编码的蛋白质具有部分氨基酸序列<SEQ ID 446>:1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GRVAIDAVLA51 LVGFWVIGMT PLLLYLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR101 PVMQFAVPFA ILIAVMQLWV IPWAELRSRE YAEILKQKQE LSLVEAGEFN151 NLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLKD201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI251 STAQLIGSSN PQHQAELMWR ISLTVSVLLL CLLAVPLSYF NPRSGHTYNI301 LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF VIAIVLLRVR351 SMPSQPFWQA VG...进一步的工作揭示了完整的核苷酸序列<SEQ ID 447>:
1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG 51 CATTTTCGTC GTCCTCTTGG CGGTGTTGGT GTCCACGCAG GCGATCAACC101 TGCTTGGCCG CGCAGCTGAC GGGCGTGTCG CCATCGATGC CGTGTTGGCC151 TTAGTCGGCT TCTGGGTCAT CGGTATGACC CCGCTTTTGC TGGTGTTGAC201 CGCATTCATC AGCACGCTGA CCGTATTGAC CCGCTACTGG CGCGACAGCG251 AAATGTCGGT CTGGCTATCC TGCGGATTGG CGTTGAAACA GTGGATACGC301 CCCGTCATGC AGTTTGCCGT GCCGTTTGCC ATCCTGATTG CCGTCATGCA351 GCTTTGGGTG ATACCGTGGG CAGAGCTGCG CAGCCGCGAA TATGCCGAAA401 TTTTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAAGCCGG CGAGTTCAAT451 AACTTGGGCA AGCGCAACGG CAgggtttaT TtcgtcgaaA CCTTTGACAC501 CGaatccgGC ATCATGAAAA ACCTGTtcct GcGCGAACAG GACAAAAACG551 gcggcgacaA CATCATCTTC GCcaaaGAag gtaactTctc gctgaaggaC601 AACAAAcgca cgctcgaATT GCGCCACGGC TACCGTTACA GCGGcacgcC651 CGGacGCGCc gactaCAATC AGGTTtcctt cCAAAAacTc aacctgATta701 TCAGCACCAC GCCCAAacTT ATCGaccCCG TTTCCCACCG CCGCACCATT751 tcgacCGCCC AAcTGATTGG CAGCAGCAAT CCGCAACATC AGGCAGAATT801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTGCTC TGCCTACTCG851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC901 TTGATTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC1001 CTATGCACAT CATCATGTTC GTCATCGCAA TCGTACTTCT GCGCGTCCGC1051 AGTATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT1101 GAAAGgcgGA AAATGA它对应于氨基酸序列<SEQ ID 448;ORF101ng-1>:1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GRVAIDAVLA51 LVGFWVIGMT PLLLVLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR101 PVMQFAVPFA ILIAVMQLWN IPWAELRSRE YAEILKQKQE LSLVEAGEFN151 NLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLKD201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI251 STAQLIGSSN PQHQAELMWR ISLTVSVLLL CLLAVPLSYF NPRSGHTYNI301 LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF VIAIVLLRVR351 SMPSQPFWQA VGKSLTLKGG K*ORF101ng-1和ORF101-1显示在371个氨基酸的重叠区内有97.6%的相同性:
10 20 30 40 50 60orf101-1.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRVAIDAVLALVGFWVIGMT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf101ng-1 MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRVAIDAVLALVGFWVIGMT
10 20 30 40 50 60
70 80 90 100 110 120orf101-1.pep PLLLVLTAFISTLTVLTRYWRDSEMSVWLSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV
||||||||||||||||||||||||||||||||||||||||||||||||||:|:|||||||orf101ng-1 PLLLVLTAFISTLTVLTRYWRDSEMSVWLSCGLALKQWIRPVMQFAVPFAILIAVMQLWV
70 80 90 100 110 120
130 140 150 160 170 180orf101-1.pep IPWAELRSREYAEILKQKQELSLVEAGEFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ
||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||||orf101ng-1 IPWAELRSREYAEILKQKQELSLVEAGEFNNLGKRNGRVYFVETFDTESGIMKNLFLREQ
130 140 150 160 170 180
190 200 210 220 230 240orf101-1.pep DKNGGDNIIFAKEGNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFQKLNLIISTTPKL
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||orf101ng-1 DKNGGDNIIFAKEGNFSLKDNKRTLELRHGYRYSGTPGRADYNQVSFQKLNLIISTTPKL
190 200 210 220 230 240
250 260 270 280 290 300orf101-1.pep IDPVSHRRTIPTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI
|||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||orf101ng-1 IDPVSHRRTISTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI
250 260 270 280 290 300
310 320 330 340 350 360orf101-1.pep LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFAVALILLRVRSMPSQPFWQA
||||||||||||||||||||||||||||||||||||||||::|::|||||||||||||||orf101ng-1 LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFVIAIVLLRVRSMPSQPFWQA
310 320 330 340 350 360
370orf101-1.pep VGKSLTLKGGKX
||||||||||||orf101ng-1 VGKSLTLKGGKX
370
根据该分析结果(包括此淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例60
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 449>:1 ..GGTGGTGGTT TTATCAATGC TTCCTGTGCC ACTTTGACGA CAGCCAAACC51 GCAATATCAA GCAGGAGACC TTAGCGCTTT TAAGATAAGG CAAGGCAATG101 TTGTAATCGC CGGACACGGT TTGGATGCAC GTGATACCGA TTACACACGT151 ATTCTCAGTT ATCATTCCAA AATCGATGCA CCCGTATGGG GACAAGATGT201 TCGTGTCGTC GCGGGACAAA ACGATGTGGC CGCAACAGGT GATGCACATT251 CGCCTATTCT CAATAATGCT GCTGCCAATA CGTCAAACAA TACAGCCAAC301 AACGGCACAC ATATCCCTTT ATTTGCGATT GATACAGGCA AATTAGGAGG351 TAT.GTATGC CAACAAAATC ACCTTGATCA GTACGGTCGA GCAAGCAGGC401 ATTCGTAA它对应于氨基酸序列<SEQ ID 450;ORF113>:1 ..GGGFINASCA TLTTAKPQYQ AGDLSAFKIR QGNVVIAGHG LDARDTDYTR51 ILSYHSKIDA PVWGQDVRVV AGQNDVAATG DAHSPILNNA AANTSNNTAN101 NGTHIPLFAI DTGKLGGXVC QQNHLDQYGR ASRHS*该氨基酸序列的计算机分析给出了下列结果:与脑膜炎奈瑟球菌的pspA推定分泌蛋白(登录号为AF030941)的同源性ORF和pspA显示在179个氨基酸的重叠区内有44%的氨基酸相同性:orf113 GGGFINASCATLTTAKPQYQAGDLSAFKIRQGNVVIAGHGLDARDTDYTRILSYHSKIDA 60
GGG INA+ TLT+ P G+L+ F + G VVI G GLD D DYTRILS ++I+Apspa GGGLINAASVTLTSGVPVLNNGNLTGFDVSSGKVVIGGKGLDTSDADYTRILSRAAEINA 256orf113 PVWGQDVRVVAGQNDVAATGDAHSPILXXXXXXXXXXXXXXGTHIPLFAIDTGKLGGMYA 120
VWG+DV+VV+G+N + G + P AIDT LGGMYApspa GVWGKDVKVVSGKNKLDFDG---------SLAKTASAPSSSDSVTPTVAIDTATLGGMYA 307orf113 NKITLISTVEQAGIRNQGQWFASAGNVAVNAEGKLVNTGMIAATGENHAVSLHARNVHN 179
+KITLIST A IRN+G+ FA+ G V ++A+GKL N+G I A +++ A+ V Npspa DKITLISTDNGAVIRNKGRIFAATGGVTLSADGKLSNSGSIDAA----EITISAQTVDN 362与淋病奈瑟球菌的预计ORF的同源性ORF113显示和淋病奈瑟球菌的预计ORF(ORF113ng)在N端部分的52个氨基酸重叠区以及C端部分的17个氨基酸重叠区内有86.5%和94.1%的相同性:orf113 GGGFINASCATLTTAKPQYQAGDLSAFKIR 30
|||||||| |||||::|||||||:|:||||orf113ng SHPSQLNGYIEVGGRRAEVVIANPAGIAVNGGGFINASRATLTTGQPQYQAGDFSGFKIR 224orf113 QGNVVIAGHGLDARDTDYTRILSYHSKIDAPVWGQDVRVVAGQNDVAATGDAHSPILNNA 90
|||:|||||||||||||:||||orf113ng QGNAVIAGHGLDARDTDFTRILVCQQNHLDQYGRTSRHS 263orf113 IDTGKLGGXVCQQNHLDQYGRASRHS 135
||||||||||||:||||orf113ng DFSGFKIRQGNAVIAGHGLDARDTDFTRILVCQQNHLDQYGRTSRHS 263
预计全长ORF113ng核苷酸序列<SEQ ID 451>编码的蛋白质具有氨基酸序列<SEQ ID 452>:1 MNKTLYRVIF NRKRGAVVAV AETTKREGKS CADSGSGSVY VKSVSFIPTH51 SKAFCFSALG FSLCLALGTV NIAFADGIIT DKAAPKTQQA TILQTGNGIP101 QVNIQTPTSA GVSVNQYAQF DYGNRGAILN NSRSNTQTQL GGWIQGNPWL151 TRGEARVVVN QINSSHPSQL NGYIEVGGRR AEVVIANPAG IAVNGGGFIN201 ASRATLTTGQ PQYQAGDFSG FKIRQGNAVI AGHGLDARDT DFTRILVCQQ251 NHLDQYGRTS RHS*
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例61
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 453>:1 ..TCAACGGGAC ATAGCGAACA AAATTACACT TTGCCGCGAG AAATCACACG51 CAACATTTCA CTGGGTTCAT TTGCCTATGA ATCGCATCGC AAAGCATTAA101 GCCATCATGC GCCCAGCCAA GGCACTGAGT TGCCGCAAAG CAACGGTATT151 TCGCTACCCT ATACGTCCAA TTCTTTTACC CCATTACCCA GCAGCAGCTT201 ATACATTATC AATCCTGTCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC251 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCtGGACAGC301 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA351 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC401 GTTTAGAcGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT451 AATGGCGCGA CTGCGGCACG TTcGATGAAT CTCAGCGTTG GCATTGCATT501 AAGTGCCGAG CAAGTAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC551 AAAAAGAAGT TAAGCTTCCT GATGGCGGCA CACAAACCGT ATTGGTGCCA601 CAGGTTTATG TACGCGTTAA AAATGGCGAC ATAGACGGTA AAGGTGCATT651 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT701 CAGGCACGAT TGCAGGgCGC AATGCGCTTA TTATCAATAC CGATACGCTA751 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC801 ACAAGACATC AATAATATTG GCGGCATGCT TTCTGCCGAA CAGACATTAT851 TGCTCAACGC AGGCAACAAC ATCAACAGCC AAAGCACCAC CGCCAGCAGT901 CAAAATACAC AAGGCAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA951 TATCACAGGC AAAGAAAAAG GTGTTT..它对应于氨基酸序列<SEQ ID 454;ORF115>:1 ..STGHSEQNYT LPREITRNIS LGSFAYESHR KALSHHAPSQ GTELPQSNGI51 SLPYTSNSFT PLPSSSLYII NPVNKGYLVE TDPRFANYRQ WLGSDYMLDS101 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD151 NGATAARSMN LSVGIALSAE QVAQLTSDIV WLVQKEVKLP DGGTQTVLVP201 QVYVRVKNGD IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL251 DNIGGRIHAQ KSAVTATQDI NNIGGMLSAE QTLLLNAGNN INSQSTTASS301 QNTQGSSTYL DRMAGIYITG KEKGV..该氨基酸序列的计算机分析给出了下列结果:与脑膜炎奈瑟球菌的pspA推定分泌蛋白(登录号为AF030941)的同源性ORF115和pspA蛋白显示在325个氨基酸的重叠区内有50%的氨基酸相同性:Orf115:1 STGHSEQNYTLPREITRNISLGSFAYESHRKALSHHAPSQGTELPQSNGISLPYTSNSFT 60
STG+S Y E++ +I +G AY+ + + P + NGI +TpspA: 778 STGYSRSPYEPAPEVS-SIRMGISAYKGYAPQQASDIPGTYVPVVAENGIHPTFT----- 831Orf115:61 PLPSSSLYIINPVNKGYLVETDPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQR 120
LP+SSL+ I P NKGYL+ETDP F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+pspA: 832 -LPNSSLFAIAPNNKGYLIETDPAFTDYRKWLGSGYMLAALQQDPNHIHKRLGDGYYEQK 890Orf115:121 LINEQIAELTGHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQVAQLTSDIV 180
L+NEQIA+LTG+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQVA+LTSDIVpspA: 891 LVNEQIAKLTGYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIV 950Orf11S: 181 WLVQKEVKLPDGGTQTVLVPQVYVRVKNGDIDGKGALLSGSNTQINVSGSLKN-SGTIAG 239
WL + V LPDG TQTVL P+VYVR + D++G+GALLSGS I SG+++N G IAGpspA: 951 WLENETVTLPDGTTQTVLKPKVYVRARPKDMNGQGALLSGSVVDIG-SGAIENRGGLIAG 1009Orf115:240 RNALIINTDTLDNIGGRIHAQKSAVTATQDINNIGGMLSAEQTLLLNAGXXXXXXXXXXX 299
R ALI+N + N+ G + + A DI N G + AE LLL ApspA: 1010 REALILNAQNIKNLQGDLQGKNIFAAAGSDITNTGS-IGAENALLLKASNNIESRSETRS 1068Orf115:300 XXXXXXXXXYLDRMAGIYITGKEKG 324
+ R+AGIY+TG++ GpspA: 1069 NQNEQGSVRNIGRVAGIYLTGRQNG 1093
与淋病奈瑟球菌的预计ORF的同源性
ORF115显示与淋病奈瑟球菌的预计ORF(ORF115ng)在重叠的334个氨基酸内有91.9%的相同性:orf115.pep STGHSEQNYTLPREITRNISLGSFAYESHRK 31
||| |||||||:||||:||||||||||| |orf115ng NEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDISLGSFAYESHSK 71orf115.pep ALSHHAPSQGTELPQSN----------GISLPYTSNSFTPLPSSSLYIINPVNKGYLVET 81
|||:||||||||||||| ||||||| |||||||:||||||||:||||||||orf115ng ALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYIINPANKGYLVET 131orf115.pep DPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND 141
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf115ng DPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND 191orf115.pep EEQFKALMDNGATAARSMNLSVGIALSAEQVAQLTSDIVWLVQKEVKLPDGGTQTVLVPQ 201
||||||||||||||||||||||||||||||:||||||||||||||||||||||||||:||orf115ng EEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLPDGGTQTVLMPQ 251orf115.pep VYVRVKNGDIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK 261
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||orf115ng VYVRVKNGGIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK 311orf115.pep SAVTATQDINNIGGMLSAEQTLLLNAGNNINSQSTTASSQNTQGSSTYLDRMAGIYITGK 321
||||||||||||||:||:|||||||||||||:|||: ||||:||||||||||||||||||orf115ng SAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTYLDRMAGIYITGK 371
orf115.pep EKGV 325
||||
orf115ng EKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQEIHFDADNHTIR 431
预计ORF115ng核苷酸序列<SEQ ID 455>编码的蛋白质具有氨基酸序列<SEQ ID456>:1 MLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEYKLP DGGTQTVLMP251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAK QFDKAKTTAL701 MPWRLPMQVG RLFKQAKAPK K*进一步的工作揭示了下列淋球菌的部分DNA序列<SEQ ID 457>:1 TTGCTTGTGC AAACAGAAAA AGACGGTTTG CATAACGAGC AAACCTTTGG51 CGAGAAGAAA GTCTTCAGCG AAAATGGTAA GTTGCACAAC TACTGGCGTG101 CGCGTCGTAA AGGACATGAT GAAACAGGGC ATCGTGAACA AAATTATACT151 TTGCCGGAGG AAATCACACG CGACATTTCA CTGGGTTCAT TTGCCTATGA201 ATCGCATAGC AAAGCATTAA GCCGTCATGC GCCCAGCCAA GGCACTGAGT251 TGCCACAAAG TAACCGGGAT AATATCCGTA CTGCGAAAAG CAACGGTATT301 TCGCTACCCT ATACGCCCAA TTCTTTTACC CCATTACCCG GCAGCAGCTT351 ATACATTATC AATCCTGCCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC401 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCTGGGCAGC451 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA501 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC551 GTTTAGACGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT601 AATGGCGCGA CTGCGGCACG TTCGATGAAT CTCAGCGTTG GCATTGCATT651 AAGTGCCGAG CAAGCAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC701 AAAAAGAAGT TAAACTTCCT GATGGCGGCA CACAAACCGT ATTGATGCCA751 CAGGTTTATG TACGCGTTAA AAATGGCGGC ATAGACGGTA AAGGTGCATT801 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT851 CAGGCACGAT TGCAGGGCGC AATGCGCTTA TTATCAATAC CGATACGCTA901 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC951 ACAAGACATC AATAATATTG GCGGCATTCT TTCTGCCGAA CAGACATTAT1001 TGCTCAATGC GGGTAACAAC ATCAACAACC AAAGCACGGC CAAGAGCAGT1051 CAAAATGCAC AAGGTAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA1101 TATCACAGGC AAAGAAAAAG GTGTTTTAGC AGCGCAGGCA GGCAAAGACA1151 TCAACATCAT TGCCGGTCAA ATCAGCAATC AATCAGATCA AGGGCAAACC1201 CGGCTGCAGG CAGGACGCGA CATTAACCTG GATACGGTAC AAACCGGCAA1251 ATATCAAGAA ATCCATTTTG ATGCCGATAA CCATACCATC CGAGGTTCAA1301 CGAACGAAGT CGGCAGCAGC ATTCAAACAA AAGGCGATGT TACCCtatTG1351 TCAGGGAATA ATCTCAATGC CAAAGCTGCC GAAGTCGGCA GCGCAAAAGG1401 CACACTTGCC GTGTATGCTA AAAATGACAT TACTATCAGC TCAGGCATCC1451 ATGCCGGCCA AGTTGATGAT GCGTCCAAAC ATACAGGCAG AAGCGGCGGC1501 GGTAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACTGC1551 TCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG1601 ATGCCAACAT CCTTGGCAGT AATGTTATTT CCGATAATGG CACCCGGATT1651 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG1701 CGAAACCTAT CATCAAACCC AAAAATCAGG ATTGATGAGT GCAGGTATCG1751 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC1801 AACGAACATA CAGGCAGTAC CGTAGGCAGC CTGAAAGGCG ATACCACCAT1851 TGTTGCAAGC AAACACTACG AACAAACCGG CAGCAAGGTT TCCAGCCCTG1901 AGGGCAACAA CCTTATCAGC ACGCAAAGTA TGGATATTGG CGCAGCACAA1951 AACCAATTAA ACAGCAAAAC CACCCAAACC TACGAACAAA AAGGCTTAAC2001 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA GCGATTGCCG2051 TAGCACACAA AGCAGCAAAC AAGTCGGACA AAGCAAAAAC GACCGCGTTA2101 ATGCCATGGC GGCTGCCAAT GCAGGTTGGC AGGCCTATCA AACAGGCAAA2151 GGCGCACAAA ACTTAG它对应于氨基酸序列<SEQ ID 458;ORF115ng-1>:1 LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAYAHKAAN KSDKAKTTAL701 MPWRLPMQVG RPIKQAKAHK T*
此淋球菌蛋白(ORF115ng-1)显示和ORF115在334个氨基酸内有91.9%的相同性:
20 30 40 50 60 70orf115ng-1.p NEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDISLGSFAYESHSK
||| |||||||:||||:||||||||||| |orf115 STGHSEQNYTLPREITRNISLGSFAYESHRK
10 20 30
80 90 100 110 120 130orf115ng-1.p ALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYIINPANKGYLVET
|||:||||||||||||| ||||||| |||||||:||||||||:||||||||orf115 ALSHHAPSQGTELPQSN----------GISLPYTSNSFTPLPSSSLYIINPVNKGYLVET
40 50 60 70 80
140 150 160 170 180 190orf115ng-1.p DPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND
||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||orf115 DPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND
90 100 110 120 130 140
200 210 220 230 240 250orf115ng-1.p EEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLPDGGTQTVLMPQ
||||||||||||||||||||||||||||||:||||||||||||||||||||||||||:||orf115 EEQFKALMDNGATAARSMNLSVGIALSAEQVAQLTSDIVWLVQKEVKLPDGGTQTVLVPQ
150 160 170 180 190 200
260 270 280 290 300 310orf115ng-1.p VYVRVKNGGIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||orf115 VYVRVKNGDIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK
210 220 230 240 250 260
320 330 340 350 360 370orf115ng-1.p SAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTYLDRMAGIYITGK
||||||||||||||:||||||||||||||||:|||: ||||:||||||||||||||||||orf115 SAVTATQDINNIGGMLSAEQTLLLNAGNNINSQSTTASSQNTQGSSTYLDRMAGIYITGK
270 280 290 300 310 320
380 390 400 410 420 430orf115ng-1.p EKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQEIHFDADNHTIR
||||orf115 EKGV另外,它显示出与数据库中一种分泌的脑膜炎奈瑟球菌蛋白同源:gi|2623258(AF030941)推定分泌的蛋白[脑膜炎奈瑟球菌]长度=2273评分=604位(1541),估计值=e-172相同性=325/678(47%),阳性=449/678(65%),空隙=22/678(3%)询问:1 LLVQTEKDGLHNEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDIS 60
L+V T + L N++T G K + ++ G LH Y R +KG D TG+ Y E++ I目标:739 LIVGTPESALDNDETLGTKTI-TDKGDLHRYHRHHKKGRDSTGYSRSPYEPAPEVS-SIR 796询问:61 LGSFAYESHSKALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYII 120
+G AY++ AP Q +++P + + NGI +T LP SSL+ I目标:797 MGISAYKGY-------APQQASDIPGTV---VPVVAENGIHPTFT------LPNSSLFAI 840询问:121 NPANKGYLVETDPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELT 180
P NKGYL+ETDP F+YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+L+NEQIA+LT目标:841 APNNKGYLIETDPAFTDYRKWLGSGYMLAALQQDPNHIHKRLGDGYYEQKLVNEQIAKLT 900询问:181 GHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLP 240
G+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQ A+LTSDIVWL + V LP目标:901 GYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIVWLENETVTLP 960询问:241 DGGTQTVLMPQVYVRVKNGGIDGKGALLSGSNTQINVSGSLKN-SGTIAGRNALIINTDT 299
DG TQTVL P+VYVR + ++G+GALLSGS I SG+++N G IAGR ALI+N目标:961 DGTTQTVLKPKVYVRARPKDMNGQGALLSGSVVDIG-SGAIENRGGLIAGREALILNAQN 1019询问:300 LDNIGGRIHAQKSAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTY 359
+ N+ G + + A DI N G I AE LLL A NNI ++S +S+QN QGS目标:1020 IKNLQGDLQGKNIFAAAGSDITNTGSI-GAENALLLKASNNIESRSETRSNQNEQGSVRN 1078询问:360 LDRMAGIYITGKEKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQ 419
+ R+AGIY+TG++ G + AG +I + A +++NQS+ GQT L AG DI DT + Q目标:1079 IGRVAGIYLTGRQNGSVLLDAGNNIVLTASELTNQSEDGQTVLNAGGDIRSDTTGISRNQ 1138询问:420 EIHFDADNHTIRGSTNEVGSSIQTKGDVTLLSGNNLNAKAAEVGSAKGTLAVYAKNDITI 479
FD+DN+ IR NEVGS+I+T+G+++L + ++ +AAEVGS +G L + A DI +目标:1139 NTIFDSDNYVIRKEQNEVGSTIRTRGNLSLNAKGDIRIRAAEVGSEQGRLKLAAGRDIKV 1198询问:480 SSGIHAGQVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILG 539
+G + +DA K+TGRSGGG K +T ++ + A S T +GK+++L +G D + G目标:1199 EAGKAHTETEDALKYTGRSGGGIKQKMTRHLKNQNGQAVSGTLDGKEIILVSGRDITVTG 1258询问:540 SNVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLM-SAGIGFTIGSKTNTQENQS 598
SN+I+DN T + A N++ + +T+S+S ++ +KSGLM S GIGFT GSK +TQ N+S目标:1259 SNIIADNHTILSAKNNIVLKAAETRSRSAEMNKKEKSGLMGSGGIGFTAGSKKDTQTNRS 1318询问:599 QSNEHTGSTVGSLKGDTTIVASKHYEQTGSNVSSPEGNNLISTQSMDIGAAQNQLNSKTT 658
++ HT S VGSL G+T I A KHY QTGS +SSP+G+ IS+ + I AAQN+ + ++目标:1319 ETVSHTESVVGSLNGNTLISAGKHYTQTGSTISSPQGDVGISSGKISIDAAQNRYSQESK 1378询问:659 QTYEQKGLTVAFSSPVTD 676
Q YEQKG+TVA S PV +目标:1379 QVYEQKGVTVAISVPVVN 1396
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例62
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 459>:1 ..TCAGGGAATA ACCTCAATGC CAAAGCTGCC GAAGTCAGCA GCGCAAACGG51 TACACTCGCT GTGTCTGCCA ATAATGACAT CAACATCAGC GCAGGCATCA101 ACACGACCCA TGTTGATGAT GCGTCCAAAC ACACAGGCAG AAGCGGTGGT151 GGCAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACCGC201 CCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG251 ATGCCAACAT CCTTGGCAGC AATGTTATTT CCGATAATGG CACCCAGATT301 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG351 CGAAACCTAT CATCAAACCC AGAAATCAGG ATTGATGAGT GCAGGTATCG401 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC451 AACGAACATA CAGGCAGTAC CGTAGGCAGC TTGAAAGGCG ATACCACCAT501 TGTTGCAGGC AAACACTACG AACAAATCGG CAGTACCGTT TCCAGCCCGG551 AAGGCAACAA TACCATCTAT GCCCAAAGCA TAGACATTCA AGCGGCACAC601 AACAAATTAA ACAGTAATAC CACCCAAACC TATGAACAAA AAGG.CTAAC651 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA...它对应于氨基酸序列<SEQ ID 460;ORF117>:1 ..SGNNLNAKAA EVSSANGTLA VSANNDINIS AGINTTHVDD ASKHTGRSGG51 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTQI101 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS151 NEHTGSTVGS LKGDTTIVAG KHYEQIGSTV SSPEGNNTIY AQSIDIQAAH201 NKLNSNTTQT YEQKXLTVAF SSPVTDLAQQ ...该氨基酸序列的计算机分析给出了下列结果:与脑膜炎奈瑟球菌的pspA推定分泌蛋白(登录号AF030941)的同源性ORF117和pspA蛋白显示在224个氨基酸的重叠区内有45%的氨基酸相同性:Orf11T: 4 NLNAKAAEVSSANGTLAVSANNDINISAGINTTHVDDASKHTGRSGGGNKLVITDKAQSH 63
++ +AAEV S G L ++A DI + AG T +DA K+TGRSGGG K +T ++pspA: 1173 DIRIRAAEVGSEQGRLKLAAGRDIKVEAGKAHTETEDALKYTGRSGGGIKQKMTRHLKNQ 1232Orf11T: 64 HETAQSSTFEGKQVVLQAGNDANILGSNVISDNGTQIQAGNHVRIGTTQTQSQSETYHQT 123
+ A S T +GK+++L +G D + GSN+I+DN T + A N++ + +T+S+S ++pspA: 1233 NGQAVSGTLDGKEIILVSGRDITVTGSNIIADNHTILSAKNNIVLKAAETRSRSAEMNKK 1292Orf11T: 124 QKSGLM-SAGIGFTIGSKTNTQENQSQSNEHTGSTVGSLKGDTTIVAGKHYEQIGSTVSS 182
+KSGLM S GIGFT GSK +TQ N+S++ HT S VGSL G+T I AGKHY Q GST+SSpspA: 1293 EKSGLMGSGGIGFTAGSKKDTQTNRSETVSHTESVVGSLNGNTLISAGKHYTQTGSTISS 1352Orf11T: 183 PEGNNTIYAQSIDIQAAHNKLNSNTTQTYEQKXLTVAFSSPVTD 226
P+G+ I + I I AA N+ + + Q YEQK +TVA S PV +pspA: 1353 PQGDVGISSGKISIDAAQNRYSQESKQVYEQKGVTVAISVPVVN 1396
与淋病奈瑟球菌的预计ORF的同源性
ORF117显示和淋病奈瑟球菌的预计ORF(ORF117ng)在230个氨基酸的重叠区内有90%的相同性:orf117.pep SGNNLNAKAAEVSSANGTLAVSANNDINIS 30
||||||||||||:||:||||| |:|||:||orf117ng IHFDADNHTIRGSTNEVGSSIQTKGDVTLLSGNNLNAKAAEVGSAKGTLAVYAKNDITIS 480orf117.pep AGINTTHVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILGS 90
:|:: :|||||||||||||||||||||||||||||||||||||||||||||||||||||orf117ng SGIHAGQVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILGS 540orf117.pep NVISDNGTQIQAGNHVRIGTTQTQSQSETYHQTQKSGLMSAGIGFTIGSKTNTQENQSQS 150
||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||orf17ng NVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLMSAGIGFTIGSKTNTQENQSQS 600orf117.pep NEHTGSTVGSLKGDTTIVAGKHYEQIGSTVSSPEGNNTIYAQSIDIQAAHNKLNSNTTQT 210
|||||||||||||||||||:||||| ||:|||||||| | :||:|| ||:|:|||:||||orf117ng NEHTGSTVGSLKGDTTIVASKHYEQTGSNVSSPEGNNLISTQSMDIGAAQNQLNSKTTQT 660orf117.pep YEQKXLTVAFSSPVTDLAQQ 230
|||| |||||||||||||||orf117ng YEQKGLTVAFSSPVTDLAQQAIAVAHKAAKQFDKAKTTALMPWRLPMQVGRLFKQAKAPK 720
预计ORF117ng核苷酸序列<SEQ ID 461>编码的蛋白质具有氨基酸序列<SEQ ID462>:1 ..LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAK QFDKAKTTAL701 MPWRLPMQVG RLFKQAKAAPK K*进一步的工作揭示了下列淋球菌的部分DNA序列<SEQ ID 463>:1 TTGCTTGTGC AAACAGAAAA AGACGGTTTG CATAACGAGC AAACCTTTGG51 CGAGAAGAAA GTCTTCAGCG AAAATGGTAA GTTGCACAAC TACTGGCGTG101 CGCGTCGTAA AGGACATGAT GAAACAGGGC ATCGTGAACA AAATTATACT151 TTGCCGGAGG AAATCACACG CGACATTTCA CTGGGTTCAT TTGCCTATGA201 ATCGCATAGC AAAGCATTAA GCCGTCATGC GCCCAGCCAA GGCACTGAGT251 TGCCACAAAG TAACCGGGAT AATATCCGTA CTGCGAAAAG CAACGGTATT301 TCGCTACCCT ATACGCCCAA TTCTTTTACC CCATTACCCG GCAGCAGCTT351 ATACATTATC AATCCTGCCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC401 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCTGGGCAGC451 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA501 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC551 GTTTAGACGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT601 AATGGCGCGA CTGCGGCACG TTCGATGAAT CTCAGCGTTG GCATTGCATT651 AAGTGCCGAG CAAGCAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC701 AAAAAGAAGT TAAACTTCCT GATGGCGGCA CACAAACCGT ATTGATGCCA751 CAGGTTTATG TACGCGTTAA AAATGGCGGC ATAGACGGTA AAGGTGCATT801 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT851 CAGGCACGAT TGCAGGGCGC AATGCGCTTA TTATCAATAC CGATACGCTA901 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC951 ACAAGACATC AATAATATTG GCGGCATTCT TTCTGCCGAA CAGACATTAT1001 TGCTCAATGC GGGTAACAAC ATCAACAACC AAAGCACGGC CAAGAGCAGT1051 CAAAATGCAC AAGGTAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA1101 TATCACAGGC AAAGAAAAAG GTGTTTTAGC AGCGCAGGCA GGCAAAGACA1151 TCAACATCAT TGCCGGTCAA ATCAGCAATC AATCAGATCA AGGGCAAACC1201 CGGCTGCAGG CAGGACGCGA CATTAACCTG GATACGGTAC AAACCGGCAA1251 ATATCAAGAA ATCCATTTTG ATGCCGATAA CCATACCATC CGAGGTTCAA1301 CGAACGAAGT CGGCAGCAGC ATTCAAACAA AAGGCGATGT TACCCtatTG1351 TCAGGGAATA ATCTCAATGC CAAAGCTGCC GAAGTCGGCA GCGCAAAAGG1401 CACACTTGCC GTGTATGCTA AAAATGACAT TACTATCAGC TCAGGCATCC1451 ATGCCGGCCA AGTTGATGAT GCGTCCAAAC ATACAGGCAG AAGCGGCGGC1501 GGTAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACTGC1551 TCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG1601 ATGCCAACAT CCTTGGCAGT AATGTTATTT CCGATAATGG CACCCGGATT1651 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG1701 CGAAACCTAT CATCAAACCC AAAAATCAGG ATTGATGAGT GCAGGTATCG1751 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC1801 AACGAACATA CAGGCAGTAC CGTAGGCAGC CTGAAAGGCG ATACCACCAT1851 TGTTGCAAGC AAACACTACG AACAAACCGG CAGCAACGTT TCCAGCCCTG1901 AGGGCAACAA CCTTATCAGC ACGCAAAGTA TGGATATTGG CGCAGCACAA1951 AACCAATTAA ACAGCAAAAC CACCCAAACC TACGAACAAA AAGGCTTAAC2001 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA GCGATTGCCG2051 TAGCACACAA AGCAGCAAAC AAGTCGGACA AAGCAAAAAC GACCGCGTTA2101 ATGCCATGGC GGCTGCCAAT GCAGGTTGGC AGGCCTATCA AACAGGCAAA2151 GGCGCACAAA ACTTAG它对应于氨基酸序列<SEQ ID 464;ORF117ng-1>:1 LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAN KSDKAKTTAL701 MPWRLPMQVG RPIKQAKAHK T*
ORF117ng-1和ORF117显示在230个氨基酸的重叠区内同样有90%的相同性。另外,它显示出与数据库中一种分泌型脑膜炎奈瑟球菌蛋白同源:
gi|2623258(AF030941)推定分泌的蛋白[脑膜炎奈瑟球菌]长度=2273
评分=604位(1541),估计值=e-172
相同性=325/678(47%),阳性=449/678(65%),空隙=22/678(3%)
询问:1 LLVQTEKDGLHNEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDIS 60
L+V T + L N++T G K + ++ G LH Y R +KG D TG+ Y E++ I
目标:739 LIVGTPESALDNDETLGTKTI-TDKGDLHRYHRHHKKGRDSTGYSRSPYEPAPEVS-SIR 796
询问:61 LGSFAYESHSKALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYII 120
+G AY+ + AP Q +++P + + NGI +T LP SSL+ I
目标:797 MGISAYKGY-------APQQASDIPGTV---VPVVAENGIHPTFT------LPNSSLFAI 840
询问:121 NPANKGYLVETDPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELT 180
P NKGYL+ETDP F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+L+NEQIA+LT
目标:841 APNNKGYLIETDPAFTDYRKWLGSGYMLAALQQDPNHIHKRLGDGYYEQKLVNEQIAKLT 900
询问:181 GHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLP 240
G+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQ A+LTSDIVWL + V LP
目标:901 GYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIVWLENETVTLP 960
询问:241 DGGTQTVLMPQVYVRVKNGGIDGKGALLSGSNTQINVSGSLKN-SGTIAGRNALIINTDT 299
DG TQTVL P+VYVR + ++G+GALLSGS I SG+++N G IAGR ALI+N
目标:961 DGTTQTVLKPKVYVRARPKDMNGQGALLSGSVVDIG-SGAIENRGGLIAGREALILNAQN 1019
询问:300 LDNIGGRIHAQKSAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTY 359
+ N+ G + + A DI N G I AE LLL A NNI ++S +S+QN QGS目标:1020 IKNLQGDLQGKNIFAAAGSDITNTGSI-GAENALLLKASNNIESRSETRSNQNEQGSVRN 1078询问:360 LDRMAGIYITGKEKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQ 419
+ R+AGIY+TG++ G + AG +I + A +++NQS+ GQT L AG DI DT + Q目标:1079 IGRVAGIYLTGRQNGSVLLDAGNNIVLTASELTNQSEDGQTVLNAGGDIRSDTTGISRNQ 1138询问:420 EIHFDADNHTIRGSTNEVGSSIQTKGDVTLLSGNNLNAKAAEVGSAKGTLAVYAKNDITI 479
FD+DN+ IR NEVGS+I+T+G+++L + ++ +AAEVGS +G L + A DI +目标:1139 NTIFDSDNYVIRKEQNEVGSTIRTRGNLSLNAKGDIRIRAAEVGSEQGRLKLAAGRDIKV 1198询问:480 SSGIHAGQVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILG 539
+G + +DA K+TGRSGGG K +T ++ + A S T +GK+++L +G D + G目标:1199 EAGKAHTETEDALKYTGRSGGGIKQKMTRHLKNQNGQAVSGTLDGKEIILVSGRDITVTG 1258询问:540 SNVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLM-SAGIGFTIGSKTNTQENQS 598
SN+I+DN T + A N++ + +T+S+S ++ +KSGLM S GIGFT GSK +TQ N+S目标:1259 SNIIADNHTILSAKNNIVLKAAETRSRSAEMNKKEKSGLMGSGGIGFTAGSKKDTQTNRS 1318询问:599 QSNEHTGSTVGSLKGDTTIVASKHYEQTGSNVSSPEGNNLISTQSMDIGAAQNQLNSKTT 658
++ HT S VGSL G+T I A KHY QTGS +SSP+G+ IS+ + I AAQN+ + ++目标:1319 ETVSHTESVVGSLNGNTLISAGKHYTQTGSTISSPQGDVGISSGKISIDAAQNRYSQESK 1378询问:659 QTYEQKGLTVAFSSPVTD 676
Q YEQKG+TVA S PV +目标:1379 QVYEQKGVTVAISVPVVN 1396
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例.63
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 465>:1 ATGATTTACA TCGTACTGTT TCTAGCTGTC GTCCTCGCCG TTGTCGCCTA51 CAACATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAwAACCAG CCATGTCCGC151 GACGGCAAAC CGTCCGGCGG GTCAGTCATG ATGCCGAAAC CCCAACCGGC201 GGTCAAAAAA ACGGCAAAAC CCCAAGACCC CGyCATGCGC AACCTGCAAG251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCAACGAAAC401 CTGCCGACGC GTCGGCAAAA CCTGCACCCG TTCCGCAAAC ACCTGCAAAA451 CCGCTGATTA CGCTCAAAGA ACTGTCAAAA GTCGAATTAT CCTGGTTTGA501 CGTGCGCATC GACTTCATCT CCTAT...它对应于氨基酸序列<SEQ ID 466;ORF119>:1 MIYIVLFLAV VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSXTSHVR51 DGKPSGGSVM MPKPQPAVKK TAKPQDPXMR NLQEQDAVYI AKQKQAKASP101 FKTEIETALE ESGIIGNSAH TVSEPQTGHS ATKPADASAK PAPVPQTPAK151 PLITLKELSK VELSWFDVRI DFISY...进一步的工作揭示了完整的核苷酸序列<SEQ ID 467>:1 ATGATTTACA TCGTACTGTT TCTAGCTGTC GTCCTCGCCG TTGTCGCCTA51 CAACATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC151 GACGGCAAAC CGTCCGGCGG GTCAGTCATG ATGCCGAAAC CCCAACCGGC201 GGTCAAAAAA ACGGCAAAAC CCCAAGACCC CGCCATGCGC AACCTGCAAG251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCACCGAAAC 401 CTGCCGAGGC GCCGGCAAAA CCTGCACCCG TTCCGCAAAC AGCTGCAAAA451 CCGCTGATTA CGCTCAAAGA ACTGTCAAAA GTCGAATTAC CCTGGTTTGA501 CGTGCGCTTC GACTTCATCT CCTATATCGC GCTGACCGAA GCCAAAGAAC551 TGCACGCACT GCCGCGCCTT TCCAACCGCT GCCGCTACCA GATTGTCGGC601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG651 CTATCAGGCA TTTATCGTGG GTATTCAGGC AGTCAGCCGC AACGGACTTG701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGTGGA CGCATTCGCA751 CAAAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACCATCG851 CCATCCATTT GGTTTCCCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG1001 AGCCGTTTAC XAACGCCCTT TTGGACAACC AGTCCTACAA AGGCTTCAGT1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGCCAGTTG AACCTGAATC1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATG6CT CAAAGACGTG1201 CGCACTTATG TATTGGCGCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA1251 ACCGGGCGGC AAAACCGCAT TGCGCCTGTT CTCCTAA它对应于氨基酸序列<SEQ ID 468;ORF119-1>:1 MIYIVLFLAV VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR51 DGKPSGGSVM MPKPQPAVKK TAKPQDPAMR NLQEQDAVYI AKQKQAKASP101 FKTEIETALE ESGIIGNSAH TVSEPQTGHS APKPADAPAK PAPVPQTPAK151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQVDAFA251 QSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV401 RTYVLARQSE MLKVGIEPGG KTALRLFS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF119显示和脑膜炎奈瑟球菌菌株A的ORF(ORF119a)在175个氨基酸的重叠区内有93.7%的相同性:
10 20 30 40 50 60orf119.pep MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSXTSHVRDGKPSGGSVM
|||||||||:|||||||||||||||||||||||||||||||||| |||||||||||| ||orf119a MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM
10 20 30 40 50 60
70 80 90 100 110 120orf119.pep MPKPQPAVKKTAKPQDPXMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
||||||||||||| ||| ||||||||||||||||||||||||||||||||||||||||||orf119a MPKPQPAVKKTAKSQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
70 80 90 100 110 120
130 140 150 160 170orf119.pep TVSEPQTGHSATKPADASAKPAPVPQTPAKPLITLKELSKVELSWFDVRIDFISY
|| |||||||| ||||| |||:||||||||||||||||||||| |||||:|||||orf119a TVPEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
130 140 150 160 170 180orf119a AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
190 200 210 220 230 240全长ORF119a核苷酸序列<SEQ ID 469>是:1 ATGATTTACA TCGTACTGTT CCTCGCCGCC GTCCTCGCCG TTGTCGCCTA51 CAATATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG101 GGCACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC 151 GACGGCAAAC CGTCCGGCGG GCCAGTCATG ATGCCGAAAC CCCAACCGGC201 GGTCAAAAAA ACGGCAAAAT CCCAAGACCC CGCCATGCGC AACCTGCAAG251 AGCAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA351 CTCCGCCCAC ACCGTTCCCG AACCCCAAAC CGGACATTCC GCACCAAAAC401 CTGCCGACGC GCCGGCAAAA CCTGTTCCCG TTCCGCAAAC GCCGGCAAAA451 CCGCTGATTA CGCTCAAAGA GCTGTCGAAG GTCGAGCTGC CCTGGTTTGA501 CGTGCGCTTC GACTTCATCT CTTATATCGC GCTGACCGAA GCCAAAGAAC551 TGCACGCACT GCCGCGCCTT TCCAACCGCT GCCGCTACCA GATTGTCGGC601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG651 CTATCAGGCA TTTATCGTGG GTATTCAGGC AGTCAGCCGC AACGGACTTG701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGTGGA TGCATTCGCA751 CACAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACTATCG851 CCATCCATTT GGTTTCCCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG1001 AGCCGTTTAC CAATGCCCTT TTGGACAACC AGTCCTATAA AGGCTTCAGT1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGCCAGTTG AACCTGAATC1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTG1201 CGCACTTATG TATTGGCTCG TCAGTCCGAG ATGCTCAAAG TCCGTATCGA1251 ACCGGGCGGC AAAACCGCAT TGCGCCTGTT CTCCTAA它编码的蛋白质具有氨基酸序列<SEQ ID 470>:1 MIYIVLFLAA VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR51 DGKPSGGPVM MPKPQPAVKK TAKSQDPAMR NLQEQDAVYI AKQKQAKASP101 FKTEIETALE ESGIIGNSAH TVPEPQTGHS APKPADAPAK PVPVPQTPAK151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQVDAFA251 HSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV401 RTYVLARQSE MLKVGIEPGG KTALRLFS*ORF119a和ORF119-1显示在428个氨基酸的重叠区内有98.6%的相同性:
10 20 30 40 50 60orf119a.pep MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||| ||orf119-1 MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFCHSDKDALLNSKTSHVRDGKPSGGSVM
10 20 30 40 50 60
70 80 90 100 110 120orf119a.pep MPKPQPAVKKTAKSQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||orf119-1 MPKPQPAVKKTAKPQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
70 80 90 100 110 120
130 140 150 160 170 180orf119a.pep TVPEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
|| ||||||||||||||||||:||||||||||||||||||||||||||||||||||||||orf119-1 TVSEPQTGHSAPKPADAPAKPAPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
130 140 150 160 170 180
190 200 210 220 230 240orf119a.pep AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
190 200 210 220 230 240
250 260 270 280 290 300orf119a.pep AFNRQVDAFAHSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
||||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 AFNRQVDAFAQSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
250 260 270 280 290 300
310 320 330 340 350 360orf119a.pep AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
310 320 330 340 350 360
370 380 390 400 410 420orf119a.pep GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
370 380 390 400 410 420
429orf119a.pep KTALRLFSX
|||||||||orf119-1 KTALRLFSX
与淋病奈瑟球菌的预计ORF的同源性
ORF119显示和淋病奈瑟球菌的预计ORF(ORF119ng)在175个氨基酸的重叠区内有93.1%的相同性:orf119.pep MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSXTSHVRDGKPSGGSVM 60
|||||||||:|||||||||||||||||||||||||||||||||| |||||||||||||||orf119ng MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM 60orf119.pep MPKPQPAVKKTAKPQDPXMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH 120
|||||||||| ||||| ||||||||||||||||||||||||||||||||| ||||||||orf119ng MPKPQPAVKKPAKPQDSAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEEIGIIGNSAH 120orf119.pep TVSEPQTGHSATKPADASAKPAPVPQTPAKPLITLKELSKVELSWFDVRIDFISY 175
||||||||||| ||||| |||:|||||||||||||||||||||||||||:|||||orf119ng TVSEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE 180全长ORF119ng核苷酸序列<SEQ ID 471>是:1 ATGATTTACA TCGTACTGTT CCTCGCCGCC GTCCTCGCCG TTGTCGCCTA51 CAATATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC151 GACGGCAAAC CGTCCGGCGG GCCAGTCATG ATGCCGAAAC CCCAACCGGC201 GGTCAAAAAA CCGGCCAAAC CCCAAGACTC CGCCATGCGC AACCTGCAAG251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAATCGGCA TTATCGGCAA351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCACCGAAAC401 CTGCCGACGC GCCGGCAAAA CCCGTTCCCG TTCCGCAAAC GCCGGCAAAA451 CCGCTGATTA CGCTCAAAGA GCTGTCGAAG GTCGAGCTGC CCTGGTTTGA501 CGTGCGCTtc gACTTCATCT CCTATATCGC GCTGACCGAA GCCAAAGAAC551 TGCACGCACT GCCGCGCCTT tccAACCGCT GCCGCTACCA GATTGTCGGC601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG651 CTATCAGGCA TTTATCGTGG GTATCCAGGC AGTCAGCCGC AACGGACTTG701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGCGGA CGCATTCGCA751 CAAAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACCATCG851 CCATCCATTT GGTTTCGCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG1001 AGCCGTTTAC CAATGCCCTT TTGGACAACC AGTCCTACAA AGGCTTCAGT1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGTCAGTTG AACCTGAATC1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTA1201 CGCACTTATG TATTGGCGCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA1251 ACCGGGCGGC AAAACCGCCC TGCGCCTGTT TTCATAA它编码的蛋白质具有氨基酸序列<SEQ ID 472>:1 MTYTVLFLAA VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR51 DGKPSGGPVM MPKPQPAVKK PAKPQDSAMR NLQEQDAVYI AKQKQAKASP101 FKTEIETALE EIGIIGNSAH TVSEPQTGHS APKPADAPAK PVPVPQTPAK151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQADAFA251 QSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV401 RTYVLARQSE MLKYGIEPGG KTALRLFS*ORF119ng和ORF119-1显示在428个氨基酸的重叠区内有98.4%的相同性:
10 20 30 40 50 60orf119ng MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||| ||orf119-1 MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGSVM
10 20 30 40 50 60
70 80 90 100 110 120orf119ng MPKPQPAVKKPAKPQDSAMRNLQEQDAVYIAKQKGAKASPFKTEIETALEEIGIIGNSAH
|||||||||| ||||| |||||||||||||||||||||||||||||||||| ||||||||orf119-1 MPKPQPAVKKTAKPQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
70 80 90 100 110 120
130 140 150 160 170 180orf119ng TVSEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||orf119-1 TVSEPQTGHSAPKPADAPAKPAPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
130 140 150 160 170 180
190 200 210 220 230 240orf119ng AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
190 200 210 220 230 240
250 260 270 280 290 300orf119ng AFNRQADAFAQSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
|||||:||||||||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 AFNRQVDAFAQSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
250 260 270 280 290 300
310 320 330 340 350 360orf119ng AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
310 320 330 340 350 360
370 380 390 400 410 420orf19ng GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf119-1 GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
370 380 390 400 410 420
429orf119ng KTALRLFSX
|||||||||orf119-1 KTALRLFSX
根据该分析结果,包括此淋球菌蛋白中有一个推定的前导序列,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例64
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 473>1 ..GCGCGGCACG GCACGGAAGA TTTCTTCATG AACAACAGCG ACAC.ATCAG51 GCAGATAGTC GAAAGCACCA CCGGTACGAT GAAGCTGCTG ATTTCCTCCA101 TCGCCCTGAT TTCATTGGTA GTCGGCGGCA TCGGCGTGAT GAACATCATG151 CTGGTGTCCG TTACCGAGCG CACCAAAGAA ATCGGCATAC GGATGGCAAT201 CGGCGCGCGG CGCGGCAATA TTTyGCAGCA GTTTTTGATT GAGGCGGTGT251 TAATCTGCGT CATCGGCGGT TTGGTCGGCG TGGGTTTGTC CGCCGCCGTC301 AGCCTCGTGT TCAATCATTT TGTAACCGAC TTCCCGATGG ACATTTCCGC351 CATGTCCGTC ATCGGCGCGG TCGCCTGTTC GACCGGAATC GGCATCGCGT401 TCGGCTTTAT GCCTGCCAAT AAAGCAGCCA AACTCAATCC GATAGACGCA451 TTGGCACAGG ATTGA它对应于氨基酸序列<SEQ ID 474;ORF134>:1 ..ARHGTEDFFM NNSDXIRQIV ESTTGTMKLL ISSIALISLV VGGIGVMNIM51 LVSVTERTKE IGIRMAIGAR RGNIXQQFLI EAVLICVIGG LVGVGLSAAV101 SLVFNHFVTD FPMDISAMSV IGAVACSTGI GIAFGFMPAN KAAKLNPIDA151 LAQD*进一步的工作揭示了其完整的核苷酸序列<SEQ ID 475>:1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACGAT51 GCTCGGCATC ATCATCGGTA TCGCGTCGGT GGTTTCCGTC TCGCATTGG101 GCAATGGTTC GCAGAAAAAA ATCCTTGAAG ACATCAGTTC GATAGGGACG151 AACACCATCA GCATCTTCCC GGGGCGCGGC TTCGGCGACA GGCGCAGCGG201 CAGGATTAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA251 GCTACGTTGC TTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACT301 TACCGCAACA CCGACCTGAC CGCCTCGCTT TACGGCGTGG GCGAACAATA351 TTTCGACGTG CGCGGACTGA AGCTGGAAAC GGGGCGGCTG TTTGACGAAA401 ACGATGTGAA AGAAGACGCG CAGGTCGTCG TCATCGACCA AAATGTCAAA451 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT GAAAAAAGAC GAAAACGCTT551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT CGCCCTATAC GACGGTGATG601 CACCAAATCA CAGGCGAGAG CCACACCAAC TCCATCACCG TCAAAATCAA651 AGACAATGCC AATACCCAGG TTGCCGAAAA AGGGCTGACC GATCTGCTCA701 AAGCGCGGCA CGGCACGGAA GATTTCTTCA TGAACAACAG CGACAGCATC751 AGGCAGATAG TCGAAAGCAC CACCGGTACG ATGAAGCTGC TGATTTCCTC801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG CATCGGCGTG ATGAACATCA851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG AAATCGGCAT ACGGATGGCA901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG CAGTTTTTGA TTGAGGCGGT951 GTTAATCTGC GTCATCGGCG GTTTGGTCGG CGTGGGTTTG TCCGCCGCCG1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG ACTTCCCGAT GGACATTTCC1051 GCCATGTCCG TCATCGGCGC GGTCGCCTGT TCGACCGGAA TCGGCATCGC1101 GTTCGGCTTT ATGCCTGCCA ATAAAGCAGC CAAACTCAAT CCGATAGACG1151 CATTGGCACA GGATTGA它对应于氨基酸序列<SEQ ID 476;ORF134-1>:1 MSVQAVLAHK MRSLLTMLGI IIGIASVVSV VALGNGSQKK ILEDISSIGT51 NTISIFPGRG FGDRRSGRIK TLTIDDAKII AKQSYVASAT PMTSSGGTLT101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVVVIDQNVK151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDVL MLWSPYTTVM201 HQITGESHTN SITVKIKDNA NTQVAEKGLT DLLKARHGTE DFFMNNSDSI251 RQIVESTTGT MKLLISSIAL ISLVVGGIGV MNIMLVSVTE RTKEIGIRMA301 IGARRGNILQ QFLIEAVLIC VIGGLVGVGL SAAVSLVFNH FVTDFPMDIS351 AMSVIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD*该氨基酸序列的计算机分析给出了下列结果:与假设的大肠杆菌o648蛋白(登录号为AE000189)的同源性ORF134和o648蛋白显示在153个氨基酸的重叠区内有45%的氨基酸相同性:0rf134:2 RHGTEDFFMNNSDXIRQIVESTTGTMKXXXXXXXXXXXVVGGIGVMNIMLVSVTERTKEI 61
RHG +DFF N D + + VE TT T++ VVGGIGVMNIMLVSVTERT+EIo648: 496 RHGKKDFFTWNMDGVLKTVEKTTRTLQLFLTLVAVISLVVGGIGVMNIMLVSVTERTREI 5550rf134:62 GIRMAIGARRGNIXQQFLIEAXXXXXXXXXXXXXXXXXXXXXFNHFVTDFPMDISAMSVI 121
GIRMA+GAR ++ QQFLIEA F+ + + S ++++o648: 556 GIRMAVGARASDVLQQFLIEAVLVCLVGGALGITLSLLIAFTLQLFLPGWEIGFSPLALL 6150rf134:122 GAVACSTGIGIAFGFMPANKAAKLNPIDALAQD 154
A CST GI FG++PA AA+L+P+DALA++o648: 616 LAFLCSTVTGILFGWLPARNAARLDPVDALARE 648
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF134显示和脑膜炎奈瑟球菌菌株A的ORF(ORF134a)在154个氨基酸的重叠区内有98.7%的相同性:
10 20 30orf134.pep ARHGTEDFFMNNSDXIRQIVESTTGTMKLL
|||||||||||||| |||||||||||||||orf134a GESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTEDFFMNNSDSIRQIVESTTGTMKLL
210 220 230 240 250 260
40 50 60 70 80 90orf134.pep ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNIXGQFLIEAVLICVIGG
|||||||||||||||||||||||||||||||||||||||||||| |||||||||||||||orf134a ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNILQQFLIEAVLICVIGG
270 280 290 300 310 320
100 110 120 130 140 150orf134.pep LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134a LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA
330 340 350 360 370 380orf134.pep LAQDX
|||||orf134a LAQDX全长ORF134a核苷酸序列<SEQ ID 477>是:1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACGAT551 GCTCGGCATC ATCATCGGTA TCGCTTCGGT TGTCTCCGTC GTCGCATTGG101 GCAACGGTTC GCAGAAAAAA ATCCTTGAAG ACATCAGTTC GATAGGGACG151 AACACCATCA GCATCTTCCC AGGGCGCGGC TTCGGCGACA GGCGCAGCGG201 CAGGATTAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA251 GCTACGTTGC TTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACT301 TACCGCAATA CCGACCTGAC CGCTTCTTTG TACGGTGTGG GCGAACAATA351 TTTCGACGTG CGCGGGCTGA AGCTGGAAAC GGGGCGGCTG TTTGACGAAA401 ACGATGTGAA AGAAGACGCG CAGGTCGTCG TCATCGACCA AAATGTCAAA 451 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT GAAAAAAGAC GAAAACGCTT551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT CGCCCTATAC GACGGTGATG601 CACCAAATCA CAGGCGAGAG CCACACCAAC TCCATCACCG TCAAAATCAA651 AGACAATGCC AATACCCAGG TTGCCGAAAA AGGGCTGACC GATCTGCTCA701 AAGCGCGGCA CGGCACGGAA GATTTCTTCA TGAACAACAG CGACAGCATC751 AGGCAGATAG TCGAAAGCAC CACCGGTACG ATGAAGCTGC TGATTTCCTC801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG CATCGGCGTG ATGAACATCA851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG AAATCGGCAT ACGGATGGCA901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG CAGTTTTTGA TTGAGGCGGT951 GTTAATCTGC GTCATCGGCG GTTTGGTCGG CGTGGGTTTG TCCGCCGCCG1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG ACTTCCCGAT GGACATTTCC1051 GCCATGTCCG TCATCGGCGC GGTCGCCTGT TCGACCGGAA TCGGCATCGC1101 GTTCGGCTTT ATGCCTGCCA ATAAAGCAGC CAAACTCAAT CCGATAGATG1151 CATTGGCGCA GGATTGA它编码的蛋白质具有氨基酸序列<SEQ ID 478>:1 MSVQAVLAHK MRSLLTMLGI IIGIASVVSV VALGNGSQKK ILEDISSIGT51 NTISIFPGRG FGDRRSGRIK TLTIDDAKII AKQSYVASAT PMTSSGGTLT101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVVVIDQNVK151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDYL MLWSPYTTVM201 HQITGESHTN SITVKIKDNA NTQVAEKGLT DLLKARHGTE DFFMNNSDSI251 RQIVESTTGT MKLLISSIAL ISLVVGGIGV MNIMLVSVTE RTKEIGIRMA301 IGARRGNILQ QFLIEAVLIC VIGGLVGVGL SAAVSLVFNH FVTDFPMDIS351 AMSVIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD*ORF134a和ORF134-1显示在388个氨基酸的重叠区内有100.0%的相同性:orf134a.pep MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSIGTNTISIFPGRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-1 MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSIGTNTISIFPGRGorf134a.pep FGDRRSGRIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-1 FGDRRSGRIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDVorf134a.pep RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-1 RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKDorf134a.pep ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-1 ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTEorf134a.pep DFFMNNSDSIRQIVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-1 DFFMNNSDSIRQIVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAorf134a.pep IGARRGNILQQFLIEAVLICVIGGLVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVAC
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-L IGARRGNILQQFLIEAVLICVIGGLVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACorf134a.pep STGIGIAFGFMPANKAAKLNPIDALAQDX
|||||||||||||||||||||||||||||orf134-1 STGIGIAFGFMPANKAAKLNPIDALAQDX
与淋病奈瑟球菌的预计ORF的同源性
ORF134显示和淋病奈瑟球菌的预计ORF(QRF134.ng)在154个氨基酸的重叠区内有96.8%的相同性:orf134.pep ARHGTEDFFMNNSDXIRQIVESTTGTMKLL 30
|||||||||||||| |||:|||||||||||orf134ng GESHTNSITVKIKDNANTRVAEKGLAELLKARHGTEDFFMNNSDSIRQMVESTTGTMKLL 264orf134.pep ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNIXQQFLIEAVLICVIGG 90
|||||||||||||||||||||||||||||||||||||||||||| |||||||||||:|||orf134ng ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNILQQFLIEAVLICIIGG 324orf134.pep LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA 150
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||orf134ng LVGVGLSAAVSLVFNHFVTDFPMDISAASVIGAVACSTGIGIAFGFMPANKAAKLNPIDA 384orf134.pep LAQD 154
||||orf134ng LAQD 388全长ORF134ng核苷酸序列<SEQ ID 479>是:1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACCAT51 GCTCGGCATC ATCATCGGTA TCGCTTCGGT TGTCTCCGTC GTCGCGCTGG101 GCAACGGTTC GCAGAAAAAA ATCCTCGAAG ACATCAGTTC GATGGGGACG151 AACACCATCA GCATCTTCCC CGGGCGCGGC TTCGGCGACA GGCGCAGCGG201 CAAAATCAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA251 GCTACGTTGC CTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACC301 TACCGCAATA CCGACCTGAC CGCTTCTTTG TACGGTGTGG GCGAACAATA351 TTTCGACGTG CGCGGGCTGA AGCTGGAAAC GGGGCGGCTG TTTGATGAGA401 ACGATGTGAA AGAAGACGCG CAAGTCGTCG TCATCGACCA AAATGTCAAA451 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT GAAAAAAGAC GAAAACGCTT551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT CGCCCTATAC GACGGTGATG601 CACCAAATCA CAGGCGAGAG CCACACCAAC TCCATCACCG TCAAAATCAA651 AGACAATGCC AATACCCGGG TTGCCGAAAA AGGGCTGGCC GAGCTGCTCA701 AAGCACGGCA CGGCACGGAA GACTTCTTTA TGAACAACAG CGACAGCATC751 AGGCAGATGG TCGAAAGCAC CACCGGTACG ATGAAGCTGC TGATTTCCTC801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG CATCGGTGTG ATGAACATTA851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG AAATCGGCAT ACGGATGGCA901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG CAGTTTTTGA TTGAGGCGGT951 GTTAATCTGC ATCATCGGAG GCTTGGTCGG CGTAGGTTTG TCCGCCGCCG1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG ATTTCCCGAT GGACATTTCG1051 GCGGCATCCG TTATCGGGGC GGTCGCCTGT TCGACCGGAA TCGGCATCGC1101 GTTCGGCTTT ATGCCTGCCA ATAAGGCAGC CAAACTCAAT CCGATAGATG1151 CATTGGCGCA GGATTGA它编码的蛋白质具有氨基酸序列<SEQ ID 480>:1 MSVQAVLAHK MRSLLTMLGI IIGIASVVSV VALGNGSQKK ILEDISSMGT51 NTISIFPGRG FGDRRSGKIK TLTIDDAKII AKQSYVASAT PMTSSGGTLT101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVVVIDQNVK151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDVL MLWSPYTTVM201 HQITGESHTN SITVKIKDNA NTRVAEKGLA ELLKARHGTE DFFMNNSDSI251 RQMVESTTGT MKLLISSIAL ISLVVGGIGV MNIMLVSVTE RTKEIGIRMA301 IGARRGNILQ QFLIEAVLIC IIGGLVGVGL SAAVSLVFNH FVTDFPMDIS351 AASVIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD*ORF134ng和ORF134-1显示在388个氨基酸的重叠区内有97.9%的相同性:orf134ng MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSMGTNTISIFPGRG
|||||||||||||||||||||||||||||||||||||||||||||||:||||||||||||orf134-1 MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSIGTNTISIFPGRGorf134ng FGDRRSGKIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-1 FGDRRSGRIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDVorf1 34ng RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf134-1 RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKDorf134ng ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTRVAEKGLAELLKARHGTE
||||||||||||||||||||||||||||||||||||||||||:||||||::|||||||||orf134-1 ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTEorf134ng DFFMNNSDSIRQMVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMA
||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||||orf134-1 DFFMNNSDSIRQIVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAorf134ng IGARRGNILQQFLIEAVLICIIGGLVGVGLSAAVSLVFNHFVTDFPMDISAASVIGAVAC
||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||orf134-1 IGARRGNILQQFLIEAVLICVIGGLVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACorf134ng STGIGIAFGFMPANKAAKLNPIDALAQDX
|||||||||||||||||||||||||||||orf134-1 STGIGIAFGFMPANKAAKLNPIDALAQDX
ORF134ng还显示出与一种大肠杆菌ABC转运蛋白同源:
sp|P75831|YBJZ_ECOLI假设的ABC转运蛋白ATP-结合蛋白YBJZ>gi5(AE000189)o648:similar to YBBA_HAEINSW:P45247[大肠杆菌]长度=648
评分=297位(753),估计值=6e-80
相同性=162/389(41%),阳性=230/389(58%),空隙=1/389(0%)
询问:1 MSVQAVLAHKMRSLLTMLXXXXXXXXXXXXXXLGNGSQKKILEDISSMGTNTISIFPGRG 60
M+ +A+ A+KMR+LLTML +G+ +++ +L DI S+GTNTI ++PG+
目标:260 MAWRALAANKMRTLLTMLGIIIGIASVVSIVVVGDAAKQMVLADIRSIGTNTIDVYPGKD 319
询问:61 FGDRRSGKIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV 120
FGD + L DD I KQ +VASATP S L Y N D+ AS GV YF+V
目标:320 FGDDDPQYQQALKYDDLIAIQKQPWVASATPAVSQNLRLRYNNVDVAASANGVSGDYFNV 379
询问:121 RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFAD-SDPLGKTILFRKRPLTVIGVMKK 179
G+ G F++ + AQVVV+D N + +LF +D +G+ IL P VIGV ++
目标:380 YGMTFSEGNTFNQEQLNGRAQVVVLDSNTRRQLFPHKADVVGEVILVGNMPARVIGVAEE 439
询问:180 DENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTRVAEKGLAELLKARHGT 239
++ FG+S VL +W PY+T+ ++ G+S NSITV++K+ ++ AE+ L LL RHG
目标:440 KQSMFGSSKVLRVWLPYSTMSGRVMGQSWLNSITVRVKEGFDSAEAEQQLTRLLSLRHGK 499
询问:240 EDFFMNNSDSIRQMVESTTGTMKXXXXXXXXXXXVVGGIGVMNIMLVSVTERTKEIGIRM 299
+DFF N D + + VE TT T++ VVGGIGVMNIMLVSVTERT+EIGIRM
目标:500 KDFFTVNMDGVLKTVEKTTRTLQLFLTLVAVISLVVGGIGVMNIMLVSVTERTREIGIRM 559
询问:300 AIGARRGNILQQFLIEXXXXXXXXXXXXXXXXXXXXXXFNHFVTDFPMDISAASVIGAVA 359
A+GAR ++LQQFLIE F+ + + S +++ A
目标:560 AVGARASDVLQQFLIEAVLVCLVGGALGITLSLLIAFTLQLFLPGWEIGFSPLALLLAFL 619
询问:360 CSTGIGIAFGFMPANKAAKLNPIDALAQD 388
CST GI FG++PA AA+L+P+DALA++
目标:620 CSTVTGILFGWLPARNAARLDPVDALARE 648
根据该分析结果(包括淋球菌蛋白中存在前导肽和跨膜区),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例65
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 481>:1 ..GGGACGGGAG CGATGCTGCT GCTGTTTTAC GCGGTAACGA T.CTGCCTTT551 GGCCACTGGC GTTACCCTGA GTTACACCTC GTCGATTTTT TTGGCGGTAT101 TTTCGTFCCT GATTTTGAAA GAACGGATTT CCGTTTACAC GCAGGCGGTG151 CTGCTCCTTG GTTTTGCCGG CGTGGTATTG CTGCTTAATC CCTCGTTCCG201 CAGCGGTCAG GAAACGGCGG CACTCGCCGG GCTGGCGGGC GGCGCGATGT251 CCGGCTGGGC GTATTTGAAA GTGCGCGAAC TGTCTTTGGC GGGCGAACCC301 GGCFGGCGCG TCGTGTTTTA CCTTTCCGTG ACAGGTGTGG CGATGTCGTC351 GGTTTGGGCG ACGCTGACCG GCTGGCACAC CCTGTCCTTT CCATCGGCAG401 TTTATCTGTC GTGCATCGGC GTGTCCGCGC TGATTGCCCA ACTGTCGATG451 ACCCCCGCCT ACAAAGTCGG CGACAAATTC ACGGTTGCCT CGCTTTCCTA501 TATGACCGTC GTTTTTTCCG CTCTGTCTGC CGCATTTTTT CTGGGCGAAG551 AGCTTTTCTG GCAGGAAATA CTCGGTATGT GCATCATCAT CCTCAGCGGT601 ATTTTGA它对应于氨基酸序列<SEQ ID 482;ORF135>:1 ..GTGAALLLFY AVTILPLATG VTLSYTSSIF LAVFSFLILK ERISVYTQAV51 LLLGFAGVVL LLNPSFRSGQ ETAALAGLAG GAMSGWAYLK VRELSLAGEP101 GWRVVFYLSV TGVAMSSVWA TLTGWHTLSF PSAVYLSCIG VSALIAQLSM151 TRAYKVGDKF TVASLSYMTV VFSALSAAFF LGEELFWQEI LGMCIIISAV201 F*进一步的工作揭示了完整的核苷酸序列<SEQ ID 483>:1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC51 GGCGGCCTGC TTTACCATTA TGAACGTATT GATTAAAGAG GCATCGGCAA101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA151 ACCGTTGCGC TCGGGGCTGC CGCCGTATTG CGTCGGGACA mCTTCCGCAC201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA251 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGGC CACTGGCGTT301 ACCCTGAGTT ACACCTCGTC GATTTTTTTG GCGGTATTTT CCTTCCTGAT351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT401 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA451 ACGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG551 TGTTTTACCT TTCCGTGACA GGTGTGGCGA TGTCGTCGGT TTGGGCGACG601 CTGACCGGCT GGCACACCCT GTCCTTTCCA TCGGCAGTTT ATCTGTCGTG651 CATCGGCGTG TCCGCGCTGA TTGCCCAACT GTCGATGACG CGCGCCTACA701 AAGTCGGCGA CAAATTCACG GTTGCCTCGC TTTCCTATAT GACCGTCGTT751 TTTTCCGCTC TGTCTGCCGC ATTTTTTCTG GGCGAAGAGC TTTTCTGGCA801 GGAAATACTC GGTATGTGCA TCATCATCCT CAGCGGTATT TTGAGCAGCA851 TCCGCCCCAC TGCCTTCAAA CAGCGGCTGC AATCCCTGTT CCGCCAAAGA901 TAA它对应于氨基酸序列<SEQ ID 484;ORF135-1>:1 MDTAKKDILG SGWMLVAAAC FTIMNVLIKE ASAKFALGSG ELVFWRMLFS51 TVALGAAAVL RRDXFRTPHW KNHLNRSMVG TGAMLLLFYA VTHLPLATGV101 TLSYTSSIFL AVFSFLILKE RISVYTQAVL LLGFAGVVLL LNPSFRSGQE151 TAALAGLAGG AMSGWAYLKV RFLSLAGEPG WRVVFYLSVT GVAMSSVWAT201 LTGWHTLSFP SAVYLSCIGV SALIAQLSMT RAYKVGDKFT VASLSYMTVV251 FSALSAAFFL GEELFWQEIL GMCIIILSGI LSSIRPTAFK QRLQSLFRQR301 *
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF135显示和脑膜炎奈瑟球菌菌株A的ORF(ORF135a)在197个氨基酸的重叠区内有99.0%的相同性:
10 20 30orf135.pep GTGAMLLLFYAVTILPLATGVTLSYTSSIF
||||||||||||| ||||||||||||||||orf135a STVALGAAAVLRRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIF
50 60 70 80 90 100
40 50 60 70 80 90orf135.pep LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf135a LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLK
110 120 130 140 150 160
100 110 120 130 140 150orf135.pep VRELSLAGEPGWRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf135a VRELSLAGEPGWRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM
170 180 190 200 210 220
160 170 180 190 200orf135.pep TRAYKVGDKFTVASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIISAVFX
|||||||||||||||||||||||||||||||:|||||||||||||||orf135a TRAYKVGDKFTVASLSYMTVVFSALSAAFFLAEELFWQEILGMCIIILSGILSSIRPTAF
230 240 250 260 270 280orf135a KQRLQSLFRQRX
290 300全长ORF135a核苷酸序列<SEQ ID 485>是:1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC51 GGCGGCCTGC TTTACCATTA TGAACGTATT GATTAAAGAG GCATCGGCAA101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA151 ACCGTTGCGC TCGGGGCTGC CGCCGTATTG CGTCGGGACA CCTTCCGCAC201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA251 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGGC CACCGGCGTT301 ACCCTGAGTT ACACCTCGTC GATTTTTTTG GCGGTATTTT CCTTCCTGAT351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT401 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA451 ACGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG551 TGTTTTACCT TTCCGTGACA GGTGTGGCGA TGTCATCGGT TTGGGCGACG601 CTGACCGGCT GGCACACCCT GTCCTTTCCA TCGGCAGTTT ATCTGTCGTG651 CATCGGCGTG TCCGCGCTGA TTGCCCAACT GTCGATGACG CGCGCCTACA701 AAGTCGGCGA CAAATTCACG GTTGCCTCGC TTTCCTATAT GACCGTCGTT751 TTTTCCGCTC TGTCTGCCGC ATTTTTTCTG GCCGAAGAGC TTTTCTGGCA801 GGAAATACTC GGTATGTGCA TCATCATCCT CAGCGGTATT TTGAGCAGCA851 TCCGCCCCAC TGCCTTCAAA CAGCGGCTGC AATCCCTGTT CCGCCAAAGA901 TAA它编码的蛋白质具有氨基酸序列<SEQ ID 486>:1 MDTAKKDILG SGWMLVAAAC FTIMNVLIKE ASAKFALGSG ELVFWRMLFS51 TVALGAAAVL RRDTFRTPHW KNHLNRSMVG TGAMLLLFYA VTHLPLATGV101 TLSYTSSIFL AVFSFLILKE RISVYTQAVL LLGFAGVVLL LNPSFRSGQE151 TAALAGLAGG AMSGWAYLKV RELSLAGEPG WRVVFYLSVT GVAMSSVWAT201 LTGWHTLSFP SAVYLSCIGV SALIAQLSMT RAYKVGDKFT VASLSYMTVV251 FSALSAAFFL AEELFWQEIL GMCIIILSGI LSSIRPTAFK QRLQSLFRQR301 *ORF135a和ORF135-1显示在300个氨基酸的重叠区内有99.3%的相同性:orf135a.pep MDTAKKDILGSGWMLVAAACFTIMNVLIKEASAKFALGSGELVFWRMLFSTVALGAAAVL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf135-1 MDTAKKDILGSGWMLVAAACFTIMNVLIKEASAKFALGSGELVFWRMLFSTVALGAAAVLorf135a.pep RRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKE
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf135-1 RRDXFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKEorf135a.pep RISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf135-1 RISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPGorf135a.pep WRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf135-1 WRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFTorf135a.pep VASLSYMTVVFSALSAAFFLAEELFWQEILGMCIIILSGILSSIRPTAFKQRLQSLFRQR
||||||||||||||||||||:||||||||||||||||||||||||||||:||||||||||orf135-1 VASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPTAFKQRLQSLFRQR
与淋病奈瑟球菌的预计ORF的同源性
ORF135和淋病奈瑟球菌的预计ORF(ORF135ng)在201个氨基酸的重叠区内显示出有97%的相同性:orf135.pep GTGAMLLLFYAVTXLPLATGVTLSYTSSIF 30
||||||||||||| |||:||||||||||||orf135ng STVTLGAAAVLRRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLTTGVTLSYTSSIF 335orf135.pep LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLK 90
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||orf135ng LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQEPAALAGLAGGAMSGWAYLK 395orf135.pep VRELSLAGEPGWRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM 150
|||||||||||||||||||:||||||||||||||||||||||||||| ||||||||||||orf135ng VRELSLAGEPGWRVVFYLSATGVAMSSVWATLTGWHTLSFPSAVYLSGIGVSALIAQLSM 455orf135.pep TRAYKVGDKFTVASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIISAVF 201
||||||||||||||||||||||||||||||||||||||||| |||||||:|orf135ng TRAYKVGDKFTVASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIISAAF 506
预计ORF135ng核苷酸序列<SEQ ID 487>编码的蛋白质具有氨基酸序列<SEQ ID488>:1 MPSEKAFRRH LRTASFQGLH LHHFHQKVGK CGIIGFGIHI FPTLLPAAQG51 ILDIQLGLFR IDFAALAVYR RTQVDFIHTV IDGIASDQAF SEVVQILRRL101 NLGHFTDTHL IAQARRFIAD FGNIRPMRRG EAKTFCRCFR FDGIDGIHGD151 FRQCGHINRL APGKDCRNGK RDKVFFHTRH YNQVCLEKTN CSARKIKFRH201 QKQAKTHSTS LAARFTIRPS LSQRPFMDTA KKDILGSGWM LVAAACFTVM251 NVLIKEASAK FALGSGELVF WRMLFSTVTL GAAAVLRRDT FRTPHWKNHL301 NRSMVGTGAM LLLFYAVTHL PLTTGVTLSY TSSIFLAVFS FLILKERISV351 YTQAVLLLGF AGVVLLLNPS FRSGQEPAAL AGLAGGAMSG WAYLKVRELS401 LAGEPGWRVV FYLSATGVAM SSVWATLTGW HTLSFPSAVY LSGIGVSALI451 AQLSMTRAYK VGDKFTVASL SYMTVVFSAL SAAFFLGEEL FWQEILGMCI501 IISAAF*进一步的工作揭示了下列淋球菌序列<SEQ ID 489>:1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC551 GGCGGCCTGC TTCACCGTTA TGAACGTATT GATTAAAGAG GCATCGGCAA101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA151 ACCGTTACGC TCGGTGCTGC CGCCGTATTG CGGCGCGACA CCTTCCGCAC201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA251 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGAC AACCGGCGTT301 ACCCTGAGTT ACACCTCGTC GATTTTTttg GCGGTATTTT CCTTCCTGAT351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT401 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA451 CCGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG551 TGTTTTACCT TTCCGCAACC GGCGTGGCGA TGTCGTCggt ttgggcgacg601 Ctgaccggct ggCACAcccT GTCCTTTcca tcggcagttt ATCtgtCGGG651 CATCGGCGTG tccgcgCtgA TTGCCCAaCT GtcgatgAcg cGCGcctaca701 aaGTCGGCGA CAAATTCACG GTTGCCTCGC tttcctaTAt gaccgtcGTC751 TTTTCCGCCC TGTCTGCCGC ATTTTTTCTg ggcgaagagc tttTCtggCA801 GGAAATACTC GGTATGTGCA TCATTAtccT CAGCGGCATT TTGAGCAGCA851 TCCGCCCCAT TGCCTTCAAA CAGCGGCTGC AAGCCCTCTT CCGCCAAAGA901 TAA它对应于氨基酸序列<SEQ ID 490;ORF135ng-1>:1 MDTAKKDILG SGWMLVAAAC FTVMNVLIKE ASAKFALGSG ELVFWRMLFS51 TVTLGAAAVL RRDTFRTPHW KNHLNRSMVG TGAMLLLFYA VTHLPLTTGV101 TLSYTSSIFL AVFSFLILKE RISVYTQAVL LLGFAGVVLL LNPSFRSGQE151 PAALAGLAGG AMSGWAYLKV RELSLAGEPG WRVVFYLSAT GVAMSSVWAT201 LTGWHTLSFP SAVYLSGIGV SALIAQLSMT RAYKVGDKFT VASLSYMTVV251 FSALSAAFFL GEELFWQEIL GMCIIILSGI LSSIRPIAFK QRLQALFRQR301 *ORF135ng-1和ORF135-1显示在300个氨基酸的重叠区内有97.0%的相同性:orf135ng-1.pep MDTAKKDILGSGWMLVAAACFTVMNVLIKEASAKFALGSGELVFWRMLFSTVTLGAAAVL
||||||||||||||||||||||:|||||||||||||||||||||||||||||:|||||||orf135-1 MDTAKKDILGSGWMLVAAACFTIMNVLIKEASAKFALGSGELVFWRMLFSTVALGAAAVLorf135ng-1.pep RRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLTTGVTLSYTSSIFLAVFSFLILKE
|||:||||||||||||||||||||||||||||||||:|||||||||||||||||||||||orf135-1 RRDXFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKEorf135ng-1.pep RISVYTQAVLLLGFAGVVLLLNPSFRSGQEPAALAGLAGGAMSGWAYLKVRELSLAGEPG
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||orf135-1 RISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPGorf135ng-1.pep WRVVFYLSATGVAMSSVWATLTGWHTLSFPSAVYLSGIGVSALIAQLSMTRAYKVGDKFT
||||||||:||||||||||||||||||||||||||| |||||||||||||||||||||||orf135-1 WRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFTorf135ng-1.pep VASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPIAFKQRLQALFRQR
|||||||||||||||||||||||||||||||||||||||||||||| |||||||:|||||orf135-1 VASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPTAFKQRLQSLFRQR
根据该分析结果,包括此淋球菌蛋白中存在几个推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例66
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 491>:1 ATGAAGCGGC GTATAGCCGT CTTCGTCCTG TTCCCGCAGA TAATCCGAGT51 TTTGGGACAA CTGTTGCCGA AAATCGTCAA TACAGTTCCG GCACATCGGA101 TGCTCTTCCA GATTTTCGGG ATGTTCTTTT TCTTCATACA CCAGCAATAT151 CTGCCCGGGA TCGCCGAAAT CGATTCCCCA TGCGGCATCG TGTTCGGTGC201 GCTCCTCTTC CGTCATCTGC CCGCGCATTG CCTGTATGGT AAAGCCGCCG251 TAGGGGATGC CgTTGCACAC GAACATCCAG TCGCTGATGT CGTCAACCGG301 AACGCAAACG cTTTCGCCTT GTTCGACATT GGTCAGTTCG CCsGGTTCAT351 TGTTCAGCAC ACCGTAAATA TAAAGACCGT CAAAATAAAT ATCGTCGATC401 CACATATGTT CGCAAATTTC GCCGTCTTCG CCGTCTTGGA AAAAAGGGAC451 TTTGACCATG GCAAAATCCA AGGCGGAAAT AATGCGGCGG CGTTCCCAAA501 AAAGcTCGCG CCAAAAATAT TTGAATGTTT TACGGGCGCG TTCGTCGGCA551 CGGTTTACCG GTTCGTCTGC CTGTTCTACA TAATAAATGA CGGAATCGCC601 CATCATATCT GCTCCTCAAC GTGTACGGTA TCTGTTTGCA CCTTACTGCG651 GCTTTCTgcC kTCGGCATCC GATTCGGATT TGAAAAGTTC mmrwyATTCG701 GAATAG它对应于氨基酸序列<SEQ ID 492;ORF136>:1 MKRRIAVFVL FPQIIRVLGQ LLPKIVNTVP AHRMLFQIFG MFFFFIHQQY51 LPGIAEIDSP CGIVFGALLF RHLPAHCLYG KAAVGDAVAH EHPVADVVNR101 NANAFALFDI GQFAXFIVQH TVNIKTVKIN IVDPHMFANF AVFAVLEKRD151 FDHGKIQGGN NAAAFPKKLA PKIFECFTGA FVGTVYRFVC LFYIINDGIA201 HHSAPQRVRY LFAPYCGFLP SASDSDLKSS XXSE*进一步的工作揭示了完整的核苷酸序列<SEQ ID 493>:1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGTTCCCGC AGATAATCCG51 AGTTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC101 GGATGCTCTT CCAGATTTTC GGGATGTTCT TTTTCTTCAT ACACCAGCAA151 TATCTGCCCG GGATCGCCGA AATCGATTCC CCATGCGGCA TCGTGTTCGG201 TGCGCTCCTC TTCCGTCATC TGCCCGCGCA TTGCCTGTAT GGTAAAGCCG251 CCGTAGGGGA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGTCAAC301 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT TCGCCGGGTT351 CATTGTTCAG CACACCGTAA ATATAAAGAC CGTCAAAATA AATATCGTCG401 ATCCACATAT GTTCGCAAAT TTCGCCGTCT TCGCCGTCTT GGAAAAAAGG451 GACTTTGACC ATGGCAAAAT CCAAGGCGGA AATAATGCGG CGGCGTTCCC501 AAAAAAGCTC GCGCCAAAAA TATTTGAATG TTTTACGGGC GCGTTCGTCG551 GCACGGTTTA CCGGTTCGTC TGCCTGTTCT ACATAATAAA TGACGGAATC601 GCCCATCATT CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACTG651 CGGCTTTCTG CCTTCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT701 CGGAATAG它对应于氨基酸序列<SEQ ID 494;ORF136-1>:1 MMKRRIAVFV LFPQIIRVLG QLLPKIVNTV PAHRMLFQIF GMFFFFIHQQ51 YLPGIAEIDS PCGIVFGALL FRHLPAHCLY GKAAVGDAVA HEHPVADVVN101 RNANAFALFD IGQFAGFIVQ HTVNIKTVKI NIVDPHMFAN FAVFAVLEKR151 DFDHGKIQGG NNAAAFPKKL APKIFECFTG AFVGTVYRFV CLFYIINDGI201 AHHSAPQRVR YLFAPYCGFL PSASDSDLKS SKYSE*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF136和脑膜炎奈瑟球菌菌株A的ORF(ORF136a)在237个氨基酸的重叠区内显示出有71.7%的相同性:
10 20 30 40 50 59orf136.pep MKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS
||||||||||: | ||:|||||||||||||||||||| |||||||||||||||||||||orf136a MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQXFGMFFFFIHQQYLPGIAEIDS
10 20 30 40 50 60
60 70 80 90 100 110 119orf136.pep PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAXFIVQ
|||||||:||||| :||||||||||:|||||||||||||||||||||||||||| ||||orf136a PCGIVFGTLLFRHXSTHCLYGKAAVGNAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQ
70 80 90 100 110 120
120 130 140 150 160 170 179orf136.pep HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG
|::|:||||||||||||||||| ||||||| : :| : |: | :: : :orf136a HAINVKTVKINIVDPHMFANFAXFAVLEKRALTMAKSKXXXMRRRSQKSSRQKYLNVLRA
130 140 150 160 170 180
180 190 200 210 220 230orf136.pep AFVGTVYRFVCLFYIINDGIAHH---SAPQRVRYLFAPYCGFLPSASDSDLKSSXXSEX
: ||: | : ::: ||||||||||||||||||||||||||| |||orf136a R---SPARFTGLSACSTXXMTESPIISAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
190 200 210 220 230全长ORF136a核苷酸序列<SEQ ID 495>是:1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGCTCATGC AGAAAATCCG51 GATTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC101 GGATGCTCTT CCAGATNTTC GGGATGTTCT TTTTCTTCAT ACACCAGCAA151 TACCTGCCCG GGATCGCCGA AATCGATTCC CCATGCGGCA TCGTGTTCGG201 TACGCTCCTC TTCCGTCATC NGTCCACGCA TTGCCTGTAT GGTAAAGCCG251 CCGTAGGGAA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGTCAAC301 CGGAACGCAA ACGCTFTCGC CTTGTTCGAC ATTGGTCAGT TCGCCGGGTT351 CATTGTTCAG CACGCCATAA ATGTAAAGAC CGTCAAAATA AATATCGTCG401 ATCCACATAT GTTCGCAAAT TTCGCCNTCT TCGCCGTCTT GGAAAAAAGG451 GCTTTGACCA TGGCAAAATC TAAGGNGNNA NNGATGCGGC GGCGTTCCCA501 AAAAAGCTCG CGCCAAAAAT ATTTGAATGT TTTGCGGGCG CGTTCGCCGG551 CACGGTTTAC CGGTTTGTCT GCCTGTTCTA CATAATAAAT GACGGAATCG601 CCCATCATAT CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACTG651 CGGCTTTCTG CCTTCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT701 CGGAATAG它编码的蛋白质具有氨基酸序列<SEQ ID 496>:1 MMKRRIAVFV LLMQKIRILG QLLPKIVNTV PAHRMLFQXF GMFFFFIHQQ51 YLPGIAEIDS PCGIVFGTLL FRHXSTHCLY GKAAVGNAVA HEHPVADVVN101 RNANAFALFD IGQFAGFIVQ HAINVKTVKI NIVDPHMFAN FAXFAVLEKR151 ALTMAKSKXX XMRRRSQKSS RQKYLNVLRA RSPARFTGLS ACST**MTES201 PIISAPQRVR YLFAPYCGFL PSASDSDLKS SKYSE*ORF136a和ORF136-1显示在238个氨基酸的重叠区内有73.1%的相同性:
10 20 30 40 50 60orf136a.pep MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQXFGMFFFFIHQQYLPGIAEIDS
|||||||||||: | ||:|||||||||||||||||||| |||||||||||||||||||||orf136-1 MMKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS
10 20 30 40 50 60
70 80 90 100 110 120orf136a.pep PCGIVFGTLLFRHXSTHCLYGKAAVGNAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQ
|||||||:||||| :||||||||||:|||||||||||||||||||||||||||||||||orf136-1 PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQ
70 80 90 100 110 120
130 140 150 160 170 180orf136a.pep HAINVKTVKINIVDPHMFANFAXFAVLEKRALTMAKSKXXXMRRRSQKSSRQKYLNVLRA
|::|:||||||||||||||||| ||||||| : :| : |: | :: : :orf136-1 HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG
130 140 150 160 170 180
190 200 210 220 230orf136a.pep R---SPARFTGLSACSTXXMTESPIISAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
: ||: | : ::: ||||||||||||||||||||||||||||||||orf136-1 AFVGTVYRFVCLFYIINDGIAHH---SAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
190 200 210 220 230
与淋病奈瑟球菌的预计ORF的同源性
ORF136和淋病奈瑟球菌的预计ORF(ORF136ng)在234个氨基酸的重叠区内显示出有92.3%的相同性:orf136.pep MKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS 59
||||||||||: | ||:||||||||||||||||||||||||||||||:|||||||||||orf136ng MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQIFGMFFFFIHRQYLPGIAEIDS 60orf136.pep PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAXFIVQ 119
| |||||:|||||| |||||||||||||||||||||||:|||||||||||||| | ||||orf136ng PGGIVFGTLLFRHLSAHCLYGKAAVGDAVAHEHPVADVANRNANAFALFDIGQSAGFIVQ 120orf136.pep HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG 179
|||||||||||||||||||||||||||||||||||||||||||||||||||||:||||||orf136ng HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKVFECFTG 180orf136.pep AFVGTVYRFVCLFYIINDGIAHHSAPQRVRYLFAPYCGFLPSASDSDLKSSXXSE 234
||:||||||||||||||||||||:|||||||||||| |||| ||||||||| ||orf136ng AFAGTVYRFVCLFYIINDGIAHHTAPQRVRYLFAPYRGFLPPASDSDLKSSKYSE 235全长ORF136ng核苷酸序列<SEQ ID 497>是:1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGCTCATGC AGAAAATCCG51 GATTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC101 GGATGCTCTT CCAAATTTTC GGGATGTTCT TTTTCTTCAT ACACCGGCAA151 TACCTGCCCG GGATCGCCGA AATCGATTCC CCAGGCGGTA TCGTGTTCGG201 TACGCTCCTC TTCCGTCATC TGTCCGCGCA TTGCCTGTAC GGTAAAGCCG251 CCGTAGGGGA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGCCAAC301 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT CCGCCGGGTT351 CATTGTTCAG CACACCGTAA ATATAAAGAC CGTCAAAATA AATATCGTCG401 ATCCACATAT GTTCGCAAAT TTCGCCGTCT TCGCCGTCTT GGAAAAAAGG451 GACTTTGACC ATGGCAAAAT CCAAGGCGGA AATAATGCGG CGGCGTTCCC501 AAAAAAGCTC GCGCCAAAAG TATTTGAATG TTTTACGGGC GCGTTCGCCG551 GCACGGTTTA CCGGTTCGTC TGCCTGTTCT ACATAATAAA TGACGGAATC601 GCCCATCATA CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACCG551 CGGTTTTCTA CCTCCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT701 CGGAATAG它编码的蛋白质具有氨基酸序列<SEQ ID 498>:1 MMKRRIAVFV LLMQKIRILG QLLPKIVNTV PAHRMLFQIF GMFFFFIHRQ51 YLPGIAEIDS PGGIVFGTLL FRHLSAHCLY GKAAVGDAVA HEHPVADVAN101 RNANAFALFD IGQSAGFIVQ HTVNIKTVKI NIVDPHMFAN FAVFAVLEKR151 DFDHGKIQGG NNAAAFPKKL APKVFECFTG AFAGTVYRFV CLFYIINDGI201 AHHTAPQRVR YLFAPYRGFL PPASDSDLKS SKYSE*ORF136ng和ORF136-1显示在235个氨基酸的重叠区内有93.6%的相同性:orf136ng MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQIFGMFFFFIHRQYLPGIAEIDS
|||||||||||: | ||:||||||||||||||||||||||||||||||:|||||||||||orf136-1 MMKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDSorf136ng PGGIVFGTLLFRHLSAHCLYGKAAVGDAVAHEHPVADVANRNANAFALFDIGQSAGFIVQ
| |||||:|||||| |||||||||||||||||||||||:|||||||||||||| ||||||orf136-1 PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQorf136ng HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKVFECFTG
|||||||||||||||||||||||||||||||||||||||||||||||||||||:||||||orf136-1 HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTGorf136ng AFAGTVYRFVCLFYIINDGIAHHTAPQRVRYLFAPYRGFLPPASDSDLKSSKYSEX
||:||||||||||||||||||||:|||||||||||| |||| ||||||||||||||orf136-1 AFVGTVYRFVCLFYIINDGIAHHSAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
根据此淋球菌蛋白中存在推定跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例67
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 499>:1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC51 CGCCGCCGCG TTGCTTGCCG CC.TGCGGAC GGCGGGARAT AATGCTGTCC101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACC TCCGCAGGTT251 CGATTGTCGG CAACCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA301 TTGGAAGCCG AAATTTTAGG CAAAACCGAT TTGGTCGATT TAACCTTGTC351 CACCAATGGG TTTATCAAAG GCGCAAAGCT GCAAAATTAC ATCAACCGAA401 AACTCCGCGG CATGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCC..它对应于氨基酸序列<SEQ ID 500;ORF137>:1 MENMVTFSKI RPLLAIAAAA LLAAXRTAGN NAVRKPVQTA KPAAVVGLAL51 GGGRSKGFAH VGIIKVLKEN GIPVKVVTGT SAGSIVGNLF ASGMSPDRLE101 LEAEILGKTD LVDLTLSTNG FIKGAKLQNY INRKLRGMQI QQFPIKFAA..进一步的工作揭示了完整的核苷酸序列<SEQ ID 501>:1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC51 CGCCGCCGCG TTGCTTGCCG CCTGCGCCAC GGCGGGAAAT AATGCTGTCC101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT251 CGATTGTCGG CAGCCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA301 TTGGAAGCCG AAATTTTAGG CAAAACCGAT TTGGTCGATT TAACCTTGTC351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA401 AAGTCGGCGG CAGGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT451 GCTACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AGGGGAATGC501 CGGGCAGGCT GTGCGCGCTT CCGCCGCCAT TCCCAATGTG TTCCAACCCG551 TTATCATCGG CAGGCATACA TATGTTGACG GCGGTCTGTC GCAGCCCGTG601 CCCGTCAGTG CCGCCXGGCG GCAGGGGGCG AATTTCGTGA TTGCCGTCGA651 TATTTCCGCC CGTCCGGGCA AAAACATCAG CCAAGGTTTC TTCTCTTATC701 TCGATCAGAC GCTGAACGTA ATGAGCGTTT CTGCGTTGCA AAATGAGTTG751 GGGCAGGCGG ATGTGGTTAT CAAACCGCAG GTTTTGGATT TGGGTGCAGT801 CGGCGGATTC GATCAGAAAA AACGCGCCAT CCGGTTGGGT GAGGAGGCAG851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT901 TGA它对应于氨基酸序列<SEQ ID 502;ORF137-1>:1 MENMVTFSKI RPLLAIAAAA LLAACGTAGN NAVRKPVQTA KPAAVVGLAL51 GGGASKGFAH VGIIKVLKEN GIPVKVVTGT SAGSIVGSLF ASGMSPDRLE101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRQI QQFPIKFAAV151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHT YVDGGLSQPV201 PVSAARRQGA NFVIAVDISA RPGKNISQGF FSYLDQTLNV MSVSALQNEL251 GQADVVIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE IKRKLAAYRY301 *
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF137和脑膜炎奈瑟球菌菌株A的ORF(ORF137a)在149个氨基酸的重叠区内显示出有93.3%的相同性:
10 20 30 40 50 60orf137.pep MENMVTFSKIRPLLAIAAAALLAAXRTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAH
|||||||||||||||||||||||| ||||||:|||||||||||||||||||||||||||orf137a MENMVTFSKIRPLLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVGLALGGGASKGFAH
10 20 30 40 50 60
70 80 90 100 110 120orf137.pep VGIIKVLKENGIPVKVVTGTSAGSIVGNLFASGMSPDRLELEAEILGKTDLVDLTLSTNG
|||||||||||||||||||||||||||:||||||||||||||||||||||||||||||:|orf137a VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSG
70 80 90 100 110 120
130 140 149orf137.pep FIKGAKLQNYINRKLRGMQIQQFPIKFAA
|||| |||||||||: | :||||||||||orf137a FIKGEKLQNYINRKVGGRRIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
130 140 150 160 170 180全长ORF137a核苷酸序列<SEQ ID 503>是:1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC51 CGCCGCCGCG TTGCTTGCCG CCTGCGGCAC GGCGGGAAAT AATGCTGCCC101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT251 CGATAGTCGG CAGCCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA301 TTGGAACCCG AAATTTTAGG TAAAACCGAT TTGGTCGATT TAACCTTGTC351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA401 AAGTCGGCGG CAGGCGGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT451 GCTACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AAGGGAATGC501 CGGGCAGGCT GTGCGCGCTT CCGCCGCCAT TCCCAATGTG TTCCAACCCG551 TTATCATCGG CAGGCATACA TATGTTGACG GCGGTCTGTC GCAGCCCGTG601 CCCGTCAGTG CCGCCCGGCG GCANGNNNNG NATNTCGTGA TTGCCGTCGA651 TATTTCCGCC CGTCCGAGCA AAAACATCAG CCAAGGCTTC TTCTCTTATC701 TCGATCAGAC GCTGAACGTA ATGAGCGTTT CCGCGTTGCA AAATGAGTTG751 GGGCAGGCGG ATGTGGTTAT CAAACCGCAG GTTTTGGATT TGGGTGCAGT801 CGGCGGATTC GATCAGAAAA AACGCGCCAT CCGGTTGGGT GAGGAGGCAG851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT901 TGA它编码的蛋白质具有氨基酸序列<SEQ ID 504>:1 MENMVTFSKI RPLLAIAAAA LLAACGTAGN NAARKPVQTA KPAAVVGLAL51 GGGASKGFAH VGIIKVLKEN GIPVKVVTGT SAGSIVGSLF ASGMSPDRLE101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRRI QQFPIKFAAV151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHT YVDGGLSQPV201 PVSAARRXXX XXVIAVDISA RPSKNISQGF FSYLDQTLNV MSVSALQNEL251 GQADVVIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE IKRKLAAYRY301 *ORF137a和ORF137-1显示在300个氨基酸的重叠区内有97.3%的相同性:orf137a.pep MENMVTFSKIRPLLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVGLALGGGASKGFAH
|||||||||||||||||||||||:||||||||||||||||||||||||||||||||||||orf137-1 MENMVTFSKIRPLLAIAAAALLAACGTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAHorf137a.pep VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf137-1 VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSGorf137a.pep FIKGEKLQNYINRKVGGRRIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||orf137-1 FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNVorf137a.pep FQPVIIGRHTYVDGGLSQPVPVSAARRXXXXXVIAVDISARPSKNISQGFFSYLDQTLNV
||||||||||||||||||||||||||| ||||||||||:|||||||||||||||||orf137-1 FQPVIIGRHTYVDGGLSQPVPVSAARRQGANFVIAVDISARPGKNISQGFFSYLDQTLNVorf137a.pep MSVSALQNELGQADVVIIPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf137-1 MSVSALQNELGQADVVIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
与淋病奈瑟球菌的预计ORF的同源性
ORF137和淋病奈瑟球菌的预计ORF(ORF137ng)在149个氨基酸的重叠区内显示出有89.9%的相同性:
orf137.pep MENMVTFSKIRPLLAIAAAALLAAXRTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAH 60
||||||||||| :||||||||||| ||||||:|||||||||||||:|||||||||||||
orf137ng MENMVTFSKIRSFLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVALALGGGASKGFAH 60
orf137.pep VGIIKVLKENGIPVKVVTGTSAGSIVGNLFASGMSPDRLELEAEILGKTDLVDLTLSTNG 120
:||:|||||||||||||||||||||||:|:||||||||||||||||||||||||||||:|
orf137ng IGIVKVLKENGIPVKVVTGTSAGSIVGSLLASGMSPDRLELEAEILGKTDLVDLTLSTSG 120
orf137.pep FIKGAKLQNYINRKLRGMQIQQFPIKFAA 149
|||| |||||||||: | |||||||||||
orf137ng FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV 180
全长ORF137ng核苷酸序列<SEQ ID 505>是:1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGATCATTTT TGGCAATCCC51 CGCCGCCGCG TTGCTTGCCG CCTGCGGTAC GGCGGGAAAC AATGCCGCCC101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGC TTTGGCACTC151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT ATAGGAATTG TTAAGGTTTT201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT251 CGATAGTCGG CAGCCTTTTG GCATCGGGTA TGTCGCCCGA CCGCCTCGAA301 TTGGAAGCCG AGATTTTAGG TAAAACCGAT TTAGTCGATT TAACCTTGTC351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA401 AAGTCGGCGG CAGGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT451 GCCACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AAGGGAATGC501 CGGGCAGGCG GTTCGTGCTT CCGCCGCCAT TCCCAATGTG TTCCAGCCAG551 TCATCATCGG CAGGCACAAA TATGTTGACG GCGGTCTGTC GCAGCCCGTG601 CCCGTCAGTG CCGCTCGGCG GCAGGGGGCG AATTTCGTGA TTGCCGTCGA651 TATTTCCGCA CGTCCGAGCA AAAATGTCGG TCAAGGTTTC TTCTCTTATC701 TCGATCAGAC GCTGAACGTG ATGAGCGTTT CCGTGTTGCA AAACGAGTTG751 gggcAGGCGG ATGTGGTTAT CAAACCGCag gtTTTGGATT TGGGTGCAGT801 CGGCGGATTC GATCAGAAAA AGCGCGCCAT CCGGTTGGGC GAGGAGGCAG851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT901 TGA它编码的蛋白质具有氨基酸序列<SEQ ID 506>:1 MENMVTFSKI RSFLAIAAAA LLAACGTAGN NAARKPVQTA KPAAVVALAL51 GGGASKGFAH IGIVKVLKEN GIPVKVVTGT SAGSIVGSLL ASGMSPDRLE101 LHAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRQI QQFPIKFAAV151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHK YVDGGLSQPV201 PVSAARRQGA NFVIAVDISA RPSKNVGQGF FSYLDQTLNV MSVSVLQNEL251 GQADVVIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE IKRKLAAYRY301 *ORF137ng和ORF137-1显示在300个氨基酸的重叠区内有96.0%的相同性:orf137ng MENMVTFSKIRSFLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVALALGGGASKGFAH
||||||||||| :|||||||||||||||||||:|||||||||||||:|||||||||||||orf137-1 MENMVTFSKIRPLLAIAAAALLAACGTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAHorf137ng IGIVKVLKENGIPVKVVTGTSAGSIVGSLLASGMSPDRLELEAEILGKTDLVDLTLSTSG
:||:|||||||||||||||||||||||||:||||||||||||||||||||||||||||||orf137-1 VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSGorf137ng FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf137-1 FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
orf137ng FQPVIIGRHKYVDGGLSQPVPVSAARRQGANFVIAVDISARPSKNVGQGFFSYLDQTLNV
||||||||| ||||||||||||||||||||||||||||||||:||::|||||||||||||
orf137-1 FQPVIIGRHTYVDGGLSQPVPVSAARRQGANFVIAVDISARPGKNISQGFFSYLDQTLNV
orf137ng MSVSVLQNELGQADVVIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf137 MSVSALQNELGQADVVIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
根据此淋球菌蛋白中存在预计的原核细胞膜脂蛋白脂质连接位点(下划线表示),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例68
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 507>:1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGcTG CCGCTTTCCT101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA151 AAGGAAGACC GCGCGCGCAT CGTCGCCmAT ATGCGGCAGG CGGGTTTGAA201 CCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA351 ACACGAAGGG CTGCTATTC..它对应于氨基酸序列<SEQ ID 508;ORF138>:1 MFRLQFRLFP PLRTAMHILL TALLKCLSLL PLSCLHTLGN RLGHLAFYLL51 KEDRARIVAX MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET101 MFKAVHGWEH VQQALDKHEG LLF进一步的工作揭示了完整的核苷酸序列<SEQ ID 509>:1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG CCGCTTTCCT101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGGCAGG CGGGTTTGAA201 CCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA351 ACACGAAGGG CTGCTATTCA TCACGCCGCA CATCGGCAGC TACGATTTGG401 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCCGCTGAC CGCCATGTAC451 AAACCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT501 TCGCGGCAAA GGAAAAACCG CGCCTACCAG CATACAAGGG GTCAAACAAA551 TCATCAAAGC CCTGCGTTCG GGCGAAGCAA CCATCGTCCT GCCCGACCAC601 GTCCCCTCCC CTCAAGAAGG CGGGGAAGGC GTATGGGTGG ATTTCTTCGG651 CAAACCTGCC TATACCATGA CGCTGGCGGC AAAATTGGCA CACGTCAAAG701 GCGTGAAAAC CCTGTTTTTC TGCTGCGAAC GCCTGCCTGG CGGACAAGGT751 TTCGATTTGC ACATCCGCCC CGTCCAAGGG GAATTGAACG GCGACAAAGC801 CCATGATGCC GCCGTGTTCA ACCGCAATGC CGAATATTGG ATACGCCGTT851 TTCCGACGCA GTATCTGTTT ATGTACAACC GCTACAAAAT GCCGTAA它对应于氨基酸序列<SEQ ID 510;ORF138-1>:1 MFRLQFRLFP PLRTAMHILL TALLKCLSLL PLSCLHTLGN RLGHLAFYLL51 KEDRARIVAN MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET101 MFKAVHGWEH VQQALDKHEG LLFITPHIGS YDLGGRYISQ QLPFPLTAMY151 KPPKIKAIDK IMQAGRVRGK GKTAPTSIQG VKQIIKALRS GEATIVLPDH201 VPSPQEGGEG VWVDFFGKPA YTMTLAAKLA HVKGVKTLFF CCERLPGGQG251 FDLHIRPVQG ELNGDKAHDA AVFNRNAEYW IRRFPTQYLF MYNRYKMP*该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF138和脑膜炎奈瑟球菌菌株A的ORF(ORF138a)在123个氨基酸重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60orf138.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAX
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf138a MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN
10 20 30 40 50 60
70 80 90 100 110 120orf138.pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf138a MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
70 80 90 100 110 120orf138.pep LLF
|||orf138a LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
130 140 150 160 170 180全长ORF138a核苷酸序列<SEQ ID 511>是:1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG CCGCTTTCCT101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGTCAGG CAGGCATGAA201 TCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA351 ACACGAAGGG CTGCTATTCA TCACGCCGCA CATCGGCAGC TACGATTTGG401 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCCGCTGAC CGCCATGTAC451 AAACCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT501 TCGCGGCAAA GGAAAAACCG CGCCTACCAG CATACAAGGG GTCAAACAAA551 TCATCAAAGC CCTGCGTTCG GGCGAAGCAA CCATCGTCCT GCCCGACCAC601 GTCCCCTCCC CTCAAGAAGG CGGGGAAGGC GTATGGGTGG ATTTCTTCGG651 CAAACCTGCC TATACCATGA CGCTGGCGGC AAAATTGGCA CACGTCAAAG701 GCGTGAAAAC CCTGTTTTTC TGCTGCGAAC GCCTGCCTGG CGGACAAGGT751 TTCGATTTGC ACATCCGCCC CGTCCAAGGG GAATTGAACG GCGACAAAGC801 CCATGATGCC GCCGTGTTCA ACCGCAATGC CGAATATTGG ATACGCCGTT851 TTCCGACGCA GTATCTGTTT ATGTACAACC GCTACAAAAT GCCGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 512>:1 MFRLQFRLFP PLRTAMHILL TALLKCLSLL PLSCLHTLGN RLGHLAFYLL51 KEDRARIVAN MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET101 MFKAVHGWEH VQQALDKHEG LLFITPHIGS YDLGGRYISQ QLPFPLTAMY151 KPPKIKAIDK IMQAGRVRGK GKTAPTSIQG VKQIIKALRS GEATIVLPDH201 VPSPQEGGEG VWVDFFGKPA YTMTLAAKLA HVKGVKTLFF CCERLPGGQG251 FDLHIRPVQG ELNGDKAHDA AVFNRNAEYW IRRFPTQYLF MYNRYKMP*ORF138a和ORF138-1显示在298个氨基酸的重叠区内有99.7%的相同性:orf138a.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf138-1 MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVANorf138a.pep MRQAGMNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
|||||:||||||||||||||||||||||||||||||||||||||||||||||||||||||orf138-1 MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEGorf138a.pep LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
orf138a.pep VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF
orf138a.pep CCERLPGGQGFDLHIRPVQGELNGDKAHDAAVFNRNAEYWIRRFPTQYLFMYNRYKMP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 CCERLPGGQGFDLHIRPVQGELNGDKAHDAAVFNRNAEYWIRRFPTQYLFMYNRYKMP
与淋病奈瑟球菌的预计ORF的同源性
ORF138和淋病奈瑟球菌的预计ORF(ORF138ng)在123个氨基酸的重叠区内显示出有94.3%的相同性:
orf138.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAX 60
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||
orf138ng MFRLQFRLFPPLRTAMHILLTALLKCLSLLSLSCLHTLGNRLGHLAFYLLKEDRARIVAN 60
orf138.pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 120
||||||||| :|||||||||||||||||||||:|||||||||||||||||||||||| ||
orf138ng MRQAGLNPDTQTVKAVFAETAKCGLELAPAFFKKPEDIETMFKAVHGWEHVQQALDKGEG 120
orf138.pep LLF 123
|||
orf138ng LLFITPHIGSYDLGGRYISQQLPFHLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTGIQG 180
全长ORF138ng核苷酸序列<SEQ ID 513>是:1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG TCGCTTTCCT101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGGCAGG CGGGTTTGAA201 CCCCGACACG CAGACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAATGCG251 GTTTGGAACT TGCCCCCGCG TTTTTCAAAA AACCGGAAGA CATCGAAACA301 ATGTTCAAAG CGGTACACGG CTGGGAACAC GTGCAGCAGG CTTTGGACAA351 GGGCGAAGGG CTGCTGTTCA TCACGCCGCA CATCGGCAGC TACGATTTGG401 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCACCTGAC CGCCATGTAC451 AAGCCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT501 GCGCGGCAAA GGCAAAACcg cgcccaccgg catACAAGGG GTCAAACAAA551 tcatcaAGGC CCTGCGCGCG GGCGAGGCAA CCAtcATCCT GCCCGACCAC601 GTCCCTTCTC CGCAGGAagg cggCGGCGTG TGGGCGGATT TTTTCGGCAA651 ACCTGCATAc acCATGACAC TGGCGGCAAA ATTGGCACAC GTCAAAGGCG701 TGAAAACCCT GTTTTTCTGC TGCGAACGCC TGCCCGACGG ACAAGGCTTC751 GTGTTGCACA TCCGCCCCGT CCAAGGGGAA TTGAACGGCA ACAAAGCCCA801 CGATGCCGCC GTGTTCAACC GCAATACCGA ATATTGGATA CGCCGTTTTC851 CGACGCAGTA TCTGTTTATG TACAACCGCT ATAAAACGCC GTAA它编码的蛋白质具有氨基酸序列<SEQ ID 514>:1 MFRLQFRLFP PLRTAMHILL TALLKCLSLL SLSCLHTLGN RLGHLAFYLL51 KEDRARIVAN MRQAGLNPDT QTVKAVFAET AKCGLELAPA FFKKPEDIET101 MFKAVHGWEH VQQALDKGEG LLFITPHIGS YDLGGRYISQ QLPFHLTAMY151 KPPKIKAIDK IMQAGRVRGK GKTAPTGIQG VKQIIKALRA GEATIILPDH201 VPSPQEGGGV WADFFGKPAY TMTLAAKLAH VKGVKTLFFC CERLPDCQGF251 VLHIRPVQGE LNGNKAHDAA VFNRNTEYWI RRFPTQYLFM YNRYKTF*ORF138ng和ORF138-1在299个氨基酸的重叠区内显示出有94.3%的相同性:orf138-1.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||orf138ng MFRLQFRLFPPLRTAMHILLTALLKCLSLLSLSCLHTLGNRLGHLAFYLLKEDRARIVANorf138-1.pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
||||||||| :||||||||||| ||||||||:|||||||||||||||||||||||| ||orf138ng MRQAGLNPDTQTVKAVFAETAKCGLELAPAFFKKPEDIETMFKAVHGWEHVQQALDKGEGorf138-1.pep LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
|||||||||||||||||||||||| |||||||||||||||||||||||||||||||:|||orf138ng LLFITPHIGSYDLGGRYISQQLPFHLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTGIQGorf138-1.pep VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF
|||||||||:|||||:|||||||||||| |||:|||||||||||||||||||||||||||orf138ng VKQIIKALRAGEATIILPDHVPSPQEGG-GVWADFFGKPAYTMTLAAKLAHVKGVKTLFForf138-1.pep CCERLPGGQGFDLHIRPVQGELNGDKAHDAAVFNRNAEYWIRRFPTQYLFMYNRYKMP
|||||| |||| ||||||||||||:|||||||||||:||||||||||||||||||| |orf138ng CCERLPDGQGFVLHIRPVQGELNGNKAHDAAVFNRNTEYWIRRFPTQYLFMYNRYKTP另外,ORF138ng与荧光假单胞菌的htrB蛋白同源:gnl|PID|e334283(Y14568)htrB[荧光假单胞菌]长度=253评分=80.8位(196),估计值=9e-15相同性=49/151(32%),阳性=79/151(51%),空隙=6/151(3%)询问:101 MFKAVHGWEHVQQALDKGEGLLFITPHIGSYD-LGGRYISQQLPFHLTAMYKPPKIKAID 159
+ + V G E +++AL G+G++ IT H+G+++ L Y SQ P Y+PPK+KA+D目标:94 LVREVEGLEVLKEALASGKGVVGITSHLGNWEVLNHFYCSQCKPI---IFYRPPKLKAVD 150询问:160 KIMQAGRVRGKGKTAPTGIQGVKQIIKALRAGEATIILPDHVPSPQEGGGVWADFFGKPA 219
++++ RV+ K A + +G+ +IK +R G I D P P E G++ FF A目标:151 ELLRKQRVQLGNKVAASTKEGILSVIKEVRKGGQVGIPAD--PEPAESAGIFVPFFATQA 208询问:220 YTMTLAAKLAHVKGVKTLFFCCERLPDGQGF 250
T + +F RLPDG G+目标:209 LTSKFVPNMLAGGKAVGVFLHALRLPDGSGY 239
根据该分析结果(包括淋球菌蛋白中存在推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF138-1(57kDa)克隆到pGex载体中,并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图14A显示出GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA(阳性结果)和FACS分析(图14B)。这些实验确认ORF138-1是一种外露蛋白,且是一种有用的免疫原。
实施例69
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 515>:1 ..GCGTGGTCGG CCGGCGAATC GTGGCGTGTG TTAATGGAAA GTGAAACGTG551 GCATGCGGTG TGGAATACTT TGCGCTTCTC GGCGGCGGCG GTGTATGCGG101 CAGCGGTTTT GGGTGTGGTG TATGCGGCGC CGGCGCGGCG GTCGGCGTGG151 ATGCGCGGGC TGATGTTTTA GCCGTTTATG GTGTCGCCGG TTTGTGTTTC201 GGCGGGCGTG CTGCTGCTTT ATCCGCAGTG GACGGCTTCG TTGCCGTTGC251 TGCTGGCGAT GTATGCGCTG CTGGCGTATC CGTTTGTGGC AAAAGATGTT301 TTATCAGCCT GGGATGCACT GCCGCCGGAT TACGGCAGGG CGGCGGCGGG351 TTTGGGTGCA AACGGCTTTC AGACGGCATG CCGCATCACG TTCCCCCTCT401 TGAAACCGGC GTTGCGGCGC GGTCTGACTT TGGCGGCGGC AACCTGCGTG451 GGCGAATTTG CGGCGACATT GTTTCTGTCG CGTCCGGAAT GGCAGACGCT501 GACGACTTTG ATTTATGCCT ATTTGGGACG CGCGGGTGAG GATAATTACG551 CGCGGGCGAT GGTGCTG..它对应于氨基酸序列<SEQ ID 516;ORF139>:1 ..AWSAGESWRV LMESETWHAV WNTLRFSAAA VYAAAVLGVV YAAPARRSAW51 MRGLMFXPFM VSPVCVSAGV LLLYPQWTAS LPLLLAMYAL LAYPFVAKDV101 LSAWDALPPD YGRAAAGLGA NGFQTACRIT FPLLKPALRR GLTLAAATCV151 GEFAATLFLS RPEWQTLTTL IYAYLGRAGE DNYARAMVL..进一步的工作揭示了完整的核苷酸序列<SEQ ID 517>:1 ATGGATGGAC GGCGTTGGGT GGTATGGGGT GCTTTTGCCC TGCTGCCTTC51 GGCTTTTTTG GCGGTAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA151 CGTTTGGCGT GGACGGTATT TCAGGCAGCG GCAACCTGTG TGCTGGTGCT201 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTTCCGG251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCTTTTGT GATGCCCACG301 TTGGTGGCGG GCGTGGGCGT GCTGGCCCTG TTCGGGGCGG ACGGGCTGTT351 GTGGCGCGGC AGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT401 TTTTCAACCT TCCTGTGTTG GTCAGGGCGG CGTATCAGGG GTTTGTGCAA451 GTGCCTGCGG CACGGGTTCA GACGGCACGG ACGTTGGGCG CGGGGGCGTG501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG551 GCGGCGTGTG CCTTGTCTTT CTGTATTGTT TTTCCGGGTT CGGGCTGGCG601 CTGCTGCTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA651 GTTGGTCATG TTCGAACTCG ATATGGCGGT TGCTTCGGTG CTGGTGTGGC701 TGGTGTTGGG GGTAACGGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC751 AGGCGCGCGG TTTCGGATAA GGCGGTTTCC CCTGTGATGC CGTCGCCGCC801 GCAGTCGGTC GGGGAATATG TGCTGCTGGC GTTTGCGGCG GCGGTGTTGT851 CTGTGTGCTG CCTGTTTCCT TTGTTGGCAA TTGTTGTGAA AGCGTGGTCG901 GCCGGCGAAT CGTGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCGGT951 GTGGAATACT TTGCGCTTCT CGGCGGCGGC GGTGTATGCG GCGGCGGTTT001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGTCGGCGTG GATGCGCGGG1051 CTGATGTTTT TGCCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT1101 GCTGCTGCTT TATCCGCAGT GGACGGCTTC GTTGCCGTTG CTGCTGGCGA1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCAGCC1201 TGGGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCGG GTTTGGGTGC1251 AAACGGCTTT CAGACGGCAT GCCGCATCAC GTTCCCCCTC TTGAAACCGG1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CAACCTGCGT GGGCGAATTT1351 GCGGCGACAT TGTTTCTGTC GCGTCCGGAA TGGCAGACGC TGACGACTTT1401 GATTTATGCC TATTTGGGAC GCGCGGGTGA GGATAATTAC GCGCGGGCGA1451 TGGTGCTGAC ATTGCTGTTG GCGGCGTTCG CGCTGGGTAT TTTCCTGCTG1501 TTGGACGGCG GCGAAGGCGG AAAACAGACG GAAACGTTAT AA它对应于氨基酸序列<SEQ ID 518;ORF139-1>:1 MDGRRWVVWG AFALLPSAFL AVMVVAPLWA VAAYDGLAWR AVLSDAYMLK51 RLAWTVFQAA ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLMLPFVMPT101 LVAGVGVLAL FGADGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFVQ151 VPAARLQTAR TLGAGAWRRF WDIEMPVLRP WLAGGVCLVF LYCFSGFGLA201 LLLGGSRYAT VEVEIYQLVM FELDMAVASV LVWLVLGVTA AAGLLYAWFG251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFAA AVLSVCCLFP LLAIVVKAWS301 AGESWRVLME SETWQAVWNT LRFSAAAVYA AAVLGVVYAA AARRSAWMRG351 LMFLPFMVSP VCVSAGVLLL YPQWTASLPL LLAMYALLAY PFVAKDVLSA401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKFALRRGLT LAAATCVGEF451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAMVLTLLL AAFALGIFLL501 LDGGEGGKQT ETL*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF139和脑膜炎奈瑟球菌菌株A的ORF(ORF139a)在189个氨基酸的重叠区内显示出有94.7%的相同性:
10 20 30
orf139.pep AWSAGESWRVLMESETWHAVWNTLRFSAAA
|||||||||||||||||:||||| ||||||
orf139a QSVGEYVLLAFAAAVXSVCCLFXLLAIVVKAWSAGESWRVLMESETWQAVWNTXRFSAAA
270 280 290 300 310 320
40 50 60 70 80 90
orf139.pep VYAAAVLGVVYAAPARRSAWMRGLMFXPFMVSPVCVSAGVLLLYPQWTASLPLLLAMYAL
||||||||||||| |||||||||||| |||||||||||||||| ||||||||||||||||
orf139a VYAAAVLGVVYAAAARRSAWMRGLMFLPFMVSPVCVSAGVLLLXPQWTASLPLLLAMYAL
330 340 350 360 370 380
100 110 120 130 140 150
orf139.pep LAYPFVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||
orf139a LAYPFVAKDVLSAXDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV
390 400 410 420 430 440
160 170 180 189
orf139.pep GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVL
|||||||| || |||||||||||| |||| |||||||||
orf139a GEFAATLFXSRXEWQTLTTLIYAYXGRAGXDNYARAMVLTLLLAAFALGXFLLLDGGEGG
450 460 470 480 490 500全长ORF139a核苷酸序列<SEQ ID 519>是:1 ATGGATGGAC GGCGTTGGGC GGTATGGGGT GCTTTTGCCC TGCTGCCTTC51 GGCTTTTTTG GCGGCAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA151 CGTTTGGCGT GGACGGTATT TCAGGCAGCG GCAACCTGTG TGCTGGTGCT201 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTTCCGG251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCTTTTGT GATGCCCACG301 TTGGTGGCGG GCGTGGGCGT GCTGGCTCTG TTCGGGGCGG ACGGCCTGTN351 GTGGCGCGGC TGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT401 TTTTTNACCT TCCTGTGTTG GTCAGGGCGG CATATCAGGG GTTTGTGCAA451 GTGCCTGCGG CACGGCTTCA GACGGCACNG ACATTGGGCG CGGGGGCGTG501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG551 GCGGCGTGTG CCTTGTCTTC CTGTATTGTT TTTCGGGGTT CGGGCTGGCA601 TTGCTGCTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA651 GTTGGTCATG TTCGAACTCG ATATGGCGGT TGCTTCGGTG CTNGTGTGGC701 TGGTGTNGGG GGTAACNGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC751 AGGCGCGCGG TTTCGGATAA GGCNGTTTCC CCTGTGATGC CGTCGCCGCC801 GCAGTCGGTC GGGGAATATG TGCTNCTGGC GTTTGCGGCG GCGGTGTNGT851 CTGTGTGCTG CCTGTTTCNT TTGTTGGCAA TTGTTGTGAA AGCGTGGTCG90I GCCGGCGAAT CGTGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCGGT951 GTGGAATACT NTGCGCTTCT CGGCGGCGGC GGTGTATGCG GCGGCGGTTT1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGTCGGCGTG GATGCGCGGG1051 CTGATGTTTT TGCCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT1101 GCTGCTGCTT NATCCGCAGT GGACGGCTTC GTTGCCGCTG CTGCTGGCGA1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCAGCC1201 TGNGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCGG GTTTGGGTGC1251 AAACGGCTTT CAGACGGCAT GCCGCATCAC GTTCCCCCTC TTGAAACCGG1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CAACCTGCGT GGGCGAATTT1351 GCGGCAACCT TGTTCNTGTC GCGTCNCGAG TGGCAGACGC TGACGACTTT1401 GATTTATGCC TATNTGGGAC GCGCGGGTGA NGATAATTAC GCGCGGGCGA1451 TGGTGCTGAC ATTGCTGTTG GCGGCGTTCG CGCTGGGTAT NTTCCTGCTG1501 TTGGACGGCG GCGAAGGCGG AAAACGGACG GAAACGTTAT AA它编码的蛋白质具有氨基酸序列<SEQ ID 520>:1 MDGRRWAVWG AFALLPSAFL AAMVVAPLWA VAAYDGLAWR AVLSDAYMLK51 RLAWTVFQAA ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLMLPFVMPT101 LVAGVGVLAL FGADGLXWRG WQDTPYLLLY GNVFFXLPVL VRAAYQGFVQ151 VPAARLQTAX TLGAGAWRRF WDIEMPVLRP WLAGGVCLVF LYCFSGFGLA201 LLLGGSRYAT VEVEIYQLVM FELDMAVASV LVWLVXGVTA AAGLLYAWFG251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFAA AVXSVCCLFX LLAIVVKAWS301 AGESWRVLME SETWQAVWNT XRFSAAAVYA AAVLGVVYAA AARRSAWMRG351 LMFLPFMVSP VCVSAGVLLL XPQWTASLPL LLAMYALLAY PFVAKDVLSA401 XDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF451 AATLFXSRXE WQTLTTLIYA YXGRAGXDNY ARAMVLTLLL AAFALGXFLL501 LDGGEGGKRT ETL*ORF139a和ORF139-1在514个氨基酸的重叠区内显示出有96.5%的同源性:orf139a.pep MDGRRWAVWGAFALLPSAFLAAMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA
||||||:||||||||||||||:||||||||||||||||||||||||||||||||||||||orf139-1 MDGRRWVVWGAFALLPSAFLAVMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAAorf139a.pep ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLXWRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf139-1 ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRGorf139a.pep WQDTPYLLLYGNVFFXLPVLVRAAYQGFVQVPAARLQTAXTLGAGAWRRFWDIEMPVLRP
||||||||||||||| ||||||||||||||||||||||| ||||||||||||||||||||orf139-1 RQDTPYLLLYGNVFFNLPVLVRAAYQGFVQVPAARLQTARTLGAGAWRRFWDIEMPVLRPorf139a.pep WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAVASVLVWLVXGVTA
||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||||orf139-1 WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAVASVLVWLVLGVTAorf139a.pep AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVXSVCCLFXLLAIVVKAWS
|||||||||||||||||||||||||||||||||||||||||| |||||| ||||||||||orf139-1 AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVLSVCCLFPLLAIVVKAWSorf139a.pep AGESWRVLMESETWQAVWNTXRFSAAAVYAAAVLGVVYAAAARRSAWMRGLMFLPFMVSP
|||||||||||||||||||| |||||||||||||||||||||||||||||||||||||||orf139-1 AGESWRVLMESETWQAVWNTLRFSAAAVYAAAVLGVVYAAAARRSAWMRGLMFLPFMVSPorf139a.pep VCVSAGVLLLXPQWTASLPLLLAMYALLAYPFVAKDVLSAXDALPPDYGRAAAGLGANGF
|||||||||| ||||||||||||||||||||||||||||| |||||||||||||||||||orf139-1 VCVSAGVLLLYPQWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGForf139a.pep QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFXSRXEWQTLTTLIYAYXGRAGXDNY
||||||||||||||||||||||||||||||||||| || |||||||||||| |||| |||orf139-1 QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYorf139a.pep ARAMVLTLLLAAFALGXFLLLDGGEGGKRTETLX
|||||||||||||||| |||||||||||:|||||orf139-1 ARAMVLTLLLAAFALGIFLLLDGGEGGKQTETLX
与淋病奈瑟球菌的预计ORF的同源性
ORF139和淋病奈瑟球菌的预计ORF(ORF139ng)在189个氨基酸的重叠区内显示出有95.2%的相同性:orf139.pep AWSAGESWRVLMESETWHAVWNTLRFSAAA 30
||||||| |||||||||:||||||||||||orf139ng QSVGEYVLLAFSVAVLSVCCLFPLSAIVVKAWSAGESRRVLMESETWQAVWNTLRFSAAA 327orf139.pep VYAAAVLGVVYAAPARRSAWMRGLMFXPFMVSPVCVSAGVLLLYPQWTASLPLLLAMYAL 90
|:||||||||||| ||| :|||||:| |||||||||||||||||| ||||||||||||||orf139ng VFAAAVLGVVYAAAARRLVWMRGLVFLPFMVSPVCVSAGVLLLYPGWTASLPLLLAMYAL 387
orf139.pep LAYPFVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV 150
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf139ng LAYPFVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV 447
orf139.pep GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVL 189
|||||||||||||||||||||||||||||||||||||||
orf139ng GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVLTLLLSAFAVCIFLLLDNGEGG 507
预计全长ORF139ng核苷酸序列<SEQ ID 521>编码的蛋白质具有氨基酸序列<SEQ ID 522>:1 MDGRCWAVRG AFSLLPSAFL AVMVVAPLWA VAAYDGLAWR AVLSDAYMLK51 RLAWTVFQAA ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLMLPFVMPT101 LVAGVGVLAL FGADGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFAQ151 VPAARLQTAR TLGAGAWRPF WIIEMPVLRP WLAGGVCLVF LYCFSGFGLA201 LLLCGSRYAT VEVEIYQLVM FELDMAGASA LVWLVLGVTA AAGLLYAWFG251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFSV AVLSVCCLFP LSAIVVKAWS301 AGESRRVLME SETWQAVWNT LRFSAAAVFA AAVLGVVYAA AARRLVWMRG351 LVFLPFMVSP VCVSAGVLLL YPGWTASLPL LLAMYALLAY PFVAKDVLSA401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAMVLTLLL SAFAVCIFLL501 LDNGEGGKRT ETL*进一步的工作揭示了一个淋球菌变体DNA序列<SEQ ID 523>:1 ATGGATGGAC GGTGTTGGGC GGTACGGGGT GCTTTTTCCC TGCTGCCTTC51 GGCTTTTTTG GCGGTAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA151 CGTTTGGCGT GGACGGTGTT TCAGGCGGCG GCAACCTGTG TGCTGGTGCT201 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTCCCGG251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCGTTTGT GATGCCCACG301 CTGGTGGCGG GCGTGGGCGT GCTGGCTCTG TTCGGGGCGG ACGGGCTGTT351 GTGGCGCGGC CGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT401 TTTTCAACCT GCCCGTGTTG GTCAGGGCGG CGTATCAGGG GTTTGCTCAA451 GTGCCTGCGG CACGGCTTCA GACGGCACGG ACGTTGGGCG CGGGGGCGTG501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG551 GCGGCGTGTG CCTTGTCTTC CTGTATTGTT TTTCGGGGTT CGGGCTGGCA601 TTGCTGTTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA651 GTTGGTTATG TTCGAACTCG ATATGGCGGG GGCTTCGGCG CTGGTGTGGC701 TGGTGTTGGG GGTAACGGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC751 AGGCGCGCGG TTTCGGATAA GGCGGTTTCC CCCGTGATGC CGTCGCCGCC801 GCAATCGGTG GGGGAATATG TATTGCTGGC ATTTTCGGTG GCGGTGTTGT851 CCGTGTGCTG CCTGTTTCCT TTGTCGGCAA TTGTTGTGAA AGCGTGGTCG901 GCCGGCGAAT CGCGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCAGT951 GTGGAATACt ttGCGCTTTT CGGCGGCGGC GGTGTTTGCG GCGGCGGTTT1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGCTGGTGTG GATGCGCGGA1051 CTGGTGTTTT TACCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT1101 GCTGCTGCTT TATCCGGGGT GGACGGCTTC GTTACCGCTG CTGCTGGCGA1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCGGCC1201 TGGGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCAG GTTTGGGCGC1251 AAACGGCTTT CAGACGGCAT GCCGTATCAC GTTCCCCCTC TTGAAACCGG1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CGACGTGTGT GGGCGAATTT1351 GCGGCAACCT TGTTCCTGTC GCGTCCGGAA TGGCAGACGT TGACGACTTT1401 GATTTATGCC TATTTGGGGC GTGCGGGTGA GGACAATTAT GCGCGGGCAA1451 TGGTGTTGAC ATTGCTGTTG TCGGCATTTG CGGTGTGCAT TTTCCTGCTG1501 TTGGACAACG GCGAAGGCGg aaaACGGACG GAAACGTTAT AA它对应于氨基酸序列<SEQ ID 524;ORF139ng-1>:1 MDGRCWAVRG AFSLLPSAFL AVMVVAPLWA VAAYDGLAWR AVLSDAYMLK51 RLAWTVFQAA ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLMLPFVMPT101 LVAGVGVLAL FGADGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFAQ151 VPAARLQTAR TLGAGAWRRF WDIEMPVLRP WLAGGVCLVF LYCFSGFGLA201 LLLGGSRYAT VEVEIYQLVM FELDMAGASA LVWLVLGVTA AAGLLYAWFG251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFSV AVLSVCCLFP LSAIVVKAWS301 AGESRRVLME SETWQAVWNT LRFSAAAVFA AAVLGVVYAA AARRLVWMRG351 LVFLPFMVSP VCVSAGVLLL YPGWTASLPL LLAMYALLAY PFVAKDVLSA401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAMVLTLLL SAFAVCIFLL501 LDNGEGGKRT ETL*ORF139ng-1和ORF139-1在513个氨基酸的重叠区内显示出有95.9%的相同性:orf139ng MDGRCWAVRGAFSLLPSAFLAVMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA
|||| |:| |||:|||||||||||||||||||||||||||||||||||||||||||||||orf139-1 MDGRRWVVWGAFALLPSAFLAVMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAAorf139ng ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf139-1 ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRGorf139ng RQDTPYLLLYGNVFFNLPVLVRAAYQGFAQVPAARLQTARTLGAGAWRRFWDIEMPVLRP
||||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||orf139-1 RQDTPYLLLYGNVFFNLPVLVRAAYQGFVQVPAARLQTARTLGAGAWRRFWDIEMPVLRPorf139ng WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAGASALVWLVLGVTA
|||||||||||||||||||||||||||||||||||||||||||||| ||:||||||||||orf139-1 WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAVASVLVWLVLGVTAorf139ng AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFSVAVLSVCCLFPLSAIVVKAWS
||||||||||||||||||||||||||||::||||||||||||| ||||||||||||||||orf139-1 AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVLSVCCLFPLLAIVVKAWSorf139ng AGESRRVLMESETWQAVWNTLRFSAAAVFAAAVLGVVYAAAARRLVWMRGLVFLPFMVSP
|||| |||||||||||||||||||||||:||||||||||||||| :|||||:||||||||orf139 AGESWRVLMESETWQAVWNTLRFSAAAVYAAAVLGVVYAAAARRSAWMRGLMFLPFMVSPorf139ng VCVSAGVLLLYPGWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGF
|||||||||||| |||||||||||||||||||||||||||||||||||||||||||||||orf139-1 VCVSAGVLLLYPQWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGForf139ng QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf139-1 QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYorf139ng ARAMVLTLLLSAFAVCIFLLLDNGEGGKRTETL
||||||||||:|||: ||||||:|:|||:||||orf139-1 ARAMVLTLLLAAFALGIFLLLDGGEGGKQTETL
根据淋球菌蛋白中存在一个预计的结合蛋白依赖型转运蛋白系统内膜组分特征序列(下划线),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例70
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 525>:1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAGA TTCCGCATCC101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC151 GGTTTGCCCA CAGGCAGCAT TGTCAAAGAC ATACTGGTCA AAAACTTCGG201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG251 AACGTTTGGT C...它对应于氨基酸序列<SEQ ID 526;ORF140>:1 MDGWTQTLSA QTLLGISAAA IILILILIVR FRIHALLTLV IVSLLTALAT51 GLPTGSIVKD ILVKNFGGTL GGVALLVGLG AMLERLV..进一步的工作揭示了其完整的核苷酸序列<SEQ ID 527>:1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC151 GGTTTGCCCA CAGGCAGCAT TGTCAACGAC ATACTGGTCA AAAACTTCGG201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG251 GACGTTTGGT CGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG301 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCGCTGG GCGTTGCCTC351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC401 TGCCCATCGT GTTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC451 TTCGCGCTTG CCTCCATCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG551 GCCAAGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCACCATCC ATGTTCCCGT551 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAACGACCTG CCGAAAGAAC701 CTGCCAAAGC AGGAACGGTC GTCGCCATCA TGCTGATTCC CATGCTGCTG751 ATTTTCCTGA ATACCGGCGT ATCGGCCCTC ATCAGCGAAA AACTCGTAAG801 TGCGGACGAA ACCTGGGTTC AGACGGCAAA AATAATCGGT TCGACACCGA851 TCGCCCTTCT GATTTCCGTA TTGGTCGCAC TGTTTGTCTT GGGACGCAAA901 CGCGGCGAAA GCGGCAGCGC GTTGGAAAAA ACCGTGGACG GCGCACTCGC951 CCCCGTCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGT TTCCTTGTCG CCTTGGCACT1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACC GCCGCCGCGC1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA1251 CGACTCCGGC TTCTGGCTGG TCGGCCGTCT CTTGGACATG GACGTACCGA1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ACTCATCGGC1351 TTTGCCTTGT CCGCACTGCT GTTCGCCATC GTCTGA它对应于氨基酸序列<SEQ ID 528;ORF140-1>:1 MDGWTQTLSA QTLLGISAAA IILILILIVK FRIHALLTLV IVSLLTALAT51 GLPTGSIVND ILVKNFGGTL GGVALLVGLG AMLGRLVETS GGAQSLADAL101 IRMFGEKRAP FALGVASLIF GFPIFFDAGL IVMLPIVFAT ARRMKQDVLP151 FALASIGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF201 SGYMLGKVLG RTIHVPVPEL LSGGTQDNDL PKEPAKAGTV VAIMLIPMLL251 IFLNTGVSAL ISEKLVSADE TWVQTAKIIG STPIALLISV LVALFVLGRK301 RGESGSALEK TVDGALAPVC SVILITGAGG MFGGVLRASG IGKALADSMA351 DLGIPVLLGC FLVALALRIA QGSATVALTT AAALMAPAVA AAGFTDWQLA401 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQTLIALIG451 FALSALLFAI V*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF140和脑膜炎奈瑟球菌菌株A的ORF(ORF140a)在87个氨基酸的重叠区内显示出有95.4%的相同性:
10 20 30 40 50 60
orf140.pep MDGWTQTLSAQTLLGISAAAIILILILIVRFRIHALLTLVIVSLLTALATGLPTGSIVKD
|||||||||||||||||||||||||||||:||||||||||||||||||||||||||||:|
orf140a MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND
10 20 30 40 50 60
70 80
orf140.pep ILVKNFGGTLGGVALLVGLGAMLERLV
:|||||||||||||||||||||| |||orf140a VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF
70 80 90 100 110 120全长ORF140a核苷酸序列<SEQ ID 529>是:1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC151 GGTTTGCCCA CAGGCAGCAT TGTCAACGAC GTACTGGTCA AAAACTTCGG201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG251 GACGTTTGGT CGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG301 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCGCTGG GCGTTGCCTC351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC401 TGCCCATCGT GTTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC451 TTCGCGCTTG CCTCCATCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG551 GCCAAGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCACCATCC ATGTTCCCGT651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAACGACCTG CCGAAAGAAC701 CTGCCAAAGC AGGAACGGTC GTCGCCATCA TGCTGATTCC CATGCTGCTG751 ATTTTCCTGA ATACCGGCGT ATCGGCCCTC ATCAGCGAAA AACTCGTAAG801 TGCGGACGAA ACCTGGGTTC AGACGGCAAA AATAATCGGT TCGACACCGA851 TCGCCCTTCT GATTTCCGTA TTGGTCGCAC TGTTTGTCTT GGGACGCAAA901 CGCGGCGAAA GCGGCAGCGC GTTGGAAAAA ACCGTGGACG GCGCACTCGC951 CCCCGTCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGT TTCCTTGTCG CCTTGGCACT1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACC GCCGCCGCGC1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA1251 CGACTCCGGC TTCTGGCTGG TCGGCCGCCT CTTGGACATG GACGTACCGA1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ACTCATCGGC1351 TTTGCCTTGT CCGCACTGCT GTTCGCCATC GTCTGA它编码的蛋白质具有氨基酸序列<SEQ ID 530>:1 MDGWTQTLSA QTLLGISAAA IILILILIVK FRIHALLTLV IVSLLTALAT51 GLPTGSIVND VLVKNFGGTL GGVALLVGIG AMLGRLVETS GGAQSLADAL101 IRMFGEKRAP FALGVASLIF GFPIFFDAGL IVMLPIVFAT ARRMKQDVLP151 FALASIGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF201 SGYMLGKVLG RTIHVPVPEL LSGGTQDNDL PKEPAKAGTV VAIMLIPMLL251 IFLNTGVSAL ISEKLVSADE TWVQTAKIIG STPIALLISV LVALFVLGRK301 RGESGSALEK TVDGALAPVC SVILITGAGG MFGGVLRASG IGKALADSMA351 DLGIPVLLGC FLVALALRIA QGSATVALTT AAALMAPAVA AAGFTDWQLA401 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQTLIALIG451 FALSALLFAI V*ORF140a和ORF140-1在461个氨基酸的重叠区内显示出有99.8%的相同性:orf140-1.pep MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140a MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND 60orf140-1.pep ILVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF 120
:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140a VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF 120orf140-1.pep GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASIGAFSVMHVFLPPHPGPIAASEFYG 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140a GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASIGAFSVMHVFLPPHPGPIAASEFYG 810orf140-1.pep ANIGQVLILGLPTAFITWYFSGYMLGKVLGRTIHVPVPELLSGGTQDNDLPKEPAKAGTV 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140a ANIGQVLILGLPTAFITWYFSGYMLGKVLGRTIHVPVPELLSGGTQDNDLPKEPAKAGTV 240orf140-1.pep VAIMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKIIGSTPIALLISVLVALFVLGRK 300
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140a VAIMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKIIGSTPIALLISVLVALFVLGRK 300orf140-1.pep RGESGSALEKTVDGALAPVCSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC 360
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140a RGESGSALEKTVDGALAPVCSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC 360orf140-1.pep FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140a FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG 420orf140-1.pep FWLVGRLLDMDVPTTLKTWTVNQTLIALIGFALSALLFAIV 461
|||||||||||||||||||||||||||||||||||||||||orf140a FWLVGRLLDMDVPTTLKTWTVNQTLIALIGFALSALLFAIV 461与淋病奈瑟球菌ORE的同源性
ORF140和淋病奈瑟球菌的预计ORF(ORF140ng)在87个氨基酸的重叠区内显示出有92%的相同性:
orf140.pep MDGWTQTLSAQTLLGISAAAIILILILIVRFRIHALLTLVIVSLLTALATGLPTGSIVKD 60
||| |||||||||||||||||||||||||:|||:|||||:||||||||||||||||||:|
orf140ng MDGRTQTLSAQTLLGISAAAIILILILIVKFRIRALLTLVIASLLTALATGLPTGSIVND 60
orf140.pep ILVKNFGGTLGGVALLVGLGAMLERLV 87
:|||||||||||||||||||||| |||
orf140ng VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFAPGVASLIF 120
预计全长ORF140ng核苷酸序列<SEQ ID 531>编码的蛋白质具有氨基酸序列<SEQ ID 532>:1 MDGRTQTLSA QTLLGISAAA IILILILIVK FRIRALLTLV IASLLTALAT551 GLPTGSIVND VLYKNFGGTL GGVALLVGLG AMLGRLVETS GGAQSLADAL101 IRMFGEKRAP FAPGVASLIF GFPIFFDAGL IVMLPIVFAT ARRMKQDVLP151 FALASVGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF201 SGYMLGKVLG RAIHVPVPEL LSGGTQDSDP PKEPAKAGTV VAVMLIPMLL251 IFLNTGVSAL ISEKLVSADE TWVQTAKMIG STPVALLISV LAALLVLGRK301 RGESGSTLEK TVDGALAPAC SVILITGAGG MFGGVLRASG IGKALADSMA351 DLGIPVLLGC FLVALALRIA QGSATVALTT AAALMAPAVA AAGFTDWQLA401 CIVLATAAGS VGCSHFNDSG FWLVGRLSDM DVPTTLKTWT VNQTLIAFIG451 FALSALLFAI V*进一步的工作揭示了一个淋球菌变体DNA序列<SEQ ID 533>:1 ATGGACGGCC GGACACAGAC GCTGTCCGCG CAAACCTTGT TGGGCATTTC51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC101 GCGCGCTGCT GACACTGGTC ATCGCCAGCC TGCTGACGGC TTTGGCAACC151 GGTTTGCCCA CAGGCAGCAT CGTCAACGAC GTACTGGTCA AAAACTTCGG201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGTCTGGGC GCAATGCTCG251 GACGTTTGGT AGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG301 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCTCCGG GCGTTGCCTC351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC401 TGCCCATCGT ATTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC451 TTCGCGCTTG CCTCCGTCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG551 GCCAGGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCGCCATCC ATGTTCCCGT651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAGCGACCCG CCGAAAGAAC701 CTGCCAAAGC ACGAACGGTC GTCGCCGTCA TGCTGATTCC CATGCTGCTG751 ATTTTCCTGA ATACCGGCGT ATCAGCCCTC ATCAGCGAAA AACTCGTAAG 801 TGCGGACGAA ACTTGGGTTC AGACGGCAAA AATGATCGGT TCGACACCTG851 TCGCCCTTCT GATTTCCGTA TTGGCCGCAC TGTTGGTCTT GGGACGCAAA901 CGCGGCGAAA GCGGCAGCAC GTTGGAAAAA ACCGTGGACG GCGCACTCGC951 CCCCGCCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGC TTCCTTGTCG CCTTGGCACT1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACA GCCGCCGCGC1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA1251 CGACTCCGGC TTCTGGCTGG TCGGCCGCCT CTTGGATATG GACGTACCGA1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ATTCATCGGC1351 TTTGCCTTGT CCGCACTGCT GTTTGCCATC GTCTGA它对应于氨基酸序列<SEQ ID 534;ORF140ng-1>:1 MDGRTQTLSA QTLLGISAAA IILILILIVK FRIRALLTLV IASLLTALAT51 GLPTGSIVND VLVKNFGGTL GGVALLVGLG AMLGRLVETS GGAQSLADAL101 IRMFGEKRAP FAPGVASLIF GFPIFFDAGL IVMLPIVFAT ARRMKQDVLP151 FALASVGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF201 SGYMLGKVLG RAIHVPVPEL LSGGTQDSDP PKEPAKAGTV VAVMLIPMLL251 IFLNTGVSAL ISEKLVSADE TWVQTAKMIG STPVALLISV LAALLVLGRK301 RGESGSTLEK TVDGALAPAC SVILITGAGG MFGGVLRASG IGKALADSMA351 DLGIPVLLGC FLVALALRIA QGSATVALTT AAALMAPAVA AAGFTDWQLA401 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQTLIAFIG451 FALSALLFAI V*ORF140ng-1和ORF140-1在461个氨基酸的重叠区内显示出有96.3%的相同性:orf140ng-1.pep MDGRTQTLSAQTLLGISAAAIIIILILIVKFRIRALLTLVIASLLTALATGLPTGSIVND
||| |||||||||||||||||||||||||||||:|||||||:||||||||||||||||||orf140-1 MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVNDorf140ng-1.pep VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFAPGVASLIF
:|||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||orf140-1 ILVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIForf140ng-1.pep GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASVGAFSVMHVFLPPHPGPIAASEFYG
|||||||||||||||||||||||||||||||||||:||||||||||||||||||||||||oRF140-1 GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASIGAFSVMHVFLPPHPGPIAASEFYGorf140ng-1.pep ANIGQVLILGLPTAFITWYFSGYMLGKVLGRAIHVPVPELLSGGTQDSDPPKEPAKAGTV
|||||||||||||||||||||||||||||||:|||||||||||||||:||||||||||||orf140-1 ANIGQVLILGLPTAFITWYFSGYMLGKVLGRTIHVPVPELLSGGTQDNDLPKEPAKAGTVorf140ng-1.pep VAVMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKMIGSTPVALLISVLAALLVLGRK
||:||||||||||||||||||||||||||||||||||:|||||:|||||||:||:|||||orf140-1 VAIMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKIIGSTPIALLISVIVALFVLGRKorf140ng-1.pep RGESGSTLEKTVDGALAPACSVILITGAGGMFGGVLRASGIGKAIADSMADLGIPVLLGC
||||||:|||||||||||:|||||||||||||||||||||||||||||||||||||||||orf140-1 RGESGSALEKTVDGALAPVCSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGCorf140ng-1.pep FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf140-1 FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSGorf140ng-1.pep FWLVGRLLDMDVPTTLKTWTVNQTLIAFIGFALSALLFAIV
|||||||||||||||||||||||||||:|||||||||||||orf140-1 FWLVGRLLDMDVPTTLKTWTVNQTLIALIGFALSALLFAIV
另外,ORF140ng-1与一种大肠杆菌蛋白同源:
gi|882633(U29579)ORF_o454[大肠杆菌]>gi|1789097(AE000358)o454;
该454个氨基酸的ORF与约456个氨基酸的蛋白的444个残基有34%的相同性(9个空隙)GNTP_BACLI SW:P46832[大肠杆菌]长度=454评分=210位(529),估计值=1e-53相同性=130/384(33%),阳性=194/384(49%),空隙=19/384(4%)询问:88 ETSGGAQSLADALIRMFGEKRAPFAPGVASLIFGFPIFFDAGLIVMLPIVFATARRMKQD 147
E SGGA+SLA+ R G+KR A +A+ G P+FFD G I++ PI++ A+ K目标:80 EHSGGAESLANYFSRKLGDKRTIAALTLAAFFLGIPVFFDVGFIILAPIIYGFAKVAKIS 139询问:148 VLPFALASVGAFSVMHVFLPPHPGPIAASEFYGANIGQVLILGLPTAFITWYFSGYMLGK 207
L F L G +HV +PPHPGP+AA+ A+IG + I+G+ + I GY K目标:140 PLKFGLPVAGIMLTVHVAVPPHPGPVAAAGLLHADIGWLTIIGIAIS-IPVGVVGYFAAK 198询问:208 VLGRAIHVPVPELL----------SGGTQDSDPPKEPAKAGTVVAVMLIPMLLIFLNTGV 257
++ + + E+L G T+ SD P A V ++++IP+ +I T目标:199 IINKRQYAMSVEVLEQMQLAPASEEGATKLSDKINPPGVA-LVTSLIVIPIAIIMAGT-- 255询问:258 SALISEKLVSADETWVQTAKMIGSTPXXXXXXXXXXXXXXGRKRGESGSTLEKTVDGALA 317
+S L+ + T ++IGS +RG S + AL目标:256 ---VSATLMPPSHPLLGTLQLIGSPMVALMIALVLAFWLLALRRGWSLQHTSDIMGSALP 312询问:318 PACSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGCFLVALALRIAQGSXXXX 377
A VIL+TGAGG+FG VL SG+GKALA+ + + +P+L F+++LALR +QGS目标:313 TAAVVILVTGAGGVFGKVLVESGVGKALANMLQMIDLPLLPAAFIISLALRASQGS--AT 370询问:378 XXXXXXXXXXXXXXXGFTDWQLACIVLATAAGSVGCSHFNDSGFWLVGRLLDMDVPTTLK 437
G Q + LA G +G SH NDSGFW+V + L + V LK目标:371 VAILTTGGLLSEAVMGLNPIQCVLVTLAACFGGLGASHINDSGFWIVTKYLGLSVADGLK 430询问:438 TWTVNQTLIAFIGFALSALLFAIV 461
TWTV T++ F GF ++ ++A++目标:431 TWTVLTTILGFTGFLITWCVWAVI 454
根据该分析结果(包括鉴定出此淋球菌蛋白中存在一个推定前导序列(双划线)和几个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例71
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 535>:1 ..GATTTCGGCA TATCGCCCGT GTATCTTTGG GTTGCCGCCG CGTTCAAACA51 TTTGCTGTCG CCGTGGGCTG CCGACTCATA CGATGTCGCA CGCTTTGCAG101 GCGTATTTTT TCCCGTTATC GGACTGACTT CCTGCGGCTT TGCCGGTTTC151 AACTTTTTGG GCAGACACCA CGGGCGCAC. GTCGTCCTGA TTCTCATCGG201 CTGTATCGGG CTGATTCCAG TTGCCCATTT CCTCAACCCC GCTGCCGCCG251 CCTTTGCCGC CGCCGGACTG GTGCTGCACG GTTATTCTTT GGCTCGCCGG301 CGCGTGATTG CCGCCTCTTT TCTGCTCGGT ACGGGCTGGA CGCTGATGTC351 GTTGGCAGCA GCTTATCCGG CAGCATTTGC CCTGATGCTG CCCTTGCCCG401 TACTGATGTT TTTCCGTCCG ..它对应于氨基酸序列<SEQ ID 536;ORF141>:1 ..DFGISPVYLW VAAAFKHLLS PWAADSYDVA RFAGVFFAVI GLTSCGFAGF51 NFLGRHHGRX VVLILIGCIG LIPVAHFLNP AAAAFAAAGL VLHGYSLARR101 RVIAASFLLG TGWILMSLAA AYPAAFALML PLPVLMFFRP ..进一步的工作揭示了完整的核苷酸序列<SEQ ID 537>:1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA51 AAAGCCGTGG CTGCTGCTGT TGATGGCGTT TGCCTGGTTG TGGCCCGGCGI01 TGTTTTCCCA CGATTTGTGG AATCCTGACG AACCTGCCGT CTATACCGCC151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCC TTGGTTGCCC ATCTGTTCGG 201 TCAAACCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCGT251 TCAAACATTT GCTGTCGCCG TGGGCTGCCG ACTCATACGA TGCCGCACGC301 TTTGCAGGCG TATTTTTTGC CGTTATCGGA CTGACTTCCT GCGGCTTTGC351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAgCGTC GTCCTGATTC401 TCATCGGCTG TATCGGGCTG ATTCCAGTTG CCCATTTCCT CAACCCCGCT451 GCCGCCGCCT TTGCCGCCGC CGGACTGGTG CTGCACGGTT ATTCTTTGGC501 TCGCCGGCGC GTGATTGCCG CCTCTTTTCT GCTCGGTACG GGCTGGACGC551 TGATGTCGTT GGCAGCAGCT TATCCGGCAG CATTTGCCCT GATGCTGCCC601 TTGCCCGTAC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT651 GACGGCAGTC GCCTCACTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC701 CGCTGCTCTT GGCAAAAACG CAGCCCGCGC TGTTCGCGCA ATGGCTCGAC751 TATCACGTTT TCGGTACGTT CGGCGGCGTG CGGCACGTTC AGACGGCATT801 CAGTTTGTTT TACTATCTGA AAAACCTGCT TTGGTTTGCA TTGCCCGCGC851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CGCGCCTGTT TTCGACCGAC901 TGGGGGATTT TGGGCGTCGT CTGGATGCTT GCCGTTTTGG TGCTGCTTGC951 CGTCAATCCG CAGCGTTTTC AGGATAACCT CGTCTGGCTG CTTCCGCCGC1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGGCG CGGCGCGGCG1051 GCGTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGACTGT TTGCCGTGTT1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG1151 CCGAACGCGC CGCCTATTTC AGCCCGTATT ATGTTCCTGA TATCGATCCC1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT1251 TACCCGGAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG1351 GACGCGGCGA AAAGCCACGC GCCGGTCGTC CGGAGTATGG AGGCATCGCT1401 TTCCCCGGAA TTGAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGGCA1451 TAGGCGGCGG CGACCTGCAC ACGCGGATTG TTTGGACGCA GTACGGCACA1501 TTGCCGCACC GCGTCGGCGA TGTACAATGC CGCTACCGCA TCGTCCTCCT1551 GCCCCAAAAT GCGGATGCGC CGCAAGGCTG GCAGACGGTT TGGCAGGGTG1601 CGCGTCCGCG CAACAAAGAC AGTAAGTTCG CACTGATACG GAAAATCGGG1651 GAAAATATAT AA它对应于氨基酸序列<SEQ ID 538;ORF141-1>:1 MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPDEPAVYTA51 VEALAGSPTP LVAHLFGQTD FGIPPVYLWV AAAFKHLLSP WAADSYDAAR101 FAGVFFAVIG LTSCGFAGFN FLGRHHGRSV VLILIGCIGL IPVAHFLNPA151 AAAFAAAGLV LHGYSLARRR VIAASFLLGT GWTLMSLAAA YPAAFALMLP201 LPVLMFFRPW QSRRLMLTAV ASLAFALPLM TVYPLLLAKT QPALFAQWLD251 YHVFGTFGGV RHVQTAFSLF YYLKNLLWFA LPALPLAVWF VCRTRLFSTD301 WGILGVVWML AVLVLLAVNP QRFQDNLVWL LPPLALFGAA QLDSLRRGAA351 AFVNWFGIMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP401 IPMAVAVLFT PLWLWAITRK NIRGRQAVTN WAAGVTLTWA LLMTLFLPWL451 DAAKSHAPVV RSMEASLSPE LKRELSDGIE CIGIGGGDLH TRIVWTQYGT501 LPHRVGDVQC RYRIVLLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKIG551 ENI*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF141和脑膜炎奈瑟球菌菌株A的ORF(ORF141a)在140个氨基酸的重叠区内显示出有95.0%的相同性:
10 20 30orf141.pep DFGISPVYLWVAAAFKHLLSPWAADSYDVA
|||| |||||||||||||||||||| ||:|orf141a WNPDEPAVYTAVEALAGSPTPLVAHLFGQIDFGIPPVYLWVAAAFKHLLSPWAADPYDAA
40 50 60 70 80 90
40 50 60 70 80 90orf141.pep RFAGVFFAVIGLTSCGFAGFNFLGRHHGRXVVLILIGCIGLIPVAHFLNPAAAAFAAAGL
|||||||||:||||||||||||||||||| |||||||||||||::|||||||||||||||orf141a RFAGVFFAVVGLTSCGFAGFNFLGRHHGRSVVLILIGCIGLIPTVHFLNPAAAAFAAAGL
100 110 120 130 140 150
100 110 120 130 140orf141.pep VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRP
||||||||||||||||||||||||||||||||||||||||||||||||||orf141a VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTA
160 170 180 190 200 210orf141a VASLAFALPLMTVYPLLLAKTQPALFAQWLDDHVFGTFGGVPHIQTAFSLFYYLKNLLWF
220 230 240 250 260 270全长ORF141a核苷酸序列<SEQ ID 539>是:1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA51 AAAGCCGTGG CTGTTGCTGT TGATGGCGTT TGCCTGGTTG TGGCCCGGCG101 TGTTTTCCCA CGATTTGTGG AATCCTGACG AACCTGCCGT CTATACCGCC151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCT TTGGTTGCCC ATCTGTTCGG201 TCAAATCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCGT251 TCAAACATTT GCTGTCGCCG TGGGCTGCCG ACCCGTATGA TGCCGCACGC301 TTTGCCGGCG TGTTTTTCGC CGTTGTCGGA CTGACTTCCT GCGGCTTTGC351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAGCGTC GTCCTGATTC401 TCATCGGCTG TATCGGGCTG ATTCCGACCG TACACTTTCT CAACCCCGCT451 GCCGCCGCCT TTGCCGCCGC CGGACTGGTG CTGCACGGTT ATTCTTTGGC501 TCGCCGGCGC GTGATTGCCG CCTCTTTTCT GCTCGGTACG GGTTGGACGC551 TGATGTCGTT GGCAGCAGCT TATCCGGCGG CATTTGCCCT GATGCTGCCC601 CTGCCCGTGC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT651 GACGGCAGTC GCCTCGCTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC701 CGCTGCTCTT GGCAAAAACG CAGCCCGCGC TGTTCGCGCA ATGGCTCGAC751 GATCACGTTT TCGGTACGTT CGGCGGCGTG CGGCACATTC AGACGGCATT801 CAGTTTGTTT TACTATCTGA AAAACCTGCT TTGGTTTGCA TTGCCTGCGC851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CGCGCCTGTT TTCGACCGAC901 TGGGGGATTT TGGGCGTCGT CTGGATGCTT GCCGTTTTGG TGCTGCTTGC951 CGTCAATCCG CAGCGTTTTC AGGATAACCT CGTCTGGCTG CTTCCGCCGC1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGACG CGGCGCGGCG1051 GCGTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGACTGT TTGCCGTGTT1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG1151 CCGAACGCGC CGCCTATTTC AGCCCGTATT ATGTTCCTGA TATCGATCCC1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT1251 TACCCGCAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG1351 GACGCGGCGA AAAGCCACGC GCCCGTCGTC CGGAGTATGG AGGCATCGCT1401 TTCCCCGGAA TTAAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGACA1451 TAGGCGGCGG CGACCTACAC ACGCGGATTG TTTGGACGCA GTACGGCACA1501 TTGCCGCACC GCGTCGGCGA TGTACAATGC CGCTACCGCA TCGTCCGCTT1551 GCCCCAAAAC GCGGATGCGC CGCAAGGCTG GCAGACGGTC TGGCAGGGTG1601 CGCGCCCGCG CAACAAAGAC AGTAAGTTCG CACTGATACG GAAAACCGGG1651 GAAAATATAT TAAAAACAAC AGATTGA它编码的蛋白质具有氨基酸序列<SEQ ID 540>:1 MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPDEPAVYTA51 VEALAGSPTP LVAHLFGQID FG1PPVYLWV AAAFKHLLSP WAADPYDAAR101 FAGVFFAVVG LTSCGFAGFN FLGRHHGRSV VLILIGCIGL IPTVHFLNPA151 AAAFAAAGLV LHGYSLARRR VIAASFLLGT GWTLMSLAAA YPAAFALMLP201 LPVLMFFRPW QSRRLMLTAV ASLAFALPLM TVYPLLLAKT QPALFAQWLD251 DHVFGTFGGV RHIQTAFSLF YYLKNLLWFA LPALPLAVWT VCRTRLFSTD301 WGILGVVWML AVLVLLAVNP QRFQDNLVWL LPPLALFGAA QLDSLRRGAA351 AFVNWFGIMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP401 IPMAVAVLFT PLWLWAITRK NIRGRQAVTN WAAGVTLTWA LLMTLFLPWL451 DAAKSHAPVV RSMEASLSPE LKRELSDGIE CIDIGGGDLH TRIVWTQYGT501 LPHRVGDVQC RYRIVRLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKTG551 ENILKTTD*
ORF141a和ORF141-1在553个氨基酸的重叠区内显示出有98.2%的相同性:
orf141a.pep MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP
orf141a.pep LVAHLFGQIDFGIPPVYLWVAAAFKHLLSPWAADPYDAARFAGVFFAVVGLTSCGFAGFN
|||||||| ||||||||||||||||||||||||| |||||||||||||:|||||||||||
orf141-1 LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADSYDAARFAGVFFAVIGLTSCGFAGFN
orf141a.pep FLGRHHGRSVVLILIGCIGLIPTVHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
||||||||||||||||||||||::||||||||||||||||||||||||||||||||||||
orf141-1 FLGRHHGRSVVLILIGCIGLIPVAHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
orf141a.pep GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
orf141a.pep QPALFAQWLDDHVFGTFGGVRHIQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD
|||||||||| |||||||||||:|||||||||||||||||||||||||||||||||||||
orf141-1 QPALFAQWLDYHVFGTFGGVRHVQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD
orf141a.pep WGILGVVWMLAVLVLLAVNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 WGILGVVWMLAVLVLLAVNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
orf141a.pep FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
orf141a.pep NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASLSPELKRELSDGIE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASLSPELKRELSDGIE
orf141a.pep CIDIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVRLPQNADAPQGWQTVWQGARPRNKD
||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||
orf141-1 CIGIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVLLPQNADAPQGWQTVWQGARPRNKD
orf141a.pep SKFALIRKTGENI
|||||||| ||||
orf141-1 SKFALIRKIGENI
与淋病奈瑟球菌的预计ORF的同源性
ORF141和淋病奈瑟球菌的预计ORF(ORF141ng)在140个氨基酸的重叠区内显示出有95%的相同性:
orf141.pep DFGISPVYLWVAAAFKHLLSPWAADSYDVA 30
|||| ||||||||||||||||||| ||:|
orf141ng WNPAEPAVYTAVEALAGSPTPLVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAAHPYDAA 126
orf141.pep RFAGVFFAVIGLTSCGFAGFNFLGRHHGRXVVLILIGCIGLIPVAHFLNPAAAAFAAAGL 90
||||||||||||||||||||||||||||| |||| ||||||||||||:||||||||||||
orf141ng RFAGVFFAVIGLTSCGFAGFNFLGRHHGRSVVLIHIGCIGLIPVAHFFNPAAAAFAAAGL 186
orf141.pep VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRP 140
||||||||||||||||||||||||||||||||||||||||||||||||||
orf141ng VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTA 246
预计ORF141ng核苷酸序列<SEQ ID 541>编码的蛋白质具有氨基酸序列<SEQ ID542>:1 MPSEAVSARP LCEYLLHLAI RPFLLTLMZT YTPPDARPPA KTHEKPELLL51 LMAFAWLWPG VFSHDLWNPA EPAVYTAVEA LAGSPTPLVA HLFGQTDFGI101 PPVYLWVAAA FKHLLSPWAA HPYDAARFAG VFFAVIGLTS CGFAGFNFLG151 RHHGRSVVLI HIGCIGLIPV AHFFNPAAAA FAAAGLVLHG YSLARRRVIA201 ASFLLGTGWT LMSLAAAYPA AFALMLPLPV LMFFRPWQSR RLMLTAVASL251 AFALPLMTVY PLLLAKTQPA LFAQWLNYHV FGTFGGVRHI QRAFSLFHYL301 KNLLWFAPPG LPLAVWTVCR TRLFSTDWGI LGIVWMLAVL VLLAFNPQRF351 QDNLVWLLPP LALFGAAQLD SLRRGAAAFV NWFGIMAFGL FAVFLWTGFF401 AMNYGWPAKL AERAAYFSPY YVPDIDPIPM AVAVLFTPLW LWAITRKVIR451 GRQAVTNWAA GVTLTWALLM TLFLPWLDAA KSHAPVVRSM EASFSPELKR501 ELSDGIECIG IGGGDLHTRI VWTQYGTLPH RVGDVRCRYR IVRLPQNADA551 PQGWQTVWQG ARPRNKDSKF ALIRKIGENI LKTTD*进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 543>:1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA51 AAAACCGTGG CTGCTGCTGT TGATGGCGTT TGCCTGGCTG TGGCCCGGCG101 TGTTTTCCCA CGATTTGTGG AATCCTGCCG AACCTGCCGT CTATACCGCC151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCC TTGGTTGCCC ATCTGTTCGG201 TCAAACCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCAT251 TCAAACATTT GCTGTCGCCG TGGGCAGCCG ACCCGTATGA TGCCGCACGC301 TTTGCAGGCG TATTTTTTGC CGTTATCGGA CTGACTTCTT GCGGCTTTGC351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAGCGTT GTTTTAATCC401 ATATCGGCTG TATCGGGCTG ATTCCGGTTG CCCATTTCCT CAATCCcgcc451 gccgccgcct tTGCCGCCGC CGGACTGGTG CTGCacggct actcgctgGC501 ACGCCGGCGC GTGATtgccg cctctTtccT GCTCGGTACG GGTTGGACGT551 TGATGTCGCT GGCGGCAGCT TATCCGGCGG CGTTTGCGCT GATGCTGCCC601 CTGCCCGTGC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT651 GACGGCAGTC GCCTCGCTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC701 CGCTGCTCtt gGCAAAAACG CAGCCCGCGC TGTTTGCGCA ATGGCTCAAC751 TATCACGTTT TCGGTACGTt cggcgGCGTG CGGCAcaTTC AGAggGCatT801 Cagtttgttt cactatctgA AAaatctgct ttggttcgca ccgcccgggC851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CACGCCTGTT TTCGACCGAC901 TGGGGGATTT TGGGCATTGT CTGGATGCTT GCCGTTTTGG TGCTGCTCGC951 CTTTAATCCG CAGCGTTTTC AAGACAACCT CGTCTGGCTG CTGCCGCCGC1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGGCG CGGCGCGGCG1051 GCTTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGGCTGT TTGCCGTGTT1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG1151 CCGAACGCGC CGCCTACTTC AGCCCGTATT ACGTTCCCGA CATCGATCCC1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT1251 TACCCGGAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG1351 GACGCGGCGA AAAGCCACGC GCCCGTCGTC CGGAGTATGG AGGCATCGTT1401 TTCCCCGGAA TTAAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGGCA1451 TAGGCGGCGG CGACCTGCAC ACGCGGATTG TTTGGACGCA GTACGGCACA1501 TTGCCGCACC GCGTCGGCGA TGTCCGTTGC CGCTACCGTA TCGTCCGCCT1551 GCCCCAAAAC GCGGATGCGC CGCAAGGCTG GCAGACGGTC TGGCAGGGTG1601 CGCGCCCGCG CAACAAAGAC AGTAAGTTTG CACTGATACG GAAAATCGGG1651 GAAAATATAT TAAAAACAAC AGATTGA它对应于氨基酸序列<SEQ ID 544;ORF141ng-1>:1 MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPAEPAVYTA51 VEALAGSPTP LVAHLFGQTD FGIPPVYLWV AAAFKHLLSP WAADPYDAAR101 FAGVFFAVIG LTSCGFAGFN FLGRHHGRSV VLIHIGCIGL IPVAHFLNPA151 AAAFAAAGLV LHGYSLARRR VIAASFLLGT GWTLMSLAAA YPAAFALMLP201 LPVLMFFRPW QSRRLMLTAV ASLAFALPLM TVYPLLLAKT QPALFAQWLN251 YHVFGTFGGV RHIQRAFSLF HYLRNLLWFA PPGLPLAVWT VCRTRLFSTD301 WGILGIVWML AVLVLLAFNP QRFQDNLVWL LPPLALFGAA QLDSLRRGAA351 AFVNWFGIMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP401 IPMAVAVLFT PLWLWAITRK NIRGRQAVTN WAAGVTLTWA LLMTLFLPWL451 DAAKSHAPVV RSMEASFSPE LKRELSDGIE CIGIGGGDLH TRIVWTQYGT501 LPHRVGDVRC RYRIVRLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKIG551 ENILKTTD*
ORF141ng-1和ORF141-1在553个氨基酸的重叠区内显示出有97.5%的相同性有:
orf141ng-1.pep MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPAEPAVYTAVEALAGSPTP
|||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf141-1 MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP
orf141ng-1.pep LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADPYDAARFAGVFFAVIGLTSCGFAGFN
|||||||||||||||||||||||||||||||||| |||||||||||||||||||||||||
orf141-1 LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADSYDAARFAGVFFAVIGLTSCGFAGFN
orf141ng-1.pep FLGRHHGRSVVLIHIGCIGLIPVAHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 FLGRHHGRSVVLILIGCIGLIPVAHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
orf141ng-1.pep GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
orf141ng-1.pep QPALFAQWLNYHVFGTFGGVRHIQRAFSLFHYLKNLLWFAPPGLPLAVWTVCRTRLFSTD
|||||||||:||||||||||||:| |||||:||||||||| |:|||||||||||||||||
orf141-1 QPALFAQWLDYHVFGTFGGVRHVQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD
orf141ng-1.pep WGILGIVWMLAVLVLLAFNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
|||||:||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf141-1 WGILGVVWMLAVLVLLAVNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
orf141ng-1.pep FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
orf141ng-1.pep NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASFSPELKRELSDGIE
||||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||
orf141-1 NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASLSPELKRELSDGIE
orf141ng-1.pep CIGIGGGDLHTRIVWTQYGTLPHRVGDVRCRYRIVRLPQNADAPQGWQTVWQGARPRNKD
||||||||||||||||||||||||||||:|||||| ||||||||||||||||||||||||
orf141-1 CIGIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVLLPQNADAPQGWQTVWQGARPRNKD
orf141ng-1.pep SKFALIRKIGENILKTTDX
|||||||||||||||||||
orf141-1 SKFALIRKIGENIX
根据淋球菌中存在几个推定跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例72
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 545>:1 ..CAATCCGCCA AATGGTTATC GGGCCAAACT CTAGTCGGCA CAGCAATTGG51 GATACGCGGG CAGATAAAGC TTGGCGGCAA CCTGCATTAC GATATATTTA101 CCGGCCGCGC ATTGAAAAAG CCCGAATTTT TCCAATCAAG GAAATGGGCA151 AGCGGTTTTC AGGTAGGCTA TACGTTTTAA它对应于氨基酸序列<SEQ ID 546;ORF142>:
1 ..QSAAWLSGQT LVGTAIGIRG QIKLGGNLHY DIFTGRALKK PEFFQSRKWA
51 SGFQVGYTF*进一步的工作揭示了完整的核苷酸序列<SEQ ID 547>:
1 ATGGATAATT CGGGTAGTGA GGCGACAGGA AAATACCAAG GAAATATCAC
51 TTTCTCTGCC GACAATCCTT TGGGACTGAG TGATATGTTC TATGTAAATT101 ATGGACGTTC GATTGGCGGT ACGCCCGATG AGGAAAGTTT TGACGGCCAT151 CGCAAAGAAG GCGGATCAAA CAATTACGCC GTACATTATT CAGCCCCTTT201 CGGTAAATGG ACATGGGCAT TCAATCACAA TGGCTACCGT TACCATCAGG251 CAGTTTCCGG ATTATCGGAA GTCTATGACT ATAATGGAAA AAGTTACAAT301 ACTGATTTCG GCTTCAACCG CCTGTTGTAT CGTGATGCCA AACGCAAAAC351 CTATCTCGGT GTAAAACTGT GGATGAGGGA AACAAAAAGT TACATTGATG401 ATGCCGAACT GACTGTACAA CGGCGTAAAA CTGCGGGTTG GTTGGCAGAA451 CTTTCCCACA AAGAATATAT CGGTCGCAGT ACGGCAGATT TTAAGTTGAA501 ATATAAACGC GGCACCGGCA TGAAAGATGC TCTGCGCGCG CCTGAAGAAG551 CCTTTGGCGA AGGCACGTCA CGTATGAAAA TTTGGACGGC ATCGGCTGAT601 GTAAATACTC CTTTTCAAAT CGGTAAACAG CTATTTGCCT ATGACACATC651 CGTTCATGCA CAATGGAACA AAACCCCGCT AACATCGCAA GACAAACTGG701 CTATCGGCGG ACACCACACC GTACGTGGCT TCGACGGTGA AATGAGTTTG751 TCTGCCGAGC GGGGATGGTA TTGGCGCAAC GATTTGAGCT GGCAATTTAA801 ACCAGGCCAT CAGCTTTATC TTGGGGCTGA TGTAGGACAT GTTTCAGGAC851 AATCCGCCAA ATGGTTATCG GGCCAAACTC TAGTCGGCAC AGCAATTGGG901 ATACGCGGGC AGATAAAGCT TGGCGGCAAC CTGCATTACG ATATATTTAC951 CGGCCGCGCA TTGAAAAAGC CCGAATTTTT CCAATCAAGG AAATGGGCAA1001 GCGGTTTTCA GGTAGGCTAT ACGTTTTAA它对应于氨基酸序列<SEQ ID 548 ORF142-1>:
1 MDNSGSEATG KYQGNITFSA DNPLGLSDMF YVNYGRSIGG TPDEESFDGH
51 RKEGGSNNYA VHYSAPFGKW TWAFNHNGYR YHQAVSGLSE VYDYNGKSYN101 TDFGFNRLLY RDAKRKTYLG VKLWMRETKS YIDDAELTVQ RRKTAGWLAE151 LSHKEYIGRS TADFKLKYKR GTGMKDALRA PEEAFGEGTS RMKIWFASAD201 VNTPFQIGKQ LFAYDTSVHA QWNKTPLTSQ DKLAIGGHHT VRGFDGEMSL251 SAERGWYWRN DLSWQFKPGH QLYLGADVGH VSGQSAKWLS GQTLVGTAIG301 IRGQIKLGGN LHYDIFTGRA LKKPEFFQSR KWASGFQVGY TF*
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF142和淋病奈瑟球菌的预计ORF(ORF142ng)在59个氨基酸的重叠区内显示出有88.1%的相同性。orf142.pep QSAKWLSGQTLVGTAIGIRGQIKLGGNLHY 30
|||||||||||:||||||||||||||||||orf142ng RGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLAGTAIGIRGQIKLGGNLHY 313orf142.pep DIFTGRALKKPEFFQSRKWASGFQVGYTF 59
||||||||||||:||::||::||||||:|orf142ng DIFTGRALKKPEYFQTKKWVTGFQVGYSF 342全长ORF142ng核苷酸序列<SEQ ID 549>是:1 ATGGATAATT CGGGTAGTGA GGCGACAGGA AAATACCAAG GAAATATCAC51 TTTCTCTGCC GACAATCCTT TTGGACTGAG TGATATGTTC TATGTAAATT101 ATGGACGTTC AATTGGCGGT ACGCCCGATG AGGAAAATTT TGACGGCCAT151 CGCAAAGAAG GCGGATCAAA CAATTACGCC GTACATTATT CAGCCCCTTT201 CGGTAAATGG ACATGGGCAT TCAATCACAA TGGCTACCGT TACCATCAGG251 CGGTTTCCGG ATTATCGGAA GTCTATGACT ATAATGGAAA AAGTTACAAC301 ACTGATTTCG GCTTCAACCG CCTGTTGTAT CGTGATGCCA AACGCAAAAC351 CTATCTCAGT GTAAAACTGT GGACGAGGGA AACAAAAAGT TACATTGATG401 ATGCCGAACT GACTGTACAA CGGCGTAAAA CCACAGGTTG GTTGGCAGAA 451 CTTTCCCACA AAGGATATAT CGGTCGCAGT ACGGCAGATT TTAAGTTGAA501 ATATAAACAC GGCACCGGCA TGAAAGATGC TCTGCGCGCG CCTGAAGAAG551 CCTTTGGCGA AGGCACGTCA CGTATGAAAA TTTGGACGGC ATCGGCTGAT601 GTAAATACTC CTTTTCAAAT CGGTAAACAG CTATTTGCCT ATGACACATC651 CGTTCATGCA CAATGGAACA AAACCCCGCT AACATCGCAA GACAAACTGG701 CTATCGGCGG ACACCACACC GTACGTGGCT TCGACGGTGA AATGAGTTTG751 CCTGCCGAGC GGGGATGGTA TTGGCGCAAC GATTTGAGCT GGCAATTTAA801 ACCAGGCCAT CAGCTTTATC TTGGGGCTGA TGTAGGACAT GTTTCAGGAC851 AATCCGCCAA ATGGTTATCG GGCCAAACTC TAGCCGGCAC AGCAATTGGG901 ATACGCGGGC AGATAAAGCT TGGCGGCAAC CTGCATTACG ATATATTTAC951 CGGCCGTGCA TTGAAAAAGC CCGAATATTT TCAGACGAAG AAATGGGTAA1001 CGGGGTTTCA GGTGGGTTAT TCGTTTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 550>:1 MDNSGSEATG KYQGNITFSA DNPFGLSDMF YVNYGRSIGG TPDEENFDGH51 RKEGGSNNYA VHYSAPFGKW TWAFNHNGYR YHQAVSGLSE VYDYNGKSYN101 TDFGFNRLLY RDAKRKTYLS VKLWTRETKS YIDDAELTVQ RRKTTGWLAE151 LSHKGYIGRS TADFKLKYKH GTGMKDALRA PEEAFGEGTS RMKIWTASAD201 VNTPFQIGKQ LFAYDTSVHA QWNKTPLTSQ DKLAIGGHHT VRGFDGEMSL251 PAERGWYWRN DLSWQFKPGH QLYLGADVGH VSGQSAKWLS GQTLAGTAIG301 IRGQIKLGGN LHYDIFTGRA LKKPEYFQTK KWVTGFQVGY SF*通常发现有下划线的序列(芳族-Xaa-芳族氨基酸基序)在外膜蛋白的C端。ORF142ng和ORF142-1在342个氨基酸的重叠区内显示出有95.6%的相同性:orf142-1.pep MDNSGSEATGKYQGNITFSADNPLGLSDMFYVNYGRSIGGTPDEESFDGHRKEGGSNNYA
|||||||||||||||||||||||:|||||||||||||||||||||:||||||||||||||orf142ng-1 MDNSGSEATGKYQGNITFSADNPFGLSDMFYVNYGRSIGGTPDEENFDGHRKEGGSNNYAorf142-1.pep VHYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLG
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:orf142ng-1 VHYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLSorf142-1.pep VKLWMRETKSYIDDAELTVQRRKTAGWLAELSHKEYIGRSTADFKLKYKRGTGMKDALRA
|||| |||||||||||||||||||:||||||||| |||||||||||||||||||||||||orf142ng-1 VKLWTRETKSYIDDAELTVQRRKTTGWLAELSHKGYIGRSTADFKLKYKHGTGMKDALRAorf142-1.pep PEEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf142ng-1 PEEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHTorf142-1.pep VRGFDGEMSLSAERGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLVGTAIG
|||||||||| |||||||||||||||||||||||||||||||||||||||||||:|||||orf142ng-1 VRGFDGEMSLPAERGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLAGTAIGorf142-1.pep IRGQIKLGGNLHYDIFTGRALKKPEFFQSRKWASGFQVGYTF
|||||||||||||||||||||||||:||::||::||||||:|orf142ng-1 IRGQIKLGGNLHYDIFTGRALKKPEYFQTKKWVTGFQVGYSF另外,ORF142ng与菊欧文氏菌的HecB蛋白同源:gi|1772622(L39897)HecB[菊欧文氏菌]长度=558评分=119位(295),估计值=3e-26相同性=88/346(25%),阳性=151/346(43%),空隙=22/346(6%)询问:2 DNSGSEATGKYQGNITFSADNPFGLSDMFYVNYGRSIGGTPDEENFDGHRKEGGSNNYAV 61
DNSG ++TG+ Q N + + DN FGL+D ++++ G S + + D + G目标:230 DNSGQKSTGEEQLNGSLALDNVFGLADQWFISAGHS---SRFATSHDAESLQAG------ 280询问:62 HYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLSV 121
+S P+G W +N++ RY + G S F +R+++RD KT ++目标:281 -FSMPYGYWNLGYNYSQSRYRNTFINRDFPWHSTGDSDTHRFSLSRVVFRDGTMKTAIAG 339
询问:122 KLWTRETKSYIDDAELTVQRRKTTGWLAELSHKGYIGRSTADFKLKYKHGTGMKDALRAP 181
R +Y++ + L RK + ++H + A F Y G +
目标:340 TFSQRTGNNYLNGSLLPSSSRKLSSVSLGVNHSQKLWGGLATFNPTYNRGVRWLGSETDT 399
询问:182 EEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHTV 241
+++ E + WT SA P Y S++ Q++ L ++L +GG ++
目标:400 DKSADEPRAEFNKWTLSASYYHPV---TDSITYLGSLYGQYSARALYGSEQLTLGGESSI 456
询问:242 RGFDGEMSLPAERGWYWRNDLSWQFKP----GHQLYLGA-DVGHVSGQSAKWLSGQTLAG 296
RGF E RG YWRN+L+WQ G+ ++ A D GH+ + +L G
目标:457 RGF-REQYTSGNRGAYWRNELNWQAWQLPVLGNVTFMAAVDGGHLYNHKQDNSTAASLWG 515
询问:297 TAIGIRGQIKLGGNLHYDIFTGRALKKPEYFQTKKWVTGFQVGYSF 342
A+G+ + L + G + P + Q V G++VG SF
目标:516 GAVGMTVASRW---LSQQVTVGWPISYPAWLQPDTMVVGYRVGLSF 558
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例73
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 551>:1 ATGCGGACGA AATGGTCAGC AGTGAGAAGC TGCTTACTTG GgCGGACACC51 GCCGACATCG ATACCGCTTT GAACCTGTTG TACCGTTTGC AAAAACTCGA101 ATTCCTCTAT GGCGATGAAA ACGGTCATTC AGACGGCATC AATTTGwCGG151 ACGAGCAATT GCCGTTGCTG ATGGAACAAT TGTCCGGCAG CGGTAAGGCG201 TTATTGGTCG ATCGGAACGG TCTGTATCTT GCCAACGCCA ATTTCCATCA251 TGAGGCGGCG GAAGAGTTGG GGTTGTTGGC GGCAGAAGTC GCACAGATGG301 AAAAGAAATA CCGGCTGCTG ATTAAGAACA AC..它对应于氨基酸序列<SEQ ID 552;ORF143>:1 MRTKWSAVRS CTWALTADID TALNLLYRLQ KLEFLYGDEN GHSDGINLXD51 EQLPLLMEQL SGSGKALLVD RNGLYLANAN FHHEAAEELG LLAAEVAQME101 KKYRLLIKNN..进一步的工作揭示了完整的核苷酸序列<SEQ ID 553>:1 ATGGAATCAA CACTTTCACT ACAAGCAAAT TTATATCCCC GCCTGACTCC51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGCCCCCAGT GCCGGTAAAA101 CTTTGTTGCA CAGCCTGTTG AAAGCAGATG CGGACGAAAT GGTCAGCAGT151 GAGAAGCTGC TTACTTGGGC GGACACCGCC GACATCGATA CCGCTTTGAA201 CCTGTTGTAC CGTTTGCAAA AACTCGAATT CCTCTATGGC GATGAAAACG251 GTCATTCAGA CGGCATCAAT TTGTCGGACG AGCAATTGCC GTTGCTGATG301 GAACAATTGT CCGGCAGCGG TAAGGCGTTA TTGGTCGATC GGAACGGTCT351 GTATCTTGCC AACGCCAATT TCCATCATGA GGCGGCGGAA GAGTTGGGGT401 TGTTGGCGGC AGAAGTCGCA CAGATGGAAA AGAAATACCG GCTGCTGATT451 AAGAACAACC TGTATATCAA CAATAACGCT TGGGGCGTTT GCGATCCTTC501 CGGTCAGAGC GAATTGACAT TTTTCCCATT GTATATCGGT TCAACCAAAT551 TTATTTTGGT TATCGGCGGC ATTCCCGATT TGGGCAAAGA GGCATTTGTT601 ACTTTGGTAA GGATTTTATA CCGCCGTTAC AGCAACCGCG TGTAA它对应于氨基酸序列<SEQ ID 554;ORF143-1>:1 MESTLSLQAN LYPRLTPAGA FYAVSSDAPS AGKTLLHSLL KADADEMVSS51 EKLLTWADTA DIDTALNLLY RLQKLEFLYG DENGHSDGIN LSDEQLPLLM101 EQLSGSGKAL LVDRNGLYLA NANFHHEAAE ELGLLAAEVA QMEKKYRLLI151 KNNLYINNNA WGVCDPSGQS ELTFFPLYIG STKFILVIGG IPDLGKEAFV201 TLVRILYRRY SNRV*该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF143和脑膜炎奈瑟球菌菌株A的ORF(ORF143a)在105个氨基酸的重叠区内显示出有92.4%的相同性:
10 20 30orf143.pep MRTKWSAVRSCTWADTADIDTALNLLYRLQKLEFL
|: : ||| ||||||||||||||||||||orf143a GAFYAVSSDXPSAGKTLLHSLLKADADEMVSSEKLLTWAXTADIDTALNLLYRLQKLEFL
20 30 40 50 60 70
40 50 60 70 80 90orf143.pep YGDENGHSDGINLXDEQLPLLMEQLSGSGKALLVDRNGLYLANANFHHEAAEELGLLAAE
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||orf143a YGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLANANFHHEAAEELGLLAAE
80 90 100 110 120 130
100 110orf143.pep VAQMEKKYRLLIKNN
|||||||||| ||||orf143a VAQMEKKYRLXIKNNLYINNNAWGVCDPSGQSELTFFPLYIGSTKFILVIGGIPDLGKEA
140 150 160 170 180 190全长ORF143a核苷酸序列<SEQ ID 555>是:1 ATGGAATCAA CANTTTCACT ACAAGCAAAT TTATATCNCC GCCTGACTCC51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGNCCCCAGT GCCGGTAAAA101 CTTTGTTGCA CAGCCTGTTG AAAGCGGATG CGGACGAAAT GGTNAGCAGT151 GAGAAGCTGC TTACCTGGGC GGANACCGCC GACATCGATA CCGCTTTGAA201 CCTGTTGTAC CGTTTGCAAA AACTCGAATT CCTCTATGGC GATGAAAACG251 GTCATTCAGA CGGCATCAAT TTGTCGGACG AGCAATTGCC GTTGCTGATG301 GAACAATTGT CCGGCAGCGG TAAGGCGTTA TTGGTCGATC GGAACGGTCT351 GTATCTTGCC AACGCCAATT TCCATCATGA GGCGGCGGAA GAGTTGGGGT401 TGTTGGCGGC AGAAGTCGCA CAGATGGAAA AGAAATACCG GCTGCNNATT451 AAGAACAACC TGTATATCAA CAATAACGCT TGGGGCGTTT GCGATCCTTC501 CGGTCAGAGC GAATTGACAT TTTTCCCATT GTATATCGGT TCAACCAAAT551 TTATTTTGGT TATCGGCGGC ATTCCCGATT TGGGCAAAGA GGCATTTGTT601 ACTTTGGTAA GGATNTTATA CCNCCNGTTA CAGCAACCGC GTGTAAAACT651 TGGGAGAGAG GANGGGTTAT GCAGCAATTA TTGA它编码的蛋白质具有氨基酸序列<SEQ ID 556>:1 MESTXSLQAN LYXRLTPAGA FYAVSSDXPS AGKTLLHSLL KADADEMVSS51 EKLLTWAXTA DIDTALNLLY RLQKLEFLYG DENGHSDGIN LSDEQLPLLM101 EQLSGSGKAL LYDRNGLYLA NANFHHEAAE ELGLLAAEVA QMEKKYRLXI151 KNNLYINNNA WGVCDPSGQS ELTFFPLYIG STKFILVIGG IPDLGKEAFV201 TLVRXLYXXL QQPRVKLGRE XGLCSNY*ORF143a和ORF143-1在207个氨基酸的重叠区内显示出有97.1%的相同性:orf143a.pep MESTXSLQANLYXRLTPAGAFYAVSSDXPSAGKTLLHSLLKADADEMVSSEKLLTWAXTA
|||| ||||||| |||||||||||||| ||||||||||||||||||||||||||||||||orf143-1 MESTLSLQANLYPRLTPAGAFYAVSSDAPSAGKTLLHSLLKADADEMVSSEKLLTWADTAorf143a.pep DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf143-1 DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLAorf143a.peD NANFHHEAAEELGLLAAEVAQMEKKYRLXIKNNLYINNNAWGVCDPSGQSELTFFPLYIG
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||orf143-1 NANFHHEAAEELGLLAAEVAQMEKKYRLLIKNNLYINNNAWGVCDPSGQSELTFFPLYIGorf143a.pep STKFILVIGGIPDLGKEAFVTLVRXLY
|||||||||||||||||||||||| ||
orf143-1 STKFILVIGGIPDLGKEAFVTLVRILY
与淋病奈瑟球菌的预计ORF的同源性
ORF143和淋病奈瑟球菌的预计ORF(ORF143ng)在110个氨基酸的重叠区内显示出有95.5%的相同性:
orf143.pep MRTKWSAVRSCTWADTADIDTALNLLYRLQKLEFLYGDENGHSDGINLXDEQLPLLMEQL 60
|||||||||||: ||||||||||||||||||||||||||||||||||| |||||||||||
orf143ng MRTKWSAVRSCSRADTADIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQL 60
orf143.pep SGSGKALLVDRNGLYLANANFHHEAAEELGLLAAEVAQMEKKYRLLIKNN 110
||||||||||||||||||||||||:||||||||||||||||||||||:||
orf143ng SGSGKALLVDRNGLYLANANFHHESAEELGLLAAEVAQMEKKYRLLIRNNLYINNNAWGV 120
预计ORF143ng核苷酸序列<SEQ ID 557>编码的蛋白质具有氨基酸序列<SEQ ID558>:1 MRTKWSAVRS CSRADTADID TALNLLYRLQ KLEFLYGDEN GHSDGINLSD51 EQLPLLMEQL SGSGKALLVD RNGLYLANAN FHHESAEELG LLAAEVAQME101 KKYRLLIRNN LYINNNAWGV CDPSGQSELT FFPLYIGSTK FILVIAGIPD151 LSKGGICYFG KDFIPPLQQP RVKLGTGGIM RQLLISILED LNNTSTDIIA201 SAVISTDGLP MATMLPSHLN SDRVGAISAT LLALGSRSVQ ELACGELEQV251 MIKGKSGYIL LSQAGKDAVL VLVAKETGRL GLILLDAKRA ARHIAEAI*进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 559>:1 ATGGAATCAA CACTTTCACT ACAAGCGAAT TTATATCCCT GCCTGACTCC51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGCCCCCAGT GCCGGTAAAA101 CTTTGTTGCG CAGCCTGTTG AAAGCGGATG CGGACGAAGT GGTCAGCAGT151 GAGAAGCTGC TCGCGGCGGA CACCGCCGAC ATCGATACCG CTTTGAACCT201 GTTGTACCGT TTGCAAAAAC TCGAATTCCT CTATGGCGAT GAAAACGGTC251 ATTCAGACGG CATCAATTIG TCGGACGAGC AATTGCCGTT GCTGATGGAA301 CAATTGTCCG GCAGCGGTAA GGCATTATTG GTCGATCGGA ACGGTCTGTA351 TCTTGCCAAC GCCAATTTCC ATCATGAGTC GGCGGAAGAG TTGGGGTTGT401 TGGCGGCAGA AGTCGCACAG ATGGAAAAGA AATACCGGCT GCTGATTAGG451 AACAACCTGT ATATCAACAA TAACGCTTGG GGCGTTTGCG ATCCTTCCGG501 TCAGAGCGAA TTGACATTTT TCCCATTGTA TATCGGTTCA ACCAAATTTA551 TTTTGGTTAT CGCCGGCATT CCCGATTTGA GCAAAGAGGC ATTTGTTACT601 TTGGTAAGGA TTTTATACCG CCGTTACAGC AACCGCGTGT AA它对应于氨基酸序列<SEQ ID 560;ORF143ng-1>:1 MESTLSLQAN LYPCLTPAGA FYAVSSDAPS AGKTLLRSLL KADADEVVSS51 EKLLAADTAD IDTALNLLYR LQKLEFLYGD ENGHSDGINL SDEQLPLLME101 QLSGSGKALL VDRNGLYLAN ANFHHESAEE LGLLAAEVAQ MEKKYRLLIR151 NNLYINNNAW GVCDPSGQSE LTFFPLYIGS TKFILVIAGI PDLSKEAFVT201 LVRILYRRYS NRV*ORF143ng-1和ORF143-1在214个氨基酸的重叠区内显示出有95.8%的相同性:orf143ng-1.pep MESTLSLQANLYPCLTPAGAFYAVSSDAPSAGKTLLRSLLKADADEVVSSEKLLA-ADTA 59
||||||||||||| ||||||||||||||||||||||:|||||||||:|||||||: ||||orf143-1 MESTLSLQANLYPRLTPAGAFYAVSSDAPSAGKTLLHSLLKADADEMVSSEKLLTWADTA 60orf143ng-1.pep DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 119
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf143-1 DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 120orf143ng-1.pep NANFHHESAEELGLLAAEVAQMEKKYRLLIRNNLYINNNAWGVCDPSGQSELTFFPLYIG 179
|||||||:||||||||||||||||||||||:|||||||||||||||||||||||||||||orf143-1 NANFHHEAAEELGLLAAEVAQMEKKYRLLIKNNLYINNNAWGVCDPSGQSELTFFPLYIG 180orf143ng-1.pep STKFILVIAGIPDLSKEAFVTLVRILYRRYSNRV 213
||||||||:|||||:|||||||||||||||||||
orf143-1 STKFILVIGGIPDLGKEAFVTLVRILYRRYSNRV 214
根据淋球菌蛋白中存在推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例74
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 561>:1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGr101 CGGCGGCAAG CATGACGTTT ACGACGCTGC TGGCACTCGT CCCCGTGCTG151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CA.GGCGCGG251 ACATCGTGTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCTGA TGCTGATTCG351 GACGATAGAC AATACGTTCA ACCGCATCTG GaCGGGTCAA wTyCCAGCGT401 CCGTGGATG..它对应于氨基酸序列<SEQ ID 562;ORF144>:1 MTFLQRLQGL ADNKICAFAW FVVRRFDEER VPQXAASMTF TTLLALVPVL51 TVMVAVASIF PVFDRWSDSF VSFVNQTIVP XGADMVFDYI NAFREQANRL101 TAIGSVMLVV TSLMLIRTID NTFNRIWRVX XQRPWM...进一步的工作揭示了完整的核苷酸序列<SEQ ID 563>:1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGG101 CGGCGGCAAG CATGACGTTT ACGACGCTGC TGGCACTCGT CCCCGTGCTG151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG251 ACATGGTGTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCTGA TGCTGATTCG351 GACGATAGAC AATACGTTCA ACCGCATCTG GCGGGTCAAT TCCCAGCGTC401 CGTGGATGAT GCAGTTTCTC GTCTATTGGG CTTTACTGAC GTTCGGGCCG451 CTGTCTTTGG GCGTGGGCAT TTCCTTTATG GTCGGCTCGG TACAGGATGC501 CGCGCTTGCC TCAGGTGCGC CGCAGTGGTC GGGCGCGTTG CGAACGGCGG551 CGACGCTGAC CTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTTCGTG601 CCAAACCGCT TCGTTCCCGC GCGGCAGGCG TTTGTCGGGG CTTTGGCAAC651 AGCGTTTTGT CTGGAAACCG CGCGCTCCCT CTTCACTTGG TATATGGGCA701 ATTTCGACGG CTACCGCTCG ATTTACGGCG CGTTTGCCGC CGTGCCGTTT751 TTTCTGTTGT GGCTGAACCT GTTGTGGACG CTGGTCTTGG GCGGCGCGGT801 GCTGACTTCT TCACTCTCCT ACTGGCAGGG AGAAGCGTTC CGCAGGGGCT851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG901 GATGCGGCGC AAAAAGAAGG CAAAGCCTTG CCTGTTCAGG AGTTCAGACG951 GCATATCAAT ATGGGCTACG ACGAGTTGGG CGAGCTTTTG GAAAAGCTGG1001 CGCGGCACGG CTACATCTAT TCCGGCAGAC AGGGTTGGGT GTTGAAAACG1051 GGGGCGGATT CGATTGAGTT GAACGAACTC TTCAAGCTCT TCGTTTACCG1101 TCCGTTGCCT GTGGAAAGGG ATCATGTGAA CCAAGCTGTC GATGCGGTAA1151 TGACACCGTG TTTGCAGACT TTGAACATGA CGCTGGCAGA GTTTGACGCT1201 CAGGCGAAAA AACGGCAGTA G它对应于氨基酸序列<SEQ ID 564;ORF144-1>:1 MTFLQRLQGL ADNKICAFAW FVVRRFDEER VPQAAASMTF TTLLALVPVL51 TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI NAFREQANRL101 TAIGSVMLVV TSLMLIRTID NTFNRIWRVN SQRPWMMQFL VYWALLTFGP151 LSLGVGISFM VGSVQDAALA SGAPQWSGAL RTAATLTFMT LLLWGLYRFV201 PNRFVPARQA FVGALATAFC LETARSLFTW YMGNFDGYRS IYGAFAAVPF251 FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL301 DAAQKEGKAL PVQEFRRHIN MGYDELGELL EKLARHGYIY SGRQGWVLKT
351 GADSIELNEL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA
401 QAKKRQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF144和脑膜炎奈瑟球菌菌株A的ORF(ORF144a)在136个氨基酸的重叠区内显示出有96.3%的相同性:
10 20 30 40 50 60
orf144.pep MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQXAASMTFTTLLALVPVLTVMVAVASIF
||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||
orf144a MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIF
10 20 30 40 50 60
70 80 90 100 110 120
orf144.pep PVFDRWSDSFVSFVNQTIVPXGADMVFDYINAFREQANRLTAIGSVMLVVTSLMLIRTID
|||||||||||||||||||| ||||||||||||||||||||||||||||||| |||||||
orf144a PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLVVTSXMLIRTID
70 80 90 100 110 120
130
orf144.pep NTFNRIWRVXXQRPWM
||||||||| |||||
orf144a NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFXVGSVQDAALASGAPQWSGAL
130 140 150 160 170 180全长ORF144a核苷酸序列<SEQ ID 565>是:1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGG101 CGGCGGCAAG CATGACGTTT ACGACACTGC TGGCACTCGT CCCCGTGCTG151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGNTGGTC201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG251 ACATGGTNTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCNGA TGCTGATTCG351 GACGATAGAC AATACGTTCA ACCGCATCTG GCGGGTCAAT TCCCAGCGTC401 CGTGGATGAT GCAGTTTCTC GTCTATTGGG CTTTACTGAC GTTCGGGCCG451 CTGTCTTTGG GCGTGGGCAT TTCCTTTATN GTCGGCTCGG TACAGGATGC501 CGCGCTTGCC TCAGGTGCGC CGCAGTGGTC GGGCGCGTTG CGAACGGCGG551 CGACGCTGAN CTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTNCGTG601 CCAAACCGCT TCGTTCCCGC GCGGCANGCG TTTGTCGGGG CTTTGGCAAC651 AGCGTTCTGT CTGGAAACCG CGCGTTCCCT CTTTACTTGG TATATGGGCA701 ATTTCGACGG CTACCGCTCG ATTTACGGNG CGTTTGCCGC CGTGCCGTTT751 TTTCTGTTGT GGCTGAACCT GTTGTGGACG CTGGTCTTGG GCGGCGCGGT801 GCTGACTTCT TCACTCTCCT ACTGGCAGGG AGAAGCGTTC CGCAGGGNCT851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG901 GATGCGGCGC AAAAAGAAGG CNAAGCCTTG CCTGTTCAGG AGTTCAGACG951 GCATATCAAT ATGGGCTACG ACGAGTTGGG CGAGCTTTTG GAAAAGCTGG1001 CGCGGCACGG CTACATCTAT TCCGGCAGAC AGGGTTGGGT GTTGAAAACG1051 GGGGCGGATT CGATTGAGTT GAACGAACTC TTCAAGCTCT TCGTTTACCG1101 TCCGTTGCCT GTGGAAAGGG ATCATGTGAA CCAAGCTGTC GATGCGGTAA1151 TGATGCCGTG TTTGCAGACT TTGAACATGA CGCTGGCAGA GTTTGACGCT1201 CAGGCGAAAA AACAGCAGCA ATCTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 566>:1 MTFLQRLQGL ADNKICAFAW FVVRRFDEER VPQAAASMTF TTLLALVPVL51 TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI NAFREQANRL101 TAIGSVMLVV TSXMLIRTID NTFNRIWRVN SQRPWMMQFL VYWALLTFGP151 LSLGVGISFX VGSVQDAALA SGAPQWSGAL RTAATLXFMT LLLWGLYRXV201 PNRFVPARXA FVGALATAFC LETARSLFTW YMGNFDGYRS IYGAFAAVPF251 FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRXFDSRGRF DDVLKILLLL301 DAAQKEGXAL PVQEFRRHIN MGYDELGELL EKLARHGYIY SGRQGWVLKT351 GADSIELNEL FKLFVYRPLP VERDHVNQAV DAVMMPCLQT LNMTLAEFDA401 QAKKQQQS*ORF144a和ORF144-1在406个氨基酸的重叠区内显示出有97.8%的相同性:orf144a.pep MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf144-1 MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIForf144a.pep PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLVVTSXMLIRTID
|||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||orf144-1 PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLVVTSLMLIRTIDorf144a.pep NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFXVGSVQDAALASGAPQWSGAL
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||orf144-1 NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDAALASGAPQWSGALorf144a.pep RTAATLXFMTLLLWGLYRXVPNRFVPARXAFVGALATAFCLETARSLFTWYMGNFDGYRS
||||||:||||||||||| |||||||||||||||||||||||||||||||||||||||||orf144-1 RTAATLTFMTLLLWGLYRFVPNRFVPARQAFVGALATAFCLETARSLFTWYMGNFDGYRSorf144a.pep IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRXFDSRGRFDDVLKILLLL
|||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||orf144-1 IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRGFDSRGRFDDVLKILLLLorf144a.pep DAAQKEGXALPVQEFRRHINMGYDELGELLEKLARHGYIYSGRQGWVLKTGADSIELNEL
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||orf144-1 DAAQKEGKALPVQEFRRHINMGYDELGELLEKLARHGYIYSGRQGWVLKTGADSIELNELorf144a.pep FKLFVYRPLPVERDHVNQAVDAVMMPCLQTLNMTLAEFDAQAKKQQQS 408
|||||||||||||||||||||||| |||||||||||||||||||:|orf144-1 FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKRQ 406
与淋病奈瑟球菌的预计ORF的同源性
ORF144和淋病奈瑟球菌的预计ORF(ORF114ng)在136个氨基酸的重叠区内显示出有91.2%的相同性:
orf144.pep MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQXAASMTFTTLLALVPVLTVMVAVASIF 60
||||| || ||||||||||||:|||||||||| ||||||||||||||||||||||||||
orf144ng MTFLQCWQGSADNKICAFAWFVIRRFSEERVPQAAASMTFTTLLALVPVLTVMVAVASIF 60
orf144.pep PVFDRWSDSFVSFVNQTIVPXGADMVFDYINAFREQANRLTAIGSVMLVVTSLMLIRTID 120
|||||||||||||||||||| |||||||||:|||:|||||||||||||||||||||||||
orf144ng PVFDRWSDSFVSFVNQTIVPQGADMVFDYIDAFRIQANRLTAIGSVMLVVTSLMLIRTID 120
orf144.pep NTFNRIWRVXXQRPWM 136
|:||||||| :|||||
orf144ng NAFNRIWRVNTQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDSVLSSGAQQWADAL 180
预计全长ORF144ng核苷酸序列<SEQ ID 567>编码的蛋白质具有氨基酸序列<SEQ ID 568>:1 MTFLQCWQGS ADNKICAFAW FVIRRFSEER VPQAAASMTF TTLLALVPVL51 TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI DAFRDQANRL101 TAIGSVMLVV TSLMLIRTID NAFNRIWRVN TQRPWMMQFL VYWALLTFGP151 LSLGVGISFM VGSVQDSVLS SGAQQWADAL KTAARLAFMT LLLWGLYRFV201 PNRFVPARQA FVGALITAFC LETARFLFTW YMGNFDGYRS IYGAFAAVPF251 FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL301 DAAQKEGRTL SVQEFRRHIN MGYDELGELL EKLARYGYIY SGRQGWVLKT351 GADSIELSEL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA 401 QAKKQQQS*进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 569>:1 ATGACCTTTT TACAACGTTG GCAAGGTTTG GCGGACAATA AAATCTGTGC51 ATTTGCATGG TTCGTCATCC GCCGTTTCAG TGAAGAGCGC GTACCGCAGG101 CAGCGGCGAG CATGACGTTT ACGACACTGC TGGCACTCGT CCCCGTACTG151 ACCGTAATGG TCGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG251 ATATGGTGTT CGACTATATC GACGCATTCC GCGATCAGGC AAACCGGCTG301 ACCGCCATCG GCAGCGTGAT GCTGGTCGTA ACCTCGCTGA TGCTGATTCG351 GACGATAGAC AATGCGTTCA ACCGCATCTG GCGGGTTAAC ACGCAACGCC401 CCTGGATGAT GCAGTTCCTC GTTTATTGGG CGTTGCTGAC TTTCGGGCCT451 TTGTCTTTGG GTGTGGGCAT TTCCTTTATG GTCGGGTCGG TTCAAGACTC501 CGTACTCTCC TCCGGAGCGC AACAATGGGC GGACGCGTTG AAGACGGCGG551 CAAGGCTGGC TTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTTCGTG601 CCCAACCGCT TCGTGCCCGC CCGGCAGGCG TTTGTCGGAG CTTTGATTAC651 GGCATTCTGC CTGGAGACGG CACGTTTCCT GTTCACCTGG TATATGGGCA701 ATTTCGACGG CTACCGCTCG ATTTACGGCG CATTTGCCGC CGTGCCGTTT751 TTCCTGCTGT GGTTAAACCT GCTGTGGACG CTGGTCTTGG GCGGGGCGGT801 GCTGACTTCG TCGCTGTCTT ATTGGCAGGG CGAGGCCTTC CGCAGGGGAT851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG901 GATGCGGCGC AAAAAGAAGG CCGAACCCTG TCCGTTCAGG AGTTCAGACG951 GCATATCAAT ATGGGTTACG ATGAATTGGG CGAGCTTTTG GAAAAGCTGG1001 CGCGGTACGG CTATATCTAT TCCGGCAGAC AGGGCTGGGT TTTGAAAACG1051 GGGGCGGATT CGATTGAGTT GAGCGAACTC TTCAAGCTCT TCGTGTACCG1101 CCCGTTGCct gtggaAAGGG ATCATGTGAA CCAAGCTGtc gaTGCGGTAA1151 TGAcgccgtG TTTGCAGACT TTGAACATGA CGCTGGCGGA GTTTGACGCT1201 CAGgcgAAAA AACAGCAGCA GTCTTGA
它编码ORF144ng的一个变体,该变体具有氨基酸序列<SEQ ID 570;ORF144ng-1>:1 MTFLQRWQGL ADNKICAFAW FVIRRFSEER VPQAAASMTF TTLLALVPVL51 TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI DAFRDQANRL101 TAIGSVMLVV TSLMLIRTID NAFNRIWRVN TQRPWMMQFL VYWALLTFGP151 LSLGVGISFM VGSVQDSVLS SGAQQWADAL KTAARLAFMT LLLWGLYRFV201 PNRFVPARQA FVGALITAFC LETARFLFTW YMGNFDGYRS IYGAFAAVPF251 FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL301 DAAQKEGRTL SVQEFRRHIN MGYDELGELL EKLARYGYIY SGRQGWVLKT351 GADSIELSEL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA401 QAKKQQQS*ORF144ng-1和ORF144-1在406个氨基酸的重叠区内显示出有94.1%的相同性:orf144ng-1.pep MTFLQRWQGLADNKICAFAWFVIRRFSEERVPQAAASMTFTTLLALVPVLTVMVAVASIF
|||||| |||||||||||||||:|||:|||||||||||||||||||||||||||||||||orf144-1 MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIForf144ng-1.pep PVFDRWSDSFVSFVNQTIVPQGADMVFDYIDAFRDQANRLTAIGSVMLVVTSLMLIRTID
||||||||||||||||||||||||||||||:|||:|||||||||||||||||||||||||orf144-1 PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLVVTSLMLIRTIDorf144ng-1.pep NAFNRIWRVNTQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDSVLSSGAQQWADAL
|:||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||orf144-1 NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDAALASGAPQWSGALorf144ng-1.pep KTAARLAFMTLLLWGLYRFVPNRFVPARQAFVGALITAFCLETARFLFTWYMGNFDGYRS
:||| |:|||||||||||||||||||||||||||| ||||||||| ||||||||||||||orf144-1 RTAATLTFMTLLLWGLYRFVPNRFVPARQAFVGALATAFCLETARSLFTWYMGNFDGYRSorf144ng-1.pep IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRGFDSRGRFDDVLKILLLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf144-1 IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRGFDSRGRFDDVLKILLLL
orf144ng-1.pep DAAQKEGRTLSVQEFRRHINMGYDELGELLEKLARYGYIYSGRQGWVLKTGADSIELSEL
|||||||::| ||||||||||||||||||||||||:|||||||||||||||||||||:||
orf144-1 DAAQKEGKALPVQEFRRHINMGYDELGELLEKLARHGYIYSGRQGWVLKTGADSIELNEL
orf144ng-1.pep FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKQQQS
||||||||||||||||||||||||||||||||||||||||||||:|
orf144-1 FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKRQ
根据该分析结果(包括在淋球菌蛋白中鉴定出几个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例75
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 571>:1 ..AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA51 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA101 GCACCGATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC151 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG201 CCTGCTTGAA ACACGGGAAC ACGGCTGA它对应于氨基酸序列<SEQ ID 572;ORF146>:1 ..RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTDMRQE ISALVILLQR51 TRRKWLDAHE RQHLRQSLLE TREHG*进一步的工作揭示了完整的核苷酸序列<SEQ ID 573>:1 ATGAACACCT CGCAACGCAA CCGCCTCGTC AGCCGCTGGC TCAACTCCTA51 CGAACGCTAC CGCTACCGCC GCCTCATCCA CGCCGTCCGG CTCGGCGGGG101 CCGTCCTGTT CGCCACCGCC TCCGCCCGGC TGCTCCACCT CCAACACGGC151 GAGTGGATAG GGATGACCGT CTTCGTCGTC CTCGGCATGC TCCAGTTTCA201 AGGGGCGATT TACTCCAAGG CGGTGGAACG TATGCTCGGC ACGGTCATCG251 GGCTGGGCGC GGGTTTGGGC GTTTTATGGC TGAACCAGCA TTATTTCCAC301 GGCAACCTCC TCTTCTACCT CACCGTCGGC ACGGCAAGCG CACTGGCCGG351 CTGGGCGGCG GTCGGCAAAA ACGGCTACGT CCCTATGCTG GCAGGGCTGA401 CGATGTGTAT GCTCATCGGC GACAACGGCA GCGAATGGCT CGACAGCGGA451 CTCATGCGCG CCATGAACGT CCTCATCGGC GCGGCCATCG CCATCGCCGC501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG551 CCGACAACCT GGCCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC601 AGGCGCATGA CCCGCGAACG CCTCGAGGAG AACATGGCGA AAATGCGCCA651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCATCTCGCC GCCACATCGG701 GCGAAAGCCG CATCAGCCCC GCCATGATGG AAGCCATGCA GCACGCCCAC751 CGTAAAATCG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTT GACCGCCACT851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGTCGCCCT TATCAACGGC901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG1101 CCTGCTTGAA ACACGGGAAC ACGGCTGA它对应于氨基酸序列<SEQ ID 574;ORF146-1>:1 MNTSQRNRLV SRWLNSYERY RYRRLIHAVR LGGAVLFATA SARLLHLQHG51 EWIGMTVFVV LGMLQFQGAI YSKAVERMLG TVIGLGAGLG VLWLNQHYFH101 GNLLFYLTVG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWLDSG151 LMRAMNVLIG AAIAIAAAKL LPLKSTLMWR FMLADNLADC SKMIAEISNG201 RRMTRERLEE NMAKMRQINA RMVKSRSHLA ATSGESRISP AMMEAMQHAH251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTVALING301 AHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR351 TRRKWLDAHE RQHLRQSLLE TREHG*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF146和脑膜炎奈瑟球菌菌株A的ORF(ORF146a)在74个氨基酸的重叠区内显示出有98.6%的相同性:
10 20 30orf146.pep RHARRIRIDTAINPELEALAEHLHYQWQGF
||||||||||||||||||||||||||||||orf146a KLNGSEIRLLDRHFTLLQTDLQQTVALINGRHARRIRIDTAINPELEALAEHLHYQWQGF
280 290 300 310 320 330
40 50 60 70orf146.pep LWLSTDMRQEISALVILLQRTRRKWLDAHERQHLRQSLLETREHGX
|||||:||||||||||||||||||||||||||||||||||||||:orf146a LWLSTNMRQEISALVILLQRTRRKWLDAHERQHLRQSLLETREHSX
340 350 360 370全长ORF146a核苷酸序列<SEQ ID 575>是:1 ATGAACACCT CGCAACGCAA CCGCCTCGTC AGCCGCTGGC TCAACTCCTA51 CGAACGCTAC CGCTACCGCC GCCTCATCCA CGCCGTCCGG CTCGGCGGGG101 CCGTCCTGTT CGCCACCGCC TCCGCCCGGC TGCTCCACCT CCAACACGGC151 GAGTGGATAG GGATGACCGT CTTCGTCGTC CTCGGCATGC TCCAGTTTCA201 AGGGGCGATT TACTCCAAGG CGGTGGAACG TATGCTCGGC ACGGTCATCG251 GGCTGGGCGC GGGTTTGGGC GTTTTATGGC TGAACCAGCA TTATTTCCAC301 GGCAACCTCC TCTTCTACCT CACCGTCGGC ACGGCAAGCG CACTGGCCGG351 CTGGGCGGCG GTCGGCAAAA ACGGCTACGT CCCTATGCTG GCGGGGCTGA401 CGATGTGCAT GCTCATCGGC GACAACGGCA GCGAATGGTT CGACAGCGGC451 CTGATGCGCG CGATGAACGT CCTCATCGGC GCGGCCATCG CCATCGCCGC501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG551 CCGACAACCT GACCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC601 AGGCGCATGA CCCGCGAACG CCTCGAAGAG AACATGGCGA AAATGCGCCA651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCACCTCGCC GCCACATCGG701 GCGAAAGCCG CATCAGCCCC GCCATGATGG AAGCCATGCA GCACGCCCAC751 CGTAAAATTG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTT GACCGCCACT851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGTCGCCCT TATCAACGGC901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG1101 CCTGCTTGAA ACACGGGAAC ACAGTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 576>:1 MNTSQRNRLV SRWLNSYERY RYRRLIHAVR LGGAVLFATA SARLLHLQHG51 EWIGMTVFVV LGMLQFQGAI YSKAVERMLG TVIGLGAGLG VLWLNQHYFH101 GNLLFYLTVG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWFDSG151 LMRAMNVLIG AAIAIAAAKL LPLKSTLMWR FMLADNLTDC SKMIAEISNG201 RRMTRERLEE NMAKMRQINA RMVKSRSHLA ATSGESRISP AMMEAMQHAH251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTVALING301 RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR351 TRRKWLDAHE RQHLRQSLLE TREHS*ORF146a和ORF146-1在374个氨基酸的重叠区内显示出有99.5%的相同性:orf146a.pep MNTSQRNRLVSRWLNSYERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf146-1 MNTSQRNRLVSRWLNSYERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFVVorf146a.pep LGMLQFQGAIYSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf146-1 LGMLQFQGAIYSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAAorf146a.pep VGKNGYVPMLAGLTMCMLIGDNGSEWFDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||orf146-1 VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWRorf146a.pep FMLADNLTDCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATSGESRISP
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||orf146-1 FMLADNLADCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATSGESRISPorf146a.pep AMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTVALING
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf146-1 AMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTVALINGorf146a.pep RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf146-1 RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHEorf146a.pep RQHLRQSLLETREHSX
||||||||||||||:orf146-1 RQHLRQSLLETREHGX
与淋病奈瑟球菌的预计ORF的同源性
ORF146和淋病奈瑟球菌的预计ORF(ORF146ng)在75个氨基酸的重叠区内显示出有97.3%的相同性:orf146.pep RHARRIRIDTAINPELEALAEHLHYQWQGF 30
||||||||||||||||||||||||||||||orf146ng KLNGSEIRLLDRHFTLLQTDLQQTAALINGRHARRIRIDTAINPELEALAEHLHYQWQGF 364orf146.pep LWLSTDMRQEISALVILLQRTRRKWLDAHERQHLRQSLLETREHG 75
|||||:|||||||||| ||||||||||||||||||||||||||||orf146ng LWLSTNMRQEISALVIPLQRTRRKWLDAHERQHLRQSLLETREHG 409
预计ORF146ng核苷酸序列<SEQ ID 577>编码的蛋白质具有氨基酸序列<SEQ ID578>:1 MSGVRFPSPA PIPSTDPPSG SLCFFTFPLQ TASDMNSSQR KRLSGRWLNS51 YERYRHRRLI HAVRLGGTVL FATALARLLH LQHGEWIGMT VFVVLGMLQF101 QGAIYSNAVE RMLGTVIGLG AGLGVLWLNQ HYFHGNLLFY LTIGTASALA151 GWAAVGKNGY VPMLAGLTMC MLIGDNGSEW LDSGLMRAMN VLIGAAIAIA201 AAKLLPLKST LMWRFMLADN LADCSKMIAE ISNGRRMTRE RLEQNMVKMR251 QINARMVKSR SHLAATSGES RISPSMMEAM QHAHRKIVNT TELLLTTAAK301 LQSPKLNGSE IRLLDRHFTL LQTDLQQTAA LINGRHARRI RIDTAINPEL351 EALAEHLHYQ WQGFLWLSTN MRQEISALVI PLQRTRRKWL DAHERQHLRQ401 SLLETREHG*进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 579>:1 ATGAACTCCT CGCAACGCAA ACGCCTTTCC GgccGCTGGC TCAACTCCTA51 CGAACGCTac cGCCaccGCC GCCTCATACA TGCCGTGCGG CTCGGCggaa101 ccgtCCTGTT CGCCACCGCA CTCGCCCGgc tACTCCACCT CCAacacggc151 gAATGGATAG GGAtgaCCGT CTTCGTCGTC CTCGGCATGC TCCAGTTCCA201 AGGCgcgatt tActccaacg cggtgGAacg taTGctcggt acggtcatcg251 ggctgGGCGC GGGTTTGGgc gTTTTATGGC TGAACCAGCA TTAtttccac301 ggcaacCTcc tcttctacct gaccatcggc acggcaagcg cactggccgg351 ctGGGCGGCG GTCGGCAAAA acggctacgt ccctatgctg GCGGGGctgA401 CGATGTGCAT gctcatcggc gACAACGGCA GCGAATGGCT CGACAGCGGC451 CTGATGCGCG CGATGAACGT CCTCATCGGC GCCGCCATCG CCATTGCCGC 501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG551 CCGACAACCT GGCCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC601 AGGCGTATGA CGCGCGAACG TTTGGAGCAG AATATGGTCA AAATGCGCCA651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCACCTCGCC GCCACATCGG701 GCGAAAGCCG CATCAGCCCC TCCATGATGG AAGCCATGCA GCACGCCCAC751 CGCAAAATCG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTC GACCGCCACT851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGCCGCCCT CATCAACGGC901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG1101 CCTGCTTGAA ACACGGGAAC ACGGCTGA它对应于氨基酸序列<SEQ ID 580;ORF146ng-1>:1 MNSSQRKRLS GRWLNSYERY RHRRLIHAVR LGGTVLFATA LARLLHLQHG51 EWIGMTVFVV LGMLQFQGAI YSNAVERMLG TVIGLGAGLG VLWLNQHYFH101 GNLLFYLTIG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWLDSG151 LARAMNVLIG AAIAIAAAKL LPLKSTLMWR FMLADNLADC SKMIAEISNG201 RRMTRERLEQ NMVKMRQINA RMVKSRSHLA ATSGESRISP SMMEAMQHAH251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTAALING301 RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR351 TRRKWLDAHE RQHLRQSLLE TREHG*ORF146ng-1和ORF146-1在375个氨基酸的重叠区内显示出有96.5%的相同性:orf146-1.pep MNTSQRNRLVSRWLNSYERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFVV
||:|||:|| :||||||||||:|||||||||||:|||||| |||||||||||||||||||orf146ng-1 MNSSQRKRLSGRWLNSYERYRHRRLIHAVRLGGTVLFATALARLLHLQHGEWIGMTVFVVorf146-1.pep LGMLQFQGAIYSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA
||||||||||||:|||||||||||||||||||||||||||||||||||:|||||||||||orf146ng-1 LGMLQFQGAIYSNAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTIGTASALAGWAAorf146-1.pep VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf146ng-1 VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWRorf146-1.pep FMLADNLADCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATSGESRISP
|||||||||||||||||||||||||||||:||:|||||||||||||||||||||||||||orf146ng-1 FMLADNLADCSKMIAEISNGRRMTRERLEQNMVKMRQINARMVKSRSHLAATSGESRISPorf146-1.pep AMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTVALING
:|||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||orf146ng-1 SMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTAALINGorf146-1.pep RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf146ng-1 RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHEorf146-1.pep RQHLRQSLLETREHGX
||||||||||||||||orf146ng-1 RQHLRQSLLETREHGX
另外,ORF146ng-1显示出与一种假设的大肠杆菌蛋白同源:
sp|P33011|YEEA_ECOLI COBU-SBMC基因间区域中假设的40.0KD蛋白
>gi|1736674|gnl|PID|d1016553(D90838)ORF_ID:o348#20:与[SwissProt登录号P33011][大肠杆菌]相似>gi|1736682|gnl|PID|d1016560(D90839)ORF_ID:o348#20;与[SwissProt登录号P33011][大肠杆菌]相似>gi|1788318(AE000292)f352;与片段YEEA_ECOLI L00%相同SW:P33011,但C端有附加的203个残基[大肠杆菌]长度=352
评分=109位(271),估计值=2e-23
相同性=89/347(25%),阳性=150/347(42%),空隙=21/347(6%)
询问:20 YRHRRLIHAVRLGGTVLFATALARLLHLQHGEWIGMTVFVVLGMLQFQGAIYSNAVERML 79
YRH R++H R+ L + RL + W +T+ V++G + F G + A ER+目标:15 YRHYRIVHGTRVALAFLLTFLIIRLFTIPESTWPLVTMVVIMGPISFWGNVVPRAFERIG 74询问:80 GTVIGLGAGLGVLWLNQHYFHGNLLFYLTIGTASALAGWAAVGKNGYVPMLAGLTMCMLI 139
GTV+G GL L L L + A L GW A+GK Y +L G+T+ +++目标:75 GTVLGSILGLIALQLE---LISLPLMLVWCAAAMFLCGWLALGKKPYQGLLIGVTLAIVV 131询问:140 GDNGSEWLDSGLMRAMNVLIGXXXXXXXXKLLPLKSTLMWRFMLADNLADCSKMIAEISN 199
G E +D+ L R+ +V++G + P ++ + WR LA +L + +++ +目标:132 GSPTGE-IDTALWRSGDVILGSLLAMLFTGIWPQRAFIHWRIQLAKSLTEYNRVYQSAFS 190询问:200 GRRMTRERLEQNMVKMRQINARMVKSRSHLAATSGESRISPSMMEAMQHAHRKIVNXXXX 259
+ R RLE ++ K+ VK R +A S E+RI S+ E +Q +R +V目标:191 PNLLERPRLESHLQKLL---TDAVKMRGLIAPASKETRIPKSIYEGIQTINRNLVCMLEL 247询问:260 XXXXXXXXQSPK---LNGSEIRLLDRHFXXXXXXXXXXAALINGRHARRIRIDTAINPEL 316
+ LN ++R D AL G +N +目标:248 QINAYWATRPSHFVLLNAQKLR--DTQHMMQQILLSLVHALYEGNPQPVFANTEKLNDAV 305询问:317 EALAEHL--HYQWQ-------GFLWLSTNMRQEISALVILLQRTRRK 354
E L + L H+ + G++WL+ ++ L L+ R RK目标:306 EELRQLLNNHHDLKVVETPIYGYVWLNMETAHQLELLSNLICRALRK 352
根据该分析结果(包括鉴定出在此淋球菌中的几个跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例76
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 581>1 ..GCCGAAGACA CGCGCGTTAC CGCACAGCTT TTGAGCGCGT ACGGCATTCA51 GGGCAAACTC GTCAGTGTGC GCGAACACAA CGAACGGCAG ATGGCGGACA101 AGATTGTCGG CTATCTTTCA GACGGCATGG TTGTGGCACA GGTTTCCGAT151 GCGGGTACGC CGGCCGTGTG CGACCCGGGC GCGAAACTCG CCCGCCGCGT201 GCGTGAGGCC GGGTTTAAAG TCGTTCCCGT CGTGGGCGCA AC.GCGGTGA251 TGGCGGCTTT GAGCGTGGCC GGTGTGGAAG GATCCGATTT TTATTTCAAC301 GGTTTTGTAC CGCCGAAATC GGGAGAACGC AGGAAACTGT TTGCCAAATG351 GGTGCGGGCG GCGTTTCCTA TCGTCATGTT TGAAACGCCG CACCGCATCG401 GTGCAGCGCT TGCCGATATG GCGGAACTGT TCCCCGAACG CCGATTAATG451 CTGGCGCGCG AAATTACGAA AACGTTTGAA ACGTTCTTAA GCGGCACGGT501 TGGGGAAATT CAGACGGCAT TGTCTGCCGA CGGCGACCAA TCGCGCGGCG551 AGATGGTGTT GGTGCTTTAT CCGGCGCAGG ATGAAAAACA CGAAGGCTTG601 TCCGAGTCCG CGCAAAACAT CATGAAAATC CTCACAGCCG AGCTGCCGAC651 CAAACAGGCG GCGGAGCTTG CTGCCAAAAT CACGGGCGAG GGAAAGAAAG701 CTTTGTACGA T..它对应于氨基酸序列<SEQ ID 582;ORF147>:1 ..AEDTRVTAQL LSAYGIQGKL VSVREHNERQ MADKIVGYLS DGMVVAQVSD51 AGTPAVCDPG AKLARRVREA GFKVVPVVGA XAVMAALSVA GVEGSDFYFN101 GFVPPKSGER RKLFAKWVRA AFPIVMFETP HRIGAALADM AELFPERRLM151 LAREITKTFE TFLSGTVGEI QTALSADGDQ SRGEMVLVLY PAQDEKHEGL201 SESAQNIMKI LTAELPTKQA AELAAKITGE GKKALYD..进一步的工作揭示了完整的核苷酸序列<SEQ ID 583>:1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG151 CGCGTTACCG CACAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT201 CAGTGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGCCGG351 GTTTAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTGATG GCGGCTTTGA401 GCGTGGCCGG TGTGGAAGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG451 CCGAAATCGG GAGAACGCAG GAAACTGTTT GCCAAATGGG TGCGGGCGGC501 GTTTCCTATC GTCATGTTTG AAACGCCGCA CCGCATCGGT GCGACGCTTG551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA601 ATTACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA651 GACGGCATTG TCTGCCGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC801 GGAGCTTGCT GCCAAAATCA CGGGCGAGGG AAAGAAAGCT TTGTACGATC851 TGGCTCTGTC TTGGAAAAAC AAATAG它对应于氨基酸序列<SEQ ID 584;ORF147-1>:1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP101 AVCDPGAKLA RRVREAGFKV VPVVGASAVM AALSVAGVEG SDFYFNGFVP151 PKSGERRKLF AKWVRAAFPI VMFETPHRIG ATLADMAELF PERRLMLARE201 ITKTFETFLS GTVGEIQTAL SADGNQSRGE MVLVLYPAQD EKHEGLSESA251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
该氨基酸序列的计算机分析给出了下列结果:
与大肠杆菌的假设蛋白ORF286(登录号为U18997)的同源性
ORF147和大肠杆菌ORF286蛋白在237个氨基酸的重叠区内显示出有36%的氨基酸相同性:Orf147:1 AEDTRVTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPG 60
AEDTR T LL +GI +L ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPGOrf286:43 AEDTRHTGLLLQHFGINARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPG 102Orf147:61 AKLARRVREXXXXXXXXXXXXXXXXXXXXXXXEGSDFYFNGFVPPKSGERRKLFAKWVRA 120
L R RE F + GF+P KS RROrf286:103 YHLVRTCREAGIRVVPLPGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAE 162Orf147:121 AFPIVMFETPHRIGAALADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALSADGD 179
++ +E+ HR+ +L D+ + E R ++LARE+TKT+ET VGE+ + D +Orf286:163 PRTLIFYESTHRLLDSLEDIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDEN 222Orf147:180 QSRGEMVLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALY 236
+ +GEMVL++ + E L A + +L AELP K+AA LAA+I G K ALYOrf286:223 RRKGEMVLIV-EGHKAQEEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALY 278
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF147和脑膜炎奈瑟球菌菌株A的ORF75a在237个氨基酸的重叠区内显示出有96.6%的相同性:
10 20 30orf147.pep AEDTRVTAQLLSAYGIQGKLVSVREHNERQ
||||||||||||||||||||||||||||||orf75a TLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQGKLVSVREHNERQ
20 30 40 50 60 70
40 50 60 70 80 90orf147.pep MADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPVVGAXAVMAALSVA
|||||||||||||||||||||||||||||||||||||||:|||||||||| |||||||||orf75a MADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREVGFKVVPVVGASAVMAALSVA
80 90 100 110 120 130
100 110 120 130 140 150orf147.pep GVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIGAALADMAELFPERRLM
|| ||||||||||||||||||||||||||:|||:|||||||||||:||||||||||||||orf75a GVAGSDFYFNGFVPPKSGERRKLFAKWVRVAFPVVMFETPHRIGATLADMAELFPERRLM
140 150 160 170 180 190
160 170 180 190 200 210orf147.pep LAREITKTFETFLSGTVGEIQTALSADGDQSRGFMVLVLYPAQDEKHEGLSESAQNIMKI
||||||||||||||||||||||||:|||:|||||||||||||||||||||||||||||||orf75a LAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI
200 210 220 230 240 250
220 230orf147.pep LTAELPTKQAAELAAKITGEGKKALYD
|||||||||||||||||||||||||||orf75a LTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
260 270 280 290
ORF147a与ORF75a相同,它包括ORF75的氨基酸56-292。
与淋病奈瑟球菌的预计ORF的同源性
ORF147和淋病奈瑟球菌的预计ORF(ORF147ng)在237个氨基酸的重叠区内显示出有94.1%的相同性:orf147.pep AEDTRVTAQLLSAYGIQGKLVSVREHNERQ 30
||||||||||||||||||:|||||||||||orf147ng TLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQGRLVSVREHNERQ 85orf147.pep MADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPVVGAXAVMAALSVA 90
||||::|:||||:||||||||||||||||||||||||||||||||||||| |||||||||orf147ng MADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPVVGASAVMAALSVA 145orf147.pep GVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIGAALADMAELFPERRLM 150
|| |||||||||||||||||||||||||||||:|||||||||||:||||||||||||||orf147ng GVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATLADMAELFPERRLM 205orf147.pep LAREITKTFETFLSGTVGEIQTALSADGDQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI 210
||||||||||||||||||||||||:|||:||||||||||||||||||||||||||| |||orf147ng LAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEKHEGLSESAQNAMKI 265orf147.pep LTAELPTKQAAELAAKITGEGKKALYD 237
|:|||||||||||||||||||||||||orf147ng LAAELPTKQAAELAAKITGEGKKALYDLALSWKNK 300
预计ORF147ng核苷酸序列<SEQ ID 585>编码的蛋白质具有氨基酸序列<SEQ ID586>:1 MSVFQTAFFM FQKHLQKASD SVVGGTLYVV ATPIGNLADI TLRALAVLQK51 ADIICAEDTR VTAQLLSAYG IQGRLVSVRE HNERQMADKV IGFLSDGLVV101 AQVSDAGTPA VCDPGAKLAR RVREAGFKVV PVVGASAVMA ALSVAGVAES151 DFYFNGFVPP KSGERRKLFA KWVRAAFPVV MFETPHRIGA TLADMAELFP201 ERRLMLAREI TKTFETFLSG TVGEIQTALA ADGNQSRGEM VLVLYPAQDE251 KHEGLSESAQ NAMKILAAEL PTKQAAELAA KITGEGKKAL YDLALSWKNK301 *进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 587>:1 ATGTTTCAGA AACACTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCAGAC ATTACCCTGC101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATTTGTGC CGAAGACACG151 CGCGTTACTG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAGGTTGGT201 CAGTGTGCGC GAACACAACG AGCGGCAGAT GGCGGACAAG GTAATCGGTT251 TCCTTTCAGA CGGCCTGGTT GTGGCGCAGG TTTCCGATGC GGGTACGCCG301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GCGAAGCAGG351 GTTCAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTAATG GCGGCGTTGA401 GTGTGGCCGG TGTGGCGGAA TCCGATTTTT ATTTCAACGG TTTTGTACCG451 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGCGGC501 ATTTCCTGTC GTCATGTTTG AAACGCCGCA CCGAATCGGG GCAACGCTTG551 CCGATATGGC GGAATTGTTC CCCGAACGCC GTCTGATGCT GGCGCGCGAA601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCTGCG751 CAAAATGCGA TGAAAATCCT TGCGGCCGAG CTGCCGACCA AGCAGGCGGC801 GGAGCTTGCC GCCAAGATTA CAGGTGAGGG CAAAAAGGCT TTGTACGATT851 TGGCACTGTC GTGGAAAAAC AAATGA它对应于氨基酸序列<SEQ ID 588;ORF147ng-1>:1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT51 RVTAQLLSAY GIQGRLVSVR EHNERQMADK VIGFLSDGLV VAQVSDAGTP101 AVCDPGAKLA RRVREAGFKV VPVVGASAVM AALSVAGVAE SDFYFNGFVP151 PKSGERRKLF AKWVRAAFPV VMFETPHRIG ATLADMAELF PERRLMLARE201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLYLYPAQD EKHEGLSESA251 QNAMKILAAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
ORF147ng显示出与一种假设的大肠杆菌蛋白同源:
sp|P45528| YRAL_ECOLI AGAI-MTR基因间区域中假设的31.3KD蛋白(F286)
>gi|606086(U18997)ORF_f286[大肠杆菌]
>gi|1789535(AE000395)agai-mtr基因间区域中假设的31.3kD蛋白[大肠杆菌]长度=286
评分=218位(550),估计值=3e-56
相同性=128/284(45%),阳性=171/284(60%),空隙=4/284(1%)
询问:4 KHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQ 63
K Q A +S G LY+V TPIGNLADIT RAL VLD D+I AEDTR T LL +GI
目标:2 KQHQSADNSQ--GQLYIVPTPIGNLADITQRALEVLQAVDLIAAEDTRHTGLLLQHFGIN 59
询问:64 GRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPV 123
RL ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG L R REAG +VVP+
目标:60 ARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRVVPL 119
询问:124 VGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATL 183
G A + ALS AG+ F + GF+P KS RR ++ +E+ HR+ +L
目标:120 PGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSL 179
询问:184 ADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEK 242
D+ + E R ++LARE+TKT+ET VGE+ + D N+ +GEMVL++ +
目标:180 EDIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDENRRKGEMVLIV-EGHKAQ 238
询问:243 HEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLAL 286
E L A + +L AELP K+AA LAA+I G K ALY AL
目标:239 EEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYAL 282
根据计算机的分析以及淋球菌蛋白中存在一个推定跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例77
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 589> 1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA51 AACCGGTCGC ATCCGCTTCT C.GCTGCTTA CTTAGCCATA TGCCTGTCGT101 TCGGCATTCT TCCCCAAGCC TGGGCGGGAC ACACTTATTT CGGCATCAAC151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC301 GTGGCGGcAT TGGTGGGCGt ATCAATATAT TGTGAGCGTG GCACATAACG351 GCGGCTATAA CAACGTTGAT TTTGGTGCGG AAGGAAk.AA tATCCC.GAT401 CAACAwCGww TTACTTATAA AATTGTGAAA CGGAATAATT ATAAAGCAGG451 GACTAAAGGC CATCCTTATG GCGGCGATTA TCATATGCCG CGTTTGCATA501 AATwTGTCAC AGATGCAGAA CCTGTTGAAA TGACCAGTTA TATGGATGGG551 CGGAAATATA TCGATCAAAA TAATTACCCT GACCGTGTTC GTATTGGGGC601 AGGCAGGCAA TATTGGCGAT CTGATGAAGA TGAGCCCAAT AACCGCGAAA651 GTTCATATCA TATTGCAAGT .......... .......... ..........701 .......... .....GGCTC ACCAATGTTT ATCTATGATG CCCAAAAGCA751 AAAGTGGTTA ATTAATGGGG TATTGCAAAC GGGCAACCCC TATATAGGAA801 AAAGCAATGG CTTCCAGCTG GTTCGTAAAG ATTGGTTCTA TGATGAAATC851 TTTGCTGGAG ATACCCATTC AGTATTCTAC GAACCACGTC AAAATGGGAA901 ATACTCTTTT AACGACGATA ATAATGGCAC AGGAAAAATC AATGCCAAAC951 ATGAACACAA TTCTCTGCCT AATAGATTAA AAACACGAAC CGTTCAATTG1001 TTTAATGTTT CTTTATCCGA GACAGCAAGA GAACCTGTTT ATCATGCTGC1051 AGGTGGTGTC AACAGTTATC GACCCAGACT GAATAATGGA GAAAATATTT1101 CCTTTATTGA CGAAGGAAAA GGCGAATTGA TACTTACCAG CAACATCAAT1151 CAAGGTGCTG GAGGATTATA TTTCCAAGGA GATTTTACGG TCTCGCCTGA1201 AAATAACGAA ACTTGGCAAG GCGCGGGCGT TCATATCAGT GAAGACAGTA1251 CCGTTACTTG GAAAGTAAAC GGCGTGGCAA ACGACCGCCT GTCCAAAATC1301 GGCAAAGGCA CGCTG..... .......... .......... ..........
//2101 .......... .......... .......... .......... ...GATAAAG2151 TGACTGCTTC ATTGACTAAG ACCGACATCA GCGGCAATGT CGATCTTGCC2201 GATCACGCTC ATTTAAATCT CACAGGGCTT GCCACACTCA ACGGCAATCT2251 TAGTGCAAAT GGCGATACAC GTTATACAGT CAGCCACAAC GCCACCCAAA2301 ACGGCAACCk TAgCCtCGtG G.sAATGcCC AAGCAACATT TAATCAAGCC2351 ACATTAAACG GCAACACATC GGCTTCgGGC AATGCTTCAT TTAATCTAAG2401 CGACCACGCC GTACAAAACG GCAGTCTGAC GCTTTCCGGC AACGCTAAGG2451 CAAACGTAAG CCATTCCGCA CTCAACGGTA ATGTCTCCCT AGCCGATAAG2501 GCAGTATTCC ATTTTGAAAG CAGCCGCTTT ACCGGACAAA TCAGCGGCGG2551 CAagGATACG GCATTACACT TAAAAGACAG CGAATGGACG CTGCCGTCAg2601 GarCGGAATT AGGCAATTTA AACCTTGACA ACGCCACCAT TACaCTCAAT2651 TCCGCCTATC GCCACGATGC GGCAGGGGCG CAAACCGGCA GTGCGACAGA2701 TGCGCCGCGC CGCCGTTCGC GCCGTTCGCG CCGTTCCCTA TTATmCGTTA2751 CACCGCCAAC TTCGGTAGAA TCCCGTTTCA ACACGCTGAC GGTAAACGGC2801 AAATTGAACG GTCAGGGAAC ATTCCGCTTT ATGTCGGAAC TCTTCGGCTA2851 CCGCAGCGAC AAATTGAAGC TGGCGGAAAG TTCCGAAGGC ACTTACACCT2901 TGGCGGTCAA CAATACCGGC AACGAACCTG CAAGCCTCGA ACAATTGACG2951 GTAGTGGAAG GAAAAGACAA CAAACCGCTG TCCGAAAACC TTAATTTCAC3001 CCTGCAAAAC GAACACGTCG ATGCAGGCGC GTGG...... ..........
//3551 .......... .......... ....TTAGAC CGCGTATTTG CCGAAGACCG3601 CCGCAACGCC GTTTGGACAA GCGGCATCCG GGACACCAAA CACTACCGTT3651 CGCAAGATTT CCGCGCCTAC CGCCAACAAA CCGACCTGCG CCAAATCGGT3701 ATGCAGAAAA ACCTCGGCAG CGGGCGCGTC GCCATCCTGT TTTCGCACAA3751 CCGGACCGAA AACACCTTCG ACGACGGCAT CGGCAACTCG GCACGGCTTG3801 CCCACGGCGC CGTTTTCGGG CAATACGGCA TCGACAGGTT CTACATCGGC3851 ATCAGnCGCG GGCGCGGGTT TTAGCAGCGG CAGCCTTTcA GACGGCATCG3901 GAGsmAAAwT CCGCCGCCGC GTGCtGCATT ACGGCATTCA GGCACGAtAC3951 CGCGCCGgtt tCggCGgATt CGGCATCGAA CCGCACATCG GCGCAACGCg4001 ctATTTCGTC CAAAAAGCGG ATTACCGCTA CGAAAACGTC AATATCGCCA4051 CCCCCGGCCT TGCATTCAAC CGcTACCGCG CGGGCATTAa GGCAGATTAT4101 TCATTCAAAC CGGCGCAACA CATTTCCATC ACGCCTTATT TGAGCCTGTC4151 CTATACCGAT GCCGCTTCGG GCAAAGTCCG AACACGCGTC AATACCGCCG4201 TATTGGCTCA GGATTTCGGC AAAACCCGCA GTGCGGAATG GGgCGTAAAC4251 GCCGAAATCA AAGGTTTCAC GCTGTCCCTC CACGCTGCCG CCGCCAAAGG4301 CCCGCAACTG GAAGCGCAAC ACAGCGCGGG CATCAAATTA GGCTACCGCT4351 GGTAA...它对应于氨基酸序列<SEQ ID 590;ORFI>:1 MKTTDKRTTE THRKAPKTGR IRFXAAYLAI CLSFGILPQA WAGHTYFGIN51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG101 VAALVGYQYI VSVAHNGGYN NVDFGAEGXN IXDQXRXTYK IVKRNNYKAG151 TKGHPYGGDY HMPRLHKXVT DAEPVEMTSY MDGRKYIDQN NYPDRVRIGA201 GRQYWRSDED EPNNRESSYH IAS....... ........GS PMFIYDAQKQ251 KWLINGVLQT GNPYIGKSNG FQLVRKDWFY DEIFAGDTHS VFYEPRQNGK301 YSFNDDNNGT GKINAKHEHN SLPNRLKTRT VQLFNVSLSE TAREPVYHAA351 GGVNSYRPRL NNGENISFID EGKGELILTS NINQGAGGLY FQGDFTVSPE401 NNETWQGAGV HISEDSTVTW KVNGVANDRL SKIGKGTL.. ..........
//701 .......... ....DKVTAS LTKTDISGNV DLADHAHLNL TGLATLNGNL751 SANGDTRYTV SHNATQNGNX SLVXNAQATF NQATLNGNTS ASGNASFNLS801 DHAVQNGSLT LSGNAKANVS HSALNGNVSL ADKAVFHFES SRFTGQISGG851 KDTALHLKDS EWTLPSGXEL GNLNLDNATI TLNSAYRHDA AGAQTGSATD901 APRRRSRRSR RSLLXVTPPT SVESRFNTLT VNGKLNGQGT FRFMSELFGY951 RSDKLKLAES SEGTYTLAVN NTGNEPASLE QLTVVEGKDN KPLSENLNFT1001 LQNEHVDAGA W......... .......... .......... ..........
//1151 .......... .......... .......... .......... .LDRVFAEDR1201 RNAVWTSGIR DTKHYRSQDF RAYRQQTDLR QIGMQKNLGS GRVGILFSHN1251 RTENTFDDGI GNSARLAHGA VFGQYGIDRF YIGISAGAGF SSGSLSDGIG1301 XKXRRRVLHY GIQARYRAGF GGFGIEPHIG ATRYFVQKAD YRYENVNIAT1351 PGLAFNRYRA GIKADYSFKP AQHISITPYL SLSYTDAASG KVRTRVNTAV1401 LAQDFGKTRS AEWGVNAEIK GFTLSLHAAA AKGPQLEAQH SAGIKLGYRW1451 *进一步的序列分析揭示了全部的核苷酸序列<SEQ ID 591>:1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA51 AACCGGCCGC ATCCGCTTCT CGCCTGCTTA CTTAGCCATA TGCCTGTCGT101 TCGGCATTCT TCCCCAAGCC TGGGCGGGAC ACACTTATTT CGGCATCAAC151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC301 GTGGCGGCAT TGGTGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG351 CGGCTATAAC AACGTTGATT TTGGTGCGGA AGGAAGAAAT CCCGATCAAC401 ATCGTTTTAC TTATAAAATT GTGAAACGGA ATAATTATAA AGCAGGGACT451 AAAGGCCATC CTTATGGCGG CGATTATCAT ATGCCGCGTT TGCATAAATT501 TGTCACAGAT GCAGAACCTG TTGAAATGAC CAGTTATATG GATGGGCGGA551 AATATATCGA TCAAAATAAT TACCCTGACC GTGTTCGTAT TGGGGCAGGC601 AGGCAATATT GGCGATCTGA TGAAGATGAG CCCAATAACC GCGAAAGTTC651 ATATCATATT GCAAGTGCGT ATTCTTGGCT CGTTGGTGGC AATACCTTTG701 CACAAAATGG ATCAGGTGGT GGCACAGTCA ACTTAGGTAG TGAAAAAATT751 AAACATAGCC CATATGGTTT TTTACCAACA GGAGGCTCAT TTGGCGACAG801 TGGCTCACCA ATGTTTATCT ATGATGCCCA AAAGCAAAAG TGGTTAATTA851 ATGGGGTATT GCAAACGGGC AACCCCTATA TAGGAAAAAG CAATGGCTTC901 CAGCTGGTTC GTAAAGATTG GTTCTATGAT GAAATCTTTG CTGGAGATAC951 CCATTCAGTA TTCTACGAAC CACGTCAAAA TGGGAAATAC TCTTTTAACG1001 ACGATAATAA TGGCACAGGA AAAATCAATG CCAAACATGA ACACAATTCT1051 CTGCCTAATA GATTAAAAAC ACGAACCGTT CAATTGTTTA ATGTTTCTTT1101 ATCCGAGACA GCAAGAGAAC CTGTTTATCA TGCTGCAGGT GGTGTCAACA1151 GTTATCGACC CAGACTGAAT AATGGAGAAA ATATTTCCTT TATTGACGAA1201 GGAAAAGGCG AATTGATACT TACCAGCAAC ATCAATCAAG GTGCTGGAGG1251 ATTATATTTC CAAGGAGATT TTACGGTCTC GCCTGAAAAT AACGAAACTT1301 GGCAAGGCGC GGGCGTTCAT ATCAGTGAAG ACAGTACCGT TACTTGGAAA1351 GTAAACGGCG TGGCAAACGA CCGCCTGTCC AAAATCGGCA AAGGCACGCT1401 GCACGTTCAA GCCAAAGGGG AAAACCAAGG CTCGATCAGC GTGGGCGACG1451 GTACAGTCAT TTTGGATCAG CAGGCAGACG ATAAAGGCAA AAAACAAGCC1501 TTTAGTGAAA TCGGCTTGGT CAGCGGCAGG GGTACGGTGC AACTGAATGC1551 CGATAATCAG TTCAACCCCG ACAAACTCTA TTTCGGCTTT CGCGGCGGAC1601 GTTTGGATTT AAACGGGCAT TCGCTTTCGT TCCACCGTAT TCAAAATACC1651 GATGAAGGGG CGATGATTGT CAACCACAAT CAAGACAAAG AATCCACCGT1701 TACCATTACA GGCAATAAAG ATATTGCTAC AACCGGCAAT AACAACAGCT1751 TGGATAGCAA AAAAGAAATT GCCTACAACG GTTGGTTTGG CGAGAAAGAT1801 ACGACCAAAA CGAACGGGCG GCTCAACCTT GTTTACCAGC CCGCCGCAGA1851 AGACCGCACC CTGCTGCTTT CCGGCGGAAC AAATTTAAAC GGCAACATCA1901 CGCAAACAAA CGGCAAACTG TTTTTCAGCG GCAGACCAAC ACCGCACGCC1951 TACAATCATT TAAACGACCA TTGGTCGCAA AAAGAGGGCA TTCCTCGCGG2001 GGAAATCGTG TGGGACAACG ACTGGATCAA CCGCACATTT AAAGCGGAAA2051 ACTTCCAAAT TAAAGGCGGA CAGGCGGTGG TTTCCCGCAA TGTTGCCAAA2101 GTGAAAGGCG ATTGGCATTT GAGCAATCAC GCCCAAGCAG TTTTTGGTGT2151 CGCACCGCAT CAAAGCCACA CAATCTGTAC ACGTTCGGAC TGGACGGGTC2201 TGACAAATTG TGTCGAAAAA ACCATTACCG ACGATAAAGT GATTGCTTCA2251 TTGACTAAGA CCGACATCAG CGGCAATGTC GATCTTGCCG ATCACGCTCA2301 TTTAAATCTC ACAGGGCTTG CCACACTCAA CGGCAATCTT AGTGCAAATG2251 GCGATACACG TTATACAGTC AGCCACAACG CCACCCAAAA CGGCAACCTT2401 AGCCTCGTGG GCAATGCCCA AGCAACATTT AATCAAGCCA CATTAAACGG2451 CAACACATCG GCTTCGGGCA ATGCTTCATT TAATCTAAGC GACCACGCCG2501 TACAAAACGG CAGTCTGACG CTTTCCGGCA ACGCTAAGGC AAACGTAAGC2551 CATTCCGCAC TCAACGGTAA TGTCTCCCTA GCCGATAAGG CAGTATTCCA2601 TTTTGAAAGC AGCCGCTTTA CCGGACAAAT CAGCGGCGGC AAGGATACGG2651 CATTACACTT AAAAGACAGC GAATGGACGC TGCCGTCAGG CACGGAATTA2701 GGCAATTTAA ACCTTGACAA CGCCACCATT ACACTCAATT CCGCCTATCG2751 CCACGATGCG GCAGGGGCGC AAACCGGCAG TGCGACAGAT GCGCCGCGCC2801 GCCGTTCGCG CCGTTCGCGC CGTTCCCTAT TATCCGTTAC ACCGCCAACT2851 TCGGTAGAAT CCCGTTTCAA CACGCTGACG GTAAACGGCA AATTGAACGG2901 TCAGGGAACA TTCCGCTTTA TGTCGGAACT CTTCGGCTAC CGCAGCGACA2951 AATTGAAGCT GGCGGAAAGT TCCGAAGGCA CTTACACCTT GGCGGTCAAC3001 AATACCGGCA ACGAACCTGC AAGCCTCGAA CAATTGACGG TAGTGGAAGG3051 AAAAGACAAC AAACCGCTGT CCGAAAACCT TAATTTCACC CTGCAAAACG3101 AACACGTCGA TGCCGGCGCG TGGCGTTACC AACTCATCCG CAAAGACGGC3151 GAGTTCCGCC TGCATAATCC GGTCAAAGAA CAAGAGCTTT CCGACAAACT3201 CGGCAAGGCA GAAGCCAAAA AACAGGCGGA AAAAGACAAC GCGCAAAGCC3251 TTGACGCGCT GATTGCGGCC GGGCGCGATG CCGTCGAAAA GACAGAAAGC3301 GTTGCCGAAC CGGCCCGGCA GGCAGGCGGG GAAAATGTCG GCATTATGCA3351 GGCGGAGGAA GAGAAAAAAC GGGTGCAGGC GGATAAAGAC ACCGCCTTGG3401 CGAAACAGCG CGAAGCGGAA ACCCGGCCGG CTACCACCGC CTTCCCCCGC3451 GCCCGCCGCG CCCGCCGGGA TTTGCCGCAA CTGCAACCCC AACCGCAGCC3501 CCAACCGCAG CGCGACCTGA TCAGCCGTTA TGCCAATAGC GGTTTGAGTG3551 AATTTTCCGC CACGCTCAAC AGCGTTTTCG CCGTACAGGA CGAATTAGAC3601 CGCGTATTTG CCGAAGACCG CCGCAACGCC GTTTGGACAA GCGGCATCCG3651 GGACACCAAA CACTACCGTT CGCAAGATTT CCGCGCCTAC CGCCAACAAA3701 CCGACCTGCG CCAAATCGGT ATGCAGAAAA ACCTCGGCAG CGGGCGCGTC3751 GGCATCCTGT TTTCGCACAA CCGGACCGAA AACACCTTCG ACGACGGCAT3801 CGGCAACTCG GCACGGCTTG CCCACGGCGC CGTTTTCGGG CAATACGGCA3851 TCGACAGGTT CTACATCGGC ATCAGCGCGG GCGCGGGTTT TAGCAGCGGC3901 AGCCTTTCAG ACGGCATCGG AGGCAAAATC CGCCGCCGCG TGCTGCATTA3951 CGGCATTCAG GCACGATACC GCGCCGGTTT CGGCGGATTC GGCATCGAAC4001 CGCACATCGG CGCAACGCGC TATTTCGTCC AAAAAGCGGA TTACCGCTAC4051 GAAAACGTCA ATATCGCCAC CCCCGGCCTT GCATTCAACC GCTACCGCGC4101 GGGCATTAAG GCAGATTATT CATTCAAACC GGCGCAACAC ATTTCCATCA4151 CGCCTTATTT GAGCCTGTCC TATACCGATG CCGCTTCGGG CAAAGTCCGA4201 ACACGCGTCA ATACCGCCGT ATTGGCTCAG GATTTCGGCA AAACCCGCAG4251 TGCGGAATGG GGCGTAAACG CCGAAATCAA AGGTTTCACG CTGTCCCTCC4301 ACGCTGCCGC CGCCAAAGGC CCGCAACTGG AAGCGCAACA CAGCGCGGGC4351 ATCAAATTAG GCTACCGCTG GTAA它对应于氨基酸序列<SEQ ID 592;ORF1-1>:1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA WAGHTYFGIN51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG101 VAALVGDQYI VSVAHNGGYN NVDFGAEGRN PDQHRFTYKI VKRNNYKAGT151 KGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGRKYIDQNN YPDRVRIGAG201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI251 KHSPYGFLPT GGSFGDSGSP MFIYDAQKQK WLINGVLQTG NPYIGKSNGF301 QLVRKDWFYD EIFAGDTHSV FYEPRQNGKY SFNDDNNGTG KINAKHEHNS351 LPNRLKTRTV QLFNVSLSET AREPVYHAAG GVNSYRPRLN NGENISFIDE401 GKGELILTSN INQGAGGLYF QGDFTVSPEN NETWQGAGVH ISEDSTVTWK451 VNGVANDRLS KIGKGTLHVQ AKGENQGSIS VGDGTVILDQ QADDKGKKQA501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRLDLNGH SLSFHRIQNT551 DEGAMIVNHN QDKESTVTIT GNKDIATTGN NNSLDSKKEI AYNGWFGEKD601 TTKTNGRLNL VYQPAAEDRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA651 YNHLNDHWSQ KEGIPRGEIV WDNDWINRTF KAENFQIKGG QAVVSRNVAK701 VKGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTNCVEK TITDDKVIAS751 LTKTDISGNV DLADHAHLNL TGLATLNGNL SANGDTRYTV SHNATQNGNL801 SLVGNAQATF NQATLNGNTS ASGNASFNLS DHAVQNGSLT LSGNAKANVS851 HSALNGNVSL ADKAVFHFES SRFTGQISGG KDTALHLKDS EWTLPSGTEL901 GNLNLDNATI TLNSAYRHDA AGAQTGSATD APRRRSRRSR RSLLSVTPPT951 SYESRFNTLT VNGKLNGQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN1001 NTGNEPASLE QLTVVEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG1051 EFRLHNPVKE QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAVEKTES1101 VAEPARQAGG ENVGIMQAEE EKKRVQADKD TALAKQREAE TRPATTAFPR1151 ARRARRDLPQ LQPQPQPQPQ RDLISRYANS GLSEFSATLN SVFAVQDELD1201 RVFAEDRRNA VWTSGIRDTK HYRSQDFRAY RQQTDLRQIG MQKNLGSGRV1251 GILFSHNRTE NTFDDGIGNS ARLAHGAVFG QYGIDRFYIG ISAGAGFSSG1301 SLSDGIGGKI RRRVLHYGIQ ARYRAGFGGF GIEPHIGATR YFVQKADYRY1351 ENVNIATPGL AFNRYRAGIK ADYSFKPAQH ISITPYLSLS YTDAASGKVR1401 TRVNTAVLAQ DFGKTRSAEW GVNAEIKGFT LSLHAAAAKG PQLEAQHSAG1451 IKLGYRW*
这些序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF1和脑膜炎奈瑟球菌菌株A的ORF(ORF1a)在1456个氨基酸的重叠区内显示出有57.8%的相同性:
10 20 30 40 50 60
orf1.pep MKTTDKRTTETHRKAPKTGRIRFXAAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
orf1a MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
10 20 30 40 50 60
70 80 90 100 110 120
orf1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGVQYIVSVAHNGGYN
|||||||||||||||||||||||||||||||||||||||||||||| |||||||||||||
orf1a KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
70 80 90 100 110 120
130 140 150 160 170 180
orf1.pep NVDFGAEGXNIXDQXRXTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSY
|||||||||| || | :|:|||||||| :: ||:|| ||||||| ||||||||||||
orf1a NVDFGAEGXN-PDQHRFSYQIVKRNNYKPDNS-HPYNGDXHMPRLHKFVTDAEPVEMTSD
130 140 150 160 170
190 200 210
orf1.pep MDGRKYIDQNNYPDRVRIGAGRQYWRSDEDEP---------------------NN-----
| | | |:::||:|||||:|::||| |:|: ||
orf1a MRGNTYSDKEKYPERVRIGSGHHYWRYDDDKHGDLSYSGAWLIGGNTHMQGWGNNGVXSL
550 560 570 580 590 600orf1.pep NAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGXELGNL
||||||||||||||||||||||||||:||||||:||:| |||||||||||||||:|||||orf1a NAKANVSHSALNGNVSLADKAVFHFENSRFTGQLSGSKXTALHLKDSEWTLPSGTELGNL
840 850 860 870 880 890
610 620 630 640 650 660orf1.pep NLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLXVTPPTSVESRFNTLTVNG
||||||||||||||||||||||| ::||||||| ||||| ||||||||||||||||||orf1a NLDNATITLNSAYRHDAAGAQTGXVSDTPRRRSRRS---LLSVTPPTSVESRFNTLTVNG
900 910 920 930 940 950
670 680 690 700 710 720orf1.pep KLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDNKPL
||| |||||||||||||||||||||||||||||||||||||||:||:|||||||||||||orf1a KLNXQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPVSLDQLTVVEGKDNKPL
960 970 980 990 1000 1010
730 740 750orf1.pep SENLNFTLQNEHVDAGAW------------------------------------------
||||||||||||||||||orf1a SENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAJKQAEKDNAQS
1020 1030 1040 1050 1060 1070orf1 pep ------------------------------------------------------------orf1a LDALIAAGRDAAEKTESVAEPARXAGGENVGIMQAEEEKKRVQADKDSALAKQREAETRP
1080 1090 1100 1110 1120 1130
760orf1.pep ---------------------------------------------------------LDR
|||orf1a XTTAFPRARXARRDLPQPQPQPQPQPQPQRDLXSRYANSGLSEFSATLNSVFAVQDELDR
1140 1150 1160 1170 1180 1190
770 780 790 800 810 820orf1.pep VFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNRTEN
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||orf1a VFAEDRRNAVWTSXIRXTKHYRSQDFRAYRQQTDLRQIGKQKNLGSGRVGILFSHNRTEN
1200 1210 1220 1230 1240 1250
830 840 850 860 870 880orf1.pep TFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGIGXKXRRRVLHYGIQA
:|||||||||||||||||||||| || ||||:||||||| |||||| | |||||||||||orf1a XFDDGIGNSARLAHGAVFGQYGIGRFDIGISTGAGFSSGXLSDGIGGKIRRRVLHYGIQA
1260 1270 1280 1290 1300 1310
890 900 910 920 930 940orf1.pep RYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPAQHI
|||||||||||||:||||||||||||||||||||||||||||||||||||||||||||||orf1a RYRAGFGGFGIEPYIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPAQHX
1320 1330 1340 1350 1360 1370
950 960 970 980 990 1000orf1.pep SITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSLHAAAAKGP
||||| ||||||||||||||||||||||||||||||||||||||||||||| ||||||||orf1a SITPYXSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSXHAAAAKGP
1380 1390 1400 1410 1420 1430
1010 1020orf1.pep QLEAQHSAGIKLGYRWX
|||||||||||||||||orf1a QLEAQHSAGIKLGYRWX
1440 1450全长ORF1a核苷酸序列<SEQ ID 593>是:1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA51 AACCGGCCGC ATCCGCTTCT CGCCTGCTTA CTTAGCCATA TGCCTGTCGT101 TCGGCATTCT TCCCCAAGCT TGGGCGGGAC ACACTTATTT CGGCATCAAC151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG201 GGCGAAAGAT ATTGAGGTNT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC301 GTGGCGGCAT TGGTGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG351 CGGCTATAAC AACGTTGATT TTGGTGCGGA AGGAAGNAAT CCCGATCAGC401 ACCGTTTTTC TTACCAAATT GTGAAAAGAA ATAATTATAA GCCTGACAAT451 TCACACCCTT ACAACGGCGA TTANCATATG CCGCGTTTGC ATAAATTTGT501 CACAGATGCA GAACCTGTCG AAATGACGAG TGACATGAGG GGGAATACCT551 ATTCCGATAA AGAAAAATAT CCCGAGCGTG TCCGCATCGG CTCAGGACAC601 CACTATTGGC GTTATGATGA TGACAAACAC GGCGATTTAT CCTACTCCGG651 CGCATGGTTA ATTGGCGGCA ATACACATAT GCAGGGTTGG GGAAATAATG701 GCGTANTTAG TTTGAGCGGC GATGTGCGCC ATGCCAACGA CTATGGCCCT751 ATGCCGATTG CAGGTGCGGC AGGCGACAGC GGTTCGCCAA TGTTTATTTA801 TGACAAAACA AACAATAAAT GGCTGCTCAA CGGAGTTTTA CAAACCGGCT851 ACCCTTATTC CGGCAGGGAA AACGGTTTCC AGCTGATACG CAAAGATTGG901 TTCTACGATG ACATTTACAG AGGCGATACA CATACCGTCT NTTTTGAACC951 GCGCAGTAAC GGACATTTTT CCTTTACATC CAACAACAAC GGTACGGGTA1001 CGGTAACAGA AACCAACGAA AAGGTNTCCA ATCCAAAGCT TAAAGTACAG1051 ACAGTCCGAC TGTTTGACGA ATCTTTGAAT GAAACTGATA AAGAACCAGT1101 TTACGCGGCA GGGGGTGTTA ATCAGTACCG TCCAAGGTTA AACAACGGTG1151 AAAACCTTTC TTTTATCGAT TACGGCAACG GCAAACTCAT CTTATCAAAC1201 AACATCAACC AAGGCGCGGG CGGTTTGTAT TTTGAAGGTG ATTTTACGGT1251 CTCGCCTGAA AACAACGAAA CGTGGCAAGG CGCGGGCGTT CATATCAGTG1301 AAGACAGTAC CGTTACTTGG AAAGTAAACG GCGTGGCAAA CGACCGCCTG1351 TCCAAAATCG GCAAAGGCAC GCTGCACGTT CAAGCCAAAG GGGAAAACCA1401 AGGCTCGATC AGCGTGGGCG ACGGTACAGT CATTTTGGAT CAGCAGGCAG1451 ACGATAAAGG CAAAAAACAA CCCTTTAGTG AAATCGGCTT GNTCAGCGGC1501 AGGGGTACGG TGCAACTGAA TGCCGATAAT CAGTTCAACC CCGACAAACT1551 CTATTTCGGC TTTCGCGGCG GACGTTTGGA TTTAAACGGG CATTCGCTTT1601 CGTTCCACCG TATTCAAAAT ACCGATGAAG GGGCGATGAT TGNCNATCAT1651 AATGCCACAA CAACATCCAC CGTTACCATT ACAGGGAATG AAAGTATTAC1701 ACAACCGAGT GGTAAGAATA TCAATAGACT TAATTACAGC AAAGAAATTG1751 CCTACAACGG TTGGTTTGGC GAGAAAGATA CGACCAAAAC GAACGGGCGG1801 CTCAACCTTG TTTACCAGCC CGCCGCAGAA GACCGCACCC NGCTGCTTTC1851 CGGCGGAACA AATTTAAACG GCAACATCAC GCAAACAAAC GGCAAACTGT1901 TTTTCAGCGG CAGACCGACA CCGCACGCCT ACAATCATTT AGGAAGCGGG1951 TGGTCAAAAA TGGAAGGTAT CCCACAAGGA GAAATCGTGT GGGACAACGA2001 CTGGATCNAC CGCACGTTTA AAGCGGAAAA TTTCCATATT CAGGGCGGGC2051 AGGCGGTGAT TTCCCGCAAT GTTGCCAAAG TGGAAGGCGA TTGNCATTTG2101 AGCAATCACG CCCAAGCAGT TTTTGGTGTC GCACCGCATC AAAGCCATAC2151 AATCTGTACA CGTTCGGACT GGACNGGTCT GACAAATTGT GTCGAANAAA2201 NCATTACCGA CGATAAAGTG ATTGCTTCAT TGACTAAGAC NGACNTNAGC2251 GGCANTGTNA GNCTNNCCNA TNACGNTNNT TNAAANCTCN CNGGGCNTGC2301 NNCACTNAAN GGCAATCTTA GTGCAAATGG CGATACACGT TATACAGTCA2351 GCCACAACGC CACCCAAAAC GGCAACCTTA GCCTCGTGGG CAATGCCCAA2401 GCAACATTTA ATCAAGCCAC ATTAAACGGC AACNCATCGG NTTCGGGCAA2451 TGCTTCATTT AATCTAAGCA ACAACGCCGC ACAAAACGGC AGTCTGACGC2501 TTTCCGACAA CGCTAAGGCA AACGTAAGCC ATTCCGCACT CAACGGCAAT2551 GTCTCCCTAG CCGATAAGGC AGTATTCCAT TTTGAAAACA GCCGCTTTAC2601 CGGACAACTC AGCGGCAGCA AGGANACAGC ATTACACTTA AAAGACAGCG2651 AATGGACGCT GCCGTCAGGC ACGGAATTAG GCAATTTAAA CCTTGACAAC2701 GCCACCATTA CACTCAATTC CGCCTATCGC CACGATGCTG CAGGCGCGCA2751 AACCGGCAGN GTGTCAGACA CGCCGCGCCG CCGTTCGCGC CGTTCCCTAT2801 TATCCGTTAC ACCGCCAACT TCGGTAGAAT CCCGTTTCAA CACGCTGACG2851 GTAAACGGCA AATTGAACNG TCAAGGAACA TTCCGCTTTA TGTCGGAACT2901 CTTCGGCTAC CGAAGCGACA AATTGAAGCT GGCGGAAAGT TCCGAAGGNA2951 CTTACACCTT GGCGGTCAAC AATACCGGCA ACGAACCCGT AAGCCTCGAT3001 CAATTGACGG TAGTGGAAGG GAAAGACAAC AAACCGCTGT CCGAAAACCT3051 TAATTTCACC CTGCAAAACG AACACGTCGA TGCCGGCGCG TGGCGTTACC3101 AACTCATCCG CAAAGACGGC GAGTTCCGCC TGCATAATCC GGTCAAAGAA3151 CAAGAGCTTT CCGACAAACT CGGCAAGGCA GAAGCCAAAA AACAGGCGGA3201 AAAAGACAAC GCGCAAAGCC TTGACGCGCT GATTGCGGCC GGGCGCGATG3251 CCGCCGAAAA GACAGAAAGC GTTGCCGAAC CGGCCCGGCN GGCAGGCGGG3301 GAAAATGTCG GCATTATGCA GGCGGAGGAA GAGAAAAAAC GGGTGCAGGC3351 GGATAAAGAC AGCGCNTTGG CGAAACAGCG CGAAGCGGAA ACCCGGCCGG3401 NTACCACCGC CTTCCCCCGC GCCCGCNGCG CCCGCCGGGA TTTGCCGCAA3451 CCGCAGCCCC AACCGCAACC TCAACCCCAA CCGCAGCGCG ACCTGATNAG3501 CCGTTATGCC AATAGCGGTT TGAGTGAATT TTCCGCCACG CTCAACAGCG3551 TTTTCGCCGT ACAGGACGAA TTGGACCGCG TGTTTGCCGA AGACCGCCGC3601 AACGCNGTTT GGACAAGCNG CATCCGGNAC ACCAAACACT ACCGTTCGCA3651 AGATTTCCGC GCCTACCGCC AACAAACCGA CCTGCGCCAA ATCGGTATGC3701 AGAAAAACCT CGGCAGCGGG CGCGTCGGCA TCCTGTTTTC GCACAACCGG3751 ACCGAAAACA NCTTCGACGA CGGCATCGGC AACTCGGCAC GGCTTGCCCA3801 CGGCGCCGTT TTCGGGCAAT ACGGCATCGG CAGGTTCGAC ATCGGCATCA3851 GCACGGGCGC GGGTTTTAGC AGCGGCANTC TNTCAGACGG CATCGGAGGC3901 AAAATCCGCC GCCGCGTGCT GCATTACGGC ATTCAGGCAC GATACCGCGC3951 CGGTTTCGGC GGATTCGGCA TCGAACCGTA CATCGGCGCA ACGCGCTATT4001 TCGTCCAAAA AGCGGATTAC CGCTACGAAA ACGTCAATAT CGCCACCCCC4051 GGTCTTGCGT TCAACCGNTA CCGNGCGGGC ATTAAGGCAG ATTATTCATT4101 CAAACCGGCG CAACACATNT CCATCACNCC TTATTTNAGC CTGTCCTATA4151 CCGATGCCGC TTCGGGCAAA GTCCGAACAC GCGTCAATAC CGCNGTATTG4201 GCTCAGGATT TCGGCAAAAC CCGCAGTGCG GAATGGGGCG TAAACGCCGA4251 AATCAAAGGT TTCACGCTGT CCNTCCACGC TGCCGCCGCC AAAGGNCCGC4301 AACTGGAAGC GCAACACAGC GCGGGCATCA AATTAGGCTA CCGCTGGTAA它编码的蛋白质具有氨基酸序列<SEQ ID 594>:1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA WAGHTYFGIN51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG101 VAALVGDQYI VSVAHNGGYN NVDFGAEGXN PDQHRFSYQI VKRNNYKPDN151 SHPYNGDXHM PRLHKFVTDA EPVEMTSDMR GNTYSDKEKY PERVRIGSGH201 HYWRYDDDKH GDLSYSGAWL IGGNTHMQGW GNNGVXSLSG DVRHANDYGP251 MPIAGAAGDS GSPMFIYDKT NNKWLLNGVL QTGYPYSGRE NGFQLIRKDW301 FYDDIYRGDT HTVXFEPRSN GHFSFTSNNN GTGTVTETNE KVSNPKLKVQ351 TVRLFDESLN ETDKEPVYAA GGVNQYRPRL NNGENLSFID YGNGKLILSN401 NINQGAGGLY FEGDFTVSPE NNETWQGAGV HISEDSTVTW KVNGVANDRL451 SKIGKGTLHV QAKGENQGSI SVGDGTVILD QQADDKGKKQ AFSEIGLXSG501 RGTVQLNADN QFNPDKLYFG FRGGRLDLNG HSLSFHRIQN TDEGAMIXXH551 NATTTSTVTI TGNESITQPS GKNINRLNYS KEIAYNGWFG EKDTTKTNGR601 LNLVYQPAAE DRTXLLSGGT NLNGNITQTN GKLFFSGRPT PHAYNHLGSG651 WSKMEGIPQG EIVWDNDWIX RTFKAENFHI QGGQAVISRN VAKVEGDXHL701 SNHAQAVFGV APHQSHTICT RSDWTGLTNC VEXXITDDKV IASLTKTDXS751 GXVXLXXXXX XXLXGXAXLX GNLSANGDTR YTVSHNATQN GNLSLVGNAQ801 ATFNQATLNG NXSXSGNASF NLSNNAAQNG SLTLSDNAKA NVSHSALNGN851 VSLADKAVFH FENSRFTGQL SGSKXTALHL KDSEWTLPSG TELGNLNLDN901 ATITLNSAYR HDAAGAQTGX VSDTPRRRSR RSLLSVTPPT SVESRFNTLT951 VNGKLNXQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN NTGNEPVSLD1001 QLTVVEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG EFRLHNPVKE1051 QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAAEKTES VAEPARXAGG1101 ENVGIMQAEE EKKRVQADKD SALAKQREAE TRPXTTAFPR ARXARRDLPQ1151 PQpQpQPQPQ PQRDLXSRYA NSGLSEFSAT LNSVFAVQDE LDRVFAEDRR1201 NAVWTSXIRX TKHYRSQDFR AYRQQTDLRQ IGMQKNLGSG RVGILFSHNR1251 TENXFDDGIG NSARLAHGAV FGQYGIGRFD IGISTGAGFS SGXLSDGIGG1301 KIRRRVLHYG IQARYRAGFG GFGIEPYIGA TRYFVQKADY RYENVNIATP1351 GLAFNRYRAG IKADYSFKPA QHXSITPYXS LSYTDAASGK VRTRVNTAVL1401 AQDFGKTRSA EWGVNAEIKG FTLSXHAAAA KGPQLEAQHS AGIKLGYRW*跨膜区用下划线表示。ORF1-1和ORF1a在1462个氨基酸的重叠区内显示出有86.3%的相同性:
10 20 30 40 50 60orf1a.pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf1-1 MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
10 20 30 40 50 60
70 80 90 100 110 120orf1a.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf1-1 KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
70 80 90 100 110 120
130 140 150 160 170 179orf1a.pep NVDFGAEGXNPDQHRFSYQIVKRNNYKPDNS-HPYNGDXHMPRLHKFVTDAEPVEMTSDM
|||||||| |||||||:|:|||||||| :: |||:|| |||||||||||||||||||||orf1-1 NVDFGAEGRNPDQHRFTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
130 140 150 160 170 180
180 190 200 210 220 230orf1a.pep RGNTYSDKEKYPERVRIGSGHHYWRYDDDKHGDL--SYSGA----WLIGGNTHMQGWGNN
| | |:::||:|||||:|::||| |:|: :: || | ||:|||| |: :::orf1-1 DGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
190 200 210 220 230 240
240 250 260 270 280 290orf1a.pep GVXSLSGD-VRHANDYGPMPIAGAAGDSGSPMFIYDKTNNKWLLNGVLQTGYPYSGRENG
|: :|::: ::|: || :| :|: ||||||||||| ::|||:||||||| || |: ||orf1-1 GTVNLGSEKIKHS-PYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNG
250 260 270 280 290
300 310 320 330 340 350orf1a.pep FQLIRKDWFYDDIYRGDTHTVXFEPRSNGHFSFTSNNNGTGTVTETNEKVSNP-KLKVQT
|||:|||||||:|: ||||:| :|||:||::||:::||||| :: :|: | | :||::|orf1-1 FQLVRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRT
300 310 320 330 340 350
360 370 380 390 400 410orf1a.pep VRLFDESLNETDKEPVY-AAGGVNQYRPRLNNGENLSFIDYGNGKLILSNNINQGAGGLY
|:||: ||:|| :|||| ||||||:||||||||||:|||| |:|:|||::||||||||||orf1-1 VQLFNVSLSETAREPVYHAAGGVNSYIPRLNNGENISFIDEGKGELILTSNINQGAGGLY
360 370 380 390 400 410
420 430 440 450 460 470orf1a.pep FEGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSI
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf1-1 FQGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSI
420 430 440 450 460 470
480 490 500 510 520 530orf1a.pep SVGDGTVILDQQADDKGKKQAFSEIGLXSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||orf1-1 SVGDGTVILDQQADDKGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG
480 490 500 510 520 530
540 550 560 570 580 590orf1a.pep HSLSFHRIQNTDEGAMIXXHNATTTSTVTITGNESITQPSGKNINRLNYSKEIAYNGWFG
||||||||||||||||| || ||||||||::|: :|:| | |: :||||||||||orf1-1 HSLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDIAT-TGNN-NSLDSKKEIAYNGWFG
540 550 560 570 580 590
600 610 620 630 640 650orf1a.pep EKDTTKTNGRLNLVYQPAAEDRTXLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSG
||||||||||||||||||||||| |||||||||||||||||||||||||||||||||::orf1-1 EKDTTKTNGRLNLVYQPAAEDRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLNDH
600 610 620 630 640 650
660 670 680 690 700 710orf1a.pep WSKMEGIPQGEIVWDNDWIXRTFKAENFHIQGGQAVISRNVAKVEGDXHLSNHAQAVFGV
||: ||||:|||||||||| ||||||||:|:|||||:|||||||:|| ||||||||||||orf1-1 WSQKEGIPRGEIVWDNDWINRTFKAENFQIKGGQAVVSRNVAKVKGDWHLSNHAQAVFGV
660 670 680 690 700 710
720 730 740 750 760 770orf1a.pep APHQSHTICTRSDWTGLTNCVEXXITDDKVIASLTKTDXSGXVXLXXXXXXXLXGXAXLX
|||||||||||||||||||||| :|||||||||||||| || | | |:| | |orf1-1 APHQSHTICTRSDWTGLTNCVEKTITDDKVIASLTKTDISGNVDLADHAHLNLTGLATLN
720 730 740 750 760 770
780 790 800 810 820 830orf1a.pep GNLSANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNXSXSGNASFNLSNNAAQNG
|||||||||||||||||||||||||||||||||||||||||:| |||||||||::|:|||orf1-1 GNLSANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNTSASGNASFNLSDHAVQNG
780 790 800 810 820 830
840 850 860 870 880 890orf1a.pep SLTLSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGQLSGSKXTALHLKDSEWTLPSG
||||| ||||||||||||||||||||||||||:||||||:||:|||||||||||||||||orf1-1 SLTLSGNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSG
840 850 860 870 880 890
900 910 920 930 940orf1a.pep TELGNLNLDNATITLNSAYRHDAAGAQTGXVSDTPRRRSRRS---LLSVTPPTSVESRFN
||||||||||||||||||||||||||||| ::|||||||||| |||||||||||||||orf1-1 TELGNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLSVTPPTSVESRFN
900 910 920 930 940 950
950 960 970 980 990 1000orf1a.pep TLTVNGKLNXQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPVSLDQLTVVEG
||||||||| |||||||||||||||||||||||||||||||||||||||:||:|||||||orf1-1 TLTVNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEG
960 970 980 990 1000 1010
1010 1020 1030 1040 1050 1060orf1a.pep KDNKPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf1-1 KDNKPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAE
1020 1030 1040 1050 1060 1070
1070 1080 1090 1100 1110 1120orf1a.pep KDNAQSLDALIAAGRDAAEKTESVAEPARXAGGENVGIMQAEEEKKRVQADKDSALAKQR
|||||||||||||||||:||||||||||| |||||||||||||||||||||||:||||||orf1-1 KDNAQSLDALIAAGRDAVEKTESVAEPARQAGGENVGIMQAEEEKKRVQADKDTALAKQR
1080 1090 1100 1110 1120 1130
1130 1140 1150 1160 1170 1180
orf1a.pep EAETRPXTTAFPRARXARRDLPQPQPQPQPQPQPQRDLXSRYANSGLSEFSATLNSVFAV
|||||| |||||||| ||||||| |||||||| |||| |||||||||||||||||||||
orf1-1 EAETRPATTAFPRARRARRDLPQLQPQPQPQP--QRDLISRYANSGLSEFSATLNSVFAV
1140 1150 1160 1170 1180 1190
1190 1200 1210 1220 1230 1240
orf1a.pep QDELDRVFAEDRRNAVWTSXIRXTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFS
||||||||||||||||||| || |||||||||||||||||||||||||||||||||||||
orf1-1 QDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFS
1200 1210 1220 1230 1240 1250
1250 1260 1270 1280 1290 1300
orf1a.pep HNRTENXFDDGIGNSARLAHGAVFGQYGIGRFDIGISTGAGFSSGXLSDGIGGKIRRRVL
||||||:|||||||||||||||||||||| || ||||:||||||| ||||||||||||||
orf1-1 HNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGIGGKIRRRVL
1260 1270 1280 1290 1300 1310
1310 1320 1330 1340 1350 1360
orf1a.pep HYGIQARYRAGFGGFGIEPYIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSF
|||||||||||||||||||:||||||||||||||||||||||||||||||||||||||||
orf1-1 HYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSF
1320 1330 1340 1350 1360 1370
1370 1380 1390 1400 1410 1420
orf1a.pep KPAQHXSITPYXSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSXHA
||||| ||||| ||||||||||||||||||||||||||||||||||||||||||||||||
orf1-1 KPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSLHA
1380 1390 1400 1410 1420 1430
1430 1440 1450
orf1a.pep AAAKGPQLEAQHSAGIKLGYRWX
|||||||||||||||||||||||
orf1-1 AAAKGPQLEAQHSAGIKLGYRWX
1440 1450
与流感嗜血菌的粘附和穿透蛋白hap前体(登录号为P45387)的同源性
ORF1的氨基酸23-423和hap蛋白在450个氨基酸的重叠区内显示出有59%的氨基酸相同性:
orf1 23 FXAAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAENKGKFAVGAKDIEVYNKKGELVG 82
F +L C+S GI QAWAGHTYFGI+YQYYRDFAENKGKF VGAK+IEVYNK+G+LVG
hap 6 FRLNFLTACVSLGIASQAWAGHTYFGIDYQYYRDFAENKGKFTVGAKNIEVYNKEGQLVG 65
orf1 83 KSMTKAPMIDFSVVSRNGVAALVGVQYIVSVAHNGGYNNVDFGAEGXNIXDQXRXTYKIV 142
SMTKAPMIDFSVVSRNGVAALVG QYIVSVAHNGGYN+VDFGAEG N DQ R TY+IV
hap 66 TSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYNDVDFGAEGRN-PDQHRFTYQIV 124
orf1 143 KRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSYMDGRKYIDQNNYPDRVRIGAGR 202
KRNNY+A + HPY GDYHMPRLHK VT+AEPV MT+ MDG+ Y D+ NYP+RVRIG+GR
hap 125 KRNNYQAWERKHPYDGDYHMPRLHKFVTEAEPVGMTTNMDGKVYADRENYPERVRIGSGR 184
orf1 203 QYWRSDEDEPNNRESSYHIA---------------------------------------- 222
QYWR+D+DE N SSY+++
hap 185 QYWRTDKDEETNVHSSYYVSGAYRYLTAGNTHTQSGNGNGTVNLSGNVVSPNHYGPLPTG 244
orf1 223 -----SGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGFQLVRKDWFYDEIFAGDTHSVF 277
SGSPMFIYDA+K++WLIN VLQTG+P+ G+ NGFQL+R++WFY+E+ A DT SVF
hap 245 GSKGDSGSPMFIYDAKKKQWLINAVLQTGHPFFGRGNGFQLIREEWFYNEVLAVDTPSVF 304orf1 278 --YEPRQNGKYSFNDDNNGTGKIN-AKHEHNSLPNRLKTRTVQLFNVSLSETAREPVYHA 334
Y P NG YSF +N+GTGK+ + + + + TV+LFN SL++TA+E V Ahap 305 QRYIPPINGHYSFVSNNDGTGKLTLTRPSKDGSKAKSEVGTVKLFNPSLNQTAKEHV-KA 363orf1 335 AGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYFQGDFTV-SPENNETWQGA 393
A G N Y+PR+ G+NI D+GKG L + +NINQGAGGLYF+G+F V +NN TWQGAhap 364 AAGYNIYQPRMEYGKNIYLGDQGKGTLTIENNINQGAGGLYFEGNFVVKGKQNNITWQGA 423orf1 394 GVHISEDSTVTWKVNGVANDRLSKIGKGTL 423
GV I +D+TV WKV+ NDRLSKIG GTLhap 424 GVSIGQDATVEWKVHNPENDRLSKIGIGTL 453
ORF1的氨基酸715-1011和hap蛋白在258个氨基酸的重叠区内显示出有50%的氨基酸相同性:Orf1 41 DTRYTVSHNATQ-NGNXSLVXNAQATFNQ-ATLNGNTSASGNASFNLSDEAVQNGSLTLS 98
DT+ S TQ NG+ +L NA + A LNGN + ++ F LS++A Q G++ LShap 733 DTKVINSIPITQINGSINLTNNATVNIHGLAKLNGNVTLIDHSQFTLSNNATQTGNIKLS 792orf1 99 GNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGXELGN 158
+A A V+++ LNGNV L D A F ++S F QI G KDT + L+++ WT+PS L Nhap 793 NHANATVNNATLNGNVHLTDSAQFSLKNSHFWHQIQGDKDTTVTLENATWTMPSDTTLQN 852orf1 159 LNLDNATITLNSAYRHDAAGAQTGSATDAPXXXXXXXXXXLLXVTPPTSVESRFNTLTVN 218
L L+N+T+TLNSAY + S+ +AP L T PTS E RFNTLTVNhap 853 LTLNNSTVTLNSAY--------SASSNNAPRHRRS-----LETETTPTSAEHRFNTLTVN 899orf1 219 GKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDNKP 278
GKL+GQGTF+F S LFGY+SDKLKL+ +EG YTL+V NTG EP +LEQLT++E DNKPhap 900 GKLSGQGTFQFTSSLFGYKSDKLKLSNDAEGDYTLSVRNTGKEPVTLEQLTLIESLDNKP 959orf1 279 LSENLNFTLQNEHVDAGA 296
LS+ L FTL+N+HVDAGAhap 960 LSDKLKFTLENDHVDAGA 977
ORF1的氨基酸1192-1450和hap蛋白在259个氨基酸的重叠区内显示出有41%的氨基酸相同性:Orf1 1 LDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNR 60
LDR+F + ++AVWT+ +D + Y S FRAY+Q+T+LRQIG+QK L +GR+G +FSH+Rhap 1135 LDRLFVDQAQSAVWTNIAQDKRRYDSDAFRAYQQKTNLRQIGVQKALANGRIGAVFSHSR 1194orf1 61 TENTFDDGIGNSARLAHGAVFGQYGIDRFYXXXXXXXXXXXXXXXXXIGXKXRRRVLHYG 120
++NTFD+ + N A L + F QY K R+ ++YGhap 1195 SDNTFDEQVKNHATLTMMSGFAQYQWGDLQFGVNVGTGISASKMAEEQSRKIHRKAINYG 1254orf1 121 IQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPA 180
+ A Y+ G GI+P+ G RYF+++ +Y+ E V + TP LAFNRY AGI+ DY+F Phap 1255 VNASYQFRLGQLGIQPYFGVNRYFIERENYQSEEVRVKTPSLAFNRYNAGIRVDYTFTPT 1314orf1 181 QHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSLHAAAA 240
+IS+ PY ++Y D ++ V+T VN VL Q FG+ E G+ AEI F +S + +hap 1315 DNISVKPYFFVNYVDVSNANVQTTVNLTVLQQPFGRYWQKEVGLKAEILHFQISAFISKS 1374orf1 241 KGPQLEAQHSAGIKLGYRW 259
+G QL Q + G+KLGYRWhap 1375 QGSQLGKQQNVGVKLGYRW 1393与淋病奈瑟球菌的预计ORF的同源性ORF1的片段和淋病奈瑟球菌的预计ORF(ORF1ng)在467、298和259个氨基酸的重叠区内分别显示出有83.5%,88.3%和97.7%的相同性:
orf1.pep MKTTDKRTTETHRKAPKTGRIRFXAAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN 60
||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
orf1ng MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN 60
orf1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGVQYIVSVAHNGGYN 120
||||||||||||||||||||||||||||||||||||||||||||:| |||||||||||||
orf1ng KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALAGDQYIVSVAHNGGYN 120
orf1.pep NVDFGAEGXNIXDQXRXTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSY 18
|||||||| | || | :|:|||||||||||:||||||||||||||| ||||||||||||
orf1ng NVDFGAEGSN-PDQHRFSYQIVKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSY 179
orf1.pep MDGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESSYHIAS----------------- 223
||| || | |:||||||||||||||||||||||||||||||||
orf1ng MDGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSG 239
orf1.pep ----------------------------GSPMFIYDAQKQKWLINGVLQTGNPYIGKSNG 255
||||||||||||||||||||||||||||||||
orf1ng GGTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNG 289
orf1.pep FQLVRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRT 315
|||||||||||||||||||||||||:||||| |||:|||:|||:|||:| ||| ||||||
orf1ng FQLVRKDWFYDEIFAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRT 359
orf1.pep VQLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLY 375
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf1ng VQLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLY
orf1.pep FQGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGT 422
|:|:|||||:|||||||||||||: ||||||||||||||||||||||
orf1ng FEGNFTVSPKNNETWQGAGVHISDGSTVTWKVNGVANDRLSKIGKGTLLVQAKGENQGSV 479
//
orf1.pep DKVTASLTKTDISGNVDLADHAHLNLTGLA 744
||| |||:|||: |||:|||||||||||||
orf1ng FGVAPHQSHTICTRSDWTGLTSCTEKTITDDKVIASLSKTDVRGNVSLADHAHLNLTGLA 774
orf1.pep TLNGNLSANGDTR-YTVSHNATQNGNXSLVXNAQATFNQATLNGNTSASGNASFNLSDHA 803
|:|||| ::::|| : ||||||| ||| |||||||||||||||||| |||||||::|
orf1ng TFNGNL-VQAETRTIRLRANATQNGNLSLVGNAQATFNQATLNGNTSASDNASFNLSNNA 833
orf1.pep VQNGSLTLSGNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWT 863
||||||||| ||||||||||||||||||||||||||:|||||:|||||||||||||||||
orf1ng VQNGSLTLSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGKISGGKDTALHLKDSEWT 893
orf1.pep LPSGXELGNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLXVTPPTSVE 923
||||:||||||||||||||||||||||||||||||:|||||||||| || |||||:|
orf1ng LPSGTELGNLNLDNATITLNSAYRHDAAGAQTGSAADAPRRRSRRS---LLSVTPPTSAE 950
orf1.pep SRFNTLTVNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLT 983
||||||||||||||||||||||||||||| |||||||||||||||||||||||:||||||
orf1ng SRFNTLTVNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVNNTGNEPVSLEQLT 1010
orf1.pep VVEGKDNKPLSENLNFTLQNEHVDAGAW 1011
||||||| ||||||||||||||||||||
orf1ng VVEGKDNTPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAGET 1070
//
orf1.pep LDRVFAEDRRNAVWTSGIRDTKHYRSQDFR 1211
||||||||||||||||||||||||||||||
orf1ng PQRDLISRYANSGLSEFSATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFR 1239orf1.pep AYRQQTDLRQIGMQKNLGSGRVGILFSHNRTENTFDDGIGNSARLAHGAVFGQYGIDRFY 1271
||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||orf1ng AYRQQTDLRQIGMQKNLGSGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFD 1299orf1.pep IGISAGAGFSSGSLSDGIGXKXRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADY 1331
|||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||orf1ng IGISAGAGFSSGSLSDGIRGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADY 1359orf1.pep RYENVNIATPGLAFNRYRAGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVL 1391
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf1ng RYENVNIATPGLAFNRYRAGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVL 1419orf1.pep AQDFGKTRSAEWGVNAEIKGFTLSLHAAAAKGPQLEAQHSAGIKLGYRW 1440
|||||||||||||||||||||||||||||||||||||||||||||||||orf1ng AQDFGKTRSAEWGVNAEIKGFTLSLHAAAAKGPQLEAQHSAGIKLGYRW 1468鉴定出全长ORF1ng核苷酸序列<SEQ ID 595>:
1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCTAA51 AACCGGCCGC ATCCGCTTCT CGCCCGCTTA CTTAGCCATA TGCCTGTCGT101 TCGGCATTCT GCCCCAAGCC CGGGCGGGAC ACACTTATTT CGGCATCAAC151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT251 CGATGACGAA AGCCCCGATG ATTGATTTTT CTGTGGTATC GCGTAACGGC301 GTGGCGGCAT TGGCGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG351 CGGCTATAAC AATGTTGATT TTGGTGCGGA GGGAAGCAAT CCCGATCAGC401 ACCGCTTTTC TTACCAAATT GTGAAAAGAA ATAATTATAA AGCAGGGACT451 AACGGCCATC CTTATGGCGG CGATTATCAT ATGCCGCGTT TGCACAAATT501 TGTCACAGAT GCAGAACCTG TTGAGATGAC CAGTTATATG GATGGGTGGA551 AATACGCTGA TTTAAATAAA TACCCTGATC GTGTTCGAAT CGGAGCAGGC601 AGACAATATT GGCGGTCTGA TGAAGACGAA CCCAATAACC GCGAAAGTTC651 ATATCATATT GCAAGCGCAT ATTCTTGGCT CGTCGGTGGC AATACCTTTG701 CACAAAATGG ATCAGGTGGT GGCACAGTCA ACTTAGGTAG CGAAAAAATT751 AAACATAGCC CATATGGTTT TTTACCAACA GGAGGCTCAT TTGGCGACAG801 TGGCTCACCA ATGTTTATCT ATGATGCCCA AAAGCAAAAG TGGTTAATTA851 ATGGGGTATT GCAAACAGGC AACCCCTATA TAGGAAAAAG CAATGGCTTC901 CAGCTAGTTC GTAAAGATTG GTTCTATGAT GAAATCTTTG CTGGAGATAC951 CCATTCAGTA TTCTACGAAC CACATCAAAA TGGGAAATAC TTTTTTAACG1001 ACAATAATAA TGGCGCAGGA AAAATCGATG CCAAACATAA ACACTATTCT1051 CTACCTTATA GATTAAAAAC ACGAACCGTT CAATTGTTTA ATGTTTCTTT1101 ATCCGAGACA GCAAGAGAAC CTGTTTATCA TGCTGCAGGT GGGGTCAACA1151 GTTATCGACC CAGACTGAAT AATGGAGAAA ATATTTCCTT TATTGACAAA1201 GGAAAAGGTG AATTGATACT TACCAGCAAC ATCAACCAAG GCGCGGGCGG1251 TTTGTATTTT GAGGGTAATT TTACGGTCTC GCCTAAAAAC AACGAAACGT1301 GGCAAGGCGC GGGCGTTCAT ATCAGTGATG GCAGTACCGT TACTTGGAAA1351 GTAAACGGCG TGGCAAACGA CCGCCTGTCC AAAATCGGCA AAGGCACGCT1401 GCTGGTTCAA GCCAAAGGGG AAAACCAAGG CTCGGTCAGC GTGGGCGACG1451 GTAAAGTCAT CTTAGATCAG CAGGCGGACG ATCAAGGCAA AAAACAAGCC1501 TTTAGTGAAA TCGGCTTGGT CAGCGGCAGG GGGACGGTGC AACTGAATGC1551 CGATAATCAG TTCAACCCCG ACAAACTCTA TTTCGGCTTT CGCGGCGGAC1601 GTTTGGATTT GAACGGGCAT TCGCTTTCGT TCCACCGCAT TCAAAATACC1651 GATGAAGGGG CGATGATTGT CAACCACAAT CAAGACAAAG AATCCACCGT1701 TACCATTACA GGCAATAAAG ATATTACTAC AACCGGCAAT AACAACAACT1751 TGGATAGCAA AAAAGAAATT GCCTACAACG GTTGGTTTGG CGAGAAAGAT1801 GCAACCAAAA CGAACGGGCG GCTCAATCTG AATTACCAAC CGGAAGAAGC1851 GGATCGCACT TTACTGCTTT CCGGCGGAAC AAATTTAAAC GGCAATATCA1901 CGCAAACAAA CGGCAAACTG TTTTTCAGCG GCAGACCGAC ACCGCACGCC1951 TACAATCATT TAGGAAGCGG GTGGTCAAAA ATGGAAGGTA TCCCACAAGG2001 AGAAATCGTG TGGGACAACG ATTGGATCGA CCGCACATTT AAAGCGGAAA2051 ACTTCCATAT TCAGGGCGGA CAAGCGGTGG TTTCCCGCAA TGTTGCCAAA2101 GTGGAAGGCG ATTGGCATTT AAGCAATCAC GCCCAAGCAG TTTTCGGTGT2151 CGCACCGCAT CAAAGCCACA CAATCTGTAC ACGTTCGGAC TGGACGGGTC2201 TGACAAGTTG TACCGAAAAA ACCATTACCG ACGATAAAGT GATTCCTTCA2251 TTGAGCAAGA CCGACATCAG AGGCAATGTC AGCCTTGCCG ATCACGCTCA2301 TTTAAATCTC ACAGGACTTG CCACACTCAA CGGCAATCTT AGTGCAGGCG2351 GAGACACGCA CTATACGGTT ACGCGCAACG CCACCCAAAA CGGCAACCTC2401 AGCCTCGTGG GCAATGCCCA AGCAACATTT AATCAAGCCA CATTAAACGG2451 CAACACATCG GCTTCGGACA ATGCTTCATT TAATCTAAGC AACAACGCCG2501 TACAAAACGG CAGTCTGACG CTTTCCGACA ACGCTAAGGC AAACGTAAGC2551 CATTCCGCAC TCAACGGCAA TGTCTCCCTA GCCGATAAGG CAGTATTCCA2601 TTTTGAAAAC AGCCGCTTTA CCGGAAAAAT CAGCGGCGGC AAGGATACGG2651 CATTACACTT AAAAGACAGC GAATGGACGC TGCCGTCGGG CACGGAATTA2701 GGCAATTTAA ACCTTGACAA CGCCACCATT ACACTCAATT CCGCCTATCG2751 ACACGATGCG GCAGGCGCGC AAACCGGCAG TGCGGCAGAT GCGCCGCGCC2801 GCCGTTCGCG CCGTTCCCTA TTATCCGTTA CGCCGCCAAC TTCGGCAGAA2851 TCCCGTTTCA ACACGCTGAC GGTAAACGGC AAATTGAACG GTCAGGGAAC2901 ATTCCGCTTT ATGTCGGAAC TCTTCGGCTA CCGCAGCGGC AAATTGAAGC2951 TGGCGGAAAG TTCCGAAGGC ACTTACACCT TGGCTGTCAA CAATACCGGC3001 AACGAACCCG TAAGTCTCGA GCAATTGACG GTAGTGGAAG GAAAAGACAA3051 CACACCGCTG TCCGAAAATC TTAATTTCAC CCTGCaaaAc gaacacgtcg3101 atgccggcgc atggCGTTAT CAGCTTATCC gcaaagacgG CGAGTTCCgc3151 CTGCATAATC CGGTCAAAGA ACAAGAGCTT TCCGACAAAC TCGGCAAGgc3201 gggagaaACA GAggccgccT TGACGGCAAA ACAGGCacaA CTTGCCGCCA3251 AAcaacaggc ggaaaAAGAC AACgcgcaaa gccttgAcgc gctgattgcg3301 gCcgggcgca atgccaccga AAAGGCAgaa agtgttgccg aaccgGCCCG3351 GCAGGCAGGC GGGGAAAAtg ccgGCATTAT GCAGGCGGAG GAAGAGAAAA3401 AACGGGTGCA GGCGGATAAA GACACCGCCT TGGCGAAACA GCGCGAAGCG3451 GAAACCCGGC CGGCTACCAC CGCCTTCCCC CGCGCCCGCC GCGCCCGCCG3501 GGATTTGCCG CAACCGCAGC CCCAACCGCA ACCCCAACCG CAGCGCGACC3551 TGATCAGCCG TTATGCCAAT AGCGGTTTGA GTGAATTTTC CGCCACGCTC3601 AACAGCGTTT TCGCCGTACA GGACGAATTG GACCGCGTGT TTGCCGAAGA3651 CCGCCGCAAC GCCGTTTGGA CAAGCGGCAT CCGGGACACC AAACACTACC3701 GTTCGCAAGA TTTCCGCGCC TACCGCCAAC AAACCGACCT GCGCCAAATC3751 GGTATGCAGA AAAACCTCGG CAGCGGGCGC GTCGGCATCC TGTTTTCGCA3801 CAACCGGACC GGAAACACCT TCGACGACGG CATCGGCAAC TCGGCACGGC3851 TTGCCCACGG TGCCGTTTTC GGGCAATACG GCATCGGCAG GTTCGACATC3901 GGCATCAGCG CGGGCGCGGG TTTTAGTAGC GGCAGCCTTT CAGACGGCAT3951 CAGAGGCAAA ATCCGCCGCC GCGTGCTGCA TTACGGCATT CAGGCAAGAT4001 ACCGCGCAGG TTTCGGCGGA TTCGGCATCG AACCGCACAT CGGCGCAACG4051 CGCTATTTCG TCCAAAAAGC GGATTACCGA TACGAAAACG TCAATATCGC4101 CACCCCGGGC CTTGCATTCA ACCGCTACCG CGCGGGCATT AAGGCAGATT4151 ATTCATTCAA ACCGGCGCAA CACATTTCCA TCACGCCTTA TTTGAGCCTG4201 TCCTATACCG ATGCCGCTTC CGGCAAAGTC CGAACGCGCG TCAATACCGC4251 CGTATTGGCG CAGGATTTCG GCAAAACCCG CAGTGCGGAA TGGGGCGTAA4301 ACGCCGAAAT CAAAGGTTTC ACGCTGTCCC TCCACGCTGC CGCCGCCAAG4351 GGGCCGCAAT TGGAAGCGCA GCACAGCGCG GGCATCAAAT TAGGCTACCG4401 CTGGTAA预计它编码的蛋白质具有氨基酸序列<SEQ ID 596>:1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA RAGHTYFGIN51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG101 VAALAGDQYI VSVAHNGGYN NVDFGAEGSN PDQHRFSYQI VKRNNYKAGT151 NGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGWKYADLNK YPDRVRIGAG201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI251 KHSPYGFLPT GGSFGDSGSP MFIYDAQKQK WLINGVLQTG NPYIGKSNGF301 QLVRKDWFYD EIFAGDTHSV FYEPHQNGKY FFNDNNNGAG KIDAKHKHYS351 LPYRLKTRTV QLFNVSLSET AREPVYHAAG GVNSYRPRLN NGENISFIDK401 GKGELILTSN INQGAGGLYF EGNFTVSPKN NETWQGAGVH ISDGSTVTWK451 VNGVANDRLS KIGKGTLLVQ AKGENQGSVS VGDGKVILDQ QADDQGKKQA501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRLDLNGH SLSFHRIQNT551 DEGAMIVNHN QDKESTVTIT GNKDITTTGN NNNLDSKKEI AYNGWFGEKD601 ATKTNGGLNL NYPPEEADRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA 651 YNHLGSGWSK MEGIPQGEIV WDNDWIDRTF KAENFHIQGG QAVVSRNVAK701 VEGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTSCTEK TITDDKVIAS751 LSKTDVRGNV SLADHAHLNL TGLATFNGNL VQAETRTIRL RANATQNGNL801 SLYGNAQATF NQATLNGNTS ASDNASFNLS NNAVQNGSLT LSDNAKANVS851 HSALNGNVSL ADKAVFHFEN SRFTGKISGG KDTALHLKDS EWTLPSGTEL901 GNLNLDNATI TLNSAYRHDA AGAQTGSAAD APRRRSRRSL LSVTPPTSAE951 SRFNTLTVNG KLNGQGTFRF MSELFGYRSG KLKLAESSEG TYTLAVNNTG1001 NEPVSLEQLT VVEGKDNTPL SENLNFTLQN EHVDAGAWRY QLIRKDGEFR1051 LHNPVKEQEL SDKLGKAGET EAALTAKQAQ LAAKQQAEKD NAQSLDALIA1101 AGRNATEKAE SVAEPARQAG GENAGIMQAE EEKKRVQADK DTALAKQREA1151 ETRPATTAFP RARRARRDLP QPQPQPQPQP QRDLISRYAN SGLSEFSATL1201 NSVFAVQDEL DRVFAEDRRN AVWTSGIRDT KHYRSQDFRA YRQQTDLRQI1251 GMQKNLGSGR VGILFSHNRT GNTFDDGIGN SARLAHGAYF GQYGIGRFDI1301 GISAGAGFSS GSLSDGIRGK IRRRVLHYGI QARYRAGFGG FGIEPHIGAT1351 RYFVQKADYR YENVNIATPG LAFNRYRAGI KADYSFKPAQ HISITPYLSL1401 SYTDAASGKV RTRVNTAVLA QDFGKTRSAE WGVNAEIKGF TLSLHAAAAK1451 GPQLEAQHSA GIKLGYRW*
有下划线和双划线的序列代表丝氨酸蛋白酶(胰蛋白酶家族)的活性位点以及ATP/GTP-结合位点基序A(P-环)。
ORF1-1和ORF1ng在1471个氨基酸的重叠区内显示出有93.7%的相同性:
10 20 30 40 50 60
orf1-1.pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
|||||||||||||||||||||||||||||||||||||||| |||||||||||||||||||
orf1ng-1 MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN
10 20 30 40 50 60
70 80 90 100 110 120
orf1-1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||
orf1ng-1 KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALAGDQYIVSVAHNGGYN
70 80 90 100 110 120
130 140 150 160 170 180
orf1-1.pep NVDFGAEGRNPDQHRFTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
|||||||| |||||||:|:|||||||||||:|||||||||||||||||||||||||||||
orf1ng-1 NVDFGAEGSNPDQHRFSYQIVKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
130 140 150 160 170 180
190 200 210 220 230 240
orf1-1.pep DGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
||||| | |:||||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 DGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
190 200 210 220 230 240
250 260 270 280 290 300
orf1-1.pep GTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 GTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGF
250 260 270 280 290 300
310 320 330 340 350 360
orf1-1.pep QLVRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRTV
||||||||||||||||||||||||:||||| |||:|||:|||:|||:| ||| |||||||
orf1ng-1 QLVRKDWFYDEIFAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRTV
310 320 330 340 350 360
370 380 390 400 410 420
orf1-1.pep QLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYF
|||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||| orf1ng-1 QLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLYF
370 380 390 400 410 420
430 440 450 460 470 480orf1-1.pep QGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSIS
:|:|||||:|||||||||||||: ||||||||||||||||||||||| ||||||||||:|orf1ng-1 EGNFTVSPKNNETWQGAGVHISDGSTVTWKVNGVANDRLSKIGKGTLLVQAKGENQGSVS
430 440 450 460 470 480
490 500 510 520 530 540orf1-1.pep VGDGTVILDQQADDKGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGH
|||| |||||||||:|||||||||||||||||||||||||||||||||||||||||||||orf1ng-1 VGDGKVILDQQADDQGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGH
490 500 510 520 530 540
550 560 570 580 590 600orf1-1.pep SLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDIATTGNNNSLDSKKEIAYNGWFGEKD
|||||||||||||||||||||||||||||||||||:||||||:|||||||||||||||||orf1ng-1 SLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDITTTGNNNNLDSKKEIAYNGWFGEKD
550 560 570 580 590 600
610 620 630 640 650 660orf1-1.pep TTKTNGRLNLVYQPAAEDRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLNDHWSQ
:||||||||| ||| |||||||||||||||||||||||||||||||||||||:: ||orf1ng-1 ATKTNGRLNLNYQPEEADRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSGWSK
610 620 630 640 650 660
670 680 690 700 710 720orf1-1.pep KEGIPRGEIVWDNDWINRTFKAENFQIKGGQAVVSRNVAKVKGDWHLSNHAQAVFGVAPH
||||:||||||||||:||||||||:|:|||||||||||||:||||||||||||||||||orf1ng-1 MEGIPQGEIVWDNDWIDRTFKAENFHIQGGQAVVSRNVAKVEGDWHLSNHAQAVFGVAPH
670 680 690 700 710 720
730 740 750 760 770 780orf1-1.pep QSHTICTRSDWTGLTNCVEKTITDDKVIASLTKTDISGNVDLADHAHLNLTGLATLNGNL
|||||||||||||||:|:|||||||||||||:|||| |||:|||||||||||||||||||orf1ng-1 QSHTICTRSDWTGLTSCTEKTITDDKVIASLSKTDIRGNVSLADHAHLNLTGLATLNGNL
730 740 750 760 770 780
790 800 810 820 830 840orf1-1.pep SANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNTSASGNASFNLSDHAVQNGSLT
||:|||:|||::|||||||||||||||||||||||||||||| |||||||::||||||||orf1ng-1 SAGGDTHYTVTRNATQNGNLSLVGNAQATFNQATLNGNTSASDNASFNLSNNAVQNGSLT
790 800 810 820 830 840
850 860 870 880 890 900orf1-1.pep LSGNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGTEL
|| ||||||||||||||||||||||||||:|||||:||||||||||||||||||||||||orf1ng-1 LSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGKISGGKDTALHLKDSEWTLPSGTEL
850 860 870 880 890 900
910 920 930 940 950 960orf1-1.pep GNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLSVTPPTSVESRFNTLT
||||||||||||||||||||||||||||:|||||||| |||||||||||:||||||||orf1ng-1 GNLNLDNATITLNSAYRHDAAGAQTGSAADAPRRRSR---RSLLSVTPPTSAESRFNTLT
910 920 930 940 950
970 980 990 1000 1010 1020orf1-1.pep VNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDN
|||||||||||||||||||||| |||||||||||||||||||||||:|||||||||||||orf1ng-1 VNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVNNTGNEPVSLEQLTVVEGKDN
960 970 980 990 1000 1010
1030 1040 1050 1060 1070orf1-1.pep KPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKA----------
||||||||||||||||||||||||||||||||||||||||||||||||||orf1ng-1 TPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAGETEAALTAK
1020 1030 1040 1050 1060 1070
1080 1090 1100 1110 1120orf1-1.pep ----EAKKQAEKDNAQSLDALIAAGRDAVEKTESVAEPARQAGGENVGIMQAEEEKKRVQ
||:||||||||||||||||||:|:||:||||||||||||||:|||||||||||||orf1ng-1 QAQLAAKQQAEKDNAQSLDALIAAGRNATEKAESVAEPARQAGGENAGIMQAEEEKKRVQ
1080 1090 1100 1110 1120 1130
1130 1140 1150 1160 1170 1180orf1-1.pep ADKDTALAKQREAETRPATTAFPRARRARRDLPQLQPQPQPQPQRDLISRYANSGLSEFS
|||||||||||||||||||||||||||||||||| |||||||||||||||||||||||||orf1ng-1 ADKDTALAKQREAETRPATTAFPRARRARRDLPQPQPQPQPQPQRDLISRYANSGLSEFS
1140 1150 1160 1170 1180 1190
1190 1200 1210 1220 1230 1240orf1-1.pep ATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf1ng-1 ATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLG
1200 1210 1220 1230 1240 1250
1250 1260 1270 1280 1290 1300orf1-1.pep SGRVGILFSHNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGI
||||||||||||| |||||||||||||||||||||||| || ||||||||||||||||||orf1ng-1 SGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFDIGISAGAGFSSGSLSDGI
1260 1270 1280 1290 1300 1310
1310 1320 1330 1340 1350 1360orf1-1.pep GGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orflng-1 RGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYR
1320 1330 1340 1350 1360 1370
1370 1380 1390 1400 1410 1420orf1-1.pep AGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf1ng-1 AGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEI
1380 1390 1400 1410 1420 1430
1430 1440 1450orf1-1.pep KGFTLSIHAAAAKGPQLEAQHSAGIKLGYRWX
||||||||||||||||||||||||||||||||orf1ng-1 KGFTLSLHAAAAKGPQLEAQHSAGIKLGYRWX
1440 1450 1460
另外,ORF1ng和hap蛋白(P45387)在1455个氨基酸的重叠区内显示出有55.7%的相同性:
SCORES Initl: 1104 Initn: 4632 Opt: 2680
Smith-Waterman评分:5165; 在1455个氨基酸的重叠区内有55.7%的相同性
10 20 30 40 50 60
orf1ng-1.pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN
| :|: |:|:||: || ||||||||:||||||||||
p45387 MKKTVFRLNFLTACISLGIVSQAWAGHTYFGIDYQYYRDFAEN
10 20 30 40
70 80 90 100 110 120orf1ng-1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALAGDQYIVSVAHNGGYN
||||:|||::|:||||:|:||| |||||||||||||||||||||: :||||||||| ||:p45387 KGKFTVGAQNIKVYNKQGQLVGTSMTKAPMIDFSVVSRNGVAALVENQYIVSVAHNVGYT
50 60 70 80 90 100
130 140 150 160 170 180orf1ng-1.pep NVDFGAEGSNPDQHRFSYQIVKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
:|||||||:|||||||:|:|||||||| | ||| ||| ||||||||:| |::||| |p45387 DVDFGAEGNNPDQHRFTYKIVKRNNYKKD-NLHPYEDDYHNPRLHKFVTEAAPIDMTSNM
110 120 130 140 150 160
190 200 210 220 230 240orf1ng-1.pep DGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
:| |:| :|||:|||||:|||:||:|:|: : ::|:|| :|::||| | |:|:p45387 NGSTYSDRTKYPERVRIGSGRQFWRNDQDKGD------QVAGAYHYLTAGNTHNQRGAGN
170 180 190 200 210
250 260 270 280 290 300orf1ng-1.pep GTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGF
| ||:: | : || || :|| ||||||||||||:||||||||:|: |||: || |||p45387 GYSYLGGDVRKAGEYGPLPIAGSKGDSGSPMFIYDAEKQKWLINGILREGNPFEGKENGF
220 230 240 250 260 270
310 320 330 340 350 360orf1ng-1.pep QLVRKDWFYDEIFAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRTV
|||||::| |||| | |: :| || | :: |:|| |:| | ::| ::| :p45387 QLVRKSYF-DEIFERDLHTSLYTRAGNGVYTISGNDNGQGSITQKS---GIPSEIK---I
280 290 300 310 320
370 380 390 400 410 419orf1ng-1.pep QLFNVSLSETAREPVYHAA-GGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLY
| |:|| :: |:: | | | |||||||:: |:|: :| ||::|:|||||||||p45387 TLANMSLPLKEKDKVHNPRYDGPNIYSPRLNNGETLYFMDQKQGSLIFASDINQGAGGLY
330 340 350 360 370 380
420 430 440 450 460 470 479orf1ng-1.pep FEGNFTVSPKNNETWQGAGVHISDGSTVTWKVNGVANDRLSKIGKGTLLVQAKGENQGSV
|||||||||::|:||||||:|:|::|||||||||| :||||||||||| |||||||:||:p45387 FEGNFTVSPNSNQTWQGAGIHVSENSTVTWKVNGVEHDRLSKIGKGTLHVQAKGENKGSI
390 400 410 420 430 440
480 490 500 510 520 530 539orf1ng-1.pep SVGDGKVILDQQADDQGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG
|||||||||:|||||||:||||||||||||||||||| |:||: ||:|||||||||||||p45387 SVGDGKVILEQQADDQGNKQAFSEIGLVSGRGTVQLNDDKQFDTDKFYFGFRGGRLDLNG
450 460 470 480 490 500
540 550 560 570 580 590orf1ng-1.pep HSLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDITT-TGNN-NNLDSKKEIAYNGWFG
|||:|:||||||||||||||| : ::||||||::|: :||| |:|| :||||||||||p45387 HSLTFKRIQNTDEGAMIVNHNTTQAANVTITGNESIVLPNGNNINKLDYRKEIAYNGWFG
510 520 530 540 550 560
600 610 620 630 640 650orf1ng-1.pep EKDATKTNGRLNLNYQPEEADRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSG
| | :| |||||| |:| ||||||||||||:|:||||:|||||||||||||||||::p45387 ETDKNKHNGRLNLIYKPTTEDRTLLLSGGTNLKGDITQTKGKLFFSGRPTPHAYNHLNKR
570 580 590 600 510 620
660 670 680 690 700 710orf1ng-1.pep WSKMEGIPQGEIVWDNDWIDRTFKAENFHIQGGQAVVSRNVAKVEGDWHLSNHAQAVFGV
||:||||||||||||:|||:||||||||:|:||||||||||:::||:| :||:|:|:|||p45387 WSEMEGIPQGEIVWDHDWINRTFKAENFQIKGGSAVVSRNVSSIEGNWTVSNNANATFGV
630 640 650 660 670 680
720 730 740 750 760 770orf1ng-1.pep APHQSHTICTRSDWTGLTSCTEKTITDDKVIASLSKTDIRGNVSLADHAHLNLTGLATLN
:|:|::||||||||||||:| : :|| ||| |: ||:| |:::|:|:| |: ||| ||p45387 VPNQQNTICTRSDWTGLTTCQKVDLTDTKVINSIPKTQINGSINLTDNATANVKGLAKLN
690 700 710 720 730 740
780 790 800 810 820 830orf1ng-1.pep GNLSAGGDTHYTVTRNATQNGNLSLVGNAQATFNQATLNGNTSASDNASFNLSNNAVQNG
||:: :::::|:|||||:| |p45387 GNVTL---------------------------------------TNHSQFTLSNNATQIG
750 760 770
840 850 860 870 880 890orf1ng-1.pep SLTLSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGKISGGKDTALHLKDSEWTLPSG
:: ||||: |:|::: ||||| |:|:| | ::||:|: :|:| | |:: |::: ||:||p45387 NIRLSDNSTATVDNANLNGNVHLTDSAQFSLKNSHFSHQIQGDKGTTVTLENATWTMPSD
780 790 800 810 820 830
900 910 920 930 940 950orf1ng-1.pep TELGNLNLDNATITLNSAYRHDAAGAQTGSAADAPRRRSRRSLLSVTPPTSAESRFNTLT
| | ||:|:|:||||||||| ::|: ::||||| | : | ||||| ||||||p45387 TTLQNLTLNNSTITLNSAY--------SASSNNTPRRRS---LETETTPTSAEHRFNTLT
840 850 860 870
960 970 980 990 1000 1010orf1ng-1.pep VNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVNNTGNEPVSLEQLTVVEGKDN
|||||:|||||:| | ||||:| ||||::::|| | |:| |||:|| :|||||:||:|||p45387 VNGKLSGQGTFQFTSSLFGYKSDKLKLSNDAEGDYILSVRNTGKEPETLEQLTLVESKDN
880 890 900 910 920 930
1020 1030 1040 1050 1060 1070orf1ng-1.pep TPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAGETEAALTAK
|||::|:|||:|:|||||| ||:|:::|||||||||:||||| : | :| ::| :| ||p45387 QPLSDKLKFTLENDHVDAGALRYKLVKNDGEFRLHNPIKEQELHNDLVRAEQAERTLEAK
940 950 960 970 980 990
1080 1090 1100 1110 1120 1130orf1ng-1.pep QAQLAAKQQAEKDNAQSLDALIAAGRNAT-EKAESVAEPARQAGGENAGIMQAEEEKKRV
|:: :|| |: : ::::| | || :: ::: | |:|| :| :::: : |:|p45387 QVEPTAKTQTGEPKVRSRRAARAAFPDTLPDQSLLNALEAKQAE-LTAETQKSKAKTKKV
1000 1010 1020 1030 1040 1050
1140 1150 1160 1170 1180 1190orf1ng-1.pep QADK---DTALAKQREAETRPATTAFPRARRARRD-LPQPQpQPQPQPQRDLISRYANSG
:: : : | | : | :: :::::| | | : : | : |:||||||:||:p45387 RSKRAVFSDPLLDQSLFALEAALEVIDAPQQSEKDRLAQEEAEKQ-RKQKDLISRYSNSA
1060 1070 1080 1090 1100 1110
1200 1210 1220 1230 1240 1250orf1ng-1.pep LSEFSATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQ-TDLRQIG
|||:|||:||:::|||||||:|::: ::||||: :| ::| |: ||||:|| |:|||||p45387 LSELSATVNSMLSVQDELDRLFVDQAQSAVWTNIAQDKRRYDSDAFRAYQQQKTNLRQIG
1120 1130 1140 1150 1160 1170
1260 1270 1280 1290 1300 1310orf1ng-1.pep MQKNLGSGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFDIGISAGAGFSSG
:|| |::||:| :|||:|: ||||: : | | |: : |:|| | :::|:::|:|:|::p45387 VQKALANGRIGAVFSHSRSDNTFDEQVKNHATLTMMSGFAQYQWGDLQFGVNVGTGISAS
1180 1190 1200 1210 1220 1230
1320 1330 1340 1350 1360 1370orf1ng-1.pep SLSDGIRGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGL
:::: ||:|::::||::| |: :| :||:|::|::|||::: :|: |:| : ||:|p45387 KMAEEQSRKIHRKAINYGVNASYQFRLGQLGIQPYFGVNRYFIERENYQSEEVRVKTPSL
1240 1250 1260 1270 1280 1290
1380 1390 1400 1410 1420 1430orf1ng-1.pep AFNRYRAGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEW
||||| |||::||:| |:::||: ||: ::|:|:::::|:| || :|| | ||: : |p45387 AFNRYNAGIRVDYTFTPTDNISVKPYFFVNYVDVSNANVQTTVNLTVLQQPFGRYWQKEV
1300 1310 1320 1330 1340 1350
1440 1450 1460 1469orf1ng-1.pep GVNAEIKGFTLSLHAAAAKGPQLEAQHSAGIKLGYRWX
|::||| | :| : ::| || |:::|:||||||p45387 GLKAEILHFQISAFISKSQGSQLGKQQNVGVKLGYVRW
1360 1370 1380 1390
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。实施例78在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 597>:1 ..AAGGTGTGGC AATTTGTCGA AGA.CCGCTG CGTGCCGTCG TGCCTGCCGA51 CAGTTTTGAA CCGACCGCGC AAAAATTGAA CCTGTTTAAG GCGGGTGCGG101 CAACCATTTT GTTTTATGAA GATCAAAATG TCGTCAAAGG TTTGCAGGAG151 CAGTTCCCTG CTTATGCCGC TAACTTCCCC GTTTGGGCGg ATCAGGCAAA201 CGCGATGGTG CAGTATGCCG TTTGGACGAC ACTTGCCGCG GTCGGCGTAG251 GTGCAAACCT GCAACATTAC AATCCCTTGC CCGATGCGGC GATTGCCAAA301 GCGTGGAATA TCCCCGAAAA CTGGTTGTTG CGCGCACAAA TGGTTATCGG351 CGGTATTGAA GGGGCGGCAG GTGAAAAGAC CTTTGAACCC GTTGCAGAAC401 GTTTGAAAGT GTTCGGCGCA TAA它对应于氨基酸序列<SEQ ID 598;ORF6>:1 ..KVWQFVEXPL RAVVPADSFE PTAQKLNLFK AGAATILFYE DQNVVKGLQE51 QFPAYAANFP VWADQANAMV QYAVWTTLAA VGVGANLQHY NPLPDAAIAK101 AWNIPENWLL RAQMVIGGIE GAAGEKTFEP VAERLKVFGA *进一步的序列分析进一步揭示了部分DNA序列<SEQ ID 599>:1 ..CTGCGTGCCG TCGTGCCTGC CGACAGTTTT GAACCGACCG CGCAAAAATT51 GAACCTGTTT AAGGCGGGTG CGGCAACCAT TTTGTTTTAT GAAGATCAAA101 ATGTCGTCAA AGGTTTGCAG GAGCAGTTCC CTGCTTATGC CGCTAACTTC151 CCCGTTTGGG CGGATCAGGC AAACGCGATG GTGCAGTATG CCGTTTGGAC201 GACACTTGCC GCGGTCGGCG TAGGTGCAAA CCTGCAACAT TACAATCCCT251 TGCCCGATGC GGCGATTGCC AAAGCGTGGA ATATCCCCGA AAACTGGTTG301 TTGCGCGCAC AAATGGTTAT CGGCGGTATT GAAGGGGCGG CAGGTGAAAA351 GACCTTTGAA CCCGTTGCAG AACGTTTGAA AGTGTTCGGC GCATAA它对应于氨基酸序列<SEQ ID 600;ORF6-1>:1 ..LRAVVPADSF EPTAQKLNLF KAGAATILFY EDQNVVKGLQ EQFPAYAANF51 PVWADQANAM VQYAVWTTLA AVGVGANLQH YNPLPDAAIA KAWNIPENWL101 LRAQMVIGGI EGAAGEKTFE PVAERLKVFG A*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF6和脑膜炎奈瑟球菌菌株A的ORF(ORF6a)在140个氨基酸的重叠区内显示出有98.6%的相同性:
10 20 30orf6.pep KVWQFVEXPLRAVVPADSFEPTAQKLNLFK
||||||| |||||||||||||||||||||orf6a QIVEHAVLHTPSSFNSQSARVVVLFGEEHDKVWQFVEDALRAVVPADSFEPTAQKLNLFK
40 50 60 70 80 90
40 50 60 70 80 90orf6.pep AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf6a AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY
100 110 120 130 140 150
100 110 120 130 140orf6.pep NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
|||||||||||||||||||||||||||||||||||||||||||||||||||orf6a NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
160 170 180 190 200全长ORF6a核苷酸序列<SEQ ID 601>是:1 ATGACCCGTC AATCTCTGCA ACAGGCTGCC GAAAGCCGCC GTTCCATTTA51 TTCGTTAAAT AAAAATCTGC CCGTCGGCAA AGATGAAATC GTCCAAATCG101 TCGAACACGC CGTTTTGCAC ACACCTTCTT CGTTCAATTC CCAATCTGCC151 CGTGTGGTCG TGCTGTTTGG CGAAGAGCAT GATAAGGTGT GGCAATTTGT201 CGAAGACGCG CTGCGTGCCG TCGTGCCTGC CGACAGTTTT GAACCGACCG251 CGCAAAAATT GAACCTGTTT AAGGCGGGTG CGGCAACTAT TTTGTTTTAT301 GAAGATCAAA ATGTCGTCAA AGGTTTGCAG GAGCAGTTCC CTGCTTATGC351 CGCCAACTTT CCCGTTTGGG CGGACCAGGC GAACGCGATG GTGCAGTATG401 CCGTTTGGAC GACACTTGCC GCGGTCGGCG TAGGTGCAAA CCTGCAACAT451 TACAATCCCT TGCCCGATGC GGCGATTGCC AAAGCGTGGA ATATCCCCGA501 AAACTGGTTG TTGCGCGCAC AAATGGTTAT CGGCGGTATT GAAGGGGCGG551 CAGGTGAAAA GACCTTTGAA CCAGTTGCAG AACGTTTGAA AGTGTTCGGC601 GCATAA预计它编码的蛋白质具有氨基酸序列<SEQ ID 602>:1 MTRQSLQQAA ESRRSIYSLN KNLPVGKDEI VQIVEHAVLH TPSSFNSQSA51 RVVVLFGEEH DKVWQFVEDA LRAVVPADSF EPTAQKLNLF KAGAATILFY101 EDQNVVKGLQ EQFPAYAANF PVWADQANAM VQYAVWTTLA AVGVGANLQH151 YNPLPDAAIA KAWNIPENWL LRAQMVIGGI EGAAGEKTFE PVAERLKVFG201 A*ORF6a和ORF6-1在131个氨基酸的重叠区内显示出有100.0%的相同性:
50 60 70 80 90 100orf6a.pep TPSSFNSQSARVVVLFGEEHDKVWQFVEDALRAVVPADSFEPTAQKLNLFKAGAATILFY
||||||||||||||||||||||||||||||orf6-1 LRAVVPADSFEPTAQKLNLFKAGAATILFY
10 20 30
110 120 130 140 150 160orf6a.pep EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAIA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf6-1 EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAIA
40 50 60 70 80 90
170 180 190 200
orf6a.pep KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
||||||||||||||||||||||||||||||||||||||||||
orf6-1 KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
100 110 120 130
与淋病奈瑟球菌的预计ORF的同源性
ORF6和淋病奈瑟球菌的预计ORF(ORF6ng)在140个氨基酸的重叠区内显示出有95.7%的相同性:
orf6.pep KVWQFVEXPLRAVVPADSFEPTAQKLNLFK 30
||||||| |||||||||||||||||:|||
orf6ng SNVSLDMSNPTVLRMGLPLYIASLRRGAIYKVWQFVEDALRAVVPADSFEPTAQKLKLFK 64
orf6.pep AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY 90
||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||||
orf6ng AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGAGANLQHY 124
orf6.pep NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGA 140
|||||:||||||||||||||||||||||||||||||:|||||||||||||
orf6ng NPLPDVAIAKAWNIPENWLLRAQMVIGGIEGAAGEKVFEPVAERLKVFGA 174
鉴定出全长ORF6ng核苷酸序列<SEQ ID 603>:1 ATGGCCGTTG CGTCAAATGT CAGCTTGGAT ATGTCCAATC CTACGGTGTT51 ACGCATGGGA TTACCCTTAT ATATTGCGTC CCTAAGAAGG GGCGCAATAT101 ATAAGGTGTG GCAATTTGTC GAAGACGCGC TGCGTGCCGT CGTGCCTGCC151 GACAGTTTTG AACCGACCGC GCAAAAATTG AAGCTGTTTA AGGCGGGCGC201 GGCAACCATT TTGTTTTATG AAGATCAAAA TGTCGTCAAA GGTTTGCAGG251 AGCAGTTCCC TGCTTATGCC GCCAACTTTC CCGTTTGGGC GGACCAGGCG301 AACGCTATGG TACAGTATGC CGTCTGGACG ACACTTGCCG CGGTCGGTGC351 AGGTGCAAAT CTGCAACATT ACAACCCCTT GCCCGATGTG GCGATTGCTA401 AAGCGTGGAA TATTCCCGAA AACTGGCTGT TGCGCGCGCA AATGGTTATC451 GGTGGTATTG AAGGGGcggc aggtgaaaaa gtctttgaac CCGTTGCgga501 acgtttgAAA GTGTTCGGCG CATAA它编码的蛋白质具有氨基酸序列<SEQ ID 604>:1 MAVASNVSLD MSNPTVLRMG LPLYIASLRR GAIYKVWQFV EDALRAVVPA51 DSFEPTAQKL KLFKAGAATI LFYEDQNVVK GLQEQFPAYA ANFPVWADQA101 NAMVQYAVWT TLAAVGAGAN LQHYNPLPDV AIAKAWNIPE NWLLRAQMVI151 GGIEGAAGEK VFEPVAERLK VFGA*ORF6ng和ORF6-1在131个氨基酸的重叠区内显示出有96.9%的相同性:
10 20 30orf6-1.pep LRAVVPADSFEPTAQKLNLFKAGAATILFY
|||||||||||||||||:||||||||||||orf6ng PTVLRMGLPLYIASLRRGAIYKVWQFVEDALRAVVPADSFEPTAQKLKLFKAGAATILFY
20 30 40 50 60 70
40 50 60 70 80 90orf6-1.pep EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAIA
|||||||||||||||||||||||||||||||||||||||||||:||||||||||||:|||orf6ng EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGAGANLQHYNPLPDVAIA
80 90 100 110 120 130
100 110 120 130orf6-1.pep KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
|||||||||||||||||||||||||||:||||||||||||||orf6ng KAWNIPENWLLRAQMVIGGIEGAAGEKVFEPVAERLKVFGAX
140 150 160 170
预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例79
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 605>1 ..GGCTACAACT ACCTGTTCGC GCGCGGCAGC CGCATCGCCA ACTACCAAAT51 CAACGGCATC CCCGTTGCCG ACGCGCTGGC CGATACGGGt CAATGCCAAC101 ACCGCCGCCT ATGAGCGCGT AGAAGTCGTG CGCGGCGTGG CGGGGCTGCT151 GGACGGCACG GGCGAGCCTT CCGCCACCGT CAATCTGGTG CGCAAACGCC201 TGACCCGCAA GCCATTGTTT GAAGTCCGCG CCGAAGCgGG CAACCGcAAA251 CATTTCGGGC TGGACGCGGA CGTATCGGGC AGCCTGAACA CCGAAG.crC301 rCTGCGCgGC CGCCTGGTTT CCAcCTTCGG ACGCGGCGAC TCGTGGCGGC351 GGCGCGAACG CAGCCGskAT GCCGAACTCT ACGGCATTTT GGAATACGAC401 ATCGCACCGC AAACCCGCGT CCACGCArGC ATGGACTACC AGCAGGCGAA451 AGAAACCGCC GACGCGCCGC TCAGcTACGC CGTGTACGAC AGCCAAGGTT501 ATGCCACCGC CTTCGGCCCG AAAGACAACC CCGCCACAAA TTGGGCGAAC551 AGCCACCACC GTGCGCTCAA CCTGTTCGCC GGCATCGAAC ACCGCTTCAA601 CCAAGACTGG AAACTCAAAG CCGAATACGA CTAC..它对应于氨基酸序列<SEQ ID 606;ORF23>:1 ..GYNYLFARGS RIANYQINGI PVADALADTG NANTAAYERV EVVRGVAGLL51 DGTGEPSATV NLVRKRLTRK PLFEVRAEAG NRKHFGLDAD VSGSLNTEXX101 LRGRLVSTFG RGDSWRRRER SRXAELYGIL EYDIAPQTRV HAXMDYQQAK151 ETADAPLSYA VYDSQGYATA FGPKDNPATN WANSHHRALN LFAGIEHRFN201 QDWKLKAEYD Y..进一步的工作揭示了完整的核苷酸序列<SEQ ID 607>:1 ATGACACGCT TCAAATATTC CCTGCTGTTT GCCGCCCTGT TGCCCGTGTA51 CGCGCAGGCC GATGTTTCFG TTTCAGACGA CCCCAAACCG CAGGAAAGCA101 CTGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC151 GACGGCTACA CTGTTTCCGG CACGCACACC CCGCTCGGGC TGCCCATGAC201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC251 GCGACCAAAA CATCAAAACG CTCGACCGCG CCCTGTTGCA GGCGACCGGC301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG401 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CTGGACGGCA CGGGCGAGCC501 TTCCGCCACC GTCAATCTGG TGCGCAAACG CCTGACCCGC AAGCCATTGT551 TTGAAGTCCG CGCCGAAGCG GGCAACCGCA AACATTTCGG GCTGGACGCG601 GACGTATCGG GCAGCCTGAA CACCGAAGGC ACGCTGCGCG GCCGCCTGGT651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCGGCGCGAA CGCAGCCGCG701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CCGACGCGCC801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC851 CGAAAGACAA CCCCGCCACA AATTGGGCGA ACAGCCGCCA CCGTGCGCTC901 AACCTGTTCG CCGGCATCGA ACACCGCTTC AACCAAGACT GGAAACTCAA951 AGCCGAATAC GACTACACCC GCAGCCGCTT CCGCCAGCCC TACGGCGTAG1001 CAGGCGTGCT TTCCATCGAC CACAACACCG CCGCCACCGA CCTGATTCCC1051 GGTTATTGGC ACGCCGACCC GCGCACCCAC AGCGCCAGCG TGTCATTGAT1101 CGGCAAATAC CGCCTGTTCG GCCGCGAACA CGATTTAATC GCGGGTATCA1151 ACGGTTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATCCCC1201 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGTG CCTACCCGCA1251 GCCTGCATCG TTTGCCCAAA CCATCCCGCA ATACGGCACC AGGCGGCAAA1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG1351 ATTTTGGGCG GACGATACAC CCGTTACCGC ACCGGCAGCT ACGACAGCCG1401 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG1451 GCATCGTGTT CGACCTGACC GGCAACCTGT CTCTTTACGG CTCGTACAGC1501 AGCCTGTTCG TCCCGCAATC GCAAAAAGAC GAACACGGCA GCTACCTGAA1551 ACCCGTAACC GGCAACAATC TGGAAGCCGG CATCAAAGGC GAATGGCTTG1601 AAGGCCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC1651 CTCGCCACCG CAGCAGGACG CGACCCGAGC GGCAACACCT ACTACCGCGC1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA1751 TCACGCCCGA ATGGCAGATA CAGGCAGGTT ACAGCCAAAG CAAAACCCGC1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTACCCG AACGCAGCTT1851 CAAACTCTTC ACTGCCTACC ACTTTCCCC CGAAGCCCCC AGCGGCTGGA1901 CCATCGGCGC AGGCGTGCGC TGGCAGAGCG AAACCCACAC CGACCCTGCC1951 ACGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG CCGACAACAG2001 CCGCCAAAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA2051 ATCCGCGCGC CGAACTGTCG CTGAACGTGG ACAATCTGTT CAACAAACAC2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA2151 CGCGGCGTTT ACCTATCGGT TTAAATAA它对应于氨基酸序列<SEQ ID 608;ORF23-1>:1 MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN51 DGYTVSGTHT PLGLPMTLRE IPQSVSVITS QQMRDQNIKT LDRALLQATG101 TSRQIYGSDR AGYNYLFARG SRIANYQING IPVADALADT GNANTAAYER151 VEVVRGVAGL LDGTGEPSAT VNLYRKRLTR KPLFEVRAEA GNRKHFGLDA201 DVSGSLNTEG TLRGRLVSTF GRGDSWRRRE RSRDAELYGI LEYDIAPQTR251 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWANSRHRAL301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HNTAATDLIP351 GYWHADPRTH SASVSLIGKY RLFGREHDLI AGINGYKYAS NKYGERSIIP401 NAIPNAYEFS RTGAYPQPAS FAQTIPQYGT RRQIGGYLAT RFRAADNLSL451 ILGGRYTRYR TGSYDSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS501 SLFVPQSQKD EHGSYLKPVT GNNLEAGIKG EWLEGRLNAS AAVYPARKNN551 LATAAGRDPS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKTR601 DQDGSRLNPD SVPERSFKLF TAYHFAPEAP SGWTIGAGVR WQSETHTDPA651 TLRIPNPAAK ARAADNSRQK AYAVADIMAR YRFNPRAELS LNVDNLFNKH701 YRTQPDRHSY GALRTVNAAF TYRFK*该氨基酸序列的计算机分析给出了下列结果:与恶臭假单胞菌的铁-假单胞菌素受体PupB(登录号为P38047)的同源性ORF23和PupB蛋白在205个氨基酸的重叠区内显示出有32%的氨基酸相同性:Orf23 6 FARGSRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRK 65
++RG I NY+++G+P + L D + + A ++RVE+VRG GL+ G G PSAT+NL+RKPupB 215 WSRGFAIQNYEVDGVPTSTRL-DNYSQSMAMFDRVEIVRGATGLISGMGNPSATINLIRK 273Orf23 66 RLTRKPLFEVRAEAGNRKHFGLDADVSGSLNTEXXLRGRLVSTFXXXXXXXXXXXXXXAE 125
R T + + EAGN +G DVSG L +RGR V+ +PupB 274 RPTAEAQASITGEAGNWDRYGTGFDVSGPLTETGNIRGRFVADYKTEKAWIDRYNQQSQL 333Orf23 126 LYGILEYDIAPQTRVHAXMDYQQAKETADAPLSYAVYD--SQGYATAFGPKDNPATNWAN 183
+YGI E+D++ T + Y + D+PL + S G T N A +W+PupB 334 MYGITEFDLSEDTLLTVGFSY--LRSDIDSPLRSGLPTRFSTGERTNLKRSLNAAPDWSY 391Orf23 184 SHHRALNLFAGIEHRFNQDWKLKAE 208
+ H + F IE + W K EPupB 392 NDHEQTSFFTSIEQQLGNGWSGKIE 416
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF23和脑膜炎奈瑟球菌菌株A的ORF(ORF23a)在211个氨基酸的重叠区内显示出有95.7%的相同性:
10 20 30orf23.pep GYNYLFARGSRIANYQINGIPVADALADTG
||||||||||||||||||||||||||||||orf23a QMRDQNIKALDRALLQATGTSRQIYGSDRAGYNYLFARGSRIANYQINGIPVADALADTG
90 100 110 120 130 140
40 50 60 70 80 90orf23.Dep NANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRLTRKPLFEVRAEAGNRKHFGLDAD
|||||||||||||||||||||||||||||||||||| |||||||||||||||||||| ||orf23a NANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRPTRKPLFEVRAEAGNRKHFGLGAD
150 160 170 180 190 200
100 110 120 130 140 150orf23.pep VSGSLNTEXXLRGRLVSTFGRGDSWRRRERSRXAELYGILEYDIAPQTRVHAXMDYQQAK
||||||:| :|||||||||||||||||:|||| ||||||||||||||||||| |||||||orf23a VSGSLNAEGTLRGRLVSTFGRGDSWRQRERSRDAELYGILEYDIAPQTRVHAGMDYQQAK
210 220 230 240 250 260
160 170 180 190 200 210orf23.pep ETADAPLSYAVYDSQGYATAFGPKDNPATNWANSHHRALNLFAGIEHRFNQDWKLKAEYD
||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||orf23a ETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRALNLFAGIEHRFNQDWKLKAEYD
270 280 290 300 310 320orf23.pep Y
|orf23a YTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTHSASVSLIGKYRLFGREHDLIA
330 340 350 360 370 380全长ORF23a核苷酸序列<SEQ ID 609>是:1 ATGACACGCT TCAAATATTC CCTGCTGTTT GCCGCCCTGT TGCCCGTGTA51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCAAAACCG CAGGAAAGCA101 CTGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC151 GACGGCTACA CTGTTTCCGG CACGCACACC CCGCTCGGGC TGCCCATGAC201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC251 GCGACCAAAA CATCAAAGCG CTCGACCGCG CCCTGTTGCA GGCGACCGGC301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG401 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CTGGACGGCA CGGGCGAGCC501 TTCCGCCACC GTCAATCTGG TGCGCAAACG CCCGACCCGC AAGCCATTGT551 TTGAAGTCCG CGCCGAAGCG GGCAACCGCA AACATTTCGG GCTGGGCGCG601 GACGTATCGG GCAGCCTGAA TGCCGAAGGC ACGCTGCGCG GCCGCCTGGT651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCAGCGCGAA CGCAGCCGCG701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CCGACGCGCC801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC851 CGAAAGACAA CCCCGCCACA AATTGGGCGA ACAGCCGCCA CCGTGCGCTC901 AACCTGTTCG CCGGCATCGA ACACCGCTTC AACCAAGACT GGAAACTCAA951 AGCCGAATAC GACTACACCC GCAGCCGCTT CCGCCAGCCC TACGGCGTAG1001 CAGGCGTGCT TTCCATCGAC CACAACACCG CCGCCACCGA CCTGATTCCC1051 GGTTATTGGC ACGCCGACCC GCGCACCCAC AGCGCCAGCG TGTCATTAAT1101 CGGCAAATAC CGCCTGTTCG GCCGCGAACA CGATTTAATC GCGGGTATCA1151 ACGGTTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATCCCC1201 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGTG CCTACCCGCA1251 GCCTGCATCG TTTGCCCAAA CCATCCCGCA ATACGGCACC AGGCGGCAAA1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG1351 ATACTCGGCG GCAGATACAG CCGTTACCGC ACCGGCAGCT ACGACAGCCG1401 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG1451 GCATCGTGTT CGACCTGACC GGCAACCTGT CGCTTTACGG CTCGTACAGC1501 AGCCTGTTCG TCCCGCAATC GCAAAAAGAC GAACACGGCA GCTACCTGAA
1551 ACCCGTAACC GGCAACAATC TGGAAGCCGG CATCAAAGGC GAATGGCTTG
1601 AAGGCCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC
1651 CTCGCCACCG CAGCAGGACG CGACCCGAGC GGCAACACCT ACTACCGCGC
1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA
1751 TCACGCCCGA ATGGCAGATA CAGGCAGGTT ACAGCCAAAG CAAAACCCGC
1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTACCCG AACGCAGCTT
1851 CAAACTCTTC ACTGCCTACC ACTTTGCCCC CGAAGCCCCC AGCGGCTGGA
1901 CCATCGGCGC AGGCGTGCGC TGGCAGAGCG AAACCCACAC CGACCCTGCC
1951 ACGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG CCGACAACAG
2001 CCGCCAAAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA
2051 ATCCGCGCGC CGAACTGTCG CTGAACGTGG ACAATCTGTT CAACAAACAC
2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA
2151 CGCGGCGTTT ACCTATCGGT TTAAATAA它编码的蛋白质具有氨基酸序列<SEQ ID 610>:
1 MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN
51 DGYTVSGTHT PLGLPMTLRE IPQSVSVITS QQMRDQNIKA LDRALLQATG
101 TSRQIYGSDR AGYNYLFARG SRIANYQING IPVADALADT GNANTAAYER
151 VEVVRGVAGL LDGTGEPSAT VNLVRKRPTR KPLFEVRAEA GNRKHFGLGA
201 DVSGSLNAEG TLRGRLVSTF GRGDSWRQRE RSRDAELYGI LEYDIAPQTR
251 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWANSRHRAL
301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HNTAATDLIP
351 GYWHADPRTH SASVSLIGKY RLFGREHDLI AGINGYKYAS NKYGERSIIP
401 NAIPNAYEFS RTGAYPQPAS FAQTIPQYGT RRQIGGYLAT RFRAADNLSL
451 ILGGRYSRYR TGSYDSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS
501 SLFVPQSQKD EHGSYLKPVT GNNLEAGIKG EWLEGRLNAS AAVYRARKNN
551 LATAAGRDPS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKTR
601 DQDGSRLNPD SVPERSFKLF TAYHFAPEAP SGWTIGAGVR WQSETHTDPA
651 TLRIPNPAAK ARAADNSRQK AYAVADIMAR YRFNPRAELS LNYDNLFNKH
701 YRTQPDRHSY GALRTVNAAF TYRFK*ORF23a和ORF23-1在725个氨基酸的重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60orf23a.pep MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf23-1 MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
10 20 30 40 50 60
70 80 90 100 110 120orf23a.pep PLGLPMTLREIPQSVSVITSQQMRDQNIKALDRALLQATGTSRQIYGSDRAGYNYLFARG
|||||||||||||||||||||||||||||:||||||||||||||||||||||||||||||orf23-1 PLGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG
70 80 90 100 110 120
130 140 150 160 170 180orf23a.pep SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRPTR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||orf23-1 SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRLTR
130 140 150 160 170 180
190 200 210 220 230 240orf23a.pep KPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGRGDSWRQRERSRDAELYGI
|||||||||||||||||| ||||||||:|||||||||||||||||||:||||||||||||orf23-1 KPLFEVRAEAGNRKHFGLDADVSGSLNTEGTLRGRLVSTFGRGDSWRRRERSRDAELYGI
190 200 210 220 230 240
250 260 270 280 290 300orf23a.pep LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf23-1 LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL
250 260 270 280 290 300
310 320 330 340 350 360
orf23a.pep NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH
310 320 330 340 350 360
370 380 390 400 410 420
orf23a.pep SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPAS
370 380 390 400 410 420
430 440 450 460 470 480
orf23a.pep FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYSRYRTGSYDSRTQGMTYVSANRFT
||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||
orf23-1 FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYTRYRTGSYDSRTQGMTYVSANRFT
430 440 450 460 470 480
490 500 510 520 530 540
orf23a.pep PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS
490 500 510 520 530 540
550 560 570 580 590 600
orf23a.pep AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR
550 560 570 580 590 600
610 620 630 640 650 660
orf23a.pep DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK
610 620 630 640 650 660
670 680 690 700 710 720
orf23a.pep ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPORHSYGALRTVNAAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF
670 680 690 700 710 720
orf23a.pep TYRFKX
||||||
orf23-1 TYRFKX
与淋病奈瑟球菌的预计ORF的同源性
ORF23和淋病奈瑟球菌的预计ORF(ORF23.ng)在211个氨基酸的重叠区内显示出有93.4%的相同性:
orf23.pep GYNYLFARGSRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLD 51
||||||||||||||||||||||||||||||||||||||||||||||||| |
orf23ng SAVDACRIPGYNYLFARGSRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLPD 60
orf23.pep GTGEPSATVNLVRKRLTRKPLFEVRAEAGNRKHFGLDADVSGSLNTEXXLRGRLVSTFGR 111
||||||||||||||: |||||||||||||||||||| ||||||||:| :|||||||||||
orf23ng GTGEPSATVNLVRKHPTRKPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGR 120
orf23.pep GDSWRRRERSRXAELYGILEYDIAPQTRVHAXMDYQQAKETADAPLSYAVYDSQGYATAF 171
|||||: |||| ||||||||||||||||||| ||||||||||||||||||||||||||||
orf23ng GDSWRQLERSRDAELYGILEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAF 180
orf23.pep GPKDNPATNWANSHHRALNLFAGIEHRFNQDWKLKAEYDY 211
||||||||||:||::|||||||||||||||||||||||||
orf23ng GPKDNPATNWSNSRNRALNLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHS 240
预计ORF23ng核苷酸序列<SEQ ID 611>编码的蛋白质包含氨基酸序列<SEQ ID612>:
1 SAVDACRIPG YNYLFARGSR IANYQINGIP VADALADTGN ANTAAYERVE
51 VVRGVAGLPD GTGEPSATVN LVRKHPTRKP LFEVRAEAGN RKHFGLGADV
101 SGSLNAEGTL RGRLVSTFGR GDSWRQLERS RDAELYGILE YDIAPQTRVH
151 AGMDYQQAKE TADAPLSYAV YDSQGYATAF GPKDNPATNW SNSRNRALNL
201 FAGIEHRFNQ DWKLKAEYDY TRSRFRQPYG VAGVLSIDHS TAATDLIPGY
251 WHADPRTHSA SMSLTGKYRL FGREHDLIAG INGYKYASNK YGERSIIPNA
301 IPNAYEFSRT GAYPQPSSFA QTIPQYDTRR QIGGYLATRF RAADNLSLIL
351 GGRYSRYRAG SYNSRTQGMT YVSANRFTPY TGIVFDLTGN LSLYGSYSSL
401 FVPQLQKDEH GSYLKPVTGN NLEADIKGEW LEGRLNASAA VYRARKNNLA
451 TAAGRDQSGN TYYRAANQAK THGWEIEVGG RITPEWQIQA GYSQSKPRDQ
501 DGSRLNPDSV PERSFKLFTA YHLAPEAPSG RTIGAGVRRQ GETHTDPAAL
551 RIPNPAAKAR AVANSRQKAY AVADIMARYR FNPRTELSLN VDNLFNKHYR
601 TQPDRHSYGA LRTVNAAFTY RFK*进一步的工作揭示了其完整的核苷酸序列<SEQID 613>:
1 ATGACACGCT TCAAATACTC CCTGCTTTTT GCCGCCCTGC TACCCGTGTA
51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCCAAACCG CAGGAAAGCA
101 CCGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC
151 GACGGCTACA CCGTTTCCGG CACGCACACC CCGTTCGGGC TGCCCATGAC
201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC
251 GCGACCAAAA CATCAAAACG CTCGACCGCG CCCTGTTGCA GGCGACCGGC
301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT
351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG
401 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC
451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CCGGACGGCA CGGGCGAGCC
501 TTCTGCCACC GTCAATCTGG TACGCAAACA CCCGACCCGC AAGCCATTGT
551 TTGAAGTCCG CGCCGAAGCC GGCAACCGCA AACATTTCGG GCTGGGCGCG
601 GACGTATCGG GCAGCCTGAA CGCCGAAGGC ACGCTGCGCG GCCGCCTGGT
651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCAGCTCGAA CGCAGCCGCG
701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC
751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CAGACGCGCC
801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC
851 CAAAAGACAA CCCCGCCACA AATTGGTCGA ACAGCCGCAA CCGTGCGCTC
901 AACCTGTTCG CCGGCATAGA ACACCGCTTC AACCAAGACT GGAAACTCAA
951 AGCCGAATAC GACTACACCC GTAGCCGCTT CCGCCAGCCC TACGGTGTGG
1001 CAGGCGTACT TTCCATCGAC CACAGCACTG CCGCCACCGA CCTGATTCCC
1051 GGTTATTGGC ACGCcgatcc GCGCACCCAC AGCGCCAGCA TGTCATTGAC
1101 CGGCAAATAC CgcctGTTCG GCCGCGAGCA CGATTTAATC GCGGGTATCA
1151 ACGGCTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATTCCC
1201 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGCG CCTATCCGCA
1251 GCCATCATCG TTTGCCCAAA CCATCCCGCA ATACGACACC AGGCGGCAAA
1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG
1351 ATACTCGGCG GCAGATACAG CCGCTACCGC GCAGGCAGCT ACAACAGCCG
1401 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG
1451 GCATCGTGTT CGATCTGACC GGCAACCTGT CGCTTTACGG CTCGTACAGC
1501 AGCCTGTTCG TCCCGCAATT GCAAAAAGAC GAACACGGCA GCTACCTGAA
1551 ACCCGTAACC GGCAACAATC TGGAAGCCGA CATCAAAGGC GAATGGCTTG
1601 AAGGGCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC
1651 CTCGCCACCG CAGCAGGACG CGACCAGAGC GGCAACACCT ACTATCGCGC
1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA
1751 TCACGCCCGA ATGGCAGATA CAGGCAGGCT ACAGCCAAAG CAAACCCCGC
1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTAcCCG AACGCAGCTT
1851 CAAACTCTTC ACCGCCTACC ACTTAGCCCC CGAAGCCCCC AGCGGCCGGA
1901 CCATcggTGC GGGTGTGCGC CGGCAGGGCG AAACCCACAC CGACCCAGCC
1951 GCGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG TCGCCAACAG
2001 CCGCCAGAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA
2051 ATCCGCGCAC CGAACTGTCG CTGAACGTGG ACAACCTGTT CAACAAACAC
2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA
2151 CGCGGCGTTT ACCTATCGGT TTAAATAA它对应于氨基酸序列<SEQ ID 614;ORF23ng-1>:
1 MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN
51 DGYTVSGTHT PFGLPMTLRE IPQSVSVITS QQMRDQNIKT LDRALLQATG
101 TSRQIYGSDR AGYNYLFARG SRIANYQING IPVADALADT GNANTAAYER
151 VEVVRGVAGL PDGTGEPSAT VNLVRKHPTR KPLFEVRAEA GNRKHFGLGA
201 DVSGSLNAEG TLRGRLVSTF GRGDSWRQLE RSRDAELYGI LEYDIAPQTR
251 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWSNSRNRAL
301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HSTAATDLIP
351 GYWHADPRTH SASMSLTGKY RLFGREHDLI AGINGYKYAS NKYGERSIIP
401 NAIPNAYEFS RTGAYPQPSS FAQTIPQYDT RRQIGGYLAT RFRAADNLSL
451 ILGGRYSRYR AGSYNSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS
501 SLFVPQLQKD EHGSYLKPVT GNNLEADIKG EWLEGRLNAS AAVYRARKNN
551 LATAAGRDQS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKPR
601 DQDGSRLNPD SVPERSFKLF TAYHLAPEAP SGRTIGAGVR RQGETHTDPA
651 ALRIPNPAAK ARAVANSRQK AYAVADIMAR YRFNPRTELS LNVDNLFNKH
701 YRTQPDRHSY GALRTVNAAF TYRFK*ORF23ng-1和ORF23-1在725个氨基酸的重叠区内显示出有95.9%的相同性:
10 20 30 40 50 60orf23-1.pep MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf23ng-1 MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
10 20 30 40 50 60
70 80 90 100 110 120orf23-1.pep PLGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf23ng-1 PFGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG
70 80 90 100 110 120
130 140 150 160 170 180orf23-1.pep SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRLTR
|||||||||||||||||||||||||||||||||||||||| |||||||||||||||: ||orf23ng-1 SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLPDGTGEPSATVNLVRKHPTR
130 140 150 160 170 180
190 200 210 220 230 240orf23-1.pep KPLFEVRAEAGNRKHFGLDADVSGSLNTEGTLRGRLVSTFGRGDSWRRRERSRDAELYGI
|||||||||||||||||| ||||||||:|||||||||||||||||||: |||||||||||orf23ng-1 KPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGRGDSWRQLERSRDAELYGI
190 200 210 220 230 240
250 260 270 280 290 300orf23-1.pep LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL
||||||||||||||||||||||||||||||||||||||||||||||||||||:|||:|||orf23ng-1 LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWSNSRNRAL
250 260 270 280 290 300
310 320 330 340 350 360orf23-1.pep NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH
|||||||||||||||||||||||||||||||||||||||||:||||||||||||||||||orf23ng-1 NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHSTAATDLIPGYWHADPRTH
310 320 330 340 350 360
370 380 390 400 410 420
orf23-1.pep SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPAS
|||:|| |||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf23ng-1 SASMSLTGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPSS
370 380 390 400 410 420
430 440 450 460 470 480
orf23-1.pep FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYTRYRTGSYDSRTQGMTYVSANRFT
|||||||| |||||||||||||||||||||||||||:|||:|||:|||||||||||||||
orf23ng-1 FAQTIPQYDTRRQIGGYLATRFRAADNLSLILGGRYSRYRAGSYNSRTQGMTYVSANRFT
430 440 450 460 470 480
490 500 510 520 530 540
orf23-1.pep PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS
|||||||||||||||||||||||||| ||||||||||||||||||| |||||||||||||
orf23ng-1 PYTGIVFDLTGNLSLYGSYSSLFVPQLQKDEHGSYLKPVTGNNLEADIKGEWLEGRLNAS
490 500 510 520 530 540
550 560 570 580 590 600
orf23-1.pep AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR
|||||||||||||||||| ||||||||||||||||||||||||||||||||||||||| |
orf23ng-1 AAVYRARKNNLATAAGRDQSGNTYYRAANQAJTHGWEIEVGGRITPEWQIQAGYSQSKPR
550 560 570 580 590 600
610 620 630 640 650 660
orf23-1.pep DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK
||||||||||||||||||||||||:||||||| ||||||| |:|||||||:|||||||||
orf23ng-1 DQDGSRLNPDSVPERSFKLFTAYHLAPEAPSGRTIGAGVRRQGETHTDPAALRIPNPAAK
610 620 630 640 650 660
670 680 690 700 710 720
orf23-1.pep ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF
|||: |||||||||||||||||||||:||||||||||||||||||||||||||||||||||
orf23ng-1 ARAVANSRQKAYAVADIMARYRFNPRTELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF
670 680 690 700 710 720
orf23-1.pep TYRFKX
||||||
orf23ng-1 TYRFKX
另外,ORF23ng-1显示出与大肠杆菌的OMP明显同源:
sp|P16869|FHUE_ECOLI FE(Ⅲ)粪原因子、FE(Ⅲ)铁草铵以及FE(Ⅲ)-RHODOTRULIC ACID前体的外膜受体>gi|1651542|gnl|PID|d1015403(D90745)外膜蛋白FhuE前体[大肠杆菌]>gi|1651545|gnl|PID|d1015405(D90746)外膜蛋白FhuE前体[大肠杆菌]>gi|1787344(AE000210)E(Ⅲ)粪原因子、FE(Ⅲ)铁草铵以及FE(Ⅲ)-RHODOTRULIC ACID前体的外膜受体[大肠杆菌]长度=729
评分=332位(843),估计值=3e-90
相同性=228/717(31%),阳性=350/717(48%),空隙=60/717(8%)
询问:38 TITVTADRTASSN--DGYTVSGTHTPFGLPMTLREIPQSVSVITSQQMRDQNIKTLDRAL 95
T+V TA + + Y+V+ T +MT R+IPQSV++++ Q+M DQ ++TL +
目标:43 TVIVEGSATAPDDGENDYSVTSTSAGTKMQMTQRDIPQSVTIVSQQRMEDQQLQTLGEVM 102
询问:96 LQATGTSRQIYGSDRAGYNYLFARGSRIANYQINGIP--------VADALADTGNANTAA 147
G S+ SDRA Y ++RG +I NY ++GIP + DAL+D A
目标:103 ENTLGISKSQADSDRALY---YSRGFQIDNYMVDGIPTYFESRWNLGDALSDM-----AL 154
询问:148 YERVEVVRGVAGLPDGTGEPSATVNLVRKHPTRKPLF-EVRAEAGNRKHFGLGADVSGSL 206
+ERVEVVRG GL GTG PSA +N+VRKH T + +V AE G+ AD+ L
目标:155 FERVEVVRGATGLMTGTGNPSAAINMVRKHATSREFKGDVSAEYGSWNKERYVADLQSPL 214
询问:207 NAEGTLRGRLVSTPGRGDSWRQLERSRDAELYGILEYDIAPQTRVHAGMDYQQAKETADA 266
+G +R R+V + DSW S GI++ D+ T + AG +YQ+ +
目标:215 TEDGKIRARIVGGYQNNDSWLDRYNSEKTFFSGIVDADLGDLTTLSAGYEYQRIDVNSPT 274
询问:267 PLSYAVYDSQGYATAFGPKDNPATNWSNSRNRALNLFAGIEHRFNQDWKLKAEYDYTRSR 326
+++ G + ++ + A +W+ + +F ++ +F W+ ++
目标:275 WGGLPRWNTDGSSNSYDRARSTAPDWAYNDKEINKVFMTLKQQFADTWQATNLATHSEVE 334
询问:327 F--RQPYGVAGVLSIDHSTAA--TDLIPGY-------WHADPRTHSA-SMSLTGKYRLFG 374
F + Y A V D ++ PG+ W++ R A + G Y LFG
目标:335 FDSKMMYVDAYVNKADGMLVGPYSNYGPGFDYVGGTGWNSGKRKVDALDLFADGSYELFG 394
询问:375 REHDLIAGINGYKYASNKYGER--SIIPNAIPNAYEFSRTGAYPQPSSFAQTIPQYDTRR 432
R+H+L+ G Y +N+Y +I P+ I + Y F+ G +PQ Q++ Q DT
目标:395 RQHNLMFG-GSYSKQNNRYFSSWANIFPDEIGSFYNFN--GNFPQTDWSPQSLAQDDTTH 451
询问:433 QIGGYLATRFRAADNLSLILGGRYSRYRAGSYNSRTQGMTY-VSANRFTPYTGIVFDXXX 491
Y ATR AD L LILG RY+ +R + +TY + N TPY G+VFD
目标:452 MKSLYAATRVTLADPLHLILGARYTNWRVDT-------LTYSMEKNHTTPYAGLVFDIND 504
询问:492 XXXXXXXXXXXFVPQLQKDEHGSYLKPVTGNNLEADIKGEWLEGRLNASAAVYRARKNNL 551
F PQ +D G YL P+TGNN E +K +W+ RL + A++R ++N+
目标:505 NWSTYASYTSIFQPQNDRDSSGKYLAPITGNNYELGLKSDWMNSRLTTTLAIFRIEQDNV 564
询问:552 ATAAGR---DQSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKPRDQDGSRLN 608
A + G +G T Y+A + + G E E+ G IT WQ+ G ++ D +G+ +N
目标:565 AQSTGTPIPGSNGETAYKAVDGTVSKGVEFELNGAITDNWQLTFGATRYIAEDNEGNAVN 624
询问:609 PDSVPERSFKLFTAYHLAPEAPSGRTIGAGVRRQGETHTDPAALRIPNPAAKARAVANSR 668
P ++P + K+FT+Y L P P T+G GV Q +TD P RA
目标:625 P-NLPRTTVKMFTSYRL-PVMPE-LTVGGGVNWQNRVYTDTV-----TPYGTFRA----E 672
询问:669 QKAYAVADIMARYRFNPRTELSLNVDNLFNKHYRTQPDRH-SYGALRTVNAAFTYRF 724
Q +YA+ D+ RY+ L NV+NLF+K Y T + YG R + TY+F
目标:673 QGSYALVDLFTRYQVTKNFSLQGNVNNLFDKTYDTNVEGSIVYGTPRNFSITGTYQF 729
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF23-1(77.5kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图15A显示出His-融合蛋白亲和纯化的结果,图15B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,对小鼠血清进行Western印迹(图15C)和ELISA(阳性结果)。这些实验确认ORF23-1是一种外露蛋白,且是一种有用的免疫原。
实施例80
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 615>:
1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCAATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA
101 TCATATCCAA GCCGACCGAA CAAACGGCGG TCATGGCTTC GAGTTTGTCC
151 AGCGTCAgcA CGCCTGCTTC GGCGgcGgCa ATCATACCTT CGTCTTCGGA
201 AACGGGGATA AACGcGCCAC TCAAACCCCC GACCGCGCTG GAAGCCATCA
251 TGCCGCCTTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TnTTCAAGAA TGCGTGCCAC
351 TnAGTCGCCG ACGGGG..它对应于氨基酸序列<SEQ ID 616;ORF24>:
1 MRTAVVLLLI MPMAASSAMM PEMVCAGVSP GTAIISKPTE QTAVMASSLS
51 SVSTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI XSRMRATXSP TG..进一步的工作揭示了完整的核苷酸序列<SEQ ID 617>:
1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCAATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA
101 TCATATCCAA GCCGACCGAA CAAACGGCGG TCATGGCTTC GAGTTTGTCC
151 AGCGTCAGCA CGCCTGCTTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA
201 AACGGGGATA AACGCGCCAC TCAAACCCCC GACCGCGCTG GAAGCCATCA
251 TGCCGCCTTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TCTTCAAGAA TGCGTGCCAC
351 TGAGTCGCCG ACGGCGGGGG TCGGCGCCAG CGACAAGTCG AGAATACCAA
401 ACGGGATATT CAGCATTTTT GAGGCTTCGC GGCCGATGAG TTCGCCCACG
451 CGGGTAATTT TGAAAGCAGT TTTCTTCACT ACTTCCGCAA CTTCGGTCAA
501 TGTCGTTGCA TCTGAATTTT CCAACGCGGC TTTTACGACA CCTGGGCCGG
551 ATACGCCGAC ATTGATAACG GCATCCGCTT CGCCCGAACC ATGAAACGCG
601 CCCGCCATAA ACGGGTTGTC TTCCACCGCG TTGCAGAACA CGACAATTTT
651 AGCGCAGCCG AAACCTTCGG GCGTGATTTC CGCCGTGCGT TTGACGGTTT
701 CGCCCGCCAG CTTGACCGCA TCCATATTGA TACCGGCACG CGTACTGCCG
751 ATATTGATGG AGCTGCACAC AATATCGGTA GTCTTCATCG CTTCGGGAAT
801 GGAGCGGATT AACACCTCAT CCGAAGGCGA CATCCCTTTT TGCACCAACG
851 CGGAAAAACC GCCGATAAAA GACACACCGA TGGCTTTGGC AGCTTTATCC
901 AAAGTTTGCG CCACGCTGAC GTAA它对应于氨基酸序列<SEQ ID 618;ORF24-1>:
1 MRTAVVLLLI MPMAASSAMM PEMVCAGVSP GTAIISKPTE QTAVMASSLS
51 SVSTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RIPNGIFSIF EASRPMSSPT
151 RVILKAVFFT TSATSVNVVA SEFSNAAFTT PGPDTPTLIT ASASPEP*NA
201 PAINGLSSTA LQNTTILAQP KPSGVISAVR LTVSPASLTA SILIPARVLP
251 ILMELHTISV VFIASGMERI NTSSEGDIPF CTNAEKPPIK DTPMALAALS
301 KVCATLT*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF24和脑膜炎奈瑟球菌菌株A的ORF(ORF24a)在307个氨基酸的重叠区内显示出有96.4%的相同性:
10 20 30 40 50 60
orf24a.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISXPTEQTAVIASSLSNVSTPASAAA
|||||||||||||||||||||||||||||||||||| |||||||:|||||:|||||||||
orf24 MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA
10 20 30 40 50 60
70 80 90 100 110 120
orf24a.pep IIPSSSXTGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24 IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
70 80 90 100 110 120
130 140 150 160 170 180
orf24a.pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf24 TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
130 140 150 160 170 180
190 200 210 220 230 240orf24a.pep PGPDTPTLITASASPEPXNAPAIXGLSSXALQNTTILAQPKPSSVISXVRLMVSPASLTA
||||||||||||||||||||||| ||||:||||||||||||||:||| ||| ||||||||orf24 PGPDTPTLITASASPEPXNAPAINGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA
190 200 210 220 230 240
250 260 270 280 290 300orf24a.pep SILIPARVLPILMELHTISVVFIASGMERXNTSSEGDIPFCTSAEKPPIKDTPMALAALS
||||||||||||||||||||||||||||| ||||||||||||:|||||||||||||||||orf24 SILIPARVLPILMELHTISVVFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS
250 260 270 280 290 300orf24a.pep KVCATLTX
||||||||orf24 KVCATLTX全长ORF24a核苷酸序列<SEQ ID 619>是:
1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCAATGATG CCGGAAATGG TGTGCGCGGG TGTGTCGCCG GGAACGGCAA
101 TCATATCCAA NCCGACCGAA CAAACGGCGG TCATCGCTTC GAGTTTATCC
151 AACGTCAGCA CGCCTGCTTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA
201 NACGGGGATA AACGCGCCAC TCAAACCGCC AACCGCGCTC GAAGCCATCA
251 TGCCGCCCTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAACCCATT TCTTCAAGAA TGCGCGCCAC
351 CGAGTCGCCG ACGGCAGGGG TCGGTGCCAG CGACAAGTCG AGAATACCAA
401 ACGGGATATT CAGCATTTTT GAGGCTTCGC GGCCGATGAG TTCGCCCACG
451 CGGGTAATTT TGAAGGCGGT TTTCTTCACA ACTTCGGCAA CTTCGGTCAA
501 TGTCGTTGCA TCCGAATTTT CCAACGCGGC TTTTACGACA CCCGGGCCGG
551 ATACGCCGAC ATTAATCACA GCATCCGCTT CGCCTGAGCC GTGAAACGCG
601 CCCGCCATAN ACGGGTTGTC TTCCNCCGCG TTGCAGAACA CGACGATTTT
651 GGCGCAGCCG AAACCTTCTA GTGTGATTTC ANCCGTGCGT TTGATGGTTT
701 CGCCCGCCAG TCTGACCGCG TCCATATTGA TACCGGCGCG CGTACTGCCG
751 ATATTGATGG AGCTGCACAC GATATCAGTA GTCTTCATCG CTTCGGGAAT
801 GGAACGGATN AACACCTCGT CAGAAGGCGA CATACCTTTT TGCACCAGCG
851 CGGAAAAGCC GCCAATAAAA GACACGCCGA TGGCTTTGGC AGCCTTATCC
901 AAAGTTTGCG CCACGCTGAC GTAA它编码的蛋白质具有氨基酸序列<SEQID 620>:
1 MRTAVVLLLI MPMAASSAMM PEMVCAGVSP GTAIISXPTE QTAVIASSLS
51 NVSTPASAAA IIPSSSXTGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RIPNGIFSIF EASRPMSSPT
151 RVILKAVFFT TSATSVNVVA SEFSNAAFTT PGPDTPTLIT ASASPEP*NA
201 PAIXGLSSXA LQNTTILAQP KPSSVISXVR LMVSPASLTA SILIPARVLP
251 ILMELHTISV VFIASGMERX NTSSEGDIPF CTSAEKPPIK DTPMALAALS
301 KVCATLT*应注意,该蛋白质包括198位的终止密码子。ORF24a和ORF24-1在307个氨基酸的重叠区内显示出有96.4%的相同性:
10 20 30 40 50 60orf24a.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISXPTEQTAVIASSLSNVSTPASAAA
|||||||||||||||||||||||||||||||||||| |||||||:|||||:|||||||||orf24-1 MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA
10 20 30 40 50 60
70 80 90 100 110 120orf24a.pep IIPSSSXTGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24-1 IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
70 80 90 100 110 120
130 140 150 160 170 180
orf24a.pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24-1 TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
130 140 150 160 170 180
190 200 210 220 230 240
orf24a.pep PGPDTPTLITASASPEPXNAPAIXGLSSXALQNTTILAQPKPSSVISXVRLMVSPASLTA
||||||||||||||||||||||| ||||:||||||||||||||:||| ||| ||||||||
orf24-1 PGPDTPTLITASASPEPXNAPAINGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA
190 200 210 220 230 240
250 260 270 280 290 300
orf24a.pep SILIPARVLPILMELHTISVVFIASGMERXNTSSEGDIPFCTSAEKPPIKDTPMALAALS
||||||||||||||||||||||||||||| ||||||||||||:|||||||||||||||||
orf24-1 SILIPARVLPILMELHTISVVFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS
250 260 270 280 290 300
orf24a.pep KVCATLTX
||||||||
orf24-1 KVCATLTX
与淋病奈瑟球菌的预计ORF的同源性
ORF24和淋病奈瑟球菌的预计ORF(ORF24ng)在121个氨基酸的重叠区内显示出有96.7%的相同性:
orf24.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA 60
||||||||||||||||||||||||||||||||||:|||||||||||||||||:|||||||
orf24ng MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIMSKPTEQTAVMASSLSSVNTPASAAA 60
orf24.pep IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPIXSRMRATXSP 120
|||||||||||||||||||||||||||||||||||||||||||||||||| |||||| ||
orf24ng IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP 120
orf24.pep TG 122
|:
orf24ng TAGVGASDKSRMPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVRLTASEFSSAALTT 180
全长ORF24ng核苷酸序列<SEQ ID 621>是:
1 ATGCGCACGG CGGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCGATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA
101 TCATGTCCAA ACCAACGGAG CAGACGGCGG TCATGGCTTC GAGTTTGTCC
151 AGCGTCAACA CGCCTGCCTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA
201 AACGGGGATA AACGCGCCGC TCAAACCGCC GACCGCGCTG GAAGCCATCA
251 TGCCGCCCTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TCTTCAAGAA TGCGCGCCAC
351 CGAGTCGCCG ACGGCGGGGG TCGGTGCCAG CGACAAATCG AGAATGCCGA
401 ACGGGATATT CAGCATTTTT GAGGCTTCGC GACCGATGAG TTCGCCCACG
451 CGGGTGATTT TGAAAGCGGT TTTCTTCACG ACTTCGGCGA CCTCGGTCAG
501 GCTGACCGCG TCCGAATTTT CCAGCGCGGC TTTGACCACG CCTGGACCGG
551 ATACGCCGAC ATTAATCACA GCATCCGCTT CGCCCGAGCC GTGGAACGCA
601 CCCGCCATAA ACGGATTGTC TTCCACCGCG TTGCAGAACA CGACGATTTT
651 GGCGCAGCCG AAACCTTCGG GTGTGATTTC AGCCGTGCGT TTGATGGTTT
701 CGCCTGCCAG CTTGACCGCA TCCATATTGA TACCGGCACG CGTGCTGCCG
751 ATATTGATGG AGCTGCACAC GATATCGGTA GTTTTCATCG CTTCGGGAAC
801 GGAACGGATC AACACCTCAT CCGAAGGCGA CATACCTTTT TGCACCAGCG
851 CGGAAAAGCC GCCGATAAAG GACACGCCGA TGGCTTTGGC TGCCTTGTCC
901 AAAGTCTGCG CCACGCTGAC ATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 622>:
1 MRTAVVLLLI MPMAASSAMM PEMVCAGVSP GTAIMSKPTE QTAVMASSLS
51 SVNTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RMPNGIFSIF EASRPMSSPT
151 RVILKAVFFT TSATSVRLTA SEFSSAALTT PGPDTPTLIT ASASPEPWNA
201 PAINGLSSTA LQNTTILAQP KPSGVISAVR LMVSPASLTA SILIPARVLP
251 ILMELHTISV VFIASGTERI NTSSEGDIPF CTSAEKPPIK DTPMALAALS
301 KVCATLT*
ORF24ng和ORF24-1在307个氨基酸的重叠区内显示出有96.1%的相同性:
10 20 30 40 50 60
orf24-1.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA
||||||||||||||||||||||||||||||||||:|||||||||||||||||:|||||||
orf24ng MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIMSKPTEQTAVMASSLSSVNTPASAAA
10 20 30 40 50 60
70 80 90 100 110 120
orf24-1.pep IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24ng IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
70 80 90 100 110 120
130 140 150 160 170 180
orf24-1.pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
|||||||||||:|||||||||||||||||||||||||||||||||||::|||||:||:||
orf24ng TAGVGASDKSRMPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVRLTASEFSSAALTT
130 140 150 160 170 180
190 200 210 220 230 240
orf24-1.pep PGPDTPTLITASASPEPXNAPAINGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA
||||||||||||||||| ||||||||||||||||||||||||||||||||| ||||||||
orf24ng PGPDTPTLITASASPEPWNAPAINGLSSTALQNTTILAQPKPSGVISAVRLMVSPASLTA
190 200 210 220 230 240
250 260 270 280 290 300
orf24-1.pep SILIPARVLPILMELHTISVVFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS
|||||||||||||||||||||||||| |||||||||||||||:|||||||||||||||||
orf24ng SILIPARVLPILMELHTISVVFIASGTERINTSSEGDIPFCTSAEKPPIKDTPMALAALS
250 260 270 280 290 300
orf24-1.pep KVCATLTX
||||||||
orf24ng KVCATLTX
根据该分析结果(包括淋球菌蛋白中存在一个推定前导序列(前18个氨基酸,用双划线表示)和推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例81
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 623>:
1 .. ACCGACGTGC AAAAAGAGTT GGTCGGCGAA CAACGCAAGT GGGCGCAGGA
51 AAAAATCAGC AACTGCCGAC AAGCCGCCGC GCAGGCAGAC CGGCAGGAAT
101 ACGCCGAATA CCTCAAGCTG CAATGCGACA CGCGGATGAC GCGCGAACGG
181 ATACAGTATC TTCGCGGCTA TTCCATCGAT TAG它对应于氨基酸序列<SEQ ID 624;ORF25>:
1 ..TDVQKELVGE QRKWAQEKIS NCRQAAAQAD RQEYAEYLKL QCDTRMTRER
51 IQYLRGYSID*进一步的工作揭示了完整的核苷酸序列<SEQ ID 625>:
1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCCGCTTG
51 CGGCAGGGAA GAACCGCCCA AGGCATTGGA ATGCGCCAAC CCCGCCGTGT
101 TGCAAGGCAT ACGCGGCAAT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT
151 TCTTTCGCGC GCGAAGACGG CAGGCAGTTT GTCGATGCCG ACAAAATTAT
201 CGCCGCCGCC TACGGTTTGG CGTTTTCTTT GGAACACGCT TCGGAAACGC
251 AGGAAGGCGG GCGCACGTTC TGTATCGCCG ATTTGAACAT TACCGTGCCG
301 TCTGAAACGC TTGCCGATGC CAAGGCAAAC AGCCCCCTGT TGTACGGGGA
351 AACTGCTTTG TCGGATATTG TGCGGCAGAA GACGGGCGGC AATGTCGAGT
401 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTGCC CGTCAAAGAC
481 GGTCAGACGG CATTTGTCGA CAACACGGTC GGTATGGCGG CGCAAACGCT
801 GTCTGCCGCG CTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG
551 GCAAGGCGGT GAAAAAAGAA GACGCGGTCA GGATTTTGAG CGGAAAAGCC
601 CGTGAAGAAG AACCGTCCAA ACCCACGCCC GAAGACATTT TGGAACACAA
651 TGCCGCCGGC GGCGATGCGG GCGTACCCCA AGCCGCAGAA GGCGCGCCCG
701 AACCGGAAAT CCTGCATCCT GACGACGGCG AGCGTGCCGA TACCGTTACC
751 GTATCACGGG GCGAAGTGGA AGAGGCGCGC GTACAAAACC AGCGTGCGGA
801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG
851 AGTTGGTCGG CGAACAACGC AAGTGGGCGC AGGAAAAAAT CAGCAACTGC
901 CGACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA
981 GCTGCAATGC GACACGCGGA TGACGCGCGA ACGGATACAG TATCTTCGCG
1001 GCTATTCCAT CGATTAG它对应于氨基酸序列<SEQ ID 626;ORF25-1>:
1 MYRKLIALPF ALLLAACGRE EPPKALECAN PAVLQGIRGN IQETLTQEAR
51 SFAREDGRQF VDADKIIAAA YGLAFSLEHA SETQEGGRTF CIADLNITVP
101 SETLADAKAN SPLLYGETAL SDIVRQKTGG NVEFKDGVLT AAVRFLPVKD
151 GQTAEVDNTV GMAAQTLSAA LLPYGVKSIV MIDGKAVKKE DAVRILSGKA
201 REEEPSKPTP EDILEHNAAG GDAGVPQAAE GAPEPEILHP DDGERADTVT
251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEQR KWAQEKISNC
301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF25和脑膜炎奈瑟球菌菌株A的ORF(ORF25a)在60个氨基酸的重叠区内显示出有98.3%的相同性:
10 20 30
orf25.pep TDVQKELVGEQRKWAQEKISNCRQAAAQAD
||||||||||||||||||||||||||||||
orf25a VTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEXRKWAQEKISNCRQAAAQAD
250 260 270 280 290 300
40 50 60
orf25.pep RQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
|||||||||||||||||||||||||||||||
orf25a RQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
310 320 330
全长ORF25a核苷酸序列<SEQ ID 627>是:
1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCCGCTTG
51 CGGCAGGGAA GAACCGCCCA AGGCATTGGA ATGCGCCAAC CCCGCCGTGT
101 TGCAANGCAT ACGCNGCAAT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT
151 TCTTTCGCGC GCGAAGACNG CANGCAGTTT GTCGATGCCG ACNAAATTAT
201 CGCCGCCGCC TANGNTNNGN NGNTNTCTTT GGAACACGCT TCGGAAACGC
251 AGGAAGGCGG GCGCACGTTC TGTNTCGCCG ATTTGAACAT TACCGTGCCG
301 TCTGAAACGC TTGCCGATGC CAAGGCAAAC AGCCCCCTGC TGTACGGGGA
351 AACCGCTTTG TCGGATATTG TGCGGCAGAA GACGGGCGGC AATGTCGAGT
401 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTACC CGTCAAAGAC
451 GGTCAGANGG CATTTGTCGA CAACACGGTC GGTATGGCGG CGCAAACGCT
501 GTCTGCCGCG TTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG
551 GCAAGGCGGT AAAAAAAGAA GACGCGGTCA GGATTNTGAG CNGANAAGCC
601 CGTGAANAAG AACCGTCCAA ANCCNNGCCC GAAGACATTT TGGAACATAA
651 TGCCGCCGGA GGGGATGCAG ACGTACCCCA AGCCGGAGAA GACGCGCCCG
701 AACCGGAAAT CCTGCATCCT GACGACGGCG AGCGTGCCGA TACCGTTACC
751 GTATCACGGG GCGAAGTGGA AGAGGCGCGN GTACAAAACC AGCGTGCGGA
801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG
851 AGTTGGTCGG CGAANAACGC AAGTGGGCGC AGGAAAAAAT CAGCAACTGC
901 CGACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA
951 GCTGCAATGC GACACGCGGA TGACGCGCGA ACGGATACAG TATCTTCGCG
1001 GCTATTCCAT CGATTAG它编码的蛋白质具有氨基酸序列<SEQ ID 628>:
1 MYRKLIALPF ALLLAACGRE EPPKALECAN PAVLQXIRXN IQETLTQEAR
51 SFAREDXXQF VDADXIIAAA XXXXXSLEHA SETQEGGRTF CXADLNITVP
101 SETLADAKAN SPLLYGETAL SDIVRQKTGG NVEFKDGVLT AAVRFLPVKD
151 GQXAFVDNTV GMAAQTLSAA LLPYGVKSIV MIDGKAVKKE DAVRIXSXXA
201 REXEPSKXXP EDILEHNAAG GDADVPQAGE DAPEPEILHP DDGERADTVT
251 VSRGEYEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEXR KWAQEKISNC
301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID*ORF25a和ORF25-1在338个氨基酸的重叠区内显示出有93.5%的相同性:
10 20 30 40 50 60orf25a.pep MYRKLIALPFALLLAACGREEPPKALECANPAVLQXIRXNIQETLTQEARSFAREDXXQF
||||||||||||||||||||||||||||||||||| || ||||||||||||||||| ||orf25-1 MYRKLIALPFALLLAACGREEPPKALECANPAVLQGIRGNIQETLTQEARSFAREDGRQF
10 20 30 40 50 60
70 80 90 100 110 120orf25a.pep VDADXIIAAAXXXXXSLEHASETQEGGRTFCXADLNITVPSETLADAKANSPLLYGETAL
|||| ||||| |||||||||||||||| ||||||||||||||||||||||||||||orf25-1 VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAKANSPLLYGETAL
70 80 90 100 110 120
130 140 150 160 170 180orf25a.pep SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQXAFVDNTVGMAAQTLSAALLPYGVKSIV
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||orf25-1 SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQTAFVDNTVGMAAQTLSAALLPYGVKSIV
130 140 150 160 170 180
190 200 210 220 230 240orf25a.pep MIDGKAVKKEDAVRIXSXXAREXEPSKXXPEDILEHNAAGGDADVPQAGEDAPEPEILHP
||||||||||||||| | ||| |||| :|||||||||||||| ||||:| |||||||||orf25-1 MIDGKAVKKEDAVRILSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP
190 200 210 220 230 240
250 260 270 280 290 300orf25a.pep DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEXRKWAQEKISNC
|||||||||||||||||||||||||||||||||||||||||||||||| |||||||||||orf25-1 DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC
250 260 270 280 290 300
310 320 330 339orf25a.pep RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
|||||||||||||||||||||||||||||||||||||||
orf25-1 RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
310 320 330
与淋病奈瑟球菌的预计ORF的同源性
ORF25和淋病奈瑟球菌的预计ORF(ORF25ng)在60个氨基酸的重叠区内显示出有100%的相同性:
orf25.pep TDVQKELVGEQRKWAQEKISNCRQAAAQAD 30
||||||||||||||||||||||||||||||
orf25ng VTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNCRQAAAQAD 308
orf25.pep RQEYAEYLKLQCDTRMTRERIQYLRGYSID 60
||||||||||||||||||||||||||||||
orf25ng RQEYAEYLKLQCDTRMTRERIQYLRGYSID 338
全长ORF25ng核苷酸序列<SEQ ID 629>是:
1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCAGCGTG
51 CGGCAGGGAA GAACCGCCCA AGGCGTTGGA ATGCGCCAAC CCCGCCGTGT
101 TGCAGGACAT ACGCGGCAGT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT
151 TCTTTCGCGC GCGAAGACGG CAGGCAGTTT GTCGATGCCG ACAAAATTAT
201 CGCCGCCGCC TACGGTTTGG CGTTTTCTTT GGAACACGCT TCGGAAACGC
251 AGGAAGGCGG GCGCACGTTC TGTATCGCCG ATTTGAACAT TACCGTGCCG
301 TCTGAAACGC TTGCCGATGC CGAGGCAAAC AGCCCCCTGC TGTATGGGGA
351 AACGTCTTTG GCAGACATCG TGCAGCAGAA GACGGGCGGC AATGTCGAGT
401 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTGCC CGCCAAAGAC
451 GCTCGGACGG CATTTATCGA CAACACGGTC GGTATGGCGA CGCAAACGCT
501 GTCTGCCGCG TTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG
551 GCAAGGCGGT GACAAAAGAA GACGCGGTCA GGGTTTTGAG CGGCAAAGCC
601 CGTGAAGAAG AACCGTCCAA ACCCACCCCC GAAGACATTT TGGAACACAA
651 TGCCGCCGGC GGCGATGCGG GCGTACCCCA AGCCGCAGAA GGCGCACCCG
701 AACCCGAAAT CCTGCATCCC GACGACGTCG AGCGTGCCGA TACCGTTACC
751 GTATCACGGG GCGAAGTGGA AGAGGCGCGC GTACAAAACC AACGTGCGGA
801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG
851 AGTTGGTCGG CGAACAGCGC AAGTGGGCGC AGGAAAAAAT CAGcaactgc
901 cgACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA
951 GCTCCAATGC GACACGCGGA TGACGCGCGA ACggaTACAG TATCTTCGCG
1001 GCTATTCCAT CGATTAG它编码的蛋白质具有氨基酸序列<SEQ ID 630>:
1 MYRKLIALPF ALLLAACGRE EPPKALECAN PAVLQDIRGS IQETLTQEAR
51 SFAREDGRQF VDADKIIAAA YGLAFSLEHA SETQEGGRTF CIADLNITVP
101 SETLADAEAN SPLLYGETSL ADIVQQKTGG NVEFKDGVLT AAVRFLPAKD
151 ARTAFIDNTV GMATQTLSAA LLPYGVKSIV MIDGKAVTKE DAVRVLSGKA
201 REEEPSKPTP EDILEHNAAG GDAGVPQAAE GAPEPEILHP DDVERADTVT
251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEQR KWAQEKISNCV
301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID*ORF25ng和ORF25-1在338个氨基酸的重叠区内显示出有95.9%的相同性:
10 20 30 40 50 60orf25-1.pep MYRKLIALPFALLLAACGREEPPKALECANPAVLQGIRGNIQETLTQEARSFAREDGRQF
||||||||||||||||||||||||||||||||||| |||:||||||||||||||||||||orf25ng MYRKLIALPFALLLAACGREEPPKALECANPAVLQDIRGSIQETLTQEARSFAREDGRQF
10 20 30 40 50 60
70 80 90 100 110 120orf25-1.pep VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAKANSPLLYGETAL
|||||||||||||||||||||||||||||||||||||||||||||||:||||||||||:|orf25ng VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAEANSPLLYGETSL
70 80 90 100 110 120
130 140 150 160 170 180
orf25-1.pep SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQTAFVDNTVGMAAQTLSAALLPYGVKSIV
:|||:||||||||||||||||||||||:||::|||:|||||||:||||||||||||||||
orf25ng ADIVQQKTGGNVEFKDGVLTAAVRFLPAKDARTAFIDNTVGMATQTLSAALLPYGVKSIV
130 140 150 160 170 180
190 200 210 220 230 240
orf25-1.pep MIDGKAVKKEDAVRILSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP
||||||| ||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf25ng MIDGKAVTKEDAVRVLSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP
190 200 210 220 230 240
250 260 270 280 290 300
orf25-1.pep DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC
|| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf25ng DDVERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC
250 260 270 280 290 300
310 320 330 339
orf25-1.pep RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
|||||||||||||||||||||||||||||||||||||||
orf25ng RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
310 320 330
根据该分析结果(包括淋球菌蛋白中存在一个预计的原核细胞膜脂蛋白脂质连接位点(下划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF25-1(37kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图16A显示了GST-融合蛋白的亲和纯化结果,图16B显示了His-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,对小鼠血清进行Western印迹(图16C),ELISA(阳性结果),和FACS分析(图16D)。这些实验确认ORF25-1是一种外露蛋白,且是一种有用的免疫原。
图16E显示出ORF25-1的亲水性、抗原性指数和AMPHI区域的曲线。
实施例82
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 631>
1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TCTGGwysGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA
201 CGsyGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CkGATACTTT
251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA T.........
//
851 .......... .......... .......... ........AC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT CTTTGCCGTC GTTCTCTGCA CGCTCGGCAC
951 GATTAAAACC GCCGACTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCTA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CTTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACCGG CGATTACCTC TCCACACTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTATCCCGTG
1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA
1301 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CCTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC
1401 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT
1451 TTGGCACGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAA..它对应于氨基酸序列<SEQ ID 632;ORF26>:
1 MQLIDYSHSF FSVVPPFLAL ALAVITRRVL LSLGIGILXX VAFLVGGNPV
51 DGLTHLKDMV VGLAWSDXDW SLGKPKILVF XILLGIFTSL LTYSGSN...
//
251 .......... .......... .......... .......... ......TSLV
301 FGGTCGVFAV VLCTLGTIKT ADYPKAVWQG AKSMFGAIAI LILAWLISTV
351 VGEMHTGDYL STLVAGNIHP GFLPVILFLL ASVMAFATGT SWGTFGIMLP
401 IAAAMAVKYE PALIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGTTGIV LAVLIFLLKD
501 KK..进一步的工作揭示了完整的核苷酸序列<SEQ ID 633>:
1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TCTGGTCGGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA
201 CGGCGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CTGATACTTT
251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA TCAGGCGTTT
301 GCCGACTGGG CAAAACGGCA CATTAAAAAC CGGCGCGGCG CGAAAATGCT
351 GACCGCCTGC CTCGTGTTCG TAACCTTTAT CGACGACTAT TTCCACAGTC
401 TCGCCGTCGG TGCGATTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC
451 CGCACCAAAC TCGCCTACAT CCTCGACTCC ACTGCCGCTC CTATGTGCGT
501 GCTGATGCCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG
551 GACTGCTCGT TACCTACAAA ATCACCGAAT ACACGCCGAT GGGGACGTTT
601 GTCGCCATGA GCCTGATGAA CTATTACGCA CTGTTTGCCC TGATTATGGT
651 GTTCGTCGTC GCATGGTTTT CCTTCGACAT CGGCTCGATG GCACGTTTCG
701 AACAAGCCGC GTTGAACGAA GCCCACGATG AAACTGCCGT TTCAGACGCT
751 ACCAAAGGTC GTGTTTACGC ACTGATTATT CCCGTTTTGG CCTTAATCGC
801 CTCAACGGTT TCCGCCATGA TCTACACCGG CGCGCAGGCA AGCGAAACCT
851 TCAGCATTTT GGGGGCATTT GAAAACACGG ACGTAAACAC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT CCTTGCCGTC GTTCTCTGCA CGCTCGGCAC
951 GATTAAAACC GCCGACTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCTA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CTTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACCGG CGATTACCTC TCCACACTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTATCCCGTG
1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA
1301 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CCTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC
1401 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT
1451 TTGGCACGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAACGCG CCAACGCCTG A它对应于氨基酸序列<SEQ ID 634;ORF26-1>:
1 MQLIDYSHSF FSVVPPFLAL ALAVITRRVL LSLGIGILVG VAFLVGGNPV
51 DGLTHLKDMV VGLAWSDGDW SLGKPKILVF LILLGIFTSL LTYSGSNQAF
101 ADWAKRHIKN RRGAKMLTAC LVFVTFIDDY FHSLAVGAIA RPVTDKFKVS
151 RTKLAYILDS TAAPMCVLMP VSSWGASIIA TLAGLLVTYK ITEYTPMGTF
201 VAMSLMNYYA LFALIMVFVV AWFSFDIGSM ARFEQAALNE AHDETAVSDA
251 TKGRVYALII PVLALIASTV SAMIYTGAQA SETFSILGAF ENTDVNTSLV
301 FGGTCGVLAV VLCTLGTIKT ADYPKAVWQG AKSMFGAIAI LILAWLISTV
351 VGEMHTGDYL STLVAGNIHP GFLPVILFLL ASVMAFATGT SWGTFGIMLP
401 IAAAMAVKVE PALIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGTTGIV LAVLIFLLKD
501 KKRANA*
该氨基酸序列的计算分析给出了下列结果:
与流感嗜血菌的假设跨膜蛋白HI1586(登录号为P44263)的同源性
ORF26和HI1586在N端和C端的97和221个氨基酸重叠区内分别显示出有53%和49%的氨基酸相同性:
Orf26 1 MQLIDYSHSFFSVVPPFLALALAVITRRVXXXXXXXXXXXVAFLVGGNPVDGLTHLKDMV 60
M+LID+S S +S+VP LA+ LA+ TRRV L +L V
HI1586 14 MELIDFSSSVWSIVPALLAIILAIATRRVLVSLSAGIIIGSLMLSDWQIGSAFNYLVKNV 73
Orf26 61 VGLAWSDXDWSLGKPKILVFXILLGIFTSLLTYSGSN 97
V L ++D + + I++F +LLG+ T+LLT SGSN
HI1586 74 VSLVYADGEIN-SNMNIVLFLLLLGVLTALLTVSGSN 109
//
Orf26 86 IFTSLLTYSGS--NTSLVFGGTCGVFAVVLCTL--GTIKTADYPKAVWQGAKSMFGXXXX 141
+F+ L T+ + TSLV GG C + L + + +Y ++ G KSM G
HI1586 299 VFSVLGTFENTVVGTSLVVGGFCSIIISTLLIILDRQVSVPEYVRSWIVGIKSMSGAIAI 358
Orf26 142 XXXXXXXSTVVGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLP 201
+ +VG+M TG YLS+LV+GNI FLPVILF+L + MAF+TGTSWGTFGIMLP
HI1586 359 LFFAWTINKIVGDMQTGKYLSSLVSGNIPMQFLPVILFVLGAAMAFSTGTSWGTFGIMLP 418
Orf26 202 IAAAMAVKVEPALIIPCMSAVMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQXXXX 261
IAAAMA P L++PC+SAVMAGAVCGDHCSP+SDTTILSSTGA+CNHIDHVT+Q
HI1586 419 IAAAMAANAAPELLLPCLSAVMAGAVCGDHCSPVSDTTILSSTGAKCNHIDHVTTQLPYA 478
Orf26 262 XXXXXXXXXXXXXXXXXKSALLGFGTTGIVLAVLIFLLKDK 302
S L GF T + L V+IF +K +
HI1586 479 ATVATATSIGYIVVGFTYSGLAGFAATAVSLIVIIFAVKKR 519
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF26和脑膜炎奈瑟球菌菌株A的ORF(ORF26a)在502个氨基酸的重叠区内显示出有58.2%的相同性:
10 20 30 40 50 60
orf26.pep MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILXXVAFLVGGNPVDGLTHLKDMV
|||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf26a MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
10 20 30 40 50 60
70 80 90 99
orf26.pep VGLAWSDXDWSLGKPKILVFXILLGIFTSLLTYSGSNXX---------------------
||||||| |||||||| ||| ||||||||||||||||
orf26a VGLAWSDGDWSLGKPKXLVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
70 80 90 100 110 120
orf26.pep ------------------------------------------------------------
orf26a LVFVTFIDDYFHSLAVGAXARPVTDKFKVSRAKLAYILDSTAAPMCVLMPVSSWGASIIA
130 140 150 160 170 180
orf26.pep ------------------------------------------------------------orf26a TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
190 200 210 220 230 240
100 110orf26.pep --------------------------------------------------------TSLV
||||orf26a AHDETAVSDGSWGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
250 260 270 280 290 300
120 130 140 150 160 170orf26.pep FGGTCGVFAVVLCTLGTIKTADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
|||||||:||||||||||| ||||||||||||||||||||||||||||||||||||||||orf26a FGGTCGVLAVVLCTLGTIKIADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
310 320 330 340 350 360
180 190 200 210 220 230orf26.peD STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA
||||||||||||| |||||||||||||||||||||||||||||||||||:|:||||||||orf26a STLVAGNIHPGFLXVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVDPSLIIPCMSA
370 380 390 400 410 420
240 250 260 270 280 290orf26.pep VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf26a VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
430 440 450 460 470 480
300 310orf26.pep LLGFGTTGIVLAVLIFLLKDKK
|||||:||||||||||||||||orf26a LLGFGXTGIVLAVLIFLLKDKKRANAX
490 500全长ORF26a核苷酸序列<SEQ ID 635>是:
1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TCTGGTCGGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA
201 CGGCGATTGG TCGCTGGGCA AACCAAAANT CTTGGTTTTC CTGATACTTT
251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA TCAGGCGTTT
301 GCCGACTGGG CAAAACGGCA CATTAAAAAC CGGCGCGGCG CGAAAATGCT
351 GACCGCCTGC CTCGTGTTCG TAACCTTTAT CGACGACTAT TTCCACAGTC
401 TCGCCGTCGG TGCGNTTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC
451 CGCGCCAAAC TCGCCTACAT CCTCGACTCC ACTGCCGCGC CTATGTGCGT
501 GCTGATGCCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG
551 GACTGCTCGT TACCTACAAA ATCACCGAAT ACACGCCGAT GGGGACGTTT
601 GTCGCCATGA GCCTGATGAA CTATTACGCA CTGTTTGCCC TGATTATGGT
651 GTTCGTCGTC GCATGGTTCT CCTTCGACAT CGGCTCGATG GCACGTTTCG
701 AACAAGCCGC GTTGAACGAA GCCCACGATG AAACTGCCGT TTCAGACGGC
751 AGCTGGGGCA GGGTTTACGC ATTGATTATT CCCGTTTTGG CCTTAATCGC
801 CTCAACGGTT TCCGCCATGA TCTACACCGG TGCACAGGCA AGCGAAACCT
851 TCAGCATTTT GGGTGCATTT GAAAATACGG ACGTGAACAC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT GCTTGCCGTC GTCCTCTGCA CGCTCGGCAC
951 GATTAAAATC GCCGATTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCCA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTTG CCTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACAGG CGACTACCTC TCCACGCTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGN CCGTCATCCT TTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT CATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAT CCCTCACTGA TTATCCCGTG
1251 TATGTCCGCC GTGATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA
1301 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CNTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC
1401 CGCATCGGGN TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGTT
1451 TTGGCANGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAACGCG CCAACGCCTG A它编码的蛋白质具有氨基酸序列<SEQ ID 636>:
1 MQLIDYSHSF FSVVPPFLAL ALAVITRRVL LSLGIGILVG VAFLVGGNPV
51 DGLTHLKDMV VGLAWSDGDW SLGKPKXLVF LILLGIFTSL LTYSGSNQAF
101 ADWAKRHIKN RRGAKMLTAC LVFVTFIDDY FHSLAVGAXA RPVTDKFKVS
151 RAKLAYILDS TAAPMCVLMP VSSWGASIIA TLAGLLVTYK ITEYTPMGTF
201 VAMSLMNYYA LFALIMVFVV AWFSFDIGSM ARFEQAALNE AHDETAVSDG
251 SWGRVYALII PVLALIASTV SAMIYTGAQA SETFSILGAF ENTDVNTSLV
301 FGGTCGVLAV VLCTLGTIKI ADYPKAVWQG AKSMFGAIAI LILAWLISTV
351 VGEMHTGDYL STLVAGNIHP GFLXVILFLL ASVMAFATGT SWGTFGIMLP
401 IAAAMAVKVD PSLIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGXTGIV LAVLIFLLKD
501 KKRANA*ORF26a和ORF26-1在506个氨基酸的重叠区内显示出有97.8%的相同性:
10 20 30 40 50 60orf26a.pep MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf26-1 MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
10 20 30 40 50 60
70 80 90 100 110 120orf26a.pep VGLAWSDGDWSLGKPKXLVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||orf26-1 VGLAWSDGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
70 80 90 100 110 120
130 140 150 160 170 180orf26a.pep LVFVTFIDDYFHSLAVGAXARPVTDKFKVSRAKLAYILDSTAAPMCVLMPVSSWGASIIA
|||||||||||||||||| ||||||||||||:||||||||||||||||||||||||||||orf26-1 LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRTKLAYILDSTAAPMCVLMPVSSWGASIIA
130 140 150 160 170 180
190 200 210 220 230 240orf26a.pep TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf26-1 TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
190 200 210 220 230 240
250 260 270 280 290 300orf26a.pep AHDETAVSDGSWGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
|||||||||:: ||||||||||||||||||||||||||||||||||||||||||||||||orf26-1 AHDETAVSDATKGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
250 260 270 280 290 300
310 320 330 340 350 360orf26a.pep FGGTCGVLAVVLCTLGTIKIADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||orf26-1 FGGTCGVLAVVLCTLGTIKTADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
310 320 330 340 350 360
370 380 390 400 410 420orf26a.pep STLVAGNIHPGFLXVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVDPSLIIPCMSA
||||||||||||| |||||||||||||||||||||||||||||||||||:|:||||||||orf26-1 STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA
370 380 390 400 410 420
430 440 450 460 470 480
orf26a.pep VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26-1 VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
430 440 450 460 470 480
490 500
orf26a.pep LLGFGXTGIVLAVLlFLLKDKKRANAX
|||||:|||||||||||||||||||||
orf26-1 LLGFGTTGIVLAVLIFLLKDKKRANAX
490 500
与淋病奈瑟球菌的预计ORF的同源性
ORF26和淋病奈瑟球菌的预计ORF(ORF26ng)在N端和C端的97和206个氨基酸的重叠区内分别显示出有94.8%和99%的相同性:
orf26.pep MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILXXVAFLVGGNPVDGLTHLKDMV 60
|||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf26ng MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV 60
orf26.pep VGLAWSDXDWSLGKPKILVFXILLGIFTSLLTYSGSN 97
|||||:| |||||||||||| ||||||||||||||||
orf26ng VGLAWADGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRCGAKMLTAC 120
//
orf26.pep TSLVFGGTCGVFAVVLCTLGTIKTADYPKA 326
|||||||||||:||||||:|||||||||||
orf26ng ASTVSAMIYTGAQASETFSILGAFENTDVNTSLVFGGTCGVLAVVLCTFGTIKTADYPKA 326
orf26.pep VWQGAKSMFGAIAILILAWLISTVVGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAF 386
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng VWQGAKSMFGAIAILILAWLISTVVGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAF 386
orf26.pep ATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSAVMAGAVCGDHCSPISDTTILSSTGAR 446
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng ATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSAVMAGAVCGDHCSPISDTTILSSTGAR 446
orf26.pep CNHIDHVTSQLPYALTVAAAAASGYLALGLTKSALLGFGTTGIVLAVLIFLLKDKK 502
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng CNHIDHVTSQLPYALTVAAAAASGYLALGLTKSALLGFGTTGIVLAVLIFLLKDKKRADV 506
全长ORF26ng核苷酸序列<SEQ ID 637>是:
1 ATGCAGCTGA TTGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TTTGGTCGGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGGCAGA
201 CGGCGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CTGATACTTT
251 TGGGCATTTT CACTTCACTG CTGACCTACT CCGGCAGCAA TCAGGCGTTT
301 GCCGACTGGG CAAAACGGCA CATTAAAAAC CGGTGCGGCG CGAAAATGCT
351 GACCGCCTGC CTCGTGTTCG TAACCTTTAT CGACGACTAT TTCCACAGCC
401 TCGCCGTCGG TGCGATTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC
451 CGCGCCAAAC TCGCCTACAT CCTCGACTCC ACTGCCTCGC CCATGTGCGT
501 GCTGATGCCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG
551 GATTGCTCGT TACCTACAAA ATTACCGAAT ACACGCCGAT GGGGACGTTT
601 GTCGCCATGA GCCTGATGAA CTATTACGCG CTGTTTGCCC TGATTATGGT
651 ATTCGTCGTC GCATGGTTCT CCTTCGACAT CGGCTCGAtg gCGCGTTTCG
701 AACAGGCTGC GTTGAACGAA gcccaggacg aaaccgccgc tTCAGACgCT
751 ACCAAAGGTC GTGTTTACGC ATTGATTATT CCCGTTTTGG CCTTAATCGC
801 CTCAACGGTT TCCGCCATGA TCTACACCGG CGCGCAGGCA AGCGAAACCT
851 TCAGCATTTT GGGGGCATTT GAAAATACCG ACGTAAACAC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT GCTTGCCGTC GTCCTCTGCA CGTTCGGCAC
951 GATTAAAACC GCCGATTATC CCAAAGCCGT GTGGCAGGGT GCGAAATCCA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CCTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACGGG CGACTACCTC TCCACGCTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTAtcccGTG
1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGTTCGCCCA
1301 TCTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CCTCGCAACT GCCTTATGCC CTGACGGTTG CCGCCGCCGC
1401 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT
1451 TTGGCACGAC CGGTATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAACGCG CCGACGTTTG A它编码的蛋白质具有氨基酸序列<SEQ ID 638>:
1 MQLIDYSHSF FSVVPPFLAL ALAVITRRVL LSLGIGILVG VAFLVGGNPV
51 DGLTHLKDMV VGLAWADGDW SLGKPKILVF LILLGIFTSL LTYSGSNQAF
101 ADWAKRHIKN RCGAKMLTAC LVFVTFIDDY FHSLAVGAIA RPVTDKFKVS
151 RAKLAYILDS TASPMCVLMP VSSWGASIIA TLAGLLVTYK ITEYTPMGTF
201 VAMSLMNYYA LFALIMVFVV AWFSFDIGSM ARFEQAALNE AQDETAASDA
251 TKGRVYALII PVLALIASTV SAMIYTGAQA SETFSILGAF ENTDVNTSLV
301 FGGTCGVLAV VLCTFGTIKT ADYPKAVWQG AKSMFGAIAI LILAWLISTV
351 VGEMHTGDYL STLVAGNIHP GFLPVILFLL ASVMAFATGT SWGTFGIMLP
401 IAAAMAVKVE PALIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGTTGIV LAVLIFLLKD
501 KKRADV*ORF26ng和ORF26-1在505个氨基酸的重叠区内显示出有98.4%的相同性:
10 20 30 40 50 60orf26-1.pep MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf26ng MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
10 20 30 40 50 60
70 80 90 100 110 120orf26-1.pep VGLAWSDGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
|||||:||||||||||||||||||||||||||||||||||||||||||||| ||||||||orf26ng VGLAWADGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRCGAKMLTAC
70 80 90 100 110 120
130 140 150 160 170 180orf26-1.pep LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRTKLAYILDSTAAPMCVLMPVSSWGASIIA
|||||||||||||||||||||||||||||||:||||||||||:|||||||||||||||||orf26ng LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRAKLAYILDSTASPMCVLMPVSSWGASIIA
130 140 150 160 170 180
190 200 210 220 230 240orf26-1.pep TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf26ng TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
190 200 210 220 230 240
250 260 270 280 290 300orf26-1.pep AHDETAVSDATKGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
|:||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||orf26ng AQDETAASDATKGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
250 260 270 280 290 300
310 320 330 340 350 360
orf26-1.pep FGGTCGVLAVVLCTLGTIKTADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf26ng FGGTCGVLAVVLCTFGTIKTADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
310 320 330 340 350 360
370 380 390 400 410 420
orf26-1.pep STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA
370 380 390 400 410 420
430 440 450 460 470 480
orf26-1.pep VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
430 440 450 460 470 480
490 500
orf26-1.pep LLGFGTTGIVLAVLIFLLKDKKRANAX
||||||||||||||||||||||||::
orf26ng LLGFGTTGIVLAVLIFLLKDKKRADVX
490 500
另外,ORF26ng显示出与一种假设的流感嗜血菌蛋白明显同源:
sp|P44263|YF86_HAEIN假设蛋白HI1586>gi|1074850|pir||C64037假设
protein HI1586-流感嗜血菌(Rd KW20菌株)>gi|1574427(U32832)流感嗜血菌预计编码区HI1586[流感嗜血菌]长度=519
评分=538位(1370),估计值=e-152
相同性=280/507(55%),阳性=346/507(68%),空隙=7/507(1%)
询问: 1 MQLIDYSHSFFSVVPPFLALALAVITRRXXXXXXXXXXXXXAFLVGGNPVDGLTHLKDMV 60
M+LID+S S +S+VP LA+ LA+ TRR L +L V
目标:14 MELIDFSSSVWSIVPALLAIILAIATRRVLVSLSAGIIIGSLMLSDWQIGSAFNYLVKNV 73
询问:61 VGLAWADGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRCGAKMLTAC 120
V L +ADG+ + I++FL+LLG+ T+LLT SGSN+AFA+WA+ IK R GAK+L A
目标:74 VSLVYADGEIN-SNMNIVLFLLLLGVLTALLTVSGSNRAFAEWAQSRIKGRRGAKLLAAS 132
询问:121 LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRAKLAYILDSTASPMCVLMPVSSWGASIIA 180
LVFVTFIDDYFHSLAVGAIARPVTD+FKVSRAKLAYILDSTA+PMCV+MPVSSWGA II
目标:133 LVFVTFIDDYFHS¨VGAIARPVTDRFKVSRAKLAYILDSTAAPMCVMMPVSSWGAYIIT 192
询问:181 TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE 240
+ GLL TY ITEYTP+G FVAMS MN+YA+F++IMVF VA+FSFDI SM R E+ AL
目标:193 LIGGLLATYSITEYTPIGAFVAMSSMNFYAIFSIIMVFFVAYFSFDIASMVRHEKLALKN 252
询问:241 AQDETAASDATKGRVYALIIPVLALIASTVSAMIYTGAQA----SETFSILGAFENTDVN 296
+D+ TKG+V LI+P+L LI +TVS MIYTGA+A + FS+LG FENT V
目标:253 TEDQLEEETGTKGQVRNLILPILVLIIATVSMMIYTGAEALAADGKVFSVLGTFENTVVG 312
询问:297 TSLVFGGTCGVL--AVVLCTFGTIKTADYPKAVWQGAKSMFGXXXXXXXXXXXSTVVGEM 354
TSLV GG C ++ +++ + +Y ++ G KSM G + +VG+M
目标:313 TSLVVGGFCSIIISTLLIILDRQVSVPEYVRSWIVGIKSMSGAIAILFFAWTINKIVGDM 372
询问:355 HTGDYLSTLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALI 414
TG YLS+LV+GNI FLPVILF+L + MAF+TGTSWGTFGIMLPIAAAMA P L+
目标:373 QTGKYLSSLVSGNIPMQFLPVILFV1GAAMAFSTGTSWGTFGIMLPIAAAMAANAAPELL 432
询问:415 IPCMSAVMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQXXXXXXXXXXXXXXXXXX 474
+PC+SAVMAGAVCGDHCSP+SDTTILSSTGA+CNHIDHVT+Q
目标:433 LPCLSAVMAGAVCGDHCSPVSDTTILSSTGAKCNHIDHVTTQLPYAATVATATSIGYIVV 492
询问:475 XXXKSALLGFGTTGIVLAVLIFLLKDK 501
S L GF T + L V+IF +K +
目标:493 GFTYSGLAGFAATAVSLIVIIFAVKKR 519
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例83
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 639>:
1 .. AAGCAATGGT ATGCCGACGN .AGTATCAAG ACGGAAATGG TTATGGTCAA
51 CGATGAGCCT GCCAAAATTC TGACTTGGGA TGAAAGCGGC CGATTACTCT
101 CGGAACTGTC TATCCGCCAC CATCAACGCA ACGGGGTGGT TTTGGAGTGG
151 TATGAAGATG GTTCTAAAAA GAGCGAAGT. GTTTATCAGG ATGACAAGTT
201 GGTCAGGAAA ACCCAGTGGG ATAAGGATGG TTATTTAATC GAACCCTGA它对应于氨基酸序列<SEQ ID 640;ORF27>:
1 ..KQWYADXSIK TEMVMVNDEP AKILTWDESG RLLSELSIRH HQRNGVVLEW
51 YEDGSKKSEX VYQDDKLVRK TQWDKDGYLI EP*进一步的工作揭示了完整的核苷酸序列<SEQ ID 641>:
1 ATGAAAAAAT TATCTCGGAT TGTATTTTCA ACTGTCCTGT TGGGTTTTTC
51 GGCCGCTTTG CCGGCGCAGA CCTATTCTGT TTATTTTAAT CAGAACGGAA
101 AGCTGACGGC GACGATGTCT TCTGCCGCTT ATATCAGGCA ATATAGTGTG
151 GTGGCGGGTA TTGCGCACGC GCAGGATTTT TATTATCCGT CGATGAAGAA
201 ATATTCTGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC
251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA TGGTCAGAAA
301 AAAATGGCGG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AGTGGGTCAA
351 CTGGTATCCG AACGGTAAAA AATCTGCCGT TATGCCTTAT AAAAATGGCT
401 TGAGTGAGGG TACGGGATAG CGCTATTACC GTAACGGCGG CAAGGAAAGC
451 GAAATCCAGT TTAAGCAAAA TAAGGCAAAC GGCGTATGGA AGCAATGGTA
501 TGCCGACGGC AGTATCAAGA CGGAAATGGT TATGGTCAAC GATGAGCCTG
551 CCAAAATTCT GACTTGGGAT GAAAGCGGCC GATTACTCTC GGAACTGTCT
601 ATCCGCCACC ATCAACGCAA CGGGGTGGTT TTGGAGTGGT ATGAAGATGG
651 TTCTAAAAAG AGCGAAGCTG TTTATCAGGA TGACAAGTTG GTCAGGAAAA
701 CCCAGTGGGA TAAGGATGGT TATTTAATCG AACCCTGA它对应于氨基酸序列<SEQ ID 642;ORF27-1>:
1 MKKLSRIVFS TVLLGFSAAL PAQTYSVYFN QNGKLTATMS SAAYIRQYSV
51 VAGIAHAQDF YYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFNGQK
101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGY RYYRNGGKES
151 EIQFKQNKAN GVWAQWYADG SIKTEMVMVN DEPAKILTWD ESGRLLSELS
201 IRHHQRNGVV LEWYEDGSKK SEAVYQDDKL VRKTQWDKDG YLIEP*该氨基酸序列的计算机分析给出了下列结果:与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF27和脑膜炎奈瑟球菌菌株A的ORF(ORF27a)在82个氨基酸的重叠区内显示出有91.5%的相同性:
10 20 30
orf27.pep KQWYADXSIKTEMVMVNDEPAKILTWDESG
|||||| :||||||||||||||||||||||
orf27a LSEGTGXRYYRNGGKESEIQFKQNKANGVWKQWYADGNIKTEMVMVNDEPAKILTWDESG
140 150 160 170 180 190
40 50 60 70 80orf27.pep RLLSELSIRHHQRNGVVLEWYEDGSKKSEXVYQDDKLVRKTQWDKDGYLIEPX
||||||||:|| ||||||||||||||| | |||||||||||||| ||||||||orf27a RLLSELSIHHHXRNGVVLEWYEDGSKKXEAVYQDDKLVRKTQWDXDGYLIEPX
200 210 220 230 240全长ORF27a核苷酸序列<SEQ ID 643>是:
1 ATGAAAAAAT TATCTCGGAT TGTATTTTCA ACTGTCCTGT TGGGTTTTTC
51 GGCCGCTTTG CCGGCGCAGA NCTATTCTGT TTATTTTAAT CAGAACGGGA
101 AACTGACGGC GACGNTGTCT TCTGCCGCNT ATATCAGGCA ATATAGTGTG
151 GCGGAGGGTA TTGCGCACGC GCAGGANTTT TANTATCCGT CGATGAAGAA
201 ATATTCCGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC
251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA NGGTCAGAAA
301 AAAATGGCNG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AGTGGGTCAA
351 CTGGTATCCG AACGGTAAAA AATCTGCCGT TATGCCTTAT AAAAATGGTT
401 TGAGTGAAGG TACGGGGTNN CGCTATTACC GTAACGGCGG CAAGGAAAGC
451 GAAATCCAGT TTAAACAGAA TAAGGCAAAC GGCGTATGGA AGCAATGGTA
501 TGCCGACGGC AATATCAAAA CGGAAATGGT TATGGTCAAT GATGAGCCTG
551 CCAAAATTCT GACATGGGAT GAAAGCGGTC GATTACTCTC GGAACTGTCT
601 ATCCATCATC ATNAACGTAA TGGAGTAGTC TTAGAGTGGT ATGAAGATGG
651 TTCTAAAAAG ANTGAAGCTG TTTATCAGGA TGATAAGTTG GTCAGGAAAA
701 CCCAGTGGGA TAANGATGGT TATTTAATCG AACCCTGA它编码的蛋白质具有氨基酸序列<SEQ ID 644>:
1 MKKLSRIVFS TVLLGFSAAL PAQXYSVYFN QNGKLTATXS SAAYIRQYSV
51 AEGIAHAQXF XYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFXGQK
101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGX RYYRNGGKES
151 EIQFKQNKAN GVWKQWYADG NIKTEMVMVN DEPAKILTWD ESGRLLSELS
201 IHHHXRNGVV LEWYEDGSKK XEAVYQDDKL VRKTQWDXDG YLIEP*ORF27a和ORF27-1在245个氨基酸的重叠区内显示出有94.7%的相同性:
10 20 30 40 50 60orf27a.pep MKKLSRIVFSTVLLGFSAALPAQXYSVYFNQNGKLTATXSSAAYIRQYSVAEGIAHAQXF
|||||||||||||||||||||||:|||||||||||||| |||||||||||: |||||| |orf27-1 MKKLSRIVFSTVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSVVAGIAHAQDF
10 20 30 40 50 60
70 80 90 100 110 120orf27a.pep XYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFXGQKKMAGGFSKGKPDGEWVNWYP
|||||||||||||||||||||||||||||||||||| |||||||||||||||||||||||orf27-1 YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP
70 80 90 100 110 120
130 140 150 160 170 180orf27a.pep NGKKSAVMPYKNGLSEGTGXRYYRNGGKESEIQFKQNKANGVWKQWYADGNIKTEMVMVN
||||||||||||||||||| ||||||||||||||||||||||||||||||:|||||||||orf27-1 NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN
130 140 150 160 170 180
190 200 210 220 230 240orf27a.pep DEPAKILTWDESGRLLSELSIHHHXRNGVVLEWYEDGSKKXEAVYQDDKLVRKTQWDXDG
|||||||||||||||||||||:|| ||||||||||||||| |||||||||||||||| ||orf27-1 DEPAKILTMDESGRLLSELSIRHHQRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG
190 200 210 220 230 240orf27a.pep YLIEPX
||||||orf27-1 YLIEPX
与淋病奈瑟球菌的预计ORF的同源性
ORF27和淋病奈瑟球菌的预计ORF(ORF27ng)在82个氨基酸的重叠区内显示出有96.3%的相同性:
orf27.pep KQWYADXSIKTEMVMVNDEPAKILTWDESG 30
||||||||||||||||||||||||||||||
orf27ng LSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVNDEPAKILTWDESG 193
orf27.pep RLLSELSIRHHQRNGVVLEWYEDGSKKSEXVYQDDKLVRKTQWDKDGYLIEP 82
|||||||||||:||||||||||||||||| ||||||||||||||||||||||
orf27ng RLLSELSIRHHKRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDGYLIEP 245
全长ORF27ng核苷酸序列<SEQ ID 645>是:
1 ATGAAGAAAT TATCTCGGAT TGTATTTTCA ATCGTACTGT TGGGTTTTTC
51 GGCCGCTTTG CCGGCGCAGA CCTATTCTGT TTATTTTAAT CAGAACGGGA
101 AACTGACGGC GACGATGTCT TCTGCCGCTT ATATCAGGCA ATATAGTGTG
151 CCGGCGGGTA TCGCACACGC GCAGGATTTT TATTATCCGT CGATGAAGAA
201 ATATTCCGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC
251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA TGGTCAGAAA
301 AAAATGGCGG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AATGGGTCAA
351 CTGGTATCCG AACGGTAAAA AATCTGCGGT TATGCCTTAT AAAAATGGCT
401 TGAGTGAGGG TACGGGATAC CGTTATTACC GTAACGGCGG CAAGGAAAGC
451 GAAATCCAGT TTAAGCAAAA TAAGGCGAAC GGCGTATGGA AGCAATGGTA
501 TGCCGATGGA AGTATCAAGA CGGAAATGGT TATGGTCAAC GATGAGCCTG
551 CCAAAATTCT GACTTGGGAT GAAAGCGGCC GATTACTTTC GGAACTGTCT
601 ATCCGCCACC ATAAACGCAA CGGGGTGGTT TTGGAGTGGT ATGAAGATGG
651 TTCTAAAAAG AGCGAGGCTG TTTATCAGGA TGACAAGTTG GTCAGGAAAA
701 CCCAATGGGA TAAGGATGGT TATTTAATCG AACCCTGA它编码的蛋白质具有氨基酸序列<SEQ ID 646>:
1 MKKLSRIVFS IVLLGFSAAL PAQTYSVYFN QNGKLTATMS SAAYIRQYSV
51 AAGIAHAQDF YYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFNGQK
101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGY RYYRNGGKES
151 EIQFKQNKAN GVWKQWYADG SIKTEMVMVN DEPAKILTWD ESGRLLSELS
201 IRHHKRNGVV LEWYEDGSKK SEAVYQDDKL VRKTQWDKDG YLIEP*ORF27ng和ORF27-1在245个氨基酸的重叠区内显示出有98.8%的相同性:
10 20 30 40 50 60orf27-1.pep MKKLSRIVFSTVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSVVAGIAHAQDF
|||||||||| |||||||||||||||||||||||||||||||||||||||:|||||||||orf27ng MKKLSRIVFSIVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSVAAGIAHAQDF
10 20 30 40 50 60
70 80 90 100 110 120orf27-1.pep YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf27ng YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP
70 80 90 100 110 120
130 140 150 160 170 180orf27-1.pep NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf27ng NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN
130 140 150 160 170 180
190 200 210 220 230 240orf27-1.pep DEPAKILTWDESGRLLSELSIRHHQRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG
||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||||orf27ng DEPAKILTWDESGRLLSELSIRHHKRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG
190 200 210 220 230 240
orf27-1.pep YLIEPX
||||||
orf27ng YLIEPX
根据该分析结果(包括淋球菌蛋白中有推定的前导序列),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF27-1(24.5kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图17A显示了GST-融合蛋白的亲和纯化结果,图17B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA,该试验给出了阳性结果,这确认ORF27-1是一种外露蛋白,且是一种有用的免疫原。
实施例84
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 647>:
1 ATGAAATTTA CCAAGCACCC CGTCTGGGCA ATGGCGTTCC GCCCATTTTA
51 TTCGCTGGCG GCTCTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACkAG CTGTCCGGTT TCTATTGGCA CGCGCATGAg
151 ATGATTTGGG GTTATGCCGG ACTGGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGGC AGCCGCCCAC GCGGGGCGGC GTaTCTGGTC
251 GGCTTGACTA TCTTTTGGCT GGCTGCGCGG ATTGCCGCCT TTATCCCGGG
301 TTGGGGTGCG TCGGCAAGCG GCATACTCGG TACGCTGTTT TTCTGGTACG
351 GCGCGGTGTG CATGGCTTTG CCCGTTATCC GTTCGCAGAA TCAACGCAAC
401 TATGTTgCCG TGTTCGCGCT GTTCGTCTTG GGCGGCACGC ATGCGGCGTT
451 CCACGTCCAG CTGCACAACG GCAACCTAGG CGGACTCTTG AGCGGATTGC
501 AGTCGGGCTT GGTGATG它对应于氨基酸序列<SEQ ID 648;ORF47>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHX LSGFYWHAHE
51 MIWGYAGLVV IAFLLTAVAT WTGQPPTRGG VLVGLTIFWL AARIAAFIPG
101 WGASASGILG TLFFWYGAVC MALPVIRSQN QRNYVAVFAL FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQSGL VM进一步的工作揭示了完整的核苷酸序列<SEQ ID 649>:
1 ATGAAATTTA CCAAGCACCC CGTCTGGGCA ATGGCGTTCC GCCCATTTTA
51 TTCGCTGGCG GCTCTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG
151 ATGATTTGGG GTTATGCCGG ACTGGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGGC AGCCGCCCAC GCGGGGCGGC GTTCTGGTCG
251 GCTTGACTAT CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT TATCCCGGGT
301 TGGGGTGCGT CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG
351 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TTCGCAGAAT CAACGCAACT
401 ATGTTGCCGT GTTCGCGCTG TTCGTCTTGG GCGGCACGCA TGCGGCGTTC
451 CACGTCCAGC TGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA
501 GTCGGGCTTG GTGATGGTGT CGGGTTTTAT CGGTCTGATT GGTACGCGGA
551 TTATTTCGTT TTTTACGTCC AAACGCTTGA ATGTGCCGCA GATTCCCAGT
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTGCCCATGC TGACTGCCAT
651 GCTGATGGCG CACGGTGTGT TGGCTTGGCT GTCTGCCGTT TTTGCCTTTG
701 CGGCAGGTGT GATTTTTACC GTGCAGGTGT ACCGCTGGTG GTATAAACCC
751 GTGTTGAAAG AGCCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCCGCTTTCC
851 TCAATCTGGG TGTGCATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT
901 TTGGGCATGA TGGCGCGTAC CGCGCTTGGT CATACGGGCA ATCCGATTTA
951 TCCGCCGCCC AAAGCCGTTC CCGTTGCGTT TTGGCTGATG ATGGCGGCAA
1001 CCGCCGTCCG TATGGTTGCC GTATTTTCTT CCGGCACTGC CTACACGCAC
1051 AGCATCCGCA CCTCTTCGGT TTTGTTTGCA CTCGCGCTTT TGGTGTATGC
1101 GTGGAAGTAT ATTCCTTGGC TGATTCGTCC GCGTTCGGAC GGCAGGCCCG
1151 GTTGA它对应于氨基酸序列<SEQ ID 650;ORF47-1>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE
51 MIWGYAGLVV IAFLLTAVAT WTGQPPTRGG VLVGLTIFWL AARIAAFIPG
101 WGASASGILG TLFFWYGAVC MALPVIRSQN QRNYVAVFAL FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQSGL VMVSGFIGLI GTRIISFFTS KRLNYPQIPS
201 PKWVAQASLW LPMLTAMLMA HGVLAWLSAV FAFAAGVIFT VQVYRWWYKP
251 VLKEPMLWIL FAGYLFTGLG LIAVGASYFK PAFLNLGVHL IGVGGIGVLT
301 LGMMARTALG HTGNPIYPPP KAVPVAFWLM MAATAVRMVA VFSSGTAYTH
351 SIRTSSVLFA LALLYYAWKY IPWLIRPRSD GRPG*
对该氨基酸序列进行计算机分析预测到有一个前导肽,并且还给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF47和脑膜炎奈瑟球菌菌株A的ORF(ORF47a)在172个氨基酸的重叠区内显示出有99.4%的相同性:
10 20 30 40 50 60
orf47.pep MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHXLSGFYWHAHEMIWGYAGLVV
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf47a MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
10 20 30 40 50 60
70 80 90 100 110 120
orf47.pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47a IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC
70 80 90 100 110 120
130 140 150 160 170
orf47.pep MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVM
||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47a MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
130 140 150 160 170 180
orf47a GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVMPWLSAAFAFAAGVIFT
190 200 210 220 230 240
全长ORF47a核苷酸序列<SEQ ID 651>是:
1 ATGAAATTTA CCAAGCACCC CGTTTGGGCA ATGGCGTTCC GCCCGTTTTA
51 TTCACTGGCG GCTCTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG
151 ATGATTTGGG GTTATGCCGG ACTGGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGGC AGCCGCCCAC GCGGGGCGGC GTTCTGGTCG
251 GCTTGACTAT CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT TATCCCGGGT
301 TGGGGTGCGT CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG
351 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TTCGCAGAAT CAACGCAATT
401 ATGTTGCCGT GTTCGCGCTG TTCGTCTTGG GCGGTACGCA CGCGGCGTTC
451 CACGTCCAGC TGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA
501 GTCGGGCTTG GTGATGGTGT CGGGTTTTAT CGGTCTGATT GGTACGCGGA
551 TTATTTCGTT TTTTACGTCC AAACGGTTGA ATGTGCCGCA GATTCCCAGT
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTGCCCATGC TGACCGCCAT
651 GCTGATGGCG CACGGCGTGA TGCCTTGGCT GTCGGCGGCT TTCGCGTTTG
701 CGGCAGGTGT GATTTTTACC GTGCAGGTGT ACCGCTGGTG GTATAAGCCT
751 GTGTTGAAAG AGCCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCCGCTTTCC
851 TCAATCTGGG TGTGCATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT
901 TTGGGCATGA TGGCGCGTAC CGCGCTCGGT CATACGGGCA ATCCGATTTA
951 TCCGCCGCCC AAAGCCGTTC CCGTTGCGTT TTGGCTGATG ATGGCGGCAA
1001 CCGCCGTCCG TATGGTTGCC GTATTTTCTT CCGGCACTGC CTACACGCAC
1051 AGCATACGCA CCTCTTCGGT TTTGTTTGCA CTCGCGCTTT TGGTGTATGC
1101 GTGGAAGTAT ATTCCTTGGC TGATTCGTCC GCGTTCGGAC GGCAGGCCCG
1151 GTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 652>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE
51 MIWGYAGLVV IAFLLTAVAT WTGQPPTRGG VLVGLTIFWL AARIAAFIPG
101 WGASASGILG TLFFWYGAVC MALPVIRSQN QRNYVAVFAL FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQSGL VMVSGFIGLI GTRIISFFTS KRLNVPQIPS
201 PKWVAQASLW LPMLTAMLMA HGVMPWLSAA FAFAAGVIFT VQVYRWWYKP
251 VLKEPMLWIL FAGYLFTGLG LIAVGASYFK PAFLNLGVHL IGVGGIGVLT
301 LGMMARTALG HTGNPIYPPP KAYPVAFWLM MAATAVRMVA VFSSGTAYTH
351 SIRTSSVLFA LALLVYAWKY IPWLIRPRSD GRPG*ORF47a和ORF47-1在384个氨基酸的重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60orf47a.pep MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf47-1 MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
10 20 30 40 50 60
70 80 90 100 110 120orf47a.pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf47-1 IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFMYGAVC
70 80 90 100 110 120
130 140 150 160 170 180orf47a.pep MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf47-1 MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
130 140 150 160 170 180
190 200 210 220 230 240orf47a.pep GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVMPWLSAAFAFAAGVIFT
|||||||||||||||||||||||||||||||||||||||||||: ||||:||||||||||off47-1 GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVLAWLSAVFAFAAGVIFT
190 200 210 220 230 240
250 260 270 280 290 300orf47a.pep VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf47-1 VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
250 260 270 280 290 300
310 320 330 340 350 360orf47a.pep LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf47-1 LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
310 320 330 340 350 360
370 380orf47a.pep LALLVYAWKYIPWLIRPRSDGRPGX
|||||||||||||||||||||||||
orf47-1 LALLVYAWKYIPWLIRPRSDGRPGX
370 380
与淋病奈瑟球菌的预计ORF的同源性
ORF47和淋病奈瑟球菌的预计ORF(ORF47a)在172个氨基酸的重叠区显示出有97.1%的相同性:
ORF47 MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ORF47ng MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV 60
ORF47 IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC 120
|||||||||||||||||||||||||| ||||||||||||||||:||||||||||||||||
ORF47ng IAFLLTAVATWTGQPPTRGGVLVGLTAFWLAARIAAFIPGWGAAASGILGTLFFWYGAVC 120
ORF47 MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVM 172
||||||||||:||||||||:||||||||||||||||||||||||||||||||
ORF47ng MALPVIRSQNRRNYVAVFAIFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVWGFIGLI 180
预计ORF47ng核苷酸序列<SEQ ID 653>编码的蛋白质包含氨基酸序列<SEQ ID654>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSYLL WGFGYTGTHE LSGFYWHAHE
51 MIWGYAGLVV IAFLLTAVAT WTGQPPTRGG VLVGLTAFWL AARIAAFIPG
101 WGAAASGILG TLFFWYGAVC MALPVIRSQN RRNYVAVFAI FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQSGL VMVMGFIGLI GMKIISFFTS KRLKLPQIPS
201 PKWVAHASLW LPMLNAILMA HRVMPWLSAA FPFAAGVIFT VQVYAGGITP
251 IEETSCGSVA GICYRLGNSS G
预计的前导肽和跨膜结构域与脑膜炎球菌蛋白(另见施氏假单胞菌orf396,登录号为e246540)中的序列相同(除了87位的Ile/Ala替换和140位的Leu/Ile替换):
ORF47ng中的TM节段
整合可能性=-5.63 跨膜52-68
整合可能性=-3.88 跨膜169-185
整合可能性=-3.08 跨膜82-98
整合可能性=-1.91 跨膜134-150
整合可能性=-1.44 跨膜107-123
整合可能性=-1.38 跨膜227-243
进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 655>:
1 ATGAAATTTA CCAAACATCC CGTCTGGGCA ATGGCGTTCC GCCCGTTTTA
51 TTCACTGGCG GCACTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG
151 ATGATTTGGG GTTATGCCGG TCTCGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGAC AGCCGCCCAC GAGGGGCGGC GTTCTGGTCG
251 GCTTGACCGC CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT TATCCCGGGT
301 TGGGGTGCGG CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG
351 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TtcgCAAAAC CGGCGCAACT
401 ATGtcgCCGT ATTCGCAATA TTTGTGCTGG GCGGTACGCA TGCGgcgTTC
451 CACGtccAgc tGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA
501 GTCGGGCCTG GTTATGGTGT CGGGCTTTAT CGGCCTGATT GGGATGAGGA
551 TTATTTCGTT TTTTACGTCC AAACGGTTGA ACGTGCCGCA GATTCCCAGT
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTACCCATGC TGACCGCCAT
651 ACTGATGGCG CACGGCGTGA TGCCTTGGCT GTCGGCGGCT TTCGCGTTTG
701 CGGCGGGCGT GATTTTTACC GTACAGGTGT ACCGCTGGTG GTATAAACCC
751 GTATTGAAAG AACCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCTGCCTTCC
851 TCAATCTGGG CGTACATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT
901 TTGGGCATGA TGGCGCGTAC CGCGCTCGGT CATACGGGCA ATTCGATTTA
951 TCCGCCGCCC AAAGCCGTTC CCGTTGCGTT TTGGCTGATG ATGGCGGCAA
1001 CCGCCGTCCG TATGGTTGCC GTATTTTCTT CCGGCACTGC CTACACGCAC
1051 AGCATCCGCA CGTCTTCGGT TTTGTTTGCA CTCGCGCTGC TGGTGTATGC
1101 GTGGAAATAC ATTCCGTGGC TGATCCGTCC GCGTTCGGAC GGCAGGCCCG
1151 GTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 656;ORF47ng-1>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE
51 MIWGYAGLVV IAFLLTAVAT WTGQPPTRGG VLVGLTAFWL AARIAAFIPG
101 WGAAASGILG TLFFWYGAVC MALPVIRSQN RRNYVAVFAI FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQSGL VMVSGFIGLI GMRIISFFTS KRLNVPQIPS
201 PKWVAQASLW LPMLTAILMA HGVMPWLSAA FAFAAGVIFT VQVYRWWYKP
251 VLKEPNLWIL FAGYLFTGLG LIAVGASYFK PAFLNLGVHL IGVGGIGVLT
301 LGMMARTALG HTGNSIYPPP KAVPVAFWLM MAATAVRMVA VFSSGTAYTH
351 SIRTSSVLFA LALLVYAWKY IPWLIRPRSD GRPG*ORF47ng-1和ORF47-1在384个氨基酸的重叠区内显示出有97.4%的相同性:
10 20 30 40 50 60orf47-1.pep MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf47ng-1 MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
10 20 30 40 50 60
70 80 90 100 110 120orf47-1 pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC
|||||||||||||||||||||||||| ||||||||||||||||:||||||||||||||||orf47ng-1 IAFLLTAVATWrGQPPTRGGVLVGLTAFWLAARIAAFIPGWGAAASGILGTLFFWYGAVC
70 80 90 100 110 120
130 140 150 160 170 180orf47-1.pep MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
||||||||||:||||||||:||||||||||||||||||||||||||||||||||||||||orf47ng-1 MALPVIRSQNRRNYVAVFAIFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
130 140 150 160 170 180
190 200 210 220 230 240orf47-1.pep GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVLAWLSAVFAFAAGVIFT
| ||||||||||||||||||||||||||||||||||:||||||: ||||:||||||||||orf47ng-1 GMRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAILMAHGVMPWLSAAFAFAAGVIFT
190 200 210 220 230 240
250 260 270 280 290 300orf47-1.pep VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf47ng-1 VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
250 260 270 280 290 300
310 320 330 340 350 360orf47-1.pep LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
|||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||orf47ng-1 LGMMARTALGHTGNSIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
310 320 330 340 350 360
370 380orf47-1.pep LALLVYAWKYIPWLIRPRSDGRPGX
|||||||||||||||||||||||||orf47ng-1 LALLVYAWKYIPWLIRPRSDGRPGX
370 380另外,ORF47ng-1显示出与施氏假单胞菌的一个ORF明显同源:gnl|PID|e246540(Z73914)ORF396蛋白[施氏假单胞菌]长度=396评分=155位(389),估计值=5e-37
相同性=121/391(30%),阳性=169/391(42%),空隙=21/391(5%)
询问:7 PVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFY-------WHAHEMIWGYAGLV 59
P+W +AFRPF+ +LY L++ LW +TG GF WH HEM++G+A +
目标:14 PIWRLAFRPFFLAGSLYALLAIPLWVAAWTGLWP--GFQPTGGWLAWHRHEMLFGFAMAI 71
询问:60 VIAFLLTAVATWTGQPPTRGGVLVGLTAFWLAARIAAFIPGWGAAASGILGTLFFWYGAV 119
V FLLTAV TWTGQ G LVGL A WLAAR+ ++ G AA L LF
目标:72 VAGFLLTAVQTWTGQTAPSGNRLVGLAAVWLAARL-GWLFGLPAAWLAPLDLLFLVALVW 130
询问:120 CMALPVIRSQNRRNYVAVFAIFVLGGTHAAFXXXXXXXXXXXXXXXXXXXXXMVSGFIGL 179
MA + + +RNY V + ++ G +V+ + L
目标:131 MMAQMLWAVRQKRNYPIVVVLSLMLGADVLILTGLLQGNDALQRQGVLAGLWLVAALMAL 190
询问:180 IGMRIISFFTSKRLNVPQIPSP-KWVAQASLWLPMLTAILMAHGV----MPWLSAAFAFA 234
IG R+I FFT + L P W+ A L + A+L A GV P L F A
目标:191 IGGRVIPFFTQRGLGKVDAVKPWVWLDVALLVGTGVIALLHAFGVAMRPQPLLGLLFV-A 249
询问:235 AGVIFTVQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYF-KPAFXXXXXXXXXX 293
GV +++ RW+ K + K +LW L L+ + + +F A
目标:250 IGVGHLLRLMRWYDKGIWKVGLLWSLHVAMLWLVVAAFGLALWHFGLLAQSSPSLHALSV 309
询问:294 XXXXXXXXXMMARTALGHTGNSIYPPPKAVPVAFWLXXXXXXXXXXXXFSSGTAYTHSIR 353
M+AR LGHTG + P + AF L F S +
目标:310 GSMSGLILAMIARVTLGHTGRPLQLPAGIIG-AFVL---FNLGTAARVFLSVAWPVGGLW 365
询问:354 TSSVLFALALLVYAWKYIPWLIRPRSDGRPG 384
++V + LA +Y W+Y P L+ R DG PG
目标:366 LAAVCWTLAFALYVWRYAPMLVAARVDGHPG 396
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例85
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 657>:
1 ..ATGCCGTCTG AAGGTTCAGA CGGCmTCGGT GyCGGGGAAy CAGAAGyGGT
51 AGCGCATGCC CAATGAGACT TCGTGGGTTT TGAAGCGGGT GTTTTCCAAG
101 CGTCCCCAGT TGTGGTAACG GTATCCGGTG TCyAArGTCA GCTTGGGyGT
151 GATGTCGAAa CCGACACCGG CGATGACACC AAGACCyAmG CTGCTGATrC
201 TGTkGCTTTC GTGATAGGsA GGTTTGyTGG kmksAsyTTG TAyrATwkkG
251 CCTssCwsTG kAGmGCCkTk CkyTGGTkkA swGrwArTAG TCGTGGTTTy
301 TkTTyyCACC GAATGAACyT GATGTTTAAC GTGTCCGTAG GCGACGCGCG
351 CGCCGATATA GGGTTTGAAT TTATCGTTGA GTTTGAAATC GTAAATGGCG
401 GACAAGCCGA GAGAAGAAAC GGCGTGGAAG CTGCCGTTTC CCTGATGTTT
451 TGTTTGGGTT TCTTTGTAGT TGTTGTTTAT CTCTTCAGTA ACTTTTTTAG
501 TAGAAGAATT ACTTTCTTTC CATTTTCTGT AACTGGCATA ATCTGCCGCT
551 ATTCTCCAGC CGCCGAAATC ..
它对应于氨基酸序列<SEQ ID 658;ORF67>:
1 ..MPSEGSDGXG XGEXEXVAHA QXDFVGFEAG VFQASPVVVT VSGVXXQLGX
51 DVETDTGDDT KTXAADXVAF VIGRFXGXXL YXXAXXXXAX XWXXXXSRGF
101 XXHRMNLMFN VSVGDARADI GFEFIVEFEI VNGGQAERRN GVEAAVSLMF
151 CLGFFVVVVY LFSNFFSRRI TFFPFSVTGI ICRYSPAAEI ..
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF67和淋病奈瑟球菌的预计ORF(ORF67ng)在199个氨基酸的重叠区内显示出有51.8%的相同性:
orf67.pep MPSEGSDGXGXGEXEXVAHAQXDFVGFEAG 30
|||||||| | || | ||||| |||||||
orf67ng TNFEIAVLSGMTVRVFYCARPAPVNGGRLKMPSEGSDGIGIGESEAVAHAQRGFVGFEAG 146
90 100 110 120 130 140
orf67.pep VFQASPVVVTVSGVXXQLGXDVETDTGDDTKTXAADXVAFVIGRFXGXXLYXXAXXXXAX 90
|||||||||:|:|| | | || : : ::: || |||:|| | : :
orf67ng VFQASPVVVAVAGVQ3QAGRDVYAHARHRAEAQAAAAVAFLIGVFLRMSVRIVRNCCVSI 206
orf67.pep XWXXXXSRGFXXHRMNLMFNVSVGDARADIGFEFIVEFEIVNGGQAERRNGVEAAVSLMF 150
: | : |:: : :|||||||:||||||:|||||||||||||||||| || |||
orf67ng TRVGGKSTCYFFSRIDAVSDVSVGDARTDIGFEFVVEFEIVNGGQAERRNGVECAVFDMF 266
orf67.pep CLGFFVV-------VVYLFSNFFSRRITFF-PFSVTGIICRYSPAAEI 190
| | | :: |: |: : | : ||:||||| :||||:
orf67ng RLLVFYVKLVAAKSFIILSFQLFYVHGIFIVVPFPVTGIIRGDAPAAEVVADRHPGVDGM 326
预计ORF67ng核苷酸序列<SEQ ID 659>编码的蛋白质包含氨基酸序列<SEQ ID660>:
1 MPSETVGSIV NVGVDESVGF SPPFPSIQHF YRFHRIHRIR LFRPPGPMQL
51 NRHSHGSGNL GRGVWATVLS DKFPCGQVRI PACAGMTNFE IAVLSGMTVR
101 VFYCARPAPV NGGRLKMPSE GSDGIGIGES EAVAHAQRGF VGFEAGVFQA
151 SPVVVAVAGV QGQAGRDVYA HARHRAEAQA AAAVAFLIGV FLRMSVRINR
201 NCCVSITRVG GKSTCYFFSR IDAVSDVSVG DARTDIGFEF VVEFEIVNGG
251 QAERRNGVEC AVFLMFRLLV FYVKLVAAKS FIILSFQLFY VHGIFIVVPF
301 PVTGIIRGDA PAAEVVADRH PGVDGMRTDV SEIIAYRAYF VFAWSGWFRI
351 IVGNAFGGVG *
根据淋球菌蛋白中存在几个推定的跨膜结构域的发现,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例86
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 661>
1 ATGTTTGCTT TTTTAGAAGC CTTTTTTGTC GAATACGGTT ATGCGGCTGT
51 TTTTTTTGTA TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT
101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT
201 CATGTTCGCC GCCGGACGAA TTTGGGGGCA GArArTCCTA rGGTTCArAC
251 CTATTGCGsG CATCATGACG CCGrAACGTT ATGAGCAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGTAACTG GGTCTTATTT GTCGCCCGTT TCCTGCCCGG
351 TTTGAGAACG GCCGTATTTG TTACAGCCGG TATCAGCCGC AAGGTTTCAT
401 ACTTGCGTTT TATCATTATG GATGGACTGG CCGCA...
它对应于氨基酸序列<SEQ ID 662;ORF78>:
1 MFAFLEAFFV EYGYAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 HIMFAVGMLG VLVGDGIMFA AGRIWGQXXL XFXPIAXIMT PXRYEQVQEK
101 FDKYGNWVLF VARFLPGLRT AVFVTAGISR KVSYLRFIIM DGLAA...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 663>:
1 ATGTTTGCTT TTTTAGAAGC CTTTTTTGTC GAATACGGTT ATGCGGCTGT
51 TTTTTTTGTA TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT
101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT
201 CATGTTCGCC GCCGGACGAA TTTGGGGGCA GAAAATCCTA AGGTTCAAAC
251 CTATTGCGCG CATCATGACG CCGAAACGTT ATGAGCAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGTAACTG GGTCTTATTT GTCGCCCGTT TCCTGCCCGG
351 TTTGAGAACG GCCGTATTTG TTACAGCCGG TATCAGCCGC AAGGTTTCAT
401 ACTTGCGTTT TATCATTATG GATGGACTGG CCGCACTGAT TTCCGTCCCT
451 ATTTGGATTT ATCTGGGCGA ATACGGTGCG CACAACATCG ATTGGCTGAT
501 GGCGAAAATG CACAGCCTGC AATCGGGTAT TTTTGTTATC TTGGGTATAG
551 GTGCGACCGT TGTCGCTTGG ATTTGGTGGA AAAAACGCCA ACGTATCCAG
601 TTTTACCGCA GCAAATTGAA AGAAAAGCGG GCGCAACGCA AAGCCGCCAA
651 GGCAGCCAAA AAAGCCGCGC AAAGCAAACA ATAA
它对应于氨基酸序列<SEQ ID 664;ORF78-1>:
1 MFAFLEAFFV EYGYAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 HIMFAVGMLG VLVGDGIMFA AGRIWGQKIL RFKPIARIMT PKRYEQVQEK
101 FDKYGNWVLF VARFLPGLRT AVFVTAGISR KVSYLRFIIM DGLAALISVP
151 IWIYLGEYGA HNIDWLMAKM HSLQSGIFVI LGIGATVVAW IWWKKRQRIQ
201 FYRSKLKEKR AQRKAAKAAK KAAQSKQ*
该氨基酸序列的计算机分析预测了几个跨膜结构域,并且还给出了下列结果:
与流感嗜血菌的dedA类似物(登录号为P45280)的同源性
ORF78和dedA类似物在144个氨基酸的重叠区内显示出有58%的氨基酸相同性:
Orf78:4 FLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGM--GYTNPHIMFAVGMLGV 61
FL FF EYGY AV FVL+ICGFGVPIPED+TLV+GGVI+G+ N H+M V M+GV
DedA: 20 FLIGFFTEYGYWAVLFVLIICGFGVPIPEDITLVSGGVIAGLYPENVNSHLMLLVSMIGV 79
Orf78:62 LVGDGIMFAAGRIWGQXXLXFXPIAXIMTPXRYEQVQEKFDKYGNWVLFVARFLPGLRTA 121
L GD M+ GRI+G L F PI I+T R V+EKF +YGN VLFVARFLPGLR
DedA: 80 LAGDSCMYWLGRIYGTKILRFRPIRRIVTLQRLRMVREKFSQYGNRVLFVARFLPGLRAP 139
Orf78:122 VFVTAGISRKVSYLRFIIMDGLAA 145
+++ +GI+R+VSY+RF+++D AA
DedA: 140 IYMVSGITRRVSYVRFVLIDFCAA 163
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF78和脑膜炎奈瑟球菌菌株A的ORF(ORF78a)在145个氨基酸的重叠区内显示出有93.8%的相同性:
10 20 30 40 50 60
orf78.pep MFAFLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf78a MFALLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
10 20 30 40 50 60
70 80 90 100 110 120
orf78.pep VLVGDGIMFAAGRIWGQXXLXFXPIAXIMTPXRYEQVQEKFDKYGNWVLFVARFLPGLRT
||||||||||||||||| | | ||| |||| || |||||||||||||||||||||||||
orf78a VLVGDGIMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRT
70 80 90 100 110 120
130 140
orf78.pep AVFVTAGISRKVSYLRFIIMDGLAA
|||||||||||||||||:|||||||
orf78a AVFVTAGISRKVSYLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIA
130 140 150 160 170 180
全长ORF78a核苷酸序列<SEQ ID 665>是:
1 ATGTTTGCCC TTTTGGAAGC CTTTTTTGTC GAATACGGCT ATGCGGCCGT
51 GTTTTTCGTT TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT
101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT
201 CATGTTCGCC GCCGGACGCA TCTGGGGGCA GAAAATCCTC AAGTTCAAAC
251 CGATTGCGCG CATCATGACG CCGAAACGTT ACGCACAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGCAACTG GGTGTTATTT GTCGCTCGTT TCCTGCCCGG
351 TTTGCGGACT GCCGTTTTCG TTACCGCCGG CATCAGCCGC AAAGTATCGT
401 ATCTGCGCTT TCTGATTATG GACGGGCTTG CCGCGCTGAT TTCCGTGCCC
451 GTTTGGATTT ACTTGGGCGA GTACGGCGCG CACAACATCG ATTGGCTGAT
501 GGCGAAAATG CACAGCCTGC AATCCGGCAT CTTCATCGCA TTGGGCGTGC
551 TGGCGGCGGC GCTGGCGTGG TTCTGGTGGC GCAAACGCCG ACATTATCAG
601 CTTTACCGCG CACAATTGAG CGAAAAACGC GCCAAACGCA AGGCGGAAAA
651 GGCAGCGAAA AAAGCGGCAC AGAAGCAGCA GTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 666>:
1 MFALLEAFFV EYGYAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 HIMFAVGMLG VLVGDGIMFA AGRIWGQKIL KFKPIARIMT PKRYAQVQEK
101 FDKYGNWVLF VARFLPGLRT AVFVTAGISR KVSYLRFLIM DGLAALISVP
151 VWIYLGEYGA HNIDWLMAKM HSLQSGIFIA LGVLAAALAW FWWRKRRHYQ
201 LYRAQLSEKR AKRKAEKAAK KAAQKQQ*
ORF78a和ORF78-1在227个氨基酸的重叠区内显示出有89.0%的相同性:
10 20 30 40 50 60
orf78a.pep MFALLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf78-1 MFAFLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
10 20 30 40 50 60
70 80 90 100 110 120
orf78a.pep VLVGDGIMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRT
||||||||||||||||||||:||||||||||||| |||||||||||||||||||||||||
orf78-1 VLVGDGIMFAAGRIWGQKILRFKPIARIMTPKRYEQVQEKFDKYGNWVLFVARFLPGLRT
70 80 90 100 110 120
130 140 150 160 170 180
orf78a.pep AVFVTAGISRKVSYLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIA
|||||||||||||||||:||||||||||||:|||||||||||||||||||||||||||:
orf78-1 AVFVTAGISRKVSYLRFIIMDGLAALISVPIWIYLGEYGAHNIDWLMAKMHSLQSGIFVI
130 140 150 160 170 180
190 200 210 220
orf78a.pep LGVLAAALAWFWWRKRRHYQLYRAQLSEKRAKRKAEKAAKKAAQKQQX
||: |:::||:||:||:: |:||::|:||||:||| ||||||||::||
orf78-1 LGIGATVVAWIWWKKRQRIQFYRSKLKEKRAQRKAAKAAKKAAQSKQX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF78和淋病奈瑟球菌的预计ORF(ORF78ng)在38个氨基酸的重叠区内显示出有97.4%的相同性:
orf78.pep XXLXFXPIAXIMTPXRYEQVQEKFDKYGNWVLFVARFLPGLRTAVFVTAGISRKVSYLRF 137
||||||||||||||||||||||||||||||
orf78ng YPVLFVARFLPGLRTAVFVTAGISRKVSYLRF 32
orf78.Dep IIMDGLAA 145
:|||||||
orf78ng LIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIALGVLAAALAWFWWRKRR 92
预计ORF78ng核苷酸序列<SEQ ID 667>编码的蛋白质具有氨基酸序列<SEQ ID668>:
1..YPVLFVARFL PGLRTAVFVT AGISRKVSYL RFLIMDGLAA LISVPVWIYL
51 GEYGAHNIDW LMAKMHSLQS GIFIALGVLA AALAWFWWRK RRHYQLYRAQ
101 LSEKRAKRKA EKAAKKAAQK QQ*
进一步的工作揭示了完整的淋球菌核苷酸序列<SEQ ID 669>:
1 atgtttgccc tttTggaagc CTTTTTTGTC GAAtacggCt atgcGGCCGT
51 GTTTTTCGTT TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAAGATT
101 TGACCTTGGT AACGGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCGGTCGG TATGCTCGGC GTGTTGGCGG GCGACGGCGT
201 GATGTTTGCC GCCGGACGCA TCTGGGGGCA GAAAATCCTC AAGTTCAAAC
251 CGATTGCGCG CATCATGACG CCGAAACGTT ACGCGCAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGCAACTG GGTTCTGTTT GTCGCCCGTT TCCTGCCGGG
351 TTTGCGGACT GCCGTTTTCG TTACCGCCGG CATCAGCCGC AAAGTATCGT
401 ATCTGCGCTT TCTGATTATG GACGGGCTGG CCGCGCTGAT TTCCGTGCCC
451 GTTTGGATTT ACTTGGGCGA GTACGGCGCG CACAACATCG ATTGGCTGAT
501 GGCGAAAATG CACAGCCTGC AATCGGGCAT CTTCATCGCA TTGGGCGTGC
551 TGGCGGCGGC GCTGGCGTGG TTCTGGTGGC GCAAACGCCG ACATTATCAG
601 CTTTACCGCG CACAATTGAG CGAAAAACGC GCCAAACGCA AGGCGGAAAA
651 GGCAGCGAAA AAAGCGGCAC AGAAGCAGCA GTAa
它对应于氨基酸序列<SEQ ID 670;ORF78ng-1>:
1 MFALLEAFFV EYGYAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 HIMFAVGMLG VLAGDGVMFA AGRIWGQKIL KFKPIARIMT PKRYAQVQEK
101 FDKYGNWVLF VARFLPGLRT AVFVTAGISR KVSYLRFLIM DGLAALISVP
151 VWIYLGEYGA HNIDWLMAKM HSLQSGIFIA LGVLAAALAW FWWRKRRHYQ
201 LYRAQLSEKR AKRKAEKAAK KAAQKQQ*
ORF78ng-1和ORF78-1在227个氨基酸的重叠区内显示出有88.1%的相同性:
10 20 30 40 50 60
orf78-1.pep MFAFLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf78ng-1 MFALLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
10 20 30 40 50 60
70 80 90 100 110 120
orf78-1.pep VLVGDGIMFAAGRIWGQKILRFKPIARIMTPKRYEQVQEKFDKYGNWVLFVARFLPGLRT
||:|||:|||||||||||||:||||||||||||| |||||||||||||||||||||||||
orf78ng-1 VLAGDGVMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRT
70 80 90 100 110 120
130 140 150 160 170 180
orf78-1.pep AVFVTAGISRKVSYLRFIIMDGLAALISVPIWIYLGEYGAHNIDWLMAKMHSLQSGIFVI
|||||||||||||||||:||||||||||||:|||||||||||||||||||||||||||:
orf78ng-1 AVFVTAGISRKVSYLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIA
130 140 150 160 170 180
190 200 210 220
orf78-1.pep LGIGATVVAWIWWKKRQRIQFYRSKLKEKRAQRKAAKAAKKAAQSKQX
||: |:::||:||:||:: |:||::|:||||:||| ||||||||::||
orf78ng-1 LGVLAAALAWFWWRKRRHYQLYRAQLSEKRAKRKAEKAAKKAAQKQQX
190 200 210 220
另外,orf78ng-1显示出与流感嗜血菌的dedA蛋白同源:
sp|P45280|YG29_HAEIN假设蛋白HI1629>gi|1073983|pir||D64133dedA蛋白(dedA)同系物-流感嗜血菌(Rd KW20菌株)
>gi|1574476(U32836)dedA蛋白(dedA)[流感嗜血菌]长度=212
评分=223位(563),估计值=7e-58
相同性=108/182(59%),阳性=140/182(76%),空隙=2/182(1%)
询问:5 LEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGM--GYTNPHIMFAVGMLGVL 62
L FF EYGY AV FVL+ICGFGVPIPED+TLV+GGVI+G+ N H+M V M+GVL
目标:21 LIGFFTEYGYWAVLFVLIICGFGVPIPEDITLVSGGVIAGLYPENVNSHLMLLVSMIGVL 80
询问:63 AGDGVMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRTAV 122
AGD M+ GRI+G KIL+F+PI RI+T +R V+EKF +YGN VLFVARFLPGLR +
目标:81 AGDSCMYWLGRIYGTKILRFRPIRRIVTLQRLRMVREKFSQYGNRVLFVARFLPGLRAPI 140
询问:123 FVTAGISRKVSYLRFLIMDGLAALISVFVWIYLGEYGAHNIDWLMAKMHSLQSGIFIALG 182
++ +GI+R+VSY+RF+++D AA+ISVP+WIYLGE GA N+DWL ++ Q I+I +G
目标:141 YMVSGITRRVSYVRFVLIDFCAAIISVPIWIYLGELGAKNLDWLHTQIQKGQIVIYIFIG 200
询问:183 VL 184
L
目标:201 YL 202
根据该分析结果(包括推定跨膜结构域的存在),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例87
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 671>:
1 ATGAAAAAAT TATTGGCGGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTCCGCCGCC GGAGTCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG
101 AAGGTATGAA AATAGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC
151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCCGTTGCCG ACCGCGTCGA
201 AGTGCATACC CACATCAACG ACAACGGCGT GATGCGGATG CGCGAAGTCG
251 AAGGCGGCGT GCCTTTGGAA GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCATG TGATGTTTAT GGGTTTGAAA AAACAATTAA AAGAGGGCGA
351 TAAAATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC
401 AACTGGAAGT CAAAATCGCG CCGATGCCGG CAATGAACCA C...
它对应于氨基酸序列<SEQ ID 672;ORF79>:
1 MKKLLAAVMM AGLAGAVSAA GVHVEDGWAR TTVEGMKIGG AFMKIHNDEA
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG
101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKIA PMPAMNH..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 673>:
1 ATGAAAAAAT TATTGGCGGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTCCGCCGCC GGAGTCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG
101 AAGGTATGAA AATAGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC
151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCCGTTGCCG ACCGCGTCGA
201 AGTGCATACC CACATCAACG ACAACGGCGT GATGCGGATG CGCGAAGTCG
251 AAGGCGGCGT GCCTTTGGAA GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCATG TGATGTTTAT GGGTTTGAAA AAACAATTAA AAGAGGGCGA
351 TAAAATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC
401 AACTGGAAGT CAAAATCGCG CCGATGCCGG CAATGAACCA CGGTCATCAC
451 CACGGCGAAG CGCATCAGCA CTAA
它对应于氨基酸序列<SEQ ID 674;ORF79-1>:
1 MKKLLAAVMM AGLAGAVSAA GVHVEDGWAR TTVEGMKIGG AFMKIHNDEA
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG
101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKIA PMPAMNHGHH
151 HGEAHQH*
对该氨基酸序列的计算机分析揭示了一个推定的前导肽,并且还给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF79和脑膜炎奈瑟球菌菌株A的OR-F(ORF79a)在147个氨基酸的重叠区内显示出有94.6%的相同性:
10 20 30 40 50 60
orf79.pep MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKIGGAFMKIHNDEAKQDFLLGGSS
|| ||||||||||||||||||:|||||||||||||||:||||||||||||||||||||||
orf79a MKXLLAAVMMAGLAGAVSAAGIHVEDGWARTTVEGMKMGGAFMKIHNDEAKQDFLLGGSS
10 20 30 40 50 60
70 80 90 100 110 120
orf79.pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
|||||||||||||||||||||||||||||||||||||||||||||||| ||||| |||||
orf79a PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGXKKQLKXGDKIP
70 80 90 100 110 120
130 140
orf79.pep VTLKFKNAKAQTVQLEVKIAPMPAMNH
|||||||||||||||||| ||| ||:|
orf79a VTLKFKNAKAQTVQLEVKTAPMSAMDHGHHHGEAHQHX
130 140 150
全长ORF79a核苷酸序列<SEQ ID 675>是:
1 ATGAAANAAC TATTGGCAGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTCCGCCGCC GGAATCCACG TTGAGGACGG CTGGGCGCGC ACGACCGTCG
101 AAGGTATGAA AATGGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC
151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCTGTTGCCG ACCGCGTCGA
201 AGTGCATACC CATATCAATG ATAACGGTGT GATGCGGATG CGCGAAGTCG
251 AAGGCGGCGT GCCTTTGGAG GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCATG TCATGTTTAT GGGTNTGAAA AAACAATTAA AAGANGGCGA
351 CAAGATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCA CAAACCGTCC
401 AACTGGAAGT CAAAACCGCG CCGATGTCGG CAATGGACCA CGGTCATCAC
451 CACGGCGAAG CGCATCAGCA CTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 676>:
1 MKXLLAAVMM AGLAGAVSAA GIHVEDGWAR TTVEGMKMGG AFMKIHNDEA
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG
101 SYHVMFMGXK KQLKXGDKIP VTLKFKNAKA QTVQLEVKTA PMSAMDHGHH
151 HGEAHQH*
ORF79a和ORF79-1在157个氨基酸的重叠区内显示出有94.9%的相同性:
10 20 30 40 50 60
orf79a.pep MKXLLAAVMMAGLAGAVSAAGIHVEDGWARTTVEGMKMGGAFMKIHNDEAKQDFLLGGSS
|| ||||||||||||||||||:|||||||||||||||:||||||||||||||||||||||
orf79-1 MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKIGGAFMKIHNDEAKQDFLLGGSS
10 20 30 40 50 60
70 80 90 100 110 120
orf79a.pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGXKKQLKXGDKIP
|||||||||||||||||||||||||||||||||||||||||||||||| ||||| |||||
orf79-1 PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
70 80 90 100 110 120
130 140 150
orf79a.pep VTLKFKNAKAQTVQLEVKTAPMSAMDHGHHHGEAHQHX
|||||||||||||||||| ||| ||:||||||||||||
orf79-1 VTLKFKNAKAQTVQLEVKIAPMPKMNHGHHHGEAHQHX
130 140 150
与淋病奈瑟球菌的预计ORF的同源性
ORF79和淋病奈瑟球菌的预计ORF(ORF79ng)在76个氨基酸的重叠区内显示出有96.1%的相同性:
orf79.pep FMKIHNDEAKQDFLLGGSSPVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGS 101
||||||||||||:|||||||||||||||||
orf79ng INDNGVMRMREVKGGVPLEAKSVTELKPGS 30
orf79.pep YHVMFMGLKKQLKEGDKIPVTLKFKNAKAQTVQLEVKIAPMPAMNH 147
||||||||||||||||||||||||||||||||||||| ||| ||||
orf79ng YHVMFMGLKKQLKEGDKIPVTLKFKNAKAQTVQLEVKTAPMSAMNHGHHHGEAHQH 86
预计ORF79ng核苷酸序列<SEQ ID 677>编码的蛋白质包含氨基酸序列<SEQ ID678>:
1..INDNGVMRMR EVKGGVPLEA KSVTELKPGS YHVMFMGLKK QLKEGDKIPV
51 TLKFKNAKAQ TVQLEVKTAP MSAMNHGHHH GEAHQH*
进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 679>:
1 ATGAAAAAAT TATTGGCAGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTccgccgCc GGagTccAtG TCGAggACGG CTGGGCGCGc accaCTGtcg
101 aaggtATgaa aatggGCGGC GCgttCATga aaATCCACAA CGACGaaGcc
151 atacaaGACt ttgtgcTCgg CGGaagcatg cccgttgccg accgcGTCGA
201 AGTGCAtaca cacATCAACG ACAACGGCGT GATGCGTATG CGCGAAGTCA
251 AAGGCGGCGT GCCTTTGGAG GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCACG TGATGTTTAT GGGTTTGAAA AAACAACTGA AAGAGGGCGA
351 CAAGATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC
401 AACTGGAAGT CAAAACCGCG CCGATGTCGG CAATGAACCA CGGTCATCAC
451 CACGGCGAAG CGCATCAGCA CTAA
它对应于氨基酸序列<SEQ ID 680;ORF79ng-1>:
1 MKKLLAAVMM AGLAGAVSAA GVHVEDGWAR TTVEGMKMGG AFMKIHNDEA
51 IQDFVLGGSM PVADRVEVHT HINDNGVMRM REVKGGVPLE AKSVTELKPG
101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKTA PMSAMNHGHH
151 HGEAHQH*
ORF79ng-1和ORF79-1在157个氨基酸的重叠区内显示出有95.5%的相同性:
10 20 30 40 50 60
orf79-1.pep MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKIGGAFMKIHNDEAKQDFLLGGSS
|||||||||||||||||||||||||||||||||||||:|||||||||||| |||:|||||
orf79ng-1 MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKMGGAFMKIHNDEAIQDFVLGGSM
10 20 30 40 50 60
70 80 90 100 110 120
orf79-1.pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf79ng-1 PVADRVEVHTHINDNGVMRMREVKGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
70 80 90 100 110 120
130 140 150
orf79-1.pep VTLKFKNAKAQTVQLEVKIAPMPAMNHGHHHGEAHQHX
|||||||||||||||||| ||| |||||||||||||||
orf79ng-1 VTLKFKNAKAQTVQLEVKTAPMSAMNHGHHHGEAHQHX
130 140 150
另外,ORF79ng-1显示出与Aquifex aeolicus的蛋白有明显的同源性
gi|2983695(AE000731)推定的蛋白[Aquifex aeolicus]长度=151
评分=63.6位(152),估计值=6e-10
相同性=38/114(33%),阳性=58/114(50%),空隙=1/114(0%)
询问:24 VEDGWARTTVEGMKMGGAFMKIHNDEAIQDFVLGGSMPVADRVEVHTHINDNGVMRMREV 83
V+ W G M I N+ D+++G +A RVE+H + +N V +M
目标:27 VKHPWVMEPPPGPNTTMMGMIIVNEGDEPDYLIGAKTDIAQRVELHKTVIENDVAKMVPQ 86
询问:84 KGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIPVTLKFKNAKAQTVQLEV 137
+ + + K E K YHVM +GLKK++KEGDK+ V L F+ + TV+ V
目标:87 ER-IEIPPKGKVEFKHHGYHVMIIGLKKRIKEGDKVKVELIFEKSGKITVEAPV 139
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF79-1(15.6kDa)克隆到pET载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图18A显示出His-融合蛋白亲和纯化的结果。用纯化的His-融合蛋白免疫小鼠,用其血清进行ELISA(阳性结果)和FACS分析(图18B)。这些实验确认ORF79-1是一种外露蛋白,且是一种有用的免疫原。
实施例88
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 681>:
1 ATGACGGTAA CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA
51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT
201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTGTTTGCCG
251 CCAACGTATT GGGTCGGCAG ATCCTCGCCG CGTGGGACAG CCTGTTGGGG
301 CGGATTCCGG TTGTGAAAtC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 ATacgTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC
401 CGTTTCCCCA GCCCGGTATT TGGACGATyG CTTTCGTGTC AGGGCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAs GACGGCGATT ATCTTTCCGT
501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AsCATTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAsGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT
701 AA
它对应于氨基酸序列<SEQ ID 682;ORF98>:
1 MTVTAAEGGK AAKALKKYLI TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPGLGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSEYVL SDSSRSFKTP VLVPFPQPGI WTIAFVSGQV
151 SNAVKAALPX DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEXLK
201 YVISLGMVIP DDLPVKTLAX PMPSEKADLP EQQ*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 683>:
1 ATGACGGAAC nTGCGGCCGA AGGCGGCAAA GCTGCCAArG CGTTAAAAAA
51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT
201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTGTTTGCCG
251 CCAACGTATT GGGTCGGCAG ATCCTCGCCG CGTGGGACAG CCTGTTGGGG
301 CGGATTCCGG TTGTGAAATC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 ATCGCTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC
401 CGTTTCCCCA GCCCGGTATT TGGACGATTG CTTTCGTGTC AGGGCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAG GACGGCGATT ATCTTTCCGT
501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCATTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAGGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT
701 AA
它对应于氨基酸序列<SEQ ID 684;ORF98-1>:
1 MTEXAAEGGK AAKALKKYLI TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPGLGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQPGI WTIAFVSGQV
151 SNAVKAALPK DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPSEKADLP EQQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF98和脑膜炎奈瑟球菌菌株A的ORF(ORF98a)在233个氨基酸的重叠区内显示出有96.1%的相同性:
10 20 30 40 50 60
orf98.pep MTVTAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
|| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98a MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
10 20 30 40 50 60
70 80 90 100 110 120
orf98.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSEYVL
|||||||||||||||||||||||||||||||||||||||||||||||||||||||| :|
orf98a GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSXSLL
70 80 90 100 110 120
130 140 150 160 170 180
orf98.pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPXDGDYLSVYVPTTPNPTGGYY
||||||||||||||||| ||||||||||||||||||||| ||||||||||||||||||||
orf98a SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
130 140 150 160 170 180
190 200 210 220 230
or.f98.pep IMVKKSDVRELDMSVDEXLKYVISLGMVIPDDLPVKTLAXPMPSEKADLPEQQX
||||||||||||||||| ||||||||||||||||||||| ||||||||||||||
orf98a IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
190 200 210 220 230
全长ORF98a核苷酸序列<SEQ ID 685>是:
1 ATGACGGAAC CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA
51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT
201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTATTTGCCG
251 CAAACGTATT GGGCCGGCAG ATTCTTGCCG CGTGGGACAG CTTGTTGGGG
301 CGGATTCCGG TTGTGAAGTC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 NTCGTTGCTG TCCGACAGCA GCCGTTCGTT TAAAACACCA GTACTCGTGC
401 CGTTTCCCCA ATCGGGTATT TGGACAATCG CATTCGTGTC CGGTCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAG GACGGCGATT ATCTTTCCGT
501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCGTTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAGGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT
701 AA
它编码的蛋白质具有氨基酸序列<SEQ ID 686>:
1 MTEPAAEGGK AAKALKKYLI TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPGLGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSXSLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV
151 SNAVKAALPK DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPSEKADLP EQQ*
ORF98a和ORF98-1在233个氨基酸的重叠区内显示出有98.7%的相同性:
10 20 30 40 50 60
orf98a.pep MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98-1 MTEXAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
10 20 30 40 50 60
70 80 90 100 110 120
orf98a.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSXSLL
|||||||||||||||||||||||||||||||||||||||||||||||||||||||| |||
orf98-1 GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSESLL
70 80 90 100 110 120
130 140 150 160 170 180
orf98a.pep SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf98-1 SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
130 140 150 160 170 180
190 200 210 220 230
orf98a.pep IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98-1 IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
190 200 210 220 230
与淋病奈瑟球菌的预计ORF的同源性
ORF98和淋病奈瑟球菌的预计ORF(ORF98ng)在233个氨基酸的重叠区内显示出有95.3%的相同性:
10 20 30 40 50 60
orf98.pep MTVTAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL 60
|| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98ng MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL 60
orf98.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSEYVL 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||| :|
orf98ng GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLXRIPVVKSIYSSVKKVSESLL 120
orf98.pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPXDGDYLSVYVPTTPNPTGGYY 180
||||||||||||||||| ||||||||||||||||||||| ||||||||||||||||||||
orf98ng SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPQDGDYLSVYVPTTPNPTGGYY 180
orf98.pep IMVKKSDVRELDMSVDEXLKYVISLGMVIPDDLPVKTLAXPMPSEKADLPEQQ 233
||||||||||||||||| ||||||||||||||||||||| ||| |||:|||||
orf98ng IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPPEKAELPEQQ 233
预计全长ORF98ng核苷酸序列<SEQ ID 687>编码的蛋白质具有氨基酸序列<SEQ ID 688>:
1 MTEPAAEGGK AAKALKKYLI TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPGLGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLX
101 RIPVVKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV
151 SNAYKAALPQ DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPPEKAELP EQQ*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 689>:
1 ATGACGGAAC CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA
51 ATATCTGATT ACAGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ACCAGCTTGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCCGGGCT
201 CGCCGTTATT GTTGCCATTG CCGTATTGTT TGTAACCGGA TTATTTGCCG
251 CAAACGTGTT GGGCCGGCAG ATTCTTGCCG CGTGGGACAG CCTGTTgggg
301 cggaTTCCGG TTGTCAAATC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 ATCGCTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC
401 CGTTTCCCCA ATCGGGTATT TGGACAATCG CATTCGTGTC CGGTCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGCAG GATGGCGATT ATCTTTCCGT
501 GTATGTCCCG ACCACGCCCA ACCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCGTTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAGGA CCTATGCCGC CTGAAAAGGC GGAGTTGCCC GAACAACAAT
701 AA
它对应于氨基酸序列<SEQ ID 690;ORF98ng-1>:
1 MTEPAAEGGK AAKALKKYLI TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPGLGVI VAIAVLFYTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV
151 SNAVKAALPQ DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPPEKAELP EQQ*
ORF98ng-1和ORF98-1在233个氨基酸的重叠区内显示出有97.9%的相同性:
10 20 30 40 50 60
orf98-1.pep MTEXAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98ng-1 MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
10 20 30 40 50 60
70 80 90 100 110 120
orf98-1.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSESLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98ng-1 GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSESLL
70 80 90 100 110 120
130 140 150 160 170 180
orf98-1.pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
||||||||||||||||| |||||||||||||||||||||:||||||||||||||||||||
orf98ng-1 SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPQDGDYLSVYVPTTPNPTGGYY
130 140 150 160 170 180
190 200 210 220 230
orf98-1.pep IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
||||||||||||||||||||||||||||||||||||||||||| |||:||||||
orf98ng-1 IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPPEKAELPEQQX
190 200 210 220 230
根据该分析结果(包括淋球菌蛋白中的推定跨膜结构域与脑膜炎球菌蛋白中的序列相同这一事实),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例89
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 691>:
1 ATgAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CCGTCGGACT
51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
151 GCCGTCGTGG TGTGGTATTT CTTGTTTAAA TTCATTATCG GsGgTACTCA
201 ATATCCCCGA AAAGATGCAG CGTTTCGGTT CGGCnCGTAA AGGCCkCAAG
251 ssCGsGCTTG CCTTGAACAA GGCGGGTTTG GCGTATTTTG AAGGGCGTTT
301 TGAAAAGGCG GAACTAGAAG CCTCACGCGT GTTGGTCAAC AAAGtAGGCC
351 GaGAGACAAC CGGACTTTGG CATTGATGCT GrGCGCGCAC GCCGCCGGAC
401 AGATGGAAAA CATCGAssTG CGCGACCGTT ATCTTGCGGA AATCGCCAAA
451 CTGCCGGAAA AACAGCAGCT TTCCCGTTAT CTTTTGTTGG CGGAATCGGC
501 GTTGAACCGG CGCGATTACG AAGCGGCGGA AGCCAATCTT CATGCGGCGG
551 CGAAGATGAA TGCCAACCTT ACGCGCCTCG TGCGTCTGCA .ATTCGTTAC
601 GCTTTCGACA GGGGCGACGC GTTGCAGGTT CTGGCAAAAA CCGAAAAACT
651 TTCCAAGGCG GGCGCGTTGG GCAAATCGGA AATGGAACGG TATCAAAATT
701 GGGCATATCC GTCGCCAGCT GGCGGATGCT GCCGATGCCG CCGCTTTGAA
751 AACCTGCCTG AAGCGGATTC CCGACAGCCT CAAAAACGGG GAATTGAGCG
801 TATCGGTTGC GGAAAAGTAC GAACGTTTGG GACTGTATGC CGATGCGGTC
851 AAATGGGTCA AACAGCATTA TCCGCAsAAC CGCCGCCCCG AGCTTTTGGA
901 AGCCTTTGTC GAAAGCGTGC GCTTTTTGGG CGAGCGCGAA CAGCAGAAAG
951 CCATCGATTT TGCCGATGCT TGGCTGAAAG AACAGCCCGA TAACGCGCTT
1001 CTGCTGATGT ATCTCGGTCG GCTCGCCTTC GGCCGCAAAC TTTGGGGCAA
1051 GGCAAAAGGC TACCTTGAAG CGAGCATTGC ATTAAAGCCG AGTATTTCCG
1101 CGCGTTTGGT TCTAACAAAG GTTTTCGACG AAATCGGAGA ACCGCAGAAG
1151 GCGGAGGCGC AC...它对应于氨基酸序列<SEQ ID 692;ORF100>:
1 MKTVVWIVVL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSLI
51 AVVVWYFLFK FIIGVLNIPE KMQRFGSARK GXKXXLALNK AGLAYFEGRF
101 EKAELEASRV LVNKVGRDNR TLALMLXAHA AGQMENIXXR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLXIRYA
201 FDRGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQLA DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP XNRRPELLEA
301 FVESVRFLGE REQQKAIDFA DAWLKEQPDN ALLLMYLGRL AFGRKLWGKA
351 KGYLEASIAL KPSISARLVL TKVFDEIGEP QKAEAH...进一步的工作揭示了完整的核苷酸序列<SEQ ID 693>:
1 ATGAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CCGTCGGACT
51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
151 GCCGTCGTGG TGTGGTATTT CTTGTTTAAA TTCATTATCG GCGTACTCAA
201 TATCCCCGAA AAGATGCAGC GTTTCGGTTC GGCGCGTAAA GGCCGCAAGG
251 CCGCGCTTGC CTTGAACAAG GCGGGTTTGG CGTATTTTGA AGGGCGTTTT
301 GAAAAGGCGG AACTAGAAGC CTCACGCGTG TTGGTCAACA AAGAGGCCGG
351 AGACAACCGG ACTTTGGCAT TGATGCTGGG CGCGCACGCC GCCGGACAGA
401 TGGAAAACAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG
451 CCGGAAAAAC AGCAGCTTTC CCGTTATCTT TTGTTGGCGG AATCGGCGTT
501 GAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCT
601 TTCGACAGGG GCGACGCGTT GCAGGTTCTG GCAAAAACCG AAAAACTTTC
651 CAAGGCGGGC GCGTTGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG
701 CATACCGCCG CCAGCTGGCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC
751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGAGCGTATC
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT
851 GGGTCAAACA GCATTATCCG CACAACCGCC GCCCCGAGCT TTTGGAAGCC
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAG CGCGAACAGC AGAAAGCCAT
951 CGATTTTGCC GATGCTTGGC TGAAAGAACA GCCCGATAAC GCGCTTCTGC
1001 TGATGTATCT CGGTCGGCTC GCCTACGGCC GCAAACTTTG GGGCAAGGCA
1051 AAAGGCTACC TTGAAGCGAG CATTGCATTA AAGCCGAGTA TTTCCGCGCG
1101 TTTGGTTCTA GCAAAGGTTT TCGACGAAAT CGGAGAACCG CAGAAGGCGG
1151 AGGCGCAGCG CAACTTGGTT TTGGAAGCCG TCTCCGATGA CGAACGTCAC
1201 GCAGCGTTAG AGCAGCATAG CTGA它对应于氨基酸序列<SEQ ID 694;ORF100-1>:
1 MKTVVWIVVL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSLI
51 AVVVWYFLFK FIIGVLNIPE KMQRFGSARK GRKAALALNK AGLAYFEGRF
101 EKAELEASRV LVNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA
201 FDRGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQLA DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA
301 FVESVRFLGE REQQKAIDFA DAWLKEQPDN ALLLMYLGRL AYGRKLWGKA
351 KGYLEASIAL KPSISARLVL AKVFDEIGEP QKAEAQRNLV LEAVSDDERH
401 AALEQHS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF100和脑膜炎奈瑟球菌菌株A的ORF(ORF100a)在386个氨基酸的重叠区内显示出有93.5%的相同性:
10 20 30 40 50 60
orf100.pep MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
|||||||||||||| |||||||| ||||||||||||||||||||||||||||||||||||
orf100a MKTVVWIVVLFAAAXGLALASGIXTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
10 20 30 40 50 60
70 80 90 100 110 120
orf100.pep FIIGVLNIPEKMQRFGSARKGXKXXLALNKAGLAYFEGRFEKAELEASRVLVNKVGRDNR
||||||| ||||||||||||| | |||||||||||||||||||||||||| || : |||
orf100a FIIGVLNXPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR
70 80 90 100 110 120
130 140 150 160 170 180
orf100.pep TLALMLXAHAAGQMENIXXRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
|||||| |||||||||| |||||||||||||||||||||||||||||||||||||||||
orf100a TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
130 140 150 160 170 180
190 200 210 220 230 240
orf100.pep AAAKMNANLTRLVRLXIRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA
||||||||||||||| :|||||||||||||||||| ||||| ||||||||||||||||||
orf100a AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKXSKAGAXGKSEMERYQNWAYRRQLX
190 200 210 220 230 240
250 260 270 280 290 300
orf100.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPXNRRPELLEA
|||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||
orf100a DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
250 260 270 280 290 300
310 320 330 340 350 360
orf100.pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAFGRKLWGKAKGYLEASIAL
|||||||||||:|||||||||||||||||||||| ||||||:||||||||||||||||||
orf100a FVESVRFLGERDQQKAIDFADAWLKEQPDNALLLXYLGRLAYGRKLWGKAKGYLEASIAL
310 320 330 340 350 360
370 380
orf100.pep KPSISARLVLTKVFDEIGEPQKAEAH
||||||||||:||||| ||||||||:
orf100a KPSISARLVLAKVFDETGEPQKAEAQRNLVLASVAEENRPSAETHX
370 380 390 400
全长ORF100a核苷酸序列<SEQ ID 695>是:
1 ATGAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CNNTCGGGCT
51 GGCATTGGCG TCGGGCATTN ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
151 GCCGTCGTGG TGTGGTATTT CCTGTTCAAA TTCATCATCG GCGTACTCAA
201 TANCCCCGAA AAGATGCAGC GTTTCGGTTC GGCGCGTAAA CGCCGCAAGG
251 CCGCGCTTGC TTTGAACAAG GCGGGTTTGG CGTATTTTGA AGGGCGTTTT
301 GAAAAGGCGG AACTTGAAGC CTCGCGCGTA TTGGGAAACA AAGAGGCGGG
351 GGATAACCGG ACTTTGGCAT TGATGTTGGG CGCACATGCC GCCGGGCAGA
401 TGGAAAACAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG
451 CCGGAAAAGC AGCAGCTTTC CCGTTATCTT TTGTTGGCGG AATCGGCGTT
501 GAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCT
601 TTCGACAGGG GCGACGCGTT GCAGGTTCTG GCAAAAACCG AAAAANTTTC
651 CAAGGCGGGC GCGTNGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG
701 CATACCGCCG CCAGCTGNCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC
751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGAGCGTATC
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT
851 GGGTCAAACA GCATTATCCG CACAACCGCC GACCCGAACT TTTGGAAGCN
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAA CGCGATCAGC AGAAAGCCAT
951 CGATTTTGCC GATGCTTGGC TGAAAGAACA GCCCGATAAT GCGCTTCTGC
1001 TGANGTATCT CGGTCGGCTC GCCTACGGCC GCAAACTTTG GGGCAAGGCA
1051 AAAGGCTACC TTGAAGCGAG CATTGCATTA AAGCCGAGTA TTTCCGCGCG
1101 TTTGGTTCTG GCAAAGGTTT TTGACGAAAC CGGAGAACCG CAGAAGGCGG
1151 AGGCGCAGCG CAACTTGGTT TTGGCAAGCG TTGCCGAGGA AAACCGNCCT
1201 TCCGCCGAAA CCCATTGA它编码的蛋白质具有氨基酸序列<SEQ ID 696>:
1 MKTVVWIVVL FAAAXGLALA SGIXTGDVYI VLGQTMLRIN LHAFVLGSLI
51 AVVVWYFLFK FIIGVLNXPE KMQRFGSARK GRKAALALNK AGLAYFEGRF
101 EKAELEASRV LGNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA
201 FDRGDALQVL AKTEKXSKAG AXGKSEMERY QNWAYRRQLX DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA
301 FVESVRFLGE RDQQKAIDFA DAWLKEQPDN ALLLXYLGRL AYGRKLWGKA
351 KGYLEASIAL KPSISARLVL AKVFDETGEP QKAEAQRNLV LASVAEENRP
401 SAETH*ORF100a和ORF100-1在406个氨基酸的重叠区内显示出有95.1%的相同性:
10 20 30 40 50 60orf100a.pep MKTVVWIVVLFAAAXGLALASGIXTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
|||||||||||||| |||||||| ||||||||||||||||||||||||||||||||||||orf100-1 MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
10 20 30 40 50 60
70 80 90 100 110 120orf100a.pep FIIGVLNXPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR
||||||| ||||||||||||||||||||||||||||||||||||||||||| ||||||||orf100-1 FIIGVLNIPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLVNKEAGDNR
70 80 90 100 110 120
130 140 150 160 170 180orf100a.pep TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf100-1 TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
130 140 150 160 170 180
190 200 210 220 230 240orf100a.pep AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKXSKAGAXGKSEMERYQNWAYRRQLX
||||||||||||||||||||||||||||||||||| ||||| ||||||||||||||||||orf100-1 AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA
190 200 210 220 230 240
250 260 270 280 290 300
orf100a.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100-1 DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
250 260 270 280 290 300
310 320 330 340 350 360
orf100a.pep FVESVRFLGERDQQKAIDFADAWLKEQPDNALLLXYLGRLAYGRKLWGKAKGYLEASIAL
|||||||||||:|||||||||||||||||||||| |||||||||||||||||||||||||
orf100-1 FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL
310 320 330 340 350 360
370 380 390 400
orf100a.pep KPSISARLVLAKVFDETGEPQKAEAQRNLVLASVAEENRPSA-ETHX
|||||||||||||||| |||||||||||||| :|::::| :| | |
orf100-1 KPSISARLVLAKVFDEIGEPQKAEAQRNLVLEAVSDDERHAALEQHSX
370 380 390 400
与淋病奈瑟球菌的预计ORF的同源性
ORF100和淋病奈瑟球菌的预计ORF(ORF100ng)在386个氨基酸的重叠区内显示出有93.3%的相同性:
orf100.pep MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100ng MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK 60
orf100.pep FIIGVLNIPEKMQRFGSARKGXKXXLALNKAGLAYFEGRFEKAELEASRVLVNKVGRDNR 120
||||||||||:|:| |||||| | |||||||||||||||||||||||||| || : |||
orf100ng FIIGVLNIPENMRRSGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR 120
orf100.pep TLALMLXAHAAGQMENIXXRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 180
|||||| |||||||||| |||||||||||||||||||||||||||||||||||||||||
orf100ng TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 180
orf100.pep AAAKMNANLTRLVRLXIRYAFDRGDALQVLAKTEKLSKAGALGKSFMERYQNWAYRRQLA 240
||||||||||||||| :|||||||||||||||||||||||||||||||||||||||||:|
orf100ng AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQMA 240
orf100.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPXNRRPELLEA 300
|||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||
orf100ng DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 300
orf100.pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAFGRKLWGKAKGYLEASIAL 360
|||||||||||||||||||||:|||||||||||||||||||:||||||||||||||||||
orf100ng FVESVRFLGEREQQKAIDFADSWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL 360
orf100.pep KPSISARLVLTKVFDEIGEPQKAEAH 386
|||| |||||:||||| :: |||||:
orf100ng KPSIPARLVLAKVFDETAQSQKAEAQRNLVLASVAGENRPSAETR 405
全长ORF100ng核苷酸序列<SEQ ID 697>是:
1 ATGAAAACGG TAGTCTGGAT TGTTGTCCTG TTTGCCGCCG CCGTCGGACT
51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
151 GCCGTCGTGG TGTGGTATTT CCTGTTTAAA TTCATCATCG GCGTACTCAA
201 TATCCCCGAA AATATGCGGC GTTCCGGTTC GGCGCGGAAA GGCCGCAAGG
251 CCGCGCTTGC CTTGAATAAG GCGGGTTTGG CGTATTTCGA AGGGCGTTTT
301 GAAAAGGCGG AACTCGAAGC CTCTCGAGTG TTGGGCAACA AAGAGGCCGG
351 AGACAACCGG ACTTTGGCAT TGATGCTGGG CGCGCACGCG GCAGGACAGA
401 TGGAAAATAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG
451 CCGGAAAAAC AGCAGCTTTC CCGCTATCTT CTGCTGGCGG AATCGGCGTT
501 AAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCC
601 TTCGATCGGG GCGATGCGTT GCAGGTTCTG GCAAAAaccG AAAAACTTTC
651 CAAGGCGGGC GCGTTGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG
701 CATACCGCCG CCAGATGGCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC
751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGagcGTATC
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT
851 GGGTCAAACA GCATTATCCG CACAACCGCC GCCCCGAGCT TTTGGAAGCC
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAG CGCGAACAGC AGAAAGCCAT
951 CGATTTTGCC GATTCTTGGC TGAAAGAACA GCCCGATAAC GCGCTTCTGC
1001 TGATGTATCT CGGCCGGCTC GCCTACGGCC GCAAACTTTG GGGTAAGGCA
1051 AAAGGCTACC TTGAAGCGAG TATTGCACTG AAGCCGAGTA TTCCGGCGCG
1101 TTTGGTGTTG GCAAAGGTTT TTGACGAAAC CGCACAGTCG CAAAAAGCCG
1151 AAGCACAGCG CAACTTGGTT TTGGCAAGCG TTGCCGGGGA AAACCGCCCT
1201 TCCGCCGAAA CCCGTTGA它编码的蛋白质具有氨基酸序列<SEQ ID 698>:
1 MKTVVWIVVL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSLI
51 AVVVWYFLFK FIIGVLNIPE NMRRSGSARK GRKAALALNK AGLAYFEGRF
101 EKAELEASRV LGNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA
201 FDRGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQMA DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA
301 FVESVRFLGE REQQKAIDFA DSWLKEQPDN ALLLMYLGRL AYGRKLWGKA
351 KGYLEASIAL KPSIPARLVL AKVFDETAQS QKAEAQRNLV LASVAGENRP
401 SAETR*ORF100ng和ORF100-在402个氨基酸的重叠区内1显示出有95.3%的相同性:
10 20 30 40 50 60orf100-1.pep MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf100ng MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
10 20 30 40 50 60
70 80 90 100 110 120orf100-1.pep FIIGVLNIPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLVNKEAGDNR
||||||||||:|:| |||||||||||||||||||||||||||||||||||| ||||||||orf100ng FIIGVLNIPENMRRSGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR
70 80 90 100 110 120
130 140 150 160 170 180orf100-1.pep TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf100ng TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
130 140 150 160 170 180
190 200 210 220 230 240orf100-1.pep AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|orf100ng AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQMA
190 200 210 220 230 240
250 260 270 280 290 300orf100-1.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf100ng DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
250 260 270 280 290 300
310 320 330 340 350 360orf100-1.pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf100ng FVESVRFLGEREQQKAIDFADSWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL
310 320 330 340 350 360
370 380 390 400
orf100-1.pep KPSISARLVLAKVFDEIGEPQKAEAQRNLVLEAVSDDERHAALEQHSX
|||| ||||||||||| :: ||||||||||| :|: ::| :|
orf100n KPSIPARLVLAKVFDETAQSQKAEAQRNLVLASVAGENRPSAETRX
370 380 390 400
根据该分析结果(包括一个推定的前导序列、一个推定的跨膜结构域以及一个RGD基序的存在),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例90
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 699>
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG
151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATsTGGT CGTGTTCAAA CCGTTTTGA
它对应于氨基酸序列<SEQ ID 700;ORF102>:
1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA
51 VRLYRFMSPL GFGAVVFGAA IPFAAGWWGS GWVHVKLCLG LMLLAYQLYC
101 GVLLRRFQDY SNAFSHRWYR VFNEIPVLLM VAALYXVVFK PF*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 701>:
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG
151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATCTGGT CGTGTTCAAA CCGTTTTGA
它对应于氨基酸序列<SEQ ID 702;ORF102-1>:
1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA
51 VRLYRFMSPL GFGAVVFGAA IPFAAGWWGS GWVHVKLCLG LMLLAYQLYC
101 GVLLRRFQDY SNAFSHRWYR VFNEIPVLLM VAALYLVVFK PF*
该氨基酸序列的计算机分析给出了下列结果:
与幽门螺杆菌的HP1484假设整合膜蛋白(登录号为AE000647)的同源性
ORF102和HP1484在143个氨基酸的重叠区内显示出有33%的氨基酸相同性:
orf102 3 FSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPLGF 62
F W K FH+ VISW A LFYLPR+FV A + V++ +LY F++
HP1484 8 FLWVKAFHVIAVISWMAALFYLPRLFVYHAENAHKKEFVGVVQIQEK--KLYSFIASPAM 65
orf102 63 GAVVFGAAIPFAAG---WWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWY 119
G + + + GW+H KL L ++LLAY YC +R + + R+Y
HP1484 66 GFTLITGILMLLIEPTLFKSGGWLHAKLALVVLLLAYHFYCKKCMRELEKDPTRRNARFY 125
orf102 120 RVFNEIPXXXXXXXXXXXXFKPF 142
RVFNE P KPF
HP1484 126 RVFNEAPTILMILIVILVVVKPF 148
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF102和脑膜炎奈瑟球菌菌株A的ORF(ORF102a)在142个氨基酸的重叠区内显示出有99.3%的相同性:
10 20 30 40 50 60
orf102.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102a MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
10 20 30 40 50 60
70 80 90 100 110 120
orf102.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102a GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
70 80 90 100 110 120
130 140
orf102.pep VFNEIPVLLMVAALYXVVFKPFX
||||||||||||||| |||||||
orf102a VFNEIPVLLMVAALYLVVFKPFX
130 140
全长ORF102a核苷酸序列<SEQ ID 703>是:
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG
151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATCTGGT CGTGTTCAAA CCGTTTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 704>:
1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA
51 VRLYRFMSPL GFGAVVFGAA IPFAAGWWGS GWVHVKLCLG LMLLAYQLYC
101 GVLLRRFQDY SNAFSHRWYR VFNEIPVLLM VAALYLVVFK PF*
ORF102a和ORF102-1在142个氨基酸的重叠区内显示出完全相同:
10 20 30 40 50 60
orf102a.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102-1 MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
10 20 30 40 50 60
70 80 90 100 110 120
orf102a.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102-1 GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
70 80 90 100 110 120
130 140
orf102a.pep VFNEIPVLLMVAALYLVVFKPFX
|||||||||||||||||||||||
orf102-1 VFNEIPVLLMVAALYLVVFKPFX
130 140
与淋病奈瑟球菌的预计ORF的同源性
ORF102和淋病奈瑟球菌的预计ORF(ORF102ng)在142个氨基酸的重叠区内显示出有97.9%的相同性:
orf102.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL 60
|||||||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf102ng MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPL 60
orf102.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 120
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||
orf102ng GFGAVVFGAAIPFAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 120
orf102.pep VFNEIPVLLMVAALYXVVFKPF 142
||||||||||||||| ||||||
orf102ng VFNEIPVLLMVAALYLVVFKPF 142
全长ORF102ng核苷酸序列<SEQ ID 705>是:
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGCGCC GCGCGGCAAT CCCGAGTATG TGCGCCTGTC GGGGATGGCG
151 GTGCGGTTGT ACCGTTTTAT GTCGCCTTTG GGTTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCcggccg GTGGGGCagc ggctggGTTC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTATCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAAcg aAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATCTGGT CGTGTTCAAA CCGTTTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 706>:
1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDAPRGN PEYVRLSGMA
51 VRLYRFMSPL GFGAVVFGAA IPFAAGRWGS GWVHVKLCLG LMLLAYQLYC
101 GVLLRRFQDY SNAFSHRWYR VFNEIPVLLM VAALYLVVFK PF*
ORF102ng和ORF102-1在142个氨基酸的重叠区内显示出有98.6%的相同性:
10 20 30 40 50 60
orf102-1.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
|||||||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf102ng MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPL
10 20 30 40 50 60
70 80 90 100 110 120
orf102-1.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||
orf102ng GFGAVVFGAAIPFAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
70 80 90 100 110 120
130 140
orf102-1.pep VFNEIPVLLMVAALYLVVFKPFX
|||||||||||||||||||||||
orf102ng VFNEIPVLLMVAALYLVVFKPFX
130 140
另外,ORF102ng显示出与幽门螺杆菌的一种膜蛋白明显同源:
gi|2314656(AE000647)保守的假设整合蛋白[幽门螺杆菌]长度=148
评分=79.2位(192),估计值=1e-14
相同性=50/147(34%),阳性=68/147(46%),空隙=13/147(8%)
询问:3 FSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPLGF 62
F W K FH+ VISW A LFVLPR+FV A + V++ +LY F++
目标:8 FLWVKAFHVIAVISWMAALFYLPRLFVYHAENAHKKEFVGVVQIQEK--KLYSFIA5PAM 65
询问:63 GAVVFGAAIP-------FAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFS 115
G + + F +G GW+H KL L ++LLAY YC +R + +
目标:66 GFTLITGILMLLIEPTLFKSG----GWLHAKLALVVLLLAYHFYCKKCMRELEKDPTRRN 121
询问:116 HRWYRVFNEIPXXXXXXXXXXXXFKPF 142
R+YRVFNE P KPF
目标:122 ARFYRVFNEAPTILMILIVILVVVKPF 148
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例91
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 707>:
1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCAGC
51 GGTTTGGGGC GGATGGTCTT AACTGAAGCC CGAGCCGCAC GTGCTTGATA
101 TTACGGAAAC GGTCAGGCGC GGC //....
//.. ATTTCGTTTA CGATTTTGTC CGAACCGGAT ACGCCGATTA AGGCGAAGCT
51 CGACAGCGTC GACCCCGGGC TGACCACGAT GTCGTCGGGC GGTTACAACA
101 GCAGTACGGA TACGGCTTCC AATGCGGTCT ACTATTATGC CCGTTCGTTT
151 GTGCCGAATC CGGACGGCAA ACTCGCCACG GGGATGACGA CGCAGAATAC
201 GGTTGAAATC GACGGCGTGA AAAATGTGCT GATTATTCCG TCGCTGACCG
251 TGAAAAATCG CGGCGGCAAG GCGTTTGTGC GCGTGTTGGG TGCGGACGGC
301 AAGGCGGCGG AACGCGAAAT CCGGACCGGT ATGAGAGACA GTATGAATAC
351 CGAAGTAAAA AGCGGGTTGA AAGAGGGGGA CAAAGTGGTC ATCTCCGAAA
401 TAACCGCCGC CGAGCAACAG GAAAGCGGCG AACGCGCCCT AGGCGGCCCG
451 CCGCGCCGAT AA
它对应于氨基酸序列<SEQ ID 708;ORF85>:
1 MAKMMKWAAV AAVAAAAVWG GWS.LKPEPH VLDITETVRR G.........
51 .......... .......... .......... .......... ..........
101 .......... .......... .......... .......... ..........
151 .......... .......... .......... .......... ..........
201 .......... .......... .......... .........I SFTILSEPDT
251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG
301 MTTQNTVEID GVKNVLIIPS LTVKNRGGKA FVRVLGADGK AAEREIRTGM
351 RDSMNTEVKS GLKEGDKVVI SEITAAEQQE SGERALGGPP RR*
进一步的工作揭示了部分核苷酸序列<SEQ ID 709>:
1 ..GTATCGGTCG GCGCGCAGGC ATCGGGGCAG ATTAAGATAC TTTATGTCAA
51 ACTCGGGCAA CAGGTTAAAA AGGGCGATTT GATTGCGGAA ATCAATTCGA
101 CCTCGCAGAC CAATACGCTC AATACGGAAA AATCCAAGTT GGAAACGTAT
151 CAGGCGAAGC TGGTGTCGGC ACAGATTGCA TTGGGCAGCG CGGAGAAGAA
201 ATATAAGCGT CAGGCGGCGT TATGGAAGGA AAACGCGACT TCCAAAGAGG
251 ATTTGGAAAG CGCGCAGGAT GCGTTTGCCG CCGCCAAAGC CAATGTTGCC
301 GAGCTGAAGG CTTTAATCAG ACAGAGCAAA ATTTCCATCA ATACCGCCGA
351 GTCGGAATTG GGCTACACGC GCATTACCGC AACGATGGAC GGCACGGTGG
401 TGGCGATTCT CGTGGAAGAG GGGCAGACTG TGAACGCGGC GCAGTCTACG
451 CCGACGATTG TCCAATTGGC GAATCTGGAT ATGATGTTGA ACAAAATGCA
501 GATTGCCGAG GGCGATATTA CCAAGGTGAA GGCGGGGCAG GATATTTCGT
551 TTACGATTTT GTCCGAACCG GATACGCCGA TTAAGGCGAA GCTCGACAGC
601 GTCGACCCCG GGCTGACCAC GATGTCGTCG GGCGGTTACA ACAGCAGTAC
651 GGATACGGCT TCCAATGCGG TCTACTATTA TGCCCGTTCG TTTGTGCCGA
701 ATCCGGACGG CAAACTCGCC ACGGGGATGA CGACGCAGAA TACGGTTGAA
751 ATCGACGGCG TGAAAAATGT GCTGATTATT CCGTCGCTGA CCGTGAAAAA
801 TCGCGGCGGC AAGGCGTTTG TGCGCGTGTT GGGTGCGGAC GGCAAGGCGG
851 CGGAACGCGA AATCCGGACC GGTATGAGAG ACAGTATGAA TACCGAAGTA
901 AAAAGCGGGT TGAAAGAGGG GGACAAAGTG GTCATCTCCG AAATAACCGC
951 CGCCGAGCAA CAGGAAAGCG GCGAACGCGC CCTAGGCGGC CCGCCGCGCC
1001 GATAA
它对应于氨基酸序列<SEQ ID 710;ORF85-1>:
1 ..VSVGAQASGQ IKILYVKLGQ QVKKGDLIAE INSTSQTNTL NTEKSKLETY
51 QAKLVSAQIA LGSAEKKYKR QAALWKENAT SKEDLESAQD AFAAAKANVA
101 ELKALIRQSK ISINTAESEL GYTRITATMD GTVVAILVEE GQTVNAAQST
151 PTIVQLANLD MMLNKMQIAE GDITKVKAGQ DISFTILSEP DTPIKAKLDS
201 VDPGLTTMSS GGYNSSTDTA SNAVYYYARS FVPNPDGKLA TGMTTQNTVE
251 IDGVKNVLII PSLTVKNRGG KAFVRVLGAD GKAAEREIRT GMRDSMNTEV
301 KSGLKEGDKV VISEITAAEQ QESGERALGG PPRR*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF85和脑膜炎奈瑟球菌菌株A的ORF(ORF85a)在41个氨基酸的重叠区内有87.8%的相同性,在153个氨基酸的重叠区内有99.3%的相同性:
10 20 30 40
orf85.pep MAKMMKWAAVAAVAAAAVWGGWS-LKPEPHVLDITETVRRG
||||||||||||||||||||||| |||||:: ||||||||
orf85a MAKMMKWAAVAAVAAAAVWGGWSYLKPEPQAAYITETVRRGDISRTVSATGEISPSNLVS
10 20 30 40 50 60
//
80 90 100
orf85.pep ..............................ISFTILSEPDTPIKAKLDSVDPGLTTMSSG
||||||||||||||||||||||||||||||
orf85a TIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSSG
210 220 230 240 250 260
110 120 130 140 150 160
orf85.pep GYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGGK
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:
orf85a GYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGGR
270 280 290 300 310 320
170 180 190 200 210 220
orf85.pep AFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85a AFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGP
330 340 350 360 370 380
230
orf85.pep PRRX
||||
orf85a PRRX
390
全长ORF85a核苷酸序列<SEQ ID 711>是:
1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCAGC
51 GGTTTGGGGC GGATGGTCTT ATCTGAAGCC CGACCCGCAG GCTGCTTATA
101 TTACGGAAAC GGTCAGGCGC GGCGACATCA GCCGGACGGT TTCTGCAACA
151 GGGGAGATTT CGCCGTCCAA CCTGGTATCG GTCGGCGCGC AGGCATCGGG
201 GCAGATTAAG AAACTTTATG TCA4ACTCGG GCAACAGGTT AAAAAGGGCG
251 ATTTGATTGC GGAAATCAAT TCGACCTCGC AGACCAATAC GCTCAATACG
301 GAAAAATCCA AATTGGAAAC GTATCAGGCG AAGCTGGTGT CGGCACAGAT
351 TGCATTGGGC AGCGCGGAGA AGAAATATAA GCGTCAGGCG GCGTTGTGGA
401 AGGATGATGC GACCGCTAAA GAAGATTTGG AAAGCGCACA GGATGCGCTT
451 GCCGCCGCCA AAGCCAATGT TGCCGAGCTG AAGGCTCTAA TCAGACAGAG
501 CAAAATTTCC ATCAATACCG CCGAGTCGGA ATTGGGCTAC ACGCGCATTA
551 CCGCAACGAT GGACGGCACG GTGGTGGCGA TTCTCGTGGA AGAGGGGCAG
601 ACTGTGAACG CGGCGCAGTC TACGCCGACG ATTGTCCAAT TGGCGAATCT
651 GGATATGATG TTGAACAAAA TGCAGATTGC CGAGGGCGAT ATTACCAAGG
701 TGAAGGCGGG GCAGGATATT TCGTTTACGA TTTTGTCCGA ACCGGATACG
751 CCGATTAAGG CGAAGCTCGA CAGCGTCGAC CCCGGGCTGA CCACGATGTC
801 GTCGGGCGGC TACAACAGCA GTACGGATAC GGCTTCCAAT GCGGTCTACT
851 ATTATGCCCG TTCGTTTGTG CCGAATCCGG ACGGCAAACT CGCCACGGGG
901 ATGACGACGC AGAATACGGT TGAAATCGAC GGTGTGAAAA ATGTGCTGAT
951 TATTCCGTCG CTGACCGTGA AAAATCGCGG CGGCAGGGCG TTTGTGCGCG
1001 TGTTGGGTGC AGACGGCAAG GCGGCGGAAC GCGAAATCCG GACCGGTATG
1051 AGAGACAGTA TGAATACCGA AGTAAAAAGC GGGTTGAAAG AGGGGGACAA
1101 AGTGGTCATC TCCGAAATAA CCGCCGCCGA GCAGCAGGAA AGCGGCGAAC
1151 GCGCCCTAGG CGGCCCGCCG CGCCGATAA它编码的蛋白质具有氨基酸序列<SEQ ID 712>:
1 MAKMMKWAAV AAVAAAAVWG GWSYLKPEPQ AAYITETVRR GDISRTVSAT
51 GEISPSNLVS VGAQASGQIK KLYVKLGQQV KKGDLIAEIN STSQTNTLNT
101 EKSKLETYQA KLVSAQIALG SAEKKYKRQA ALWKDDATAK EDLESAQDAL
151 AAAKANVAEL KALIRQSKIS INTAESELGY TRITATMDGT VVAILVEEGQ
201 TVNAAQSTPT IVQLANLDMM LNKMQIAEGD ITKVKAGQDI SFTILSEPDT
251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG
301 MTTQNTVEID GVKNVLIIPS LTVKNRGGRA FVRVLGADGK AAEREIRTGM
351 RDSMNTEVKS GLKEGDKVVI SEITAAEQQE SGERALGGPP RR*ORF85a和ORF85-1在334个氨基酸的重叠区内显示出有98.2%的相同性:
30 40 50 60 70 80orf85a.pep PQAAYITETVRRGDISRTVSATGEISPSNLVSVGAQASGQIKKLYVKLGQQVKKGDLIAE
|||||||||||| |||||||||||||||||orf85-1 VSVGAQASGQIKILYVKLGQQVKKGDLIAE
10 20 30
90 100 110 120 130 140orf85a.pep INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATAKEDLESAQD
||||||||||||||||||||||||||||||||||||||||||||||::||:|||||||||orf85-1 INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKENATSKEDLESAQD
40 50 60 70 80 90
150 160 170 180 190 200orf85a.pep ALAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTVVAILVEEGQTVNAAQST
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf85-1 AFAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTVVAILVEEGQTVNAAQST
100 110 120 130 140 150
210 220 230 240 250 260orf85a.pep PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf85-1 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
160 170 180 190 200 210
270 280 290 300 310 320orf85a.pep GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf85-1 GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGG
220 230 240 250 260 270
330 340 350 360 370 380orf85a.pep RAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf85-1 KAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
280 290 300 310 320 330
390orf85a.pep PPRRX
|||||orf85-1 PPRRX图19D显示出ORF85a的亲水性、抗原性指数和AMPHI区域的曲线。与淋病奈瑟球菌的预计ORF的同源性ORF85和淋病奈瑟球菌的预计ORF(ORF85ng)显示出高度的相同性:ORF85 1 MAKMMKWAAVAAVAAAAVWGGWS.LKPEPHVLDITETVRRG......... 40
||||||||||||||||||||||| |||||:: |||:||||ORF85ng 1 MAKMMKWAAVAAVAAAAVWGGWSYLKPEPQAAYITEAVRRGDISRTVSAT 50ORF85 .......................................ISFTILSEPDT 250
|||||||||||ORF85ng 201 TVNAAQSTPTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDT 250ORF85 251 PIKAKLDSVDPGLTTMSSGGYNSSTDTASNAVYYYARSFVPNPDGKLATG 300
||||||||||||||||||||||||||||||||||||||||||||||||||ORF85ng 251 PIKAKLDSVDPGLTTMSSGGYNSSTDTASNAVYYYARSFVPNPDGKLATG 300ORF85 301 MTTQNTVEIDGVKNVLIIPSLTVKNRGGKAFVRVLGADGKAAEREIRTGM 350
||||||||||||||||:|||||||||||||||||||||||| ||||||||ORF85ng 301 MTTQNTVEIDGVKNVLLIPSLTVKNRGGKAFVRVLGADGKAVEREIRTGM 350ORF85 152 RDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGPPRR 393
:|||||||||||||||||||||||||||||||||||||||||ORF85ng 351 KDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGPPRR 393全长ORF85ng核苷酸序列<SEQ ID 713>是:
1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCaac
51 GGTTTGGGGC GGATGGTCTT ATCTGAAGCC CGAACCGCAG GCTGCTTATA
101 TTACGGAaac ggTCAGGCGC GGCGATATCA GCCGGACGGT TTCCGCGACG
151 GgcgAGATTT CGCCGTCCAA CCTGGTATCG GTCGGCGCGC AGGCTTCGGG
201 GCAGATTAAA AAGCTTTATG TCAAACTCGG GCAACAGGTC AAAAAGGGCG
251 ATTTGATTGC GGAAATCAAT TCGACCACGC AGACCAACAC GATCGATATG
301 GAAAAATCCA AATTGGAAAC GTATCAGGCG AAGCTGGTGT CGGCACAGAT
351 TGCATTGGGC AGCGCGGAGA AGAAATATAA GCGTCAGGCG GCGTTGTGGA
401 AGGATGATGC GACCTCTAAA GAAGATTTGG AAAGCGCGCA GGATGCGCTT
451 GCCGCCGCCA AAGCCAATGT TGCCGAGTTG AAGGCTTTAA TCAGACAGAG
501 CAAAATTTCC ATCAATACCG CCGAGTCGGA TTTGGGCTAC ACGCGCATTA
551 CCGCGACGAT GGACGGCACG GTGGTGGCGA TTCCCGTGGA AGAGGGGCAG
601 ACTGTGAACG CGGCGCAGTC TACGCCGACG ATTGTCCAAT TGGCGAATCT
651 GGATATGATG TTGAACAAAA TGCAGATTGC CGAGGGCGAT ATTACCAAGG
701 TGAAGGCGGG GCAGGATATT TCGTTTACGA TTTTGTCCGA ACCGGATACG
751 CCGATTAAGG CGAAGCTCGA CAGCGTCGAC CCCGGGCTGA CCACGATGTC
801 GTCGGGCGGC TACAACAGCA GTACGGATAC GGCTTCCAAT GCGGTCTATT
851 ATTATGCCCG TTCGTTTGTG CCGAATCCGG ACGGCAAACT CGCCACGGGG
901 ATGACGACGC AGAATACGGT TGAAATCGAC GGTGTGAAAA ATGTGTTGCT
951 TATTCCGTCG CTGACCGTGA AAAATCGCGG CGGCAAGGCG TTCGTACGCG
1001 TGTTGGGTGC GGACGGCAAG GCAGTGGAAC GCGAAATCCG GACCGGTATG
1051 AAAGACAGTA TGAATACCGA AGTGAAAAGC GGGTTGAAAG AGGGGGACAA
1101 AGTGGTCATC TCCGAAATAA CCGCCGCCGA GCAGCAGGAA AGCGGCGAAC
1151 GCGCCCTAGG CGGCCCGCCG CGCCGATAA它编码的蛋白质具有氨基酸序列<SEQ ID 714>:
1 MAKMMKWAAV AAVAAAAVWG GWSYLKPEPQ AAYITEAVRR GDISRTVSAT
51 GEISPSNLVS VGAQASGQIK KLYVKLGQQV KKGDLIAEIN STTQTNTIDM
101 EKSKLETYQA KLVSAQIALG SAEKKYKRQA ALWKDDATSK EDLESAQDAL
151 AAAKANVAEL KALIRQSKIS INTAESDLGY TRITATMDGT VVAIPVEEGQ
201 TVNAAQSTPT IVQLANLDMM LNKMQIAEGD ITKVKAGQDI SFTILSEPDT
251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG
301 MTTQNTVEID GVKNVLLIPS LTVKNRGGKA FVRVLGADGK AVEREIRTGM
351 KDSMNTEVKS GLKEGDKVVI SEITAAEQQE SGERALGGPP RR*
ORF85ng和ORF85-1在334个氨基酸的重叠区内显示出有96.1%的相同性:
30 40 50 60 70 80
orf85ng PQAAYITETVRRGDISRTVSATGEISPSNLVSVGAQASGQIKKLYVKLGQQVKKGDLIAE
|||||||||||| |||||||||||||||||
orf85-1 VSVGAQASGQIKILYVKLGQQVKKGDLIAE
10 20 30
90 100 110 120 130 140
orf85ng INSTTQTNTIDMEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATSKEDLESAQD
||||:||||:: ||||||||||||||||||||||||||||||||||::||||||||||||
orf85-1 INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKENATSKEDLESAQD
40 50 60 70 80 90
150 160 170 180 190 200
orf85ng ALAAAKANVAELKALIRQSKISINTAESDLGYTRITATMDGTVVAIPVEEGQTVNAAQST
|:||||||||||||||||||||||||||:||||||||||||||||| |||||||||||||
orf85-1 AFAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTVVAILVEEGQTVNAAQST
100 110 120 130 140 150
210 220 230 240 250 260
orf85ng PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85-1 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
160 170 180 190 200 210
270 280 290 300 310 320
orf85ng GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLLIPSLTVKNRGG
||||||||||||||||||||||||||||||||||||||||||||||||:|||||||||||
orf85-1 GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGG
220 230 240 250 260 270
330 340 350 360 370 380
orf85ng KAFVRVLGADGKAVEREIRTGMKDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
|||||||||||||:||||||||:|||||||||||||||||||||||||||||||||||||
orf85-1 KAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
280 290 300 310 320 330
390
orf85ng PPRRX
|||||
orf85-1 PPRRX
另外,ORF85ng显示出与大肠杆菌一种膜融合蛋白明显同源:
gi|1787104(AE000189)o380;与膜融合蛋白前体的332个残基有27%相同(27个空隙),MTRC_EIG0 SW:P43505(412aa)[大肠杆菌]长度=380
评分=193位(485),估计值=2e-48
相同性=120/345(34%),阳性=182/345(51%),空隙=13/345(3%)
询问:29 PQAAYITETVRRGDISRTVSATGEISPSNLVSVGAQASGQIKKLYVKLGQQVKKGDLIAE 88
P Y T VR GD+ ++V ATG++ V VGAQ SGQ+K L V +G +VKK L+
目标:41 PVPTYQTLIVRPGDLQQSVLATGKLDALRKVDVGAQVSGQLKTLSVAIGDKVKKDQLLGV 100
询问:89 INSTTQTNTIDMEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATSKEXXXXXXX 148
I+ N I ++ L +A+ A+ L A Y RQ L + A S++
目标:101 IDPEQAENQIKEVEATLMELRAQRQQAEAELKLARVTYSRQQRLAQTKAVSQQDLDTAAT 160
询问:149 XXXXXXXXXXXXXXXIRQSKISINTAESDLGYTRITATMDGTVVAIPVEEGQTVNAAQST 208
I++++ S++TA+++L YTRI A M G V I +GQTV AAQ
目标:161 EMAVKQAQIGTIDAQIKRNQASLDTAKTNLDYTRIVAPMAGEVTQITTLQGQTVIAAQQA 220
询问:209 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS 268
P I+ LA++ ML K Q++E D+ +K GQ FT+L +P T + ++ V P
目标:221 PNILTLADMSAMLVKAQVSEADVIHLKPGQKAWFTVLGDPLTRYEGQIKDVLP------- 273
询问:269 GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLLIPSLTVKNRGG 328
+ + ++A++YYAR VPNP+G L MT Q +++ VKNVL IP + + G
目标:274 -----TPEKVNDAIFYYARFEVPNPNGLLRLDMTAQVHIQLTDVKNVLTIPLSALGDPVG 328
询问:329 KAFVRV-LGADGKAVEREIRTGMKDSMNTEVKSGLKEGDKVVISE 372
+V L +G+ ERE+ G ++ + E+ GL+ GD+VVI E
目标:329 DNRYKVKLLRNGETREREVTIGARNDTDVEIVKGLEAGDEVVIGE 373
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF85-1(40.4kDa)克隆到pGex载体中,并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图19A显示出GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图19B),FACS分析(图19C)和ELISA(阳性结果)。这些实验确认ORF85-1是一种外露蛋白,且是一种有用的免疫原。
实施例92
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 715>:
1 ..ATTCCCGCCA CGATGACATT TGAACGCAGC GGCAATGCTT ACAAAATCGT
51 TTCGACGATT AAAGTGCCGC TATACAATAT CCGTTTCGAG TCCGGCGGTA
101 CGGTTGTCGG CAATACCCTG CACCCTACCT ACTATAGAGA CATACGCAGG
151 GGCAAACTGT ATGCGGAAgc CAAATTCGCC GACgGcAGCG TAACTTACGG
201 CAAAGCGGGC GAGAGCAAAA CCGAGCAAAG CCCCAAGGCT ATGGATTTGT
251 TCACGCTTGC CTGGCAGTTG GCGGCAAATG ACGCGAAACT CCCCCCGGGG
301 CTGAAAATCA CCAACGGCAA AAAACTTTAT TCCGTCGGCG GTTTGAATAA
351 GGCGGGTACA GGAAAATACA GCATAGGCGG CGTGGAAACC GAAGTCGTCA
401 AATATCGGGT GCGGCGCGGC GACGATGCGG TAATGTATTT cTTCGCACCG
451 TCCCTGAACA ATATTCCGGC ACAAATCGGC TATACCGACG ACGGCAAAAC
501 CTATACGCTG AAACTCAAAT CGGTGCAGAT CAACGGCCAG GCAGCCAAAC
551 CGTAA
它对应于氨基酸序列<SEQ ID 716;ORF120>:
1 ..IPAIMTFERS GNAYKIVSTI KVPLYNIRFE SGGTVVGNTL HPTYYRDIRR
51 GKLYAEAKFA DGSVTYGKAG ESKTEQSPKA MDLFTLAWQL AANDAKLPPG
101 LKITNGKKLY SVGGLNKAGT GKYSIGGVET EVVKYRVRRG DDAVMYFFAP
151 SLNNIPAQIG YTDDGKTYTL KLKSVQINGQ AAKP*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 717>:
1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC
51 CCTGCCGTGC GCGTATGCGG CAGGGCTGCC CCAATCCGCC GTGCTGCACT
101 ATTCCGGCAG CTACGGCATT CCCGCCACGA TGACATTTGA ACGCAGCGGC
151 AATGCTTACA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG
201 TTTCGAGTCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTACCTACT
251 ATAGAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC
301 GGCAGCGTAA CTTACGGCAA AGCGGGCGAG AGCAAAACCG AGCAAAGCCC
351 CAAGGCTATG GATTTGTTCA CGCTTGCCTG GCAGTTGGCG GCAAATGACG
401 CGAAACTCCC CCCGGGGCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC
451 GTCGGCGGTT TGAATAAGGC GGGTACAGGA AAATACAGCA TAGGCGGCGT
501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATGCGGTAA
551 TGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT
601 ACCGACGACG GCAAAACCTA TACGCTGAAA CTCAAATCGG TGCAGATCAA
651 CGGCCAGGCA GCCAAACCGT AA
它对应于氨基酸序列<SEQ ID 718;ORF120-1>:
1 MMKTFKNIFS AAILSAALPC AYAAGLPQSA VLHYSGSYGI PATMTFERSG
51 NAYKIVSTIK VPLYNIRFES GGTVVGNTLH PTYYRDIRRG KLYAEAKFAD
101 GSVTYGKAGE SKTEQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS
151 VGGLNKAGTG KYSIGGVETE VVKYRVRRGD DAVMYFFAPS LNNIPAQIGY
201 TDDGKTYTLK LKSVQINGQA AKP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF120和脑膜炎奈瑟球菌菌株A的ORF(ORF120a)在184个氨基酸的重叠区内显示出有92.4%的相同性:
10 20 30
orf120.pep IPATMTFERSGNAYKIVSTIKVPLYNIRFE
|||| : || ||||||||||||||||
orf120a SAAILSAALPCAYAAGLPXSAVLHYSGSYGIPATXXXXXXXNAXKIVSTIKVPLYNIRFE
10 20 30 40 50 60
40 50 60 70 80 90
orf120.pep SGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf120a SGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAXXXXXXQSPKAMDLFTLAWQL
70 80 90 100 110 120
100 110 120 130 140 150
orf120.pep AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDAVMYFFAP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf120a AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDAVMYFFAP
130 140 150 160 170 180
160 170 180
orf120.pep SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
|||||||||||||||||||||||||||||||||||
orf120a SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
190 200 210 220
全长ORF120a核苷酸序列<SEQ ID 719>是:
1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC
51 CCTGCCGTGC GCGTATGCGG CAGGGCTGCC CNAATCCGCC GTGCTGCACT
101 ATTCCGGCAG CTACGGCATT CCCGCCACNA NNANNTNNGN ACNNNGNGNC
151 AATGCTTNCA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG
201 TTTCGAGTCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTACCTACT
251 ATAGAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC
301 GGCAGCGTAA CCTACGGCAA AGCGGNNNNN ANCNNNNNNG NGCAAAGCCC
351 CAAGGCTATG GATTTGTTCA CGCTTGCNTG GCAGTTGGCG GCAAATGACG
401 CGAAACTCCC CCCGGGGCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC
451 GTCGGCGGTT TGAATAAGGC GGGTACAGGA AAATACAGCA TAGGCGGCGT
501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATGCGGTAA
551 TGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT
601 ACCGACGACG GCAAAACCTA TACGCTGAAA CTCAAATCGG TGCAGATCAA
651 CGGCCAGGCA GCCAAACCGT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 720>:
1 MMKTFKNIFS AAILSAALPC AYAAGLPXSA VLHYSGSYGI PATXXXXXXX
51 NAXKIVSTIK VPLYNIRFES GGTVVGNTLH PTYYRDIRRG KLYAEAKFAD
101 GSVTYGKAXX XXXXQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS
151 VGGLNKAGTG KYSIGGVETE VVKYRVRRGD DAVMYFFAPS LNNIPAQIGY
201 TDDGKTYTLK LKSVQINGQA AKP*
ORF120a和ORF120-1在223个氨基酸的重叠区内显示出有93.3%的相同性:
10 20 30 40 50 60
orf120a.DeD MMKTFKNIFSAAILSAALPCAYAAGLPXSAVLHYSGSYGIPATXXXXXXXNAXKIVSTIK
||||||||||||||||||||||||||| ||||||||||||||| : || |||||||
orf120-1 MMKTFKNIFSAAILSAALPCAYAAGLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK
10 20 30 40 50 60
70 80 90 100 110 120
orf120a.pep VPLYNIRFESGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAXXXXXXQSPKAM
|||||||||||||||||||||||||||||||||||||||||||||||| : ||||||
orf120-1 VPLYNIRFESGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM
70 80 90 100 110 120
130 140 150 160 170 180
orf120a.pep DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf120-1 DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
130 140 150 160 170 180
190 200 210 220
orf120a.pep DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
||||||||||||||||||||||||||||||||||||||||||||
orf120-1 DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF120和淋病奈瑟球菌的预计ORF(ORF120ng)在184个氨基酸的重叠区内显示出有97.8%的相同性:
orf120.pep IPATMTFERSGNAYKIVSTIKVPLYNIRFE 30
||||||||||||||||||||||||||||||
orf120ng SAAILSAALPCAYAARLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIKVPLYNIRFE 69
orf120.pep SGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL 90
||||||||||||:||:||||||||||||||||||||||||||||||||||||||||||||
orf120ng SGGTVVGNTLHPAYYKDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL 129
orf120.pep AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDAVMYFFAP 150
||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||||
orf120ng AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDTVTYFFAP 189
orf120.pep SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKP 184
||||||||||||||||||||||||||||||||||
orf120ng SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKP 223
全长ORF120ng核苷酸序列<SEQ ID 721>是:
1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC
51 CCTGCCGTGC GCGTATGCGG CAAGGCTACC CCAATCCGCC GTGCTGCACT
101 ATTCCGGCAG CTACGGCATT CCCGCCACGA TGACATTTGA ACGCAGCGGC
151 AATGCTTACA AAATCGTTTC GACGATTAAA GTGCCCCTAT ACAATATCCG
201 TTTCGAATCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTGCCTACT
251 ATAAAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC
301 GGCAGCGTAA CCTACGGCAA AGCGGGCGAG AGCAAAACCG AGCAAAGCCC
351 CAAGGCTATG GATTTGTTCA CGCTTGCCTG GCAGTTGGCG GCAAATGACG
401 CGAAACTCCC CCCGGGTCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC
451 GTCGGCGGCC TGAATAAGGC GGGTACGGGA AAATACAGCA TaggCGGCGT
501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATACGGTAA
551 CGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT
601 ACCGACGACG GCAAAACCTA TACGCTGAAG CTCAAATCGG TGCAGATCAA
651 CGGACAGGCC GCCAAACCGT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 722>:
1 MMKTFKNIFS AAILSAALPC AYAARLPQSA VLHYSGSYGI PATMTFERSG
51 NAYKIVSTIK VPLYNIRFES GGTVVGNTLH PAYYKDIRRG KLYAEAKFAD
101 GSVTYGKAGE SKTEQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS
151 VGGLNKAGTG KYSIGGVETE VVKYRVRRGD DTVTYFFAPS LNNIPAQIGY
201 TDDGKTYTLK LKSVQINGQA AKP*
与ORF120-1相比,ORF120ng在223个氨基酸的重叠区内显示出有97.8%的相同性:
10 20 30 40 50 60
orf120-1.pep MMKTFKNIFSAAILSAALPCAYAAGLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK
|||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
orf120ng MMKTFKNIFSAAILSAALPCAYAARLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK
10 20 30 40 50 60
70 80 90 100 110 120
orf120-1.pep VPLYNIRFESGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM
|||||||||||||||||||||:||:|||||||||||||||||||||||||||||||||||
orf120ng VPLYNIRFESGGTVVGNTLHPAYYKDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM
70 80 90 100 110 120
130 140 150 160 170 180
orf120-1.pep DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf120ng DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
130 140 150 160 170 180
190 200 210 220
orf120-1.pep DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
|:| ||||||||||||||||||||||||||||||||||||||||
orf120ng DTVTYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
190 200 210 220
该分析结果(包括淋球菌蛋白中有一个推定的前导序列)提示脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例93
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 723>:
1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG GTGCCGGTGC
51 .GCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATCGTCC
251 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC
401 ATACGGGAGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AGGCAGGGCG GCAATATT..
它对应于氨基酸序列<SEQ ID 724;ORF121>:
1 MYRRKGRGIK PWMGAGXAFA ALVWLVFALG DTLTPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMSVMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM
151 RQGGNI..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 725>:
1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG GTGCCGGTGC
51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATCGTCC
251 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC
401 ATACGGGAGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AGGCAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCTTCC
501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA
551 TTGCCAAACT GGTTCCGAgG CGTTTTGCCG GTGCTTATAC GCGCATTACA
601 GGCAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGGC AGCTTCTGGT
651 AATGCTGATT ATGGGCTTGG TTTACGGTTT GGGATTGGTG CTGGTCGGGC
701 TGGATTCGGG GTTTGCCATC GGTATGCTTG CCGGTATTTT GGTGTTTGTC
751 CCTTATCTCG GGGCGTTTAC GGGATTGCTG CTTGCCACCG TCGCCGCCTT
801 GCTCCAGTTC GGTTCGTGGA ACGGCATCCT ATCGGTTTGG GCGGTTTTTG
851 CCGTAGGACA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATCGTGGGA
901 GACCGTATCG GGCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT
951 CGGGCAGCTG ATGGGCTTTG TCGGAATGTT GGCGGGATTG CCTTTGGCCG
1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG TGCAGAAATA TTTTGCCGGC
1051 AGTTTTTACC GGGGCAGGTAG
它对应于氨基酸序列<SEQ ID 726;ORF121-1>:
1 MYRRKGRGIK PWMGAGAAFA ALVWLVFALG DTLTPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMSVMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM
151 RQGGNIVSSI GNLLLLPLLL YYFLLDWQRW SCGIAKLVPR RFAGAYTRIT
201 GNLNEVLGEF LRGQLLVMLI MGLVYGLGLV LVGLDSGFAI GMLAGILVFV
251 PYLGAFTGLL LATVAALLQF GSWNGILSVW AVFAVGQFLE SFFITPKIVG
301 DRIGLSPFWV IFSLMAFGQL MGFVGMLAGL PLAAVTLVLL REGVQKYFAG
351 SFYRGR*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF121和脑膜炎奈瑟球菌菌株A的ORF(ORF121a)在156个氨基酸的重叠区内显示出有98.7%的相同性:
10 20 30 40 50 60
orf121.pep MYRRKGRGIKPWMGAGXAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121a MYRRKGRGIKPWMDAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
10 20 30 40 50 60
70 80 90 100 110 120
orf121.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121a ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
70 80 90 100 110 120
130 140 150
orf121.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNI
||||||||||||||||||||||||||||||||||||orf121a EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
130 140 150 160 170 180orf121a SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
190 200 210 220 230 240全长ORF121a核苷酸序列<SEQ ID 727>是:
1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG ATGCCGGTGC
51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATTGTCC
251 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC
401 ATACGGGCGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AGGCAGGGCG GCAATATTGT CAGCAGTATC GQCAACCTGC TGCTGCTTCC
501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA
551 TTGCCAAACT GGTTCCGAGG CGTTTTGCCG GTGCTTATAC GCGCATTACA
601 GGCAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGGC AGCTTCTGGT
651 GATGCTGATT ATGGGTTTGG TTTACGGCTT GGGGTTGGTG CTGGTCGGGC
701 TGGATTCGGG GTTTGCAATC GGTATGGTTG CCGGTATTTT GGTTTTTGTT
751 CCCTATTTGG GCGCGTTTAC AGGACTGCTG CTGGCAACCG TCGCCGCCTT
801 GCTCCAGTTC GGTTCGTGGA ACGGCATCTT GGCTGTTTGG GCGGTTTTTG
851 CCGTAGGACA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATCGTGGGA
901 GACCGTATCG GCCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT
951 CGGGCAGCTG ATGGGCTTTG TCGGAATGTT GGCCGGATTG CCTTTGGCCG
1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG TGCAGAAATA TTTTGCCGGC
1051 AGTTTTTACC GGGGCAGGTA G它编码的蛋白质具有氨基酸序列<SEQ ID 728>:
1 MYRRKGRGIK PWMDAGAAFA ALVWLVFALG DTLTPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMSVMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM
151 RQGGNIVSSI GNLLLLPLLL YYFLLDWQRW SCGIAKLVPR RFAGAYTRIT
201 GNLNEVLGEF LRGQLLVMLI MGLVYGLGLV LVGLDSGFAI GMVAGILVFV
251 PYLGAFTGLL LATVAALLQF GSWNGILAVW AVFAVGQFLE SFFITPKIVG
301 DRIGLSPFWV IFSLMAFGQL MGFVGMLAGL PLAAVTLVLL REGVQKYFAG
351 SFYRGR*ORF121a和ORF121-1在356个氨基酸的重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60orf121a.pep MYRRKGRGIKPWMDAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf121-1 MYRRKGRGIKPWMGAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
10 20 30 40 50 60
70 80 90 100 110 120orf121a.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf121-1 ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
70 80 90 100 110 120
130 140 150 160 170 180orf121a.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf121-1 E1DQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
130 140 150 160 170 180
190 200 210 220 230 240orf121a.pep SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121-1 SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
190 200 210 220 230 240
250 260 270 280 290 300
orf121a.pep GMVAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILAVWAVFAVGQFLESFFITPKIVG
||:||||||||||||||||||||||||||||||||||:||||||||||||||||||||||
orf121-1 GMLAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILSVWAVFAVGQFLESFFITPKIVG
250 260 270 280 290 300
310 320 330 340 350
orf121a.pep DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121-1 DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX
310 320 330 340 350
与淋病奈瑟球菌的预计ORF的同源性
ORF121和淋病奈瑟球菌的预计ORF(ORF121ng)在156个氨基酸的重叠区内显示出有97.4%的相同性:
orf121.pep MYRRKGRGIKPWMGAGXAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 60
|||||||||||||||| |||||||||:|||||||||||||||||||||||||||||||||
orf121ng MYRRKGRGIKPWMGAGAAFAALVWLVYALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 60
orf121.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121ng ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 120
orf121.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNI 156
||||||||||:|||||||||||||||||||:|||||
orf121ng EIDQASIIAWFQAHTGELSNALKAWFPVLMKQGGNIVSTIGNLLLPPLLLYYFLLDWHRW 180
预计ORF121ng核苷酸序列<SEQ ID 729>编码的蛋白质具有氨基酸序列<SEQ ID730>:
1 MYRRKGRGIK PWMGAGAAFA ALVWLVYALG DTLTPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMSVMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW FQAHTGELSN ALKAWFPVLM
151 KQGGNIVSTI GNLLLPPLLL YYFLLDWHRW SCGIPKLVPR RFAGAYTRIT
201 GNLNKVWGKF LRGQLLGETE RGAVVCRVGR ECWEGGGARS RPSDDGWPRW
251 GGG*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 731>:
1 ATGTATCGGA GAAAAGGACG GGGCATCAAG CCGTGGATGG GTGCCGGCGC
51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTA CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTGTTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATTGTCC
251 CTATGCTGGT CGGGCAGTTC AATAATTTGG CATCTCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG TTTCAGGCGC
401 ATACGGGCGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AAACAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCCGCC
501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA
551 TCGCCAAACT GGTTCCGAGG CGTTTTGCCG GTGCTTATAC GCGCATTACG
601 GGTAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGTC AGCTTCTGGT
651 GATGCTGATT ATGGGCTTGG TTTACGGTTT GGGATTGATG CTAGTCGGAC
701 TGGATTCGGG ATTTGCCATC GGTATGGTTG CCGGTATTTT GGTGTTTGTC
751 CCCTATTTGG GTGCGTTTAC GGGATTGCTG CTTGCCACTG TTGCAGCCTT
801 GCTCCAGTTC GGTTCGTGGA ACGGAATCTT GGCTGTTTGG GCGGTTTTTG
851 CCGTCGGTCA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATTGTAGGA
901 GACCGTATCG GCCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT
951 CGGAGAGCTG ATGGGCTTTG TCGGAATGTT GGCCGGATTG CCTTTGGCCG
1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG CGCAGAAATA TTTTGCCGGC
1051 AGTTTTTACC GGGGCAGGTA G它对应于氨基酸序列<SEQ ID 732;ORF121ng-1>:
1 MYRRKGRGIK PWMGAGAAFA ALVWLVYALG DTLTPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMSVMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW FQAHTGELSN ALKAWFPVLM
151 KQGGNIVSSI GNLLLPPLLL YYFLLDWQRW SCGIAKLVPR RFAGAYTRIT
201 GNLNEVLGEF LRGQLLVMLI MGLVYGLGLM LVGLDSGFAI GMVAGILVFV
251 PYLGAFTGLL LATVAALLQF GSWNGILAVW AVFAVGQFLE SFFITPKIVG
301 DRIGLSPFWV IFSLMAFGEL MGFVGMLAGL PLAAVTLVLL REGAQKYFAG
351 SFYRGR*ORF121ng-1和ORF121-1在356个氨基酸的重叠区内显示出有97.5%的相同性:
10 20 30 40 50 60orf121-1.pep MYRRKGRGIKPWMGAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||orf121ng-1 MYRRKGRGIKPWMGAGAAFAALVWLVYALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
10 20 30 40 50 60
70 80 90 100 110 120orf121-1.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf121ng-1 ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
70 80 90 100 110 120
130 140 150 160 170 180orf121-1.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
||||||||||:|||||||||||||||||||:|||||||||||||| ||||||||||||||orf121ng-1 EIDQASIIAWFQAHTGELSNALKAWFPVLMKQGGNIVSSIGNLLLPPLLLYYFLLDWQRW
130 140 150 160 170 180
190 200 210 220 230 240orf121-1.pep SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
|||||||||||||||||||||||||||||||||||||||||||||||||:||||||||||orf121ng-1 SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLMLVGLDSGFAI
190 200 210 220 230 240
250 260 270 280 290 300orf121-1.pep GMLAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILSVWAVFAVGQFLESFFITPKIVG
||:||||||||||||||||||||||||||||||||||:||||||||||||||||||||||orf121ng-1 GMVAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILAVWAVFAVGQFLESFFITPKIVG
250 260 270 280 290 300
310 320 330 340 350orf121-1.pep DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX
||||||||||||||||||:||||||||||||||||||||||||:|||||||||||||orf121ng-1 DRIGLSPFWVIFSLMAFGELMGFVGMLAGLPLAAVTLVLLREGAQKYFAGSFYRGRX
310 320 330 340 350另外,ORF121ng-1显示出与流感嗜血菌的一种通透酶同源:sp|P43969|PERM_HAEIN推定的通透酶PERM同系物长度=349评分=69.9位(168),估计值=2e-11相同性=67/317(21%),阳性=120/317(37%),空隙=7/317(2%)询问:26 VYALGDTLTPFAVAAVLAYVLDPLVEWL-QKKGLNRASASMSVMVFSXXXXXXXXXXXVP 84
+Y GD + P +A VL+Y+L+ + +L Q R A++ + VP目标:32 IYFFGDLIAPLLIALVLSYLLEIPINFLNQYLKCPRMLATILIFGSFIGLAAVFFLVLVP 91
询问:85 MLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYVE-IDQASIIAWFQAHTGELSNALK 143
ML Q +L S LP + N WL N Y E ID + + + F + ++ +
目标:92 MLWNQTISLLSDLPAMF----NKSNEWLLNLPKNYPELIDYSMVDSIFNSVREKILGFGE 147
询问:144 AWFPVLMKQGGNIVSSIGNXXXXXXXXXXXXXDWQRWSCGIAKLVPRRFAGAYTRITGNL 203
+ + + N+VS D G+++ +P+ A+ R +
目标:148 SAVKLSLASIMNLVSLGIYAFLVPLMMFFMLKDKSELLQGVSRFLPKNRNLAFXRWK-EM 206
询问:204 NEVLGEFLRGQXXXXXXXXXXXXXXXXXXXXDSGFAIGMVAGILVFVPYXXXXXXXXXXX 263
+ + ++ G+ + + G+ V VPY
目标:207 QQQISNYIHGKLLEILIVTLITYIIFLIFGLNYPLLLAFAVGLSVLVPYIGAVIVTIPVA 266
询问:264 XXXXXQFGSWNGILAVWAVFAVGQFLESFFITPKIVGDRIGLSPFWVIFSLMAFGELMGF 323
QFG + FAV Q L+ + P + + + L P +I S++ FG L GF
目标:267 LVALFQFGISPTFWYIIIAFAVSQLLDGNLLVPYLFSEAVNLHPLIIIISVLIFGGLWGF 326
询问:324 VGMLAGLPLAAVTLVLL 340
G+ +PLA + ++
目标:327 WGVFFAIPLATLVKAVI 343
根据该分析结果(包括两个蛋白中存在一个推定的前导序列和跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例94
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 733>:
1 ..ACTGCTTTTT CGGCGGCGCT GCGCTTGAGT CCATCATGAC TCGTCATATT
51 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT
101 TTTGCACGTC CTGCCCGCCG CGTTCAAATG CGTACCAGCA ATACCGCCGC
151 CTGCGCCTCT ATGCCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG
201 TTTTGCCTTT GATGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG
251 ATGTTGGCAC GCATTTGCGG AATGTGCGGC GCGAGTGTGG GTTTCTGTGC
301 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC
351 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
401 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC
451 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC
501 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAG..
它对应于氨基酸序列<SEQ ID 734;ORF122>:
1 ..TAFSAALRLS PSXLVIFLSF GKPYQQTAAI LTFFCTSCPP RSNAYQQYRR
51 LRLYAFHPPE IAEFFVGFAF DVDARNVYAQ IGGDVGTHLR NVRRECGFLC
101 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGEM AADIAQTCRT
151 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQ..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 735>:
1 ATATCGTACT GGGCAAGCAG TTCGCCGGAT TTTTTGGAAG TAGATACCGC
51 GCCTTTGATT TTTTTGCCGC TCTTACCCAA GGCTTCGATG AAAAAGTTGA
101 TGGTCGAGCC GGTACCGATG CCGATATATT CATTTTCGGG TACGAATTCG
151 ACTGCTTTTT CGGCGGCGAT GCGCTTGAGT TCGTCTTGTG TCGTCATATT
201 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT
251 TTTGCACGTC CTGCCCGCCG CGTTCAAATG CGTACCAGCA ATACCGCCGC
301 CTGCGCCTCT ATGCCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG
351 TTTTGCCTTT GATGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG
401 ATGTTGGCAC GCATTTGCGG AATGTGCGGC GCGAGTTTGG GTTTCTGTGC
451 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC
501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
551 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC
601 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC
651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT
701 CTGCCTTCGG TCAGTTGGTG GACATCGTAG CCCTGTCCGA TACGGATGTT
751 CGTCATCGTT TGTGTTCCTG A
它对应于氨基酸序列<SEQ ID 736;ORF122-1>:
1 ISYWASSSPD FLEVDTAPLI FLPLLPKASM KKLMVEPVPM PIYSFSGTNS
51 TAFSAAMRLS SSCVVIFLSF GKPYQQTAAI LTFFCTSCPP RSNAYQQYRR
101 LRLYAFHPPE IAEFFVGFAF DYDARNVYAQ IGGDVGTHLR NVRREFGFLC
151 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGEM AADIAQTCRT
201 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDV
251 RHRLCS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF122和脑膜炎奈瑟球菌菌株A的ORF(ORF122a)在182个氨基酸的重叠区内显示出有94.0%的相同性:
10 20 30
orf122.pep TAFSAALRLSPSXLVIFLSFGKPYQQTAAI
||||||:||| | :||||||||||||||||
orf122a FLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLSSSCVVIFLSFGKPYQQTAAI
30 40 50 60 70 80
40 50 60 70 80 90
orf122.pep LTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAFDVDARNVYAQIGGDVGTHLR
|||| |||||||| ||||||||||||| |||:|||||||| |||||||||||||||||||
orf122a LTFFXTSCPPRSNPYQQYRRLRLYAFHAPEITEFFVGFAFXVDARNVYAQIGGDVGTHLR
90 100 110 120 130 140
100 110 120 130 140 150
orf122.pep NVRRECGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRIFELCGGVGEMAADIAQTCRT
|:||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf122a NMRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRIFELCGGVGEMAADIAQTCRT
150 160 170 180 190 200
160 170 180
orf122.pep EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQ
||||||||||||||||||||||||||||||||
orf122a EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLVDIVALSDTDVRHRLCSX
210 220 230 240 250
全长ORF122a核苷酸序列<SEQ ID 737>是:
1 ATATCATATT GGGCAAGCAG TTCACTGGAT TTTTTGGAAG TAGATACCGC
51 GCCTTTGATT TTTTTGCCGC TCTTACCCAA GGCTTCGATG AAAAAGTTGA
101 TGGTCGAACC GGTACCGATG CCGATGTATT CGTTTTCGGG TACGAATTCG
151 ACTGCNTTTT CGGCGGCGAT GCGCTTGAGT TCGTCTTGTG TCGTCATATT
201 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT
251 TTNNNACGTC CTGCCCGCCG CGTTCAAATC CTTACCAGCA ATACCGCCGC
301 CTGCGACTCT ATGCCTTCCA TGCGCCCGAG ATAACCGAGT TTTTCGTTGG
351 TTTTGCCTTT GANGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG
401 ATGTTGGCAC GCATTTGCGG AATATGCGGC GCGAGTTTGG GTTTCTGTGC
451 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC
501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
551 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC
601 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC
651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT
701 CTGCCTTCGG TCAGTTGGTG GACATCGTAG CCCTGTCCGA TACGGATGTT
751 CGTCATCGTT TGTGTTCCTG A
它编码的蛋白质具有氨基酸序列<SEQ ID 738>:
1 ISYWASSSLD FLEVDTAPLI FLPLLPKASM KKLMVEPVPM PMYSFSGTNS
51 TAFSAAMRLS SSCVVIFLSF GKPYQQTAAI LTFFXTSCPP RSNPYQQYRR
101 LRLYAFHAPE ITEFFVGFAF XVDARNVYAQ IGGDVGTHLR NMRREFGFLC
151 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGFM AADIAQTCRT
201 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDV
251 RHRLCS*
ORF122a和ORF122-1在256个氨基酸的重叠区内显示出有96.9%的相同性:
10 20 30 40 50 60
orf122a.pep ISYWASSSLDFLEVDTAPLIFLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLS
|||||||| ||||||||||||||||||||||||||||||||:||||||||||||||||||
orf122-1 ISYWASSSPDFLEVDTAPLIFLPLLPKASMRKLMVEPVPMPIYSFSGTNSTAFSAAMRLS
10 20 30 40 50 60
70 80 90 100 110 120
orf122a.pep SSCVVIFLSFGKPYQQTAAILTFFXTSCPPRSNPYQQYRRLRLYAFHAPEITEFFVGFAF
|||||||||||||||||||||||| |||||||| ||||||||||||| |||:||||||||
orf122-1 SSCVVIFLSFGKPYQQTAAILTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAF
70 80 90 100 110 120
130 140 150 160 170 180
orf122a.pep XVDARNVYAQIGGDVGTHLRNMRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRI
||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf122-1 DVDARNVYAQIGGDVGTHLRNYRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRI
130 140 150 160 170 180
190 200 210 220 230 240
orf122a.pep FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf122-1 FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV
190 200 210 220 230 240
250
orf122a.pep DIVALSDTDVRHRLCSX
|||||||||||||||||
orf122-1 DIVALSDTDVRHRLCSX
250
与淋病奈瑟球菌的预计ORF的同源性
ORF122和淋病奈瑟球菌的ORF(ORF122ng)在182个氨基酸的重叠区内显示出有89.6%的相同性:
orf122.pep TAFSAALRLSPSXLVIFLSFGKPYQQTAAI 30
||||||:||| | :||||||||||||||||
orf122ng FLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLSSSCVVIFLSFGKPYQQTAAI 80
orf122.pep LTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAFDVDARNVYAQIGGDVGTHLR 90
||||||| ||||| |||||||||||||||||||||||||||:||||: :|||||||||||
orf122ng LTFFCTSWPPRSNPYQQYRRLRLYAFHPPEIAEFFVGFAFDIDARNIDTQIGGDVGTHLR 140
orf122.pep NVRRECGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRIFELCGGVGEMAADIAQTCRT 150
||| | ||||||||||||:|||||||||||||||||||||||||||||:||||:||||||
orf122ng NVRCEFGFLCNHGRIDIDHLPTLRLNALIRRTQKDAAVRIFELCGGVGKMAADVAQTCRT 200
orf122.pep EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQ 182
|||||||||||:|| : |||||||||||||||
orf122ng EQRVGNGVQQRVGIRMPEQPFFKWDFNSAKYQLSAFGQLVDIVALSDTDIRHRLCS 256
全长ORF122ng核苷酸序列<SEQ ID 739>是:
1 ATGTCGTACC GGGCAAGCAG TTCGCCGGAT TTTTTGGAGG TTGAAACCGC
51 GCCTTTGATT TTTTTACCGC TTTTGCCCAA GGCTTCGATG AAGAAATTGa
101 tgGTCGAACC GgtaCCGATG CCGATGTATT CGTTTTCGGG TACGAATTCG
151 ACTGCTTTTT CGGCGGCGAT GCGCttgAgt TCgtcttgcg TcgTCATATT
201 TTTAtccttt gGGAAaccct atcaAcaAAc agccgccatC TTAACATTTT
251 TTTGCACGtc ctggccgccg cgttcaAATc cgtaccaGca ataccgccgc
301 ctgcgcctCT AtgcCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG
351 TTTTGCCTTT GATatTGACG CACGAAATAT CGatacCCAa atcggcgGCG
401 ATGTTGGCAC GCATTTGCGG AATGTGCGGT GCGAGTTTGG GTTTCTGTGC
451 AATCACGGTC GTATCGACAT TGACCACCTG CCAACCCTGC GCCTGAACGC
501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
551 GCGGCGGTGT CGGGAAAATG GCTGCCGATG TCGCCCAAAC CTGCCGCACC
601 GAGCAGCgcg tcggtaaCGG CGTGCAGCAG cgcgTcgGCA TCCGAATGCC
651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT
701 CTGCCTTCGG TCAATTGGTG GACATCGTAG CCCTGTCCGA TACGGATATT
751 CGTCATCGTT TGTGTTCCTG A
它编码的蛋白质具有氨基酸序列<SEQ ID 740>:
1 MSYRASSSPD FLEVETAPLI FLPLLPKASM KKLMVEPVPM PMYSFSGTNS
51 TAFSAAMRLS SSCVVIFLSF GKPYQQTAAI LTFFCTSWPP RSNPYQQYRR
101 LRLYAFHPPE IAEFFVGFAF DIDARNIDTQ IGGDVGTHLR NVRCEFGFLC
151 NHGRIDIDHL PTLRLNALIR RTQKDAAVRI FELCGGVGKM AADVAQTCRT
201 EQRVGNGVQQ RVGIRMPEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDI
251 RHRLCS*
ORF122ng和ORF122-1在256个氨基酸的重叠区内显示出有92.6%的相同性:
10 20 30 40 50 60
orf122-1.pep ISYWASSSPDFLEVDTAPLIFLPLLPKASMKKLMVEPVPMPIYSFSGTNSTAFSAAMRLS
:|| ||||||||||:||||||||||||||||||||||||||:||||||||||||||||||
orf122ng MSYRASSSPDFLEVETAPLIFLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLS
10 20 30 40 50 60
70 80 90 100 110 120
orf122-1.pep SSCVVIFLSFGKPYQQTAAILTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAF
||||||||||||||||||||||||||| ||||| ||||||||||||||||||||||||||
orf122ng SSCVVIFLSFGKPYQQTAAILTFFCTSWPPRSNPYQQYRRLRLYAFHPPEIAEFFVGFAF
70 80 90 100 110 120
130 140 150 160 170 180
orf122-1.pep DVDARNVYAQIGGDVGTHLRNVRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRI
|:||||: :|||||||||||||| ||||||||||||||:|||||||||||||||||||||
orf122ng DIDARNIDTQIGGDVGTHLRNVRCEFGFLCNHGRIDIDHLPTLRLNALIRRTQKDAAVRI
130 140 150 160 170 180
190 200 210 220 230 240
orf122-1.pep FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV
||||||||:||||:|||||||||||||||||:|| : |||||||||||||||||||||||
orf122ng FELCGGVGKMAADVAQTCRTEQRVGNGVQQRVGIRMPEQPFFKWDFNSAKYQLSAFGQLV
190 200 210 220 230 240
250
orf122-1.pep DIVALSDTDVRHRLCSX
|||||||||:|||||||
orf122ng DIVALSDTDIRHRLCSX
250
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例95
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 741>:
1 ..GCCGGCGCGA GTGCGAACAA CATTTCCGCG CGTTTTGCGG AAACACCCGT
51 CGCTGTCAGC GTTACCCTGA TCGGCACGGT ACTTGCCGTC ATGCTGCCCG
101 TTACCGAATA TGAAAACTTC CTGCTGCTTA TCGGCTCGGT ATTTGCGCCG
151 ATGGGGCGGA TTTTGATTGC CGACTTTTTC GTCTTGAAAC GGCGTGA
它对应于氨基酸序列<SEQ ID 742;ORF125>:
1 ..AGASANNISA RFAETPVAVS VTLIGTVLAV MLPVTEYENF LLLIGSVFAP
51 MGGFDCRLFR LETA*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 743>:
1 ATGTCGGGCA ATGCCTCCTC TCCTTCATCT TCCTCCGCCA TCGGGCTGAT
51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACGCTGC
101 TTGCGCCTTT GGGCTGGCAG CGCGGTCTGG CGGCTCTACT TTTGGGTCAT
151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC
201 CGGACGCAGC TCGATGGAAA GCGTGCGCCT GTCGTTCGGC AAACGCGGTT
251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG
301 GTGATGATTT ACGCCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT
351 GTGGGACGGC GAATCTTTTG TCTGGTGGGC ATTGGCAAAC GGCGCGCTGA
401 TTGTGCTGTG GCTGGTTTTC GGCGCACGCA AAACAGGCGG GCTGAAAACC
451 GTTTCGATGC TGCTGATGCT GTTGGCGGTT CTGTGGCTGA GTGCCGAAGT
501 CTTTTCCACG GCAGGCAGCA CCGCCGCACA GGTTTCAGAC GGCATGAGTT
551 TCGGAACGGC AGTCGAGCTG TCCGCCGTGA TGCCGCTTTC CTGGCTGCCG
601 CTTGCCGCCG ACTACACGCG CCACGCGCGC CGCCCGTTTG CGGCAACCCT
651 GACGGCAACG CTCGCCTACA CGCTGACCGG CTGCTGGATG TATGCCTTGG
701 GTTTGGCAGC GGCGTTGTTC ACCGGAGAAA CCGACGTGGC AAAAATCCTG
751 CTGGGCGCAG GTTTGGGTGC GGCAGGCATT TTGGCGGTCG TCCTCTCCAC
801 CGTTACCACA ACGTTTCTCG ATGCCTATTC CGCCGGCGCG AGTGCGAACA
851 ACATTTCCGC GCGTTTTGCG GAAACACCCG TCGCTGTCGG CGTTACCCTG
901 ATCGGCACGG TACTTGCCGT CATGCTGCCC GTTACCGAAT ATGAAAACTT
951 CCTGCTGCTT ATCGGCTCGG TATTTGCGCC GATGGCGGCG GTTTTGATTG
1001 CCGACTTTTT CGTCTTGAAA CGGCGTGAGG AGATTGAAGG CTTTGACTTT
1051 GCCGGACTGG TTCTGTGGCT TGCGGGCTTC ATCCTCTACC GCTTCCTGCT
1101 CTCGTCCGGC TGGGAAAGCA GCATCGGTCT GACCGCCCCC GTAATGTCTG
1151 CCGTTGCCAT TGCCACCGTA TCGGTACGCC TTTTCTTTAA AAAAACCCAA
1201 TCTTTACAAA GGAACCCGTC ATGA
它对应于氨基酸序列<SEQ ID 744;ORF125-1>:
1 MSGNASSPSS SSAIGLIWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51 AVGGALFFAA AYIGALTGRS SMESVRLSFG KRGSVLFSVA NMLQLAGWTA
101 VMIYAGATVS SALGKVLWDG ESFVWWALAN GALIVLWLVF GARKTGGLKT
151 VSMLLMLLAV LWLSAEVFST AGSTAAQVSD GMSFGTAVEL SAVMPLSWLP
201 LAADYTRHAR RPFAATLTAT LAYTLTGCWM YALGLAAALF TGETDVAKIL
251 LGAGLGAAGI LAVVLSTVTT TFLDAYSACA SANNISARFA ETPVAVGVTL
301 IGTVLAVMLP VTEYENFLLL IGSVFAPMAA VLIADFFVLK RREEIEGFDF
351 AGLVLWLAGF ILYRFLLSSG WESSIGLTAP VMSAVAIATV SVRLFFKKTQ
401 SLQRNPS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF125和脑膜炎奈瑟球菌菌株A的ORF(ORF125a)在51个氨基酸的重叠区内显示出有76.5%的相同性:
10 20 30
orf125.pep AGASANNISARFAETPVAVSVTLIGTVLAV
||:|||||||:::| |:||:|:::||:|||orf125a KILLGAGLGAAGILAVVLSTVTTTFLDAYSAGVSANNISAKLSEIPIAVAVAVVGTLLAV
250 260 270 280 290 300
40 50 60orf125.pep MLPVTEYENFLLLIGSVFAPMGGFDCRLFRLETAX
:||||||||||||||||||||:orf125a LLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEG
310 320 330 340该ORF125a的部分核苷酸序列<SEQ ID 745>是:
1 ATGTCGGGCA ATGCCTCCTC TCNTTCATCT TCCGCCGCCA TCGGGCTGAT
51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACACTGC
101 TTGCGCCTTT GGGCTGGCAG CGCGGTCTGG CNGCTCTGCT TTTGGGTCAT
151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC
201 CGGACNCANC TCGATGGAAA GCGTGCGCCT GTCGTTCGGC AAACGCGGTT
251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG
301 GTGATGATTT ACGCCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT
351 GTGGGACGGC GAATCTTTTG TCTGGTGGGC ATTGGCAAAC GGCGCGCTGA
401 TTGTGCTGTG GCTGGTTTTC GGCGCACGCA AAACAGGCGG GCTGAAAACC
451 GTTTCGATGC TGCTGATGCT GTTGGCGGTT CTGTGGCTGA GTGCCGAANT
501 NTTTTCCACG GCAGGCAGCA CCGCCGCANN GGTNNCAGAC GGCATGAGTT
551 TCGGAACGGC AGTCGAGCTG TCCGCCGTNA TGCCGCTTTC TTGGCTGCCG
601 CTGGCCGCCG ACTACACGCG CCACGCGCGC CGCCCGTTTG CGGCAACCCT
651 GACGGCAACG CTCGCCTACA CGCTGACCGG CTGCTGGATG TATGCCTTGG
701 GTTTGGCAGC GGCGTTGTTC ACCGGAGAAA CCGACGTGGC AAAAATCCTG
751 CTGGGCGCAG GTTTGGGTGC GGCAGGCATT TTGGCGGTCG TCCTGTCGAC
801 CGTTACCACC ACTTTTCTCG ATGCNTACTC CGCCGGCGTA AGTGCCAACA
851 ATATTTCCGC CAAACTTTCG GAAATACCNA TCGCCGTTGC CGTCGCCGTT
901 GTCGGCACAC TGCTTGCCGT CCTCCTGCCC GTTACCGAAT ATGAAAACTT
951 CCTGCTGCTT ATCGGCTCGG TATTTGCGCC GATGGCGGCG GTTTTGATTG
1001 CCGACTTTTT CGTCTTGAAA CGGCGTGAGG AGATTGAAGG C..它编码的蛋白质具有部分氨基酸序列<SEQ ID 746>:
1 MSGNASSXSS SAAIGLIWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51 AVGGALFFAA AYIGALTGXX SMESVRLSFG KRGSVLFSVA NMLQLAGWTA
101 VMIYAGATVS SALGKVLWDG ESFVWWALAN GALIVLWLVF GARKTGGLKT
151 VSMLLMLLAV LWLSAEXFST AGSTAAXVXD GMSFGTAVEL SAVMPLSWLP
201 LAADYTRHAR RPFAATLTAT LAYTLTGCWM YALGLAAALF TGETDVAKIL
251 LGAGLGAAGI LAVVLSTVTT TFLDAYSAGV SANNISAKLS EIPIAVAVAV
301 VGTLLAVLLP VTEYENFLLL IGSVFAPMAA VLIADFFVLK RREEIEG..ORF125a和ORF125-1在347个氨基酸的重叠区内显示出有94.5%的相同性:
10 20 30 40 50 60orf125a.pep MSGNASSXSSSAAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
||||||| |||:||||||||||||||||||||||||||||||||||||||||||||||||orf125-1 MSGNASSPSSSSAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
10 20 30 40 50 60
70 80 90 100 110 120orf125a.pep AYIGALTGXXSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG
|||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||orf125-1 AYIGALTGRSSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG
70 80 90 100 110 120
130 140 150 160 170 180orf125a.pep ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEXFSTAGSTAAXVXD
|||||||||||||||||||||||||||||||||||||||||||||| ||||||||| | |orf125-1 ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEVFSTAGSTAAQVSD
130 140 150 160 110 180
190 200 210 220 230 240
orf125a.pep GMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAALF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf125-1 GMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAALF
190 200 210 220 230 240
250 260 270 280 290 300
orf125a.pep TGETDVAKILLGAGLGAAGILAVVLSTVTTTFLDAYSAGVSANNISAKLSEIPIAVAVAV
|||||||||||||||||||||||||||||||||||||||:|||||||:::| |:||:|::
orf125-1 TGETDVAKILLGAGLGAAGILAVVLSTVTTTFLDAYSAGASANNISARFAETPVAVGVTL
250 260 270 280 290 300
310 320 330 340
orf125a.pep VGTLLAVLLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEG
:||:|||:|||||||||||||||||||||||||||||||||||||||
orf125-1 IGTVLAVMLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEGFDFAGLVLWLAGF
310 320 330 340 350 360
与淋病奈瑟球菌的预计ORF的同源性
ORF125和淋病奈瑟球菌的预计ORF(ORF125ng)在65个氨基酸的重叠区内显示出有86.2%的相同性:
orf125.pep AGASANNISARFAETPVAVSVTLIGTVLAV 30
|||||||||||||| ||||:|||| |||||
orf125ng KILLGAGLGITGILAVVLSTVTTTFLDTYSAGASANNISARFAEIPVAVGVTLIRTVLAV 308
orf125.pep MLPVTEYENFLLLIGSVFAPM-GGFDCRLFRLETA 64
|||||||:|||||| |||:|| |||||||| |:||
orf125ng MLPVTEYKNFLLLIRSVFGPMAGGFDCRLFCLKTA 343
预计ORF125ng核苷酸序列<SEQ ID 747>编码的蛋白质具有氨基酸序列<SEQ ID748>:
1 MSGNASSPSS SAAIGLVWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51 AVGGALFFAA AYIGALTGRS SMESVRLSFG KCGSVLFSVA NMLQLAGWTA
101 VMIYVGATVS SALGKVLWDG ESFVWWALAN GALIVLWLVF GARRTGGLKT
151 VSMLLMLLAV LWLSVEVFAS SGTNAAPAVS DGMTFGTAVE LSAVMPLSWL
201 PLAADYTRQA RRPFAATLTA TLAYTLTGCW MYALGLAAAL FTGETDVAKI
251 LLGAGLGITG ILAVVLSTVT TTFLDTYSAG ASANNISARF AEIPVAVGVT
301 LIRTVLAVML PVTEYKNFLL LIRSVFGPMA GGFDCRLFCL KTA*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 749>:
1 ATGTCGGGCA ATGCCTCCTC TCCTTCATCT TCCGCCGCCA TCGGGCTGGT
51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACGCTGC
101 TCGCCCCCTT GGGCTGGCAG CGCGGTCTGG CGGCCCTGCT TTTGGGTCAT
151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC
201 CGGACGCAGC TCGATGGAAA GTGTGCGCCT GTCGTTCGGC AAATGCGGTT
251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG
301 GTGATGATTT ACGTCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT
351 GTGGGACGGC GAATCCTTTG TCTGGTGGGC ATTGGCAAAC GGCGCACTGA
401 TCGTGCTGTG GCTGGTTTTC GGCGCACGCA GAACGGGCGG GCTGAAAACC
451 GTTTCGATGC TGCTGATGCT GCTTGCCGTG TTGTGGTTGA GCGTCGAAGT
501 GTTCGCTTCG TCCGGCACAA ACGCCGCGCC CGCCGTTTCA GACGGCATGA
551 CCTTCGGAAC GGCAGTCGAA CTGTCCGCCG TCATGCCGCT TTCCTGGCTG
601 CCGCTGGCCG CCGACTACAC GCGCCAAGCA CGCCGCCCGT TTGCGGCAAC
651 CCTGACGGCA ACGCTCGCCT ATACGCTGAC GGGCTGCTGG ATGTATGCCT
701 TGGGTTTGGC GGCGGCTCTG TTTACCGGAG AAACCGACGT GGCGAAAATC
751 CTGTTGGGCG CGGGCTTGGG CATAACGGGC ATTCTGGCAG TCGTCCTCTC
801 CACCGTTACC ACAACGTTTC TCGATACCTA TTCCGCCGGC GCGAGTGCGA
851 ACAACATTTC CGCGCGTTTT GCGGAAATAC CCGTCGCTGT CGGCGTTACC
901 CTGATCGGCA CGGTGCTTGC CGTCATGCTG CCCGTTACCG AATATAAAAA
951 CTTCCTGCTG CTTATCGGCT CGGTATTTGC GCCGATGGCG GCGGTTTTGA
1001 TTGCCGACTT TTTCGTCTTA AAACGGCGTG AGGAGATTGA AGGCTTTGAC
1051 TTTGCCGGAC TGGTTCTGTG GCTGGCAGGC TTCATCCTCT ACCGCTTCCT
1101 GCTCTCGTCC GGTTGGGAAA GCAGCATCGG TCTGACCGCC CCCGTAATGT
1151 CTGCCGTTGC CATTGCCACC GTATCGGTAC GCCTTTTCTT TAAAAAAACC
1201 CAATCTTTAC AAAGGAACCC GTCATGA它对应于氨基酸序列<SEQ ID 750;ORF125ng-1>:
1 MSGNASSPSS SAAIGLVWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51 AVGGALFFAA AYIGALTGRS SMESVRLSFG KCGSVLFSVA NMLQLAGWTA
101 VMIYVGATVS SALGKVLWDG ESFVWWALAN GALIVLWLVF GARRTGGLKT
151 VSMLLMLLAV LWLSVEVFAS SGTNAAPAVS DGMTFGTAVE LSAVMPLSWL
201 PLAADYTRQA RRPFAATLTA TLAYTLTGCW MYALGLAAAL FTGETDVAKI
251 LLGAGLGITG ILAVVLSTVT TTFLDTYSAG ASANNISARF AEIPVAVGVT
301 LIGTVLAVML PVTEYKNFLL LIGSVFAPMA AVLIADFFVL KRREEIEGFD
351 FAGLVLWLAG FILYRFLLSS GWESSIGLTA PVMSAVAIAT VSVRLFFKKT
401 QSLQRNPS*ORF125ng-1和ORF125-1在408个氨基酸的重叠区内显示出有95.1%的相同性:
10 20 30 40 50 60orf125-1.pep MSGNASSPSSSSAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
|||||||||||:||||:|||||||||||||||||||||||||||||||||||||||||||orf125ng-1 MSGNASSPSSSAAIGLVWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
10 20 30 40 50 60
70 80 90 100 110 120orf125-1.pep AYIGALTGRSSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG
||||||||||||||||||||| ||||||||||||||||||||||:|||||||||||||||orf125ng-1 AYIGALTGRSSMESVRLSFGKCGSVLFSVANMLQLAGWTAVMIYVGATVSSALGKVLWDG
70 80 90 100 110 120
130 140 150 160 170 179orf125-1.pep ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEVFSTAGSTAAQ-VS
|||||||||||||||||||||||:||||||||||||||||||||:|||:::|::|| ||orf125ng-1 ESFVWWALANGALIVLWLVFGARRTGGLKTVSMLLMLLAVLWLSVEVFASSGTNAAPAVS
130 140 150 160 170 180
180 190 200 210 220 230 239orf125-1.pep DGMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAAL
|||:||||||||||||||||||||||||:|||||||||||||||||||||||||||||||orf125ng-1 DGMTFGTAVELSAVMPLSWLPLAADYTRQARRPFAATLTATLAYTLTGCWMYALGLAAAL
190 200 210 220 230 240
240 250 260 270 280 290 299orf125-1.pep FTGETDVAKILLGAGLGAAGILAVVLSTVTTTFLDAYSAGASANNISARFAETPVAVGVT
||||||||||||||||| :||||||||||||||||:|||||||||||||||| |||||||orf125ng-1 FTGETDVAKILLGAGLGITGILAVVLSTVTTTFLDTYSAGASANNISARFAEIPVAVGVT
250 260 270 280 290 300
300 310 320 330 340 350 359orf125-1.pep LIGTVLAVMLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEGFDFAGLVLWLAG
|||||||||||||||:||||||||||||||||||||||||||||||||||||||||||||orf125ng-1 LIGTVLAVMLPVTEYKNFLLLIGSVFAPMAAVLIADFFVLKRREEIEGFDFAGLVLWLAG
310 320 330 340 350 360
360 370 380 390 400orf125-1.pep FILYRFLLSSGWESSIGLTAPVMSAVAIATVSVRLFFKKTQSLQRNPSX
|||||||||||||||||||||||||||||||||||||||||||||||||orf125ng-1 FILYRFLLSSGWESSIGLTAPVMSAVAIATVSVRLFFKKTQSLQRNPSX
370 380 390 400
根据该分析结果(包括淋球菌蛋白中存在推定的前导序列和跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例96
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 751>:
1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCGGGAA GGCTGACCGC
51 GTTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAAGCT
101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TAGCCGCCGC CATGCTCGCG
151 CCTGCAGCGG A.ACGGTCGA AGCCACGCCC GAAGTGGTCA GGCTGGGCAG
201 GCAGAGCATC CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCACA
251 CGATGATGCA GGAAAACGGC AGCCTGATTG TATGGCACGG GCAGGACAAG
301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGT.ACGGA
351 TGACGAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC
401 AACTCGGCGG ACGTTTTTAA GACGGCATCT ACCTGCCGAC CGAAGC.CAG
451 CTCGACGGGC GGCAATTATA GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCC TGCCATTGGG AACACGAATG CGTCCCCGAA GCCFGCAAG.
它对应于氨基酸序列<SEQ ID 752;ORF126>:
1 MTRIAILGGG LSGRLTALQL AEQGYQIALF DKSCRRGEHA AAYVAAAMLA
51 PAAXTVEATP EVVRLGRQSI PLWRGIRCRL NTHTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGXTDDEI VRWRADDIAE REPQLGGRFX DGIYLPTEXQ
151 LDGRQLXSAL ADALDELNVP CHWEHECVPE ACK...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 753>:
1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCGGGAA GGCTGACCGC
51 GTTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAGGCT
101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TTGCCGCCGC CATGCTCGCG
151 CCTGCGGCGG AAGCGGTCGA AGCCACGCCC GAAGTGGTCA GGCTGGGCAG
201 GCAGAGCATC CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCACA
251 CGATGATGCA GGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAG
301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA
351 TGACGAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC
401 AACTCGCCGG ACGTTTTTCA GACGCCATCT ACCTGCCGAC CGAAGGCCAG
451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCC TGCCATTGGG AACACGAATG CGTCCCCGAA GGCCTGCAAG
551 CCCAATACGA CTGGCTGATC GACTGCCGCG GCTACGGCGC AAAAACCGCG
601 TGGAACCAAT CCCCCGAGCA CACCAGCACC CTGCGCGGCA TACGCGGCGA
651 AGTGGCGCGG GTTTACACAC CCGAAATCAC GCTCAACCGC CCCGTGCGTC
701 TGCTCCATCC GCGTTATCCG CTCTACATCG CCCCGAAAGA AAACCACGTC
751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CCCCCGCCAG
801 CGTGCGTTCA GGGTTGGAAC TCTTGTCCGC ACTCTATGCC ATCCACCCCG
851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCACCGGCCT GCGCCCCACG
901 CTCAACCACC ACAACCCCGA AATCCGTTAC AACCGCGCCC GACGCCTGAT
951 TGAAATCAAC GGCCTTTTCC GCCACGGTTT CATGATCTCC CCCGCCGTAA
1001 CCGCCGCCGC CGCCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGACGCG
1051 CCCGAACGCG ATAAAGAAAG CGGTTTGGCG TATATCCGAA GACAAGATTA
1101 A
它对应于氨基酸序列<SEQ ID 754;ORF126-1>:
1 MTRIAILGGG LSGRLTALQL AEQGYQIALF DKGCRRGEHA AAYVAAAMLA
51 PAAEAVEATP EVVRLGRQSI PLWRGIRCRL NTHTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGVADDEI VRWRADDIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECVPE GLQAQYDWLI DCRGYGAKTA
201 WNQSPEHTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENHV
251 FVIGATQIES ESQAPASVRS GLELLSALYA IHPAFGEADI LEIATGLRPT
301 LNHHNPEIRY NRARRLIEIN GLFRHGFMIS PAVTAAAARL AVALFDGKDA
351 PERDKESGLA YIRRQD*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF126和脑膜炎奈瑟球菌菌株A的ORF(ORF126a)在180个氨基酸的重叠区内显示出有90.0%的相同性:
10 20 30 40 50 60
orf126.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKSCRRGEHAAAYVAAAMLAPAAXTVEATP
||||||||||||||||||||||||||||||||:|||||||||||||||||||| :|||||
orf126a MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
10 20 30 40 50 60
70 80 90 100 110 120
orf126.pep EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGXTDDEI
|||||||| |||||||||:|:| :|| ||||||||||||||||:|||||||||| :|| |
orf126a EVVRLGRQXIPLWRGIRCHLKTPAMMXENGSLIVWHGQDKPLSNEFVRHLKRGGVADDXI
70 80 90 100 110 120
130 140 150 160 170 180
orf126.pep VRWRADDIAEREPQLGGRFXDGIYLPTEXQLDGRQLXSALADALDELNVPCHWEHECVPE
||||||||||||||||||| |||||||| ||||||: ||||||||||||||||||||:||
orf126a VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPE
130 140 150 160 170 180
全长ORF126a核苷酸序列<SEQ ID 755>是:
1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCNGGAA GGCTGACCGC
51 ACTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAGGCT
101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TTGCCGCCGC CATGCTCGCG
151 CCTGCGGCGG AAGCGGTCGA AGCCACGCCT GAAGTGGTCA GGCTGGGCAG
201 GCAGANCATC CCGCTTTGGC GCGGCATCCG ATGCCATCTG AAAACGCCTG
251 CCATGATGCA NGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAA
301 CCTTTATCCA ACGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA
351 TGACNAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC
401 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG
451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCC TGCCATTGGG AACACGAATG TGCCCCCGAA GACTTGCAAG
551 CCCAATACGA CTGGCTGATC GACTGCCGCG GCTACGGCGC AAAAACCGCG
601 TGGAACCAAT CCCCCGANNA NACCAGCACC CTGCGCGGCA TACGCGGCGA
651 AGTGGCGCGG GTTTACACAC CCGAAATCAC GCTCAACCGC CCCGTGCGCC
701 TGCTACACCC GCGCTATCCG CTNTACATCG CCCCGAAAGA AAACCNCGTC
751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CACCTGCCAG
801 CGTGCGTTCC GGGCTGGAAC TCTTATCCGC ACTCTATGCC GTCCACCCCG
851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCACCGGCCT GCGCCCCACG
901 CTCAATCACC ACAACCCCGA AATCCGTTAC AACCGCGCCC GACGCCTGAT
951 TGAAATCAAC GGCCTTTTCC GCCACGGTTT CATGATCTCC CCCGCCGTAA
1001 CCGCCGCCGC CGTCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGANGCG
1051 CCCGAACGCG ATGAAGAAAG CGGTTTGGCG TATATCCGAA GACAAGATTA
1101 A
它编码的蛋白质具有氨基酸序列<SEQ ID 756>:
1 MTRIAILGGG LSGRLTALQL AEQGYQIALF DKGCRRGEHA AAYVAAAMLA
51 PAAEAVEATP EVVRLGRQXI PLWRGIRCHL KTPAMMXENG SLIVWHGQDK
101 PLSNEFVRHL KRGGVADDXI VRWRADDIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECAPE DLQAQYDWLI DCRGYGAKTA
201 WNQSPXXTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENXV
251 FVIGATQIES ESQAPASYRS GLELLSALYA VHPAFGEADI LEIATGLRPT
301 LNHHNPEIRY NRARRLIEIN GLFRHGFMIS PAVTAAAVRL AVALFDGKXA
351 PERDEESGLA YIRRQD*
ORF126a和ORF126-1在366个氨基酸的重叠区内显示出有95.4%的相同性:
10 20 30 40 50 60
orf126a.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf126-1 MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
10 20 30 40 50 60
70 80 90 100 110 120
orf126a.pep EVVRLGRQXIPLWRGIRCHLKTPAMMXENGSLIVWHGQDKPLSNEFVRHLKRGGVADDXI
|||||||| |||||||||:|:| :|| ||||||||||||||||:|||||||||||||| |
orf126-1 EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI
70 80 90 100 110 120
130 140 150 160 170 180
orf126a.pep VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPE
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||:||
orf126-1 VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECVPE
130 140 150 160 170 180
190 200 210 220 230 240
orf126a.pep DLQAQYDWLIDCRGYGAKTAWNQSPXXTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
|||||||||||||||||||||||| |||||||||||||||||||||||||||||||||
orf126-1 GLQAQYDWLIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
190 200 210 220 230 240
250 260 270 280 290 300
orf126a.pep LYIAPKENXVFVIGATQIESESQAPASVRSGLELLSALYAVHPAFGEADILEIATGLRPT
|||||||| |||||||||||||||||||||||||||||||:|||||||||||||||||||
orf126-1 LYIAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAIHPAFGEADILEIATGLRPT
250 260 270 280 290 300
310 320 330 340 350 360
orf126a.pep LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAVRLAVALFDGKXAPERDEESGLA
|||||||||||||||||||||||||||||||||||||:|||||||||| |||||:|||||
orf126-1 LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAARLAVALFDGKDAPERDKESGLA
310 320 330 340 350 360
orf126a.pep YIRRQDX
|||||||
orf126-1 YIRRQDX
与淋病奈瑟球菌的预计ORF的同源性
ORF126和淋病奈瑟球菌的预计ORF(ORF126ng)在180个氨基酸的重叠区内显示出有90%的相同性:
orf126.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKSCRRGEHAAAYVAAAMLAPAAXTVEATP 60
|||||:||||||||||||||||||||| ||||: |:||||||||||||||||| :|||||
orf126ng MTRIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHAAAYVAAAMLAPAAEAVEATP 60
orf126.pep EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGXTDDEI 120
||:||||||||||||||||||| ||||||||||||||||||||||||||||||| :||||
orf126ng EVIRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI 120
orf126.pep VRWRADDIAEREPQLGGRFXDGIYLPTEXQLDGRQLXSALADALDELNVPCHWEHECVPE 180
||||||:|||||||||||| |||||||| ||||||: ||||||||||||||||||||:|:
orf126ng VRWRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQ 180
预计ORF126ng核苷酸序列<SEQ ID 757>编码的蛋白质具有氨基酸序列<SEQ ID758>:
1 MTRIAVLGGG LSGRLTALQL AEQGYQIELF DKGTRQGEHA AAYVAAAMLA
51 PAAEAVEATP EVIRLGRQSI PLWRGIRCRL NTLTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGVADDEI VRWRADEIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECAPQ DLQAQVDWVI DCRGYGAKTA
201 WNQSPEHTST LRGIRGEVRG FTRPKSRSTA PCACCTRAIR STSPRKKTTS
251 SSSARPKSKA KAKPPPAYVP GWNSYPRSMP STPPSAKPTS SKWRPGLRPT
301 LNHHNPEIRY SRERRLIEIN GLFRHGFMIS PAVTAAAVRL AVALFDGKDA
351 PERDEESGLA YIGRQD*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 759>:
1 ATGACCCGTA TCGCCGTCCT CGGAGGCGGC CTTTCCGGAA GGCTGACCGC
51 ATTGCAGCTT GCAGAACAAG GTTATCAGAT TGAACTTTTC GACAAGGGCA
101 CCCGCCAAGG CGAACACGCC GCCGCCTATG TTGCCGCCGC GATGCTCGCG
151 CCTGCGGCGG AAGCGGTCGA GGCAACGCCC GAAGTCATCA GGCTGGGCAG
201 GCAGAGCATT CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCTCA
251 CGATGATGCA GGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAG
301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA
351 TGACGAAATC GTCCGTTGGC GCGCCGATGA AATCGCCGAA CGCGAACCGC
401 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG
451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCT TGCCATTGGG AACACGAATG CGCCCCCCAA GACCTGCAAG
551 CCCAATACGA CTGGGTAATC GACTGCCGGG GCTACGGCGC GAAAACCGCG
601 TGGAACCAAT CCCCCGAGCA CACCAGCACC TTGCGCGGCA TACGCGGCGA
651 AGTGGCGCGG GTTTACACGC CCGAAATCAC GCTCAACCGC CCCGTGCGCC
701 TGCTGCACCC GCGCTATCCG CTCTACATCG CCCCGAAAGA AAACCACGTC
751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CCCCCGCCAG
801 CGTACGTTCC GGGCTGGAAC TCTTATCCGC GCTCTATGCC GTCCACCCCG
851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCGCCGGCCT GCGCCCCACG
901 CTCAACCACC ACAACCCCGA AATCCGCTAC AGCCGCGAAC GCCGCCTCAT
951 CGAAATCAAC GGCCTTTTCC GGCACGGCTT TATGATTTCC CCCGCCGTAA
1001 CCGCCGCCGC CGTCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGACGCG
1051 CCCGAACGTG ATGAAGAAAG CGGTTTGGCG TATATCGGAA GACAAGATTA
1101 A
它对应于氨基酸序列<SEQ ID 760;ORF126ng-1>:
1 MTRIAVLGGG LSGRLTALQL AEQGYQIELF DKGTRQGEHA AAYVAAAMLA
51 PAAEAVEATP EVIRLGRQSI PLWRGIRCRL NTLTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGVADDEI VRWRADEIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECAPQ DLQAQYDWVI DCRGYGAKTA
201 WNQSPEHTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENHV
251 FVIGATQIES ESQAPASVRS GLELLSALYA VHPAFGEADI LEIAAGLRPT
301 LNHHNPEIRY SRERRLIEIN GLFRHGFMIS PAVTAAAVRL AVALFDGKDA
351 PERDEESGLA YIGRQD*
ORF126ng-1和ORF126-1在366个氨基酸的重叠区内显示出有95.1%的相同性:
10 20 30 40 50 60
orf126-1.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
|||||:||||||||||||||||||||| ||||| |:||||||||||||||||||||||||
orf126ng-1 MTRIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHAAAYVAAAMLAPAAEAVEATP
10 20 30 40 50 60
70 80 90 100 110 120
orf126-1.pep EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI
||:||||||||||||||||||| |||||||||||||||||||||||||||||||||||||
orf126ng-1 EVIRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI
70 80 90 100 110 120
130 140 150 160 170 180
orf126-1.pep VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECVPE
||||||:||||||||||||||||||||||||||||||||||||||||||||||||||:|:
orf126ng-1 VRWRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQ
130 140 150 160 170 180
190 200 210 220 230 240
orf126-1.pep GLQAQYDWLIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
|||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf126ng-1 DLQAQYDWVIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
190 200 210 220 230 240
250 260 270 280 290 300
orf126-1.pep LYIAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAIHPAFGEADILEIATGLRPT
||||||||||||||||||||||||||||||||||||||||:|||||||||||||:|||||
orf126ng-1 LYIAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAVHPAFGEADILEIAAGLRPT
250 260 270 280 290 300
310 320 330 340 350 360
orf126-1.pep LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAARLAVALFDGKDAPERDKESGLA
||||||||||:| ||||||||||||||||||||||||:||||||||||||||||:|||||
orf126ng-1 LNHHNPEIRYSRERRLIEINGLFRHGFMISPAVTAAAVRLAVALFDGKDAPERDEESGLA
310 320 330 340 350 360
orf126-1.pep YIRRQDX
|| ||||
orf126ng-1 YIGRQDX
另外,ORF126ng-1显示出与一种推定的根瘤菌氧化酶黄素蛋白同源:
gi|2627327(AF004408)推定的氨基酸氧化酶黄素蛋白[Rhizobium etli]长度=327
评分=169位(423),估计值=3e-41
相同性=112/329(34%),阳性=163/329(49%),空隙=25/329(7%)
询问:3 RIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHXXXXXXXXXXXXXXXXXXXXXXX 62
RI V G G++G A QL G+++ L ++ G
目标:2 RILVNGAGVAGLTVAWQLYRHGFRVTLAERAGTVGA-GASGFAGGMLAPWCERESAEEPV 60
询问:63 IRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEIVR 122
+ LGR + W + G+L+V G+D F R G DE+
目标:61 LTLGRLAADWWEAA-----LPGHVHRRGTLVVAGGRDTGELDRFSRRTS-GWEWLDEVA- 113
询问:123 WRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQDL 182
IA EP L GRF ++ E LD RQ L+ALA L++ + +
目标:114 -----IAALEPDLAGRFRRALFFRQEAHLDPRQALAALAAGLEDARMRLTLG---VVGES 165
询问:183 QAQYDWVIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYPLY 242
+D V+DC G LRG+RGE+ V T E++L+RPVRLLHPR+P+Y
目标:166 DVDHDRVVDCTGAA-------QIGRLPGLRGVRGEMLCVETTEVSLSRPVRLLHPRHPIY 218
询问:243 IAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAVHPAFGEADILEIAAGLRPTLN 302
I P++ + F++GAT IES+ P + RS +ELL+A YA+HPAFGEA + E AG+RP
目标:219 IVPRDKNRFMVGATMIESDDGGPITARSLMELLNAAYAMHPAFGEARVTETGAGVRPAYP 278
询问:303 HHNPEIRYSRERRLIEINGLFRHGFMISP 331
+ P R ++E R + +NGL+RHGF+++P
目标:279 DNLP--RVTQEGRTLHVNGLYRHGFLLAP 305
该分析结果提示,脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例97
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 761>:
1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGGTTTA AACAAACATC
201 TACCAAGTGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGtCGCG CGGG..GCTT TAGACAGTAA ATTCATGTTG
301 AAGGCGGTAG CCATAGATAA AGATAAAAAT CCTTTTATTA TTAAGATGAA
351 TGAAAATCTA GTAACCTTTA aTTTGCAAGA AGTCCGCCAG TTCGTGTAGT
401 GACGGGCTGG ATTATTTTAA AGGAAATGAT AAGGACTGCA AGTTACTTAA
451 GTAG
它对应于氨基酸序列<SEQ ID 762;ORF127>:
1 MTDNRGFTLV ELISVVLILS VLALIVYPSY RNYVEKAKIN AVRAALLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIVA RXALDSKFML
101 KAVAIDKDKN PFIIKMNENL VTFICKKSAS SCSDGLDYFK GNDKDCKLLK
151 *
进一步的工作揭示了下列DNA序列<SEQ ID 763>:
1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGGTTTA AACAAACATC
201 TACCAAGTGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGCGCGC GGGGCTTTAG ACAGTAAATT CATGTTGAAG
301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA
351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG
401 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG
它对应于氨基酸序列<SEQ ID 764;ORF127-1>:
1 MTDNRGFTLV ELISVVLILS VLALIVYPSY RNYVEKAKIN AVRAALLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDGLDYFKG NDKDCKLLK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株)的预计ORF的同源性
ORF127和脑膜炎奈瑟球菌菌株A的ORF(ORF127a)在150个氨基酸的重叠区内显示出有98.0%的相同性:
10 20 30 40 50 60
orf127.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf127a MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINTVRAALLENAHFMEKFYLQN
10 20 30 40 50 60
70 80 90 100 110 120
orf127.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIVARXALDSKFMLKAVAIDKDKNPFIIKMNENL
|||||||||||||||||||||||||||| || ||||||||||||||||||||||||||||
orf127a GRFKQTSTKWPSLPIKEAEGFCIRLNGI-ARGALDSKFMLKAVAIDKDKNPFIIKMNENL
70 80 90 100 110
130 140 150
orf127.pep VTFICKKSASSCSDGLDYFKGNDKDCKLLKX
|||||||||||||||||||||||||||||||
orf127a VTFICKKSASSCSDGLDYFKGNDKDCKLLKX
120 130 140 150
全长ORF127a核苷酸序列<SEQ ID 765>是:
1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT ACAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGATTTA AACAAACATC
201 TACCAAATGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGCGCGC GGGGCCTTAG ACAGTAAATT CATGTTGAAG
301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA
351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG
401 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 766>:
1 MTDNRGFTLV ELISVVLILS VLALIVYPSY RNYVEKAKIN TVRAALLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDGLDYFKG NDKDCKLLK*
ORF127a和ORF127-1在149个氨基酸的重叠区内显示出有99.3%的相同性:
10 20 30 40 50 60
orf127a.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINTVRAALLENAHFMEKFYLQN
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf127-1 MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
10 20 30 40 50 60
70 80 90 100 110 120
orf127a.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf127-1 GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
70 80 90 100 110 120
130 140 150
orf127a.pep TFICKKSASSCSDGLDYFKGNDKDCKLLKX
||||||||||||||||||||||||||||||
orf127-1 TFICKKSASSCSDGLDYFKGNDKDCKLLKX
130 140 150
与淋病奈瑟球菌的预计ORF的同源性
ORF127和淋病奈瑟球菌的预计ORF(ORF127ng)在150个氨基酸的重叠区内显示出有97.3%的相同性:
orf127.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN 60
|||||||||||||||||||||||||||||||||||||||||||||:||||||||||||||
orf127ng MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAAFLENAHFMEKFYLQN 60
orf127.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIVARXALDSKFMLKAVAIDKDKNPFIIKMNENL 120
|||||||||||||||||||||||||||| || ||||||||||||||||||||||||||||
orf127ng GRFKQTSTKWPSLPIKEAEGFCIRLNGI-ARGALDSKFMLKAVAIDKDKNPFIIKMNENL 119
orf127.pep VTFICKKSASSCSDGLDYFKGNDKDCKLLK 150
|||||||||||||| |||||||||||||||
orf127ng VTFICKKSASSCSDRLDYFKGNDKDCKLLK 149
全长ORF127ng核苷酸序列<SEQ ID 767>是:
1 ATGACTGATA ATCGGGGGTT TACACTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGATTTA AACAAACATC
201 TACCAAATGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGCGCGC GGGGCTTTAG ACAGTAAATT CATGTTGAAG
301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA
351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG
401 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 768>:
1 MTDNRGFTLV ELISVVLILS VLALIVYPSY RNYVEKAKIN AVRAAFLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDRLDYFKG NDKDCKLLK*
ORF127ng和ORF127-1在149个氨基酸的重叠区内显示出有100.0%的相同性:
10 20 30 40 50 60
oTf127-1.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf127ng-1 MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
10 20 30 40 50 60
70 80 90 100 110 120
orf127-1.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf127ng-1 GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
70 80 90 100 110 120
130 140 150
orf127-1.pep TFICKKSASSCSDGLDYFKGNDKDCKLLKX
||||||||||||||||||||||||||||||
orf127ng-1 TFICKKSASSCSDGLDYFKGNDKDCKLLKX
130 140 150
该分析结果(包括脑膜炎球菌和淋球菌蛋白均具有预计的跨膜结构域这一事实)提示,脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例98
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 769>
1 ..GTGTCGCTGG CTTCGGTGAT TGCCTCTCAA ATCTTCCTTT ACGAAGATTT
51 CAACCAAATG CGGAAAACCC GTGGAGCTAT CTGCGGTTTT CTTGTCCAAT
101 ATTTATCTGG GGTTTCAGCA GGGGTATTTC GATTTGAGTG CCGACGAGAA
151 CCCCGTACTG CATATCTGGT CTTTGGCAGT AGAGGAACAG TATTACCTCC
201 TGTATCCCCT TTTGCTGATA TTTTGCTGCA AAAAAACCAA ATCGCTACGG
251 GTGCTGCGTA ACATCAGCAT CATCCTGTTT TTGATTTTGA CTGCCTCATC
301 GTTTTTGCCA AGCGGGTTTT ATACCGACAT CCTCAACCAA CCCAATACTT
351 ATTACCTTTC GACACTGAGG TTTCCCGAGC TGTTGGCAGG TTCGCTGCTG
401 GCGGTTTACG GGCAAACGCA AAACGGCAGA CGGCAAACAG CAAATGGAAA
451 ACGGCAGTTG CTTTCATCAC TCTGCTTCGG CGCATTGCTT GCCTGCCTGT
501 TCGTGATTGA CAAACACAAT CCGTTTATCC CGGGAATGAC CCTGCTCCTT
551 CCCTGCCTGC TGACGGCACT GCTTATCCGG AGTATGCAAT ACGGGACACT
601 TCCGACCCGC ATCCTGTCGG CAAGCCCCAT CGTATTTGTC GGCAAAATCT
651 CTTATTCCCT ATACCTGTAC CATTGGATTT TTATTGCTTT CGCTCCGCTC
701 ATTAGAGGCG GGAAACAGCT CGGACTGCCT GCCG..
它对应于氨基酸序列<SEQ ID 770;ORF128>:
1 ..VSLASVIASQ IFLYEDFNQM RKTVELSAVF LSNIYLGFQQ GYFDLSADEN
51 PVLHIWSLAV EEQYYLLYPL LLIFCCKKTK SLRVLRNISI ILFLILTASS
101 FLPSGFYTDI LNQPNTYYLS TLRFPELLAG SLLAVYGQTQ NGRRQTANGK
151 RQLLSSLCFG ALLACLFVID KHNPFIPGMT LLLPCLLTAL LIRSMQYGTL
201 PTRILSASPI VFVGKISYSL YLYHWIFIAF APLIRGGKQL GLPA..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 771>:
1 ATGCAAGCTG TCCGATACAG ACCGGAAATT GACGGATTGC GGGCCGTCGC
51 CGTGCTATCC GTCATGATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG
101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCAGGATT CCTCATTACC
151 GGCATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT
201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCCTTTATT GCGGCCGTGT
251 CGCTGGCTTC GGTGATTGCC TCTCAAATCT TCCTTTACGA AGATTTCAAC
301 CAAATGCGGA AAACCGTGGA GCTTTCTGCG GTTTTCTTGT CCAATATTTA
351 TCTGGGGTTT CAGCAGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG
401 TACTGCATAT CTGGTCTTTG GCAGTAGAGG AACAGTATTA CCTCCTGTAT
451 CCCCTTTTGC TGATATTTTG CTGCAAAAAA ACCAAATCGC TACGGGTGCT
501 GCGTAACATC AGCATCATCC TGTTTTTGAT TTTGACTGCC TCATCGTTTT
551 TGCCAAGCGG GTTTTATACC GACATCCTCA ACCAACCCAA TACTTATTAC
601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GCAGGTTCGC TGCTGGCGGT
651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGCAAAT GGAAAACGGC
701 AGTTGCTTTC ATCACTCTGC TTCGGCGCAT TGCTTGCCTG CCTGTTCGTG
751 ATTGACAAAC ACAATCCGTT TATCCCGGGA ATGACCCTGC TCCTTCCCTG
801 CCTGCTGACG GCACTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA
851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT
901 TCCCTATACC TGTACCATTG GATTTTTATT GCTTTCGCCC ATTACATTAC
951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA
1001 CGGCCGGATT TTCCCTGTTG AGTTATTATT TGATTGAACA GCCGCTTAGA
1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTCT ATCTCGCCCC
1101 GTCCCTGATA CTTGTCGGTT ACAACCTGTA CGCAAGGGGG ATATTGAAAC
1151 AGGAACACCT CCGCCCGTTG CCCGGCGCGC CCCTTGCTGC GGAAAATCAT
1201 TTTCCGGAAA CCGTCCTGAC CCTCGGCGAC TCGCACGCCG GACACCTGAG
1251 GGGGTTTCTG GATTATGTCG GCAGCCGGGA AGGGTGGAAA GCCAAAATCC
1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TAGATGAGAA GCTGGCAGAC
1351 AACCCGTTAT GTCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCCGT
1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCTGTGCCGA
1451 GATTTGAAGC GCAATCCTTC CTAATACCCG GGTTCCCAGC CCGATTCAGG
1501 GAAACCGTCA AAAGGATAGC CGCCGTCAAA CCCGTCTATG TTTTTGCAAA
1551 CAACACATCA ATCAGCCGTT CGCCCCTGAG GGAGGAAAAA TTGAAAAGAT
1601 TTGCCGCAAA CCAATATCTC CGCCCCATTC AGGCTATGGG CGACATCGGC
1651 AAGAGCAATC AGGCGGTCTT TGATTTGATT AAAGATATTC CCAATGTGCA
1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATATACG
1751 GCCGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT
1801 TATATGGGGC GGGAATTCCA CAAACACGAA CGCCTGCTTA AATCTTCCCA
1851 CGGCGGCGCA TTGCAGTAG它对应于氨基酸序列<SEQ ID 772;ORF128-1>:
1 MQAVRYRPEI DGLRAVAVLS VMIFHLNNRW LPGGFLGVDI FFVISGFLIT
51 GIILSEIQNG SFSFRDFYTR RIKRIYPAFI AAVSLASVIA SQIFLYEDFN
101 QMRKTVELSA VFLSNIYLGF QQGYFDLSAD ENPVLHIWSL AVEEQYYLLY
151 PLLLIFCCKK TKSLRVLRNI SIILFLILTA SSFLPSGFYT DILNQPNTYY
201 LSTLRFPELL AGSLLAVYGQ TQNGRRQTAN GKRQLLSSLC FGALLACLFV
251 IDKHNPFIPG MTLLLPCLLT ALLIRSMQYG TLPTRILSAS PIVFVGKISY
301 SLYLYHWIFI AFAHYITGDK QLGLP4VSAV AALTAGFSLL SYYLIEQPLR
351 KRKMTFKKAF FCLYLAPSLI LVGYNLYARG ILKQEHLRPL PGAPLAAENH
401 FPETVLTLGD SHAGHLRGFL DYVGSREGWK AKILSLDSEC LVWVDEKLAD
451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFPARFR
501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAANQYL RPIQAMGDIG
551 KSNQAVFDLI KDIPNVHWVD AQKYLPKNTV EIYGRYLYGD QDHLTYFGSY
601 YMUREFHKHE RLLKSSHGGA LQ*该氨基酸序列的计算机分析给出了下列结果:与流感嗜血菌的假设的整合膜蛋白HI0392(登录号为U32723)的同源性ORF128和HI0392在180个氨基酸的重叠区内显示出有52%的氨基酸相同性:Orf128:1 VSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGFQQGYFDLSADENPVLHIWSLAV 60
++L S IAS IF+Y DFN++RKT+EL+ FLSN YLG QGYFDLSA+ENPVLHIWSLAVHI0392:46 MALVSFTASAIFTYNDFNKLRKTIELAIAFLSNFYLGLTQGYFDLSANENPVLHIWSLAV 105
Orf128:61 EEQXXXXXXXXXIFCCKKTKSLRVLRNISIILFLILTASSFLPSGFYTDILNQPNTYYLS 120
E Q I KK + ++VL I++ILF IL A+SF+ + FY ++L+QPN YYLS
HI0392:106 EGQYYLIFPLILILAYKKFREVKVLFIITLILFFILLATSFVSANFYKEVLHQPNIYYLS 165
Orf128:121 TLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLCFGALLACLFVIDKHNPFIPGMT 180
LRFPELL GSLLA+Y N + Q + +L+ L L +CLF+++ + FIPG+T
HI0392:166 NLRFPELLVGSLLAIYHNLSN-KVQLSKQVNNILAILSTLLLFSCLFLMNNNIAFIPGIT 224
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF128和脑膜炎奈瑟球菌菌株A的ORF(ORF128a)在244个氨基酸的重叠区内显示出有98.0%的相同性:
10 20 30
orf128.pep VSLASVIASQIFLYEDFNQMRKTVELSAVF
||||||||||||||||||||||||||||||
orf128a ILSEIQNGSFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVF
60 70 80 90 100 110
40 50 60 70 80 90
orf128.pep LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128a LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI
120 130 140 150 160 170
100 110 120 130 140 150
orf128.pep ILFLILTASSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK
||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf128a ILFLILTATSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK
180 190 200 210 220 230
160 170 180 190 200 210
orf128.pep RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128a RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI
240 250 260 270 280 290
220 230 240
orf128.pep VFVGKISYSLYLYHWIFIAFAPLIRGGKQLGLPA
||||||||||||||||||||| | | |||||||
orf128a VFVGKISYSLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKR
300 310 320 330 340 350
orf128a KMTFKKAFFCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSH
360 370 380 390 400 410
全长ORF128a核苷酸序列<SEQ ID 773>是:
1 ATGCAAGCTG TCCGATACAG ACCGGAAATT GACGGATTGC GGGCCGTCGC
51 CGTGCTATCC GTCATGATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG
101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCAGGATT CCTCATTACC
151 GGCATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT
201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCTTTTATT GCGGCCGTGT
251 CGCTGGCTTC GGTGATTGCC TCTCAAATCT TCCTTTACGA AGATTTCAAC
301 CAAATGCGGA AAACCGTGGA GCTTTCTGCG GTTTTCTTGT CCAATATTTA
351 TCTGGGGTTT CAGCAGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG
401 TACTGCATAT CTGGTCTTTG GCAGTAGAGG AACAGTATTA CCTCCTGTAT
451 CCTCTTTTGC TGATATTTTG CTGCAAAAAA ACAAAATCGC TACGGGTGCT
501 GCGTAACATC AGCATCATCC TATTTCTGAT TTTGACTGCC ACATCGTTTT
551 TGCCAAGCGG GTTTTATACC GATATTCTCA ACCAACCCAA TACTTATTAC
601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GCAGGTTCGC TGCTGGCGGT
651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGCAAAT GGAAAACGGC
701 AGTTGCTTTC ATCACTCTGC TTCGGCGCAT TGCTTGCCTG CCTGTTCGTG
751 ATTGACAAAC ACAATCCGTT TATCCCGGGA ATGACCCTGC TCCTTCCCTG
801 CCTGCTGACG GCACTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA
851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT
901 TCCCTATACC TGTACCATTG GATTTTTATT GCTTTCGCCC ATTACATTAC
951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA
1001 CGGCCGGATT TTCCCTGTTG AGTTATTATT TGATTGAACA GCCGCTTAGA
1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTCT ATCTCGCCCC
1101 GTCCCTGATA CTTGTCGGTT ACAACCTGTA CGCAAGGGGG ATATTGAAAC
1151 AGGAACACCT CCGCCCGTTG CCCGGCGCGC CCCTTGCTGC GGAAAATCAT
1201 TTTCCGGAAA CCGTCCTGAC CCTCGGCGAC TCGCACGCCG GACACCTGCG
1251 GGGGTTTCTG GATTATGTCG GCAGCCGGGA AGGGTGGAAA GCCAAAATCC
1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TAGATGAGAA GCTGGCAGAC
1351 AACCCGTTAT GTCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCCGT
1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCCGTGCCGA
1451 GATTTGAAGC GCAATCCTTC CTAATACCCG GGTTCCCAGC CCGATTCAGG
1501 GAAACCGTCA AAAGGATAGC CGCCGTCAAA CCCGTCTATG TTTTTGCAAA
1551 CAACACATCA ATCAGCCGTT CGCCCCTGAG GGAGGAAAAA TTGAAAAGAT
1601 TTGCCGCAAA CCAATATCTC CGCCCCATTC AGGCTATGGG CGACATCGGC
1651 AAGAGCAATC AGGCGGTCTT TGATTTGATT AAAGATATTC CCAATGTGCA
1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATATACG
1751 GCCGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT
1801 TATATGGGGC GGGAATTTCA CAAACACGAA CGCCTGCTTA AATCTTCTCG
1851 CGACGGCGCA TTGCAGTAG它编码的蛋白质具有氨基酸序列<SEQ ID 774>:
1 MQAVRYRPEI DGLRAVAVLS VMIFHLNNRW LPGGFLGVDI FFVISGFLIT
51 GIILSEIQNG SFSFRDFYTR RIKRIYPAFI AAVSLASVIA SQIFLYEDFN
101 QMRKTVELSA VFLSNIYLGF QQGYFDLSAD ENPVLHIWSL AVEEQYYLLY
151 PLLLIFCCKK TKSLRVLRNI SIILFLILTA TSFLPSGFYT DILNQPNTYY
201 LSTLRFPELL AGSLLAVYGQ TQNGRRQTAN GKRQLLSSLC FGALLACLFV
251 IDKHNPFIPG MTLLLPCLLT ALLIRSMQYG TLPTRILSAS PIVFVGKISY
301 SLYLYHWIFI AFAHYITGDK QLGLPAVSAV AALTAGFSLL SYYLIEQPLR
351 KRKMTFKKAF FCLYLAPSLI LVGYNLYARG ILKQEHLRPL PGAPLAAENH
401 FPETVLTLGD SHAGHLRGFL DYVGSREGWK AKILSLDSEC LVWVDEKLAD
451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFPARFR
501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAANQYL RPIQAMGDIG
551 KSNQAVFDLI KDIPNVHWVD AQKYLPKNTV EIYGRYLYGD QDHLTYFGSY
601 YMGREFHKHE RLLKSSRDGA LQ*ORF128a和ORF128-1在622个氨基酸的重叠区内显示出有99.5%的相同性:orf128a.pep MQAVRYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGIILSEIQNG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf128-1 MQAVRYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGIILSEIQNGorf128a.pep SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf128-1 SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGForf128a.pep QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISIILFLILTA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf128-1 QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISIILFLILTAorf128a.pep TSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC
:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf128-1 SSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLCorf128a.pep FGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf128-1 FGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
orf128a.pep SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
orf128a.pep FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL
orf128a.pep DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
orf128a.pep PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAANQYL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAANQYL
orf128a.pep RPIQAMGDIGKSNQAVFDLIKDIPNVHWVDAQKYLPKNTVEIYGRYLYGDQDHLTYFGSY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 RPIQAMGDIGKSNQAVFDLIKDIPNVHWVDAQKYLPKNTVEIYGRYLYGDQDHLTYFGSY
orf128a.pep YMGREFHKHERLLKSSRDGALQX
||||||||||||||||: |||||
orf128-1 YMGREFHKHERLLKSSHGGALQX
与淋病奈瑟球菌的预计ORF的同源性
ORF128和淋病奈瑟球菌的预计ORF(ORF128ng)在244个氨基酸的重叠区内显示出有93.4%的相同性:
orf128.pep VSLASVIASQIFLYEDFNQMRKTVELSAVF 30
|||||||||||||||||||||||:|||:||
orf128ng ILSEIQNGSFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTIELSTVF 112
orf128.pep LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI 90
||||||||: ||||||||||||||||||||||||||||||||||| ||||||||||||||
orf128ng LSNIYLGFRLGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCYKKTKSLRVLRNISI 172
orf128.pep ILFLILTASSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK 150
|||||||||||||:||||||||||||||||||||||||:||||||||||||||||| |||
orf128ng ILFLILTASSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAVYGQTQNGRRQTENGK 232
orf128.pep RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI 210
||||| |||||||:||||||||:|||||:|||||||||||||||||||||||||||||||
orf128ng RQLLSLLCFGALLVCLFVIDKHDPFIPGITLLLPCLLTALLIRSMQYGTLPTRILSASPI 292
orf128.pep VFVGKISYSLYLYHWIFIAFAPLIRGGKQLGLPA 244
||||||||||||||||||||| | | |||||||
orf128ng VFVGKISYSLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKR 352
全长ORF128ng核苷酸序列<SEQ ID 775>是:
1 ATGCAAGCTG TCCGATACAG GCCTGAAATT GACGGATTGC GGGCCGTCGC
51 CGTGCTATCC GTCATTATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG
101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCGGGATT CCTCATTACC
151 AACATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT
201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCTTTTATT GCGGCCGTGT
251 CCCTGGCTTC GGTGATTGCT TCTCAAATCT TCCTTTACGA AGATTTCAAC
301 CAAATGAGGA AAACCATAGA GCTTTCTACG GTTTTTTTGT CCAATATTTA
351 TTTGGGGTTC CGATTGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG
401 TACTGCATAT CTGGTCTTTG GCGGTAGAGG AACAGTATTA CCTCCTGTAT
451 CCTCTTTTGC TGATATTCTG TTACAAAAAA ACCAAATCAC TACGGGTGCT
501 GCGTAATATC AGCATCATCC TGTTTCTGAT TTTGACCGCA TCATCGTTTT
551 TGCCGGCCGG GTTTTATACC GACATCCTCA ACCAACCcaa TACTTATTAC
601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GTGGGTTCGC TGTTGGCGGT
651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGAAAAT GGAAAACGGC
701 AGTTGCTTTC ATTACTCTGT TTCGGCGCat tgCTTGTCTG CCTGTTCGTG
751 ATCGACAAAC ACGATCCGTT TATCCCGGGA ATAACCCTGC TCCTTCCCTG
801 CCTGCTGACG GCGCTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA
851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT
901 TCCCTATACC TGTACCATTG GATTTTTATT GCCTTCGCCC ATTACATTAC
951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA
1001 CGGCCGGATT TTCCCTGTTG AGCTATTATT TGATTGAACA GCCGCTTAGA
1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTTT ATCTCGCCCC
1101 GTCCCTGATG CTTGTCGGTT ACAACCTGTA TTCAAGAGGG ATATTGAAAC
1151 AGGAACACCT CCGCCCGCTG CCCGGCACGC CCGTTGCTGC GGAAAATAAT
1201 TTTCCGGAAA CCGTCTTGAC CCTCGGCGAC TCGCACGCCG GACACCTGCG
1251 GGGGTTTCTG GATTATGTCG GCGGCAGGGA AGGGTGGAAA GCTAAAATCC
1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TGGATGAGAA GCTGGCAGAC
1351 AACCCGTTGT GCCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCTGT
1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCCGTGCCGA
1451 GATTTGAAGC GCAATCCTTC CTGATACCCG GGTTCAAAGC CCGATTCAGG
1501 GAAACCGTCA AGAGGATAGC CGCCGTCAAA CCTGTATATG TTTTTGCAAA
1551 CAATACATCA ATCAGCCGTT CTCCCTTGAG GGAGGAAAAA TTGAAAAGAT
1601 TTGCTATAAA CCAATACCTC CGGCCTATTC GGGCTATGGG CGACATCGGC
1651 AAGAGCAATC AGGCGGTCTT TGATTTGGTT AAAGATATTC CCAATGTGCA
1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATACACG
1751 GACGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT
1801 TATATGGGGC GGGAATTTCA CAAACACGAA CGCCTGCTCA AGCATTCCCG
1851 AGGCGGCGCA TTGCAGTAG它编码的蛋白质具有氨基酸序列<SEQ ID 776>:
1 MQAVRYRPEI DGLRAVAVLS VIIFHLNNRW LPGGFLGVDI FFVISGFLIT
51 NIILSEIQNG SFSFRDFYTR RIKRIYPAFI AAVSLASVIA SQIFLYEDFN
101 QMRKTIELST VFLSNIYLGF RLGYFDLSAD ENPVLHIWSL AVEEQYYLLY
151 PLLLIFCYKK TKSLRVLRNI SIILFLILTA SSFLPAGFYT DILNQPNTYY
201 LSTLRFPELL VGSLLAVYGQ TQNGRRQTEN GKRQLLSLLC FGALLVCLFV
251 IDKHDPFIPG ITLLLPCLLT ALLIRSMQYG TLPTRILSAS PIVFVGKISY
301 SLYLYHWIFI AFAHYITGDK QLGLPAVSAV AALTAGFSLL SYYLIEQPLR
351 KRKMTFKKAF FCLYLAPSLM LVGYNLYSRG ILKQEHLRPL PGTPVAAENN
401 FPETVLTLGD SHAGHLRGFL DYVGGREGWK AKILSLDSEC LVWVDEKLAD
451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFKARFR
501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAINQYL RPIRAMGDIG
551 KSNQAVFDLV KDIPNVHWVD AQKYLPKNTV EIHGRYLYGD QDHLTYFGSY
601 YMGREFHKHE RLLKHSRGGA LQ*ORF128ng和ORF128-1在622个氨基酸的重叠区内显示出有95.7%的相同性:orf128-1.pep MQAVRYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGIILSEIQNG
|||||||||||||||||||||:||||||||||||||||||||||||||||:|||||||||orf128ng MQAVRYRPEIDGLRAVAVLSVIIFHLNNRWLPGGFLGVDIFFVISGFLITNIILSEIQNGorf128-1.pep SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF
|||||||||||||||||||||||||||||||||||||||||||||:|||:||||||||||orf128ng SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTIELSTVFLSNIYLGForf128-1.pep QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISIILFLILTA
: ||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||orf128ng RLGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCYKKTKSLRVLRNISIILFLILTAorf128-1.pep SSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC
|||||:||||||||||||||||||||||||:||||||||||||||||| |||||||| ||orf128ng SSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAVYGQTQNGRRQTENGKRQLLSLLCorf128-1.pep FGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
|||||:||||||||:|||||:|||||||||||||||||||||||||||||||||||||||
orf128ng FGALLVCLFVIDKHDPFIPGITLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
orf128-1.pep SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128ng SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
orf128-1.pep FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL
|||||||||:|||||||:||||||||||||||:|:||||:||||||||||||||||||||
orf128ng FCLYLAPSLMLVGYNLYSRGILKQEHLRPLPGTPVAAENNFPETVLTLGDSHAGHLRGFL
orf128-1.pep DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128ng DYVGGREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
orf128-1.pep PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAANQYL
||||||||||||||| ||||||||||||||||||||||||||||||||||||||| ||||
orf128ng PVPRFEAQSFLIPGFKARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAINQYL
orf128-1.pep RPIQAMGDIGKSNQAVFDLIKDIPNVHWVDAQKYLPKNTVEIYGRYLYGDQDHLTYFGSY
|||:|||||||||||||||:||||||||||||||||||||||:|||||||||||||||||
orf128ng RPIRAMGDIGKSNQAVFDLVKDIPNVHWVDAQKYLPKNTVEIHGRYLYGDQDHLTYFGSY
orf128-1.pep YMGREFHKHERLLKSSHGGALQX
|||||||||||||| |:||||||
orf128ng YMGREFHKHERLLKHSRGGALQX
610 620
另外,ORF218ng显示出与一种假设的流感嗜血菌蛋白同源:
sp|P43993|Y392_HAEIN假设蛋白HI0392>gi|1074385|pir||B64007假设蛋白HI0392-流感嗜血菌(Rd KW20菌株)
>gi|1573364(U32723)流感嗜血菌预计的编码区HI0392[流感嗜血菌]长度=245
评分=239位(604),估计值=3e-62
相同性=124/225(55%),阳性=152/225(67%),空隙=1/225(0%)
询问:38 VDIFFVISGFLITNIILSEIQNGSFSFRDFYTRRIKRIYPXXXXXXXXXXXXXXXXFLYE 97
+DIFFVISGFLIT II++EIQ SFS + FYTRRIKRIYP F+Y
目标:1 MDIFFVISGFLITGIIITEIQQNSFSLKQFYTRRIKRIYPAFITVMALVSFIASAIFIYN 60
询问:98 DFNQMRKTIELSTVFLSNIYLGFRLGYFDLSADENPVLHIWSLAVEEQXXXXXXXXXIFC 157
DFN++RKTIEL+ FLSN YLG GYFDLSA+ENPVLHIWSLAVE Q I
目标:61 DFNKLRKTIELAIAFLSNFYLGLTQGYFDLSANENPVLHIWSLAVEGQYYLIFPLILILA 120
询问:158 YKKTKSLRVLRNISIILFLILTASSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAV 217
YKK + ++VL I++ILF IL A+SF+ A FY ++L+QPN YYLS LRFPELLVGSLLA+
目标:121 YKKFREVKVLFIITLILFFILLATSFVSANFYKEVLHQPNIYYLSNLRFPELLVGSLLAI 180
询问:218 YGQTQNGRRQTENGKRQLLSLLCFGALLVCLFVIDKHDPFIPGIT 262
Y N + Q +L++L L CLF+++ + FIPGIT
目标:181 YHNLSN-KVQLSKQVNNILAILSTLLLFSCLFLMNNNIAFIPGIT 224
该分析结果(包括鉴定出几个推定的跨膜结构域)提示脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例99
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 777>:
1 ..ATTATTTACG AATACCGCTG GATGTTTCTT TACGGCGCAC TGACGACCTT
51 GGGGCTGACG GTCGTGGCAA C.GCGGGCGG TTCGGTATTG GGTCTGTTGT
101 TGGCGTTGGC GCGCCTGATT CACTTGGAAA AAGCCGGTGC GCCGATGCGC
151 GTGCTGGCGT GGGCGTTGCG TAAAGTTTCG CTGCTGTATG TTACGCTGTT
201 CCGGGGTACG CCGCTGTTTG TGCAGATTGT GATTTGGGCG TATGTGTGGT
251 TTCCGTTTTT CGTC..
它对应于氨基酸序列<SEQ ID 778;ORF129>:
1 ..IIYEYRWMFL YGALTTLGLT VVAXAGGSVL GLLLALARLI HLEKAGAPMR
51 VLAWALRKVS LLYVTLFRGT PLFVQIVIWA YVWFPFFV..
进一步的工作揭示了其完整的核苷酸序列<SEQ ID 779>:
1 ATGGATTTTC GTTTTGACAT TATTTACGAA TACCGCTGGA TGTTTCTTTA
51 CGGCGCACTG ACGACCTTGG GGCTGACGGT CGTGGCAACG GCGGGCGGTT
101 CGGTATTGGG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA
151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AAGTTTCGCT
201 GCTGTATGTT ACGCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA
251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT
301 TTGGTCAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG
401 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG
451 GCGCGTTCTT TGGGGCTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT
501 GCCGCAGGCA TTGCGCCGCA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC
651 GCTTTACACC GTCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA
它对应于氨基酸序列<SEQ ID 780;ORF129-1>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLTVVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRVLAW ALRKVSLLYV TLFRGTPLFV QIVIWAYVWF PFFVHPSDGI
101 LVSGEAAIAL RRGYGPLIAG SLALIANSGA YICEIFRAGI QSIDKGQMEA
151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS EFITLLKDSS LLSVIAVAEL
201 AYVQNTITGR YSVYEEPLYT VALIYLLMTT FLGWIFLRLE KRYNPQHR*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF129和脑膜炎奈瑟球菌菌株A的ORF(ORF129a)在88个氨基酸的重叠区内显示出有98.9%的相同性:
10 20 30 40 50
orf129.pep IIYEYRWMFLYGALTTLGLTVVAXAGGSVLGLLLALARLIHLEKAGAPMRVLAW
|||||||||||||||||||||||:||||||||||||||||||||||||||||||
orf129a MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
10 20 30 40 50 60
60 70 80
orf129.pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFV
||||||||||||||||||||||||||||||||||
orf129a ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
70 80 90 100 110 120
orf129a SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
130 140 150 160 170 180
全长ORF129a核苷酸序列<SEQ ID 781>是:
1 ATGGATTTTC GTTTTGACAT TATTTACGAA TACCGCTGGA TGTTTCTTTA
51 CGGCGCACTG ACGACCTTGG GGCTGACGGT CGTGGCGACG GCGGGCGGTT
101 CGGTATTGGG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA
151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AGGTTTCGCT
201 GCTGTATGTT ACGCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA
251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT
301 TTGGTTAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG
401 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG
451 GCGCGTTCTT TGGGGCTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT
501 GCCGCAGGCA TTGCGCCGTA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC
651 GCTTTACACC GTCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 782>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLTVVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRYLAW ALRKVSLLYV TLFRGTPLFV QIVIWAYVWF PFFVHPSDGI
101 LVSGEAAIAL RRGYGPLIAG SLALIANSGA YICEIFRAGI QSIDKGQMEA
151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS EFITLLKDSS LLSVIAVAEL
201 AYVQNTITGR YSVYEEPLYT VALIYLLMTT FLGWIFLRLE KRYNPQHR*
ORF129a和ORF129-1在248个氨基酸的重叠区内显示出有100.0%的相同性:
orf129a.pep MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
orf129a.pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
orf129a.pep SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
orf129a.pep EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE
orf129a.pep KRYNPQHRX
|||||||||
orf129-1 KRYNPQHRX
与淋病奈瑟球菌的预计ORF的同源性
ORF129和淋病奈瑟球菌的预计ORF(ORF129ng)在88个氨基酸的重叠区内显示出有98.9%的相同性:
orf129.pep IIYEYRWMFLYGALTTLGLTVVAXAGGSVLGLLLALARLIHLEKAGAPMRVLAW 54
|||||||||||||||||||||||:||||||||||||||||||||||||||||||
orf129ng MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW 60
orf129.pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFV 88
||||||||||||||||||||||||||||||||||
orf129ng ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVILHTAFLGNAMRQSRRVPDKGRWIAG 120
预计ORF129ng核苷酸序列<SEQ ID 783>编码的蛋白质具有氨基酸序列<SEQ ID784>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLTVVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRVLAW ALRKVSLLYV TLFRGTPLFV QIVIWAYVWF PFFVILHTAF
101 LGNAMRQSRR VPDKGRWIAG SLELNCQPRG RKTRGEFPPG ESNLGTEPRN
151 PLSMGQRRFP GCENWYPPQN FIKK*
进一步的工作揭示了下列淋球菌序列<SEQ ID 785>:
1 ATGGATTTTc gtTTTGACAT TATTTAcgaA TACCGCTGGA TGTTTCTTTA
51 CGGCGCACTG Acgaccttgg ggctgacggt cgtggcgacg gCGGGCGGTT
101 CGGtattggG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA
151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AGGTTTCGCT
201 GCTGTACGTT ACCCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA
251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT
301 TTGGTCAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG
401 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG
451 GCGTGTTCTT TGGGACTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT
501 GCCGCAGGCA TTGCGCCGTA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC
651 GCTTTACACC GCCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA它对应于氨基酸序列<SEQ ID 786;ORF129ng-1>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLTVVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRVLAW ALRKVSLLYV TLFRGTPLFV QIVIWAYVWF PFFVHPSDGI
101 LVSGEAAIAL RRGYGPLIAG SLALIANSGA YICEIFRAGI QSIDKGQMEA
151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS EFITLLKDSS LLSVIAVAEL
201 AYVQNTITGR YSVYEEPLYT VALIYLLMTT FLGWIFLRLE KRYNPQHR*ORF129ng-1和ORF129-1在248个氨基酸的重叠区内显示出有99.2%的相同性:orf129-1.pep MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf129ng-1 MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAWorf129-1.pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf129ng-1 ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAGorf129-1.pep SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||orf129ng-1 SLALIANSGAYICEIFRAGIQSIDKGQMEAACSLGLTYPQAMRYVILPQALRRMLPPLASorf129-1.pep EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||orf129ng-1 EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTAALIYLLMTTFLGWIFLRLEorf129-1.pep KRYNPQHRX
|||||||||orf129ng-1 KRYNPQHRX另外,ORF129ng-1与闪烁古生球菌的ABC转运蛋白同源:2650409(AE001090)谷氨酰胺ABC转运蛋白,通透酶蛋白(glnP)[闪烁古生球菌]长度=224评分=132位(329),估计值=2e-30相同性=86/178(48%),阳性=103/178(57%),空隙=18/178(10%)询问:65 VSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAGSLAL 124
+S YV + RGTPL VQI+I +F P+ GI + E A G +AL目标:58 ISTAYVEVIRGTPLLVQILI------VYFGLPAIGINLQPEPA------------GIIAL 99询问:125 IANSGAYICEIFRAGIQSIDKGQMEAACSLGLTYPQAMRYVILPQALRRMLPPLASEFIT 184
SGAYI EI RAGI+SI GQMEAA SLG+TY QAMRYVI PQA R +LP L +EFI目标:100 SICSGAYIAEIVRAGIESIPIGQMEAARSLGMTYLQAMRYVIFPQAFRNILPALGNEFIA 159询问:185 LLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTAALIYLLMTTFLGWIFLRLEKR 242
LLKDSSLLSVI++ EL V I P AL YL+MT L + +K+目标:160 LLKDSSLLSVISIVELTRVGRQIVNTTFNAWTPFLGVALFYLMMTIPLSRLVAYSQKK 217该分析结果(包括在两个蛋白中鉴定出几个跨膜结构域)提示脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例100
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 787>:
1 ..CTGAAAGAAT GCCGTCTGAA AGACCCTGTT TTTATTCCAA ATATCGTTTA
51 TAAGAACATC GCCATTACTT TCCTGCTCTT GCACGCCGCC GCCGAACTTT
101 GGCTGCCCGC GCAAACCGCC GGTTTTACCG CGCTCGCCGT CGGCTTCATC
151 CTGCTCGCCA AGCTGCGTGA gCTTCACCAT CACGAACTCT TACGTAAACA
201 cTACGTCCGC ACTTATTACy TGCTCCAACT CTTTGCCGCC GCAGgcTAgT
251 TTGTGGACAG GCGCGGCGwA ATTACAAAAC CTGCCCGCyT CCGCGCCCCT
301 GCACCTGATT ACCCTCGGCG GCATGATGGG CGGCGTGATG ATGGTGTGGc
351 TGACCGCCGG ACTGTGGCAC AGCGGCTTTA CCAAACTCGA CTACCCCAAA
401 CTCTGCCGCA TTGCCGTCCC CATCCTTTTC GCCGCCGCCG TCTCGCGCGC
451 TTTCTTGrTG AACGTGAACC CGrTATTTTT CATTACCGTT CCTGCGATTC
501 TGACCGCCGC CGTATTCGTA CTGTATCTTT TCrCGTTTAT ACCGATATTT
551 CGGGCGAATG CGTTTACAGA CGATCCGGAr TAr
它对应于氨基酸序列<SEQ ID 788;ORF130>:
1 ..LKECRLKDPV FIPNIVYKNI AITFLLLHAA AELWLPAQTA GFTALAVGFI
51 LLAKLRELHH HELLRKHYVR TYYLLQLFAA AGSLWTGAAX LQNLPASAPL
101 HLITLGGMMG GVMMVWLTAG LWHSGFTKLD YPKLCRIAVP ILFAAAVSRA
151 FLXNVNPXFF ITVPAILTAA VFVLYLFXFI PIFRANAFTD DPE*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 789>:
1 ATGCGGCCGT TTTTCGTCGG CGCGGCGGTG CTTGCCATAC TCGGTGCGCT
51 GGTGTTTTTC ATCAACCCCG GTGCCATCGT CCTGCACCGC CAAATTTTCT
101 TGGAACTTAT GCTGCCGGCG GCATACGGCG GTTTTTTGAC TGCGGCTTTG
151 TTGGACTGGA CGGGTTTTTC GGGTAACCTG AAACCTGTCG CGACTTTGAT
201 GGCGGCATTA TTGCTCGCCG CATCCGCTAT ACTGCCCTTT TCGCCGCAAA
251 CTGCCTCGTT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC
301 GCCCGGCTGA TTTGGCTAGA CCGAAACACC GACAACTTCG CCCTGCTAAT
351 GTTACTTGCC GCGTTCACTG TTTTTCAGAC GGCATATGCC GTCAGCGGCG
401 ATTTGAACCT GTTGCGCGCG CAAGTGCATC TAAATATGGC GGCGGTGATG
451 TTCGTATCCG TGCGCGTCAG TATTCTTTTG GGCGCGGAAG CCCTGAAAGA
501 ATGCCGTCTG AAAGACCCTG TTTTTATTCC AAATATCGTT TATAAAAACA
551 TCGCCATTAC TTTCCTGCTC TTGCACGCCG CCGCCGAACT TTGGCTGCCC
601 GCGCAAACCG CCGGTTTTAC CGCGCTCGCC GTCGGCTTCA TCCTCCTCGC
651 CAAGCTGCGT GAGCTTCACC ATCACGAACT CTTACGTAAA CACTACGTCC
701 GCACTTATTA CCTGCTCCAA CTCTTTGCCG CCGCAGGCTA TTTGTGGACA
751 GGCGCGGCGA AATTACAAAA CCTGCCCGCC TCCGCGCCCC TGCACCTGAT
801 TACCCTCGGC GGCATGATGG GCGGCGTGAT GATGGTGTGG CTGACCGCCG
851 GACTGTGGCA CAGCGGCTTT ACCAAACTCG ACTACCCCAA ACTCTGCCGC
901 ATTGCCGTCC CCATCCTTTT CGCCGCCGCC GTCTCGCGCG CTTTCTTGAT
951 GAACGTGAAC CCGATATTTT TCATTACCGT TCCTGCGATT CTGACCGCCG
1001 CCGTATTCGT ACTGTATCTT TTCACGTTTA TACCGATATT TCGGGCGAAT
1051 GCGTTTACAG ACGATCCGGA ATAA
它对应于氨基酸序列<SEQ ID 790;ORF130-1>:
1 MRPFFVGAAV LAILGALYFF INPGAIVLHR QIFLELMLPA AYGGFLTAAL
51 LDWTGFSGNL KPVATLMAAL LLAASAILPF SPQTASFFVA AYWLVLLLFC
101 ARLIWLDRNT DNFALLMLLA AFTVFQTAYA VSGDLNLLRA QVHLNMAAVM
151 FVSVRVSILL GAEALKECRL KDPVFIPNIV YKNIAITFLL LHAAAELWLP
201 AQTAGFTALA VGFILLAKLR ELHHHELLRK HYVRTYYLLQ LFAAAGYLWT
251 GAAKLQNLPA SAPLHLITLG GMMGGVMMVW LTAGLWHSGF TKLDYPKLCR
301 IAVPILFAAA VSRAFLMNVN PIFFITVPAI LTAAVFVLYL FTFIPIFRAN
351 AFTDDPE*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF130和脑膜炎奈瑟球菌菌株A的ORF(ORF130a)在193个氨基酸的重叠区内显示出有94.3%的相同性:
10 20 30
orf130.pep LKECRLKDPVFIPNIVYKNIAITFLLLHAA
||||||||||||||||||||||||||||||
orf130a LNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNVVYKNIAITFLLLHAA
140 150 160 170 180 190
40 50 60 70 80 90
orf130.pep AELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGSLWTGAAX
|||||||||||||:|||||||||||||||||||||||||||||||||||||| |||||||
orf130a AELWLPAQTAGFTSLAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGYLWTGAAK
200 210 220 230 240 250
100 110 120 130 140 150
orf130.pep LQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDyPKLCRIAVPILFAAAVSRA
||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||
orf130a LQNLPASAPLHLITLGGMMGSVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA
260 270 280 290 300 310
160 170 180 190
orf130.pep FLXNVNPXFFITVPAILTAAVFVLYLFXFIPIFRANAFTDDPEX
| |||| ||||||||||||||||||::|:||||||||||||||
orf130a VLMNVNPIFFITVPAILTAAVFVLYLLTFVPIFRANAFTDDPEX
320 330 340 350
全长ORF130a核苷酸序列<SEQ ID 791>是:
1 ATGCGGCCGT TTTTCGTCGG CGCGGCGGTG CTTGCCATAC TCGGTGCGCT
51 GGTGTTTTTC ATCAACCCCG GTGCCATCGT CCTGCACCGC CAAATTTTCT
101 TGGAACTTAT GCTGCCGGCG GCATACGGCG GTTTTTTGAC TGCGGCTTTG
151 TTGGACTGGA CGGGTTTTTC GGGTAACCTG AAACCTGTCG CGACTTTGAT
201 GGCGGCATTA TTGCTCGCCG CATCCGCTAT ACTGCCCTTT TCGCCGCAAA
251 CTGCCTCGTT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC
301 GCCCGGCTGA TTTGGCTAGA CCGAAACACC GACAACTTCG CCCTGCTAAT
351 GTTACTTGCC GCGTTCACTG TTTTTCAGAC GGCATATGCC GTCAGCGGCG
401 ATTTGAACCT GTTGCGCGCG CAAGTGCATC TAAATATGGC GGCGGTGATG
451 TTCGTATCCG TGCGCGTCAG TATTCTTTTG GGCGCGGAAG CCCTGAAAGA
501 ATGCCGTCTG AAAGACCCAG TATTCATCCC CAATGTCGTC TATAAAAACA
551 TCGCCATTAC CTTCCTGCTC CTGCACGCCG CCGCCGAACT TTGGCTGCCT
601 GCGCAAACCG CCGGTTTTAC CTCGCTCGCC GTCGGCTTTA TCCTGCTTGC
651 CAAGCTGCGT GAGCTTCACC ATCACGAACT CCTGCGCAAA CACTACGTCC
701 GCACTTATTA CCTGCTCCAA CTCTTTGCCG CCGCAGGCTA TTTGTGGACA
751 GGCGCGGGGA AATTACAAAA CCTGCCCGCC TCCGCGCCCC TGCACCTGAT
801 TACCCTCGGT GGCATGATGG GCAGCGTGAT GATGGTGTGG CTGACTGCCG
851 GACTGTGGCA CAGCGGCTTT ACCAAGCTCG ACTACCCGAA ACTCTGCCGC
901 ATCGCCGTCC CCATCCTNTT CGCCGCCGCC GTTTCGCGCG CTGTTTTAAT
951 GAACGTAAAC CCGATATTCT TCATCACCGT CCCCGCAATT CTGACCGCCG
1001 CCGTGTTCGT GCTTTACCTG CTGACATTCG TACCGATCTT TCGGGCGAAC
1051 GCGTTTACAG ACGATCCGGA ATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 792>:
1 MRPFFVGAAV LAILGALVFF INPGAIVLHR QIFLELMLPA AYGGFLTAAL
51 LDWTGFSGNL KPVATLMAAL LLAASAILPF SPQTASFFVA AYWLVLLLFC
101 ARLIWLDRNT DNFALLMLLA AFTVFQTAYA VSGDLNLLRA QVHLNMAAVM
151 FVSVRVSILL GAEALKECRL KDPVFIPNVV YKNIAITFLL LHAAAELWLP
201 AQTAGFTSLA VGFILLAKLR ELHHHELLRK HYVRTYYLLQ LFAAAGYLWT
251 GAAKLQNLPA SAPLHLITLG GMMGSVMMVW LTAGLWHSGF TKLDYPKLCR
301 IAVPILFAAA VSRAVLMNVN PIFFITVPAI LTAAVFVLYL LTFVPIFRAN
351 AFTDDPE*
ORF130a和ORF130-1在357个氨基酸的重叠区内显示出有98.3%的相同性:
orf130a.pep MRPFFVGAAVLAILGALVFFINPGAIVLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf130-1 MRPFFVGAAVLAILGALVFFINPGAIVLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL
orf130a.pep KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf130-1 KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA
orf130a.pep AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf130-1 AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNIV
orf130a.pep YKNIAITFLLLHAAAELWLPAQTAGFTSLAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
|||||||||||||||||||||||||||:||||||||||||||||||||||||||||||||
orf130-1 YKNIAITFLLLHAAAELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
orf130a.pep LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGSVMMVWLTAGLWHSGFTKLDYPKLCR
||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||
orf130-1 LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCR
orf130a.pep IAVPILFAAAVSRAVLMNVNPIFFITVPAILTAAVFVLYLLTFVPIFRANAFTDDPE
|||||||||||||| |||||||||||||||||||||||||:||:|||||||||||||
orf130-1 IAVPILFAAAVSRAFLMNVNPIFFITVPAILTAAVFVLYLFTFIPIFRANAFTDDPE
与淋病奈瑟球菌的预计ORF的同源性
ORF130和淋病奈瑟球菌的预计ORF(ORF130ng)在193个氨基酸的重叠区内显示出有91.7%的相同性:
orf130.pep LKECRLKDPVFIPNIVYKNIAITFLLLHAA 30
||||||||||||||||||||||||||||||
orf130ng LNLLRAQVHLNMAAVMFVSVRVSVLLGTETLKECRLKDPVFIPNVIYKNIAIT-LLLHAA 201
orf130.pep AELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGSLWTGAAX 90
|||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||
orf130ng AELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGYLWTGAAK 261
orf130.pep LQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA 150
|||||||||||||||||| |||||||||||||||||||||||||||||| ||||:|||||
orf130ng LQNLPASAPLHLITLGGMTGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVSILFASAVSRA 321
orf130.pep FLXNVNPXFFITVPAILTAAVFVLYLFXFIPIFRANAFTDDPE 193
| |||| |||||| |||||||:|||::|:|||||||||||||
orf130ng VLMNVNPIFFITVPEILTAAVFMLYLLTFVPIFRANAFTDDPE 364
预计ORF130ng核苷酸序列<SEQ ID 793>编码的蛋白质具有氨基酸序列<SEQ ID794>:
1 MNKFFTHPMR PFFVGAAVLA ILGALVFFHQ PRRYHPAPPN FLGTYAAGCI
51 RRFFDYRFVG PDGFFRQPET CRYFDGGVVA CCGCFIAVFT ATCRIFRRRL
101 LAGVAAVLRL ADLARRQHRT LRSVDVTAAF TVFQTAYAVS 6DLNLLRAQV
151 HLNMAAVMFV SVRVSVLLGT ETLKECRLKD PVFIPNVIYK NIAITLLLHA
201 AAELWLPAQT AGFTALAVGF ILLAKLRELH HHELLRKHYV RTYYLLQLFA
251 AAGYLWTGAA KLQNLPASAP LHLITLGGMT GGVMMVWLTA GLWHSGFTKL
301 DYPKLCRIAV SILFASAVSR AVLMNVNPIF FITVPEILTA AVFMLYLLTF
351 VPIFRANAFT DDPE*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 795>:
1 ATGCGCCCGT TTTTCGTCGG TGCGGCAGTA CTTGCCATAC TCGGTGCGTT
51 GGTGTTTTTT ATCAACCCCG GCGCTATCAT CCTGCACCGC CAAATTTTCT
101 TGGAACTTAT GCTGCCGGCT GCATACGGCG GTTTTTTGAC TACCGCTTTG
151 TTGGACCGGA CGGGTTTTTC AGGCAACCTG AAACCTGCCG CTACTTTGAT
201 GGCGGTGTTG TTGCTTGTTG CGGCTGTTTT ATTGCCGTTT TTACCGCAAC
251 TTGCCGCATT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC
301 GCCTGGCTGA TTTGGCTCGA CCGCAACACC GACAACTTCG CTCTGTTGAT
351 GTTACTTGCC GCATTTACCG TTTTTCAGAC GGCCTATGCC GTCAGCGGCG
401 ATTTGAACTT ACTGCGCGCG CAAGTGCATT TGAATATGGC GGCGGTCATG
451 TTCGTATCCG TCCGCGTCAG CGTCCTTTTG GGCACGGAAA CCCTGAAAGA
501 ATGCCGTCTG AAAGACCCCG TATTCATCCC CAACGTTATC TATAAAAACA
551 TCGCCATCAC CCTGCTGCTG CACGCCGCCG CCGAACTTTG GCTGCCCGCG
601 CAAACCGCCG GTTTTACTGC GCTTGCCGTC GGCTTCATCC TGCTCGCCAA
651 GCTGCGCGAA CTGCACCATC ACGAACTCTT ACGCAAACAC TACGTCCGCA
701 CTTATTACCT GCTCCAGCTC TTTGCCGCCG CAGGTTATCT GTGGACAGGC
751 GCGGCGAAAC TGCAAAACCT GCCCGCCTCC GCGCCCCTGC ACCTGATTAC
801 CCTCGGCGGC ATGACGGGTG GCGTGATGAT GGTGTGGCTG ACTGCCGGAC
851 TGTGGCACAG CGGCTTTACC AAACTCGACT ACCCGAAACT CTGCCGCATC
901 GCCGTCTCCA TCCTTTTCGC CTCCGCCGTT TCGCGCGCTG TTTTAATGAA
951 CGTGAATCCG ATATTCTTCA TCACCGTTCC CGAGATTCTG ACCGCCGCCG
1001 TGTTCATGCT TTACCTGCTG ACGTTCGTAC CGATTTTTCG AGCGAACGCG
1051 TTTACAGACG ATCCGGAATA A
它对应于氨基酸序列<SEQ ID 796;ORF130ng-1>:
1 MRPFFVGAAV LAILGALVFF INPGAIILHR QIFLELMLPA AYGGFLTTAL
51 LDRTGFSGNL KPAATLMAVL LLVAAVLLPF LPQLAAFFVA AYWLVLLLFC
101 AWLIWLDRNT DNFALLMLLA AFTVFQTAYA VSGDLNLLRA QVHLNMAAVM
151 FVSVRVSVLL GTETLKECRL KDPVFIPNVI YKNIAITLLL HAAAELWLPA
201 QTAGFTALAV GFILLAKLRE LHHHELLRKH YVRTYYLLQL FAAAGYLWTG
251 AAKLQNLPAS APLHLITLGG MTGGVMMVWL TAGLWHSGFT KLDYPKLCRI
301 AVSILFASAV SRAVLMNVNP IFFITVPEIL TAAVFMLYLL TFVPIFRANA
351 FTDDPE*
ORF130ng-1和ORF130-1在357个氨基酸的重叠区内显示出有92.4%的相同性:
orf130-1.pep MRPFFVGAAVLAILGALVFFINPGAIVLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL
||||||||||||||||||||||||||:||||||||||||||||||||:|||| |||||||
orf130ng-1 MRPFFVGAAVLAILGALVFFINPGAIILHRQIFLELMLPAAYGGFLTTALLDRTGFSGNL
orf130-1.pep KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA
||:|||||:|||:|:::||| || |:||||||||||||||| ||||||||||||||||||
orf130ng-1 KPAATLMAVLLLVAAVLLPFLPQLAAFFVAAYWLVLLLFCAWLIWLDRNTDNFALLMLLA
orf130-1.pep AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNIV
|||||||||||||||||||||||||||||||||||||:|||:|:||||||||||||||::
orf130ng-1 AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSVLLGTETLKECRLKDPVFIPNVI
orf130-1.pep YKNIAITFLLLHAAAELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf130ng-1 YKNIAIT-LLLHAAAELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
orf130-1.pep LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCR
|||||||||||||||||||||||||||||||| |||||||||||||||||||||||||||
orf130ng-1 LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMTGGVMMVWLTAGLWHSGFTKLDYPKLCR
orf130-1.pep IAVPILFAAAVSRAFLMNVNPIFFITVPAILTAAVFVLYLFTFIPIFRANAFTDDPEX
||| ||||:||||| ||||||||||||| |||||||:|||:||:||||||||||||||
orf130ng-1 IAVSILFASAVSRAVLMNVNPIFFITVPEILTAAVFMLYLLTFVPIFRANAFTDDPEX
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例101
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 797>:
1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT
51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCCTCA
101 CCGGCTGGTG TAAGCCGAGA AAACCGGCTG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GTCCGCCGTC TTTAGGGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATAGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT
251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG C.TGCGGGCT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG GAGGATTTGA
351 CTGCTTGGAA AAG..
它对应于氨基酸序列<SEQ ID 798;ORF131>:
1 MEIRAIKYTA MAALLAFTVA GCRLAGWYEC SSLTGWCKPR KPAAIDFWDI
51 GGESPPSLGD YEIPLSDGNS SVRANEYESA QQSYFYRKIG KFEXCGLDWR
101 TRDGKPLIET FKQGGFDCLE K..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 799>:
1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT
51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCCTCA
101 CCGGCTGGTG TAAGCCGAGA AAACCGGCTG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GTCCGCCGTC TTTAGGGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT
251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG CCTGCGGGCT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG GAGGATTTGA
351 CTGCTTGGAA AAGCAGGGGT TGCGGCGCAA CGGTCTGTCC GAGCGCGTCC
401 GATGGTAA
它对应于氨基酸序列<SEQ ID 800;ORF131-1>:
1 MEIRAIKYTA MAALLAFTVA GCRLAGWYEC SSLTGWCKPR KPAAIDFWDI
51 GGESPPSLGD YEIPLSDGNR SVRANEYESA QQSYFYRKIG KFEACGLDWR
101 TRDGKPLIET FKQGGFDCLE KQGLRRNGLS ERVRW*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF131和脑膜炎奈瑟球菌菌株A的ORF(ORF131a)在121个氨基酸的重叠区内显示出有95.0%的相同性:
10 20 30 40 50 60
orf131.pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD
|||||||||||||||||||||||||||||||||:|||||||||||||||||||||||| |
orf131a MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWDIGGESPPSLED
10 20 30 40 50 60
70 80 90 100 110 120
orf131.pep YEIPLSDGNSSVRANEYESAQQSYFYRKIGKFEXCGLDWRTRDGKPLIETFKQGGFDCLE
||||||||| ||||||||||||||||||||||| ||||||||||||||||||| |||||:
orf131a YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQEGFDCLK
70 80 90 100 110 120
orf131.pep K
|
orf131a KQGLRRNGLSERVRWX
130
全长ORF131a核苷酸序列<SEQ ID 801>是:
1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT
51 TACGGTTGCA GGCTGCCGGT TGGCAGGTTG GTATGAGTGT TCGTCCCTGT
101 CCGGCTGGTG TAAGCCGAGA AAACCTGCCG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GTCCTCCGTC TTTAGAGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT
251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG CCTGCGGGTT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG AAGGTTTTGA
351 TTGTTTGAAA AAGCAGGGGT TGCGGCGCAA CGGTCTGTCC GAGCGCGTCC
401 GATGGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 802>:
1 MEIRAIKYTA MAALLAFTVA GCRLAGMYEC SSLSGWCKPR KPAAIDFWDI
51 GGESPPSLED YEIPLSDGNR SVRANEYESA QQSYFYRKIG KFEACGLDWR
101 TRDGKPLIET FKQEGFDCLK KQGLRRNGLS ERVRW*
ORF131a和ORF131-1在135个氨基酸的重叠区内显示出有97.0%的相同性:
orf131a.pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWDIGGESPPSLED
|||||||||||||||||||||||||||||||||:|||||||||||||||||||||||| |
orf131-1 MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD
orf131a.pep YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQEGFDCLK
||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||:
orf131-1 YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQGGFDCLE
orf131a.pep KQGLRRNGLSERVRWX
||||||||||||||||
orf131-1 KQGLRRNGLSERVRWX
与淋病奈瑟球菌的预计ORF的同源性
ORF131和淋病奈瑟球菌的预计ORF(ORF131ng)121个氨基酸的重叠区内显示出有89.3%的相同性:
orf131.pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD 60
||||:||||| |||:||||||||||||||| ||:||||||||||||||||||||| || |
orf131ng MEIRVIKYTATAALFAFTVAGCRLAGWYECLSLSGWCKPRKPAAIDFWDIGGESPLSLED 60
orf131.pep YEIPLSDGNSSVRANEYESAQQSYFYRKIGKFEXCGLDWRTRDGKPLIETFKQGGFDCLE 120
||||||||| |||||||||||:||||||||||| |||||||||||||:| ||| ||||||
orf131ng YEIPLSDGNRSVRANEYESAQKSYFYRKIGKFEACGLDWRTRDGKPLVERFKQEGFDCLE 120
orf131.pep K 121
|
orf131ng KGGLRRNGLSERVRW 134
预计全长ORF131ng核苷酸序列<SEQ ID 803>编码的蛋白质具有氨基酸序列<SEQ ID 804>:
1 MEIRVIKYTA TAALFAFTVA GCRLAGWYEC LSLSGWCKPR KPAAIDFWDI
51 GGESPLSLED YEIPLSDGNR SVRANEYESA QKSYFYRKIG KFEACGLDWR
101 TRDGKPLVER FKQEGFDCLE KQGLRRNGLS ERVRW*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 805>:
1 ATGGAAATTC GGGTAATAAA ATATACGGCA ACGGCTGCGT TGTTTGCATT
51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCTTGT
101 CCGGCTGGTG TAAGCCGAGA AAACCTGCCG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GtccgctGTC TTTAGAGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCG CAAAAATCTT
251 ACTTTTATAG GAAAATAGGG AAGTTTGAAG CCTGCGGGTT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GGTTGAGAGG TTCAAACAGG AAGGTTTCGA
351 CTGTTTGGAA AAGCAGGGGT TGCGGCGCAA CGGCCTGTCC GAGCGCGTCC
401 GATGGTAA
它对应于氨基酸序列<SEQ ID 806;ORF131ng-1>:
1 MEIRVIKYTA TAALFAFTVA GCRLAGWYEC 5SLSGWCKPR KPAAIDFWDI
51 GGESPLSLED YEIPLSDGNR SVRANEYESA QKSYFYRKIG KFEACGLDWR
101 TRDGKPLVER FKQEGFDCLE KQGLRRNGLS ERVRW*
ORF131ng-1和ORF131-1在135个氨基酸的重叠区内显示出有92.6%的相同性:
orf131ng-1.pep MEIRVIKYTATAALFAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWDIGGESPLSLED
||||:||||| |||:||||||||||||||||||:||||||||||||||||||||| || |
orf131-1 MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD
orf131ng-1.pep YEIPLSDGNRSVRANEYESAQKSYFYRKIGKFEACGLDWRTRDGKPLVERFKQEGFDCLE
|||||||||||||||||||||:|||||||||||||||||||||||||:| ||| ||||||
orf131-1 YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQGGFDCLE
orf131ng-1.pep KQGLRRNGLSERVRWX
||||||||||||||||
orf131-1 KQGLRRNGLSERVRWX
根据存在预计的原核细胞膜脂蛋白脂质连接位点的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例102
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 807>
1 ATGAAACACA TCCATATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGCT
51 TGCCGCCATT GCCAAAGAAG CGGGGTTTGA AGTCAGCGGT TGCGACGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG TATAGACGTG
151 TATGAAGGCT TCGATGCCGC TCAGTTGGAC GAATTTAAAG CCGACGTTTA
201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT
251 TGAACCTCGG CCTGCCtTAT ATtTcCGGCC CGCAATGGCT GTCGGAAAAC
301 GTGCTGCACC ATCATTGGGT ACTCGGTGTG GCGGGGACgC ACGGCAAAAC
351 GACCACCGCC TCCATGCTCG CATGGGTCTT GGAATATgCC GGCCTCGCGC
401 CGGGCTTCCT TATtGGCGGC GTACC.GGAA AATttCGGCG TTTCCGCCCG
451 CCTGCCGCAA ACGCCGCGCC AAGACCCGAA CAGCCAATCG CCGTTTTTcG
501 TCATCGAAGC CGACGAATAC GACACCGCCT TTtTCGACAA ACGTTCTAAA
551 TtCGTGCATT ACCGTCCGCG TACCGCCGTG TTGAACAATC TGGAATTCGA
601 CCACGCCGAC ATCTTTGCCG ACTTGGGCGC GATACAGACc CAGTTCCACT
651 ACCTCGTGCG TACCGTGCCG TCTGAAGGCT TAATCGTCTG CAACGGACGG
701 CAGCAAAGCC TGCAAGATAC TTTGGACAAA GGCTGCTGGA CGCCGGTGGA
751 AAAATTCGGC ACGGAACACG GCTGGCA..
它对应于氨基酸序列<SEQ ID 808;ORF132>:
1 MKHIHIIGIG GTFMGGLAAI AKEAGFEVSG CDAKMYPPMS TQLEALGIDV
51 YEGFDAAQLD EFKADVYVIG NVAKRGMDVV EAILNLGLPY ISGPQWLSEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VXGKFRRFRP
151 PAANAAPRPE QPIAVFRHRS RRIRHRLFRQ TFXIRALPSA YRRVEQSGIR
201 PRRHLCRLGR DTDPVPLPRA YRAVXRLNRL QRTAAKPARY FGQRLLDAGG
251 KIRHGTRLA..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 809>:
1 ATGAAACACA TCCATATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGCT
51 TGCCGCCATT GCCAAAGAAG CGGGGTTTGA AGTCAGCGGT TGCGACGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG TATAGACGTG
151 TATGAAGGCT TCGATGCCGC TCAGTTGGAC GAATTTAAAG CCGACGTTTA
201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT
251 TGAACCTCGG CCTGCCTTAT ATTTCCGGCC CGCAATGGCT GTCGGAAAAC
301 GTGCTGCACC ATCATTGGGT ACTCGGTGTG GCGGGGACGC ACGGCAAAAC
351 GACCACCGCC TCCATGCTCG CATGGGTCTT GGAATATGCC GGCCTCGCGC
401 CGGGCTTCCT TATTGGCGGC GTACCGGAAA ATTTCGGCGT TTCCGCCCGC
451 CTGCCGCAAA CGCCGCGCCA AGACCCGAAC AGCCAATCGC CGTTTTTCGT
501 CATCGAAGCC GACGAATACG ACACCGCCTT TTTCGACAAA CGTTCTAAAT
551 TCGTGCATTA CCGTCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC
601 CACGCCGACA TCTTTGCCGA CTTGGGCGCG ATACAGACCC AGTTCCACTA
651 CCTCGTGCGT ACCGTGCCGT CTGAAGGCTT AATCGTCTGC AACGGACGGC
701 AGCAAAGCCT GCAAGATACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA
751 AAATTCGGCA CGGAACACGG CTGGCAGGCC GGCGAAGCCA ATGCCGACGG
801 CTCGTTCGAC GTGTTGCTCG ACGGCAAAAC CGCCGGACGC GTCAAATGGG
851 ATTTGATGGG CAGGCACAAC CGCATGAACG CGCTCGCCGT CATTGCCGCC
901 GCGCGTCATG TCGGTGTCGA TATTCAGACC GCCTGCGAAG CCTTGGGCGC
951 GTTTAAAAAC GTCAAACGCC GGATGGAAAT CAAAGGCACG GCAAACGGCA
1001 TCACCGTTTA CGACGACTTC GCCCACCACC CGACCGCCAT CGAAACCACG
1051 ATTCAAGGTT TGCGCCAACG CGTCGGCGGC GCGCGCATCC TCGCCGTCCT
1101 CGAACCGCGT TCCAACACGA TGAAGCTGGG CACGATGAAG TCCGCCCTGC
1151 CTGTAAGCCT CAAAGAAGCC GACCAAGTGT TCTGCTACGC CGGCGGCGTG
1201 GACTGGGACG TCGCCGAAGC CCTCGCGCCT TTGGGCGGCA GGCTGAACGT
1251 CGGCAAAGAC TTCGATGCCT TCGTTGCCGA AATCGTGAAA AACGCCGAAG
1301 TAGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC
1351 GGAAAGCTGC TGGAAGCTTT GAGATAG
它对应于氨基酸序列<SEQ ID 810;ORF132-1>:
1 MKHIHIIGIG GTFMGGLAAI AKEAGFEVSG CDAKMYPPMS TQLEALGIDV
51 YEGFDAAQLD EFKADVYVIG NVAKRGMDVV EAILNLGLPY ISGPQWLSEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPENFGVSAR
151 LPQTPRQDPN SQSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD
201 HADIFADLGA IQTQFHYLVR TVPSEGLIVC NGRQQSLQDT LDKGCWTPVE
251 KFGTEHGWQA GEANADGSFD VLLDGKTAGR VKWDLMGRHN RMNALAVIAA
301 ARHVGVDIQT ACEALGAFKN VKRRMEIKGT ANGITVYDDF AHHPTAIETT
351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK SALPVSLKEA DQVFCYAGGV
401 DWDVAEALAP LGGRLNVGKD FDAFVAEIVK NAEVGDHILV MSNGGFGGIH
451 GKLLEALR*
该氨基酸序列的计算机分析给出了下列结果:
与大肠杆菌的假设的o457蛋白(登录号为U14003)的同源性
ORF132和o457在140个氨基酸的重叠区内显示出有58%的氨基酸相同性:
Orf132:4 IHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLDEFK 63
IHI+GI GTFMGGLA +A++ G EV+G DA +YPPMST LE GI++ +G+DA+QL+ +
o457: 3 IHILGICGTFMGGLAMLARQLGHEVTGSDANVYPPMSTLLEKQGIELIQGYDASQLEP-Q 61
Orf132:64 ADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTASML 123
D+ +IGN RG VEA+L +PY+SGPQWL + VL WVL VAGTHGKTTTA M
o457: 62 PDLVIIGNAMTRGNPCVEAVLEKNIPYMSGPQWLHDFVLRDRWVLAVAGTHGKTTTAGMA 121
Orf132:124 AWVLEYAGLAPGFLIGGVXG 143
W+LE G PGF+IGGV G
o457: 122 TWILEQCGYKPGFVIGGVPG 141
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF132和脑膜炎奈瑟球菌菌株A的ORF(ORF132a)在189个氨基酸的重叠区内显示出有74.6%的相同性:
10 20 30 40 50 60
orf132.pep MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD
||||||||||||||||:|||||||||| |||||||||||||||||||| ||||||:||||
orf132a MKHIHIIGIGGTFMGGIAAIAKEAGFEXSGCDAKMYPPMSTQLEALGIGVYEGFDTAQLD
10 20 30 40 50 60
70 80 90 100 110 120orf132.pep EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA
||||||||||||||||||||||||| |||||||||||:|| ||||| |||| ||||||||orf132a EFKADVYVIGNVAKRGMDVVEAILNRGLPYISGPQWLAENXLHHHWXLGVAXTHGKTTTA
70 80 90 100 110 120
130 140 150 160orf132.pep SMLAWVLEYAGLAPGFLIGGVXGKFR---RFRPPAANAAPRPEQPI----------AVFR
|||||||||||||||| |||| :| |: | : | ::|: | |orf132a SMLAWVLEYAGLAPGFXIGGVPENFSVSARL-PQTPRQDPNSQSPFFVIEADEYDTAFFD
130 140 150 160 170
170 180 190 200 210 220orf132.pep HRSRRIRHRLFRQTFXIRALPSAYRRVEQSGIRPRRHLCRLGRDTDPVPLPRAYRAVXRL
:||: :::|orf132a KRSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGRQQSLQD
180 190 200 210 220 230全长ORF132a核苷酸序列<SEQ ID 811>是:
1 ATGAAACACA TCCACATTAT CGGTATCGGC GGCACGTTTA TGGGTGGGAT
51 TGCCGCCATT GCCAAAGAAG CAGGGTTTGA ANTCAGCGGT TGCGATGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG CATAGGCGTG
151 TATGAAGGCT TCGACACCGC GCAGTTGGAC GAATTTAAAG CCGACGTTTA
201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT
251 TGAACCGTGG GCTGCCTTAT ATTTCCGGCC CGCAATGGCT GGCTGAAAAC
301 NTGCTGCACC ATCATTGGNN ACTCGGCGTG GCGGNGACGC ACGGCAAAAC
351 GACCACCGCG TCTATGCTCG CGTGGGTTTT GGAATATGCC GGACTCGCAC
401 CGGGCTTCNT TATCGGCGGC GTACCGGAAA ACTTCAGCGT TTCCGCCCGC
451 CTGCCGCAAA CGCCGCGCCA AGACCCGAAC AGCCAATCGC CGTTTTTCGT
501 CATTGAAGCC GACGAATACG ACACCGCGTT TTTCGACAAA CGCTCCAAAT
551 TCGTGCATTA CCGTCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC
601 CACGCCGACA TCTTCGCCGA TTTGGGCGCG ATACAGACCC AGTTCCACCA
651 CCTCGTGCGT ACCGTGCCGT CTGAAGGCCT CATCGTCTGC AACGGACGGC
701 AGCAAAGCCT GCAAGACACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA
751 AAATTCGGCA CGGAACACGG CTGGCAGGCC GGCGAAGCCA ATGCCGATGG
801 CTCGTTCGAC GTGTTGCTTG ACGGCAAAAA AGCCGGACAC GTCGCTTGGA
851 GTTTGATGGG CGGACACAAC CGCATGAACG CGCTCGCNGT CATCGCCGCC
901 GCGCGTCATG CCGGAGTNGA CATTCAGACG GCCTGCGAAG CCTTGAGCAC
951 GTTTAAAAAC GTCAAACGCC GCATGGAAAT CAAAGGCACG GCAAACGGTA
1001 TCACCGTTTA CGACGACTTC GCCCACCATC CGACCGCTAT CGAAACCACG
1051 ATTCAAGGTT TGCGCCAGCG CGTCGGCGGC GCGCGCATCC TCGCCGTCCT
1101 CGAACCGCGT TCCAATACGA TGAAGCTGGG TACGATGAAA GCCGCCCTGC
1151 CCGCAAGCCT CAAAGAAGCC GACCAAGTGT TCTGNTACGC CGGCGGCGCG
1201 GACTGGGACG TTGCCGAAGC CCTCGCGCCT TTGGGCGGCA GGCTGCACGT
1251 CGGCAAAGAC TTCGATGCCT TCGTTGCCGA AATCGTGAAA AACGCCGAAG
1301 CAGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC
1351 ACCAAACTGC TGGACGCTTT GAGATAG它编码的蛋白质具有氨基酸序列<SEQ ID 812>:
1 MKHIHIIGIG GTFMGGIAAI AKEAGFEXSG CDAKMYPPMS TQLEALGIGV
51 YEGFDTAQLD EFKADVYVIG NVAKRGMDVV EAILNRGLPY ISGPQWLAEN
101 XLHHHWXLGV AXTHGKTTTA SMLAWVLEYA GLAPGFXIGG VPENFSVSAR
151 LPQTPRQDPN SQSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD
201 HADIFADLGA IQTQFHHLVR TVPSEGLIVC NGRQQSLQDT LDKGCWTPVE
251 KFGTEHGWQA GEANADGSFD VLLDGKKAGH VAWSLMGGHN RMNALAVIAA
301 ARHAGVDIQT ACEALSTFKN VKRRMEIKGT ANGITVYDDF AHHPTAIETT
351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK AALPASLKEA DQVFXYAGGA
401 DWDVAEALAP LGGRLHVGKD FDAFVAEIVK NAEAGDHILV MSNGGFGGIH
451 TKLLDALR*ORF132a和ORF132-1在458个氨基酸的重叠区内显示出有93.9%的相同性:
orf132a.pep MKHIHIIGIGGTFMGGIAAIAKEAGFEXSGCDAKMYPPMSTQLEALGIGVYEGFDTAQLD
||||||||||||||||:|||||||||| |||||||||||||||||||| ||||||:||||
orf132-1 MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD
orf132a.pep EFKADVYVIGNVAKRGMDVVEAILNRGLPYISGPQWLAENXLHHHWXLGVAXTHGKTTTA
||||||||||||||||||||||||| |||||||||||:|| ||||| |||| ||||||||
orf132-1 EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA
orf132a.pep SMLAWVLEYAGLAPGFXIGGVPENFSVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK
|||||||||||||||| ||||||||:||||||||||||||||||||||||||||||||||
orf132-1 SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK
orf132a.pep RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGRQQSLQDT
||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||
orf132-1 RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHYLVRTVPSEGLIVCNGRQQSLQDT
orf132a.pep LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKKAGHVAWSLMGGHNRMNALAVIAA
|||||||||||||||||||||||||||||||||||| ||:| |:||| ||||||||||||
orf132-1 LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKTAGRVKWDLMGRHNRMNALAVIAA
orf132a.pep ARHAGVDIQTACEALSTFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
|||:|||||||||||::|||||||||||||||||||||||||||||||||||||||||||
orf132-1 ARHVGVDIQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
orf132a.pep ARILAVLEPRSNTMKLGTMKAALPASLKEADQVFXYAGGADWDVAEALAPLGGRLHVGKD
||||||||||||||||||||:|||:||||||||| ||||:|||||||||||||||:||||
orf132-1 ARILAVLEPRSNTMKLGTMKSALPVSLKEADQVFCYAGGVDWDVAEALAPLGGRLNVGKD
orf132a.pep FDAFVAEIVKNAEAGDHILVMSNGGFGGIHTKLLDALRX
|||||||||||||:|||||||||||||||| |||:||||
orf132-1 FDAFVAEIVKNAEVGDHILVMSNGGFGGIHGKLLEALRX
与淋病奈瑟球菌的预计ORF的同源性
ORF132和淋病奈瑟球菌的预计ORF(ORF132ng)在259个氨基酸的重叠区内显示出有89.6%的相同性:
orf132.pep MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD 60
||||||||||||||||:|||||||||:||||||||||||||||||||| |:||||||||:
orf132ng MKHIHIIGIGGTFMGGIAAIAKEAGFKVSGCDAKMYPPMSTQLEALGIGVHEGFDAAQLE 60
orf132.pep EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA 120
||:||:|||||||:||||||||||| |||||||||||:||||||||||||||||||||||
orf132ng EFQADIYVIGNVARRGMDVVEAILNRGLPYISGPQWLAENVLHHHWVLGVAGTHGKTTTA 120
orf132.pep SMLAWVLEYAGLAPGFLIGGVXGKFRRFRPPAANAAPRPEQPIAVFRHRSRRIRHRLFRQ 180
||||||||||||||||||||| |||||||||:|||| |||| ||||||||||||||||||
orf132ng SMLAWVLEYAGLAPGFLIGGVPGKFRRFRPPTANAASRPEQQIAVFRHRSRRIRHRLFRQ 180
orf132.pep TFXIRALPSAYRRVEQSGIRPRRHLCRLGRDTDPVPLPRAYRAVXRLNRLQRTAAKPARY 240
|: |||| |||||||||||||||| |||||||||| |||:|:: | :||||||||||||
orf132ng TLQIRALSPAYRRVEQSGIRPRRHLRRLGRDTDPVPPPRAHRTIRRPHRLQRTAAKPARY 240
orf132.pep FGQRLLDAGGKIRHGTRLA 259
|||||||||||||| ||||
orf132ng FGQRLLDAGGKIRHRTRLADW 261
预计ORF132ng核苷酸序列<SEQ ID 813>编码的蛋白质具有氨基酸序列<SEQ ID814>:
1 MKHIHIIGIG GTFMGGIAAI AKEAGFKVSG CDAKMYPPMS TQLEALGIGV
51 HEGFDAAQLE EFQADIYVIG VVARRGMDVV EAILNRGLPY ISGPQWLAEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPGKFRRFRP
151 PTANAASRPE QQIAVFRHRS RRIRHRLFRQ TLQIRALSPA YRRVEQSGIR
201 PRRHLRRLGR DTDPYPPPRA HRTIRRPHRL QRTAAKPARY FGQRLLDAGG
251 KIRHRTRLAD W*进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 815>:
1 ATGAAACACA TCCACATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGAT
51 TGCCGCCATT GCCAAAGAAG CCGGGTTCAA AGTCAGCGGT TGCGACGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG CATAGGCGTA
151 CACGAAGGCT TCGATGCCGC GCAGTTGGAA GAATTTCAAG CCGATATTTA
201 CGTCATCGGC AATGTCGCCA GGCGCGGGAT GGATGTGGTC GAGGCGATTT
251 TGAACCGTGG GCTGCCTTAT ATTTCCGGCC CGCAATGGCT GGCTGAAAac
301 GTGCtgcacc atcaTTGGgt ACTCGGCGTG GcagggaCGC ACGGcaaAac
351 gaccaCcGcg tCCATGCTCG CCTGGGTCTT GGAATATGCC GGACTCGCGC
401 CGGGCTTCCT CATCGGCGGt gtaccggaAA ATTTCGGCGT TTCCGCCCGC
451 CTACCGCAAA CGCCGCGTCA AGACCCGAAC AGCAAATCGC CGTTTTTCGT
501 CATCGAAGCC GACGAATACG ACACCGCCTT TTTCGACAAA CGCTCCAAAT
551 TCGTGCATTA TCGCCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC
601 CACGCCGACA TCTTCGCCGA CTTGGGCGCG ATACAGACCC AGTTCCACCA
651 CCTCGTGCGC ACCGTACCAT CCGAAGGCCT CATCGTCTGC AACGGACAGC
701 AGCAAAGCCT GCAAGATACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA
751 AAATTCGGCA CCGGACACGG CTGGCAGATT GGTGAAGTCA ATGCCGACGG
801 CTCGTTCGAC GTATTGCTTG ACGGCAAAAA AGCCGGACAC GTCGCATGGG
851 ATTTGATGGG CGGACACAAC CGCATGAACG CGCTCGCCGT CATCGCTGCC
901 GCACGCCATG CCGGAGTCGA TGTTCAGACG GCCTGCGAAG CCTTGGGTGC
951 GTTTAAAAAC GTCAAACGCC GCATGGAAAT CAAAGGCACG GCAAACGGCA
1001 TCACCGTTTA CGACGATTTC GCCCACCACC CGACCGCCAT CGAAACCACG
1051 ATTCAAGGTT TGCGCCAACG TGTCGGCGGC GCGCGCATCC TCGCCGTCCT
1101 CGAGCCGCGT TCCAACACCA TGAAACTCGG CACGATGAAG TCCGCCCTGC
1151 CCGCAAGCCT CAAAGAAGCC GACCAAGTGT TCTGCTACGC CGGCGGCGCG
1201 GACTGGGACG TTGCCGAAGC CCTCGCGCCT TTGGGCTGCA GGCTGCGCGT
1251 CGGTAAAGAT TTCGATACCT TCGTTGCCGA AATTGTGAAA AACGCCCGAA
1301 CCGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC
1351 ACCAAACTGC TGGACGCTTT GAGATAG它对应于氨基酸序列<SEQ ID 816;ORF132ng-1>:
1 MKHIHIIGIG GTFMGGIAAI AKEAGFKVSG CDAKMYPPMS TQLEALGIGV
51 HEGFDAAQLE EFQADIYVIG NVARRGMDVV EAILNRGLPY ISGPQWLAEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPENFGVSAR
151 LPQTPRQDPN SKSPFFVIEA DEYDTAFFDK RSKFVHYRFR TAVLNNLEFD
201 HADIFADLGA IQTQFHHLVR TVPSEGLIVC NGQQQSLQDT LDKGCWTPVE
251 KFGTGHGWQI GEVNADGSFD VLLDGKKAGH VAWDLMGGHN RMNALAVIAA
301 ARHAGYDVQT ACEALGAFKN VKRRMEIKGT ANGITYYDDF AHHPTAIETT
351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK SALPASLKEA DQVFCYAGGA
401 DWDVAEALAP LGCRLRVGKD FDTFVAEIVK NARTGDHILV MSNGGFGGIH
451 TKLLDALR*ORF132ng-1和ORF132-1在458个氨基酸的重叠区内显示出有93.2%的相同性:orf132ng-1.pep MKHIHIIGIGGTFMGGIAAIAKEAGFKVSGCDAKMYPPMSTQLEALGIGVHEGFDAAQLE
||||||||||||||||:|||||||||:||||||||||||||||||||| |:||||||||:orf132-1 MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLDorf132ng-1.pep EFQADIYVIGNVARRGMDVVEAILNRGLPYISGPQWLAENVLHHHWVLGVAGTHGKTTTA
||:||:|||||||:||||||||||| |||||||||||:||||||||||||||||||||||orf132-1 EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTAorf132ng-1.pep SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSKSPFFVIEADEYDTAFFDK
|||||||||||||||||||||||||||||||||||||||||:||||||||||||||||||orf132-1 SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDKorf132ng-1.pep RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGQQQSLQDT
||||||||||||||||||||||||||||||||||||:|||||||||||||||:|||||||
orf132-1 RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHYLVRTVPSEGLIVCNGRQQSLQDT
orf132ng-1.pep LDKGCWTPVEKFGTGHGWQIGEVNADGSFDVLLDGKKAGHVAWDLMGGHNRMNALAVIAA
|||||||||||||| |||| ||:||||||||||||| ||:| ||||| ||||||||||||
orf132-1 LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKTAGRVKWDLMGRHNRMNALAVIAA
orf132ng-1.pep ARHAGVDVQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
|||:|||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf132-1 ARHVGVDIQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
orf132ng-1.pep ARILAVLEPRSNTMKLGTMKSALPASLKEADQVFCYAGGADWDVAEALAPLGCRLRVGKD
||||||||||||||||||||||||:||||||||||||||:|||||||||||| || ||||
orf132-1 ARILAVLEPRSNTMKLGTMKSALPVSLKEADQVFCYAGGVDWDVAEALAPLGGRLNVGKD
orf132ng-1.pep FDTFVAEIVKNARTGDHILVMSNGGFGGIHTKLLDALRX
||:|||||||||::|||||||||||||||| |||:||||
orf132-1 FDAFVAEIVKNAEVGDHILVMSNGGFGGIHGKLLEALRX
另外,ORF132ng-1与一种假设的大肠杆菌蛋白同源:
pir||S56459假设蛋白o457-大肠杆菌>gi|537075(U14003)
ORF-o457[大肠杆菌]>gi|1790680(AE000494)fbp-pmba基因间区中的假设的48.5kD蛋白[大肠杆菌]长度=457
评分=474位(1207),估计值=e-133
相同性=249/439(56%),阳性=294/439(66%),空隙=13/439(2%)
询问:22 KEAGFKVSGCDAKMYPPMSTQLEALGIGVHEGFDAAQLEEFQADIYVIGNVARRGMDVVE 81
++ G +V+G DA +YPPMST LE GI + +G+DA+QLE Q D+ +IGN RG VE
目标:21 RQLGHEVTGSDANVYPPMSTLLEKQGIELIQGYDASQLEP-QPDLVIIGNAMTRGNPCVE 79
询问:82 AILNRGLPYISGPQWLAENVLHHHWVLGVAGTHGKTTTASMLAWVLEYAGLAPGFLIGGV 141
A+L + +PY+SGPQWL + VL WVL VAGTHGKTTTA M W+LE G PGF+IGGV
目标:80 AVLEKNIPYMSGPQWLHDFVLRDRWVLAVAGTHGKTTTAGMATWILEQCGYKPGFVIGGV 139
询问:142 PENFGVSARLPQTPRQDPNSKSPFFVIEADEYDTAFFDKRSKFVHYRPRTAVLNNLEFDH 201
P NF VSA L +S FFVIEADEYD AFFDKRSKFVHY PRT +LNNLEFDH
目标:140 PGNFEVSAHL---------GESDFFVIEADEYDCAFFDKRSKFVHYCPRTLILNNLEFDH 190
询问:202 ADIFADLGAIQTQFHHLVRTVPSEGLIVCNGQQQSLQDTLDKGCWTPVEKFGTGHGWQIG 261
ADIF DL AIQ QFHHLVR VP +G I+ +L+ T+ GCW+ E G WQ
目标:191 ADIFDDLKAIQKQFHHLVRIVPGQGRIIWPENDINLKQTMAMGCWSEQELVGEQGHWQAK 250
询问:262 EVNADGS-FDVLLDGKKAGHVAWDLMGGHNRMNALAVIAAARHAGVDVQTACEALGAFKN 320
++ D S ++VLLDG+K G V W L+G HN N L IAAARH GV A ALG+F N
目标:251 KLTTDASEWEVLLDGEKVGEVKWSLVGEHNMHNGLMAIAAARHVGVAPADAANALGSFIN 310
询问:321 VKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG-ARILAVLEPRSNTMKLGTM 379
+RR+E++G ANG+TVYDDFAHHPTAI T+ LR +VGG ARI+AVLEPRSNTMK+G
目标:311 ARRRLELRGEANGVTVYDDFAHHPTAILATLAALRGKVGGTARIIAVLEPRSNTMKMGIC 370
询问:380 KSALPASLKEADQVF-CYAGGADWDVAEALAPLGCRLRVGKDFDTFVAEIVKNARTGDHI 438
K L SL AD+VF W VAE D DT +VK A+ GDHI
目标:371 KDDLAPSLGRADEVFLLQPAHIPWQVAEVAEACVQPAHWSGDVDTLADMVVKTAQPGDHI 430
询问:439 LVMSNGGFGGIHTKLLDAL 457
LVMSNGGFGGIH KLLD L
目标:431 LVMSNGGFGGIHQKLLDGL 449
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF132-1(26.4kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图20A显示出His-融合蛋白亲和纯化的结果,图20B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,将小鼠血清用于FACS分析(图20C)和ELISA(阳性结果)。这些实验确认ORF132是一种外露蛋白,且是一种有用的免疫原。
实施例103
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 817>
1 ..CCGGGCTATT ACGGCTCGGA TGACGAATTT AAGCGGGCAT TCGGAGAAAA
51 CTCGCCGACA TmCAAGAAAC ATTGCAACCG GAGCTGCGGG ATTTATGAAC
101 CCGTATTGAA AAAATACGGC AAAAAGCGCG CCAACAACCA TTCGGTCAGC
151 ATTAGTGCGG ACTTCGGCGA TTATTTCATG CCGTTCGCCA GCTATTCGCG
201 CACACACCGT ATGCCCAACA TCCAAGAAAT GTATTTTTCC CAAATCGGCG
251 ACTCCGGCGT TCACACCGCC TTAAAACCAG AGCGCGCAAA CACTTGGCAA
301 TTTGGCTTCr ATACCTATAA AAAAGGATTG TTAAAACAAG ATGATACATT
351 AGGATTAAAA CTGGTCGGCT ACCGCAGCCG CATCGACAAC TACATCCACA
401 ACGTTTACGG GAAATGGTGG GATTTGAACG GGGATATTCC GAGCTGGGTC
451 AGCAGCACCG GGCTTGCCTA CACCATCCAA CATCGCrATT TCAwAGACAA
501 AGTGCATCAA nnnnnnnnnn nnnnnnnnnn nnnnTACGAT TATGGGCGTT
551 TTTTCACCAA CCTTTCTTAC GCCTATCAAA AAAGCACGCA ACCGACCAAC
601 TTCAGCGATG CGAGCGAATC GCCCAACAAT GCGTCCAAAG AAGACCAACT
651 CAAACAAGGT TATGGGTTGA GCAGGGTTTC CGCCCTGCCG CGAGATTACG
701 GACGTTTGGA AGTCGGTACG CGCTGGTTGG GCAACAAACT GACTTTGGGC
751 GGCGCGATGC GCTATTTCGG CAAGAGCATC CGCGCGACGG CTGAAGAACG
801 CTATATCGAC GCCACCAACG GGGGAAATAC CAGCAATTTC CGGCAACTGG
851 GCAAGCGTTC CATCAAACAA ACCGAAACTC TTGCCCGCCA GCCTTTGATT
901 TTwGATTTTa ACGCCGCTTA CGAGCCGAAG AAAAACCTTA TTTTCCGCGC
951 CGAAGTCAAA AATCTGTTCG ACAGGCGTTA TATCGATCCG CTCGATGCGG
1001 GCAATGATGC GGCAAC.GAG CGTTATTACA GCTCGTTCGA CCCGAAAGAC
1051 AAGGACrrAG ACGTAACGTG TAATGCTGAT AAAACGTTGT GCaACGGCAA
1101 ATACGGCGGC ACAAGCAAAA GCGTATTGAC CAATTTTGCA CGCGGACGCA
1151 CCTTTTTgAT GACGATGAGC TACAAGTTTT AA
它对应于氨基酸序列<SEQ ID 818;ORF133>:
1 ..PGYYGSDDEF KRAFGENSPT XKKHCNRSCG IYEPVLKKYG KKRANNHSVS
51 ISADFGDYFM PFASYSRTHR MPNIQEMYFS QIGDSGVHTA LKPERANTWQ
101 FGFXTYKKGL LKQDDTLGLK LVGYRSRIDN YIHNVYGKWW DLNGDIPSWV
151 SSTGLAYTIQ HRXFXDKVHQ XXXXXXXXYD YGRFFTNLSY AYQKSTQPTN
201 FSDASESPNN ASKEDQLKQG YGLSRVSALP RDYGRLEVGT RWLGNKLTLG
251 GAMRYFGKSI RATAEERYID GTNGGNTSNF RQLGKRSIKQ TETLARQPLI
301 XDFNAAYEPK KNLIFRAEVK NLFDRRYIDP LDAGNDAAXE RYYSSFDPKD
351 KDXDVTCNAD KTLCNGKYGG TSKSVLTNEA RGRTFLMTMS YKF*
进一步的工作揭示了部分DNA序列<SEQ ID 819>:
1 GAGGCGCAGA TACAGGTTTT GGAAGATGTG CACGTCAAGG CGAAGCGCGT
51 ACCGAAAGAC AAAAAAGTGT TTACCGATGC GCGTGCCGTA TCGACCCGTC
101 AGGATATATT CAAATCCAGC GAAAACCTCG ACAACATCGT ACGCAGCATC
151 CCCGGTGCGT TTACACAGCA AGATAAAAGC TCGGGCATTG TGTCTTTGAA
201 TATTCGCGGC GACAGCGGGT TCGGGCGGGT CAATACGATG GTGGACGGCA
251 TCACGCAGAC CTTTTATTCG ACTTCTACCG ATGCGGGCAG GGCAGGCGGT
301 TCATCTCAAT TCGGTGCATC TGTCGACAGC AATTTTATTG CCGGACTGGA
351 TGTCGTCAAA GGCAGCTTCA GCGGCTCGGC AGGCATCAAC AGCCTTGCCG
401 GTTCGGCGAA TCTGCGGACT TTAGGCGTGG ATGACGTCGT TCAGGGCAAT
451 AATACCTACG GCCTGCTGCT AAAAGGTCTG ACCGGCACCA ATTCAACCAA
501 AGGTAATGCG ATGGCGGCGA TAGGTGCGCG CAAATGGCTG GAAAGCGGAG
551 CATCTGTCGG TGTGCTTTAC GGGCACAGCA GGCGCAGCGT GGCGCAAAAT
601 TACCGCGTGG GCGGCGGCGG GCAGCACATC GGAAATTTTG GCGCGGAATA
651 TTTGGAACGG CGCAAGCAGC GATATTTTGT ACAAGAGGGT GCTTTGAAAT
701 TCAATTCCGA CAGCGGAAAA TGGGAGCGGG ATTTACAAAG GCAACAGTGG
751 AAATACAAGC CGTATAAAAA TTACAACAAC CAAGAACTAC AaAAATACAT
801 CGAAGAGCAT GACAAAAGCT GGCGGGAAAA CCTg.CaCCG CAATACGACA
851 TTACCCCCAT CGATCCGTCC AGCCTGAAGC AGCAGTCGGC AGGCAATCTG
901 TTTAAATTGG AATACGACGG CGTATTCAAT AAATACACGG CGCAATTTCG
951 CGATTTAAAC ACCAAAATCG GCAGCCGCAA AATCATCAAC CGCAATTATC
1001 AGTTCAATTA CGGTTTGTCT TTGAACCCGT ATACCAACCT CAATCTGACC
1051 GCAGCCTACA ATTCGGGCAG GCAGAAATAT CCGAAAGGGT CGAAGTTTAC
1101 AGGCTGGGGG CTTTTAAAGG ATTTTGAAAC CTACAACAAC GCGAAAATCC
1151 TCGACCTCAA CAACACCGCC ACCTTCCGGC TGCCCCGCGA AACCGAGTTG
1201 CAAACCACTT TGGGCTTCAA TTATTTCCAC AACGAATACG GCAAAAACCG
1251 CTTTCCTGAA GAATTGGGGC TGTTTTTCGA CGGTCCTGAT CAGGACAACG
1301 GGCTTTATTC CTATTTGGGG CGGTTTAAGG GCGATAAAGG GCTGCTGCCC
1351 CAAAAATCAA CCATTGTCCA ACCGGCCGGC AGCCAATATT TCAACACGTT
1401 CTACTTCGAT GCCGCGCTCA AAAAAGACAT TTACCGCTTA AACTACAGCA
1451 CCAATACCGT CGGCTACCGT TTCGGCGGCG AATATACGGG CTATTACGGC
1501 TCGGATGACG AATTTAAGCG GGCATTCGGA GAAAACTCGC CGACATACAA
1551 GAAACATTGC AACCGGAGCT GCGGGATTTA TGAACCCGTA TTGAAAAAAT
1601 ACGGCAAAAA GCGCGCCAAC AACCATTCGG TCAGCATTAG TGCGGACTTC
1651 GGCGATTATT TCATGCCGTT CGCCAGCTAT TCGCGCACAC ACCGTATGCC
1701 CAACATCCAA GAAATGTATT TTTCCCAAAT CGGCGACTCC GGCGTTCACA
1751 CCGCCTTAAA ACCAGAGCGC GCAAACACTT GGCAATTTGG CTTCAATACC
1801 TATAAAAAAG GATTGTTAAA ACAAGATGAT ACATTAGGAT TAAAACTGGT
1851 CGGCTACCGC AGCCGCATCG ACAACTACAT CCACAACGTT TACGGGAAAT
1901 GGTGGGATTT GAACGGGGAT ATTCCGAGCT GGGTCAGCAG CACCGGGCTT
1951 GCCTACACCA TCCAACATCG CAATTTCAAA GACAAAGTGC ACAAACACGG
2001 TTTTGAGTTG GAGCTGAATT ACGATTATGG GCGTTTTTTC ACCAACCTTT
2051 CTTACGCCTA TCAAAAAAGC ACGCAACCGA CCAACTTCAG CGATGCGAGC
2101 GAATCGCCCA ACAATGCGTC CAAAGAAGAC CAACTCAAAC AAGGTTATGG
2151 GTTGAGCAGG GTTTCCGCCC TGCCGCGAGA TTACGGACGT TTGGAAGTCG
2201 GTACGCGCTG GTTGGGCAAC AAACTGACTT TGGGCGGCGC GATGCGCTAT
2251 TTCGGCAAGA GCATCCGCGC GACGGCTGAA GAACGCTATA TCGACGGCAC
2301 CAACGGGGGA AATACCAGCA ATTTCCGGCA ACTGGGCAAG CGTTCCATCA
2351 AACAAACCGA AACTCTTGCC CGCCAGCCTT TGATTTTTGA TTTTTACGCC
2401 GCTTACGAGC CGAAGAAAAA CCTTATTTTC CGCGCCGAAG TCAAAAATCT
2451 GTTCGACAGG CGTTATATCG ATCCGCTCGA TGCGGGCAAT GATGCGGCAA
2501 CGCAGCGTTA TTACAGCTCG TTCGACCCGA AAGACAAGGA CGAAGACGTA
2551 ACGTGTAATG CTGATAAAAC GTTGTGCAAC GGCAAATACG GCGGCACAAG
2601 CAAAAGCGTA TTGACCAATT TTGCACGCGG ACGCACCTTT TTGATGACGA
2651 TGAGCTACAA GTTTTAA它对应于氨基酸序列<SEQ ID 820;ORF133-1>:
1 EAQIQVLEDV HVKAKRVPKD KKVFTDARAV STRQDIFKSS ENLDNIVRSI
51 PGAFTQQDKS SGIVSLNIRG DSGFGRVNTM VDGITQTFYS TSTDAGRAGG
101 SSQFGASVDS NFIAGLDVVK GSFSGSAGIN SLAGSANLRT LGVDDVVQGN
151 NTYGLLLKGL TGTNSTKGNA MAAIGARKWL ESGASVGVLY GHSRRSVAQN
201 YRVGGGGQHI GNFGAEYLER RKQRYFVQEG ALKFNSDSGK WERDLQRQQW
251 KYKPYKNYNN QELQKYIEEH DKSWRENLXP QYDITPIDPS SLKQQSAGNL
301 FKLEYDGVFN KYTAQFRDLN TKIGSRKIIN RNYQFNYGLS LNPYTNLNLT
351 AAYNSGRQKY PKGSKFTGWG LLKDFETYNN AKILDLNNTA TFRLPRETEL
401 QTTLGFNYFH NEYGKNRFPE ELGLFFDGPD QDNGLYSYLG RFKGDKGLLP
451 QKSTIVQPAG SQYFNTFYFD AALKKDIYRL NYSTNTVGYR FGGEYTGYYG
501 SDDEFKRAFG ENSPTYKKHC NRSCGIYEPV LKKYGKKRAN NHSVSISADF
551 GDYFMPFASY SRTHRMPNIQ EMYFSQIGDS GVHTALKPER ANTWQFGFNT
601 YKKGLLKQDD TLGLKLVGYR SRIDNYIHNV YGKWWDLNGD IPSWVSSTGL
651 AYTIQHRNFK DKVHKHGFEL ELNYDYGRFF TNLSYAYQKS TQPTNFSDAS
701 ESPNNASKED QLKQGYGLSR VSALPRDYGR LEVGTRWLGN KLTLGGAMRY
751 FGKSIRATAE ERYIDGTNGG NTSNFRQLGK RSIKQTETLA RQPLIFDFYA
801 AYEPKKNLIF RAEVKNLFDR RYIDPLDAGN DAATQRYYSS FDPKDKDEDV
851 TCNADKTLCN GKYGGTSKSV LTNFARGRTF LMTMSYKF*
该氨基酸序列的计算机分析给出了下列结果:
与流感嗜血菌的可能的TonB依赖性受体HI121(登录号为U32801)的同源性
ORF133和HI121在363个氨基酸的重叠区内显示出有57%的氨基酸相同性:
Orf133:31 IYEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTA 90
I EP+L K G K+A NHS ++SA+ DYFMPF +YSRTHRMPNIQEM+FSQ+ ++GV+TA
HI121: 563 INEPILHKSGHKKAFNHSATLSAELSDYFMPFFTYSRTHRMPNIQEMFFSQVSNAGVNTA 622
Orf133:91 LKPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWV 150
LKPE+++T+Q GF TYKKGL QDD LG+KLVGYRS I NYIHNVYG WW +P+W
HI121: 623 LKPEQSDTYQLGFNTYKKGLFTQDDVLGVKLVGYRSFIKNYIHNVYGVWW--RDGMPTWA 680
Orf133:151 SSTGLAYTIQHRXFXDKVHXXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNN 210
S G YTI H+ + V YD GRFF N+SYAYQ++ QPTN++DAS PNN
HI121: 681 ESNGFKYTIAHQNYKPIVKKSGYELEINYDMGRFFANVSYAYQRTNQPTNYADASPRPNN 740
Orf133:211 ASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYID 270
AS+ED LKQGYGLSRVS LP+DYGRLE+GTRW KLTLG A RY+GKS RAT EE YI+
HI121: 741 ASQEDILKQGYGLSRVSMLPKDYGRLELGTRWFDQKLTLGLAARYYGKSKRATIEEEYIN 800
Orf133:271 GTNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDP 330
G+ + R+ ++K+TE + +QP+I D + +YEP K+LI +AEV+NL D+RY+DP
HI121: 801 GSR-FKKNTLRRENYYAVKKTEDIKKQPIILDLHVSYEPIKDLIIKAEVQNLLDKRYVDP 859
Orf133:331 LDAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMS 390
LDAGNDAA +RYYSS + + C D + C GG+ K+VL NFARGRT++++++
HI121: 860 LDAGNDAASQRYYSSL-----NNSIECAQDSSAC----GGSDKTVLYNFARGRTYILSLN 910
Orf133:391 YKF 393
YKF
HI121: 911 YKF 913
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF133和脑膜炎奈瑟球菌菌株A的ORF(ORF133a)在392个氨基酸的重叠区内显示出有90.8%的相同性:
10 20 30
orf133.pep PGYYGSDDEFKRAFGENSPTXKKHCNRSCGI
||| ||||||||||||||| ||||:||||
orf133a FYFDAALKKDIYRLNYSTNTVGYRFGGXYTGYYXSDDEFKRAFGENSPTYXKHCNQSCGI
450 460 470 480 490 500
40 50 60 70 80 90
orf133.pep YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133a YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL
510 520 530 540 550 560
100 110 120 130 140 150
orf133.pep KPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVS
|||||||||||| ||||||||||| ||||||||||||| ||||||||||||||:||||||
orf133a KPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDXYIHNVYGKWWDLNGNIPSWVS
570 580 590 600 610 620
160 170 180 190 200 210
orf133.pep STGLAYTIQHRXFXDKVHQXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA
||||||||||| | ||||: ||| |||||||||||||||||||||||||||||orf133a STGLAYTIQHRNFKDKVHKHGFELELNYDYXRFFTNLSYAYQKSTQPTNFSDASESPNNA
630 640 650 660 670 680
220 230 240 250 260 270orf133.pep SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDG
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf133a SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDX
690 700 710 720 730 740
280 290 300 310 320 330orf133.pep TNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDPL
||| |||||||||||| ||||||||||| | ||||||| |||||||||||||||||||orf133a TNGXXTSNFRQLGKRSIXQTETLARQPLIFDXYAAYEPKKXLIFRAEVKNLFDRRYIDPL
750 760 770 780 790 800
340 350 360 370 380 390orf133.pep DAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY
|||||||::|||||||||||| :|||| |:||||||||||||||||||||| |||:||||orf133a DAGNDAATQRYYSSFDPKDKDEEVTCNDDNTLCNGKYGGTSKSVLTNFARGXTFLITMSY
810 820 830 840 850 860orf133.pep KFX
|||orf133a KFX
870一部分ORF133a核苷酸序列<SEQ ID 821>是:
1 AAAGACAAAA AAGTGTTTAC CGATGCGCGT GCCGTATCGA CCCGTCAGGA
51 TATATTCAAA TCCANCGAAA ACCTCGACAA CATCGTACGC ANCATCCCCG
101 GTGCGTTTAC ACANCAANAT AAAAGCTCGG GCNTTGTGTC TTTGAATATT
151 CGCNGCGACA GCGGGTTCGG GCGGGTCAAT ACNATGGTNG ACGGCATCAC
201 NCANACCTTT TATTCGACTT CTACCGATGC GGGCAGGGCA GGCGGTTCAT
251 CTCAATTCGG TGCATCTGTC GACAGCAATT TTATNGCCGG ACTGGATGTC
301 GTCAAAGGCA GCTTCAGCGG CTCGGCAGGC ATCAACAGCC TTGCCGGTTC
351 GGCGAATCTG CGGACTTTAN GCGTGGATGA TGTCGTTCAG GGCAATANTA
401 CNTACGGCCT GCTGCTAAAA GGTCTGACCG GCACCAATTC AACCAAAGGT
451 AATGCGATGG CGGCGATAGG TGCGCGCAAA TGGCTGGAAA GCGGAGCATC
501 TGTCGGTGTG CTTTACGGGC ACAGCAGGCG CAGCGTGGCG CAAAATTACC
551 GCGTGGGCGG CGGCGGGCAG CACATCGGAA ATTTTGGCGC GGAATATCTG
601 GAACGACGCA AGCAACGATA TTTTGAGCAA GAAGGCGGGT TGAAATTCAA
651 TTCCAACAGC GGAAAATGGG AGCGGGATTT CCAAAAGTCG TACTGGAAAA
701 CCAAGTGGTA TCAAAAATAC GATGCCCCCC AAGAACTGCA AAAATACATC
751 GAAGGTCATG ATAAAAGCTG GCGGGAAAAC CTGGCGCCGC AATACGACAT
801 CACCCCCATC GATCCGTCCA GCCTGAAGCN GCAGTCGGCA GGCAACCTGT
851 TTAAATTGGA ATACGACGGC GTATTCAATA AATACACGGC GCAATTTCGC
901 GATTTAAACA CCAAAATCGG CAGCCGCAAA ATCATCAACC GCAATTATCA
951 ATTCAATTAC GGTTTGTCTT TGAACCCGTA TACCAACCTC AATCTGACCG
1001 CAGCCTACAA TTCGGGCAGG CAGAAATATC CGAAAGGGTC GAAGTTTACA
1051 GGCTGGGGGC TTTTNAAAGA TTTTGAAACC TACAACAACG CAAAAATCCT
1101 CGACCTCANC AACACCTCCA CCTTCCGGCT GCCCCGTGAA ACCGAGTTGC
1151 AAACCACTTT GGGCTTCAAT TATTTCCACA ACGAATACGG CAAAAACCGC
1201 TTTCCTGAAG AATTGGGGCT GTTTTTCGAC GGTCCGGATC ANGACAACGG
1251 GCTTTATTCC TATTTGGGGC GGTTTAAGGG CGATAAAGGG CTGCTGCCCC
1301 AAAAATCAAC CATTGTCCAA CCGGCCGGCA GCCAATATTT CAACACGTTC
1351 TACTTCGATG CCGCGCTCAA AAAAGACATT TACCGCTTAA ACTACAGCAC
1401 CAATACCGTC GGCTACCGTT TCGGCGGCNA ATATACGGGC TATTACNGCT
1451 CGGATGACGA ATTTAAGCGG GCATTCGGAG AAAACTCGCC GACATACANG
1501 AAACATTGCA ACCAGAGCTG CGGAATTTAT GAACCCGTAT TGAAAAAATA
1551 CGGCAAAAAG CGCGCCAACA ACCATTCGGT CAGCATTAGT GCGGACTTCG
1601 GCGATTATTT CATGCCGTTC GCCAGCTATT CGCGCACACA CCGTATGCCC
1651 AACATCCAAG AAATGTATTT TTCCCAAATC GGCGACTCCG GCGTTCACAC
1701 CGCCTTAAAA CCAGAGCGCG CAAACACTTG GCAATTTGGC TTCAATACCT
1751 ATAAAAAAGG ATTGTTAAAA CAAGATGATA TATTAGGATT AAAACTGGTC
1801 GGCTACCGCA GCCGCATCGA CNACTACATC CACAACGTTT ACGGGAAATG
1851 GTGGGATTTG AACGGGAATA TTCCGAGCTG GGTCAGCAGC ACCGGGCTTG
1901 CCTACACCAT CCAACACCGC AATTTCAAAG ACAAAGTGCA CAAACACGGT
1951 TTTGAGTTGG AGCTGAATTA CGATTATNGG CGTTTTTTCA CCAACCTTTC
2001 TTACGCCTAT CAAAAAAGCA CGCAACCGAC CAACTTCAGC GATGCGAGCG
2051 AATCGCCCAA CAATGCGTCC AAAGAAGACC AACTCAAACA AGGTTATGGG
2101 TTGAGCAGGG TTTCCGCCCT GCCGCGAGAT TACGGACGTT TGGAAGTCGG
2151 TACGCGCTGG TTGGGCAACA AACTGACTTT GGGCGGCGCG ATGCGCTATT
2201 TCGGCAAGAG CATCCGCGCG ACGGCTGAAG AACGCTATAT CGACGNCACC
2251 AATGGGGNAN NTACCAGCAA TTTCCGGCAA CTGGGCAAGC GTTCCATCAN
2301 ACAAACCGAA ACCCTTGCCC GCCAGCCTTT GATTTTTGAT TTNTACGCCG
2351 CTTACGAGCC GAAGAAAAAN CTTATTTTCC GCGCCGAAGT CAAAAATCTG
2401 TTCGACAGGC GTTATATCGA TCCGCTCGAT GCGGGCAATG ATGCGGCAAC
2451 GCAGCGTTAT TACAGTTCGT TCGACCCGAA AGACAAGGAC GAAGAAGTAA
2501 CGTGTAATGA TGATAACACG TTATGCAACG GCAAATACGG CGGCACAAGC
2551 AAAAGCGTAT TGACCAATTT TGCACGCGGA CNCACCTTTT TGATAACGAT
2601 GAGCTACAAG TTTTAA它编码的蛋白质具有(部分)氨基酸序列<SEQ ID 822>:
1 KDKKVFTDAR AVSTRQDIFK SXENLDNIVR XIPGAFTXQX KSSGXVSLNI
51 RXDSGFGRVN TMVDGITXTF YSTSTDAGRA GGSSQFGASV DSNFXAGLDV
101 VKGSFSGSAG INSLAGSANL RTLXVDDVVQ GNXTYGLLLK GLTGTNSTKG
151 NAMAAIGARK WLESGASVGV LYGHSRRSVA QNYRVGGGGQ HIGNFGAEYL
201 ERRKQRYFEQ EGGLKFNSNS GKWERDFQKS YWKTKWYQKY DAPQELQKYI
251 EGHDKSWREN LAPQYDITPI DPSSLKXQSA GNLFKLEYDG VFNKYTAQFR
301 DLNTKIGSRK IINRNYQFNY GLSLNPYTNL NLTAAYNSGR QKYPKGSKFT
351 GWGLXKDFET YNNAKILDLX NTSTFRLPRE TELQTTLGFN YFHNEYGKNR
401 FPEELGLFFD GPDXDNGLYS YLGRFKGDKG LLPQKSTIVQ PAGSQYFNTF
451 YFDAALKKDI YRLNYSTNTV GYRFGGXYTG YYXSDDEFKR AFGENSPTYX
501 KHCNQSCGIY EPVLKKYGKK RANNHSVSIS ADFGDYFMPF ASYSRTHRMP
551 NIQEMYFSQI GDSGVHTALK PERANTWQFG FNTYKKGLLK QDDILGLKLV
601 GYRSRIDXYI HNVYGKWWDL NGNIPSWVSS TGLAYTIQHR NFKDKVHKHG
651 FELELNYDYX RFFTNLSYAY QKSTQPTNFS DASESPNNAS KEDQLKQGYG
701 LSRVSALPRD YGRLEVGTRW LGNKLTLGGA MRYFGKSIRA TAEERYIDXT
751 NGXXTSNFRQ LGKRSIXQTE TLARQPLIFD XYAAYEPKKX LIFRAEVKNL
801 FDRRYIDPLD AGNDAATQRY YSSFDPKDKD EEVTCNDDNT LCNGKYGGTS
851 KSVLTNFARG XTFLITMSYK F*ORF133a和ORF133-1在871个氨基酸的重叠区内显示出有94.3%的相同性:
10 20 30 40orf133a.pep KDKKVFTDARAVSTRQDIFKSXENLDNIVRXIPGAFTXQXKS
||||||||||||||||||||||||||||||||||||||||||orf133-1 EAQIQVLEDVHVKAKRVPKDKKVFTDARAVSTRQDIFKSSENLDNIVRSIPGAFTQQDKS
10 20 30 40 50 60
50 60 70 80 90 100orf133a.pep SGXVSLNIRXDSGFGRVNTHVDGITXTFYSTSTDAGRAGGSSQFGASVDSNFXAGLDVVK
|| |||||| ||||||||||||||| |||||||||||||||||||||||||| |||||||orf133-1 SGIVSLNIRGDSGFGRVNTMVDGITQTFYSTSTDAGRAGGSSQFGASVDSNFIAGLDVVK
70 80 90 100 110 120
110 120 130 140 150 160orf133a.pep GSFSGSAGINSLAGSANLRTLXVDDVVQGNXTYGLLLKGLTGTNSTKGNAMAAIGARKWL
||||||||||||||||||||| |||||||| |||||||||||||||||||||||||||||orf133-1 GSFSGSAGINSLAGSANLRTLGVDDVVQGNNTYGLLLKGLTGTNSTKGNAMAAIGARKWL
130 140 150 160 170 180
170 180 190 200 210 220orf133a.pep ESGASVGVLYGHSRRSVAQNYRVGGGGQHIGNFGAEYLERRKQRYFEQEGGLKFNSNSGK
|||||||||||||||||||||||||||||||||||||||||||||| |||:|||||:|||orf133-1 ESGASVGVLYGHSRRSVAQNYRVGGGGQHIGNFGAEYLERRKQRYFVQEGALKFNSDSGK
190 200 210 220 230 240
230 240 250 260 270 280orf133a.pep WERDFQKSYWKTKWYQKYDAPQELQKYIEGHDKSWRENLAPQYDITPIDPSSLKXQSAGN
||||:|:: || | |::|: |||||||| ||||||||| |||||||||||||| |||||orf133-1 WERDLQRQQWKYKPYKNYNN-QELQKYIEEHDKSWRENLXPQYDITPIDPSSLKQQSAGN
250 260 270 280 290
290 300 310 320 330 340orf133a.pep LFKLEYDGVFNKYTAQFRDLNTKIGSRKIINRNYQFNYGLSLNPYTNLNLTAAYNSGRQK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 LFKLEYDGVFNKYTAQFRDLNTKIGSRKIINRNYQFNYGLSLNPYTNLNLTAAYNSGRQK
300 310 320 330 340 350
350 360 370 380 390 400orf133a.pep YPKGSKFTGWGLXKDFETYNNAKILDLXNTSTFRLPRETELQTTLGFNYFHNEYGKNRFP
|||||||||||| |||||||||||||| ||:|||||||||||||||||||||||||||||orf133-1 YPKGSKFTGWGLLKDFETYNNAKILDLNNTATFRLPRETELQTTLGFNYFHNEYGKNRFP
360 370 380 390 400 410
410 420 430 440 450 460orf133a.pep EELGLFFDGPDXDNGLYSYLGRFKGDKGLLPQKSTIVQPAGSQYFNTFYFDAALKKDIYR
||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 EELGLFFDGPDQDNGLYSYLGRFKGDKGLLPQKSTIVQPAGSQYFNTFYFDAALKKDIYR
420 430 440 450 460 470
470 480 490 500 510 520orf133a.pep LNYSTNTVGYRFGGXYTGYYXSDDEFKRAFGENSPTYXKHCNQSCGIYEPVLKKYGKKRA
|||||||||||||| ||||| |||||||||||||||| ||||:|||||||||||||||||orf133-1 LNYSTNTVGYRFGGEYTGYYGSDDEFKRAFGENSPTYKKHCNRSCGIYEPVLKKYGKKRA
480 490 500 510 520 530
530 540 550 560 570 580orf133a.pep NNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTALKPERANTWQFGFN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 NNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTALKPERANTWQFGFN
540 550 560 570 580 590
590 600 610 620 630 640orf133a.pep TYKKGLLKQDDILGLKLVGYRSRIDXYIHNVYGKWWDLNGNIPSWVSSTGLAYTIQHRNF
||||||||||| ||||||||||||| ||||||||||||||:|||||||||||||||||||orf133-1 TYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVSSTGLAYTIQHRNF
600 610 620 630 640 650
650 660 670 680 690 700orf133a.pep KDKVHKHGFELELNYDYXRFFTNLSYAYQKSTQPTNFSDASESPNNASKEDQLKQGYGLS
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||orf133-1 KDKVHKHGFELELNYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNASKEDQLKQGYGLS
660 670 680 690 700 710
710 720 730 740 750 760orf133a.pep RVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDXTNGXXTSNFRQLG
|||||||||||||||||||||||||||||||||||||||||||||| ||| ||||||||orf133-1 RVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDGTNGGNTSNFRQLG
720 730 740 750 760 770
770 780 790 800 810 820orf133a.pep KRSIXQTETLARQPLIFDXYAAYEPKKXLIFRAEVKNLFDRRYIDPLDAGNDAATQRYYS
|||| ||||||||||||| |||||||| ||||||||||||||||||||||||||||||||
orf133-1 KRSIKQTETLARQPLIFDFYAAYEPKKNLIFRAEVKNLFDRRYIDPLDAGNDAATQRYYS
780 790 800 810 820 830
830 840 850 860 870
orf133a.pep SFDPKDKDEEVTCNDDNTLCNGKYGGTSKSVLTNFARGXTFLITMSYKFX
|||||||||:|||| |:||||||||||||||||||||| |||:|||||||
orf133-1 SFDPKDKDEDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSYKFX
840 850 860 870 880
与淋病奈瑟球菌的预计ORF的同源性
ORF133和淋病奈瑟球菌的预计ORF(ORF133ng)在392个氨基酸的重叠区内显示出有92.3%的相同性:
orf133.pep PGYYGSDDEFKRAFGENSPTXKKHCNRSCGI 31
|||||::|||||||||||: |:||: |||:
orf133ng FYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKRAFGENSPAYKEHCDPSCGL 560
orf133.pep YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL 91
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||
orf133ng YEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMPNIQEMYFSQIGDSGVHTAL 620
orf133.pep KPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVS 151
|||||||||||| ||||||||||| ||||||||||||||||||||||||||||||||||:
orf133ng KPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVG 680
orf133.pep STGLAYTIQHRXFXDKVHQXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA 211
||||||||:|| | ||||: |||||||||||||||||||||||||||||||||
orf133ng STGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA 740
orf133.pep SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAWRYFHKSIRATAEERYIDG 271
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133ng SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDG 800
orf133.pep TNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDPL 331
|||||||| |||||||||||||||||||| || |||||||||||||||||||||||||||
orf133ng TNGGNTSNVRQLGKRSIKQTETLARQPLIFDFYAAYEPKKNLIFRAEVKNLFDRRYIDPL 860
orf133.pep DAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY 391
|||||||::|||||||||||| ||||||||||||||||||||||||||||||||||||||
orf133ng DAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY 920
orf133.pep KF 393
||
orf133ng KF 922
预计全长ORF133ng核苷酸序列<SEQ ID 823>编码的蛋白质具有氨基酸序列<SEQ ID 824>:
1 MRSSFRLKPI CFYLMGVMLY HHSYAEDAGR AGSEAQIQVL EDVHVKAKRV
51 PKDKKVFTDA RAVSTRQDVF KSGENLDNIV RSIPGAFTQQ DKSSGIVSLN
101 IRGDSGFGRV NTMVDGITQT FYSTSTDAGR AGGSSQFGAS VDSNFIAGLD
151 VVKGSFSGSA GINSLAGSAN LRTLGVDDVV QGNNTYGLLL KGLTGTNSTK
201 GNAMAAIGAR KWLESGASVG VLYGHSRRGV AQNYRVGGGG QHIGNFGEEY
251 LERRKQQYFV QEGGLKFNAG SGKWERDLQR QYWKTKWYKK YEDPQELQKY
301 IEEHDKSWRE NLAPQYDITP IDPSGLKQQS AGNLLALEYD GVFNKYTAQF
351 RDLNTRIGSR KIINRNYQFN YGLSLNPYTN LNLTAAYNSG RQKYPKGAKF
401 TGWGLLKDFE TYNNAKILDL NNTATFRLPR ETELQTTLGF NYFHNEYGKN
451 RFPEELGLFF DGPDQDNGLY SYLGRFKGDK GLLPQKSTIV QPAGSQYFNT
501 FYFDAALKKD IYRLNYSTNA INYRFGGEYT GYYGSENEFK RAFGENSPAY
551 KEHCDPSCGL YEPVLKKYGK KRANNHSVSI SADFGDYFMP FAGYSRTHRM
601 PNIQEMYFSQ IGDSGVHTAL KPERANTWQF GFNTYKKGLL KQDDILGLKL
651 VGYRSRIDNY IHNVYGKWWD LNGDIPSWVG STGLAYTIRH RNFKDKVHKH
701 GFELELNYDY GRFFTNLSYA YQKSTQPTNF SDASESPNNA SKEDQLKQGY
751 GLSRVSALPR DYGRLEVGTR WLGNKLTLGG AMRYFGKSIR ATAEERYIDG
801 TNGGNTSNVR QLGKRSIKQT ETLARQPLIF DFYAAYEPKK NLIFRAEYKN
851 LFDRRYIDPL DAGNDAATQR YYSSFDPKDK DEDVTCNADK TLCNGKYGGT
901 SKSVLTNFAR GRTFLMTMSY KF*还鉴定出一个变体,它由淋球菌DNA序列<SEQ ID 825>编码:
1 ATGAGATCTT CTTTCCGGTT GAAGCCGATT TGTTTTTATC TTATGGGTGT
51 TATGCTATAT CATCATAGTT ATGCCGAAGA TGCAGGGCGC GCGGGCAGCG
101 AGGCGCAGAT ACAGGTTTTG GAAGATGTGC ACGTCAAGGC GAAGCGCGTA
151 CCGAAAGACA AAAAAGTGTT TACCGATGCG CGTGCCGTAT CGACCCGTca
201 gGATGTGTTC AAATCCGGCG AAAACCTCGA CAACATCGTA CGCAGCATAC
251 CCGGTGCGTT TACACAGCAA GATAAAAGCT CGGGCATTGT GTCTTTGAAT
301 ATTCGCGGCG ACAGCGGGTT CGGGCGGGTC AATACGATGG TGGACGGCAT
351 CACGCAGACC TTTTATTCGA CTTCTACCGA TGCGGGCAGG GCAGGCGGTT
401 CATCTCAATT CGGTGCATCT GTCGACAGCA ATTTTATTGC CGGACTGGAT
451 GTCGTCAAAG GCAGCTTCAG CGGCTCGGCA GGCATCAACA GCCTTGCCGG
501 TTCGGCGAAT CTGCGGACTT TAGGCGTGGA TGACGTCGTT CAGGGCAATA
551 ATACCTACGG CCTGCTGCTA AAAGGTCTGA CCGGCACCAA TTCAACCAAA
601 GGTAATGCGA TGGCGGCGAT AGGTGCGCGC AAATGGCTGG AAAGCGGAGC
651 GTCTGTCGGT GTGCTTTACG GGCACAGCAG GCGCGGCGTG GCGCAAAATT
701 ACCGCGTGGG CGGCGGCGGG CAGCACATCG GAAATTTTGG TGAAGAATAT
751 CTGGAACGGC GCAAACAGCA ATATTTTGTA CAAGAGGGTG GTTTGAAATT
801 CAATGCCGGC AGCGGAAAAT GGGAACGGGA TTTGCAAAGG CAATACTGGA
851 AAACAAAGTG GTATAAAAAA TACGAAGACC CCCAAGAACT GCAAAAATAC
901 ATCGAAGAGC ATGATAAAAG CTGGCGGGAA AACCTGGCGC CGCAATACGA
951 CATCACCCCC ATCGATCCGT CCGGCCTGAA GCAGCAGTCG GCAGGCAATC
1001 TGTTTAAATT GGAATACGAC GGCGTATTCA ATAAATACAC GGCGCAATTT
1051 CGCGATTTAA ACACCAGAAT CGGCAGCCGC AAAATCATCA ACCGCAATTA
1101 TCAATTCAAT TACGGTTTGT CTTTGAACCC GTATACCAAC CTCAATCTGA
1151 CCGCAGCCTA CAATTCGGGC AGGCAGAAAT ATCCGAAAGG GGCGAAGTTT
1201 ACAGGCTGGG GGCTTTTAAA AGATTTTGAA ACCTACAACA ACGCGAAAAT
1251 CCTCGACCTC AACAACACCG CCACCTTCCG GCTGCCCCGC GAAACCGAGT
1301 TGCAAACCAC TTTGGGCTTC AATTATTTCC ACAACGAATA CGGCAAAAAC
1351 CGCTTTCCTG AAGAATTGGG GCTGTTTTTC GACGGTCCTG ATCAGGACAA
1401 CGGGCTTTAT TCCTATTTGG GGCGGTTTAA GGGCGATAAA GGGCTGTTGC
1451 CTCAAAAATC AACCATTGTC CAACCGGCCG GCAGCCAATA TTTCAACACG
1501 TTCTACTTCG ATGCCGCGCT CAAAAAAGAC ATTTACCGCT TAAACTACAG
1551 CACCAATGCA ATCAACTACC GTTTCGGCGG CGAATATACG GGCTATTACG
1601 GCTCGGAAAA CGAATTTAAG CGGGCATTCG GAGAAAACTC GCCGGCATAC
1651 AAGGAACATT GCGACCCGAG CTGCGGGCTT TATGAACCCG TATTGAAAAA
1701 ATACGGCAAA AAGCGCGCCA ACAACCATTC GGTCAGCATT AGTGCGGACT
1751 TCGGCGATTA TTTCATGCCG TTCGCCGGGT ATTCGCGCAC ACACCGTATG
1801 CCCAACATCC AAGAAATGTA TTTTTCCCAA ATCGGCGACT CCGGCGTTCA
1851 CACCGCCTTA AAACCAGAGC GCGCAAACAC TTGGCAATTT GGCTTCAATA
1901 CCTATAAAAA AGGATTGTTA AAACAAGATG ATATATTAGG ATTGAAACTG
1951 GTCGGCTACC GCAGCCGCAT TGACAACTAC ATCCACAACG TTTACGGGAA
2001 ATGGTGGGAT TTGAACGGGG ATATTCCGAG CTGGGTCGGC AGCACCGGGC
2051 TTGCCTACAC CATCCGACAC CGCAATTTCA AAGACAAAGT GCACAAACAC
2101 GGTTTTGAGC TGGAGCTGAA TTACGATTAT GGGCGTTTTT TCACCAACCT
2151 TTCTTACGCC TATCAAAAAA GCACGCAACC GACCAATTTC AGCGATGCGA
2201 GCGAATCGCC CAACAATGCC tccaaAGAAG ACCAACTCAA ACAAGGTTAT
2251 GGGCTGAGCA GGGTTTCCGC CCTGCCGCGA GATTACGGAC GTTTGGAAGT
2301 CGGTACGCGC TGGTTGGGCA ACAAACTGAC TTTGGGCGGC GCGAtgcGCT
2351 ATTTCGGCAA GAGCATCCGC GCGACGGCTG AAGAACGCTA TATCGACGGC
2401 ACCAACGGGG GAAATACCAG CAATGTCCGG CAACTGGGCA AGCGTTCCAT
2451 CAAACAAACC GAAACCCTTG CCCGACAGCC TTTGATTTTT GATTTTTACG
2501 CCGCTTACGA GCCGAAGAAA AACCTTATTT TCCGCGCCGA AGTCAAAAAC
2551 CTGTTCGACA GGCGTTATAT CGATCCGCTC GATGCGGGCA ATGATGCGGC
2601 AACGCAGCGT TATTACAGCT CGTTCGACCC GAAAGACAAG GACGAAGACG
2651 TAACGTGTAA TGCTGATAAA ACGTTGTGCA ACGGCAAATA CGGCGGCACA
2701 AGCAAAAGCG TATTGACCAA TTTCGCACGC GGACGCACCT TCTTGATGAC
2751 GATGAGCTAC AAGTTTTAA它对应于氨基酸序列<SEQ ID 826;ORF133ng-1>:
1 MRSSFRLKPI CFYLMGVMLY HHSYAEDAGR AGSEAQIQVL EDVHVKAKRV
51 PKDKKVFTDA RAVSTRQDVF KSGENLDNIV RSIPGAFTQQ DKSSGIVSLN
101 IRGDSGFGRV NTMVDGITQT FYSTSTDAGR AGGSSQFGAS VDSNFIAGLD
151 VVKGSFSGSA GINSLAGSAN LRTLGVDDVV QGNNTYGLLL KGLTGTNSTK
201 GNAMAAIGAR KWLESGASVG VLYGHSRRGV AQNYRVGGGG QHIGNFGEEY
251 LERRKQQYFV QEGGLKFNAG SGKWERDLQR QYWKTKWYKK YEDPQELQKY
301 IEEHDKSWRE NLAPQYDITP IDPSGLKQQS AGNLFKLEYD GVFNKYTAQF
351 RDLNTRIGSR KIINRNYQFN YGLSLNPYTN LNLTAAYNSG RQKYPKGAKF
401 TGWGLLKDFE TYNNAKILDL NNTATFRLPR ETELQTTLGF NYFHNEYGKN
451 RFPEELGLFF DGPDQDNGLY SYLGRFKGDK GLLPQKSTIV QPAGSQYFNT
501 FYFDAALKKD IYRLNYSTNA INYRFGGEYT GYYGSENEFK RAFGENSPAY
551 KEHCDPSCGL YEPVLKKYGK KRANNHSVSI SADFGDYFMP FAGYSRTHRM
601 PNIQEMYFSQ IGDSGVHTAL KPERANTWQF GFNTYKKGLL KQDDILGLKL
651 VGYRSRIDNY IHNVYGKWWS LNGDIPSWVG STGLAYTIRH RNFKDKVHKH
701 GFELELNYDY GRFFTNLSYA YQKSTQPTNF SDASESPNNA SKEDQLKQGY
751 GLSRVSALPR DYGRLEVGTR WLGNKLTLGG AMRYFGKSIR ATAEERYIDG
801 TNGGNTSNVR QLGKRSIKQT ETLARQPLIF DFYAAYEPKK NLIFRAEVKN
851 LFDRRYIDPL DAGNDAATQR YYSSFDPKDK DEDVTCNADK TLCNGKYGGT
901 SKSVLTNFAR GRTFLMTMSY KF*ORF133ng-1和ORF133-1在889个氨基酸的重叠区显示出有96.2%的相同性:
10 20 30 40 50 60orf133ng-1.pep SFRLKPICFYLMGVMLYHHSYAEDAGRAGSEAQIQVLEDVHVKAKRVPKDKKVFTDARAV
||||||||||||||||||||||||||||||orf133-1 EAQIQVLEDVHVKAKRVPKDKKVFTDARAV
10 20 30
70 80 90 100 110 120orf133ng-1.pep STRQDVFKSGENLDNIVRSIPGAFTQQDKSSGIVSLNIRGDSGFGRVNTMVDGITQTFYS
|||||:|||:||||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 STRQDIFKSSENLDNIVRSIPGAFTQQDKSSGIVSLNIRGDSGFGRVNTMVDGITQTFYS
40 50 60 70 80 90
130 140 150 160 170 180orf133ng-1.pep TSTDAGRAGGSSQFGASVDSNFIAGLDVVKGSFSGSAGINSLAGSANLRTLGVDDVVQGN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 TSTDAGRAGGSSQFGASVDSNFIAGLDVVKGSFSGSAGINSLAGSANLRTLGVDDVVQGN
100 110 120 130 140 150
190 200 210 220 230 240orf133ng-1.pep NTYGLLLKGLTGTNSTKGNAMAAIGARKWLESGASVGVLYGHSRRGVAQNYRVGGGGQHI
|||||||||||||||||||||||||||||||||||||||||||||:||||||||||||||orf133-1 NTYGLLLKGLTGTNSTKGNAMAAIGARKWLESGASVGVLYGHSRRSVAQNYRVGGGGQHI
160 170 180 190 200 210
250 260 270 280 290 300orf133ng-1.pep GNFGEEYLERRKQQYFVQEGGLKFNAGSGKWERDLQRQYWKTKWYKKYEDPQELQKYIEE
|||| ||||||||:||||||:||||: ||||||||||| || | ||:|:: |||||||||orf133-1 GNFGAEYLERRKQRYFVQEGALKFNSDSGKWERDLQRQQWKYKPYKNYNN-QELQKYIEE
220 230 240 250 260
310 320 330 340 350 360orf133ng-1.peD HDKSWRENLAPQYDITPIDPSGLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTRIGSRKII
||||||||| |||||||||||:||||||||||||||||||||||||||||||:|||||||orf133-1 HDKSWRENLXPQYDITPIDPSSLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTKIGSRKII
270 280 290 300 310 320
370 380 390 400 410 420orf133ng-1.pep NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGAKFTGWGLLKDFETYNNAKILDLNNT
||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||orf133-1 NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGSKFTGWGLLKDFETYNNAKILDLNNT
330 340 350 360 370 380
430 440 450 460 470 480orf133ng-1.pep ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSYLGRFKGDKGLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSYLGRFKGDKGLL
390 400 410 420 430 440
490 500 510 520 530 540orf133ng-1.pep PQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKRAF
||||||||||||||||||||||||||||||||||||:::|||||||||||||::||||||orf133-1 PQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNTVGYRFGGEYTGYYGSDDEFKRAF
450 460 470 480 490 500
550 560 570 580 590 600orf133ng-1.pep GENSPAYKEHCDPSCGLYEPVLKKYGK1RANNHSVSISADFGDYFMPFAGYSRTHRMPNI
|||||:||:||: |||:||||||||||||||||||||||||||||||||:||||||||||orf133-1 GENSPTYKKHCNRSCGIYEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNI
510 520 530 540 550 560
610 620 630 640 650 660orf133ng-1.pep QEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYIHN
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||orf133-1 QEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDTLGLKLVGYRSRIDNYIHN
570 580 590 600 610 620
670 680 690 700 710 720orf133ng-1.pep VYGKWWDLNGDIPSWVGSTGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQK
||||||||||||||||:||||||||:||||||||||||||||||||||||||||||||||orf133-1 VYGKWWDLNGDIPSWVSSTGLAYTIQHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQK
630 640 650 660 670 680
730 740 750 760 770 780orf133ng-1.pep STQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 STQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMR
690 700 710 720 730 740
790 800 810 820 830 840orf133ng-1.pep YFGKSIRATAEERYIDGTNGGNTSNVRQLGKRSIKQTETLARQPLIFDFYAAYEPKKNLI
||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||orf133-1 YFGKSIRATAEERYIDGTNGGNTSNFRQLGKRSIKQTETLARQPLIFDFYAAYEPKKNLI
750 760 770 780 790 800
850 860 870 880 890 900orf133ng-1.pep FRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf133-1 FRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKS
810 820 830 840 850 860
910 920orf133ng-1.pep VLTNFARGRTFLMTMSYKFX
||||||||||||||||||||orf133-1 VLTNFARGRTFLMTMSYKFX
870 880
另外,ORF133ng-1与流感嗜血菌中的TonB-依赖性受体同源:
sp|P45114|YC17_HAEIN可能的TONB-依赖性受体HI1217前体
>gi|1075372|pir||G64110运铁蛋白结合蛋白1前体(tbp1)同系物-流感嗜血菌(Rd KW20菌株)>gi|1574147(U32801)运铁蛋白结合蛋白1前体(tbp1)[流感嗜血菌]长度=913
评分=930位(2377),估计值=0.0
相同性=476/921(51%),阳性=619/921(66%),空隙=72/921(7%)
询问:38 QVLEDVHVKAKRVPKDKKVFTDARAVSTRQDVFKSGENLDNIVRSIPGAFTQQDKSSGIV 97
+ L + V K + DKK FT+A+A STR++VFK + +D ++RSIPGAFTQQDK SG+V
目标:29 ETLGQIDVVEKVISNDKKPFTEAKAKSTRENVFKETQTIDQVIRSIPGAFTQQDKGSGVV 88
询问:98 SLNIRGDSGFGRVNTMVDGITQTFYSTSTDAGRAGGSSQFGASVDSNFIAGLDVVKGSFS 157
S+NIRG++G GRVNTMVDG+TQTFYST+ D+G++GGSSQFGA++D NFIAG+DV K +FS
目标:89 SVNIRGENGLGRVNTMVDGVTQTFYSTALDSGQSGGSSQFGAAIDPNFIAGVDVNKSNFS 148
询问:158 GSAGINSLAGSANLRTLGVDDVVQXXXXXXXXXXXXXXXXXXXXXAMAAIGARKWLESGA 217
G++GIN+LAGSAN RTLGV+DV+ M RKWL++G
目标:149 GASGINALAGSANFRTLGVNDVITDDKPFGIILKGMTGSNATKSNFMTMAAGRKWLDNGG 208
询问:218 SVGVLYGHSRRGVAQNYRVGGGGQHIGNFGEEYLERRKQQYFVQEGGLKFNAGSGKWERD 277
VGV+YG+S+R V+Q+YR+ GGG+ + + G++ L + K+ YF + G N G+W D
目标:209 YVGVVYGYSQREVSQDYRI-GGGERLASLGQDILAKEKEAYF-RNAGYILNP-EGQWTPD 265
询问:278 LQRQYWK-----------TKWY--------------------KKYEDPQELQK---YIEE 303
L +++W +Y KK +D ++LQK IEE
目标:266 LSKKHWSCNKPDYQKNGDCSYYRIGSAAKTRREILQELLTNGKKPKDIEKLQKGNDGIEE 325
询问:304 HDKSWRENLAPQYDITPIDPSGLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTRIGSRKII 363
DKS+ N QY + PI+P L+ +S +L K EY AQ R L+ +IGSRKI
目标:326 TDKSFERN-KDQYSVAPIEPGSLQSRSRSHLLKFEYGDDHQNLGAQLRTLDNKIGSRKIE 384
询问:364 NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGAKFTGWGLLKDFETYNNAKILDLNNT 423
NRNYQ NY + N Y +LNL AA+N G+ YPKG F GW + T N A I+D+NN+
目标:385 NRNYQVNYNFNNNSYLDLNLMAAHNIGKTIYPKGGFFAGWQVADKLITKNVANIVDINNS 444
询问:424 ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSY--LGRFKGDKG 481
TF LP+E +L+TTLGFNYF NEY KNRFPEEL LF++ D GLYS+ GR+ G K
目标:445 HTFLLPKEIDLKTTLGFNYFTNEYSKNRFPEELSLFYNDASHDQGLYSHSKRGRYSGTKS 504
询问:482 LLPQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKR 541
LLPQ+S I+QP+G Q F T YFD AL K IY LNYS N +Y F GEY GY
目标:505 LLPQRSVILQPSGKQKFKTVYFDTALSKGIYHLNYSVNFTHYAFNGEYVGY--------- 555
询问:542 AFGENSPAYKEHCDPSCGLYEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMP 601
EN+ + + EP+L K G K+A NHS ++SA+ DYFMPF YSRTHRMP
目标:556 ---ENTAGQQ--------INEPILHKSGHKKAFNHSATLSAELSDYFMPFFTYSRTHRMP 604
询问:602 NIQEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYI 661
NIQEM+FSQ+ ++GV+TALKPE+++T+Q GFNTYKKGL QDD+LG+KLVGYRS I NYI
目标:605 NIQEMFFSQVSNAGVNTALKPEQSDTYQLGFNTYKKGLFTQDDVLGVKLVGYRSFIKNYI 664
询问:662 HNVYGKWWDLNGDIPSWVGSTGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAY 721
HNVYG WW +P+W S G YTI H+N+K V K G ELE+NYD GRFF N+SYAY
目标:665 HNVYGVWW--RDGMPTWAESNGFKYTIAHQNYKPIVKKSGVELEINYDMGRFFANVSYAY 722
询问:722 QKSTQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGA 781
Q++ QPTN++DAS PNNAS+ED LKQGYGLSRVS LP+DYGRLE+GTRW KLTLG A
目标:723 QRTNQPTNYADASPRPNNASQEDILKQGYGLSRVSMLPKDYGRLELGTRWFDQKLTLGLA 782
询问:782 MRYFGKSIRATAEERYIDGTNGGNTSNVRQLGKRSIKQTETLARQPLIFDFYAAYEPKKN 841
RY+GKS RAT EE YI+G+ + +R+ ++K+TE + +QP+I D + +YEP K+
目标:783 ARYYGKSKRATIEEEYINGSR-FKKNTLRRENYYAVKKTEDIKKQPIILDLHVSYEPIKD 841
询问:842 LIFRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTS 901
LI +AEV+NL D+RY+DPLDAGNDAA+QRYYSS + + C D + C GG+
目标:842 LIIKAEVQNLLDKRYVDPLDAGNDAASQRYYSSL-----NNSIECAQDSSAC----GGSD 892
询问:902 KSVLTNFARGRTFLMTMSYKF 922
K+VL NFARGRT++++++YKF
目标:893 KTVLYNFARGRTYILSLNYKF 913
该淋球菌蛋白中用下划线示出的基序预计是ATP/GTP结合位点基序A(P-环),该分析提示脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例104
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 827>
1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG
151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT
201 CGCCGTCCTT ATCGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG
401 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG
451 AAAGAAAAAA ACAGCGTGAT CAATGTGCGC GAAATGTTGC CCGACCAT..
它对应于氨基酸序列<SEQ ID 828;ORF112>:
1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML
51 GYTALKMPAR AYELIPLAVL IGGLVSLSQL AAGSELTVIK ASGMSTKKLL
101 LILSQFGFIF AIATVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKNSVINVR EMLPDH...
进一步的工作揭示了部分核苷酸序列<SEQ ID 829>:
1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG
151 gGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT
201 CGCCGTCCTT ATCGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG
401 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG
451 AAAGAAAAAA ACAGCrTkAT CAATGTGCGC GAAATGTTGC CCGACCATAC
501 GCTTTTGGGC ATCAAAATTT GGGCGCGCAA CGATAAAAAC GAATTGGCAG
551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGTTGGCAG
601 TTGAAAAACA TCCGCCGCAG CACGCTTGGC GAAGACAAAG TCGAGGTCTC
651 TATTGCGGCT GAAGAAAACT GGCCGATTTC CGTCAAACGC AACCTGATGG
701 ACGTATTGCT CGTCAAACCC GACCAAATGT CCGTCGGCGA ACTGACCACC
751 TACATCCGCC ACCTCCAAAA CAACAGCCAA AACACCCGAA TCTACGCCAT
801 CGCATGGTGG CGCAAATTGG TTTACCCCGC CGCAGCCTGG GTGATGGCGC
851 TCGTCGCCTT TGCCTTTACC CCGCAAACCA CCCGCCACGG CAATATGGGC
901 TTAAAACTCT TCGGCGGCAT CTGTsTCGGA TTGCTGTTCC ACCTTGCCGG
951 ACGGCTCTTT GGGTTTACCA GCCAACTCGG...
它对应于氨基酸序列<SEQ ID 830;ORF112-1>:
1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML
51 GYTALKMPAR AYELIPLAVL IGGLVSLSQL AAGSELTVIK ASGMSTKKLL
101 LILSQFGFIF AIATVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKNSXINVR EMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ
201 LKNIRRSTLG EDKVEVSIAA EENWPISVKR NLMDVLLVKP DQMSVGELTT
251 YIRHLQNNSQ NTRIYAIAWW RKLVYPAAAW VMALVAFAFT PQTTRHGNMG
301 LKLFGGICXG LLFHLAGRLF GFTSQL...
对该氨基酸序列进行的计算机分析预测了两个跨膜结构域,并给出下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF112和脑膜炎奈瑟球菌菌株A的ORF(ORF112a)在166个氨基酸的重叠区内显示出有96.4%的相同性:
10 20 30 40 50 60
orf112.pep MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
||||||||||||||||||||||||||||||||||||||||||||||||| ||||||| ||
orf112a MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMXGYTALKMXAR
10 20 30 40 50 60
70 80 90 100 110 120
orf112.pep AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
||||:||||||||||| |||||||||:|||||||||||||||||||||||||||||||||
orf112a AYELMPLAVLIGGLVSXSQLAAGSELXVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
70 80 90 100 110 120
130 140 150 160
orf112.pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSVINVREMLPDH
|||||||||||||||||||||||||||||||||||:||||||||||
orf112a VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSIINVREMLPDHTLLGIKIWARNDKN
130 140 150 160 170 180
orf112a ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEEXWPISVKRNLMDVLLVKP
190 200 210 220 230 240
该ORF112a的核苷酸序列<SEQ ID 831>是:
1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGNTG
151 GGNTACACCG CCCTCAAAAT GNCCGCCCGC GCCTACGAAC TGATGCCCCT
201 CGCCGTCCTT ATCGGCGGAC TGGTCTCTNT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGAN CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG
401 CCGCGGCCAT CAACGGCAAA ATCAGTACCG GCAATACCGG CCTTTGGCTG
451 AAAGAAAAAA ACAGCATTAT CAATGTGCGC GAAATGTTGC CCGACCATAC
501 CCTGCTGGGC ATTAAAATCT GGGCCCGCAA CGATAAAAAC GAACTGGCAG
551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGTTGGCAG
601 TTGAAAAACA TCCGCCGCAG CACGCTTGGC GAAGACAAAG TCGAGGTCTC
651 TATTGCGGCT GAAGAAAANT GGCCGATTTC CGTCAAACGC AACCTGATGG
701 ACGTATTGCT CGTCAAACCC GACCAAATGT CCGTCGGCGA ACTGACCACC
751 TACATCCGCC ACCTCCAAAN NNACAGCCAA AACACCCGAA TCTACGCCAT
801 CGCATGGTGG CGCAAATTGG TTTACCCCGC CGCAGCCTGG GTGATGGCGC
851 TCGTCGCCTT TGCCTTTACC CCGCAAACCA CCCGCCACGG CAATATGGGC
901 TTAAAANTCT TCGGCGGCAT CTGTCTCGGA TTGCTGTTCC ACCTTGCCGG
951 NCGGCTCTTC NGGTTTACCA GCCAACTCTA CGGCATCCCG CCCTTCCTCG
1001 NCGGCGCACT ACCTACCATA GCCTTCGCCT TGCTCGCCGT TTGGCTGATA
1051 CGCAAACAGG AAAAACGCTA A
它编码的蛋白质具有氨基酸序列<SEQ ID 832>:
1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEMX
51 GYTALKMXAR AYELMPLAVL IGGLVSXSQL AAGSELXVIK ASGMSTKKLL
101 LILSQFGFIF AIATVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKNSIINVR EMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ
201 LKNIRRSTLG EDKVEVSIAA EEXWPISVKR NLMDVLLVKP DQMSVGELTT
251 YIRHLQXXSQ NTRIYAIAWW RKLYYPAAAW VMALVAFAFT PQTTRHGNMG
301 LKXFGGICLG LLFHLAGRLF XFTSQLYGIP PFLXGALPTI AFALLAVWLI
351 RKQEKR*
ORF112a和ORF112-1在326个氨基酸的重叠区内显示出有96.3%的相同性:
orf112a.pep MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMXGYTALKMXAR
||||||||||||||||||||||||||||||||||||||||||||||||| ||||||| ||
orf112-1 MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
orf112a.pep AYELMPLAVLIGGLVSXSQLAAGSELXVIXASGMSTKKLLLILSQFGFIFAIATVALGEW
||||:||||||||||| |||||||||:|||||||||||||||||||||||||||||||||
orf112-1 AYELIPLAVLIGGLVSLSQLAAGSELTVIXASGMSTKKLLLILSQFGFIFAIATVALGEW
orf112a.pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSIINVREMLPDHTLLGIKIWARNDKN
||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||
orf112-1 VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSXINVREMLPDHTLLGIKIWARNDKN
orf112a.pep ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEEXWPISVKRNLMDVLLVKP
|||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf112-1 ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEENWPISVKRNLMDVLLVKP
orf112a.pep DQMSVGELTTYIRHLQXXSQNTRIYAIAWWRKLVYPAAAWVMALVAFAFTPQTTRHGNMG
|||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf112-1 DQMSVGELTTYIRHLQNNSQNTRIYAIAWWRKLVYPAAAWVMALVAFAFTPQTTRHGNMG
orf112a.pep LKXFGGICLGLLFHLAGRLFXFTSQLYGIPPFLXGALPTIAFALLAVWLIRKQEKRX
|| ||||| ||||||||||| |||||
orf112-1 LKLFGGICXGLLFHLAGRLFGFTSQL
与淋病奈瑟球菌的预计ORF的同源性
ORF112和淋病奈瑟球菌的预计ORF(ORF112ng)在166个氨基酸的重叠区内显示出有95.8%的相同性:
orf112.pep MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf112ng MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR 60
orf112.pep AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW 120
||||:|||||||||:|||||||||||:||||||||||||||||||||||||||:||||||
orf112ng AYELMPLAVLIGGLASLSQLAAGSELAVIKASGMSTKKLLLILSQFGFIFAIAAVALGEW 120
orf112.pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSVINVREMLPDH 166
|||||||||||||||||||||||||||||||||:|:|||| |||||
orf112ng VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKTSIINVRGMLPDHTLLGIKIWARNDKN 180
全长ORF112ng核苷酸序列<SEQ ID 833>是:
1 ATGAACCTGA TTTCACGTTA CATCATCCGC CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG
151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TCATGCCCCT
201 CGCCGTCCTC ATCGGCGGAC TGGCCTCTCT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGGC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CTCAGTTCGG TTTTATTTTT GCTATTGCCG CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CGCTGAGCCA AAAAGCCGAA AACATCAAag
401 cCGCCGCCAt taacggCAAA ATCAGCAccg gcAATACCGG CCTTTggcTG
451 AAAGAAAAAa ccAGCATTAT CAATGTGcGc GGAATGTTGC CCGACCATAC
501 GCTTTTGGGC ATCAAAATTT GGGCGCGCAA CGATAAAAAC GAATTGGCAG
551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGCTGGCAG
601 TTGAAAAACA TCCGCCGCAG CATCATGGGT ACAGACAAAA TCGAAACATC
651 cgCCGCCGCC GAAGAAACTT gGCCGATTGC CGTCAGACGC AACCTGATGG
701 ACGTATTGCT CGTCAAGCCC GACCAAATGT CCGTCGGCGA GCTGACCACC
751 TACATCCGCC ACCTCCAAAA CAACAGCCAA AACACCCAAA TCTACGCCAT
801 CGCATGGTGG CGTAAACTCG TTTACCCCGT CGCCGCATGG GTCATGGCGC
851 TCGTTGCCTT CGCCTTTACG CCGCAAACCA CGCGCCACGG CAATATGGGC
901 TTAAAACTCT TCGGCGGCAT CTGTCTCGGA TTGCTGTTCC ACCTTGCCGG
951 CAGGCTCTTC GGGTTTACCA GCCAACTCTA CGGCACCCCA CCCTTCCTCG
1001 CCGGCGCACT GCCTACCATA GCCTTCGCCT TGCTCGCTGT TTGGCTGATA
1051 CGCAAACAGG AAAAACGTTG A它编码的蛋白质具有氨基酸序列<SEQ ID 834>:
1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML
51 GYTALKMPAR AYELMPLAVL IGGLASLSQL AAGSELAVIK ASGMSTKKLL
101 LILSQFGFIF AIAAVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKTSIINVR GMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ
201 LKNIRRSIMG TDKIETSAAA EETWPIAVRR NLMDVLLVKP DQMSVGELTT
251 YIRHLQNNSQ NTQIYAIAWW RKLVYPVAAW VMALVAFAFT PQTTRHGNMG
301 LKLFGGICLG LLFHLAGRLF GFTSQLYGTP PFLAGALPTI AFALLAVWLI
351 RKQEKR*ORF112ng和ORF112-1在326个氨基酸的重叠区内显示出有94.2%的相同性:
10 20 30 40 50 60orf112ng MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||orf112-1 MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
10 20 30 40 50 60
70 80 90 100 110 120orf112ng AYELMPLAVLIGGLASLSQLAAGSELAVIKASGMSTKKLLLILSQFGFIFAIAAVALGEW
||||:|||||||||:|||||||||||:||||||||||||||||||||||||||:||||||orf112-1 AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
70 80 90 100 110 120
130 140 150 160 170 180orf112ng VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKTSIINVRGMLPDHTLLGIKIWARNDKN
|||||||||||||||||||||||||||||||||:| |||| |||||||||||||||||||orf112-1 VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSXINVREMLPDHTLLGIKIWARNDKN
130 140 150 160 170 180
190 200 210 220 230 240orf112ng ELAEAVEADSAVLNSDGSWQLKNIRRSIMGTDKIETSAAAEETWPIAVRRNLMDVLLVKP
||||||||||||||||||||||||||| :| ||:|:| ||||:|||:|:|||||||||||orf112-1 ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEENWPISVKRNLMDVLLVKP
190 200 210 220 230 240
250 260 270 280 290 300orf112ng DQMSVGELTTYIRHLQNNSQNTQIYAIAWWRKLVYPVAAWVMALVAFAFTPQTTRHGNMG
||||||||||||||||||||||:|||||||||||||:|||||||||||||||||||||||orf112-1 DQMSVGELTTYIRHLQNNSQNTRIYAIAWWRKLVYPAAAWVMALVAFAFTPQTTRHGNMG
250 260 270 280 290 300
310 320 330 340 350orf112ng LKLFGGICLGLLFHLAGRLFGFTSQLYGTPPFLAGALPTIAFALLAVWLIRKQEKRX
||||||||||||||||||||||||||orf112-1 LKLFGGICXGLLFHLAGRLFGFTSQL
310 320
该分析结果提示脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
应理解,本发明只通过实施例进行了描述,而在本发明的思路和范围内还可作其它改动。
表Ⅰ-PCR引物
ORF | 引物 | 序列 | 限制性位点 |
ORF1ORF2ORF2-1ORF4ORF5ORF6ORF7ORF8ORF9ORF10 | 正向反向正向反向正向反向正向反向正向正向反向正向反向正向反向正向反向正向反向正向反向 | CGCGGATCCGCTAGC-GGACACACTTATTTCGGCCCGCTCGAG-CCAGCGGTAGCCTAATTGCGGATCCCATATG-TTTGATTTCGGTTTGGGCCCGCTCGAG-GACGGCATAACGGCGGCGGATCCCATATG-TTTGATTTCGGTTTGGGCCCGCTCGAG-TGATTTACGGACGCGCAGCGGATCCCATATG-TGCGGAGGTCAAAAAGACCCCGCTCGAG-TTTGGCTGCGCCTTCGGAATTCCATATGGCCATGG-TGGAAGGCGCACAACCCGGGATCC-ATGGAAGGCGCACAACCCCGCTCGAG-GACTGTGCAAAAACGGCGCGGATCCCATATG-ACCCGTCAATCTCTGCACCCGCTCGAG-TGCGCCGAACACTTTCCGCGGATCCGCTAGC-GCGCTGCTTTTTGTTCCCCCGCTCGAG-TTTCAAAATATATTTGCGGAGCGGATCCCATATG-GCTCAACTGCTTCGTACCCCGCTCGAG-AGCAGGCTTTGGCGCCGCGGATCCCATATG-CCGAAGGAAGTCGGAAACCCGCTCGAG-TTTCCGAGGTTTTCGGGGCGGATCCCATATG-GACACAAAAGAAATCCTCCCCGCTCGAG-TAATGGGAAACCTTGTTTT | BamHⅠ-NheⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NheⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠ |
ORF11ORF13ORF15ORF17ORF18ORF19ORF20ORF22ORF23ORF24 | 正向反向正向反向正向正向反向正向正向反向正向反向正向正向反向正向正向反向正向正向反向正向反向正向正向 | GCGGATCCCATATG-GCGGTCAACCTCTACGCCCGCTCGAG-GGAAACGACTTCGCCCGCGGATCCCATATG-GCTCTGCTTTCCGCGCCCCGCTCGAG-AGGGTGTGTGATAATAAGGGAATTCCATATGGCCATGG-GCGGGACACTGACAGCGGGATCC-TGCGGGACACTGACAGGCCCGCTCGAG-AGGTTGGCCTTGTCTATGGGAATTCCATATGGCCATGG-TTGCCGGCCTGTTCGCGGGATCC-ATTGCCGGCCTGTTCGCCCGCTCGAG-AAGCAGGTTGTACAGCGCGGATCCCATATG-ATTTTGCTGCATTTGGATCCCGCTCGAG-TCTTCCAATTTCTGAAAGCGGAATTCCATATGGCCATGG-TCGCCAGTGTTTTTACCCGGGATCC-TTCGCCAGTGTTTTTACCGCCCGCTCGAG-GGTGTTTTTGAAGCTGCCGGAATTCCATATGGCCATGG-TCGGCGCGGGTATGCGGGATCC-TTCGGCGCGGGTATGCCCGCTCGAG-CGGCGAGCGAGAGCAGGAATTCCATATGGCCATGG-TGATTAAAATCAAAAAAGGTCTCGGGATCC-ATGATTAAAATCAAAAAAGGTCTAAACCCCCGCTCGAG-ATTATGATAGCGGCCCCGCGGATCCCATATG-GATGTTTCTGTTTCAGACCCCGCTCGAG-TTTAAACCGATAGGTAAACGGAATTCCATATGGCCATGG-TGATGCCGGAAATGGTGCGGGATCC-ATGATGCCGGAAATGGTG | BamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠBamHⅠ-NdeⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠBamHⅠ-NdeⅠXhoⅠNdeⅠ-NcoⅠBamHⅠ |
ORF25ORF26ORF27ORF28ORF29ORF32ORF33ORF35ORF37ORF58 | 反向正向反向正向反向正向正向反向正向正向反向正向正向反向正向反向正向反向正向正向反向正向反向正向反向 | CCCGCTCGAG-TGTCAGCGTGGCGCAGCGGATCCCATATG-TATCGCAAACTGATTGCCCCGCTCGAG-ATCGATGGAATAGCCGGCGGATCCCATATG-CAGCTGATCGACTATTCCCCGCTCGAG-GACATCGGCGCGTTTTGGAATTCCATATGGCCATGG-AGACCTATTCTGTTTACGGGATCC-CAGACCTATTCTGTTTATTTTAATCCCCGCTCGAG-GGGTTCGATTAAATAACCATGGAATTCCATATGGCCATGG-ACGGCTGTACGTTGATGTCGGGATCC-AACGGCTGTACGTTGATGCCCGCTCGAG-TTTGTCAGAGGAATTCGCGGCGGATCCCATATG-AACGGTTTGGATGCCCGCGCGGATCCGCTAGC-AACGGTTTGGATGCCCGCCCGCTCGAG-TTTGTCTAAGTTCCTGATATGCGCGGATCCCATATG-AATACTCCTCCTTTTGCCCGCTCGAG-GCGTATTTTTTGATGCTTTGGCGGATCCCATATG-ATTGATAGGGATCGTATGCCCGCTCGAG-TTGATCTTTCAAACGGCCGCGGATCCCATATG-TTCAGAGCTCAGCTTCGCGGATCCGCTAGC-TTCAGAGCTCAGCTTCCCGCTCGAG-AAACAGCCATTTGAGCGACGGATCCCATATG-GATGACGTATCGGATTTTCCGCTCGAG-ATAGCCCGCTTTCAGGGCGGATCCGCTAGC-TCCGAACGCGAGTGGATCCGCTCGAG-AGCATTGTCCAAGGGGAC | XhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠNdeⅠ-NcoⅠBamHⅠXhoⅠBamHⅠ-NdeⅠBamHⅠ-NheⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠBamHⅠ-NheⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NheⅠXhoⅠ |
ORF65ORF66ORF72ORF73ORF75ORF76ORF79ORF83ORF84ORF85ORF89 | 正向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向正向 | GGAATTCCATATGGCCATGG-TGCTGTATCTGAATCAAGCGGGATCC-TTGCTGTATCTGAATCAAGGCCCGCTCGAG-CCGCATCGGCAGACAGCGGATCCCATATG-TACGCATTTACCGCCGCCCGCTCGAG-TGGATTTTGCAGAGATGGCGCGGATCCCATATG-AATGCAGTAAAAATATCTGACCCGCTCGAG-GCCTGAGACCTTTGCAAGCGGATCCCATATG-AGATTTTTCGGTATCGGCCCGCTCGAG-TTCATCTTTTTCATGTTCGGCGGATCCCATATG-TCTGTCTTTCAAACGGCCCCGCTCGAG-TTTGTTTTTGCAAGACAGGATCAGCTAGCCATATG-AAACAGAAAAAAACCGCCGGGATCC-TTACGGTTTGACACCGTTCGCGGATCCCATATG-GTTTCCGCCGCCGCCCGCTCGAG-GTGCTGATGCGCTTCGGCGGATCCCATATG-AAAACCCTGCTGCTGCCCCGCTCGAG-GCCGCCTTTGCGGCGCGGATCCCATATG-GCAGAGATCTGTTTGCCCGCTCGAG-GTTTGCCGATCCGACCACGCGGATCCCATATG-GCGGTTTGGGGCGGACCCGCTCGAG-TCGGCGCGGCGGGCGGAATTCCATATGGCCATGG-CCATACCTTCTTATCACGGGATCC-GCCATACCTTCTTATCAGAG | NdeⅠ-NcoⅠBamHⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠNheⅠ-NdeⅠBamHⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠNdeⅠ-NcoⅠBamHⅠ |
ORF97ORF98ORF100ORF101ORF102ORF103ORF104ORF105ORF106ORF109ORF11O | 反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向 | CCCGCTCGAG-TTTTTTGCGATTAGAAAAAGCGCGGATCCCATATG-CATCCTGCCAGCGAACCCCGCTCGAG-TTCGCCTACGGTTTTTTGGCGGATCCCATATG-ACGGTAACTGCGGCCCGCTCGAG-TTGTTGTTCGGGCAAATCGCGGATCCCATATG-TCGGGCATTTACACCGCCCGCTCGAG-ACGGGTTTCGGCGGAAGCGGATCCCATATG-ATTTATCAAAGAAACCTCCCCGCTCGAG-TTTTCCGCCTTTCAATGTGCGGATCCCATATG-GCAGGGCTGTTTTACCCCCGCTCGAG-AAACGGTTTGAACACGACGCGGATCCCATATG-AACCACGACATCACCCCGCTCGAG-CAGCCACAGGACGGCGCGGATCCCATATG-ACGTGGGGAACGCCCCGCTCGAG-GCGGCGTTTGAACGGCGCGGATCCCATATG-ACCAAATTTCAAACCCCTCCCCGCTCGAG-TAAACGAATGCCGTCCAGGCGGATCCCATATG-AGGATAACCGACGGCGCCCGCTCGAG-TTTGTTCCCGATGATGTTGCGGATCCCATATG-GAAGATTTATATATAATACTCGCCCGCTCGAG-ATCAGCTTCGAACCGAAGAAGAATTC-ATGAGTAAATCCCGTAGATCTCCCAACTGCAG-GGAAAACCACATCCGCACTCTGCC | XhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠ3amHⅠ-NdeⅠXhoⅠEcoRⅠPstⅠ |
ORF111ORF113ORF115ORF119ORF120ORF121ORF122ORF125ORF126ORF127ORF128ORF129 | 正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向 | AAAGAATTC-GCACCGCAAAAGGCAAAAACCGCAAAACTGCAG-TCTGCGCGTTTTCGGGCAGGGTGGAAAGAATTC-ATGAACAAAACCCTCTATCGTGTGATTTTCAACCGAAACTGCAG-TTACGAATGCCTGCTTGCTCGACCGTACTGAAAGAATTC-TTGCTTGTGCAAACAGAAAAAGACGGAAAAAAGTCGAC-CTATTTTTTAGGGGCTTTTGCTTGTTTGAAAAGCCTGCCAAAGAATTC-TACAACATGTATCAGGAAAACCAATACCGAAACTGCAG-TTATGAAAACAGGCGCAGGGCGGTTTTGCCAAAGAATTC-GCAAGGCTACCCCAATCCGCCGTGAAACTGCAG-CGGTTTGGCTGCCTGGCCGTTGATAAAGAATTC-GCCTTGGTCTGGCTGGTTTTCGCAAACTGCAG-TCATCCCCCACCCCACCTCGGCCATCCATCAAAAAAGTCGAC-ATGTCTTACCGCCCAAGCAGTTCTCCAAACTGCAG-TCAGGAACACAAACGATGACGAATATCCGTATCAAAGAATTC-GCGCTGTTTTTTGCGGCGGCGTATAAACTGCAG-CGCCGTTTCAAGACGAAAAAGTCGAAGAATTC-GCGGAAACGGTCGAAGAACTGCAG-TTAATCTTGTCTTCCGATATACAAGAATTC-ATGACTGATAATCGGGGGTTTACGAAAAAGTCGAC-CTTAAGTAACTTGCAGTCCTTATCAAGAATTC-ATGCAAGCTGTCCGCTACAGGCCAACTGCAG-CTATTGCAATGCGCCGCCCCCGGAATGTTTGAGCAGGCGAAGAATTC-ATGGATTTTCGTTTTGACATTATTTACGAATACCGAACTGCAG-TTATTTTTTGATGAAATTTTGGGGCGG | EcoRⅠPstⅠEcoRⅠPstⅠEcoRⅠSalⅠEcoRⅠPstⅠEcoRⅠPstⅠEcoRⅠPstⅠSalⅠPstⅠEcoRⅠPstⅠEcoRⅠPstⅠEcoRⅠSalⅠEcoRⅠPstⅠEcoRⅠPstⅠ |
ORF130ORF131ORF132ORF133ORF134ORF135ORF136ORF137ORF138ORF139ORF140ORF141 | 正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向 | AAAGAATTC-GCAGTACTTGCCATCTCGGTGCGAAACTGCAGG-CTCCGGATCGTCTGTAAACGCATTGCGGATCCCATATG-GAAATTCGGGCAATAAAATCCCGCTCGAG-CCAGCGGACGCGTTCGCGGATCCCATATG-AAAGAAGCGGGGTTTGCCCGCTCGAG-CCAATCTGCCAGCCGTCGCGGATCCCATATG-GAAGATGCAGGGCGCGCCCGCTCGAG-AAACTTGTAGCTCATCGTGCGGATCCCATATG-TCTGTGCAAGCAGTATTGCCCGCTCGAG-ATCCTGTGCCAATGCGGCGGATCCCATATG-CCGTCTGAAAAAGCTTTCCCGCTCGAG-AAATACCGCTGAGGATGGGCGGATCCGCTAGC-ATGAAGCGGCGTATAGCCCCCGCTCGAG-TTCCGAATATTTGGAACTTTTCGCGGATCCCATATG-GGCACGGCGGGAAATACCCGCTCGAG-ATAACGGTATGCCGCCGCGGATCCCATATG-TTTCGTTTACAATTCAGGCCCCGCTCGAG-CGGCGTTTTATAGCGGGCGGATCCCATATG-GCTTTTTTGGCGGTAATGCCCGCTCGAG-TAACGTTTCCGTGCGTTTGCGGATCCCATATG-TTGCCCACAGGCAGCCCCGCTCGAG-GACGATGGCAAACAGCGCGGATCCCATATG-CCGTCTGAAGCAGTCT | EcoRⅠPstⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NheⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠ |
ORF142ORF143ORF144ORF147 | 反向正向反向正向反向正向反向正向反向 | CCCGCTCGAG-ATCTGTTGTTTTTAAAATATTGCGGATCCCATATG-GATAATTCTGGTAGTGAAGCCCGCTCGAG-AAACGTATAGCCTACCTGCGGATCCCATATG-GATACCGCTTTGAACCTCCCGCTCGAG-AATGGCTTCCGCAATATGGCGGATCCCATATG-ACCTTTTTACAACGTTTGCCCCGCTCGAG-AGATTGTTGTTGTTTTTTCGGCGGATCCCATATG-TCTGTCTTTCAAACGGCCCCGCTCGAG-TTTGTTTTTGCAAGACAG | XhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠBamHⅠ-NdeⅠXhoⅠ |
NB:
-限制性位点用下划线表示
-对于ORF110-130,当ORF本身携带EcoRⅠ位点(例如ORF122)时,在正向引物中采用SalⅠ位点。同样,当ORF携带了PstⅠ位点(例如ORF115和127),在反向引物中采用SalⅠ位点。
表Ⅱ-克隆、表达和纯化的归纳
ORF | PCR/克隆 | His-融合物表达 | GST-融合物表达 | 纯化 |
orf1 | + | + | + | His-融合物 |
orf2 | + | + | + | GST-融合物 |
orf2.1 | + | 未测定 | + | GST-融合物 |
orf4 | + | + | + | His-融合物 |
orf5 | + | 未测定 | + | GST-融合物 |
orf6 | + | + | + | GST-融合物 |
orf7 | + | + | + | GST-融合物 |
orf8 | + | 未测定 | 未测定 | |
orf9 | + | + | + | GST-融合物 |
orf10 | + | 未测定 | 未测定 | |
orf11 | + | 未测定 | 未测定 | |
orf13 | + | 未测定 | + | GST-融合物 |
orf15 | + | + | + | GST-融合物 |
orf17 | + | 未测定 | 未测定 | |
orf18 | + | 未测定 | 未测定 | |
orf19 | + | 未测定 | 未测定 | |
orf20 | + | 未测定 | 未测定 | |
orf22 | + | + | + | GST-融合物 |
orf23 | + | + | + | His-融合物 |
orf24 | + | 未测定 | 未测定 | |
orf25 | + | + | + | His-融合物 |
orf26 | + | 未测定 | 未测定 | |
orf27 | + | + | + | GST-融合物 |
orf28 | + | + | + | GST-融合物 |
orf29 | + | 未测定 | 未测定 | |
orf32 | + | + | + | His-融合物 |
orf33 | + | 未测定 | 未测定 | |
orf35 | + | 未测定 | 未测定 |
orf37 | + | + | + | GST-融合物 |
orf58 | + | 未测定 | 未测定 | |
orf65 | + | 未测定 | 未测定 | |
orf66 | + | 未测定 | 未测定 | |
orf72 | + | + | 未测定 | His-融合物 |
orf73 | + | 未测定 | + | 未测定 |
orf75 | + | 未测定 | 未测定 | |
orf76 | + | + | 未测定 | His-融合物 |
orf79 | + | + | 未测定 | His-融合物 |
orf83 | + | 未测定 | + | 未测定 |
orf84 | + | 未测定 | 未测定 | |
orf85 | + | 未测定 | + | GST-融合物 |
orf89 | + | 未测定 | + | GST-融合物 |
orf97 | + | + | + | GST-融合物 |
orf98 | + | 未测定 | 未测定 | |
orf100 | + | 未测定 | 未测定 | |
orf101 | + | 未测定 | 未测定 | |
orf102 | + | 未测定 | 未测定 | |
orf103 | + | 未测定 | 未测定 | |
orf104 | + | 未测定 | 未测定 | |
orf105 | + | 未测定 | 未测定 | |
orf106 | + | + | + | His-融合物 |
orf109 | + | 未测定 | 未测定 | |
orf110 | + | 未测定 | 未测定 | |
orf111 | + | + | 未测定 | His-融合物 |
orf113 | + | + | 未测定 | His-融合物 |
orf115 | 未测定 | 未测定 | 未测定 | |
orf119 | + | + | 未测定 | His-融合物 |
orf120 | + | + | 未测定 | His-融合物 |
orf121 | + | 未测定 | 未测定 | |
orf122 | + | + | 未测定 | His-融合物 |
orf125 | + | + | 未测定 | His-融合物 |
orf126 | + | + | 未测定 | His-融合物 |
orf127 | + | + | 未测定 | His-融合物 |
orf128 | + | 未测定 | 未测定 | |
orf129 | + | + | 未测定 | His-融合物 |
orf130 | + | 未测定 | 未测定 | |
orf131 | + | + | + | 未测定 |
orf132 | + | + | + | His-融合物 |
orf133 | + | 未测定 | + | GST-融合物 |
orf134 | + | 未测定 | 未测定 | |
orf135 | + | 未测定 | 未测定 | |
orf136 | + | 未测定 | 未测定 | |
orf137 | + | 未测定 | + | GST-融合物 |
orf138 | + | 未测定 | + | GST-融合物 |
orf139 | + | 未测定 | 未测定 | |
orf140 | + | 未测定 | 未测定 | |
orf141 | + | 未测定 | 未测定 | |
orf142 | + | 未测定 | 未测定 | |
orf143 | + | 未测定 | 未测定 | |
orf144 | + | 未测定 | + | 未测定 |
orf147 | + | 未测定 | 未测定 |
Claims (17)
1.一种蛋白,它包含选自SEQ ID 2、4、6和8的氨基酸序列。
2.一种核酸分子,它编码权利要求1所述的蛋白。
3.根据权利要求2所述的核酸分子,它包含选自SEQ ID 1、3、5和7的核苷酸序列。
4.一种蛋白,它包含选自SEQ ID 2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126,128,130,132,134,136,138,140,142,144,146,148,150,152,154,156,158,160,162,164,166,168,170,172,174,176,178,180,182,184,186,188,190,192,194,196,198,200,202,204,206,208,210,212,214,216,218,220,222,224,226,228,230,232,234,236,238,240,242,244,246,248,250,252,254,256,258,260,262,264,266,268,270,272,274,276,278,280,282,284,286,288,290,292,294,296,298,300,302,304,306,308,310,312,314,316,318,320,322,324,326,328,330,332,334,336,338,340,342,344,346,348,350,352,354,356,358,360,362,364,366,368,370,372,374,376,378,380,382,384,386,388,390,392,394,396,398,400,402,404,406,408,410,412,414,416,418,420,422,424,426,428,430,432,434,436,438,440,442,444,446,448,450,452,454,456,458,460,462,464,466,468,470,472,474,476,478,480,482,484,486,488,490,492,494,496,498,500,502,504,506,508,510,512,514,516,518,520,522,524,526,528,530,532,534,536,538,540,542,544,546,548,550,552,554,556,558,560,562,564,566,568,570,572,574,576,578,580,582,584,586,588,590,592,594,596,598,600,602,604,606,608,610,612,614,616,618,620,622,624,626,628,630,632,634,636,638,640,642,644,646,648,650,652,654,656,658,660,662,664,666,668,670,672,674,676,678,680,682,684,686,688,690,692,694,696,698,700,702,704,706,708,710,712,714,716,718,720,722,724,726,728,730,732,734,736,738,740,742,744,746,748,750,752,754,756,758,760,762,764,766,768,770,772,774,776,778,780,782,784,786,788,790,792,794,796,798,800,802,804,806,808,810,812,814,816,818,820,822,824,826,828,830,832,834,836,838,840,842,844,846,848,850,852,854,856,858,860,862,864,866,868,870,872,874,876,878,880,882,884,886,888,890,和892的氨基酸序列。
5.一种蛋白质,它与权利要求4所述的蛋白质的序列相同性为50%或更高。
6.一种蛋白质,它包含选自SEQ ID 2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126,128,130,132,134,136,138,140,142,144,146,148,150,152,154,156,158,160,162,164,166,168,170,172,174,176,178,180,182,184,186,188,190,192,194,196,198,200,202,204,206,208,210,212,214,216,218,220,222,224,226,228,230,232,234,236,238,240,242,244,246,248,250,252,254,256,258,260,262,264,266,268,270,272,274,276,278,280,282,284,286,288,290,292,294,296,298,300,302,304,306,308,310,312,314,316,318,320,322,324,326,328,330,332,334,336,338,340,342,344,346,348,350,352,354,356,358,360,362,364,366,368,370,372,374,376,378,380,382,384,386,388,390,392,394,396,398,400,402,404,406,408,410,412,414,416,418,420,422,424,426,428,430,432,434,436,438,440,442,444,446,448,450,452,454,456,458,460,462,464,466,468,470,472,474,476,478,480,482,484,486,488,490,492,494,496,498,500,502,504,506,508,510,512,514,516,518,520,522,524,526,528,530,532,534,536,538,540,542,544,546,548,550,552,554,556,558,560,562,564,566,568,570,572,574,576,578,580,582,584,586,588,590,592,594,596,598,600,602,604,606,608,610,612,614,616,618,620,622,624,626,628,630,632,634,636,638,640,642,644,646,648,650,652,654,656,658,660,662,664,666,668,670,672,674,676,678,680,682,684,686,688,690,692,694,696,698,700,702,704,706,708,710,712,714,716,718,720,722,724,726,728,730,732,734,736,738,740,742,744,746,748,750,752,754,756,758,760,762,764,766,768,770,772,774,776,778,780,782,784,786,788,790,792,794,796,798,800,802,804,806,808,810,812,814,816,818,820,822,824,826,828,830,832,834,836,838,840,842,844,846,848,850,852,854,856,858,860,862,864,866,868,870,872,874,876,878,880,882,884,886,888,890和892的氨基酸序列的片段。
7.一种抗体,它结合权利要求4至6任一所述的蛋白质。
8.一种核酸分子,它编码权利要求4至6任一所述的蛋白质。
9.根据权利要求8所述的核酸分子,它包括选自SEQ ID 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145,147,149,151,153,155,157,159,161,163,165,167,169,171,173,175,177,179,181,183,185,187,189,191,193,195,197,199,201,203,205,207,209,211,213,215,217,219,221,223,225,227,229,231,233,235,237,239,241,243,245,247,249,251,253,255,257,259,261,263,265,267,269,271,273,275,277,279,281,283,285,287,289,291,293,295,297,299,301,303,305,307,309,311,313,315,317,319,321,323,325,327,329,331,333,335,337,339,341,343,345,347,349,351,353,355,357,359,361,363,365,367,369,371,373,375,377,379,381,383,385,387,389,391,393,395,397,399,401,403,405,407,409,411,413,415,417,419,421,423,425,427,429,431,433,435,437,439,441,443,445,447,449,451,453,455,457,459,461,463,465,467,469,471,473,475,477,479,481,483,485,487,489,491,493,495,497,499,501,503,505,507,509,511,513,515,517,519,521,523,525,527,529,531,533,535,537,539,541,543,545,547,549,551,553,555,557,559,561,563,565,567,569,571,573,575,577,579,581,583,585,587,589,591,593,595,597,599,601,603,605,607,609,611,613,615,617,619,621,623,625,627,629,631,633,635,637,639,641,643,645,647,649,651,653,655,657,659,661,663,665,667,669,671,673,675,677,679,681,683,685,687,689,691,693,695,697,699,701,703,705,707,709,711,713,715,717,719,721,723,725,727,729,731,733,735,737,739,741,743,745,747,749,751,753,755,757,759,761,763,765,767,769,771,773,775,777,779,781,783,785,787,789,791,793,795,797,799,801,803,805,807,809,811,813,815,817,819,821,823,825,827,829,831,833,835,837,839,841,843,845,847,849,851,853,855,857,859,861,863,865,867,869,871,873,875,877,879,881,883,885,887,889和891的核苷酸序列。
10.一种核酸分子,它包含选自SEQ ID 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145,147,149,151,153,155,157,159,161,163,165,167,169,171,173,175,177,179,181,183,185,187,189,191,193,195,197,199,201,203,205,207,209,211,213,215,217,219,221,223,225,227,229,231,233,235,237,239,241,243,245,247,249,251,253,255,257,259,261,263,265,267,269,271,273,275,277,279,281,283,285,287,289,291,293,295,297,299,301,303,305,307,309,311,313,315,317,319,321,323,325,327,329,331,333,335,337,339,341,343,345,347,349,351,353,355,357,359,361,363,365,367,369,371,373,375,377,379,381,383,385,387,389,391,393,395,397,399,401,403,405,407,409,411,413,415,417,419,421,423,425,427,429,431,433,435,437,439,441,443,445,447,449,451,453,455,457,459,461,463,465,467,469,471,473,475,477,479,481,483,485,487,489,491,493,495,497,499,501,503,505,507,509,511,513,515,517,519,521,523,525,527,529,531,533,535,537,539,541,543,545,547,549,551,553,555,557,559,561,563,565,567,569,571,573,575,577,579,581,583,585,587,589,591,593,595,597,599,601,603,605,607,609,611,613,615,617,619,621,623,625,627,629,631,633,635,637,639,641,643,645,647,649,651,653,655,657,659,661,663,665,667,669,671,673,675,677,679,681,683,685,687,689,691,693,695,697,699,701,703,705,707,709,711,713,715,717,719,721,723,725,727,729,731,733,735,737,739,741,743,745,747,749,751,753,755,757,759,761,763,765,767,769,771,773,775,777,779,781,783,785,787,789,791,793,795,797,799,801,803,805,807,809,811,813,815,817,819,821,823,825,827,829,831,833,835,837,839,841,843,845,847,849,851,853,855,857,859,861,863,865,867,869,871,873,875,877,879,881,883,885,887,889和891的核苷酸序列的片段。
11.一种核酸分子,它包含与权利要求8至10任一所述的核酸分子互补的核苷酸序列。
12.一种核酸分子,它包含的核苷酸序列与权利要求8至11任一所述的核酸分子的序列相同性为50%或更高。
13.一种核酸分子,它能在高度严谨的条件下与权利要求8至12任一所述的核酸分子杂交。
14.一种组合物,它包含前述任一权利要求所述的蛋白质、核酸分子或抗体。
15.根据权利要求14所述的组合物,它是疫苗组合物或诊断组合物。
16.权利要求14或15所述的组合物作为药剂的应用。
17.权利要求14所述的组合物在生产用于治疗或预防由于奈瑟球菌引起的感染的药剂中的应用。
Applications Claiming Priority (14)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9723516.2 | 1997-11-06 | ||
GBGB9723516.2A GB9723516D0 (zh) | 1997-11-06 | 1997-11-06 | |
GBGB9724190.5A GB9724190D0 (zh) | 1997-11-14 | 1997-11-14 | |
GB9724190.5 | 1997-11-14 | ||
GB9724386.9 | 1997-11-18 | ||
GBGB9724386.9A GB9724386D0 (zh) | 1997-11-18 | 1997-11-18 | |
GBGB9725158.1A GB9725158D0 (zh) | 1997-11-27 | 1997-11-27 | |
GB9725158.1 | 1997-11-27 | ||
GB9726147.3 | 1997-12-10 | ||
GBGB9726147.3A GB9726147D0 (en) | 1997-12-10 | 1997-12-10 | Antigens |
GBGB9800759.4A GB9800759D0 (zh) | 1998-01-14 | 1998-01-14 | |
GB9800759.4 | 1998-01-14 | ||
GBGB9819016.8A GB9819016D0 (zh) | 1998-09-01 | 1998-09-01 | |
GB9819016.8 | 1998-09-01 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2005101133957A Division CN1824675B (zh) | 1997-11-06 | 1998-10-09 | 奈瑟球菌抗原 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1286727A true CN1286727A (zh) | 2001-03-07 |
CN1263854C CN1263854C (zh) | 2006-07-12 |
Family
ID=27562941
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB988128446A Expired - Fee Related CN1263854C (zh) | 1997-11-06 | 1998-10-09 | 奈瑟球菌抗原 |
Country Status (11)
Country | Link |
---|---|
EP (3) | EP2278006A3 (zh) |
JP (3) | JP4472866B2 (zh) |
CN (1) | CN1263854C (zh) |
AT (1) | ATE476508T1 (zh) |
AU (1) | AU9363798A (zh) |
BR (1) | BR9813930A (zh) |
CA (2) | CA2308606A1 (zh) |
CY (1) | CY1110880T1 (zh) |
DE (1) | DE69841807D1 (zh) |
HK (2) | HK1027126A1 (zh) |
WO (1) | WO1999024578A2 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100354297C (zh) * | 2001-10-03 | 2007-12-12 | 希龙公司 | 辅助的脑膜炎球菌组合物 |
Families Citing this family (126)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2324093A (en) | 1996-01-04 | 1998-10-14 | Rican Limited | Helicobacter pylori bacterioferritin |
CA2266656A1 (en) | 1996-09-17 | 1998-03-26 | Chiron Corporation | Compositions and methods for treating intracellular diseases |
US6914131B1 (en) | 1998-10-09 | 2005-07-05 | Chiron S.R.L. | Neisserial antigens |
AUPP083997A0 (en) | 1997-12-10 | 1998-01-08 | Csl Limited | Porphyromonas gingivalis nucleotides |
US8129500B2 (en) | 1997-12-10 | 2012-03-06 | Csl Limited | Porphyromonas gingivalis polypeptides and nucleotides |
GB9808866D0 (en) * | 1998-04-24 | 1998-06-24 | Smithkline Beecham Biolog | Novel compounds |
AU761780B2 (en) * | 1998-05-01 | 2003-06-12 | Glaxosmithkline Biologicals Sa | Neisseria meningitidis antigens and compositions |
GB9814902D0 (en) * | 1998-07-10 | 1998-09-09 | Univ Nottingham | Screening of neisserial vaccine candidates against pathogenic neisseria |
GB9818004D0 (en) * | 1998-08-18 | 1998-10-14 | Smithkline Beecham Biolog | Novel compounds |
US6610306B2 (en) | 1998-10-22 | 2003-08-26 | The University Of Montana | OMP85 protein of neisseria meningitidis, compositions containing the same and methods of use thereof |
US10967045B2 (en) | 1998-11-02 | 2021-04-06 | Secretary of State for Health and Social Care | Multicomponent meningococcal vaccine |
JP2004537956A (ja) | 1999-01-22 | 2004-12-24 | グラクソスミスクライン バイオロジカルズ ソシエテ アノニム | 新規化合物 |
DK1163343T3 (da) | 1999-03-12 | 2010-04-19 | Glaxosmithkline Biolog Sa | Neisseria meningitidis antigene polypeptider, tilsvarende polynukleotider og beskyttende antistoffer |
EP1228217B1 (en) | 1999-04-30 | 2012-11-21 | Novartis Vaccines and Diagnostics S.r.l. | Conserved neisserial antigens |
MXPA01011867A (es) * | 1999-05-19 | 2002-05-06 | Chiron Spa | Composiciones neisseriales de combinacion. |
GB9911683D0 (en) * | 1999-05-19 | 1999-07-21 | Chiron Spa | Antigenic peptides |
GB9918003D0 (en) * | 1999-07-30 | 1999-09-29 | Smithkline Beecham Biolog | Novel compounds |
EP1741784B1 (en) | 1999-11-29 | 2010-03-10 | Novartis Vaccines and Diagnostics S.r.l. | 85kDa neisserial antigen |
GB9928196D0 (en) | 1999-11-29 | 2000-01-26 | Chiron Spa | Combinations of B, C and other antigens |
EP2275129A3 (en) | 2000-01-17 | 2013-11-06 | Novartis Vaccines and Diagnostics S.r.l. | Outer membrane vesicle (OMV) vaccine comprising N. meningitidis serogroup B outer membrane proteins |
HUP0300696A3 (en) | 2000-01-25 | 2004-10-28 | Univ Queensland Brisbane | Proteins comprising conserved regions of neisseria meningitidis surface antigen nhha |
JP4763210B2 (ja) | 2000-02-28 | 2011-08-31 | ノバルティス ヴァクシンズ アンド ダイアグノスティクス エスアールエル | ナイセリアのタンパク質の異種発現 |
GB0011108D0 (en) * | 2000-05-08 | 2000-06-28 | Microscience Ltd | Virulence gene and protein and their use |
EP1322328B1 (en) | 2000-07-27 | 2014-08-20 | Children's Hospital & Research Center at Oakland | Vaccines for broad spectrum protection against diseases caused by neisseria meningitidis |
US6830898B2 (en) * | 2000-08-24 | 2004-12-14 | Omnigene Bioproducts, Inc. | Microorganisms and assays for the identification of antibiotics |
PT2088197T (pt) | 2000-10-02 | 2016-08-01 | Id Biomedical Corp Quebec | Antigénios de haemophilus influenzae e fragmentos de adn correspondentes |
MX357775B (es) | 2000-10-27 | 2018-07-20 | J Craig Venter Inst Inc | Acidos nucleicos y proteinas de los grupos a y b de estreptococos. |
US7261901B2 (en) | 2001-01-31 | 2007-08-28 | University Of Iowa Research Foundation | Vaccine and compositions for the prevention and treatment of neisserial infections |
WO2002060936A2 (en) * | 2001-01-31 | 2002-08-08 | University Of Iowa Research Foundation | Vaccine and compositions for the prevention and treatment of neisserial infections |
GB0107658D0 (en) | 2001-03-27 | 2001-05-16 | Chiron Spa | Streptococcus pneumoniae |
GB0107661D0 (en) | 2001-03-27 | 2001-05-16 | Chiron Spa | Staphylococcus aureus |
GB0109289D0 (en) * | 2001-04-12 | 2001-05-30 | Glaxosmithkline Biolog Sa | Novel compounds |
EP1392831B1 (en) | 2001-05-15 | 2008-11-12 | ID Biomedical Corporation | Moraxella(branhamella) catarrhalis antigens |
EP1399183B1 (en) | 2001-05-31 | 2010-06-30 | Novartis Vaccines and Diagnostics, Inc. | Chimeric alphavirus replicon particles |
GB0115176D0 (en) | 2001-06-20 | 2001-08-15 | Chiron Spa | Capular polysaccharide solubilisation and combination vaccines |
GB0118249D0 (en) | 2001-07-26 | 2001-09-19 | Chiron Spa | Histidine vaccines |
GB0121591D0 (en) * | 2001-09-06 | 2001-10-24 | Chiron Spa | Hybrid and tandem expression of neisserial proteins |
ATE469915T1 (de) | 2001-07-27 | 2010-06-15 | Novartis Vaccines & Diagnostic | Antikörper gegen das meningokokken adhäsin app |
AR045702A1 (es) | 2001-10-03 | 2005-11-09 | Chiron Corp | Composiciones de adyuvantes. |
MX339524B (es) | 2001-10-11 | 2016-05-30 | Wyeth Corp | Composiciones inmunogenicas novedosas para la prevencion y tratamiento de enfermedad meningococica. |
GB0129007D0 (en) * | 2001-12-04 | 2002-01-23 | Chiron Spa | Adjuvanted antigenic meningococcal compositions |
ATE406912T1 (de) | 2001-12-12 | 2008-09-15 | Novartis Vaccines & Diagnostic | Immunisierung gegen chlamydia tracheomatis |
US7501134B2 (en) | 2002-02-20 | 2009-03-10 | Novartis Vaccines And Diagnostics, Inc. | Microparticles with adsorbed polypeptide-containing molecules |
US7785608B2 (en) | 2002-08-30 | 2010-08-31 | Wyeth Holdings Corporation | Immunogenic compositions for the prevention and treatment of meningococcal disease |
GB0220194D0 (en) | 2002-08-30 | 2002-10-09 | Chiron Spa | Improved vesicles |
PT2353608T (pt) | 2002-10-11 | 2020-03-11 | Novartis Vaccines And Diagnostics S R L | Vacinas de polipéptidos para protecção alargada contra linhagens meningocócicas hipervirulentas |
AU2003288660A1 (en) | 2002-11-15 | 2004-06-15 | Chiron Srl | Unexpected surface proteins in neisseria meningitidis |
GB0227346D0 (en) | 2002-11-22 | 2002-12-31 | Chiron Spa | 741 |
EP1585542B1 (en) | 2002-12-27 | 2012-06-13 | Novartis Vaccines and Diagnostics, Inc. | Immunogenic compositions containing phospholipid |
DK1587537T3 (da) | 2003-01-30 | 2012-07-16 | Novartis Ag | Injicerbare vacciner mod multiple meningococ-serogrupper |
WO2004078949A2 (en) * | 2003-03-06 | 2004-09-16 | Children's Hospital, Inc. | Genes of an otitis media isolate of nontypeable haemophilus influenzae |
EP1608369B1 (en) | 2003-03-28 | 2013-06-26 | Novartis Vaccines and Diagnostics, Inc. | Use of organic compounds for immunopotentiation |
GB0308198D0 (en) | 2003-04-09 | 2003-05-14 | Chiron Srl | ADP-ribosylating bacterial toxin |
US7731967B2 (en) | 2003-04-30 | 2010-06-08 | Novartis Vaccines And Diagnostics, Inc. | Compositions for inducing immune responses |
ATE437633T2 (de) | 2003-06-02 | 2009-08-15 | Novartis Vaccines & Diagnostic | Immunogene zusammensetzungen auf basis von biologisch abbaubaren mikroteilchen enthaltend ein diphtherie- und ein tetanustoxoid |
SI1961426T1 (sl) | 2003-10-02 | 2011-10-28 | Novartis Ag | Kombinirana cepiva proti meningitisu |
GB0323103D0 (en) | 2003-10-02 | 2003-11-05 | Chiron Srl | De-acetylated saccharides |
GB0408977D0 (en) | 2004-04-22 | 2004-05-26 | Chiron Srl | Immunising against meningococcal serogroup Y using proteins |
US20110104186A1 (en) | 2004-06-24 | 2011-05-05 | Nicholas Valiante | Small molecule immunopotentiators and assays for their detection |
US20060165716A1 (en) | 2004-07-29 | 2006-07-27 | Telford John L | Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae |
RU2432962C2 (ru) | 2005-01-27 | 2011-11-10 | Чилдрен'З Хоспитал Энд Рисерч Сентер Эт Окленд | Вакцины с использованием везикул на основе gna 1870 широкого спектра действия для профилактики заболеваний, вызываемых neisseria meningitidis |
GB0502095D0 (en) | 2005-02-01 | 2005-03-09 | Chiron Srl | Conjugation of streptococcal capsular saccharides |
GB0502096D0 (en) | 2005-02-01 | 2005-03-09 | Chiron Srl | Purification of streptococcal capsular polysaccharide |
WO2006089264A2 (en) | 2005-02-18 | 2006-08-24 | Novartis Vaccines And Diagnostics Inc. | Proteins and nucleic acids from meningitis/sepsis-associated escherichia coli |
CN101180312A (zh) | 2005-02-18 | 2008-05-14 | 诺华疫苗和诊断公司 | 来自尿路病原性大肠杆菌的免疫原 |
EP2070945A1 (en) | 2005-06-16 | 2009-06-17 | Nationwide Children's Hospital, Inc. | Genes of an otitis media isolate of nontypeable haemophilus influenzae |
US20110223197A1 (en) | 2005-10-18 | 2011-09-15 | Novartis Vaccines And Diagnostics Inc. | Mucosal and Systemic Immunization with Alphavirus Replicon Particles |
JP5215865B2 (ja) | 2005-11-22 | 2013-06-19 | ノバルティス ヴァクシンズ アンド ダイアグノスティクス インコーポレイテッド | ノロウイルス抗原およびサポウイルス抗原 |
GB0524066D0 (en) | 2005-11-25 | 2006-01-04 | Chiron Srl | 741 ii |
AU2007281934B2 (en) | 2006-01-18 | 2012-11-15 | University Of Chicago | Compositions and methods related to Staphylococcal bacterium proteins |
CA2646539A1 (en) | 2006-03-23 | 2007-09-27 | Novartis Ag | Imidazoquinoxaline compounds as immunomodulators |
WO2007110602A1 (en) * | 2006-03-28 | 2007-10-04 | The University Of Nottingham | Immunogenic compositions |
US8039007B2 (en) | 2006-06-29 | 2011-10-18 | J. Craig Venter Institute, Inc. | Polypeptides from Neisseria meningitidis |
US20100166788A1 (en) | 2006-08-16 | 2010-07-01 | Novartis Vaccines And Diagnostics | Immunogens from uropathogenic escherichia coli |
AR064642A1 (es) | 2006-12-22 | 2009-04-15 | Wyeth Corp | Polinucleotido vector que lo comprende celula recombinante que comprende el vector polipeptido , anticuerpo , composicion que comprende el polinucleotido , vector , celula recombinante polipeptido o anticuerpo , uso de la composicion y metodo para preparar la composicion misma y preparar una composi |
GB0700562D0 (en) | 2007-01-11 | 2007-02-21 | Novartis Vaccines & Diagnostic | Modified Saccharides |
GB0713880D0 (en) | 2007-07-17 | 2007-08-29 | Novartis Ag | Conjugate purification |
AU2008299376B2 (en) | 2007-09-12 | 2013-02-28 | Glaxosmithkline Biologicals S.A. | GAS57 mutant antigens and GAS57 antibodies |
CN101951949B (zh) | 2007-10-19 | 2013-10-02 | 诺华股份有限公司 | 脑膜炎球菌疫苗制剂 |
US8815253B2 (en) | 2007-12-07 | 2014-08-26 | Novartis Ag | Compositions for inducing immune responses |
KR101773114B1 (ko) | 2007-12-21 | 2017-08-30 | 노파르티스 아게 | 스트렙토라이신 o의 돌연변이 형태 |
EP3263591B1 (en) | 2008-02-21 | 2019-03-27 | GlaxoSmithKline Biologicals S.A. | Meningococcal fhbp polypeptides |
NZ588191A (en) | 2008-03-03 | 2012-06-29 | Irm Llc | Compounds and compositions as tlr activity modulators |
HUE029265T2 (en) | 2008-10-27 | 2017-02-28 | Glaxosmithkline Biologicals Sa | Method of purifying carbohydrates from the group streptococci |
US9175353B2 (en) | 2008-11-14 | 2015-11-03 | Gen-Probe Incorporated | Compositions, kits and methods for detection of campylobacter nucleic acid |
US8425922B2 (en) | 2009-01-05 | 2013-04-23 | EpitoGenesis, Inc. | Adjuvant compositions and methods of use |
AU2010204139A1 (en) | 2009-01-12 | 2011-08-11 | Novartis Ag | Cna_B domain antigens in vaccines against gram positive bacteria |
ITMI20090946A1 (it) | 2009-05-28 | 2010-11-29 | Novartis Ag | Espressione di proteine ricombinanti |
WO2010144734A1 (en) | 2009-06-10 | 2010-12-16 | Novartis Ag | Benzonaphthyridine-containing vaccines |
ES2596653T3 (es) | 2009-06-16 | 2017-01-11 | Glaxosmithkline Biologicals Sa | Ensayos bactericidas de opsonización y dependientes de anticuerpo mediado por el complemento de alto rendimiento |
EP2470204B1 (en) | 2009-08-27 | 2015-12-16 | GlaxoSmithKline Biologicals SA | Hybrid polypeptides including meningococcal fhbp sequences |
WO2011026111A1 (en) | 2009-08-31 | 2011-03-03 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | Oral delivery of a vaccine to the large intestine to induce mucosal immunity |
ES2443952T3 (es) | 2009-09-02 | 2014-02-21 | Novartis Ag | Composiciones inmunógenas que incluyen moduladores de la actividad de TLR |
TWI445708B (zh) | 2009-09-02 | 2014-07-21 | Irm Llc | 作為tlr活性調節劑之化合物及組合物 |
CA2779798C (en) | 2009-09-30 | 2019-03-19 | Novartis Ag | Conjugation of staphylococcus aureus type 5 and type 8 capsular polysaccharides |
MX2012004850A (es) | 2009-10-27 | 2012-05-22 | Novartis Ag | Polipeptidos fhbp meningococicos modificados. |
CN102971009A (zh) | 2009-10-30 | 2013-03-13 | 诺华有限公司 | 金黄色葡萄球菌5型和8型荚膜多糖的纯化 |
WO2011057148A1 (en) | 2009-11-05 | 2011-05-12 | Irm Llc | Compounds and compositions as tlr-7 activity modulators |
CN102762206A (zh) | 2009-12-15 | 2012-10-31 | 诺华有限公司 | 免疫增强化合物的均匀悬液及其用途 |
EA023725B1 (ru) | 2010-03-23 | 2016-07-29 | Новартис Аг | Соединения (липопептиды на основе цистеина) и композиции в качестве агонистов tlr2, применяемые для лечения инфекционных, воспалительных, респираторных и других заболеваний |
EP3730943A1 (en) | 2010-04-08 | 2020-10-28 | University of Pittsburgh - Of the Commonwealth System of Higher Education | B-cell antigen presenting cell assay |
WO2011149564A1 (en) | 2010-05-28 | 2011-12-01 | Tetris Online, Inc. | Interactive hybrid asynchronous computer game infrastructure |
RU2013103763A (ru) * | 2010-07-02 | 2014-08-10 | Ангиохем Инк. | Короткие и содержащие d-аминокислоты полипептиды для терапевтических конъюгатов и их применения |
MX350142B (es) | 2010-08-23 | 2017-08-28 | Wyeth Llc * | Formulaciones estables de antigenos rlp2086 de neisseria meningitidis. |
EP3549601B1 (en) | 2010-09-10 | 2021-02-24 | Wyeth LLC | Non-lipidated variants of neisseria meningitidis orf2086 antigens |
WO2012032498A2 (en) | 2010-09-10 | 2012-03-15 | Novartis Ag | Developments in meningococcal outer membrane vesicles |
GB201101665D0 (en) | 2011-01-31 | 2011-03-16 | Novartis Ag | Immunogenic compositions |
CA2860331A1 (en) | 2010-12-24 | 2012-06-28 | Novartis Ag | Compounds |
WO2012095684A1 (en) * | 2011-01-10 | 2012-07-19 | Inserm ( Institut National De La Sante Et De La Recherche Medicale) | Methods for the screening of substances useful for the prevention and treatment of neisseria infections |
CN107837394A (zh) | 2011-06-24 | 2018-03-27 | 埃皮托吉尼西斯有限公司 | 作为抗原特异性免疫调节剂的包含选择的载体、维生素、单宁和类黄酮的组合的药物组合物 |
MX354924B (es) | 2011-11-07 | 2018-03-22 | Novartis Ag | Molecula portadora que comprende un antigeno spr0096 y un spr2021. |
MX359256B (es) | 2012-03-09 | 2018-09-19 | Pfizer | Composiciones de neisseria meningitidis y metodos de las mismas. |
SA115360586B1 (ar) | 2012-03-09 | 2017-04-12 | فايزر انك | تركيبات لعلاج الالتهاب السحائي البكتيري وطرق لتحضيرها |
US10376573B2 (en) | 2012-06-14 | 2019-08-13 | Glaxosmithkline Biologicals Sa | Vaccines for serogroup X meningococcus |
ES2848048T3 (es) | 2012-10-03 | 2021-08-05 | Glaxosmithkline Biologicals Sa | Composiciones inmunogénicas |
BR112015018014A2 (pt) | 2013-02-01 | 2017-07-11 | Glaxosmithkline Biologicals Sa | liberação intradérmica de composições imunológicas compreendendo agonistas do receptor do tipo toll |
EP2964665B1 (en) | 2013-03-08 | 2018-08-01 | Pfizer Inc | Immunogenic fusion polypeptides |
RU2662968C2 (ru) | 2013-09-08 | 2018-07-31 | Пфайзер Инк. | Иммуногенная композиция против neisseria meningitidis (варианты) |
CN107249626A (zh) | 2015-02-19 | 2017-10-13 | 辉瑞大药厂 | 脑膜炎奈瑟球菌组合物及其方法 |
BR112018017141A2 (pt) | 2016-02-22 | 2019-01-02 | Boehringer Ingelheim Vetmedica Gmbh | método para a imobilização de biomoléculas |
US11612664B2 (en) | 2016-04-05 | 2023-03-28 | Gsk Vaccines S.R.L. | Immunogenic compositions |
SG11201906519RA (en) | 2017-01-31 | 2019-08-27 | Pfizer | Neisseria meningitidis compositions and methods thereof |
WO2019018744A1 (en) | 2017-07-21 | 2019-01-24 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | IMMUNOGENIC COMPOSITIONS OF NEISSERIA MENINGITIDIS |
CN111867623B (zh) | 2018-02-12 | 2024-06-04 | 英尼穆内公司 | Toll样受体配体 |
WO2020086408A1 (en) | 2018-10-26 | 2020-04-30 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | A high-yield perfusion-based transient gene expression bioprocess |
WO2022096596A1 (en) | 2020-11-04 | 2022-05-12 | Eligo Bioscience | Cutibacterium acnes recombinant phages, method of production and uses thereof |
Family Cites Families (138)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US2386796A (en) | 1942-08-05 | 1945-10-16 | Bond Crown & Cork Co | Extruding device |
DE2855719A1 (de) | 1978-12-22 | 1980-07-10 | Siemens Ag | Zahnaerztliche handstueckanordnung |
US4336336A (en) | 1979-01-12 | 1982-06-22 | President And Fellows Of Harvard College | Fused gene and method of making and using same |
AU545912B2 (en) | 1980-03-10 | 1985-08-08 | Cetus Corporation | Cloned heterologous jive products in bacillies |
ZA811368B (en) | 1980-03-24 | 1982-04-28 | Genentech Inc | Bacterial polypedtide expression employing tryptophan promoter-operator |
NZ199722A (en) | 1981-02-25 | 1985-12-13 | Genentech Inc | Dna transfer vector for expression of exogenous polypeptide in yeast;transformed yeast strain |
DK188582A (da) | 1981-04-29 | 1982-10-30 | Biogen Nv | Bacillus-kloningsvektorer rekombinations-dna-molekyler bacillus-vaerter transformeret dermed samt fremgangsmaader til ekspressionaf fremmede dna-sekvenser og fremstilling af polypeptideer som er kodede dermed |
US4551433A (en) | 1981-05-18 | 1985-11-05 | Genentech, Inc. | Microbial hybrid promoters |
US4405712A (en) | 1981-07-01 | 1983-09-20 | The United States Of America As Represented By The Department Of Health And Human Services | LTR-Vectors |
US4603112A (en) | 1981-12-24 | 1986-07-29 | Health Research, Incorporated | Modified vaccinia virus |
US4769330A (en) | 1981-12-24 | 1988-09-06 | Health Research, Incorporated | Modified vaccinia virus and methods for making and using the same |
CA1341116C (en) | 1983-02-22 | 2000-10-17 | Rae Lyn Burke | Yeast expression systems with vectors having gapdh or pyk promoters and synthesis or foreign protein |
US4876197A (en) | 1983-02-22 | 1989-10-24 | Chiron Corporation | Eukaryotic regulatable transcription |
JPS59166086A (ja) | 1983-03-09 | 1984-09-19 | Teruhiko Beppu | 新規な発現型プラスミドとそれらを用いて仔牛プロキモシン遺伝子を大腸菌内で発現させる方法 |
US4546083A (en) | 1983-04-22 | 1985-10-08 | Stolle Research & Development Corporation | Method and device for cell culture growth |
US4588684A (en) | 1983-04-26 | 1986-05-13 | Chiron Corporation | a-Factor and its processing signals |
JPS59205983A (ja) | 1983-04-28 | 1984-11-21 | ジエネツクス・コ−ポレイシヨン | 異種遺伝子を原核微生物で発現させる方法 |
US4663280A (en) | 1983-05-19 | 1987-05-05 | Public Health Research Institute Of The City Of New York | Expression and secretion vectors and method of constructing vectors |
EP0127839B1 (en) | 1983-05-27 | 1992-07-15 | THE TEXAS A&M UNIVERSITY SYSTEM | Method for producing a recombinant baculovirus expression vector |
US4689406A (en) | 1983-08-10 | 1987-08-25 | Amgen | Enhancement of microbial expression of polypeptides |
US4870008A (en) | 1983-08-12 | 1989-09-26 | Chiron Corporation | Secretory expression in eukaryotes |
JPS6054685A (ja) | 1983-09-02 | 1985-03-29 | Suntory Ltd | 改良発現ベクタ−およびその利用 |
EP0136907A3 (en) | 1983-10-03 | 1986-12-30 | Genentech, Inc. | A xenogeneic expression control system, a method of using it, expression vectors containing it, cells transformed thereby and heterologous proteins produced therefrom |
DK518384A (da) | 1984-01-31 | 1985-07-01 | Idaho Res Found | Vektor til fremstilling af et gen-produkt i insektceller, fremgangsmaade til dens fremstilling samt dens anvendelse |
US4880734A (en) | 1984-05-11 | 1989-11-14 | Chiron Corporation | Eukaryotic regulatable transcription |
EP0164556B1 (en) | 1984-05-11 | 1994-03-02 | Chiron Corporation | Enhanced yeast transcription employing hybrid promoter region constructs |
CA1282721C (en) | 1984-06-04 | 1991-04-09 | Bernard Roizman | Herpes simplex virus as a vector |
US5288641A (en) | 1984-06-04 | 1994-02-22 | Arch Development Corporation | Herpes Simplex virus as a vector |
US4738921A (en) | 1984-09-27 | 1988-04-19 | Eli Lilly And Company | Derivative of the tryptophan operon for expression of fused gene products |
US4745056A (en) | 1984-10-23 | 1988-05-17 | Biotechnica International, Inc. | Streptomyces secretion vector |
US4837148A (en) | 1984-10-30 | 1989-06-06 | Phillips Petroleum Company | Autonomous replication sequences for yeast strains of the genus pichia |
US4762915A (en) | 1985-01-18 | 1988-08-09 | Liposome Technology, Inc. | Protein-liposome conjugates |
US4797368A (en) | 1985-03-15 | 1989-01-10 | The United States Of America As Represented By The Department Of Health And Human Services | Adeno-associated virus as eukaryotic expression vector |
EP0196056B1 (en) | 1985-03-28 | 1991-05-22 | Chiron Corporation | Improved expression using fused genes providing for protein product |
US4683195A (en) | 1986-01-30 | 1987-07-28 | Cetus Corporation | Process for amplifying, detecting, and/or-cloning nucleic acid sequences |
US4683202A (en) | 1985-03-28 | 1987-07-28 | Cetus Corporation | Process for amplifying nucleic acid sequences |
US4865974A (en) | 1985-09-20 | 1989-09-12 | Cetus Corporation | Bacterial methionine N-terminal peptidase |
US4777127A (en) | 1985-09-30 | 1988-10-11 | Labsystems Oy | Human retrovirus-related products and methods of diagnosing and treating conditions associated with said retrovirus |
JPS6296086A (ja) | 1985-10-21 | 1987-05-02 | Agency Of Ind Science & Technol | 複合プラスミド |
US5139941A (en) | 1985-10-31 | 1992-08-18 | University Of Florida Research Foundation, Inc. | AAV transduction vectors |
US5091309A (en) | 1986-01-16 | 1992-02-25 | Washington University | Sindbis virus vectors |
US4861719A (en) | 1986-04-25 | 1989-08-29 | Fred Hutchinson Cancer Research Center | DNA constructs for retrovirus packaging cell lines |
ES2061482T3 (es) | 1986-05-02 | 1994-12-16 | Gist Brocades Nv | Metodo para la identificacion de secuencias de señal secretora para sintesis de proteina extracelular en bacillus. |
JP2612874B2 (ja) | 1986-10-02 | 1997-05-21 | マサチユセツツ・インスチチユート・オブ・テクノロジー | タンパク質の代謝的安定性を調節する方法 |
JPS63123383A (ja) | 1986-11-11 | 1988-05-27 | Mitsubishi Kasei Corp | ハイブリツドプロモ−タ−、発現調節dna配列および発現ベクタ− |
GB8702816D0 (en) | 1987-02-07 | 1987-03-11 | Al Sumidaie A M K | Obtaining retrovirus-containing fraction |
US5219740A (en) | 1987-02-13 | 1993-06-15 | Fred Hutchinson Cancer Research Center | Retroviral gene transfer into diploid fibroblasts for gene therapy |
JP2795850B2 (ja) | 1987-03-23 | 1998-09-10 | ザイモジェネティクス,インコーポレイティド | 酵母発現ベクター |
US4980289A (en) | 1987-04-27 | 1990-12-25 | Wisconsin Alumni Research Foundation | Promoter deficient retroviral vector |
WO1989001973A2 (en) | 1987-09-02 | 1989-03-09 | Applied Biotechnology, Inc. | Recombinant pox virus for immunization against tumor-associated antigens |
DK463887D0 (da) | 1987-09-07 | 1987-09-07 | Novo Industri As | Gaerleader |
WO1989002468A1 (en) | 1987-09-11 | 1989-03-23 | Whitehead Institute For Biomedical Research | Transduced fibroblasts and uses therefor |
US4929555A (en) | 1987-10-19 | 1990-05-29 | Phillips Petroleum Company | Pichia transformation |
DE3886363T3 (de) | 1987-11-18 | 2004-09-09 | Chiron Corp. (N.D.Ges.D. Staates Delaware), Emeryville | NANBV-Diagnostika |
WO1989005349A1 (en) | 1987-12-09 | 1989-06-15 | The Australian National University | Method of combating viral infections |
CA1340772C (en) | 1987-12-30 | 1999-09-28 | Patricia Tekamp-Olson | Expression and secretion of heterologous protiens in yeast employing truncated alpha-factor leader sequences |
US4973551A (en) | 1988-01-15 | 1990-11-27 | Merck & Co., Inc. | Vector for the expression of fusion proteins and protein immunogens |
JPH03504079A (ja) | 1988-03-21 | 1991-09-12 | カイロン コーポレイション | 組換えレトロウィルス |
US5591624A (en) | 1988-03-21 | 1997-01-07 | Chiron Viagene, Inc. | Retroviral packaging cell lines |
US5662896A (en) | 1988-03-21 | 1997-09-02 | Chiron Viagene, Inc. | Compositions and methods for cancer immunotherapy |
US5206152A (en) | 1988-04-08 | 1993-04-27 | Arch Development Corporation | Cloning and expression of early growth regulatory protein genes |
US5422120A (en) | 1988-05-30 | 1995-06-06 | Depotech Corporation | Heterovesicular liposomes |
AP129A (en) | 1988-06-03 | 1991-04-17 | Smithkline Biologicals S A | Expression of retrovirus gag protein eukaryotic cells |
WO1990002806A1 (en) | 1988-09-01 | 1990-03-22 | Whitehead Institute For Biomedical Research | Recombinant retroviruses with amphotropic and ecotropic host ranges |
US5217879A (en) | 1989-01-12 | 1993-06-08 | Washington University | Infectious Sindbis virus vectors |
JP2752788B2 (ja) | 1989-01-23 | 1998-05-18 | カイロン コーポレイション | 感染および過剰増殖障害の為の組換え療法 |
CA2045129A1 (en) | 1989-02-01 | 1990-08-02 | Alfred I. Geller | Herpes simplex virus type i expression vector |
JP3140757B2 (ja) | 1989-02-06 | 2001-03-05 | デイナ・フアーバー・キヤンサー・インステイテユート | パッケージング欠陥hivプロウイルス、細胞系及びその使用 |
EP0463056A1 (en) | 1989-03-17 | 1992-01-02 | E.I. Du Pont De Nemours And Company | External regulation of gene expression |
US5703055A (en) | 1989-03-21 | 1997-12-30 | Wisconsin Alumni Research Foundation | Generation of antibodies through lipid mediated DNA delivery |
ATE240401T1 (de) | 1989-03-21 | 2003-05-15 | Vical Inc | Expression von exogenen polynukleotidsequenzen in wirbeltieren |
JPH0832638B2 (ja) | 1989-05-25 | 1996-03-29 | カイロン コーポレイション | サブミクロン油滴乳剤を含んで成るアジュバント製剤 |
AU653919B2 (en) | 1989-08-15 | 1994-10-20 | Pasminco Australia Limited | Absorption of zinc vapour in molten lead |
WO1991002805A2 (en) | 1989-08-18 | 1991-03-07 | Viagene, Inc. | Recombinant retroviruses delivering vector constructs to target cells |
US5585362A (en) | 1989-08-22 | 1996-12-17 | The Regents Of The University Of Michigan | Adenovirus vectors for gene therapy |
US5166057A (en) | 1989-08-28 | 1992-11-24 | The Mount Sinai School Of Medicine Of The City University Of New York | Recombinant negative strand rna virus expression-systems |
GB8919607D0 (en) | 1989-08-30 | 1989-10-11 | Wellcome Found | Novel entities for cancer therapy |
AU7007491A (en) | 1990-02-02 | 1991-08-08 | Schweiz. Serum- & Impfinstitut Bern | Cdna corresponding to the genome of negative-strand rna viruses, and process for the production of infectious negative-strand rna viruses |
ZA911974B (en) | 1990-03-21 | 1994-08-22 | Res Dev Foundation | Heterovesicular liposomes |
CA2039921A1 (en) | 1990-04-16 | 1991-10-17 | Xandra O. Breakefield | Transfer and expression of gene sequences into central nervous system cells using herpes simplex virus mutants with deletions in genes for viral replication |
AU7906691A (en) | 1990-05-23 | 1991-12-10 | United States of America, as represented by the Secretary, U.S. Department of Commerce, The | Adeno-associated virus (aav)-based eucaryotic vectors |
US5149655A (en) | 1990-06-21 | 1992-09-22 | Agracetus, Inc. | Apparatus for genetic transformation |
EP0467714A1 (en) | 1990-07-19 | 1992-01-22 | Merck & Co. Inc. | The class II protein of the outer membrane of neisseria meningitidis |
JPH06500923A (ja) | 1990-09-21 | 1994-01-27 | カイロン コーポレイション | パッケージング細胞 |
WO1992007945A1 (en) | 1990-10-30 | 1992-05-14 | Dana Farber Cancer Institute | Cell type specific alteration of levels of gene products in neural cells |
US5173414A (en) | 1990-10-30 | 1992-12-22 | Applied Immune Sciences, Inc. | Production of recombinant adeno-associated virus vectors |
SE9003978D0 (sv) | 1990-12-13 | 1990-12-13 | Henrik Garoff | Dna expressionssystem baserade paa ett virus replikon |
AU657111B2 (en) | 1990-12-20 | 1995-03-02 | Dana-Farber Cancer Institute | Control of gene expression by ionizing radiation |
EP0648271B1 (en) | 1991-08-20 | 2003-04-16 | THE GOVERNMENT OF THE UNITED STATES OF AMERICA as represented by the SECRETARY OF THE DEPARTMENT OF HEALTH AND HUMAN SERVICES | Adenovirus mediated transfer of genes to the gastrointestinal tract |
FR2681786A1 (fr) | 1991-09-27 | 1993-04-02 | Centre Nat Rech Scient | Vecteurs recombinants d'origine virale, leur procede d'obtention et leur utilisation pour l'expression de polypeptides dans des cellules musculaires. |
IL103059A0 (en) | 1991-09-30 | 1993-02-21 | Boehringer Ingelheim Int | Conjugates for introducing nucleic acid into higher eucaryotic cells |
NZ244306A (en) | 1991-09-30 | 1995-07-26 | Boehringer Ingelheim Int | Composition for introducing nucleic acid complexes into eucaryotic cells, complex containing nucleic acid and endosomolytic agent, peptide with endosomolytic domain and nucleic acid binding domain and preparation |
US5252479A (en) | 1991-11-08 | 1993-10-12 | Research Corporation Technologies, Inc. | Safe vector for gene therapy |
WO1993010218A1 (en) | 1991-11-14 | 1993-05-27 | The United States Government As Represented By The Secretary Of The Department Of Health And Human Services | Vectors including foreign genes and negative selective markers |
GB9125623D0 (en) | 1991-12-02 | 1992-01-29 | Dynal As | Cell modification |
CA2128616A1 (en) | 1992-01-23 | 1993-08-05 | Gary H. Rhodes | Ex vivo gene transfer |
FR2688514A1 (fr) | 1992-03-16 | 1993-09-17 | Centre Nat Rech Scient | Adenovirus recombinants defectifs exprimant des cytokines et medicaments antitumoraux les contenant. |
JPH07507689A (ja) | 1992-06-08 | 1995-08-31 | ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア | 特定組織のターゲティング方法及び組成物 |
JPH09507741A (ja) | 1992-06-10 | 1997-08-12 | アメリカ合衆国 | ヒト血清による不活性化に耐性のあるベクター粒子 |
GB2269175A (en) | 1992-07-31 | 1994-02-02 | Imperial College | Retroviral vectors |
WO1994008026A1 (en) | 1992-09-25 | 1994-04-14 | Rhone-Poulenc Rorer S.A. | Adenovirus vectors for the transfer of foreign genes into cells of the central nervous system, particularly in brain |
EP1321526A3 (en) | 1992-11-18 | 2003-07-02 | Arch Development Corporation | Adenovirus-mediated gene transfer to cardiac and vascular smooth muscle |
WO1994012649A2 (en) | 1992-12-03 | 1994-06-09 | Genzyme Corporation | Gene therapy for cystic fibrosis |
US5478745A (en) | 1992-12-04 | 1995-12-26 | University Of Pittsburgh | Recombinant viral vector system |
US5348358A (en) | 1993-02-22 | 1994-09-20 | Selick David A | Contact lens insertion tool |
DE4311651A1 (de) | 1993-04-08 | 1994-10-13 | Boehringer Ingelheim Int | Virus für den Transport von Fremd-DNA in höhere eukaryotische Zellen |
DE69431750T2 (de) | 1993-04-22 | 2003-04-03 | Skyepharma Inc., San Diego | Multivesikuläre liposomen mit verkapseltem cyclodextrin und pharmakologisch wirksamen verbindungen sowie verfahren zu deren verwendung |
JPH09501309A (ja) | 1993-05-26 | 1997-02-10 | アメリカ合衆国 | アデノ関連性ウイルスレプタンパク質および細菌性タンパク質含有融合タンパク質 |
FR2705686B1 (fr) | 1993-05-28 | 1995-08-18 | Transgene Sa | Nouveaux adénovirus défectifs et lignées de complémentation correspondantes. |
ATE304604T1 (de) | 1993-06-24 | 2005-09-15 | Frank L Graham | Adenovirus vektoren für gentherapie |
EP0667912B1 (fr) | 1993-07-13 | 2008-07-02 | Centelion | Vecteurs adenoviraux defectifs et utilisation en therapie genique |
WO1995004139A1 (en) | 1993-07-27 | 1995-02-09 | The Wistar Institute Of Anatomy And Biology | Modified dna virus vectors and uses therefor |
US5631236A (en) | 1993-08-26 | 1997-05-20 | Baylor College Of Medicine | Gene therapy for solid tumors, using a DNA sequence encoding HSV-Tk or VZV-Tk |
US5362865A (en) | 1993-09-02 | 1994-11-08 | Monsanto Company | Enhanced expression in plants using non-translated leader sequences |
EP0982405B1 (en) | 1993-09-15 | 2009-08-26 | Novartis Vaccines and Diagnostics, Inc. | Recombinant alphavirus vectors |
FR2710536B1 (fr) | 1993-09-29 | 1995-12-22 | Transgene Sa | Usage anti-cancéreux d'un vecteur viral comportant un gène modulateur de la réponse immunitaire et/ou inflammatoire. |
CA2171109A1 (en) | 1993-10-01 | 1995-04-13 | Zvi Ram | Gene therapy of the nervous system |
SK283703B6 (sk) | 1993-10-25 | 2003-12-02 | Canji, Inc. | Rekombinantný adenovírusový vektor a jeho použitie |
RU2160093C2 (ru) | 1993-11-16 | 2000-12-10 | Скайефарма Инк. | Везикулы с регулируемым высвобождением активных ингредиентов |
US5693506A (en) | 1993-11-16 | 1997-12-02 | The Regents Of The University Of California | Process for protein production in plants |
FR2712603B1 (fr) | 1993-11-18 | 1996-02-09 | Centre Nat Rech Scient | Virus recombinants, préparation et utilisation en thérapie génique. |
JPH07241786A (ja) | 1994-03-08 | 1995-09-19 | Fanuc Ltd | 産業用ロボットの制御装置 |
US6780406B1 (en) | 1994-03-21 | 2004-08-24 | The Regents Of The University Of Michigan | Inhibition of vascular smooth muscle cell proliferation administering a thymidine kinase gene |
US7252989B1 (en) | 1994-04-04 | 2007-08-07 | Board Of Regents, The University Of Texas System | Adenovirus supervector system |
JPH10507061A (ja) | 1994-04-28 | 1998-07-14 | ザ ユニバーシティ オブ ミシガン | アデノウイルス中にパッケージされたプラスミドdnaを用いる遺伝子送達ベクターおよびパッケージング細胞株 |
ES2297831T3 (es) | 1994-05-09 | 2008-05-01 | Oxford Biomedica (Uk) Limited | Vectores retroviricos que presentan una tasa de recombinacion reducida. |
JP3816518B2 (ja) | 1994-06-10 | 2006-08-30 | ジェンベク、インコーポレイティッド | 相補的なアデノウイルスベクター系と細胞系 |
FR2723588B1 (fr) | 1994-08-12 | 1996-09-20 | Rhone Poulenc Rorer Sa | Adenovirus comprenant un gene codant pour la glutathion peroxydase |
US6121037A (en) * | 1994-10-18 | 2000-09-19 | Stojiljkovic; Igor | Bacterial hemoglobin receptor genes |
IL117483A (en) | 1995-03-17 | 2008-03-20 | Bernard Brodeur | MENINGITIDIS NEISSERIA shell protein is resistant to proteinase K. |
US6265567B1 (en) * | 1995-04-07 | 2001-07-24 | University Of North Carolina At Chapel Hill | Isolated FrpB nucleic acid molecule |
AU5741996A (en) | 1995-05-22 | 1996-12-11 | Chiron Corporation | Position-specific integration of vector constructs into eukaryotic genomes mediated by a chimeric integrase protein |
DE19534579C2 (de) * | 1995-09-18 | 2000-06-08 | Max Planck Gesellschaft | Nucleinsäure-Moleküle codierend Proteine, die die Adhäsion von Neisseria-Zellen an humane Zellen vermitteln |
US5753235A (en) | 1996-02-15 | 1998-05-19 | Heska Corporation | Recombinant canine herpesviruses |
US5980898A (en) | 1996-11-14 | 1999-11-09 | The United States Of America As Represented By The U.S. Army Medical Research & Material Command | Adjuvant for transcutaneous immunization |
GB9808866D0 (en) * | 1998-04-24 | 1998-06-24 | Smithkline Beecham Biolog | Novel compounds |
AU761780B2 (en) * | 1998-05-01 | 2003-06-12 | Glaxosmithkline Biologicals Sa | Neisseria meningitidis antigens and compositions |
US9714465B2 (en) | 2008-12-01 | 2017-07-25 | Applied Materials, Inc. | Gas distribution blocker apparatus |
-
1998
- 1998-10-09 CA CA002308606A patent/CA2308606A1/en not_active Abandoned
- 1998-10-09 EP EP10179785A patent/EP2278006A3/en not_active Withdrawn
- 1998-10-09 WO PCT/IB1998/001665 patent/WO1999024578A2/en active Application Filing
- 1998-10-09 AU AU93637/98A patent/AU9363798A/en not_active Abandoned
- 1998-10-09 BR BRPI9813930-4A patent/BR9813930A/pt not_active Application Discontinuation
- 1998-10-09 CN CNB988128446A patent/CN1263854C/zh not_active Expired - Fee Related
- 1998-10-09 CA CA002671261A patent/CA2671261A1/en not_active Abandoned
- 1998-10-09 DE DE69841807T patent/DE69841807D1/de not_active Expired - Lifetime
- 1998-10-09 EP EP98946675A patent/EP1029052B1/en not_active Expired - Lifetime
- 1998-10-09 EP EP07075379A patent/EP1900818A3/en not_active Withdrawn
- 1998-10-09 AT AT98946675T patent/ATE476508T1/de not_active IP Right Cessation
- 1998-10-09 JP JP2000520572A patent/JP4472866B2/ja not_active Expired - Fee Related
-
2000
- 2000-09-19 HK HK00105869.7A patent/HK1027126A1/xx not_active IP Right Cessation
-
2001
- 2001-06-06 HK HK01103903A patent/HK1033337A1/xx not_active IP Right Cessation
-
2005
- 2005-10-03 JP JP2005290551A patent/JP2006089488A/ja not_active Withdrawn
-
2009
- 2009-10-26 JP JP2009245980A patent/JP2010046089A/ja not_active Withdrawn
-
2010
- 2010-11-04 CY CY20101100997T patent/CY1110880T1/el unknown
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100354297C (zh) * | 2001-10-03 | 2007-12-12 | 希龙公司 | 辅助的脑膜炎球菌组合物 |
Also Published As
Publication number | Publication date |
---|---|
WO1999024578A3 (en) | 2000-03-02 |
JP4472866B2 (ja) | 2010-06-02 |
HK1033337A1 (en) | 2001-08-24 |
EP1900818A3 (en) | 2008-06-11 |
EP1029052B1 (en) | 2010-08-04 |
JP2003522514A (ja) | 2003-07-29 |
EP2278006A3 (en) | 2011-03-02 |
CA2671261A1 (en) | 1999-05-20 |
HK1027126A1 (en) | 2001-01-05 |
AU9363798A (en) | 1999-05-31 |
CY1110880T1 (el) | 2015-06-10 |
WO1999024578A2 (en) | 1999-05-20 |
JP2010046089A (ja) | 2010-03-04 |
ATE476508T1 (de) | 2010-08-15 |
CA2308606A1 (en) | 1999-05-20 |
BR9813930A (pt) | 2006-12-19 |
EP1900818A2 (en) | 2008-03-19 |
EP1029052A2 (en) | 2000-08-23 |
EP2278006A2 (en) | 2011-01-26 |
CN1263854C (zh) | 2006-07-12 |
JP2006089488A (ja) | 2006-04-06 |
DE69841807D1 (de) | 2010-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1263854C (zh) | 奈瑟球菌抗原 | |
CN1224708C (zh) | 脑膜炎奈瑟氏球菌抗原 | |
US8293251B2 (en) | Neisserial antigens | |
CN100379757C (zh) | 脑膜炎奈瑟球菌抗原和组合物 | |
CN1359426A (zh) | 奈瑟球菌基因组序列及其用法 | |
KR102607213B1 (ko) | 암모니아-산화 니트로소모나스 유트로파 균주 d23 | |
CN1338005A (zh) | 奈瑟球菌基因组序列及其用途 | |
AU745787B2 (en) | Enterococcus faecalis polynucleotides and polypeptides | |
CN1451046A (zh) | 保守的奈瑟球菌抗原 | |
CN1617740A (zh) | 抗沙眼衣原体的免疫 | |
CN1433470A (zh) | 奈瑟球菌的抗原性肽 | |
CN1774447A (zh) | 肺炎链球菌抗原 | |
RU2673715C2 (ru) | Вакцина против haemophilus parasuis серологического типа 4 | |
CN1824675B (zh) | 奈瑟球菌抗原 | |
CN1911959A (zh) | 奈瑟球菌基因组序列及其用途 | |
MXPA00004363A (en) | Neisserial antigens | |
AU2003235364B2 (en) | Neisseria meningitidis antigens and compositions | |
CN1186516A (zh) | 与幽门螺杆菌相关的诊断和治疗用的核酸和氨基酸序列 | |
AU1546202A (en) | Enterococcus faecalis polynucleotides and polypeptides |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C56 | Change in the name or address of the patentee |
Owner name: NOVARTIS VACCINES + DIAGNOSTIC Free format text: FORMER NAME: CHRION S.P.A. |
|
CP01 | Change in the name or title of a patent holder |
Address after: Italy Siena Patentee after: Novartis Vaccines & Diagnostic Address before: Italy Siena Patentee before: Chrion S. P. A. |
|
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20060712 Termination date: 20131009 |