CN112301018B - 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途 - Google Patents

新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途 Download PDF

Info

Publication number
CN112301018B
CN112301018B CN202010401622.0A CN202010401622A CN112301018B CN 112301018 B CN112301018 B CN 112301018B CN 202010401622 A CN202010401622 A CN 202010401622A CN 112301018 B CN112301018 B CN 112301018B
Authority
CN
China
Prior art keywords
lys
leu
asp
glu
ser
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010401622.0A
Other languages
English (en)
Other versions
CN112301018A (zh
Inventor
江媛
王丹
章登位
戴雪辰
汪晓珏
纪泽阳
王�琦
赵静
李卓坤
顾颖
欧阳文杰
沈玥
陈奥
章文蔚
肖亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to CN202310742030.9A priority Critical patent/CN116694603A/zh
Publication of CN112301018A publication Critical patent/CN112301018A/zh
Application granted granted Critical
Publication of CN112301018B publication Critical patent/CN112301018B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

本发明涉及基因编辑领域,具体涉及新型的Cas蛋白、Crispr‑Cas系统及其在基因编辑领域中的用途。该新型的Cas蛋白选自下列中的至少一种:SEQ ID NO:1~SEQ ID NO:4;与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,其序列相似性在85%以上,优选在90%以上。本发明提供的新型的Cas蛋白可以用于Crispr‑Cas系统,实现对于基因的编辑。其可以编辑的靶位点更多,而且更容易被递送到细胞中进行编辑,且不会造成脱靶。

Description

新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的 用途
技术领域
本发明涉及基因编辑领域,具体涉及新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途。
背景技术
CRISPR(Clustered regularly interspaced short palindromic repeats),被称为规律成簇间隔短回文重复,实际上是一种基因编辑器,是大多数细菌及古细菌中的一种天然免疫方式。通过对CRISPR簇的侧翼序列分析发现,在其附近存在一个多态性家族基因,并且与CRISPR区域共同发挥作用,因此被命名为CRISPR关联基因(CRISPRassociated),缩写为Cas。大多数的CRISPR-Cas系统都含有cas1蛋白,而且cas1是Cas家族中较为保守的蛋白。根据效应模块的结构,目前被发现的CRISPR-Cas系统主要有两类:Class1是包含多个Cas蛋白并有多个效应蛋白(effector)共同作用,主要包括Type I型、Type III和Type IV型;Class2仅包含一个巨大的effector蛋白,包括Type II型、Type V型和Type VI型。目前,Class2包括Cas9系统(TypeⅡ型)和Cpf1(TypeⅤ型)系统,并且广泛用于基因编辑应用中。
然而,Crispr-Cas系统仍需在不少缺点,例如可能会存在基因脱靶的现象,而且其应用范围也有限,还需要进一步改进。
发明内容
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。为此,本发明的一个目的在于提出一种新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途。
CRISPR/Cas系统是一种常用的基因编辑的系统,能够成功应用于动物和植物基因组的精确编辑中。该系统是由RNA介导靶向识别DNA双链特异位点,并通过核酸酶进行切割,通常使用比较广泛的是Cas9核酸酶和Cpf1核酸酶。Cas9核酸酶和Cpf1核酸酶通过RNA介导靶向识别DNA双链特异位点并切割,造成DNA双链断裂,细胞再通过NHEJ(nonhomologousend joining)或HR(homologous recombination)途径进行修复,实现对目标基因的定点修饰。商业上广泛应用的一种Cas9核酸酶是SpCas9核酸酶,其识别PAM序列为NGG,位于靶向序列的3’端,在距PAM序列3bp处切割形成平末端。LbCpf1是广泛商业应用的一种Cpf1核酸酶,其识别PAM位点是位于靶向序列5’端的TTTN序列,并在远端进行切割,形成粘性末端。
在研究过程中发现:无论是SpCas9还是LbCpf1,均有比较严格的PAM序列,对靶向位点的设计有所限制。而且SpCas9蛋白和LbCpf1蛋白分别由1368和1228个氨基酸组成,体积太大不能用AAV病毒包装和递送,在一定程度上限制了其在动物细胞方面的应用。且SpCas9的靶向序列为20bp,在全基因组中易出现相似序列,造成脱靶。
寻找新型可用的Cas蛋白,使得其蛋白长度更小,从而可以方便包装和递送,进一步扩大其在动物细胞领域中的应用。而且使得Crispr-Cas系统不易造成脱靶,至关重要。
为此,通过研究我们找到多种新型的Cas蛋白,其蛋白长度更短,将其用于Crispr-Cas系统中,可以更容易被递送到细胞中进行编辑。而且更不容易脱靶。以在人肠道菌Veillonella sp AF13-2(下简称:AF13-2)上得到的BES1蛋白为例,其用于Crispr-Cas系统,所识别的PAM序列的特异性均比商业的SpCas9和LbCpf1低,该Cas蛋白潜在可以编辑的靶位点更多。而且的BES1蛋白仅由1064个氨基酸组成,更容易被递送到细胞中进行编辑。SpCas9的靶向序列为20bp,我们的BES1的靶向序列为23bp,比SpCas9潜在更不易造成脱靶。
具体而言,本发明提供了如下技术方案:
根据本发明的第一方面,本发明提供了一种Cas蛋白,选自下列中的至少一种:SEQID NO:1~SEQ ID NO:4;与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,其序列相似性在85%以上,优选在90%以上。经过生物学信息技术筛选获得了新型的Cas蛋白SEQ ID NO:1~SEQ ID NO:4,并经过分子生物学技术进行验证,这些Cas蛋白中的任意一种蛋白的容易被递送到细胞中进行基因编辑。而且其所识别的PAM序列特异性合适,所以可以编辑的靶位点更多,而且靶向序列的长度也合适,更不易造成脱靶。与SEQ ID NO:1~SEQ ID NO:4中任一蛋白相比,序列相似性在85%以上,例如在86%以上,87%以上,88%以上,89%以上,优选在90%以上,例如在91%以上,92%以上,93%以上,94%以上的蛋白,其具有SEQ ID NO:1~SEQ ID NO:4所示Cas蛋白相同或者相似的活性和功能,也容易被递送到细胞中进行基因编辑,而且可以编辑的靶位点更多,所靶向的序列长度也更合适,更不易造成脱靶。
根据本发明的实施例,以上所述的Cas蛋白可以进一步包括如下技术特征:
在本发明的一些实施例中,与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,其序列相似性在95%以上,优选在96%以上,更优选在97%以上,更优选在98%以上,最优选在99%以上。与SEQ ID NO:1~SEQ ID NO:4中任一蛋白相比,序列相似性在95%以上,优选在96%以上,97%以上,98%以上,99%以上,99.5%以上的蛋白与上述Cas蛋白具有相同或者相似的活性,容易被递送到细胞中进行基因编辑,而且可以编辑的靶位点更多,所靶向的序列长度也更合适,不易造成脱靶。
在本发明的一些实施例中,所述Cas蛋白为与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加一个或几个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多8个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多6个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多5个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多4个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多3个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多2个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加1个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。
在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:1所示。该Cas蛋白由1064个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为NNNV(其中V代表碱基A/G/C),由此可以编辑更多的靶位点,而且其靶向序列为23bp,不容易造成脱靶现象。该Cas蛋白具有体外切割DNA双链活性,未检测到人细胞内编辑活性。
在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:2所示。该Cas蛋白由1368个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为NNMTA。该Cas蛋白具有体外切割DNA双链活性,未检测到人细胞内编辑活性。
在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:3所示。该Cas蛋白由1245个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为TTTN。该Cas蛋白具有体外切割DNA双链活性和人细胞内编辑活性。
在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:4所示。该Cas蛋白由1306个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为YYN,极大地缓解了LbCpf1只识别TTTN的限制。该Cas蛋白具有体外切割DNA双链活性和人细胞内编辑活性。
根据本发明的第二方面,本发明提供了一种核酸序列,所述核酸序列选自下列中的至少一种:编码本发明第一方面任一实施例所述的Cas蛋白的核酸序列;与编码本发明第一方面任一实施例所述的Cas蛋白的核酸序列反向互补的核酸序列。
在本发明的一些实施例中,所述核酸序列为DNA或者RNA。
根据本发明的第三方面,本发明提供了一种表达载体,所述表达载体包括本发明第二方面所述的核酸序列。将上述核酸序列与载体构建,获得表达载体,这些表达载体可以在目标细胞中表达相应的Cas蛋白,从而在目标细胞中进行相应的基因编辑。常用的载体可以是质粒、慢病毒等等,例如可以为pET 28a载体、pMD19载体等。
根据本发明的第四方面,本发明提供了一种重组细胞,所述重组细胞含有本发明第三方面所述的表达载体。将表达载体导入到细胞中,构成重组细胞,利用表达载体表达相应的Cas蛋白,可以实现对于重组细胞的基因编辑。这些重组细胞可以是真核细胞,例如植物细胞、动物细胞。尤其是相较于常用的SpCas9蛋白和LbCpf1蛋白,本文提供的Cas蛋白其氨基酸个数较少,更容易被递送到细胞中进行编辑。在用于动物细胞时,更方便被病毒载体包装和递送,扩大了在动物细胞领域中的应用。
根据本发明的第五方面,本发明提供了一种Crispr-Cas系统,包括本发明第一方面所述的Cas蛋白。本发明提供的Cas蛋白可以用于Crispr-Cas系统中,应用于基因编辑领域,扩大了可编辑的范围,而且不容易脱靶,提高了编辑的准确性。将该系统可用于基础生物科学、医药、农业等众多领域中。
根据本发明的实施例,以上所述Crispr-Cas系统可以进一步包括如下技术特征:
在本发明的一些实施例中,所述Crispr-Cas系统进一步包括下列中的至少一种:crRNA、tracrRNA或者由crRNA、tracrRNA形成的嵌合RNA。这些RNA可以帮助Crispr-cas系统发挥基因编辑的功能。除此之外,所述Crispr-cas系统根据需要还可以进一步包括Crispr_repeat序列,其中每种Cas蛋白对应的Crispr_repeat序列如附表I和附表II所示。
在本发明的一些实施例中,所述crRNA、tracrRNA如附表I和附表II所示。在附表I和附表II中列出了Cas蛋白在进行基因编辑时所用到的crRNA、tracrRNA序列。这些序列可以帮助Cas蛋白精确的定位到靶序列,实现精准的基因编辑。
根据本发明的第六方面,本发明提供了本发明第一方面所述的Cas蛋白、核酸序列、表达载体、重组细胞或Crispr-Cas系统在基因编辑领域中的用途,其中所述Cas蛋白为本发明第一方面所述的Cas蛋白,所述核酸序列为本发明第二方面所述的核酸序列,所述表达载体为本发明第三方面所述的表达载体,所述重组细胞为本发明第四方面所述的重组细胞,所述Crispr-Cas系统为本发明第五方面所述的Crispr-Cas系统。
附图说明
图1是根据本发明的实施例提供的BES1的PAM偏好性图。
图2是根据本发明的实施例提供的BES1纯化结果图。
图3是根据本发明的实施例提供的BES1的crRNA+tracrRNA-L、sgRNA-1、sgRNA-2和sgRNA-3的碱基序列及结构图。
图4是根据本发明的实施例提供的芯片检测分别用crRNA+tracrRNA-L,sgRNA-1,sgRNA-3的BES1的PAM偏好性图。
图5是根据本发明的实施例提供的spacer序列图。
图6是根据本发明的实施例提供的所构建的PAM文库序列。
图7是根据本发明的实施例提供的切割底物序列示意图。
图8是根据本发明的实施例提供的BES1与crRNA+tracrRNA-L、sgRNA-1、sgRNA-2和sgRNA-3在20℃、25℃和37℃体外切割产物条带图。
图9是根据本发明的实施例提供的获取新型的Cas蛋白的流程示意图。
图10是根据本发明实施例的芯片检测BES2、BES4和BES6系统的PAM偏好性图。
图11是根据本发明实施例的BES2、BES4和BES6系统体外切割实验图。
图12是根据本发明实施例的BES6系统人细胞编辑活性检测电泳图。
图13是根据本发明实施例的BES4系统人细胞编辑活性检测电泳图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。同时为了使得本领域技术人员更好的理解本发明,对本文中出现的某些术语或者表述进行解释,这些解释和说明仅用于方便对于本发明的理解,而不应看做是对本发明保护范围的限制。
本文中,术语“Crispr”、“crispr”或者“CRISPR”均指规律成簇间隔短回文重复,即Clustered regularly interspaced short palindromic repeats的首字母缩写,术语无论是大写还是小写或者是首字母大写,均是本领域常用的表述方式。相应的,Crispr-Cas系统中因为字母大小写存在不同的表述。另外,当表示碱基时,如无特别说明,字母N和字母V所代表的碱基具有本领域通常的含义,即N代表随机或者任意碱基A、T、C或者G,V代表随机或者任意碱基A、C或者G。
Cas9酶在目标DNA靶点上进行切割,通常通过如下方式来确定靶位点:被称作Crispr RNA(crRNA)的RNA分子利用它的一部分序列与被称作tracrRNA的RNA分子通过碱基配对结合在一起,形成嵌合RNA(tracrRNA/crRNA),然后借助crRNA的另一部分序列与靶DNA位点进行碱基配对,由此,嵌合RNA引导Cas蛋白结合到这个靶位点进行切割,这种嵌合RNA也称为向导RNA(guide RNA)。与Crispr-Cas9系统不同的是,Cpf1酶能够独自地对CrRNA前体进行加工,然后利用加工后产生的crRNA特异性地靶向和切割DNA,不需要来自宿主细胞的核糖核酸酶和tracrRNA。
Crispr的靶向特异性由两部分决定,一部分是RNA嵌合体和靶DNA之间的碱基配对,另一部分依靠Cas蛋白和一个短的DNA序列,这个短的DNA序列在靶DNA的3’末端,称为PAM(protospacer adjacent motif)。
若PAM序列严格(例如可能为特定的几个碱基),则Cas蛋白可以编辑的靶位点就比较少,从而限制了Crispr-Cas系统的应用。SpCas9和LbCpf1均有比较严格的PAM序列,从而使得在对靶向位点的设计有所限制。例如,SpCas9核酸酶所识别的PAM序列为NGG,位于靶向序列的3’端,在距PAM序列3bp处切割形成平末端,由于其PAM序列仅为NGG,限制了该编辑系统的应用。
我们在人的肠道菌群中,利用生物信息和分子实验技术,找到的多种有基因编辑潜力的新型的Cas9系统和Cpf1系统,如附表I和附表II所示。其中Cpf系统中Cpf1酶也称为Cas12a蛋白以与Cas9蛋白不同的方式进行基因编辑,Cpf1酶比SpCas9蛋白要小,更易传送至细胞和组织内。而且其应用于Crsipr-Cpf1系统中,只需要一个crRNA,且可以实现多位点的同时编辑。本申请中提供的Cas蛋白,既包括Cas9蛋白,也包括Cpf1蛋白。即本发明提供了Cas蛋白,其为SEQ ID NO:1~SEQ ID NO:4中的至少一种。这些Cas蛋白具有核酸酶活性,可以用来切割目标核酸,从而应用于Crispr-cas系统中,实现基因的有效编辑,且可用于编辑的靶位点更多,应用范围更广。
所提供的新型的Cas9和Cpf1系统,其所识别的PAM特异性更低,因此扩大了基因编辑系统的应用。以我们在人肠道菌Veillonella sp AF13-2(下简称:AF13-2)上得到的BES1蛋白为例,其比现有的商业SpCas9和LbCpf1的PAM特异性低和蛋白更小。BES1蛋白的氨基酸个数比较小,更容易被递送到细胞中行使基因编辑功能。而且BES1的PAM序列偏好性如图1所示,其中图1中横坐标代表紧邻靶序列3’端的7个位点,纵坐标代表在被切割的所有阳性序列中,各个碱基所占的比例。图1中,紧邻靶序列3’端的第一个位点上,无论是碱基A、碱基C、碱基T或者碱基G的概率都很大,该位点即可以表示为N,依次观察各位点的结果。从图1可以看出,只有第四位为T的时候,被切割的概率极低(小于0.05),因此BES1的PAM序列为NNNV(其中V代表碱基A、G或者C)。
新型的Cas9系统表,包括该Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门,crRNA,tracrRNA,crispr repeat sequence,effectorprotein length,effector amino acid sequence等,详见附表I。
新型的Cpf1系统表,包括该Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门,crRNA,tracrRNA,crispr repeat sequence,effectorprotein length,effector amino acid sequence等,详见附表II。
下面将结合实施例对本发明的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本发明,而不应视为限定本发明的范围。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。
实施例一
根据微生物基因组数据库,对人的肠道菌群中的微生物进行分析,预测Cas蛋白序列和Crispr序列,确定了在Crispr上下游20kb的所有蛋白序列。然后跟NCBI中的蛋白数据库进行比对,得到与已知TypeII或者TypeV蛋白的同源蛋白。将这些同源蛋白进行分析,确定同源蛋白的关键结构域的保守位点以及蛋白的完整性,从而得到附表I和附表II中的Cas蛋白序列及附近的Crispr序列。分析方法如图9所示。这些新型的Crispr-Cas系统属于新型的Type II和Type V Crispr-Cas系统,具有不同于现有SpCas9蛋白的基因编辑能力。这些新型的Crispr-Cas系统丰富了已有的Crispr-cas系统,可以根据需要用于不同细胞中,例如动物细胞和植物细胞中,发挥基因编辑功能。
以人肠道菌Veillonella sp AF13-2(简称:AF13-2)上得到的BES1为例,其比现有的商业SpCas9和LbCpf1的PAM特异性低和蛋白更小。BES1的PAM序列偏好性如图1所示,只有第四位为T的时候,被切割的概率极低(小于0.05),BES1的PAM序列为NNNV(其中V代表碱基A、G、C)。
附表I为新型的Cas9系统表,包括该Crispr-Cas系统或者Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门信息,以及crRNA,tracrRNA,crispr repeat sequence(crispr重复序列),effector protein length(效应蛋白长度),effector amino acid sequence(效应蛋白氨基酸序列)。附表I中所示出的Cas蛋白,已经示出了相应的crRNA、tracrRNA和/或crispr repeat sequence,本领域技术人员可以根据示出的序列直接应用。
附表II为新型的Cpf1系统表,包括该Crispr-Cas系统或者Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门信息,以及crRNA,tracrRNA,crispr repeat sequence,effector protein length,effector amino acidsequence。附表II中所示出的Cas蛋白,未示出相应的crRNA、tracrRNA和/或crispr repeatsequence,可以根据相应Cas蛋白的信息找到能够帮助这些Cas蛋白行使编辑功能的crRNA、tracrRNA和/或crispr repeat sequence。
实施例二 表达纯化BES1蛋白的实验
1、BES1表达载体的构建
采取In-fusion的方法进行表达载体的构建,选取NdeI和EcoR I两个位点酶切pET28a载体,将BES1编码的基因序列插入到载体pET 28a的克隆区。将重组型BES1蛋白氨基酸序列的N端的6个His作为纯化标签,其中筛选标签为卡那霉素,将构建好的载体命名为pET28a-BES1。
2、BES1菌株的培养和诱导
LB液体培养基:胰蛋白胨10g/L,酵母提取物5g/L,NaCl 10g/L。
将重组表达载体pET 28a-BES1转化到大肠杆菌表达菌株Ecoli.BL21(DE3)中,将菌液均匀涂抹于卡那霉素浓度为50μg/mL的LB固体培养基平板上,37℃过夜培养。挑取单菌落,于5ml LB培养基(含有50μg/mL卡那霉素)培养,37℃,200rpm,过夜培养。将上述所得菌液,按1:100接种于50ml LB培养基(含有50μg/mL卡那霉素)中培养,37℃,200rpm,4h。将扩大培养的菌液,按1:100接种于2L LB液体培养基(含有50μg/mL卡那霉素)中培养,37℃,200rpm,待OD600值达0.6-0.8左右,加入IPTG至终浓度为0.4mM,16℃,200rpm,培养过夜,约16-18h。将诱导结束的菌液于10000g离心收集菌体,菌体冻存于-20℃待用。
3、BES1蛋白的提取与纯化
纯化Buffer配制:
(1)Ni柱亲和层析
Buffer A平衡缓冲液:50mM Tris-HCl+500mM NaCl+20mM咪唑,pH 7.5。
Buffer B洗脱缓冲液:50mM Tris-HCl+500mM NaCl+500mM咪唑,pH 7.5。
(2)离子交换层析
Buffer C平衡缓冲液:50mM Tris-HCl+100mM NaCl,pH 7.0。
Buffer D洗脱缓冲液:50mM Tris-HCl+1M NaCl,pH 7.0。
(3)蛋白样品稀释液
Buffer E稀释液:50mM Tris-HCl,pH 7.0。
(4)蛋白样品2×储存液
Buffer F 2×储存液:50mM Tris-HCl+300mM NaCl,pH 7.0。
按1g菌体加15ml Buffer A液的比例重悬菌体,并加入终浓度为1mM的PMSF,超声破碎细胞,直至菌体溶液至澄清。将破碎后的菌体4℃,12000rpm,离心30min,取上清,0.22μm滤膜过滤后于4℃储存。
将Ni柱亲合层析柱水洗5CV,Buffer B清洗5CV,Buffer A进行平衡10CV后,进行上样。上样完成后,平衡15CV,使用15%Buffer B洗掉杂蛋白,线性洗脱(15-100%Buffer B,10CV),当UV值大于100mAU以上收集蛋白。
将Ni柱收集到的蛋白用Buffer E稀释5倍,将Q阴离子交换柱水洗5CV,Buffer C平衡5CV,蛋白样品上样,当UV值上升开始收集流穿液。将SP阳离子交换柱使用Buffer C平衡5CV,将上步得到的蛋白样品上样,上样完成后,用Buffer C平衡15CV后,用洗脱缓冲液Buffer D线性洗脱(0-100%Buffer D,10CV),收集蛋白。收集蛋白进行过夜透析,透析液为2×的储存Buffer。蛋白终浓度为1mg/mL,甘油浓度为50%。如图2所示,SDS-PAGE结果显示融合蛋白的纯化效果很好,纯度合格。
下面的实施例三和实施例四,以在人肠道菌Veillonella sp AF13-2中发现的Cas9蛋白BES1BES1(SEQ ID NO:1)为例,探究了该蛋白所识别的PAM序列以及其在体外对于目标底物的切割功能。
实施例三 得到BES1PAM序列的实验
1、向导RNA(guide RNA)制备
首先,我们根据预测的BES1在菌株AF13-2的crRNA和tracrRNA序列(见附表I),设计了得到crRNA和tracrRNA-L的双链DNA转录模板(见下表1)。同时在此基础上尝试将crRNA和tracrRNA-L的配对区域的序列缩短,用一个GAAA的连接序列将其连接起来,使其形成了一个单DNA链,即sgRNA-1,sgRNA-1的转录模板序列见下表1。同时为了最大程度地保持原RNA的活性,设计了sgRNA-3,其转录模板序列见下表1,表1所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。其中表1中所示出的序列均为各RNA转录用的DNA模板序列。crRNA+tracrRNA-L,sgRNA-1,sgRNA-3的序列和二级结构见图3。
表1 BES1芯片切割实验所使用RNA转录用模板序列
通过DNA聚合酶链式反应使用KAPAHiFiTM热激活即时使用混合液(Roche)制备上述双链DNA模板。反应后使用苯酚氯仿异戊醇混合液(阿拉丁)进行DNA双链模板纯化,纯化之后的DNA双链模板使用Nanodrop TM 2000光谱仪进行纯度测定(Thermo FisherScientific),并对纯化后的DNA双链模板使用Qubit TM双链DNA高灵敏定量试剂盒(ThermoFisher Scientific)和Qubit TM 3.0荧光定量仪进行浓度测定。
然后利用上述DNA双链模板进行转录,在进行转录时,按照MEGAscriptTMT7Transcription Kit说明书中的内容,投入2皮摩尔的DNA双链模板,使用Bio-rad S1000TMPCR仪37℃孵育12小时。并利用苯酚氯仿异戊醇混合液(阿拉丁)对RNA进行纯化,纯化之后的RNA使用Nanodrop TM 2000光谱仪进行纯度与浓度测定(Thermo Fisher Scientific)。
2、切割底物单链环制备
制备能够用于上述BES1蛋白的切割底物,其中切割底物所用到的脱氧核苷酸序列见下表(表2)。其中表2所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。
通过DNA聚合酶链式反应使用热激活即时使用混合液(Roche)制备待切割底物双链(双链底物)。利用表2中PAM_AF13-2_2/1与PAM_AF13-2_2/2两条核苷酸序列在95摄氏度变性后复性作为模板,利用PAM_AF13-2_1与PAM_AF13-2_3两条核苷酸序列为引物进行聚合酶链式反应扩增获得双链底物。
使用E.Z.N.A.TM胶回收试剂盒对所获得的聚合酶链式反应产物进行回收,然后将回收得到的产物使用Nanodrop TM 2000光谱仪进行纯度测定(Thermo FisherScientific),并且使用Qubit TM双链DNA高灵敏定量试剂盒(Thermo Fisher Scientific)和Qubit TM 3.0荧光定量仪进行浓度测定。
表2切割底物制备所用脱氧核苷酸序列
然后利用上述获得的双链底物,进行单链环化,获得单链环产物。方法如下:
使用1皮摩尔上述制备的DNA双链底物,1×TA缓冲液(Epicentre),T4DNA连接酶120U(Epicentre),和10mM ATP(NEB)终浓度,反应产物体系大小为60μl,使用Bio-radS1000TM PCR仪37℃孵育1小时。
然后使用EXO III(10U/μl)(购自BGI)与EXO I(3U/μl)(购自BGI),使用Bio-radS1000TM PCR仪37℃孵育30分钟,对未成环的PCR产物进行消化。产物使用2.5倍体积的AMPure XP(BeckmanTM)进行纯化后并且使用Qubit TM单链DNA高灵敏定量试剂盒(ThermoFisher Scientific)和Qubit TM 3.0(Thermo Fisher Scientific)荧光定量仪进行浓度测定。
3、SE51测序
(1)利用上述单链环制备上机所用纳米球,取6纳克上述单链环产物,使用无核酸酶纯水(AmbionTM)配平到20微升,加入Make DnB buffer(BGI)20μl,混匀后离心,使用Bio-rad S1000TM PCR仪进行95℃孵育1分钟,65℃孵育1分钟,40℃孵育1分钟,4℃孵育分钟。
反应后产物加入make DnB enzyme mix V2.0(BGI)40微升,make DnB enzyme mixII V2.0(BGI)2微升,混匀后使用Bio-rad S1000TM PCR仪30℃孵育20分钟,反应后混匀DnB终止缓冲液(BGI),使用扩口枪头(Axygen)吹匀后,加入30微升load DnB buffer(BGI),使用扩口枪头(Axygen)吹匀,使用BGITMSEQ500DnB loader(BGI)将所述文库固定到BGITMSEQ500 V3.1芯片(BGI)上,得到待测序的芯片。
(2)使用BGITMSEQ500 SE100 sequencing Cartridge测序试剂盒(BGI)对上述芯片使用BGITMSEQ500测序仪(BGI)进行SE51测序,获得每条核酸序列的序列信息及ID号。
4、BES1-PAM原生链测序
由于上述测序所得到的为单链DNA,利用该单链DNA合成得到互补链(即原生链),所得到的双链DNA用于蛋白的切割实验。包括:
(1)芯片测序完成后,在BGITMSEQ500 DnB loader(BGI)上使用100%甲酰胺(Sigma)将第一次测序生成的新链洗脱掉。
(2)芯片洗脱完成后,使用dNTP mix 2(BGI),在BGITMSEQ500测序仪(BGI)上进行原生链合成,得到双链DNA,合成长度为50个核苷酸,第51个碱基使用dNTP mix 1(BGI)合成,此步骤为合成链末尾加上带荧光dNTP。
(3)上述步骤完成后,使用BGITMSEQ500测序仪(BGI)对芯片进行拍照,在测序仪上保存为原图一。
(4)BES1芯片酶切反应。对步骤(2)所获得的双链DNA,利用不同的RNA进行酶切反应。其中,反应所用缓冲液为spCas9 1×反应缓冲液(NEB),上述步骤1所制备RNA(crRNA+tracrRNA-L,sgRNA-1或者是sgRNA-3)投入30微克,BES1蛋白终浓度为0.1微摩尔,RNase抑制剂(Epicentre)反应体系终体积300微升,使用BGITMSEQ500 DnB loader(BGI)泵上混合液入芯片,于37℃孵育5小时。
(5)上述芯片使用洗涤缓冲液2(BGI)300微升进行清洗3次。
(6)上述步骤完成后,使用BGITMSEQ500测序仪(BGI)对芯片进行拍照,在测序仪上保存为原图二。
(7)使用BGITMSEQ500测序仪(BGI)对已保存的原图一与原图二进行手动basecall软件(BGI)对比酶切前后的荧光信号。对BES1的PAM序列进行分析,同时以SpCas9作为对照,结果如图4所示。
图4示出的结果中,横坐标示出了紧邻靶序列3’端的7个位点,纵坐标为在被切割的所有阳性序列中,各个碱基所占的比例。即纵坐标代表以被切割的序列数为分母,确定各位置上被切得分别是哪种碱基,计算每个位置上四种碱基所占的比例。从图4示出的结果可以看出,相较于SpCas9,在结构略有不同的Guide RNA的作用下BES1的偏好性并无太大差异。
实施例四BES1的体外切割实验
1、guide RNA制备
按照实施例三的方法,获得crRNA转录模板,tracrRNA-L的双链DNA转录模板以及sgRNA-1和sgRNA-3的双链DNA转录模板。同时设计了更短的tracrRNA-S,并利用完整的crRNA与tracrRNA-S设计了sgRNA-2,其转录模板序列见下表3。该转录模板DNA皆于深圳国家基因库合成与编辑平台合成。
表3 sgRNA-2的双链DNA转录模板
可以利用上述DNA模板转录出如图4所示的功能RNA,包括crRNA+tracrRNA-L、sgRNA-1、sgRNA-2和sgRNA-3(其中靶标序列在图4中用N代替)。
具体来说,按照实施例三的方法,包括:
通过DNA聚合酶链式反应使用KAPAHiFiTM热激活即时使用混合液(Roche)制备双链DNA模板。反应后使用苯酚氯仿异戊醇混合液(阿拉丁)进行DNA双链模板纯化,纯化之后的DNA双链模板使用Nanodrop TM 2000光谱仪进行纯度测定(Thermo FisherScientific),并对纯化后的DNA双链模板使用Qubit TM双链DNA高灵敏定量试剂盒(ThermoFisher Scientific)和Qubit TM 3.0荧光定量仪进行浓度测定。
然后利用上述DNA双链模板进行转录,在进行转录时,按照MEGAscriptTMT7Transcription Kit说明书中的内容,投入2皮摩尔的DNA双链模板,使用Bio-rad S1000TMPCR仪37℃孵育12小时。使用苯酚氯仿异戊醇混合液(阿拉丁)对RNA进行纯化,纯化之后的RNA使用Nanodrop TM 2000光谱仪进行纯度与浓度测定(Thermo Fisher Scientific)。
2、切割底物制备
目标位点设计:Crispr序列通常由一个前导区(leader)、多个重复序列(repeat)和多个间隔区(spacer)构成,前导区通常可以作为Crispr序列的启动子,重复序列可以形成发卡结构,间隔区通常由俘获的外源DNA组成。因此以Veillonella sp.AF13-2菌株(NCBIgenome ID:QTMT00000000)基因组序列上原始的pro-spacer序列(如图5中selected-spacer)作为目标位点序列。
PAM序列设计:建立了一个7个N的PAM文库(图6中的spacer及PAM序列),以便于BES1蛋白的切割。
切割底物设计:将合成的PAM文库序列克隆至pMD19载体中形成了pMD19-AF13-2-3’PAM文库。我们在这个文库中扩增了一个842bp的切割底物序列(见图7,其中切割底物序列如SEQ ID NO:243所示),目标位点位置为402bp-431bp(见图7),PAM位置为第432bp-438bp(见图7,即SEQ ID NO:24中第432位~第438位的7个随机碱基,下划线部分),因此其切割后产物皆为400bp左右。进行如此设计的原因是在凝胶电泳分辨率不高的情况下,切割产物会形成一条较宽的条带,以便于我们检测是否切割。
842bp的切割底物序列如下(N代表任意碱基)(SEQ ID NO:23):
CTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGCCAAGTTTGCACGCCTGCCGTTCGACGATTGTAGTAGCTCAAAAGGGAACTGCTACCGAANNNNNNNAATCTCTGGAAGATCCGCGCGTACCGAGTTCTAATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGG(SEQ ID NO:23)。
3、切割实验及结果
切割体系为功能性RNA(图4所示的四种RNA)、切割底物及BES1投入量均为终浓度100nM,分别在20℃、25℃和37℃孵育1小时,切割产物利用2%的琼脂糖凝胶进行鉴定,切割结果如图8所示。
从图8示出的结果可以看出,在20℃、25℃和37℃孵育下,BES1分别加上图4中四种功能RNA均可以对目标底物进行切割。
实施例五BES2、BES4和BES6系统的PAM偏好性鉴定
BES2、BES4和BES6三个系统的PAM鉴定实验方法及步骤与上述实施例一致,主要步骤如下:
(1)guide RNA的制备
生信预测获得BES2系统在菌株Collinsella sp.Marseille-P2666中的tracrRNA和crRNA序列(见附表I),设计了由crRNA与tracrRNA连接整合的sgRNA的双链DNA转录模板,具体脱氧核苷酸序列见下表4。BES4和BES6属于Cpf1同源系统,此系统只需crRNA引导效应蛋白即可实现基因组靶向切割,无需tracrRNA的参与,通过生信预测两个蛋白的crRNA序列,设计并合成其双链DNA转录模板,具体脱氧核苷酸序列见下表4。表4所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。
表4:
表4所示的BES2系统gRNA、BES4和BES6系统crRNA的双链DNA转录模板guide RNA表达制备步骤同实施例三致。
(2)PAM鉴定
基于DNB芯片快速检测BES2、BES4和BES6系统的PAM序列与实施例三一致。三个系统的PAM偏好性如图10所示。
实施例六BES2、BES4和BES6系统体外切割活性鉴定
首先,根据实施例三中所述,体外转录表达BES2、BES4和BES6系统的guide RNA序列;其次,同实施例二中实验方法一致,表达纯化BES2、BES4和BES6系统的效应蛋白;最后,同实施例四中实验方法一致,进行底物制备和体外切割。如图11所示,三个系统均具有体外切割DNA双链的活性。
实施例七BES6系统在人细胞体内的编辑活性鉴定
(1)人细胞培养
发明人选择人HEK293T细胞作为进行体内编辑活性测试的细胞。HEK293T细胞培养于DMEM培养基上,由胎牛血清(FBS)提供营养。
(2)RNP制备
对于HEK293T细胞的编辑,我们选用内源基因AAVS1进行靶向切割验证。
AAVS1的靶向区域核苷酸序列如下:
CCCTTGCTCTCTGCTGTGTTGCTGCCCAAGGATGCTCTTTCCGGAGCACTTCCTTCTCGGCGCTGCACCACGTGATGTCCTCTGAGCGGATCCTCCCCGTGTCTGGGTCCTCTCCGGGCATCTCTCCTCCCTCACCCAACCCCATGCCGTcTTCACTCGCTGGGTTCCCTTTTCCTTCTCCTTCTGGGGCCTGTGCCATCTCTCGTTTCTTAGGATGGCCTTCTCCGACGGATGTCTCCCTTGCGTCCCGCCTCCCCTTCTTGTAGGCCTGCATCATCACCGTTTTTCTGGACAACCCCAAAGTACCCCGTCTCCCTGGCTTtAGcCACCTCTCCATCCTCTTGCTTTCTTTGCCTGGACACCCCGTTCTCCTGTGGATTCGGGTCACCTCTCACTCCTTTCATTTGGGCAGCTCCCCTACCCCCCTTACCTCTCTAGTCTGTGCTAGCTCTTCCAGCCCCCTGTCATGGCATCTTCCAGGGGTCCGAGAGCTCAGCTAGTCTTCTTCCTCCAACCCGGGCCCcTATGTCCACTTCAGGACAGCATGTTTGCTGCCTCCAGGGATCCTGTGTCCCCGAGCTGGGACCACCTTATATTCCCAGGGCCGGTTAATGTGGCTCTGGTTCTGGGTACTTTTATCTGTCCCCTCCACCCCACAGTGGGGCCACTAGGGACAGGATTGGTGACAGAAAAGCCCCATCCTTAGGCCTCCTCCTTCCTAGTCTCCTGATATTGGGTCTAACCCCCACCTCCTGTTAGGCAGATTCCTTATCTGGTGACACACCCCCATTTCCTGGAGCCATCTCTCTCCTTGCCAGAACCTCTAAGGTTTGCTTACGATGGAGCCAGAGAGGATCCTGGGAGGGAGAGCTTGGCAGGGGGTGGGAGGGAAGGGGGGGATGCGTGACCTGCCCGGTTCTCAGTGGCCACCCTGCGCTACCCTCTCCCAGAACCTGAGCTGCTCTGACGCGGCTGTCTGGTGCGTTTCACTGATCCTGGTGCTGCAGCTTCCTTACACTTCCCAAGAGGAGAAGCAGTTTGGAAAAACAAAATCAGAATAAGTTGGTCCTGAGTTCTAACTTTGGCTCTTCACCTTTCTAGTCCCCAATTTATATTGTTCCTCCGTGCGTCAGTTTTACCTGTGAGATAAGGCCAGTAGCCACCCCCGTCCTGGCAGGGCTGTGGTGAGGAGGGGGGTGTCCGTGTGGAAAACTCCCTTTGTGAGAATGGTGCGTCCTAGGTGTTCACCAGGTCGTGGCCGCCTCTACTCCCTTTCTCTTTCTCCATCCTTCTTTCCTTAAAGAGCCCCCAGTGCTATCTGGACATATTCCTCCGCCCAGAGCAGGGTCCGCTTCCCTAAGGCCCTGCTCTGGGCTTCTGGGTTTGAGTCCTTGCAAGCCCAGGAGAGCGCTAGCTTCCCTGTCCCCCTTCCTCGTCCACCATCTCATGCCCTGGCTCTCCTGCCCCTTCCTACA(SEQ ID NO:27).
针对这个基因,设计了1个靶向位点,设计并合成其双链DNA转录模板,具体脱氧核苷酸序列见下表5。表五所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。
表5:
表5所示的BES4和BES6靶向AAVS1位点序列使用订购的寡核苷酸和MEGAshortscriptTM T7转录试剂盒(Invitrogen),根据制造商推荐的方法在体外转录生成guide RNA。BES4和BES6效应蛋白的体外表达同实施例二一致。
(3)RNP转入人细胞
在十二孔板中,每个孔加入10皮摩尔的纯化效应蛋白和0.5微升的gRNA。使用NeonTM转染系统试剂盒及核转仪(Invitrogen)根据制造商的流程将RNP组装并转染入HEK293T细胞中。
(4)编辑活性鉴定
RNP转染后2-3天收获细胞,并进行T7E1酶切实验检测活性,步骤如下:
(a)收集细胞:向12孔板每孔中加入200微升0.5摩尔的EDTA(pH 8.0),以重悬细胞;
(b)基因组DNA提取:使用基因组DNA提取试剂盒(Tiangen)提取基因组DNA,采用Nanodrop来测量gDNA浓度;
(c)靶向区域PCR:使用GXL Prime从gDNA进行靶位点区域扩增,扩增引物如下表6所示,所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。并使用PCR纯化和凝胶提取试剂盒(MN)进行纯化。通过琼脂糖凝胶电泳分析PCR产物洁净度,同时使用Nanodrop测量浓度。
(d)变性和退火:使用Bio-rad PCR仪对步骤(c)纯化产物进行变性和退火。T7E1酶切反应投入等量底物DNA(约200-300ng/rxn,反应体系10微升)。
(e)T7E1酶切:将0.2微升的T7EI核酸酶加入步骤(d)中的10微升样品中。37℃,20分钟进行酶切反应。
(f)活性检测:T7E1完成切割反应后,加入loading buffer,进行琼脂糖凝胶检测条带。
表6:PCR扩增引物列表
如图12所示,BES6具有人细胞编辑活性。
实施例八BES4系统在人细胞体内的编辑活性鉴定
(1)人细胞培养
发明人选择人HEK293T细胞作为进行体内编辑活性测试的细胞。HEK293T细胞培养于DMEM培养基上,由胎牛血清(FBS)提供营养。
(2)质粒制备
对于HEK293T细胞的编辑,我们选用内源基因HBG进行靶向切割验证。
HBG的靶向区域核苷酸序列如下:
CCCTGCTGTGCTCAGATCAATACTCCGTTGTCTAAGTTGCCTCGAGACTAAAGGCAACAGGGCTGAAACATCTCCTGGACTCACCTTGAAGTTCTCAGGATCCACATGCAGCTTGTCACAGTGCAGTTCACTCAGCTGGGCAAAGGTGCCCTTGAGATCATCCAGGTGCTTTGTGGCATCTCCCAAGGAAGTCAGCACCTTCTTGCCATGTGCCTTGACTTTGGGGTTGCCCATGATGGCAGAGGCAGAGGACAGGTTGCCAAAGCTGTCAAAGAACCTCTGGGTCCATGGGTAGACAACCAGGAGCCTGTGAGATTGACAAGAACAGTTTGACAGTCAGAAGGTGCCACAAATCCTGAGAAGCGACCTGGACTTTTGCCAGGCACAGGGTCCTTCCTTCCCTCCCTTGTCCTGGTCACCAGAGCCTACCTTCCCAGGGTTTCTCCTCCAGCATCTTCCACATTCACCTTGCCCCACAGGCTTGTGATAGTAGCCTTGTCCTCCTCTGTGAAATGACCCATGGCGTCTGGACTAGGAGCTTATTGATAACCTCAGACGTTCCAGAAGCGAGTGTGTGGAACTGCTGAAGGGTGCTTCCTTTTATTCTTCATCCCTAGCCAGCCGCCGGCCCCTGGCCTCACTGGATACTCTAAGACTATTGGTCAAGTTTGCCTTGTCAAGGCTATTGGTCAAGGCAAGGCTGGCCAACCCATGGGTGGAGTTTAGCCAGGGACCGTTTCAGACAGATATTTGCATTGAGATAGTGTGGGGAAGGGGCCCCCAAGAGGATACTGCTAATTTTTTTTATAGCCTTTGCCTTGTTCCGATTCAGTCATTCCAGTTTTTCTCTAATTTATTCTTCCCTTTAGCTAGTTTCCTTCTCCCATCATAGAGGATACCAGGACTTCTTTTGTCAGCCGTTTTTTACCTTCTTGTCTCTAGCTCCAGTGAGGCCTGTAGTTTAAAGCTAAAGCATGTACCAATTTTTGAAAAGTTCAGGGATTGTGAAATGTGTTTTAGGCATAGGTCCAGGATTTTTGACGGGACAAATCTTAGTCTCTTTCAGTTAGCAGTGGTTTCTAAGGA(SEQ ID NO:32).
针对这个区域,发明人设计了三个靶点,并合成相应质粒序列,
BES4-HBG-sg01:
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGAATTTCTACTATTGTAGATGCCAGCCTTGCCTTGACCAATAGTTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAAGATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGAACCCCGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACGAGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGAAACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGCCGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGCCAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACAAGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGGACGAGAAGGAAGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCTTCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACAGATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAGGCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGCCTGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCCCCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACGGCACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCAAGAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCGAGAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGTGAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCCTGAAGAAAGCCATCGACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCTACATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGAGCGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAATCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAAGAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAGATCAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGACAACCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACGAGAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGACGCCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGGCGAGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGAATCGACAGACTGTACGACAAGGTGAGAAACTACGTGACCCAGAAGCCCTTCAGCACCGACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGAAACAAGGAGAAGGACTGTGGCGCTGTGCTGCTGTGTAAGGACGAGAAGTACTACCTGGCCATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCAACGAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAACCTGCCCAGAGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATGAGATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGACGACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGTGGCTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTTCTACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCAGCTTCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAAGGACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACACCCTGTACTTCAAGATGCTGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCGAGATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACACCCCCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCCTACGACCTGATCAAGGACAAGAGATACACCAAGTGGCAGTTCAGCCTGCACTTCCCCATCACCATGAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGCTGAAGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGTACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACATCATCGGCAACAAGTTCAAGGGCAAGACCTACGAAACCAACTACAGAGAGAAGCTGGCCACCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCATCAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCTGGTGGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGAGAGGCAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGACAAGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGCTGCATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGAGCGGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGGATTTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGATCAGCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACATCGACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTGCACCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGAGCTACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCATCAACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGCAAGTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAACCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCAACTTCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGCTAACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGAAGGCCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTTCGTGCAGGAGAACGACATGGCCAACAACAAAAGGCCGGCGGCCACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:33).
BES4-HBG-sg02:
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGAATTTCTACTATTGTAGATACCAATAGCCTTGACAAGGCAAATTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAAGATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGAACCCCGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACGAGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGAAACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGCCGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGCCAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACAAGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGGACGAGAAGGAAGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCTTCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACAGATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAGGCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGCCTGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCCCCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACGGCACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCAAGAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCGAGAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGTGAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCCTGAAGAAAGCCATCGACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCTACATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGAGCGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAATCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAAGAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAGATCAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGACAACCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACGAGAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGACGCCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGGCGAGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGAATCGACAGACTGTACGACAAGGTGAGAAACTACGTGACCCAGAAGCCCTTCAGCACCGACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGAAACAAGGAGAAGGACTGTGGCGCTGTGCTGCTGTGTAAGGACGAGAAGTACTACCTGGCCATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCAACGAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAACCTGCCCAGAGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATGAGATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGACGACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGTGGCTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTTCTACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCAGCTTCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAAGGACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACACCCTGTACTTCAAGATGCTGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCGAGATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACACCCCCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCCTACGACCTGATCAAGGACAAGAGATACACCAAGTGGCAGTTCAGCCTGCACTTCCCCATCACCATGAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGCTGAAGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGTACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACATCATCGGCAACAAGTTCAAGGGCAAGACCTACGAAACCAACTACAGAGAGAAGCTGGCCACCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCATCAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCTGGTGGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGAGAGGCAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGACAAGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGCTGCATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGAGCGGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGGATTTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGATCAGCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACATCGACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTGCACCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGAGCTACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCATCAACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGCAAGTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAACCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCAACTTCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGCTAACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGAAGGCCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTTCGTGCAGGAGAACGACATGGCCAACAACAAAAGGCCGGCGGCCACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:34)
BES4-HBG-SG03:
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGAATTTCTACTATTGTAGATCCTTGTCAAGGCTATTGGTCAAGTTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAAGATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGAACCCCGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACGAGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGAAACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGCCGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGCCAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACAAGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGGACGAGAAGGAAGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCTTCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACAGATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAGGCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGCCTGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCCCCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACGGCACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCAAGAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCGAGAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGTGAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCCTGAAGAAAGCCATCGACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCTACATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGAGCGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAATCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAAGAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAGATCAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGACAACCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACGAGAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGACGCCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGGCGAGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGAATCGACAGACTGTACGACAAGGTGAGAAACTACGTGACCCAGAAGCCCTTCAGCACCGACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGAAACAAGGAGAAGGACTGTGGCGCTGTGCTGCTGTGTAAGGACGAGAAGTACTACCTGGCCATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCAACGAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAACCTGCCCAGAGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATGAGATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGACGACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGTGGCTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTTCTACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCAGCTTCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAAGGACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACACCCTGTACTTCAAGATGCTGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCGAGATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACACCCCCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCCTACGACCTGATCAAGGACAAGAGATACACCAAGTGGCAGTTCAGCCTGCACTTCCCCATCACCATGAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGCTGAAGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGTACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACATCATCGGCAACAAGTTCAAGGGCAAGACCTACGAAACCAACTACAGAGAGAAGCTGGCCACCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCATCAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCTGGTGGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGAGAGGCAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGACAAGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGCTGCATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGAGCGGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGGATTTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGATCAGCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACATCGACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTGCACCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGAGCTACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCATCAACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGCAAGTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAACCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCAACTTCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGCTAACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGAAGGCCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTTCGTGCAGGAGAACGACATGGCCAACAACAAAAGGCCGGCGGCCACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:35).
PX458-HBG-SG01:
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCCTTGTCAAGGCTATTGGTCAGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCGACAAGAAGTACAGCATCGGCCTGGACATCGGCACCAACTCTGTGGGCTGGGCCGTGATCACCGACGAGTACAAGGTGCCCAGCAAGAAATTCAAGGTGCTGGGCAACACCGACCGGCACAGCATCAAGAAGAACCTGATCGGAGCCCTGCTGTTCGACAGCGGCGAAACAGCCGAGGCCACCCGGCTGAAGAGAACCGCCAGAAGAAGATACACCAGACGGAAGAACCGGATCTGCTATCTGCAAGAGATCTTCAGCAACGAGATGGCCAAGGTGGACGACAGCTTCTTCCACAGACTGGAAGAGTCCTTCCTGGTGGAAGAGGATAAGAAGCACGAGCGGCACCCCATCTTCGGCAACATCGTGGACGAGGTGGCCTACCACGAGAAGTACCCCACCATCTACCACCTGAGAAAGAAACTGGTGGACAGCACCGACAAGGCCGACCTGCGGCTGATCTATCTGGCCCTGGCCCACATGATCAAGTTCCGGGGCCACTTCCTGATCGAGGGCGACCTGAACCCCGACAACAGCGACGTGGACAAGCTGTTCATCCAGCTGGTGCAGACCTACAACCAGCTGTTCGAGGAAAACCCCATCAACGCCAGCGGCGTGGACGCCAAGGCCATCCTGTCTGCCAGACTGAGCAAGAGCAGACGGCTGGAAAATCTGATCGCCCAGCTGCCCGGCGAGAAGAAGAATGGCCTGTTCGGAAACCTGATTGCCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGCAACTTCGACCTGGCCGAGGATGCCAAACTGCAGCTGAGCAAGGACACCTACGACGACGACCTGGACAACCTGCTGGCCCAGATCGGCGACCAGTACGCCGACCTGTTTCTGGCCGCCAAGAACCTGTCCGACGCCATCCTGCTGAGCGACATCCTGAGAGTGAACACCGAGATCACCAAGGCCCCCCTGAGCGCCTCTATGATCAAGAGATACGACGAGCACCACCAGGACCTGACCCTGCTGAAAGCTCTCGTGCGGCAGCAGCTGCCTGAGAAGTACAAAGAGATTTTCTTCGACCAGAGCAAGAACGGCTACGCCGGCTACATTGACGGCGGAGCCAGCCAGGAAGAGTTCTACAAGTTCATCAAGCCCATCCTGGAAAAGATGGACGGCACCGAGGAACTGCTCGTGAAGCTGAACAGAGAGGACCTGCTGCGGAAGCAGCGGACCTTCGACAACGGCAGCATCCCCCACCAGATCCACCTGGGAGAGCTGCACGCCATTCTGCGGCGGCAGGAAGATTTTTACCCATTCCTGAAGGACAACCGGGAAAAGATCGAGAAGATCCTGACCTTCCGCATCCCCTACTACGTGGGCCCTCTGGCCAGGGGAAACAGCAGATTCGCCTGGATGACCAGAAAGAGCGAGGAAACCATCACCCCCTGGAACTTCGAGGAAGTGGTGGACAAGGGCGCTTCCGCCCAGAGCTTCATCGAGCGGATGACCAACTTCGATAAGAACCTGCCCAACGAGAAGGTGCTGCCCAAGCACAGCCTGCTGTACGAGTACTTCACCGTGTATAACGAGCTGACCAAAGTGAAATACGTGACCGAGGGAATGAGAAAGCCCGCCTTCCTGAGCGGCGAGCAGAAAAAGGCCATCGTGGACCTGCTGTTCAAGACCAACCGGAAAGTGACCGTGAAGCAGCTGAAAGAGGACTACTTCAAGAAAATCGAGTGCTTCGACTCCGTGGAAATCTCCGGCGTGGAAGATCGGTTCAACGCCTCCCTGGGCACATACCACGATCTGCTGAAAATTATCAAGGACAAGGACTTCCTGGACAATGAGGAAAACGAGGACATTCTGGAAGATATCGTGCTGACCCTGACACTGTTTGAGGACAGAGAGATGATCGAGGAACGGCTGAAAACCTATGCCCACCTGTTCGACGACAAAGTGATGAAGCAGCTGAAGCGGCGGAGATACACCGGCTGGGGCAGGCTGAGCCGGAAGCTGATCAACGGCATCCGGGACAAGCAGTCCGGCAAGACAATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAACAGAAACTTCATGCAGCTGATCCACGACGACAGCCTGACCTTTAAAGAGGACATCCAGAAAGCCCAGGTGTCCGGCCAGGGCGATAGCCTGCACGAGCACATTGCCAATCTGGCCGGCAGCCCCGCCATTAAGAAGGGCATCCTGCAGACAGTGAAGGTGGTGGACGAGCTCGTGAAAGTGATGGGCCGGCACAAGCCCGAGAACATCGTGATCGAAATGGCCAGAGAGAACCAGACCACCCAGAAGGGACAGAAGAACAGCCGCGAGAGAATGAAGCGGATCGAAGAGGGCATCAAAGAGCTGGGCAGCCAGATCCTGAAAGAACACCCCGTGGAAAACACCCAGCTGCAGAACGAGAAGCTGTACCTGTACTACCTGCAGAATGGGCGGGATATGTACGTGGACCAGGAACTGGACATCAACCGGCTGTCCGACTACGATGTGGACCATATCGTGCCTCAGAGCTTTCTGAAGGACGACTCCATCGACAACAAGGTGCTGACCAGAAGCGACAAGAACCGGGGCAAGAGCGACAACGTGCCCTCCGAAGAGGTCGTGAAGAAGATGAAGAACTACTGGCGGCAGCTGCTGAACGCCAAGCTGATTACCCAGAGAAAGTTCGACAATCTGACCAAGGCCGAGAGAGGCGGCCTGAGCGAACTGGATAAGGCCGGCTTCATCAAGAGACAGCTGGTGGAAACCCGGCAGATCACAAAGCACGTGGCACAGATCCTGGACTCCCGGATGAACACTAAGTACGACGAGAATGACAAGCTGATCCGGGAAGTGAAAGTGATCACCCTGAAGTCCAAGCTGGTGTCCGATTTCCGGAAGGATTTCCAGTTTTACAAAGTGCGCGAGATCAACAACTACCACCACGCCCACGACGCCTACCTGAACGCCGTCGTGGGAACCGCCCTGATCAAAAAGTACCCTAAGCTGGAAAGCGAGTTCGTGTACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAGGCTACCGCCAAGTACTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAGATTACCCTGGCCAACGGCGAGATCCGGAAGCGGCCTCTGATCGAGACAAACGGCGAAACCGGGGAGATCGTGTGGGATAAGGGCCGGGATTTTGCCACCGTGCGGAAAGTGCTGAGCATGCCCCAAGTGAATATCGTGAAAAAGACCGAGGTGCAGACAGGCGGCTTCAGCAAAGAGTCTATCCTGCCCAAGAGGAACAGCGATAAGCTGATCGCCAGAAAGAAGGACTGGGACCCTAAGAAGTACGGCGGCTTCGACAGCCCCACCGTGGCCTATTCTGTGCTGGTGGTGGCCAAAGTGGAAAAGGGCAAGTCCAAGAAACTGAAGAGTGTGAAAGAGCTGCTGGGGATCACCATCATGGAAAGAAGCAGCTTCGAGAAGAATCCCATCGACTTTCTGGAAGCCAAGGGCTACAAAGAAGTGAAAAAGGACCTGATCATCAAGCTGCCTAAGTACTCCCTGTTCGAGCTGGAAAACGGCCGGAAGAGAATGCTGGCCTCTGCCGGCGAACTGCAGAAGGGAAACGAACTGGCCCTGCCCTCCAAATATGTGAACTTCCTGTACCTGGCCAGCCACTATGAGAAGCTGAAGGGCTCCCCCGAGGATAATGAGCAGAAACAGCTGTTTGTGGAACAGCACAAGCACTACCTGGACGAGATCATCGAGCAGATCAGCGAGTTCTCCAAGAGAGTGATCCTGGCCGACGCTAATCTGGACAAAGTGCTGTCCGCCTACAACAAGCACCGGGATAAGCCCATCAGAGAGCAGGCCGAGAATATCATCCACCTGTTTACCCTGACCAATCTGGGAGCCCCTGCCGCCTTCAAGTACTTTGACACCACCATCGACCGGAAGAGGTACACCAGCACCAAAGAGGTGCTGGACGCCACCCTGATCCACCAGAGCATCACCGGCCTGTACGAGACACGGATCGACCTGTCTCAGCTGGGAGGCGACAAAAGGCCGGCGGCCACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:36)。
BES4活性检测实验所需质粒序列获取商业合成的质粒和菌株后,通过直接接菌的方式扩增:
(a)取15mL无抗LB液体培养基,加入1000X Amp抗生素15μL,然后利用一个白枪头挑取保存有目标质粒的菌株,放入培养基中,37℃,200rpm,培养过夜;
(b)将过夜培养的菌株进行8000rpm离心3min,将菌离心至底部,倒掉培养基;
(c)利用天根小提试剂盒或者天根无内毒素小提中量试剂盒提取;
(d)提取完质粒后利用Nanodrop进行浓度定量,置于-20℃保存。
(3)质粒转入人细胞
(a)质粒转染利用的是Lipo3000试剂盒(每孔投入的质粒1.5μg);
(b)转染过后,对细胞进行2-3天培养后,充分进行基因编辑后,回收细胞;
(c)细胞培养完成后,用枪头吸取培养基,在12孔板每孔中加入200μL 0.5M EDTA溶液,放置十分钟之后,利用吹打重悬,转移至EP管中,12000rpm离心1min,取上清回收细胞;
(4)编辑活性鉴定
收获细胞后进行基因组提取和T7E1酶切实验检测活性,步骤如下:
(a)基因组DNA提取:使用基因组DNA提取试剂盒(Tiangen)提取基因组DNA,采用Nanodrop来测量gDNA浓度;
(b)靶向区域PCR:使用GXL Prime从gDNA进行靶位点区域扩增,扩增引物如下表7所示,所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。并使用PCR纯化和凝胶提取试剂盒(MN)进行纯化。通过琼脂糖凝胶电泳分析PCR产物洁净度,同时使用Nanodrop测量浓度。
(c)变性和退火:使用Bio-rad PCR仪对步骤(c)纯化产物进行变性和退火。T7E1酶切反应投入等量底物DNA(约200-300ng/rxn,反应体系10微升)。
(d)T7E1酶切:将0.2微升的T7EI核酸酶加入步骤(d)中的10微升样品中。37℃,20分钟进行酶切反应。
(e)活性检测:T7E1完成切割反应后,加入loading buffer,进行琼脂糖凝胶检测条带。
表7:PCR扩增引物列表
如图13所示,BES4的sg03质粒具有人细胞编辑活性。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。
附表I:新型Type II Crispr-Cas系统
/>
/>
附表II:新型Cpf1(Type V)系统
/>
/>
SEQUENCE LISTING
<110> 深圳华大生命科学研究院
<120> 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途
<130> PIDC3202629
<160> 38
<170> PatentIn version 3.3
<210> 1
<211> 1064
<212> PRT
<213> Artificial
<220>
<223> BES1序列
<400> 1
Met Gly Tyr Ile Leu Gly Leu Asp Ile Gly Val Ala Ser Val Gly Tyr
1 5 10 15
Ala Ile Ile Asp Glu Asn Tyr Asn Val Leu Ile Ser Gly Val Arg Leu
20 25 30
Phe Arg Glu Gly Thr Ala Glu Glu Asn Val Ala Arg Arg Gly Phe Arg
35 40 45
Ser Ser Arg Arg Ser Met Arg Arg Ser Arg His Arg Leu Asp Arg Leu
50 55 60
Lys Glu Leu Leu Ser Ser Ala Leu Gly Val Ser Gly Asp Gln Ser Tyr
65 70 75 80
Thr Asn Leu Tyr Glu Ile Arg Val Arg Gly Leu Ser Asn Lys Leu Leu
85 90 95
Pro Asp Glu Leu Ile Ala Ala Ile Ile Gln Leu Ala Lys His Arg Gly
100 105 110
Ile Phe Tyr Leu Ser Pro Glu Asp Leu Ala Thr Glu Asp Gly Ser Asn
115 120 125
Arg Ser Ser Ala Asp Ile Ile Arg Thr Asn Glu Asn Lys Leu Lys Asp
130 135 140
Gly Ile Tyr Pro Cys His Val Gln Leu Glu Lys Leu Asn Thr Thr Gly
145 150 155 160
Lys Val Arg Gly Ile Glu Asn Lys Phe Thr His Gly Ser Tyr Arg Ser
165 170 175
Glu Leu Ile Lys Leu Leu Glu Val Gln Ser Ser Phe Tyr Pro Lys Leu
180 185 190
Lys Gly Ile Met Asp Glu Val Leu Cys Ile Tyr Asp Ser Lys Arg Glu
195 200 205
Tyr Tyr Glu Gly Pro Gly Ser Tyr Lys Ser Pro Thr Pro Tyr Gly Ser
210 215 220
Tyr Gln Leu Asp Glu Ser Gly Asn Val Ile Lys Ile Asn Leu Ile Asp
225 230 235 240
Lys Met Arg Gly Thr Cys Thr Tyr Phe Pro Asp Glu Leu Arg Ala Pro
245 250 255
Lys Trp Ser Asn Ser Ala Cys Leu Phe Asn Leu Leu Asn Asp Leu Asn
260 265 270
Asn Leu Thr Ile Gln Gly Val Lys Ile Thr Glu Val Gln Lys Gln Glu
275 280 285
Leu Ile Ser Glu Tyr Val Asn Lys Gly Lys Thr Val Thr Ile Pro Ala
290 295 300
Ile Ala Lys Val Cys Gly Val Lys Lys Glu Asp Ile Phe Gly Phe Arg
305 310 315 320
Ile Asp Lys Ser Glu Lys Pro Ile Phe Thr Lys Phe Glu Gly Tyr Asn
325 330 335
Glu Leu Leu Lys Ile Ala Lys Ser Val Asn Glu Glu Asp Ala Ile Glu
340 345 350
Gly Lys Lys Gln Leu Val Asp Asp Ile Ser Glu Ile Leu Thr Lys Glu
355 360 365
Lys Ser Ile Asp Val Arg Glu Arg Lys Leu Val Asp Asp Leu Asn Leu
370 375 380
Ser Thr Ser Leu Ala Lys Glu Ile Ala Lys Ser Gly Gly Phe Thr Thr
385 390 395 400
Tyr His Ser Leu Ser Phe Lys Ala Ile Asn Leu Ile Leu Asp Asp Leu
405 410 415
Leu Lys Thr Ser Lys Asn Gln Met Glu Leu Phe Thr Glu Ala Gly Ile
420 425 430
Lys Pro Tyr Asn His Lys Phe Ser Gln Ser Tyr Gln Leu Ser Ala Asn
435 440 445
Leu Ser Asp Trp Ile Val Ser Pro Val Val Lys Arg Ser Ile Asn Glu
450 455 460
Thr Ile Lys Val Phe Asn Ala Leu Arg Lys Tyr Leu Lys Thr Gln Asn
465 470 475 480
Ser Asp Asp Thr Glu Phe Ser Asp Val Val Val Glu Leu Ala Arg Glu
485 490 495
Lys Asn Ser Gln Glu Lys Lys Asp Leu Ile Lys Lys Ile Gln Lys Ala
500 505 510
Asn Glu Glu Lys Arg Tyr Lys Ile Met Glu Leu Val Glu Asn Arg Lys
515 520 525
Leu Thr Ser Ala Glu Phe Glu Arg Ile Ser Leu Leu Leu Glu Gln Asp
530 535 540
Phe Lys Cys Ala Tyr Ser Leu Glu Pro Ile Glu Leu Ser Asp Val Phe
545 550 555 560
Lys Ala Gly Leu Leu Glu Val Asp His Ile Ile Pro Leu Ser Ile Ser
565 570 575
Leu Ser Asp Ala Gln Ser Asn Lys Val Leu Val Tyr Gln Ser Glu Asn
580 585 590
Gln Ala Lys Gly Gln Arg Ser Pro Phe Gln Tyr Phe Ser Ser Gly Lys
595 600 605
Ala Lys Ile Thr Phe Glu Arg Phe Lys Glu Tyr Val Thr Lys Asn Leu
610 615 620
Asn Phe Ser Asn Ala Lys Lys Arg Asn Leu Leu Tyr Leu Gly Asn Pro
625 630 635 640
Val Glu Asp Met Lys Gly Phe Ile Asn Arg Asn Leu Val Asp Thr Arg
645 650 655
Tyr Ala Ser Arg Glu Thr Tyr Asn Leu Leu Lys Ser Phe Phe Asp Tyr
660 665 670
His Asn Ile Asn Thr Lys Val Lys Val Ile Asn Gly Ser Ala Thr Ser
675 680 685
Tyr Phe Arg Lys Lys Ala Tyr Leu Ser Lys Asn Arg Glu Glu Thr Tyr
690 695 700
Ala His His Ala Gln Asp Ala Met Ile Ile Ala Gly Phe Ala Asn Thr
705 710 715 720
Lys Leu Met Lys Phe Phe Ser Lys Ile Gly Ala Phe Ser Glu Ser Leu
725 730 735
Asn Asn Lys Asp Ser Ile Val Glu Val Asp Gly Asn Ile Ile Asn Ser
740 745 750
Glu Thr Gly Glu Val Leu Glu Gln Glu Leu Phe Asp Lys Ser Glu Asn
755 760 765
Val Ser Asn Tyr Ile Gln Phe Leu Lys Arg Ile Glu Ser Ile Glu Pro
770 775 780
Leu Tyr Ser His Lys Val Asp Arg Lys Pro Asn Arg Ala Leu Tyr Asp
785 790 795 800
Gln Gln Ile Lys Ala Thr Arg Ser Phe Val Glu Asp Asn Lys Glu Val
805 810 815
Thr Tyr Ile Ile Thr Lys Tyr Ser Asp Ile Tyr Asn Thr Gly Thr Gly
820 825 830
Asn Ser Gly Ser Lys Leu Lys Lys Met Ile Leu Glu Ser Pro Asp Lys
835 840 845
Leu Leu Met Tyr His His Asp Pro Lys Thr Phe Glu Ile Phe Gln Lys
850 855 860
Ile Val Glu Gln Tyr Gly Asp Glu Ser Asn Pro Phe Ala Ala Tyr Lys
865 870 875 880
Glu Asp His Gly Pro Ile Arg Lys Tyr Ser Lys Lys Gly Asn Gly Pro
885 890 895
Ile Ile Glu Ser Val Lys Phe Arg Asp Lys Gln Leu Gly Ser His Arg
900 905 910
Val Asn Thr Lys Gln Asn Gly Tyr Asn Lys Ser Val Phe Leu Lys Ile
915 920 925
Lys Ser Leu Arg Thr Asp Val Tyr Gln Asp Gly Glu Asn Tyr Leu Val
930 935 940
Leu Asn Val Pro Tyr Asp Met Val Ser Phe Val Asn Gly Lys Tyr Ile
945 950 955 960
Ile Asp Gln Asp Lys Tyr Asn Lys Ser Lys Gln Ala Gln Lys Ile Pro
965 970 975
Glu Ser Ala Thr Phe Val Thr Ser Leu Tyr Arg Gly Asp His Ile Thr
980 985 990
Tyr Glu Glu Asn Gly Glu Ile Val Glu Cys Ile Phe Lys Cys Ile Asn
995 1000 1005
Asn Glu Lys Ala His Lys Ile Glu Ile Ser Tyr Val Asn Arg Pro
1010 1015 1020
Thr Asp Lys Gln Val Met Lys Gly Ile Lys Thr Ser Ile Lys Asn
1025 1030 1035
Leu Thr Lys Tyr Asn Val Asp Val Leu Gly Asn Lys Tyr Lys Val
1040 1045 1050
Thr Asp Glu Lys Leu Glu Phe Asp Val Thr Ile
1055 1060
<210> 2
<211> 1368
<212> PRT
<213> Artificial
<220>
<223> BES2序列
<400> 2
Met Lys Leu Arg Asn Ile Glu Gly Asp Tyr Asn Ile Gly Leu Asp Leu
1 5 10 15
Gly Thr Gly Ser Val Gly Trp Ala Ala Thr Gly Ile Asp Gly Glu Leu
20 25 30
Leu Thr Gln Asn Asp Lys Pro Ala Trp Gly Ser Arg Val Phe Pro Ser
35 40 45
Gly Glu Thr Ala Ala Asp Thr Arg Leu Lys Arg Gly Gln Arg Arg Arg
50 55 60
Tyr Glu Arg Arg Arg Trp Arg Leu Asp Leu Leu Gln Arg Phe Phe Glu
65 70 75 80
Asp Tyr Met Ala Val Val Asp Pro Ala Phe Phe Ile Arg Leu Lys Gln
85 90 95
Ala Arg Leu Leu Arg Glu Asp Arg Asp Glu Ser Cys Arg Asp Tyr His
100 105 110
Ser Pro Leu Phe Ile Ser Gly Asp Ala Glu Arg Asp Tyr Tyr Lys Arg
115 120 125
Phe Pro Thr Ile Tyr His Leu Arg Ala Trp Leu Met Thr Thr Glu Lys
130 135 140
Lys Ala Asp Leu Arg Glu Val Tyr Leu Ala Leu His Asn Ile Val Lys
145 150 155 160
His Arg Gly Asn Phe Leu His Gln Asp Asn Pro Asn Leu Ser Ala Thr
165 170 175
Ala Ala Asn Met Glu Glu Ser Val Glu Arg Leu Cys Leu Glu Leu Asp
180 185 190
Asp Arg Cys Ala Ala Leu Asp Ile Pro Cys Ala Cys Asp Ala Ala Ser
195 200 205
Ile Arg Gln Val Phe Glu Asp Pro Ser Leu Ala Arg Ala Gly Lys Ser
210 215 220
Glu Ser Val Ser Lys Leu Phe Gly Phe Asp Lys Asp Ser Gln Lys Thr
225 230 235 240
Met Gly Lys Gly Ile Ser Arg Ala Ile Val Gly Tyr Lys Val Asp Phe
245 250 255
Ala Thr Val Leu Gly Cys Glu Phe Glu Asp Ser Ala Phe Ser Leu Ser
260 265 270
Asp Asp Glu Lys Val Asp Gly Ala Leu Ala Ala Ile Pro Asp Asp Ala
275 280 285
Met Gly Leu Phe Asp Ala Ile Arg Ala Ala Tyr Ser Ser Tyr Val Leu
290 295 300
Leu Gly Ile Leu Ser Ser Gly Asp Asp Ser Pro Ile Thr Ser Gly Ala
305 310 315 320
Leu Ser Ser Ala Ser Gly Arg Thr Val Ser Phe Cys Lys Val Arg Glu
325 330 335
Tyr Glu Thr Tyr Lys Ala Asp Leu Ala Leu Leu Lys Ser Leu Val Arg
340 345 350
Thr Tyr Val Pro Glu Gln Tyr Glu Gly Phe Phe Arg Gly Glu Leu Ile
355 360 365
Ala Gly Thr Ser His Tyr Asp Pro Ala Lys Ala Lys Gly Tyr Thr Arg
370 375 380
Tyr Asp Leu Thr His Lys Val Ala Tyr Ala Asp Phe Phe Lys Glu Val
385 390 395 400
Lys Ser Leu Leu Asp Lys Thr Asp Ala Val Thr Asp Glu Arg Tyr Lys
405 410 415
Asp Met Leu Gly Arg Phe Glu Glu Glu Arg Phe Leu Arg Arg Leu Lys
420 425 430
Thr Ser Asp Asn Gly Ser Ile Pro Tyr Gln Leu His Leu Glu Glu Met
435 440 445
Asp Ala Ile Leu Lys Asn Gln Gly Lys His Tyr Pro Phe Leu Leu Glu
450 455 460
Asn Leu Asp Lys Ile Glu Ser Leu Val Ser Phe Arg Ile Pro Tyr Tyr
465 470 475 480
Val Gly Pro Leu Thr Gln Lys Asn Ala Ala Leu Asp His Asn Gly Gln
485 490 495
Ala Arg Phe Ala Trp Ala Thr Arg Lys Pro Gly Lys Gly Asp Glu Pro
500 505 510
Val Tyr Pro Trp Asn Trp Glu Glu Val Ile Asp Lys Gly His Ala Ala
515 520 525
His Ala Phe Ile Gln Arg Met Thr Ser Asp Cys Ser Tyr Leu Ile Gly
530 535 540
Glu Gly Val Leu Pro Arg Asn Ser Leu Met Tyr Glu Glu Phe Cys Val
545 550 555 560
Leu Asn Glu Leu Asn Gly Ala Arg Tyr Ser Val Asp Gly Asp Asp Trp
565 570 575
Arg Arg Phe Asp Tyr Ala Asp Arg Met Gly Ile Met Asp Asp Leu Phe
580 585 590
Arg Gln Arg Arg Ser Val Thr Tyr Lys Met Val Glu Asp Trp Met Arg
595 600 605
Ala Asn Arg Gly Trp Ala Arg Val His Val Arg Gly Gly Gln Gly Glu
610 615 620
Asn Lys Phe Glu Ser Ser Leu Leu Ala Tyr Arg Phe Phe Cys Lys Asp
625 630 635 640
Val Phe Lys Thr Asp Glu Leu Ser Pro Ser Leu Ile Pro Met Val Glu
645 650 655
Thr Ile Ile Leu Trp Ser Thr Leu Phe Glu Asp Arg Ser Ile Leu Lys
660 665 670
Glu Gln Leu Ile Arg Asn Phe Ser Asp Arg Leu Ser Pro Glu Gln Ile
675 680 685
Lys Ile Ile Cys Lys Lys Arg Leu Thr Gly Trp Gly Asn Leu Ser Glu
690 695 700
Arg Phe Leu Ala Glu Ile Lys Val Glu Thr Asp Cys Gly Pro Arg Ser
705 710 715 720
Ile Met Asp Ile Leu Arg Glu Gly Ser Pro Val Gly Gly Glu Gln Gly
725 730 735
Arg Thr Met Val Leu Met Glu Val Leu His Asp Glu Arg Leu Gly Phe
740 745 750
Glu Val Lys Ile Glu Glu Ile Asn Ala Glu Arg Ile Ala Asp Ala Gly
755 760 765
Arg Leu Glu Val Gly Asp Leu Pro Gly Ser Pro Ala Leu Arg Arg Thr
770 775 780
Val Asn Gln Ala Val Arg Val Val Glu Glu Ile Val Arg Ile Ala Gly
785 790 795 800
Lys Pro Pro Val Asn Ile Phe Ile Glu Asn Thr Arg Asp Glu Asp Leu
805 810 815
Ser Arg Lys Gly Lys Arg Thr Lys Arg Arg Tyr Asp Ala Ile Lys Glu
820 825 830
Ala Val Asn Ala Phe Lys Arg Glu Asn Ala Asp Leu Ala Gln Glu Leu
835 840 845
Lys Asp Phe Lys Pro Thr Asp Phe Asp Asp Glu Arg Leu Thr Leu Tyr
850 855 860
Phe Met Gln Gly Gly Lys Ser Leu Tyr Ser Lys Ala Pro Leu Asp Val
865 870 875 880
Thr Arg Leu Ser Glu Tyr Glu Ile Asp His Ile Ile Pro Gln Ser Tyr
885 890 895
Ile Lys Asp Asp Ser Phe Glu Asn Lys Ala Leu Val Leu Lys Ser Glu
900 905 910
Asn Gln Thr Lys Thr Asn Gln Leu Leu Leu Pro Gln Gly Val Arg Val
915 920 925
Lys Met Ala Ser Tyr Trp Gln Glu Leu His Arg Cys Gly Leu Met Gly
930 935 940
Asp Lys Lys Leu Arg Asn Leu Met Cys Ser Asp Ile Ser Glu Arg Arg
945 950 955 960
Ile Lys Gly Phe Ile Ala Arg Gln Leu Val Glu Thr Ser Gln Ile Val
965 970 975
Lys Leu Thr Lys Met Val Leu Glu Asn Arg Leu Pro Glu Ser Arg Leu
980 985 990
Val Pro Ile Lys Ala Ser Leu Ser His Glu Leu Arg Glu Ala Lys His
995 1000 1005
Tyr Tyr Lys Cys Arg Glu Ile Asn Asp Phe His His Ala His Asp
1010 1015 1020
Ala Leu Leu Ala Ala Glu Ile Gly Arg Phe Leu Leu Leu Arg His
1025 1030 1035
Ala Gly Met Tyr Asp Asn Pro Ile Gly Tyr Ala His Val Val Lys
1040 1045 1050
Asp Phe Val Arg Val Gln Ala Asp Glu Ala Lys Arg Thr Gly Arg
1055 1060 1065
Leu Pro Gly Ser Ala Gly Phe Ile Val Ser Ser Phe Leu His Ser
1070 1075 1080
Gly Phe Asp Lys Asp Thr Gly Glu Ile Ser Trp Asp Ala Glu Phe
1085 1090 1095
Glu Cys Glu Arg Ile Arg Lys Tyr Leu Asn Tyr Arg Gln Val Tyr
1100 1105 1110
Leu Ser Arg Met Pro Glu Glu Thr Ser Gly Ala Phe Trp Asp Ala
1115 1120 1125
Thr Ile Tyr Ser Pro Arg Gly Lys Met Lys Leu Ser Leu Pro Leu
1130 1135 1140
Lys Glu Gly Leu Asp Pro Ser Lys Tyr Gly Gly Tyr Ser Ser Glu
1145 1150 1155
Lys Tyr Ala Tyr Phe Phe Cys Tyr Tyr Ala Lys Asp Lys Lys Gly
1160 1165 1170
Lys Arg Ile Ile Asp Phe Ala Pro Val Pro Val Ser Arg Ala Ala
1175 1180 1185
Gly Gly Gln Val Asp Ile Glu Ala Phe Gly Arg Glu Val Ala Glu
1190 1195 1200
Glu Arg Gly Tyr Ala Phe Glu Ser Ile Ala Arg Ala Lys Ile Ala
1205 1210 1215
Val Lys Gln Leu Ile Glu Val Asp Gly Cys Arg Leu Phe Ile Thr
1220 1225 1230
Gly Ala Asp Glu Val Arg Ser Ala Val Pro Leu Ala Tyr Ser Gln
1235 1240 1245
Asp Asp Thr His Leu Met Thr Arg Leu Phe Ala Gly Ser Asp Thr
1250 1255 1260
Asp Cys Asp Arg Leu Phe Cys Gln Met Met Ala Gly Ile Glu Arg
1265 1270 1275
Phe Asp Lys Arg Leu Tyr Asp Asn Leu Lys Leu Lys Ser Arg Ala
1280 1285 1290
Ser Ala Phe Pro Ala Leu Gly Asp Glu Asn Lys Lys Leu Val Leu
1295 1300 1305
Lys Gly Leu Thr Ala Leu Ser Ser Ala Ser Ser Asn Lys Glu Asp
1310 1315 1320
Met Arg Pro Ile Gly Gly Ala Lys Thr Ala Gly Gln Leu Lys Ile
1325 1330 1335
Val Phe Arg Asn Val Leu Ser Asn Gln Gly Ile Thr Phe Ile Asp
1340 1345 1350
Gln Ser Val Thr Gly Met Phe Glu Arg Lys Thr Tyr Ile Gly Leu
1355 1360 1365
<210> 3
<211> 1306
<212> PRT
<213> Artificial
<220>
<223> BES6序列
<400> 3
Met Ala Lys Asn Phe Glu Asp Phe Lys Arg Leu Tyr Pro Leu Ser Lys
1 5 10 15
Thr Leu Arg Phe Glu Ala Lys Pro Ile Gly Val Thr Leu Asp Asn Ile
20 25 30
Val Lys Ser Gly Leu Leu Asp Glu Asp Glu His Arg Ala Ala Ser Tyr
35 40 45
Val Lys Val Lys Lys Leu Ile Asp Glu Tyr His Lys Val Phe Ile Asp
50 55 60
Arg Val Leu Ala Asp Gly Cys Leu Pro Leu Lys Asn Glu Gly His Asn
65 70 75 80
Asn Ser Leu Thr Glu Tyr Tyr Asp Asn Tyr Val Ser Lys Ser Gln Asn
85 90 95
Glu Asp Ala Lys Lys Ala Phe Glu Glu Asn Gln Gln Asn Leu Arg Ser
100 105 110
Ile Ile Ala Lys Lys Leu Thr Glu Asp Lys Ala Tyr Ala Asn Leu Phe
115 120 125
Gly Lys Asn Leu Ile Glu Ser Tyr Lys Asp Lys Thr Asp Lys Thr Lys
130 135 140
Ile Ile Asp Ser Asp Leu Phe Lys Phe Ile Asn Thr Ala Glu Ser Thr
145 150 155 160
Gln Leu Asp Ser Met Ser Gln Asp Glu Ala Lys Glu Ile Val Lys Glu
165 170 175
Phe Trp Gly Phe Thr Thr Tyr Phe Val Gly Phe Phe Asp Asn Arg Lys
180 185 190
Asn Met Tyr Thr Ala Glu Glu Lys Ser Thr Gly Ile Ala Tyr Arg Leu
195 200 205
Ile Asn Glu Asn Leu Pro Lys Phe Ile Asp Asn Met Glu Ala Phe Lys
210 215 220
Lys Ala Ile Ala Arg Thr Glu Ile Gln Ala Asn Met Asp Glu Leu Tyr
225 230 235 240
Ser Asn Phe Ser Glu Tyr Leu Asn Val Glu Ser Ile Gln Glu Met Phe
245 250 255
Gln Leu Asp Tyr Tyr Asn Met Leu Leu Thr Gln Lys Gln Ile Asp Val
260 265 270
Tyr Asn Ala Ile Ile Gly Gly Lys Thr Asp Asp Glu His Asp Val Lys
275 280 285
Ile Lys Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Gln His Lys
290 295 300
Asp Asp Lys Leu Pro Lys Leu Lys Ala Leu Phe Lys Gln Ile Leu Ser
305 310 315 320
Asp Arg Asn Ala Ile Ser Trp Leu Pro Glu Glu Phe Asn Ser Asp Gln
325 330 335
Glu Val Leu Asn Ala Ile Lys Asp Cys Tyr Glu Arg Leu Ser Glu Asn
340 345 350
Val Leu Gly Asp Lys Val Leu Lys Ser Leu Leu Gly Ser Leu Ala Asp
355 360 365
Tyr Ser Leu Asp Gly Ile Phe Ile Arg Asn Asp Leu Gln Leu Thr Asp
370 375 380
Ile Ser Gln Lys Ile Phe Gly Asn Trp Gly Val Ile Gln Asn Ala Ile
385 390 395 400
Met Gln Asn Ile Lys Arg Val Ala Pro Ala Arg Lys His Lys Glu Ser
405 410 415
Glu Glu Asp Tyr Glu Lys Arg Ile Ala Gly Ile Phe Lys Lys Ala Asp
420 425 430
Ser Phe Ser Ile Ser Tyr Ile Asn Asp Cys Leu Asn Glu Ala Asp Pro
435 440 445
Asn Asn Ala Tyr Phe Val Glu Asn Tyr Phe Ala Thr Phe Gly Ala Val
450 455 460
Asn Thr Pro Thr Met Gln Arg Glu Asn Leu Phe Ala Leu Val Gln Asn
465 470 475 480
Ala Tyr Thr Glu Val Ala Ala Leu Leu His Ser Asp Tyr Pro Thr Val
485 490 495
Lys His Leu Ala Gln Asp Lys Ala Asn Val Ser Lys Ile Lys Ala Leu
500 505 510
Leu Asp Ala Ile Lys Ser Leu Gln His Phe Val Lys Pro Leu Leu Gly
515 520 525
Lys Gly Asp Glu Ser Asp Lys Asp Glu Arg Phe Tyr Gly Glu Leu Ala
530 535 540
Ser Leu Trp Ala Glu Leu Asp Thr Val Thr Pro Leu Tyr Asn Met Ile
545 550 555 560
Arg Asn Tyr Met Thr Arg Lys Pro Tyr Ser Gln Lys Lys Ile Lys Leu
565 570 575
Asn Phe Glu Asn Pro Gln Leu Leu Gly Gly Trp Asp Ala Asn Lys Glu
580 585 590
Lys Asp Tyr Ala Thr Ile Ile Leu Arg Arg Asn Gly Leu Tyr Tyr Leu
595 600 605
Ala Ile Met Asp Lys Asp Ser Arg Lys Leu Leu Gly Lys Ala Met Pro
610 615 620
Ser Asp Gly Glu Cys Tyr Glu Lys Met Val Tyr Lys Phe Phe Lys Asp
625 630 635 640
Val Thr Thr Met Ile Pro Lys Cys Ser Thr Gln Leu Lys Asp Val Gln
645 650 655
Ala Tyr Phe Lys Val Asn Thr Asp Asp Tyr Val Leu Asn Ser Lys Ala
660 665 670
Phe Asn Lys Pro Leu Thr Ile Thr Lys Glu Val Phe Asp Leu Asn Asn
675 680 685
Val Leu Tyr Gly Lys Tyr Lys Lys Phe Gln Lys Gly Tyr Leu Thr Ala
690 695 700
Thr Gly Asp Asn Val Gly Tyr Thr His Ala Val Asn Val Trp Ile Lys
705 710 715 720
Phe Cys Met Asp Phe Leu Asn Ser Tyr Asp Ser Thr Cys Ile Tyr Asp
725 730 735
Phe Ser Ser Leu Lys Pro Glu Ser Tyr Leu Ser Leu Asp Ala Phe Tyr
740 745 750
Gln Asp Ala Asn Leu Leu Leu Tyr Lys Leu Ser Phe Ala Arg Ala Ser
755 760 765
Val Ser Tyr Ile Asn Gln Leu Val Glu Glu Gly Lys Met Tyr Leu Phe
770 775 780
Gln Ile Tyr Asn Lys Asp Phe Ser Glu Tyr Ser Lys Gly Thr Pro Asn
785 790 795 800
Met His Thr Leu Tyr Trp Lys Ala Leu Phe Asp Glu Arg Asn Leu Ala
805 810 815
Asp Val Val Tyr Lys Leu Asn Gly Gln Ala Glu Met Phe Tyr Arg Lys
820 825 830
Lys Ser Ile Glu Asn Thr His Pro Thr His Pro Ala Asn His Pro Ile
835 840 845
Leu Asn Lys Asn Lys Asp Asn Lys Lys Lys Glu Ser Leu Phe Asp Tyr
850 855 860
Asp Leu Ile Lys Asp Arg Arg Tyr Thr Val Asp Lys Phe Met Phe His
865 870 875 880
Val Pro Ile Thr Met Asn Phe Lys Ser Val Gly Leu Glu Asn Ile Asn
885 890 895
Gln Asp Val Lys Ala Tyr Leu Arg His Ala Asp Asp Met His Ile Ile
900 905 910
Gly Ile Asp Arg Gly Glu Arg His Leu Leu Tyr Leu Val Val Ile Asp
915 920 925
Leu Gln Gly Asn Ile Lys Glu Gln Tyr Ser Leu Asn Glu Ile Val Asn
930 935 940
Glu Tyr Asn Gly Asn Thr Tyr His Thr Asn Tyr His Asp Leu Leu Asp
945 950 955 960
Val Arg Glu Glu Glu Arg Leu Lys Ala Arg Gln Ser Trp Gln Thr Ile
965 970 975
Glu Asn Ile Lys Glu Leu Lys Glu Gly Tyr Leu Ser Gln Val Ile His
980 985 990
Lys Ile Thr Gln Leu Met Val Arg Tyr His Ala Ile Val Val Leu Glu
995 1000 1005
Asp Leu Ser Lys Gly Phe Met Arg Ser Arg Gln Lys Val Glu Lys
1010 1015 1020
Gln Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn
1025 1030 1035
Tyr Leu Val Asp Lys Lys Thr Asp Val Ser Thr Pro Gly Gly Leu
1040 1045 1050
Leu Asn Ala Tyr Gln Leu Thr Cys Lys Ser Asp Ser Ser Gln Lys
1055 1060 1065
Leu Gly Lys Gln Ser Gly Phe Leu Phe Tyr Ile Pro Ala Trp Asn
1070 1075 1080
Thr Ser Lys Ile Asp Pro Val Thr Gly Phe Val Asn Leu Leu Asp
1085 1090 1095
Thr His Ser Leu Asn Ser Lys Glu Lys Ile Lys Ala Phe Phe Ser
1100 1105 1110
Lys Phe Asp Ala Ile Arg Tyr Asn Lys Asp Lys Lys Trp Phe Glu
1115 1120 1125
Phe Asn Leu Asp Tyr Asp Lys Phe Gly Lys Lys Ala Glu Asp Thr
1130 1135 1140
Arg Thr Lys Trp Thr Leu Cys Thr Arg Gly Met Arg Ile Asp Thr
1145 1150 1155
Phe Arg Asn Lys Glu Lys Asn Ser Gln Trp Asp Asn Gln Glu Val
1160 1165 1170
Asp Leu Thr Thr Glu Met Lys Ser Leu Leu Glu His Tyr Tyr Ile
1175 1180 1185
Asp Ile His Gly Asn Leu Lys Asp Ala Ile Ser Thr Gln Thr Asp
1190 1195 1200
Lys Ala Phe Phe Thr Gly Leu Leu His Ile Leu Lys Leu Thr Leu
1205 1210 1215
Gln Met Arg Asn Ser Ile Thr Gly Thr Glu Thr Asp Tyr Leu Val
1220 1225 1230
Ser Pro Val Ala Asp Glu Asn Gly Ile Phe Tyr Asp Ser Arg Ser
1235 1240 1245
Cys Gly Asp Gln Leu Pro Glu Asn Ala Asp Ala Asn Gly Ala Tyr
1250 1255 1260
Asn Ile Ala Arg Lys Gly Leu Met Leu Ile Glu Gln Ile Lys Asn
1265 1270 1275
Ala Glu Asp Leu Asn Asn Val Lys Phe Asp Ile Ser Asn Lys Ala
1280 1285 1290
Trp Leu Asn Phe Ala Gln Gln Lys Pro Tyr Lys Asn Gly
1295 1300 1305
<210> 4
<211> 1245
<212> PRT
<213> Artificial
<220>
<223> BES4序列
<400> 4
Met Gln Glu Arg Lys Lys Ile Ser His Leu Thr His Arg Asn Ser Val
1 5 10 15
Lys Lys Thr Ile Arg Met Gln Leu Asn Pro Val Gly Lys Thr Met Asp
20 25 30
Tyr Phe Gln Ala Lys Gln Ile Leu Glu Asn Asp Glu Lys Leu Lys Glu
35 40 45
Asp Tyr Gln Lys Ile Lys Glu Ile Ala Asp Arg Phe Tyr Arg Asn Leu
50 55 60
Asn Glu Asp Val Leu Ser Lys Thr Gly Leu Asp Lys Leu Lys Asp Tyr
65 70 75 80
Ala Glu Ile Tyr Tyr His Cys Asn Thr Asp Ala Asp Arg Lys Arg Leu
85 90 95
Asn Glu Cys Ala Ser Glu Leu Arg Lys Glu Ile Val Lys Asn Phe Lys
100 105 110
Asn Arg Asp Glu Tyr Asn Lys Leu Phe Asn Lys Lys Met Ile Glu Ile
115 120 125
Val Leu Pro Lys His Leu Lys Asn Glu Asp Glu Lys Glu Val Val Ala
130 135 140
Ser Phe Lys Asn Phe Thr Thr Tyr Phe Thr Gly Phe Phe Thr Asn Arg
145 150 155 160
Lys Asn Met Tyr Ser Asp Gly Glu Glu Ser Thr Ala Ile Ala Tyr Arg
165 170 175
Cys Ile Asn Glu Asn Leu Pro Lys His Leu Asp Asn Val Lys Val Phe
180 185 190
Glu Lys Ala Ile Ser Lys Leu Ser Lys Asn Ala Ile Asp Asp Leu Asp
195 200 205
Ala Thr Tyr Ser Gly Leu Cys Gly Thr Asn Leu Tyr Asp Val Phe Thr
210 215 220
Val Asp Tyr Phe Asn Phe Leu Leu Pro Gln Ser Gly Ile Thr Glu Tyr
225 230 235 240
Asn Lys Ile Ile Gly Gly Tyr Thr Thr Ser Asp Gly Thr Lys Val Lys
245 250 255
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Asn Gln Gln Val Ser Lys Arg
260 265 270
Asp Lys Ile Pro Asn Leu Lys Ile Leu Tyr Lys Gln Ile Leu Ser Glu
275 280 285
Ser Glu Lys Val Ser Phe Ile Pro Pro Lys Phe Glu Asp Asp Asn Glu
290 295 300
Leu Leu Ser Ala Val Ser Glu Phe Tyr Ala Asn Asp Glu Thr Phe Asp
305 310 315 320
Gly Met Pro Leu Lys Lys Ala Ile Asp Glu Thr Lys Leu Leu Phe Gly
325 330 335
Asn Leu Asp Asn Ser Ser Leu Asn Gly Ile Tyr Ile Gln Asn Asp Arg
340 345 350
Ser Val Thr Asn Leu Ser Asn Ser Met Phe Gly Ser Trp Ser Val Ile
355 360 365
Glu Asp Leu Trp Asn Lys Asn Tyr Asp Ser Val Asn Ser Asn Ser Arg
370 375 380
Ile Lys Asp Ile Gln Lys Arg Glu Asp Lys Arg Lys Lys Ala Tyr Lys
385 390 395 400
Ala Glu Lys Lys Leu Ser Leu Ser Phe Leu Gln Val Leu Ile Ser Asn
405 410 415
Ser Glu Asn Asp Glu Ile Arg Lys Lys Ser Ile Val Asp Tyr Tyr Lys
420 425 430
Thr Ser Leu Met Gln Leu Thr Asp Asn Leu Ser Asp Lys Tyr Lys Glu
435 440 445
Ala Ala Pro Leu Phe Ser Glu Asn Tyr Asp Asn Glu Lys Gly Leu Lys
450 455 460
Asn Asp Asp Lys Ser Ile Ser Leu Ile Lys Asn Phe Leu Asp Ala Ile
465 470 475 480
Lys Glu Ile Glu Lys Phe Ile Lys Pro Leu Ser Glu Thr Asn Ile Thr
485 490 495
Gly Glu Lys Asn Asp Leu Phe Tyr Ser Gln Phe Thr Pro Leu Leu Asp
500 505 510
Asn Ile Ser Arg Ile Asp Arg Leu Tyr Asp Lys Val Arg Asn Tyr Val
515 520 525
Thr Gln Lys Pro Phe Ser Thr Asp Lys Ile Lys Leu Asn Phe Gly Asn
530 535 540
Ser Gln Leu Leu Asn Gly Trp Asp Arg Asn Lys Glu Lys Asp Cys Gly
545 550 555 560
Ala Val Leu Leu Cys Lys Asp Glu Lys Tyr Tyr Leu Ala Ile Ile Asp
565 570 575
Lys Ser Asn Asn Ser Ile Leu Glu Asn Ile Asp Phe Gln Asp Cys Asn
580 585 590
Glu Ser Asp Tyr Tyr Glu Lys Ile Val Tyr Lys Leu Leu Thr Lys Ile
595 600 605
Ser Gly Asn Leu Pro Arg Val Phe Phe Ser Glu Lys His Lys Lys Leu
610 615 620
Leu Ser Pro Ser Asp Glu Ile Leu Lys Ile Tyr Lys Ser Gly Thr Phe
625 630 635 640
Lys Lys Gly Asp Lys Phe Ser Leu Asp Asp Cys His Lys Leu Ile Asp
645 650 655
Phe Tyr Lys Glu Ser Phe Lys Lys Tyr Pro Lys Trp Leu Ile Tyr Asn
660 665 670
Phe Lys Phe Lys Asn Thr Asn Glu Tyr Asn Asp Ile Ser Glu Phe Tyr
675 680 685
Asn Asp Val Ala Ser Gln Gly Tyr Asn Ile Ser Lys Met Lys Ile Pro
690 695 700
Thr Ser Phe Ile Asp Lys Leu Val Asp Glu Gly Lys Ile Tyr Leu Phe
705 710 715 720
Gln Leu Tyr Asn Lys Asp Phe Ser Pro His Ser Lys Gly Thr Pro Asn
725 730 735
Leu His Thr Leu Tyr Phe Lys Met Leu Phe Asp Glu Arg Asn Leu Glu
740 745 750
Asp Val Val Tyr Lys Leu Asn Gly Glu Ala Glu Met Phe Tyr Arg Pro
755 760 765
Ala Ser Ile Lys Tyr Asp Lys Pro Thr His Pro Lys Asn Thr Pro Ile
770 775 780
Lys Asn Lys Asn Thr Leu Asn Asp Lys Lys Ala Ser Thr Phe Pro Tyr
785 790 795 800
Asp Leu Ile Lys Asp Lys Arg Tyr Thr Lys Trp Gln Phe Ser Leu His
805 810 815
Phe Pro Ile Thr Met Asn Phe Lys Ala Pro Asp Arg Ala Met Ile Asn
820 825 830
Asp Asp Val Arg Asn Leu Leu Lys Ser Cys Asn Asn Asn Phe Ile Ile
835 840 845
Gly Ile Asp Arg Gly Glu Arg Asn Leu Leu Tyr Val Ser Val Ile Asp
850 855 860
Ser Asn Gly Ala Ile Ile Tyr Gln His Ser Leu Asn Ile Ile Gly Asn
865 870 875 880
Lys Phe Lys Gly Lys Thr Tyr Glu Thr Asn Tyr Arg Glu Lys Leu Ala
885 890 895
Thr Arg Glu Lys Glu Arg Thr Glu Gln Arg Arg Asn Trp Lys Ala Ile
900 905 910
Glu Ser Ile Lys Glu Leu Lys Glu Gly Tyr Ile Ser Gln Thr Val His
915 920 925
Val Ile Cys Gln Leu Val Val Lys Tyr Asp Ala Ile Ile Val Met Glu
930 935 940
Lys Leu Thr Asp Gly Phe Lys Arg Gly Arg Thr Lys Phe Glu Lys Gln
945 950 955 960
Val Tyr Gln Lys Phe Glu Lys Met Leu Ile Asp Lys Leu Asn Tyr Tyr
965 970 975
Val Asp Lys Lys Leu Asp Pro Asn Glu Glu Gly Gly Leu Leu His Ala
980 985 990
Tyr Gln Leu Thr Asn Lys Leu Asp Ser Phe Asp Lys Leu Gly Met Gln
995 1000 1005
Ser Gly Phe Ile Phe Tyr Val Arg Pro Asp Phe Thr Ser Lys Ile
1010 1015 1020
Asp Pro Val Thr Gly Phe Val Asn Leu Leu Tyr Pro Arg Tyr Glu
1025 1030 1035
Asn Ile Asp Lys Ala Lys Asp Met Ile Ser Arg Phe Asp Asp Ile
1040 1045 1050
Arg Tyr Asn Ala Gly Glu Asp Phe Phe Glu Phe Asp Ile Asp Tyr
1055 1060 1065
Asp Lys Phe Pro Lys Thr Ala Ser Asp Tyr Arg Lys Lys Trp Thr
1070 1075 1080
Ile Cys Thr Asn Gly Glu Arg Ile Glu Ala Phe Arg Asn Pro Ala
1085 1090 1095
Asn Asn Asn Glu Trp Ser Tyr Arg Thr Ile Ile Leu Ala Glu Lys
1100 1105 1110
Phe Lys Glu Leu Phe Asp Asn Asn Ser Ile Asn Tyr Arg Asp Ser
1115 1120 1125
Asp Asp Leu Lys Ala Glu Ile Leu Ser Gln Thr Lys Gly Lys Phe
1130 1135 1140
Phe Glu Asp Phe Phe Lys Leu Leu Arg Leu Thr Leu Gln Met Arg
1145 1150 1155
Asn Ser Asn Pro Glu Thr Gly Glu Asp Arg Ile Leu Ser Pro Val
1160 1165 1170
Lys Asp Lys Asn Gly Asn Phe Tyr Asp Ser Ser Lys Tyr Asp Glu
1175 1180 1185
Lys Ser Lys Leu Pro Cys Asp Ala Asp Ala Asn Gly Ala Tyr Asn
1190 1195 1200
Ile Ala Arg Lys Gly Leu Trp Ile Val Glu Gln Phe Lys Lys Ala
1205 1210 1215
Asp Asn Val Ser Ala Val Glu Pro Val Ile His Asn Asp Lys Trp
1220 1225 1230
Leu Lys Phe Val Gln Glu Asn Asp Met Ala Asn Asn
1235 1240 1245
<210> 5
<211> 50
<212> DNA
<213> Artificial
<220>
<223> crRNA序列
<220>
<221> misc_feature
<222> (1)..(26)
<223> n is a, c, g, t or u
<400> 5
nnnnnnnnnn nnnnnnnnnn nnnnnnguuu uaguacucug uaauuuuucg 50
<210> 6
<211> 177
<212> DNA
<213> Artificial
<220>
<223> tracrRNA:
<400> 6
agauuuuacc auagcgaaag guuacagaau cuacuaaaau aagacuuuau gucgaaauca 60
cuacuuuuaa guaguuauua acaauaguau auguaaauug aguuaguagu acauauuacu 120
aauguuuuuu gugugaaauu uugagcacgg gucuuaugau cugugcucuu uuuguuu 177
<210> 7
<211> 36
<212> DNA
<213> Artificial
<220>
<223> crispr_repeat
<400> 7
guuuuaucau agcgaaaaau uacagaguac uaaaac 36
<210> 8
<211> 42
<212> DNA
<213> Artificial
<220>
<223> crRNA
<220>
<221> misc_feature
<222> (1)..(20)
<223> n is a, c, g, t or u
<400> 8
nnnnnnnnnn nnnnnnnnnn guuuuggagc agugucguuc ug 42
<210> 9
<211> 102
<212> DNA
<213> Artificial
<220>
<223> tracrRNA
<400> 9
ggacgacacu gcgagucaaa auacggcuuu gccaaaaaug ccuccgggcg ccacguaggu 60
ggcaauuuga cuugccaagg gcccucaaug agggcccuuu uu 102
<210> 10
<211> 47
<212> DNA
<213> Artificial
<220>
<223> crispr_repeat
<400> 10
guuguggucu gcuuucauuu aaguaucuuu gaaccauugg aaacagu 47
<210> 11
<211> 36
<212> DNA
<213> Artificial
<220>
<223> crispr-repeat
<400> 11
atctacaata gtagaaatta ttgaagcata ctagcc 36
<210> 12
<211> 38
<212> DNA
<213> Artificial
<220>
<223> crispr-repeat
<400> 12
atctacaata gtagaaatta tatagggtta ttaaacat 38
<210> 13
<211> 68
<212> DNA
<213> Artificial
<220>
<223> crRNA
<400> 13
ttctaatacg actcactata ggtaaaatag tacatttata gaaaggtttt agtactctgt 60
aatttttc 68
<210> 14
<211> 82
<212> DNA
<213> Artificial
<220>
<223> tracrRNA:
<400> 14
ttctaatacg actcactata ggaaaggtta cagaatctac taaaataaga ctttatgtcg 60
aaatcactac ttttaagtag tt 82
<210> 15
<211> 117
<212> DNA
<213> Artificial
<220>
<223> sgRNA-1
<400> 15
ttctaatacg actcactata ggctcaaaag ggaactgcta ccgaagtttt agtactctgt 60
gaaaacagaa tctactaaaa taagacttta tgtcgaaatc actactttta agtagtt 117
<210> 16
<211> 131
<212> DNA
<213> Artificial
<220>
<223> sgRNA-3
<400> 16
ttctaatacg actcactata ggctcaaaag ggaactgcta ccgaagtttt agtactctgt 60
aatttttcaa aaaaggttac agaatctact aaaataagac tttatgtcga aatcactact 120
tttaagtagt t 131
<210> 17
<211> 59
<212> DNA
<213> Artificial
<220>
<223> PAM_AF13-2_1
<400> 17
tgtgagccaa ggagttggcc taggcaattg tcttcctaag accgcttggc ctccgactt 59
<210> 18
<211> 59
<212> DNA
<213> Artificial
<220>
<223> PAM_AF13-2_2/1
<220>
<221> misc_feature
<222> (24)..(38)
<223> n is a, c, g, t or u
<400> 18
ttcggtagca gttccctttt gagnnnnnnn nnnnnnnnaa gtcggaggcc aagcggtct 59
<210> 19
<211> 59
<212> DNA
<213> Artificial
<220>
<223> PAM_AF13-2_2/2
<220>
<221> misc_feature
<222> (22)..(36)
<223> n is a, c, g, or t
<400> 19
agaccgcttg gcctccgact tnnnnnnnnn nnnnnnctca aaagggaact gctaccgaa 59
<210> 20
<211> 59
<212> DNA
<213> Artificial
<220>
<223> PAM_AF13-2_3
<220>
<221> misc_feature
<222> (26)..(40)
<223> n is a, c, g, t or u
<400> 20
gaacgacatg gctacgatcc gacttnnnnn nnnnnnnnnn ttcggtagca gttcccttt 59
<210> 21
<211> 74
<212> DNA
<213> Artificial
<220>
<223> tracrRNA-S
<400> 21
ttctaatacg actcactata ggaaaggtta cagaatctac taaaataaga ctttatgtcg 60
aaatcactac tttt 74
<210> 22
<211> 123
<212> DNA
<213> Artificial
<220>
<223> sgRNA-2
<400> 22
ttctaatacg actcactata ggctcaaaag ggaactgcta ccgaagtttt agtactctgt 60
aatttttcaa aaaaggttac agaatctact aaaataagac tttatgtcga aatcactact 120
ttt 123
<210> 23
<211> 842
<212> DNA
<213> Artificial
<220>
<223> 842bp的切割底物序列
<220>
<221> misc_feature
<222> (432)..(438)
<223> n is a, c, g, t or u
<400> 23
ctggcctttt gctcacatgt tctttcctgc gttatcccct gattctgtgg ataaccgtat 60
taccgccttt gagtgagctg ataccgctcg ccgcagccga acgaccgagc gcagcgagtc 120
agtgagcgag gaagcggaag agcgcccaat acgcaaaccg cctctccccg cgcgttggcc 180
gattcattaa tgcagctggc acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa 240
cgcaattaat gtgagttagc tcactcatta ggcaccccag gctttacact ttatgcttcc 300
ggctcgtatg ttgtgtggaa ttgtgagcgg ataacaattt cacacaggaa acagctatga 360
ccatgattac gccaagtttg cacgcctgcc gttcgacgat tgtagtagct caaaagggaa 420
ctgctaccga annnnnnnaa tctctggaag atccgcgcgt accgagttct aattcactgg 480
ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt tacccaactt aatcgccttg 540
cagcacatcc ccctttcgcc agctggcgta atagcgaaga ggcccgcacc gatcgccctt 600
cccaacagtt gcgcagcctg aatggcgaat ggcgcctgat gcggtatttt ctccttacgc 660
atctgtgcgg tatttcacac cgcatatggt gcactctcag tacaatctgc tctgatgccg 720
catagttaag ccagccccga cacccgccaa cacccgctga cgcgccctga cgggcttgtc 780
tgctcccggc atccgcttac agacaagctg tgaccgtctc cgggagctgc atgtgtcaga 840
gg 842
<210> 24
<211> 169
<212> DNA
<213> Artificial
<220>
<223> BES2-gRNA序列
<400> 24
ttctaatacg actcactata ggaaaaggga actgctaccg aagttttgga gcagtgtcgt 60
tctgaaagga cgacactgcg agtcaaaata cggctttgcc aaaaatgcct ccgggcgcca 120
cgtaggtggc aatttgactt gccaagggcc ctcaatgagg gcccttttt 169
<210> 25
<211> 65
<212> DNA
<213> Artificial
<220>
<223> BES4-crRNA
<400> 25
ttctaatacg actcactata ggaatttcta ctattgtaga tttcggtagc agttcccttt 60
tgagc 65
<210> 26
<211> 65
<212> DNA
<213> Artificial
<220>
<223> BES6-crRNA
<400> 26
ttctaatacg actcactata ggaatttcta ctattgtaga tttcggtagc agttcccttt 60
tgagc 65
<210> 27
<211> 1475
<212> DNA
<213> Artificial
<220>
<223> AAVS1的靶向区域核苷酸序列
<400> 27
cccttgctct ctgctgtgtt gctgcccaag gatgctcttt ccggagcact tccttctcgg 60
cgctgcacca cgtgatgtcc tctgagcgga tcctccccgt gtctgggtcc tctccgggca 120
tctctcctcc ctcacccaac cccatgccgt cttcactcgc tgggttccct tttccttctc 180
cttctggggc ctgtgccatc tctcgtttct taggatggcc ttctccgacg gatgtctccc 240
ttgcgtcccg cctccccttc ttgtaggcct gcatcatcac cgtttttctg gacaacccca 300
aagtaccccg tctccctggc tttagccacc tctccatcct cttgctttct ttgcctggac 360
accccgttct cctgtggatt cgggtcacct ctcactcctt tcatttgggc agctccccta 420
ccccccttac ctctctagtc tgtgctagct cttccagccc cctgtcatgg catcttccag 480
gggtccgaga gctcagctag tcttcttcct ccaacccggg cccctatgtc cacttcagga 540
cagcatgttt gctgcctcca gggatcctgt gtccccgagc tgggaccacc ttatattccc 600
agggccggtt aatgtggctc tggttctggg tacttttatc tgtcccctcc accccacagt 660
ggggccacta gggacaggat tggtgacaga aaagccccat ccttaggcct cctccttcct 720
agtctcctga tattgggtct aacccccacc tcctgttagg cagattcctt atctggtgac 780
acacccccat ttcctggagc catctctctc cttgccagaa cctctaaggt ttgcttacga 840
tggagccaga gaggatcctg ggagggagag cttggcaggg ggtgggaggg aaggggggga 900
tgcgtgacct gcccggttct cagtggccac cctgcgctac cctctcccag aacctgagct 960
gctctgacgc ggctgtctgg tgcgtttcac tgatcctggt gctgcagctt ccttacactt 1020
cccaagagga gaagcagttt ggaaaaacaa aatcagaata agttggtcct gagttctaac 1080
tttggctctt cacctttcta gtccccaatt tatattgttc ctccgtgcgt cagttttacc 1140
tgtgagataa ggccagtagc cacccccgtc ctggcagggc tgtggtgagg aggggggtgt 1200
ccgtgtggaa aactcccttt gtgagaatgg tgcgtcctag gtgttcacca ggtcgtggcc 1260
gcctctactc cctttctctt tctccatcct tctttcctta aagagccccc agtgctatct 1320
ggacatattc ctccgcccag agcagggtcc gcttccctaa ggccctgctc tgggcttctg 1380
ggtttgagtc cttgcaagcc caggagagcg ctagcttccc tgtccccctt cctcgtccac 1440
catctcatgc cctggctctc ctgccccttc ctaca 1475
<210> 28
<211> 65
<212> DNA
<213> Artificial
<220>
<223> BES6-AAVS1-crRNA4
<400> 28
ttctaatacg actcactata ggaatttcta ctattgtaga tggcagctcc cctacccccc 60
ttacc 65
<210> 29
<211> 118
<212> DNA
<213> Artificial
<220>
<223> spCas9-AAVS1-crRNA序列
<400> 29
ttctaatacg actcactata ggggggccac tagggacagg atgttttaga gctagaaata 60
gcaagttaaa ataaggctag tccgttatca acttgaaaaa gtggcaccga gtcggtgc 118
<210> 30
<211> 20
<212> DNA
<213> Artificial
<220>
<223> AAVS1-F1
<400> 30
cccttgctct ctgctgtgtt 20
<210> 31
<211> 20
<212> DNA
<213> Artificial
<220>
<223> AAVS1-R8
<400> 31
tgtaggaagg ggcaggagag 20
<210> 32
<211> 1086
<212> DNA
<213> Artificial
<220>
<223> HBG的靶向区域核苷酸序列
<400> 32
ccctgctgtg ctcagatcaa tactccgttg tctaagttgc ctcgagacta aaggcaacag 60
ggctgaaaca tctcctggac tcaccttgaa gttctcagga tccacatgca gcttgtcaca 120
gtgcagttca ctcagctggg caaaggtgcc cttgagatca tccaggtgct ttgtggcatc 180
tcccaaggaa gtcagcacct tcttgccatg tgccttgact ttggggttgc ccatgatggc 240
agaggcagag gacaggttgc caaagctgtc aaagaacctc tgggtccatg ggtagacaac 300
caggagcctg tgagattgac aagaacagtt tgacagtcag aaggtgccac aaatcctgag 360
aagcgacctg gacttttgcc aggcacaggg tccttccttc cctcccttgt cctggtcacc 420
agagcctacc ttcccagggt ttctcctcca gcatcttcca cattcacctt gccccacagg 480
cttgtgatag tagccttgtc ctcctctgtg aaatgaccca tggcgtctgg actaggagct 540
tattgataac ctcagacgtt ccagaagcga gtgtgtggaa ctgctgaagg gtgcttcctt 600
ttattcttca tccctagcca gccgccggcc cctggcctca ctggatactc taagactatt 660
ggtcaagttt gccttgtcaa ggctattggt caaggcaagg ctggccaacc catgggtgga 720
gtttagccag ggaccgtttc agacagatat ttgcattgag atagtgtggg gaaggggccc 780
ccaagaggat actgctaatt ttttttatag cctttgcctt gttccgattc agtcattcca 840
gtttttctct aatttattct tccctttagc tagtttcctt ctcccatcat agaggatacc 900
aggacttctt ttgtcagccg ttttttacct tcttgtctct agctccagtg aggcctgtag 960
tttaaagcta aagcatgtac caatttttga aaagttcagg gattgtgaaa tgtgttttag 1020
gcataggtcc aggatttttg acgggacaaa tcttagtctc tttcagttag cagtggtttc 1080
taagga 1086
<210> 33
<211> 8871
<212> DNA
<213> Artificial
<220>
<223> BES4-HBG-sg01
<400> 33
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg aatttctact attgtagatg ccagccttgc cttgaccaat agttttttgt 300
tttagagcta gaaatagcaa gttaaaataa ggctagtccg tttttagcgc gtgcgccaat 360
tctgcagaca aatggctcta gaggtacccg ttacataact tacggtaaat ggcccgcctg 420
gctgaccgcc caacgacccc cgcccattga cgtcaatagt aacgccaata gggactttcc 480
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 540
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 600
gtgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 660
tcgctattac catggtcgag gtgagcccca cgttctgctt cactctcccc atctcccccc 720
cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca gcgatggggg 780
cggggggggg gggggggcgc gcgccaggcg gggcggggcg gggcgagggg cggggcgggg 840
cgaggcggag aggtgcggcg gcagccaatc agagcggcgc gctccgaaag tttcctttta 900
tggcgaggcg gcggcggcgg cggccctata aaaagcgaag cgcgcggcgg gcgggagtcg 960
ctgcgcgctg ccttcgcccc gtgccccgct ccgccgccgc ctcgcgccgc ccgccccggc 1020
tctgactgac cgcgttactc ccacaggtga gcgggcggga cggcccttct cctccgggct 1080
gtaattagct gagcaagagg taagggttta agggatggtt ggttggtggg gtattaatgt 1140
ttaattacct ggagcacctg cctgaaatca ctttttttca ggttggaccg gtgccaccat 1200
ggactataag gaccacgacg gagactacaa ggatcatgat attgattaca aagacgatga 1260
cgataagatg gccccaaaga agaagcggaa ggtcggtatc cacggagtcc cagcagccat 1320
gcaggagaga aagaagatca gccacctgac ccacagaaac agcgtgaaga aaaccatcag 1380
aatgcagctg aaccccgtgg gaaagaccat ggactacttc caggccaagc agatcctgga 1440
gaacgacgag aagctgaagg aggactacca gaagatcaag gagatcgccg acagattcta 1500
cagaaacctg aacgaggacg tgctgagcaa aaccggactg gacaagctga aggactacgc 1560
cgagatctac taccattgca acaccgacgc cgacagaaag agactgaacg agtgcgccag 1620
cgagctgaga aaggagatcg tgaagaactt caagaacaga gatgagtaca acaagctgtt 1680
caacaagaag atgatcgaga tcgtgctgcc caagcacctg aagaacgagg acgagaagga 1740
agtggtggcc agcttcaaga acttcaccac ctacttcacc ggcttcttca ccaacagaaa 1800
gaacatgtac agcgacggcg aagagtctac cgctattgcc tacagatgca tcaacgagaa 1860
cctgcccaag cacctggaca acgtgaaggt gttcgagaag gccatcagca agctgagcaa 1920
gaacgccatc gacgacctgg atgccacata ttctggcctg tgcggcacaa atctgtacga 1980
cgtgttcacc gtggactact tcaacttcct gctgccccaa agcggaatca ccgagtacaa 2040
caagatcatc ggcggctaca caacaagcga cggcaccaaa gtgaagggca tcaacgagta 2100
catcaacctg tacaaccagc aggtgagcaa gagagacaag atccccaacc tgaagatcct 2160
gtacaagcag atcctgagcg agagcgagaa ggtgtctttc atccccccca agttcgagga 2220
cgacaacgaa ctgctgtctg ccgtgagcga gttctatgcc aacgacgaga catttgatgg 2280
catgcccctg aagaaagcca tcgacgaaac caaactgctg ttcggcaacc tggacaacag 2340
cagcctgaac ggcatctaca tccagaacga cagaagcgtg accaacctga gcaacagcat 2400
gttcggcagc tggagcgtga ttgaggacct gtggaacaag aactacgaca gcgtgaacag 2460
caacagcaga atcaaggaca tccagaagag agaggacaag agaaagaagg cctacaaggc 2520
cgagaagaag ctgagcctga gcttcctgca ggtgctgatc agcaacagcg agaacgacga 2580
gatcagaaag aagagcatcg tggactacta caagaccagc ctgatgcagc tgaccgacaa 2640
cctgagcgac aagtacaaag aagccgcccc cctgttttct gagaactacg acaacgagaa 2700
gggcctgaag aacgacgaca agagcatcag cctgatcaag aacttcctgg acgccatcaa 2760
ggagatcgag aagttcatca agcccctgag cgagacaaat atcaccggcg agaagaacga 2820
cctgttctac agccagttca cccccctgct ggacaacatc agcagaatcg acagactgta 2880
cgacaaggtg agaaactacg tgacccagaa gcccttcagc accgacaaga tcaagctgaa 2940
cttcggcaac agccagcttc tgaacggctg ggacagaaac aaggagaagg actgtggcgc 3000
tgtgctgctg tgtaaggacg agaagtacta cctggccatc atcgacaaga gcaacaacag 3060
catcctggag aacatcgact tccaggactg caacgagagc gactactacg agaagatcgt 3120
gtacaagctg ctgaccaaga tctctggcaa cctgcccaga gtgttcttca gcgagaagca 3180
caagaagctg ctgagcccca gcgatgagat cctgaagatc tacaagagcg gcaccttcaa 3240
gaagggcgac aagttcagcc ttgacgactg ccacaagctg atcgacttct acaaggagag 3300
cttcaagaag taccccaagt ggctgatcta caacttcaag ttcaagaaca ccaacgagta 3360
caacgacatc agcgagttct acaacgacgt ggccagccag ggatacaaca tcagcaagat 3420
gaagatcccc accagcttca tcgacaagct ggtggacgag ggcaagatct acctgttcca 3480
gctgtacaac aaggacttca gcccccacag caagggaaca cctaacctgc acaccctgta 3540
cttcaagatg ctgttcgacg agagaaacct ggaggacgtg gtgtacaagc tgaatggcga 3600
ggccgagatg ttttacagac ccgccagcat caagtatgac aagcccaccc accctaagaa 3660
cacccccatc aagaacaaga acaccctgaa cgacaagaag gccagcacct tcccctacga 3720
cctgatcaag gacaagagat acaccaagtg gcagttcagc ctgcacttcc ccatcaccat 3780
gaacttcaag gcccccgaca gagccatgat caacgacgac gtgagaaacc tgctgaagag 3840
ctgcaacaac aacttcatca tcggcatcga cagaggcgag agaaacctgc tgtacgtgag 3900
cgtgatcgat agcaacggcg ccatcatcta ccagcacagc ctgaacatca tcggcaacaa 3960
gttcaagggc aagacctacg aaaccaacta cagagagaag ctggccacca gagagaagga 4020
gagaaccgag cagagaagaa actggaaggc catcgagagc atcaaggagc tgaaggaggg 4080
ctacatcagc caaaccgtgc acgtgatttg ccagctggtg gtgaagtacg acgccatcat 4140
cgtgatggag aagctgaccg acggcttcaa gagaggcaga accaagttcg agaagcaggt 4200
gtaccagaag ttcgagaaga tgctgatcga caagctgaac tactacgtgg acaagaagct 4260
ggaccccaat gaggaaggcg gactgctgca tgcttatcag ctgaccaaca agctggacag 4320
cttcgacaag ctgggaatgc agagcggctt catcttctac gtcagacccg acttcaccag 4380
caaaatcgac cccgtgaccg gatttgtgaa cctgctgtac cccagatacg agaacatcga 4440
caaggccaag gacatgatca gcagattcga cgacatcaga tacaacgccg gcgaggactt 4500
cttcgagttc gacatcgact acgacaagtt ccccaagacc gccagcgact acagaaagaa 4560
gtggaccatc tgcaccaacg gcgagagaat cgaggccttc agaaaccccg ccaacaacaa 4620
cgagtggagc tacagaacca tcatcctggc cgagaagttc aaggagctgt tcgacaacaa 4680
cagcatcaac tacagagaca gcgacgacct gaaagccgag atcctgagcc aaaccaaggg 4740
caagttcttc gaggacttct tcaagctgct gagactgacc ctgcagatga gaaacagcaa 4800
ccccgaaacc ggagaggaca ggattctgag ccccgtgaag gacaagaacg gcaacttcta 4860
cgacagcagc aagtacgacg agaagagcaa gctgccctgt gacgctgatg ctaacggcgc 4920
ttacaacatc gccagaaagg gcctgtggat cgtggagcag ttcaagaagg ccgacaacgt 4980
gtctgctgtg gaacccgtga tccacaacga caagtggctg aagttcgtgc aggagaacga 5040
catggccaac aacaaaaggc cggcggccac gaaaaaggcc ggccaggcaa aaaagaaaaa 5100
ggaattcggc agtggagagg gcagaggaag tctgctaaca tgcggtgacg tcgaggagaa 5160
tcctggccca gtgagcaagg gcgaggagct gttcaccggg gtggtgccca tcctggtcga 5220
gctggacggc gacgtaaacg gccacaagtt cagcgtgtcc ggcgagggcg agggcgatgc 5280
cacctacggc aagctgaccc tgaagttcat ctgcaccacc ggcaagctgc ccgtgccctg 5340
gcccaccctc gtgaccaccc tgacctacgg cgtgcagtgc ttcagccgct accccgacca 5400
catgaagcag cacgacttct tcaagtccgc catgcccgaa ggctacgtcc aggagcgcac 5460
catcttcttc aaggacgacg gcaactacaa gacccgcgcc gaggtgaagt tcgagggcga 5520
caccctggtg aaccgcatcg agctgaaggg catcgacttc aaggaggacg gcaacatcct 5580
ggggcacaag ctggagtaca actacaacag ccacaacgtc tatatcatgg ccgacaagca 5640
gaagaacggc atcaaggtga acttcaagat ccgccacaac atcgaggacg gcagcgtgca 5700
gctcgccgac cactaccagc agaacacccc catcggcgac ggccccgtgc tgctgcccga 5760
caaccactac ctgagcaccc agtccgccct gagcaaagac cccaacgaga agcgcgatca 5820
catggtcctg ctggagttcg tgaccgccgc cgggatcact ctcggcatgg acgagctgta 5880
caaggaattc taactagagc tcgctgatca gcctcgactg tgccttctag ttgccagcca 5940
tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc 6000
ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca ttctattctg 6060
gggggtgggg tggggcagga cagcaagggg gaggattggg aagagaatag caggcatgct 6120
ggggagcggc cgcaggaacc cctagtgatg gagttggcca ctccctctct gcgcgctcgc 6180
tcgctcactg aggccgggcg accaaaggtc gcccgacgcc cgggctttgc ccgggcggcc 6240
tcagtgagcg agcgagcgcg cagctgcctg caggggcgcc tgatgcggta ttttctcctt 6300
acgcatctgt gcggtatttc acaccgcata cgtcaaagca accatagtac gcgccctgta 6360
gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca 6420
gcgccttagc gcccgctcct ttcgctttct tcccttcctt tctcgccacg ttcgccggct 6480
ttccccgtca agctctaaat cgggggctcc ctttagggtt ccgatttagt gctttacggc 6540
acctcgaccc caaaaaactt gatttgggtg atggttcacg tagtgggcca tcgccctgat 6600
agacggtttt tcgccctttg acgttggagt ccacgttctt taatagtgga ctcttgttcc 6660
aaactggaac aacactcaac tctatctcgg gctattcttt tgatttataa gggattttgc 6720
cgatttcggt ctattggtta aaaaatgagc tgatttaaca aaaatttaac gcgaatttta 6780
acaaaatatt aacgtttaca attttatggt gcactctcag tacaatctgc tctgatgccg 6840
catagttaag ccagccccga cacccgccaa cacccgctga cgcgccctga cgggcttgtc 6900
tgctcccggc atccgcttac agacaagctg tgaccgtctc cgggagctgc atgtgtcaga 6960
ggttttcacc gtcatcaccg aaacgcgcga gacgaaaggg cctcgtgata cgcctatttt 7020
tataggttaa tgtcatgata ataatggttt cttagacgtc aggtggcact tttcggggaa 7080
atgtgcgcgg aacccctatt tgtttatttt tctaaataca ttcaaatatg tatccgctca 7140
tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt atgagtattc 7200
aacatttccg tgtcgccctt attccctttt ttgcggcatt ttgccttcct gtttttgctc 7260
acccagaaac gctggtgaaa gtaaaagatg ctgaagatca gttgggtgca cgagtgggtt 7320
acatcgaact ggatctcaac agcggtaaga tccttgagag ttttcgcccc gaagaacgtt 7380
ttccaatgat gagcactttt aaagttctgc tatgtggcgc ggtattatcc cgtattgacg 7440
ccgggcaaga gcaactcggt cgccgcatac actattctca gaatgacttg gttgagtact 7500
caccagtcac agaaaagcat cttacggatg gcatgacagt aagagaatta tgcagtgctg 7560
ccataaccat gagtgataac actgcggcca acttacttct gacaacgatc ggaggaccga 7620
aggagctaac cgcttttttg cacaacatgg gggatcatgt aactcgcctt gatcgttggg 7680
aaccggagct gaatgaagcc ataccaaacg acgagcgtga caccacgatg cctgtagcaa 7740
tggcaacaac gttgcgcaaa ctattaactg gcgaactact tactctagct tcccggcaac 7800
aattaataga ctggatggag gcggataaag ttgcaggacc acttctgcgc tcggcccttc 7860
cggctggctg gtttattgct gataaatctg gagccggtga gcgtggaagc cgcggtatca 7920
ttgcagcact ggggccagat ggtaagccct cccgtatcgt agttatctac acgacgggga 7980
gtcaggcaac tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta 8040
agcattggta actgtcagac caagtttact catatatact ttagattgat ttaaaacttc 8100
atttttaatt taaaaggatc taggtgaaga tcctttttga taatctcatg accaaaatcc 8160
cttaacgtga gttttcgttc cactgagcgt cagaccccgt agaaaagatc aaaggatctt 8220
cttgagatcc tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac 8280
cagcggtggt ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 8340
tcagcagagc gcagatacca aatactgttc ttctagtgta gccgtagtta ggccaccact 8400
tcaagaactc tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg 8460
ctgccagtgg cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata 8520
aggcgcagcg gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga 8580
cctacaccga actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag 8640
ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg 8700
agcttccagg gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac 8760
ttgagcgtcg atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca 8820
acgcggcctt tttacggttc ctggcctttt gctggccttt tgctcacatg t 8871
<210> 34
<211> 8871
<212> DNA
<213> Artificial
<220>
<223> BES4-HBG-sg02
<400> 34
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg aatttctact attgtagata ccaatagcct tgacaaggca aattttttgt 300
tttagagcta gaaatagcaa gttaaaataa ggctagtccg tttttagcgc gtgcgccaat 360
tctgcagaca aatggctcta gaggtacccg ttacataact tacggtaaat ggcccgcctg 420
gctgaccgcc caacgacccc cgcccattga cgtcaatagt aacgccaata gggactttcc 480
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 540
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 600
gtgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 660
tcgctattac catggtcgag gtgagcccca cgttctgctt cactctcccc atctcccccc 720
cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca gcgatggggg 780
cggggggggg gggggggcgc gcgccaggcg gggcggggcg gggcgagggg cggggcgggg 840
cgaggcggag aggtgcggcg gcagccaatc agagcggcgc gctccgaaag tttcctttta 900
tggcgaggcg gcggcggcgg cggccctata aaaagcgaag cgcgcggcgg gcgggagtcg 960
ctgcgcgctg ccttcgcccc gtgccccgct ccgccgccgc ctcgcgccgc ccgccccggc 1020
tctgactgac cgcgttactc ccacaggtga gcgggcggga cggcccttct cctccgggct 1080
gtaattagct gagcaagagg taagggttta agggatggtt ggttggtggg gtattaatgt 1140
ttaattacct ggagcacctg cctgaaatca ctttttttca ggttggaccg gtgccaccat 1200
ggactataag gaccacgacg gagactacaa ggatcatgat attgattaca aagacgatga 1260
cgataagatg gccccaaaga agaagcggaa ggtcggtatc cacggagtcc cagcagccat 1320
gcaggagaga aagaagatca gccacctgac ccacagaaac agcgtgaaga aaaccatcag 1380
aatgcagctg aaccccgtgg gaaagaccat ggactacttc caggccaagc agatcctgga 1440
gaacgacgag aagctgaagg aggactacca gaagatcaag gagatcgccg acagattcta 1500
cagaaacctg aacgaggacg tgctgagcaa aaccggactg gacaagctga aggactacgc 1560
cgagatctac taccattgca acaccgacgc cgacagaaag agactgaacg agtgcgccag 1620
cgagctgaga aaggagatcg tgaagaactt caagaacaga gatgagtaca acaagctgtt 1680
caacaagaag atgatcgaga tcgtgctgcc caagcacctg aagaacgagg acgagaagga 1740
agtggtggcc agcttcaaga acttcaccac ctacttcacc ggcttcttca ccaacagaaa 1800
gaacatgtac agcgacggcg aagagtctac cgctattgcc tacagatgca tcaacgagaa 1860
cctgcccaag cacctggaca acgtgaaggt gttcgagaag gccatcagca agctgagcaa 1920
gaacgccatc gacgacctgg atgccacata ttctggcctg tgcggcacaa atctgtacga 1980
cgtgttcacc gtggactact tcaacttcct gctgccccaa agcggaatca ccgagtacaa 2040
caagatcatc ggcggctaca caacaagcga cggcaccaaa gtgaagggca tcaacgagta 2100
catcaacctg tacaaccagc aggtgagcaa gagagacaag atccccaacc tgaagatcct 2160
gtacaagcag atcctgagcg agagcgagaa ggtgtctttc atccccccca agttcgagga 2220
cgacaacgaa ctgctgtctg ccgtgagcga gttctatgcc aacgacgaga catttgatgg 2280
catgcccctg aagaaagcca tcgacgaaac caaactgctg ttcggcaacc tggacaacag 2340
cagcctgaac ggcatctaca tccagaacga cagaagcgtg accaacctga gcaacagcat 2400
gttcggcagc tggagcgtga ttgaggacct gtggaacaag aactacgaca gcgtgaacag 2460
caacagcaga atcaaggaca tccagaagag agaggacaag agaaagaagg cctacaaggc 2520
cgagaagaag ctgagcctga gcttcctgca ggtgctgatc agcaacagcg agaacgacga 2580
gatcagaaag aagagcatcg tggactacta caagaccagc ctgatgcagc tgaccgacaa 2640
cctgagcgac aagtacaaag aagccgcccc cctgttttct gagaactacg acaacgagaa 2700
gggcctgaag aacgacgaca agagcatcag cctgatcaag aacttcctgg acgccatcaa 2760
ggagatcgag aagttcatca agcccctgag cgagacaaat atcaccggcg agaagaacga 2820
cctgttctac agccagttca cccccctgct ggacaacatc agcagaatcg acagactgta 2880
cgacaaggtg agaaactacg tgacccagaa gcccttcagc accgacaaga tcaagctgaa 2940
cttcggcaac agccagcttc tgaacggctg ggacagaaac aaggagaagg actgtggcgc 3000
tgtgctgctg tgtaaggacg agaagtacta cctggccatc atcgacaaga gcaacaacag 3060
catcctggag aacatcgact tccaggactg caacgagagc gactactacg agaagatcgt 3120
gtacaagctg ctgaccaaga tctctggcaa cctgcccaga gtgttcttca gcgagaagca 3180
caagaagctg ctgagcccca gcgatgagat cctgaagatc tacaagagcg gcaccttcaa 3240
gaagggcgac aagttcagcc ttgacgactg ccacaagctg atcgacttct acaaggagag 3300
cttcaagaag taccccaagt ggctgatcta caacttcaag ttcaagaaca ccaacgagta 3360
caacgacatc agcgagttct acaacgacgt ggccagccag ggatacaaca tcagcaagat 3420
gaagatcccc accagcttca tcgacaagct ggtggacgag ggcaagatct acctgttcca 3480
gctgtacaac aaggacttca gcccccacag caagggaaca cctaacctgc acaccctgta 3540
cttcaagatg ctgttcgacg agagaaacct ggaggacgtg gtgtacaagc tgaatggcga 3600
ggccgagatg ttttacagac ccgccagcat caagtatgac aagcccaccc accctaagaa 3660
cacccccatc aagaacaaga acaccctgaa cgacaagaag gccagcacct tcccctacga 3720
cctgatcaag gacaagagat acaccaagtg gcagttcagc ctgcacttcc ccatcaccat 3780
gaacttcaag gcccccgaca gagccatgat caacgacgac gtgagaaacc tgctgaagag 3840
ctgcaacaac aacttcatca tcggcatcga cagaggcgag agaaacctgc tgtacgtgag 3900
cgtgatcgat agcaacggcg ccatcatcta ccagcacagc ctgaacatca tcggcaacaa 3960
gttcaagggc aagacctacg aaaccaacta cagagagaag ctggccacca gagagaagga 4020
gagaaccgag cagagaagaa actggaaggc catcgagagc atcaaggagc tgaaggaggg 4080
ctacatcagc caaaccgtgc acgtgatttg ccagctggtg gtgaagtacg acgccatcat 4140
cgtgatggag aagctgaccg acggcttcaa gagaggcaga accaagttcg agaagcaggt 4200
gtaccagaag ttcgagaaga tgctgatcga caagctgaac tactacgtgg acaagaagct 4260
ggaccccaat gaggaaggcg gactgctgca tgcttatcag ctgaccaaca agctggacag 4320
cttcgacaag ctgggaatgc agagcggctt catcttctac gtcagacccg acttcaccag 4380
caaaatcgac cccgtgaccg gatttgtgaa cctgctgtac cccagatacg agaacatcga 4440
caaggccaag gacatgatca gcagattcga cgacatcaga tacaacgccg gcgaggactt 4500
cttcgagttc gacatcgact acgacaagtt ccccaagacc gccagcgact acagaaagaa 4560
gtggaccatc tgcaccaacg gcgagagaat cgaggccttc agaaaccccg ccaacaacaa 4620
cgagtggagc tacagaacca tcatcctggc cgagaagttc aaggagctgt tcgacaacaa 4680
cagcatcaac tacagagaca gcgacgacct gaaagccgag atcctgagcc aaaccaaggg 4740
caagttcttc gaggacttct tcaagctgct gagactgacc ctgcagatga gaaacagcaa 4800
ccccgaaacc ggagaggaca ggattctgag ccccgtgaag gacaagaacg gcaacttcta 4860
cgacagcagc aagtacgacg agaagagcaa gctgccctgt gacgctgatg ctaacggcgc 4920
ttacaacatc gccagaaagg gcctgtggat cgtggagcag ttcaagaagg ccgacaacgt 4980
gtctgctgtg gaacccgtga tccacaacga caagtggctg aagttcgtgc aggagaacga 5040
catggccaac aacaaaaggc cggcggccac gaaaaaggcc ggccaggcaa aaaagaaaaa 5100
ggaattcggc agtggagagg gcagaggaag tctgctaaca tgcggtgacg tcgaggagaa 5160
tcctggccca gtgagcaagg gcgaggagct gttcaccggg gtggtgccca tcctggtcga 5220
gctggacggc gacgtaaacg gccacaagtt cagcgtgtcc ggcgagggcg agggcgatgc 5280
cacctacggc aagctgaccc tgaagttcat ctgcaccacc ggcaagctgc ccgtgccctg 5340
gcccaccctc gtgaccaccc tgacctacgg cgtgcagtgc ttcagccgct accccgacca 5400
catgaagcag cacgacttct tcaagtccgc catgcccgaa ggctacgtcc aggagcgcac 5460
catcttcttc aaggacgacg gcaactacaa gacccgcgcc gaggtgaagt tcgagggcga 5520
caccctggtg aaccgcatcg agctgaaggg catcgacttc aaggaggacg gcaacatcct 5580
ggggcacaag ctggagtaca actacaacag ccacaacgtc tatatcatgg ccgacaagca 5640
gaagaacggc atcaaggtga acttcaagat ccgccacaac atcgaggacg gcagcgtgca 5700
gctcgccgac cactaccagc agaacacccc catcggcgac ggccccgtgc tgctgcccga 5760
caaccactac ctgagcaccc agtccgccct gagcaaagac cccaacgaga agcgcgatca 5820
catggtcctg ctggagttcg tgaccgccgc cgggatcact ctcggcatgg acgagctgta 5880
caaggaattc taactagagc tcgctgatca gcctcgactg tgccttctag ttgccagcca 5940
tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc 6000
ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca ttctattctg 6060
gggggtgggg tggggcagga cagcaagggg gaggattggg aagagaatag caggcatgct 6120
ggggagcggc cgcaggaacc cctagtgatg gagttggcca ctccctctct gcgcgctcgc 6180
tcgctcactg aggccgggcg accaaaggtc gcccgacgcc cgggctttgc ccgggcggcc 6240
tcagtgagcg agcgagcgcg cagctgcctg caggggcgcc tgatgcggta ttttctcctt 6300
acgcatctgt gcggtatttc acaccgcata cgtcaaagca accatagtac gcgccctgta 6360
gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca 6420
gcgccttagc gcccgctcct ttcgctttct tcccttcctt tctcgccacg ttcgccggct 6480
ttccccgtca agctctaaat cgggggctcc ctttagggtt ccgatttagt gctttacggc 6540
acctcgaccc caaaaaactt gatttgggtg atggttcacg tagtgggcca tcgccctgat 6600
agacggtttt tcgccctttg acgttggagt ccacgttctt taatagtgga ctcttgttcc 6660
aaactggaac aacactcaac tctatctcgg gctattcttt tgatttataa gggattttgc 6720
cgatttcggt ctattggtta aaaaatgagc tgatttaaca aaaatttaac gcgaatttta 6780
acaaaatatt aacgtttaca attttatggt gcactctcag tacaatctgc tctgatgccg 6840
catagttaag ccagccccga cacccgccaa cacccgctga cgcgccctga cgggcttgtc 6900
tgctcccggc atccgcttac agacaagctg tgaccgtctc cgggagctgc atgtgtcaga 6960
ggttttcacc gtcatcaccg aaacgcgcga gacgaaaggg cctcgtgata cgcctatttt 7020
tataggttaa tgtcatgata ataatggttt cttagacgtc aggtggcact tttcggggaa 7080
atgtgcgcgg aacccctatt tgtttatttt tctaaataca ttcaaatatg tatccgctca 7140
tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt atgagtattc 7200
aacatttccg tgtcgccctt attccctttt ttgcggcatt ttgccttcct gtttttgctc 7260
acccagaaac gctggtgaaa gtaaaagatg ctgaagatca gttgggtgca cgagtgggtt 7320
acatcgaact ggatctcaac agcggtaaga tccttgagag ttttcgcccc gaagaacgtt 7380
ttccaatgat gagcactttt aaagttctgc tatgtggcgc ggtattatcc cgtattgacg 7440
ccgggcaaga gcaactcggt cgccgcatac actattctca gaatgacttg gttgagtact 7500
caccagtcac agaaaagcat cttacggatg gcatgacagt aagagaatta tgcagtgctg 7560
ccataaccat gagtgataac actgcggcca acttacttct gacaacgatc ggaggaccga 7620
aggagctaac cgcttttttg cacaacatgg gggatcatgt aactcgcctt gatcgttggg 7680
aaccggagct gaatgaagcc ataccaaacg acgagcgtga caccacgatg cctgtagcaa 7740
tggcaacaac gttgcgcaaa ctattaactg gcgaactact tactctagct tcccggcaac 7800
aattaataga ctggatggag gcggataaag ttgcaggacc acttctgcgc tcggcccttc 7860
cggctggctg gtttattgct gataaatctg gagccggtga gcgtggaagc cgcggtatca 7920
ttgcagcact ggggccagat ggtaagccct cccgtatcgt agttatctac acgacgggga 7980
gtcaggcaac tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta 8040
agcattggta actgtcagac caagtttact catatatact ttagattgat ttaaaacttc 8100
atttttaatt taaaaggatc taggtgaaga tcctttttga taatctcatg accaaaatcc 8160
cttaacgtga gttttcgttc cactgagcgt cagaccccgt agaaaagatc aaaggatctt 8220
cttgagatcc tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac 8280
cagcggtggt ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 8340
tcagcagagc gcagatacca aatactgttc ttctagtgta gccgtagtta ggccaccact 8400
tcaagaactc tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg 8460
ctgccagtgg cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata 8520
aggcgcagcg gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga 8580
cctacaccga actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag 8640
ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg 8700
agcttccagg gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac 8760
ttgagcgtcg atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca 8820
acgcggcctt tttacggttc ctggcctttt gctggccttt tgctcacatg t 8871
<210> 35
<211> 8871
<212> DNA
<213> Artificial
<220>
<223> BES4-HBG-SG03
<400> 35
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg aatttctact attgtagatc cttgtcaagg ctattggtca agttttttgt 300
tttagagcta gaaatagcaa gttaaaataa ggctagtccg tttttagcgc gtgcgccaat 360
tctgcagaca aatggctcta gaggtacccg ttacataact tacggtaaat ggcccgcctg 420
gctgaccgcc caacgacccc cgcccattga cgtcaatagt aacgccaata gggactttcc 480
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 540
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 600
gtgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 660
tcgctattac catggtcgag gtgagcccca cgttctgctt cactctcccc atctcccccc 720
cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca gcgatggggg 780
cggggggggg gggggggcgc gcgccaggcg gggcggggcg gggcgagggg cggggcgggg 840
cgaggcggag aggtgcggcg gcagccaatc agagcggcgc gctccgaaag tttcctttta 900
tggcgaggcg gcggcggcgg cggccctata aaaagcgaag cgcgcggcgg gcgggagtcg 960
ctgcgcgctg ccttcgcccc gtgccccgct ccgccgccgc ctcgcgccgc ccgccccggc 1020
tctgactgac cgcgttactc ccacaggtga gcgggcggga cggcccttct cctccgggct 1080
gtaattagct gagcaagagg taagggttta agggatggtt ggttggtggg gtattaatgt 1140
ttaattacct ggagcacctg cctgaaatca ctttttttca ggttggaccg gtgccaccat 1200
ggactataag gaccacgacg gagactacaa ggatcatgat attgattaca aagacgatga 1260
cgataagatg gccccaaaga agaagcggaa ggtcggtatc cacggagtcc cagcagccat 1320
gcaggagaga aagaagatca gccacctgac ccacagaaac agcgtgaaga aaaccatcag 1380
aatgcagctg aaccccgtgg gaaagaccat ggactacttc caggccaagc agatcctgga 1440
gaacgacgag aagctgaagg aggactacca gaagatcaag gagatcgccg acagattcta 1500
cagaaacctg aacgaggacg tgctgagcaa aaccggactg gacaagctga aggactacgc 1560
cgagatctac taccattgca acaccgacgc cgacagaaag agactgaacg agtgcgccag 1620
cgagctgaga aaggagatcg tgaagaactt caagaacaga gatgagtaca acaagctgtt 1680
caacaagaag atgatcgaga tcgtgctgcc caagcacctg aagaacgagg acgagaagga 1740
agtggtggcc agcttcaaga acttcaccac ctacttcacc ggcttcttca ccaacagaaa 1800
gaacatgtac agcgacggcg aagagtctac cgctattgcc tacagatgca tcaacgagaa 1860
cctgcccaag cacctggaca acgtgaaggt gttcgagaag gccatcagca agctgagcaa 1920
gaacgccatc gacgacctgg atgccacata ttctggcctg tgcggcacaa atctgtacga 1980
cgtgttcacc gtggactact tcaacttcct gctgccccaa agcggaatca ccgagtacaa 2040
caagatcatc ggcggctaca caacaagcga cggcaccaaa gtgaagggca tcaacgagta 2100
catcaacctg tacaaccagc aggtgagcaa gagagacaag atccccaacc tgaagatcct 2160
gtacaagcag atcctgagcg agagcgagaa ggtgtctttc atccccccca agttcgagga 2220
cgacaacgaa ctgctgtctg ccgtgagcga gttctatgcc aacgacgaga catttgatgg 2280
catgcccctg aagaaagcca tcgacgaaac caaactgctg ttcggcaacc tggacaacag 2340
cagcctgaac ggcatctaca tccagaacga cagaagcgtg accaacctga gcaacagcat 2400
gttcggcagc tggagcgtga ttgaggacct gtggaacaag aactacgaca gcgtgaacag 2460
caacagcaga atcaaggaca tccagaagag agaggacaag agaaagaagg cctacaaggc 2520
cgagaagaag ctgagcctga gcttcctgca ggtgctgatc agcaacagcg agaacgacga 2580
gatcagaaag aagagcatcg tggactacta caagaccagc ctgatgcagc tgaccgacaa 2640
cctgagcgac aagtacaaag aagccgcccc cctgttttct gagaactacg acaacgagaa 2700
gggcctgaag aacgacgaca agagcatcag cctgatcaag aacttcctgg acgccatcaa 2760
ggagatcgag aagttcatca agcccctgag cgagacaaat atcaccggcg agaagaacga 2820
cctgttctac agccagttca cccccctgct ggacaacatc agcagaatcg acagactgta 2880
cgacaaggtg agaaactacg tgacccagaa gcccttcagc accgacaaga tcaagctgaa 2940
cttcggcaac agccagcttc tgaacggctg ggacagaaac aaggagaagg actgtggcgc 3000
tgtgctgctg tgtaaggacg agaagtacta cctggccatc atcgacaaga gcaacaacag 3060
catcctggag aacatcgact tccaggactg caacgagagc gactactacg agaagatcgt 3120
gtacaagctg ctgaccaaga tctctggcaa cctgcccaga gtgttcttca gcgagaagca 3180
caagaagctg ctgagcccca gcgatgagat cctgaagatc tacaagagcg gcaccttcaa 3240
gaagggcgac aagttcagcc ttgacgactg ccacaagctg atcgacttct acaaggagag 3300
cttcaagaag taccccaagt ggctgatcta caacttcaag ttcaagaaca ccaacgagta 3360
caacgacatc agcgagttct acaacgacgt ggccagccag ggatacaaca tcagcaagat 3420
gaagatcccc accagcttca tcgacaagct ggtggacgag ggcaagatct acctgttcca 3480
gctgtacaac aaggacttca gcccccacag caagggaaca cctaacctgc acaccctgta 3540
cttcaagatg ctgttcgacg agagaaacct ggaggacgtg gtgtacaagc tgaatggcga 3600
ggccgagatg ttttacagac ccgccagcat caagtatgac aagcccaccc accctaagaa 3660
cacccccatc aagaacaaga acaccctgaa cgacaagaag gccagcacct tcccctacga 3720
cctgatcaag gacaagagat acaccaagtg gcagttcagc ctgcacttcc ccatcaccat 3780
gaacttcaag gcccccgaca gagccatgat caacgacgac gtgagaaacc tgctgaagag 3840
ctgcaacaac aacttcatca tcggcatcga cagaggcgag agaaacctgc tgtacgtgag 3900
cgtgatcgat agcaacggcg ccatcatcta ccagcacagc ctgaacatca tcggcaacaa 3960
gttcaagggc aagacctacg aaaccaacta cagagagaag ctggccacca gagagaagga 4020
gagaaccgag cagagaagaa actggaaggc catcgagagc atcaaggagc tgaaggaggg 4080
ctacatcagc caaaccgtgc acgtgatttg ccagctggtg gtgaagtacg acgccatcat 4140
cgtgatggag aagctgaccg acggcttcaa gagaggcaga accaagttcg agaagcaggt 4200
gtaccagaag ttcgagaaga tgctgatcga caagctgaac tactacgtgg acaagaagct 4260
ggaccccaat gaggaaggcg gactgctgca tgcttatcag ctgaccaaca agctggacag 4320
cttcgacaag ctgggaatgc agagcggctt catcttctac gtcagacccg acttcaccag 4380
caaaatcgac cccgtgaccg gatttgtgaa cctgctgtac cccagatacg agaacatcga 4440
caaggccaag gacatgatca gcagattcga cgacatcaga tacaacgccg gcgaggactt 4500
cttcgagttc gacatcgact acgacaagtt ccccaagacc gccagcgact acagaaagaa 4560
gtggaccatc tgcaccaacg gcgagagaat cgaggccttc agaaaccccg ccaacaacaa 4620
cgagtggagc tacagaacca tcatcctggc cgagaagttc aaggagctgt tcgacaacaa 4680
cagcatcaac tacagagaca gcgacgacct gaaagccgag atcctgagcc aaaccaaggg 4740
caagttcttc gaggacttct tcaagctgct gagactgacc ctgcagatga gaaacagcaa 4800
ccccgaaacc ggagaggaca ggattctgag ccccgtgaag gacaagaacg gcaacttcta 4860
cgacagcagc aagtacgacg agaagagcaa gctgccctgt gacgctgatg ctaacggcgc 4920
ttacaacatc gccagaaagg gcctgtggat cgtggagcag ttcaagaagg ccgacaacgt 4980
gtctgctgtg gaacccgtga tccacaacga caagtggctg aagttcgtgc aggagaacga 5040
catggccaac aacaaaaggc cggcggccac gaaaaaggcc ggccaggcaa aaaagaaaaa 5100
ggaattcggc agtggagagg gcagaggaag tctgctaaca tgcggtgacg tcgaggagaa 5160
tcctggccca gtgagcaagg gcgaggagct gttcaccggg gtggtgccca tcctggtcga 5220
gctggacggc gacgtaaacg gccacaagtt cagcgtgtcc ggcgagggcg agggcgatgc 5280
cacctacggc aagctgaccc tgaagttcat ctgcaccacc ggcaagctgc ccgtgccctg 5340
gcccaccctc gtgaccaccc tgacctacgg cgtgcagtgc ttcagccgct accccgacca 5400
catgaagcag cacgacttct tcaagtccgc catgcccgaa ggctacgtcc aggagcgcac 5460
catcttcttc aaggacgacg gcaactacaa gacccgcgcc gaggtgaagt tcgagggcga 5520
caccctggtg aaccgcatcg agctgaaggg catcgacttc aaggaggacg gcaacatcct 5580
ggggcacaag ctggagtaca actacaacag ccacaacgtc tatatcatgg ccgacaagca 5640
gaagaacggc atcaaggtga acttcaagat ccgccacaac atcgaggacg gcagcgtgca 5700
gctcgccgac cactaccagc agaacacccc catcggcgac ggccccgtgc tgctgcccga 5760
caaccactac ctgagcaccc agtccgccct gagcaaagac cccaacgaga agcgcgatca 5820
catggtcctg ctggagttcg tgaccgccgc cgggatcact ctcggcatgg acgagctgta 5880
caaggaattc taactagagc tcgctgatca gcctcgactg tgccttctag ttgccagcca 5940
tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc 6000
ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca ttctattctg 6060
gggggtgggg tggggcagga cagcaagggg gaggattggg aagagaatag caggcatgct 6120
ggggagcggc cgcaggaacc cctagtgatg gagttggcca ctccctctct gcgcgctcgc 6180
tcgctcactg aggccgggcg accaaaggtc gcccgacgcc cgggctttgc ccgggcggcc 6240
tcagtgagcg agcgagcgcg cagctgcctg caggggcgcc tgatgcggta ttttctcctt 6300
acgcatctgt gcggtatttc acaccgcata cgtcaaagca accatagtac gcgccctgta 6360
gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca 6420
gcgccttagc gcccgctcct ttcgctttct tcccttcctt tctcgccacg ttcgccggct 6480
ttccccgtca agctctaaat cgggggctcc ctttagggtt ccgatttagt gctttacggc 6540
acctcgaccc caaaaaactt gatttgggtg atggttcacg tagtgggcca tcgccctgat 6600
agacggtttt tcgccctttg acgttggagt ccacgttctt taatagtgga ctcttgttcc 6660
aaactggaac aacactcaac tctatctcgg gctattcttt tgatttataa gggattttgc 6720
cgatttcggt ctattggtta aaaaatgagc tgatttaaca aaaatttaac gcgaatttta 6780
acaaaatatt aacgtttaca attttatggt gcactctcag tacaatctgc tctgatgccg 6840
catagttaag ccagccccga cacccgccaa cacccgctga cgcgccctga cgggcttgtc 6900
tgctcccggc atccgcttac agacaagctg tgaccgtctc cgggagctgc atgtgtcaga 6960
ggttttcacc gtcatcaccg aaacgcgcga gacgaaaggg cctcgtgata cgcctatttt 7020
tataggttaa tgtcatgata ataatggttt cttagacgtc aggtggcact tttcggggaa 7080
atgtgcgcgg aacccctatt tgtttatttt tctaaataca ttcaaatatg tatccgctca 7140
tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt atgagtattc 7200
aacatttccg tgtcgccctt attccctttt ttgcggcatt ttgccttcct gtttttgctc 7260
acccagaaac gctggtgaaa gtaaaagatg ctgaagatca gttgggtgca cgagtgggtt 7320
acatcgaact ggatctcaac agcggtaaga tccttgagag ttttcgcccc gaagaacgtt 7380
ttccaatgat gagcactttt aaagttctgc tatgtggcgc ggtattatcc cgtattgacg 7440
ccgggcaaga gcaactcggt cgccgcatac actattctca gaatgacttg gttgagtact 7500
caccagtcac agaaaagcat cttacggatg gcatgacagt aagagaatta tgcagtgctg 7560
ccataaccat gagtgataac actgcggcca acttacttct gacaacgatc ggaggaccga 7620
aggagctaac cgcttttttg cacaacatgg gggatcatgt aactcgcctt gatcgttggg 7680
aaccggagct gaatgaagcc ataccaaacg acgagcgtga caccacgatg cctgtagcaa 7740
tggcaacaac gttgcgcaaa ctattaactg gcgaactact tactctagct tcccggcaac 7800
aattaataga ctggatggag gcggataaag ttgcaggacc acttctgcgc tcggcccttc 7860
cggctggctg gtttattgct gataaatctg gagccggtga gcgtggaagc cgcggtatca 7920
ttgcagcact ggggccagat ggtaagccct cccgtatcgt agttatctac acgacgggga 7980
gtcaggcaac tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta 8040
agcattggta actgtcagac caagtttact catatatact ttagattgat ttaaaacttc 8100
atttttaatt taaaaggatc taggtgaaga tcctttttga taatctcatg accaaaatcc 8160
cttaacgtga gttttcgttc cactgagcgt cagaccccgt agaaaagatc aaaggatctt 8220
cttgagatcc tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac 8280
cagcggtggt ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 8340
tcagcagagc gcagatacca aatactgttc ttctagtgta gccgtagtta ggccaccact 8400
tcaagaactc tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg 8460
ctgccagtgg cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata 8520
aggcgcagcg gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga 8580
cctacaccga actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag 8640
ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg 8700
agcttccagg gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac 8760
ttgagcgtcg atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca 8820
acgcggcctt tttacggttc ctggcctttt gctggccttt tgctcacatg t 8871
<210> 36
<211> 9290
<212> DNA
<213> Artificial
<220>
<223> PX458-HBG-SG01
<400> 36
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccc ttgtcaaggc tattggtcag ttttagagct agaaatagca agttaaaata 300
aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt tgttttagag 360
ctagaaatag caagttaaaa taaggctagt ccgtttttag cgcgtgcgcc aattctgcag 420
acaaatggct ctagaggtac ccgttacata acttacggta aatggcccgc ctggctgacc 480
gcccaacgac ccccgcccat tgacgtcaat agtaacgcca atagggactt tccattgacg 540
tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag tgtatcatat 600
gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc attgtgccca 660
gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag tcatcgctat 720
taccatggtc gaggtgagcc ccacgttctg cttcactctc cccatctccc ccccctcccc 780
acccccaatt ttgtatttat ttatttttta attattttgt gcagcgatgg gggcgggggg 840
gggggggggg cgcgcgccag gcggggcggg gcggggcgag gggcggggcg gggcgaggcg 900
gagaggtgcg gcggcagcca atcagagcgg cgcgctccga aagtttcctt ttatggcgag 960
gcggcggcgg cggcggccct ataaaaagcg aagcgcgcgg cgggcgggag tcgctgcgcg 1020
ctgccttcgc cccgtgcccc gctccgccgc cgcctcgcgc cgcccgcccc ggctctgact 1080
gaccgcgtta ctcccacagg tgagcgggcg ggacggccct tctcctccgg gctgtaatta 1140
gctgagcaag aggtaagggt ttaagggatg gttggttggt ggggtattaa tgtttaatta 1200
cctggagcac ctgcctgaaa tcactttttt tcaggttgga ccggtgccac catggactat 1260
aaggaccacg acggagacta caaggatcat gatattgatt acaaagacga tgacgataag 1320
atggccccaa agaagaagcg gaaggtcggt atccacggag tcccagcagc cgacaagaag 1380
tacagcatcg gcctggacat cggcaccaac tctgtgggct gggccgtgat caccgacgag 1440
tacaaggtgc ccagcaagaa attcaaggtg ctgggcaaca ccgaccggca cagcatcaag 1500
aagaacctga tcggagccct gctgttcgac agcggcgaaa cagccgaggc cacccggctg 1560
aagagaaccg ccagaagaag atacaccaga cggaagaacc ggatctgcta tctgcaagag 1620
atcttcagca acgagatggc caaggtggac gacagcttct tccacagact ggaagagtcc 1680
ttcctggtgg aagaggataa gaagcacgag cggcacccca tcttcggcaa catcgtggac 1740
gaggtggcct accacgagaa gtaccccacc atctaccacc tgagaaagaa actggtggac 1800
agcaccgaca aggccgacct gcggctgatc tatctggccc tggcccacat gatcaagttc 1860
cggggccact tcctgatcga gggcgacctg aaccccgaca acagcgacgt ggacaagctg 1920
ttcatccagc tggtgcagac ctacaaccag ctgttcgagg aaaaccccat caacgccagc 1980
ggcgtggacg ccaaggccat cctgtctgcc agactgagca agagcagacg gctggaaaat 2040
ctgatcgccc agctgcccgg cgagaagaag aatggcctgt tcggaaacct gattgccctg 2100
agcctgggcc tgacccccaa cttcaagagc aacttcgacc tggccgagga tgccaaactg 2160
cagctgagca aggacaccta cgacgacgac ctggacaacc tgctggccca gatcggcgac 2220
cagtacgccg acctgtttct ggccgccaag aacctgtccg acgccatcct gctgagcgac 2280
atcctgagag tgaacaccga gatcaccaag gcccccctga gcgcctctat gatcaagaga 2340
tacgacgagc accaccagga cctgaccctg ctgaaagctc tcgtgcggca gcagctgcct 2400
gagaagtaca aagagatttt cttcgaccag agcaagaacg gctacgccgg ctacattgac 2460
ggcggagcca gccaggaaga gttctacaag ttcatcaagc ccatcctgga aaagatggac 2520
ggcaccgagg aactgctcgt gaagctgaac agagaggacc tgctgcggaa gcagcggacc 2580
ttcgacaacg gcagcatccc ccaccagatc cacctgggag agctgcacgc cattctgcgg 2640
cggcaggaag atttttaccc attcctgaag gacaaccggg aaaagatcga gaagatcctg 2700
accttccgca tcccctacta cgtgggccct ctggccaggg gaaacagcag attcgcctgg 2760
atgaccagaa agagcgagga aaccatcacc ccctggaact tcgaggaagt ggtggacaag 2820
ggcgcttccg cccagagctt catcgagcgg atgaccaact tcgataagaa cctgcccaac 2880
gagaaggtgc tgcccaagca cagcctgctg tacgagtact tcaccgtgta taacgagctg 2940
accaaagtga aatacgtgac cgagggaatg agaaagcccg ccttcctgag cggcgagcag 3000
aaaaaggcca tcgtggacct gctgttcaag accaaccgga aagtgaccgt gaagcagctg 3060
aaagaggact acttcaagaa aatcgagtgc ttcgactccg tggaaatctc cggcgtggaa 3120
gatcggttca acgcctccct gggcacatac cacgatctgc tgaaaattat caaggacaag 3180
gacttcctgg acaatgagga aaacgaggac attctggaag atatcgtgct gaccctgaca 3240
ctgtttgagg acagagagat gatcgaggaa cggctgaaaa cctatgccca cctgttcgac 3300
gacaaagtga tgaagcagct gaagcggcgg agatacaccg gctggggcag gctgagccgg 3360
aagctgatca acggcatccg ggacaagcag tccggcaaga caatcctgga tttcctgaag 3420
tccgacggct tcgccaacag aaacttcatg cagctgatcc acgacgacag cctgaccttt 3480
aaagaggaca tccagaaagc ccaggtgtcc ggccagggcg atagcctgca cgagcacatt 3540
gccaatctgg ccggcagccc cgccattaag aagggcatcc tgcagacagt gaaggtggtg 3600
gacgagctcg tgaaagtgat gggccggcac aagcccgaga acatcgtgat cgaaatggcc 3660
agagagaacc agaccaccca gaagggacag aagaacagcc gcgagagaat gaagcggatc 3720
gaagagggca tcaaagagct gggcagccag atcctgaaag aacaccccgt ggaaaacacc 3780
cagctgcaga acgagaagct gtacctgtac tacctgcaga atgggcggga tatgtacgtg 3840
gaccaggaac tggacatcaa ccggctgtcc gactacgatg tggaccatat cgtgcctcag 3900
agctttctga aggacgactc catcgacaac aaggtgctga ccagaagcga caagaaccgg 3960
ggcaagagcg acaacgtgcc ctccgaagag gtcgtgaaga agatgaagaa ctactggcgg 4020
cagctgctga acgccaagct gattacccag agaaagttcg acaatctgac caaggccgag 4080
agaggcggcc tgagcgaact ggataaggcc ggcttcatca agagacagct ggtggaaacc 4140
cggcagatca caaagcacgt ggcacagatc ctggactccc ggatgaacac taagtacgac 4200
gagaatgaca agctgatccg ggaagtgaaa gtgatcaccc tgaagtccaa gctggtgtcc 4260
gatttccgga aggatttcca gttttacaaa gtgcgcgaga tcaacaacta ccaccacgcc 4320
cacgacgcct acctgaacgc cgtcgtggga accgccctga tcaaaaagta ccctaagctg 4380
gaaagcgagt tcgtgtacgg cgactacaag gtgtacgacg tgcggaagat gatcgccaag 4440
agcgagcagg aaatcggcaa ggctaccgcc aagtacttct tctacagcaa catcatgaac 4500
tttttcaaga ccgagattac cctggccaac ggcgagatcc ggaagcggcc tctgatcgag 4560
acaaacggcg aaaccgggga gatcgtgtgg gataagggcc gggattttgc caccgtgcgg 4620
aaagtgctga gcatgcccca agtgaatatc gtgaaaaaga ccgaggtgca gacaggcggc 4680
ttcagcaaag agtctatcct gcccaagagg aacagcgata agctgatcgc cagaaagaag 4740
gactgggacc ctaagaagta cggcggcttc gacagcccca ccgtggccta ttctgtgctg 4800
gtggtggcca aagtggaaaa gggcaagtcc aagaaactga agagtgtgaa agagctgctg 4860
gggatcacca tcatggaaag aagcagcttc gagaagaatc ccatcgactt tctggaagcc 4920
aagggctaca aagaagtgaa aaaggacctg atcatcaagc tgcctaagta ctccctgttc 4980
gagctggaaa acggccggaa gagaatgctg gcctctgccg gcgaactgca gaagggaaac 5040
gaactggccc tgccctccaa atatgtgaac ttcctgtacc tggccagcca ctatgagaag 5100
ctgaagggct cccccgagga taatgagcag aaacagctgt ttgtggaaca gcacaagcac 5160
tacctggacg agatcatcga gcagatcagc gagttctcca agagagtgat cctggccgac 5220
gctaatctgg acaaagtgct gtccgcctac aacaagcacc gggataagcc catcagagag 5280
caggccgaga atatcatcca cctgtttacc ctgaccaatc tgggagcccc tgccgccttc 5340
aagtactttg acaccaccat cgaccggaag aggtacacca gcaccaaaga ggtgctggac 5400
gccaccctga tccaccagag catcaccggc ctgtacgaga cacggatcga cctgtctcag 5460
ctgggaggcg acaaaaggcc ggcggccacg aaaaaggccg gccaggcaaa aaagaaaaag 5520
gaattcggca gtggagaggg cagaggaagt ctgctaacat gcggtgacgt cgaggagaat 5580
cctggcccag tgagcaaggg cgaggagctg ttcaccgggg tggtgcccat cctggtcgag 5640
ctggacggcg acgtaaacgg ccacaagttc agcgtgtccg gcgagggcga gggcgatgcc 5700
acctacggca agctgaccct gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg 5760
cccaccctcg tgaccaccct gacctacggc gtgcagtgct tcagccgcta ccccgaccac 5820
atgaagcagc acgacttctt caagtccgcc atgcccgaag gctacgtcca ggagcgcacc 5880
atcttcttca aggacgacgg caactacaag acccgcgccg aggtgaagtt cgagggcgac 5940
accctggtga accgcatcga gctgaagggc atcgacttca aggaggacgg caacatcctg 6000
gggcacaagc tggagtacaa ctacaacagc cacaacgtct atatcatggc cgacaagcag 6060
aagaacggca tcaaggtgaa cttcaagatc cgccacaaca tcgaggacgg cagcgtgcag 6120
ctcgccgacc actaccagca gaacaccccc atcggcgacg gccccgtgct gctgcccgac 6180
aaccactacc tgagcaccca gtccgccctg agcaaagacc ccaacgagaa gcgcgatcac 6240
atggtcctgc tggagttcgt gaccgccgcc gggatcactc tcggcatgga cgagctgtac 6300
aaggaattct aactagagct cgctgatcag cctcgactgt gccttctagt tgccagccat 6360
ctgttgtttg cccctccccc gtgccttcct tgaccctgga aggtgccact cccactgtcc 6420
tttcctaata aaatgaggaa attgcatcgc attgtctgag taggtgtcat tctattctgg 6480
ggggtggggt ggggcaggac agcaaggggg aggattggga agagaatagc aggcatgctg 6540
gggagcggcc gcaggaaccc ctagtgatgg agttggccac tccctctctg cgcgctcgct 6600
cgctcactga ggccgggcga ccaaaggtcg cccgacgccc gggctttgcc cgggcggcct 6660
cagtgagcga gcgagcgcgc agctgcctgc aggggcgcct gatgcggtat tttctcctta 6720
cgcatctgtg cggtatttca caccgcatac gtcaaagcaa ccatagtacg cgccctgtag 6780
cggcgcatta agcgcggcgg gtgtggtggt tacgcgcagc gtgaccgcta cacttgccag 6840
cgccttagcg cccgctcctt tcgctttctt cccttccttt ctcgccacgt tcgccggctt 6900
tccccgtcaa gctctaaatc gggggctccc tttagggttc cgatttagtg ctttacggca 6960
cctcgacccc aaaaaacttg atttgggtga tggttcacgt agtgggccat cgccctgata 7020
gacggttttt cgccctttga cgttggagtc cacgttcttt aatagtggac tcttgttcca 7080
aactggaaca acactcaact ctatctcggg ctattctttt gatttataag ggattttgcc 7140
gatttcggtc tattggttaa aaaatgagct gatttaacaa aaatttaacg cgaattttaa 7200
caaaatatta acgtttacaa ttttatggtg cactctcagt acaatctgct ctgatgccgc 7260
atagttaagc cagccccgac acccgccaac acccgctgac gcgccctgac gggcttgtct 7320
gctcccggca tccgcttaca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag 7380
gttttcaccg tcatcaccga aacgcgcgag acgaaagggc ctcgtgatac gcctattttt 7440
ataggttaat gtcatgataa taatggtttc ttagacgtca ggtggcactt ttcggggaaa 7500
tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt atccgctcat 7560
gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta tgagtattca 7620
acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg tttttgctca 7680
cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac gagtgggtta 7740
catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg aagaacgttt 7800
tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc gtattgacgc 7860
cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg ttgagtactc 7920
accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat gcagtgctgc 7980
cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg gaggaccgaa 8040
ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg atcgttggga 8100
accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc ctgtagcaat 8160
ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt cccggcaaca 8220
attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct cggcccttcc 8280
ggctggctgg tttattgctg ataaatctgg agccggtgag cgtggaagcc gcggtatcat 8340
tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca cgacggggag 8400
tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct cactgattaa 8460
gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt taaaacttca 8520
tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga ccaaaatccc 8580
ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 8640
ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 8700
agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 8760
cagcagagcg cagataccaa atactgttct tctagtgtag ccgtagttag gccaccactt 8820
caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 8880
tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 8940
ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 9000
ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 9060
gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 9120
gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 9180
tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 9240
cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt 9290
<210> 37
<211> 24
<212> DNA
<213> Artificial
<220>
<223> HBG1F
<400> 37
tccttagaaa ccactgctaa ctga 24
<210> 38
<211> 20
<212> DNA
<213> Artificial
<220>
<223> HBG1R
<400> 38
ccctgctgtg ctcagatcaa 20

Claims (10)

1.一种Cas蛋白,其特征在于,包括:
SEQ ID NO: 3所示的氨基酸序列。
2.一种核酸序列,其特征在于,所述核酸序列编码权利要求1所述的Cas蛋白。
3.根据权利要求2所述的核酸序列,其特征在于,所述核酸序列为DNA或者RNA。
4.一种表达载体,其特征在于,所述表达载体包括权利要求2或3所述的核酸序列。
5.一种重组细胞,其特征在于,所述重组细胞含有权利要求4所述的表达载体,所述重组细胞非植物细胞。
6.根据权利要求5所述的重组细胞,其特征在于,所述重组细胞为真核细胞。
7.根据权利要求6所述的重组细胞,其特征在于,所述重组细胞为动物细胞。
8.一种Crispr-Cas系统,其特征在于,包括权利要求1所述的Cas蛋白。
9.根据权利要求8所述的系统,其特征在于,进一步包括下列中的至少一种:crRNA、tracrRNA或者由crRNA、tracrRNA形成的嵌合RNA。
10.权利要求1所述的Cas蛋白、权利要求2或3所述的核酸序列、权利要求4所述的表达载体、权利要求5~7任一项所述的重组细胞或者权利要求8或9所述的Crispr-Cas系统在非疾病诊断或治疗的基因编辑领域中的用途。
CN202010401622.0A 2019-05-14 2020-05-13 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途 Active CN112301018B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310742030.9A CN116694603A (zh) 2019-05-14 2020-05-13 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019103990824 2019-05-14
CN201910399082 2019-05-14

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202310742030.9A Division CN116694603A (zh) 2019-05-14 2020-05-13 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途

Publications (2)

Publication Number Publication Date
CN112301018A CN112301018A (zh) 2021-02-02
CN112301018B true CN112301018B (zh) 2023-07-25

Family

ID=74336498

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202310742030.9A Pending CN116694603A (zh) 2019-05-14 2020-05-13 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途
CN202010401622.0A Active CN112301018B (zh) 2019-05-14 2020-05-13 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202310742030.9A Pending CN116694603A (zh) 2019-05-14 2020-05-13 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途

Country Status (1)

Country Link
CN (2) CN116694603A (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114921439B (zh) * 2022-06-16 2024-04-26 尧唐(上海)生物科技有限公司 CRISPR-Cas效应子蛋白、其基因编辑系统及应用
WO2024098383A1 (zh) * 2022-11-11 2024-05-16 深圳华大生命科学研究院 蛋白突变体及其治疗与hbb基因突变相关疾病的应用
CN116410955B (zh) * 2023-03-10 2023-12-19 华中农业大学 两种新型核酸内切酶及其在核酸检测中的应用

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108690845A (zh) * 2017-04-10 2018-10-23 中国科学院动物研究所 基因组编辑系统和方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784200B (zh) * 2016-08-26 2020-11-06 深圳华大生命科学研究院 一种筛选新型CRISPR-Cas系统的方法和装置
EP3555275A1 (en) * 2016-12-14 2019-10-23 Wageningen Universiteit Thermostable cas9 nucleases

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108690845A (zh) * 2017-04-10 2018-10-23 中国科学院动物研究所 基因组编辑系统和方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Zou,Y. 等.GenBank: RGS46198.1,"type V CRISPR-associated protein Cpf1 [Prevotella copri]".《GenBank》.2018,feature、origin部分. *

Also Published As

Publication number Publication date
CN116694603A (zh) 2023-09-05
CN112301018A (zh) 2021-02-02

Similar Documents

Publication Publication Date Title
CN112301018B (zh) 新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途
AU2020289750B2 (en) Engineered meganucleases with recognition sequences found in the human T cell receptor alpha constant region gene
AU2021200863A1 (en) Genetically-modified cells comprising a modified human t cell receptor alpha constant region gene
CN112375748B (zh) 基于水疱性口炎病毒载体的新型冠状病毒嵌合重组疫苗及其制备方法与应用
KR102528337B1 (ko) 정의된 서열 및 길이의 dna 단일 가닥 분자의 확장 가능한 생명공학적 생산
CN110467679B (zh) 一种融合蛋白、碱基编辑工具和方法及其应用
CN110582567A (zh) 经遗传修饰的表达海藻糖酶的酵母及使用此类经遗传修饰的酵母的发酵方法
KR20210151916A (ko) 뒤시엔느 근육 이영양증의 치료를 위한 aav 벡터-매개된 큰 돌연변이 핫스팟의 결실
CN112941038B (zh) 基于水疱性口炎病毒载体的重组新型冠状病毒及其制备方法与应用
CA2747462A1 (en) Systems and methods for the secretion of recombinant proteins in gram negative bacteria
CN114921439A (zh) CRISPR-Cas效应子蛋白、其基因编辑系统及应用
WO2020169221A1 (en) Production of plant-based active substances (e.g. cannabinoids) by recombinant microorganisms
CN112442515B (zh) gRNA靶点组合在构建血友病模型猪细胞系中的应用
CN101511996B (zh) 酶促还原炔衍生物的方法
CN111534578A (zh) 一种高通量筛选真核生物细胞与农药互作的靶点基因的方法
KR20140105821A (ko) 정의된 항원에 대한 보호 체액성 면역 반응을 형성하는 재조합 효모를 이용한 예방 접종
CN111718932A (zh) 一种新型的基因编辑动物生物反应器制备方法及应用
CN114835818B (zh) 一种基因编辑融合蛋白、其构建的腺嘌呤碱基编辑器及其应用
CN113481114B (zh) 一种基于酵母细胞表面展示技术的爆炸物可视化生物传感器及其制备方法和应用
CN114958759B (zh) 一种肌萎缩侧索硬化症模型猪的构建方法及应用
KR101831121B1 (ko) 피리피로펜 생합성 유전자 클러스터 및 표지 유전자를 포함하는 핵산 구성체
CN110964748B (zh) 含线粒体靶向序列的载体及其构建方法和应用
CN112538497B (zh) CRISPR/Cas9系统及其在构建α、β和α&β地中海贫血模型猪细胞系中的应用
CN112442513B (zh) Cas9过表达载体及其构建方法和应用
CN111534544A (zh) 一种高通量筛选真核生物细胞与病毒互作靶点基因的方法

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant