TW202221119A - Dna-binding domain transactivators and uses thereof - Google Patents

Dna-binding domain transactivators and uses thereof Download PDF

Info

Publication number
TW202221119A
TW202221119A TW110127164A TW110127164A TW202221119A TW 202221119 A TW202221119 A TW 202221119A TW 110127164 A TW110127164 A TW 110127164A TW 110127164 A TW110127164 A TW 110127164A TW 202221119 A TW202221119 A TW 202221119A
Authority
TW
Taiwan
Prior art keywords
nucleic acid
seq
cdata
leu
arg
Prior art date
Application number
TW110127164A
Other languages
Chinese (zh)
Inventor
米格 西那 艾斯提夫
史考特 A 沃夫
Original Assignee
麻州大學
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 麻州大學 filed Critical 麻州大學
Publication of TW202221119A publication Critical patent/TW202221119A/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • C40B40/08Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/70Fusion polypeptide containing domain for protein-protein interaction
    • C07K2319/71Fusion polypeptide containing domain for protein-protein interaction containing domain for transcriptional activaation, e.g. VP16
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • C07K2319/81Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor containing a Zn-finger domain for DNA binding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • C12N15/1138Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing against receptors or cell surface proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16041Use of virus, viral particle or viral elements as a vector
    • C12N2740/16043Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/008Vector systems having a special element relevant for transcription cell type or tissue specific enhancer/promoter combination

Abstract

In some aspects, the disclosure relates to recombinant adeno-associated viruses (rAAVs) comprising a nucleic acid encoding a fusion protein comprising a DNA-binding domain and a transcriptional regulator domain and methods of using the same. In some embodiments, expression of the fusion protein results in modified expression of a target gene in a cell.

Description

DNA結合域轉活化子及其用途DNA binding domain transactivators and uses thereof

靶基因表現之調節已成為生物醫學研究之一主要領域。基因表現之上調可糾正由基因表現減少引起之單倍不足情況。當基因之至少之一個拷貝中存在一或多個功能喪失突變時,通常導致單倍不足。用於治療與單倍不足相關之疾病之基於AAV之基因增強方法受傳統rAAV載體包裝能力之阻礙。The regulation of target gene expression has become one of the main areas of biomedical research. Upregulation of gene expression corrects haploinsufficiency caused by reduced gene expression. Haploinsufficiency usually results when one or more loss-of-function mutations are present in at least one copy of a gene. AAV-based gene enhancement methods for the treatment of diseases associated with haploinsufficiency are hampered by the packaging capabilities of traditional rAAV vectors.

本發明之態樣係關於用於基因遞送之經分離核酸及重組AAV載體。本發明係部分基於用於調節靶基因表現之組合物(例如,rAAV載體及rAAV)及方法,其中該靶基因係單倍不足,諸如SCN1A。在一些實施例中,本發明提供包含DNA結合域(諸如Cys2-His2鋅指蛋白(ZFP))及轉錄調節域之融合蛋白。在一些實施例中,由本發明描述之組合物包括包含融合至轉錄調節域之DNA結合域(例如,ZFP、轉錄活化子樣效應物(TALE)域等)之融合蛋白。在一些實施例中,由本發明描述之融合蛋白增加靶基因(例如,SCN1A)之表現,且因此適用於治療特徵在於該靶基因在細胞或個體中相較於正常細胞或個體具有缺陷表現之疾病(例如,與靶基因單倍不足相關之疾病)。Aspects of the invention relate to isolated nucleic acids and recombinant AAV vectors for gene delivery. The present invention is based in part on compositions (eg, rAAV vectors and rAAV) and methods for modulating the expression of a target gene, wherein the target gene is haploinsufficient, such as SCN1A. In some embodiments, the present invention provides fusion proteins comprising a DNA binding domain, such as a Cys2-His2 zinc finger protein (ZFP), and a transcriptional regulatory domain. In some embodiments, the compositions described by the present invention include fusion proteins comprising a DNA binding domain (eg, a ZFP, a transcription activator-like effector (TALE) domain, etc.) fused to a transcriptional regulatory domain. In some embodiments, the fusion proteins described by the present invention increase the expression of a target gene (eg, SCN1A) and are therefore useful in the treatment of diseases characterized by defective expression of the target gene in cells or individuals compared to normal cells or individuals (eg, diseases associated with target gene haploinsufficiency).

因此,在一些態樣中,本發明提供一種經分離核酸,其包含一個轉基因,該轉基因係經組態以表現至少一個融合至至少一個轉錄調節域之DNA結合域,其中該DNA結合域結合至靶基因或靶基因(例如在個體或細胞中)之調節區(例如,強化子序列、啟動子序列、抑制子序列等),其中該靶基因編碼電位閘控鈉通道(例如,Na v1.1)。在一些實施例中,靶基因係SCN1A基因。在一些實施例中,轉基因係側接腺相關病毒(AAV)反向末端重複序列(ITR)。在一些實施例中,該至少一個DNA結合域結合至靶基因(例如,在個體或細胞中)且該轉錄調節域修飾(例如,上調)靶基因表現。 Accordingly, in some aspects, the invention provides an isolated nucleic acid comprising a transgene configured to express at least one DNA binding domain fused to at least one transcriptional regulatory domain, wherein the DNA binding domain binds to A target gene or a regulatory region (eg, enhancer sequence, promoter sequence, repressor sequence, etc.) of a target gene (eg, in an individual or cell), wherein the target gene encodes a potential-gated sodium channel (eg, Nav 1.1) . In some embodiments, the target gene is the SCN1A gene. In some embodiments, the transgenic line is flanked by adeno-associated virus (AAV) inverted terminal repeats (ITRs). In some embodiments, the at least one DNA binding domain binds to a target gene (eg, in an individual or cell) and the transcriptional regulatory domain modifies (eg, upregulates) target gene expression.

在一些態樣中,本發明提供包含以下之重組AAV (rAAV):包含編碼至少一個融合至至少一個轉錄調節域之DNA結合域之轉基因之核酸,其中該DNA結合域結合至靶基因或靶基因之調節區(例如在個體或細胞中),其中該靶基因編碼電位閘控鈉通道(例如,Nav1.1),及至少一個衣殼蛋白。在一些實施例中,靶基因係SCN1A基因。在一些實施例中,轉基因係側接AAV反向末端重複序列(ITR)。In some aspects, the invention provides a recombinant AAV (rAAV) comprising a nucleic acid comprising a transgene encoding at least one DNA binding domain fused to at least one transcriptional regulatory domain, wherein the DNA binding domain binds to a target gene or a target gene A regulatory region (eg, in an individual or cell), wherein the target gene encodes a potential-gated sodium channel (eg, Nav1.1), and at least one capsid protein. In some embodiments, the target gene is the SCN1A gene. In some embodiments, the transgenic line is flanked by AAV inverted terminal repeats (ITRs).

在一些實施例中,至少一個DNA結合域結合至靶基因(例如,在個體或細胞中)且轉錄調節域修飾(例如,上調)靶基因在個體中之表現。In some embodiments, at least one DNA binding domain binds to a target gene (eg, in an individual or cell) and the transcriptional regulatory domain modifies (eg, upregulates) the expression of the target gene in the individual.

在一些實施例中,至少一個DNA結合域結合至靶基因之非轉譯區。在一些實施例中,DNA結合域結合至該靶基因之調節區,選擇性地強化子序列、啟動子序列及/或抑制子序列。In some embodiments, at least one DNA binding domain binds to a non-translated region of a target gene. In some embodiments, the DNA binding domain binds to the regulatory region of the target gene, selectively enhancing subsequences, promoter sequences and/or repressor sequences.

在一些實施例中,DNA結合域結合靶基因之調節區(例如,強化子序列、啟動子序列、抑制子序列等)上游2至2000 bp之間或下游2至2000 bp之間。In some embodiments, the DNA binding domain binds between 2 and 2000 bp upstream or between 2 and 2000 bp downstream of a regulatory region (eg, enhancer sequence, promoter sequence, repressor sequence, etc.) of the target gene.

在一些實施例中,至少一個DNA結合域編碼鋅指蛋白(ZFP)、轉錄活化子樣效應物(TALE)、dCas蛋白(例如,dCas9或dCas12a)及/或同源域。在一些實施例中,至少一個DNA結合域結合至SEQ ID NO: 5至7之任一者中闡述之核酸序列。在一些實施例中,至少一個DNA結合域結合至SEQ ID NO: 3中闡述之核酸序列之至少2 (例如,至少3、4、5、6、7、8、9、10或更多)個連續核苷酸。在一些實施例中,該至少一個DNA結合域係包含由具有SEQ ID NO: 11至16、23至28或35至40之任一者中闡述之序列之核酸編碼之識別螺旋之鋅指蛋白。在一些實施例中,至少一個DNA結合域係包含SEQ ID NO: 17至22、29至34或41至46之任一者中闡述之胺基酸序列之鋅指蛋白。In some embodiments, at least one DNA binding domain encodes a zinc finger protein (ZFP), a transcriptional activator-like effector (TALE), a dCas protein (eg, dCas9 or dCas12a), and/or a homeodomain. In some embodiments, at least one DNA binding domain binds to the nucleic acid sequence set forth in any one of SEQ ID NOs: 5-7. In some embodiments, at least one DNA binding domain binds to at least 2 (eg, at least 3, 4, 5, 6, 7, 8, 9, 10 or more) of the nucleic acid sequences set forth in SEQ ID NO: 3 consecutive nucleotides. In some embodiments, the at least one DNA binding domain comprises a helix-recognizing zinc finger protein encoded by a nucleic acid having the sequence set forth in any of SEQ ID NOs: 11-16, 23-28, or 35-40. In some embodiments, at least one DNA binding domain is a zinc finger protein comprising the amino acid sequence set forth in any one of SEQ ID NOs: 17-22, 29-34, or 41-46.

在一些實施例中,至少一個DNA結合域係一種鋅指蛋白,其包含由包含SEQ ID NO: 11之核酸編碼之識別螺旋、由包含SEQ ID NO: 12之核酸編碼之識別螺旋、由包含SEQ ID NO: 13之核酸編碼之識別螺旋、由包含SEQ ID NO: 14之核酸編碼之識別螺旋、由包含SEQ ID NO: 15之核酸編碼之識別螺旋及/或由包含SEQ ID NO: 16之核酸編碼之識別螺旋。在一些實施例中,該至少一個DNA結合域係包含SEQ ID NO: 57之胺基酸序列之鋅指蛋白。在一些實施例中,結合至SCN1A基因之ZFP包含與SEQ ID NO: 57之胺基酸序列具有至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少97%或至少99%序列一致性。In some embodiments, the at least one DNA binding domain is a zinc finger protein comprising a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 11, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 12, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 12, The recognition helix encoded by the nucleic acid comprising SEQ ID NO: 13, the recognition helix encoded by the nucleic acid comprising SEQ ID NO: 14, the recognition helix encoded by the nucleic acid comprising SEQ ID NO: 15 and/or the nucleic acid comprising SEQ ID NO: 16 Coded recognition helix. In some embodiments, the at least one DNA binding domain is a zinc finger protein comprising the amino acid sequence of SEQ ID NO:57. In some embodiments, the ZFP that binds to the SCN1A gene comprises at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least the amino acid sequence of SEQ ID NO: 57 90%, at least 95%, at least 97% or at least 99% sequence identity.

在一些實施例中,至少一個DNA結合域係一種鋅指蛋白,其包含由包含SEQ ID NO: 23之核酸編碼之識別螺旋、由包含SEQ ID NO: 24之核酸編碼之識別螺旋、由包含SEQ ID NO: 25之核酸編碼之識別螺旋、由包含SEQ ID NO: 26之核酸編碼之識別螺旋、由包含SEQ ID NO: 27之核酸編碼之識別螺旋及/或由包含SEQ ID NO: 28之核酸編碼之識別螺旋。在一些實施例中,該至少一個DNA結合域係包含SEQ ID NO: 59之胺基酸序列之鋅指蛋白。在一些實施例中,結合至SCN1A基因之ZFP包含與SEQ ID NO: 59之胺基酸序列具有至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少97%或至少99%序列一致性。In some embodiments, at least one DNA binding domain is a zinc finger protein comprising a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 23, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 24, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 24, The recognition helix encoded by the nucleic acid comprising SEQ ID NO:25, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:26, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:27 and/or the nucleic acid comprising SEQ ID NO:28 Coded recognition helix. In some embodiments, the at least one DNA binding domain is a zinc finger protein comprising the amino acid sequence of SEQ ID NO:59. In some embodiments, the ZFP that binds to the SCN1A gene comprises at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least the amino acid sequence of SEQ ID NO: 59 90%, at least 95%, at least 97% or at least 99% sequence identity.

在一些實施例中,至少一個DNA結合域係一種鋅指蛋白,其包含由包含SEQ ID NO: 35之核酸編碼之識別螺旋、由包含SEQ ID NO: 36之核酸編碼之識別螺旋、由包含SEQ ID NO: 37之核酸編碼之識別螺旋、由包含SEQ ID NO: 38之核酸編碼之識別螺旋、由包含SEQ ID NO: 39之核酸編碼之識別螺旋及/或由包含SEQ ID NO: 40之核酸編碼之識別螺旋。在一些實施例中,該至少一個DNA結合域係包含SEQ ID NO: 61之胺基酸序列之鋅指蛋白。在一些實施例中,結合至SCN1A基因之ZFP包含與SEQ ID NO: 61之胺基酸序列具有至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少97%或至少99%序列一致性。In some embodiments, at least one DNA binding domain is a zinc finger protein comprising a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 35, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 36, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 36, The recognition helix encoded by the nucleic acid comprising SEQ ID NO:37, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:38, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:39 and/or the nucleic acid comprising SEQ ID NO:40 Coded recognition helix. In some embodiments, the at least one DNA binding domain is a zinc finger protein comprising the amino acid sequence of SEQ ID NO:61. In some embodiments, the ZFP that binds to the SCN1A gene comprises at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least the amino acid sequence of SEQ ID NO: 61 90%, at least 95%, at least 97% or at least 99% sequence identity.

在一些實施例中,至少一個DNA結合域係一種鋅指蛋白,其包括包含SEQ ID NO: 17之胺基酸序列之識別螺旋、包含SEQ ID NO: 18之胺基酸序列之識別螺旋、包含SEQ ID NO: 19之胺基酸序列之識別螺旋、包含SEQ ID NO: 20之胺基酸序列之識別螺旋、包含SEQ ID NO: 21之胺基酸序列之識別螺旋及/或包含SEQ ID NO: 22之胺基酸序列之識別螺旋。In some embodiments, at least one DNA binding domain is a zinc finger protein comprising a recognition helix comprising the amino acid sequence of SEQ ID NO: 17, a recognition helix comprising the amino acid sequence of SEQ ID NO: 18, comprising The recognition helix of the amino acid sequence of SEQ ID NO: 19, the recognition helix comprising the amino acid sequence of SEQ ID NO: 20, the recognition helix comprising the amino acid sequence of SEQ ID NO: 21 and/or the recognition helix comprising the amino acid sequence of SEQ ID NO: 21 : The recognition helix of the amino acid sequence of 22.

在一些實施例中,至少一個DNA結合域係一種鋅指蛋白,其包括包含SEQ ID NO: 29之識別螺旋、包含SEQ ID NO: 30之識別螺旋、包含SEQ ID NO: 31之識別螺旋、包含SEQ ID NO: 32之識別螺旋、包含SEQ ID NO: 33之識別螺旋及/或包含SEQ ID NO: 34之識別螺旋。In some embodiments, at least one DNA binding domain is a zinc finger protein comprising a recognition helix comprising SEQ ID NO: 29, a recognition helix comprising SEQ ID NO: 30, a recognition helix comprising SEQ ID NO: 31, comprising The recognition helix of SEQ ID NO:32, the recognition helix comprising SEQ ID NO:33 and/or the recognition helix comprising SEQ ID NO:34.

在一些實施例中,至少一個DNA結合域係一種鋅指蛋白,其包括包含SEQ ID NO: 41之識別螺旋、包含SEQ ID NO: 42之識別螺旋、包含SEQ ID NO: 43之識別螺旋、包含SEQ ID NO: 44之識別螺旋、包含SEQ ID NO: 45之識別螺旋及/或包含SEQ ID NO: 46之識別螺旋。In some embodiments, at least one DNA binding domain is a zinc finger protein comprising a recognition helix comprising SEQ ID NO: 41, a recognition helix comprising SEQ ID NO: 42, a recognition helix comprising SEQ ID NO: 43, comprising The recognition helix of SEQ ID NO:44, the recognition helix comprising SEQ ID NO:45 and/or the recognition helix comprising SEQ ID NO:46.

在一些實施例中,至少一個DNA結合域係無觸媒活性CRISPR相關蛋白(Cas蛋白)。在一些實施例中,無觸媒活性Cas蛋白(或「死Cas蛋白」)係dCas9或dCas12蛋白。在一些實施例中,核酸或rAAV另外包含至少一個引導核酸(例如,引導RNA或gRNA)。在一些實施例中,該引導核酸包含靶向SCN1A之間隔區序列。在一些實施例中,該引導核酸包含具有SEQ ID NO: 85、86、89、90、93或94中任一者之核苷酸序列之間隔區序列。在一些實施例中,該引導核酸包含SEQ ID NO: 83至94中任一者之核苷酸序列。在一些實施例中,該引導核酸係由SEQ ID NO: 83至94之任一者中闡述之核酸序列編碼。In some embodiments, at least one DNA binding domain is a catalytically inactive CRISPR-associated protein (Cas protein). In some embodiments, the catalytically inactive Cas protein (or "dead Cas protein") is a dCas9 or dCas12 protein. In some embodiments, the nucleic acid or rAAV additionally comprises at least one guide nucleic acid (eg, guide RNA or gRNA). In some embodiments, the guide nucleic acid comprises a spacer sequence targeting SCN1A. In some embodiments, the guide nucleic acid comprises a spacer sequence having the nucleotide sequence of any of SEQ ID NOs: 85, 86, 89, 90, 93, or 94. In some embodiments, the guide nucleic acid comprises the nucleotide sequence of any one of SEQ ID NOs: 83-94. In some embodiments, the guide nucleic acid is encoded by the nucleic acid sequence set forth in any one of SEQ ID NOs: 83-94.

在一些實施例中,至少一個轉錄調節域係包含以下之轉活化子域:VP16域、VP64域、Rta域、p65域、Hsf1域、TCF4域、MEF2A域、MEF2C域、MEF2D域、Sp1富麩胺酸域、p53域、E2F1域、MyoD域、MAPK7域、NF1B富脯胺酸域、RelA域,或其任何組合,諸如VPR域(VP64+p65+Rta1域)。在一些實施例中,至少一個轉錄調節域係由如SEQ ID NO: 47中闡述之核酸序列編碼。在一些實施例中,至少一個轉活化子域包含SEQ ID NO: 48或122至134之任一者中闡述之胺基酸序列。In some embodiments, at least one transcriptional regulatory domain comprises the following transactivator domains: VP16 domain, VP64 domain, Rta domain, p65 domain, Hsf1 domain, TCF4 domain, MEF2A domain, MEF2C domain, MEF2D domain, Sp1 gluten rich domain Amino acid domain, p53 domain, E2F1 domain, MyoD domain, MAPK7 domain, NF1B proline-rich domain, RelA domain, or any combination thereof, such as a VPR domain (VP64+p65+Rtal domain). In some embodiments, at least one transcriptional regulatory domain is encoded by a nucleic acid sequence as set forth in SEQ ID NO:47. In some embodiments, the at least one transactivator domain comprises the amino acid sequence set forth in SEQ ID NO: 48 or any one of 122-134.

在一些實施例中,本文描述之核酸另外包含核定位序列(例如,包含SEQ ID NO: 135至140中任一者之序列)。In some embodiments, the nucleic acids described herein additionally comprise a nuclear localization sequence (eg, a sequence comprising any one of SEQ ID NOs: 135-140).

在一些實施例中,在轉基因側翼之ITR包含選自由以下組成之群之ITR:AAV1 ITR、AAV2 ITR、AAV3 ITR、AAV4 ITR、AAV5 ITR、AAV6 ITR、AAV8 ITR、AAVrh8 ITR、AAV9 ITR、AAV10 ITR或AAVrh10 ITR。在一些實施例中,該ITR係ΔTR或mTR。In some embodiments, the ITR flanking the transgene comprises an ITR selected from the group consisting of: AAV1 ITR, AAV2 ITR, AAV3 ITR, AAV4 ITR, AAV5 ITR, AAV6 ITR, AAV8 ITR, AAVrh8 ITR, AAV9 ITR, AAV10 ITR or AAVrh10 ITR. In some embodiments, the ITR is ΔTR or mTR.

在一些實施例中,經分離核酸之轉基因係可操作地連接至啟動子。在一些實施例中,啟動子係組織特異性啟動子。在一些實施例中,組織特異性啟動子係神經元啟動子,諸如SST、NPY、經磷酸活化麩胺酸酶(PAG)、囊泡麩胺酸轉運子-1 (VGLUT1)、麩胺酸去羧酶65及57 (GAD65、GAD67)、突觸素I、a-CamKII、Dock10、Prox1、微小白蛋白(PV)、體抑素(SST)、膽囊收縮素(CCK)、鈣結合蛋白(CR)或神經肽Y (NPY)。In some embodiments, the transgenic line of the isolated nucleic acid is operably linked to a promoter. In some embodiments, the promoter is a tissue-specific promoter. In some embodiments, the tissue-specific promoter is a neuronal promoter, such as SST, NPY, phosphate-activated glutaminase (PAG), vesicular glutamate transporter-1 (VGLUT1), glutamate depleted Carboxylases 65 and 57 (GAD65, GAD67), Synaptophysin I, a-CamKII, Dock10, Prox1, Microalbumin (PV), Somatostatin (SST), Cholecystokinin (CCK), Calbindin (CR) ) or neuropeptide Y (NPY).

在一些實施例中,轉基因之DNA結合域係藉由連接子域融合至轉錄調節域。在一些實施例中,連接子域係可撓性連接子,例如富甘胺酸連接子或甘胺酸-絲胺酸連接子或可裂解連接子(諸如可光解連接子或可酶(例如,蛋白酶)解連接子)。In some embodiments, the DNA binding domain of the transgene is fused to the transcriptional regulatory domain via a linker domain. In some embodiments, the linker domain is a flexible linker, such as a glycine-rich linker or a glycine-serine linker or a cleavable linker such as a photocleavable linker or an enzymatically (eg , protease) cleavage linker).

在一些實施例中,經分離核酸包含編碼多個DNA結合域(例如,1、2、3、4、5、6、7、8、9或10個DNA結合域)之轉基因。在一些實施例中,經分離核酸包含編碼多個轉錄調節域(例如1、2、3、4、5、6、7、8、9或10個轉錄調節域)之轉基因。In some embodiments, the isolated nucleic acid comprises a transgene encoding multiple DNA binding domains (eg, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 DNA binding domains). In some embodiments, the isolated nucleic acid comprises a transgene encoding multiple transcriptional regulatory domains (eg, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 transcriptional regulatory domains).

在一些實施例中,經分離核酸或rAAV係表現於特徵在於相對於正常細胞或個體具有靶基因之異常表現或單倍不足(例如,表現增加或表現減少)之細胞或個體中。在一些實施例中,經分離核酸或rAAV係表現於特徵在於相對於正常細胞或個體具有靶基因之缺陷(例如,減少)表現之細胞或個體中。在一些實施例中,經分離核酸或rAAV之靶基因係SCN1A。In some embodiments, the isolated nucleic acid or rAAV line is expressed in cells or individuals characterized by abnormal expression or haploinsufficiency (eg, increased or decreased expression) of the target gene relative to normal cells or individuals. In some embodiments, the isolated nucleic acid or rAAV line is expressed in cells or individuals characterized by defective (eg, reduced) expression of the target gene relative to normal cells or individuals. In some embodiments, the target gene of the isolated nucleic acid or rAAV is SCN1A.

在一些實施例中,AAV衣殼血清型係選自由以下組成之群:AAV1、AAV2、AAV3、AAV4、AAV5、AAV6、AAV7、AAV8、AAVrh8、AAV9、AAV10、AAVrh10或AAV.PHPB。In some embodiments, the AAV capsid serotype is selected from the group consisting of AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAV9, AAV10, AAVrh10, or AAV.PHPB.

在一些態樣中,本發明提供調節靶基因之表現之方法。在一些實施例中,本發明方法包括對表現靶基因之細胞或個體投與如本文描述之經分離核酸或rAAV,其中該個體係靶基因單倍不足(例如,SCN1A單倍不足)。例如,在一些實施例中,相對於正常細胞或個體中之靶基因表現,靶基因(諸如SCN1A)在該細胞或個體中之表現係缺陷的(例如,減少)。在一些實施例中,投與經分離核酸或rAAV之細胞係神經元。在一些實施例中,神經元係GABA能神經元。In some aspects, the present invention provides methods of modulating the expression of target genes. In some embodiments, the methods of the invention comprise administering an isolated nucleic acid or rAAV as described herein to a cell or individual expressing a target gene, wherein the target gene is haploinsufficient (eg, SCN1A haploinsufficiency). For example, in some embodiments, the expression of a target gene (such as SCN1A) in a cell or individual is deficient (eg, reduced) relative to the expression of the target gene in a normal cell or individual. In some embodiments, the isolated nucleic acid or rAAV is administered to cell line neurons. In some embodiments, the neurons are GABAergic neurons.

在一些實施例中,經分離核酸或rAAV的投與導致靶基因表現(例如,SCN1A表現)相對於未投與經分離核酸或rAAV之個體增加至少2倍、至少10倍、至少20倍、至少30倍、至少40倍、至少50倍、至少60倍、至少70倍、至少80倍、至少90倍或至少100倍。在一些實施例中,經分離核酸或rAAV的投與導致靶基因表現(例如,SCN1A表現)相對於投與經分離核酸或rAAV之前的該個體中之靶基因(例如,SCN1A)表現增加至少2倍、至少10倍、至少20倍、至少30倍、至少40倍、至少50倍、至少60倍、至少70倍、至少80倍、至少90倍或至少100倍。In some embodiments, administration of the isolated nucleic acid or rAAV results in at least a 2-fold, at least 10-fold, at least 20-fold, at least 20-fold increase in target gene expression (eg, SCN1A expression) relative to individuals not administered the isolated nucleic acid or rAAV 30 times, at least 40 times, at least 50 times, at least 60 times, at least 70 times, at least 80 times, at least 90 times, or at least 100 times. In some embodiments, administration of the isolated nucleic acid or rAAV results in at least a 2-fold increase in target gene expression (eg, SCN1A expression) relative to target gene expression (eg, SCN1A) in the individual prior to administration of the isolated nucleic acid or rAAV times, at least 10 times, at least 20 times, at least 30 times, at least 40 times, at least 50 times, at least 60 times, at least 70 times, at least 80 times, at least 90 times, or at least 100 times.

在一些態樣中,本發明提供一種調節個體中之基因表現(例如,SCN1A之表現)之方法,其中對表現靶基因之個體投與如本文描述之經分離核酸或rAAV。在一些實施例中,相對於健康個體,個體中之靶基因表現係異常的(例如,增加或減少)。在一些實施例中,相對於健康個體,個體關於靶基因之表現係單倍不足或疑似單倍不足。In some aspects, the invention provides a method of modulating gene expression (eg, expression of SCN1A) in an individual, wherein an isolated nucleic acid or rAAV as described herein is administered to the individual expressing a target gene. In some embodiments, the expression of the target gene is abnormal (eg, increased or decreased) in an individual relative to a healthy individual. In some embodiments, the individual is haploinsufficient or suspected of haploinsufficiency relative to a healthy individual for the expression of the target gene.

在一些實施例中,個體患有或疑似患有由靶基因之單倍不足表現引起之疾病或病症。例如,在一些實施例中,SCN1A表現單倍不足之個體罹患卓飛症候群(Dravet syndrome)。在一些實施例中,藉由靜脈內注射、肌內注射、吸入、皮下注射及/或顱內注射對個體投與經分離核酸或rAAV。In some embodiments, the individual has or is suspected of having a disease or disorder caused by a haploinsufficiency expression of the target gene. For example, in some embodiments, individuals with SCN1A haploinsufficiency suffer from Dravet syndrome. In some embodiments, the isolated nucleic acid or rAAV is administered to the individual by intravenous injection, intramuscular injection, inhalation, subcutaneous injection, and/or intracranial injection.

在一些態樣中,本發明提供包含如由本發明描述之經分離核酸或rAAV之組合物。在一些實施例中,組合物包含醫藥上可接受之載劑。In some aspects, the present invention provides compositions comprising an isolated nucleic acid or rAAV as described by the present invention. In some embodiments, the composition includes a pharmaceutically acceptable carrier.

在一些態樣中,本發明提供包含容納如由本發明描述之經分離核酸或rAAV之容器之套組。在一些實施例中,套組包含容納醫藥上可接受之載劑之容器。在一些實施例中,經分離核酸或rAAV及醫藥上可接受之載劑係容納於相同容器中。在一些實施例中,容器係注射器。In some aspects, the present invention provides kits comprising a container containing an isolated nucleic acid or rAAV as described by the present invention. In some embodiments, the kit includes a container containing a pharmaceutically acceptable carrier. In some embodiments, the isolated nucleic acid or rAAV and the pharmaceutically acceptable carrier are contained in the same container. In some embodiments, the container is a syringe.

在一些態樣中,本發明提供包含如由本發明描述之經分離核酸或rAAV之宿主細胞。在一些實施例中,宿主細胞係真核細胞。在一些實施例中,宿主細胞係哺乳動物細胞。在一些實施例中,宿主細胞係人類細胞,選擇性地為神經元,例如GABA能神經元。In some aspects, the present invention provides host cells comprising an isolated nucleic acid or rAAV as described by the present invention. In some embodiments, the host cell is a eukaryotic cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, the host cell line is a human cell, optionally a neuron, eg, a GABAergic neuron.

相關申請案related applications

本申請案主張2020年7月24日申請之標題為「DNA BINDING DOMAIN TRANSACTIVATORS AND USES THEREOF」之美國臨時申請案第63/056,528號之申請日之權益,該案之全部內容係以引用之方式併入本文中。This application claims the benefit of the filing date of U.S. Provisional Application No. 63/056,528, filed July 24, 2020, entitled "DNA BINDING DOMAIN TRANSACTIVATORS AND USES THEREOF," which is incorporated by reference in its entirety. into this article.

本發明之態樣係關於用於調節(例如,增加)靶基因在細胞或個體中之表現之方法及組合物,其中該靶基因係單倍不足(即,靶基因包含一個功能拷貝)。在一些實施例中,該靶基因係SCN1A。Aspects of the invention relate to methods and compositions for modulating (eg, increasing) the expression of a target gene in a cell or individual, wherein the target gene is haploinsufficient (ie, the target gene comprises one functional copy). In some embodiments, the target gene is SCN1A.

在一些實施例中,本發明提供包含DNA結合域(諸如ZFP)及轉錄調節域之融合蛋白。在一些實施例中,本發明提供包含DNA結合域(諸如ZFP)及轉活化子域(例如,VPR域)之融合蛋白。在一些實施例中,DNA結合蛋白結合至靶基因或靶基因之調節區之序列。在一些實施例中,調節區係強化子序列、啟動子序列或抑制子序列。在一些實施例中,啟動子序列可為內部啟動子(例如,位於靶基因之內含子中)或外部啟動子(例如,位於靶基因之轉錄起始位點上游)。在一些實施例中,本文描述之融合蛋白之DNA結合域結合靶基因(例如,SCN1A)之啟動子區中之保守序列,因此該轉活化子域增加基因表現。In some embodiments, the present invention provides fusion proteins comprising a DNA binding domain, such as a ZFP, and a transcriptional regulatory domain. In some embodiments, the present invention provides fusion proteins comprising a DNA binding domain (such as a ZFP) and a transactivator domain (eg, a VPR domain). In some embodiments, the DNA binding protein binds to a target gene or a sequence of a regulatory region of a target gene. In some embodiments, the regulatory region is an enhancer sequence, a promoter sequence, or a suppressor sequence. In some embodiments, the promoter sequence can be an internal promoter (eg, located in an intron of the target gene) or an external promoter (eg, located upstream of the transcription start site of the target gene). In some embodiments, the DNA binding domains of the fusion proteins described herein bind to conserved sequences in the promoter regions of target genes (eg, SCN1A), and thus the transactivator domains increase gene expression.

在一些態樣中,本發明係關於用於增加靶基因(例如,SCN1A)在細胞或個體中之表現之方法。在一些實施例中,該靶基因含有致使細胞或個體之該靶基因呈現單倍不足之突變。因此,在一些實施例中,本發明之方法及組合物可用於治療與靶基因產物之單倍不足相關聯之疾病及疾患(例如卓飛症候群),其通常由導致電位閘控鈉通道α次單元Nav1.1單倍不足之SCN1A基因之一個拷貝中之突變引起。In some aspects, the invention pertains to methods for increasing the expression of a target gene (eg, SCN1A) in a cell or individual. In some embodiments, the target gene contains a mutation that renders a cell or individual haploinsufficient for the target gene. Thus, in some embodiments, the methods and compositions of the present invention can be used to treat diseases and disorders associated with haploinsufficiency of target gene products (eg, Zoffer syndrome), which are typically caused by potential-gated sodium channels alpha times Caused by a mutation in one copy of the SCN1A gene of the unit Nav1.1 haploinsufficiency.

轉活化子融合蛋白 本發明之一些態樣係關於包含DNA結合域(DBD)及轉活化子域之融合蛋白。如本文使用,融合蛋白包含由兩個或更多個不同胺基酸序列編碼之兩個或更多個連接之多肽。如本文使用,嵌合蛋白係其中該等兩個或更多個連接之基因來自不同物種之融合蛋白。融合蛋白係通常重組產生,其中編碼該融合蛋白之基因係在支持該等兩個或更多個連接之基因之表現並將所得mRNA轉譯為重組蛋白之系統中。在一些實施例中,融合蛋白係在原核或真核細胞中重組產生。融合蛋白可以多種排列組態。例如,一種蛋白質(蛋白A)位於第二種蛋白質(蛋白B)上游。在其他融合蛋白組態中,蛋白B位於蛋白A上游。在一些實施例中,編碼DNA結合域之核酸序列位於編碼轉活化子域之核酸序列上游,並產生包含連接至該轉活化子之DBD之融合蛋白。在一些實施例中,編碼轉活化子域之核酸序列位於編碼DNA結合域之核酸序列上游,並產生包含連接至該DNA結合域之轉活化子域之融合蛋白。在一些實施例中,融合蛋白包含位於DNA結合域上游之轉活化子域。在一些實施例中,融合蛋白包含位於轉活化子域上游之DNA結合域。 Transactivator fusion protein Some aspects of the invention relate to fusion proteins comprising a DNA binding domain (DBD) and a transactivator domain. As used herein, a fusion protein comprises two or more linked polypeptides encoded by two or more different amino acid sequences. As used herein, a chimeric protein is a fusion protein in which the two or more linked genes are from different species. Fusion proteins are typically produced recombinantly, wherein the gene encoding the fusion protein is in a system that supports the expression of the two or more linked genes and translates the resulting mRNA into a recombinant protein. In some embodiments, fusion proteins are recombinantly produced in prokaryotic or eukaryotic cells. Fusion proteins can be arranged in a variety of configurations. For example, one protein (Protein A) is located upstream of a second protein (Protein B). In other fusion protein configurations, protein B is located upstream of protein A. In some embodiments, the nucleic acid sequence encoding the DNA binding domain is located upstream of the nucleic acid sequence encoding the transactivator domain, and a fusion protein comprising the DBD linked to the transactivator is produced. In some embodiments, the nucleic acid sequence encoding the transactivator domain is located upstream of the nucleic acid sequence encoding the DNA binding domain, and a fusion protein comprising the transactivator domain linked to the DNA binding domain is produced. In some embodiments, the fusion protein comprises a transactivator domain upstream of the DNA binding domain. In some embodiments, the fusion protein comprises a DNA binding domain upstream of the transactivator domain.

在一些實施例中,由本發明描述之融合蛋白包含DNA結合域。如本文使用,「DNA結合域(DBD)」係指包含至少一個識別雙股或單股DNA (dsDNA或ssDNA)之結構模體之經獨立折疊蛋白質。某些DBD識別特異性序列(識別序列或模體),而其他類型之DBD對DNA具有一般親和力。在一些實施例中,由本發明描述之融合蛋白包含序列特異性DBD。在一些實施例中,該DBD識別(例如,特異性結合至)於編碼SCN1A蛋白(例如,Nav1.1)之基因內或與其相鄰之核酸序列。含有DBD之蛋白質通常參與細胞過程,諸如轉錄、複製、修復及DNA儲存。轉錄因子中之DBD識別啟動子區或強化子元件中之特異性DNA序列以促進基因表現。轉錄因子DBD在基因工程中用作融合蛋白以調節靶基因表現且可突變以改變DNA結合特異性或DNA結合親和力並因此調節所需靶基因之表現。DBD之實例包括(但不限於)螺旋轉螺旋模體、鋅指模體(包括Cys2-His2鋅指)、轉錄活化子樣效應物(TALE)、翼狀螺旋模體、HMG匣、dCas蛋白(例如,dCas9或dCas12a)、同源域及OB折疊域。In some embodiments, the fusion proteins described by the present invention comprise a DNA binding domain. As used herein, a "DNA binding domain (DBD)" refers to an independently folded protein comprising at least one structural motif that recognizes double- or single-stranded DNA (dsDNA or ssDNA). Certain DBDs recognize specific sequences (recognition sequences or motifs), while other types of DBDs have a general affinity for DNA. In some embodiments, the fusion proteins described by the present invention comprise sequence-specific DBDs. In some embodiments, the DBD recognizes (eg, specifically binds to) a nucleic acid sequence within or adjacent to a gene encoding a SCN1A protein (eg, Nav1.1). DBD-containing proteins are often involved in cellular processes such as transcription, replication, repair, and DNA storage. DBDs in transcription factors recognize specific DNA sequences in promoter regions or enhancer elements to facilitate gene expression. The transcription factor DBD is used as a fusion protein in genetic engineering to modulate target gene expression and can be mutated to alter DNA binding specificity or DNA binding affinity and thus modulate the expression of a desired target gene. Examples of DBDs include, but are not limited to, helix-turn-helix motifs, zinc finger motifs (including Cys2-His2 zinc fingers), transcription activator-like effectors (TALEs), winged helix motifs, HMG cassettes, dCas proteins ( For example, dCas9 or dCas12a), the homeodomain and the OB fold domain.

在一些實施例中,本發明係關於鋅指DBD融合蛋白。如本文使用,「鋅指蛋白(ZFP)」係指含有至少一個結構模體之蛋白質,該結構模體之特徵在於一或多個穩定蛋白質折疊之鋅離子之配位。鋅指係蛋白質中發現之最多樣化結構模體之一,且多達3%之人類基因編碼鋅指。大多數ZFP含有多個鋅指,該等鋅指可與靶分子(包括DNA、RNA及小蛋白泛素)串聯接觸。「經典」鋅指模體由2個半胱胺酸胺基酸及2個組胺酸胺基酸(C 2H 2)構成,並以序列特異性方式結合DNA。此等ZFP (包括轉錄因子IIIIA (TFIIIA))通常參與基因表現。DNA結合蛋白中之多個鋅指模體結合並包裹在DNA雙螺旋之外部。鋅指域融合蛋白由於其等相對較小之尺寸(例如,各指係約25至40,通常27至35個胺基酸)而可用以產生具有新穎DNA結合特異性之DBD。此等DBD可遞送其他融合域(例如,轉錄活化或抑制域或表觀遺傳修飾域)以改變靶基因之轉錄調節。在一些實施例中,鋅指蛋白包含2至8個指,其中各指含有27至40個胺基酸(例如,27、28、29、30 、31 、32、33、34、35、36、37、38、39或40個胺基酸)。 In some embodiments, the invention relates to zinc finger DBD fusion proteins. As used herein, a "zinc finger protein (ZFP)" refers to a protein containing at least one structural motif characterized by the coordination of one or more zinc ions that stabilize the protein fold. Zinc fingers are one of the most diverse structural motifs found in proteins, and as many as 3% of human genes encode zinc fingers. Most ZFPs contain multiple zinc fingers that make tandem contacts with target molecules, including DNA, RNA, and small proteins ubiquitin. "Classic" zinc finger motifs are composed of 2 cysteine amino acids and 2 histidine amino acids (C 2 H 2 ) and bind DNA in a sequence-specific manner. These ZFPs, including transcription factor IIIA (TFIIIA), are often involved in gene expression. Multiple zinc finger motifs in DNA-binding proteins bind and wrap around the outside of the DNA double helix. Zinc finger domain fusion proteins can be used to generate DBDs with novel DNA binding specificities due to their relatively small size (eg, about 25 to 40, typically 27 to 35 amino acids per finger). These DBDs can deliver other fusion domains (eg, transcriptional activation or repression domains or epigenetic modification domains) to alter transcriptional regulation of target genes. In some embodiments, the zinc finger protein comprises 2 to 8 fingers, wherein each finger contains 27 to 40 amino acids (eg, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 amino acids).

在一些實施例中,ZFP包含1、2、3、4、5、6、7或8個鋅指。各鋅指可包含25至40、25至30、30至35、35至40或40至45個胺基酸。在一些實施例中,鋅指包含27至35個胺基酸。在一些實施例中,鋅指包含27、28、29、30、31、32、33、34或35個胺基酸。鋅指可特異性識別或結合至個體中單倍不足之靶序列,例如,靶基因或靶基因之調節區。在一些實施例中,鋅指結合至(例如)如SEQ ID NO: 49中闡述之SCN1A基因(例如,人類SCN1A)之靶序列。在一些實施例中,結合至SCN1A基因之靶序列之鋅指包含SEQ ID NO: 63至80之一或多個胺基酸序列,或其組合。在一些實施例中,鋅指特異性識別或結合至包含三核苷酸序列之靶序列。In some embodiments, the ZFP comprises 1, 2, 3, 4, 5, 6, 7 or 8 zinc fingers. Each zinc finger may contain 25 to 40, 25 to 30, 30 to 35, 35 to 40, or 40 to 45 amino acids. In some embodiments, the zinc fingers comprise 27 to 35 amino acids. In some embodiments, the zinc finger comprises 27, 28, 29, 30, 31, 32, 33, 34 or 35 amino acids. Zinc fingers can specifically recognize or bind to a haploinsufficient target sequence in an individual, eg, a target gene or a regulatory region of a target gene. In some embodiments, the zinc finger binds to, eg, a target sequence of the SCN1A gene (eg, human SCN1A) as set forth in SEQ ID NO: 49. In some embodiments, the zinc finger that binds to the target sequence of the SCN1A gene comprises one or more amino acid sequences of SEQ ID NOs: 63 to 80, or a combination thereof. In some embodiments, the zinc finger specifically recognizes or binds to a target sequence comprising a trinucleotide sequence.

在一些實施例中,鋅指包含識別或結合至靶序列(例如,包含三核苷酸序列之靶序列)之識別螺旋。在一些實施例中,識別螺旋結合至三核苷酸。在一些實施例中,識別螺旋包含4至10個胺基酸。在一些實施例中,識別螺旋包含4、6、7、8、9或10個胺基酸。在一些實施例中,識別螺旋結合至SCN1A基因之三核苷酸序列。在一些實施例中,結合至SCN1A基因之識別序列包含SEQ ID NO: 17至22、29至34或41至46中任一者之胺基酸序列。在一些實施例中,結合至SCN1A基因之識別序列係由SEQ ID NO: 11至16、23至28或35至40中之任一者編碼。在一些實施例中,鋅指結合至與包含SEQ ID NO: 17至22、29至34或41至46中任一者之胺基酸序列之識別螺旋相同之核苷酸序列。In some embodiments, the zinc finger comprises a recognition helix that recognizes or binds to a target sequence (eg, a target sequence comprising a trinucleotide sequence). In some embodiments, the recognition helix is bound to a trinucleotide. In some embodiments, the recognition helix comprises 4 to 10 amino acids. In some embodiments, the recognition helix comprises 4, 6, 7, 8, 9 or 10 amino acids. In some embodiments, the recognition helix binds to the trinucleotide sequence of the SCN1A gene. In some embodiments, the recognition sequence bound to the SCN1A gene comprises the amino acid sequence of any one of SEQ ID NOs: 17-22, 29-34, or 41-46. In some embodiments, the recognition sequence that binds to the SCN1A gene is encoded by any of SEQ ID NOs: 11-16, 23-28, or 35-40. In some embodiments, the zinc finger binds to the same nucleotide sequence as the recognition helix comprising the amino acid sequence of any of SEQ ID NOs: 17-22, 29-34, or 41-46.

在一些實施例中,鋅指於其C末端包含連接子序列,該序列可用以將該鋅指連接或連結至另外鋅指。在一些實施例中,連接子序列可為典型連接子,例如,包含TGEKP之胺基酸序列(SEQ ID NO: 120)。在一些實施例中,連接子序列可為非典型連接子,例如,包含TGSQKP之胺基酸序列(SEQ ID NO: 121)。在一些實施例中,連接子序列可為2至10個胺基酸,例如,2、3、4、5、6、7、8、9或10個胺基酸。In some embodiments, a zinc finger comprises a linker sequence at its C-terminus that can be used to link or link the zinc finger to another zinc finger. In some embodiments, the linker sequence can be a typical linker, eg, an amino acid sequence comprising TGEKP (SEQ ID NO: 120). In some embodiments, the linker sequence can be an atypical linker, eg, an amino acid sequence comprising TGSQKP (SEQ ID NO: 121). In some embodiments, the linker sequence can be 2 to 10 amino acids, eg, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids.

在一些實施例中,結合至靶基因(例如,SCN1A基因)之ZFP包含六個鋅指,其等中之各者識別或結合至該靶基因(例如,SCN1A基因)之不同三核苷酸序列。在一些實施例中,結合至SCN1A基因之ZFP包含SEQ ID NO: 57之胺基酸序列。在一些實施例中,結合至SCN1A基因之ZFP包括包含SEQ ID NO: 63、64、65、66、67及/或68之胺基酸序列之鋅指。在一些實施例中,結合至SCN1A基因之ZFP包括包含SEQ ID NO: 17、18、19、20、21及/或22之胺基酸序列之識別螺旋。在一些實施例中,結合至SCN1A基因之ZFP包含SEQ ID NO: 59之胺基酸序列。在一些實施例中,結合至SCN1A基因之ZFP包括包含SEQ ID NO: 69、70、71、72、73及/或74之胺基酸序列之鋅指。在一些實施例中,結合至SCN1A基因之ZFP包括包含SEQ ID NO: 29、30、31、32、33及/或34之胺基酸序列之識別螺旋。在一些實施例中,結合至SCN1A基因之ZFP包含SEQ ID NO: 61之胺基酸序列。在一些實施例中,結合至SCN1A基因之ZFP包括包含SEQ ID NO: 75、76、77、78、79及/或80之胺基酸序列之鋅指。在一些實施例中,結合至SCN1A基因之ZFP包括包含SEQ ID NO: 41、42 43、44、45及/或46之胺基酸序列之識別螺旋。在一些實施例中,結合至SCN1A基因之ZFP包含與如下文顯示的SEQ ID NO: 57、59或61具有至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少97%或至少99%序列一致性。In some embodiments, a ZFP that binds to a target gene (eg, SCN1A gene) comprises six zinc fingers, each of which recognizes or binds to a different trinucleotide sequence of the target gene (eg, SCN1A gene) . In some embodiments, the ZFP that binds to the SCN1A gene comprises the amino acid sequence of SEQ ID NO:57. In some embodiments, the ZFP that binds to the SCN1A gene comprises a zinc finger comprising the amino acid sequence of SEQ ID NO: 63, 64, 65, 66, 67 and/or 68. In some embodiments, the ZFP that binds to the SCN1A gene includes a recognition helix comprising the amino acid sequence of SEQ ID NO: 17, 18, 19, 20, 21 and/or 22. In some embodiments, the ZFP that binds to the SCN1A gene comprises the amino acid sequence of SEQ ID NO:59. In some embodiments, the ZFP that binds to the SCN1A gene includes a zinc finger comprising the amino acid sequence of SEQ ID NO: 69, 70, 71, 72, 73 and/or 74. In some embodiments, the ZFP that binds to the SCN1A gene includes a recognition helix comprising the amino acid sequence of SEQ ID NOs: 29, 30, 31, 32, 33 and/or 34. In some embodiments, the ZFP that binds to the SCN1A gene comprises the amino acid sequence of SEQ ID NO:61. In some embodiments, the ZFP that binds to the SCN1A gene comprises a zinc finger comprising the amino acid sequence of SEQ ID NO: 75, 76, 77, 78, 79 and/or 80. In some embodiments, the ZFP that binds to the SCN1A gene includes a recognition helix comprising the amino acid sequence of SEQ ID NOs: 41, 42, 43, 44, 45, and/or 46. In some embodiments, the ZFP that binds to the SCN1A gene comprises at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85% of SEQ ID NO: 57, 59 or 61 as shown below %, at least 90%, at least 95%, at least 97%, or at least 99% sequence identity.

SEQ ID NO: 57 (ZFP1 蛋白之 胺基酸序列 )RPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGKKFADRSHLARHTKIHTGSQKPFQCRICMRNFSQKAHLTAHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD SEQ ID NO: 57 ( amino acid sequence of ZFP1 protein ) RPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGKKFADRSHLARHTKIHTGSQKPFQCRICMRNFSQKAHLTAHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD

SEQ ID NO: 59 (ZFP2 蛋白之胺基酸序列 )RPFQCRICMRNFSRSSNLTRHIRTHTGEKPFACDICGKKFADKRTLIRHTKIHTGSQKPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGRKFADRSHLARHTKIHLRQKD SEQ ID NO: 59 ( amino acid sequence of ZFP2 protein ) RPFQCRICMRNFSRSSNLTRHIRTHTGEKPFACDICGKKFADKRTLIRHTKIHTGSQKPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGRKFADRSHLARHTKIHLRQKD

SEQ ID NO: 61 (ZFP3 蛋白之胺基酸序列 )RPFQCRICMRNFSDRSALARHIRTHTGEKPFACDICGKKFARSDNLTRHTKIHTGSQKPFQCRICMRNFSQSGDLTRHIRTHTGEKPFACDICGKKFAVRQTLKQHTKIHTGSQKPFQCRICMRNFSAAGNLTRHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD SEQ ID NO: 61 ( amino acid sequence of ZFP3 protein ) RPFQCRICMRNFSDRSALARHIRTHTGEKPFACDICGKKFARSDNLTRHTKIHTGSQKPFQCRICMRNFSQSGDLTRHIRTHTGEKPFACDICGKKFAVRQTLKQHTKIHTGSQKPFQCRICMRNFSAAGNLTRHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD

在一些實施例中,DBD係轉錄活化子樣效應物蛋白(TALE)。TALE可特異性識別或結合至靶序列,例如,靶基因或靶基因之調節區。在一些實施例中,個體係該靶基因單倍不足。在一些實施例中,TALE結合至SCN1A基因(例如,如SEQ ID NO: 49中提供之人類SCN1A)之靶序列。TALE蛋白係由細菌分泌並結合宿主植物中之啟動子序列以活化有助於細菌感染之植物基因之表現。通常,操作TALE蛋白以結合新穎DNA序列,因為通過由可變數量之~30至35個胺基酸重複序列構成之中央重複域識別靶序列,其中各重複序列識別該靶序列內之單一鹼基對。此等重複序列之陣列通常係識別DNA序列必需的。In some embodiments, the DBD is a transcriptional activator-like effector protein (TALE). A TALE can specifically recognize or bind to a target sequence, eg, a target gene or a regulatory region of a target gene. In some embodiments, each system is haploinsufficient for the target gene. In some embodiments, the TALE binds to the target sequence of the SCN1A gene (eg, human SCN1A as provided in SEQ ID NO: 49). TALE proteins are secreted by bacteria and bind to promoter sequences in host plants to activate the expression of plant genes that facilitate bacterial infection. Typically, TALE proteins are manipulated to bind novel DNA sequences, since target sequences are recognized by a central repeat domain consisting of a variable number of ~30 to 35 amino acid repeats, each of which recognizes a single base within the target sequence right. Arrays of these repeats are often necessary to identify DNA sequences.

在一些實施例中,DBD係同源域。同源域可特異性識別或結合至靶序列,例如,靶基因或靶基因之調節區。在一些實施例中,個體係該靶基因單倍不足。在一些實施例中,同源域結合至SCN1A基因(例如,如SEQ ID NO: 49中提供之人類SCN1A)之靶序列。同源域係含有三個α螺旋及N端臂之負責識別靶序列之蛋白質。同源域通常識別小DNA序列(~4至8個鹼基對),然而此等域可與其他DNA結合域(其他同源域或鋅指蛋白)串聯融合以識別較長延伸序列(12至24個鹼基對)。因此,同源域可為識別人類基因體內之唯一序列之DBD組分。In some embodiments, the DBD is a homology domain. A homology domain can specifically recognize or bind to a target sequence, eg, a target gene or a regulatory region of a target gene. In some embodiments, each system is haploinsufficient for the target gene. In some embodiments, the homeodomain binds to the target sequence of the SCN1A gene (eg, human SCN1A as provided in SEQ ID NO: 49). Homeodomains are proteins containing three alpha helices and an N-terminal arm responsible for recognizing target sequences. Homeodomains typically recognize small DNA sequences (~4 to 8 base pairs), however these domains can be tandemly fused to other DNA-binding domains (other homeodomains or zinc finger proteins) to recognize longer stretches (12 to 24 base pairs). Thus, the homeodomain may be a DBD component that recognizes a unique sequence within the human genome.

在一些實施例中,至少一個DNA結合域係無觸媒活性CRISPR相關蛋白(Cas蛋白)。無觸媒活性Cas蛋白(亦稱為dCas或「死Cas蛋白」)係已經修飾或突變使得核酸酶活性(例如,核酸內切酶活性)減弱或缺乏所有核酸酶活性(例如,核酸內切酶活性)之Cas蛋白。在一些實施例中,無觸媒活性Cas蛋白係dCas9或dCas12蛋白。在一些實施例中,DBD係dCas蛋白(亦稱為「死Cas」),諸如dCas9或dCas12a。dCas蛋白係已經突變使得無觸媒活性(即,無法裂解核苷酸)之CRISPR相關蛋白之突變變體(Cas,例如,Cas9或Cas12a)。dCas可特異性識別或結合至靶序列,例如,靶基因或靶基因之調節區。包含dCas蛋白及引導核酸(例如,gRNA)之複合物可靶向及/或結合至與該引導核酸互補之特異性核苷酸序列或基因。在一些實施例中,個體係靶基因單倍不足。在一些實施例中,dCas結合至SCN1A基因(例如,如SEQ ID NO: 49中提供之人類SCN1A)之靶序列。然而,當結合至與該標靶DNA序列互補或部分互補之引導核酸(例如,引導RNA、gRNA或sgRNA)時,dCas蛋白保留其等識別並結合至標靶DNA序列之能力。在一些實施例中,用於將dCas (例如,dCas9)蛋白靶向SCN1A之引導核酸包含具有SEQ ID NO: 85、86、89、90、93或94中任一者之間隔區序列。在一些實施例中,用於將dCas (例如,dCas9)蛋白靶向SCN1A之引導核酸包含具有SEQ ID NO: 85、86、89、90、93或94中任一者之至少15 (例如,至少16、17、18、19或20)個連續核苷酸之間隔區序列。在一些實施例中,用於將dCas (例如,dCas9)蛋白靶向SCN1A之引導核酸包含SEQ ID NO: 83、84、87、88、91或92中之任一者。在一些實施例中,用於將dCas (例如,dCas9)蛋白靶向SCN1A之引導核酸包含SEQ ID NO: 83至94中之任一者或由其構成。因此,dCas核酸內切酶可為識別人類基因體內之唯一序列之DBD組分。在一些實施例中,融合蛋白包含dCas9蛋白及轉活化域(例如,VPR域)。In some embodiments, at least one DNA binding domain is a catalytically inactive CRISPR-associated protein (Cas protein). Catalytically inactive Cas proteins (also known as dCas or "dead Cas proteins") are those that have been modified or mutated such that nuclease activity (eg, endonuclease activity) is reduced or lacks all nuclease activity (eg, endonuclease activity) activity) of the Cas protein. In some embodiments, the catalytically inactive Cas protein is a dCas9 or dCas12 protein. In some embodiments, the DBD is a dCas protein (also known as "dead Cas"), such as dCas9 or dCas12a. dCas proteins are mutant variants of CRISPR-associated proteins (Cas, eg, Cas9 or Cas12a) that have been mutated to render them inactive (ie, unable to cleave nucleotides). dCas can specifically recognize or bind to a target sequence, eg, a target gene or a regulatory region of a target gene. A complex comprising a dCas protein and a guide nucleic acid (eg, gRNA) can target and/or bind to a specific nucleotide sequence or gene complementary to the guide nucleic acid. In some embodiments, each system target gene is haploinsufficient. In some embodiments, the dCas binds to the target sequence of the SCN1A gene (eg, human SCN1A as provided in SEQ ID NO: 49). However, when bound to a guide nucleic acid (eg, guide RNA, gRNA, or sgRNA) that is complementary or partially complementary to the target DNA sequence, dCas proteins retain their ability to recognize and bind to the target DNA sequence. In some embodiments, the guide nucleic acid used to target a dCas (eg, dCas9) protein to SCN1A comprises a spacer sequence having any of SEQ ID NOs: 85, 86, 89, 90, 93 or 94. In some embodiments, the guide nucleic acid for targeting a dCas (eg, dCas9) protein to SCN1A comprises at least 15 (eg, at least 15) of any of SEQ ID NOs: 85, 86, 89, 90, 93, or 94 16, 17, 18, 19 or 20) spacer sequences of consecutive nucleotides. In some embodiments, the guide nucleic acid for targeting a dCas (eg, dCas9) protein to SCN1A comprises any one of SEQ ID NOs: 83, 84, 87, 88, 91 or 92. In some embodiments, the guide nucleic acid for targeting a dCas (eg, dCas9) protein to SCN1A comprises or consists of any one of SEQ ID NOs: 83-94. Thus, the dCas endonuclease may be a DBD component that recognizes a unique sequence within the human genome. In some embodiments, the fusion protein comprises a dCas9 protein and a transactivation domain (eg, a VPR domain).

在一些態樣中,本發明係關於結合至編碼電位閘控鈉通道(例如,Na v1.1)之基因之DNA結合域。在一些實施例中,編碼電位閘控鈉通道之基因係SCN1A基因,且包含SEQ ID NO: 49中闡述之序列。在一些實施例中,DNA結合域結合至靶基因之非轉譯區,諸如3’-非轉譯區(3’UTR)或5’-非轉譯區(5’UTR)。在一些實施例中,非轉譯區包含調節序列,例如強化子、啟動子、內含子或抑制子序列。在一些實施例中,DNA結合域係包含SEQ ID NO: 57至62中闡述之序列之鋅指蛋白。在一些實施例中,DNA結合域結合至SEQ ID NO: 5至7之任一者中闡述之核酸序列。在一些實施例中,DNA結合域結合至SEQ ID NO: 5至7之任一者中闡述之核酸序列之全長。在一些實施例中,DNA結合域結合至SEQ ID NO: 5至7之任一者中闡述之核酸序列之至少6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、35、40、45或50個連續核苷酸。在一些實施例中,DNA結合域結合至SEQ ID NO: 3中闡述之核酸序列。在一些實施例中,DNA結合域結合至SEQ ID NO: 3之任一者中闡述之核酸序列之至少6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、35、40、45或50個連續核苷酸。 In some aspects, the invention relates to DNA binding domains that bind to genes encoding potential-gated sodium channels (eg, Nav 1.1). In some embodiments, the gene encoding the potential-gated sodium channel is the SCN1A gene and comprises the sequence set forth in SEQ ID NO:49. In some embodiments, the DNA binding domain binds to an untranslated region of a target gene, such as a 3'-untranslated region (3'UTR) or a 5'-untranslated region (5'UTR). In some embodiments, the non-translated region comprises regulatory sequences, such as enhancer, promoter, intron or suppressor sequences. In some embodiments, the DNA binding domain is a zinc finger protein comprising the sequences set forth in SEQ ID NOs: 57-62. In some embodiments, the DNA binding domain binds to the nucleic acid sequence set forth in any one of SEQ ID NOs: 5-7. In some embodiments, the DNA binding domain binds to the full length of the nucleic acid sequence set forth in any one of SEQ ID NOs: 5-7. In some embodiments, the DNA binding domain binds to at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 of the nucleic acid sequences set forth in any one of SEQ ID NOs: 5-7 , 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45 or 50 consecutive nucleotides. In some embodiments, the DNA binding domain binds to the nucleic acid sequence set forth in SEQ ID NO:3. In some embodiments, the DNA binding domain binds to at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 of the nucleic acid sequence set forth in any one of SEQ ID NO: 3 , 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, or 50 consecutive nucleotides.

SEQ ID NO: 3 SCN1A 靶向區,核苷酸序列TTTTTTTTTTTTTTTTTGAAACAAGCTATTTGCTGATTTGTATTAGGTACCATAGAGTGAGGCGAGGATGAAGCCGAGAGGATACTGCAGAGGTCTCTGGTGCATGTGTGTATGTGTGCGTTTGTGTGTG SEQ ID NO: 3 , SCN1A targeting region, nucleotide sequence TTTTTTTTTTTTTTTTTGAAACAAGCTATTTGCTGATTTGTATTAGGTACCATAGAGTGAGGCGAGGATGAAGCCGAGAGGATACTGCAGAGGTCTCTGGTGCATGTGTGTATGTGTGCGTTTGTGTGTG

由轉基因編碼之DNA結合域之數量可變化。在一些實施例中,轉基因編碼一個DNA結合域。在一些實施例中,轉基因編碼2個DNA結合域。在一些實施例中,轉基因編碼3個DNA結合域。在一些實施例中,轉基因編碼4個DNA結合域。在一些實施例中,轉基因編碼5個DNA結合域。在一些實施例中,轉基因編碼6個DNA結合域。在一些實施例中,轉基因編碼7個DNA結合域。在一些實施例中,轉基因編碼8個DNA結合域。在一些實施例中,轉基因編碼9個DNA結合域。在一些實施例中,轉基因編碼10個DNA結合域。在一些實施例中,轉基因編碼超過10 (例如,20、30、50、100等)個DNA結合域。該等DNA結合域可為相同DNA結合域(例如,相同DBD之多個拷貝)、不同DNA結合域(例如,各DBD結合唯一序列),或其組合。The number of DNA binding domains encoded by the transgene can vary. In some embodiments, the transgene encodes a DNA binding domain. In some embodiments, the transgene encodes two DNA binding domains. In some embodiments, the transgene encodes three DNA binding domains. In some embodiments, the transgene encodes four DNA binding domains. In some embodiments, the transgene encodes five DNA binding domains. In some embodiments, the transgene encodes six DNA binding domains. In some embodiments, the transgene encodes seven DNA binding domains. In some embodiments, the transgene encodes 8 DNA binding domains. In some embodiments, the transgene encodes nine DNA binding domains. In some embodiments, the transgene encodes 10 DNA binding domains. In some embodiments, the transgene encodes more than 10 (eg, 20, 30, 50, 100, etc.) DNA binding domains. The DNA binding domains can be the same DNA binding domain (eg, multiple copies of the same DBD), different DNA binding domains (eg, each DBD binds a unique sequence), or a combination thereof.

在一些態樣中,本發明係關於包含轉活化子域之融合蛋白。如本文使用,「轉活化域」係指含有調節基因表現之其他蛋白質(諸如轉錄共調節因子)之結合位點之轉錄因子中之支架域。在一些實施例中,轉活化域(亦稱為轉錄活化域)結合DBD一起用以直接經由接觸轉錄因子,或間接經由共活化子蛋白而自啟動子或強化子活化轉錄。轉活化域(TAD)通常以其胺基酸組成命名,其中該等胺基酸對活性而言係必需的或為TAD中最豐富的。TAD在基因工程中用作融合蛋白以調節靶基因之表現且可突變以改變轉錄活化之程度並因此改變該靶基因之表現。轉活化域之實例包括(但不限於) GAL4、HAP1、VP16、P65、RTA、GCN4、TCF4 AD1、TCF4 AD2、MEF2A、MEF2C、MEF2D、Sp1富麩胺酸域、p53、E2F1、MyoD、MAPK7、NF1B富脯胺酸域、RelA及HSF1。In some aspects, the invention relates to fusion proteins comprising transactivator domains. As used herein, a "transactivation domain" refers to a scaffold domain in a transcription factor that contains binding sites for other proteins that regulate gene expression, such as transcriptional coregulators. In some embodiments, a transactivation domain (also referred to as a transcription activation domain) is used in conjunction with DBD to activate transcription from a promoter or enhancer, either directly via contact with a transcription factor, or indirectly via a coactivator protein. Transactivation domains (TADs) are often named for their amino acid composition, which are either essential for activity or the most abundant in TADs. TAD is used as a fusion protein in genetic engineering to modulate the expression of a target gene and can be mutated to alter the degree of transcriptional activation and thus the expression of the target gene. Examples of transactivation domains include, but are not limited to, GAL4, HAP1, VP16, P65, RTA, GCN4, TCF4 AD1, TCF4 AD2, MEF2A, MEF2C, MEF2D, Sp1 glutamate-rich domain, p53, E2F1, MyoD, MAPK7, NF1B proline-rich domain, RelA and HSF1.

在一些實施例中,轉活化子域包含VP64域。VP64係由VP16蛋白之四個串聯拷貝構成之酸性TAD,其由單純疱疹病毒天然表現。當融合至於基因之啟動子處或附近結合之DBD時,VP64用作強轉錄活化子且可因此用以調節靶基因(例如,SCN1A)之表現。該VP64域通常由單純疱疹蛋白VP16之最小活化域之四聚體重複序列構成。在一些實施例中,該VP64域包含VP16中胺基酸殘基437至448之四個重複序列。在一些實施例中,VP16蛋白係由人類疱疹病毒2 UL48基因編碼,人類疱疹病毒2 UL48基因包含NCBI參考序列寄存編號:NC_001798.2中闡述之序列。在一些實施例中,VP16基因包含與由NCBI參考序列寄存編號:YP_009137200.1中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,VP16蛋白包含與NCBI參考序列寄存編號Q69113-1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,VP16基因包含與由SEQ ID NO: 51中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,VP16蛋白包含與SEQ ID NO: 52中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivation subdomain comprises a VP64 domain. VP64 is an acidic TAD consisting of four tandem copies of the VP16 protein, which is naturally expressed by the herpes simplex virus. When fused to DBD bound at or near the promoter of a gene, VP64 acts as a strong transcriptional activator and can thus be used to regulate the expression of target genes (eg, SCN1A). The VP64 domain typically consists of a tetrameric repeat of the minimal activation domain of the herpes simplex protein VP16. In some embodiments, the VP64 domain comprises four repeats of amino acid residues 437 to 448 in VP16. In some embodiments, the VP16 protein is encoded by the human herpesvirus 2 UL48 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NC_001798.2. In some embodiments, the VP16 gene comprises 99% identity, 95% identity, 90% identity, 80% identity to the amino acid sequence encoded by the nucleic acid sequence set forth in NCBI Reference Sequence Accession No.: YP_009137200.1 Nucleotide sequences that are identical, 70% identical, 60% identical or 50% identical. In some embodiments, the VP16 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in NCBI Reference Sequence Accession No. Q69113-1 , 60% identical or 50% identical amino acid sequences. In some embodiments, the VP16 gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence encoded by the nucleic acid sequence set forth in SEQ ID NO: 51 Nucleotide sequences that are identical, 60% identical, or 50% identical. In some embodiments, the VP16 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the amino acid sequence set forth in SEQ ID NO: 52 Amino acid sequences that are identical or 50% identical.

在一些實施例中,轉活化子域包含P65活化域。P65係NF-κβ轉錄因子之次單元,其C端含有兩個相鄰酸性TAD。當融合至於基因之啟動子處或附近結合之DBD時,p65蛋白用作強轉錄活化子且可因此用以調節靶基因之表現,例如如由Urlinger等人,「The p65 domain from NF-kappaB is an efficient human activator in the tetracycline-regulatable gene expression system」,Gene, 2000描述。在一些實施例中,p65蛋白係由人類RELA基因編碼,RELA基因包含NCBI參考序列寄存編號:NM_001145138.1、NM_001243984.1、NM_001243985.1或NM_021975.3中闡述之序列。在一些實施例中,RELA基因包含與由NCBI參考序列ID NO: NM_001145138.1、NM_001243984.1、NM_001243985.1或NM_021975.3之任一者中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,p65蛋白包含與NP_001138610.1、NP_001230913.1、NP_001230914.1及NP_068110.3中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,RELA基因包含與由SEQ ID NO: 53中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,p65蛋白包含與SEQ ID NO: 54中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator subdomain comprises a p65 activation domain. P65 is a subunit of NF-κβ transcription factor, and its C-terminus contains two adjacent acidic TADs. When fused to a DBD bound at or near the promoter of a gene, the p65 protein acts as a strong transcriptional activator and can thus be used to regulate the expression of target genes, for example as described by Urlinger et al., "The p65 domain from NF-kappaB is an efficient human activator in the tetracycline-regulatable gene expression system”, described in Gene, 2000. In some embodiments, the p65 protein is encoded by the human RELA gene comprising the sequence set forth in NCBI Reference Sequence Accession Nos.: NM_001145138.1, NM_001243984.1, NM_001243985.1, or NM_021975.3. In some embodiments, the RELA gene comprises 99% of the amino acid sequence encoded by the nucleic acid sequence set forth in any of NCBI Reference Sequence ID NOs: NM_001145138.1, NM_001243984.1, NM_001243985.1, or NM_021975.3 Nucleotide sequences that are identical, 95% identical, 90% identical, 80% identical, 70% identical, 60% identical or 50% identical. In some embodiments, the p65 protein comprises 99% identity, 95% identity, 90% identity, 80% identity to the amino acid sequences set forth in NP_001138610.1, NP_001230913.1, NP_001230914.1, and NP_068110.3 Amino acid sequences that are identical, 70% identical, 60% identical, or 50% identical. In some embodiments, the RELA gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence encoded by the nucleic acid sequence set forth in SEQ ID NO: 53 Nucleotide sequences that are identical, 60% identical, or 50% identical. In some embodiments, the p65 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the amino acid sequence set forth in SEQ ID NO: 54 Amino acid sequences that are identical or 50% identical.

在一些實施例中,轉活化子域包含RTA域。RTA係來源於愛潑斯坦巴爾病毒之疏水性TAD,其係結合至強化子區以促進數種病毒基因之表現之強效轉活化域。當融合至於基因之啟動子處或附近結合之DBD時,RTA蛋白用作強轉錄活化子且可因此用以調節靶基因之表現,例如如由Miyazawa等人,「IL-10 promoter transactivation by the viral K-RTA protein involves the host-cell transcription factors, specificity proteins 1 and 3」,Journal of Biological Chemistry, 2018描述。在一些實施例中,RTA蛋白係由愛潑斯坦巴爾病毒BRLF1基因編碼, BRLF1基因包含NCBI參考序列寄存編號:YP_041674.1中闡述之序列。在一些實施例中,BRLF1基因包含與由NCBI參考序列ID NO: YP_041674.1之任一者中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,RTA蛋白包含與YP_041674.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,BRLF1基因包含與由SEQ ID NO: 55中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,RTA蛋白包含與SEQ ID NO: 56中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator subdomain comprises an RTA domain. RTA is a hydrophobic TAD derived from Epstein Barr virus, a potent transactivation domain that binds to enhancer regions to facilitate the expression of several viral genes. When fused to DBD bound at or near the promoter of a gene, the RTA protein acts as a strong transcriptional activator and can thus be used to regulate the expression of target genes, as described, for example, by Miyazawa et al., "IL-10 promoter transactivation by the viral K-RTA protein involves the host-cell transcription factors, specificity proteins 1 and 3", described in Journal of Biological Chemistry, 2018. In some embodiments, the RTA protein is encoded by the Epsteinbarr virus BRLF1 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: YP_041674.1. In some embodiments, the BRLF1 gene comprises 99% identity, 95% identity, 90% identity to the amino acid sequence encoded by any one of the nucleic acid sequences set forth in NCBI Reference Sequence ID NO: YP_041674.1 , 80% identical, 70% identical, 60% identical or 50% identical nucleotide sequences. In some embodiments, the RTA protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the amino acid sequence set forth in YP_041674.1 or 50% identical amino acid sequences. In some embodiments, the BRLF1 gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence encoded by the nucleic acid sequence set forth in SEQ ID NO: 55 Nucleotide sequences that are identical, 60% identical, or 50% identical. In some embodiments, the RTA protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the amino acid sequence set forth in SEQ ID NO: 56 Amino acid sequences that are identical or 50% identical.

在一些實施例中,轉活化子域包含轉錄因子4 (TCF4)活化域。在一些實施例中,TCF4活化域係TCF4蛋白之蛋白域。在一些實施例中,TCF4蛋白係由人類TCF4基因編碼,TCF4基因包含NCBI參考序列寄存編號:NM_003199中闡述之序列。在一些實施例中,TCF4基因包含與由NCBI參考序列ID NO: NM_003199之任一者中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,TCF4蛋白包含與NP_003190.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,TCF4活化域包含與SEQ ID NO: 122中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,TCF4活化域包含與SEQ ID NO: 123中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator domain comprises a transcription factor 4 (TCF4) activation domain. In some embodiments, the TCF4 activation domain is a protein domain of the TCF4 protein. In some embodiments, the TCF4 protein is encoded by the human TCF4 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_003199. In some embodiments, the TCF4 gene comprises 99% identity, 95% identity, 90% identity, 80% identity to the amino acid sequence encoded by the nucleic acid sequence set forth in any one of NCBI Reference Sequence ID NO: NM_003199 % identical, 70% identical, 60% identical or 50% identical nucleotide sequences. In some embodiments, the TCF4 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the amino acid sequence set forth in NP_003190.1 or 50% identical amino acid sequences. In some embodiments, the TCF4 activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 122 % identical, 60% identical or 50% identical amino acid sequences. In some embodiments, the TCF4 activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 123 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 122 TCF4 活化域 1 ,胺基酸序列MHHQQRMAALGTDKELSDLLDFSAMFSPPVSSGKNGPTSLASGHFTGSNVEDRSSSGSWGNGGHPSPSRNYGDGTPYDHMTSRDLGSHDNLSPPFVNS SEQ ID NO: 122 , TCF4 activation domain 1 , amino acid sequence MHHQQRMAALGTDKELSDLLDFSAMFSPPVSSGKNGPTLASGHFTGSNVEDRSSSGSWGNGGHPSPSRNYGDGTPYDHMTSRDLGSHDNLSPPFVNS

SEQ ID NO: 123 TCF4 活化域 2 ,胺基酸序列TNNSFSSNPSTPVGSPPSLSAGTAVWSRNGGQASSSPNYEGPLHSLQSRIEDRLERLDDAIHVLRNHAVGPS SEQ ID NO: 123 , TCF4 activation domain 2 , amino acid sequence TNNSFSSNPSTPVGSPPSLSAGTAVWSRNGGQASSSPNYEGPLHSLQSRIEDRLERLDDAIHVLRNHAVGPS

在一些實施例中,轉活化子域包含肌細胞特異性強化子因子2A (MEF2A)活化域。在一些實施例中,MEF2A活化域係MEF2A蛋白之蛋白域。在一些實施例中,MEF2A蛋白係由人類MEF2A基因編碼,MEF2A基因包含NCBI參考序列寄存編號:NM_001130926.2、NM_001130927.3或NM_001130928.2之任一者中闡述之序列。在一些實施例中,MEF2A蛋白包含與由NCBI參考序列ID NO: NM_001130926.2、NM_001130927.3或NM_001130928.2之任一者中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之序列。在一些實施例中,MEF2A蛋白包含與NP_001124398.1、NP_001124399.1或NP_001124400.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,MEF2A活化域包含與SEQ ID NO: 124中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator domain comprises a myocyte-specific enhancer factor 2A (MEF2A) activation domain. In some embodiments, the MEF2A activation domain is a protein domain of the MEF2A protein. In some embodiments, the MEF2A protein is encoded by the human MEF2A gene comprising the sequence set forth in any of NCBI Reference Sequence Accession Nos.: NM_001130926.2, NM_001130927.3, or NM_001130928.2. In some embodiments, the MEF2A protein comprises an amino acid sequence that is 99% identical to an amino acid sequence encoded by any of the nucleic acid sequences set forth in NCBI Reference Sequence ID NO: NM_001130926.2, NM_001130927.3, or NM_001130928.2, 95 Sequences of % identity, 90% identity, 80% identity, 70% identity, 60% identity or 50% identity. In some embodiments, the MEF2A protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70 % identical, 60% identical or 50% identical amino acid sequences. In some embodiments, the MEF2A activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 124 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 124 MEF2A 活化域,胺基酸序列PLSEEEELELNTQR SEQ ID NO: 124 , MEF2A activation domain, amino acid sequence PLSEEEELELNTQR

在一些實施例中,轉活化子域包含肌細胞強化子因子2C (MEF2C)活化域。在一些實施例中,MEF2C活化域係MEF2C蛋白之蛋白域。在一些實施例中,MEF2C蛋白係由人類MEF2C基因編碼,MEF2C基因包含NCBI參考序列寄存編號:NM_001131005.2、NM_001193347.1、NM_001193348.1或NM_001193349.2之任一者中闡述之序列。在一些實施例中,MEF2C基因包含與NCBI參考序列ID NO: NM_001131005.2、NM_001193347.1、NM_001193348.1或NM_001193349.2之任一者中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,MEF2C蛋白包含與NCBI參考序列ID NO: NP_001124477.1、NP_001180276.1、NP_001180277.1或NP_001180278.1之任一者中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,MEF2C活化域包含與SEQ ID NO: 125中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator domain comprises a myocyte enhancer factor 2C (MEF2C) activation domain. In some embodiments, the MEF2C activation domain is a protein domain of the MEF2C protein. In some embodiments, the MEF2C protein is encoded by the human MEF2C gene comprising the sequence set forth in any of NCBI Reference Sequence Accession Nos.: NM_001131005.2, NM_001193347.1, NM_001193348.1, or NM_001193349.2. In some embodiments, the MEF2C gene comprises 99% identity, 95% identity to the nucleic acid sequence set forth in any of NCBI Reference Sequence ID NOs: NM_001131005.2, NM_001193347.1, NM_001193348.1 or NM_001193349.2 , 90% identity, 80% identity, 70% identity, 60% identity or 50% identity nucleotide sequences. In some embodiments, the MEF2C protein comprises 99% identity, 95% identity to the amino acid sequence set forth in any one of NCBI Reference Sequence ID NOs: NP_001124477.1, NP_001180276.1, NP_001180277.1, or NP_001180278.1 Amino acid sequences that are identical, 90% identical, 80% identical, 70% identical, 60% identical, or 50% identical. In some embodiments, the MEF2C activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 125 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 125 MEF2C 活化域,胺基酸序列SVSEDVDLLLNQR SEQ ID NO: 125 , MEF2C activation domain, amino acid sequence SVSEDVDLLLNQR

在一些實施例中,轉活化子域包含肌細胞強化子因子2D (MEF2D)活化域。在一些實施例中,MEF2D活化域係MEF2D蛋白之蛋白域。在一些實施例中,MEF2D蛋白係由人類MEF2D基因編碼,MEF2D基因包含NCBI參考序列寄存編號:NM_001271629.2或NM_005920.4中闡述之序列。在一些實施例中,MEF2D基因包含與由NCBI參考序列ID NO: NM_001271629.2或NM_005920.4之任一者中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,MEF2D蛋白包含與NCBI參考序列ID NO: NP_001258558.1或NP_005911.1之任一者中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,MEF2D活化域包含與SEQ ID NO: 126中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator domain comprises a myocyte enhancer factor 2D (MEF2D) activation domain. In some embodiments, the MEF2D activation domain is a protein domain of a MEF2D protein. In some embodiments, the MEF2D protein is encoded by the human MEF2D gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_001271629.2 or NM_005920.4. In some embodiments, the MEF2D gene comprises 99% identity, 95% identity, Nucleotide sequences that are 90% identical, 80% identical, 70% identical, 60% identical or 50% identical. In some embodiments, the MEF2D protein comprises 99% identity, 95% identity, 90% identity, Amino acid sequences of 80% identity, 70% identity, 60% identity or 50% identity. In some embodiments, the MEF2D activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 126 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 126 MEF2D 活化域,胺基酸序列HLTEDHLDLNNAQR SEQ ID NO: 126 , MEF2D activation domain, amino acid sequence HLTEDHLDLNNAQR

在一些實施例中,轉活化子域包含轉錄因子Sp1富麩胺酸活化域。在一些實施例中,活化域係蛋白質之蛋白域。在一些實施例中,蛋白質係由人類 SP1基因編碼, SP1基因包含NCBI參考序列寄存編號:NM_001251825.2中闡述之序列。在一些實施例中,基因包含與NCBI參考序列ID No: NM_001251825.2中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,蛋白質包含與NCBI參考編號:NP_001238754.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,轉錄因子Sp1富麩胺酸活化域包含與SEQ ID NO: 127中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。 In some embodiments, the transactivator domain comprises the transcription factor Sp1 glutamate-rich activation domain. In some embodiments, the activation domain is a protein domain of a protein. In some embodiments, the protein is encoded by the human SP1 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_001251825.2. In some embodiments, the gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the nucleic acid sequence set forth in NCBI Reference Sequence ID No: NM_001251825.2 % identical or 50% identical nucleotide sequences. In some embodiments, the protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the amino acid sequence set forth in NCBI Reference Number: NP_001238754.1 % identical or 50% identical amino acid sequences. In some embodiments, the transcription factor Sp1 glutamate-rich activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, Amino acid sequences of 80% identity, 70% identity, 60% identity or 50% identity.

SEQ ID NO: 127 SP1 富麩胺酸活化域,胺基酸序列NSVSAATLTPSSQAVTISSSGSQESGSQPVTSGTTISSASLVSSQASSSSFFTNANSYSTTTTTSNMGIMNFTTSGSSGTNSQGQTPQRVSGLQGSDALNIQQNQTSGGSLQAGQQKEGEQNQQTQQQQILIQPQLVQGGQALQALQAAPLSGQTFTTQAISQETLQNLQLQAVPNSGPIIIRTPTVGPNGQVSWQTLQLQNLQVQNPQAQTITLAPMQGVSLGQTSSSNTTLTPIA SEQ ID NO: 127 SP1 富麩胺酸活化域,胺基酸序列NSVSAATLTPSSQAVTISSSGSQESGSQPVTSGTTISSASLVSSQASSSSFFTNANSYSTTTTTSNMGIMNFTTSGSSGTNSQGQTPQRVSGLQGSDALNIQQNQTSGGSLQAGQQKEGEQNQQTQQQQILIQPQLVQGGQALQALQAAPLSGQTFTTQAISQETLQNLQLQAVPNSGPIIIRTPTVGPNGQVSWQTLQLQNLQVQNPQAQTITLAPMQGVSLGQTSSSNTTLTPIA

在一些實施例中,轉活化子域包含腫瘤蛋白p53活化域。在一些實施例中,p53活化域係p53蛋白之蛋白域。在一些實施例中,p53蛋白係由人類 p53基因編碼, p53基因包含NCBI參考序列寄存編號:NM_000546.6或NM_001126112.2中闡述之序列。在一些實施例中, p53基因包含與NCBI參考序列ID No: NM_000546.6或NM_001126112.2中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,p53蛋白包含與NCBI參考編號:NP_000537.3或NP_001119584.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,p53活化域包含與SEQ ID NO: 128中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。 In some embodiments, the transactivator domain comprises a tumor protein p53 activation domain. In some embodiments, the p53 activation domain is a protein domain of the p53 protein. In some embodiments, the p53 protein is encoded by the human p53 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_000546.6 or NM_001126112.2. In some embodiments, the p53 gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the nucleic acid sequence set forth in NCBI Reference Sequence ID No: NM_000546.6 or NM_001126112.2 % identical, 60% identical or 50% identical nucleotide sequences. In some embodiments, the p53 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70 % identical, 60% identical or 50% identical amino acid sequences. In some embodiments, the p53 activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 128 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 128 p53 活化域,胺基酸序列MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLS SEQ ID NO: 128 , p53 activation domain, amino acid sequence MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLS

在一些實施例中,轉活化子域包含E2F轉錄因子1 (E2F1)活化域。在一些實施例中,E2F轉錄因子1活化域係E2F轉錄因子1蛋白之蛋白域。在一些實施例中,E2F轉錄因子1蛋白係由人類 E2F1基因編碼, E2F1基因包含NCBI參考序列寄存編號:NM_005225.3中闡述之序列。在一些實施例中,E2F1基因包含與NCBI參考序列ID No: NM_005225.3中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,E2F1蛋白包含與NCBI參考編號:NP_005216.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,E2F1活化域包含具有NCBI參考編號:NP_005216.1中闡述之胺基酸序列之E2F1蛋白之胺基酸殘基380至437。在一些實施例中,E2F1活化域包含與SEQ ID NO: 129中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。 In some embodiments, the transactivator domain comprises an E2F transcription factor 1 (E2F1) activation domain. In some embodiments, the E2F transcription factor 1 activation domain is the protein domain of the E2F transcription factor 1 protein. In some embodiments, the E2F transcription factor 1 protein is encoded by the human E2F1 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_005225.3. In some embodiments, the E2F1 gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, Nucleotide sequences that are 60% identical or 50% identical. In some embodiments, the E2F1 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, Amino acid sequences with 60% identity or 50% identity. In some embodiments, the E2F1 activation domain comprises amino acid residues 380 to 437 of the E2F1 protein having the amino acid sequence set forth in NCBI reference number: NP_005216.1. In some embodiments, the E2F1 activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 129 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 129 E2F 轉錄因子 1 活化域,胺基酸序列ADSLLEHVREDFSGLLPEEFISLSPPHEALDYHFGLEEGEGIRDLFDCDFGDLTPLDF SEQ ID NO: 129 , E2F transcription factor 1 activation domain, amino acid sequence ADSLLEHVREDFSGLLPEEFISLSPPHEALDYHFGLEEGEGIRDLFDCDFGDLTPLDF

在一些實施例中,轉活化子域包含肌母細胞測定蛋白1 (MyoD)活化域。在一些實施例中,MyoD活化域係MyoD蛋白之蛋白域。在一些實施例中,MyoD蛋白係由人類 MyoD基因編碼, MyoD基因包含NCBI參考序列寄存編號:NM_002478.5中闡述之序列。在一些實施例中,MyoD基因包含與NCBI參考序列ID No: NM_002478.5中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,MyoD蛋白包含與NCBI參考編號:NP_002469.2中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,MyoD活化域包含具有NCBI參考編號:NP_002469.2中闡述之胺基酸序列之MyoD蛋白之胺基酸殘基1至63。在一些實施例中,MyoD活化域包含與SEQ ID NO: 130中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。 In some embodiments, the transactivator domain comprises a myoblast assay protein 1 (MyoD) activation domain. In some embodiments, the MyoD activation domain is the protein domain of the MyoD protein. In some embodiments, the MyoD protein is encoded by the human MyoD gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_002478.5. In some embodiments, the MyoD gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, nucleotide sequence set forth in NCBI Reference Sequence ID No: NM_002478.5, Nucleotide sequences that are 60% identical or 50% identical. In some embodiments, the MyoD protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, Amino acid sequences with 60% identity or 50% identity. In some embodiments, the MyoD activation domain comprises amino acid residues 1 to 63 of the MyoD protein having the amino acid sequence set forth in NCBI reference number: NP_002469.2. In some embodiments, the MyoD activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 130 % identical, 60% identical or 50% identical amino acid sequences.

yo SEQ ID NO: 130 M yoD 活化域,胺基酸序列MELLSPPLRDVDLTAPDGSLCSFATTDDFYDDPCFDSPDLRFFEDLDPRLMHVGALLKPEEHS yo SEQ ID NO: 130 , MyoD activation domain, amino acid sequence MELLSPPLRDVDLTAPDGSLCSFATTDDFYDDPCFDSPDLRFFEDLDPRLMHVGALLKPEEHS

在一些實施例中,轉活化子域包含促分裂原活化蛋白激酶7 (MAPK7)活化域。在一些實施例中,MAPK7活化域係MAPK7蛋白之蛋白域。在一些實施例中,MAPK7蛋白係由人類MAPK7基因編碼,MAPK7基因包含NCBI參考序列寄存編號:NM_002749.4中闡述之序列。在一些實施例中,MAPK7基因包含與NCBI參考序列ID No: NM_002749.4中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,MAPK7蛋白包含與NCBI參考編號:NP_002740.2中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,MAPK7活化域包含與SEQ ID NO: 131中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator domain comprises a mitogen-activated protein kinase 7 (MAPK7) activation domain. In some embodiments, the MAPK7 activation domain is the protein domain of the MAPK7 protein. In some embodiments, the MAPK7 protein is encoded by the human MAPK7 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_002749.4. In some embodiments, the MAPK7 gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, a nucleic acid sequence set forth in NCBI Reference Sequence ID No: NM_002749.4, Nucleotide sequences that are 60% identical or 50% identical. In some embodiments, the MAPK7 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, Amino acid sequences with 60% identity or 50% identity. In some embodiments, the MAPK7 activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 131 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 131 MAPK7 活化域,胺基酸序列LAAQSLVPPPGLPGSSTPGVLPYFPPGLPPPDAGGAPQSSMSESPDVNLVTQQLSKSQVEDPLPPVFSGTPKGSGAGYGVGFDLEEFLNQSFDMGVADGPQDGQADSASLSASLLADWLEGHGMNPA SEQ ID NO: 131 , MAPK7 activation domain, amino acid sequence LAAQSLVPPPGLPGSSTPGVLPYFPPGLPPPDAGGAPQSSMSESPDVNLVTQQLSKSQVEDPLPPVFSGTPKGSGAGYGVGFDLEEFLNQSFDMGVADGPQDGQADSASLSASLLADWLEGHGMNPA

在一些實施例中,轉活化子域包含核因子1 B型(NF1B)富脯胺酸活化域。在一些實施例中,NF1B富脯胺酸活化域係NF1B蛋白之蛋白域。在一些實施例中,NF1B蛋白係由人類NF1B基因編碼,NF1B基因包含NCBI參考序列寄存編號:NM_001369480中闡述之序列。在一些實施例中,NF1B基因包含與NCBI參考序列ID No: NM_001369480中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,NF1B蛋白包含與NCBI參考編號:NP_001356409.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,NF1B活化域包含具有NCBI參考編號:NP_001356409.1中闡述之胺基酸序列之NF1B蛋白之胺基酸殘基319至419。在一些實施例中,NF1B活化域包含與SEQ ID NO: 132中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator domain comprises a nuclear factor 1 type B (NF1B) proline-rich activation domain. In some embodiments, the NF1B proline-rich activation domain is a protein domain of the NF1B protein. In some embodiments, the NF1B protein is encoded by the human NF1B gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_001369480. In some embodiments, the NF1B gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the nucleic acid sequence set forth in NCBI Reference Sequence ID No: NM_001369480 Nucleotide sequences that are identical or 50% identical. In some embodiments, the NF1B protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, Amino acid sequences with 60% identity or 50% identity. In some embodiments, the NF1B activation domain comprises amino acid residues 319 to 419 of the NF1B protein having the amino acid sequence set forth in NCBI reference number: NP_001356409.1. In some embodiments, the NF1B activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 132 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 132 NF1B 富脯胺酸活化域,胺基酸序列PEKPLFSSASPQDSSPRLSTFPQHHHPGIPGVAHSVISTRTPPPPSPLPFPTQAILPPAPSSYFSHPTIRYPPHLNPQDTLKNYVPSYDPSSPQTSQSWYLG SEQ ID NO: 132 , NF1B proline-rich activation domain, amino acid sequence PEKPLFSSASPQDSSPRLSTFPQHHHPGIPGVAHSVISTRTPPPPSPLPFPTQAILPPAPSSYFSHPTIRYPPHLNPQDTLKNYVPSYDPSSPQTSQSWYLG

在一些實施例中,轉活化子域包含RelA活化域。在一些實施例中,RelA活化域係RelA蛋白之蛋白域。在一些實施例中,RelA蛋白係由人類RelA基因編碼,RelA基因包含NCBI參考序列寄存編號:NM_001145138.2或NM_021975.4中闡述之序列。在一些實施例中,RelA基因包含與NCBI參考序列ID No: NM_001145138.2或NM_021975.4中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,RelA蛋白包含與NCBI參考編號:NP_001138610.1或NP_068810.3中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,RelA活化域包含與SEQ ID NO: 133中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator subdomain comprises a RelA activation domain. In some embodiments, the RelA activation domain is the protein domain of the RelA protein. In some embodiments, the RelA protein is encoded by the human RelA gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_001145138.2 or NM_021975.4. In some embodiments, the RelA gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the nucleic acid sequence set forth in NCBI Reference Sequence ID No: NM_001145138.2 or NM_021975.4 % identical, 60% identical or 50% identical nucleotide sequences. In some embodiments, the RelA protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70 % identical, 60% identical or 50% identical amino acid sequences. In some embodiments, the RelA activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 133 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 133 R elA 活化域,胺基酸序列QYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALL SEQ ID NO: 133 R elA 活化域,胺基酸序列QYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALL

在一些實施例中,轉活化子域包含熱休克轉錄因子1 (HSF1)活化域。在一些實施例中,HSF1活化域係HSF1蛋白之蛋白域。在一些實施例中,HSF1蛋白係由人類HSF1基因編碼,HSF1基因包含NCBI參考序列寄存編號:NM_005526.4中闡述之序列。在一些實施例中,HSF1基因包含與NCBI參考序列ID No: NM_005526.4中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之核苷酸序列。在一些實施例中,HSF1蛋白包含與NCBI參考編號:NP_005517.1中闡述之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,HSF1活化域包含與SEQ ID NO: 134中闡述之胺基酸序列具有100%一致性、99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the transactivator domain comprises a heat shock transcription factor 1 (HSF1) activation domain. In some embodiments, the HSF1 activation domain is a protein domain of the HSF1 protein. In some embodiments, the HSF1 protein is encoded by the human HSF1 gene comprising the sequence set forth in NCBI Reference Sequence Accession No.: NM_005526.4. In some embodiments, the HSF1 gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, Nucleotide sequences that are 60% identical or 50% identical. In some embodiments, the HSF1 protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, Amino acid sequences with 60% identity or 50% identity. In some embodiments, the HSF1 activation domain comprises 100% identity, 99% identity, 95% identity, 90% identity, 80% identity, 70% identity to the amino acid sequence set forth in SEQ ID NO: 134 % identical, 60% identical or 50% identical amino acid sequences.

SEQ ID NO: 134 HSF1 活化域,胺基酸序列GFSVDTSALLDLFSPSVTVPDMSLPDLDSSLASIQELLSPQEPPRPPEAENSSPDSGKQLVHYTAQPLFLLDPGSVDTGSNDLPVLFELGEGSYFSEGDGFAEDPTISLLTGSEPPKAKDPTVS SEQ ID NO: 134 , HSF1 activation domain, amino acid sequence GFSVDTSALLDLFSPSVTVPDMSLPDLDSSLASIQELLSPQEPPRPPEAENSSPDSGKQLVHYTAQPLFLLDPGSVDTGSNDLPVLFELGEGSYFSEGDGFAEDPTISLLTGSEPPKAKDPTVS

本發明係部分基於包含雜合轉活化子域之融合蛋白。如本文使用,「雜合轉活化子域」係指包含超過一個轉錄活化蛋白或其部分(例如,2、3、4、5或更多個轉錄活化蛋白,或其部分)之融合蛋白。雜合轉活化域在基因工程中用以增加靶基因之表現。在本發明之一些實施例中,如Chavez等人,使用「Highly efficient Cas9-mediated transcriptional programming」, Nat Methods, 2015, (SEQ ID NO: 47)中描述之包含VP64-P65-RTA (VPR)之核苷酸序列之三連雜合轉活化域以增加靶基因(例如SCN1A)表現。The present invention is based in part on fusion proteins comprising hybrid transactivator domains. As used herein, a "hybrid transactivator domain" refers to a fusion protein comprising more than one transcriptional activator protein or portion thereof (eg, 2, 3, 4, 5 or more transcriptional activator proteins, or portions thereof). Hybrid transactivation domains are used in genetic engineering to increase the expression of target genes. In some embodiments of the invention, as described in Chavez et al., using a VP64-P65-RTA (VPR) containing VP64-P65-RTA (VPR) as described in "Highly efficient Cas9-mediated transcriptional programming", Nat Methods, 2015, (SEQ ID NO: 47) Triple hybrid transactivation domains of nucleotide sequences to increase target gene (eg SCN1A) expression.

在一些實施例中,本文描述之融合蛋白可包含DBD (例如,ZFP)及轉錄抑制子蛋白。在一些態樣中,本發明係關於包含轉錄抑制子域之融合蛋白。如本文使用,「轉錄抑制子」蛋白一般係指下調靶基因表現之多肽。轉錄抑制子之實例包括(但不限於) KRAB、SMRT/TRAC-2及NCoR/RIP-13。在一些實施例中,此等轉錄抑制子融合蛋白適用於降低靶基因(例如,於功能獲得性疾病中過表現之基因)之表現量。In some embodiments, the fusion proteins described herein can comprise a DBD (eg, ZFP) and a transcriptional repressor protein. In some aspects, the invention relates to fusion proteins comprising transcriptional repressor domains. As used herein, a "transcriptional repressor" protein generally refers to a polypeptide that downregulates the expression of a target gene. Examples of transcriptional repressors include, but are not limited to, KRAB, SMRT/TRAC-2, and NCoR/RIP-13. In some embodiments, these transcriptional repressor fusion proteins are useful for reducing the expression of target genes (eg, genes overexpressed in gain-of-function diseases).

在一些實施例中,本文描述之融合蛋白另外包含核定位信號或序列(NLS)。NLS係促進蛋白質進入細胞核內之胺基酸序列。在一些實施例中,該NLS係包含複數個帶正電胺基酸(例如,離胺酸或精胺酸)之胺基酸序列。在一些實施例中,該NLS包含SEQ ID NO: 135至140中之任一者。在一些實施例中,該NLS包含SEQ ID NO: 135至140中之一或多者(例如,任何組合)。該NLS可於本文描述之融合蛋白之N末端或C末端處。在一些實施例中,該NLS可位於該蛋白質之內部中。 表A:核定位序列 標識符 序列 SEQ ID NO: SV40 NLS PKKKRKVE 135 cMyc NLS PAAKRVKLD 136 cMyc樣NLS PAAKKKKLD 137 核質素NLS KRPAATKKAGQAKKKKLD 138 二連SV40 NLS KRTADGSEFESTPKKKRKVE 139 二連TCF4 NLS PRRRPLHSSAMEVQTKKVRKVPP 140 In some embodiments, the fusion proteins described herein additionally comprise a nuclear localization signal or sequence (NLS). NLS is an amino acid sequence that facilitates the entry of proteins into the nucleus. In some embodiments, the NLS comprises a plurality of amino acid sequences of positively charged amino acids (eg, lysine or arginine). In some embodiments, the NLS comprises any one of SEQ ID NOs: 135-140. In some embodiments, the NLS comprises one or more of SEQ ID NOs: 135-140 (eg, any combination). The NLS can be at the N-terminus or the C-terminus of the fusion proteins described herein. In some embodiments, the NLS can be located within the protein. Table A: Nuclear localization sequences identifier sequence SEQ ID NO: SV40 NLS PKKKRKVE 135 cMyc NLS PAAKRVKLD 136 cMyc-like NLS PAAKKKLD 137 nucleoplasmin NLS KRPAATKKAGQAKKKKLD 138 Duolian SV40 NLS KRTADGSEFESTPKKKRKVE 139 Duo TCF4 NLS PRRRPLHSSAMEVQTKKVRKVPP 140

經分離核酸 經分離核酸序列係指DNA或RNA序列。在一些實施例中,分離本發明之蛋白質及核酸。如本文使用,術語「經分離」意謂人工產生。如本文關於核酸使用,術語「經分離」意謂:(i)藉由(例如)聚合酶鏈反應(PCR)活體外擴增;(ii)藉由選殖重組產生;(iii)如藉由裂解及凝膠分離純化;或(iv)藉由(例如)化學合成進行合成。經分離核酸係可藉由此項技術中熟知的重組DNA技術容易操作者。因此,認為其中已知5'及3'限制位點或其中已揭示聚合酶鏈反應(PCR)引子序列之包含於載體中之核苷酸序列係經分離但以其天然狀態存在於其天然宿主中之核酸序列未經分離。可大體上純化(但不必)經分離核酸。例如,選殖或表現載體內經分離之核酸不為純的,因為該核酸於其所在細胞中可包含僅較小百分比之材料。然而,此核酸係經分離,如本文使用該術語,因為其可藉由一般技術者已知的標準技術容易操作。如本文關於蛋白質或肽使用,術語「經分離」係指已自天然環境分離或人工產生(例如,藉由化學合成、藉由重組DNA等)之蛋白質或肽。 isolated nucleic acid An isolated nucleic acid sequence refers to a DNA or RNA sequence. In some embodiments, the proteins and nucleic acids of the invention are isolated. As used herein, the term "isolated" means artificially generated. As used herein in reference to nucleic acids, the term "isolated" means: (i) amplified in vitro by, for example, polymerase chain reaction (PCR); (ii) produced by clonal recombination; (iii) as by Cleavage and gel separation purification; or (iv) synthesis by, for example, chemical synthesis. Isolated nucleic acids can be readily manipulated by recombinant DNA techniques well known in the art. Therefore, it is believed that the nucleotide sequence contained in the vector in which the 5' and 3' restriction sites are known or in which the polymerase chain reaction (PCR) primer sequence has been disclosed is isolated but exists in its natural state in its natural host The nucleic acid sequences in this have not been isolated. The isolated nucleic acid can be substantially purified, but need not be. For example, an isolated nucleic acid within a colonization or expression vector is not pure because the nucleic acid may contain only a small percentage of material in the cell in which it resides. However, this nucleic acid is isolated, as the term is used herein, because it can be readily manipulated by standard techniques known to those of ordinary skill. As used herein in reference to a protein or peptide, the term "isolated" refers to a protein or peptide that has been isolated from its natural environment or artificially produced (eg, by chemical synthesis, by recombinant DNA, etc.).

在一些態樣中,本發明係關於經組態以表現一或多個ZFP-轉活化域融合蛋白之經分離核酸(例如,表現構築體,諸如rAAV載體)。在一些實施例中,融合蛋白包含1至10 (例如,1、2、3、4、5、6、7、8、9或10)個DBD及/或1至10 (例如,1、2、3、4、5、6、7、8、9或10)個轉活化子域。在一些實施例中,融合蛋白包含超過10個DBS及/或超過10個轉活化子域。In some aspects, the invention relates to isolated nucleic acids (eg, expression constructs, such as rAAV vectors) configured to express one or more ZFP-transactivation domain fusion proteins. In some embodiments, the fusion protein comprises 1 to 10 (eg, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10) DBDs and/or 1 to 10 (eg, 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10) transactivation subdomains. In some embodiments, the fusion protein comprises more than 10 DBS and/or more than 10 transactivator domains.

在本發明之一些態樣中,DNA結合域係間接經由連接子融合至轉錄調節域。如本文使用,「連接子」一般係結構上連接單一轉基因內之兩個不同多肽之一段多肽。在一些實施例中,連接子係可撓性的以容許不同多肽移動。在一些實施例中,可撓性連接子包含甘胺酸殘基。在一些實施例中,可撓性連接子包含甘胺酸及絲胺酸殘基之混合物。在一些實施例中,連接子係可裂解的,容許分離多肽。在一些實施例中,可裂解連接子係藉由蛋白酶切割。在一些實施例中,該蛋白酶係胰蛋白酶或因子X。In some aspects of the invention, the DNA binding domain is fused to the transcriptional regulatory domain indirectly via a linker. As used herein, a "linker" is generally a stretch of polypeptide that structurally links two different polypeptides within a single transgene. In some embodiments, the linker is flexible to allow movement of different polypeptides. In some embodiments, the flexible linker comprises a glycine residue. In some embodiments, the flexible linker comprises a mixture of glycine and serine residues. In some embodiments, the linker is cleavable, allowing isolation of the polypeptide. In some embodiments, the cleavable linker is cleaved by a protease. In some embodiments, the protease is trypsin or factor X.

在一些實施例中,連接子包含5至30個胺基酸(例如,5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29或30個胺基酸)。在一些實施例中,連接子包含3至30個胺基酸(例如,3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29或30個胺基酸)。在一些實施例中,連接子包含3至20個胺基酸(例如,3、4 5、6、7、8、9、10、11、12、13、14、15、16、17、18、19或20個胺基酸)。In some embodiments, the linker comprises 5 to 30 amino acids (eg, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 , 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 amino acids). In some embodiments, the linker comprises 3 to 30 amino acids (eg, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 , 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 amino acids). In some embodiments, the linker comprises 3 to 20 amino acids (eg, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 amino acids).

本發明係部分基於經工程化以增加編碼電位閘控鈉離子通道次單元蛋白(亦稱為SCN蛋白)之基因(例如SCN1A)之表現之融合蛋白。如本文使用,「SCN蛋白」係指鈉離子通道蛋白,其介導興奮膜之電位依賴性鈉離子滲透性,容許鈉離子通過該膜。人類中SCN蛋白之實例包括(但不限於) SCN1A、SCN3A、SCN5A、SCN10A及SCN11A。在一些實施例中,SCN蛋白係SCN1A (亦稱為Nav1.1),其編碼1型α 1離子通道次單元。在一些實施例中,SCN蛋白係SCN1B蛋白,其編碼1型β 1離子通道次單元或SCN1C蛋白。在一些實施例中,SCN蛋白係SCN1A、SCN1B及/或SCN1C蛋白之組合。如本文揭示,SCN蛋白可為SCN蛋白之一部分或片段。在一些實施例中,如本文揭示之SCN蛋白係SCN蛋白之變體,諸如點突變體或截短突變體。 The present invention is based in part on fusion proteins engineered to increase the expression of genes (eg, SCN1A) encoding potential-gated sodium ion channel subunit proteins (also known as SCN proteins). As used herein, "SCN protein" refers to a sodium ion channel protein that mediates potential-dependent sodium ion permeability of excitatory membranes, allowing sodium ions to pass through the membrane. Examples of SCN proteins in humans include, but are not limited to, SCN1A, SCN3A, SCN5A, SCN10A, and SCN11A. In some embodiments, the SCN protein is SCN1A (also known as Nav1.1), which encodes the type 1 alpha 1 ion channel subunit. In some embodiments, the SCN protein is the SCN1B protein, which encodes the type 1 beta 1 ion channel subunit or the SCN1C protein. In some embodiments, the SCN protein is a combination of SCN1A, SCN1B and/or SCN1C proteins. As disclosed herein, an SCN protein can be a portion or fragment of an SCN protein. In some embodiments, the SCN proteins as disclosed herein are variants of the SCN proteins, such as point mutants or truncation mutants.

在人類中,SCN1A係由SCN1A基因(基因ID: 6323,人類)編碼,SCN1A基因在黑猩猩、恒河猴、狗、奶牛、小鼠、大鼠及雞中係保守的。人類中之SCN1A基因主要表現於大腦、肺及睾丸中。在一些實施例中,SCN1A蛋白包含五個結構重複序列(I、II、III、IV、Q)。In humans, the SCN1A line is encoded by the SCN1A gene (Gene ID: 6323, human), which is conserved in chimpanzees, rhesus monkeys, dogs, cows, mice, rats and chickens. The SCN1A gene in humans is mainly expressed in the brain, lung and testis. In some embodiments, the SCN1A protein comprises five structural repeats (I, II, III, IV, Q).

在一些實施例中,SCN1A蛋白係由人類SCN1A基因編碼,SCN1A基因包含NCBI參考序列ID No: NM_001165963.2、NM_00165964.2、NM_001202435.2、NM_001353948.1、NM_001353949.1、NM_001353950.1、NM_00135395.1、NM_001353952.1、NM_001353954.1、NM_00353955.1、NM_001353957.1、NM_001353958.1、NM_001353960.1、NM_001353961.1或NM_006920.5中闡述之序列。在一些實施例中,SCN1A蛋白係由小鼠SCN1A基因編碼,SCN1A基因包含NCBI參考序列ID No:  NM_001313997.1或NM_018733.2中闡述之序列。在一些實施例中,SCN1A蛋白包含與由NCBI參考序列ID No: NG_011906.1、NM_001313997.1或NM_018733.2中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,SCN1A基因包含與SEQ ID NO: 50中闡述之序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,人類SCN1A蛋白包含NCBI參考序列ID No: NP_001159435.1、NP_0011159436.1、NP_001189364.1、NP_001340877.1、NP_001340878.1、NP_001340879.1、NP_001340880.1、NP_001340881.1、NP_001340883.1、NP_001340884.1、NP_001340886.1、NP_001340887.1、NP_001340889.1、NP_001340890.1、NP_00851.3中闡述之序列。在一些實施例中,SCN1A蛋白包含與由NCBI參考序列ID No: NG_011906.1、NM_001313997.1或NM_018733.2中闡述之核酸序列編碼之胺基酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。在一些實施例中,小鼠SCN1A蛋白包含NCBI參考序列ID No: NP_001300926.1或NP_061203.2中闡述之序列。在一些實施例中,人類SCN1A蛋白包含與SEQ ID NO: 49中闡述之核酸序列具有99%一致性、95%一致性、90%一致性、80%一致性、70%一致性、60%一致性或50%一致性之胺基酸序列。In some embodiments, the SCN1A protein is encoded by the human SCN1A gene comprising NCBI Reference Sequence ID Nos: NM_001165963.2, NM_00165964.2, NM_001202435.2, NM_001353948.1, NM_001353949.1, NM_001353950.1, NM_001353950.1 1. The sequence set forth in NM_001353952.1, NM_001353954.1, NM_00353955.1, NM_001353957.1, NM_001353958.1, NM_001353960.1, NM_001353961.1 or NM_006920.5. In some embodiments, the SCN1A protein is encoded by the mouse SCN1A gene comprising the sequence set forth in NCBI Reference Sequence ID No: NM_001313997.1 or NM_018733.2. In some embodiments, the SCN1A protein comprises 99% identity, 95% identity, Amino acid sequences of 90% identity, 80% identity, 70% identity, 60% identity or 50% identity. In some embodiments, the SCN1A gene comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the sequence set forth in SEQ ID NO: 50, or 50% identical amino acid sequence. In some embodiments, the human SCN1A protein comprises NCBI Reference Sequence ID Nos: NP_001159435.1, NP_0011159436.1, NP_001189364.1, NP_001340877.1, NP_001340878.1, NP_001340870.1, NP3_001340880.1, NP_000138 1. Sequences set forth in NP_001340884.1, NP_001340886.1, NP_001340887.1, NP_001340889.1, NP_001340890.1, NP_00851.3. In some embodiments, the SCN1A protein comprises 99% identity, 95% identity, Amino acid sequences of 90% identity, 80% identity, 70% identity, 60% identity or 50% identity. In some embodiments, the mouse SCN1A protein comprises the sequence set forth in NCBI Reference Sequence ID No: NP_001300926.1 or NP_061203.2. In some embodiments, the human SCN1A protein comprises 99% identity, 95% identity, 90% identity, 80% identity, 70% identity, 60% identity to the nucleic acid sequence set forth in SEQ ID NO: 49 Amino acid sequences that are sexual or 50% identical.

本發明之經分離核酸可為重組腺相關病毒(AAV)載體(rAAV載體)。在一些實施例中,如由本發明描述之經分離核酸包括包含第一腺相關病毒(AAV)反向末端重複序列(ITR)或其變體之區域(例如,第一區)。該經分離核酸(例如,重組AAV載體)可包裝於衣殼蛋白內並對個體投與及/或遞送至所選靶細胞。「重組AAV (rAAV)載體」通常至少由轉基因及其調節序列且5'及3' AAV反向末端重複序列(ITR)構成。如本發明中別處描述,該轉基因可包含編碼例如蛋白質及/或表現控制序列(例如,多腺苷酸尾)之區域。The isolated nucleic acid of the present invention may be a recombinant adeno-associated virus (AAV) vector (rAAV vector). In some embodiments, an isolated nucleic acid as described by the present invention includes a region (eg, a first region) comprising a first adeno-associated virus (AAV) inverted terminal repeat (ITR) or a variant thereof. The isolated nucleic acid (eg, a recombinant AAV vector) can be packaged within a capsid protein and administered to an individual and/or delivered to a target cell of choice. A "recombinant AAV (rAAV) vector" typically consists of at least a transgene and its regulatory sequences and 5' and 3' AAV inverted terminal repeats (ITRs). As described elsewhere herein, the transgene may comprise regions encoding, for example, proteins and/or expression control sequences (eg, polyadenylation tails).

一般而言,ITR序列係約145 bp長度。較佳地,大體上編碼該等ITR之整個序列用於分子中,然而允許對此等序列進行某種程度之微小修飾。修飾此等ITR序列之能力係於本領域之技術範圍內。(參見例如諸如Sambrook等人,「Molecular Cloning. A Laboratory Manual」,第2版,Cold Spring Harbor Laboratory, New York (1989);及K. Fisher等人,J Virol., 70:520 532 (1996)之文本)。本發明中採用之此分子之實例係含有轉基因之「順式作用」質體,其中所選轉基因序列及相關調節元件係側接5'及3' AAV ITR序列。該等AAV ITR序列可獲自任何已知AAV,包括目前鑑定之哺乳動物AAV類型。在一些實施例中,該經分離核酸另外包括包含第二AAV ITR之區域(例如,第二區、第三區、第四區等)。Typically, ITR sequences are about 145 bp in length. Preferably, substantially the entire sequences encoding these ITRs are used in the molecule, although some minor modifications to these sequences are permitted. The ability to modify these ITR sequences is within the skill in the art. (See, eg, Sambrook et al., "Molecular Cloning. A Laboratory Manual", 2nd ed., Cold Spring Harbor Laboratory, New York (1989); and K. Fisher et al., J Virol., 70:520-532 (1996) text). An example of such a molecule employed in the present invention is a "cis-acting" plastid containing a transgene in which the selected transgene sequence and associated regulatory elements are flanked by 5' and 3' AAV ITR sequences. Such AAV ITR sequences can be obtained from any known AAV, including currently identified mammalian AAV types. In some embodiments, the isolated nucleic acid additionally includes a region comprising a second AAV ITR (eg, a second region, a third region, a fourth region, etc.).

除上文針對重組AAV載體鑑定之主要元件外,該載體亦包括習知控制元件,其係以允許轉基因在經載體轉染或感染由本發明產生之病毒之細胞中轉錄、轉譯及/或表現之方式與轉基因之元件可操作地連接。如本文使用,「可操作地連接」之序列包括與受關注基因鄰接之表現控制序列及反式或長程作用以控制受關注基因之表現控制序列。表現控制序列包括適當之轉錄起始、終止、啟動子及強化子序列;高效RNA處理信號,諸如剪接及聚腺苷酸化(多腺苷酸)信號;穩定細胞質mRNA之序列;增強轉譯效率之序列(例如,科紮克共有序列);增強蛋白質穩定性之序列;且選擇性地,增強編碼產物分泌之序列。表現控制序列(包括天然、組成型、誘導型及/或組織特異性啟動子)之數量為此項技術中已知且可加以利用。In addition to the major elements identified above for the recombinant AAV vector, the vector also includes well-known control elements to allow transcription, translation and/or expression of the transgene in cells transfected with the vector or infected with the virus produced by the invention means operably linked to elements of the transgene. As used herein, "operably linked" sequences include expression control sequences that are contiguous to the gene of interest and expression control sequences that act in trans or long-range to control the gene of interest. Expression control sequences include appropriate transcription initiation, termination, promoter, and enhancer sequences; efficient RNA processing signals, such as splicing and polyadenylation (polyadenylation) signals; sequences that stabilize cytoplasmic mRNAs; sequences that enhance translation efficiency (eg, Kozak consensus sequences); sequences that enhance protein stability; and, optionally, sequences that enhance secretion of the encoded product. A number of expression control sequences, including native, constitutive, inducible and/or tissue-specific promoters, are known in the art and available.

如本文使用,據說當核酸序列(例如,編碼序列)及調節序列以使得該核酸序列之表現或轉錄置於該調節序列之影響或控制下之方式共價連接時,認為其等係可操作地連接。若需將核酸序列轉譯為功能蛋白,若5’調節序列中啟動子之誘導導致編碼序列之轉錄及若兩個DNA序列間之鍵聯之性質不(1)導致移碼突變之引入,(2)干擾啟動子區指導編碼序列之轉錄之能力,或(3)干擾相應RNA轉錄本轉譯成蛋白質之能力,則認為兩個DNA序列係可操作地連接。因此,若啟動子區能夠影響該DNA序列之轉錄,使得所得轉錄本可轉譯成所需蛋白質或多肽,則啟動子區將可操作地連接至核酸序列。類似地,當兩個或更多個編碼區以使得其等自共同啟動子之轉錄導致兩個或更多個已於框架中轉譯之蛋白質之表現之方式連接時,兩個或更多個編碼區係可操作地連接。在一些實施例中,可操作地連接之編碼序列產生融合蛋白。As used herein, a nucleic acid sequence (eg, a coding sequence) and a regulatory sequence are said to be operably linked when they are covalently linked in such a way that the expression or transcription of the nucleic acid sequence is placed under the influence or control of the regulatory sequence. connect. If the nucleic acid sequence is to be translated into a functional protein, if induction of the promoter in the 5' regulatory sequence results in transcription of the coding sequence and if the nature of the linkage between the two DNA sequences does not (1) result in the introduction of a frameshift mutation, (2) Two DNA sequences are considered operably linked if ) interfere with the ability of the promoter region to direct transcription of the coding sequence, or (3) interfere with the ability of the corresponding RNA transcript to translate into protein. Thus, a promoter region will be operably linked to a nucleic acid sequence if the promoter region is capable of affecting the transcription of the DNA sequence such that the resulting transcript can be translated into the desired protein or polypeptide. Similarly, when two or more coding regions are linked in such a way that their transcription from a common promoter results in the expression of two or more proteins that have been translated in frame, the two or more coding regions The divisions are operably linked. In some embodiments, operably linked coding sequences result in fusion proteins.

包含轉基因(例如,包含融合蛋白等)之區域可位於將可表現該融合蛋白之經分離核酸之任何合適位置。A region comprising a transgene (eg, comprising a fusion protein, etc.) can be located at any suitable location in an isolated nucleic acid that will express the fusion protein.

應知曉,在轉基因編碼超過一個多肽之情況下,各多肽可位於轉基因內之任何合適位置。例如,編碼第一多肽之核酸可位於轉基因之內含子中及編碼第二多肽之核酸序列可位於另一非轉譯區中(例如,在蛋白質編碼序列之最後一個密碼子與轉基因之多腺苷酸信號之第一個鹼基之間)。It will be appreciated that where the transgene encodes more than one polypeptide, each polypeptide may be located at any suitable position within the transgene. For example, the nucleic acid encoding the first polypeptide can be located in an intron of the transgene and the nucleic acid sequence encoding the second polypeptide can be located in another untranslated region (eg, in the last codon of the protein coding sequence as much as the transgene between the first bases of the adenylate signal).

「啟動子」係指由細胞的合成機器或引入的合成機器所識別且為啟動基因之特異性轉錄所需之DNA序列。片語「可操作地連接」、「可操作地放置」、「在控制下」或「在轉錄控制下」意謂該啟動子相對於核酸處於正確位置及或方向上以控制RNA聚合酶起始及該基因之表現。"Promoter" refers to a DNA sequence recognized by a cell's synthetic machinery or introduced synthetic machinery and required to initiate specific transcription of a gene. The phrases "operably linked", "operably placed", "under control" or "under transcriptional control" mean that the promoter is in the correct position and or orientation relative to the nucleic acid to control RNA polymerase initiation and the expression of the gene.

對於編碼蛋白質之核酸,聚腺苷酸化序列一般係插入在轉基因序列之後且3' AAV ITR序列之前。用於本發明中之rAAV構築體亦可含有內含子,期望位於啟動子/強化子序列與轉基因之間。一種可能之內含子序列係來源於SV-40,且稱為SV-40 T內含子序列。可使用之另一載體元件係內部核醣體進入位點(IRES)。IRES序列係用以自單一基因轉錄本產生超過一個多肽。IRES序列將用以產生含有超過一個多肽鏈之蛋白質。習知此等及其他常見載體元件之選擇且可獲得許多此等序列[參見例如Sambrook等人,及其中(例如)第3.18 3.26及16.17 16.27頁引用之參考文獻,及Ausubel等人,Current Protocols in Molecular Biology, John Wiley & Sons, New York, 1989]。在一些實施例中,口蹄疫病毒2A序列包括於多聚蛋白中;此係小肽(約18個胺基酸長度),已顯示其介導多聚蛋白之裂解(Ryan, M D等人,EMBO, 1994;4: 928-933;Mattion, N M等人,J Virology,1996年11月;第8124至8127頁;Furler, S等人,Gene Therapy, 2001;8: 864-873;及Halpin, C等人,The Plant Journal, 1999;4: 453-459)。先前已在包括質體及基因療法載體(AAV及反轉錄病毒)之人工系統中證實2A序列之裂解活性(Ryan, M D等人,EMBO, 1994;4: 928-933;Mattion, N M等人,J Virology,1996年11月;第8124至8127頁;Furler, S等人,Gene Therapy, 2001;8: 864-873;及Halpin, C等人,The Plant Journal, 1999;4: 453-459;de Felipe, P等人,Gene Therapy, 1999;6: 198-208;de Felipe, P等人,Human Gene Therapy, 2000;11: 1921-1931.;及Klump, H等人,Gene Therapy, 2001;8: 811-817)。For nucleic acids encoding proteins, the polyadenylation sequence is typically inserted after the transgene sequence and before the 3' AAV ITR sequence. The rAAV constructs used in the present invention may also contain introns, desirably located between the promoter/enhancer sequence and the transgene. One possible intron sequence is derived from SV-40 and is referred to as the SV-40 T intron sequence. Another vector element that can be used is the internal ribosome entry site (IRES). IRES sequences are used to generate more than one polypeptide from a single gene transcript. IRES sequences will be used to generate proteins containing more than one polypeptide chain. The selection of these and other common vector elements is known and many such sequences are available [see, eg, Sambrook et al., and references cited therein, for example, at pages 3.18-3.26 and 16.17-16.27, and Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, New York, 1989]. In some embodiments, the foot-and-mouth disease virus 2A sequence is included in a polyprotein; this is a small peptide (about 18 amino acids in length) that has been shown to mediate cleavage of polyproteins (Ryan, MD et al., EMBO, 1994; 4: 928-933; Mattion, N M et al, J Virology, Nov 1996; pp. 8124-8127; Furler, S et al, Gene Therapy, 2001; 8: 864-873; and Halpin, C et al Human, The Plant Journal, 1999; 4: 453-459). The cleavage activity of the 2A sequence has been previously demonstrated in artificial systems including plastids and gene therapy vectors (AAV and retrovirus) (Ryan, MD et al., EMBO, 1994; 4: 928-933; Mattion, N M et al., J Virology, Nov 1996; pp. 8124-8127; Furler, S et al, Gene Therapy, 2001; 8: 864-873; and Halpin, C et al, The Plant Journal, 1999; 4: 453-459; de Felipe, P et al, Gene Therapy, 1999; 6: 198-208; de Felipe, P et al, Human Gene Therapy, 2000; 11: 1921-1931.; and Klump, H et al, Gene Therapy, 2001; 8: 811-817).

組成型啟動子之實例包括(但不限於)反轉錄病毒勞斯肉瘤病毒(RSV) LTR啟動子(選擇性地為具有RSV強化子)、巨細胞病毒(CMV)啟動子(選擇性地為具有CMV強化子) [參見例如Boshart等人,Cell, 41:521-530 (1985)]、SV40啟動子、二氫葉酸還原酶啟動子、β-肌動蛋白啟動子、磷酸甘油激酶(PGK)啟動子及EF1α啟動子[Invitrogen]。在一些實施例中,啟動子係P2啟動子。在一些實施例中,啟動子係雞β-肌動蛋白(CBA)啟動子。在一些實施例中,啟動子係兩個CBA啟動子。在一些實施例中,啟動子係由CMV強化子隔開之兩個CBA啟動子。在一些實施例中,啟動子係CAG啟動子。Examples of constitutive promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with RSV enhancers), the cytomegalovirus (CMV) promoter (optionally with CMV enhancer) [see, eg, Boshart et al., Cell, 41:521-530 (1985)], SV40 promoter, dihydrofolate reductase promoter, beta-actin promoter, phosphoglycerol kinase (PGK) promoter promoter and EF1α promoter [Invitrogen]. In some embodiments, the promoter is a P2 promoter. In some embodiments, the promoter is the chicken beta-actin (CBA) promoter. In some embodiments, the promoters are two CBA promoters. In some embodiments, the promoters are two CBA promoters separated by a CMV enhancer. In some embodiments, the promoter is the CAG promoter.

誘導型啟動子容許調節基因表現且可藉由以下調節:外源供應之化合物、環境因素(諸如溫度)、或存在特定生理狀態 例如,急性期、細胞之特定分化狀態,或僅於複製細胞中)。誘導型啟動子及誘導系統可獲自各種商業來源,包括(但不限於) Invitrogen、Clontech及Ariad。已描述許多其他系統且可由熟習此項技術者容易選擇。藉由外源供應之啟動子調節之誘導型啟動子之實例包括鋅誘導型綿羊金屬硫蛋白(MT)啟動子、地塞米松(Dex)誘導型小鼠乳腺腫瘤病毒(MMTV)啟動子、T7聚合酶啟動子系統(WO 98/10088);蛻皮激素昆蟲啟動子(No等人,Proc. Natl. Acad. Sci. USA, 93:3346-3351 (1996))、四環素可抑制系統(Gossen等人,Proc. Natl. Acad. Sci. USA, 89:5547-5551 (1992))、四環素誘導型系統(Gossen等人,Science, 268:1766-1769 (1995),亦參見Harvey等人,Curr. Opin. Chem. Biol., 2:512-518 (1998))、RU486誘導型系統(Wang等人,Nat. Biotech., 15:239-243 (1997)及Wang等人,Gene Ther., 4:432-441 (1997))及雷帕黴素誘導型系統(Magari等人,J. Clin. Invest., 100:2865-2872 (1997))。可用於本內文中之誘導型啟動子之又其他類型係彼等由特定生理狀態(例如,溫度、急性期、細胞之特定分化狀態,或僅於複製細胞中)調節者。Inducible promoters allow regulation of gene expression and can be regulated by exogenously supplied compounds, environmental factors (such as temperature), or the presence of a specific physiological state such as the acute phase, a specific differentiation state of a cell, or only in replicating cells ). Inducible promoters and inducible systems are available from various commercial sources including, but not limited to, Invitrogen, Clontech, and Ariad. Many other systems have been described and are readily selectable by those skilled in the art. Examples of inducible promoters regulated by exogenously supplied promoters include zinc inducible ovine metallothionein (MT) promoter, dexamethasone (Dex) inducible mouse mammary tumor virus (MMTV) promoter, T7 Polymerase promoter system (WO 98/10088); ecdysone insect promoter (No et al., Proc. Natl. Acad. Sci. USA, 93:3346-3351 (1996)), tetracycline repressible system (Gossen et al. , Proc. Natl. Acad. Sci. USA, 89:5547-5551 (1992)), tetracycline-inducible system (Gossen et al., Science, 268:1766-1769 (1995), see also Harvey et al., Curr. Opin Chem. Biol., 2:512-518 (1998)), RU486 inducible system (Wang et al., Nat. Biotech., 15:239-243 (1997) and Wang et al., Gene Ther., 4:432 -441 (1997)) and the rapamycin-inducible system (Magari et al., J. Clin. Invest., 100:2865-2872 (1997)). Still other types of inducible promoters that can be used in the context are those that are regulated by a particular physiological state (eg, temperature, acute phase, a particular differentiation state of a cell, or only in replicating cells).

在另一實施例中,將使用轉基因之天然啟動子。當期望轉基因之表現應模擬天然表現時,該天然啟動子可較佳。當必須在時間上或發育上,或以組織特異性方式,或應特定轉錄刺激而調節轉基因之表現時,可使用該天然啟動子。在另一實施例中,其他天然表現控制元件(諸如強化子元件、聚腺苷酸化位點或科紮克共有序列)亦可用以模擬天然表現。In another embodiment, the native promoter of the transgene will be used. The native promoter may be preferred when it is desired that the expression of the transgene should mimic native expression. The native promoter can be used when the expression of the transgene must be regulated temporally or developmentally, or in a tissue-specific manner, or in response to a specific transcriptional stimulus. In another embodiment, other native expression control elements, such as enhancer elements, polyadenylation sites, or Kozak consensus sequences, can also be used to mimic native expression.

在一些實施例中,調節序列賦予組織特異性基因表現能力。在一些情況下,組織特異性調節序列結合以組織特異性方式誘導轉錄之組織特異性轉錄因子。此等組織特異性調節序列(例如,啟動子、強化子等)係此項技術中熟知。例示性組織特異性調節序列包括(但不限於)下列組織特異性啟動子:肝特異性甲狀腺素結合球蛋白(TBG)啟動子、胰島素啟動子、胰高血糖素啟動子、體抑素啟動子、胰多肽(PPY)啟動子、突觸素-1 (Syn)啟動子、肌酸激酶(MCK)啟動子、哺乳動物韌帶素(DES)啟動子、α-肌球蛋白重鏈(α-MHC)啟動子或心肌肌鈣蛋白T (cTnT)啟動子。其他例示性啟動子包括β-肌動蛋白啟動子、B型肝炎病毒核心啟動子,Sandig等人,Gene Ther., 3:1002-9 (1996);α-胎蛋白(AFP)啟動子,Arbuthnot等人,Hum.Gene Ther., 7:1503-14 (1996))、骨骨鈣化素啟動子(Stein等人,Mol. Biol. Rep., 24:185-96 (1997));骨涎蛋白啟動子(Chen等人,J. Bone Miner. Res., 11:654-64 (1996))、CD2啟動子(Hansal等人,J. Immunol., 161:1063-8 (1998);免疫球蛋白重鏈啟動子;T細胞受體α-鏈啟動子、神經元(諸如神經元特異性烯醇酶(NSE)啟動子) (Andersen等人,Cell. Mol. Neurobiol., 13:503-15 (1993))、神經絲輕鏈基因啟動子(Piccioli等人,Proc. Natl. Acad. Sci. USA, 88:5611-5 (1991))及神經元特異性vgf基因啟動子(Piccioli等人,Neuron, 15:373-84 (1995)),其等尤其為熟練技術人員知曉。In some embodiments, the regulatory sequences confer tissue-specific gene expression capabilities. In some instances, tissue-specific regulatory sequences bind tissue-specific transcription factors that induce transcription in a tissue-specific manner. Such tissue-specific regulatory sequences (eg, promoters, enhancers, etc.) are well known in the art. Exemplary tissue-specific regulatory sequences include, but are not limited to, the following tissue-specific promoters: liver-specific thyroxine-binding globulin (TBG) promoter, insulin promoter, glucagon promoter, somatostatin promoter , pancreatic polypeptide (PPY) promoter, synaptophysin-1 (Syn) promoter, creatine kinase (MCK) promoter, mammalian ligamentin (DES) promoter, α-myosin heavy chain (α-MHC) ) promoter or the cardiac troponin T (cTnT) promoter. Other exemplary promoters include the beta-actin promoter, the hepatitis B virus core promoter, Sandig et al., Gene Ther., 3:1002-9 (1996); the alpha-fetoprotein (AFP) promoter, Arbuthnot et al, Hum. Gene Ther., 7: 1503-14 (1996)), Osteocalcin promoter (Stein et al, Mol. Biol. Rep., 24: 185-96 (1997)); Bone sialoprotein Promoter (Chen et al., J. Bone Miner. Res., 11:654-64 (1996)), CD2 promoter (Hansal et al., J. Immunol., 161:1063-8 (1998); Immunoglobulin Heavy chain promoters; T cell receptor alpha-chain promoters, neuronal (such as neuron-specific enolase (NSE) promoters) (Andersen et al., Cell. Mol. Neurobiol., 13:503-15 ( 1993)), neurofilament light chain gene promoter (Piccioli et al., Proc. Natl. Acad. Sci. USA, 88:5611-5 (1991)) and neuron-specific vgf gene promoter (Piccioli et al., Neuron , 15:373-84 (1995)), among others, are known to the skilled artisan.

在一些實施例中,編碼包含DBD及轉活化子之融合蛋白之轉基因係可操作地連接至啟動子。在一些實施例中,該啟動子係組成型啟動子。在一些實施例中,該啟動子係誘導型啟動子。在一些實施例中,該啟動子係組織特異性啟動子。在一些實施例中,該啟動子係神經組織特異性的。在一些實施例中,該啟動子對SST或NPY啟動子。In some embodiments, a transgenic line encoding a fusion protein comprising DBD and a transactivator is operably linked to a promoter. In some embodiments, the promoter is a constitutive promoter. In some embodiments, the promoter is an inducible promoter. In some embodiments, the promoter is a tissue-specific promoter. In some embodiments, the promoter is neural tissue specific. In some embodiments, the promoter is a SST or NPY promoter.

本發明之態樣係關於包含超過一個啟動子(例如,2、3、4、5或更多個啟動子)之經分離核酸。例如,在具有包含編碼蛋白質之第一區及編碼蛋白質之第二區之轉基因之構築體的情境下,可需使用第一啟動子序列(例如,可操作地連接至蛋白質編碼區之第一啟動子序列)驅動第一蛋白質編碼區之表現,及以第二啟動子序列(例如,可操作地連接至第二蛋白質編碼區之第二啟動子序列)驅動第二蛋白質編碼區之表現。一般而言,該第一啟動子序列及該第二啟動子序列可為相同啟動子序列或不同啟動子序列。在一些實施例中,該第一啟動子序列(例如,驅動蛋白質編碼區表現之啟動子)係RNA聚合酶III (pol III)啟動子序列。pol III啟動子序列之非限制性實例包括U6及H1啟動子序列。在一些實施例中,該第二啟動子序列(例如,驅動第二蛋白質之啟動子序列)係RNA 聚合酶II (pol II)啟動子序列。pol II啟動子序列之非限制性實例包括 T7、T3、SP6、RSV及巨細胞病毒啟動子序列。在一些實施例中,pol III啟動子序列驅動第一蛋白質編碼區之表現。在一些實施例中,pol II啟動子序列驅動第二蛋白質編碼區之表現。Aspects of the invention relate to isolated nucleic acids comprising more than one promoter (eg, 2, 3, 4, 5 or more promoters). For example, in the context of a construct having a transgene comprising a first region encoding a protein and a second region encoding a protein, it may be desirable to use a first promoter sequence (eg, a first promoter operably linked to the protein encoding region) subsequence) drives the expression of the first protein-coding region, and a second promoter sequence (eg, a second promoter sequence operably linked to the second protein-coding region) drives the expression of the second protein-coding region. In general, the first promoter sequence and the second promoter sequence can be the same promoter sequence or different promoter sequences. In some embodiments, the first promoter sequence (eg, a promoter that drives expression of a protein coding region) is an RNA polymerase III (pol III) promoter sequence. Non-limiting examples of pol III promoter sequences include U6 and H1 promoter sequences. In some embodiments, the second promoter sequence (eg, the promoter sequence driving the second protein) is an RNA polymerase II (pol II) promoter sequence. Non-limiting examples of pol II promoter sequences include T7, T3, SP6, RSV and cytomegalovirus promoter sequences. In some embodiments, the pol III promoter sequence drives the expression of the first protein coding region. In some embodiments, the pol II promoter sequence drives the expression of the second protein coding region.

重組腺相關病毒(rAAV) 在一些態樣中,本發明提供經分離腺相關病毒(AAV)。如本文關於AAV使用,術語「經分離」係指已人工產生或獲得之AAV。可使用重組方法產生經分離AAV。此等AV在本文中稱為「重組AAV」。重組AAV (rAAV)較佳具有組織特異性靶向能力,使得將rAAV之核酸酶及/或轉基因特異性遞送至一或多個預定組織。AAV衣殼係測定此等組織特異性靶向能力之重要元件。因此,可選擇具有適用於靶向組織之衣殼之rAAV。 Recombinant Adeno-Associated Virus (rAAV) In some aspects, the present invention provides isolated adeno-associated virus (AAV). As used herein with respect to AAVs, the term "isolated" refers to AAVs that have been artificially produced or obtained. Isolated AAV can be produced using recombinant methods. These AVs are referred to herein as "recombinant AAVs." Recombinant AAV (rAAV) preferably has tissue-specific targeting capabilities such that the nuclease and/or transgene of the rAAV is specifically delivered to one or more predetermined tissues. The AAV capsid is an important element for determining these tissue-specific targeting capabilities. Therefore, rAAVs with capsids suitable for targeting tissues can be selected.

用於獲得具有所需衣殼蛋白之重組AAV之方法係此項技術中熟知。(參見例如US 2003/0138772),該案之內容係以全文引用之方式併入本文中)。通常該等方法涉及培養宿主細胞,其含有編碼AAV衣殼蛋白之核酸序列;功能性rep基因;由AAV反向末端重複序列(ITR)及轉基因構成之重組AAV載體;及足夠之輔助功能以允許將重組AAV載體包裝於AAV衣殼蛋白內。在一些實施例中,衣殼蛋白係由AAV之cap基因編碼之結構蛋白。AAV包含三種衣殼蛋白,病毒體蛋白1至3 (命名為VP1、VP2及VP3),其等均經由選擇性剪接轉錄自單一cap基因。在一些實施例中,VP1、VP2及VP3之分子量分別為約87 kDa、約72 kDa及約62 kDa。在一些實施例中,一經轉譯,衣殼蛋白即在病毒基因體周圍形成球形60單體單元蛋白質外殼。在一些實施例中,該等衣殼蛋白之功能係保護病毒基因體,遞送基因體並與宿主相互作用。在一些態樣中,衣殼蛋白以組織特異性方式將病毒基因體遞送至宿主。Methods for obtaining recombinant AAV with the desired capsid protein are well known in the art. (See eg, US 2003/0138772), the contents of which are incorporated herein by reference in their entirety). Typically these methods involve culturing host cells containing a nucleic acid sequence encoding an AAV capsid protein; a functional rep gene; a recombinant AAV vector consisting of an AAV inverted terminal repeat (ITR) and a transgene; and sufficient helper functions to allow The recombinant AAV vector is packaged within the AAV capsid protein. In some embodiments, the capsid protein is a structural protein encoded by the cap gene of AAV. AAV contains three capsid proteins, virion proteins 1 to 3 (designated VP1, VP2 and VP3), all of which are transcribed from a single cap gene via alternative splicing. In some embodiments, the molecular weights of VP1, VP2, and VP3 are about 87 kDa, about 72 kDa, and about 62 kDa, respectively. In some embodiments, upon translation, the capsid protein forms a spherical 60 monomer unit protein coat around the viral genome. In some embodiments, the function of the capsid proteins is to protect the viral genome, deliver the genome, and interact with the host. In some aspects, the capsid protein delivers the viral genome to the host in a tissue-specific manner.

在一些實施例中,AAV衣殼蛋白具有選自由以下組成之群之AAV血清型:AAV1、AAV2、AAV3、AAV4、AAV5、AAV6、AAV7、AAV8、AAVrh8、AAV9、AAV10、AAVrh10及AAV.PHP.B。在一些實施例中,AAV衣殼蛋白具有來源於非人類靈長類動物之血清型,例如AAVrh8血清型。在一些實施例中,AAV衣殼蛋白具有來源於廣泛且高效CNS轉導之血清型,例如AAV.PHP.B。在一些實施例中,該衣殼蛋白具有AAV血清型9。In some embodiments, the AAV capsid protein has an AAV serotype selected from the group consisting of AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAV9, AAV10, AAVrh10, and AAV.PHP. B. In some embodiments, the AAV capsid protein has a serotype derived from a non-human primate, eg, the AAVrh8 serotype. In some embodiments, the AAV capsid protein is of a serotype derived from a broad and efficient CNS transduction, eg, AAV.PHP.B. In some embodiments, the capsid protein is of AAV serotype 9.

欲在宿主細胞中培養以將rAAV載體包裝於AAV衣殼中之組分可反式提供給宿主細胞。或者,可由穩定宿主細胞提供所需組分 (例如,重組AAV載體、rep序列、cap序列及/或輔助功能)中之任一者或多者,該穩定宿主細胞已使用熟習此項技術者已知的方法經工程化以含有所需組分中之一或多者。最適當地,此穩定宿主細胞將含有在誘導型啟動子控制下之所需組分。然而,該一或多種所需組分可在組成型啟動子之控制下。在討論適合與轉基因一起使用之調節元件時,本文中提供合適之誘導型及組成型啟動子之實例。在又另一種選擇中,所選穩定宿主細胞可含有在組成型啟動子控制下之所選組分及在一或多個誘導型啟動子控制下之其他所選組分。例如,可產生穩定宿主細胞,其來源於293細胞(其含有在組成型啟動子控制下之E1輔助功能),但其含有在誘導型啟動子控制下之rep蛋白及/或cap蛋白。可由熟習此項技術者產生又其他穩定宿主細胞。Components to be cultured in the host cell to package the rAAV vector in the AAV capsid can be provided to the host cell in trans. Alternatively, any one or more of the desired components (eg, recombinant AAV vectors, rep sequences, cap sequences, and/or helper functions) can be provided by a stable host cell that has been Known methods are engineered to contain one or more of the desired components. Most suitably, this stable host cell will contain the desired components under the control of an inducible promoter. However, the one or more desired components may be under the control of a constitutive promoter. In discussing regulatory elements suitable for use with transgenes, examples of suitable inducible and constitutive promoters are provided herein. In yet another option, the selected stable host cell may contain selected components under the control of a constitutive promoter and other selected components under the control of one or more inducible promoters. For example, stable host cells can be generated that are derived from 293 cells (which contain E1 helper functions under the control of a constitutive promoter), but which contain rep and/or cap proteins under the control of an inducible promoter. Still other stable host cells can be generated by those skilled in the art.

在一些實施例中,本發明係關於含有包含編碼轉基因(例如,融合至轉錄調節域之DNA結合域)之編碼序列之核酸之宿主細胞。在一些實施例中,該宿主細胞係哺乳動物細胞、酵母細胞、細菌細胞、昆蟲細胞、植物細胞或真菌細胞。In some embodiments, the invention pertains to host cells containing nucleic acid comprising a coding sequence encoding a transgene (eg, a DNA binding domain fused to a transcriptional regulatory domain). In some embodiments, the host cell is a mammalian cell, a yeast cell, a bacterial cell, an insect cell, a plant cell, or a fungal cell.

可使用任何適當之遺傳元件(載體)將產生本發明之rAAV所需之重組AAV載體、rep序列、cap序列及輔助功能遞送至包裝宿主細胞。可藉由任何合適之方法(包括彼等本文描述者)遞送所選遺傳元件。用以構築本發明之任何實施例之方法為熟習核酸操作技術者已知且包括基因工程、重組工程及合成技術。參見例如Sambrook等人,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring Harbor, N.Y。同樣地,產生rAAV病毒體之方法係熟知且合適方法之選擇不限制本發明。參見例如K. Fisher等人,J. Virol., 70:520-532 (1993)及美國專利第5,478,745號。The recombinant AAV vectors, rep sequences, cap sequences and helper functions required to produce the rAAV of the invention can be delivered to the packaging host cell using any suitable genetic element (vector). The selected genetic elements can be delivered by any suitable method, including those described herein. The methods used to construct any embodiment of the present invention are known to those skilled in the art of nucleic acid manipulation and include genetic engineering, recombinant engineering, and synthetic techniques. See, eg, Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. Likewise, methods for producing rAAV virions are well known and the selection of a suitable method does not limit the present invention. See, eg, K. Fisher et al., J. Virol., 70:520-532 (1993) and US Patent No. 5,478,745.

在一些實施例中,可使用三重轉染法(詳細描述於美國專利第6,001,650號中)產生重組AAV。通常,藉由用欲包裝於AAV顆粒中之AAV載體(包含側接ITR元件之轉基因)、AAV輔助功能載體及附屬功能載體轉染宿主細胞產生重組AAV。AAV輔助功能載體編碼「AAV輔助功能」序列(例如,rep序列及cap序列),該等序列反式用於生產性AAV複製及衣殼化。較佳地,AAV輔助功能載體支援高效AAV載體產生而不產生任何可偵測之野生型AAV病毒體(例如,含有功能性rep基因及cap基因之AAV病毒體)。適合與本發明一起使用之載體之非限制性實例包括描述於美國專利第6,001,650號中之pHLP19及描述於美國專利第6,156,303號中之pRep6cap6載體,兩者之全部內容係以引用之方式併入本文中。附屬功能載體編碼非AAV衍生之病毒及/或細胞功能之核苷酸序列,AAV依賴於此等功能(例如,「附屬功能」)進行複製。附屬功能包括彼等AAV複製所需之功能,包括(但不限於)彼等參與AAV基因轉錄之活化、階段特異性AAV mRNA剪接、AAV DNA複製、cap表現產物之合成及AAV衣殼組裝之部分。基於病毒之附屬功能可來源於已知輔助病毒諸如腺病毒、疱疹病毒(除單純疱疹病毒1型外)及痘苗病毒中之任一者。In some embodiments, the triple transfection method (described in detail in US Pat. No. 6,001,650) can be used to generate recombinant AAV. Typically, recombinant AAV is produced by transfecting host cells with an AAV vector to be packaged in AAV particles (comprising a transgene flanked by ITR elements), an AAV helper function vector, and an accessory function vector. AAV helper vectors encode "AAV helper" sequences (eg, rep and cap sequences) that are used in trans for productive AAV replication and encapsidation. Preferably, the AAV helper vector supports efficient AAV vector production without producing any detectable wild-type AAV virions (eg, AAV virions containing functional rep and cap genes). Non-limiting examples of vectors suitable for use with the present invention include pHLP19 described in US Patent No. 6,001,650 and the pRep6cap6 vector described in US Patent No. 6,156,303, both of which are incorporated herein by reference in their entirety middle. Accessory function vectors encode nucleotide sequences that are not AAV-derived viral and/or cellular functions that AAV relies on for replication (eg, "accessory functions"). Accessory functions include those required for AAV replication, including but not limited to those involved in activation of AAV gene transcription, stage-specific AAV mRNA splicing, AAV DNA replication, synthesis of cap expression products, and AAV capsid assembly . Virus-based accessory functions can be derived from any of known helper viruses such as adenovirus, herpes virus (except herpes simplex virus type 1), and vaccinia virus.

在一些態樣中,本發明提供經轉染宿主細胞。術語「轉染」用以係指由細胞攝取外源DNA,且當已將外源性DNA引入細胞膜內部時,細胞已經「轉染」。許多轉染技術為此項技術中普遍知曉。參見例如Graham等人,(1973) Virology, 52:456、Sambrook等人,(1989) Molecular Cloning, a laboratory manual, Cold Spring Harbor Laboratories, New York、Davis等人,(1986) Basic Methods in Molecular Biology, Elsevier及Chu等人,(1981) Gene 13:197。此等技術可用以將一或多個外源性核酸(諸如核苷酸整合載體及其他核酸分子)引入合適之宿主細胞內。In some aspects, the invention provides transfected host cells. The term "transfection" is used to refer to the uptake of exogenous DNA by a cell, and a cell has been "transfected" when the exogenous DNA has been introduced into the interior of the cell membrane. A number of transfection techniques are generally known in the art. See, eg, Graham et al, (1973) Virology, 52:456, Sambrook et al, (1989) Molecular Cloning, a laboratory manual, Cold Spring Harbor Laboratories, New York, Davis et al, (1986) Basic Methods in Molecular Biology, Elsevier and Chu et al. (1981) Gene 13:197. These techniques can be used to introduce one or more exogenous nucleic acids, such as nucleotide integration vectors and other nucleic acid molecules, into suitable host cells.

「宿主細胞」係指攜帶或能夠攜帶受關注物質之任何細胞。通常宿主細胞係哺乳動物細胞。在一些實施例中,宿主細胞係神經元,選擇性地為GABA能神經元。如本文使用,「GABA能神經元」係產生γ胺基丁酸(GABA)之神經細胞。在哺乳動物中,GABA係廣泛分佈於神經系統中之神經傳遞質,該GABA結合並抑制其結合之神經元。因此,GABA與許多影響神經系統之疾患(包括癲癇、自閉症及焦慮症)有關。SCN1A半合子及基因剔除小鼠中之研究已觀察到大腦中GABA能神經元存在嚴重鈉電流缺陷。宿主細胞可用作AAV輔助構築體、AAV袖珍基因質體、附屬功能載體或與重組AAV產生相關聯之其他轉移DNA之受體。該術語包括已經轉染之原始細胞之後代。因此,如本文使用之「宿主細胞」可係指已經外源性DNA序列轉染之細胞。應瞭解由於天然、偶然或故意之突變,單一親代細胞之後代在形態學或基因體或總DNA互補體上不一定與原始親代完全相同。"Host cell" refers to any cell that carries or is capable of carrying a substance of interest. Usually the host cell is a mammalian cell. In some embodiments, the host cell line neurons, optionally GABAergic neurons. As used herein, a "GABAergic neuron" is a nerve cell that produces gamma aminobutyric acid (GABA). In mammals, GABA is a neurotransmitter widely distributed in the nervous system, and GABA binds and inhibits the neurons to which it binds. As such, GABA has been implicated in many disorders affecting the nervous system, including epilepsy, autism, and anxiety. Studies in SCN1A hemizygous and knockout mice have observed severe sodium current deficits in GABAergic neurons in the brain. Host cells can be used as recipients for AAV helper constructs, AAV pocket gene plastids, accessory function vectors, or other transfer DNA associated with recombinant AAV production. The term includes progeny of the original cell that has been transfected. Thus, a "host cell" as used herein may refer to a cell that has been transfected with an exogenous DNA sequence. It is understood that the progeny of a single parental cell are not necessarily identical to the original parent in morphology or genome or total DNA complement due to natural, accidental or deliberate mutation.

如本文使用,術語「細胞系」係指能夠活體外連續或延長生長並分裂之細胞群體。通常,細胞系係來源於單一前驅細胞之純系群體。此項技術中另外已知,在此等純系群體之儲存或轉移期間,核型中可發生自發或誘導之變化。因此,來源於所指細胞系之細胞可與上代細胞或培養物不完全相同,且所指細胞系包括此等變體。As used herein, the term "cell line" refers to a population of cells capable of continuous or prolonged growth and division in vitro. Typically, a cell line is derived from a clonal population of single precursor cells. It is additionally known in the art that spontaneous or induced changes in the karyotype can occur during storage or transfer of these pure line populations. Thus, cells derived from a referenced cell line may not be identical to the previous cell or culture, and the referenced cell line includes such variants.

如本文使用,術語「重組細胞」係指其中已引入外源性DNA區段(諸如導致生物活性多肽之轉錄或生物活性核酸(諸如RNA)產生之DNA區段)之細胞。As used herein, the term "recombinant cell" refers to a cell into which an exogenous DNA segment has been introduced, such as a DNA segment that results in the transcription of a biologically active polypeptide or the production of a biologically active nucleic acid, such as RNA.

如本文使用,術語「載體」包括當與適當控制元件結合時能夠複製且可在細胞之間轉移基因序列之任何遺傳元件,諸如質體、噬菌體、轉位子、黏接質體、染色體、人工染色體、病毒、病毒體等。在一些實施例中,載體係病毒載體,諸如rAAV載體、慢病毒載體、腺病毒載體、反轉錄病毒載體等。因此,該術語包括選殖及表現媒介物,及病毒載體。在一些實施例中,經審慎考慮之有用載體係彼等其中欲轉錄之核酸區段位於啟動子之轉錄控制下之載體。As used herein, the term "vector" includes any genetic element capable of replicating and transferring genetic sequences between cells, such as plastids, bacteriophages, transposons, cohesoplasts, chromosomes, artificial chromosomes, when combined with appropriate control elements , viruses, virions, etc. In some embodiments, the vector is a viral vector, such as an rAAV vector, lentiviral vector, adenoviral vector, retroviral vector, and the like. Thus, the term includes cloning and expression vehicles, as well as viral vectors. In some embodiments, carefully considered useful vectors are those in which the nucleic acid segment to be transcribed is under the transcriptional control of a promoter.

「啟動子」係指由細胞的合成機器或引入的合成機器所識別且為啟動基因之特異性轉錄所需之DNA序列。片語「可操作地連接」、「可操作地放置」、「在控制下」或「在轉錄控制下」意謂該啟動子相對於核酸處於正確位置及或方向上以控制RNA聚合酶起始及該基因之表現。術語「表現載體或構築體」意謂含有其中核酸編碼序列之部分或所有能夠被轉錄之核酸之任何類型之遺傳構築體。在一些實施例中,表現包括核酸之轉錄,例如,以自經轉錄基因產生生物活性多肽產物。用於將重組載體包裝於所需AAV衣殼中以產生本發明rAAV之前述方法無意為限制性且其他合適之方法為熟練技術人員所知曉。"Promoter" refers to a DNA sequence recognized by a cell's synthetic machinery or introduced synthetic machinery and required to initiate specific transcription of a gene. The phrases "operably linked", "operably placed", "under control" or "under transcriptional control" mean that the promoter is in the correct position and or orientation relative to the nucleic acid to control RNA polymerase initiation and the expression of the gene. The term "expression vector or construct" means any type of genetic construct containing a nucleic acid in which part or all of a nucleic acid coding sequence is capable of being transcribed. In some embodiments, expression includes transcription of nucleic acid, eg, to produce a biologically active polypeptide product from a transcribed gene. The foregoing methods for packaging the recombinant vector in the desired AAV capsid to generate the rAAV of the present invention are not intended to be limiting and other suitable methods will be known to the skilled artisan.

用於調節靶基因表現之方法 本發明提供用於調節細胞或個體中之基因表現之方法。該等方法通常涉及對細胞或個體投與包含編碼包含DNA結合域(例如,ZFP域)及轉活化域之融合蛋白之轉基因之經分離核酸或rAAV。在一些實施例中,融合蛋白包含ZFP及VP64轉活化子。在一些實施例中,融合蛋白包含ZFP及p65轉活化子。在一些實施例中,融合蛋白包含ZFP及RTA轉活化子。在一些實施例中,融合蛋白包含ZFP及VPR轉活化子。在一些實施例中,該方法涉及對細胞或個體投與dCas9蛋白及至少一個靶向SCN1A之引導核酸(例如,包含SEQ ID NO: 83至94中之任一者或由SEQ ID NO: 83至94中之任一者編碼之引導核酸)。 Methods for modulating target gene expression The present invention provides methods for modulating gene expression in cells or individuals. These methods generally involve administering to a cell or individual an isolated nucleic acid or rAAV comprising a transgene encoding a fusion protein comprising a DNA binding domain (eg, a ZFP domain) and a transactivation domain. In some embodiments, the fusion protein comprises a ZFP and a VP64 transactivator. In some embodiments, the fusion protein comprises a ZFP and a p65 transactivator. In some embodiments, the fusion protein comprises a ZFP and an RTA transactivator. In some embodiments, the fusion protein comprises a ZFP and a VPR transactivator. In some embodiments, the method involves administering to a cell or individual a dCas9 protein and at least one guide nucleic acid targeting SCN1A (e.g., comprising or consisting of any of SEQ ID NO: 83 to 94). guide nucleic acid encoded by any one of 94).

在一些實施例中,對細胞或個體投與編碼融合蛋白(例如,包含轉活化子之融合蛋白)之經分離核酸或rAAV導致靶基因(例如,SCN1A)之表現增加。因此,在一些實施例中,由本發明描述之組合物及方法適用於治療由靶基因單倍不足引起之病症,諸如由SCN1A基因單倍不足引起之卓飛症候群。In some embodiments, administration of an isolated nucleic acid or rAAV encoding a fusion protein (eg, a fusion protein comprising a transactivator) to a cell or individual results in increased expression of a target gene (eg, SCN1A). Thus, in some embodiments, the compositions and methods described by the present invention are suitable for the treatment of disorders caused by haploinsufficiency of target genes, such as Zoffer syndrome caused by haploinsufficiency of the SCN1A gene.

如本文使用,「單倍不足」係指其中基因(例如,SCN1A)之一個拷貝(例如)藉由遺傳突變或缺失而不活化,且該基因之剩餘功能拷貝不足以產生足夠維持該基因正常功能之基因產物之量之遺傳病症。As used herein, "haploinsufficiency" refers to where one copy of a gene (eg, SCN1A) is inactivated, eg, by genetic mutation or deletion, and the remaining functional copies of the gene are not sufficient to produce enough to maintain normal function of the gene A genetic disorder of the amount of the gene product.

卓飛症候群(亦稱為嬰兒嚴重症肌痙攣癲癇症)係癲癇之罕見終生形式,其通常在生命之前三年顯現。卓飛症候群之特徵在於長時間且頻繁之癲癇發作、行為及發育遲緩、運動及平衡問題、語言及言語遲緩問題及自主神經系統之中斷。在一些實施例中,個體具有與卓飛症候群相關聯之單倍不足,諸如SCN1A基因之一個拷貝突變,導致細胞或個體中SCN1A蛋白減少。大多數卓飛症候群病患攜載SCN1A突變,該等突變經轉譯成截短蛋白質;與卓飛症候群相關聯之其他SCN1A突變包括剪接位點及錯義突變,及隨機分佈於整個SCN1A基因中之突變。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) SCN1A基因之ZFP域及轉活化域。在一些實施例中,用於靶向SCNA1之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) SCN1A基因之引導核酸(例如,gRNA)。Zhuofei Syndrome (also known as Myosic Infantile Severe Epilepsy) is a rare lifelong form of epilepsy that usually manifests within the first three years of life. Zhuofei syndrome is characterized by prolonged and frequent seizures, behavioral and developmental delays, motor and balance problems, language and speech delay problems, and disruption of the autonomic nervous system. In some embodiments, the individual has a haploinsufficiency associated with Zoffer syndrome, such as a mutation in one copy of the SCN1A gene, resulting in decreased SCN1A protein in the cell or individual. Most patients with Zoffer's syndrome carry SCN1A mutations that are translated into truncated proteins; other SCN1A mutations associated with Zoffer's syndrome include splice site and missense mutations, and are randomly distributed throughout the SCN1A gene mutation. In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the SCN1A gene. In some embodiments, compositions for targeting SCNA1 comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the SCN1A gene , gRNA).

在一些實施例中,個體患有與MED13L單倍不足症候群相關聯之單倍不足,其中該個體僅具有MED13L基因之單一功能拷貝。罹患MED13L單倍不足症候群之個體通常於其等MED13L基因之第二個非功能拷貝中具有突變。MED13L單倍不足症候群之特徵在於智力障礙、言語問題、獨特之面部特徵及發育遲緩。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) MED13L基因之ZFP域及轉活化域。在一些實施例中,用於靶向MED13L之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) MED13L基因之引導核酸(例如,gRNA)。In some embodiments, the individual has haploinsufficiency associated with MED13L haploinsufficiency syndrome, wherein the individual has only a single functional copy of the MED13L gene. Individuals with MED13L haploinsufficiency syndrome often have mutations in the second non-functional copy of their MED13L gene. MED13L haploinsufficiency syndrome is characterized by intellectual disability, speech problems, distinctive facial features, and developmental delay. In some embodiments, the fusion proteins of the present invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the MED13L gene. In some embodiments, compositions for targeting MED13L comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the MED13L gene. , gRNA).

在一些實施例中,個體患有與骨髓化生不良症候群相關聯之單倍不足。罹患骨髓化生不良症候群之個體通常於異檸檬酸去氫酶1 (IDH1)、異檸檬酸去氫酶2 (IDH2)及/或GATA2基因之一個拷貝中具有突變。骨髓化生不良症候群係一組癌症,其中骨髓中之未成熟血細胞未成熟為健康血細胞。有時,此症候群可導致急性骨髓性白血病。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) IDH1基因之ZFP域及轉活化域。在一些實施例中,用於靶向IDH1之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) IDH1基因之引導核酸(例如,gRNA)。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) IDH2基因之ZFP域及轉活化域。在一些實施例中,用於靶向IDH2之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) IDH2基因之引導核酸(例如,gRNA)。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) GATA2基因之ZFP域及轉活化域。在一些實施例中,用於靶向GATA2之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) GATA2基因之引導核酸(例如,gRNA)。In some embodiments, the individual has a haploinsufficiency associated with myelodysplastic syndrome. Individuals with myelodysplasia syndrome typically have mutations in one copy of the isocitrate dehydrogenase 1 (IDH1), isocitrate dehydrogenase 2 (IDH2) and/or GATA2 genes. Myelodysplastic syndromes are a group of cancers in which immature blood cells in the bone marrow do not mature into healthy blood cells. Sometimes, this syndrome can lead to acute myeloid leukemia. In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the IDH1 gene. In some embodiments, compositions for targeting IDH1 comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the IDH1 gene , gRNA). In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the IDH2 gene. In some embodiments, compositions for targeting IDH2 comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the IDH2 gene , gRNA). In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the GATA2 gene. In some embodiments, compositions for targeting GATA2 comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the GATA2 gene , gRNA).

在一些實施例中,個體患有與迪格奧爾格(DiGeorge)症候群相關聯之單倍不足。罹患迪格奧爾格症候群之個體通常於第22號染色體的中間稱為22q11.2之位置具有30至40個基因之缺失。特定言之,該疾病之特徵可能在於 TBX基因之單倍不足。迪格奧爾格症候群之特徵在於先天性心臟病、特定面部特徵、頻繁感染、發育遲緩、學習問題及齶裂。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) TBX基因之ZFP域及轉活化域。在一些實施例中,用於靶向 TBX之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) TBX基因之引導核酸(例如,gRNA)。 In some embodiments, the individual has a haploinsufficiency associated with DiGeorge syndrome. Individuals with Diggeorg syndrome typically have deletions of 30 to 40 genes in the middle of chromosome 22 at a location called 22q11.2. Specifically, the disease may be characterized by haploinsufficiency of the TBX gene. Diggeorg syndrome is characterized by congenital heart disease, certain facial features, frequent infections, developmental delays, learning problems, and cleft palate. In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the TBX gene. In some embodiments, compositions for targeting TBX comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the TBX gene , gRNA).

在一些實施例中,個體患有與CHARGE症候群相關聯之單倍不足。在大多數情況下,罹患CHARGE症候群之個體係 CHD7基因單倍不足。CHARGE症候群之特徵在於眼睛缺損、心臟缺陷、鼻後孔閉鎖、生長及/或發育遲緩、生殖器及/或泌尿系統異常及耳朵異常及耳聾。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) CHD7基因之ZFP域及轉活化域。在一些實施例中,用於靶向 CHD7之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) CHD7基因之引導核酸(例如,gRNA)。 In some embodiments, the individual has haploinsufficiency associated with CHARGE syndrome. In most cases, individuals with CHARGE syndrome are haploinsufficient in the CHD7 gene. CHARGE syndrome is characterized by eye defects, heart defects, posterior nasal atresia, growth and/or developmental delay, genital and/or urinary system abnormalities and ear abnormalities and deafness. In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the CHD7 gene. In some embodiments, compositions for targeting CHD7 comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the CHD7 gene , gRNA).

在一些實施例中,個體患有與埃勒斯-當洛斯(Ehlers–Danlos)症候群相關聯之單倍不足。罹患埃勒斯-當洛斯症候群之個體可具有以下之單倍不足: COL1A1COL1A2COL3A1COL5A1COL5A2TNXBADAMTS2PLOD1B4GALT7DSE及/或 D4ST1/ CHST14基因。埃勒斯-當洛斯症候群之特徵在於皮膚超彈性且可導致主動脈剝離、脊柱側彎及早發性骨關節炎。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) COL1A1COL1A2COL3A1COL5A1COL5A2TNXBADAMTS2PLOD1B4GALT7DSED4ST1/ CHST14基因中之任一者之ZFP域及轉活化域。在一些實施例中,用於靶向 COL1A1COL1A2COL3A1COL5A1COL5A2TNXBADAMTS2PLOD1B4GALT7DSED4ST1/ CHST14中之任一者之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) COL1A1COL1A2COL3A1COL5A1COL5A2TNXBADAMTS2PLOD1B4GALT7DSED4ST1/ CHST14基因中之任一者之引導核酸(例如,gRNA)。 In some embodiments, the individual has haploinsufficiency associated with Ehlers-Danlos syndrome. Individuals with Ehlers-Danlos Syndrome may have haploinsufficiency of the following: COL1A1 , COL1A2 , COL3A1 , COL5A1 , COL5A2 , TNXB , ADAMTS2 , PLOD1 , B4GALT7 , DSE and/or D4ST1 / CHST14 genes. Ehlers-Danlos syndrome is characterized by hyperelastic skin and can lead to aortic dissection, scoliosis, and early-onset osteoarthritis. In some embodiments, fusion proteins of the invention comprise specifically targeting (eg, binding to) any of the COL1A1 , COL1A2 , COL3A1 , COL5A1 , COL5A2 , TNXB , ADAMTS2 , PLOD1 , B4GALT7 , DSE or D4ST1 / CHST14 genes The ZFP domain and the transactivation domain. In some embodiments, compositions for targeting any of COL1A1 , COL1A2 , COL3A1 , COL5A1 , COL5A2 , TNXB , ADAMTS2 , PLOD1 , B4GALT7 , DSE , or D4ST1 / CHST14 comprise (i) a dCas protein and a transfection Fusion proteins of activation domains, and (ii) specifically targeting (eg, binding to) any of the COL1A1 , COL1A2 , COL3A1 , COL5A1 , COL5A2 , TNXB , ADAMTS2 , PLOD1 , B4GALT7 , DSE , or D4ST1 / CHST14 genes Guide nucleic acid (eg, gRNA).

在一些實施例中,個體患有與額顳葉型失智症(FTD)相關聯之單倍不足。罹患FTD之個體係 MAPT基因(其編碼Tau蛋白)及/或 GRN基因單倍不足。FTD之特徵在於記憶力減退、缺乏社會意識、衝動控制能力差及言語困難。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) MAPT基因之ZFP域及轉活化域。在一些實施例中,用於靶向 MAPT之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) MAPT基因之引導核酸(例如,gRNA)。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) GRN基因之ZFP域及轉活化域。在一些實施例中,用於靶向 GRN之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) GRN基因之引導核酸(例如,gRNA)。 In some embodiments, the individual has haploinsufficiency associated with frontotemporal dementia (FTD). Individual lines with FTD were haploinsufficient in the MAPT gene (which encodes the Tau protein) and/or the GRN gene. FTD is characterized by memory loss, lack of social awareness, poor impulse control, and speech difficulties. In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the MAPT gene. In some embodiments, compositions for targeting MAPT comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the MAPT gene. , gRNA). In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) a GRN gene. In some embodiments, a composition for targeting GRN comprises (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the GRN gene , gRNA).

在一些實施例中,個體患有與霍爾特-奧拉姆(Holt-Oram)症候群相關聯之單倍不足。罹患霍爾特-奧拉姆症候群之個體係TBX5基因單倍不足。霍爾特-奧拉姆症候群之特徵在於心臟併發症,包括先天性心臟缺陷及心臟傳導疾病。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) TBX5基因之ZFP域及轉活化域。在一些實施例中,用於靶向TBX5之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) TBX5基因之引導核酸(例如,gRNA)。In some embodiments, the individual has a haploinsufficiency associated with Holt-Oram syndrome. Haploinsufficiency of the TBX5 gene in a system with Holt-Oram syndrome. Holt-Oram syndrome is characterized by cardiac complications, including congenital heart defects and cardiac conduction disorders. In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the TBX5 gene. In some embodiments, compositions for targeting TBX5 comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the TBX5 gene , gRNA).

在一些實施例中,個體患有與馬凡症候群(Marfan)相關聯之單倍不足。罹患馬凡症候群之個體通常係FBN1基因(其編碼原纖蛋白-1蛋白)單倍不足。馬凡症候群之特徵在於肢體長度不成比例、早發性關節炎、心臟併發症及/或自主神經系統之功能障礙。在一些實施例中,本發明之融合蛋白包含特異性靶向(例如,結合至) FBN1基因之ZFP域及轉活化域。在一些實施例中,用於靶向FBN1之組合物包含(i)包含dCas蛋白及轉活化域之融合蛋白,及(ii)特異性靶向(例如,結合至) FBN1基因之引導核酸(例如,gRNA)。In some embodiments, the individual has a haploinsufficiency associated with Marfan syndrome. Individuals with Marfan syndrome are usually haploinsufficient in the FBN1 gene, which encodes the fibrillin-1 protein. Marfan syndrome is characterized by disproportionate limb length, early-onset arthritis, cardiac complications, and/or dysfunction of the autonomic nervous system. In some embodiments, fusion proteins of the invention comprise a ZFP domain and a transactivation domain that specifically target (eg, bind to) the FBN1 gene. In some embodiments, compositions for targeting FBN1 comprise (i) a fusion protein comprising a dCas protein and a transactivation domain, and (ii) a guide nucleic acid (eg, binding to) that specifically targets (eg, binds to) the FBN1 gene , gRNA).

本發明係部分基於對個體投與如本文描述之融合蛋白之方法。在一些實施例中,該融合蛋白包含DBD及轉錄活化子。在一些實施例中,該DBD係ZNF、TALE、dCas蛋白(例如,dCas9或dCas12a)或結合至SCN1A基因之同源域。在一些實施例中,該轉錄活化子係VP64、p65、RTA或包含VP64-p65-RTA (VPR)之三連轉錄活化子。在一些實施例中,該融合蛋白係側接AAV反向末端重複序列(ITR)序列。在一些實施例中,該融合蛋白係可操作地連接至啟動子。在一些實施例中,該個體具有或疑似在SCN1A中具有導致SCN1A蛋白單倍不足之突變。在一些實施例中,該個體患有或疑似患有卓飛症候群。The present invention is based in part on methods of administering to an individual a fusion protein as described herein. In some embodiments, the fusion protein comprises DBD and a transcriptional activator. In some embodiments, the DBD is a ZNF, TALE, dCas protein (eg, dCas9 or dCas12a) or binds to the homeodomain of the SCN1A gene. In some embodiments, the transcriptional activator is VP64, p65, RTA or a triplex transcriptional activator comprising VP64-p65-RTA (VPR). In some embodiments, the fusion protein is flanked by AAV inverted terminal repeat (ITR) sequences. In some embodiments, the fusion protein is operably linked to a promoter. In some embodiments, the individual has or is suspected of having a mutation in SCN1A that results in a haploinsufficiency of the SCN1A protein. In some embodiments, the individual has or is suspected of having Zoffer syndrome.

在一些態樣中,本發明提供調節(例如,增加、減少等)靶基因在細胞中之表現之方法。在一些實施例中,本發明提供增加靶基因(例如,SCN1A)在細胞中之表現之方法。在一些實施例中,細胞係哺乳動物細胞。在一些實施例中,細胞係於個體中(例如,活體內)。在一些實施例中,個體係哺乳動物個體,例如人類。在一些實施例中,細胞係神經系統細胞(中樞神經系統細胞或外周神經系統細胞),例如神經元(例如,GABA能神經元、單極神經元、雙極神經元、籃狀細胞、貝氏細胞、盧加羅細胞(Lugaro cell)、多刺神經元、普金氏細胞(Purkinje cell)、錐體細胞、閏紹細胞(Renshaw cell)、顆粒細胞、運動神經元、梭形細胞等)或神經膠質細胞(例如,星形膠質細胞、寡突膠質細胞、室管膜細胞、放射狀膠質細胞、施旺氏細胞、衛星細胞等)。In some aspects, the present invention provides methods of modulating (eg, increasing, decreasing, etc.) the expression of a target gene in a cell. In some embodiments, the present invention provides methods of increasing expression of a target gene (eg, SCN1A) in a cell. In some embodiments, the cell line is a mammalian cell. In some embodiments, the cell line is in an individual (eg, in vivo). In some embodiments, a systemic mammalian individual, such as a human. In some embodiments, the cell line is a nervous system cell (central nervous system cell or peripheral nervous system cell), such as a neuron (eg, GABAergic neuron, unipolar neuron, bipolar neuron, basket cell, Bayer cells, Lugaro cells, spiny neurons, Purkinje cells, pyramidal cells, Renshaw cells, granule cells, motor neurons, spindle cells, etc.) or Glial cells (eg, astrocytes, oligodendrocytes, ependymal cells, radial glial cells, Schwann cells, satellite cells, etc.).

在「正常」細胞或個體中,靶基因(例如,SCN1A)之表現係足夠使得細胞或個體關於該靶基因(例如,SCN1A)而言非單倍不足。在一些實施例中,轉基因之「經改善」或「經增加」之表現或活性係相對於該轉基因於尚未投與一或多種如本文描述之經分離核酸、rAAV或組合物之細胞或個體中之表現或活性來量測。在一些實施例中,轉基因之「經改善」或「經增加」之表現或活性係在已對個體投與一或多種如本文描述之經分離核酸、rAAV或組合物後,相對於該轉基因於該個體中之表現或活性來量測(例如,在投與一或多種如本文描述之經分離核酸、rAAV或組合物之前及之後量測基因表現)。例如,在一些實施例中,SCN1A於細胞或個體中之「經改善」或「經增加」之表現係相對於尚未投與編碼融合ZFP-轉活化子之轉基因之細胞或個體來量測。在一些實施例中,由本發明描述之方法導致個體中之SCN1A表現及/或活性相對於尚未投與一或多種由本發明描述之組合物之個體之SCN1A表現及/或活性增加2倍至100倍(例如,2倍、5倍、10倍、50倍、100倍等)。In a "normal" cell or individual, the expression of a target gene (eg, SCN1A) is sufficient such that the cell or individual is not haploinsufficient for the target gene (eg, SCN1A). In some embodiments, the "improved" or "increased" expression or activity of a transgene is relative to the transgene in cells or individuals to which one or more isolated nucleic acids, rAAVs, or compositions as described herein have not been administered performance or activity. In some embodiments, the "improved" or "increased" performance or activity of a transgene is relative to the transgene at a time after the individual has been administered one or more isolated nucleic acids, rAAVs, or compositions as described herein. The expression or activity in the individual is measured (eg, gene expression is measured before and after administration of one or more isolated nucleic acids, rAAVs, or compositions as described herein). For example, in some embodiments, "improved" or "increased" expression of SCN1A in a cell or individual is measured relative to a cell or individual that has not been administered a transgene encoding a fusion ZFP-transactivator. In some embodiments, the methods described herein result in a 2- to 100-fold increase in SCN1A expression and/or activity in an individual relative to SCN1A expression and/or activity in an individual who has not been administered one or more of the compositions described herein (eg, 2x, 5x, 10x, 50x, 100x, etc.).

如本文使用,術語「治療(treatment、treating)」及「療法」係指治療性治療及預防性或防止性操作。該術語另外包括改善現存症狀、預防另外症狀、改善或預防症狀之根本原因、預防或逆轉症狀之原因,例如,與單倍不足基因(例如,單倍不足SCN1A基因)相關聯之症狀。因此,該術語表示已賦予患有疾患(例如,與單倍不足基因相關聯之疾病或病症,例如,卓飛症候群)或具有發展此疾患潛力之個體之有利結果。此外,術語「治療」亦包括對個體可患有疾病、疾病之症狀或具有患有疾病之傾向之個體或來自個體之經分離組織或細胞系施用或投與藥劑(例如,治療劑或治療組合物,例如,靶向或結合至靶基因或靶基因之調節區之經分離核酸或rAAV),目的在於治癒、痊癒、減輕、緩解、改變、補救、改良、改善或影響該疾病、疾病之症狀或具有患有疾病之傾向。As used herein, the terms "treatment, treating" and "therapy" refer to both therapeutic treatment and prophylactic or preventative procedures. The term additionally includes amelioration of existing symptoms, prevention of additional symptoms, amelioration or prevention of underlying causes of symptoms, prevention or reversal of causes of symptoms, eg, symptoms associated with a haploinsufficiency gene (eg, haploinsufficiency SCN1A gene). Thus, the term refers to a favorable outcome conferred on an individual who has a disorder (eg, a disease or disorder associated with a haploinsufficiency gene, eg, Zoffer syndrome) or has the potential to develop such a disorder. In addition, the term "treating" also includes administering or administering an agent (eg, a therapeutic agent or a therapeutic combination) to an individual who may have a disease, a symptom of a disease, or a predisposition to a disease, or an isolated tissue or cell line from an individual. (e.g., an isolated nucleic acid or rAAV that targets or binds to a target gene or regulatory region of a target gene) for the purpose of curing, healing, alleviating, relieving, altering, remediating, ameliorating, ameliorating or affecting the disease, symptoms of the disease or have a tendency to suffer from disease.

治療劑或治療組合物可包括呈醫藥上可接受形式之化合物,其預防及/或減少特定疾病(例如,與單倍不足基因相關聯之疾病或病症,例如,卓飛症候群)之症狀。例如,治療組合物可為醫藥組合物,其預防及/或減少與單倍不足基因相關聯之疾病或病症(例如,卓飛症候群)之症狀。經審慎考慮本發明之治療組合物將以任何合適之形式提供。該治療組合物之形式將取決於許多因素,包括如本文描述之投與模式。該治療組合物可含有稀釋劑、佐劑及賦形劑及如本文描述之其他成分。A therapeutic agent or composition can include a compound in a pharmaceutically acceptable form that prevents and/or reduces the symptoms of a particular disease (eg, a disease or disorder associated with a haploinsufficiency gene, eg, Zoffer syndrome). For example, a therapeutic composition can be a pharmaceutical composition that prevents and/or reduces symptoms of a disease or disorder associated with a haploinsufficiency gene (eg, Zoffer syndrome). The therapeutic compositions of the present invention will be provided in any suitable form with due consideration. The form of the therapeutic composition will depend on many factors, including the mode of administration as described herein. The therapeutic composition may contain diluents, adjuvants and excipients and other ingredients as described herein.

投與模式 本發明之經分離核酸、rAAV及組合物可以組合物根據此項技術中已知的任何適當方法遞送至個體。例如,可對個體,即宿主動物,諸如人類、小鼠、大鼠、貓、狗、綿羊、兔、馬、奶牛、山羊、豬、豚鼠、倉鼠、雞、火雞或非人類靈長類動物(例如,獼猴)投與rAAV,較佳懸浮於生理上可相容之載劑中(例如,於組合物中)。在一些實施例中,宿主動物不包括人類。 investment mode The isolated nucleic acids, rAAVs, and compositions of the present invention can be delivered in compositions to an individual according to any suitable method known in the art. For example, an individual, ie, a host animal, such as a human, mouse, rat, cat, dog, sheep, rabbit, horse, cow, goat, pig, guinea pig, hamster, chicken, turkey, or non-human primate, can be treated (eg, rhesus monkeys) rAAV is administered, preferably suspended in a physiologically compatible carrier (eg, in a composition). In some embodiments, the host animal does not include a human.

rAAV可藉由(例如)肌內注射或藉由投與至該哺乳動物個體之血流內而遞送至哺乳動物個體。投與至該血流內可藉由注射至靜脈、動脈或任何其他血管導管內。在一些實施例中,rAAV係藉助於經分離肢體灌注(外科領域中熟知的技術)而投與至血流內,該方法基本上使熟練技術人員可在投與rAAV病毒體前將肢體與體循環分離。熟練技術人員亦可採用美國專利第6,177,403號中描述之經分離肢體灌注技術之變體將病毒體投與至經分離肢體之脈管系統內以潛在增強轉導至肌肉細胞或組織內。此外,在某些情況下,可需將病毒體遞送至個體之CNS。「CNS」意謂脊椎動物之大腦及脊髓之所有細胞及組織。因此,該術語包括(但不限於)神經元細胞、神經膠質細胞、星形膠質細胞、腦脊液(CSF)、胞間隙、骨、軟骨及類似物。重組AAV可使用此項技術中已知的神經外科技術諸如藉由立體定向注射(參見例如Stein等人,J Virol 73:3424-3429, 1999;Davidson等人,PNAS 97:3428-3432, 2000;Davidson等人,Nat. Genet. 3:219-223, 1993;及Alisky及Davidson, Hum. Gene Ther. 11:2315-2329, 2000)藉由用針、導管或相關裝置注射至(例如)心室區內及注射至紋狀體(例如,紋狀體之尾狀核或殼核)、丘腦、脊髓及神經肌肉接合點,或小腦小葉而直接遞送至CNS或大腦。在一些實施例中,藉由靜脈內注射投與如本發明中描述之rAAV。在一些實施例中,藉由大腦內注射投與rAAV。在一些實施例中,藉由鞘內注射投與rAAV。在一些實施例中,藉由紋狀體內注射投與rAAV。在一些實施例中,藉由顱內注射遞送rAAV。在一些實施例中,藉由小腦延髓池注射遞送rAAV。在一些實施例中,藉由腦側腦室注射遞送rAAV。rAAV can be delivered to a mammalian subject, for example, by intramuscular injection or by administration into the bloodstream of the mammalian subject. Administration into the blood stream can be by injection into a vein, artery or any other vascular catheter. In some embodiments, the rAAV is administered into the bloodstream by means of isolated limb perfusion (a technique well known in the surgical field), which essentially allows the skilled artisan to circulate the limb to the system prior to administration of the rAAV virions separation. Skilled artisans may also employ a variant of the isolated limb perfusion technique described in US Pat. No. 6,177,403 to administer virions into the vasculature of isolated limbs to potentially enhance transduction into muscle cells or tissues. In addition, in certain instances, it may be desirable to deliver virions to the CNS of an individual. "CNS" means all cells and tissues of the brain and spinal cord of vertebrates. Thus, the term includes, but is not limited to, neuronal cells, glial cells, astrocytes, cerebrospinal fluid (CSF), intercellular spaces, bone, cartilage, and the like. Recombinant AAV can be obtained using neurosurgical techniques known in the art such as by stereotaxic injection (see, eg, Stein et al, J Virol 73:3424-3429, 1999; Davidson et al, PNAS 97:3428-3432, 2000; Davidson et al., Nat. Genet. 3:219-223, 1993; and Alisky and Davidson, Hum. Gene Ther. 11:2315-2329, 2000) by injection into, for example, the ventricular area with a needle, catheter or related device Intra and injection into the striatum (eg, the caudate nucleus or putamen of the striatum), thalamus, spinal cord and neuromuscular junction, or cerebellar lobules for direct delivery to the CNS or brain. In some embodiments, the rAAV as described in the present invention is administered by intravenous injection. In some embodiments, the rAAV is administered by intracerebral injection. In some embodiments, the rAAV is administered by intrathecal injection. In some embodiments, the rAAV is administered by intrastriatal injection. In some embodiments, the rAAV is delivered by intracranial injection. In some embodiments, the rAAV is delivered by cisterna magna injection. In some embodiments, the rAAV is delivered by intraventricular injection.

本發明之態樣係關於包括包含衣殼蛋白及編碼轉基因之核酸之重組AAV之組合物,其中該轉基因包含編碼一或多種蛋白質之核酸序列。在一些實施例中,該核酸另外包含AAV ITR。在一些實施例中,組合物另外包含醫藥上可接受之載劑。Aspects of the invention pertain to compositions comprising recombinant AAV comprising a capsid protein and a nucleic acid encoding a transgene, wherein the transgene comprises nucleic acid sequences encoding one or more proteins. In some embodiments, the nucleic acid additionally comprises an AAV ITR. In some embodiments, the composition additionally comprises a pharmaceutically acceptable carrier.

本發明之組合物可包含單獨rAAV,或與一或多種其他病毒之組合(例如,具有一或多種不同轉基因之第二rAAV編碼)。在一些實施例中,組合物包含1、2、3、4、5、6、7、8、9、10或更多種不同rAAV,各具有一或多種不同之轉基因。Compositions of the invention may comprise rAAV alone, or in combination with one or more other viruses (eg, encoding a second rAAV with one or more different transgenes). In some embodiments, the composition comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more different rAAVs, each with one or more different transgenes.

熟習此項技術者鑒於rAAV針對之適應症可容易選擇合適之載劑。例如,一種合適之載劑包括鹽水,其可與各種緩衝溶液(例如,磷酸鹽緩衝鹽水)一起調配。其他例示性載劑包括無菌鹽水、乳糖、蔗糖、磷酸鈣、明膠、聚葡糖、瓊脂、果膠、花生油、芝麻油及水。該載劑之選擇不限制本發明。Those skilled in the art can readily select an appropriate carrier in view of the indication for which the rAAV is directed. For example, one suitable carrier includes saline, which can be formulated with various buffered solutions (eg, phosphate buffered saline). Other exemplary carriers include sterile saline, lactose, sucrose, calcium phosphate, gelatin, polydextrose, agar, pectin, peanut oil, sesame oil, and water. The choice of the carrier does not limit the invention.

選擇性地,除rAAV及載劑外,本發明之組合物亦可含有其他習知醫藥成分,諸如防腐劑或化學穩定劑。合適之例示性防腐劑包括氯丁醇、山梨酸鉀、山梨酸、二氧化硫、沒食子酸丙酯、對羥基苯甲酸酯、乙基香蘭素、甘油、苯酚、對氯苯酚,及泊洛沙姆(非離子型表面活性劑),諸如Pluronic ®F-68。合適之化學穩定劑包括明膠及白蛋白。 Optionally, in addition to the rAAV and the carrier, the compositions of the present invention may also contain other conventional pharmaceutical ingredients, such as preservatives or chemical stabilizers. Suitable exemplary preservatives include chlorobutanol, potassium sorbate, sorbic acid, sulfur dioxide, propyl gallate, parabens, ethyl vanillin, glycerin, phenol, p-chlorophenol, and polox Sham (non-ionic surfactant) such as Pluronic ® F-68. Suitable chemical stabilizers include gelatin and albumin.

足量投與rAAV以轉染所需組織之細胞並提供足夠量之基因轉移及表現而無過度不利影響。習知及醫藥上可接受之投與途徑包括(但不限於)直接遞送至所選器官(例如,門靜脈內遞送至肝)、經口、吸入(包括鼻內及氣管內遞送)、眼內、靜脈內、肌內、皮下、皮內、瘤內,及其他非經腸投與途徑。選擇性地,投與途徑可組合。rAAV is administered in sufficient quantities to transfect cells of the desired tissue and provide sufficient quantities of gene transfer and expression without undue adverse effects. Conventional and pharmaceutically acceptable routes of administration include, but are not limited to, direct delivery to the organ of choice (eg, intraportal delivery to the liver), oral, inhalation (including intranasal and intratracheal delivery), intraocular, Intravenous, intramuscular, subcutaneous, intradermal, intratumoral, and other parenteral routes of administration. Alternatively, routes of administration can be combined.

達成特定「治療效應」所需之rAAV病毒體之劑量(例如,以基因體拷貝/每公斤體重(GC/kg)計之劑量單位)將基於數種因素而變化,包括(但不限於):rAAV病毒體之投與途徑、達成治療效應所需之基因或RNA表現之量、治療中之特定疾病或疾患,及基因或RNA產物之穩定性。基於前述因素及此項技術中熟知的其他因素,熟習此項技術者可容易確定rAAV病毒體劑量範圍以治療患有特定疾病或疾患之病患。The dose of rAAV virions required to achieve a particular "therapeutic effect" (eg, dose units in gene copies per kilogram of body weight (GC/kg)) will vary based on several factors, including (but not limited to): The route of administration of the rAAV virions, the amount of gene or RNA expression required to achieve a therapeutic effect, the particular disease or disorder under treatment, and the stability of the gene or RNA product. Based on the foregoing factors and other factors well known in the art, one skilled in the art can readily determine a rAAV virion dosage range to treat a patient with a particular disease or disorder.

rAAV之有效量係足以靶向感染動物,靶向所需組織之量。在一些實施例中,在溶酶體貯積病之症狀前階段期間,對個體投與有效量之rAAV。在一些實施例中,該溶酶體貯積病之症狀前階段發生在出生(例如,圍產期)至4週齡之間。An effective amount of rAAV is an amount sufficient to target the infected animal, targeting the desired tissue. In some embodiments, an effective amount of rAAV is administered to the individual during the presymptomatic stage of the lysosomal storage disease. In some embodiments, the presymptomatic stage of the lysosomal storage disease occurs between birth (eg, perinatal period) and 4 weeks of age.

在一些實施例中,調配rAAV組合物以減少AAV顆粒在組合物中之聚集,特別在存在高rAAV濃度(例如,~10 13GC/mL或更大)之情況下。用於減少rAAV聚集之方法係此項技術中熟知且包括(例如)添加表面活性劑、pH調節劑、鹽濃度調節劑等。(參見例如Wright FR等人,Molecular Therapy (2005) 12, 171–178,該案之內容係以引用之方式併入本文中)。 In some embodiments, rAAV compositions are formulated to reduce aggregation of AAV particles in the composition, particularly in the presence of high rAAV concentrations (eg, -10 13 GC/mL or greater). Methods for reducing rAAV aggregation are well known in the art and include, for example, the addition of surfactants, pH modifiers, salt concentration modifiers, and the like. (See eg, Wright FR et al., Molecular Therapy (2005) 12, 171-178, the contents of which are incorporated herein by reference).

熟習此項技術者熟知醫藥上可接受之賦形劑及載劑溶液之調配物,及熟知開發在各種治療方案中使用本文描述之特定組合物之合適之給藥及治療方案。Those skilled in the art are familiar with the formulation of pharmaceutically acceptable excipients and carrier solutions, and are familiar with developing appropriate dosing and treatment regimens for use in various treatment regimens of the particular compositions described herein.

通常,此等調配物可含有至少約0.1%活性化合物或更多,然而該(等)活性成分之百分比當然可變化且可便利地在總調配物之重量或體積之約1或2%至約70%或80%或更多之間。理所當然,活性化合物在各治療有用之組合物中之量可以使得在化合物之任何給定單位劑量中將獲得合適劑量之方式製備。熟習製備此等醫藥調配物之技術者將審慎考慮各種因素(諸如溶解度、生物有效性、生物半衰期、投與途徑、產品保質期及其他藥理學考量),且因此,各種劑量及治療方案可為合需的。Typically, such formulations will contain at least about 0.1% active compound or more, although the percentage of active ingredient(s) may of course vary and may conveniently range from about 1 or 2% to about 1 or 2% by weight or volume of the total formulation Between 70% or 80% or more. The amount of active compound in each therapeutically useful composition can, of course, be prepared in such a way that an appropriate dosage will be obtained in any given unit dose of the compound. Those skilled in the preparation of such pharmaceutical formulations will carefully consider various factors (such as solubility, bioavailability, biological half-life, route of administration, product shelf life, and other pharmacological considerations) and, therefore, various dosages and treatment regimens may be suitable. required.

在某些情況下,皮下、胰內、鼻內、非經腸、靜脈內、肌內、鞘內或經口、腹膜內或藉由吸入遞送經適當調配之本文揭示之醫藥組合物中之基於rAAV之治療構築體可為合需的。在一些實施例中,可使用如美國專利第5,543,158;5,641,515及5,399,363號(各以全文引用之方式明確併入本文中)中描述之投與模式以遞送rAAV。在一些實施例中,較佳之投與模式係藉由門靜脈注射。In certain instances, subcutaneous, intrapancreatic, intranasal, parenteral, intravenous, intramuscular, intrathecal or oral, intraperitoneal, or by inhalation delivery of a suitably formulated pharmaceutical composition based on Therapeutic constructs of rAAV may be desirable. In some embodiments, modes of administration as described in US Pat. Nos. 5,543,158; 5,641,515 and 5,399,363 (each expressly incorporated herein by reference in its entirety) can be used to deliver rAAV. In some embodiments, the preferred mode of administration is by portal vein injection.

適用於可注射用途之醫藥形式包括無菌水溶液或分散液及用於臨時製備無菌可注射溶液或分散液之無菌粉末。分散液亦可製備於甘油、液體聚乙二醇,及其混合物中及製備於油中。在儲存及使用之一般條件下,此等製劑含有防腐劑以防止微生物生長。在許多情況下,該形式係無菌且流動至容易注射之程度。該形式在製造及儲存條件下必須係穩定的且必須防止微生物(諸如細菌及真菌)之污染作用。該載劑可為含有以下之溶劑或分散介質:例如,水、乙醇、多元醇(例如,甘油、丙二醇及液體聚乙二醇,及類似物)、其合適之混合物,及/或植物油。適當之流動性可例如藉由使用包衣(諸如卵磷脂)、在分散液之情況下藉由維持所需粒度及藉由使用表面活性劑加以維持。防止微生物作用可由各種抗菌劑及抗真菌劑實現,例如,對羥基苯甲酸酯、氯丁醇、苯酚、山梨酸、乙汞硫柳酸鈉,及類似物。在許多情況下,較佳包括等滲劑,例如,糖或氯化鈉。可注射組合物之延長吸收可藉由在該等組合物中使用延遲吸收劑(例如,單硬脂酸鋁及明膠)實現。The pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations contain a preservative to prevent the growth of microorganisms. In many cases, the form is sterile and fluid to the extent that it is easy to inject. This form must be stable under the conditions of manufacture and storage and must be protected against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and/or vegetable oils. Proper fluidity can be maintained, for example, by the use of coatings such as lecithin, by the maintenance of the required particle size in the case of dispersions, and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents such as, for example, parabens, chlorobutanol, phenol, sorbic acid, sodium thiosalate, and the like. In many cases, it is preferred to include isotonic agents such as sugar or sodium chloride. Prolonged absorption of the injectable compositions can be brought about by the use of agents delaying absorption, for example, aluminum monostearate and gelatin in the compositions.

為投與可注射水溶液,例如,溶液可選擇性地經適當緩衝,且液體稀釋劑首先與足夠之鹽水或葡萄糖等滲。此等特定水溶液尤其適用於靜脈內、肌內、皮下及腹膜內投與。就此而言,熟習此項技術者已知可採用之無菌水性介質。例如,一個劑量可溶解於1 mL等滲NaCl溶液中並添加至1000 mL皮下灌注液或於建議之輸注位點注射(參見例如「Remington's Pharmaceutical Sciences」,第15版,第1035至1038及1570至1580頁)。取決於宿主之病症,劑量將必然發生一些變化。在任何情況下,負責投與者均將確定適用於個別宿主之劑量。For administration of aqueous injectable solutions, for example, the solution may optionally be suitably buffered and the liquid diluent first isotonic with sufficient saline or dextrose. These particular aqueous solutions are particularly suitable for intravenous, intramuscular, subcutaneous and intraperitoneal administration. In this regard, sterile aqueous media known to those skilled in the art can be employed. For example, one dose can be dissolved in 1 mL of isotonic NaCl solution and added to 1000 mL of subcutaneous infusion solution or injected at the proposed infusion site (see, eg, "Remington's Pharmaceutical Sciences", 15th Edition, pp. 1035-1038 and 1570- 1580 pages). Depending on the condition of the host, some variation in dosage will necessarily occur. In any event, the responsible administrator will determine the appropriate dosage for the individual host.

藉由將所需量的活性rAAV選擇性地與各種本文枚舉之其他成分一起併入適當溶劑中,接著過濾殺菌製備無菌可注射溶液。一般而言,藉由將各種無菌活性成分併入含有基礎分散介質及來自彼等上文枚舉者之所需其他成分之無菌媒介物內製備分散液。在用於製備無菌可注射溶液之無菌粉末之情況下,較佳之製備方法係真空乾燥及冷凍乾燥技術,該等技術產生活性成分加來自其經預先無菌過濾之溶液之任何另外所需成分之粉末。Sterile injectable solutions are prepared by incorporating the active rAAV in the required amount in an appropriate solvent, optionally with various of the other ingredients enumerated herein, in an appropriate solvent followed by filtered sterilization. Generally, dispersions are prepared by incorporating the various sterile active ingredients into a sterile vehicle that contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum-drying and freeze-drying techniques which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof .

本文揭示之rAAV組合物亦可以中性或鹽形式調配。醫藥上可接受之鹽包括酸加成鹽(與蛋白質之游離胺基形成)且其等用無機酸(諸如,舉例而言,鹽酸或磷酸)或有機酸(諸如乙酸、草酸、酒石酸、扁桃酸,及類似物)形成。與游離羧基形成之鹽亦可衍生自無機鹼(諸如,舉例而言,氫氧化鈉、氫氧化鉀、氫氧化銨、氫氧化鈣或氫氧化鐵),及有機鹼(諸如異丙胺、三甲胺、組胺酸、普魯卡因及類似物)。調配後,溶液將以可與劑量調配物相容之方式及使得治療有效的量投與。該等調配物以各種劑型(諸如可注射溶液、釋藥膠囊,及類似物)容易投與。The rAAV compositions disclosed herein can also be formulated in neutral or salt form. Pharmaceutically acceptable salts include acid addition salts (formed with free amine groups of proteins) and the like with inorganic acids (such as, for example, hydrochloric or phosphoric acid) or organic acids (such as acetic, oxalic, tartaric, mandelic acid) , and the like) are formed. Salts formed with free carboxyl groups can also be derived from inorganic bases such as, for example, sodium hydroxide, potassium hydroxide, ammonium hydroxide, calcium hydroxide, or ferric hydroxide, and organic bases such as isopropylamine, trimethylamine , histidine, procaine and analogs). Once formulated, the solution will be administered in a manner compatible with the dosage formulation and in an amount that will render it therapeutically effective. These formulations are readily administered in a variety of dosage forms such as injectable solutions, drug release capsules, and the like.

如本文使用,「載劑」包括任何及所有溶劑、分散介質、媒介物、包衣劑、稀釋劑、抗菌劑及抗真菌劑、等滲劑及吸收延遲劑、緩衝劑、載劑溶液、懸浮液、膠體,及類似物。此等介質及藥劑於醫藥活性物質之用途係此項技術中熟知。亦可將補充活性成分併入該等組合物內。片語「醫藥上可接受」係指當對宿主投與時不產生過敏或類似不良反應之分子實體及組合物。As used herein, "carrier" includes any and all solvents, dispersion media, vehicles, coatings, diluents, antibacterial and antifungal agents, isotonic and absorption delaying agents, buffers, carrier solutions, suspensions Liquids, colloids, and the like. The use of such media and agents for pharmaceutically active substances is well known in the art. Supplementary active ingredients can also be incorporated into the compositions. The phrase "pharmaceutically acceptable" refers to molecular entities and compositions that do not produce allergic or similar adverse reactions when administered to a host.

可使用遞送媒介物(諸如脂質體、奈米膠囊、微粒、微球、脂質顆粒、囊泡,及類似物)將本發明之組合物引入合適之宿主細胞內。特定言之,rAAV載體遞送之轉基因可經調配用於囊封於脂質顆粒、脂質體、囊泡、奈米球或奈米顆粒或類似物中而遞送。The compositions of the present invention can be introduced into suitable host cells using delivery vehicles such as liposomes, nanocapsules, microparticles, microspheres, lipid particles, vesicles, and the like. In particular, transgenes delivered by rAAV vectors can be formulated for delivery encapsulated in lipid particles, liposomes, vesicles, nanospheres or nanoparticles or the like.

此等調配物可較佳用於引入本文揭示之核酸或rAAV構築體之醫藥上可接受之調配物。脂質體之形成及用途為熟習此項技術者普遍知曉。最近,開發具有經改善之血清穩定性及循環半衰期之脂質體(美國專利第5,741,516號)。此外,已描述脂質體及脂質體樣製劑作為潛在藥物載劑之各種方法(美國專利第5,567,434;5,552,157;5,565,213;5,738,868及5,795,587號)。These formulations may preferably be used in pharmaceutically acceptable formulations incorporating the nucleic acids or rAAV constructs disclosed herein. The formation and use of liposomes is generally known to those skilled in the art. More recently, liposomes with improved serum stability and circulating half-life have been developed (US Pat. No. 5,741,516). In addition, various approaches have been described for liposomes and liposome-like formulations as potential drug carriers (US Pat. Nos. 5,567,434; 5,552,157; 5,565,213; 5,738,868 and 5,795,587).

脂質體已成功與通常對藉由其他程序之轉染具有抗性之許多細胞類型一起使用。另外,脂質體不受基於病毒之遞送系統特有之DNA長度限制。脂質體已有效用以將基因、藥物、反射治療劑、病毒、轉錄因子及變構效應物引入各種培養細胞系及動物內。另外,已完成檢查脂質體介導之藥物遞送有效性之數個成功臨床試驗。Liposomes have been used successfully with many cell types that are generally resistant to transfection by other procedures. In addition, liposomes are not limited by the DNA length specific to virus-based delivery systems. Liposomes have been effectively used to introduce genes, drugs, reflex therapeutics, viruses, transcription factors and allosteric effectors into various cultured cell lines and animals. In addition, several successful clinical trials examining the effectiveness of liposome-mediated drug delivery have been completed.

脂質體係由磷脂分散於水性介質中並自發形成多層同心雙層囊泡(亦稱為多層囊泡(MLV))形成。MLV一般具有25 nm至4 µm之直徑。MLV之音波處理導致形成直徑在200至500 Å之範圍內,核中含有水溶液之小單層囊泡(SUV)。Lipid systems are formed by dispersing phospholipids in an aqueous medium and spontaneously forming multilamellar concentric bilayer vesicles, also known as multilamellar vesicles (MLVs). MLVs typically have a diameter of 25 nm to 4 µm. Sonication of MLVs results in the formation of small unilamellar vesicles (SUVs) with diameters ranging from 200 to 500 Å and containing aqueous solutions in their cores.

或者,可使用rAAV之奈米膠囊調配物。奈米膠囊可一般以穩定且可再現方式捕獲物質。為避免由於細胞內聚合物過載引起之副作用,應使用可活體內降解之聚合物設計此等超細顆粒(尺寸約0.1 µm)。經審慎考慮使用滿足此等要求之生物可降解聚氰基丙烯酸烷基酯奈米顆粒。Alternatively, nanocapsule formulations of rAAV can be used. Nanocapsules can generally capture substances in a stable and reproducible manner. To avoid side effects due to intracellular polymer overload, these ultrafine particles (approximately 0.1 µm in size) should be designed using in vivo degradable polymers. The use of biodegradable polyalkylcyanoacrylate nanoparticles that meet these requirements is carefully considered.

除上文描述之遞送方法外,亦審慎考慮將下列技術作為將rAAV組合物遞送至宿主之替代方法。已使用音波導入術(即,超音波)並描述於美國專利第5,656,016號中作為用於增強藥物滲透至循環系統內並通過循環系統之速率及效用之裝置。經審慎考慮其他藥物遞送替代方案係骨內注射(美國專利第5,779,708號)、微晶片裝置(美國專利第5,797,898號)、眼用調配物(Bourlais等人,1998)、透皮基質(美國專利第5,770,219及5,783,208號)及反饋控制之遞送(美國專利第5,697,899號)。In addition to the delivery methods described above, the following techniques are also carefully considered as alternative methods of delivering rAAV compositions to the host. Sonication (ie, ultrasound) has been used and described in US Pat. No. 5,656,016 as a device for enhancing the rate and effectiveness of drug penetration into and through the circulatory system. Other drug delivery alternatives that have been carefully considered are intraosseous injection (US Pat. No. 5,779,708), microchip devices (US Pat. No. 5,797,898), ophthalmic formulations (Bourlais et al., 1998), transdermal matrices (US Pat. No. 5,797,898), 5,770,219 and 5,783,208) and the delivery of feedback control (US Pat. No. 5,697,899).

實例 實例1:設計鋅指蛋白以上調SCN1A基因表現 藉由比對各物種之RIKEN CAGE序列資料集中所鑑定之兩個重要轉錄起始位點周圍之序列鑑定人類(HEK293T細胞)與小鼠(HEPG2細胞) SCN1A啟動子序列之間的同源區域(圖1)。人類(HEK)與小鼠(HEPG2)之間的高度保守序列存在於SCN1A之近端啟動子區中(圖2)。通過組裝具有預定DNA結合特異性之一指及二指模組設計由六指構成之三個ZFP以結合SCN1A近端啟動子區中之重疊15至22個核苷酸同源區(圖3)。各由六指構成之三個ZFP (ZFP1至ZFP3)經設計以結合圖3中所鑑定之重疊高度保守序列。各指經設計以結合SCN1A近端啟動子之高度保守區中之三個鹼基區域(三聯體)。 example Example 1: Design of zinc finger proteins to upregulate SCN1A gene expression Regions of homology between human (HEK293T cells) and mouse (HEPG2 cells) SCN1A promoter sequences were identified by aligning sequences around two important transcription initiation sites identified in the RIKEN CAGE sequence dataset for each species (Fig. 1). A highly conserved sequence between human (HEK) and mouse (HEPG2) is present in the proximal promoter region of SCN1A (Figure 2). Three ZFPs consisting of six fingers were designed by assembling one- and two-finger modules with predetermined DNA binding specificities to bind overlapping 15-22 nucleotide homology regions in the SCN1A proximal promoter region (Figure 3). Three ZFPs (ZFP1 to ZFP3), each consisting of six fingers, were designed to combine the overlapping highly conserved sequences identified in Figure 3 . Each refers to a region of three bases (triplets) designed to bind to the highly conserved region of the SCN1A proximal promoter.

如圖4A中顯示,ZFP-1識別SCN1A基因(SEQ ID NO: 2)近端啟動子區內個別三個鹼基區域(由紅色表示之DNA三聯體,由「•」隔開)。如圖4B中顯示,ZFP-1之指1至6之各識別螺旋(七個胺基酸)結合三個核苷酸序列。圖4C中顯示ZFP-1之六指之胺基酸序列(SEQ ID NO: 17至22);重點標示該等指之間的連接子以指定典型(TGEKP)及非典型(TGSQKP)連接子序列。圖4D中顯示ZFP-1之六指之核苷酸序列(SEQ ID NO: 11至16)。 表1:靶向SCN1A之ZFP-1之識別螺旋    胺基酸序列 核苷酸序列 ZFP-1識別螺旋1 QRGNLVR (SEQ ID NO: 17) CAGCGGGGAAACCTGGTGAGG (SEQ ID NO: 11) ZFP-1識別螺旋2 LSFNLTR (SEQ ID NO: 18) CTGAGCTTCAATCTAACCAGA (SEQ ID NO: 12) ZFP-1識別螺旋3 RSDNLTR (SEQ ID NO: 19) CGGAGTGACAACTTAACGCGG (SEQ ID NO: 13) ZFP-1識別螺旋4 DRSHLAR (SEQ ID NO: 20) GACCGGTCTCACCTTGCCCGA (SEQ ID NO: 14) ZFP-1識別螺旋5 QKAHLTA (SEQ ID NO: 21) CAGAAGGCCCATTTGACTGCC (SEQ ID NO: 15) ZFP-1識別螺旋6 RSDNLTR (SEQ ID NO: 22) CGGTCGGACAACCTCACACGC (SEQ ID NO: 16) As shown in Figure 4A, ZFP-1 recognizes individual three-base regions within the proximal promoter region of the SCN1A gene (SEQ ID NO: 2) (DNA triplet shown in red, separated by "•"). As shown in Figure 4B, each recognition helix (seven amino acids) of fingers 1 to 6 of ZFP-1 binds three nucleotide sequences. The amino acid sequences of the six fingers of ZFP-1 are shown in Figure 4C (SEQ ID NOs: 17-22); the linkers between the fingers are highlighted to designate canonical (TGEKP) and atypical (TGSQKP) linker sequences. The nucleotide sequences of the six fingers of ZFP-1 are shown in Figure 4D (SEQ ID NOs: 11 to 16). Table 1: Recognition helix of ZFP-1 targeting SCN1A amino acid sequence Nucleotide sequence ZFP-1 recognizes helix 1 QRGNLVR (SEQ ID NO: 17) CAGCGGGGAAAACCTGGTGAGG (SEQ ID NO: 11) ZFP-1 recognizes helix 2 LSFNLTR (SEQ ID NO: 18) CTGAGCTTCAATCTAACCAGA (SEQ ID NO: 12) ZFP-1 recognizes helix 3 RSDNLTR (SEQ ID NO: 19) CGGAGTGACAACTTAACGCGG (SEQ ID NO: 13) ZFP-1 recognizes helix 4 DRSHLAR (SEQ ID NO: 20) GACCGGTCTCACCTTGCCCGA (SEQ ID NO: 14) ZFP-1 recognizes helix 5 QKAHLTA (SEQ ID NO: 21) CAGAAGGCCCATTTGACTGCC (SEQ ID NO: 15) ZFP-1 recognizes helix 6 RSDNLTR (SEQ ID NO: 22) CGGTCGGACAACCTCACACGC (SEQ ID NO: 16)

如圖5A中顯示,ZFP-2識別SCN1A基因(SEQ ID NO: 3)近端啟動子區內個別三個鹼基區域(由紅色表示之DNA三聯體,由「•」隔開)。如圖5B中顯示,ZFP-2之指1至6之各識別螺旋(七個胺基酸)結合三個核苷酸序列。圖5C中顯示ZFP-2之六指之胺基酸序列(SEQ ID NO: 29至34);重點標示該等指之間的連接子以指定典型(TGEKP)及非典型(TGSQKP)連接子序列。圖5D中顯示ZFP-1之六指之核苷酸序列(SEQ ID NO: 23至28)。 表2:靶向SCN1A之ZFP-2之識別螺旋    胺基酸序列 核苷酸序列 ZFP-2識別螺旋1 RSSNLTR (SEQ ID NO: 29) CGAAGTTCCAACCTGACACGG (SEQ ID NO: 23) ZFP-2識別螺旋2 DKRTLIR (SEQ ID NO: 30) GACAAGCGGACCTTAATCCGC (SEQ ID NO: 24) ZFP-2識別螺旋3 QRGNLVR (SEQ ID NO: 31) CAGCGGGGAAATCTAGTGCGA (SEQ ID NO: 25) ZFP-2識別螺旋4 LSFNLTR (SEQ ID NO: 32) CTGAGCTTCAACTTGACTCGT (SEQ ID NO: 26) ZFP-2識別螺旋5 RSDNLTR (SEQ ID NO: 33) CGGAGTGACAATCTTACGAGA (SEQ ID NO: 27) ZFP-2識別螺旋6 DRSHLAR (SEQ ID NO: 34) GACCGGAGCCACTTAGCCAGG (SEQ ID NO: 28) As shown in Figure 5A, ZFP-2 recognizes individual three-base regions within the proximal promoter region of the SCN1A gene (SEQ ID NO: 3) (DNA triplet represented in red, separated by "•"). As shown in Figure 5B, each recognition helix (seven amino acids) of fingers 1 to 6 of ZFP-2 binds three nucleotide sequences. The amino acid sequences of the six fingers of ZFP-2 are shown in Figure 5C (SEQ ID NOs: 29-34); the linkers between the fingers are highlighted to designate canonical (TGEKP) and atypical (TGSQKP) linker sequences. The nucleotide sequences of the six fingers of ZFP-1 are shown in Figure 5D (SEQ ID NOs: 23 to 28). Table 2: Recognition helix of ZFP-2 targeting SCN1A amino acid sequence Nucleotide sequence ZFP-2 recognizes helix 1 RSSNLTR (SEQ ID NO: 29) CGAAGTTCCAACCTGACACGG (SEQ ID NO: 23) ZFP-2 recognizes helix 2 DKRTLIR (SEQ ID NO: 30) GACAAGCGGACCTTAATCCGC (SEQ ID NO: 24) ZFP-2 recognizes helix 3 QRGNLVR (SEQ ID NO: 31) CAGCGGGGAAATCTAGTGCGA (SEQ ID NO: 25) ZFP-2 recognizes helix 4 LSFNLTR (SEQ ID NO: 32) CTGAGCTTCAACTTGACTCGT (SEQ ID NO: 26) ZFP-2 recognizes helix 5 RSDNLTR (SEQ ID NO: 33) CGGAGTGACAATCTTACGAGA (SEQ ID NO: 27) ZFP-2 recognizes helix 6 DRSHLAR (SEQ ID NO: 34) GACCGGAGCCACTTAGCCAGG (SEQ ID NO: 28)

如圖6A中顯示,ZFP-3識別SCN1A基因(SEQ ID NO: 4)近端啟動子區內個別三個鹼基區域(由紅色表示之DNA三聯體,由「•」隔開)。如圖6B中顯示,ZFP-3之指1至6之各識別螺旋(七個胺基酸)結合三個核苷酸序列。圖6C中顯示ZFP-3之六指之胺基酸序列(SEQ ID NO: 41至46);重點標示該等指之間的連接子以指定典型(TGEKP)及非典型(TGSQKP)連接子序列。圖6D中顯示ZFP-1之六指之核苷酸序列(SEQ ID NO: 35至40)。 表3:靶向SCN1A之ZFP-3之識別螺旋    胺基酸序列 核苷酸序列 ZFP-3識別螺旋1 DRSALAR (SEQ ID NO: 41) GACCGGAGCGCGCTGGCACGG (SEQ ID NO: 35) ZFP-3識別螺旋2 RSDNLTR (SEQ ID NO: 42) CGAAGTGACAACTTAACGCGC (SEQ ID NO: 36) ZFP-3識別螺旋3 QSGDLTR (SEQ ID NO: 43) CAGTCAGGGGACCTCACTCGT (SEQ ID NO: 37) ZFP-3識別螺旋4 VRQTLKQ (SEQ ID NO: 44) GTACGACAGACGCTTAAACAA (SEQ ID NO: 38) ZFP-3識別螺旋5 AAGNLTR (SEQ ID NO: 45) GCCGCTGGTAACTTGACACGA (SEQ ID NO: 39) ZFP-3識別螺旋6 RSDNLTR (SEQ ID NO: 46) AGATCTGATAATCTAACGCGT (SEQ ID NO: 40) As shown in Figure 6A, ZFP-3 recognizes individual three-base regions within the proximal promoter region of the SCN1A gene (SEQ ID NO: 4) (DNA triplet indicated in red, separated by "•"). As shown in Figure 6B, each recognition helix (seven amino acids) of fingers 1 to 6 of ZFP-3 binds three nucleotide sequences. The amino acid sequences of the six fingers of ZFP-3 are shown in Figure 6C (SEQ ID NOs: 41 to 46); the linkers between the fingers are highlighted to designate canonical (TGEKP) and atypical (TGSQKP) linker sequences. The nucleotide sequences of the six fingers of ZFP-1 are shown in Figure 6D (SEQ ID NOs: 35 to 40). Table 3: Recognition helix of ZFP-3 targeting SCN1A amino acid sequence Nucleotide sequence ZFP-3 recognizes helix 1 DRSALAR (SEQ ID NO: 41) GACCGGAGCGCGCTGGCACGG (SEQ ID NO: 35) ZFP-3 recognizes helix 2 RSDNLTR (SEQ ID NO: 42) CGAAGTGACAACTTAACGCGC (SEQ ID NO: 36) ZFP-3 recognizes helix 3 QSGDLTR (SEQ ID NO: 43) CAGTCAGGGGACCTCACTCGT (SEQ ID NO: 37) ZFP-3 recognizes helix 4 VRQTLKQ (SEQ ID NO: 44) GTACGACAGACGCTTAAACAA (SEQ ID NO: 38) ZFP-3 recognizes helix 5 AAGNLTR (SEQ ID NO: 45) GCCGCTGGTAACTTGACACGA (SEQ ID NO: 39) ZFP-3 recognizes helix 6 RSDNLTR (SEQ ID NO: 46) AGATCTGATAATCTAACGCGT (SEQ ID NO: 40)

經設計以靶向SCN1A基因近端啟動子區中之保守序列之另外ZFP將各包含五或六個指域且將結合至具有在人類與小鼠SCN1A之間高度保守之15至22個核苷酸之區域。 表4:靶向SCN1A之鋅指蛋白    胺基酸序列 核苷酸序列 ZFP-1 RPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGKKFADRSHLARHTKIHTGSQKPFQCRICMRNFSQKAHLTAHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD (SEQ ID NO: 57) CGACCATTCCAGTGTCGAATCTGCATGCGCAACTTCAGCCAGCGGGGAAACCTGGTGAGGCATATCCGCACCCACACGGGAGAGAAGCCTTTTGCCTGCGATATTTGTGGAAAGAAGTTTGCTCTGAGCTTCAATCTAACCAGACACACCAAGATTCATACTGGGTCCCAGAAACCGTTCCAGTGTAGGATATGCATGAGGAATTTCTCTCGGAGTGACAACTTAACGCGGCATATAAGGACGCACACAGGTGAAAAACCATTTGCATGCGACATCTGTGGCAAAAAGTTTGCGGACCGGTCTCACCTTGCCCGACACACAAAAATCCATACCGGCAGTCAAAAGCCCTTTCAATGTCGCATTTGCATGCGAAACTTCTCACAGAAGGCCCATTTGACTGCCCATATTCGTACTCATACTGGCGAGAAACCTTTCGCTTGCGATATATGTGGTCGTAAGTTTGCACGGTCGGACAACCTCACACGCCACACTAAGATACACCTGCGGCAGAAGGAC  (SEQ ID NO: 58) ZFP-2 RPFQCRICMRNFSRSSNLTRHIRTHTGEKPFACDICGKKFADKRTLIRHTKIHTGSQKPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGRKFADRSHLARHTKIHLRQKD (SEQ ID NO: 59) CGACCATTCCAGTGTCGAATCTGCATGCGCAACTTCAGCCGAAGTTCCAACCTGACACGGCATATCCGCACCCACACGGGAGAGAAGCCTTTTGCCTGCGATATTTGTGGAAAGAAGTTTGCTGACAAGCGGACCTTAATCCGCCACACCAAGATTCATACTGGGTCCCAGAAACCGTTCCAGTGTAGGATATGCATGAGGAATTTCTCTCAGCGGGGAAATCTAGTGCGACATATAAGGACGCACACAGGTGAAAAACCATTTGCATGCGACATCTGTGGCAAAAAGTTTGCGCTGAGCTTCAACTTGACTCGTCACACAAAAATCCATACCGGCAGTCAAAAGCCCTTTCAATGTCGCATTTGCATGCGAAACTTCTCACGGAGTGACAATCTTACGAGACATATTCGTACTCATACTGGCGAGAAACCTTTCGCTTGCGATATATGTGGTCGTAAGTTTGCAGACCGGAGCCACTTAGCCAGGCACACTAAGATACACCTGCGGCAGAAGGAC (SEQ ID NO: 60) ZFP-3 RPFQCRICMRNFSDRSALARHIRTHTGEKPFACDICGKKFARSDNLTRHTKIHTGSQKPFQCRICMRNFSQSGDLTRHIRTHTGEKPFACDICGKKFAVRQTLKQHTKIHTGSQKPFQCRICMRNFSAAGNLTRHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD (SEQ ID NO: 61) CGACCATTCCAGTGTCGAATCTGCATGCGCAACTTCAGCGACCGGAGCGCGCTGGCACGGCATATCCGCACCCACACGGGAGAGAAGCCTTTTGCCTGCGATATTTGTGGAAAGAAGTTTGCTCGAAGTGACAACTTAACGCGCCACACCAAGATTCATACTGGGTCCCAGAAACCGTTCCAGTGTAGGATATGCATGAGGAATTTCTCTCAGTCAGGGGACCTCACTCGTCATATAAGGACGCACACAGGTGAAAAACCATTTGCATGCGACATCTGTGGCAAAAAGTTTGCGGTACGACAGACGCTTAAACAACACACAAAAATCCATACCGGCAGTCAAAAGCCCTTTCAATGTCGCATTTGCATGCGAAACTTCTCAGCCGCTGGTAACTTGACACGACATATTCGTACTCATACTGGCGAGAAACCTTTCGCTTGCGATATATGTGGTCGTAAGTTTGCAAGATCTGATAATCTAACGCGTCACACTAAGATACACCTGCGGCAGAAGGAC (SEQ ID NO: 62) Additional ZFPs designed to target conserved sequences in the proximal promoter region of the SCN1A gene will each contain five or six finger domains and will bind to 15 to 22 nucleosides that are highly conserved between human and mouse SCN1A Acid zone. Table 4: Zinc finger proteins targeting SCN1A amino acid sequence Nucleotide sequence ZFP-1 RPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGKKFADRSHLARHTKIHTGSQKPFQCRICMRNFSQKAHLTAHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD (SEQ ID NO: 57) CGACCATTCCAGTGTCGAATCTGCATGCGCAACTTCAGCCAGCGGGGAAACCTGGTGAGGCATATCCGCACCCACACGGGAGAGAAGCCTTTTGCCTGCGATATTTGTGGAAAGAAGTTTGCTCTGAGCTTCAATCTAACCAGACACACCAAGATTCATACTGGGTCCCAGAAACCGTTCCAGTGTAGGATATGCATGAGGAATTTCTCTCGGAGTGACAACTTAACGCGGCATATAAGGACGCACACAGGTGAAAAACCATTTGCATGCGACATCTGTGGCAAAAAGTTTGCGGACCGGTCTCACCTTGCCCGACACACAAAAATCCATACCGGCAGTCAAAAGCCCTTTCAATGTCGCATTTGCATGCGAAACTTCTCACAGAAGGCCCATTTGACTGCCCATATTCGTACTCATACTGGCGAGAAACCTTTCGCTTGCGATATATGTGGTCGTAAGTTTGCACGGTCGGACAACCTCACACGCCACACTAAGATACACCTGCGGCAGAAGGAC (SEQ ID NO: 58) ZFP-2 RPFQCRICMRNFSRSSNLTRHIRTHTGEKPFACDICGKKFADKRTLIRHTKIHTGSQKPFQCRICMRNFSQRGNLVRHIRTHTGEKPFACDICGKKFALSFNLTRHTKIHTGSQKPFQCRICMRNFSRSDNLTRHIRTHTGEKPFACDICGRKFADRSHLARHTKIHLRQKD (SEQ ID NO: 59) CGACCATTCCAGTGTCGAATCTGCATGCGCAACTTCAGCCGAAGTTCCAACCTGACACGGCATATCCGCACCCACACGGGAGAGAAGCCTTTTGCCTGCGATATTTGTGGAAAGAAGTTTGCTGACAAGCGGACCTTAATCCGCCACACCAAGATTCATACTGGGTCCCAGAAACCGTTCCAGTGTAGGATATGCATGAGGAATTTCTCTCAGCGGGGAAATCTAGTGCGACATATAAGGACGCACACAGGTGAAAAACCATTTGCATGCGACATCTGTGGCAAAAAGTTTGCGCTGAGCTTCAACTTGACTCGTCACACAAAAATCCATACCGGCAGTCAAAAGCCCTTTCAATGTCGCATTTGCATGCGAAACTTCTCACGGAGTGACAATCTTACGAGACATATTCGTACTCATACTGGCGAGAAACCTTTCGCTTGCGATATATGTGGTCGTAAGTTTGCAGACCGGAGCCACTTAGCCAGGCACACTAAGATACACCTGCGGCAGAAGGAC (SEQ ID NO: 60) ZFP-3 RPFQCRICMRNFSDRSALARHIRTHTGEKPFACDICGKKFARSDNLTRHTKIHTGSQKPFQCRICMRNFSQSGDLTRHIRTHTGEKPFACDICGKKFAVRQTLKQHTKIHTGSQKPFQCRICMRNFSAAGNLTRHIRTHTGEKPFACDICGRKFARSDNLTRHTKIHLRQKD (SEQ ID NO: 61) CGACCATTCCAGTGTCGAATCTGCATGCGCAACTTCAGCGACCGGAGCGCGCTGGCACGGCATATCCGCACCCACACGGGAGAGAAGCCTTTTGCCTGCGATATTTGTGGAAAGAAGTTTGCTCGAAGTGACAACTTAACGCGCCACACCAAGATTCATACTGGGTCCCAGAAACCGTTCCAGTGTAGGATATGCATGAGGAATTTCTCTCAGTCAGGGGACCTCACTCGTCATATAAGGACGCACACAGGTGAAAAACCATTTGCATGCGACATCTGTGGCAAAAAGTTTGCGGTACGACAGACGCTTAAACAACACACAAAAATCCATACCGGCAGTCAAAAGCCCTTTCAATGTCGCATTTGCATGCGAAACTTCTCAGCCGCTGGTAACTTGACACGACATATTCGTACTCATACTGGCGAGAAACCTTTCGCTTGCGATATATGTGGTCGTAAGTTTGCAAGATCTGATAATCTAACGCGTCACACTAAGATACACCTGCGGCAGAAGGAC (SEQ ID NO: 62)

實例2:ZFP增加人類細胞中之SCN1A基因表現 為檢查ZFP1-ZFP3上調SCN1A之轉錄之能力,將ZFP1-ZFP3 DNA結合域融合至雜合VP64、p53及RTA (VPR)三連強轉錄活化子域以形成嵌合轉活化子。VPR融合活化子域用以募集轉錄調節複合物並增加染色質可及性且幫助達成高基因表現量。因此,ZFP域將使VPR活化子靶向近端啟動子區中之高度保守序列以增加SCN1A基因表現。 Example 2: ZFP increases SCN1A gene expression in human cells To examine the ability of ZFP1-ZFP3 to upregulate transcription of SCN1A, the ZFP1-ZFP3 DNA binding domain was fused to a hybrid VP64, p53 and RTA (VPR) triplet strong transcriptional activator domain to form a chimeric transactivator. VPR fusion activator domains serve to recruit transcriptional regulatory complexes and increase chromatin accessibility and help achieve high gene expression. Therefore, the ZFP domain would target the VPR activator to a highly conserved sequence in the proximal promoter region to increase SCN1A gene expression.

經由瞬時轉染將編碼VPR-ZFP1、VPR-ZFP2及/或VPR-ZFP3融合蛋白之表現質體轉染至HEK293細胞內並藉由qRT-PCR (使用TBP表現作為標準化之參考)量測SCN1A基因表現。VPR-ZFP融合物包括融合至VPR之ZFP1、ZFP2及/或ZFP3。用於多重調節之三個構築體之轉染(含有各融合至VPR之ZFP1、ZFP2及ZFP3 DNA結合域)導致相對於未轉染之細胞增加45倍之SCN1A基因表現,指示VPR-ZFP嵌合轉活化子可藉由在基因啟動子近端區域中結合而增加SCN1A基因表現(圖7)。Expression plasmids encoding VPR-ZFP1, VPR-ZFP2 and/or VPR-ZFP3 fusion proteins were transfected into HEK293 cells via transient transfection and the SCN1A gene was measured by qRT-PCR (using TBP expression as a reference for normalization) Performance. VPR-ZFP fusions include ZFP1, ZFP2 and/or ZFP3 fused to VPR. Transfection of the three constructs for multiplex regulation (containing ZFP1, ZFP2 and ZFP3 DNA binding domains each fused to VPR) resulted in a 45-fold increase in SCN1A gene expression relative to untransfected cells, indicating VPR-ZFP chimerism Transactivators can increase SCN1A gene expression by binding in the proximal region of the gene promoter (Figure 7).

將VPR-[ZFP1-ZFP3]融合蛋白及其中目前正設計ZFP DNA結合域之VPR-ZFP融合蛋白轉染於HeLa及HEPG2細胞中,該等兩種細胞均具有低SCN1A表現量。VPR-ZFP融合蛋白含有融合至VPR轉活化子之單個及多個ZFP DNA結合域之組合。藉由qRT-PCR量測SCN1A基因表現以確定此等VPR-ZFP融合物是否可增加基因表現。測試最具前景之VPR-ZFP融合候選物在腺相關病毒(AAV)遞送該等融合蛋白後在原代小鼠皮質神經元中增加SCN1A表現之能力。The VPR-[ZFP1-ZFP3] fusion protein and the VPR-ZFP fusion protein in which the ZFP DNA binding domain is currently being designed were transfected into HeLa and HEPG2 cells, both of which have low SCN1A expression. VPR-ZFP fusion proteins contain a combination of single and multiple ZFP DNA binding domains fused to VPR transactivators. SCN1A gene expression was measured by qRT-PCR to determine whether these VPR-ZFP fusions could increase gene expression. The most promising VPR-ZFP fusion candidates were tested for their ability to increase SCN1A expression in primary mouse cortical neurons following adeno-associated virus (AAV) delivery of these fusion proteins.

使用細菌單雜合選擇系統進一步最佳化ZFP域之特異性(參見例如Meng等人,「Targeted gene inactivation in zebrafish using engineered zinc-finger nucleases」,Nat Biotechnol, 2008)以自其中DNA結合中重要之殘基變化之隨機庫鑑定理想之ZFP。將新選擇之ZFP以個別及多個ZFP組合之方式融合至VPR轉活化子域,並轉染於HEK293、HeLa及HEPG2細胞,及原代小鼠皮質神經元中以鑑定在qRT-PCR分析後最大程度地增加SCN1A基因表現之候選ZFP域。The specificity of the ZFP domain was further optimized using a bacterial single-hybrid selection system (see, eg, Meng et al., "Targeted gene inactivation in zebrafish using engineered zinc-finger nucleases", Nat Biotechnol, 2008) to extract the most important role in DNA binding from the A random library of residue changes identifies ideal ZFPs. Newly selected ZFPs were fused to VPR transactivator domains individually and in multiple ZFP combinations and transfected in HEK293, HeLa and HEPG2 cells, and primary mouse cortical neurons for identification after qRT-PCR analysis Candidate ZFP domains that maximize the expression of the SCN1A gene.

實例3:產生具有變化效力之ZFP SCN1A轉活化子系列 將來自實例2之上調SCN1A基因表現之最有效ZFP融合至一系列具有預期效力梯度之人類轉活化域(例如,Rta、p65、Hsf1等),以鑑定在一系列AAV感染複數(MOI)中達成SCN1A基因表現之2倍上調之組裝體。用表現ZFP SCN1A融合轉活化子之AAV載體感染來自正常及SCN1A +/-小鼠之小鼠原代皮質神經元。藉由西方墨點法及使用qPCR評估Na V1.1蛋白之表現量。使用經TGF-α處理8小時之原代神經元作為陽性對照,因為此處理將Na V1.1蛋白表現增加~6至8倍(Chen等人,2015, Neuroinflammation 12: 126)。亦評估其他Na Vα次單元基因之表現量之變化以證實ZFP SCN1A轉活化之規格。使用免疫螢光以確定Na V1.1表現是否仍僅限於GABA能中間神經元,通過用抗ZFP SCN1A(HA標籤)及對GABA能神經元具特異性之標誌物(例如,微小白蛋白 +或體抑素 +)或通用神經元標誌物(例如,NeuN、TUBIII及/或Map2)之抗體進行雙重免疫螢光染色。亦藉由ChIP-Seq及RNA-Seq評估ZFP SCN1A對SCN1A基因轉活化之特異性以繪製基因體結合位點及基因轉移後產生之所得轉錄組學圖譜。 Example 3: Generation of a series of ZFPs SCN1A transactivators with varying potency The most potent ZFPs expressed from the up-regulated SCN1A gene from Example 2 were fused to a series of human transactivation domains with expected potency gradients (eg, Rta, p65, Hsf1, etc.) , to identify assemblies that achieved a 2-fold upregulation of SCN1A gene expression across a range of AAV multiplicities of infection (MOI). Mouse primary cortical neurons from normal and SCN1A +/- mice were infected with AAV vectors expressing ZFP SCN1A fusion transactivators. The expression of Na V 1.1 protein was assessed by Western blotting and using qPCR. Primary neurons treated with TGF-α for 8 hours were used as a positive control as this treatment increased Na V 1.1 protein expression by -6- to 8-fold (Chen et al., 2015, Neuroinflammation 12: 126). Changes in expression levels of other NaVα subunit genes were also assessed to confirm the specification of ZFP SCN1A transactivation . Immunofluorescence was used to determine whether Na V 1.1 expression was still restricted to GABAergic interneurons, by using anti-ZFP SCN1A (HA tag) and markers specific for GABAergic neurons (e.g., microalbumin + or Double immunofluorescence staining was performed with antibodies to statin + ) or universal neuronal markers (eg, NeuN, TUBIII and/or Map2). The specificity of ZFP SCN1A for SCN1A gene transactivation was also assessed by ChIP-Seq and RNA-Seq to map gene body binding sites and the resulting transcriptomic profiles generated after gene transfer.

實例4:SCN1A啟動子在GABA能抑制劑中之組蛋白組織及表觀基因體景觀引導啟動子活性依賴性SCN1A-ZFP轉活化子之設計 ZFP結合基因體標靶之能力取決於靶序列之可及性(例如,存在無核小體區域)。DNA可及性之此要求用以設計ZFP轉活化子,該等ZFP轉活化子僅在基於存在DNA靶序列可及性之細胞類型子集中發揮作用。通過使用ZFP轉活化子表現之組織特異性啟動子達成細胞類型活性之另外限制。已顯示在AAV載體及慢病毒之情境下,來自河豚(紅鰭東方魨(Takifugu rubripes))體抑素及神經肽Y基因之小啟動子在皮質及海馬抑制性中間神經元中驅動高度特異性轉基因表現。在一些實施例中,對DNA可及性敏感之SCN1A特異性ZFP之基於AAV之轉錄限制之組合在整個大腦中導致抑制性中間神經元中之Na V1.1蛋白表現之高度特異性上調。此雙重調節方法將Na V1.1蛋白在正常情況下不表現之細胞中之異位表現可導致之副作用最小化。 Example 4: Histone organization and epigenomic landscape of the SCN1A promoter in GABAergic inhibitors guides the design of promoter activity-dependent SCN1A-ZFP transactivators The ability of ZFPs to bind to genome targets depends on the availability of the target sequence and accessibility (eg, presence of nucleosome-free regions). This requirement of DNA accessibility was used to design ZFP transactivators that function only in a subset of cell types based on the presence of DNA target sequence accessibility. Additional restriction of cell type activity is achieved by using tissue-specific promoters expressed by ZFP transactivators. Small promoters from puffer fish (Takifugu rubripes) somatostatin and neuropeptide Y genes have been shown to drive high specificity in cortical and hippocampal inhibitory interneurons in the context of AAV vectors and lentiviruses Transgenic expression. In some embodiments, the combination of AAV-based transcriptional restriction of SCN1A-specific ZFPs sensitive to DNA accessibility results in highly specific upregulation of NaV1.1 protein expression in inhibitory interneurons throughout the brain. This dual modulation approach minimizes the side effects that can result from ectopic expression of the Na V 1.1 protein in cells that do not normally express it.

在小鼠及人類GABA能抑制性及麩胺酸能興奮性神經元中分析SCN1A啟動子之核小體結構及表觀遺傳景觀。此資訊用以通過靶向僅在此細胞類型中之SCN1A基因座周圍可及之序列,設計GABA能抑制性神經元限制性ZFP轉活化子。Analysis of the nucleosome structure and epigenetic landscape of the SCN1A promoter in mouse and human GABAergic inhibitory and glutamatergic excitatory neurons. This information was used to design GABAergic inhibitory neuron-restricted ZFP transactivators by targeting sequences accessible only around the SCN1A locus in this cell type.

使用螢光活化細胞分選(FACS)分離來自在GAD67啟動子下表現TdTomato之轉基因小鼠之GABA能抑制性神經元及藉由使Emx1-IRES-Cre與ROSA26/stop/EGFP小鼠雜交所產生之GFP陽性麩胺酸能興奮性神經元。自誘導多能幹細胞(iPS)細胞產生人類GABA能及興奮性神經元並使用對此等細胞類型具特異性之標誌物之免疫染色及RT-PCR,及電生理活性證實。使用轉座酶可及染色質分析(ATAC-Seq)表徵小鼠及人類神經元群體中SCN1A啟動子周圍之可及基因體區。GABAergic inhibitory neurons isolated from transgenic mice expressing TdTomato under the GAD67 promoter using fluorescence-activated cell sorting (FACS) and generated by crossing Emx1-IRES-Cre with ROSA26/stop/EGFP mice GFP-positive glutamatergic excitatory neurons. Human GABAergic and excitatory neurons were generated from induced pluripotent stem (iPS) cells and confirmed using immunostaining and RT-PCR, and electrophysiological activity using markers specific for these cell types. Accessible gene body regions around the SCN1A promoter in mouse and human neuronal populations were characterized using analysis of transposase accessible chromatin (ATAC-Seq).

基於抑制性及興奮性神經元中SCN1A啟動子周圍之基因體區之差異染色質可及性,設計識別僅GABA能神經元中可及之序列之ZFP SCN1A轉活化子。產生一系列候選ZFP-VPR轉活化子融合物以靶向不同SCN1A可及區,其中預期轉活化子之結合有效上調抑制性區域中之Na V1.1表現,及揭示興奮性神經元中之Na V1.1表現之任何非所需誘導表現。 Based on differential chromatin accessibility in the gene body regions surrounding the SCN1A promoter in inhibitory and excitatory neurons, ZFP SCN1A transactivators were designed that recognize sequences accessible only in GABAergic neurons. A series of candidate ZFP-VPR transactivator fusions were generated to target different SCN1A accessible regions, where binding of the transactivators is expected to efficiently upregulate NaV 1.1 expression in inhibitory regions and reveal NaV in excitatory neurons 1.1 Any undesired induced performance of the performance.

在模擬卓飛症候群之經培養人類iPS衍生之神經元及小鼠SCN1A +/-原代神經元中進行表現研究以確定經設計以識別僅抑制性神經元中可及之DNA序列之ZFP SCN1A轉活化子當在泛神經元人類突觸素-1或抑制性中間神經元特異性啟動子下自AAV載體表現時是否提供必要特異性。藉由qRT-PCR、西方墨點法及抑制性GABA能(例如,GABA +、GAD65/67 +、體抑素及/或微小白蛋白)及興奮性麩胺酸能(例如,Cux1+、FoxG1+、GABA A受體、GABA -神經元)之神經元類型特異性標誌物之雙重免疫螢光量測Na V1.1表現量。設計ZFP SCN1A轉活化子之細胞類型特異性以靶向小鼠及人類SCN1A啟動子中之不同序列,因為同線區內之染色質結構及DNA序列因物種而異。此等實驗中之對照包括經編碼GFP、無轉活化域之ZFP或無ZFP DNA結合域之轉活化子之類似AAV載體感染之神經元培養物。 Performance studies were performed in cultured human iPS-derived neurons and mouse SCN1A +/- primary neurons mimicking Zoffer's syndrome to identify ZFP SCN1A transducers designed to identify DNA sequences accessible only in inhibitory neurons Whether activators provide the necessary specificity when expressed from AAV vectors under pan-neuronal human synaptophysin-1 or inhibitory interneuron-specific promoters. By qRT-PCR, Western blotting and inhibitory GABAergic (eg, GABA + , GAD65/67 + , somatostatin and/or microalbumin) and excitatory glutaminergic (eg, Cux1+, FoxG1+, Dual immunofluorescence measurements of neuron type-specific markers of GABA A receptors, GABA - neurons) of Na V 1.1 expression. The cell-type specificity of the ZFP SCN1A transactivator was designed to target different sequences in the mouse and human SCN1A promoters because the chromatin structure and DNA sequence within the syntenic region vary by species. Controls in these experiments included neuronal cultures infected with similar AAV vectors encoding GFP, ZFP without the transactivation domain, or transactivator without the ZFP DNA binding domain.

將微RNA (miRNA)結合位點併入ZFP SCN1A轉活化子之3’非轉譯區(3’ UTR)內,該等ZFP SCN1A轉活化子僅限於其中發生非所需表現之細胞類型(例如,麩胺酸能興奮性神經元)。此方法先前用以限制AAV遞送之轉基因之表現(Xie等人,「MicroRNA-regulated, systematically delivered rAAV9:  a step closer to CNS-restricted transgene expression」,Mol. Ther. 2011)。藉由小RNA定序確定GABA能抑制性神經元及其他細胞類型之miRNA表現圖譜之差異。 MicroRNA (miRNA) binding sites are incorporated into the 3' untranslated region (3' UTR) of ZFP SCN1A transactivators that are restricted to cell types in which undesired performance occurs (e.g., glutamatergic excitatory neurons). This approach was previously used to limit the expression of transgenes delivered by AAV (Xie et al., "MicroRNA-regulated, systematically delivered rAAV9: a step closer to CNS-restricted transgene expression", Mol. Ther. 2011). Differences in miRNA expression profiles in GABAergic inhibitory neurons and other cell types were determined by small RNA sequencing.

實例5:評估AAV-ZFP SCN1A基因療法糾正來源於病患之iPS產生之GABA能中間神經元中之鈉電流缺陷之潛力 開發用於卓飛症候群之ZFP SCN1A轉活化子之關鍵步驟係用以證實此等人工轉活化子在人類神經元中具有所需功能。出於此目的,獲得來自卓飛病患(n= 4至6)及非卓飛病患(n= 4)之iPS細胞。此等細胞內顯示非卓飛遺傳背景,無需人工作業基因表現,且因此iPS細胞已成為生物醫學研究之當前最先進細胞系。利用CRISPR-Cas9基因體編輯技術藉由將SCN1A中之遺傳突變修復為野生型序列,或藉由將卓飛相關突變引入對照細胞系內之正常對偶基因中來產生同基因型細胞系。同基因型系藉此消除因比較來自不同人類個體之細胞系而產生之自然變異性,且因此對證實及增加疾病特異性表現型具有價值。使用已建立之抑制性神經元分化方案及驗證途徑以將iPS細胞系分化為前腦GABA能抑制劑中間神經元。 Example 5: Assessing the potential of AAV-ZFP SCN1A gene therapy to correct sodium current deficits in patient-derived iPS-producing GABAergic interneurons Key steps in the development of ZFP SCN1A transactivators for Zoffer syndrome were demonstrated These artificial transactivators have desired functions in human neurons. For this purpose, iPS cells from Zhuofei patients (n=4 to 6) and non- Zhuofei patients (n=4) were obtained. These cells display a non-Zuofei genetic background without the need for manual gene expression, and thus iPS cells have become the current state-of-the-art cell line for biomedical research. Isogenic cell lines were generated using CRISPR-Cas9 genome editing technology by repairing genetic mutations in SCN1A to wild-type sequences, or by introducing Zhuofei-related mutations into normal counterpart genes in control cell lines. Isogenic lines thereby eliminate the natural variability that arises from comparing cell lines from different human individuals, and are therefore valuable for confirming and increasing disease-specific phenotypes. Established inhibitory neuron differentiation protocols and validated pathways were used to differentiate iPS cell lines into forebrain GABAergic inhibitor interneurons.

來源於卓飛病患之抑制性神經元顯示如由全細胞箝膜術電生理學量測確定之鈉電流減小及動作電位放電受損。進行類似量測以證實本文描述之來源於卓飛之神經元重現與此等疾病相關聯之表現型。抑制劑中發生鈉電流缺陷,但卓飛病患中之興奮性神經元不發生鈉電流缺陷(Sun等人)且因此本發明中僅使用抑制性神經元。來源於卓飛病患之抑制性神經元中之突變誘導之鈉通道缺陷可由野生型SCN1A (參考20)之異位表現挽救。因此,在卓飛症候群之情境下,本發明中描述之方法適用於測試ZFP SCN1A轉活化子恢復野生型鈉通道功能及生理機能之效用。 Inhibitory neurons derived from Zhuo Fei patients showed reduced sodium current and impaired action potential firing as determined by whole-cell clamp electrophysiological measurements. Similar measurements were performed to confirm that the Zhuo Fei-derived neurons described herein reproduce the phenotypes associated with these diseases. Defects in sodium current occur in inhibitors, but not in excitatory neurons in Zhuo Fei patients (Sun et al.) and therefore only inhibitory neurons are used in the present invention. Mutation-induced defects in sodium channels in inhibitory neurons derived from Zhuo Fei patients were rescued by ectopic expression of wild-type SCN1A (ref. 20). Therefore, in the context of Zhuofei syndrome, the methods described in the present invention are suitable for testing the efficacy of ZFP SCN1A transactivator in restoring wild-type sodium channel function and physiology.

用在通用神經元或抑制性神經元特異性啟動子下編碼ZFP SCN1A轉活化子之AAV載體感染GABA能抑制性神經元培養物。藉由西方墨點法評估Na V1.1表現量之變化。通過未轉染細胞與經轉染細胞相比之全細胞箝膜術評估抑制性神經元中之功能性鈉電流之恢復。藉由ChIP-seq分析來源於所有病患之抑制性神經元中ZFP SCN1A轉活化子在基因體中之結合且與藉由RNA-seq所偵測之任何經鑑定轉錄組變化相關。此等實驗中之對照係經編碼GFP、無VPR轉活化子域之ZFP及無ZFP DNA結合域之VPR轉活化子域感染之類似AAV載體之神經元培養物。 GABAergic inhibitory neuronal cultures were infected with AAV vectors encoding the ZFP SCN1A transactivator under either a universal neuron or inhibitory neuron specific promoter. Changes in the expression of Na V 1.1 were assessed by Western blotting. Restoration of functional sodium currents in inhibitory neurons was assessed by whole cell clamping in untransfected cells compared to transfected cells. Binding of the ZFP SCN1A transactivator in the gene body in inhibitory neurons from all patients was analyzed by ChIP-seq and correlated with any identified transcriptomic changes detected by RNA-seq. Controls in these experiments were neuronal cultures infected with similar AAV vectors encoding GFP, ZFP without the VPR transactivator domain, and VPR transactivator without the ZFP DNA binding domain.

實例6:評估AAV-ZFP SCN1A干預在SCN1A小鼠中於不同年齡及遞送途徑下之治療潛力 AAV之廣泛趨向性係用於經廣泛表現之基因之基因療法應用之關鍵性質,但當受關注轉基因以細胞類型特異性方式表現時,可成為一重大挑戰。通過使用組織特異性啟動子(諸如甲狀腺素結合蛋白(TBP)、肌酸激酶及肌鈣蛋白T)分別為體內主要組織(諸如肝、肌肉及心臟)在很大程度上解決此問題。另外控制水平可疊加在組織特異性啟動子上以藉由併入彼等組織中高度豐富之微RNA (諸如肝中之miR-122及骨骼肌中之miR-1)之結合位點之多個拷貝達成自特異性組織較高程度之脫靶。最近描述之AAV-PHP.B血清型對於全身遞送後之CNS基因轉移異常高效,其中該AAV-PHP.B血清型誘導範圍廣泛之細胞類型。此外,該AAV-PHP.B血清型對外周組織之趨向性在很大程度上與AAV9之趨向性一樣廣泛。用於卓飛症候群之基因療法方法之目的在於僅在GABA能抑制性中間神經元中恢復Na V1.1表現,同時防止其他神經元及別處異位表現引起之有害影響。已顯示在來源於河豚(紅鰭東方魨)體抑素(fSST)及神經肽Y (fNPY)基因之小啟動子(<2.8 kb)下編碼GFP之AAV及慢病毒載體在顱內注射後在小鼠大腦中驅動抑制性神經元特異性表現。將攜載此等驅動GFP表現之啟動子之AAV-PHP.B載體與對照載體進行比較,其中由普遍存在之強CAG啟動子及最小相對較弱之小鼠MeCP2啟動子驅動轉基因表現。在藉由於6週齡(尾靜脈)及出生後第1天(眶後)小鼠中全身投與、新生兒中CSF遞送,及最後靶向齒狀回(DG)之單側注射遞送至CNS後,研究具有fSST及fNYP啟動子之AAV-PHP.B-GFP載體對GABA能抑制性中間神經元之特異性(表5)。CNS基因轉移之效率隨遞送途徑顯著變化且因為治療處於不同年齡之Scn1a +/-小鼠,進行廣泛分析以建立各遞送途徑在整個CNS中對GABA能抑制性中間神經元之神經元轉導效用及啟動子特異性的基線。先前已顯示自短fSST及fNYP啟動子驅動GFP表現之AAV載體在直接注射後對海馬體中之抑制性中間神經元具有高度特異性。以與後續研究中之方式相同之方式驗證本發明之AAV-PHP.B載體,其中評估恢復Scn1a +/-小鼠海馬形成中之抑制性神經元(具體位於齒狀回及顆粒細胞層內層中)中之Na V1.1表現之治療影響(基本原理如下闡述)。藉由使129SvJ與獲自Jackson Laboratories(Bar Harbor, ME)之C57BL/6小鼠交配在UMMS產生之129SvJ/C57BL/6小鼠中進行實驗。注射後一個月將小鼠安樂死並收集大腦及脊髓,使用利用抗細胞特異性標誌物及GFP之抗體之雙重免疫螢光對轉導效率及特異性進行組織學分析。在整個大腦及脊髓中藉由利用抗麩胺酸去羧酶(GAD;GABA能神經元之標誌物)及GFP之抗體之雙重免疫螢光染色評估GABA能抑制性中間神經元之基因轉移效率及特異性。另外,使用對彼等蛋白質及GFP具特異性之抗體評估啟動子及/或AAV-PHP.B對表現體抑素(SST)、微小白蛋白(PV)、鈣結合蛋白(CR)、血管活性腸肽(VIP)或神經肽Y (NPY)之抑制性中間神經元之子集之優先特異性。自藉由全身及ICV投與治療之小鼠收集肝、心臟及骨骼肌以在組織學上評估GFP表現並使用西方墨點法以確定外周組織中之異位表現之可能性。 表5:實驗組    每個定群之小鼠數量 遞送途徑 全身 ICV IC 年齡 6週* PND1 # PND1 # 8週* 劑量(vg) 2x10 12 4x10 11 4x10 10 1x10 10 AAV-fSST-GFP 6 6-8 6-8 4 AAV-fNYP-GFP 6 6-8 6-8 4 AAV-CAG-GFP 6 6-8 6-8 4 AAV-MeCP2-GFP 6 6-8 6-8 - 媒介物(PBS) 2 - -    *組由來自兩種性別之相等數量之小鼠構成。 #每個載體注射一窩(litter)。 縮寫:ICV:腦室內注射;IC:顱內注射;PND1:出生後第1天 Example 6: Assessing the therapeutic potential of AAV-ZFP SCN1A intervention in SCN1A mice at different ages and routes of delivery The broad tropism of AAV is a key property for gene therapy applications of widely expressed genes, but when concerns about transgenic When expressed in a cell-type-specific manner, this can become a major challenge. This problem is largely addressed by using tissue-specific promoters such as thyroxine-binding protein (TBP), creatine kinase and troponin T, respectively, for major tissues in the body such as liver, muscle and heart. Additional control levels can be superimposed on tissue-specific promoters by incorporating multiple binding sites for microRNAs that are highly abundant in those tissues, such as miR-122 in liver and miR-1 in skeletal muscle Copies achieve a higher degree of off-target from specific tissues. The recently described AAV-PHP.B serotype, which induces a wide range of cell types, is exceptionally efficient for CNS gene transfer following systemic delivery. Furthermore, the tropism of this AAV-PHP.B serotype to peripheral tissues is largely as broad as that of AAV9. The aim of the gene therapy approach for Zoffe syndrome is to restore Na V 1.1 expression only in GABAergic inhibitory interneurons, while preventing the deleterious effects of ectopic expression in other neurons and elsewhere. AAV and lentiviral vectors encoding GFP under small promoters (<2.8 kb) derived from the somatostatin (fSST) and neuropeptide Y (fNPY) genes of the puffer fish (Fugu puffer fish) have been shown to be free after intracranial injection. Driver inhibitory neuron-specific expression in the mouse brain. AAV-PHP.B vectors carrying these GFP expression-driving promoters were compared to control vectors, in which transgene expression is driven by the ubiquitous strong CAG promoter and the minimal relatively weak mouse MeCP2 promoter. Delivery to the CNS by systemic administration in 6-week-old (tail vein) and postnatal day 1 (retro-orbital) mice, CSF delivery in neonates, and finally unilateral injection targeting the dentate gyrus (DG) Next, the specificity of the AAV-PHP.B-GFP vector with the fSST and fNYP promoters for GABAergic inhibitory interneurons was investigated (Table 5). The efficiency of CNS gene transfer varies significantly by route of delivery and since Scn1a +/- mice of different ages were treated, extensive analyses were performed to establish the effect of each delivery route on neuronal transduction of GABAergic inhibitory interneurons throughout the CNS and promoter-specific baselines. AAV vectors driving GFP expression from short fSST and fNYP promoters have previously been shown to be highly specific for inhibitory interneurons in the hippocampus after direct injection. The AAV-PHP.B vector of the present invention was validated in the same manner as in a follow-up study in which it was assessed to restore inhibitory neurons (specifically located in the dentate gyrus and inner layer of the granulosa cell layer) in hippocampal formation in Scn1a +/- mice. Therapeutic effects of Na V 1.1 manifestations in (middle) (the rationale is explained below). Experiments were performed in UMMS-generated 129SvJ/C57BL/6 mice by mating 129SvJ with C57BL/6 mice obtained from Jackson Laboratories (Bar Harbor, ME). Mice were euthanized one month after injection and brains and spinal cords were harvested for histological analysis of transduction efficiency and specificity using double immunofluorescence using antibodies against cell-specific markers and GFP. Gene transfer efficiency of GABAergic inhibitory interneurons was assessed in the whole brain and spinal cord by double immunofluorescence staining with antibodies to glutamic acid decarboxylase (GAD; a marker of GABAergic neurons) and GFP and specificity. In addition, the promoter and/or AAV-PHP.B were assessed for their effects on epistatin (SST), microalbumin (PV), calbindin (CR), vasoactivity using antibodies specific for these proteins and GFP Preferential specificity of a subset of inhibitory interneurons for gut peptide (VIP) or neuropeptide Y (NPY). Liver, heart, and skeletal muscle were collected from mice treated by systemic and ICV administration to assess GFP expression histologically and Western blotting was used to determine the possibility of ectopic expression in peripheral tissues. Table 5: Experimental groups Number of mice per cohort delivery route whole body ICV IC age 6 weeks* PND1 # PND1 # 8 weeks* Dose (vg) 2x10 12 4x10 11 4x10 10 1x10 10 AAV-fSST-GFP 6 6-8 6-8 4 AAV-fNYP-GFP 6 6-8 6-8 4 AAV-CAG-GFP 6 6-8 6-8 4 AAV-MeCP2-GFP 6 6-8 6-8 - vehicle (PBS) 2 - - * Groups consisted of equal numbers of mice from both sexes. #Inject one litter per vehicle. Abbreviations: ICV: intraventricular injection; IC: intracranial injection; PND1: postnatal day 1

將編碼不同ZFP Scn1a轉活化子蛋白之AAV-PHP.B載體或相同體積之磷酸鹽緩衝鹽水(PBS) (n=3隻雄性+ 3隻雌性/組)雙側注射投與至六週齡Scn1a +/-小鼠之齒狀回內,AAV-PHP.B載體為一種具有ZFP Scn1a活化域但無DNA結合域控制單獨活化子之影響之構築體。此等實驗中使用之單股AAV載體亦攜載IRES-GFP匣於ZFP Scn1acDNA之下游以促進經轉導之細胞之鑑定。測試可在各種神經元中具有更廣泛之活化之至少兩種ZFP Scn1a轉活化子,及上文描述之兩個最具前景之GABA能抑制性神經元限制性ZFP SCN1A轉活化子。注射後一個月,收穫大腦並解剖來自一個大腦半球之海馬體以藉由西方墨點法使用β-肌動蛋白或微管蛋白作為加載對照評估ZFP Scn1a、Na V1.1、Na V1.3、GAD65、GAD67蛋白之表現量。藉由組織學研究使用連續腦切片(10 µm)檢查另一個大腦半球以藉由利用抗GAD及GFP之抗體,或抗GAD及所有ZFP Scn1a蛋白中所包括之抗原決定基標籤(HA或myc標籤)之抗體之雙重免疫螢光染色來分析齒狀回及顆粒細胞層內小葉中之經轉導抑制性中間神經元%。同樣,測定表現Na V1.1及Na V1.3之GAD陽性神經元之百分比以證實鈉通道表現之正常模式之恢復。除Na V1.1及Na V1.3蛋白表現之免疫螢光偵測外,使用Na V1.1、Na V1.3、ZFP Scn1a及GAD之RNAscope探針評估GABA能中間神經元中之mRNA量之變化。RNAScope係高度靈敏之原位雜交技術以分析腦神經元中之mRNA量。評估由ZFP Scn1a表現引起之Na V1.1量變化之此等兩種方法之組合提供對如何藉由本發明之基因療法方法達成中間神經元變化之全面瞭解。 AAV-PHP.B vectors encoding different ZFP Scn1a transactivator proteins or the same volume of phosphate buffered saline (PBS) (n=3 males + 3 females/group) were administered bilaterally to six-week-old Scn1a In the dentate gyrus of +/- mice, the AAV-PHP.B vector is a construct with the ZFP Scn1a activating domain but without the effects of the DNA binding domain controlling the activator alone. The single-stranded AAV vector used in these experiments also carried the IRES-GFP cassette downstream of the ZFP Scn1a cDNA to facilitate identification of transduced cells. At least two ZFP Scn1a transactivators were tested that could have broader activation in various neurons, as well as the two most promising GABAergic inhibitory neuron-restricted ZFP Scn1A transactivators described above. One month after injection, the brains were harvested and the hippocampus from one cerebral hemisphere was dissected to assess ZFP Scn1a , Na V 1.1, Na V 1.3, GAD65, GAD65, β-actin or tubulin as loading controls by Western blotting. Expression of GAD67 protein. The other cerebral hemisphere was examined by histological studies using serial brain sections (10 µm) by using antibodies against GAD and GFP, or anti-GAD and epitope tags (HA or myc tags) included in all ZFP Scn1a proteins. Double immunofluorescence staining of antibodies against ) to analyze the % of transduced inhibitory interneurons in the dentate gyrus and lobules within the granulosa cell layer. Likewise, the percentage of GAD-positive neurons expressing Na V 1.1 and Na V 1.3 was determined to demonstrate restoration of the normal pattern of sodium channel expression. In addition to the immunofluorescence detection of Na V 1.1 and Na V 1.3 protein expression, RNAscope probes for Na V 1.1, Na V 1.3, ZFP Scn1a and GAD were used to assess changes in mRNA levels in GABAergic interneurons. RNAScope is a highly sensitive in situ hybridization technique to analyze the amount of mRNA in brain neurons. The combination of these two methods to assess changes in NaV1.1 amounts caused by ZFP Scn1a expression provides a comprehensive understanding of how interneuron changes are achieved by the gene therapy methods of the present invention.

在出生後第1天或6週齡經由尾靜脈開始之兩種性別之Scn1a +/-小鼠中分析AAV-PHP.B-ZFP Scn1a基因療法之治療效用。對照包括經編碼無ZFP DNA結合域之ZFP樣蛋白之AAV載體治療之小鼠,及年齡匹配之未經治療之Scn1a +/-小鼠及野生型同窩小鼠(每組n=15隻雄性及15隻雌性)。各組中一部分小鼠(n=3隻雄性及3隻雌性)在12週齡時安樂死,使用西方墨點法以及利用抗GAD (及其他神經元類型特異性標誌物,例如,GAD65、GAD67)及ZFP之抗體之免疫螢光評估基因轉移至GABA能中間神經元之效率,及Na V1.1表現在整個大腦及脊髓之彼等細胞中之恢復。此外,評估ZFP之異位表現以及外周組織中之Na V1.1表現。各組中其他子集之動物(n=24)用以研究對存活(長達一歲)、運動性能及行為之影響,自2至12月齡每兩個月測試一次。使用加速旋轉桿及橫梁測試評估運動功能及協調性,因為Scn1a +/-小鼠顯示PND21之前肢及後肢協調受損。另外,利用行為測試,其中Scn1a +/-小鼠顯示受損表現,包括:開放場地、高架十字迷宮、築巢、大理石掩埋及巴恩斯迷宮,以測試Scn1a +/-小鼠中出現嚴重損害之空間學習及記憶。卓飛症候群病患之自發性癲癇發作特性在Scn1a +/-小鼠中亦明顯,且頻率隨年齡及體溫增加。此外,Scn1a +/-小鼠在強直-陣攣性癲癇發作後立即發生過早猝死。因此,在2、6及12月齡時利用24小時連續視訊監測以評估癲癇發作頻率及持續時間。若在上文描述測試中量測之主要結果中偵測到顯著變化,則考慮使用對新物體、氣味及小鼠反應之腔室偏好讀數之社交互動研究。在人道終點實驗時收集並評估大腦、脊髓及外周器官以進行上文概述之分子及組織學分析。 The therapeutic efficacy of AAV-PHP.B-ZFP Scn1a gene therapy was analyzed in Scn1a +/- mice of both sexes starting at postnatal day 1 or 6 weeks of age via the tail vein. Controls included mice treated with an AAV vector encoding a ZFP-like protein without the ZFP DNA binding domain, and age-matched untreated Scn1a +/- mice and wild-type littermates (n=15 males per group) and 15 females). A subset of mice in each group (n=3 males and 3 females) were euthanized at 12 weeks of age using Western blotting and the use of anti-GAD (and other neuron type-specific markers, e.g., GAD65, GAD67) The efficiency of gene transfer to GABAergic interneurons and the restoration of Na V 1.1 expression in these cells throughout the brain and spinal cord were assessed by immunofluorescence with antibodies to ZFP. In addition, ectopic manifestations of ZFPs and Na V 1.1 manifestations in peripheral tissues were assessed. An additional subset of animals in each group (n=24) were used to study effects on survival (up to one year of age), motor performance and behavior, and were tested every two months from 2 to 12 months of age. Motor function and coordination were assessed using accelerated rotarod and beam tests, as Scn1a +/- mice showed impaired coordination of forelimbs and hindlimbs by PND21. In addition, severe impairment in Scn1a +/- mice was tested using behavioral tests in which Scn1a +/- mice showed impaired performance, including: open field, elevated plus maze, nesting, marble burial, and Barnes maze spatial learning and memory. The characteristics of spontaneous seizures in patients with Zhuofei syndrome were also evident in Scn1a +/- mice, and the frequency increased with age and body temperature. Furthermore, Scn1a +/- mice experienced premature sudden death immediately after tonic-clonic seizures. Therefore, 24-hour continuous video monitoring was used to assess seizure frequency and duration at 2, 6, and 12 months of age. If significant changes were detected in the primary outcomes measured in the tests described above, a social interaction study using chamber preference readouts to novel objects, odors, and mouse responses was considered. Brain, spinal cord, and peripheral organs were collected and assessed for molecular and histological analyses outlined above at the time of humane endpoint experiments.

實例7:ZFP及dCas9系統增加人類細胞中之SCN1A基因表現 為檢查ZFP1-ZFP3上調SCN1A之轉錄之能力,將ZFP1-ZFP3 DNA結合域融合至雜合VP64、p53及RTA (VPR)三連強轉錄活化子域以形成嵌合轉活化子。VPR融合活化子域用以募集轉錄調節複合物並增加染色質可及性且幫助達成高基因表現量。因此,ZFP域將使VPR活化子靶向近端啟動子區中之高度保守序列以增加SCN1A基因表現。 Example 7: ZFP and dCas9 systems increase SCN1A gene expression in human cells To examine the ability of ZFP1-ZFP3 to upregulate transcription of SCN1A, the ZFP1-ZFP3 DNA binding domain was fused to a hybrid VP64, p53 and RTA (VPR) triplet strong transcriptional activator domain to form a chimeric transactivator. VPR fusion activator domains serve to recruit transcriptional regulatory complexes and increase chromatin accessibility and help achieve high gene expression. Therefore, the ZFP domain would target the VPR activator to a highly conserved sequence in the proximal promoter region to increase SCN1A gene expression.

此外,為檢查靶向SCN1A以上調SCN1A之轉錄之dCas9系統之能力,使三個靶向SCN1A之引導RNA與dCas9蛋白複合。In addition, to examine the ability of the dCas9 system targeting SCN1A to upregulate transcription of SCN1A, three guide RNAs targeting SCN1A were complexed with the dCas9 protein.

用下列實驗條件中之任一者瞬時轉染HEK293T細胞:(1) VPR-ZFP1構築體;(2) VPR-ZFP2構築體;(3) VPR-ZFP3構築體;(4) VPR-ZFP1、VPR-ZFP2及VPR-ZFP3構築體中之所有三者;(5) dCas9-VPR構築體及SCN1A引導RNA 1;(6)dCas9-VPR構築體及SCN1A引導RNA 2;(7)dCas9-VPR構築體及SCN1A引導RNA 3;(8)dCas9-VPR構築體及SCN1A引導RNA 1、SCN1A引導RNA 2及SCN1A引導RNA 3中之所有三者;及(9)無任何引導RNA之dCas9-VPR構築體(對照)。藉由qRT-PCR量測SCN1A基因表現。將SCN1A之倍數活化標準化為對照實驗(無任何引導RNA之dCas9-VPR構築體)。HEK293T cells were transiently transfected with any of the following experimental conditions: (1) VPR-ZFP1 construct; (2) VPR-ZFP2 construct; (3) VPR-ZFP3 construct; (4) VPR-ZFP1, VPR - All three of ZFP2 and VPR-ZFP3 constructs; (5) dCas9-VPR construct and SCN1A guide RNA 1; (6) dCas9-VPR construct and SCN1A guide RNA 2; (7) dCas9-VPR construct and SCN1A guide RNA 3; (8) dCas9-VPR construct and all three of SCN1A guide RNA 1, SCN1A guide RNA 2 and SCN1A guide RNA 3; and (9) dCas9-VPR construct without any guide RNA ( control). SCN1A gene expression was measured by qRT-PCR. Fold activation of SCN1A was normalized to control experiments (dCas9-VPR construct without any guide RNA).

產生之所有經測試實驗條件相對於對照實驗增加SCN1A之基因活化(圖8)。此等資料證實此實例及整個本發明中描述之鋅指蛋白能夠靶向SCN1A以影響基因表現。此等資料另外證實此實例之引導RNA序列(SEQ ID NO: 83至94)能夠將dCas9靶向SCN1A以影響基因表現。 表6:靶向SCN1A之引導核酸(粗體顯示間隔區序列)    核苷酸序列(DNA) 核苷酸序列(RNA) SCN1A引導1 GAGGTACCATAGAGTGAGGCGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC  (SEQ ID NO: 83) GAGGUACCAUAGAGUGAGGCGGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO: 84)    SCN1A引導2 ACCGAGGCGAGGATGAAGCCGAGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC  (SEQ ID NO: 87) ACCGAGGCGAGGAUGAAGCCGAGGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC  (SEQ ID NO: 88) SCN1A引導3 ACCGAAGCCGAGAGGATACTGCAGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 91) ACCGAAGCCGAGAGGAUACUGCAGGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC  (SEQ ID NO: 92) All tested experimental conditions resulted in increased gene activation of SCN1A relative to control experiments (Figure 8). These data demonstrate that the zinc finger proteins described in this example and throughout this invention are capable of targeting SCN1A to affect gene expression. These data additionally demonstrate that the guide RNA sequences of this example (SEQ ID NOs: 83-94) are capable of targeting dCas9 to SCN1A to affect gene expression. Table 6: Guide nucleic acids targeting SCN1A (spacer sequences shown in bold) Nucleotide sequence (DNA) Nucleotide sequence (RNA) SCN1A Boot 1 GAGGTACCATAGAGTGAGGCG GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 83) GAGGUACCAUAGAGUGAGGCGGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO: 84) SCN1A Boot 2 ACCGAGGCGAGGATGAAGCCGAG GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 87) ACCGAGGCGAGGAUGAAGCCGAG GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO: 88) SCN1A bootstrap 3 ACCGAAGCCGAGAGGATACTGCAG GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 91) ACCGAAGCCGAGAGGAUACUGCAG GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO: 92)

實例8:具有範圍廣泛之效力之SCN1A特異性鋅指蛋白(ZFP)轉活化子 使用用於ZFP組裝之細菌單雜合(B1H)系統開發靶向SCN1A啟動子之保守區之鋅指蛋白(ZFP)轉活化子(例如,具有2個指)。然後針對較大ZFP識別序列內各6個鹼基對子位點產生B1H 2指ZFP模組(2FM)庫並進行B1H選擇以鑑定各6 bp子位點之候選2FM。此外,評估2FM之DNA結合特異性以選擇對靶識別序列內各6個鹼基對子位點具有優先特異性之2FM。 Example 8: SCN1A-specific zinc finger protein (ZFP) transactivators with wide-ranging potencies A zinc finger protein (ZFP) transactivator (eg, with 2 fingers) targeting the conserved region of the SCN1A promoter was developed using the bacterial monohybrid (B1H) system for ZFP assembly. A B1H 2-finger ZFP module (2FM) library was then generated for each 6 base pair subsite within the larger ZFP recognition sequence and B1H selection was performed to identify candidate 2FMs for each 6 bp subsite. In addition, the DNA binding specificity of 2FMs was assessed to select 2FMs with preferential specificity for each of the 6 base pair subsites within the target recognition sequence.

然後將經表徵及選擇之2指模組(2FM)之每一個組合組裝成具有經改善之DNA結合特異性之6指ZFP。此等6指ZFP經活體外表現及純化,然後在HEK293T細胞中使用SELEX-seq及/或CUT&Tag評估DNA結合特異性。鑑定以對各靶位點內≥14個鹼基具有優先特異性靶向SCN1A之6指ZFP,使得針對三個不同靶序列鑑定至少一個6指ZFP。Each combination of the characterized and selected 2-finger modules (2FM) was then assembled into a 6-finger ZFP with improved DNA binding specificity. These 6-finger ZFPs were expressed and purified in vitro and then assessed for DNA binding specificity using SELEX-seq and/or CUT&Tag in HEK293T cells. 6-finger ZFPs that target SCN1A with preferential specificity for >14 bases within each target site were identified such that at least one 6-finger ZFP was identified for three different target sequences.

然後微調6指ZFP之親和力以改善全基因體結合圖譜並評估(例如,使用CUT&Tag)指間連接子改變對全基因體結合圖譜之影響。此外,評估(例如,在HEK293T細胞中使用CUT&Tag)非特異性DNA結合親和力減小對全基因體結合圖譜之影響。另外評估(例如,使用SELEX-seq或CUT&Tag)候選6指ZFP之DNA結合特異性。此等研究可發現至少三種不同的靶向SCN1A且基因體內具有經改良結合位點分佈的ZFP。The affinity of the 6-finger ZFPs was then fine-tuned to improve genome-wide binding profiles and the effect of inter-finger linker changes on genome-wide binding profiles was assessed (eg, using CUT&Tag). In addition, the effect of reduced non-specific DNA binding affinity on genome-wide binding profiles was assessed (eg, using CUT&Tag in HEK293T cells). Candidate 6 finger ZFPs were additionally assessed (eg, using SELEX-seq or CUT&Tag) for DNA binding specificity. These studies identified at least three different ZFPs targeting SCN1A with improved distribution of binding sites within the gene.

然後鑑定在神經元中以不同效力靶向SCN1A之ZFP-轉活化子融合蛋白。在神經元中評估候選活化域(AD)模組(例如,來自GABA能神經元中存在之轉錄因子之8至12個不同人類活化域(AD))。將候選AD與主要候選ZFP融合及評估後續融合蛋白相對於具有確定效力之合成AD (例如VP16、VP64及VPR)在HEK293T細胞中活化SCN1A之能力。qRT-PCR資料用以驗證ZFP Scn1A-AD融合蛋白之功能性及活化潛力。 ZFP-transactivator fusion proteins were then identified that target SCN1A with varying potency in neurons. Candidate activation domain (AD) modules (eg, 8 to 12 different human activation domains (AD) from transcription factors present in GABAergic neurons) are evaluated in neurons. Candidate ADs were fused to lead candidate ZFPs and subsequent fusion proteins were assessed for their ability to activate SCN1A in HEK293T cells relative to synthetic ADs with established potency such as VP16, VP64 and VPR. qRT-PCR data were used to verify the functionality and activation potential of the ZFP Scn1A -AD fusion protein.

評估ZFP Scn1A-AD在iCell GABA能抑制性神經元中之活化圖譜。將各主要ZFP Scn1A-AD轉導至iCell GABA能抑制性神經元內並評估對SCN1A表現(例如,使用qRT-PCR及西方墨點法)及轉錄組(例如,使用RNA-seq)之影響。qRT-PCR及RNA序列資料容許定量分析(例如火山圖)各AD對SCN1A表現及轉錄組穩態之影響。 The activation profile of ZFP Scn1A -AD in iCell GABAergic inhibitory neurons was assessed. Each major ZFP Scn1A -AD was transduced into iCell GABAergic inhibitory neurons and assessed for effects on SCN1A performance (eg, using qRT-PCR and Western blotting) and transcriptome (eg, using RNA-seq). qRT-PCR and RNA-seq data allow quantitative analysis (eg, volcano plots) of the effect of each AD on SCN1A expression and transcriptome homeostasis.

在神經元中基於全基因體結合圖譜評估ZFP Scn1A-AD候選物。將各候選ZFP Scn1A-AD轉導至iCell GABA能抑制性神經元內並評估(例如,使用CUT&Tag) DNA結合特異性及全基因體圖譜。評估對SCN1A表現(例如,使用qRT-PCR及西方墨點法)及轉錄組(例如,使用RNA-seq)之影響。使用此等量測鑑定脫靶位點(例如,非所需之基因活化)。 Evaluation of ZFP Scn1A -AD candidates based on genome-wide binding maps in neurons. Each candidate ZFP Scn1A -AD was transduced into iCell GABAergic inhibitory neurons and assessed (eg, using CUT&Tag) for DNA binding specificity and genome-wide profiles. Effects on SCN1A performance (eg, using qRT-PCR and Western blotting) and transcriptome (eg, using RNA-seq) were assessed. Off-target sites (eg, unwanted gene activation) are identified using these measurements.

藉由利用B1H反選擇策略調整ZFP Scn1A-AD對SCN1A結合(相對於脫靶結合)之特異性以鑑定區分靶位點與脫靶位點之ZFP指組。重新評估候選ZFP Scn1A-AD之DNA結合特異性及全基因體圖譜。重新評估與經修正ZFP Scn1A-AD相關之最佳AD之SCN1A活化以證實各候選ZFP Scn1A-AD對SCN1A之選擇性活化。 The specificity of ZFP Scn1A -AD for SCN1A binding (vs. off-target binding) was adjusted using a B1H counter-selection strategy to identify ZFP fingers that differentiate between on-target and off-target sites. The DNA binding specificity and genome-wide profiles of the candidate ZFP Scn1A -AD were reassessed. SCN1A activation of the best ADs associated with the revised ZFP Scn1A -AD was reassessed to confirm the selective activation of SCN1A by each candidate ZFP Scn1A -AD.

在野生型小鼠中樞神經系統(CNS)中評估主要候選ZFP Scn1A-AD。用2x10 12vg編碼具有各種不同活化潛力之候選ZFP Scn1a-AD之AAV-PHP.eB載體全身治療野生型小鼠(4至5週齡)。對照包括經編碼無DNA結合域之ZFP之AAV載體治療之野生型小鼠,及未經治療之小鼠對照。各ZFP Scn1a-AD治療組由2隻雄性小鼠及2隻雌性小鼠構成。生活中結果評量係存活、標準行為評估及標準健康評估。在治療後5週將動物安樂死用於分子及組織學分析。屍體解剖結果評量係SCN1A及ZFP Scn1a蛋白量之西方墨點法及組織學分析,及經轉導及未經轉導之細胞中之轉錄組學變化之snRNA-seq分析。 The lead candidate ZFP Scn1A -AD was evaluated in the wild-type mouse central nervous system (CNS). Wild-type mice (4 to 5 weeks old) were systemically treated with 2x1012 vg AAV-PHP.eB vector encoding candidate ZFP Scn1a -AD with various activation potentials. Controls included wild-type mice treated with AAV vectors encoding ZFPs without the DNA binding domain, and untreated mouse controls. Each ZFP Scn1a -AD treatment group consisted of 2 male mice and 2 female mice. Outcome measures in life were survival, standard behavioral assessments, and standard health assessments. Animals were euthanized 5 weeks after treatment for molecular and histological analysis. Autopsy results were assessed by western blotting and histological analysis of SCN1A and ZFP Scn1a protein levels, and snRNA-seq analysis of transcriptomic changes in transduced and non-transduced cells.

隨後評估有效AAV-ZFP Scn1a載體在小鼠實驗中之結合圖譜,評估AAV-ZFP Scn1a載體在小鼠CNS中之結合圖譜及SCN1A活化。使用Cre驅動子及loxP-nGFP小鼠以選擇性標記不同GABA能神經元子集。用2x10 12vg編碼候選ZFP Scn1a-AD之AAV-PHP.eB載體全身治療小鼠(4至5周齡)。對照包括經編碼無DNA結合域之ZFP之AAV載體治療之野生型小鼠,及未經治療之小鼠對照。各ZFP Scn1a-AD治療組由2隻雄性小鼠及2隻雌性小鼠構成。生活中結果評量係存活、標準行為評估及標準健康評估。在治療後5週將動物安樂死用於分子及組織學分析。屍體解剖結果評量係對GABA能神經元細胞核之FACS分析及使用單核RNAseq (snRNAseq)之轉錄組分析。藉由CUT&Tag評估ZFP全基因體結合分析。然後基於對神經元轉錄組學圖譜具有最小影響的活化Scn1a之能力選擇最有效AAV-ZFP Scn1a載體。後續可評估具有一系列活化潛力之最具前景之載體之治療效用。 The binding profile of the potent AAV-ZFP Scn1a vector in mouse experiments was then assessed, and the binding profile of the AAV-ZFP Scn1a vector in the mouse CNS and SCN1A activation were assessed. The Cre driver and loxP-nGFP mice were used to selectively label different subsets of GABAergic neurons. Mice (4 to 5 weeks old) were treated systemically with 2x10 12 vg of AAV-PHP.eB vector encoding the candidate ZFP Scn1a -AD. Controls included wild-type mice treated with AAV vectors encoding ZFPs without the DNA binding domain, and untreated mouse controls. Each ZFP Scn1a -AD treatment group consisted of 2 male mice and 2 female mice. Outcome measures in life were survival, standard behavioral assessments, and standard health assessments. Animals were euthanized 5 weeks after treatment for molecular and histological analysis. Autopsy outcomes were assessed by FACS analysis of GABAergic neuron nuclei and transcriptome analysis using single nucleus RNAseq (snRNAseq). ZFP genome-wide binding assays were assessed by CUT&Tag. The most potent AAV-ZFP Scn1a vector was then selected based on its ability to activate Scn1a with minimal impact on neuronal transcriptomic profiles. The therapeutic utility of the most promising carriers with a range of activation potential can then be assessed.

實例9:GABA能神經元特異性基因表現系統 使用如下文描述之三種方法開發對抑制性GABA能中間神經元具特異性之轉基因表現匣。 Example 9: GABAergic Neuron-Specific Gene Expression System Transgenic expression cassettes specific for inhibitory GABAergic interneurons were developed using three methods as described below.

方法 1 GABA 特異性啟動子之生物資訊學引導之設計圖10,左圖中描述方法1。 Method 1 : Bioinformatics-guided design of GABA -specific promoters Figure 10, method 1 is depicted in the left panel.

分析來自小鼠及人類大腦之全基因體ATACseq、ChIPseq、Dnase I、CAGE及HiC資料集以尋找用於SCN1A、GAD1及GAD2啟動子中之候選強化子元件。SCN1A附近人類Chr. 4之HiC資料顯示可指示強化子之3D染色體內相互作用區(~1Mb以SCN1A為中心;箭頭指示間隔165至166 Mb的第2號染色體2不同區之間的潛在相互作用)。亦評估染色體間相互作用。Genome-wide ATACseq, ChIPseq, Dnase I, CAGE and HiC datasets from mouse and human brain were analyzed to find candidate enhancer elements for use in SCN1A, GAD1 and GAD2 promoters. HiC data for human Chr.4 near SCN1A show a 3D intrachromosomal interacting region that may indicate enhancers (~1 Mb centered on SCN1A; arrows indicate potential interactions between different regions of chromosome 2 165 to 166 Mb apart ). Interchromosomal interactions were also assessed.

可使用用於GABA神經元子集(Scn1a、Gad2、Sst、Cck、Vip啟動子)之Cre驅動子小鼠系與loxP-GFP小鼠雜交產生原始資料(圖10,左圖,陰影框)。分選GABA神經元並使用CUT&Tag評估Scn1a、Gad1、Gad2基因之表觀遺傳景觀,亦使用HiC評估可對GABA神經元具特異性之染色體相互作用。Raw data can be generated using the Cre driver mouse line for a subset of GABA neurons (Scn1a, Gad2, Sst, Cck, Vip promoters) crossed with loxP-GFP mice (Figure 10, left panel, shaded box). GABA neurons were sorted and the epigenetic landscape of Scn1a, Gad1, Gad2 genes was assessed using CUT&Tag, and chromosomal interactions specific for GABA neurons were also assessed using HiC.

評估生物資訊學產生之用於GABA神經元特異性表現之強化子候選物。產生融合至SCN1A最小啟動子之強化子候選物之條碼庫。Bioinformatics-generated enhancer candidates for GABA neuron-specific performance were evaluated. A barcode library of enhancer candidates fused to the SCN1A minimal promoter was generated.

對6至8週齡正常小鼠(n=4)全身輸注10 12vg AAV-PHP.eB庫,實驗終點在注射後4至6週。結果評量係CNS細胞群體中之經表現條碼分佈之snRNAseq;條碼在肝、心臟及骨骼肌中之RT-PCR擴增,接著頻率之NGS分析。使用唯一基因表現圖譜鑑定經分析之組織中之不同細胞群體。強化子基於其等於GABA神經元中特異性驅動基因表現(唯一條碼)之能力進行選擇用於進一步研究,及第二選擇標準係鑑定具有不同效力程度之GABA特異性強化子。可選擇三至四個強化子用於與彼等方法2中鑑定者進行比較。 10 12 vg AAV-PHP.eB pool was systemically infused into 6- to 8-week-old normal mice (n=4) with experimental endpoints at 4 to 6 weeks post-injection. Outcome measures were snRNAseq expressing barcode distribution in CNS cell populations; RT-PCR amplification of barcodes in liver, heart and skeletal muscle followed by NGS analysis of frequency. Unique gene expression profiles were used to identify distinct cell populations in the analyzed tissues. Enhancers were selected for further study based on their ability to equal specific driver gene expression (unique barcodes) in GABA neurons, and a second selection criterion was to identify GABA-specific enhancers with varying degrees of potency. Three to four enhancers can be selected for comparison with those identified in Method 2.

方法 2 :長程強化子掃描陣列產生具有>4x10 9個變體之AAV衣殼庫。使用此等衣殼庫以探測整個基因體區中是否存在組織特異性強化子。合成~98,800個寡核苷酸(140個核苷酸長度)的庫並選殖至AAV-nlsGFP載體中最小SCN1A啟動子之上游。將寡核苷酸平鋪在整個基因體區(小鼠及人類中之SCN1A、GAD1、GAD2基因)上,且寡核苷酸之間的偏移量(一個核苷酸至無重疊)決定靶尺寸之大小在~99 kb至13 Mb基因體區之間。該AAV庫攜載18個核苷酸條碼(NNM6 = ~10 9個條碼),其等對各強化子而言均係唯一的,位於轉基因mRNA之3’UTR中。使用低頻限制酶移除nlsGFP cDNA後,由NGS確定與各強化子相關之條碼。CNS及外周組織之GFP陽性細胞核中之條碼讀段之數量提供特異性及整體轉基因表現效率之量度。使用AAV強化子掃描陣列(AAVeSA)庫可藉由採樣標靶中經唯一表現之基因而針對任何細胞類型快速開發細胞類型特異性強化子。 Method 2 : Long-range enhancer sub-scan arrays generate AAV capsid libraries with > 4x109 variants. These capsid libraries were used to probe for the presence of tissue-specific enhancers throughout the gene body region. A pool of -98,800 oligonucleotides (140 nucleotides in length) was synthesized and cloned upstream of the minimal SCN1A promoter in the AAV-nlsGFP vector. The oligonucleotides are tiled over the entire gene body region (SCN1A, GAD1, GAD2 genes in mouse and human) and the offset between oligonucleotides (one nucleotide to no overlap) determines the target Sizes ranged in size from ~99 kb to 13 Mb gene body regions. This AAV library carries 18 nucleotide barcodes (NNM6 = ~ 109 barcodes), which are unique to each enhancer, located in the 3'UTR of the transgenic mRNA. The barcodes associated with each enhancer were determined by NGS after removal of the nlsGFP cDNA using low frequency restriction enzymes. The number of barcode reads in GFP-positive nuclei of the CNS and peripheral tissues provides a measure of specificity and overall transgene expression efficiency. Using the AAV Enhancer Scanning Array (AAVeSA) library allows rapid development of cell type specific enhancers for any cell type by sampling the uniquely expressed genes in the target.

產生具有20個核苷酸偏移量之人類及小鼠SCN1A、GAD1及GAD2基因之寡核苷酸庫以探測各基因周圍~2Mb之區域。使用AAV強化子掃描陣列庫,該庫具有最小100x覆蓋所有序列(3個基因x 2個物種x 98,800個寡核苷酸/基因= 592,800個序列),或5.9x107個變體。在產生之前,藉由如上文描述之NGS確定所有強化子條碼(理論上~ 100個條碼/強化子)之識別碼。Oligonucleotide pools of human and mouse SCN1A, GAD1 and GAD2 genes with 20 nucleotide offsets were generated to probe the ~2Mb region surrounding each gene. Array libraries were scanned using AAV enhancers with a minimum 100x coverage of all sequences (3 genes x 2 species x 98,800 oligonucleotides/gene = 592,800 sequences), or 5.9x107 variants. The identifiers of all enhancer barcodes (theoretically ~100 barcodes/enhancer) were determined by NGS as described above prior to generation.

將AAVeSA庫包裝於AAV-PHP.eB中用於全身遞送。The AAVeSA library was packaged in AAV-PHP.eB for systemic delivery.

在6至8週齡正常小鼠(n=4)中藉由全身輸注10 12vg AAVeSA庫進行GABA神經元特異性強化子之活體內篩選,實驗終點在注射後4至6週。結果評量係CNS細胞群體中之經表現條碼分佈之snRNAseq;條碼在肝、心臟、骨骼肌中之RTPCR擴增,接著頻率之NGS分析。基於CNS中之GABA神經元之特異性及涵蓋一系列基因表現量選擇強化子用於進一步研究。 In vivo screening of GABA neuron-specific enhancers was performed by systemic infusion of 10 12 vg AAVeSA pools in 6- to 8-week-old normal mice (n=4) with endpoints at 4-6 weeks post-injection. Outcome measures were snRNAseq expressing barcode distribution in CNS cell populations; RTPCR amplification of barcodes in liver, heart, skeletal muscle followed by NGS analysis of frequency. Enhancers were selected for further studies based on their specificity to GABA neurons in the CNS and covering a range of gene expression levels.

將GABA特異性AAV載體彼此進行比較。用AAV-PHP.eB包裝攜載上文所選強化子之AAV-nlsGFP載體並單獨研究。選擇來自各方法之三至四個載體用於進一步研究。The GABA-specific AAV vectors were compared to each other. The AAV-nlsGFP vector carrying the enhancers selected above was packaged with AAV-PHP.eB and studied separately. Three to four vectors from each method were selected for further study.

對6至8週齡正常小鼠(n=4隻/載體)全身輸注10 12vg AAV-PHP.eB-nlsGFP載體,實驗終點在注射後4至6週。結果評量係GFP及神經元(NeuN) GABA能神經元(GAD1,及GABA神經元子集之其他標誌物)、星形膠質細胞(ALDH1L1)、小膠質細胞(Iba1)之細胞特異性標誌物之腦切片之雙重免疫螢光染色;肝、心臟、骨骼肌中之GFP表現之西方墨點法分析;大腦及相同外周器官中之載體基因體生物分佈。預期結果係一組提供僅限於GABA能神經元之不同量之基因表現之強化子(強化子之組合)之定義。 10 12 vg AAV-PHP.eB-nlsGFP vector was systemically infused into 6- to 8-week-old normal mice (n=4/vector), and the experimental endpoint was 4 to 6 weeks post-injection. Outcome measures are GFP and neuronal (NeuN) cell-specific markers of GABAergic neurons (GAD1, and other markers of subsets of GABA neurons), astrocytes (ALDH1L1), and microglia (Iba1). Double immunofluorescence staining of brain sections; Western blot analysis of GFP expression in liver, heart, skeletal muscle; vector gene biodistribution in brain and the same peripheral organs. The expected result is the definition of a set of enhancers (combinations of enhancers) that provide varying amounts of gene expression restricted to GABAergic neurons.

方法 3 :來自 CNS 中之非 GABA 神經元細胞群體之基因表現之 miR 轉錄後 已分析小鼠大腦中之miR表現圖譜之現存資料並選擇許多候選物,該等候選物應自大腦中除GABA神經元及星形膠質細胞及小膠質細胞外的大多數神經元群體脫靶基因表現。產生寡核苷酸庫,該庫涵蓋大量miR標靶(miR-T)之組合及大量重複序列欲選殖在人類突觸素-1啟動子下表現nlsGFP之AAV載體之3’UTR中。所有miR-T匣均攜載miR-1及miR-122之標靶以分別自肌肉及肝脫靶基因表現。包括此特徵之原因係容許使用較多普遍存在之不同強度之啟動子以驅動ZFP表現。可藉由改變啟動子強度及ZFP效力微調Scn1a之表現。最後,各miR-T組合係與用於RNAseq分析之唯一條碼相關聯。庫方法容許快速選擇最高效之元件組合以達成GABA神經元特異性表現,其可與方法2中鑑定之其他強化子組合。 Approach 3 : Post- transcriptional miR off- targets of gene expression from non- GABA neuronal cell populations in the CNS Existing data on miR expression profiles in mouse brain have been analyzed and a number of candidates have been selected that should be eliminated from the brain GABA neurons and most neuronal populations outside of astrocytes and microglia express off-target genes. An oligonucleotide library was generated covering a large number of combinations of miR targets (miR-T) and a large number of repeats to be cloned in the 3'UTR of an AAV vector expressing nlsGFP under the human synaptophysin-1 promoter. All miR-T cassettes carry the targets of miR-1 and miR-122 to express off-target genes from muscle and liver, respectively. The reason for including this feature is to allow the use of more ubiquitous promoters of varying strengths to drive ZFP expression. The performance of Scn1a can be fine-tuned by altering promoter strength and ZFP potency. Finally, each miR-T combination was associated with a unique barcode for RNAseq analysis. The library approach allows rapid selection of the most efficient combination of elements for GABA neuron-specific performance, which can be combined with other enhancers identified in Method 2.

AAV-Syn1-nlsGFP-miRT庫係藉由設計並構建該庫來產生,以用於選殖至轉基因匣之3’UTR內。NGS分析使用上文描述原理之質體庫。The AAV-Syn1-nlsGFP-miRT library was generated by designing and constructing the library for cloning into the 3'UTR of the transgenic cassette. The NGS analysis uses the plastid library of the principles described above.

產生使用AAV-PHP.eB之庫。Generated using the AAV-PHP.eB library.

在6至8週齡正常小鼠(n=4)中使用全身輸注10 12vg AAV.miR-T庫進行GABA神經元特異性miR-T匣之活體內篩選,實驗終點在注射後4至6週。結果評量係CNS細胞群體中之經表現條碼分佈之snRNAseq;條碼在肝、心臟、骨骼肌中之RTPCR擴增,接著頻率之NGS分析。 In vivo screening of GABA neuron-specific miR-T cassettes using systemic infusion of 10 12 vg AAV.miR-T library in 6- to 8-week-old normal mice (n=4) with endpoints 4 to 6 post-injection week. Outcome measures were snRNAseq expressing barcode distribution in CNS cell populations; RTPCR amplification of barcodes in liver, heart, skeletal muscle followed by NGS analysis of frequency.

經驗證miR-T匣將轉基因表現限制於CNS中之GABA神經元。在由Syn-1及CBA啟動子驅動之編碼nlsGFP之轉基因匣中單獨測試頂部miR-T匣。單獨研究AAV-PHP.eB-nlsGFP載體[(2個miR-T匣+無miR-T) x 2個啟動子= 6個載體)。The miR-T cassette was validated to restrict transgene expression to GABA neurons in the CNS. The top miR-T cassette was tested alone in the transgenic cassette encoding nlsGFP driven by the Syn-1 and CBA promoters. The AAV-PHP.eB-nlsGFP vector [(2 miR-T cassettes + no miR-T) x 2 promoters = 6 vectors) was studied alone.

對6至8週齡正常小鼠(n=4隻/載體)全身輸注10 12vg AAV-PHP.eB-nlsGFP載體,實驗終點在注射後4至6週。結果評量係GFP及神經元(NeuN) GABA能神經元(GAD1,GABA神經元子集之其他標誌物)、星形膠質細胞(ALDH1L1)、小膠質細胞(Iba1)之細胞特異性標誌物之腦切片之雙重免疫螢光染色;肝、心臟、骨骼肌中之GFP表現之西方墨點法分析;大腦及相同外周器官中之載體基因體生物分佈。預期結果係一或多個可將基因表現限制於GABA能神經元之miR-T匣之定義。 10 12 vg AAV-PHP.eB-nlsGFP vector was systemically infused into 6- to 8-week-old normal mice (n=4/vector), and the experimental endpoint was 4 to 6 weeks post-injection. Outcome assessments are based on GFP and neuron (NeuN) cell-specific markers of GABAergic neurons (GAD1, other markers of a subset of GABA neurons), astrocytes (ALDH1L1), and microglia (Iba1). Double immunofluorescence staining of brain sections; Western blot analysis of GFP expression in liver, heart, skeletal muscle; vector gene biodistribution in brain and same peripheral organs. The expected outcome is the definition of one or more miR-T boxes that can restrict gene expression to GABAergic neurons.

選擇攜載所選強化子及miR-T匣之最終AAV載體設計。三個涵蓋一系列效力之GABA特異性強化子與頂部miR-T匣組合,並單獨研究-總計至少六種AAV載體設計。The final AAV vector design carrying the selected enhancer and miR-T cassette was selected. Three GABA-specific enhancers covering a range of potencies were combined with the top miR-T cassette and studied individually - a total of at least six AAV vector designs.

對6至8週齡正常小鼠(n=4隻/載體)全身輸注10 12vg AAV-PHP.eB-nlsGFP載體(6種載體設計),實驗終點在注射後4至6週。結果評量係snRNAseq;GFP及神經元(NeuN) GABA能神經元(GAD1,及GABA神經元子集之其他標誌物)、星形膠質細胞(ALDH1L1)、小膠質細胞(Iba1)之細胞特異性標誌物之腦切片之雙重免疫螢光染色;肝、心臟、骨骼肌中之GFP表現之西方墨點法分析;大腦及相同外周器官中之載體基因體生物分佈。預期結果係選擇至少三個在大多數GABA能神經元群體中具有不同效力且在CNS或外周組織之其他細胞類型中無轉基因表現之AAV載體。 10 12 vg AAV-PHP.eB-nlsGFP vector (6 vector design) was systemically infused into 6- to 8-week-old normal mice (n=4/vector), and the experimental endpoint was 4 to 6 weeks after injection. Outcome measures were snRNAseq; GFP and neuron (NeuN) cell specificity of GABAergic neurons (GAD1, and other markers of subsets of GABA neurons), astrocytes (ALDH1L1), microglia (Iba1) Double immunofluorescence staining of brain sections for markers; Western blot analysis of GFP expression in liver, heart, skeletal muscle; vector gene biodistribution in brain and same peripheral organs. The expected outcome was to select at least three AAV vectors with varying potency in most GABAergic neuron populations and no transgene expression in the CNS or other cell types of peripheral tissues.

圖1顯示指示人類(HEK)與小鼠(HEPG2) SCN1A基因(共有序列–SEQ ID NO: 98;靶序列–SEQ ID NO: 99;Hep-SCN1A_R4序列(頂部) –SEQ ID NO: 100;Hep-SCN1A_R4序列(底部) –SEQ ID NO: 101)之間的序列保守性之層析定序資料。Figure 1 shows the indicated human (HEK) and mouse (HEPG2) SCN1A genes (consensus sequence - SEQ ID NO: 98; target sequence - SEQ ID NO: 99; Hep-SCN1A_R4 sequence (top) - SEQ ID NO: 100; Hep - Chromatographic sequencing data on sequence conservation between SCN1A_R4 sequences (bottom) - SEQ ID NO: 101).

圖2顯示人類(SEQ ID NO: 1)及小鼠(SEQ ID NO: 2) SCN1A基因之近端啟動子區之序列比對,其中重點標示保守序列。此保守序列內係鋅指蛋白(ZFP)結合區之受關注靶區,其以粗體 (SEQ ID NO: 4)表示。Figure 2 shows a sequence alignment of the proximal promoter regions of the human (SEQ ID NO: 1) and mouse (SEQ ID NO: 2) SCN1A genes, with the conserved sequences highlighted. Within this conserved sequence is the target region of interest for the zinc finger protein (ZFP) binding region, which is shown in bold (SEQ ID NO: 4).

圖3係顯示SCN1A基因之近端啟動子區中三個重疊標靶ZFP (ZFP-1、ZFP-2、ZFP-3) (SEQ ID NO: 5至7)結合位點之位置(SEQ ID NO: 3)之示意圖。Figure 3 shows the location of three overlapping target ZFP (ZFP-1, ZFP-2, ZFP-3) (SEQ ID NO: 5 to 7) binding sites in the proximal promoter region of the SCN1A gene (SEQ ID NO. : 3) schematic diagram.

圖4A至4D顯示ZFP-1中之個別鋅指(指1至指6;F1至F6)之六個識別螺旋序列之比對,該ZFP-1將識別SCN1A基因(SEQ ID NO: 2)之近端啟動子區內之個別三個鹼基區域(由紅色表示之DNA三聯體,由「•」隔開)。圖4A重點標示ZFP-1之鋅指1至6 (F1至F6)將結合之核苷酸序列(SEQ ID NO: 3)。圖4B顯示由ZFP-1之指1至6 (SEQ ID NO: 17至22)之各識別螺旋(七個胺基酸)識別之三個核苷酸序列。圖4C顯示ZFP-1之胺基酸序列,其含有6個指,每行一個,其中重點標示該等指之間的連接子以指定典型(TGEKP)及非典型(TGSQKP)連接子序列(SEQ ID NO: 65至70)。圖4D顯示ZFP-1 (F1至F6)之核苷酸序列(SEQ ID NO: 102至107)。Figures 4A to 4D show an alignment of the six recognition helical sequences of individual zinc fingers (finger 1 to finger 6; F1 to F6) in ZFP-1 that will recognize the SCN1A gene (SEQ ID NO: 2) Individual three-base regions within the proximal promoter region (DNA triplet shown in red, separated by "•"). Figure 4A highlights the nucleotide sequence (SEQ ID NO: 3) to which zinc fingers 1 to 6 (F1 to F6) of ZFP-1 will bind. Figure 4B shows the three nucleotide sequences recognized by each of the recognition helices (seven amino acids) of fingers 1 to 6 (SEQ ID NOs: 17 to 22) of ZFP-1. Figure 4C shows the amino acid sequence of ZFP-1 containing 6 fingers, one in each row, with the linkers between the fingers highlighted to designate the canonical (TGEKP) and atypical (TGSQKP) linker sequences (SEQ ID NO: 65 to 70). Figure 4D shows the nucleotide sequences (SEQ ID NOs: 102 to 107) of ZFP-1 (F1 to F6).

圖5A至5D顯示ZFP-2中之個別鋅指(指1至指6;F1至F6)之六個識別螺旋序列之比對,該ZFP-2將識別SCN1A基因(SEQ ID NO: 3) 之近端啟動子區內之個別三個鹼基區域(由紅色表示之DNA三聯體,由「*」隔開)。圖5A重點標示ZFP-2之鋅指1至6 (F1至F6)將結合之核苷酸序列(SEQ ID NO: 3)。圖5B顯示由ZFP-2之指1至6 (SEQ ID NO: 29至34)之各識別螺旋(七個胺基酸)識別之前三個核苷酸。圖5C顯示ZFP-2之胺基酸序列,其含有6個指,每行一個(SEQ ID NO: 69至74),其中重點標示該等指之間的連接子以指定典型(TGEKP)及非典型(TGSQKP)連接子序列。圖5D顯示ZFP-2 (F1至F6)之核苷酸序列(SEQ ID NO: 108至113)。Figures 5A-5D show an alignment of the six recognition helical sequences of individual zinc fingers (finger 1 to finger 6; F1 to F6) in ZFP-2 that will recognize the SCN1A gene (SEQ ID NO: 3). Individual three base regions within the proximal promoter region (DNA triplets represented in red, separated by "*"). Figure 5A highlights the nucleotide sequence (SEQ ID NO: 3) to which zinc fingers 1 to 6 (F1 to F6) of ZFP-2 will bind. Figure 5B shows the first three nucleotides recognized by each of the recognition helices (seven amino acids) of fingers 1 to 6 (SEQ ID NOs: 29 to 34) of ZFP-2. Figure 5C shows the amino acid sequence of ZFP-2, which contains 6 fingers, one per row (SEQ ID NOs: 69 to 74), wherein the linkers between the fingers are highlighted to designate typical (TGEKP) and SARS type (TGSQKP) linker sequence. Figure 5D shows the nucleotide sequences (SEQ ID NOs: 108 to 113) of ZFP-2 (F1 to F6).

圖6A至6D顯示ZFP-3中之個別鋅指(指1至指6;F1至F6)之六個識別螺旋序列之比對,該ZFP-3將識別SCN1A基因(SEQ ID NO: 4) 之近端啟動子區內之個別三個鹼基區域(由紅色表示之DNA三聯體,由「*」隔開)。圖6A重點標示ZFP-3之鋅指1至6 (F1至F6)將結合之核苷酸序列(SEQ ID NO: 3)。圖6B顯示由ZFP-3之指1至6 (SEQ ID NO: 41至46)之各識別螺旋(七個胺基酸)識別之前三個核苷酸。圖6C顯示ZFP-3之胺基酸序列,其含有6個指,每行一個(SEQ ID NO: 75至80),其中重點標示該等指之間的連接子以指定典型(TGEKP)及非典型(TGSQKP)連接子序列。圖6D顯示ZFP-3 (F1至F6)之核苷酸序列(SEQ ID NO: 114至119)。Figures 6A-6D show an alignment of the six recognition helical sequences of individual zinc fingers (finger 1 to finger 6; F1 to F6) in ZFP-3 that will recognize the SCN1A gene (SEQ ID NO: 4). Individual three base regions within the proximal promoter region (DNA triplets represented in red, separated by "*"). Figure 6A highlights the nucleotide sequence (SEQ ID NO: 3) to which zinc fingers 1 to 6 (F1 to F6) of ZFP-3 will bind. Figure 6B shows the first three nucleotides recognized by each of the recognition helices (seven amino acids) of fingers 1 to 6 (SEQ ID NOs: 41 to 46) of ZFP-3. Figure 6C shows the amino acid sequence of ZFP-3 containing 6 fingers, one per row (SEQ ID NOs: 75 to 80), wherein the linkers between the fingers are highlighted to designate typical (TGEKP) and SARS type (TGSQKP) linker sequence. Figure 6D shows the nucleotide sequences (SEQ ID NOs: 114 to 119) of ZFP-3 (F1 to F6).

圖7顯示指示如藉由定量實時聚合酶鏈反應(qRT-PCR)量測,圖4至6中描述的結合SCN1A之ZFP增加HEK293T細胞中之SCN1A基因表現之資料。經由瞬時轉染編碼下列轉錄調節物之表現質體將此等表現構築體遞送至細胞:化膿鏈球菌( Streptococcus pyogene) Cas9 + SCN1A引導RNA (SpCas9 + Scn1a);無核酸內切酶活性之Cas9 (dCas9);VPR活化域+ SCN1A引導RNA (dCas9_VPR + Scn1a);VPR活化域+ ZFP1 (VPR_ZFP1);VPR活化域+ ZPF2 (VPR_ZFP2);VPR活化域+ ZFP3 (VPR_ZFP3);SpCas9 + ASCL1引導RNA (SpCas9 + Ascl1);三個VPR_ZFP (VPR_ZFP1 + VPR_ZFP2 + VPR_ZFP3)。將表現量標準化為藉由qRT-PCR於各樣本中測定之TBP表現量。 Figure 7 shows data indicating that the SCN1A-binding ZFPs described in Figures 4-6 increased SCN1A gene expression in HEK293T cells as measured by quantitative real-time polymerase chain reaction (qRT-PCR). These expression constructs were delivered to cells by transient transfection of expression plastids encoding the following transcriptional regulators: Streptococcus pyogene Cas9 + SCN1A guide RNA (SpCas9 + Scn1a); Cas9 without endonuclease activity ( dCas9); VPR activation domain + SCN1A guide RNA (dCas9_VPR + Scn1a); VPR activation domain + ZFP1 (VPR_ZFP1); VPR activation domain + ZPF2 (VPR_ZFP2); VPR activation domain + ZFP3 (VPR_ZFP3); SpCas9 + ASCL1 guide RNA (SpCas9 + Ascl1); three VPR_ZFPs (VPR_ZFP1 + VPR_ZFP2 + VPR_ZFP3). Expression levels were normalized to TBP expression levels determined by qRT-PCR in each sample.

圖8顯示如藉由定量實時聚合酶鏈反應(qRT-PCR)量測,指示圖4至6中描述結合SCN1A之ZFP及Cas9+SCN1A引導RNA增加HEK293T細胞中之SCN1A基因表現之資料。Figure 8 shows data indicating that the SCN1A-binding ZFP and Cas9+SCN1A guide RNA described in Figures 4-6 increased SCN1A gene expression in HEK293T cells, as measured by quantitative real-time polymerase chain reaction (qRT-PCR).

圖9顯示以SCN1A為中心之~1Mb範圍內之高通量染色體構象捕獲(Hi-C)資料。箭頭指示間隔165至166 Mb的第2號染色體不同區之間的潛在相互作用。Figure 9 shows high-throughput chromosome conformation capture (Hi-C) data in the ~1 Mb range centered on SCN1A. Arrows indicate potential interactions between different regions of chromosome 2 separated by 165 to 166 Mb.

圖10顯示三種開發GABA-神經元特異性AAV載體之方法。Figure 10 shows three methods for developing GABA-neuron-specific AAV vectors.

                                 
          <![CDATA[<110>  麻州大學(Univeristy of Massachusetts)]]>
          <![CDATA[<120>  DNA結合域轉活化子及其用途]]>
          <![CDATA[<130>  U0120.70147WO00]]>
          <![CDATA[<140>  TW 110127164]]>
          <![CDATA[<141>  2017-07-23]]>
          <![CDATA[<150>  US 63/056,528]]>
          <![CDATA[<151>  2020-07-24]]>
          <![CDATA[<160>  140   ]]>
          <![CDATA[<170>  PatentIn version 3.5]]>
          <![CDATA[<210>  1]]>
          <![CDATA[<211>  672]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  1]]>
          aatttccatg gactcttttt ccaaaggaat aactggaatg aataaactta aaatcaagat       60
          gaaacaatta gatggcttac ctgattaaaa ggaaaattat ccatctgcag tgaggaacag      120
          catcacccaa agacgagatg ataacaatgt gccttcagtt gcaattgttc agttccttct      180
          tgcaaaaggt gtcaaagtat ttacaagggc tgcagtctca ctggggcaga acacacagac      240
          acacaaacac acacaaacgc acacatacac acatgcacca gagacctctg cagtatcctc      300
          tcggcttcat cctcgcctca ctctatggta cctaatacaa atcagcaaat agcttgtttc      360
          aaaaaaaaaa aaaagtcaag acagcacctt acattacatc gccatctagt ggctaaatat      420
          taaacacttt ctcacaatcc agatttatga tttcttcctc aacctctttt ctctcagctt      480
          ttttcctttc ttctctgtaa tctcccagta ttgcttctcc ttgcttctct ttcattccct      540
          attgctatat aatatcatga acctaatgac tcaaagagga aaaggtttga aagtaaatat      600
          agctattttc aagtagtact tgaaaaactt agcattattt tagtttgaaa ctgttacttt      660
          attcctaata tg                                                          672
          <![CDATA[<210>  2]]>
          <![CDATA[<211>  669]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  小鼠]]>
          <![CDATA[<400>  2]]>
          tatttccgtg ggctcttctc cccaaggatt taccaggtaa gaattcacca ccaaagaaga       60
          tcacaatgag ataatcagat ggcttacctg ataaaaagga aaattatcca tctgcagtca      120
          ggagcaacat ctccccacga cgagtccgca ccttccgttg caacgattca gattccttct      180
          tgcaaaaggt gaccaagtgc ttcacaaggg ctgcagcctc ataggggaga acacacgtac      240
          acaaacacac gcacacacac acacacatgc accagagacc tctgcagtat cctctggctt      300
          catcctcgcc tcactctatg gtacctaata caaatcagca aatagcttgt tttaaaaaaa      360
          agaaagaaaa aaagcggaga cagcacctaa cgttacagtg ccatctagtg gctacatcgt      420
          aaataggttc tcacagcctg gatttctgtg ttctttctca accgcttcct tctggttcct      480
          ttttcttttt tcctctttat tttggtttta ttacttcctc agatgccttt ttttcattcc      540
          cctttgctct gcctacatgg aactattgac ttaaagatta aaacaatcag aactggagag      600
          cgttgctttt aagttaaaaa aaaaaaggtt gctaattttg tttgtaaatg ttactttatt      660
          ttctctatt                                                              669
          <![CDATA[<210>  3]]>
          <![CDATA[<211>  130]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  3]]>
          tttttttttt tttttttgaa acaagctatt tgctgatttg tattaggtac catagagtga       60
          ggcgaggatg aagccgagag gatactgcag aggtctctgg tgcatgtgtg tatgtgtgcg      120
          tttgtgtgtg                                                             130
          <![CDATA[<210>  4]]>
          <![CDATA[<211>  41]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  4]]>
          gagtgaggcg aggatgaagc cgagaggata ctgcagaggt c                           41
          <![CDATA[<210>  5]]>
          <![CDATA[<211>  18]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  5]]>
          gagtgaggcg aggatgaa                                                     18
          <![CDATA[<210>  6]]>
          <![CDATA[<211>  18]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  6]]>
          ggcgaggatg aagccgag                                                     18
          <![CDATA[<210>  7]]>
          <![CDATA[<211>  18]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  7]]>
          gaggatactg cagaggtc                                                     18
          <![CDATA[<210>  8]]>
          <![CDATA[<211>  5]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  8]]>
          Glu Gly Glu Asp Glu 
          1               5   
          <![CDATA[<210>  9]]>
          <![CDATA[<211>  6]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  9]]>
          Gly Glu Asp Glu Ala Glu 
          1               5       
          <![CDATA[<210>  10]]>
          <![CDATA[<211>  6]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  10]]>
          Glu Asp Thr Ala Glu Val 
          1               5       
          <![CDATA[<210>  11]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  11]]>
          cagcggggaa acctggtgag g                                                 21
          <![CDATA[<210>  12]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  12]]>
          ctgagcttca atctaaccag a                                                 21
          <![CDATA[<210>  13]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  13]]>
          cggagtgaca acttaacgcg g                                                 21
          <![CDATA[<210>  14]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  14]]>
          gaccggtctc accttgcccg a                                                 21
          <![CDATA[<210>  15]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  15]]>
          cagaaggccc atttgactgc c                                                 21
          <![CDATA[<210>  16]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  16]]>
          cggtcggaca acctcacacg c                                                 21
          <![CDATA[<210>  17]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  17]]>
          Gln Arg Gly Asn Leu Val Arg 
          1               5           
          <![CDATA[<210>  18]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  18]]>
          Leu Ser Phe Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  19]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  19]]>
          Arg Ser Asp Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  20]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  20]]>
          Asp Arg Ser His Leu Ala Arg 
          1               5           
          <![CDATA[<210>  21]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  21]]>
          Gln Lys Ala His Leu Thr Ala 
          1               5           
          <![CDATA[<210>  22]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  22]]>
          Arg Ser Asp Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  23]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  23]]>
          cgaagttcca acctgacacg g                                                 21
          <![CDATA[<210>  24]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  24]]>
          gacaagcgga ccttaatccg c                                                 21
          <![CDATA[<210>  25]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  25]]>
          cagcggggaa atctagtgcg a                                                 21
          <![CDATA[<210>  26]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  26]]>
          ctgagcttca acttgactcg t                                                 21
          <![CDATA[<210>  27]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  27]]>
          cggagtgaca atcttacgag a                                                 21
          <![CDATA[<210>  28]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  28]]>
          gaccggagcc acttagccag g                                                 21
          <![CDATA[<210>  29]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  29]]>
          Arg Ser Ser Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  30]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  30]]>
          Asp Lys Arg Thr Leu Ile Arg 
          1               5           
          <![CDATA[<210>  31]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  31]]>
          Gln Arg Gly Asn Leu Val Arg 
          1               5           
          <![CDATA[<210>  32]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  32]]>
          Leu Ser Phe Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  33]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  33]]>
          Arg Ser Asp Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  34]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  34]]>
          Asp Arg Ser His Leu Ala Arg 
          1               5           
          <![CDATA[<210>  35]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  35]]>
          gaccggagcg cgctggcacg g                                                 21
          <![CDATA[<210>  36]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  36]]>
          cgaagtgaca acttaacgcg c                                                 21
          <![CDATA[<210>  37]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  37]]>
          cagtcagggg acctcactcg t                                                 21
          <![CDATA[<210>  38]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  38]]>
          gtacgacaga cgcttaaaca a                                                 21
          <![CDATA[<210>  39]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  39]]>
          gccgctggta acttgacacg a                                                 21
          <![CDATA[<210>  40]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  40]]>
          agatctgata atctaacgcg t                                                 21
          <![CDATA[<210>  41]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  41]]>
          Asp Arg Ser Ala Leu Ala Arg 
          1               5           
          <![CDATA[<210>  42]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  42]]>
          Arg Ser Asp Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  43]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  43]]>
          Gln Ser Gly Asp Leu Thr Arg 
          1               5           
          <![CDATA[<210>  44]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  44]]>
          Val Arg Gln Thr Leu Lys Gln 
          1               5           
          <![CDATA[<210>  45]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  45]]>
          Ala Ala Gly Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  46]]>
          <![CDATA[<211>  7]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  46]]>
          Arg Ser Asp Asn Leu Thr Arg 
          1               5           
          <![CDATA[<210>  47]]>
          <![CDATA[<211>  1569]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  47]]>
          gaggccagcg gttccggacg ggctgacgca ttggacgatt ttgatctgga tatgctggga       60
          agtgacgccc tcgatgattt tgaccttgac atgcttggtt cggatgccct tgatgacttt      120
          gacctcgaca tgctcggcag tgacgccctt gatgatttcg acctggacat gctgattaac      180
          tctagaagtt ccggatctag ccagtacctg cccgacaccg acgaccggca ccggatcgag      240
          gaaaagcgga agcggaccta cgagacattc aagagcatca tgaagaagtc ccccttcagc      300
          ggccccaccg accctagacc tccacctaga agaatcgccg tgcccagcag atccagcgcc      360
          agcgtgccaa aacctgcccc ccagccttac cccttcacca gcagcctgag caccatcaac      420
          tacgacgagt tccctaccat ggtgttcccc agcggccaga tctctcaggc ctctgctctg      480
          gctccagccc ctcctcaggt gctgcctcag gctcctgctc ctgcaccagc tccagccatg      540
          gtgtctgcac tggctcaggc accagcaccc gtgcctgtgc tggctcctgg acctccacag      600
          gctgtggctc caccagcccc taaacctaca caggccggcg agggcacact gtctgaagct      660
          ctgctgcagc tgcagttcga cgacgaggat ctgggagccc tgctgggaaa cagcaccgat      720
          cctgccgtgt tcaccgacct ggccagcgtg gacaacagcg agttccagca gctgctgaac      780
          cagggcatcc ctgtggcccc tcacaccacc gagcccatgc tgatggaata ccccgaggcc      840
          atcacccggc tcgtgacagg cgctcagagg cctcctgatc cagctcctgc ccctctggga      900
          gcaccaggcc tgcctaatgg actgctgtct ggcgacgagg acttcagctc tatcgccgat      960
          atggatttct cagccttgct gggctctggc agcggcagcc gggattccag ggaagggatg     1020
          tttttgccga agcctgaggc cggctccgct attagtgacg tgtttgaggg ccgcgaggtg     1080
          tgccagccaa aacgaatccg gccatttcat cctccaggaa gtccatgggc caaccgccca     1140
          ctccccgcca gcctcgcacc aacaccaacc ggtccagtac atgagccagt cgggtcactg     1200
          accccggcac cagtccctca gccactggat ccagcgcccg cagtgactcc cgaggccagt     1260
          cacctgttgg aggatcccga tgaagagacg agccaggctg tcaaagccct tcgggagatg     1320
          gccgatactg tgattcccca gaaggaagag gctgcaatct gtggccaaat ggacctttcc     1380
          catccgcccc caaggggcca tctggatgag ctgacaacca cacttgagtc catgaccgag     1440
          gatctgaacc tggactcacc cctgaccccg gaattgaacg agattctgga taccttcctg     1500
          aacgacgagt gcctcttgca tgccatgcat atcagcacag gactgtccat cttcgacaca     1560
          tctctgttt                                                             1569
          <![CDATA[<210>  48]]>
          <![CDATA[<211>  523]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  48]]>
          Glu Ala Ser Gly Ser Gly Arg Ala Asp Ala Leu Asp Asp Phe Asp Leu 
          1               5                   10                  15      
          Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu 
                      20                  25                  30          
          Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp 
                  35                  40                  45              
          Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Ile Asn Ser Arg Ser Ser 
              50                  55                  60                  
          Gly Ser Ser Gln Tyr Leu Pro Asp Thr Asp Asp Arg His Arg Ile Glu 
          65                  70                  75                  80  
          Glu Lys Arg Lys Arg Thr Tyr Glu Thr Phe Lys Ser Ile Met Lys Lys 
                          85                  90                  95      
          Ser Pro Phe Ser Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg Ile 
                      100                 105                 110         
          Ala Val Pro Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gln 
                  115                 120                 125             
          Pro Tyr Pro Phe Thr Ser Ser Leu Ser Thr Ile Asn Tyr Asp Glu Phe 
              130                 135                 140                 
          Pro Thr Met Val Phe Pro Ser Gly Gln Ile Ser Gln Ala Ser Ala Leu 
          145                 150                 155                 160 
          Ala Pro Ala Pro Pro Gln Val Leu Pro Gln Ala Pro Ala Pro Ala Pro 
                          165                 170                 175     
          Ala Pro Ala Met Val Ser Ala Leu Ala Gln Ala Pro Ala Pro Val Pro 
                      180                 185                 190         
          Val Leu Ala Pro Gly Pro Pro Gln Ala Val Ala Pro Pro Ala Pro Lys 
                  195                 200                 205             
          Pro Thr Gln Ala Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gln Leu 
              210                 215                 220                 
          Gln Phe Asp Asp Glu Asp Leu Gly Ala Leu Leu Gly Asn Ser Thr Asp 
          225                 230                 235                 240 
          Pro Ala Val Phe Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gln 
                          245                 250                 255     
          Gln Leu Leu Asn Gln Gly Ile Pro Val Ala Pro His Thr Thr Glu Pro 
                      260                 265                 270         
          Met Leu Met Glu Tyr Pro Glu Ala Ile Thr Arg Leu Val Thr Gly Ala 
                  275                 280                 285             
          Gln Arg Pro Pro Asp Pro Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu 
              290                 295                 300                 
          Pro Asn Gly Leu Leu Ser Gly Asp Glu Asp Phe Ser Ser Ile Ala Asp 
          305                 310                 315                 320 
          Met Asp Phe Ser Ala Leu Leu Gly Ser Gly Ser Gly Ser Arg Asp Ser 
                          325                 330                 335     
          Arg Glu Gly Met Phe Leu Pro Lys Pro Glu Ala Gly Ser Ala Ile Ser 
                      340                 345                 350         
          Asp Val Phe Glu Gly Arg Glu Val Cys Gln Pro Lys Arg Ile Arg Pro 
                  355                 360                 365             
          Phe His Pro Pro Gly Ser Pro Trp Ala Asn Arg Pro Leu Pro Ala Ser 
              370                 375                 380                 
          Leu Ala Pro Thr Pro Thr Gly Pro Val His Glu Pro Val Gly Ser Leu 
          385                 390                 395                 400 
          Thr Pro Ala Pro Val Pro Gln Pro Leu Asp Pro Ala Pro Ala Val Thr 
                          405                 410                 415     
          Pro Glu Ala Ser His Leu Leu Glu Asp Pro Asp Glu Glu Thr Ser Gln 
                      420                 425                 430         
          Ala Val Lys Ala Leu Arg Glu Met Ala Asp Thr Val Ile Pro Gln Lys 
                  435                 440                 445             
          Glu Glu Ala Ala Ile Cys Gly Gln Met Asp Leu Ser His Pro Pro Pro 
              450                 455                 460                 
          Arg Gly His Leu Asp Glu Leu Thr Thr Thr Leu Glu Ser Met Thr Glu 
          465                 470                 475                 480 
          Asp Leu Asn Leu Asp Ser Pro Leu Thr Pro Glu Leu Asn Glu Ile Leu 
                          485                 490                 495     
          Asp Thr Phe Leu Asn Asp Glu Cys Leu Leu His Ala Met His Ile Ser 
                      500                 505                 510         
          Thr Gly Leu Ser Ile Phe Asp Thr Ser Leu Phe 
                  515                 520             
          <![CDATA[<210>  49]]>
          <![CDATA[<211>  6027]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  49]]>
          atggaacaga ccgtgctggt gccgccgggc ccggatagct ttaacttttt tacccgcgaa       60
          agcctggcgg cgattgaacg ccgcattgcg gaagaaaaag cgaaaaaccc gaaaccggat      120
          aaaaaagatg atgatgaaaa cggcccgaaa ccgaacagcg atctggaagc gggcaaaaac      180
          ctgccgttta tttatggcga tattccgccg gaaatggtga gcgaaccgct ggaagatctg      240
          gatccgtatt atattaacaa aaaaaccttt attgtgctga acaaaggcaa agcgattttt      300
          cgctttagcg cgaccagcgc gctgtatatt ctgaccccgt ttaacccgct gcgcaaaatt      360
          gcgattaaaa ttctggtgca tagcctgttt agcatgctga ttatgtgcac cattctgacc      420
          aactgcgtgt ttatgaccat gagcaacccg ccggattgga ccaaaaacgt ggaatatacc      480
          tttaccggca tttatacctt tgaaagcctg attaaaatta ttgcgcgcgg cttttgcctg      540
          gaagatttta cctttctgcg cgatccgtgg aactggctgg attttaccgt gattaccttt      600
          gcgtatgtga ccgaatttgt ggatctgggc aacgtgagcg cgctgcgcac ctttcgcgtg      660
          ctgcgcgcgc tgaaaaccat tagcgtgatt ccgggcctga aaaccattgt gggcgcgctg      720
          attcagagcg tgaaaaaact gagcgatgtg atgattctga ccgtgttttg cctgagcgtg      780
          tttgcgctga ttggcctgca gctgtttatg ggcaacctgc gcaacaaatg cattcagtgg      840
          ccgccgacca acgcgagcct ggaagaacat agcattgaaa aaaacattac cgtgaactat      900
          aacggcaccc tgattaacga aaccgtgttt gaatttgatt ggaaaagcta tattcaggat      960
          agccgctatc attattttct ggaaggcttt ctggatgcgc tgctgtgcgg caacagcagc     1020
          gatgcgggcc agtgcccgga aggctatatg tgcgtgaaag cgggccgcaa cccgaactat     1080
          ggctatacca gctttgatac ctttagctgg gcgtttctga gcctgtttcg cctgatgacc     1140
          caggattttt gggaaaacct gtatcagctg accctgcgcg cggcgggcaa aacctatatg     1200
          attttttttg tgctggtgat ttttctgggc agcttttatc tgattaacct gattctggcg     1260
          gtggtggcga tggcgtatga agaacagaac caggcgaccc tggaagaagc ggaacagaaa     1320
          gaagcggaat ttcagcagat gattgaacag ctgaaaaaac agcaggaagc ggcgcagcag     1380
          gcggcgaccg cgaccgcgag cgaacatagc cgcgaaccga gcgcggcggg ccgcctgagc     1440
          gatagcagca gcgaagcgag caaactgagc agcaaaagcg cgaaagaacg ccgcaaccgc     1500
          cgcaaaaaac gcaaacagaa agaacagagc ggcggcgaag aaaaagatga agatgaattt     1560
          cagaaaagcg aaagcgaaga tagcattcgc cgcaaaggct ttcgctttag cattgaaggc     1620
          aaccgcctga cctatgaaaa acgctatagc agcccgcatc agagcctgct gagcattcgc     1680
          ggcagcctgt ttagcccgcg ccgcaacagc cgcaccagcc tgtttagctt tcgcggccgc     1740
          gcgaaagatg tgggcagcga aaacgatttt gcggatgatg aacatagcac ctttgaagat     1800
          aacgaaagcc gccgcgatag cctgtttgtg ccgcgccgcc atggcgaacg ccgcaacagc     1860
          aacctgagcc agaccagccg cagcagccgc atgctggcgg tgtttccggc gaacggcaaa     1920
          atgcatagca ccgtggattg caacggcgtg gtgagcctgg tgggcggccc gagcgtgccg     1980
          accagcccgg tgggccagct gctgccggaa gtgattattg ataaaccggc gaccgatgat     2040
          aacggcacca ccaccgaaac cgaaatgcgc aaacgccgca gcagcagctt tcatgtgagc     2100
          atggattttc tggaagatcc gagccagcgc cagcgcgcga tgagcattgc gagcattctg     2160
          accaacaccg tggaagaact ggaagaaagc cgccagaaat gcccgccgtg ctggtataaa     2220
          tttagcaaca tttttctgat ttgggattgc agcccgtatt ggctgaaagt gaaacatgtg     2280
          gtgaacctgg tggtgatgga tccgtttgtg gatctggcga ttaccatttg cattgtgctg     2340
          aacaccctgt ttatggcgat ggaacattat ccgatgaccg atcattttaa caacgtgctg     2400
          accgtgggca acctggtgtt taccggcatt tttaccgcgg aaatgtttct gaaaattatt     2460
          gcgatggatc cgtattatta ttttcaggaa ggctggaaca tttttgatgg ctttattgtg     2520
          accctgagcc tggtggaact gggcctggcg aacgtggaag gcctgagcgt gctgcgcagc     2580
          tttcgcctgc tgcgcgtgtt taaactggcg aaaagctggc cgaccctgaa catgctgatt     2640
          aaaattattg gcaacagcgt gggcgcgctg ggcaacctga ccctggtgct ggcgattatt     2700
          gtgtttattt ttgcggtggt gggcatgcag ctgtttggca aaagctataa agattgcgtg     2760
          tgcaaaattg cgagcgattg ccagctgccg cgctggcata tgaacgattt ttttcatagc     2820
          tttctgattg tgtttcgcgt gctgtgcggc gaatggattg aaaccatgtg ggattgcatg     2880
          gaagtggcgg gccaggcgat gtgcctgacc gtgtttatga tggtgatggt gattggcaac     2940
          ctggtggtgc tgaacctgtt tctggcgctg ctgctgagca gctttagcgc ggataacctg     3000
          gcggcgaccg atgatgataa cgaaatgaac aacctgcaga ttgcggtgga tcgcatgcat     3060
          aaaggcgtgg cgtatgtgaa acgcaaaatt tatgaattta ttcagcagag ctttattcgc     3120
          aaacagaaaa ttctggatga aattaaaccg ctggatgatc tgaacaacaa aaaagatagc     3180
          tgcatgagca accataccgc ggaaattggc aaagatctgg attatctgaa agatgtgaac     3240
          ggcaccacca gcggcattgg caccggcagc agcgtggaaa aatatattat tgatgaaagc     3300
          gattatatga gctttattaa caacccgagc ctgaccgtga ccgtgccgat tgcggtgggc     3360
          gaaagcgatt ttgaaaacct gaacaccgaa gattttagca gcgaaagcga tctggaagaa     3420
          agcaaagaaa aactgaacga aagcagcagc agcagcgaag gcagcaccgt ggatattggc     3480
          gcgccggtgg aagaacagcc ggtggtggaa ccggaagaaa ccctggaacc ggaagcgtgc     3540
          tttaccgaag gctgcgtgca gcgctttaaa tgctgccaga ttaacgtgga agaaggccgc     3600
          ggcaaacagt ggtggaacct gcgccgcacc tgctttcgca ttgtggaaca taactggttt     3660
          gaaaccttta ttgtgtttat gattctgctg agcagcggcg cgctggcgtt tgaagatatt     3720
          tatattgatc agcgcaaaac cattaaaacc atgctggaat atgcggataa agtgtttacc     3780
          tatattttta ttctggaaat gctgctgaaa tgggtggcgt atggctatca gacctatttt     3840
          accaacgcgt ggtgctggct ggattttctg attgtggatg tgagcctggt gagcctgacc     3900
          gcgaacgcgc tgggctatag cgaactgggc gcgattaaaa gcctgcgcac cctgcgcgcg     3960
          ctgcgcccgc tgcgcgcgct gagccgcttt gaaggcatgc gcgtggtggt gaacgcgctg     4020
          ctgggcgcga ttccgagcat tatgaacgtg ctgctggtgt gcctgatttt ttggctgatt     4080
          tttagcatta tgggcgtgaa cctgtttgcg ggcaaatttt atcattgcat taacaccacc     4140
          accggcgatc gctttgatat tgaagatgtg aacaaccata ccgattgcct gaaactgatt     4200
          gaacgcaacg aaaccgcgcg ctggaaaaac gtgaaagtga actttgataa cgtgggcttt     4260
          ggctatctga gcctgctgca ggtggcgacc tttaaaggct ggatggatat tatgtatgcg     4320
          gcggtggata gccgcaacgt ggaactgcag ccgaaatatg aagaaagcct gtatatgtat     4380
          ctgtattttg tgatttttat tatttttggc agctttttta ccctgaacct gtttattggc     4440
          gtgattattg ataactttaa ccagcagaaa aaaaaatttg gcggccagga tatttttatg     4500
          accgaagaac agaaaaaata ttataacgcg atgaaaaaac tgggcagcaa aaaaccgcag     4560
          aaaccgattc cgcgcccggg caacaaattt cagggcatgg tgtttgattt tgtgacccgc     4620
          caggtgtttg atattagcat tatgattctg atttgcctga acatggtgac catgatggtg     4680
          gaaaccgatg atcagagcga atatgtgacc accattctga gccgcattaa cctggtgttt     4740
          attgtgctgt ttaccggcga atgcgtgctg aaactgatta gcctgcgcca ttattatttt     4800
          accattggct ggaacatttt tgattttgtg gtggtgattc tgagcattgt gggcatgttt     4860
          ctggcggaac tgattgaaaa atattttgtg agcccgaccc tgtttcgcgt gattcgcctg     4920
          gcgcgcattg gccgcattct gcgcctgatt aaaggcgcga aaggcattcg caccctgctg     4980
          tttgcgctga tgatgagcct gccggcgctg tttaacattg gcctgctgct gtttctggtg     5040
          atgtttattt atgcgatttt tggcatgagc aactttgcgt atgtgaaacg cgaagtgggc     5100
          attgatgata tgtttaactt tgaaaccttt ggcaacagca tgatttgcct gtttcagatt     5160
          accaccagcg cgggctggga tggcctgctg gcgccgattc tgaacagcaa accgccggat     5220
          tgcgatccga acaaagtgaa cccgggcagc agcgtgaaag gcgattgcgg caacccgagc     5280
          gtgggcattt ttttttttgt gagctatatt attattagct ttctggtggt ggtgaacatg     5340
          tatattgcgg tgattctgga aaactttagc gtggcgaccg aagaaagcgc ggaaccgctg     5400
          agcgaagatg attttgaaat gttttatgaa gtgtgggaaa aatttgatcc ggatgcgacc     5460
          cagtttatgg aatttgaaaa actgagccag tttgcggcgg cgctggaacc gccgctgaac     5520
          ctgccgcagc cgaacaaact gcagctgatt gcgatggatc tgccgatggt gagcggcgat     5580
          cgcattcatt gcctggatat tctgtttgcg tttaccaaac gcgtgctggg cgaaagcggc     5640
          gaaatggatg cgctgcgcat tcagatggaa gaacgcttta tggcgagcaa cccgagcaaa     5700
          gtgagctatc agccgattac caccaccctg aaacgcaaac aggaagaagt gagcgcggtg     5760
          attattcagc gcgcgtatcg ccgccatctg ctgaaacgca ccgtgaaaca ggcgagcttt     5820
          acctataaca aaaacaaaat taaaggcggc gcgaacctgc tgattaaaga agatatgatt     5880
          attgatcgca ttaacgaaaa cagcattacc gaaaaaaccg atctgaccat gagcaccgcg     5940
          gcgtgcccgc cgagctatga tcgcgtgacc aaaccgattg tggaaaaaca tgaacaggaa     6000
          ggcaaagatg aaaaagcgaa aggcaaa                                         6027
          <![CDATA[<210>  50]]>
          <![CDATA[<211>  2009]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  50]]>
          Met Glu Gln Thr Val Leu Val Pro Pro Gly Pro Asp Ser Phe Asn Phe 
          1               5                   10                  15      
          Phe Thr Arg Glu Ser Leu Ala Ala Ile Glu Arg Arg Ile Ala Glu Glu 
                      20                  25                  30          
          Lys Ala Lys Asn Pro Lys Pro Asp Lys Lys Asp Asp Asp Glu Asn Gly 
                  35                  40                  45              
          Pro Lys Pro Asn Ser Asp Leu Glu Ala Gly Lys Asn Leu Pro Phe Ile 
              50                  55                  60                  
          Tyr Gly Asp Ile Pro Pro Glu Met Val Ser Glu Pro Leu Glu Asp Leu 
          65                  70                  75                  80  
          Asp Pro Tyr Tyr Ile Asn Lys Lys Thr Phe Ile Val Leu Asn Lys Gly 
                          85                  90                  95      
          Lys Ala Ile Phe Arg Phe Ser Ala Thr Ser Ala Leu Tyr Ile Leu Thr 
                      100                 105                 110         
          Pro Phe Asn Pro Leu Arg Lys Ile Ala Ile Lys Ile Leu Val His Ser 
                  115                 120                 125             
          Leu Phe Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Val Phe 
              130                 135                 140                 
          Met Thr Met Ser Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr 
          145                 150                 155                 160 
          Phe Thr Gly Ile Tyr Thr Phe Glu Ser Leu Ile Lys Ile Ile Ala Arg 
                          165                 170                 175     
          Gly Phe Cys Leu Glu Asp Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp 
                      180                 185                 190         
          Leu Asp Phe Thr Val Ile Thr Phe Ala Tyr Val Thr Glu Phe Val Asp 
                  195                 200                 205             
          Leu Gly Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu 
              210                 215                 220                 
          Lys Thr Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu 
          225                 230                 235                 240 
          Ile Gln Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe 
                          245                 250                 255     
          Cys Leu Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn 
                      260                 265                 270         
          Leu Arg Asn Lys Cys Ile Gln Trp Pro Pro Thr Asn Ala Ser Leu Glu 
                  275                 280                 285             
          Glu His Ser Ile Glu Lys Asn Ile Thr Val Asn Tyr Asn Gly Thr Leu 
              290                 295                 300                 
          Ile Asn Glu Thr Val Phe Glu Phe Asp Trp Lys Ser Tyr Ile Gln Asp 
          305                 310                 315                 320 
          Ser Arg Tyr His Tyr Phe Leu Glu Gly Phe Leu Asp Ala Leu Leu Cys 
                          325                 330                 335     
          Gly Asn Ser Ser Asp Ala Gly Gln Cys Pro Glu Gly Tyr Met Cys Val 
                      340                 345                 350         
          Lys Ala Gly Arg Asn Pro Asn Tyr Gly Tyr Thr Ser Phe Asp Thr Phe 
                  355                 360                 365             
          Ser Trp Ala Phe Leu Ser Leu Phe Arg Leu Met Thr Gln Asp Phe Trp 
              370                 375                 380                 
          Glu Asn Leu Tyr Gln Leu Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met 
          385                 390                 395                 400 
          Ile Phe Phe Val Leu Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn 
                          405                 410                 415     
          Leu Ile Leu Ala Val Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala 
                      420                 425                 430         
          Thr Leu Glu Glu Ala Glu Gln Lys Glu Ala Glu Phe Gln Gln Met Ile 
                  435                 440                 445             
          Glu Gln Leu Lys Lys Gln Gln Glu Ala Ala Gln Gln Ala Ala Thr Ala 
              450                 455                 460                 
          Thr Ala Ser Glu His Ser Arg Glu Pro Ser Ala Ala Gly Arg Leu Ser 
          465                 470                 475                 480 
          Asp Ser Ser Ser Glu Ala Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu 
                          485                 490                 495     
          Arg Arg Asn Arg Arg Lys Lys Arg Lys Gln Lys Glu Gln Ser Gly Gly 
                      500                 505                 510         
          Glu Glu Lys Asp Glu Asp Glu Phe Gln Lys Ser Glu Ser Glu Asp Ser 
                  515                 520                 525             
          Ile Arg Arg Lys Gly Phe Arg Phe Ser Ile Glu Gly Asn Arg Leu Thr 
              530                 535                 540                 
          Tyr Glu Lys Arg Tyr Ser Ser Pro His Gln Ser Leu Leu Ser Ile Arg 
          545                 550                 555                 560 
          Gly Ser Leu Phe Ser Pro Arg Arg Asn Ser Arg Thr Ser Leu Phe Ser 
                          565                 570                 575     
          Phe Arg Gly Arg Ala Lys Asp Val Gly Ser Glu Asn Asp Phe Ala Asp 
                      580                 585                 590         
          Asp Glu His Ser Thr Phe Glu Asp Asn Glu Ser Arg Arg Asp Ser Leu 
                  595                 600                 605             
          Phe Val Pro Arg Arg His Gly Glu Arg Arg Asn Ser Asn Leu Ser Gln 
              610                 615                 620                 
          Thr Ser Arg Ser Ser Arg Met Leu Ala Val Phe Pro Ala Asn Gly Lys 
          625                 630                 635                 640 
          Met His Ser Thr Val Asp Cys Asn Gly Val Val Ser Leu Val Gly Gly 
                          645                 650                 655     
          Pro Ser Val Pro Thr Ser Pro Val Gly Gln Leu Leu Pro Glu Val Ile 
                      660                 665                 670         
          Ile Asp Lys Pro Ala Thr Asp Asp Asn Gly Thr Thr Thr Glu Thr Glu 
                  675                 680                 685             
          Met Arg Lys Arg Arg Ser Ser Ser Phe His Val Ser Met Asp Phe Leu 
              690                 695                 700                 
          Glu Asp Pro Ser Gln Arg Gln Arg Ala Met Ser Ile Ala Ser Ile Leu 
          705                 710                 715                 720 
          Thr Asn Thr Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro 
                          725                 730                 735     
          Cys Trp Tyr Lys Phe Ser Asn Ile Phe Leu Ile Trp Asp Cys Ser Pro 
                      740                 745                 750         
          Tyr Trp Leu Lys Val Lys His Val Val Asn Leu Val Val Met Asp Pro 
                  755                 760                 765             
          Phe Val Asp Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe 
              770                 775                 780                 
          Met Ala Met Glu His Tyr Pro Met Thr Asp His Phe Asn Asn Val Leu 
          785                 790                 795                 800 
          Thr Val Gly Asn Leu Val Phe Thr Gly Ile Phe Thr Ala Glu Met Phe 
                          805                 810                 815     
          Leu Lys Ile Ile Ala Met Asp Pro Tyr Tyr Tyr Phe Gln Glu Gly Trp 
                      820                 825                 830         
          Asn Ile Phe Asp Gly Phe Ile Val Thr Leu Ser Leu Val Glu Leu Gly 
                  835                 840                 845             
          Leu Ala Asn Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu 
              850                 855                 860                 
          Arg Val Phe Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile 
          865                 870                 875                 880 
          Lys Ile Ile Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val 
                          885                 890                 895     
          Leu Ala Ile Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe 
                      900                 905                 910         
          Gly Lys Ser Tyr Lys Asp Cys Val Cys Lys Ile Ala Ser Asp Cys Gln 
                  915                 920                 925             
          Leu Pro Arg Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val 
              930                 935                 940                 
          Phe Arg Val Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met 
          945                 950                 955                 960 
          Glu Val Ala Gly Gln Ala Met Cys Leu Thr Val Phe Met Met Val Met 
                          965                 970                 975     
          Val Ile Gly Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu 
                      980                 985                 990         
          Ser Ser Phe Ser Ala Asp Asn Leu  Ala Ala Thr Asp Asp  Asp Asn Glu 
                  995                 1000                 1005             
          Met Asn  Asn Leu Gln Ile Ala  Val Asp Arg Met His  Lys Gly Val 
              1010                 1015                 1020             
          Ala Tyr  Val Lys Arg Lys Ile  Tyr Glu Phe Ile Gln  Gln Ser Phe 
              1025                 1030                 1035             
          Ile Arg  Lys Gln Lys Ile Leu  Asp Glu Ile Lys Pro  Leu Asp Asp 
              1040                 1045                 1050             
          Leu Asn  Asn Lys Lys Asp Ser  Cys Met Ser Asn His  Thr Ala Glu 
              1055                 1060                 1065             
          Ile Gly  Lys Asp Leu Asp Tyr  Leu Lys Asp Val Asn  Gly Thr Thr 
              1070                 1075                 1080             
          Ser Gly  Ile Gly Thr Gly Ser  Ser Val Glu Lys Tyr  Ile Ile Asp 
              1085                 1090                 1095             
          Glu Ser  Asp Tyr Met Ser Phe  Ile Asn Asn Pro Ser  Leu Thr Val 
              1100                 1105                 1110             
          Thr Val  Pro Ile Ala Val Gly  Glu Ser Asp Phe Glu  Asn Leu Asn 
              1115                 1120                 1125             
          Thr Glu  Asp Phe Ser Ser Glu  Ser Asp Leu Glu Glu  Ser Lys Glu 
              1130                 1135                 1140             
          Lys Leu  Asn Glu Ser Ser Ser  Ser Ser Glu Gly Ser  Thr Val Asp 
              1145                 1150                 1155             
          Ile Gly  Ala Pro Val Glu Glu  Gln Pro Val Val Glu  Pro Glu Glu 
              1160                 1165                 1170             
          Thr Leu  Glu Pro Glu Ala Cys  Phe Thr Glu Gly Cys  Val Gln Arg 
              1175                 1180                 1185             
          Phe Lys  Cys Cys Gln Ile Asn  Val Glu Glu Gly Arg  Gly Lys Gln 
              1190                 1195                 1200             
          Trp Trp  Asn Leu Arg Arg Thr  Cys Phe Arg Ile Val  Glu His Asn 
              1205                 1210                 1215             
          Trp Phe  Glu Thr Phe Ile Val  Phe Met Ile Leu Leu  Ser Ser Gly 
              1220                 1225                 1230             
          Ala Leu  Ala Phe Glu Asp Ile  Tyr Ile Asp Gln Arg  Lys Thr Ile 
              1235                 1240                 1245             
          Lys Thr  Met Leu Glu Tyr Ala  Asp Lys Val Phe Thr  Tyr Ile Phe 
              1250                 1255                 1260             
          Ile Leu  Glu Met Leu Leu Lys  Trp Val Ala Tyr Gly  Tyr Gln Thr 
              1265                 1270                 1275             
          Tyr Phe  Thr Asn Ala Trp Cys  Trp Leu Asp Phe Leu  Ile Val Asp 
              1280                 1285                 1290             
          Val Ser  Leu Val Ser Leu Thr  Ala Asn Ala Leu Gly  Tyr Ser Glu 
              1295                 1300                 1305             
          Leu Gly  Ala Ile Lys Ser Leu  Arg Thr Leu Arg Ala  Leu Arg Pro 
              1310                 1315                 1320             
          Leu Arg  Ala Leu Ser Arg Phe  Glu Gly Met Arg Val  Val Val Asn 
              1325                 1330                 1335             
          Ala Leu  Leu Gly Ala Ile Pro  Ser Ile Met Asn Val  Leu Leu Val 
              1340                 1345                 1350             
          Cys Leu  Ile Phe Trp Leu Ile  Phe Ser Ile Met Gly  Val Asn Leu 
              1355                 1360                 1365             
          Phe Ala  Gly Lys Phe Tyr His  Cys Ile Asn Thr Thr  Thr Gly Asp 
              1370                 1375                 1380             
          Arg Phe  Asp Ile Glu Asp Val  Asn Asn His Thr Asp  Cys Leu Lys 
              1385                 1390                 1395             
          Leu Ile  Glu Arg Asn Glu Thr  Ala Arg Trp Lys Asn  Val Lys Val 
              1400                 1405                 1410             
          Asn Phe  Asp Asn Val Gly Phe  Gly Tyr Leu Ser Leu  Leu Gln Val 
              1415                 1420                 1425             
          Ala Thr  Phe Lys Gly Trp Met  Asp Ile Met Tyr Ala  Ala Val Asp 
              1430                 1435                 1440             
          Ser Arg  Asn Val Glu Leu Gln  Pro Lys Tyr Glu Glu  Ser Leu Tyr 
              1445                 1450                 1455             
          Met Tyr  Leu Tyr Phe Val Ile  Phe Ile Ile Phe Gly  Ser Phe Phe 
              1460                 1465                 1470             
          Thr Leu  Asn Leu Phe Ile Gly  Val Ile Ile Asp Asn  Phe Asn Gln 
              1475                 1480                 1485             
          Gln Lys  Lys Lys Phe Gly Gly  Gln Asp Ile Phe Met  Thr Glu Glu 
              1490                 1495                 1500             
          Gln Lys  Lys Tyr Tyr Asn Ala  Met Lys Lys Leu Gly  Ser Lys Lys 
              1505                 1510                 1515             
          Pro Gln  Lys Pro Ile Pro Arg  Pro Gly Asn Lys Phe  Gln Gly Met 
              1520                 1525                 1530             
          Val Phe  Asp Phe Val Thr Arg  Gln Val Phe Asp Ile  Ser Ile Met 
              1535                 1540                 1545             
          Ile Leu  Ile Cys Leu Asn Met  Val Thr Met Met Val  Glu Thr Asp 
              1550                 1555                 1560             
          Asp Gln  Ser Glu Tyr Val Thr  Thr Ile Leu Ser Arg  Ile Asn Leu 
              1565                 1570                 1575             
          Val Phe  Ile Val Leu Phe Thr  Gly Glu Cys Val Leu  Lys Leu Ile 
              1580                 1585                 1590             
          Ser Leu  Arg His Tyr Tyr Phe  Thr Ile Gly Trp Asn  Ile Phe Asp 
              1595                 1600                 1605             
          Phe Val  Val Val Ile Leu Ser  Ile Val Gly Met Phe  Leu Ala Glu 
              1610                 1615                 1620             
          Leu Ile  Glu Lys Tyr Phe Val  Ser Pro Thr Leu Phe  Arg Val Ile 
              1625                 1630                 1635             
          Arg Leu  Ala Arg Ile Gly Arg  Ile Leu Arg Leu Ile  Lys Gly Ala 
              1640                 1645                 1650             
          Lys Gly  Ile Arg Thr Leu Leu  Phe Ala Leu Met Met  Ser Leu Pro 
              1655                 1660                 1665             
          Ala Leu  Phe Asn Ile Gly Leu  Leu Leu Phe Leu Val  Met Phe Ile 
              1670                 1675                 1680             
          Tyr Ala  Ile Phe Gly Met Ser  Asn Phe Ala Tyr Val  Lys Arg Glu 
              1685                 1690                 1695             
          Val Gly  Ile Asp Asp Met Phe  Asn Phe Glu Thr Phe  Gly Asn Ser 
              1700                 1705                 1710             
          Met Ile  Cys Leu Phe Gln Ile  Thr Thr Ser Ala Gly  Trp Asp Gly 
              1715                 1720                 1725             
          Leu Leu  Ala Pro Ile Leu Asn  Ser Lys Pro Pro Asp  Cys Asp Pro 
              1730                 1735                 1740             
          Asn Lys  Val Asn Pro Gly Ser  Ser Val Lys Gly Asp  Cys Gly Asn 
              1745                 1750                 1755             
          Pro Ser  Val Gly Ile Phe Phe  Phe Val Ser Tyr Ile  Ile Ile Ser 
              1760                 1765                 1770             
          Phe Leu  Val Val Val Asn Met  Tyr Ile Ala Val Ile  Leu Glu Asn 
              1775                 1780                 1785             
          Phe Ser  Val Ala Thr Glu Glu  Ser Ala Glu Pro Leu  Ser Glu Asp 
              1790                 1795                 1800             
          Asp Phe  Glu Met Phe Tyr Glu  Val Trp Glu Lys Phe  Asp Pro Asp 
              1805                 1810                 1815             
          Ala Thr  Gln Phe Met Glu Phe  Glu Lys Leu Ser Gln  Phe Ala Ala 
              1820                 1825                 1830             
          Ala Leu  Glu Pro Pro Leu Asn  Leu Pro Gln Pro Asn  Lys Leu Gln 
              1835                 1840                 1845             
          Leu Ile  Ala Met Asp Leu Pro  Met Val Ser Gly Asp  Arg Ile His 
              1850                 1855                 1860             
          Cys Leu  Asp Ile Leu Phe Ala  Phe Thr Lys Arg Val  Leu Gly Glu 
              1865                 1870                 1875             
          Ser Gly  Glu Met Asp Ala Leu  Arg Ile Gln Met Glu  Glu Arg Phe 
              1880                 1885                 1890             
          Met Ala  Ser Asn Pro Ser Lys  Val Ser Tyr Gln Pro  Ile Thr Thr 
              1895                 1900                 1905             
          Thr Leu  Lys Arg Lys Gln Glu  Glu Val Ser Ala Val  Ile Ile Gln 
              1910                 1915                 1920             
          Arg Ala  Tyr Arg Arg His Leu  Leu Lys Arg Thr Val  Lys Gln Ala 
              1925                 1930                 1935             
          Ser Phe  Thr Tyr Asn Lys Asn  Lys Ile Lys Gly Gly  Ala Asn Leu 
              1940                 1945                 1950             
          Leu Ile  Lys Glu Asp Met Ile  Ile Asp Arg Ile Asn  Glu Asn Ser 
              1955                 1960                 1965             
          Ile Thr  Glu Lys Thr Asp Leu  Thr Met Ser Thr Ala  Ala Cys Pro 
              1970                 1975                 1980             
          Pro Ser  Tyr Asp Arg Val Thr  Lys Pro Ile Val Glu  Lys His Glu 
              1985                 1990                 1995             
          Gln Glu  Gly Lys Asp Glu Lys  Ala Lys Gly Lys 
              2000                 2005                 
          <![CDATA[<210>  51]]>
          <![CDATA[<211>  1470]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  51]]>
          atggatctgc tggtggatga actgtttgcg gatatgaacg cggatggcgc gagcccgccg       60
          ccgccgcgcc cggcgggcgg cccgaaaaac accccggcgg cgccgccgct gtatgcgacc      120
          ggccgcctga gccaggcgca gctgatgccg agcccgccga tgccggtgcc gccggcggcg      180
          ctgtttaacc gcctgctgga tgatctgggc tttagcgcgg gcccggcgct gtgcaccatg      240
          ctggatacct ggaacgaaga tctgtttagc gcgctgccga ccaacgcgga tctgtatcgc      300
          gaatgcaaat ttctgagcac cctgccgagc gatgtggtgg aatggggcga tgcgtatgtg      360
          ccggaacgca cccagattga tattcgcgcg catggcgatg tggcgtttcc gaccctgccg      420
          gcgacccgcg atggcctggg cctgtattat gaagcgctga gccgcttttt tcatgcggaa      480
          ctgcgcgcgc gcgaagaaag ctatcgcacc gtgctggcga acttttgcag cgcgctgtat      540
          cgctatctgc gcgcgagcgt gcgccagctg catcgccagg cgcatatgcg cggccgcgat      600
          cgcgatctgg gcgaaatgct gcgcgcgacc attgcggatc gctattatcg cgaaaccgcg      660
          cgcctggcgc gcgtgctgtt tctgcatctg tatctgtttc tgacccgcga aattctgtgg      720
          gcggcgtatg cggaacagat gatgcgcccg gatctgtttg attgcctgtg ctgcgatctg      780
          gaaagctggc gccagctggc gggcctgttt cagccgttta tgtttgtgaa cggcgcgctg      840
          accgtgcgcg gcgtgccgat tgaagcgcgc cgcctgcgcg aactgaacca tattcgcgaa      900
          catctgaacc tgccgctggt gcgcagcgcg gcgaccgaag aaccgggcgc gccgctgacc      960
          accccgccga ccctgcatgg caaccaggcg cgcgcgagcg gctattttat ggtgctgatt     1020
          cgcgcgaaac tggatagcta tagcagcttt accaccagcc cgagcgaagc ggtgatgcgc     1080
          gaacatgcgt atagccgcgc gcgcaccaaa aacaactatg gcagcaccat tgaaggcctg     1140
          ctggatctgc cggatgatga tgcgccggaa gaagcgggcc tggcggcgcc gcgcctgagc     1200
          tttctgccgg cgggccatac ccgccgcctg agcaccgcgc cgccgaccga tgtgagcctg     1260
          ggcgatgaac tgcatctgga tggcgaagat gtggcgatgg cgcatgcgga tgcgctggat     1320
          gattttgatc tggatatgct gggcgatggc gatagcccgg gcccgggctt taccccgcat     1380
          gatagcgcgc cgtatggcgc gctggatatg gcggattttg aatttgaaca gatgtttacc     1440
          gatgcgctgg gcattgatga atatggcggc                                      1470
          <![CDATA[<210>  52]]>
          <![CDATA[<211>  490]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  52]]>
          Met Asp Leu Leu Val Asp Glu Leu Phe Ala Asp Met Asn Ala Asp Gly 
          1               5                   10                  15      
          Ala Ser Pro Pro Pro Pro Arg Pro Ala Gly Gly Pro Lys Asn Thr Pro 
                      20                  25                  30          
          Ala Ala Pro Pro Leu Tyr Ala Thr Gly Arg Leu Ser Gln Ala Gln Leu 
                  35                  40                  45              
          Met Pro Ser Pro Pro Met Pro Val Pro Pro Ala Ala Leu Phe Asn Arg 
              50                  55                  60                  
          Leu Leu Asp Asp Leu Gly Phe Ser Ala Gly Pro Ala Leu Cys Thr Met 
          65                  70                  75                  80  
          Leu Asp Thr Trp Asn Glu Asp Leu Phe Ser Ala Leu Pro Thr Asn Ala 
                          85                  90                  95      
          Asp Leu Tyr Arg Glu Cys Lys Phe Leu Ser Thr Leu Pro Ser Asp Val 
                      100                 105                 110         
          Val Glu Trp Gly Asp Ala Tyr Val Pro Glu Arg Thr Gln Ile Asp Ile 
                  115                 120                 125             
          Arg Ala His Gly Asp Val Ala Phe Pro Thr Leu Pro Ala Thr Arg Asp 
              130                 135                 140                 
          Gly Leu Gly Leu Tyr Tyr Glu Ala Leu Ser Arg Phe Phe His Ala Glu 
          145                 150                 155                 160 
          Leu Arg Ala Arg Glu Glu Ser Tyr Arg Thr Val Leu Ala Asn Phe Cys 
                          165                 170                 175     
          Ser Ala Leu Tyr Arg Tyr Leu Arg Ala Ser Val Arg Gln Leu His Arg 
                      180                 185                 190         
          Gln Ala His Met Arg Gly Arg Asp Arg Asp Leu Gly Glu Met Leu Arg 
                  195                 200                 205             
          Ala Thr Ile Ala Asp Arg Tyr Tyr Arg Glu Thr Ala Arg Leu Ala Arg 
              210                 215                 220                 
          Val Leu Phe Leu His Leu Tyr Leu Phe Leu Thr Arg Glu Ile Leu Trp 
          225                 230                 235                 240 
          Ala Ala Tyr Ala Glu Gln Met Met Arg Pro Asp Leu Phe Asp Cys Leu 
                          245                 250                 255     
          Cys Cys Asp Leu Glu Ser Trp Arg Gln Leu Ala Gly Leu Phe Gln Pro 
                      260                 265                 270         
          Phe Met Phe Val Asn Gly Ala Leu Thr Val Arg Gly Val Pro Ile Glu 
                  275                 280                 285             
          Ala Arg Arg Leu Arg Glu Leu Asn His Ile Arg Glu His Leu Asn Leu 
              290                 295                 300                 
          Pro Leu Val Arg Ser Ala Ala Thr Glu Glu Pro Gly Ala Pro Leu Thr 
          305                 310                 315                 320 
          Thr Pro Pro Thr Leu His Gly Asn Gln Ala Arg Ala Ser Gly Tyr Phe 
                          325                 330                 335     
          Met Val Leu Ile Arg Ala Lys Leu Asp Ser Tyr Ser Ser Phe Thr Thr 
                      340                 345                 350         
          Ser Pro Ser Glu Ala Val Met Arg Glu His Ala Tyr Ser Arg Ala Arg 
                  355                 360                 365             
          Thr Lys Asn Asn Tyr Gly Ser Thr Ile Glu Gly Leu Leu Asp Leu Pro 
              370                 375                 380                 
          Asp Asp Asp Ala Pro Glu Glu Ala Gly Leu Ala Ala Pro Arg Leu Ser 
          385                 390                 395                 400 
          Phe Leu Pro Ala Gly His Thr Arg Arg Leu Ser Thr Ala Pro Pro Thr 
                          405                 410                 415     
          Asp Val Ser Leu Gly Asp Glu Leu His Leu Asp Gly Glu Asp Val Ala 
                      420                 425                 430         
          Met Ala His Ala Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly 
                  435                 440                 445             
          Asp Gly Asp Ser Pro Gly Pro Gly Phe Thr Pro His Asp Ser Ala Pro 
              450                 455                 460                 
          Tyr Gly Ala Leu Asp Met Ala Asp Phe Glu Phe Glu Gln Met Phe Thr 
          465                 470                 475                 480 
          Asp Ala Leu Gly Ile Asp Glu Tyr Gly Gly 
                          485                 490 
          <![CDATA[<210>  53]]>
          <![CDATA[<211>  2570]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  53]]>
          agcgcgcagg cgcggccgga ttccgggcag tgacgcgacg gcgggccgcg cggcgcattt       60
          ccgcctctgg cgaatggctc gtctgtagtg cacgccgcgg gcccagctgc gaccccggcc      120
          ccgcccccgg gaccccggcc atggacgaac tgttccccct catcttcccg gcagagccag      180
          cccaggcctc tggcccctat gtggagatca ttgagcagcc caagcagcgg ggcatgcgct      240
          tccgctacaa gtgcgagggg cgctccgcgg gcagcatccc aggcgagagg agcacagata      300
          ccaccaagac ccaccccacc atcaagatca atggctacac aggaccaggg acagtgcgca      360
          tctccctggt caccaaggac cctcctcacc ggcctcaccc ccacgagctt gtaggaaagg      420
          actgccggga tggcttctat gaggctgagc tctgcccgga ccgctgcatc cacagtttcc      480
          agaacctggg aatccagtgt gtgaagaagc gggacctgga gcaggctatc agtcagcgca      540
          tccagaccaa caacaacccc ttccaagaag agcagcgtgg ggactacgac ctgaatgctg      600
          tgcggctctg cttccaggtg acagtgcggg acccatcagg caggcccctc cgcctgccgc      660
          ctgtcctttc tcatcccatc tttgacaatc gtgcccccaa cactgccgag ctcaagatct      720
          gccgagtgaa ccgaaactct ggcagctgcc tcggtgggga tgagatcttc ctactgtgtg      780
          acaaggtgca gaaagaggac attgaggtgt atttcacggg accaggctgg gaggcccgag      840
          gctccttttc gcaagctgat gtgcaccgac aagtggccat tgtgttccgg acccctccct      900
          acgcagaccc cagcctgcag gctcctgtgc gtgtctccat gcagctgcgg cggccttccg      960
          accgggagct cagtgagccc atggaattcc agtacctgcc agatacagac gatcgtcacc     1020
          ggattgagga gaaacgtaaa aggacatatg agaccttcaa gagcatcatg aagaagagtc     1080
          ctttcagcgg acccaccgac ccccggcctc cacctcgacg cattgctgtg ccttcccgca     1140
          gctcagcttc tgtccccaag ccagcacccc agccctatcc ctttacgtca tccctgagca     1200
          ccatcaacta tgatgagttt cccaccatgg tgtttccttc tgggcagatc agccaggcct     1260
          cggccttggc cccggcccct ccccaagtcc tgccccaggc tccagcccct gcccctgctc     1320
          cagccatggt atcagctctg gcccaggccc cagcccctgt cccagtccta gccccaggcc     1380
          ctcctcaggc tgtggcccca cctgccccca agcccaccca ggctggggaa ggaacgctgt     1440
          cagaggccct gctgcagctg cagtttgatg atgaagacct gggggccttg cttggcaaca     1500
          gcacagaccc agctgtgttc acagacctgg catccgtcga caactccgag tttcagcagc     1560
          tgctgaacca gggcatacct gtggcccccc acacaactga gcccatgctg atggagtacc     1620
          ctgaggctat aactcgccta gtgacagggg cccagaggcc ccccgaccca gctcctgctc     1680
          cactgggggc cccggggctc cccaatggcc tcctttcagg agatgaagac ttctcctcca     1740
          ttgcggacat ggacttctca gccctgctga gtcagatcag ctcctaaggg ggtgacgcct     1800
          gccctcccca gagcactggg ttgcagggga ttgaagccct ccaaaagcac ttacggattc     1860
          tggtggggtg tgttccaact gcccccaact ttgtggatgt cttccttgga ggggggagcc     1920
          atattttatt cttttattgt cagtatctgt atctctctct ctttttggag gtgcttaagc     1980
          agaagcatta acttctctgg aaagggggga gctggggaaa ctcaaacttt tcccctgtcc     2040
          tgatggtcag ctcccttctc tgtagggaac tctggggtcc cccatcccca tcctccagct     2100
          tctggtactc tcctagagac agaagcaggc tggaggtaag gcctttgagc ccacaaagcc     2160
          ttatcaagtg tcttccatca tggattcatt acagcttaat caaaataacg ccccagatac     2220
          cagcccctgt atggcactgg cattgtccct gtgcctaaca ccagcgtttg aggggctggc     2280
          cttcctgccc tacagaggtc tctgccggct ctttccttgc tcaaccatgg ctgaaggaaa     2340
          ccagtgcaac agcactggct ctctccagga tccagaaggg gtttggtctg ggacttcctt     2400
          gctctccctc ttctcaagtg ccttaatagt agggtaagtt gttaagagtg ggggagagca     2460
          ggctggcagc tctccagtca ggaggcatag tttttactga acaatcaaag cacttggact     2520
          cttgctcttt ctactctgaa ctaataaatc tgttgccaag ctggctagaa                2570
          <![CDATA[<210>  54]]>
          <![CDATA[<211>  548]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  54]]>
          Met Asp Glu Leu Phe Pro Leu Ile Phe Pro Ala Glu Pro Ala Gln Ala 
          1               5                   10                  15      
          Ser Gly Pro Tyr Val Glu Ile Ile Glu Gln Pro Lys Gln Arg Gly Met 
                      20                  25                  30          
          Arg Phe Arg Tyr Lys Cys Glu Gly Arg Ser Ala Gly Ser Ile Pro Gly 
                  35                  40                  45              
          Glu Arg Ser Thr Asp Thr Thr Lys Thr His Pro Thr Ile Lys Ile Asn 
              50                  55                  60                  
          Gly Tyr Thr Gly Pro Gly Thr Val Arg Ile Ser Leu Val Thr Lys Asp 
          65                  70                  75                  80  
          Pro Pro His Arg Pro His Pro His Glu Leu Val Gly Lys Asp Cys Arg 
                          85                  90                  95      
          Asp Gly Phe Tyr Glu Ala Glu Leu Cys Pro Asp Arg Cys Ile His Ser 
                      100                 105                 110         
          Phe Gln Asn Leu Gly Ile Gln Cys Val Lys Lys Arg Asp Leu Glu Gln 
                  115                 120                 125             
          Ala Ile Ser Gln Arg Ile Gln Thr Asn Asn Asn Pro Phe Gln Glu Glu 
              130                 135                 140                 
          Gln Arg Gly Asp Tyr Asp Leu Asn Ala Val Arg Leu Cys Phe Gln Val 
          145                 150                 155                 160 
          Thr Val Arg Asp Pro Ser Gly Arg Pro Leu Arg Leu Pro Pro Val Leu 
                          165                 170                 175     
          Ser His Pro Ile Phe Asp Asn Arg Ala Pro Asn Thr Ala Glu Leu Lys 
                      180                 185                 190         
          Ile Cys Arg Val Asn Arg Asn Ser Gly Ser Cys Leu Gly Gly Asp Glu 
                  195                 200                 205             
          Ile Phe Leu Leu Cys Asp Lys Val Gln Lys Glu Asp Ile Glu Val Tyr 
              210                 215                 220                 
          Phe Thr Gly Pro Gly Trp Glu Ala Arg Gly Ser Phe Ser Gln Ala Asp 
          225                 230                 235                 240 
          Val His Arg Gln Val Ala Ile Val Phe Arg Thr Pro Pro Tyr Ala Asp 
                          245                 250                 255     
          Pro Ser Leu Gln Ala Pro Val Arg Val Ser Met Gln Leu Arg Arg Pro 
                      260                 265                 270         
          Ser Asp Arg Glu Leu Ser Glu Pro Met Glu Phe Gln Tyr Leu Pro Asp 
                  275                 280                 285             
          Thr Asp Asp Arg His Arg Ile Glu Glu Lys Arg Lys Arg Thr Tyr Glu 
              290                 295                 300                 
          Thr Phe Lys Ser Ile Met Lys Lys Ser Pro Phe Ser Gly Pro Thr Asp 
          305                 310                 315                 320 
          Pro Arg Pro Pro Pro Arg Arg Ile Ala Val Pro Ser Arg Ser Ser Ala 
                          325                 330                 335     
          Ser Val Pro Lys Pro Ala Pro Gln Pro Tyr Pro Phe Thr Ser Ser Leu 
                      340                 345                 350         
          Ser Thr Ile Asn Tyr Asp Glu Phe Pro Thr Met Val Phe Pro Ser Gly 
                  355                 360                 365             
          Gln Ile Ser Gln Ala Ser Ala Leu Ala Pro Ala Pro Pro Gln Val Leu 
              370                 375                 380                 
          Pro Gln Ala Pro Ala Pro Ala Pro Ala Pro Ala Met Val Ser Ala Leu 
          385                 390                 395                 400 
          Ala Gln Ala Pro Ala Pro Val Pro Val Leu Ala Pro Gly Pro Pro Gln 
                          405                 410                 415     
          Ala Val Ala Pro Pro Ala Pro Lys Pro Thr Gln Ala Gly Glu Gly Thr 
                      420                 425                 430         
          Leu Ser Glu Ala Leu Leu Gln Leu Gln Phe Asp Asp Glu Asp Leu Gly 
                  435                 440                 445             
          Ala Leu Leu Gly Asn Ser Thr Asp Pro Ala Val Phe Thr Asp Leu Ala 
              450                 455                 460                 
          Ser Val Asp Asn Ser Glu Phe Gln Gln Leu Leu Asn Gln Gly Ile Pro 
          465                 470                 475                 480 
          Val Ala Pro His Thr Thr Glu Pro Met Leu Met Glu Tyr Pro Glu Ala 
                          485                 490                 495     
          Ile Thr Arg Leu Val Thr Gly Ala Gln Arg Pro Pro Asp Pro Ala Pro 
                      500                 505                 510         
          Ala Pro Leu Gly Ala Pro Gly Leu Pro Asn Gly Leu Leu Ser Gly Asp 
                  515                 520                 525             
          Glu Asp Phe Ser Ser Ile Ala Asp Met Asp Phe Ser Ala Leu Leu Ser 
              530                 535                 540                 
          Gln Ile Ser Ser 
          545             
          <![CDATA[<210>  55]]>
          <![CDATA[<211>  1815]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  智人]]>
          <![CDATA[<400>  55]]>
          atgcgcccga aaaaagatgg cctggaagat tttctgcgcc tgaccccgga aattaaaaaa       60
          cagctgggca gcctggtgag cgattattgc aacgtgctga acaaagaatt taccgcgggc      120
          agcgtggaaa ttaccctgcg cagctataaa atttgcaaag cgtttattaa cgaagcgaaa      180
          gcgcatggcc gcgaatgggg cggcctgatg gcgaccctga acatttgcaa cttttgggcg      240
          attctgcgca acaaccgcgt gcgccgccgc gcggaaaacg cgggcaacga tgcgtgcagc      300
          attgcgtgcc cgattgtgat gcgctatgtg ctggatcatc tgattgtggt gaccgatcgc      360
          ttttttattc aggcgccgag caaccgcgtg atgattccgg cgaccattgg caccgcgatg      420
          tataaactgc tgaaacatag ccgcgtgcgc gcgtatacct atagcaaagt gctgggcgtg      480
          gatcgcgcgg cgattatggc gagcggcaaa caggtggtgg aacatctgaa ccgcatggaa      540
          aaagaaggcc tgctgagcag caaatttaaa gcgttttgca aatgggtgtt tacctatccg      600
          gtgctggaag aaatgtttca gaccatggtg agcagcaaaa ccggccatct gaccgatgat      660
          gtgaaagatg tgcgcgcgct gattaaaacc ctgccgcgcg cgagctatag cagccatgcg      720
          ggccagcgca gctatgtgag cggcgtgctg ccggcgtgcc tgctgagcac caaaagcaaa      780
          gcggtggaaa ccccgattct ggtgagcggc gcggatcgca tggatgaaga actgatgggc      840
          aacgatggcg gcgcgagcca taccgaagcg cgctatagcg aaagcggcca gtttcatgcg      900
          tttaccgatg aactggaaag cctgccgagc ccgaccatgc cgctgaaacc gggcgcgcag      960
          agcgcggatt gcggcgatag cagcagcagc agcagcgata gcggcaacag cgataccgaa     1020
          cagagcgaac gcgaagaagc gcgcgcggaa gcgccgcgcc tgcgcgcgcc gaaaagccgc     1080
          cgcaccagcc gcccgaaccg cggccagacc ccgtgcccga gcaacgcggc ggaaccggaa     1140
          cagccgtgga ttgcggcggt gcatcaggaa agcgatgaac gcccgatttt tccgcatccg     1200
          agcaaaccga cctttctgcc gccggtgaaa cgcaaaaaag gcctgcgcga tagccgcgaa     1260
          ggcatgtttc tgccgaaacc ggaagcgggc agcgcgatta gcgatgtgtt tgaaggccgc     1320
          gaagtgtgcc agccgaaacg cattcgcccg tttcatccgc cgggcagccc gtgggcgaac     1380
          cgcccgctgc cggcgagcct ggcgccgacc ccgaccggcc cggtgcatga accggtgggc     1440
          agcctgaccc cggcgccggt gccgcagccg ctggatccgg cgccggcggt gaccccggaa     1500
          gcgagccatc tgctggaaga tccggatgaa gaaaccagcc aggcggtgaa agcgctgcgc     1560
          gaaatggcgg ataccgtgat tccgcagaaa gaagaagcgg cgatttgcgg ccagatggat     1620
          ctgagccatc cgccgccgcg cggccatctg gatgaactga ccaccaccct ggaaagcatg     1680
          accgaagatc tgaacctgga tagcccgctg accccggaac tgaacgaaat tctggatacc     1740
          tttctgaacg atgaatgcct gctgcatgcg atgcatatta gcaccggcct gagcattttt     1800
          gataccagcc tgttt                                                      1815
          <![CDATA[<210>  56]]>
          <![CDATA[<211>  605]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  56]]>
          Met Arg Pro Lys Lys Asp Gly Leu Glu Asp Phe Leu Arg Leu Thr Pro 
          1               5                   10                  15      
          Glu Ile Lys Lys Gln Leu Gly Ser Leu Val Ser Asp Tyr Cys Asn Val 
                      20                  25                  30          
          Leu Asn Lys Glu Phe Thr Ala Gly Ser Val Glu Ile Thr Leu Arg Ser 
                  35                  40                  45              
          Tyr Lys Ile Cys Lys Ala Phe Ile Asn Glu Ala Lys Ala His Gly Arg 
              50                  55                  60                  
          Glu Trp Gly Gly Leu Met Ala Thr Leu Asn Ile Cys Asn Phe Trp Ala 
          65                  70                  75                  80  
          Ile Leu Arg Asn Asn Arg Val Arg Arg Arg Ala Glu Asn Ala Gly Asn 
                          85                  90                  95      
          Asp Ala Cys Ser Ile Ala Cys Pro Ile Val Met Arg Tyr Val Leu Asp 
                      100                 105                 110         
          His Leu Ile Val Val Thr Asp Arg Phe Phe Ile Gln Ala Pro Ser Asn 
                  115                 120                 125             
          Arg Val Met Ile Pro Ala Thr Ile Gly Thr Ala Met Tyr Lys Leu Leu 
              130                 135                 140                 
          Lys His Ser Arg Val Arg Ala Tyr Thr Tyr Ser Lys Val Leu Gly Val 
          145                 150                 155                 160 
          Asp Arg Ala Ala Ile Met Ala Ser Gly Lys Gln Val Val Glu His Leu 
                          165                 170                 175     
          Asn Arg Met Glu Lys Glu Gly Leu Leu Ser Ser Lys Phe Lys Ala Phe 
                      180                 185                 190         
          Cys Lys Trp Val Phe Thr Tyr Pro Val Leu Glu Glu Met Phe Gln Thr 
                  195                 200                 205             
          Met Val Ser Ser Lys Thr Gly His Leu Thr Asp Asp Val Lys Asp Val 
              210                 215                 220                 
          Arg Ala Leu Ile Lys Thr Leu Pro Arg Ala Ser Tyr Ser Ser His Ala 
          225                 230                 235                 240 
          Gly Gln Arg Ser Tyr Val Ser Gly Val Leu Pro Ala Cys Leu Leu Ser 
                          245                 250                 255     
          Thr Lys Ser Lys Ala Val Glu Thr Pro Ile Leu Val Ser Gly Ala Asp 
                      260                 265                 270         
          Arg Met Asp Glu Glu Leu Met Gly Asn Asp Gly Gly Ala Ser His Thr 
                  275                 280                 285             
          Glu Ala Arg Tyr Ser Glu Ser Gly Gln Phe His Ala Phe Thr Asp Glu 
              290                 295                 300                 
          Leu Glu Ser Leu Pro Ser Pro Thr Met Pro Leu Lys Pro Gly Ala Gln 
          305                 310                 315                 320 
          Ser Ala Asp Cys Gly Asp Ser Ser Ser Ser Ser Ser Asp Ser Gly Asn 
                          325                 330                 335     
          Ser Asp Thr Glu Gln Ser Glu Arg Glu Glu Ala Arg Ala Glu Ala Pro 
                      340                 345                 350         
          Arg Leu Arg Ala Pro Lys Ser Arg Arg Thr Ser Arg Pro Asn Arg Gly 
                  355                 360                 365             
          Gln Thr Pro Cys Pro Ser Asn Ala Ala Glu Pro Glu Gln Pro Trp Ile 
              370                 375                 380                 
          Ala Ala Val His Gln Glu Ser Asp Glu Arg Pro Ile Phe Pro His Pro 
          385                 390                 395                 400 
          Ser Lys Pro Thr Phe Leu Pro Pro Val Lys Arg Lys Lys Gly Leu Arg 
                          405                 410                 415     
          Asp Ser Arg Glu Gly Met Phe Leu Pro Lys Pro Glu Ala Gly Ser Ala 
                      420                 425                 430         
          Ile Ser Asp Val Phe Glu Gly Arg Glu Val Cys Gln Pro Lys Arg Ile 
                  435                 440                 445             
          Arg Pro Phe His Pro Pro Gly Ser Pro Trp Ala Asn Arg Pro Leu Pro 
              450                 455                 460                 
          Ala Ser Leu Ala Pro Thr Pro Thr Gly Pro Val His Glu Pro Val Gly 
          465                 470                 475                 480 
          Ser Leu Thr Pro Ala Pro Val Pro Gln Pro Leu Asp Pro Ala Pro Ala 
                          485                 490                 495     
          Val Thr Pro Glu Ala Ser His Leu Leu Glu Asp Pro Asp Glu Glu Thr 
                      500                 505                 510         
          Ser Gln Ala Val Lys Ala Leu Arg Glu Met Ala Asp Thr Val Ile Pro 
                  515                 520                 525             
          Gln Lys Glu Glu Ala Ala Ile Cys Gly Gln Met Asp Leu Ser His Pro 
              530                 535                 540                 
          Pro Pro Arg Gly His Leu Asp Glu Leu Thr Thr Thr Leu Glu Ser Met 
          545                 550                 555                 560 
          Thr Glu Asp Leu Asn Leu Asp Ser Pro Leu Thr Pro Glu Leu Asn Glu 
                          565                 570                 575     
          Ile Leu Asp Thr Phe Leu Asn Asp Glu Cys Leu Leu His Ala Met His 
                      580                 585                 590         
          Ile Ser Thr Gly Leu Ser Ile Phe Asp Thr Ser Leu Phe 
                  595                 600                 605 
          <![CDATA[<210>  57]]>
          <![CDATA[<211>  172]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  57]]>
          Arg Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Arg Gly 
          1               5                   10                  15      
          Asn Leu Val Arg His Ile Arg Thr His Thr Gly Glu Lys Pro Phe Ala 
                      20                  25                  30          
          Cys Asp Ile Cys Gly Lys Lys Phe Ala Leu Ser Phe Asn Leu Thr Arg 
                  35                  40                  45              
          His Thr Lys Ile His Thr Gly Ser Gln Lys Pro Phe Gln Cys Arg Ile 
              50                  55                  60                  
          Cys Met Arg Asn Phe Ser Arg Ser Asp Asn Leu Thr Arg His Ile Arg 
          65                  70                  75                  80  
          Thr His Thr Gly Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly Lys Lys 
                          85                  90                  95      
          Phe Ala Asp Arg Ser His Leu Ala Arg His Thr Lys Ile His Thr Gly 
                      100                 105                 110         
          Ser Gln Lys Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln 
                  115                 120                 125             
          Lys Ala His Leu Thr Ala His Ile Arg Thr His Thr Gly Glu Lys Pro 
              130                 135                 140                 
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu 
          145                 150                 155                 160 
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp 
                          165                 170         
          <![CDATA[<210>  58]]>
          <![CDATA[<211>  516]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  58]]>
          cgaccattcc agtgtcgaat ctgcatgcgc aacttcagcc agcggggaaa cctggtgagg       60
          catatccgca cccacacggg agagaagcct tttgcctgcg atatttgtgg aaagaagttt      120
          gctctgagct tcaatctaac cagacacacc aagattcata ctgggtccca gaaaccgttc      180
          cagtgtagga tatgcatgag gaatttctct cggagtgaca acttaacgcg gcatataagg      240
          acgcacacag gtgaaaaacc atttgcatgc gacatctgtg gcaaaaagtt tgcggaccgg      300
          tctcaccttg cccgacacac aaaaatccat accggcagtc aaaagccctt tcaatgtcgc      360
          atttgcatgc gaaacttctc acagaaggcc catttgactg cccatattcg tactcatact      420
          ggcgagaaac ctttcgcttg cgatatatgt ggtcgtaagt ttgcacggtc ggacaacctc      480
          acacgccaca ctaagataca cctgcggcag aaggac                                516
          <![CDATA[<210>  59]]>
          <![CDATA[<211>  172]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  59]]>
          Arg Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Ser 
          1               5                   10                  15      
          Asn Leu Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro Phe Ala 
                      20                  25                  30          
          Cys Asp Ile Cys Gly Lys Lys Phe Ala Asp Lys Arg Thr Leu Ile Arg 
                  35                  40                  45              
          His Thr Lys Ile His Thr Gly Ser Gln Lys Pro Phe Gln Cys Arg Ile 
              50                  55                  60                  
          Cys Met Arg Asn Phe Ser Gln Arg Gly Asn Leu Val Arg His Ile Arg 
          65                  70                  75                  80  
          Thr His Thr Gly Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly Lys Lys 
                          85                  90                  95      
          Phe Ala Leu Ser Phe Asn Leu Thr Arg His Thr Lys Ile His Thr Gly 
                      100                 105                 110         
          Ser Gln Lys Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg 
                  115                 120                 125             
          Ser Asp Asn Leu Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
              130                 135                 140                 
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Asp Arg Ser His Leu 
          145                 150                 155                 160 
          Ala Arg His Thr Lys Ile His Leu Arg Gln Lys Asp 
                          165                 170         
          <![CDATA[<210>  60]]>
          <![CDATA[<211>  516]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  60]]>
          cgaccattcc agtgtcgaat ctgcatgcgc aacttcagcc gaagttccaa cctgacacgg       60
          catatccgca cccacacggg agagaagcct tttgcctgcg atatttgtgg aaagaagttt      120
          gctgacaagc ggaccttaat ccgccacacc aagattcata ctgggtccca gaaaccgttc      180
          cagtgtagga tatgcatgag gaatttctct cagcggggaa atctagtgcg acatataagg      240
          acgcacacag gtgaaaaacc atttgcatgc gacatctgtg gcaaaaagtt tgcgctgagc      300
          ttcaacttga ctcgtcacac aaaaatccat accggcagtc aaaagccctt tcaatgtcgc      360
          atttgcatgc gaaacttctc acggagtgac aatcttacga gacatattcg tactcatact      420
          ggcgagaaac ctttcgcttg cgatatatgt ggtcgtaagt ttgcagaccg gagccactta      480
          gccaggcaca ctaagataca cctgcggcag aaggac                                516
          <![CDATA[<210>  61]]>
          <![CDATA[<211>  172]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  61]]>
          Arg Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Asp Arg Ser 
          1               5                   10                  15      
          Ala Leu Ala Arg His Ile Arg Thr His Thr Gly Glu Lys Pro Phe Ala 
                      20                  25                  30          
          Cys Asp Ile Cys Gly Lys Lys Phe Ala Arg Ser Asp Asn Leu Thr Arg 
                  35                  40                  45              
          His Thr Lys Ile His Thr Gly Ser Gln Lys Pro Phe Gln Cys Arg Ile 
              50                  55                  60                  
          Cys Met Arg Asn Phe Ser Gln Ser Gly Asp Leu Thr Arg His Ile Arg 
          65                  70                  75                  80  
          Thr His Thr Gly Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly Lys Lys 
                          85                  90                  95      
          Phe Ala Val Arg Gln Thr Leu Lys Gln His Thr Lys Ile His Thr Gly 
                      100                 105                 110         
          Ser Gln Lys Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Ala 
                  115                 120                 125             
          Ala Gly Asn Leu Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
              130                 135                 140                 
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu 
          145                 150                 155                 160 
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp 
                          165                 170         
          <![CDATA[<210>  62]]>
          <![CDATA[<211>  516]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  62]]>
          cgaccattcc agtgtcgaat ctgcatgcgc aacttcagcg accggagcgc gctggcacgg       60
          catatccgca cccacacggg agagaagcct tttgcctgcg atatttgtgg aaagaagttt      120
          gctcgaagtg acaacttaac gcgccacacc aagattcata ctgggtccca gaaaccgttc      180
          cagtgtagga tatgcatgag gaatttctct cagtcagggg acctcactcg tcatataagg      240
          acgcacacag gtgaaaaacc atttgcatgc gacatctgtg gcaaaaagtt tgcggtacga      300
          cagacgctta aacaacacac aaaaatccat accggcagtc aaaagccctt tcaatgtcgc      360
          atttgcatgc gaaacttctc agccgctggt aacttgacac gacatattcg tactcatact      420
          ggcgagaaac ctttcgcttg cgatatatgt ggtcgtaagt ttgcaagatc tgataatcta      480
          acgcgtcaca ctaagataca cctgcggcag aaggac                                516
          <![CDATA[<210>  63]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  63]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Arg Gly Asn Leu 
          1               5                   10                  15      
          Val Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  64]]>
          <![CDATA[<211>  29]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  64]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Leu Ser Phe Asn Leu 
          1               5                   10                  15      
          Thr Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro 
                      20                  25                  
          <![CDATA[<210>  65]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  65]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Asp Asn Leu 
          1               5                   10                  15      
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  66]]>
          <![CDATA[<211>  29]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  66]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Asp Arg Ser His Leu 
          1               5                   10                  15      
          Ala Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro 
                      20                  25                  
          <![CDATA[<210>  67]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  67]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Lys Ala His Leu 
          1               5                   10                  15      
          Thr Ala His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  68]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  68]]>
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu 
          1               5                   10                  15      
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp 
                      20                  25              
          <![CDATA[<210>  69]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  69]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Ser Asn Leu 
          1               5                   10                  15      
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  70]]>
          <![CDATA[<211>  29]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  70]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Asp Lys Arg Thr Leu 
          1               5                   10                  15      
          Ile Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro 
                      20                  25                  
          <![CDATA[<210>  71]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  71]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Arg Gly Asn Leu 
          1               5                   10                  15      
          Val Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  72]]>
          <![CDATA[<211>  29]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  72]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Leu Ser Phe Asn Leu 
          1               5                   10                  15      
          Thr Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro 
                      20                  25                  
          <![CDATA[<210>  73]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  73]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Asp Asn Leu 
          1               5                   10                  15      
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  74]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  74]]>
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Asp Arg Ser His Leu 
          1               5                   10                  15      
          Ala Arg His Thr Lys Ile His Leu Arg Gln Lys Asp 
                      20                  25              
          <![CDATA[<210>  75]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  75]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Asp Arg Ser Ala Leu 
          1               5                   10                  15      
          Ala Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  76]]>
          <![CDATA[<211>  29]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  76]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Arg Ser Asp Asn Leu 
          1               5                   10                  15      
          Thr Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro 
                      20                  25                  
          <![CDATA[<210>  77]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  77]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Ser Gly Asp Leu 
          1               5                   10                  15      
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  78]]>
          <![CDATA[<211>  29]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  78]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Val Arg Gln Thr Leu 
          1               5                   10                  15      
          Lys Gln His Thr Lys Ile His Thr Gly Ser Gln Lys Pro 
                      20                  25                  
          <![CDATA[<210>  79]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  79]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Ala Ala Gly Asn Leu 
          1               5                   10                  15      
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro 
                      20                  25              
          <![CDATA[<210>  80]]>
          <![CDATA[<211>  28]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  80]]>
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu 
          1               5                   10                  15      
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp 
                      20                  25              
          <![CDATA[<210>  81]]>
          <![CDATA[<211>  4104]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  81]]>
          atggacaaga agtactccat tgggctcgct atcggtacca acagcgtcgg ctgggccgtc       60
          attacggacg agtacaaggt gccgagcaaa aaattcaaag ttctgggcaa taccgatcgc      120
          cacagcataa agaagaacct cattggagcc ctcctgttcg actccgggga gacggccgaa      180
          gccacgcggc tcaaaagaac agcacggcgc agatataccc gcagaaagaa tcggatctgc      240
          tacctgcagg agatctttag taatgagatg gctaaggtgg atgactcttt cttccatagg      300
          ctggaggagt cctttttggt ggaggaggat aaaaagcacg agcgccaccc aatctttggc      360
          aatatcgtgg acgaggtggc gtaccatgaa aagtacccaa ccatatatca tctgaggaag      420
          aagctggtag acagtactga taaggctgac ttgcggttga tctatctcgc gctggcgcac      480
          atgatcaaat ttcggggaca cttcctcatc gagggggacc tgaacccaga caacagcgat      540
          gtcgacaaac tctttatcca actggttcag acttacaatc agcttttcga ggagaacccg      600
          atcaacgcat ccggcgttga cgccaaagca atcctgagcg ctaggctgtc caaatcccgg      660
          cggctcgaaa acctcatcgc acagctccct ggggagaaga agaacggcct gtttggtaat      720
          cttatcgccc tgtcactcgg gctgaccccc aactttaaat ctaacttcga cctggccgaa      780
          gatgccaagc tgcaactgag caaagacacc tacgatgatg atctcgacaa tctgctggcc      840
          cagatcggcg accagtacgc agaccttttt ttggcggcaa agaacctgtc agacgccatt      900
          ctgctgagtg atattctgcg agtgaacacg gagatcacca aagctccgct gagcgctagt      960
          atgatcaagc gctatgatga gcaccaccaa gacttgactt tgctgaaggc ccttgtcaga     1020
          cagcaactgc ctgagaagta caaggaaatt ttcttcgatc agtctaaaaa tggctacgcc     1080
          ggatacattg acggcggagc aagccaggag gaattttaca aatttattaa gcccatcttg     1140
          gaaaaaatgg acggcaccga ggagctgctg gtaaagctga acagagaaga tctgttgcgc     1200
          aaacagcgca ctttcgacaa tggaagcatc ccccaccaga ttcacctggg cgaactgcac     1260
          gctatcctca ggcggcaaga ggatttctac ccctttttga aagataacag ggaaaagatt     1320
          gagaaaatcc tcacatttcg gataccctac tatgtaggcc ccctcgctcg gggaaattcc     1380
          agattcgcgt ggatgactcg caaatcagaa gagaccatca ctccctggaa cttcgaggaa     1440
          gtcgtggata agggggcctc tgcccagtcc ttcatcgaaa ggatgactaa ctttgataaa     1500
          aatctgccta acgaaaaggt gcttcctaaa cactctctgc tgtacgagta cttcacagtt     1560
          tataacgagc tcaccaaggt caaatacgtc acagaaggga tgagaaagcc agcattcctg     1620
          tctggagagc agaagaaagc tatcgtggac ctcctcttca agacgaaccg gaaagttacc     1680
          gtgaaacagc tcaaagaaga ctatttcaaa aagattgaat gtttcgactc tgttgaaatc     1740
          agcggagtgg aggatcgctt caacgcatcc ctgggaacgt atcacgatct cctgaaaatc     1800
          attaaagaca aggacttcct ggacaatgag gagaacgagg acattcttga ggacattgtc     1860
          ctcaccctta cgttgtttga agatagggag atgattgaag aacgcttgaa aacttacgct     1920
          catctcttcg acgacaaagt catgaaacag ctcaagagac gccgatatac aggatggggg     1980
          cggctgtcaa gaaaactgat caatggcatc cgagacaagc agagtggaaa gacaatcctg     2040
          gattttctta agtccgatgg atttgccaac cggaacttca tgcagttgat ccatgatgac     2100
          tctctcacct ttaaggagga catccagaaa gcacaagttt ctggccaggg ggacagtctt     2160
          cacgagcaca tcgctaatct tgcaggtagc ccagctatca aaaagggaat actgcagacc     2220
          gttaaggtcg tggatgaact cgtcaaagta atgggaaggc ataagcccga gaatatcgtt     2280
          atcgagatgg cccgagagaa ccaaactacc cagaagggac agaagaacag tagggaaagg     2340
          atgaagagga ttgaagaggg tataaaagaa ctggggtccc aaatccttaa ggaacaccca     2400
          gttgaaaaca cccagcttca gaatgagaag ctctacctgt actacctgca gaacggcagg     2460
          gacatgtacg tggatcagga actggacatc aaccggttgt ccgactacga cgtggatgct     2520
          atcgtgcccc aaagctttct caaagatgat tctattgata ataaagtgtt gacaagatcc     2580
          gataaaaata gagggaagag tgataacgtc ccctcagaag aagttgtcaa gaaaatgaaa     2640
          aattattggc ggcagctgct gaacgccaaa ctgatcacac aacggaagtt cgataatctg     2700
          actaaggctg aacgaggtgg cctgtctgag ttggataaag ccggcttcat caaaaggcag     2760
          cttgttgaga cacgccagat caccaagcac gtggcccaaa ttctcgattc acgcatgaac     2820
          accaagtacg atgaaaatga caaactgatt cgagaggtga aagttattac tctgaagtct     2880
          aagctggtct cagatttcag aaaggacttt cagttttata aggtgagaga gatcaacaat     2940
          taccaccatg cgcatgatgc ctacctgaat gcagtggtag gcactgcact tatcaaaaaa     3000
          tatcccaagc tggaatctga atttgtttac ggagactata aagtgtacga tgttaggaaa     3060
          atgatcgcaa agtctgagca ggaaataggc aaggccaccg ctaagtactt cttttacagc     3120
          aatattatga attttttcaa gaccgagatt acactggcca atggagagat tcggaagcga     3180
          ccacttatcg aaacaaacgg agaaacagga gaaatcgtgt gggacaaggg tagggatttc     3240
          gcgacagtcc gcaaggtcct gtccatgccg caggtgaaca tcgttaaaaa gaccgaagta     3300
          cagaccggag gcttctccaa ggaaagtatc ctcccgaaaa ggaacagcga caagctgatc     3360
          gcacgcaaaa aagattggga ccccaagaaa tacggcggat tcgattctcc tacagtcgct     3420
          tacagtgtac tggttgtggc caaagtggag aaagggaagt ctaaaaaact caaaagcgtc     3480
          aaggaactgc tgggcatcac aatcatggag cgatccagct tcgagaaaaa ccccatcgac     3540
          tttctcgaag cgaaaggata taaagaggtc aaaaaagacc tcatcattaa gctgcccaag     3600
          tactctctct ttgagcttga aaacggccgg aaacgaatgc tcgctagtgc gggcgagctg     3660
          cagaaaggta acgagctggc actgccctct aaatacgtta atttcttgta tctggccagc     3720
          cactatgaaa agctcaaagg gtctcccgaa gataatgagc agaagcagct gttcgtggaa     3780
          caacacaaac actaccttga tgagatcatc gagcaaataa gcgagttctc caaaagagtg     3840
          atcctcgccg acgctaacct cgataaggtg ctttctgctt acaataagca cagggataag     3900
          cccatcaggg agcaggcaga aaacattatc cacttgttta ctctgaccaa cttgggcgcg     3960
          cctgcagcct tcaagtactt cgacaccacc atagacagaa agcggtacac ctctacaaag     4020
          gaggtcctgg acgccacact gattcatcag tcaattacgg ggctctatga aacaagaatc     4080
          gacctctctc agctcggtgg agac                                            4104
          <![CDATA[<210>  82]]>
          <![CDATA[<211>  1368]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  82]]>
          Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 
          1               5                   10                  15      
          Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 
                      20                  25                  30          
          Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile 
                  35                  40                  45              
          Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 
              50                  55                  60                  
          Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys 
          65                  70                  75                  80  
          Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser 
                          85                  90                  95      
          Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 
                      100                 105                 110         
          His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 
                  115                 120                 125             
          His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 
              130                 135                 140                 
          Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 
          145                 150                 155                 160 
          Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 
                          165                 170                 175     
          Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 
                      180                 185                 190         
          Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 
                  195                 200                 205             
          Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 
              210                 215                 220                 
          Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 
          225                 230                 235                 240 
          Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 
                          245                 250                 255     
          Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 
                      260                 265                 270         
          Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 
                  275                 280                 285             
          Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 
              290                 295                 300                 
          Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 
          305                 310                 315                 320 
          Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 
                          325                 330                 335     
          Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 
                      340                 345                 350         
          Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 
                  355                 360                 365             
          Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 
              370                 375                 380                 
          Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 
          385                 390                 395                 400 
          Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 
                          405                 410                 415     
          Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 
                      420                 425                 430         
          Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 
                  435                 440                 445             
          Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 
              450                 455                 460                 
          Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 
          465                 470                 475                 480 
          Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 
                          485                 490                 495     
          Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 
                      500                 505                 510         
          Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 
                  515                 520                 525             
          Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 
              530                 535                 540                 
          Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 
          545                 550                 555                 560 
          Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 
                          565                 570                 575     
          Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 
                      580                 585                 590         
          Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 
                  595                 600                 605             
          Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 
              610                 615                 620                 
          Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 
          625                 630                 635                 640 
          His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 
                          645                 650                 655     
          Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 
                      660                 665                 670         
          Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 
                  675                 680                 685             
          Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 
              690                 695                 700                 
          Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 
          705                 710                 715                 720 
          His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 
                          725                 730                 735     
          Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 
                      740                 745                 750         
          Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 
                  755                 760                 765             
          Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 
              770                 775                 780                 
          Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 
          785                 790                 795                 800 
          Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 
                          805                 810                 815     
          Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 
                      820                 825                 830         
          Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 
                  835                 840                 845             
          Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 
              850                 855                 860                 
          Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 
          865                 870                 875                 880 
          Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 
                          885                 890                 895     
          Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 
                      900                 905                 910         
          Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 
                  915                 920                 925             
          Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 
              930                 935                 940                 
          Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 
          945                 950                 955                 960 
          Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 
                          965                 970                 975     
          Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 
                      980                 985                 990         
          Val Gly Thr Ala Leu Ile Lys Lys  Tyr Pro Lys Leu Glu  Ser Glu Phe 
                  995                 1000                 1005             
          Val Tyr  Gly Asp Tyr Lys Val  Tyr Asp Val Arg Lys  Met Ile Ala 
              1010                 1015                 1020             
          Lys Ser  Glu Gln Glu Ile Gly  Lys Ala Thr Ala Lys  Tyr Phe Phe 
              1025                 1030                 1035             
          Tyr Ser  Asn Ile Met Asn Phe  Phe Lys Thr Glu Ile  Thr Leu Ala 
              1040                 1045                 1050             
          Asn Gly  Glu Ile Arg Lys Arg  Pro Leu Ile Glu Thr  Asn Gly Glu 
              1055                 1060                 1065             
          Thr Gly  Glu Ile Val Trp Asp  Lys Gly Arg Asp Phe  Ala Thr Val 
              1070                 1075                 1080             
          Arg Lys  Val Leu Ser Met Pro  Gln Val Asn Ile Val  Lys Lys Thr 
              1085                 1090                 1095             
          Glu Val  Gln Thr Gly Gly Phe  Ser Lys Glu Ser Ile  Leu Pro Lys 
              1100                 1105                 1110             
          Arg Asn  Ser Asp Lys Leu Ile  Ala Arg Lys Lys Asp  Trp Asp Pro 
              1115                 1120                 1125             
          Lys Lys  Tyr Gly Gly Phe Asp  Ser Pro Thr Val Ala  Tyr Ser Val 
              1130                 1135                 1140             
          Leu Val  Val Ala Lys Val Glu  Lys Gly Lys Ser Lys  Lys Leu Lys 
              1145                 1150                 1155             
          Ser Val  Lys Glu Leu Leu Gly  Ile Thr Ile Met Glu  Arg Ser Ser 
              1160                 1165                 1170             
          Phe Glu  Lys Asn Pro Ile Asp  Phe Leu Glu Ala Lys  Gly Tyr Lys 
              1175                 1180                 1185             
          Glu Val  Lys Lys Asp Leu Ile  Ile Lys Leu Pro Lys  Tyr Ser Leu 
              1190                 1195                 1200             
          Phe Glu  Leu Glu Asn Gly Arg  Lys Arg Met Leu Ala  Ser Ala Gly 
              1205                 1210                 1215             
          Glu Leu  Gln Lys Gly Asn Glu  Leu Ala Leu Pro Ser  Lys Tyr Val 
              1220                 1225                 1230             
          Asn Phe  Leu Tyr Leu Ala Ser  His Tyr Glu Lys Leu  Lys Gly Ser 
              1235                 1240                 1245             
          Pro Glu  Asp Asn Glu Gln Lys  Gln Leu Phe Val Glu  Gln His Lys 
              1250                 1255                 1260             
          His Tyr  Leu Asp Glu Ile Ile  Glu Gln Ile Ser Glu  Phe Ser Lys 
              1265                 1270                 1275             
          Arg Val  Ile Leu Ala Asp Ala  Asn Leu Asp Lys Val  Leu Ser Ala 
              1280                 1285                 1290             
          Tyr Asn  Lys His Arg Asp Lys  Pro Ile Arg Glu Gln  Ala Glu Asn 
              1295                 1300                 1305             
          Ile Ile  His Leu Phe Thr Leu  Thr Asn Leu Gly Ala  Pro Ala Ala 
              1310                 1315                 1320             
          Phe Lys  Tyr Phe Asp Thr Thr  Ile Asp Arg Lys Arg  Tyr Thr Ser 
              1325                 1330                 1335             
          Thr Lys  Glu Val Leu Asp Ala  Thr Leu Ile His Gln  Ser Ile Thr 
              1340                 1345                 1350             
          Gly Leu  Tyr Glu Thr Arg Ile  Asp Leu Ser Gln Leu  Gly Gly Asp 
              1355                 1360                 1365             
          <![CDATA[<210>  83]]>
          <![CDATA[<211>  97]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  83]]>
          gaggtaccat agagtgaggc ggttttagag ctagaaatag caagttaaaa taaggctagt       60
          ccgttatcaa cttgaaaaag tggcaccgag tcggtgc                                97
          <![CDATA[<210>  84]]>
          <![CDATA[<211>  97]]>
          <![CDATA[<212>  RNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  84]]>
          gagguaccau agagugaggc gguuuuagag cuagaaauag caaguuaaaa uaaggcuagu       60
          ccguuaucaa cuugaaaaag uggcaccgag ucggugc                                97
          <![CDATA[<210>  85]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  85]]>
          gaggtaccat agagtgaggc g                                                 21
          <![CDATA[<210>  86]]>
          <![CDATA[<211>  21]]>
          <![CDATA[<212>  RNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  86]]>
          gagguaccau agagugaggc g                                                 21
          <![CDATA[<210>  87]]>
          <![CDATA[<211>  99]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  87]]>
          accgaggcga ggatgaagcc gaggttttag agctagaaat agcaagttaa aataaggcta       60
          gtccgttatc aacttgaaaa agtggcaccg agtcggtgc                              99
          <![CDATA[<210>  88]]>
          <![CDATA[<211>  99]]>
          <![CDATA[<212>  RNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  88]]>
          accgaggcga ggaugaagcc gagguuuuag agcuagaaau agcaaguuaa aauaaggcua       60
          guccguuauc aacuugaaaa aguggcaccg agucggugc                              99
          <![CDATA[<210>  89]]>
          <![CDATA[<211>  23]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  89]]>
          accgaggcga ggatgaagcc gag                                               23
          <![CDATA[<210>  90]]>
          <![CDATA[<211>  23]]>
          <![CDATA[<212>  RNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  90]]>
          accgaggcga ggaugaagcc gag                                               23
          <![CDATA[<210>  91]]>
          <![CDATA[<211>  100]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  91]]>
          accgaagccg agaggatact gcaggtttta gagctagaaa tagcaagtta aaataaggct       60
          agtccgttat caacttgaaa aagtggcacc gagtcggtgc                            100
          <![CDATA[<210>  92]]>
          <![CDATA[<211>  100]]>
          <![CDATA[<212>  RNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  92]]>
          accgaagccg agaggauacu gcagguuuua gagcuagaaa uagcaaguua aaauaaggcu       60
          aguccguuau caacuugaaa aaguggcacc gagucggugc                            100
          <![CDATA[<210>  93]]>
          <![CDATA[<211>  24]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  93]]>
          accgaagccg agaggatact gcag                                              24
          <![CDATA[<210>  94]]>
          <![CDATA[<211>  24]]>
          <![CDATA[<212>  RNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  94]]>
          accgaagccg agaggauacu gcag                                              24
          <![CDATA[<210>  95]]>
          <![CDATA[<211>  1569]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  95]]>
          gacgcattgg acgattttga tctggatatg ctgggaagtg acgccctcga tgattttgac       60
          cttgacatgc ttggttcgga tgcccttgat gactttgacc tcgacatgct cggcagtgac      120
          gcccttgatg atttcgacct ggacatgctg attaactcta gaagttccgg atctccgaaa      180
          aagaaacgca aagttggtag ccagtacctg cccgacaccg acgaccggca ccggatcgag      240
          gaaaagcgga agcggaccta cgagacattc aagagcatca tgaagaagtc ccccttcagc      300
          ggccccaccg accctagacc tccacctaga agaatcgccg tgcccagcag atccagcgcc      360
          agcgtgccaa aacctgcccc ccagccttac cccttcacca gcagcctgag caccatcaac      420
          tacgacgagt tccctaccat ggtgttcccc agcggccaga tctctcaggc ctctgctctg      480
          gctccagccc ctcctcaggt gctgcctcag gctcctgctc ctgcaccagc tccagccatg      540
          gtgtctgcac tggctcaggc accagcaccc gtgcctgtgc tggctcctgg acctccacag      600
          gctgtggctc caccagcccc taaacctaca caggccggcg agggcacact gtctgaagct      660
          ctgctgcagc tgcagttcga cgacgaggat ctgggagccc tgctgggaaa cagcaccgat      720
          cctgccgtgt tcaccgacct ggccagcgtg gacaacagcg agttccagca gctgctgaac      780
          cagggcatcc ctgtggcccc tcacaccacc gagcccatgc tgatggaata ccccgaggcc      840
          atcacccggc tcgtgacagg cgctcagagg cctcctgatc cagctcctgc ccctctggga      900
          gcaccaggcc tgcctaatgg actgctgtct ggcgacgagg acttcagctc tatcgccgat      960
          atggatttct cagccttgct gggctctggc agcggcagcc gggattccag ggaagggatg     1020
          tttttgccga agcctgaggc cggctccgct attagtgacg tgtttgaggg ccgcgaggtg     1080
          tgccagccaa aacgaatccg gccatttcat cctccaggaa gtccatgggc caaccgccca     1140
          ctccccgcca gcctcgcacc aacaccaacc ggtccagtac atgagccagt cgggtcactg     1200
          accccggcac cagtccctca gccactggat ccagcgcccg cagtgactcc cgaggccagt     1260
          cacctgttgg aggatcccga tgaagagacg agccaggctg tcaaagccct tcgggagatg     1320
          gccgatactg tgattcccca gaaggaagag gctgcaatct gtggccaaat ggacctttcc     1380
          catccgcccc caaggggcca tctggatgag ctgacaacca cacttgagtc catgaccgag     1440
          gatctgaacc tggactcacc cctgaccccg gaattgaacg agattctgga taccttcctg     1500
          aacgacgagt gcctcttgca tgccatgcat atcagcacag gactgtccat cttcgacaca     1560
          tctctgttt                                                             1569
          <![CDATA[<210>  96]]>
          <![CDATA[<211>  523]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  96]]>
          Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu 
          1               5                   10                  15      
          Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe 
                      20                  25                  30          
          Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp 
                  35                  40                  45              
          Met Leu Ile Asn Ser Arg Ser Ser Gly Ser Pro Lys Lys Lys Arg Lys 
              50                  55                  60                  
          Val Gly Ser Gln Tyr Leu Pro Asp Thr Asp Asp Arg His Arg Ile Glu 
          65                  70                  75                  80  
          Glu Lys Arg Lys Arg Thr Tyr Glu Thr Phe Lys Ser Ile Met Lys Lys 
                          85                  90                  95      
          Ser Pro Phe Ser Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg Ile 
                      100                 105                 110         
          Ala Val Pro Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gln 
                  115                 120                 125             
          Pro Tyr Pro Phe Thr Ser Ser Leu Ser Thr Ile Asn Tyr Asp Glu Phe 
              130                 135                 140                 
          Pro Thr Met Val Phe Pro Ser Gly Gln Ile Ser Gln Ala Ser Ala Leu 
          145                 150                 155                 160 
          Ala Pro Ala Pro Pro Gln Val Leu Pro Gln Ala Pro Ala Pro Ala Pro 
                          165                 170                 175     
          Ala Pro Ala Met Val Ser Ala Leu Ala Gln Ala Pro Ala Pro Val Pro 
                      180                 185                 190         
          Val Leu Ala Pro Gly Pro Pro Gln Ala Val Ala Pro Pro Ala Pro Lys 
                  195                 200                 205             
          Pro Thr Gln Ala Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gln Leu 
              210                 215                 220                 
          Gln Phe Asp Asp Glu Asp Leu Gly Ala Leu Leu Gly Asn Ser Thr Asp 
          225                 230                 235                 240 
          Pro Ala Val Phe Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gln 
                          245                 250                 255     
          Gln Leu Leu Asn Gln Gly Ile Pro Val Ala Pro His Thr Thr Glu Pro 
                      260                 265                 270         
          Met Leu Met Glu Tyr Pro Glu Ala Ile Thr Arg Leu Val Thr Gly Ala 
                  275                 280                 285             
          Gln Arg Pro Pro Asp Pro Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu 
              290                 295                 300                 
          Pro Asn Gly Leu Leu Ser Gly Asp Glu Asp Phe Ser Ser Ile Ala Asp 
          305                 310                 315                 320 
          Met Asp Phe Ser Ala Leu Leu Gly Ser Gly Ser Gly Ser Arg Asp Ser 
                          325                 330                 335     
          Arg Glu Gly Met Phe Leu Pro Lys Pro Glu Ala Gly Ser Ala Ile Ser 
                      340                 345                 350         
          Asp Val Phe Glu Gly Arg Glu Val Cys Gln Pro Lys Arg Ile Arg Pro 
                  355                 360                 365             
          Phe His Pro Pro Gly Ser Pro Trp Ala Asn Arg Pro Leu Pro Ala Ser 
              370                 375                 380                 
          Leu Ala Pro Thr Pro Thr Gly Pro Val His Glu Pro Val Gly Ser Leu 
          385                 390                 395                 400 
          Thr Pro Ala Pro Val Pro Gln Pro Leu Asp Pro Ala Pro Ala Val Thr 
                          405                 410                 415     
          Pro Glu Ala Ser His Leu Leu Glu Asp Pro Asp Glu Glu Thr Ser Gln 
                      420                 425                 430         
          Ala Val Lys Ala Leu Arg Glu Met Ala Asp Thr Val Ile Pro Gln Lys 
                  435                 440                 445             
          Glu Glu Ala Ala Ile Cys Gly Gln Met Asp Leu Ser His Pro Pro Pro 
              450                 455                 460                 
          Arg Gly His Leu Asp Glu Leu Thr Thr Thr Leu Glu Ser Met Thr Glu 
          465                 470                 475                 480 
          Asp Leu Asn Leu Asp Ser Pro Leu Thr Pro Glu Leu Asn Glu Ile Leu 
                          485                 490                 495     
          Asp Thr Phe Leu Asn Asp Glu Cys Leu Leu His Ala Met His Ile Ser 
                      500                 505                 510         
          Thr Gly Leu Ser Ile Phe Asp Thr Ser Leu Phe 
                  515                 520             
          <![CDATA[<210>  97]]>
          <![CDATA[<211>  1939]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  97]]>
          Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 
          1               5                   10                  15      
          Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 
                      20                  25                  30          
          Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile 
                  35                  40                  45              
          Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 
              50                  55                  60                  
          Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys 
          65                  70                  75                  80  
          Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser 
                          85                  90                  95      
          Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 
                      100                 105                 110         
          His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 
                  115                 120                 125             
          His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 
              130                 135                 140                 
          Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 
          145                 150                 155                 160 
          Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 
                          165                 170                 175     
          Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 
                      180                 185                 190         
          Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 
                  195                 200                 205             
          Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 
              210                 215                 220                 
          Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 
          225                 230                 235                 240 
          Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 
                          245                 250                 255     
          Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 
                      260                 265                 270         
          Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 
                  275                 280                 285             
          Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 
              290                 295                 300                 
          Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 
          305                 310                 315                 320 
          Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 
                          325                 330                 335     
          Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 
                      340                 345                 350         
          Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 
                  355                 360                 365             
          Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 
              370                 375                 380                 
          Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 
          385                 390                 395                 400 
          Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 
                          405                 410                 415     
          Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 
                      420                 425                 430         
          Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 
                  435                 440                 445             
          Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 
              450                 455                 460                 
          Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 
          465                 470                 475                 480 
          Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 
                          485                 490                 495     
          Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 
                      500                 505                 510         
          Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 
                  515                 520                 525             
          Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 
              530                 535                 540                 
          Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 
          545                 550                 555                 560 
          Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 
                          565                 570                 575     
          Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 
                      580                 585                 590         
          Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 
                  595                 600                 605             
          Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 
              610                 615                 620                 
          Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 
          625                 630                 635                 640 
          His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 
                          645                 650                 655     
          Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 
                      660                 665                 670         
          Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 
                  675                 680                 685             
          Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 
              690                 695                 700                 
          Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 
          705                 710                 715                 720 
          His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 
                          725                 730                 735     
          Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 
                      740                 745                 750         
          Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 
                  755                 760                 765             
          Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 
              770                 775                 780                 
          Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 
          785                 790                 795                 800 
          Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 
                          805                 810                 815     
          Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 
                      820                 825                 830         
          Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 
                  835                 840                 845             
          Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 
              850                 855                 860                 
          Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 
          865                 870                 875                 880 
          Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 
                          885                 890                 895     
          Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 
                      900                 905                 910         
          Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 
                  915                 920                 925             
          Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 
              930                 935                 940                 
          Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 
          945                 950                 955                 960 
          Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 
                          965                 970                 975     
          Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 
                      980                 985                 990         
          Val Gly Thr Ala Leu Ile Lys Lys  Tyr Pro Lys Leu Glu  Ser Glu Phe 
                  995                 1000                 1005             
          Val Tyr  Gly Asp Tyr Lys Val  Tyr Asp Val Arg Lys  Met Ile Ala 
              1010                 1015                 1020             
          Lys Ser  Glu Gln Glu Ile Gly  Lys Ala Thr Ala Lys  Tyr Phe Phe 
              1025                 1030                 1035             
          Tyr Ser  Asn Ile Met Asn Phe  Phe Lys Thr Glu Ile  Thr Leu Ala 
              1040                 1045                 1050             
          Asn Gly  Glu Ile Arg Lys Arg  Pro Leu Ile Glu Thr  Asn Gly Glu 
              1055                 1060                 1065             
          Thr Gly  Glu Ile Val Trp Asp  Lys Gly Arg Asp Phe  Ala Thr Val 
              1070                 1075                 1080             
          Arg Lys  Val Leu Ser Met Pro  Gln Val Asn Ile Val  Lys Lys Thr 
              1085                 1090                 1095             
          Glu Val  Gln Thr Gly Gly Phe  Ser Lys Glu Ser Ile  Leu Pro Lys 
              1100                 1105                 1110             
          Arg Asn  Ser Asp Lys Leu Ile  Ala Arg Lys Lys Asp  Trp Asp Pro 
              1115                 1120                 1125             
          Lys Lys  Tyr Gly Gly Phe Asp  Ser Pro Thr Val Ala  Tyr Ser Val 
              1130                 1135                 1140             
          Leu Val  Val Ala Lys Val Glu  Lys Gly Lys Ser Lys  Lys Leu Lys 
              1145                 1150                 1155             
          Ser Val  Lys Glu Leu Leu Gly  Ile Thr Ile Met Glu  Arg Ser Ser 
              1160                 1165                 1170             
          Phe Glu  Lys Asn Pro Ile Asp  Phe Leu Glu Ala Lys  Gly Tyr Lys 
              1175                 1180                 1185             
          Glu Val  Lys Lys Asp Leu Ile  Ile Lys Leu Pro Lys  Tyr Ser Leu 
              1190                 1195                 1200             
          Phe Glu  Leu Glu Asn Gly Arg  Lys Arg Met Leu Ala  Ser Ala Gly 
              1205                 1210                 1215             
          Glu Leu  Gln Lys Gly Asn Glu  Leu Ala Leu Pro Ser  Lys Tyr Val 
              1220                 1225                 1230             
          Asn Phe  Leu Tyr Leu Ala Ser  His Tyr Glu Lys Leu  Lys Gly Ser 
              1235                 1240                 1245             
          Pro Glu  Asp Asn Glu Gln Lys  Gln Leu Phe Val Glu  Gln His Lys 
              1250                 1255                 1260             
          His Tyr  Leu Asp Glu Ile Ile  Glu Gln Ile Ser Glu  Phe Ser Lys 
              1265                 1270                 1275             
          Arg Val  Ile Leu Ala Asp Ala  Asn Leu Asp Lys Val  Leu Ser Ala 
              1280                 1285                 1290             
          Tyr Asn  Lys His Arg Asp Lys  Pro Ile Arg Glu Gln  Ala Glu Asn 
              1295                 1300                 1305             
          Ile Ile  His Leu Phe Thr Leu  Thr Asn Leu Gly Ala  Pro Ala Ala 
              1310                 1315                 1320             
          Phe Lys  Tyr Phe Asp Thr Thr  Ile Asp Arg Lys Arg  Tyr Thr Ser 
              1325                 1330                 1335             
          Thr Lys  Glu Val Leu Asp Ala  Thr Leu Ile His Gln  Ser Ile Thr 
              1340                 1345                 1350             
          Gly Leu  Tyr Glu Thr Arg Ile  Asp Leu Ser Gln Leu  Gly Gly Asp 
              1355                 1360                 1365             
          Gly Thr  Gly Gly Pro Pro Lys  Lys Lys Arg Lys Val  Ala Ala Ala 
              1370                 1375                 1380             
          Ser Arg  Tyr Pro Arg Gly Asp  Ala Leu Asp Asp Phe  Asp Leu Asp 
              1385                 1390                 1395             
          Met Leu  Gly Ser Asp Ala Leu  Asp Asp Phe Asp Leu  Asp Met Leu 
              1400                 1405                 1410             
          Gly Ser  Asp Ala Leu Asp Asp  Phe Asp Leu Asp Met  Leu Gly Ser 
              1415                 1420                 1425             
          Asp Ala  Leu Asp Asp Phe Asp  Leu Asp Met Leu Ile  Asn Ser Arg 
              1430                 1435                 1440             
          Ser Ser  Gly Ser Pro Lys Lys  Lys Arg Lys Val Gly  Ser Gln Tyr 
              1445                 1450                 1455             
          Leu Pro  Asp Thr Asp Asp Arg  His Arg Ile Glu Glu  Lys Arg Lys 
              1460                 1465                 1470             
          Arg Thr  Tyr Glu Thr Phe Lys  Ser Ile Met Lys Lys  Ser Pro Phe 
              1475                 1480                 1485             
          Ser Gly  Pro Thr Asp Pro Arg  Pro Pro Pro Arg Arg  Ile Ala Val 
              1490                 1495                 1500             
          Pro Ser  Arg Ser Ser Ala Ser  Val Pro Lys Pro Ala  Pro Gln Pro 
              1505                 1510                 1515             
          Tyr Pro  Phe Thr Ser Ser Leu  Ser Thr Ile Asn Tyr  Asp Glu Phe 
              1520                 1525                 1530             
          Pro Thr  Met Val Phe Pro Ser  Gly Gln Ile Ser Gln  Ala Ser Ala 
              1535                 1540                 1545             
          Leu Ala  Pro Ala Pro Pro Gln  Val Leu Pro Gln Ala  Pro Ala Pro 
              1550                 1555                 1560             
          Ala Pro  Ala Pro Ala Met Val  Ser Ala Leu Ala Gln  Ala Pro Ala 
              1565                 1570                 1575             
          Pro Val  Pro Val Leu Ala Pro  Gly Pro Pro Gln Ala  Val Ala Pro 
              1580                 1585                 1590             
          Pro Ala  Pro Lys Pro Thr Gln  Ala Gly Glu Gly Thr  Leu Ser Glu 
              1595                 1600                 1605             
          Ala Leu  Leu Gln Leu Gln Phe  Asp Asp Glu Asp Leu  Gly Ala Leu 
              1610                 1615                 1620             
          Leu Gly  Asn Ser Thr Asp Pro  Ala Val Phe Thr Asp  Leu Ala Ser 
              1625                 1630                 1635             
          Val Asp  Asn Ser Glu Phe Gln  Gln Leu Leu Asn Gln  Gly Ile Pro 
              1640                 1645                 1650             
          Val Ala  Pro His Thr Thr Glu  Pro Met Leu Met Glu  Tyr Pro Glu 
              1655                 1660                 1665             
          Ala Ile  Thr Arg Leu Val Thr  Gly Ala Gln Arg Pro  Pro Asp Pro 
              1670                 1675                 1680             
          Ala Pro  Ala Pro Leu Gly Ala  Pro Gly Leu Pro Asn  Gly Leu Leu 
              1685                 1690                 1695             
          Ser Gly  Asp Glu Asp Phe Ser  Ser Ile Ala Asp Met  Asp Phe Ser 
              1700                 1705                 1710             
          Ala Leu  Leu Gly Ser Gly Ser  Gly Ser Arg Asp Ser  Arg Glu Gly 
              1715                 1720                 1725             
          Met Phe  Leu Pro Lys Pro Glu  Ala Gly Ser Ala Ile  Ser Asp Val 
              1730                 1735                 1740             
          Phe Glu  Gly Arg Glu Val Cys  Gln Pro Lys Arg Ile  Arg Pro Phe 
              1745                 1750                 1755             
          His Pro  Pro Gly Ser Pro Trp  Ala Asn Arg Pro Leu  Pro Ala Ser 
              1760                 1765                 1770             
          Leu Ala  Pro Thr Pro Thr Gly  Pro Val His Glu Pro  Val Gly Ser 
              1775                 1780                 1785             
          Leu Thr  Pro Ala Pro Val Pro  Gln Pro Leu Asp Pro  Ala Pro Ala 
              1790                 1795                 1800             
          Val Thr  Pro Glu Ala Ser His  Leu Leu Glu Asp Pro  Asp Glu Glu 
              1805                 1810                 1815             
          Thr Ser  Gln Ala Val Lys Ala  Leu Arg Glu Met Ala  Asp Thr Val 
              1820                 1825                 1830             
          Ile Pro  Gln Lys Glu Glu Ala  Ala Ile Cys Gly Gln  Met Asp Leu 
              1835                 1840                 1845             
          Ser His  Pro Pro Pro Arg Gly  His Leu Asp Glu Leu  Thr Thr Thr 
              1850                 1855                 1860             
          Leu Glu  Ser Met Thr Glu Asp  Leu Asn Leu Asp Ser  Pro Leu Thr 
              1865                 1870                 1875             
          Pro Glu  Leu Asn Glu Ile Leu  Asp Thr Phe Leu Asn  Asp Glu Cys 
              1880                 1885                 1890             
          Leu Leu  His Ala Met His Ile  Ser Thr Gly Leu Ser  Ile Phe Asp 
              1895                 1900                 1905             
          Thr Ser  Leu Phe Pro Lys Lys  Lys Arg Lys Val Arg  Ser Lys Arg 
              1910                 1915                 1920             
          Pro Ala  Ala Thr Lys Lys Ala  Gly Gln Ala Lys Lys  Lys Lys Leu 
              1925                 1930                 1935             
          Asp 
          <![CDATA[<210>  98]]>
          <![CDATA[<211>  112]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  98]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg atgaagccga       60
          gaggatactg cagaggtctc tggtgcaatg tgtgtatgtg tgcgtttgtg tg              112
          <![CDATA[<210>  99]]>
          <![CDATA[<211>  71]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  99]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg atgaagccga       60
          gaggatactg c                                                            71
          <![CDATA[<210>  100]]>
          <![CDATA[<211>  112]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  100]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg atgaagccga       60
          gaggatactg cagaggtctc tggtgcaatg tgtgtatgtg tgcgtttgtg tg              112
          <![CDATA[<210>  101]]>
          <![CDATA[<211>  108]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<220>]]>
          <![CDATA[<221>  misc_feature]]>
          <![CDATA[<222>  (52)..(52)]]>
          <![CDATA[<223>  n係a、c、g或t]]>
          <![CDATA[<400>  101]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg angaagccga       60
          gaggatactg cagaggtctc tggtgcaatg tgtgtatgtg tgcgtttg                   108
          <![CDATA[<210>  102]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  102]]>
          ttccagtgtc gaatctgcat gcgcaacttc agccagcggg gaaacctggt gaggcatatc       60
          cgcacccaca cgggagagaa gcct                                              84
          <![CDATA[<210>  103]]>
          <![CDATA[<211>  87]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  103]]>
          tttgcctgcg atatttgtgg aaagaagttt gctctgagct tcaatctaac cagacacacc       60
          aagattcata ctgggtccca gaaaccg                                           87
          <![CDATA[<210>  104]]>
          <![CDATA[<211>  85]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  104]]>
          ttccagtgta ggatatgcat gaggaatttc tctcggagtg acaacttaac gcggcatata       60
          aggacgcaca caggtgaaaa aacaa                                             85
          <![CDATA[<210>  105]]>
          <![CDATA[<211>  87]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  105]]>
          tttgcatgcg acatctgtgg caaaaagttt gcggaccggt ctcaccttgc ccgacacaca       60
          aaaatccata ccggcagtca aaagccc                                           87
          <![CDATA[<210>  106]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  106]]>
          tttcaatgtc gcatttgcat gcgaaacttc tcacagaagg cccatttgac tgcccatatt       60
          cgtactcata ctggcgagaa acct                                              84
          <![CDATA[<210>  107]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  107]]>
          ttcgcttgcg atatatgtgg tcgtaagttt gcacggtcgg acaacctcac acgccacact       60
          aagatacacc tgcggcagaa ggac                                              84
          <![CDATA[<210>  108]]>
          <![CDATA[<211>  85]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  108]]>
          ttccagtgtc gaatctgcat gcgcaacttc agcccgaatg tccaacctga cacggcatat       60
          ccgcacccac acgggagaga agcct                                             85
          <![CDATA[<210>  109]]>
          <![CDATA[<211>  87]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  109]]>
          tttgcctgcg atatttgtgg aaagaagttt gctgacaagc ggaccttaat ccgccacacc       60
          aagattcata ctgggtccca gaaaccg                                           87
          <![CDATA[<210>  110]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  110]]>
          ttccagtgta ggatatgcat gaggaatttc tctcagcggg gaaatctagt gcgacatata       60
          aggacgcaca caggtgaaaa acca                                              84
          <![CDATA[<210>  111]]>
          <![CDATA[<211>  87]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  111]]>
          tttgcatgcg acatctgtgg caaaaagttt gcgctgagct tcaacttgac tcgtcacaca       60
          aaaatccata ccggcagtca aaagccc                                           87
          <![CDATA[<210>  112]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  112]]>
          tttcaatgtc gcatttgcat gcgaaacttc tcacggagtg acaatcttac gagacatatt       60
          cgtactcata ctggcgagaa acct                                              84
          <![CDATA[<210>  113]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  113]]>
          ttcgcttgcg atatatgtgg tcgtaagttt gcagaccgga gccacttagc caggcacact       60
          aagatacacc tgcggcagaa ggac                                              84
          <![CDATA[<210>  114]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  114]]>
          ttccagtgtc gaatctgcat gcgcaacttc agcgaccgga gcgcgctggc acggcatatc       60
          cgcacccaca cgggagagaa gcct                                              84
          <![CDATA[<210>  115]]>
          <![CDATA[<211>  87]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  115]]>
          tttgcctgcg atatttgtgg aaagaagttt gctcgaagtg acaacttaac gcgccacacc       60
          aagattcata ctgggtccca gaaaccg                                           87
          <![CDATA[<210>  116]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  116]]>
          ttccagtgta ggatatgcat gaggaatttc tctcagtcag gggacctcac tcgtcatata       60
          aggacgcaca caggtgaaaa acca                                              84
          <![CDATA[<210>  117]]>
          <![CDATA[<211>  87]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  117]]>
          tttgcatgcg acatctgtgg caaaaagttt gcggtacgac agacgcttaa acaacacaca       60
          aaaatccata ccggcagtca aaagccc                                           87
          <![CDATA[<210>  118]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  118]]>
          tttcaatgtc gcatttgcat gcgaaacttc tcagccgctg gtaacttgac acgacatatt       60
          cgtactcata ctggcgagaa acct                                              84
          <![CDATA[<210>  119]]>
          <![CDATA[<211>  84]]>
          <![CDATA[<212>  DNA]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  119]]>
          ttcgcttgcg atatatgtgg tcgtaagttt gcaagatctg ataatctaac gcgtcacact       60
          aagatacacc tgcggcagaa ggac                                              84
          <![CDATA[<210>  120]]>
          <![CDATA[<211>  5]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成多肽]]>
          <![CDATA[<400>  120]]>
          Thr Gly Glu Lys Pro 
          1               5   
          <![CDATA[<210>  121]]>
          <![CDATA[<211>  6]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成多肽]]>
          <![CDATA[<400>  121]]>
          Thr Gly Ser Gln Lys Pro 
          1               5       
          <![CDATA[<210>  122]]>
          <![CDATA[<211>  98]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  122]]>
          Met His His Gln Gln Arg Met Ala Ala Leu Gly Thr Asp Lys Glu Leu 
          1               5                   10                  15      
          Ser Asp Leu Leu Asp Phe Ser Ala Met Phe Ser Pro Pro Val Ser Ser 
                      20                  25                  30          
          Gly Lys Asn Gly Pro Thr Ser Leu Ala Ser Gly His Phe Thr Gly Ser 
                  35                  40                  45              
          Asn Val Glu Asp Arg Ser Ser Ser Gly Ser Trp Gly Asn Gly Gly His 
              50                  55                  60                  
          Pro Ser Pro Ser Arg Asn Tyr Gly Asp Gly Thr Pro Tyr Asp His Met 
          65                  70                  75                  80  
          Thr Ser Arg Asp Leu Gly Ser His Asp Asn Leu Ser Pro Pro Phe Val 
                          85                  90                  95      
          Asn Ser 
          <![CDATA[<210>  123]]>
          <![CDATA[<211>  72]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  123]]>
          Thr Asn Asn Ser Phe Ser Ser Asn Pro Ser Thr Pro Val Gly Ser Pro 
          1               5                   10                  15      
          Pro Ser Leu Ser Ala Gly Thr Ala Val Trp Ser Arg Asn Gly Gly Gln 
                      20                  25                  30          
          Ala Ser Ser Ser Pro Asn Tyr Glu Gly Pro Leu His Ser Leu Gln Ser 
                  35                  40                  45              
          Arg Ile Glu Asp Arg Leu Glu Arg Leu Asp Asp Ala Ile His Val Leu 
              50                  55                  60                  
          Arg Asn His Ala Val Gly Pro Ser 
          65                  70          
          <![CDATA[<210>  124]]>
          <![CDATA[<211>  14]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  124]]>
          Pro Leu Ser Glu Glu Glu Glu Leu Glu Leu Asn Thr Gln Arg 
          1               5                   10                  
          <![CDATA[<210>  125]]>
          <![CDATA[<211>  13]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  125]]>
          Ser Val Ser Glu Asp Val Asp Leu Leu Leu Asn Gln Arg 
          1               5                   10              
          <![CDATA[<210>  126]]>
          <![CDATA[<211>  14]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  126]]>
          His Leu Thr Glu Asp His Leu Asp Leu Asn Asn Ala Gln Arg 
          1               5                   10                  
          <![CDATA[<210>  127]]>
          <![CDATA[<211>  237]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  127]]>
          Asn Ser Val Ser Ala Ala Thr Leu Thr Pro Ser Ser Gln Ala Val Thr 
          1               5                   10                  15      
          Ile Ser Ser Ser Gly Ser Gln Glu Ser Gly Ser Gln Pro Val Thr Ser 
                      20                  25                  30          
          Gly Thr Thr Ile Ser Ser Ala Ser Leu Val Ser Ser Gln Ala Ser Ser 
                  35                  40                  45              
          Ser Ser Phe Phe Thr Asn Ala Asn Ser Tyr Ser Thr Thr Thr Thr Thr 
              50                  55                  60                  
          Ser Asn Met Gly Ile Met Asn Phe Thr Thr Ser Gly Ser Ser Gly Thr 
          65                  70                  75                  80  
          Asn Ser Gln Gly Gln Thr Pro Gln Arg Val Ser Gly Leu Gln Gly Ser 
                          85                  90                  95      
          Asp Ala Leu Asn Ile Gln Gln Asn Gln Thr Ser Gly Gly Ser Leu Gln 
                      100                 105                 110         
          Ala Gly Gln Gln Lys Glu Gly Glu Gln Asn Gln Gln Thr Gln Gln Gln 
                  115                 120                 125             
          Gln Ile Leu Ile Gln Pro Gln Leu Val Gln Gly Gly Gln Ala Leu Gln 
              130                 135                 140                 
          Ala Leu Gln Ala Ala Pro Leu Ser Gly Gln Thr Phe Thr Thr Gln Ala 
          145                 150                 155                 160 
          Ile Ser Gln Glu Thr Leu Gln Asn Leu Gln Leu Gln Ala Val Pro Asn 
                          165                 170                 175     
          Ser Gly Pro Ile Ile Ile Arg Thr Pro Thr Val Gly Pro Asn Gly Gln 
                      180                 185                 190         
          Val Ser Trp Gln Thr Leu Gln Leu Gln Asn Leu Gln Val Gln Asn Pro 
                  195                 200                 205             
          Gln Ala Gln Thr Ile Thr Leu Ala Pro Met Gln Gly Val Ser Leu Gly 
              210                 215                 220                 
          Gln Thr Ser Ser Ser Asn Thr Thr Leu Thr Pro Ile Ala 
          225                 230                 235         
          <![CDATA[<210>  128]]>
          <![CDATA[<211>  94]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  128]]>
          Met Glu Glu Pro Gln Ser Asp Pro Ser Val Glu Pro Pro Leu Ser Gln 
          1               5                   10                  15      
          Glu Thr Phe Ser Asp Leu Trp Lys Leu Leu Pro Glu Asn Asn Val Leu 
                      20                  25                  30          
          Ser Pro Leu Pro Ser Gln Ala Met Asp Asp Leu Met Leu Ser Pro Asp 
                  35                  40                  45              
          Asp Ile Glu Gln Trp Phe Thr Glu Asp Pro Gly Pro Asp Glu Ala Pro 
              50                  55                  60                  
          Arg Met Pro Glu Ala Ala Pro Pro Val Ala Pro Ala Pro Ala Ala Pro 
          65                  70                  75                  80  
          Thr Pro Ala Ala Pro Ala Pro Ala Pro Ser Trp Pro Leu Ser 
                          85                  90                  
          <![CDATA[<210>  129]]>
          <![CDATA[<211>  58]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  129]]>
          Ala Asp Ser Leu Leu Glu His Val Arg Glu Asp Phe Ser Gly Leu Leu 
          1               5                   10                  15      
          Pro Glu Glu Phe Ile Ser Leu Ser Pro Pro His Glu Ala Leu Asp Tyr 
                      20                  25                  30          
          His Phe Gly Leu Glu Glu Gly Glu Gly Ile Arg Asp Leu Phe Asp Cys 
                  35                  40                  45              
          Asp Phe Gly Asp Leu Thr Pro Leu Asp Phe 
              50                  55              
          <![CDATA[<210>  130]]>
          <![CDATA[<211>  63]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  130]]>
          Met Glu Leu Leu Ser Pro Pro Leu Arg Asp Val Asp Leu Thr Ala Pro 
          1               5                   10                  15      
          Asp Gly Ser Leu Cys Ser Phe Ala Thr Thr Asp Asp Phe Tyr Asp Asp 
                      20                  25                  30          
          Pro Cys Phe Asp Ser Pro Asp Leu Arg Phe Phe Glu Asp Leu Asp Pro 
                  35                  40                  45              
          Arg Leu Met His Val Gly Ala Leu Leu Lys Pro Glu Glu His Ser 
              50                  55                  60              
          <![CDATA[<210>  131]]>
          <![CDATA[<211>  127]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  131]]>
          Leu Ala Ala Gln Ser Leu Val Pro Pro Pro Gly Leu Pro Gly Ser Ser 
          1               5                   10                  15      
          Thr Pro Gly Val Leu Pro Tyr Phe Pro Pro Gly Leu Pro Pro Pro Asp 
                      20                  25                  30          
          Ala Gly Gly Ala Pro Gln Ser Ser Met Ser Glu Ser Pro Asp Val Asn 
                  35                  40                  45              
          Leu Val Thr Gln Gln Leu Ser Lys Ser Gln Val Glu Asp Pro Leu Pro 
              50                  55                  60                  
          Pro Val Phe Ser Gly Thr Pro Lys Gly Ser Gly Ala Gly Tyr Gly Val 
          65                  70                  75                  80  
          Gly Phe Asp Leu Glu Glu Phe Leu Asn Gln Ser Phe Asp Met Gly Val 
                          85                  90                  95      
          Ala Asp Gly Pro Gln Asp Gly Gln Ala Asp Ser Ala Ser Leu Ser Ala 
                      100                 105                 110         
          Ser Leu Leu Ala Asp Trp Leu Glu Gly His Gly Met Asn Pro Ala 
                  115                 120                 125         
          <![CDATA[<210>  132]]>
          <![CDATA[<211>  102]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  132]]>
          Pro Glu Lys Pro Leu Phe Ser Ser Ala Ser Pro Gln Asp Ser Ser Pro 
          1               5                   10                  15      
          Arg Leu Ser Thr Phe Pro Gln His His His Pro Gly Ile Pro Gly Val 
                      20                  25                  30          
          Ala His Ser Val Ile Ser Thr Arg Thr Pro Pro Pro Pro Ser Pro Leu 
                  35                  40                  45              
          Pro Phe Pro Thr Gln Ala Ile Leu Pro Pro Ala Pro Ser Ser Tyr Phe 
              50                  55                  60                  
          Ser His Pro Thr Ile Arg Tyr Pro Pro His Leu Asn Pro Gln Asp Thr 
          65                  70                  75                  80  
          Leu Lys Asn Tyr Val Pro Ser Tyr Asp Pro Ser Ser Pro Gln Thr Ser 
                          85                  90                  95      
          Gln Ser Trp Tyr Leu Gly 
                      100         
          <![CDATA[<210>  133]]>
          <![CDATA[<211>  260]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  133]]>
          Gln Tyr Leu Pro Asp Thr Asp Asp Arg His Arg Ile Glu Glu Lys Arg 
          1               5                   10                  15      
          Lys Arg Thr Tyr Glu Thr Phe Lys Ser Ile Met Lys Lys Ser Pro Phe 
                      20                  25                  30          
          Ser Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg Ile Ala Val Pro 
                  35                  40                  45              
          Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gln Pro Tyr Pro 
              50                  55                  60                  
          Phe Thr Ser Ser Leu Ser Thr Ile Asn Tyr Asp Glu Phe Pro Thr Met 
          65                  70                  75                  80  
          Val Phe Pro Ser Gly Gln Ile Ser Gln Ala Ser Ala Leu Ala Pro Ala 
                          85                  90                  95      
          Pro Pro Gln Val Leu Pro Gln Ala Pro Ala Pro Ala Pro Ala Pro Ala 
                      100                 105                 110         
          Met Val Ser Ala Leu Ala Gln Ala Pro Ala Pro Val Pro Val Leu Ala 
                  115                 120                 125             
          Pro Gly Pro Pro Gln Ala Val Ala Pro Pro Ala Pro Lys Pro Thr Gln 
              130                 135                 140                 
          Ala Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gln Leu Gln Phe Asp 
          145                 150                 155                 160 
          Asp Glu Asp Leu Gly Ala Leu Leu Gly Asn Ser Thr Asp Pro Ala Val 
                          165                 170                 175     
          Phe Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gln Gln Leu Leu 
                      180                 185                 190         
          Asn Gln Gly Ile Pro Val Ala Pro His Thr Thr Glu Pro Met Leu Met 
                  195                 200                 205             
          Glu Tyr Pro Glu Ala Ile Thr Arg Leu Val Thr Gly Ala Gln Arg Pro 
              210                 215                 220                 
          Pro Asp Pro Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu Pro Asn Gly 
          225                 230                 235                 240 
          Leu Leu Ser Gly Asp Glu Asp Phe Ser Ser Ile Ala Asp Met Asp Phe 
                          245                 250                 255     
          Ser Ala Leu Leu 
                      260 
          <![CDATA[<210>  134]]>
          <![CDATA[<211>  124]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  134]]>
          Gly Phe Ser Val Asp Thr Ser Ala Leu Leu Asp Leu Phe Ser Pro Ser 
          1               5                   10                  15      
          Val Thr Val Pro Asp Met Ser Leu Pro Asp Leu Asp Ser Ser Leu Ala 
                      20                  25                  30          
          Ser Ile Gln Glu Leu Leu Ser Pro Gln Glu Pro Pro Arg Pro Pro Glu 
                  35                  40                  45              
          Ala Glu Asn Ser Ser Pro Asp Ser Gly Lys Gln Leu Val His Tyr Thr 
              50                  55                  60                  
          Ala Gln Pro Leu Phe Leu Leu Asp Pro Gly Ser Val Asp Thr Gly Ser 
          65                  70                  75                  80  
          Asn Asp Leu Pro Val Leu Phe Glu Leu Gly Glu Gly Ser Tyr Phe Ser 
                          85                  90                  95      
          Glu Gly Asp Gly Phe Ala Glu Asp Pro Thr Ile Ser Leu Leu Thr Gly 
                      100                 105                 110         
          Ser Glu Pro Pro Lys Ala Lys Asp Pro Thr Val Ser 
                  115                 120                 
          <![CDATA[<210>  135]]>
          <![CDATA[<211>  8]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  135]]>
          Pro Lys Lys Lys Arg Lys Val Glu 
          1               5               
          <![CDATA[<210>  136]]>
          <![CDATA[<211>  9]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  136]]>
          Pro Ala Ala Lys Arg Val Lys Leu Asp 
          1               5                   
          <![CDATA[<210>  137]]>
          <![CDATA[<211>  9]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  137]]>
          Pro Ala Ala Lys Lys Lys Lys Leu Asp 
          1               5                   
          <![CDATA[<210>  138]]>
          <![CDATA[<211>  18]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  138]]>
          Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys 
          1               5                   10                  15      
          Leu Asp 
          <![CDATA[<210>  139]]>
          <![CDATA[<211>  20]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  139]]>
          Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Thr Pro Lys Lys Lys 
          1               5                   10                  15      
          Arg Lys Val Glu 
                      20  
          <![CDATA[<210>  140]]>
          <![CDATA[<211>  23]]>
          <![CDATA[<212>  PRT]]>
          <![CDATA[<213>  人工序列]]>
          <![CDATA[<220>]]>
          <![CDATA[<223>  合成]]>
          <![CDATA[<400>  140]]>
          Pro Arg Arg Arg Pro Leu His Ser Ser Ala Met Glu Val Gln Thr Lys 
          1               5                   10                  15      
          Lys Val Arg Lys Val Pro Pro 
                      20              
                                  
           <![CDATA[ <110> University of Massachusetts]]>
           <![CDATA[ <120> DNA binding domain transactivators and their uses]]>
           <![CDATA[ <130> U0120.70147WO00]]>
           <![CDATA[ <140>TW 110127164]]>
           <![CDATA[ <141> 2017-07-23]]>
           <![CDATA[ <150> US 63/056,528]]>
           <![CDATA[ <151> 2020-07-24]]>
           <![CDATA[ <160> 140 ]]>
           <![CDATA[ <170> PatentIn version 3.5]]>
           <![CDATA[ <210> 1]]>
           <![CDATA[ <211> 672]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 1]]>
          aatttccatg gactcttttt ccaaaggaat aactggaatg aataaactta aaatcaagat 60
          gaaacaatta gatggcttac ctgattaaaa ggaaaattat ccatctgcag tgaggaacag 120
          catcacccaa agacgagatg ataacaatgt gccttcagtt gcaattgttc agttccttct 180
          tgcaaaaggt gtcaaagtat ttacaagggc tgcagtctca ctggggcaga acacacagac 240
          acacaaacac acacaaacgc acacatacac acatgcacca gagacctctg cagtatcctc 300
          tcggcttcat cctcgcctca ctctatggta cctaatacaa atcagcaaat agcttgtttc 360
          aaaaaaaaaa aaaagtcaag acagcacctt acattacatc gccatctagt ggctaaatat 420
          taaacacttt ctcacaatcc agatttatga tttcttcctc aacctctttt ctctcagctt 480
          ttttcctttc ttctctgtaa tctcccagta ttgcttctcc ttgcttctct ttcattccct 540
          attgctatat aatatcatga acctaatgac tcaaagagga aaaggtttga aagtaaatat 600
          agctattttc aagtagtact tgaaaaactt agcattattt tagtttgaaa ctgttacttt 660
          attcctaata tg 672
           <![CDATA[ <210> 2]]>
           <![CDATA[ <211> 669]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Mice]]>
           <![CDATA[ <400> 2]]>
          tatttccgtg ggctcttctc cccaaggatt taccaggtaa gaattcacca ccaaagaaga 60
          tcacaatgag ataatcagat ggcttacctg ataaaaagga aaattatcca tctgcagtca 120
          ggagcaacat ctccccacga cgagtccgca ccttccgttg caacgattca gattccttct 180
          tgcaaaaggt gaccaagtgc ttcacaaggg ctgcagcctc ataggggaga acacacgtac 240
          acaaacacac gcacacacac acacacatgc accagagacc tctgcagtat cctctggctt 300
          catcctcgcc tcactctatg gtacctaata caaatcagca aatagcttgt tttaaaaaaa 360
          agaaagaaaa aaagcggaga cagcacctaa cgttacagtg ccatctagtg gctacatcgt 420
          aaataggttc tcacagcctg gatttctgtg ttctttctca accgcttcct tctggttcct 480
          ttttcttttt tcctctttat tttggtttta ttacttcctc agatgccttt ttttcattcc 540
          cctttgctct gcctacatgg aactattgac ttaaagatta aaacaatcag aactggagag 600
          cgttgctttt aagttaaaaa aaaaaaggtt gctaattttg tttgtaaatg ttactttatt 660
          ttctctatt 669
           <![CDATA[ <210> 3]]>
           <![CDATA[ <211> 130]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 3]]>
          tttttttttt ttttttttgaa acaagctatt tgctgatttg tattaggtac catagagtga 60
          ggcgaggatg aagccgagag gatactgcag aggtctctgg tgcatgtgtg tatgtgtgtgcg 120
          tttgtgtgtg 130
           <![CDATA[ <210> 4]]>
           <![CDATA[ <211> 41]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 4]]>
          gagtgaggcg aggatgaagc cgagaggata ctgcagaggt c 41
           <![CDATA[ <210> 5]]>
           <![CDATA[ <211> 18]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 5]]>
          gagtgaggcg aggatgaa 18
           <![CDATA[ <210> 6]]>
           <![CDATA[ <211> 18]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 6]]>
          ggcgaggatg aagccgag 18
           <![CDATA[ <210> 7]]>
           <![CDATA[ <211> 18]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 7]]>
          gaggatactg cagaggtc 18
           <![CDATA[ <210> 8]]>
           <![CDATA[ <211> 5]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 8]]>
          Glu Gly Glu Asp Glu
          1 5
           <![CDATA[ <210> 9]]>
           <![CDATA[ <211> 6]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 9]]>
          Gly Glu Asp Glu Ala Glu
          1 5
           <![CDATA[ <210> 10]]>
           <![CDATA[ <211> 6]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 10]]>
          Glu Asp Thr Ala Glu Val
          1 5
           <![CDATA[ <210> 11]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 11]]>
          cagcgggggaa acctggtgag g 21
           <![CDATA[ <210> 12]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 12]]>
          ctgagcttca atctaaccag a 21
           <![CDATA[ <210> 13]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 13]]>
          cggagtgaca acttaacgcg g 21
           <![CDATA[ <210> 14]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 14]]>
          gaccggtctc accttgcccg a 21
           <![CDATA[ <210> 15]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 15]]>
          cagaaggccc atttgactgc c 21
           <![CDATA[ <210> 16]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 16]]>
          cggtcggaca acctcacacg c 21
           <![CDATA[ <210> 17]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 17]]>
          Gln Arg Gly Asn Leu Val Arg
          1 5
           <![CDATA[ <210> 18]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 18]]>
          Leu Ser Phe Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 19]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 19]]>
          Arg Ser Asp Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 20]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 20]]>
          Asp Arg Ser His Leu Ala Arg
          1 5
           <![CDATA[ <210> 21]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 21]]>
          Gln Lys Ala His Leu Thr Ala
          1 5
           <![CDATA[ <210> 22]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 22]]>
          Arg Ser Asp Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 23]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 23]]>
          cgaagttcca acctgacacg g 21
           <![CDATA[ <210> 24]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 24]]>
          gacaagcgga ccttaatccg c 21
           <![CDATA[ <210> 25]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 25]]>
          cagcgggggaa atctagtgcg a 21
           <![CDATA[ <210> 26]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 26]]>
          ctgagcttca acttgactcg t 21
           <![CDATA[ <210> 27]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 27]]>
          cggagtgaca atcttacgag a 21
           <![CDATA[ <210> 28]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 28]]>
          gaccggagcc acttagccag g 21
           <![CDATA[ <210> 29]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 29]]>
          Arg Ser Ser Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 30]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 30]]>
          Asp Lys Arg Thr Leu Ile Arg
          1 5
           <![CDATA[ <210> 31]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 31]]>
          Gln Arg Gly Asn Leu Val Arg
          1 5
           <![CDATA[ <210> 32]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 32]]>
          Leu Ser Phe Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 33]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 33]]>
          Arg Ser Asp Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 34]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 34]]>
          Asp Arg Ser His Leu Ala Arg
          1 5
           <![CDATA[ <210> 35]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 35]]>
          gaccggagcg cgctggcacg g 21
           <![CDATA[ <210> 36]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 36]]>
          cgaagtgaca acttaacgcg c 21
           <![CDATA[ <210> 37]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 37]]>
          cagtcagggg acctcactcg t 21
           <![CDATA[ <210> 38]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 38]]>
          gtacgacaga cgcttaaaca a 21
           <![CDATA[ <210> 39]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 39]]>
          gccgctggta acttgacacg a 21
           <![CDATA[ <210> 40]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 40]]>
          agatctgata atctaacgcg t 21
           <![CDATA[ <210> 41]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 41]]>
          Asp Arg Ser Ala Leu Ala Arg
          1 5
           <![CDATA[ <210> 42]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 42]]>
          Arg Ser Asp Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 43]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 43]]>
          Gln Ser Gly Asp Leu Thr Arg
          1 5
           <![CDATA[ <210> 44]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 44]]>
          Val Arg Gln Thr Leu Lys Gln
          1 5
           <![CDATA[ <210> 45]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 45]]>
          Ala Ala Gly Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 46]]>
           <![CDATA[ <211> 7]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 46]]>
          Arg Ser Asp Asn Leu Thr Arg
          1 5
           <![CDATA[ <210> 47]]>
           <![CDATA[ <211> 1569]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 47]]>
          gaggccagcg gttccggacg ggctgacgca ttggacgatt ttgatctgga tatgctggga 60
          agtgacgccc tcgatgattt tgaccttgac atgcttggtt cggatgccct tgatgacttt 120
          gacctcgaca tgctcggcag tgacgccctt gatgatttcg acctggacat gctgattaac 180
          tctagaagtt ccggatctag ccagtacctg cccgacaccg acgaccggca ccggatcgag 240
          gaaaagcgga agcggaccta cgagacattc aagagcatca tgaagaagtc ccccttcagc 300
          ggccccaccg accctagacc tccacctaga agaatcgccg tgcccagcag atccagcgcc 360
          agcgtgccaa aacctgcccc ccagccttac cccttcacca gcagcctgag caccatcaac 420
          tacgacgagt tccctaccat ggtgttcccc agcggccaga tctctcaggc ctctgctctg 480
          gctccagccc ctcctcaggt gctgcctcag gctcctgctc ctgcaccagc tccagccatg 540
          gtgtctgcac tggctcaggc accagcaccc gtgcctgtgc tggctcctgg acctccacag 600
          gctgtggctc caccagcccc taaacctaca caggccggcg agggcacact gtctgaagct 660
          ctgctgcagc tgcagttcga cgacgaggat ctgggagccc tgctgggaaa cagcaccgat 720
          cctgccgtgt tcaccgacct ggccagcgtg gacaacagcg agttccagca gctgctgaac 780
          cagggcatcc ctgtggcccc tcacaccacc gagcccatgc tgatggaata ccccgaggcc 840
          atcacccggc tcgtgacagg cgctcagagg cctcctgatc cagctcctgc ccctctggga 900
          gcaccaggcc tgcctaatgg actgctgtct ggcgacgagg acttcagctc tatcgccgat 960
          atggatttct cagccttgct gggctctggc agcggcagcc gggattccag ggaagggatg 1020
          ttttttgccga agcctgaggc cggctccgct attagtgacg tgtttgaggg ccgcgaggtg 1080
          tgccagccaa aacgaatccg gccatttcat cctccaggaa gtccatgggc caaccgccca 1140
          ctccccgcca gcctcgcacc aacaccaacc ggtccagtac atgagccagt cgggtcactg 1200
          accccggcac cagtccctca gccactggat ccagcgcccg cagtgactcc cgaggccagt 1260
          cacctgttgg aggatcccga tgaagagacg agccaggctg tcaaagccct tcgggagatg 1320
          gccgatactg tgattcccca gaaggaagag gctgcaatct gtggccaaat ggacctttcc 1380
          catccgcccc caaggggcca tctggatgag ctgacaacca cacttgagtc catgaccgag 1440
          gatctgaacc tggactcacc cctgaccccg gaattgaacg agattctgga taccttcctg 1500
          aacgacgagt gcctcttgca tgccatgcat atcagcacag gactgtccat cttcgacaca 1560
          tctctgttt 1569
           <![CDATA[ <210> 48]]>
           <![CDATA[ <211> 523]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 48]]>
          Glu Ala Ser Gly Ser Gly Arg Ala Asp Ala Leu Asp Asp Phe Asp Leu
          1 5 10 15
          Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu
                      20 25 30
          Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp
                  35 40 45
          Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Ile Asn Ser Arg Ser Ser
              50 55 60
          Gly Ser Ser Gln Tyr Leu Pro Asp Thr Asp Asp Arg His Arg Ile Glu
          65 70 75 80
          Glu Lys Arg Lys Arg Thr Tyr Glu Thr Phe Lys Ser Ile Met Lys Lys
                          85 90 95
          Ser Pro Phe Ser Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg Ile
                      100 105 110
          Ala Val Pro Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gln
                  115 120 125
          Pro Tyr Pro Phe Thr Ser Ser Leu Ser Thr Ile Asn Tyr Asp Glu Phe
              130 135 140
          Pro Thr Met Val Phe Pro Ser Gly Gln Ile Ser Gln Ala Ser Ala Leu
          145 150 155 160
          Ala Pro Ala Pro Pro Gln Val Leu Pro Gln Ala Pro Ala Pro Ala Pro
                          165 170 175
          Ala Pro Ala Met Val Ser Ala Leu Ala Gln Ala Pro Ala Pro Val Pro
                      180 185 190
          Val Leu Ala Pro Gly Pro Pro Gln Ala Val Ala Pro Pro Ala Pro Lys
                  195 200 205
          Pro Thr Gln Ala Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gln Leu
              210 215 220
          Gln Phe Asp Asp Glu Asp Leu Gly Ala Leu Leu Gly Asn Ser Thr Asp
          225 230 235 240
          Pro Ala Val Phe Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gln
                          245 250 255
          Gln Leu Leu Asn Gln Gly Ile Pro Val Ala Pro His Thr Thr Glu Pro
                      260 265 270
          Met Leu Met Glu Tyr Pro Glu Ala Ile Thr Arg Leu Val Thr Gly Ala
                  275 280 285
          Gln Arg Pro Pro Asp Pro Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu
              290 295 300
          Pro Asn Gly Leu Leu Ser Gly Asp Glu Asp Phe Ser Ser Ile Ala Asp
          305 310 315 320
          Met Asp Phe Ser Ala Leu Leu Gly Ser Gly Ser Gly Ser Arg Asp Ser
                          325 330 335
          Arg Glu Gly Met Phe Leu Pro Lys Pro Glu Ala Gly Ser Ala Ile Ser
                      340 345 350
          Asp Val Phe Glu Gly Arg Glu Val Cys Gln Pro Lys Arg Ile Arg Pro
                  355 360 365
          Phe His Pro Pro Gly Ser Pro Trp Ala Asn Arg Pro Leu Pro Ala Ser
              370 375 380
          Leu Ala Pro Thr Pro Thr Gly Pro Val His Glu Pro Val Gly Ser Leu
          385 390 395 400
          Thr Pro Ala Pro Val Pro Gln Pro Leu Asp Pro Ala Pro Ala Val Thr
                          405 410 415
          Pro Glu Ala Ser His Leu Leu Glu Asp Pro Asp Glu Glu Thr Ser Gln
                      420 425 430
          Ala Val Lys Ala Leu Arg Glu Met Ala Asp Thr Val Ile Pro Gln Lys
                  435 440 445
          Glu Glu Ala Ala Ile Cys Gly Gln Met Asp Leu Ser His Pro Pro Pro
              450 455 460
          Arg Gly His Leu Asp Glu Leu Thr Thr Thr Leu Glu Ser Met Thr Glu
          465 470 475 480
          Asp Leu Asn Leu Asp Ser Pro Leu Thr Pro Glu Leu Asn Glu Ile Leu
                          485 490 495
          Asp Thr Phe Leu Asn Asp Glu Cys Leu Leu His Ala Met His Ile Ser
                      500 505 510
          Thr Gly Leu Ser Ile Phe Asp Thr Ser Leu Phe
                  515 520
           <![CDATA[ <210> 49]]>
           <![CDATA[ <211> 6027]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 49]]>
          atggaacaga ccgtgctggt gccgccgggc ccggatagct ttaacttttt tacccgcgaa 60
          agcctggcgg cgattgaacg ccgcattgcg gaagaaaaag cgaaaaaccc gaaaccggat 120
          aaaaaagatg atgatgaaaa cggcccgaaa ccgaacagcg atctggaagc gggcaaaaac 180
          ctgccgttta tttatggcga tattccgccg gaaatggtga gcgaaccgct ggaagatctg 240
          gatccgtatt atattaacaa aaaaaccttt attgtgctga acaaaggcaa agcgattttt 300
          cgctttagcg cgaccagcgc gctgtatatt ctgaccccgt ttaacccgct gcgcaaaatt 360
          gcgattaaaa ttctggtgca tagcctgttt agcatgctga ttatgtgcac cattctgacc 420
          aactgcgtgt ttatgaccat gagcaacccg ccggattgga ccaaaaacgt ggaatatacc 480
          tttaccggca tttatacctt tgaaagcctg attaaaatta ttgcgcgcgg cttttgcctg 540
          gaagatttta cctttctgcg cgatccgtgg aactggctgg attttaccgt gattaccttt 600
          gcgtatgtga ccgaatttgt ggatctgggc aacgtgagcg cgctgcgcac ctttcgcgtg 660
          ctgcgcgcgc tgaaaaccat tagcgtgatt ccgggcctga aaaccattgt gggcgcgctg 720
          attcagagcg tgaaaaaact gagcgatgtg atgattctga ccgtgttttg cctgagcgtg 780
          tttgcgctga ttggcctgca gctgtttatg ggcaacctgc gcaacaaatg cattcagtgg 840
          ccgccgacca acgcgagcct ggaagaacat agcattgaaa aaaacattac cgtgaactat 900
          aacggcaccc tgattaacga aaccgtgttt gaatttgatt ggaaaagcta tattcaggat 960
          agccgctatc attattttct ggaaggcttt ctggatgcgc tgctgtgcgg caacagcagc 1020
          gatgcgggcc agtgcccgga aggctatatg tgcgtgaaag cgggccgcaa cccgaactat 1080
          ggctatacca gctttgatac ctttagctgg gcgtttctga gcctgtttcg cctgatgacc 1140
          caggattttt gggaaaacct gtatcagctg accctgcgcg cggcgggcaa aacctatatg 1200
          atttttttttg tgctggtgat ttttctgggc agcttttatc tgattaacct gattctggcg 1260
          gtggtggcga tggcgtatga agaacagaac caggcgaccc tggaagaagc ggaacagaaa 1320
          gaagcggaat ttcagcagat gattgaacag ctgaaaaaac agcaggaagc ggcgcagcag 1380
          gcggcgaccg cgaccgcgag cgaacatagc cgcgaaccga gcgcggcggg ccgcctgagc 1440
          gatagcagca gcgaagcgag caaactgagc agcaaaagcg cgaaagaacg ccgcaaccgc 1500
          cgcaaaaaac gcaaacagaa agaacagagc ggcggcgaag aaaaagatga agatgaattt 1560
          cagaaaagcg aaagcgaaga tagcattcgc cgcaaaggct ttcgctttag cattgaaggc 1620
          aaccgcctga cctatgaaaa acgctatagc agcccgcatc agagcctgct gagcattcgc 1680
          ggcagcctgt ttagcccgcg ccgcaacagc cgcaccagcc tgtttagctt tcgcggccgc 1740
          gcgaaagatg tgggcagcga aaacgatttt gcggatgatg aacatagcac ctttgaagat 1800
          aacgaaagcc gccgcgatag cctgtttgtg ccgcgccgcc atggcgaacg ccgcaacagc 1860
          aacctgagcc agaccagccg cagcagccgc atgctggcgg tgtttccggc gaacggcaaa 1920
          atgcatagca ccgtggattg caacggcgtg gtgagcctgg tgggcggccc gagcgtgccg 1980
          accagcccgg tgggccagct gctgccggaa gtgattattg ataaaccggc gaccgatgat 2040
          aacggcacca ccaccgaaac cgaaatgcgc aaacgccgca gcagcagctt tcatgtgagc 2100
          atggattttc tggaagatcc gagccagcgc cagcgcgcga tgagcattgc gagcattctg 2160
          accaacaccg tggaagaact ggaagaaagc cgccagaaat gcccgccgtg ctggtataaa 2220
          tttagcaaca tttttctgat ttgggattgc agcccgtatt ggctgaaagt gaaacatgtg 2280
          gtgaacctgg tggtgatgga tccgtttgtg gatctggcga ttaccatttg cattgtgctg 2340
          aacaccctgt ttatggcgat ggaacattat ccgatgaccg atcattttaa caacgtgctg 2400
          accgtgggca acctggtgtt taccggcatt tttaccgcgg aaatgtttct gaaaattatt 2460
          gcgatggatc cgtattatta ttttcaggaa ggctggaaca tttttgatgg ctttattgtg 2520
          accctgagcc tggtggaact gggcctggcg aacgtggaag gcctgagcgt gctgcgcagc 2580
          tttcgcctgc tgcgcgtgtt taaactggcg aaaagctggc cgaccctgaa catgctgatt 2640
          aaaattattg gcaacagcgt gggcgcgctg ggcaacctga ccctggtgct ggcgattatt 2700
          gtgtttattt ttgcggtggt gggcatgcag ctgtttggca aaagctataa agattgcgtg 2760
          tgcaaaattg cgagcgattg ccagctgccg cgctggcata tgaacgattt ttttcatagc 2820
          tttctgattg tgtttcgcgt gctgtgcggc gaatggattg aaaccatgtg ggattgcatg 2880
          gaagtggcgg gccaggcgat gtgcctgacc gtgtttatga tggtgatggt gattggcaac 2940
          ctggtggtgc tgaacctgtt tctggcgctg ctgctgagca gctttagcgc ggataacctg 3000
          gcggcgaccg atgatgataa cgaaatgaac aacctgcaga ttgcggtgga tcgcatgcat 3060
          aaaggcgtgg cgtatgtgaa acgcaaaatt tatgaattta ttcagcagag ctttattcgc 3120
          aaacagaaaa ttctggatga aattaaaccg ctggatgatc tgaacaacaa aaaagatagc 3180
          tgcatgagca accataccgc ggaaattggc aaagatctgg attatctgaa agatgtgaac 3240
          ggcaccacca gcggcattgg caccggcagc agcgtggaaa aatatattat tgatgaaagc 3300
          gattatatga gctttattaa caacccgagc ctgaccgtga ccgtgccgat tgcggtgggc 3360
          gaaagcgatt ttgaaaacct gaacaccgaa gattttagca gcgaaagcga tctggaagaa 3420
          agcaaagaaa aactgaacga aagcagcagc agcagcgaag gcagcaccgt ggatattggc 3480
          gcgccggtgg aagaacagcc ggtggtggaa ccggaagaaa ccctggaacc ggaagcgtgc 3540
          tttaccgaag gctgcgtgca gcgctttaaa tgctgccaga ttaacgtgga agaaggccgc 3600
          ggcaaacagt ggtggaacct gcgccgcacc tgctttcgca ttgtggaaca taactggttt 3660
          gaaaccttta ttgtgtttat gattctgctg agcagcggcg cgctggcgtt tgaagatatt 3720
          tatattgatc agcgcaaaac cattaaaacc atgctggaat atgcggataa agtgtttacc 3780
          tatattttta ttctggaaat gctgctgaaa tgggtggcgt atggctatca gacctatttt 3840
          accaacgcgt ggtgctggct ggattttctg attgtggatg tgagcctggt gagcctgacc 3900
          gcgaacgcgc tgggctatag cgaactgggc gcgattaaaa gcctgcgcac cctgcgcgcg 3960
          ctgcgcccgc tgcgcgcgct gagccgcttt gaaggcatgc gcgtggtggt gaacgcgctg 4020
          ctgggcgcga ttccgagcat tatgaacgtg ctgctggtgt gcctgatttt ttggctgatt 4080
          tttagcatta tgggcgtgaa cctgtttgcg ggcaaatttt atcattgcat taacaccacc 4140
          accggcgatc gctttgatat tgaagatgtg aacaaccata ccgattgcct gaaactgatt 4200
          gaacgcaacg aaaccgcgcg ctggaaaaac gtgaaagtga actttgataa cgtgggcttt 4260
          ggctatctga gcctgctgca ggtggcgacc tttaaaggct ggatggatat tatgtatgcg 4320
          gcggtggata gccgcaacgt ggaactgcag ccgaaatatg aagaaagcct gtatatgtat 4380
          ctgtattttg tgatttttat tatttttggc agctttttta ccctgaacct gtttattggc 4440
          gtgattattg ataactttaa ccagcagaaa aaaaaatttg gcggccagga tatttttatg 4500
          accgaagaac agaaaaaata ttataacgcg atgaaaaaac tgggcagcaa aaaaccgcag 4560
          aaaccgattc cgcgcccggg caacaaattt cagggcatgg tgtttgattt tgtgacccgc 4620
          caggtgtttg atattagcat tatgattctg atttgcctga acatggtgac catgatggtg 4680
          gaaaccgatg atcagagcga atatgtgacc accattctga gccgcattaa cctggtgttt 4740
          attgtgctgt ttaccggcga atgcgtgctg aaactgatta gcctgcgcca ttattatttt 4800
          accattggct ggaacatttt tgattttgtg gtggtgattc tgagcattgt gggcatgttt 4860
          ctggcggaac tgattgaaaa atattttgtg agcccgaccc tgtttcgcgt gattcgcctg 4920
          gcgcgcattg gccgcattct gcgcctgatt aaaggcgcga aaggcattcg caccctgctg 4980
          tttgcgctga tgatgagcct gccggcgctg tttaacattg gcctgctgct gtttctggtg 5040
          atgtttattt atgcgatttt tggcatgagc aactttgcgt atgtgaaacg cgaagtgggc 5100
          attgatgata tgtttaactt tgaaaccttt ggcaacagca tgatttgcct gtttcagatt 5160
          accaccagcg cgggctggga tggcctgctg gcgccgattc tgaacagcaa accgccggat 5220
          tgcgatccga acaaagtgaa cccgggcagc agcgtgaaag gcgattgcgg caacccgagc 5280
          gtgggcattt ttttttttgt gagctatatt attattagct ttctggtggt ggtgaacatg 5340
          tatattgcgg tgattctgga aaactttagc gtggcgaccg aagaaagcgc ggaaccgctg 5400
          agcgaagatg attttgaaat gttttatgaa gtgtgggaaa aatttgatcc ggatgcgacc 5460
          cagtttatgg aatttgaaaa actgagccag tttgcggcgg cgctggaacc gccgctgaac 5520
          ctgccgcagc cgaacaaact gcagctgatt gcgatggatc tgccgatggt gagcggcgat 5580
          cgcattcatt gcctggatat tctgtttgcg tttaccaaac gcgtgctggg cgaaagcggc 5640
          gaaatggatg cgctgcgcat tcagatggaa gaacgcttta tggcgagcaa cccgagcaaa 5700
          gtgagctatc agccgattac caccaccctg aaacgcaaac aggaagaagt gagcgcggtg 5760
          attattcagc gcgcgtatcg ccgccatctg ctgaaacgca ccgtgaaaca ggcgagcttt 5820
          acctataaca aaaacaaaat taaaggcggc gcgaacctgc tgattaaaga agatatgatt 5880
          attgatcgca ttaacgaaaa cagcattacc gaaaaaaccg atctgaccat gagcaccgcg 5940
          gcgtgcccgc cgagctatga tcgcgtgacc aaaccgattg tggaaaaaca tgaacaggaa 6000
          ggcaaagatg aaaaagcgaa aggcaaa 6027
           <![CDATA[ <210> 50]]>
           <![CDATA[ <211> 2009]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 50]]> Met Glu Gln Thr Val Leu Val Pro Pro Gly Pro Asp Ser Phe Asn Phe 1 5 10 15 Phe Thr Arg Glu Ser Leu Ala Ala Ile Glu Arg Arg Ile Ala Glu Glu 20 25 30 Lys Ala Lys Asn Pro Lys Pro Asp Lys Lys Asp Asp Asp Glu Asn Gly 35 40 45 Pro Lys Pro Asn Ser Asp Leu Glu Ala Gly Lys Asn Leu Pro Phe Ile 50 55 60 Tyr Gly Asp Ile Pro Pro Glu Met Val Ser Glu Pro Leu Glu Asp Leu 65 70 75 80 Asp Pro Tyr Tyr Ile Asn Lys Lys Thr Phe Ile Val Leu Asn Lys Gly 85 90 95 Lys Ala Ile Phe Arg Phe Ser Ala Thr Ser Ala Leu Tyr Ile Leu Thr 100 105 110 Pro Phe Asn Pro Leu Arg Lys Ile Ala Ile Lys Ile Leu Val His Ser 115 120 125 Leu Phe Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Val Phe 130 135 140 Met Thr Met Ser Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr 145 150 155 160 Phe Thr Gly Ile Tyr Thr Phe Glu Ser Leu Ile Lys Ile Ile Ala Arg 165 170 175 Gly Phe Cys Leu Glu Asp Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp 180 185 190 Leu Asp Phe Thr Val Ile Thr Phe Ala Tyr Val Thr Glu Phe Val Asp 195 200 205 Leu Gly Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu 210 215 220 Lys Thr Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu 225 230 235 240 Ile Gln Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe 245 250 255 Cys Leu Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn 260 265 270 Leu Arg Asn Lys Cys Ile Gln Trp Pro Pro Thr Asn Ala Ser Leu Glu 275 280 285 Glu His Ser Ile Glu Lys Asn Ile Thr Val Asn Tyr Asn Gly Thr Leu 290 295 300 Ile Asn Glu Thr Val Phe Glu Phe Asp Trp Lys Ser Tyr Ile Gln Asp 305 310 315 320 Ser Arg Tyr His Tyr Phe Leu Glu Gly Phe Leu Asp Ala Leu Leu Cys 325 330 335 Gly Asn Ser Ser Asp Ala Gly Gln Cys Pro Glu Gly Tyr Met Cys Val 340 345 350 Lys Ala Gly Arg Asn Pro Asn Tyr Gly Tyr Thr Ser Phe Asp Thr Phe 355 360 365 Ser Trp Ala Phe Leu Ser Leu Phe Arg Leu Met Thr Gln Asp Phe Trp 370 375 380 Glu Asn Leu Tyr Gln Leu Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met 385 390 395 400 Ile Phe Phe Val Leu Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn 405 410 415 Leu Ile Leu Ala Val Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala 420 425 430 Thr Leu Glu Glu Ala Glu Gln Lys Glu Ala Glu Phe Gln Gln Met Ile 435 440 445 Glu Gln Leu Lys Lys Gln Gln Glu Ala Ala Gln Gln Ala Ala Thr Ala 450 455 460 Thr Ala Ser Glu His Ser Arg Glu Pro Ser Ala Ala Gly Arg Leu Ser 465 470 475 480 Asp Ser Ser Ser Glu Ala Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu 485 490 495 Arg Arg Asn Arg Arg Lys Lys Arg Lys Gln Lys Glu Gln Ser Gly Gly 500 505 510 Glu Glu Lys Asp Glu Asp Glu Phe Gln Lys Ser Glu Ser Glu Asp Ser 515 520 525 Ile Arg Arg Lys Gly Phe Arg Phe Ser Ile Glu Gly Asn Arg Leu Thr 530 535 540 Tyr Glu Lys Arg Tyr Ser Ser Pro His Gln Ser Leu Leu Ser Ile Arg 545 550 555 560 Gly Ser Leu Phe Ser Pro Arg Arg Asn Ser Arg Thr Ser Leu Phe Ser 565 570 575 Phe Arg Gly Arg Ala Lys Asp Val Gly Ser Glu Asn Asp Phe Ala Asp 580 585 590 Asp Glu His Ser Thr Phe Glu Asp Asn Glu Ser Arg Arg Asp Ser Leu 595 600 605 Phe Val Pro Arg Arg His Gly Glu Arg Arg Asn Ser Asn Leu Ser Gln 610 615 620 Thr Ser Arg Ser Ser Arg Met Leu Ala Val Phe Pro Ala Asn Gly Lys 625 630 635 640 Met His Ser Thr Val Asp Cys Asn Gly Val Val Ser Leu Val Gly Gly 645 650 655 Pro Ser Val Pro Thr Ser Pro Val Gly Gln Leu Leu Pro Glu Val Ile 660 665 670 Ile Asp Lys Pro Ala Thr Asp Asp Asn Gly Thr Thr Thr Glu Thr Glu 675 680 685 Met Arg Lys Arg Arg Ser Ser Ser Phe His Val Ser Met Asp Phe Leu 690 695 700 Glu Asp Pro Ser Gln Arg Gln Arg Ala Met Ser Ile Ala Ser Ile Leu 705 710 715 720 Thr Asn Thr Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro 725 730 735 Cys Trp Tyr Lys Phe Ser Asn Ile Phe Leu Ile Trp Asp Cys Ser Pro 74 0 745 750 Tyr Trp Leu Lys Val Lys His Val Val Asn Leu Val Val Met Asp Pro 755 760 765 Phe Val Asp Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe 770 775 780 Met Ala Met Glu His Tyr Pro Met Thr Asp His Phe Asn Asn Val Leu 785 790 795 800 Thr Val Gly Asn Leu Val Phe Thr Gly Ile Phe Thr Ala Glu Met Phe 805 810 815 Leu Lys Ile Ile Ala Met Asp Pro Tyr Tyr Tyr Phe Gln Glu Gly Trp 820 825 830 Asn Ile Phe Asp Gly Phe Ile Val Thr Leu Ser Leu Val Glu Leu Gly 835 840 845 Leu Ala Asn Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu 850 855 860 Arg Val Phe Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile 865 870 875 880 Lys Ile Ile Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Th r Leu Val 885 890 895 Leu Ala Ile Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe 900 905 910 Gly Lys Ser Tyr Lys Asp Cys Val Cys Lys Ile Ala Ser Asp Cys Gln 915 920 925 Leu Pro Arg Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val 930 935 940 Phe Arg Val Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met 945 950 955 960 Glu Val Ala Gly Gln Ala Met Cys Leu Thr Val Phe Met Met Val Met 965 970 975 Val Ile Gly Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu 980 985 990 Ser Ser Phe Ser Ala Asp Asn Leu Ala Ala Thr Asp Asp Asp Asn Glu 995 1000 1005 Met Asn Asn Leu Gln Ile Ala Val Asp Arg Met His Lys Gly Val 1010 1015 1020 Ala Tyr Val Lys Arg Lys Ile Tyr Glu Phe Ile Gln Gln Ser Phe 1025 1030 1035 Ile Arg Lys Gln Lys Ile Leu Asp Glu Ile Lys Pro Le u Asp Asp 1040 1045 1050 Leu Asn Asn Lys Lys Asp Ser Cys Met Ser Asn His Thr Ala Glu 1055 1060 1065 Ile Gly Lys Asp Leu Asp Tyr Leu Lys Asp Val Asn Gly Thr Thr 1070 1075 1080 Ser Gly Ile Gly Thr Gly Ser Ser Val Glu Lys Tyr Ile Ile Asp 1085 1090 1095 Glu Ser Asp Tyr Met Ser Phe Ile Asn Asn Pro Ser Leu Thr Val 1100 1105 1110 Thr Val Pro Ile Ala Val Gly Glu Ser Asp Phe Glu Asn Leu Asn 1115 1120 1125 Thr Glu Asp Phe Ser Ser Glu Ser Asp Leu Glu Glu Ser Lys Glu 1130 1135 1140 Lys Leu Asn Glu Ser Ser Ser Ser Ser Glu Gly Ser Thr Val Asp 1145 1150 1155 Ile Gly Ala Pro Val Glu Glu Gln Pro Val Val Glu Pro Glu Glu 1160 1165 1170 Thr Leu Glu Pro Glu Ala Cys Phe Thr Glu Gly Cys Val Gln Arg 1175 1180 1185 Phe Lys Cys Cys Gln Ile Asn Val Glu Glu Glu Gly Arg Gly Lys Gln 1190 1195 1200 Trp Trp Asn Leu Arg Arg Thr Cys Phe Arg Ile Val Glu His Asn 1205 1210 1215 Trp Phe Glu Thr Phe Ile Val Phe Met Ile Leu Leu Ser Ser Gly 1220 1225 1230 Ala Leu Ala Phe Glu Asp Ile Tyr Ile Asp Gln Arg Lys Thr Ile 1235 1240 1245 Lys Thr Met Leu Glu Tyr Ala Asp Lys Val Phe Thr Tyr Ile Phe 1250 1255 1260 Ile Leu Glu Met Leu Leu Lys Trp Val Ala Tyr Gly Tyr Gln Thr 1265 1270 1275 Tyr Phe Thr Asn Ala Trp Cys Trp Leu Asp Phe Leu Ile Val Asp 1280 1285 1290 Val Ser Leu Val Ser Leu Thr Ala Asn Ala Leu Gly Tyr Ser Glu 1295 1300 1305 Leu Gly Ala Ile Lys Ser Leu Arg Thr Leu Arg Ala Leu Arg Pro 1310 1315 1320 Leu Arg Ala Leu Ser Arg Phe Glu Gly Met Arg Val Val Val Asn 1325 1330 1335 Ala Leu Leu Gly Ala Ile Pro Ser Ile Met Asn Val Leu Leu Val 1340 1345 1350 Cys Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gly Val Asn Leu 1355 1360 1365 Phe Ala Gly Lys Phe Tyr His Cys Ile Asn Thr Thr Thr Gly Asp 1370 1375 1380 Arg Phe Asp Ile Glu Asp Val Asn Asn His Thr Asp Cys Leu Lys 1385 1390 1395 Leu Ile Glu Arg Asn Glu Thr Ala Arg Trp Lys Asn Val Lys Val 1400 1405 1410 Asn Phe Asp Asn Val Gly Phe Gly Tyr Leu Ser Leu Leu Gln Val 1415 1420 1425 Ala Thr Phe Lys Gly Trp Met Asp Ile Met Tyr Ala Ala Val Asp 1430 1435 1440 Ser Arg Asn Val Glu Leu G ln Pro Lys Tyr Glu Glu Ser Leu Tyr 1445 1450 1455 Met Tyr Leu Tyr Phe Val Ile Phe Ile Ile Phe Gly Ser Phe Phe 1460 1465 1470 Thr Leu Asn Leu Phe Ile Gly Val Ile Ile Asp Asn Phe Asn Gln 1475 1480 1485 Gln Lys Lys Lys Phe Gly Gly Gln Asp Ile Phe Met Thr Glu Glu 1490 1495 1500 Gln Lys Lys Tyr Tyr Asn Ala Met Lys Lys Lys Leu Gly Ser Lys Lys 1505 1510 1515 Pro Gln Lys Pro Ile Pro Arg Pro Gly Asn Lys Phe Gln Gly Met 1520 1525 1530 Val Phe Asp Phe Val Thr Arg Gln Val Phe Asp Ile Ser Ile Met 1535 1540 1545 Ile Leu Ile Cys Leu Asn Met Val Thr Met Met Val Glu Thr Asp 1550 1555 1560 Asp Gln Ser Glu Tyr Val Thr Thr Ile Leu Ser Arg Ile Asn Leu 1565 1570 1575 Val Phe Ile Val Leu Phe Thr Gly Glu Cys Val Leu Lys Leu Ile 1580 1585 1590 Ser Leu Arg His Tyr Tyr Phe Thr Ile Gly Trp Asn Ile Phe Asp 1595 1600 1605 Phe Val Val Val Ile Leu Ser Ile Val Gly Met Phe Leu Ala Glu 1610 1615 1620 Leu Ile Glu Lys Tyr Phe Val Ser Pro Thr Leu Phe Arg Val Ile 1625 1630 1635 Arg Leu Ala Arg Ile Gly Arg Ile Leu Arg Leu Ile Ly s Gly Ala 1640 1645 1650 Lys Gly Ile Arg Thr Leu Leu Phe Ala Leu Met Met Ser Leu Pro 1655 1660 1665 Ala Leu Phe Asn Ile Gly Leu Leu Leu Phe Leu Val Met Phe Ile 1670 1675 1680 Tyr Ala Ile Phe Gly Met Ser Asn Phe Ala Tyr Val Lys Arg Glu 1685 1690 1695 Val Gly Ile Asp Asp Met Phe Asn Phe Glu Thr Phe Gly Asn Ser 1700 1705 1710 Met Ile Cys Leu Phe Gln Ile Thr Thr Ser Ala Gly Trp Asp Gly 1715 1720 1725 Leu Leu Ala Pro Ile Leu Asn Ser Lys Pro Pro Asp Cys Asp Pro 1730 1735 1740 Asn Lys Val Asn Pro Gly Ser Ser Val Lys Gly Asp Cys Gly Asn 1745 1750 1755 Pro Ser Val Gly Ile Phe Phe Phe Val Ser Tyr Ile Ile Ile Ser 1760 1765 1770 Phe Leu Val Val Val Asn Met Tyr Ile Ala Val Ile Leu Glu Asn 1775 1780 1785 Phe Ser Val Ala Thr Glu Glu Ser Ala Glu Pro Leu Ser Glu Asp 1790 1795 1800 Asp Phe Glu Met Phe Tyr Glu Val Trp Glu Lys Phe Asp Pro Asp 1805 1810 1815 Ala Thr Gln Phe Met Glu Phe Glu Lys Leu Ser Gln Phe Ala Ala 1820 1825 1830 Ala Leu Glu Pro Pro Leu Asn Leu Pro Gln Pro Asn Lys Leu Gln 1835 1840 1845 Leu Ile Ala Met Asp Leu Pro Met Val Ser Gly Asp Arg Ile His 1850 1855 1860 Cys Leu Asp Ile Leu Phe Ala Phe Thr Lys Arg Val Leu Gly Glu 1865 1870 1875 Ser Gly Glu Met Asp Ala Leu Arg Ile Gln Met Glu Glu Arg Phe 1880 1885 1890 Met Ala Ser Asn Pro Ser Lys Val Ser Tyr Gln Pro Ile Thr Thr 1895 1900 1905 Thr Leu Lys Arg Lys Gln Glu Glu Val Ser Ala Val Ile Ile Gln 1910 1915 1920 Arg Ala Tyr Arg Arg His Leu Leu Lys Arg Thr Val Lys Gln Ala 1925 1930 1935 Ser Phe Thr Tyr Asn Lys Asn Lys Ile Lys Gly Gly Ala Asn Leu 1940 1945 1950 Leu Ile Lys Glu Asp Met Ile Ile Asp Arg Ile Asn Glu Asn Ser 1955 1960 1965 Ile Thr Glu Lys Thr Asp Leu Thr Met Ser Thr Ala Ala Cys Pro 1970 1975 1980 Pro Ser Tyr Asp Arg Val Thr Lys Pro Ile Val Glu Lys His Glu 1985 1990 1995 Gln Glu Gly Lys Asp Glu Lys Ala Lys Gly Lys 2000 2005 <![CDATA[ <210> 51]]>
           <![CDATA[ <211> 1470]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 51]]>
          atggatctgc tggtggatga actgtttgcg gatatgaacg cggatggcgc gagcccgccg 60
          ccgccgcgcc cggcgggcgg cccgaaaaac accccggcgg cgccgccgct gtatgcgacc 120
          ggccgcctga gccaggcgca gctgatgccg agcccgccga tgccggtgcc gccggcggcg 180
          ctgtttaacc gcctgctgga tgatctgggc tttagcgcgg gcccggcgct gtgcaccatg 240
          ctggatacct ggaacgaaga tctgtttagc gcgctgccga ccaacgcgga tctgtatcgc 300
          gaatgcaaat ttctgagcac cctgccgagc gatgtggtgg aatggggcga tgcgtatgtg 360
          ccggaacgca cccagattga tattcgcgcg catggcgatg tggcgtttcc gaccctgccg 420
          gcgacccgcg atggcctggg cctgtattat gaagcgctga gccgcttttt tcatgcggaa 480
          ctgcgcgcgc gcgaagaaag ctatcgcacc gtgctggcga acttttgcag cgcgctgtat 540
          cgctatctgc gcgcgagcgt gcgccagctg catcgccagg cgcatatgcg cggccgcgat 600
          cgcgatctgg gcgaaatgct gcgcgcgacc attgcggatc gctattatcg cgaaaccgcg 660
          cgcctggcgc gcgtgctgtt tctgcatctg tatctgtttc tgacccgcga aattctgtgg 720
          gcggcgtatg cggaacagat gatgcgcccg gatctgtttg attgcctgtg ctgcgatctg 780
          gaaagctggc gccagctggc gggcctgttt cagccgttta tgtttgtgaa cggcgcgctg 840
          accgtgcgcg gcgtgccgat tgaagcgcgc cgcctgcgcg aactgaacca tattcgcgaa 900
          catctgaacc tgccgctggt gcgcagcgcg gcgaccgaag aaccgggcgc gccgctgacc 960
          accccgccga ccctgcatgg caaccaggcg cgcgcgagcg gctattttat ggtgctgatt 1020
          cgcgcgaaac tggatagcta tagcagcttt accaccagcc cgagcgaagc ggtgatgcgc 1080
          gaacatgcgt atagccgcgc gcgcaccaaa aacaactatg gcagcaccat tgaaggcctg 1140
          ctggatctgc cggatgatga tgcgccggaa gaagcgggcc tggcggcgcc gcgcctgagc 1200
          tttctgccgg cgggccatac ccgccgcctg agcaccgcgc cgccgaccga tgtgagcctg 1260
          ggcgatgaac tgcatctgga tggcgaagat gtggcgatgg cgcatgcgga tgcgctggat 1320
          gattttgatc tggatatgct gggcgatggc gatagcccgg gcccgggctt taccccgcat 1380
          gatagcgcgc cgtatggcgc gctggatatg gcggattttg aatttgaaca gatgtttacc 1440
          gatgcgctgg gcattgatga atatggcggc 1470
           <![CDATA[ <210> 52]]>
           <![CDATA[ <211> 490]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 52]]>
          Met Asp Leu Leu Val Asp Glu Leu Phe Ala Asp Met Asn Ala Asp Gly
          1 5 10 15
          Ala Ser Pro Pro Pro Pro Arg Pro Ala Gly Gly Pro Lys Asn Thr Pro
                      20 25 30
          Ala Ala Pro Pro Leu Tyr Ala Thr Gly Arg Leu Ser Gln Ala Gln Leu
                  35 40 45
          Met Pro Ser Pro Pro Met Pro Val Pro Pro Ala Ala Leu Phe Asn Arg
              50 55 60
          Leu Leu Asp Asp Leu Gly Phe Ser Ala Gly Pro Ala Leu Cys Thr Met
          65 70 75 80
          Leu Asp Thr Trp Asn Glu Asp Leu Phe Ser Ala Leu Pro Thr Asn Ala
                          85 90 95
          Asp Leu Tyr Arg Glu Cys Lys Phe Leu Ser Thr Leu Pro Ser Asp Val
                      100 105 110
          Val Glu Trp Gly Asp Ala Tyr Val Pro Glu Arg Thr Gln Ile Asp Ile
                  115 120 125
          Arg Ala His Gly Asp Val Ala Phe Pro Thr Leu Pro Ala Thr Arg Asp
              130 135 140
          Gly Leu Gly Leu Tyr Tyr Glu Ala Leu Ser Arg Phe Phe His Ala Glu
          145 150 155 160
          Leu Arg Ala Arg Glu Glu Ser Tyr Arg Thr Val Leu Ala Asn Phe Cys
                          165 170 175
          Ser Ala Leu Tyr Arg Tyr Leu Arg Ala Ser Val Arg Gln Leu His Arg
                      180 185 190
          Gln Ala His Met Arg Gly Arg Asp Arg Asp Leu Gly Glu Met Leu Arg
                  195 200 205
          Ala Thr Ile Ala Asp Arg Tyr Tyr Arg Glu Thr Ala Arg Leu Ala Arg
              210 215 220
          Val Leu Phe Leu His Leu Tyr Leu Phe Leu Thr Arg Glu Ile Leu Trp
          225 230 235 240
          Ala Ala Tyr Ala Glu Gln Met Met Arg Pro Asp Leu Phe Asp Cys Leu
                          245 250 255
          Cys Cys Asp Leu Glu Ser Trp Arg Gln Leu Ala Gly Leu Phe Gln Pro
                      260 265 270
          Phe Met Phe Val Asn Gly Ala Leu Thr Val Arg Gly Val Pro Ile Glu
                  275 280 285
          Ala Arg Arg Leu Arg Glu Leu Asn His Ile Arg Glu His Leu Asn Leu
              290 295 300
          Pro Leu Val Arg Ser Ala Ala Thr Glu Glu Pro Gly Ala Pro Leu Thr
          305 310 315 320
          Thr Pro Pro Thr Leu His Gly Asn Gln Ala Arg Ala Ser Gly Tyr Phe
                          325 330 335
          Met Val Leu Ile Arg Ala Lys Leu Asp Ser Tyr Ser Ser Phe Thr Thr
                      340 345 350
          Ser Pro Ser Glu Ala Val Met Arg Glu His Ala Tyr Ser Arg Ala Arg
                  355 360 365
          Thr Lys Asn Asn Tyr Gly Ser Thr Ile Glu Gly Leu Leu Asp Leu Pro
              370 375 380
          Asp Asp Asp Ala Pro Glu Glu Ala Gly Leu Ala Ala Pro Arg Leu Ser
          385 390 395 400
          Phe Leu Pro Ala Gly His Thr Arg Arg Leu Ser Thr Ala Pro Pro Thr
                          405 410 415
          Asp Val Ser Leu Gly Asp Glu Leu His Leu Asp Gly Glu Asp Val Ala
                      420 425 430
          Met Ala His Ala Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly
                  435 440 445
          Asp Gly Asp Ser Pro Gly Pro Gly Phe Thr Pro His Asp Ser Ala Pro
              450 455 460
          Tyr Gly Ala Leu Asp Met Ala Asp Phe Glu Phe Glu Gln Met Phe Thr
          465 470 475 480
          Asp Ala Leu Gly Ile Asp Glu Tyr Gly Gly
                          485 490
           <![CDATA[ <210> 53]]>
           <![CDATA[ <211> 2570]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 53]]>
          agcgcgcagg cgcggccgga ttccgggcag tgacgcgacg gcgggccgcg cggcgcattt 60
          ccgcctctgg cgaatggctc gtctgtagtg cacgccgcgg gcccagctgc gaccccggcc 120
          ccgccccccgg gaccccggcc atggacgaac tgttccccct catcttcccg gcagagccag 180
          cccaggcctc tggcccctat gtggagatca ttgagcagcc caagcagcgg ggcatgcgct 240
          tccgctacaa gtgcgagggg cgctccgcgg gcagcatccc aggcgagagg agcacagata 300
          ccaccaagac ccaccccacc atcaagatca atggctacac aggaccaggg acagtgcgca 360
          tctccctggt caccaaggac cctcctcacc ggcctcaccc ccacgagctt gtaggaaagg 420
          actgccggga tggcttctat gaggctgagc tctgcccgga ccgctgcatc cacagtttcc 480
          agaacctggg aatccagtgt gtgaagaagc gggacctgga gcaggctatc agtcagcgca 540
          tccagaccaa caacaacccc ttccaagaag agcagcgtgg ggactacgac ctgaatgctg 600
          tgcggctctg cttccaggtg acagtgcggg acccatcagg caggcccctc cgcctgccgc 660
          ctgtcctttc tcatcccatc tttgacaatc gtgcccccaa cactgccgag ctcaagatct 720
          gccgagtgaa ccgaaactct ggcagctgcc tcggtgggga tgagatcttc ctactgtgtg 780
          acaaggtgca gaaagaggac attgaggtgt atttcacggg accaggctgg gaggcccgag 840
          gctccttttc gcaagctgat gtgcaccgac aagtggccat tgtgttccgg acccctccct 900
          acgcagaccc cagcctgcag gctcctgtgc gtgtctccat gcagctgcgg cggccttccg 960
          accgggagct cagtgagccc atggaattcc agtacctgcc agatacagac gatcgtcacc 1020
          ggattgagga gaaacgtaaa aggacatatg agaccttcaa gagcatcatg aagaagagtc 1080
          ctttcagcgg acccaccgac ccccggcctc cacctcgacg cattgctgtg ccttcccgca 1140
          gctcagcttc tgtccccaag ccagcacccc agccctatcc ctttacgtca tccctgagca 1200
          ccatcaacta tgatgagttt cccaccatgg tgtttccttc tgggcagatc agccaggcct 1260
          cggccttggc cccggcccct ccccaagtcc tgccccaggc tccagcccct gcccctgctc 1320
          cagccatggt atcagctctg gcccaggccc cagcccctgt cccagtccta gccccaggcc 1380
          ctcctcaggc tgtggcccca cctgccccca agcccaccca ggctggggaa ggaacgctgt 1440
          cagaggccct gctgcagctg cagtttgatg atgaagacct gggggccttg cttggcaaca 1500
          gcacagaccc agctgtgttc acagacctgg catccgtcga caactccgag tttcagcagc 1560
          tgctgaacca gggcatacct gtggcccccc acacaactga gcccatgctg atggagtacc 1620
          ctgaggctat aactcgccta gtgacagggg cccagaggcc ccccgaccca gctcctgctc 1680
          cactggggc cccggggctc cccaatggcc tcctttcagg agatgaagac ttctcctcca 1740
          ttgcggacat ggacttctca gccctgctga gtcagatcag ctcctaaggg ggtgacgcct 1800
          gccctcccca gagcactggg ttgcagggga ttgaagccct ccaaaagcac ttacggattc 1860
          tggtggggtg tgttccaact gcccccaact ttgtggatgt cttccttgga ggggggagcc 1920
          atattttatt cttttattgt cagtatctgt atctctctct ctttttggag gtgcttaagc 1980
          agaagcatta acttctctgg aaagggggga gctggggaaa ctcaaacttt tcccctgtcc 2040
          tgatggtcag ctcccttctc tgtagggaac tctggggtcc cccatcccca tcctccagct 2100
          tctggtactc tcctagagac agaagcaggc tggaggtaag gcctttgagc ccacaaagcc 2160
          ttatcaagtg tcttccatca tggattcatt acagcttaat caaaataacg ccccagatac 2220
          cagcccctgt atggcactgg cattgtccct gtgcctaaca ccagcgtttg aggggctggc 2280
          cttcctgccc tacagaggtc tctgccggct ctttccttgc tcaaccatgg ctgaaggaaa 2340
          ccagtgcaac agcactggct ctctccagga tccagaaggg gtttggtctg ggacttcctt 2400
          gctctccctc ttctcaagtg ccttaatagt agggtaagtt gttaagagtg ggggagagca 2460
          ggctggcagc tctccagtca ggaggcatag tttttactga acaatcaaag cacttggact 2520
          cttgctcttt ctactctgaa ctaataaatc tgttgccaag ctggctagaa 2570
           <![CDATA[ <210> 54]]>
           <![CDATA[ <211> 548]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 54]]>
          Met Asp Glu Leu Phe Pro Leu Ile Phe Pro Ala Glu Pro Ala Gln Ala
          1 5 10 15
          Ser Gly Pro Tyr Val Glu Ile Ile Glu Gln Pro Lys Gln Arg Gly Met
                      20 25 30
          Arg Phe Arg Tyr Lys Cys Glu Gly Arg Ser Ala Gly Ser Ile Pro Gly
                  35 40 45
          Glu Arg Ser Thr Asp Thr Thr Lys Thr His Pro Thr Ile Lys Ile Asn
              50 55 60
          Gly Tyr Thr Gly Pro Gly Thr Val Arg Ile Ser Leu Val Thr Lys Asp
          65 70 75 80
          Pro Pro His Arg Pro His Pro His Glu Leu Val Gly Lys Asp Cys Arg
                          85 90 95
          Asp Gly Phe Tyr Glu Ala Glu Leu Cys Pro Asp Arg Cys Ile His Ser
                      100 105 110
          Phe Gln Asn Leu Gly Ile Gln Cys Val Lys Lys Arg Asp Leu Glu Gln
                  115 120 125
          Ala Ile Ser Gln Arg Ile Gln Thr Asn Asn Asn Pro Phe Gln Glu Glu
              130 135 140
          Gln Arg Gly Asp Tyr Asp Leu Asn Ala Val Arg Leu Cys Phe Gln Val
          145 150 155 160
          Thr Val Arg Asp Pro Ser Gly Arg Pro Leu Arg Leu Pro Pro Val Leu
                          165 170 175
          Ser His Pro Ile Phe Asp Asn Arg Ala Pro Asn Thr Ala Glu Leu Lys
                      180 185 190
          Ile Cys Arg Val Asn Arg Asn Ser Gly Ser Cys Leu Gly Gly Asp Glu
                  195 200 205
          Ile Phe Leu Leu Cys Asp Lys Val Gln Lys Glu Asp Ile Glu Val Tyr
              210 215 220
          Phe Thr Gly Pro Gly Trp Glu Ala Arg Gly Ser Phe Ser Gln Ala Asp
          225 230 235 240
          Val His Arg Gln Val Ala Ile Val Phe Arg Thr Pro Pro Tyr Ala Asp
                          245 250 255
          Pro Ser Leu Gln Ala Pro Val Arg Val Ser Met Gln Leu Arg Arg Pro
                      260 265 270
          Ser Asp Arg Glu Leu Ser Glu Pro Met Glu Phe Gln Tyr Leu Pro Asp
                  275 280 285
          Thr Asp Asp Arg His Arg Ile Glu Glu Lys Arg Lys Arg Thr Tyr Glu
              290 295 300
          Thr Phe Lys Ser Ile Met Lys Lys Ser Pro Phe Ser Gly Pro Thr Asp
          305 310 315 320
          Pro Arg Pro Pro Pro Arg Arg Ile Ala Val Pro Ser Arg Ser Ser Ala
                          325 330 335
          Ser Val Pro Lys Pro Ala Pro Gln Pro Tyr Pro Phe Thr Ser Ser Leu
                      340 345 350
          Ser Thr Ile Asn Tyr Asp Glu Phe Pro Thr Met Val Phe Pro Ser Gly
                  355 360 365
          Gln Ile Ser Gln Ala Ser Ala Leu Ala Pro Ala Pro Pro Gln Val Leu
              370 375 380
          Pro Gln Ala Pro Ala Pro Ala Pro Ala Pro Ala Met Val Ser Ala Leu
          385 390 395 400
          Ala Gln Ala Pro Ala Pro Val Pro Val Leu Ala Pro Gly Pro Pro Gln
                          405 410 415
          Ala Val Ala Pro Pro Ala Pro Lys Pro Thr Gln Ala Gly Glu Gly Thr
                      420 425 430
          Leu Ser Glu Ala Leu Leu Gln Leu Gln Phe Asp Asp Glu Asp Leu Gly
                  435 440 445
          Ala Leu Leu Gly Asn Ser Thr Asp Pro Ala Val Phe Thr Asp Leu Ala
              450 455 460
          Ser Val Asp Asn Ser Glu Phe Gln Gln Leu Leu Asn Gln Gly Ile Pro
          465 470 475 480
          Val Ala Pro His Thr Thr Glu Pro Met Leu Met Glu Tyr Pro Glu Ala
                          485 490 495
          Ile Thr Arg Leu Val Thr Gly Ala Gln Arg Pro Pro Asp Pro Ala Pro
                      500 505 510
          Ala Pro Leu Gly Ala Pro Gly Leu Pro Asn Gly Leu Leu Ser Gly Asp
                  515 520 525
          Glu Asp Phe Ser Ser Ile Ala Asp Met Asp Phe Ser Ala Leu Leu Ser
              530 535 540
          Gln Ile Ser Ser
          545
           <![CDATA[ <210> 55]]>
           <![CDATA[ <211> 1815]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Homo sapiens]]>
           <![CDATA[ <400> 55]]>
          atgcgcccga aaaaagatgg cctggaagat tttctgcgcc tgaccccgga aattaaaaaa 60
          cagctgggca gcctggtgag cgattattgc aacgtgctga acaaagaatt taccgcgggc 120
          agcgtggaaa ttaccctgcg cagctataaa atttgcaaag cgtttattaa cgaagcgaaa 180
          gcgcatggcc gcgaatgggg cggcctgatg gcgaccctga acatttgcaa cttttgggcg 240
          attctgcgca acaaccgcgt gcgccgccgc gcggaaaacg cgggcaacga tgcgtgcagc 300
          attgcgtgcc cgattgtgat gcgctatgtg ctggatcatc tgattgtggt gaccgatcgc 360
          ttttttattc aggcgccgag caaccgcgtg atgattccgg cgaccattgg caccgcgatg 420
          tataaactgc tgaaacatag ccgcgtgcgc gcgtatacct atagcaaagt gctgggcgtg 480
          gatcgcgcgg cgattatggc gagcggcaaa caggtggtgg aacatctgaa ccgcatggaa 540
          aaagaaggcc tgctgagcag caaatttaaa gcgttttgca aatgggtgtt tacctatccg 600
          gtgctggaag aaatgtttca gaccatggtg agcagcaaaa ccggccatct gaccgatgat 660
          gtgaaagatg tgcgcgcgct gattaaaacc ctgccgcgcg cgagctatag cagccatgcg 720
          ggccagcgca gctatgtgag cggcgtgctg ccggcgtgcc tgctgagcac caaaagcaaa 780
          gcggtggaaa ccccgattct ggtgagcggc gcggatcgca tggatgaaga actgatgggc 840
          aacgatggcg gcgcgagcca taccgaagcg cgctatagcg aaagcggcca gtttcatgcg 900
          tttaccgatg aactggaaag cctgccgagc ccgaccatgc cgctgaaacc gggcgcgcag 960
          agcgcggatt gcggcgatag cagcagcagc agcagcgata gcggcaacag cgataccgaa 1020
          cagagcgaac gcgaagaagc gcgcgcggaa gcgccgcgcc tgcgcgcgcc gaaaagccgc 1080
          cgcaccagcc gcccgaaccg cggccagacc ccgtgcccga gcaacgcggc ggaaccggaa 1140
          cagccgtgga ttgcggcggt gcatcaggaa agcgatgaac gcccgatttt tccgcatccg 1200
          agcaaaccga cctttctgcc gccggtgaaa cgcaaaaaag gcctgcgcga tagccgcgaa 1260
          ggcatgtttc tgccgaaacc ggaagcgggc agcgcgatta gcgatgtgtt tgaaggccgc 1320
          gaagtgtgcc agccgaaacg cattcgcccg tttcatccgc cgggcagccc gtgggcgaac 1380
          cgcccgctgc cggcgagcct ggcgccgacc ccgaccggcc cggtgcatga accggtgggc 1440
          agcctgaccc cggcgccggt gccgcagccg ctggatccgg cgccggcggt gaccccggaa 1500
          gcgagccatc tgctggaaga tccggatgaa gaaaccagcc aggcggtgaa agcgctgcgc 1560
          gaaatggcgg ataccgtgat tccgcagaaa gaagaagcgg cgatttgcgg ccagatggat 1620
          ctgagccatc cgccgccgcg cggccatctg gatgaactga ccaccaccct ggaaagcatg 1680
          accgaagatc tgaacctgga tagcccgctg accccggaac tgaacgaaat tctggatacc 1740
          tttctgaacg atgaatgcct gctgcatgcg atgcatatta gcaccggcct gagcattttt 1800
          gataccagcc tgttt 1815
           <![CDATA[ <210> 56]]>
           <![CDATA[ <211> 605]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 56]]>
          Met Arg Pro Lys Lys Asp Gly Leu Glu Asp Phe Leu Arg Leu Thr Pro
          1 5 10 15
          Glu Ile Lys Lys Gln Leu Gly Ser Leu Val Ser Asp Tyr Cys Asn Val
                      20 25 30
          Leu Asn Lys Glu Phe Thr Ala Gly Ser Val Glu Ile Thr Leu Arg Ser
                  35 40 45
          Tyr Lys Ile Cys Lys Ala Phe Ile Asn Glu Ala Lys Ala His Gly Arg
              50 55 60
          Glu Trp Gly Gly Leu Met Ala Thr Leu Asn Ile Cys Asn Phe Trp Ala
          65 70 75 80
          Ile Leu Arg Asn Asn Arg Val Arg Arg Arg Ala Glu Asn Ala Gly Asn
                          85 90 95
          Asp Ala Cys Ser Ile Ala Cys Pro Ile Val Met Arg Tyr Val Leu Asp
                      100 105 110
          His Leu Ile Val Val Thr Asp Arg Phe Phe Ile Gln Ala Pro Ser Asn
                  115 120 125
          Arg Val Met Ile Pro Ala Thr Ile Gly Thr Ala Met Tyr Lys Leu Leu
              130 135 140
          Lys His Ser Arg Val Arg Ala Tyr Thr Tyr Ser Lys Val Leu Gly Val
          145 150 155 160
          Asp Arg Ala Ala Ile Met Ala Ser Gly Lys Gln Val Val Glu His Leu
                          165 170 175
          Asn Arg Met Glu Lys Glu Gly Leu Leu Ser Ser Lys Phe Lys Ala Phe
                      180 185 190
          Cys Lys Trp Val Phe Thr Tyr Pro Val Leu Glu Glu Met Phe Gln Thr
                  195 200 205
          Met Val Ser Ser Lys Thr Gly His Leu Thr Asp Asp Val Lys Asp Val
              210 215 220
          Arg Ala Leu Ile Lys Thr Leu Pro Arg Ala Ser Tyr Ser Ser His Ala
          225 230 235 240
          Gly Gln Arg Ser Tyr Val Ser Gly Val Leu Pro Ala Cys Leu Leu Ser
                          245 250 255
          Thr Lys Ser Lys Ala Val Glu Thr Pro Ile Leu Val Ser Gly Ala Asp
                      260 265 270
          Arg Met Asp Glu Glu Leu Met Gly Asn Asp Gly Gly Ala Ser His Thr
                  275 280 285
          Glu Ala Arg Tyr Ser Glu Ser Gly Gln Phe His Ala Phe Thr Asp Glu
              290 295 300
          Leu Glu Ser Leu Pro Ser Pro Thr Met Pro Leu Lys Pro Gly Ala Gln
          305 310 315 320
          Ser Ala Asp Cys Gly Asp Ser Ser Ser Ser Ser Ser Asp Ser Gly Asn
                          325 330 335
          Ser Asp Thr Glu Gln Ser Glu Arg Glu Glu Ala Arg Ala Glu Ala Pro
                      340 345 350
          Arg Leu Arg Ala Pro Lys Ser Arg Arg Thr Ser Arg Pro Asn Arg Gly
                  355 360 365
          Gln Thr Pro Cys Pro Ser Asn Ala Ala Glu Pro Glu Gln Pro Trp Ile
              370 375 380
          Ala Ala Val His Gln Glu Ser Asp Glu Arg Pro Ile Phe Pro His Pro
          385 390 395 400
          Ser Lys Pro Thr Phe Leu Pro Pro Val Lys Arg Lys Lys Gly Leu Arg
                          405 410 415
          Asp Ser Arg Glu Gly Met Phe Leu Pro Lys Pro Glu Ala Gly Ser Ala
                      420 425 430
          Ile Ser Asp Val Phe Glu Gly Arg Glu Val Cys Gln Pro Lys Arg Ile
                  435 440 445
          Arg Pro Phe His Pro Pro Gly Ser Pro Trp Ala Asn Arg Pro Leu Pro
              450 455 460
          Ala Ser Leu Ala Pro Thr Pro Thr Gly Pro Val His Glu Pro Val Gly
          465 470 475 480
          Ser Leu Thr Pro Ala Pro Val Pro Gln Pro Leu Asp Pro Ala Pro Ala
                          485 490 495
          Val Thr Pro Glu Ala Ser His Leu Leu Glu Asp Pro Asp Glu Glu Thr
                      500 505 510
          Ser Gln Ala Val Lys Ala Leu Arg Glu Met Ala Asp Thr Val Ile Pro
                  515 520 525
          Gln Lys Glu Glu Ala Ala Ile Cys Gly Gln Met Asp Leu Ser His Pro
              530 535 540
          Pro Pro Arg Gly His Leu Asp Glu Leu Thr Thr Thr Leu Glu Ser Met
          545 550 555 560
          Thr Glu Asp Leu Asn Leu Asp Ser Pro Leu Thr Pro Glu Leu Asn Glu
                          565 570 575
          Ile Leu Asp Thr Phe Leu Asn Asp Glu Cys Leu Leu His Ala Met His
                      580 585 590
          Ile Ser Thr Gly Leu Ser Ile Phe Asp Thr Ser Leu Phe
                  595 600 605
           <![CDATA[ <210> 57]]>
           <![CDATA[ <211> 172]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 57]]>
          Arg Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Arg Gly
          1 5 10 15
          Asn Leu Val Arg His Ile Arg Thr His Thr Gly Glu Lys Pro Phe Ala
                      20 25 30
          Cys Asp Ile Cys Gly Lys Lys Phe Ala Leu Ser Phe Asn Leu Thr Arg
                  35 40 45
          His Thr Lys Ile His Thr Gly Ser Gln Lys Pro Phe Gln Cys Arg Ile
              50 55 60
          Cys Met Arg Asn Phe Ser Arg Ser Asp Asn Leu Thr Arg His Ile Arg
          65 70 75 80
          Thr His Thr Gly Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly Lys Lys
                          85 90 95
          Phe Ala Asp Arg Ser His Leu Ala Arg His Thr Lys Ile His Thr Gly
                      100 105 110
          Ser Gln Lys Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln
                  115 120 125
          Lys Ala His Leu Thr Ala His Ile Arg Thr His Thr Gly Glu Lys Pro
              130 135 140
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu
          145 150 155 160
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp
                          165 170
           <![CDATA[ <210> 58]]>
           <![CDATA[ <211> 516]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 58]]>
          cgaccattcc agtgtcgaat ctgcatgcgc aacttcagcc agcggggaaa cctggtgagg 60
          catatccgca cccacacggg agagaagcct tttgcctgcg atatttgtgg aaagaagttt 120
          gctctgagct tcaatctaac cagacacacc aagattcata ctgggtccca gaaaccgttc 180
          cagtgtagga tatgcatgag gaatttctct cggagtgaca acttaacgcg gcatataagg 240
          acgcacacag gtgaaaaacc atttgcatgc gacatctgtg gcaaaaagtt tgcggaccgg 300
          tctcaccttg cccgacacac aaaaatccat accggcagtc aaaagccctt tcaatgtcgc 360
          atttgcatgc gaaacttctc acagaaggcc catttgactg cccatattcg tactcatact 420
          ggcgagaaac ctttcgcttg cgatatatgt ggtcgtaagt ttgcacggtc ggacaacctc 480
          acacgccaca ctaagataca cctgcggcag aaggac 516
           <![CDATA[ <210> 59]]>
           <![CDATA[ <211> 172]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 59]]>
          Arg Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Ser
          1 5 10 15
          Asn Leu Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro Phe Ala
                      20 25 30
          Cys Asp Ile Cys Gly Lys Lys Phe Ala Asp Lys Arg Thr Leu Ile Arg
                  35 40 45
          His Thr Lys Ile His Thr Gly Ser Gln Lys Pro Phe Gln Cys Arg Ile
              50 55 60
          Cys Met Arg Asn Phe Ser Gln Arg Gly Asn Leu Val Arg His Ile Arg
          65 70 75 80
          Thr His Thr Gly Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly Lys Lys
                          85 90 95
          Phe Ala Leu Ser Phe Asn Leu Thr Arg His Thr Lys Ile His Thr Gly
                      100 105 110
          Ser Gln Lys Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg
                  115 120 125
          Ser Asp Asn Leu Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
              130 135 140
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Asp Arg Ser His Leu
          145 150 155 160
          Ala Arg His Thr Lys Ile His Leu Arg Gln Lys Asp
                          165 170
           <![CDATA[ <210> 60]]>
           <![CDATA[ <211> 516]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 60]]>
          cgaccattcc agtgtcgaat ctgcatgcgc aacttcagcc gaagttccaa cctgacacgg 60
          catatccgca cccacacggg agagaagcct tttgcctgcg atatttgtgg aaagaagttt 120
          gctgacaagc ggaccttaat ccgccacacc aagattcata ctgggtccca gaaaccgttc 180
          cagtgtagga tatgcatgag gaatttctct cagcggggaa atctagtgcg acatataagg 240
          acgcacacag gtgaaaaacc atttgcatgc gacatctgtg gcaaaaagtt tgcgctgagc 300
          ttcaacttga ctcgtcacac aaaaatccat accggcagtc aaaagccctt tcaatgtcgc 360
          atttgcatgc gaaacttctc acggagtgac aatcttacga gacatattcg tactcatact 420
          ggcgagaaac ctttcgcttg cgatatatgt ggtcgtaagt ttgcagaccg gagccactta 480
          gccaggcaca ctaagataca cctgcggcag aaggac 516
           <![CDATA[ <210> 61]]>
           <![CDATA[ <211> 172]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 61]]>
          Arg Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Asp Arg Ser
          1 5 10 15
          Ala Leu Ala Arg His Ile Arg Thr His Thr Gly Glu Lys Pro Phe Ala
                      20 25 30
          Cys Asp Ile Cys Gly Lys Lys Phe Ala Arg Ser Asp Asn Leu Thr Arg
                  35 40 45
          His Thr Lys Ile His Thr Gly Ser Gln Lys Pro Phe Gln Cys Arg Ile
              50 55 60
          Cys Met Arg Asn Phe Ser Gln Ser Gly Asp Leu Thr Arg His Ile Arg
          65 70 75 80
          Thr His Thr Gly Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly Lys Lys
                          85 90 95
          Phe Ala Val Arg Gln Thr Leu Lys Gln His Thr Lys Ile His Thr Gly
                      100 105 110
          Ser Gln Lys Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Ala
                  115 120 125
          Ala Gly Asn Leu Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
              130 135 140
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu
          145 150 155 160
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp
                          165 170
           <![CDATA[ <210> 62]]>
           <![CDATA[ <211> 516]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 62]]>
          cgaccattcc agtgtcgaat ctgcatgcgc aacttcagcg accggagcgc gctggcacgg 60
          catatccgca cccacacggg agagaagcct tttgcctgcg atatttgtgg aaagaagttt 120
          gctcgaagtg acaacttaac gcgccacacc aagattcata ctgggtccca gaaaccgttc 180
          cagtgtagga tatgcatgag gaatttctct cagtcagggg acctcactcg tcatataagg 240
          acgcacacag gtgaaaaacc atttgcatgc gacatctgtg gcaaaaagtt tgcggtacga 300
          cagacgctta aacaacacac aaaaatccat accggcagtc aaaagccctt tcaatgtcgc 360
          atttgcatgc gaaacttctc agccgctggt aacttgacac gacatattcg tactcatact 420
          ggcgagaaac ctttcgcttg cgatatatgt ggtcgtaagt ttgcaagatc tgataatcta 480
          acgcgtcaca ctaagataca cctgcggcag aaggac 516
           <![CDATA[ <210> 63]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 63]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Arg Gly Asn Leu
          1 5 10 15
          Val Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 64]]>
           <![CDATA[ <211> 29]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 64]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Leu Ser Phe Asn Leu
          1 5 10 15
          Thr Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro
                      20 25
           <![CDATA[ <210> 65]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 65]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Asp Asn Leu
          1 5 10 15
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 66]]>
           <![CDATA[ <211> 29]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 66]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Asp Arg Ser His Leu
          1 5 10 15
          Ala Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro
                      20 25
           <![CDATA[ <210> 67]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 67]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Lys Ala His Leu
          1 5 10 15
          Thr Ala His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 68]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 68]]>
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu
          1 5 10 15
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp
                      20 25
           <![CDATA[ <210> 69]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 69]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Ser Asn Leu
          1 5 10 15
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 70]]>
           <![CDATA[ <211> 29]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 70]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Asp Lys Arg Thr Leu
          1 5 10 15
          Ile Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro
                      20 25
           <![CDATA[ <210> 71]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 71]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Arg Gly Asn Leu
          1 5 10 15
          Val Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 72]]>
           <![CDATA[ <211> 29]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 72]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Leu Ser Phe Asn Leu
          1 5 10 15
          Thr Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro
                      20 25
           <![CDATA[ <210> 73]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 73]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Arg Ser Asp Asn Leu
          1 5 10 15
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 74]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 74]]>
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Asp Arg Ser His Leu
          1 5 10 15
          Ala Arg His Thr Lys Ile His Leu Arg Gln Lys Asp
                      20 25
           <![CDATA[ <210> 75]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 75]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Asp Arg Ser Ala Leu
          1 5 10 15
          Ala Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 76]]>
           <![CDATA[ <211> 29]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 76]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Arg Ser Asp Asn Leu
          1 5 10 15
          Thr Arg His Thr Lys Ile His Thr Gly Ser Gln Lys Pro
                      20 25
           <![CDATA[ <210> 77]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 77]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Gln Ser Gly Asp Leu
          1 5 10 15
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 78]]>
           <![CDATA[ <211> 29]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 78]]>
          Phe Ala Cys Asp Ile Cys Gly Lys Lys Phe Ala Val Arg Gln Thr Leu
          1 5 10 15
          Lys Gln His Thr Lys Ile His Thr Gly Ser Gln Lys Pro
                      20 25
           <![CDATA[ <210> 79]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 79]]>
          Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Ala Ala Gly Asn Leu
          1 5 10 15
          Thr Arg His Ile Arg Thr His Thr Gly Glu Lys Pro
                      20 25
           <![CDATA[ <210> 80]]>
           <![CDATA[ <211> 28]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 80]]>
          Phe Ala Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser Asp Asn Leu
          1 5 10 15
          Thr Arg His Thr Lys Ile His Leu Arg Gln Lys Asp
                      20 25
           <![CDATA[ <210> 81]]>
           <![CDATA[ <211> 4104]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 81]]>
          atggacaaga agtactccat tgggctcgct atcggtacca acagcgtcgg ctgggccgtc 60
          attacggacg agtacaaggt gccgagcaaa aaattcaaag ttctgggcaa taccgatcgc 120
          cacagcataa agaagaacct cattggagcc ctcctgttcg actccgggga gacggccgaa 180
          gccacgcggc tcaaaagaac agcacggcgc agatataccc gcagaaagaa tcggatctgc 240
          tacctgcagg agatctttag taatgagatg gctaaggtgg atgactcttt cttccatagg 300
          ctggaggagt ccttttttggt ggaggaggat aaaaagcacg agcgccaccc aatctttggc 360
          aatatcgtgg acgaggtggc gtaccatgaa aagtacccaa ccatatatca tctgaggaag 420
          aagctggtag acagtactga taaggctgac ttgcggttga tctatctcgc gctggcgcac 480
          atgatcaaat ttcggggaca cttcctcatc gagggggacc tgaacccaga caacagcgat 540
          gtcgacaaac tctttatcca actggttcag acttacaatc agcttttcga ggagaacccg 600
          atcaacgcat ccggcgttga cgccaaagca atcctgagcg ctaggctgtc caaatcccgg 660
          cggctcgaaa acctcatcgc acagctccct ggggagaaga agaacggcct gtttggtaat 720
          cttatcgccc tgtcactcgg gctgaccccc aactttaaat ctaacttcga cctggccgaa 780
          gatgccaagc tgcaactgag caaagacacc tacgatgatg atctcgacaa tctgctggcc 840
          cagatcggcg accagtacgc agacctttttt ttggcggcaa agaacctgtc agacgccatt 900
          ctgctgagtg atattctgcg agtgaacacg gagatcacca aagctccgct gagcgctagt 960
          atgatcaagc gctatgatga gcaccaccaa gacttgactt tgctgaaggc ccttgtcaga 1020
          cagcaactgc ctgagaagta caaggaaatt ttcttcgatc agtctaaaaa tggctacgcc 1080
          ggatacattg acggcggagc aagccaggag gaattttaca aatttattaa gcccatcttg 1140
          gaaaaaatgg acggcaccga ggagctgctg gtaaagctga acagagaaga tctgttgcgc 1200
          aaacagcgca ctttcgacaa tggaagcatc ccccaccaga ttcacctggg cgaactgcac 1260
          gctatcctca ggcggcaaga ggatttctac ccctttttga aagataacag ggaaaagatt 1320
          gagaaaatcc tcacatttcg gataccctac tatgtaggcc ccctcgctcg gggaaattcc 1380
          agattcgcgt ggatgactcg caaatcagaa gagaccatca ctccctggaa cttcgaggaa 1440
          gtcgtggata agggggcctc tgcccagtcc ttcatcgaaa ggatgactaa ctttgataaa 1500
          aatctgccta acgaaaaggt gcttcctaaa cactctctgc tgtacgagta cttcacagtt 1560
          tataacgagc tcaccaaggt caaatacgtc acagaaggga tgagaaagcc agcattcctg 1620
          tctggagagc agaagaaagc tatcgtggac ctcctcttca agacgaaccg gaaagttacc 1680
          gtgaaacagc tcaaagaaga ctatttcaaa aagattgaat gtttcgactc tgttgaaatc 1740
          agcggagtgg aggatcgctt caacgcatcc ctgggaacgt atcacgatct cctgaaaatc 1800
          attaaagaca aggacttcct ggacaatgag gagaacgagg acattcttga ggacattgtc 1860
          ctcaccctta cgttgtttga agatagggag atgattgaag aacgcttgaa aacttacgct 1920
          catctcttcg acgacaaagt catgaaacag ctcaagagac gccgatatac aggatggggg 1980
          cggctgtcaa gaaaactgat caatggcatc cgagacaagc agagtggaaa gacaatcctg 2040
          gattttctta agtccgatgg atttgccaac cggaacttca tgcagttgat ccatgatgac 2100
          tctctcacct ttaaggagga catccagaaa gcacaagttt ctggccaggg ggacagtctt 2160
          cacgagcaca tcgctaatct tgcaggtagc ccagctatca aaaagggaat actgcagacc 2220
          gttaaggtcg tggatgaact cgtcaaagta atgggaaggc ataagcccga gaatatcgtt 2280
          atcgagatgg cccgagagaa ccaaactacc cagaagggac agaagaacag tagggaaagg 2340
          atgaagagga ttgaagaggg tataaaagaa ctggggtccc aaatccttaa ggaacaccca 2400
          gttgaaaaca cccagcttca gaatgagaag ctctacctgt actacctgca gaacggcagg 2460
          gacatgtacg tggatcagga actggacatc aaccggttgt ccgactacga cgtggatgct 2520
          atcgtgcccc aaagctttct caaagatgat tctattgata ataaagtgtt gacaagatcc 2580
          gataaaaata gagggaagag tgataacgtc ccctcagaag aagttgtcaa gaaaatgaaa 2640
          aattattggc ggcagctgct gaacgccaaa ctgatcacac aacggaagtt cgataatctg 2700
          actaaggctg aacgaggtgg cctgtctgag ttggataaag ccggcttcat caaaaggcag 2760
          cttgttgaga cacgccagat caccaagcac gtggcccaaa ttctcgattc acgcatgaac 2820
          accaagtacg atgaaaatga caaactgatt cgagaggtga aagttattac tctgaagtct 2880
          aagctggtct cagatttcag aaaggacttt cagttttata aggtgagaga gatcaacaat 2940
          taccaccatg cgcatgatgc ctacctgaat gcagtggtag gcactgcact tatcaaaaaa 3000
          tatcccaagc tggaatctga atttgtttac ggagactata aagtgtacga tgttaggaaa 3060
          atgatcgcaa agtctgagca ggaaataggc aaggccaccg ctaagtactt cttttacagc 3120
          aatattatga attttttcaa gaccgagatt acactggcca atggagagat tcggaagcga 3180
          ccacttatcg aaacaaacgg agaaacagga gaaatcgtgt gggacaaggg tagggatttc 3240
          gcgacagtcc gcaaggtcct gtccatgccg caggtgaaca tcgttaaaaa gaccgaagta 3300
          cagaccggag gcttctccaa ggaaagtatc ctcccgaaaa ggaacagcga caagctgatc 3360
          gcacgcaaaa aagattggga ccccaagaaa tacggcggat tcgattctcc tacagtcgct 3420
          tacagtgtac tggttgtggc caaagtggag aaagggaagt ctaaaaaact caaaagcgtc 3480
          aaggaactgc tgggcatcac aatcatggag cgatccagct tcgagaaaaa ccccatcgac 3540
          tttctcgaag cgaaaggata taaagaggtc aaaaaagacc tcatcattaa gctgcccaag 3600
          tactctctct ttgagcttga aaacggccgg aaacgaatgc tcgctagtgc gggcgagctg 3660
          cagaaaggta acgagctggc actgccctct aaatacgtta atttcttgta tctggccagc 3720
          cactatgaaa agctcaaagg gtctcccgaa gataatgagc agaagcagct gttcgtggaa 3780
          caacacaaac actaccttga tgagatcatc gagcaaataa gcgagttctc caaaagagtg 3840
          atcctcgccg acgctaacct cgataaggtg ctttctgctt acaataagca cagggataag 3900
          cccatcaggg agcaggcaga aaacattatc cacttgttta ctctgaccaa cttgggcgcg 3960
          cctgcagcct tcaagtactt cgacaccacc atagacagaa agcggtacac ctctacaaag 4020
          gaggtcctgg acgccacact gattcatcag tcaattacgg ggctctatga aacaagaatc 4080
          gacctctctc agctcggtgg agac 4104
           <![CDATA[ <210> 82]]>
           <![CDATA[ <211> 1368]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 82]]>
          Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
          1 5 10 15
          Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
                      20 25 30
          Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
                  35 40 45
          Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
              50 55 60
          Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
          65 70 75 80
          Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
                          85 90 95
          Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
                      100 105 110
          His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
                  115 120 125
          His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
              130 135 140
          Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
          145 150 155 160
          Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
                          165 170 175
          Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
                      180 185 190
          Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
                  195 200 205
          Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
              210 215 220
          Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
          225 230 235 240
          Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
                          245 250 255
          Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
                      260 265 270
          Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
                  275 280 285
          Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
              290 295 300
          Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
          305 310 315 320
          Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
                          325 330 335
          Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
                      340 345 350
          Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
                  355 360 365
          Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
              370 375 380
          Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
          385 390 395 400
          Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
                          405 410 415
          Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
                      420 425 430
          Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
                  435 440 445
          Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
              450 455 460
          Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
          465 470 475 480
          Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
                          485 490 495
          Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
                      500 505 510
          Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
                  515 520 525
          Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
              530 535 540
          Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
          545 550 555 560
          Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
                          565 570 575
          Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
                      580 585 590
          Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
                  595 600 605
          Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
              610 615 620
          Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
          625 630 635 640
          His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
                          645 650 655
          Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
                      660 665 670
          Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
                  675 680 685
          Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
              690 695 700
          Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
          705 710 715 720
          His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
                          725 730 735
          Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
                      740 745 750
          Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
                  755 760 765
          Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
              770 775 780
          Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
          785 790 795 800
          Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
                          805 810 815
          Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
                      820 825 830
          Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys
                  835 840 845
          Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
              850 855 860
          Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
          865 870 875 880
          Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
                          885 890 895
          Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
                      900 905 910
          Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
                  915 920 925
          Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
              930 935 940
          Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
          945 950 955 960
          Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
                          965 970 975
          Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
                      980 985 990
          Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
                  995 1000 1005
          Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
              1010 1015 1020
          Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
              1025 1030 1035
          Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
              1040 1045 1050
          Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
              1055 1060 1065
          Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
              1070 1075 1080
          Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
              1085 1090 1095
          Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
              1100 1105 1110
          Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
              1115 1120 1125
          Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
              1130 1135 1140
          Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
              1145 1150 1155
          Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
              1160 1165 1170
          Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
              1175 1180 1185
          Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
              1190 1195 1200
          Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
              1205 1210 1215
          Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
              1220 1225 1230
          Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
              1235 1240 1245
          Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
              1250 1255 1260
          His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
              1265 1270 1275
          Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
              1280 1285 1290
          Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
              1295 1300 1305
          Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
              1310 1315 1320
          Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
              1325 1330 1335
          Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
              1340 1345 1350
          Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
              1355 1360 1365
           <![CDATA[ <210> 83]]>
           <![CDATA[ <211> 97]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 83]]>
          gaggtaccat agagtgaggc ggttttagag ctagaaatag caagttaaaa taaggctagt 60
          ccgttatcaa cttgaaaaag tggcaccgag tcggtgc 97
           <![CDATA[ <210> 84]]>
           <![CDATA[ <211> 97]]>
           <![CDATA[ <212> RNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 84]]>
          gagguaccau agagugaggc gguuuuagag cuagaaauag caaguuaaaa uaaggcuagu 60
          ccguuaucaa cuugaaaaag uggcaccgag ucggugc 97
           <![CDATA[ <210> 85]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 85]]>
          gaggtaccat agagtgaggc g 21
           <![CDATA[ <210> 86]]>
           <![CDATA[ <211> 21]]>
           <![CDATA[ <212> RNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 86]]>
          gagguaccau agagugaggc g 21
           <![CDATA[ <210> 87]]>
           <![CDATA[ <211> 99]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 87]]>
          accgaggcga ggatgaagcc gaggttttag agctagaaat agcaagttaa aataaggcta 60
          gtccgttatc aacttgaaaa agtggcaccg agtcggtgc 99
           <![CDATA[ <210> 88]]>
           <![CDATA[ <211> 99]]>
           <![CDATA[ <212> RNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 88]]>
          accgaggcga ggaugaagcc gagguuuuag agcuagaaau agcaaguuaa aauaaggcua 60
          guccguuauc aacuugaaaa aguggcaccg agucggugc 99
           <![CDATA[ <210> 89]]>
           <![CDATA[ <211> 23]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 89]]>
          accgaggcga ggatgaagcc gag 23
           <![CDATA[ <210> 90]]>
           <![CDATA[ <211> 23]]>
           <![CDATA[ <212> RNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 90]]>
          accgaggcga ggaugaagcc gag 23
           <![CDATA[ <210> 91]]>
           <![CDATA[ <211> 100]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 91]]>
          accgaagccg agaggatact gcaggtttta gagctagaaa tagcaagtta aaataaggct 60
          agtccgttat caacttgaaa aagtggcacc gagtcggtgc 100
           <![CDATA[ <210> 92]]>
           <![CDATA[ <211> 100]]>
           <![CDATA[ <212> RNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 92]]>
          accgaagccg agaggauacu gcagguuuua gagcuagaaa uagcaaguua aaauaaggcu 60
          aguccguuau caacuugaaa aaguggcacc gagucggugc 100
           <![CDATA[ <210> 93]]>
           <![CDATA[ <211> 24]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 93]]>
          accgaagccg agaggatact gcag 24
           <![CDATA[ <210> 94]]>
           <![CDATA[ <211> 24]]>
           <![CDATA[ <212> RNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 94]]>
          accgaagccg agaggauacu gcag 24
           <![CDATA[ <210> 95]]>
           <![CDATA[ <211> 1569]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 95]]>
          gacgcattgg acgattttga tctggatatg ctgggaagtg acgccctcga tgattttgac 60
          cttgacatgc ttggttcgga tgcccttgat gactttgacc tcgacatgct cggcagtgac 120
          gcccttgatg atttcgacct ggacatgctg attaactcta gaagttccgg atctccgaaa 180
          aagaaacgca aagttggtag ccagtacctg cccgacaccg acgaccggca ccggatcgag 240
          gaaaagcgga agcggaccta cgagacattc aagagcatca tgaagaagtc ccccttcagc 300
          ggccccaccg accctagacc tccacctaga agaatcgccg tgcccagcag atccagcgcc 360
          agcgtgccaa aacctgcccc ccagccttac cccttcacca gcagcctgag caccatcaac 420
          tacgacgagt tccctaccat ggtgttcccc agcggccaga tctctcaggc ctctgctctg 480
          gctccagccc ctcctcaggt gctgcctcag gctcctgctc ctgcaccagc tccagccatg 540
          gtgtctgcac tggctcaggc accagcaccc gtgcctgtgc tggctcctgg acctccacag 600
          gctgtggctc caccagcccc taaacctaca caggccggcg agggcacact gtctgaagct 660
          ctgctgcagc tgcagttcga cgacgaggat ctgggagccc tgctgggaaa cagcaccgat 720
          cctgccgtgt tcaccgacct ggccagcgtg gacaacagcg agttccagca gctgctgaac 780
          cagggcatcc ctgtggcccc tcacaccacc gagcccatgc tgatggaata ccccgaggcc 840
          atcacccggc tcgtgacagg cgctcagagg cctcctgatc cagctcctgc ccctctggga 900
          gcaccaggcc tgcctaatgg actgctgtct ggcgacgagg acttcagctc tatcgccgat 960
          atggatttct cagccttgct gggctctggc agcggcagcc gggattccag ggaagggatg 1020
          ttttttgccga agcctgaggc cggctccgct attagtgacg tgtttgaggg ccgcgaggtg 1080
          tgccagccaa aacgaatccg gccatttcat cctccaggaa gtccatgggc caaccgccca 1140
          ctccccgcca gcctcgcacc aacaccaacc ggtccagtac atgagccagt cgggtcactg 1200
          accccggcac cagtccctca gccactggat ccagcgcccg cagtgactcc cgaggccagt 1260
          cacctgttgg aggatcccga tgaagagacg agccaggctg tcaaagccct tcgggagatg 1320
          gccgatactg tgattcccca gaaggaagag gctgcaatct gtggccaaat ggacctttcc 1380
          catccgcccc caaggggcca tctggatgag ctgacaacca cacttgagtc catgaccgag 1440
          gatctgaacc tggactcacc cctgaccccg gaattgaacg agattctgga taccttcctg 1500
          aacgacgagt gcctcttgca tgccatgcat atcagcacag gactgtccat cttcgacaca 1560
          tctctgttt 1569
           <![CDATA[ <210> 96]]>
           <![CDATA[ <211> 523]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 96]]>
          Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu
          1 5 10 15
          Asp Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe
                      20 25 30
          Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp
                  35 40 45
          Met Leu Ile Asn Ser Arg Ser Ser Gly Ser Pro Lys Lys Lys Arg Lys
              50 55 60
          Val Gly Ser Gln Tyr Leu Pro Asp Thr Asp Asp Arg His Arg Ile Glu
          65 70 75 80
          Glu Lys Arg Lys Arg Thr Tyr Glu Thr Phe Lys Ser Ile Met Lys Lys
                          85 90 95
          Ser Pro Phe Ser Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg Ile
                      100 105 110
          Ala Val Pro Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gln
                  115 120 125
          Pro Tyr Pro Phe Thr Ser Ser Leu Ser Thr Ile Asn Tyr Asp Glu Phe
              130 135 140
          Pro Thr Met Val Phe Pro Ser Gly Gln Ile Ser Gln Ala Ser Ala Leu
          145 150 155 160
          Ala Pro Ala Pro Pro Gln Val Leu Pro Gln Ala Pro Ala Pro Ala Pro
                          165 170 175
          Ala Pro Ala Met Val Ser Ala Leu Ala Gln Ala Pro Ala Pro Val Pro
                      180 185 190
          Val Leu Ala Pro Gly Pro Pro Gln Ala Val Ala Pro Pro Ala Pro Lys
                  195 200 205
          Pro Thr Gln Ala Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gln Leu
              210 215 220
          Gln Phe Asp Asp Glu Asp Leu Gly Ala Leu Leu Gly Asn Ser Thr Asp
          225 230 235 240
          Pro Ala Val Phe Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gln
                          245 250 255
          Gln Leu Leu Asn Gln Gly Ile Pro Val Ala Pro His Thr Thr Glu Pro
                      260 265 270
          Met Leu Met Glu Tyr Pro Glu Ala Ile Thr Arg Leu Val Thr Gly Ala
                  275 280 285
          Gln Arg Pro Pro Asp Pro Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu
              290 295 300
          Pro Asn Gly Leu Leu Ser Gly Asp Glu Asp Phe Ser Ser Ile Ala Asp
          305 310 315 320
          Met Asp Phe Ser Ala Leu Leu Gly Ser Gly Ser Gly Ser Arg Asp Ser
                          325 330 335
          Arg Glu Gly Met Phe Leu Pro Lys Pro Glu Ala Gly Ser Ala Ile Ser
                      340 345 350
          Asp Val Phe Glu Gly Arg Glu Val Cys Gln Pro Lys Arg Ile Arg Pro
                  355 360 365
          Phe His Pro Pro Gly Ser Pro Trp Ala Asn Arg Pro Leu Pro Ala Ser
              370 375 380
          Leu Ala Pro Thr Pro Thr Gly Pro Val His Glu Pro Val Gly Ser Leu
          385 390 395 400
          Thr Pro Ala Pro Val Pro Gln Pro Leu Asp Pro Ala Pro Ala Val Thr
                          405 410 415
          Pro Glu Ala Ser His Leu Leu Glu Asp Pro Asp Glu Glu Thr Ser Gln
                      420 425 430
          Ala Val Lys Ala Leu Arg Glu Met Ala Asp Thr Val Ile Pro Gln Lys
                  435 440 445
          Glu Glu Ala Ala Ile Cys Gly Gln Met Asp Leu Ser His Pro Pro Pro
              450 455 460
          Arg Gly His Leu Asp Glu Leu Thr Thr Thr Leu Glu Ser Met Thr Glu
          465 470 475 480
          Asp Leu Asn Leu Asp Ser Pro Leu Thr Pro Glu Leu Asn Glu Ile Leu
                          485 490 495
          Asp Thr Phe Leu Asn Asp Glu Cys Leu Leu His Ala Met His Ile Ser
                      500 505 510
          Thr Gly Leu Ser Ile Phe Asp Thr Ser Leu Phe
                  515 520
           <![CDATA[ <210> 97]]>
           <![CDATA[ <211> 1939]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 97]]> Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val 1 5 10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25 30 Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile 35 40 45 Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60 Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys 65 70 75 80 Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser 85 90 95 Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105 110 His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115 120 125 His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 130 135 140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His 145 150 155 160 Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 165 170 175 Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190 Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205 Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220 Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225 230 235 240 Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245 250 255 Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260 265 270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290 295 300 Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335 Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350 Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360 365 Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370 375 380 Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 385 390 395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 405 410 415 Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430 Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445 Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460 Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu 465 470 475 480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 485 490 495 Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500 505 510 Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515 520 525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 530 535 540 Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565 570 575 Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580 585 590 Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595 600 605 Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615 620 Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 645 650 655 Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665 670 Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685 Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700 Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu 705 710 715 720 His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725 730 735 Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 74 0 745 750 Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755 760 765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770 775 780 Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 785 790 795 800 Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815 Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830 Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys 835 840 845 Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850 855 860 Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870 875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gl n Arg Lys 885 890 895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900 905 910 Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915 920 925 Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935 940 Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser 945 950 955 960 Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965 970 975 Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 980 985 990 Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000 1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala 1010 1015 1020 Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe 1025 1030 1035 Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Th r Leu Ala 1040 1045 1050 Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060 1065 Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080 Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Lys Thr 1085 1090 1095 Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105 1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro 1115 1120 1125 Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val 1130 1135 1140 Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145 1150 1155 Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170 Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185 Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200 Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210 1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val 1220 1225 1230 Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser 1235 1240 1245 Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250 1255 1260 His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270 1275 Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290 Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305 Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320 Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330 1335 Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr 1340 1345 1350 Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365 Gly Gly Gly Pro Pro Lys Lys Lys Arg Lys Val Ala Ala Ala 1370 1375 1380 Ser Arg Tyr Pro Arg Gly Asp Ala Leu Asp Asp Phe Asp Leu Asp 1385 1390 1395 Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu 1400 1405 1410 Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser 1415 1420 1425 Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Ile Asn Ser Arg 1430 1435 1440 Ser Ser Gly Ser Pro Lys L ys Lys Arg Lys Val Gly Ser Gln Tyr 1445 1450 1455 Leu Pro Asp Thr Asp Asp Arg His Arg Ile Glu Glu Lys Arg Lys 1460 1465 1470 Arg Thr Tyr Glu Thr Phe Lys Ser Ile Met Lys Lys Ser Pro Phe 1475 1480 1485 Ser Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg Ile Ala Val 1490 1495 1500 Pro Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gln Pro 1505 1510 1515 Tyr Pro Phe Thr Ser Ser Leu Ser Thr Ile Asn Tyr Asp Glu Phe 1520 1525 1530 Pro Thr Met Val Phe Pro Ser Gly Gln Ile Ser Gln Ala Ser Ala 1535 1540 1545 Leu Ala Pro Ala Pro Pro Gln Val Leu Pro Gln Ala Pro Ala Pro 1550 1555 1560 Ala Pro Ala Pro Ala Met Val Ser Ala Leu Ala Gln Ala Pro Ala 1565 1570 1575 Pro Val Pro Val Leu Ala Pro Gly Pro Pro Gln Ala Val Ala Pro 1580 1585 1590 Pro Ala Pro Lys Pro Thr Gln Ala Gly Glu Gly Thr Leu Ser Glu 1595 1600 1605 Ala Leu Leu Gln Leu Gln Phe Asp Asp Glu Asp Leu Gly Ala Leu 1610 1615 1620 Leu Gly Asn Ser Thr Asp Pro Ala Val Phe Thr Asp Leu Ala Ser 1625 1630 1635 Val Asp Asn Ser Glu Phe Gln Gln Leu Leu Asn Gln Gl y Ile Pro 1640 1645 1650 Val Ala Pro His Thr Thr Glu Pro Met Leu Met Glu Tyr Pro Glu 1655 1660 1665 Ala Ile Thr Arg Leu Val Thr Gly Ala Gln Arg Pro Pro Asp Pro 1670 1675 1680 Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu Pro Asn Gly Leu Leu 1685 1690 1695 Ser Gly Asp Glu Asp Phe Ser Ser Ile Ala Asp Met Asp Phe Ser 1700 1705 1710 Ala Leu Leu Gly Ser Gly Ser Gly Ser Arg Asp Ser Arg Glu Gly 1715 1720 1725 Met Phe Leu Pro Lys Pro Glu Ala Gly Ser Ala Ile Ser Asp Val 1730 1735 1740 Phe Glu Gly Arg Glu Val Cys Gln Pro Lys Arg Ile Arg Pro Phe 1745 1750 1755 His Pro Pro Gly Ser Pro Trp Ala Asn Arg Pro Leu Pro Ala Ser 1760 1765 1770 Leu Ala Pro Thr Pro Thr Gly Pro Val His Glu Pro Val Gly Ser 1775 1780 1785 Leu Thr Pro Ala Pro Val Pro Gln Pro Leu Asp Pro Ala Pro Ala 1790 1795 1800 Val Thr Pro Glu Ala Ser His Leu Leu Glu Asp Pro Asp Glu Glu 1805 1810 1815 Thr Ser Gln Ala Val Lys Ala Leu Arg Glu Met Ala Asp Thr Val 1820 1825 1830 Ile Pro Gln Lys Glu Glu Ala Ala Ile Cys Gly Gln Met Asp Leu 1835 1840 1845 Ser His Pro Pro Pro Arg Gly His Leu Asp Glu Leu Thr Thr Thr 1850 1855 1860 Leu Glu Ser Met Thr Glu Asp Leu Asn Leu Asp Ser Pro Leu Thr 1865 1870 1875 Pro Glu Leu Asn Glu Ile Leu Asp Thr Phe Leu Asn Asp Glu Cys 1880 1885 1890 Leu Leu His Ala Met His Ile Ser Thr Gly Leu Ser Ile Phe Asp 1895 1900 1905 Thr Ser Leu Phe Pro Lys Lys Lys Arg Lys Val Arg Ser Lys Arg 1910 1915 1920 Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys Leu 1925 1930 1935 Asp <![CDATA[ <210> 98]]>
           <![CDATA[ <211> 112]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 98]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg atgaagccga 60
          gaggatactg cagaggtctc tggtgcaatg tgtgtatgtg tgcgtttgtg tg 112
           <![CDATA[ <210> 99]]>
           <![CDATA[ <211> 71]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 99]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg atgaagccga 60
          gaggatactg c 71
           <![CDATA[ <210> 100]]>
           <![CDATA[ <211> 112]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 100]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg atgaagccga 60
          gaggatactg cagaggtctc tggtgcaatg tgtgtatgtg tgcgtttgtg tg 112
           <![CDATA[ <210> 101]]>
           <![CDATA[ <211> 108]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <221> misc_feature]]>
           <![CDATA[ <222> (52)..(52)]]>
           <![CDATA[ <223> n is a, c, g or t]]>
           <![CDATA[ <400> 101]]>
          gaaacaagct atttgctgat ttgtattagg taccatagag tgaggcgagg angaagccga 60
          gaggatactg cagaggtctc tggtgcaatg tgtgtatgtg tgcgtttg 108
           <![CDATA[ <210> 102]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 102]]>
          ttccagtgtc gaatctgcat gcgcaacttc agccagcggg gaaacctggt gaggcatatc 60
          cgcacccaca cgggagagaa gcct 84
           <![CDATA[ <210> 103]]>
           <![CDATA[ <211> 87]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 103]]>
          tttgcctgcg atatttgtgg aaagaagttt gctctgagct tcaatctaac cagacacacc 60
          aagattcata ctgggtccca gaaaccg 87
           <![CDATA[ <210> 104]]>
           <![CDATA[ <211> 85]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 104]]>
          ttccagtgta ggatatgcat gaggaatttc tctcggagtg acaacttaac gcggcatata 60
          aggacgcaca caggtgaaaa aacaa 85
           <![CDATA[ <210> 105]]>
           <![CDATA[ <211> 87]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 105]]>
          tttgcatgcg acatctgtgg caaaaagttt gcggaccggt ctcaccttgc ccgacacaca 60
          aaaatccata ccggcagtca aaagccc 87
           <![CDATA[ <210> 106]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 106]]>
          tttcaatgtc gcatttgcat gcgaaacttc tcacagaagg cccatttgac tgcccatatt 60
          cgtactcata ctggcgagaa acct 84
           <![CDATA[ <210> 107]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 107]]>
          ttcgcttgcg atatatgtgg tcgtaagttt gcacggtcgg acaacctcac acgccacact 60
          aagatacacc tgcggcagaa ggac 84
           <![CDATA[ <210> 108]]>
           <![CDATA[ <211> 85]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 108]]>
          ttccagtgtc gaatctgcat gcgcaacttc agcccgaatg tccaacctga cacggcatat 60
          ccgcacccac acgggagaga agcct 85
           <![CDATA[ <210> 109]]>
           <![CDATA[ <211> 87]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 109]]>
          tttgcctgcg atatttgtgg aaagaagttt gctgacaagc ggaccttaat ccgccacacc 60
          aagattcata ctgggtccca gaaaccg 87
           <![CDATA[ <210> 110]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 110]]>
          ttccagtgta ggatatgcat gaggaatttc tctcagcggg gaaatctagt gcgacatata 60
          aggacgcaca caggtgaaaa acca 84
           <![CDATA[ <210> 111]]>
           <![CDATA[ <211> 87]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 111]]>
          tttgcatgcg acatctgtgg caaaaagttt gcgctgagct tcaacttgac tcgtcacaca 60
          aaaatccata ccggcagtca aaagccc 87
           <![CDATA[ <210> 112]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 112]]>
          tttcaatgtc gcatttgcat gcgaaacttc tcacggagtg acaatcttac gagacatatt 60
          cgtactcata ctggcgagaa acct 84
           <![CDATA[ <210> 113]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 113]]>
          ttcgcttgcg atatatgtgg tcgtaagttt gcagaccgga gccacttagc caggcacact 60
          aagatacacc tgcggcagaa ggac 84
           <![CDATA[ <210> 114]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 114]]>
          ttccagtgtc gaatctgcat gcgcaacttc agcgaccgga gcgcgctggc acggcatatc 60
          cgcacccaca cgggagagaa gcct 84
           <![CDATA[ <210> 115]]>
           <![CDATA[ <211> 87]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 115]]>
          tttgcctgcg atatttgtgg aaagaagttt gctcgaagtg acaacttaac gcgccacacc 60
          aagattcata ctgggtccca gaaaccg 87
           <![CDATA[ <210> 116]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 116]]>
          ttccagtgta ggatatgcat gaggaatttc tctcagtcag gggacctcac tcgtcatata 60
          aggacgcaca caggtgaaaa acca 84
           <![CDATA[ <210> 117]]>
           <![CDATA[ <211> 87]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 117]]>
          tttgcatgcg acatctgtgg caaaaagttt gcggtacgac agacgcttaa acaacacaca 60
          aaaatccata ccggcagtca aaagccc 87
           <![CDATA[ <210> 118]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 118]]>
          tttcaatgtc gcatttgcat gcgaaacttc tcagccgctg gtaacttgac acgacatatt 60
          cgtactcata ctggcgagaa acct 84
           <![CDATA[ <210> 119]]>
           <![CDATA[ <211> 84]]>
           <![CDATA[ <212> DNA]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 119]]>
          ttcgcttgcg atatatgtgg tcgtaagttt gcaagatctg ataatctaac gcgtcacact 60
          aagatacacc tgcggcagaa ggac 84
           <![CDATA[ <210> 120]]>
           <![CDATA[ <211> 5]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthetic peptides]]>
           <![CDATA[ <400> 120]]>
          Thr Gly Glu Lys Pro
          1 5
           <![CDATA[ <210> 121]]>
           <![CDATA[ <211> 6]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthetic peptides]]>
           <![CDATA[ <400> 121]]>
          Thr Gly Ser Gln Lys Pro
          1 5
           <![CDATA[ <210> 122]]>
           <![CDATA[ <211> 98]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 122]]>
          Met His His Gln Gln Arg Met Ala Ala Leu Gly Thr Asp Lys Glu Leu
          1 5 10 15
          Ser Asp Leu Leu Asp Phe Ser Ala Met Phe Ser Pro Pro Val Ser Ser
                      20 25 30
          Gly Lys Asn Gly Pro Thr Ser Leu Ala Ser Gly His Phe Thr Gly Ser
                  35 40 45
          Asn Val Glu Asp Arg Ser Ser Ser Gly Ser Trp Gly Asn Gly Gly His
              50 55 60
          Pro Ser Pro Ser Arg Asn Tyr Gly Asp Gly Thr Pro Tyr Asp His Met
          65 70 75 80
          Thr Ser Arg Asp Leu Gly Ser His Asp Asn Leu Ser Pro Pro Phe Val
                          85 90 95
          Asn Ser
           <![CDATA[ <210> 123]]>
           <![CDATA[ <211> 72]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 123]]>
          Thr Asn Asn Ser Phe Ser Ser Asn Pro Ser Thr Pro Val Gly Ser Pro
          1 5 10 15
          Pro Ser Leu Ser Ala Gly Thr Ala Val Trp Ser Arg Asn Gly Gly Gln
                      20 25 30
          Ala Ser Ser Ser Pro Asn Tyr Glu Gly Pro Leu His Ser Leu Gln Ser
                  35 40 45
          Arg Ile Glu Asp Arg Leu Glu Arg Leu Asp Asp Ala Ile His Val Leu
              50 55 60
          Arg Asn His Ala Val Gly Pro Ser
          65 70
           <![CDATA[ <210> 124]]>
           <![CDATA[ <211> 14]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 124]]>
          Pro Leu Ser Glu Glu Glu Glu Glu Leu Glu Leu Asn Thr Gln Arg
          1 5 10
           <![CDATA[ <210> 125]]>
           <![CDATA[ <211> 13]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 125]]>
          Ser Val Ser Glu Asp Val Asp Leu Leu Leu Asn Gln Arg
          1 5 10
           <![CDATA[ <210> 126]]>
           <![CDATA[ <211> 14]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 126]]>
          His Leu Thr Glu Asp His Leu Asp Leu Asn Asn Ala Gln Arg
          1 5 10
           <![CDATA[ <210> 127]]>
           <![CDATA[ <211> 237]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 127]]>
          Asn Ser Val Ser Ala Ala Thr Leu Thr Pro Ser Ser Gln Ala Val Thr
          1 5 10 15
          Ile Ser Ser Ser Gly Ser Gln Glu Ser Gly Ser Gln Pro Val Thr Ser
                      20 25 30
          Gly Thr Thr Ile Ser Ser Ala Ser Leu Val Ser Ser Gln Ala Ser Ser
                  35 40 45
          Ser Ser Phe Phe Thr Asn Ala Asn Ser Tyr Ser Thr Thr Thr Thr Thr
              50 55 60
          Ser Asn Met Gly Ile Met Asn Phe Thr Thr Ser Gly Ser Ser Gly Thr
          65 70 75 80
          Asn Ser Gln Gly Gln Thr Pro Gln Arg Val Ser Gly Leu Gln Gly Ser
                          85 90 95
          Asp Ala Leu Asn Ile Gln Gln Asn Gln Thr Ser Gly Gly Ser Leu Gln
                      100 105 110
          Ala Gly Gln Gln Lys Glu Gly Glu Gln Asn Gln Gln Thr Gln Gln Gln
                  115 120 125
          Gln Ile Leu Ile Gln Pro Gln Leu Val Gln Gly Gly Gln Ala Leu Gln
              130 135 140
          Ala Leu Gln Ala Ala Pro Leu Ser Gly Gln Thr Phe Thr Thr Gln Ala
          145 150 155 160
          Ile Ser Gln Glu Thr Leu Gln Asn Leu Gln Leu Gln Ala Val Pro Asn
                          165 170 175
          Ser Gly Pro Ile Ile Ile Arg Thr Pro Thr Val Gly Pro Asn Gly Gln
                      180 185 190
          Val Ser Trp Gln Thr Leu Gln Leu Gln Asn Leu Gln Val Gln Asn Pro
                  195 200 205
          Gln Ala Gln Thr Ile Thr Leu Ala Pro Met Gln Gly Val Ser Leu Gly
              210 215 220
          Gln Thr Ser Ser Ser Asn Thr Thr Leu Thr Pro Ile Ala
          225 230 235
           <![CDATA[ <210> 128]]>
           <![CDATA[ <211> 94]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 128]]>
          Met Glu Glu Pro Gln Ser Asp Pro Ser Val Glu Pro Pro Leu Ser Gln
          1 5 10 15
          Glu Thr Phe Ser Asp Leu Trp Lys Leu Leu Pro Glu Asn Asn Val Leu
                      20 25 30
          Ser Pro Leu Pro Ser Gln Ala Met Asp Asp Leu Met Leu Ser Pro Asp
                  35 40 45
          Asp Ile Glu Gln Trp Phe Thr Glu Asp Pro Gly Pro Asp Glu Ala Pro
              50 55 60
          Arg Met Pro Glu Ala Ala Pro Pro Val Ala Pro Ala Pro Ala Ala Pro
          65 70 75 80
          Thr Pro Ala Ala Pro Ala Pro Ala Pro Ser Trp Pro Leu Ser
                          85 90
           <![CDATA[ <210> 129]]>
           <![CDATA[ <211> 58]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 129]]>
          Ala Asp Ser Leu Leu Glu His Val Arg Glu Asp Phe Ser Gly Leu Leu
          1 5 10 15
          Pro Glu Glu Phe Ile Ser Leu Ser Pro Pro His Glu Ala Leu Asp Tyr
                      20 25 30
          His Phe Gly Leu Glu Glu Gly Glu Gly Ile Arg Asp Leu Phe Asp Cys
                  35 40 45
          Asp Phe Gly Asp Leu Thr Pro Leu Asp Phe
              50 55
           <![CDATA[ <210> 130]]>
           <![CDATA[ <211> 63]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 130]]>
          Met Glu Leu Leu Ser Pro Pro Leu Arg Asp Val Asp Leu Thr Ala Pro
          1 5 10 15
          Asp Gly Ser Leu Cys Ser Phe Ala Thr Thr Asp Asp Phe Tyr Asp Asp
                      20 25 30
          Pro Cys Phe Asp Ser Pro Asp Leu Arg Phe Phe Glu Asp Leu Asp Pro
                  35 40 45
          Arg Leu Met His Val Gly Ala Leu Leu Lys Pro Glu Glu His Ser
              50 55 60
           <![CDATA[ <210> 131]]>
           <![CDATA[ <211> 127]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 131]]>
          Leu Ala Ala Gln Ser Leu Val Pro Pro Pro Gly Leu Pro Gly Ser Ser
          1 5 10 15
          Thr Pro Gly Val Leu Pro Tyr Phe Pro Pro Gly Leu Pro Pro Pro Asp
                      20 25 30
          Ala Gly Gly Ala Pro Gln Ser Ser Met Ser Glu Ser Pro Asp Val Asn
                  35 40 45
          Leu Val Thr Gln Gln Leu Ser Lys Ser Gln Val Glu Asp Pro Leu Pro
              50 55 60
          Pro Val Phe Ser Gly Thr Pro Lys Gly Ser Gly Ala Gly Tyr Gly Val
          65 70 75 80
          Gly Phe Asp Leu Glu Glu Phe Leu Asn Gln Ser Phe Asp Met Gly Val
                          85 90 95
          Ala Asp Gly Pro Gln Asp Gly Gln Ala Asp Ser Ala Ser Leu Ser Ala
                      100 105 110
          Ser Leu Leu Ala Asp Trp Leu Glu Gly His Gly Met Asn Pro Ala
                  115 120 125
           <![CDATA[ <210> 132]]>
           <![CDATA[ <211> 102]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 132]]>
          Pro Glu Lys Pro Leu Phe Ser Ser Ala Ser Pro Gln Asp Ser Ser Pro
          1 5 10 15
          Arg Leu Ser Thr Phe Pro Gln His His His Pro Gly Ile Pro Gly Val
                      20 25 30
          Ala His Ser Val Ile Ser Thr Arg Thr Pro Pro Pro Pro Ser Pro Leu
                  35 40 45
          Pro Phe Pro Thr Gln Ala Ile Leu Pro Pro Ala Pro Ser Ser Tyr Phe
              50 55 60
          Ser His Pro Thr Ile Arg Tyr Pro Pro His Leu Asn Pro Gln Asp Thr
          65 70 75 80
          Leu Lys Asn Tyr Val Pro Ser Tyr Asp Pro Ser Ser Pro Gln Thr Ser
                          85 90 95
          Gln Ser Trp Tyr Leu Gly
                      100
           <![CDATA[ <210> 133]]>
           <![CDATA[ <211> 260]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 133]]>
          Gln Tyr Leu Pro Asp Thr Asp Asp Arg His Arg Ile Glu Glu Lys Arg
          1 5 10 15
          Lys Arg Thr Tyr Glu Thr Phe Lys Ser Ile Met Lys Lys Ser Pro Phe
                      20 25 30
          Ser Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg Ile Ala Val Pro
                  35 40 45
          Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gln Pro Tyr Pro
              50 55 60
          Phe Thr Ser Ser Leu Ser Thr Ile Asn Tyr Asp Glu Phe Pro Thr Met
          65 70 75 80
          Val Phe Pro Ser Gly Gln Ile Ser Gln Ala Ser Ala Leu Ala Pro Ala
                          85 90 95
          Pro Pro Gln Val Leu Pro Gln Ala Pro Ala Pro Ala Pro Ala Pro Ala
                      100 105 110
          Met Val Ser Ala Leu Ala Gln Ala Pro Ala Pro Val Pro Val Leu Ala
                  115 120 125
          Pro Gly Pro Pro Gln Ala Val Ala Pro Pro Ala Pro Lys Pro Thr Gln
              130 135 140
          Ala Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gln Leu Gln Phe Asp
          145 150 155 160
          Asp Glu Asp Leu Gly Ala Leu Leu Gly Asn Ser Thr Asp Pro Ala Val
                          165 170 175
          Phe Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gln Gln Leu Leu
                      180 185 190
          Asn Gln Gly Ile Pro Val Ala Pro His Thr Thr Glu Pro Met Leu Met
                  195 200 205
          Glu Tyr Pro Glu Ala Ile Thr Arg Leu Val Thr Gly Ala Gln Arg Pro
              210 215 220
          Pro Asp Pro Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu Pro Asn Gly
          225 230 235 240
          Leu Leu Ser Gly Asp Glu Asp Phe Ser Ser Ile Ala Asp Met Asp Phe
                          245 250 255
          Ser Ala Leu Leu
                      260
           <![CDATA[ <210> 134]]>
           <![CDATA[ <211> 124]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 134]]>
          Gly Phe Ser Val Asp Thr Ser Ala Leu Leu Asp Leu Phe Ser Pro Ser
          1 5 10 15
          Val Thr Val Pro Asp Met Ser Leu Pro Asp Leu Asp Ser Ser Leu Ala
                      20 25 30
          Ser Ile Gln Glu Leu Leu Ser Pro Gln Glu Pro Pro Arg Pro Pro Glu
                  35 40 45
          Ala Glu Asn Ser Ser Pro Asp Ser Gly Lys Gln Leu Val His Tyr Thr
              50 55 60
          Ala Gln Pro Leu Phe Leu Leu Asp Pro Gly Ser Val Asp Thr Gly Ser
          65 70 75 80
          Asn Asp Leu Pro Val Leu Phe Glu Leu Gly Glu Gly Ser Tyr Phe Ser
                          85 90 95
          Glu Gly Asp Gly Phe Ala Glu Asp Pro Thr Ile Ser Leu Leu Thr Gly
                      100 105 110
          Ser Glu Pro Pro Lys Ala Lys Asp Pro Thr Val Ser
                  115 120
           <![CDATA[ <210> 135]]>
           <![CDATA[ <211> 8]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 135]]>
          Pro Lys Lys Lys Arg Lys Val Glu
          1 5
           <![CDATA[ <210> 136]]>
           <![CDATA[ <211> 9]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 136]]>
          Pro Ala Ala Lys Arg Val Lys Leu Asp
          1 5
           <![CDATA[ <210> 137]]>
           <![CDATA[ <211> 9]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 137]]>
          Pro Ala Ala Lys Lys Lys Lys Lys Leu Asp
          1 5
           <![CDATA[ <210> 138]]>
           <![CDATA[ <211> 18]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 138]]>
          Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys
          1 5 10 15
          Leu Asp
           <![CDATA[ <210> 139]]>
           <![CDATA[ <211> 20]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 139]]>
          Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Thr Pro Lys Lys Lys
          1 5 10 15
          Arg Lys Val Glu
                      20
           <![CDATA[ <210> 140]]>
           <![CDATA[ <211> 23]]>
           <![CDATA[ <212> PRT]]>
           <![CDATA[ <213> Artificial sequences]]>
           <![CDATA[ <220>]]>
           <![CDATA[ <223> Synthesis]]>
           <![CDATA[ <400> 140]]>
          Pro Arg Arg Arg Pro Leu His Ser Ser Ala Met Glu Val Gln Thr Lys
          1 5 10 15
          Lys Val Arg Lys Val Pro Pro
                      20
          
      

Figure 12_A0101_SEQ_0001
Figure 12_A0101_SEQ_0001

Figure 12_A0101_SEQ_0002
Figure 12_A0101_SEQ_0002

Figure 12_A0101_SEQ_0003
Figure 12_A0101_SEQ_0003

Figure 12_A0101_SEQ_0004
Figure 12_A0101_SEQ_0004

Figure 12_A0101_SEQ_0005
Figure 12_A0101_SEQ_0005

Figure 12_A0101_SEQ_0006
Figure 12_A0101_SEQ_0006

Figure 12_A0101_SEQ_0007
Figure 12_A0101_SEQ_0007

Figure 12_A0101_SEQ_0008
Figure 12_A0101_SEQ_0008

Figure 12_A0101_SEQ_0009
Figure 12_A0101_SEQ_0009

Figure 12_A0101_SEQ_0010
Figure 12_A0101_SEQ_0010

Figure 12_A0101_SEQ_0011
Figure 12_A0101_SEQ_0011

Figure 12_A0101_SEQ_0012
Figure 12_A0101_SEQ_0012

Figure 12_A0101_SEQ_0013
Figure 12_A0101_SEQ_0013

Figure 12_A0101_SEQ_0014
Figure 12_A0101_SEQ_0014

Figure 12_A0101_SEQ_0015
Figure 12_A0101_SEQ_0015

Figure 12_A0101_SEQ_0016
Figure 12_A0101_SEQ_0016

Figure 12_A0101_SEQ_0017
Figure 12_A0101_SEQ_0017

Figure 12_A0101_SEQ_0018
Figure 12_A0101_SEQ_0018

Figure 12_A0101_SEQ_0019
Figure 12_A0101_SEQ_0019

Figure 12_A0101_SEQ_0020
Figure 12_A0101_SEQ_0020

Figure 12_A0101_SEQ_0021
Figure 12_A0101_SEQ_0021

Figure 12_A0101_SEQ_0022
Figure 12_A0101_SEQ_0022

Figure 12_A0101_SEQ_0023
Figure 12_A0101_SEQ_0023

Figure 12_A0101_SEQ_0024
Figure 12_A0101_SEQ_0024

Figure 12_A0101_SEQ_0025
Figure 12_A0101_SEQ_0025

Figure 12_A0101_SEQ_0026
Figure 12_A0101_SEQ_0026

Figure 12_A0101_SEQ_0027
Figure 12_A0101_SEQ_0027

Figure 12_A0101_SEQ_0028
Figure 12_A0101_SEQ_0028

Figure 12_A0101_SEQ_0029
Figure 12_A0101_SEQ_0029

Figure 12_A0101_SEQ_0030
Figure 12_A0101_SEQ_0030

Figure 12_A0101_SEQ_0031
Figure 12_A0101_SEQ_0031

Figure 12_A0101_SEQ_0032
Figure 12_A0101_SEQ_0032

Figure 12_A0101_SEQ_0033
Figure 12_A0101_SEQ_0033

Figure 12_A0101_SEQ_0034
Figure 12_A0101_SEQ_0034

Figure 12_A0101_SEQ_0035
Figure 12_A0101_SEQ_0035

Figure 12_A0101_SEQ_0036
Figure 12_A0101_SEQ_0036

Figure 12_A0101_SEQ_0037
Figure 12_A0101_SEQ_0037

Figure 12_A0101_SEQ_0038
Figure 12_A0101_SEQ_0038

Figure 12_A0101_SEQ_0039
Figure 12_A0101_SEQ_0039

Figure 12_A0101_SEQ_0040
Figure 12_A0101_SEQ_0040

Figure 12_A0101_SEQ_0041
Figure 12_A0101_SEQ_0041

Figure 12_A0101_SEQ_0042
Figure 12_A0101_SEQ_0042

Figure 12_A0101_SEQ_0043
Figure 12_A0101_SEQ_0043

Figure 12_A0101_SEQ_0044
Figure 12_A0101_SEQ_0044

Figure 12_A0101_SEQ_0045
Figure 12_A0101_SEQ_0045

Figure 12_A0101_SEQ_0046
Figure 12_A0101_SEQ_0046

Figure 12_A0101_SEQ_0047
Figure 12_A0101_SEQ_0047

Figure 12_A0101_SEQ_0048
Figure 12_A0101_SEQ_0048

Figure 12_A0101_SEQ_0049
Figure 12_A0101_SEQ_0049

Figure 12_A0101_SEQ_0050
Figure 12_A0101_SEQ_0050

Figure 12_A0101_SEQ_0051
Figure 12_A0101_SEQ_0051

Figure 12_A0101_SEQ_0052
Figure 12_A0101_SEQ_0052

Figure 12_A0101_SEQ_0053
Figure 12_A0101_SEQ_0053

Figure 12_A0101_SEQ_0054
Figure 12_A0101_SEQ_0054

Figure 12_A0101_SEQ_0055
Figure 12_A0101_SEQ_0055

Figure 12_A0101_SEQ_0056
Figure 12_A0101_SEQ_0056

Figure 12_A0101_SEQ_0057
Figure 12_A0101_SEQ_0057

Figure 12_A0101_SEQ_0058
Figure 12_A0101_SEQ_0058

Figure 12_A0101_SEQ_0059
Figure 12_A0101_SEQ_0059

Figure 12_A0101_SEQ_0060
Figure 12_A0101_SEQ_0060

Figure 12_A0101_SEQ_0061
Figure 12_A0101_SEQ_0061

Figure 12_A0101_SEQ_0062
Figure 12_A0101_SEQ_0062

Figure 12_A0101_SEQ_0063
Figure 12_A0101_SEQ_0063

Figure 12_A0101_SEQ_0064
Figure 12_A0101_SEQ_0064

Figure 12_A0101_SEQ_0065
Figure 12_A0101_SEQ_0065

Figure 12_A0101_SEQ_0066
Figure 12_A0101_SEQ_0066

Figure 12_A0101_SEQ_0067
Figure 12_A0101_SEQ_0067

Figure 12_A0101_SEQ_0068
Figure 12_A0101_SEQ_0068

Figure 12_A0101_SEQ_0069
Figure 12_A0101_SEQ_0069

Figure 12_A0101_SEQ_0070
Figure 12_A0101_SEQ_0070

Figure 12_A0101_SEQ_0071
Figure 12_A0101_SEQ_0071

Figure 12_A0101_SEQ_0072
Figure 12_A0101_SEQ_0072

Figure 12_A0101_SEQ_0073
Figure 12_A0101_SEQ_0073

Figure 12_A0101_SEQ_0074
Figure 12_A0101_SEQ_0074

Figure 12_A0101_SEQ_0075
Figure 12_A0101_SEQ_0075

Figure 12_A0101_SEQ_0076
Figure 12_A0101_SEQ_0076

Figure 12_A0101_SEQ_0077
Figure 12_A0101_SEQ_0077

Figure 12_A0101_SEQ_0078
Figure 12_A0101_SEQ_0078

Figure 12_A0101_SEQ_0079
Figure 12_A0101_SEQ_0079

Figure 12_A0101_SEQ_0080
Figure 12_A0101_SEQ_0080

Figure 12_A0101_SEQ_0081
Figure 12_A0101_SEQ_0081

Figure 12_A0101_SEQ_0082
Figure 12_A0101_SEQ_0082

Figure 12_A0101_SEQ_0083
Figure 12_A0101_SEQ_0083

Figure 12_A0101_SEQ_0084
Figure 12_A0101_SEQ_0084

Figure 12_A0101_SEQ_0085
Figure 12_A0101_SEQ_0085

Figure 12_A0101_SEQ_0086
Figure 12_A0101_SEQ_0086

Figure 12_A0101_SEQ_0087
Figure 12_A0101_SEQ_0087

Figure 12_A0101_SEQ_0088
Figure 12_A0101_SEQ_0088

Figure 12_A0101_SEQ_0089
Figure 12_A0101_SEQ_0089

Figure 12_A0101_SEQ_0090
Figure 12_A0101_SEQ_0090

Figure 12_A0101_SEQ_0091
Figure 12_A0101_SEQ_0091

Figure 12_A0101_SEQ_0092
Figure 12_A0101_SEQ_0092

Figure 12_A0101_SEQ_0093
Figure 12_A0101_SEQ_0093

Figure 12_A0101_SEQ_0094
Figure 12_A0101_SEQ_0094

Figure 12_A0101_SEQ_0095
Figure 12_A0101_SEQ_0095

Figure 12_A0101_SEQ_0096
Figure 12_A0101_SEQ_0096

Figure 12_A0101_SEQ_0097
Figure 12_A0101_SEQ_0097

Figure 12_A0101_SEQ_0098
Figure 12_A0101_SEQ_0098

Figure 12_A0101_SEQ_0099
Figure 12_A0101_SEQ_0099

Figure 12_A0101_SEQ_0100
Figure 12_A0101_SEQ_0100

Figure 12_A0101_SEQ_0101
Figure 12_A0101_SEQ_0101

Figure 12_A0101_SEQ_0102
Figure 12_A0101_SEQ_0102

Figure 12_A0101_SEQ_0103
Figure 12_A0101_SEQ_0103

Claims (58)

一種經分離核酸,其包含經組態以表現至少一個融合至至少一個轉錄調節域之DNA結合域之轉基因,其中該DNA結合域結合至靶基因或靶基因之調節區,其中該靶基因編碼電位閘控鈉通道,且其中該至少一個轉錄調節域包含TCF4轉活化子、MEF2A轉活化子、MEF2C轉活化子、MEF2D轉活化子、Sp1富麩胺酸轉活化子、p53轉活化子域、E2F1轉活化子、MyoD轉活化子、MAPK7轉活化子域、NF1B富脯胺酸轉活化子或RelA轉活化子或其任何組合。An isolated nucleic acid comprising a transgene configured to express at least one DNA binding domain fused to at least one transcriptional regulatory domain, wherein the DNA binding domain binds to a target gene or a regulatory region of a target gene, wherein the target gene encodes a potential gated sodium channel, and wherein the at least one transcriptional regulatory domain comprises TCF4 transactivator, MEF2A transactivator, MEF2C transactivator, MEF2D transactivator, Sp1 glutamate-rich transactivator, p53 transactivator domain, E2F1 A transactivator, a MyoD transactivator, a MAPK7 transactivator domain, a NF1B proline-rich transactivator, or a RelA transactivator, or any combination thereof. 如請求項1之經分離核酸,其中該轉基因另外包含核定位序列。The isolated nucleic acid of claim 1, wherein the transgene additionally comprises a nuclear localization sequence. 一種經分離核酸,其包含經組態以表現至少一個融合至至少一個轉錄調節域之DNA結合域之轉基因,其中該DNA結合域結合至靶基因或靶基因之調節區,其中該靶基因編碼電位閘控鈉通道,且其中該轉基因包含核定位序列。An isolated nucleic acid comprising a transgene configured to express at least one DNA binding domain fused to at least one transcriptional regulatory domain, wherein the DNA binding domain binds to a target gene or a regulatory region of a target gene, wherein the target gene encodes a potential gated sodium channels, and wherein the transgene comprises a nuclear localization sequence. 如請求項3之經分離核酸,其中該至少一個轉錄調節域包含VPR轉活化子、Rta轉活化子、p65轉活化子、Hsf1轉活化子、TCF4轉活化子、MEF2A轉活化子、MEF2C轉活化子、MEF2D轉活化子、Sp1富麩胺酸轉活化子、p53轉活化子域、E2F1轉活化子、MyoD轉活化子、MAPK7轉活化子域、NF1B富脯胺酸轉活化子或RelA轉活化子或其任何組合。The isolated nucleic acid of claim 3, wherein the at least one transcriptional regulatory domain comprises VPR transactivator, Rta transactivator, p65 transactivator, Hsf1 transactivator, TCF4 transactivator, MEF2A transactivator, MEF2C transactivator MEF2D transactivator, Sp1 glutamate-rich transactivator, p53 transactivator domain, E2F1 transactivator, MyoD transactivator, MAPK7 transactivator domain, NF1B proline-rich transactivator, or RelA transactivator sub or any combination thereof. 如請求項2至4中任一項之經分離核酸,其中該核定位序列包含SEQ ID NO: 135至140中之任一者。The isolated nucleic acid of any one of claims 2 to 4, wherein the nuclear localization sequence comprises any one of SEQ ID NOs: 135 to 140. 如請求項2至4中任一項之經分離核酸,其中該核定位序列包含SEQ ID NO: 135至140之任何組合。The isolated nucleic acid of any one of claims 2 to 4, wherein the nuclear localization sequence comprises any combination of SEQ ID NOs: 135 to 140. 如請求項1至6中任一項之經分離核酸,其中該轉基因係側接來源於腺相關病毒(adeno-associated virus;AAV)之反向末端重複序列(inverted terminal repeat;ITR)。The isolated nucleic acid of any one of claims 1 to 6, wherein the transgenic line is flanked by inverted terminal repeats (ITRs) derived from adeno-associated virus (AAV). 如請求項1至7中任一項之經分離核酸,其中該轉錄調節域上調該靶基因之表現。The isolated nucleic acid of any one of claims 1 to 7, wherein the transcriptional regulatory domain upregulates the expression of the target gene. 如請求項1至8中任一項之經分離核酸,其中該至少一個DNA結合域結合至該靶基因之非轉譯區。The isolated nucleic acid of any one of claims 1 to 8, wherein the at least one DNA binding domain binds to a non-translated region of the target gene. 如請求項9之經分離核酸,其中該非轉譯區係強化子、啟動子、內含子及/或抑制子。The isolated nucleic acid of claim 9, wherein the non-translated region is an enhancer, promoter, intron and/or repressor. 如請求項1至10中任一項之經分離核酸,其中該DNA結合域結合該靶基因之調節區上游2至2000 bp之間或下游2至2000 bp之間。The isolated nucleic acid of any one of claims 1 to 10, wherein the DNA binding domain binds between 2 and 2000 bp upstream or between 2 and 2000 bp downstream of the regulatory region of the target gene. 如請求項1至11中任一項之經分離核酸,其中該至少一個DNA結合域編碼鋅指蛋白(zinc finger protein;ZFP)、轉錄活化子樣效應物(transcription-activator like effector;TALE)、dCas蛋白(例如,dCas9或dCas12a)及/或同源域。The isolated nucleic acid of any one of claims 1 to 11, wherein the at least one DNA binding domain encodes a zinc finger protein (ZFP), a transcription-activator like effector (TALE), dCas proteins (eg, dCas9 or dCas12a) and/or homeodomains. 如請求項1至12中任一項之經分離核酸,其中該至少一個DNA結合域結合至SEQ ID NO: 5至7之任一者中闡述之核酸序列。The isolated nucleic acid of any one of claims 1 to 12, wherein the at least one DNA binding domain binds to the nucleic acid sequence set forth in any one of SEQ ID NOs: 5 to 7. 如請求項1至13中任一項之經分離核酸,其中該至少一個DNA結合域結合至SEQ ID NO: 3中闡述之核酸序列之至少2 (例如,至少3、4、5、6、7、8、9、10或更多)個連續核苷酸。The isolated nucleic acid of any one of claims 1 to 13, wherein the at least one DNA binding domain binds to at least 2 (eg, at least 3, 4, 5, 6, 7) of the nucleic acid sequence set forth in SEQ ID NO: 3 , 8, 9, 10 or more) consecutive nucleotides. 如請求項1至14中任一項之經分離核酸,其中該至少一個DNA結合域係包含識別螺旋之鋅指蛋白,該識別螺旋由具有SEQ ID NO: 11至16、23至28或35至40之任一者中闡述之序列之核酸編碼。The isolated nucleic acid of any one of claims 1 to 14, wherein the at least one DNA binding domain is a zinc finger protein comprising a recognition helix consisting of SEQ ID NOs: 11 to 16, 23 to 28, or 35 to Nucleic acid encoding of the sequences set forth in any of 40. 如請求項15之經分離核酸,其中: (i)    該至少一個DNA結合域係鋅指蛋白,該鋅指蛋白包含由包含SEQ ID NO: 11之核酸編碼之識別螺旋、由包含SEQ ID NO: 12之核酸編碼之識別螺旋、由包含SEQ ID NO: 13之核酸編碼之識別螺旋、由包含SEQ ID NO: 14之核酸編碼之識別螺旋、由包含SEQ ID NO: 15之核酸編碼之識別螺旋及/或由包含SEQ ID NO: 16之核酸編碼之識別螺旋; (ii)   該至少一個DNA結合域係鋅指蛋白,該鋅指蛋白包含由包含SEQ ID NO: 23之核酸編碼之識別螺旋、由包含SEQ ID NO: 24之核酸編碼之識別螺旋、由包含SEQ ID NO: 25之核酸編碼之識別螺旋、由包含SEQ ID NO: 26之核酸編碼之識別螺旋、由包含SEQ ID NO: 27之核酸編碼之識別螺旋及/或由包含SEQ ID NO: 28之核酸編碼之識別螺旋;或 (iii)  該至少一個DNA結合域係鋅指蛋白,該鋅指蛋白包含由包含SEQ ID NO: 35之核酸編碼之識別螺旋、由包含SEQ ID NO: 36之核酸編碼之識別螺旋、由包含SEQ ID NO: 37之核酸編碼之識別螺旋、由包含SEQ ID NO: 38之核酸編碼之識別螺旋、由包含SEQ ID NO: 39之核酸編碼之識別螺旋及/或由包含SEQ ID NO: 40之核酸編碼之識別螺旋。 The isolated nucleic acid of claim 15, wherein: (i) the at least one DNA binding domain is a zinc finger protein comprising a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 11, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 12, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 12, The recognition helix encoded by the nucleic acid comprising SEQ ID NO: 13, the recognition helix encoded by the nucleic acid comprising SEQ ID NO: 14, the recognition helix encoded by the nucleic acid comprising SEQ ID NO: 15 and/or the nucleic acid comprising SEQ ID NO: 16 The identification helix of the code; (ii) the at least one DNA binding domain is a zinc finger protein comprising a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 23, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 24, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 24, The recognition helix encoded by the nucleic acid comprising SEQ ID NO:25, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:26, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:27 and/or the nucleic acid comprising SEQ ID NO:28 an encoded identification helix; or (iii) the at least one DNA binding domain is a zinc finger protein comprising a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 35, a recognition helix encoded by a nucleic acid comprising SEQ ID NO: 36, The recognition helix encoded by the nucleic acid comprising SEQ ID NO:37, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:38, the recognition helix encoded by the nucleic acid comprising SEQ ID NO:39 and/or the nucleic acid comprising SEQ ID NO:40 The identification helix of the code. 如請求項1至14中任一項之經分離核酸,其中該至少一個DNA結合域係包含SEQ ID NO: 17至22、29至34或41至46之任一者中闡述之胺基酸序列之鋅指蛋白。The isolated nucleic acid of any one of claims 1 to 14, wherein the at least one DNA binding domain comprises the amino acid sequence set forth in any one of SEQ ID NOs: 17 to 22, 29 to 34, or 41 to 46 the zinc finger protein. 如請求項17之經分離核酸,其中: (i)    該至少一個DNA結合域係鋅指蛋白,該鋅指蛋白包括包含SEQ ID NO: 17之識別螺旋、包含SEQ ID NO: 18之識別螺旋、包含SEQ ID NO: 19之識別螺旋、包含SEQ ID NO: 20之識別螺旋、包含SEQ ID NO: 21之識別螺旋及/或包含SEQ ID NO: 22之識別螺旋; (ii)   該至少一個DNA結合域係鋅指蛋白,該鋅指蛋白包括包含SEQ ID NO: 29之識別螺旋、包含SEQ ID NO: 30之識別螺旋、包含SEQ ID NO: 31之識別螺旋、包含SEQ ID NO: 32之識別螺旋、包含SEQ ID NO: 33之識別螺旋及/或包含SEQ ID NO: 34之識別螺旋;或 (iii)  該至少一個DNA結合域係鋅指蛋白,該鋅指蛋白包括包含SEQ ID NO: 41之識別螺旋、包含SEQ ID NO: 42之識別螺旋、包含SEQ ID NO: 43之識別螺旋、包含SEQ ID NO: 44之識別螺旋、包含SEQ ID NO: 45之識別螺旋及/或包含SEQ ID NO: 46之識別螺旋。 The isolated nucleic acid of claim 17, wherein: (i) the at least one DNA binding domain is a zinc finger protein comprising a recognition helix comprising SEQ ID NO: 17, a recognition helix comprising SEQ ID NO: 18, a recognition helix comprising SEQ ID NO: 19, comprising the recognition helix of SEQ ID NO: 20, the recognition helix comprising SEQ ID NO: 21 and/or the recognition helix comprising SEQ ID NO: 22; (ii) the at least one DNA binding domain is a zinc finger protein comprising a recognition helix comprising SEQ ID NO: 29, a recognition helix comprising SEQ ID NO: 30, a recognition helix comprising SEQ ID NO: 31, comprising The recognition helix of SEQ ID NO:32, the recognition helix comprising SEQ ID NO:33 and/or the recognition helix comprising SEQ ID NO:34; or (iii) the at least one DNA binding domain is a zinc finger protein comprising a recognition helix comprising SEQ ID NO: 41, a recognition helix comprising SEQ ID NO: 42, a recognition helix comprising SEQ ID NO: 43, comprising The recognition helix of SEQ ID NO:44, the recognition helix comprising SEQ ID NO:45 and/or the recognition helix comprising SEQ ID NO:46. 如請求項1至18中任一項之經分離核酸,其中該至少一個DNA結合域係dCas蛋白,選擇性地為dCas9蛋白,且選擇性地其中該經分離核酸另外包含至少一個引導核酸。The isolated nucleic acid of any one of claims 1 to 18, wherein the at least one DNA binding domain is a dCas protein, optionally a dCas9 protein, and optionally wherein the isolated nucleic acid additionally comprises at least one guide nucleic acid. 如請求項19之經分離核酸,其中該引導核酸包含靶向 SCN1A之間隔區序列。 The isolated nucleic acid of claim 19, wherein the guide nucleic acid comprises a spacer sequence targeting SCN1A . 如請求項19或20之經分離核酸,其中該引導核酸包含具有SEQ ID NO: 85、86、89、90、93或94中任一者之核苷酸序列之間隔區序列。The isolated nucleic acid of claim 19 or 20, wherein the guide nucleic acid comprises a spacer sequence having the nucleotide sequence of any one of SEQ ID NOs: 85, 86, 89, 90, 93 or 94. 如請求項19至21中任一項之經分離核酸,其中該引導核酸包含SEQ ID NO: 83至94中任一者之核苷酸序列。The isolated nucleic acid of any one of claims 19 to 21, wherein the guide nucleic acid comprises the nucleotide sequence of any one of SEQ ID NOs: 83 to 94. 如請求項1至22中任一項之經分離核酸,其中該至少一個轉錄調節域係由SEQ ID NO: 122至134之任一者中闡述之胺基酸序列編碼。The isolated nucleic acid of any one of claims 1 to 22, wherein the at least one transcriptional regulatory domain is encoded by the amino acid sequence set forth in any one of SEQ ID NOs: 122 to 134. 如請求項1至23中任一項之經分離核酸,其中該核酸包含AAV2 ITR。The isolated nucleic acid of any one of claims 1 to 23, wherein the nucleic acid comprises AAV2 ITR. 如請求項24之經分離核酸,其中該ITR係ΔTR及/或mTR。The isolated nucleic acid of claim 24, wherein the ITR is ΔTR and/or mTR. 如請求項1至25中任一項之經分離核酸,其中該轉基因係可操作地連接至啟動子。The isolated nucleic acid of any one of claims 1 to 25, wherein the transgenic line is operably linked to a promoter. 如請求項26之經分離核酸,其中該啟動子係組織特異性啟動子,選擇性地其中該啟動子係神經元啟動子,諸如SST、NPY、經磷酸活化麩胺酸酶(Phosphate-activated glutaminase;PAG)、囊泡麩胺酸轉運子-1 (Vesicular glutamate transporter-1;VGLUT1)、麩胺酸去羧酶65及57 (GAD65、GAD67)、突觸素I、a-CamKII、Dock10、Prox1、微小白蛋白(PV)、體抑素(Somatostatin;SST)、膽囊收縮素(Cholecystokinin;CCK)、鈣結合蛋白(Calretinin;CR)或神經肽Y (Neuropeptide Y;NPY)。The isolated nucleic acid of claim 26, wherein the promoter is a tissue-specific promoter, optionally wherein the promoter is a neuronal promoter, such as SST, NPY, Phosphate-activated glutaminase ; PAG), Vesicular glutamate transporter-1 (VGLUT1), Glutamate decarboxylase 65 and 57 (GAD65, GAD67), Synaptophysin I, a-CamKII, Dock10, Prox1 , microalbumin (PV), somatostatin (Somatostatin; SST), cholecystokinin (Cholecystokinin; CCK), calcium binding protein (Calretinin; CR) or neuropeptide Y (Neuropeptide Y; NPY). 如請求項1至27中任一項之經分離核酸,其中該至少一個DNA結合域係藉由連接子域融合至至少一個轉錄調節域。The isolated nucleic acid of any one of claims 1 to 27, wherein the at least one DNA binding domain is fused to the at least one transcriptional regulatory domain via a linker domain. 如請求項28之經分離核酸,其中該連接子域選擇性地為: (i)可撓性連接子,選擇性地由甘胺酸構成,或 (ii)可裂解連接子。 The isolated nucleic acid of claim 28, wherein the linker domain is selectively: (i) a flexible linker, optionally consisting of glycine, or (ii) Cleavable linkers. 如請求項1至29中任一項之經分離核酸,其中該轉基因編碼1個DNA結合域、2個DNA結合域、3個DNA結合域、4個DNA結合域、5個DNA結合域、6個DNA結合域、7個DNA結合域、8個DNA結合域、9個DNA結合域或10個DNA結合域。The isolated nucleic acid of any one of claims 1 to 29, wherein the transgene encodes 1 DNA binding domain, 2 DNA binding domains, 3 DNA binding domains, 4 DNA binding domains, 5 DNA binding domains, 6 DNA binding domains DNA binding domains, 7 DNA binding domains, 8 DNA binding domains, 9 DNA binding domains, or 10 DNA binding domains. 如請求項1至30中任一項之經分離核酸,其中該轉基因編碼1個轉錄調節域、2個轉錄調節域、3個轉錄調節域、4個轉錄調節域、5個轉錄調節域、6個轉錄調節域、7個轉錄調節域、8個轉錄調節域、9個轉錄調節域或10個轉錄調節域。The isolated nucleic acid of any one of claims 1 to 30, wherein the transgene encodes 1 transcriptional regulatory domain, 2 transcriptional regulatory domains, 3 transcriptional regulatory domains, 4 transcriptional regulatory domains, 5 transcriptional regulatory domains, 6 1 transcriptional regulatory domain, 7 transcriptional regulatory domains, 8 transcriptional regulatory domains, 9 transcriptional regulatory domains, or 10 transcriptional regulatory domains. 一種重組AAV (rAAV),其包含: (i)如請求項1至31中任一項之經分離核酸, (ii)至少一個衣殼蛋白。 A recombinant AAV (rAAV) comprising: (i) the isolated nucleic acid of any one of claims 1 to 31, (ii) at least one capsid protein. 如請求項32之rAAV,其中該AAV衣殼蛋白血清型係選自由以下組成之群:AAV1、AAV2、AAV3、AAV4、AAV5、AAV6、AAV7、AAV8、AAVrh8、AAV9、AAV10、AAVrh10或AAV.PHPB。The rAAV of claim 32, wherein the AAV capsid protein serotype is selected from the group consisting of AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAV9, AAV10, AAVrh10 or AAV.PHPB . 一種增加靶基因表現之方法,該方法包括對包含該靶基因之細胞或個體投與如請求項1至31中任一項之經分離核酸或如請求項32或33之rAAV。A method of increasing expression of a target gene, the method comprising administering to a cell or individual comprising the target gene an isolated nucleic acid as claimed in any one of claims 1 to 31 or an rAAV as claimed in claim 32 or 33. 如請求項34之方法,其中該個體係靶基因單倍不足。The method of claim 34, wherein the systemic target gene is haploinsufficient. 如請求項34或35之方法,其中該靶基因係SCN1A。The method of claim 34 or 35, wherein the target gene is SCN1A. 如請求項34至36中任一項之方法,其中該細胞係神經元,選擇性地為GABA能神經元。The method of any one of claims 34 to 36, wherein the cell line neurons are selectively GABAergic neurons. 如請求項34至37中任一項之方法,其中如請求項1至31中任一項之經分離核酸或如請求項32或33之rAAV之投與導致靶基因表現相對於投與前該轉基因在該個體中之表現增加至少2倍、至少10倍、至少20倍、至少30倍、至少40倍、至少50倍、至少60倍、至少70倍、至少80倍、至少90倍或至少100倍。The method of any one of claims 34 to 37, wherein administration of the isolated nucleic acid of any one of claims 1 to 31 or the rAAV of claim 32 or 33 results in the expression of the target gene relative to the expression of the target gene prior to administration The expression of the transgene in the individual is increased by at least 2-fold, at least 10-fold, at least 20-fold, at least 30-fold, at least 40-fold, at least 50-fold, at least 60-fold, at least 70-fold, at least 80-fold, at least 90-fold, or at least 100-fold times. 一種治療個體之卓飛症候群(Dravet syndrome)之方法,其包括對表現靶基因之個體投與如請求項1至31中任一項之經分離核酸或如請求項32或33之rAAV。A method of treating Dravet syndrome in an individual comprising administering an isolated nucleic acid as claimed in any one of claims 1 to 31 or an rAAV as claimed in claim 32 or 33 to an individual expressing a target gene. 如請求項39之方法,其中相較於正常個體,該個體中該靶基因之表現係經減少。The method of claim 39, wherein the expression of the target gene is reduced in the individual compared to a normal individual. 如請求項39或40之方法,其中相對於正常個體,該個體係或疑似係靶基因表現單倍不足。The method of claim 39 or 40, wherein the system or suspected line target gene exhibits haploinsufficiency relative to normal individuals. 如請求項39至41中任一項之方法,其中該個體患有或疑似患有由該靶基因之單倍不足表現引起之病症。The method of any one of claims 39 to 41, wherein the individual has or is suspected of having a disorder caused by a haploinsufficiency expression of the target gene. 如請求項39至42中任一項之方法,其中該靶基因係SCN1A。The method of any one of claims 39 to 42, wherein the target gene is SCN1A. 如請求項39至43中任一項之方法,其中該經分離核酸或該rAAV係藉由靜脈內注射、肌內注射、吸入、皮下注射及/或顱內注射投與。The method of any one of claims 39 to 43, wherein the isolated nucleic acid or the rAAV is administered by intravenous injection, intramuscular injection, inhalation, subcutaneous injection and/or intracranial injection. 如請求項39至44中任一項之方法,其中該經分離核酸或該rAAV之投與導致靶基因表現相對於投與前該轉基因在該個體中之表現增加至少2倍、至少10倍、至少20倍、至少30倍、至少40倍、至少50倍、至少60倍、至少70倍、至少80倍、至少90倍或至少100倍。The method of any one of claims 39 to 44, wherein administration of the isolated nucleic acid or the rAAV results in at least a 2-fold, at least 10-fold increase in target gene expression relative to the expression of the transgene in the individual prior to administration, At least 20 times, at least 30 times, at least 40 times, at least 50 times, at least 60 times, at least 70 times, at least 80 times, at least 90 times, or at least 100 times. 一種組合物,其包含如請求項1至31中任一項之經分離核酸或如請求項32或33之rAAV。A composition comprising an isolated nucleic acid as claimed in any one of claims 1 to 31 or an rAAV as claimed in claim 32 or 33. 如請求項46之組合物,其另外包含醫藥上可接受之載劑。The composition of claim 46, further comprising a pharmaceutically acceptable carrier. 一種套組,其包含: 容器,該容器容納如請求項1至31中任一項之經分離核酸或如請求項32或33之rAAV。 A kit comprising: A container containing an isolated nucleic acid as claimed in any one of claims 1 to 31 or an rAAV as claimed in claim 32 or 33. 如請求項48之套組,其中該套組另外包含容納醫藥上可接受之載劑之容器。The kit of claim 48, wherein the kit additionally comprises a container containing a pharmaceutically acceptable carrier. 如請求項48或49之套組,其中rAAV之經分離核酸及醫藥上可接受之載劑係容納於相同容器中。The kit of claim 48 or 49, wherein the isolated nucleic acid of rAAV and the pharmaceutically acceptable carrier are contained in the same container. 如請求項48至50中任一項之套組,其中該容器係注射器。The kit of any one of claims 48 to 50, wherein the container is a syringe. 一種宿主細胞,其包含如請求項1至31中任一項之經分離核酸或如請求項32或33之rAAV。A host cell comprising the isolated nucleic acid of any one of claims 1 to 31 or the rAAV of claim 32 or 33. 如請求項52之宿主細胞,其中該宿主細胞係真核細胞。The host cell of claim 52, wherein the host cell is a eukaryotic cell. 如請求項52之宿主細胞,其中該宿主細胞係哺乳動物細胞,選擇性地為人類細胞,選擇性地為神經元,選擇性地為GABA能神經元。The host cell of claim 52, wherein the host cell is a mammalian cell, optionally a human cell, optionally a neuron, optionally a GABAergic neuron. 一種鑑定GABA能啟動子之方法,其包括: (i)    生物資訊學挖掘神經元啟動子以選擇候選強化子元件; (ii)   將各候選強化子元件個別融合至與轉基因連接之最小啟動子,藉此產生與轉基因連接之候選啟動子的庫; (iii)  將該庫之各轉基因遞送至個體;及 (iv)   評估各轉基因在該個體GABA神經元中相對於非靶組織之表現。 A method of identifying a GABAergic promoter comprising: (i) Bioinformatics mining of neuronal promoters to select candidate enhancer elements; (ii) each candidate enhancer element is individually fused to the minimal promoter linked to the transgene, thereby generating a pool of candidate promoters linked to the transgene; (iii) delivering each transgene of the pool to an individual; and (iv) Assess the performance of each transgene in the individual GABA neurons relative to non-target tissues. 如請求項55之方法,其中(iii)中該庫之各轉基因係同時遞送至相同個體。The method of claim 55, wherein each transgenic line of the pool in (iii) is delivered simultaneously to the same individual. 如請求項55或56之方法,其中各轉基因包含獨特條碼。The method of claim 55 or 56, wherein each transgene comprises a unique barcode. 如請求項55至57中任一項之方法,其中該神經元啟動子係SCN1A、GAD1或GAD2啟動子。The method of any one of claims 55 to 57, wherein the neuronal promoter is a SCN1A, GAD1 or GAD2 promoter.
TW110127164A 2020-07-24 2021-07-23 Dna-binding domain transactivators and uses thereof TW202221119A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063056528P 2020-07-24 2020-07-24
US63/056,528 2020-07-24

Publications (1)

Publication Number Publication Date
TW202221119A true TW202221119A (en) 2022-06-01

Family

ID=79728976

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110127164A TW202221119A (en) 2020-07-24 2021-07-23 Dna-binding domain transactivators and uses thereof

Country Status (6)

Country Link
US (1) US20230279405A1 (en)
EP (1) EP4185303A1 (en)
JP (1) JP2023535025A (en)
AR (1) AR123041A1 (en)
TW (1) TW202221119A (en)
WO (1) WO2022020706A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007061759A1 (en) * 2005-11-18 2007-05-31 The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Delayed expression vectors
US9163330B2 (en) * 2009-07-13 2015-10-20 President And Fellows Of Harvard College Bifunctional stapled polypeptides and uses thereof
US20190127713A1 (en) * 2016-04-13 2019-05-02 Duke University Crispr/cas9-based repressors for silencing gene targets in vivo and methods of use
AU2019375975A1 (en) * 2018-11-05 2021-06-17 Allen Institute Artificial expression constructs for selectively modulating gene expression in excitatory cortical neurons

Also Published As

Publication number Publication date
JP2023535025A (en) 2023-08-15
US20230279405A1 (en) 2023-09-07
WO2022020706A1 (en) 2022-01-27
EP4185303A1 (en) 2023-05-31
AR123041A1 (en) 2022-10-26

Similar Documents

Publication Publication Date Title
KR102604159B1 (en) Tissue-selective transgene expression
KR20200107949A (en) Engineered DNA binding protein
TW201932479A (en) Compositions and methods for TTR gene editing and treating ATTR amyloidosis
KR20210009317A (en) Gene therapy for diseases caused by unbalanced nucleotide pools, including mitochondrial DNA depletion syndrome
CN114126665A (en) Gene therapy for fundus yellow speckle disease (ABCA4)
CN114174520A (en) Compositions and methods for selective gene regulation
US20230365963A1 (en) Methods for treating neurological disease
CN114402075A (en) Gene therapy for Uschel syndrome (USH2A)
US20220185862A1 (en) Dna-binding domain transactivators and uses thereof
CN116685329A (en) Nucleic acid constructs and their use for the treatment of spinal muscular atrophy
EP3372249B1 (en) Aav/upr-plus virus, upr-plus fusion protein, genetic treatment method and use thereof in the treatment of neurodegenerative diseases, such as parkinson&#39;s disease and huntington&#39;s disease,inter alia
CA3133455A1 (en) Vector and method for treating angelman syndrome
TW202221119A (en) Dna-binding domain transactivators and uses thereof
US20220378941A1 (en) Recombinant nucleic acids containing alphaherpesvirus promoter sequences
CN117580941A (en) Multiple CRISPR/Cas9 mediated target gene activation system
US20230078498A1 (en) Targeted Translation of RNA with CRISPR-Cas13 to Enhance Protein Synthesis
BR112021016294A2 (en) DNA BINDING DOMAIN TRANSACTIVATORS AND USES THEREOF
WO2024009280A1 (en) Integrated stress response inhibitors and methods of using the same
EP4297801A1 (en) Inducible single aav system and uses thereof
CA3179402A1 (en) Gene therapy delivery of parkin mutants having increased activity to treat parkinson&#39;s disease
EA046157B1 (en) COMPOSITIONS AND METHODS FOR SELECTIVE REGULATION OF GENE EXPRESSION
CN117836420A (en) Recombinant TERT-encoding viral genome and vector
CN116761812A (en) NEUROD1 and DLX2 vectors
CN112236516A (en) Gene therapy for oxidative stress