CN111471665B - DNA cyclization molecule and application thereof - Google Patents

DNA cyclization molecule and application thereof Download PDF

Info

Publication number
CN111471665B
CN111471665B CN201910063623.6A CN201910063623A CN111471665B CN 111471665 B CN111471665 B CN 111471665B CN 201910063623 A CN201910063623 A CN 201910063623A CN 111471665 B CN111471665 B CN 111471665B
Authority
CN
China
Prior art keywords
leu
lys
glu
asp
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910063623.6A
Other languages
Chinese (zh)
Other versions
CN111471665A (en
Inventor
冯松杰
江雯
黄行许
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ShanghaiTech University
Original Assignee
ShanghaiTech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ShanghaiTech University filed Critical ShanghaiTech University
Priority to CN201910063623.6A priority Critical patent/CN111471665B/en
Publication of CN111471665A publication Critical patent/CN111471665A/en
Application granted granted Critical
Publication of CN111471665B publication Critical patent/CN111471665B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Abstract

The invention relates to the field of biotechnology, in particular to a fusion protein and application thereof. The invention provides a fusion protein, comprising a homodimer fragment and a dCAS9 protein fragment, wherein the homodimer fragment comprises an LDB1 protein fragment or a Dimer domain fragment of an LDB1 protein. The invention combines LDB1 with LDB1spdCAS9 forms a novel fusion protein LDB1-dCAS9, achieves the aim through the three-dimensional structure change of chromatin, does not need to change genome sequence information or epigenetic modification, has simple operation, short experimental preparation period, greatly reduces time and working cost, does not need to add any small molecule to induce the formation of dimer, and can simultaneously target two sites to form a ring only by one CRISPR-Cas9 due to the existence of homodimer monomers, thereby greatly reducing time and working cost, and improving the efficiency of DNA ring formation when the target sites of genes are added.

Description

DNA cyclization molecule and application thereof
Technical Field
The invention relates to the field of biotechnology, in particular to a fusion protein and application thereof.
Background
The folding of chromatin and the interaction of different regions in the nucleus create a three-dimensional structure of chromatin, which in turn plays a key role in gene expression. Promoter and enhancer sequences are cis-acting elements that regulate gene expression. A promoter is a region of DNA immediately upstream of a gene that is capable of recruiting transcription factor binding to initiate gene expression. Enhancers are sequences located at a distance upstream and downstream of a target gene and activate or increase the expression of the target gene by DNA looping to interact with the target gene promoter. The regulation of gene spatiotemporal expression by dynamic changes in enhancer and enhancer loops during development is described in more detail based on studies of different embryo development stages. Among them, the most intensive research is on the human β -globin gene cluster. The remote enhancers in this gene cluster are referred to as "gene cluster regulatory regions (LCR)". During development, LCR sequentially regulates and expresses embryo epsilon-globin (HBE), infant gamma-globin (HBG), and adult delta-globin (HBE) and beta-globin (HBB) by combining with different gene promoters into a loop. Thus, artificial cyclization can be used as a viable strategy to study the regulation of endogenous gene expression by enhancers, even with potential for use in disease treatment.
Disclosure of Invention
In view of the above-mentioned drawbacks of the prior art, an object of the present invention is to provide a fusion protein and use thereof for solving the problems of the prior art.
To achieve the above and other related objects, in one aspect, the present invention provides a fusion protein comprising a homodimer fragment and a dCas9 protein fragment, the homodimer fragment comprising an LDB1 protein fragment or a dimedomain fragment of an LDB1 protein.
In some embodiments of the invention, the amino acid sequence of the LDB1 protein fragment comprises:
a) An amino acid sequence as shown in SEQ ID NO. 1; or alternatively, the first and second heat exchangers may be,
b) An amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.1 and having the function of the amino acid sequence defined under a), preferably being capable of forming a homodimer.
In some embodiments of the invention, the amino acid sequence of the Dimer domain fragment of the LDB1 protein comprises:
c) An amino acid sequence as shown in SEQ ID NO. 2; or alternatively, the first and second heat exchangers may be,
d) An amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.2 and having the function of the amino acid sequence defined in c), preferably being capable of forming homodimers.
In some embodiments of the invention, the amino acid sequence of the dCas9 protein fragment includes:
e) An amino acid sequence as shown in SEQ ID NO. 3; or alternatively, the first and second heat exchangers may be,
f) Amino acid sequence having more than 80% sequence similarity with SEQ ID No.3 and having the function of the amino acid sequence defined in e), preferably sgRNA capable of specifically targeting a site.
In some embodiments of the invention, the fusion protein comprises a homodimer fragment and a dCas9 protein fragment in order from the 5 'end to the 3' end.
In some embodiments of the invention, the fusion protein comprises, in order from the 5 'end to the 3' end, a dCas9 protein fragment and a homodimer fragment.
In some embodiments of the invention, the fusion protein further comprises a flexible linker peptide segment between the homodimer segment and the dCas9 protein segment, preferably the amino acid sequence of the flexible linker peptide segment is shown in SEQ ID nos. 41-42.
In some embodiments of the invention, the amino acid sequence of the fusion protein is shown in one of SEQ ID Nos. 37 to 40.
In another aspect, the invention provides an isolated polynucleotide encoding the fusion protein.
In another aspect, the invention provides a DNA-loop system comprising said fusion protein, further comprising a promoter-targeting sgRNA and an enhancer-targeting sgRNA.
In some embodiments of the invention, the sgRNA of the targeting promoter targets a region of-100 to-200 bp upstream of the TSS gene.
In some embodiments of the invention, the sgrnas of the targeting promoters have the gnnnnnnnnnnnnnnnnnnnNGG characteristics (SEQ ID No. 43).
In some embodiments of the invention, the GC content of the targeted promoter is between 40-60%.
In some embodiments of the invention, the sgRNA of the targeting promoter targets the promoter region of the HBB gene, preferably, the sequence of the sgRNA of the targeting promoter is shown in SEQ ID NO.4 to 6.
In some embodiments of the invention, the sgRNA of the targeting enhancer targets the DHS region of the enhancer.
In some embodiments of the invention, the sgrnas of the targeting enhancer target the vicinity of DHS2 of the LCR region of β -globin.
In some embodiments of the invention, the sequence of the sgRNA of the targeting enhancer is shown in SEQ ID NO. 7-9.
In another aspect, the invention provides an expression system comprising a host cell capable of expressing the fusion protein, the sgRNA of the targeting promoter and the sgRNA of the targeting enhancer.
In some embodiments of the invention, the expression system comprises a host cell comprising an expression vector comprising a polynucleotide encoding the fusion protein, or a host cell having a polynucleotide encoding the fusion protein integrated into the chromosome.
In some embodiments of the invention, the expression system comprises a host cell comprising an expression vector of a polynucleotide encoding a sgRNA of the targeting promoter, or a host cell having integrated in its chromosome a polynucleotide encoding a sgRNA of the targeting promoter.
In some embodiments of the invention, the expression system comprises a host cell comprising an expression vector comprising a polynucleotide encoding an sgRNA of the targeting enhancer, or a host cell having integrated in its chromosome a polynucleotide encoding an sgRNA of the targeting enhancer.
In some embodiments of the invention, the expression system further comprises a host cell capable of expressing the gene of interest.
In some embodiments of the invention, the host cell is selected from eukaryotic cells.
In some embodiments of the invention, the host cell is selected from a primary cell of metazoan origin or an immortalized cell line.
In some embodiments of the invention, the host cell is selected from the group consisting of a blood cell line.
In some embodiments of the invention, the host cell is selected from human K562 cells.
In another aspect, the invention provides the use of said DNA-loop forming molecule, said polynucleotide, said loop forming system, said expression system in gene expression.
In some embodiments of the invention, the use in gene expression is use in gene expression in eukaryotes.
In some embodiments of the invention, the eukaryotic organism is selected from the group consisting of a metazoan.
In some embodiments of the invention, the eukaryotic organism is selected from one or more of human, mouse, nematode, drosophila.
In another aspect, the present invention provides a method of gene expression comprising: and (3) by the fusion protein or the loop-forming system, the three-dimensional space distance of the target site is shortened, and gene expression is carried out.
In some embodiments of the invention, the gene expression method comprises: culturing a host cell capable of expressing a gene of interest under suitable conditions in the presence of said loop-forming system.
In some embodiments of the invention, the gene expression method is an in vitro gene expression method.
In some embodiments of the invention, the gene expression method comprises: culturing under appropriate conditions to obtain the expression system.
Drawings
FIG. 1 shows a schematic representation of the invention in which LDB1-dCAS9 mediated DNA looping reprograms the spatial position of a gene of interest.
FIG. 2 shows the variation of the expression of each gene in the β -globin gene cluster after the LDB 1-dmas 9and dmas 9-LDB1 mediated DNA cyclization according to the present invention.
FIG. 3 is a schematic diagram showing the expression of other globin genes in the gene cluster of the present invention.
FIG. 4 shows the comparison of the efficiency of the activation of the HBB gene by LDB 1-dmas 9, dAS 9-LDB1, dAS 9-DD according to the present invention with DD-dAS 9.
Detailed Description
The present inventors have made extensive studies to provide a novel DNA circularization molecule comprising a fusion protein formed by LDB1 and dCas9, which can regulate gene expression by reprogramming the spatial position of the gene, thereby completing the present invention.
The first aspect of the invention provides a fusion protein comprising a homodimer fragment and a dCas9 protein fragment, the homodimer fragment comprising a LDB1 protein fragment or a dimedomain (DD) fragment of a LDB1 protein. The cyclized molecule provided by the invention can be fusion protein generally, and can regulate and control the expression of the gene by reprogramming the space position of the gene, and because of the existence of the homodimer segment, two sites can be simultaneously targeted to form a loop by only one CRISPR-Cas9, so that the distance between a promoter, an enhancer and a target gene can be pulled up from the space position, the efficiency of DNA loop formation is improved, and the effect of regulating and controlling the expression of the target gene is achieved.
In the fusion protein provided by the invention, the amino acid sequence of the LDB1 protein fragment can comprise: a) An amino acid sequence as shown in SEQ ID NO. 1; or b) an amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.1 and having the function of the amino acid sequence defined in a); specifically, the amino acid sequence in b) specifically refers to: the polypeptide fragment having the function of the polypeptide fragment shown in SEQ ID No.1 and having one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, or 1 to 3) amino acids substituted, deleted, or added at the N-terminal and/or C-terminal of the amino acid sequence shown in one of SEQ ID No.1, or having one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, or 1 to 3) amino acids may be formed, for example, as a homodimer. The amino acid sequence in b) may have more than 80%, 85%, 90%, 93%, 95%, 97%, or 99% similarity to SEQ ID No. 1.
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQ(SEQ ID NO.1)
In the fusion protein provided by the invention, the Dimer domain fragment of the LDB1 protein is a fragment for forming a homodimer in the LDB1 protein, and the amino acid sequence of the Dimer domain fragment of the LDB1 protein can comprise: c) An amino acid sequence as shown in SEQ ID NO. 2; or, d) an amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.2, and having the function of the amino acid sequence defined in c); specifically, the amino acid sequence in d) specifically refers to: the polypeptide fragment having the function of a polypeptide fragment as shown in SEQ ID No.2, for example, may be formed by substituting, deleting or adding one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, or 1 to 3) amino acids to the amino acid sequence shown in one of SEQ ID No.2, or adding one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, or 1 to 3) amino acids to the N-terminal and/or C-terminal. The amino acid sequence in d) may have more than 80%, 85%, 90%, 93%, 95%, 97%, or 99% similarity to SEQ ID No. 2.
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLS(SEQID NO.2)
In the fusion protein provided by the invention, the amino acid sequence of the dCas9 protein fragment can comprise: e) An amino acid sequence as shown in SEQ ID NO. 3; or f) an amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.3, and having the function of the amino acid sequence defined in e); specifically, the amino acid sequence in f) specifically refers to: the polypeptide fragment having the function of the polypeptide fragment shown in SEQ ID No.3, for example, is matched with sgRNA of a specific targeting site (for example, a targeting promoter, enhancer and the like), and recognizes the targeting site, and thus the three-dimensional space distance between the targeting sites can be shortened by substituting, deleting or adding one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5 or 1 to 3) amino acids to the amino acid sequence shown in SEQ ID No.3, or adding one or more (specifically, 1 to 50, 1 to 30, 1 to 10, 1 to 5 or 1 to 3) amino acids to the N-terminal and/or C-terminal. The amino acid sequence in f) may have 80%, 85%, 90%, 93%, 95%, 97%, or 99% or more similarity to SEQ ID No. 3.
DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD(SEQ ID NO.3)
In the fusion protein provided by the invention, the fusion protein can sequentially comprise a homodimer fragment and a dCAS9 protein fragment from the 5 'end to the 3' end, for example, the homodimer fragment can be connected with the amino end of the dCAS9 protein. The fusion protein may comprise, in order from the 5 'end to the 3' end, a dCas9 protein fragment and a homodimer fragment, e.g., the homodimer fragment may be linked to the dCas9 protein carboxy end.
In the fusion proteins provided by the invention, the fusion proteins further comprise a flexible linking peptide segment, which is typically located between the homodimer segment and the dCas9 protein segment. One skilled in the art can generally select a suitable flexible linker peptide to link the homodimer segment and the dCas9 protein segment, e.g., when the fusion protein can include the homodimer segment and the dCas9 protein segment in sequence from the 5 'end to the 3' end, the amino acid sequence of the flexible linker peptide linking the homodimer segment and the dCas9 protein segment can be SGSETPGTSESATPES (SEQ ID No. 41). For another example, when the fusion protein may include a dCas9 protein fragment and a homodimer fragment in sequence from the 5 'end to the 3' end, the amino acid sequence of the flexible linker peptide linking the homodimer fragment and the dCas9 protein fragment may be GRAGGGSGGGSGGGS (SEQ ID No. 42).
In one embodiment of the present invention, the amino acid sequence of the fusion protein may be as shown in SEQ ID Nos. 37 to 40.
DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVGRAGGGSGGGSGGGSMLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQG(SEQ ID No.37).
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSNSTLNYLRLCVILEPMQELMSRHKTYSLSPRDCLKTCLFQKWQRMVAPPAEPTRQQPSKRRKRKMSGGSTMSSGGGNTNNSNSKKKSPASTFALSSQVPDVMVVGEPTLMGGEFGDEDERLITRLENTQFDAANGIDDEDSFNNSPALGANSPWNSKPPSSQESKSENPTSQASQSGSETPGTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKV(SEQ ID No.38)
MLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSSGSETPGTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKV(SEQ ID No.39)
DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDGSPKKKRKVGRAGGGSGGGSGGGSMLDRDVGPTPMYPPTYLEPGIGRHTPYGNQTDYRIFELNKRLQNWTEECDNLWWDAFTTEFFEDDAMLTITFCLEDGPKRYTIGRTLIPRYFRSIFEGGATELYYVLKHPKEAFHSNFVSLDCDQGSMVTQHGKPMFTQVCVEGRLYLEFMFDDMMRIKTWHFSIRQHRELIPRSILAMHAQDPQMLDQLSKNITRCGLSG(SEQ ID No.40)
In a second aspect, the invention provides an isolated polynucleotide encoding the fusion protein provided in the first aspect of the invention.
In a third aspect the invention provides a DNA-loop system comprising the fusion protein provided in the first aspect of the invention, further comprising a promoter-targeting sgRNA and an enhancer-targeting sgRNA. One skilled in the art can select appropriate promoter-targeted sgrnas and/or enhancer-targeted sgrnas depending on the expressed gene of interest. For example, the sequence of the sgRNA of the targeting promoter may be at least partially complementary to the promoter of the target gene, and for another example, the sequence of the sgRNA of the targeting enhancer may be at least partially complementary to the enhancer of the target gene, such that the three-dimensional spatial distance of the targeting site may be brought closer by the dimer formed by the circularized molecule.
In the DNA loop system provided by the invention, the sgRNA of the targeting promoter can be generally targeted to the region between-100 and-200 bp upstream of the TSS, the sequence of the sgRNA of the targeting promoter can be generally designed to have the characteristic of gnnnnnnnnnnnnnnnnnnnngG (SEQ ID No. 43), and the GC content of the sgRNA of the targeting promoter can be generally between 40 and 60 percent. In a specific embodiment of the present invention, the sgRNA of the targeted promoter targets the promoter region of the HBB gene, and specifically, the sequence of the sgRNA of the targeted promoter may be shown in SEQ ID No.4 to 6.
In the DNA loop system provided by the present invention, the sgrnas of the targeting enhancer may target the DHS (DNase Hypersensitive Site) region of the enhancer, specifically may be near or inside DHS (DNase Hypersensitive Site) of the targeting enhancer, the sequence of the sgrnas of the targeting enhancer may be generally designed to have gnnnnnnnnnnnnnnnnnnnNGG characteristics (SEQ ID No. 43), and the GC content of the sgrnas of the targeting enhancer may be generally between 40-60%. In a specific embodiment of the present invention, the sgRNA of the targeting enhancer targets an LCR region, which may be preferably an LCR region of a β -globin gene cluster, and the LCR region may also be preferably a hypersensitive site DHS2 of the LCR region, and specifically, the sequence of the sgRNA of the targeting enhancer may be as shown in SEQ ID nos. 7 to 9.
In a third aspect the invention provides an expression system comprising a host cell capable of expressing the fusion protein, the sgrnas of the targeting promoter and the sgrnas of the targeting enhancer. Therefore, the three-dimensional space distance of the target site can be shortened through the dimer formed by the fusion protein, so that the smooth expression of the target gene can be realized. Methods for enabling the expression system to express the fusion protein, the sgRNA of the targeting promoter and the sgRNA of the targeting enhancer should be known to the person skilled in the art, for example, the expression system may be made to comprise a host cell comprising an expression vector encoding the polynucleotide of the fusion protein, or a host cell having integrated in the chromosome a polynucleotide encoding the fusion protein; for another example, the expression system can be made to include a host cell comprising an expression vector for a polynucleotide encoding a sgRNA of the targeting promoter, or a host cell having a polynucleotide encoding a sgRNA of the targeting promoter integrated in the chromosome; for another example, the expression system can be made to include a host cell comprising an expression vector for the polynucleotide encoding the sgRNA of the targeting enhancer, or a host cell having the polynucleotide encoding the sgRNA of the targeting enhancer integrated in the chromosome. The expression system may also include a host cell capable of expressing a gene of interest, which in a particular embodiment of the invention may be a gene in the β -globin gene cluster, more particularly a silenced gene, and more particularly an HBB gene. In another embodiment of the invention, the host cell may be a eukaryotic cell, more particularly a cell of a metazoan, more particularly a primary cell or an immortalized cell line derived from a metazoan (e.g., including but not limited to human, mouse, etc.), for example, a blood line cell line, more particularly a human K562 cell. The skilled artisan can select an appropriate expression vector depending on the type of host cell, for example, the expression vector may be a transient vector including, but not limited to, pcdna3.1, pST1374, etc., or a lenti virus vector, etc. In the expression system, the host cell may be one or more of a gene of interest, the fusion protein, the sgRNA of the targeting promoter, the sgRNA of the targeting enhancer, so that the DNA loop system may be formed in the expression system.
In a fifth aspect the invention provides the use of a DNA-loop-forming molecule as provided in the first aspect of the invention, or a loop-forming system as provided in the second aspect of the invention, or an expression system as provided in the third aspect of the invention, in gene expression, preferably in gene expression of a eukaryotic organism, which may in particular be a metazoan, may in particular include but is not limited to humans, mice, drosophila, nematodes and the like. In a specific embodiment of the present invention, the expressed target gene may be a gene in the β -globin gene cluster, more specifically a silenced gene, and more specifically an HBB gene.
The sixth aspect of the present invention provides a gene expression method, which may be an in vitro gene expression method, comprising: gene expression is performed by pulling up the three-dimensional spatial distance of the targeting site by the fusion protein provided in the first aspect of the invention or the loop-forming system provided in the second aspect of the invention. For example, the gene expression method may include: culturing a host cell capable of expressing a gene of interest under suitable conditions in the presence of said loop-forming system. For another example, the gene expression method may include: culturing the expression system provided in the third aspect of the present invention under appropriate conditions.
Compared with the prior art, the invention is achieved by changing the three-dimensional structure of chromatin without changing genome sequence information or epigenetic modification, has simple operation and short experimental preparation period, greatly reduces time and working cost, does not need adding any small molecule to induce formation of dimer, and can simultaneously form rings at two sites by only one CRISPR-Cas9 due to the existence of homodimer monomers, thereby greatly reducing time and working cost and improving the efficiency of DNA ring formation when the targeting sites of genes are increased.
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention.
Before the embodiments of the invention are explained in further detail, it is to be understood that the invention is not limited in its scope to the particular embodiments described below; it is also to be understood that the terminology used in the examples of the invention is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the invention; in the description and claims of the invention, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise.
Where numerical ranges are provided in the examples, it is understood that unless otherwise stated herein, both endpoints of each numerical range and any number between the two endpoints are significant both in the numerical range. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition to the specific methods, devices, materials used in the embodiments, any methods, devices, and materials of the prior art similar or equivalent to those described in the embodiments of the present invention may be used to practice the present invention according to the knowledge of one skilled in the art and the description of the present invention.
Unless otherwise indicated, the experimental methods, detection methods, and preparation methods disclosed in the present invention employ techniques conventional in the art of molecular biology, biochemistry, chromatin structure and analysis, analytical chemistry, cell culture, recombinant DNA techniques, and related arts. These techniques are well described in the prior art literature and see, in particular, sambrook et al MOLECULAR CLONING: a LABORATORY MANUAL, second edition, cold Spring Harbor Laboratory Press,1989and Third edition,2001; ausubel et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, john Wiley & Sons, new York,1987and periodic updates; the series METHODS IN ENZYMOLOGY, academic Press, san Diego; wolffe, CHROMATIN STRUCTURE AND FUNCTION, third edition, academic Press, san Diego,1998; METHODS IN ENZYMOLOGY, vol.304, chromatin (p.m. wassman and a.p. wolffe, eds.), academic Press, san Diego,1999; and METHODS IN MOLECULAR BIOLOGY, vol.119, chromatin Protocols (p.b. becker, ed.) Humana Press, totowa,1999, etc.
Example 1
Construction of LDB1-dCAS9 plasmid
The cDNA of human LDB1, nucleotide sequence (NM-001113407.2) was atgctggatagggatgtgggtccaactcccatgtatccgcctacatacctggagccagggattgggaggcacacaccatatggcaaccaaactgactacagaatatttgagcttaacaaacggcttcagaactggacagaggagtgtgacaatctctggtgggatgcattcacgactgagttctttgaggatgatgccatgttgaccatcactttctgcctggaggatggaccaaagagatataccattggccggaccctgatcccacgctacttccgcagcatctttgaggggggtgctacggagctgtactatgttcttaagcaccccaaggaggcattccacagcaactttgtgtccctcgactgtgaccagggcagcatggtgacccagcatggcaagcccatgttcacccaggtgtgtgtggagggccggttgtacctggagttcatgtttgacgacatgatgcggataaagacgtggcacttcagcatccggcagcaccgagagctcatcccccgcagcatccttgccatgcatgcccaagacccccagatgttggatcagctctccaaaaacatcactcggtgtgggctgtccaattccactctcaactacctccgactctgtgtgatactcgagcccatgcaagagctcatgtcacgccacaagacctacagcctcagcccccgcgactgcctcaagacctgccttttccagaagtggcagcgcatggtagcaccccctgcggagcccacacgtcagcagcccagcaaacggcggaaacggaagatgtcagggggcagcaccatgagctctggtggtggcaacaccaacaacagcaacagcaagaagaagagcccagctagcaccttcgccctctccagccaggtacctgatgtgatggtggtgggggagcccaccctgatgggcggggagttcggggacgaggacgagaggctcatcacccggctggagaacacccagtttgacgcagccaacggcattgacgacgaggacagctttaacaactcccctgcactgggcgccaacagcccctggaacagcaagcctccgtccagccaagaaagcaaatcggagaaccccacgtcacaggcctcccag (SEQ ID NO. 37), diluted to 10. Mu.L as a PCR template. Designing a forward primer with NotI restriction sites: gggacctaagaaaaagaggaaggtggcggccgctggcggcagcatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO. 29), the reverse primer carries a KpnI cleavage site ctctcgggggtggcgctctcgctggtaccgggggtctcgctgccgctctgggaggcctgtgacgt (SEQ ID NO. 30), and is dissolved in water to 10. Mu.M. cDNA sequence fragments of LDB1 were amplified using a Norpraise high fidelity enzyme kit (Vazyme, p501-d 2). The amplification system and PCR reaction conditions are shown as follows:
Figure BDA0001954959520000121
/>
Figure BDA0001954959520000122
the PCR amplification product was recovered by purification using AxyPrep PCR Clean-up kit (Axygen, AP-PCR-500G). 1. Mu.g of pST1374-N-NLS-flag-linker-dCAS9 vector was further digested with NotI-HF (NEB, R3189S) and KpnI-HF (NEB, R3142S), and incubated at 37℃for 2h. The enzyme digestion system is as follows:
Figure BDA0001954959520000131
the digested product was subjected to tapping recovery using AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G). The PCR fragment and the digested vector fragment were recombinantly ligated by Vazyme recombinant kit (Vazyme, C112-01), the ligation system being as follows:
Figure BDA0001954959520000132
the ligation products were incubated at 37℃for 0.5h, plated, and Sanger sequenced to give the correct LDB1-dCAS9 plasmid with sequence information shown in SEQ ID No.11.
Construction of dCAS9-LDB1 plasmid
Using the cDNA of LDB1 as a PCR template, a forward primer with BssHII cleavage site gggcgcgctggaggaggatccggaggaggatccggaggaggatccatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO. 31) and a reverse primer with ApaI cleavage site gaagggcccctgggaggcctgtgacgt (SEQ ID NO. 32) were designed and dissolved in water to 10. Mu.M. cDNA sequence fragments of LDB1 were amplified using a Norpraise high fidelity enzyme kit (Vazyme, p501-d 2). The amplification system and PCR reaction conditions were as follows:
Figure BDA0001954959520000133
Figure BDA0001954959520000141
the PCR amplified product was recovered by purification using AxyPrep PCR Clean-up kit (Axygen, AP-PCR-500G) and 1. Mu.g of pST1374-N-NLS-flag-linker-dCAS9 vector 1. Mu.g was taken, and the PCR fragment of interest or vector was digested with ApaI (NEB, R0114S) and BssHII (NEB, R0119S), respectively, and incubated at 25℃for 2h. The enzyme digestion system is as follows:
Figure BDA0001954959520000142
the digested product was subjected to tapping recovery using AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G). The PCR fragment and the vector fragment after cleavage by T4 ligase (NEB, M0202S) were ligated as follows:
Figure BDA0001954959520000143
the ligation products were incubated at 16℃for 2h, plated, and Sanger sequenced to give the correct dCAS9-LDB1 plasmid with sequence information shown in SEQ ID No.12.
Construction of DD-dCAS9 plasmid
The cDNA of LDB1 was used as a PCR template, and the forward primer was designed with NotI cleavage site gtggcggccgctggcggcagcatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO. 33) and the reverse primer with KpnI cleavage site cgctggtaccgggggtctcgctgccgctggacagcccacaccgagtgatgtttttgg (SEQ ID NO. 34) and dissolved in water to 10. Mu.M. The Dimmer Domain (DD) fragment of LDB1 was amplified using the Norfirazan high fidelity enzyme kit (Vazyme, p501-d 2). The amplification system and PCR reaction conditions were as follows:
Figure BDA0001954959520000151
Figure BDA0001954959520000152
the PCR amplified product was recovered by purification using AxyPrep PCR Clean-up kit (Axygen, AP-PCR-500G) and 1. Mu.g of pST1374-N-NLS-flag-linker-dCAS9 vector 1. Mu.g was taken, digested with NotI-HF (NEB, R3189S) and KpnI-HF (NEB, R3142S), and incubated at 37℃for 2h. The enzyme digestion system is as follows:
Figure BDA0001954959520000153
the digested product was subjected to tapping recovery using AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G). The PCR fragment and the vector fragment after cleavage by T4 ligase (NEB, M0202S) were ligated as follows:
Figure BDA0001954959520000154
/>
the ligation products were incubated at 16℃for 2h, plated, and Sanger sequenced to give the correct DD-dCAS9 plasmid with sequence information shown in SEQ ID No.13.
Construction of dCAS9-DD plasmid
Using the cDNA of LDB1 as a PCR template, a forward primer with BssHII cleavage site gggcgcgctggaggaggatccggaggaggatccggaggaggatccatgctggatagggatgtgggtccaactcccatgtatccg (SEQ ID NO. 35) and a reverse primer with ApaI cleavage site tcgaagggcccggacagcccacaccgagtgatgtt (SEQ ID NO. 36) were designed and dissolved in water to 10. Mu.M. cDNA sequence fragments of LDB1 were amplified using a Norpraise high fidelity enzyme kit (Vazyme, p501-d 2). The amplification system and PCR reaction conditions are shown as follows:
Figure BDA0001954959520000161
Figure BDA0001954959520000162
the PCR amplification product was purified and recovered by AxyPrep PCR Clean-up kit (Axygen, AP-PCR-500G) and 1. Mu.g of pST1374-N-NLS-flag-linker-dCAS9 vector was taken, and the PCR fragment of interest or vector was digested with ApaI (NEB, R0114S) and BssHII (NEB, R0119S), respectively, and incubated at 25℃for 2h. The enzyme digestion system is as follows:
Figure BDA0001954959520000163
the digested product was subjected to tapping recovery using AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G). The PCR fragment and the vector fragment after cleavage by T4 ligase (NEB, M0202S) were ligated as follows:
Figure BDA0001954959520000171
the ligation product was incubated at 16℃for 2h, plated, and Sanger sequenced to give the correct dCAS9-DD
The plasmid and the sequence information are shown in SEQ ID NO.14.
Construction of targeting site sgRNA plasmids
Designing 3 targeting sgRNAs respectively named as L-sg1 sequences aatatgtcacattctgtctc (SEQ ID NO. 7) for LCR region DHS2 of beta-globin gene cluster of K562 cells; l-sg3, SEQ ID NO.8, sequence ggactatgggaggtcactaa; l-sg4, sequence gaaggttacacagaaccaga (SEQ ID NO. 9). 3 sgRNAs were designed for the pro region of the HBB gene, designated as P-sg1 sequence ggccaagagatatatcttag (SEQ ID NO. 4), respectively; the sequence of P-sg3 was gtgccagaagagccaaggac (SEQ ID NO. 5) and the sequence of P-sg4 was gtggagccacaccctagggt (SEQ ID NO. 6). The negative control sgRNA targets EGFP, designated sg-EGFP, with sequence ggagcgcaccatcttcttca (SEQ ID NO. 10). And designing positive and negative strand primers for base complementary pairing according to the sgRNA sequence, adding an alkali group ACCG at the 5 'end of the positive strand, adding an alkali group AAAC at the 5' end of the negative strand, and adding sterilizing water to dissolve to 100 mu M. The annealed double-stranded DNA fragment with overlapping is ligated to pGL3-U6-sgRNA (Addgene # 51133) linear vector after BsaI (NEB, R0535S) cleavage to construct targeting-specific sgRNA. The primer sequences of sgRNAs of all targeting sites are shown in SEQ ID NO. and are specifically as follows:
l-sg1 forward primer sequence: ACCG AATATGTCACATTCTGTCTC (SEQ ID NO. 15)
L-sg1 negative strand primer sequence: AAAC GAGACAGAATGTGACATATT (SEQ ID NO. 16)
L-sg3 forward primer sequence: ACCG GGACTATGGGAGGTCACTAA (SEQ ID NO. 17)
L-sg3 negative strand primer sequence: AAAC TTAGTGACCTCCCATAGTCC (SEQ ID NO. 18)
L-sg4 forward primer sequence: ACCG GAAGGTTACACAGAACCAGA (SEQ ID NO. 19)
L-sg4 negative strand primer sequence: AAAC TCTGGTTCTGTGTAACCTTC (SEQ ID NO. 20)
P-sg1 forward primer sequence: ACCG GGCCAAGAGATATATCTTAG (SEQ ID NO. 21)
P-sg1 negative strand primer sequence: AAAC CTAAGATATATCTCTTGGCC (SEQ ID NO. 22)
P-sg3 forward primer sequence: ACCG GTGCCAGAAGAGCCAAGGAC (SEQ ID NO. 23)
P-sg3 negative strand primer sequence: AAAC GTCCTTGGCTCTTCTGGCAC (SEQ ID NO. 24)
P-sg4 forward primer sequence: ACCG GTGGAGCCACACCCTAGGGT (SEQ ID NO. 25)
P-sg4 negative strand primer sequence: AAAC ACCCTAGGGTGTGGCTCCAC (SEQ ID NO. 26)
sg-egfp positive strand primer sequence: ACCG GGAGCGCACCATCTTCTTCA (SEQ ID NO. 27)
sg-egfp negative strand primer sequence: AAAC TGAAGAAGATGGTGCGCTCC (SEQ ID NO. 28)
The annealing system and annealing procedure were as follows:
Figure BDA0001954959520000181
Figure BDA0001954959520000182
pGL3-U6-sgRNA (Addgene # 51133) plasmid was digested with BsaI (NEB, R0535S) to give linearized sgRNA vectors. The enzyme digestion system is as follows:
Figure BDA0001954959520000183
the digested product was subjected to tapping recovery using AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G) to obtain a linearized vector. 50ng of linearized vector were ligated with 3. Mu.l of annealed product by T4 ligase (NEB, M0202S), incubated at 16℃for 2 hours and plated and sequenced by Sanger to give the correct target-specific sgRNA.
The connection system is as follows:
Figure BDA0001954959520000184
a schematic diagram of the spatial position of the target gene for the ring reprogramming of LDB1-dCAS9 by artificial DNA is shown in FIG. 1.
Example 2
LDB 1-dAS 9and dAS 9-LDB1 activate HBB expression by DNA-loop-forming reprogramming gene spatial locations:
k562 cells were transfected by electroporation using the LDB1-dCAS 9and dCAS9-LDB1 systems described above as follows:
1) K562 cells (from ATCC) were thawed and cultured in 10cm dishes (Corning, 430167) in RPMI 1640 medium (Gibco, 11875093) mixed with 10% fetal bovine serum (HyClone, SV 30087). The culture temperature was 37℃and the carbon dioxide concentration was 5%.
2) When the cell concentration is 1x10 6 Cells were collected at 1X10 per tube per ml 6 Cells were collected by centrifugation at 1000 r/min. Using the Lonza electrotransformation kit Amaxa cell line Nucleofector Kit V (Lonza, VCA-1003), the amount of plasmid transfected per well was 1. Mu.g of LDB1-dCAS9 or dCAS9-LDB1 plasmid, 0.5. Mu.g of sgRNA plasmid targeting the LCR region DHS2 and 0.5. Mu.g of sgRNA targeting the HBB gene promoter region, respectively, and electrotransformation procedure was T-016 (Lonza 2 b). Three LCR-targeted sgrnas and 3 HBB promoter-targeted sThe total of 9 gRNAs were combined. The negative control group of electrotransferred sgrnas was egfp-targeted sgrnas.
3) After completion of the electrotransformation, the cells were gently washed out with 500. Mu.l of medium and transferred to 12-well plates, each of which was filled with 1.5ml of 1640 medium.
4) 24 hours after transfection, the drug was treated with Puromycin (InvivoGen, nt-pr-1) and Blastidin (InvivoGen, ant-bl-1) at a final concentration of 2 ng/ml.
Cells were harvested 72 hours after transfection, lysed with 500. Mu.l Trizol (Invitrogen, 15596018) and RNA extracted using RNA extraction kit TransZol Up Plus RNA Kit (ER 501-01), and 500ng of RNA was diluted 10-fold with TOYOBO reverse transcription kit (Toyobo, FSQ-301) for use. qPCR was performed using Biotool Sybr green qPCRMastermix (Biotool, B21703) as follows:
Figure BDA0001954959520000191
the regulation of HBB gene expression by LDB1-dCAs 9and dCAs9-LDB1 at each sgRNA combination was measured and shown in FIG. 2. Compared with the negative control group, both LDB-dCAS 9and dCAS9-LDB1 can specifically up-regulate the expression of the spatial position of the HBB gene through DNA ring-forming reprogramming. The activation effect of LDB1-dCAS9 on HBB is higher than that of dCAS9-LDB1, and the expression of HBB is up-regulated by 12 times to increase the highest LDB1-dCAS9 when the targeting sgRNA is selected as the combination of L-sg3 and p-sg3, and the dCAS9-LDB1 is increased by nearly 8 times.
To verify that the expression of HBB was due to the specific expression of HBB caused by the spatial position being pulled closer to LCR, we examined the expression of other globin genes in the gene cluster when targeting sgRNAs were different in combination, namely L-sg3 and P-sg1, L-sg3 and P-sg3, L-sg1 and P-sg3, and compared with the blank control, HBB was increased by 27 times when targeting sgRNAs were combination of L-sg3 and P-sg1, and 25 times when L-sg3 and P-sg3 were combined, as shown in FIG. 3; HBB was 15-fold increased by dCAS9-LDB1 when targeting sgRNA was combined with L-sg1 and P-sg3 and 14-fold increased when L-sg3 was combined with P-sg 3. Meanwhile, the expression of HBD is also improved by 4-5 times, and the expression quantity is presumably improved by pulling the space positions of LCR and HBD when DNA is looped due to the fact that the HBD gene is closer to the HBB gene.
Full length LDB1 protein activated HBB gene expression more:
in order to verify the difference of the efficiency of LDB1 and DD domain on regulating and controlling HBB genes, the space position reprogramming is carried out on the HBB genes by LDB1-dCAS9, DD-dCAS9, dCAS9-LDB1 and dCAS9-DD respectively. Meanwhile, in order to verify whether a plurality of sites of the targeting promoter have a multiplication effect on the DNA looping efficiency, two targeting sites are respectively selected in the LCR or HBB promoter regions. We mixed 1. Mu.g each of the plasmid vectors of the above fusion proteins and electrotransformed with three sets of mixed sgRNA plasmids P-sg3, P-sg1+3, L-sg3& P-sg1+3, L-sg1+3&P-sg1+3, each set of sgRNA mixtures totaling 0.5. Mu.g, each plasmid vector being identical in amount when multiple plasmid vectors are included in the sgRNA mixture, the electrotransformation procedure being T-016, as described in detail above.
RNA extraction and qPCR detection steps were as described above. The results of HBB expression change are shown in fig. 4. Under the action of targeting sgRNA of each group, the activation efficiency of the full-length LDB1 on HBB genes is higher than that of DD domain. Notably, LDB1-dCAS 9and dCAS9-LDB1 have a multiplicative effect on the activation efficiency of HBB when the targeting site of the pro region is increased. Therefore, increasing the targeted sgrnas of the promoter region of the gene of interest can significantly increase DNA efficiency and thus gene expression.
In summary, the present invention effectively overcomes the disadvantages of the prior art and has high industrial utility value.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.
Sequence listing
<110> Shanghai university of science and technology
<120> a DNA circularized molecule and use thereof
<160> 43
<170> SIPOSequenceListing 1.0
<210> 1
<211> 375
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 1
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190
Asn Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu Arg
195 200 205
Leu Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His Lys
210 215 220
Thr Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe Gln
225 230 235 240
Lys Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln Gln
245 250 255
Pro Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met Ser
260 265 270
Ser Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser Pro
275 280 285
Ala Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val Val
290 295 300
Gly Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu Arg
305 310 315 320
Leu Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly Ile
325 330 335
Asp Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn Ser
340 345 350
Pro Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu Asn
355 360 365
Pro Thr Ser Gln Ala Ser Gln
370 375
<210> 2
<211> 200
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 2
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190
Asn Ile Thr Arg Cys Gly Leu Ser
195 200
<210> 3
<211> 1367
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 3
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 1100
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 1215
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1315 1320 1325
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 1360
Leu Ser Gln Leu Gly Gly Asp
1365
<210> 4
<211> 20
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 4
ggccaagaga tatatcttag 20
<210> 5
<211> 20
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 5
gtgccagaag agccaaggac 20
<210> 6
<211> 20
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 6
gtggagccac accctagggt 20
<210> 7
<211> 20
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 7
aatatgtcac attctgtctc 20
<210> 8
<211> 20
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 8
ggactatggg aggtcactaa 20
<210> 9
<211> 20
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 9
gaaggttaca cagaaccaga 20
<210> 10
<211> 20
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 10
ggagcgcacc atcttcttca 20
<210> 11
<211> 10428
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 11
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg gcggcagcat gctggatagg 960
gatgtgggtc caactcccat gtatccgcct acatacctgg agccagggat tgggaggcac 1020
acaccatatg gcaaccaaac tgactacaga atatttgagc ttaacaaacg gcttcagaac 1080
tggacagagg agtgtgacaa tctctggtgg gatgcattca cgactgagtt ctttgaggat 1140
gatgccatgt tgaccatcac tttctgcctg gaggatggac caaagagata taccattggc 1200
cggaccctga tcccacgcta cttccgcagc atctttgagg ggggtgctac ggagctgtac 1260
tatgttctta agcaccccaa ggaggcattc cacagcaact ttgtgtccct cgactgtgac 1320
cagggcagca tggtgaccca gcatggcaag cccatgttca cccaggtgtg tgtggagggc 1380
cggttgtacc tggagttcat gtttgacgac atgatgcgga taaagacgtg gcacttcagc 1440
atccggcagc accgagagct catcccccgc agcatccttg ccatgcatgc ccaagacccc 1500
cagatgttgg atcagctctc caaaaacatc actcggtgtg ggctgtccaa ttccactctc 1560
aactacctcc gactctgtgt gatactcgag cccatgcaag agctcatgtc acgccacaag 1620
acctacagcc tcagcccccg cgactgcctc aagacctgcc ttttccagaa gtggcagcgc 1680
atggtagcac cccctgcgga gcccacacgt cagcagccca gcaaacggcg gaaacggaag 1740
atgtcagggg gcagcaccat gagctctggt ggtggcaaca ccaacaacag caacagcaag 1800
aagaagagcc cagctagcac cttcgccctc tccagccagg tacctgatgt gatggtggtg 1860
ggggagccca ccctgatggg cggggagttc ggggacgagg acgagaggct catcacccgg 1920
ctggagaaca cccagtttga cgcagccaac ggcattgacg acgaggacag ctttaacaac 1980
tcccctgcac tgggcgccaa cagcccctgg aacagcaagc ctccgtccag ccaagaaagc 2040
aaatcggaga accccacgtc acaggcctcc cagagcggca gcgagacccc cggtaccagc 2100
gagagcgcca cccccgagag cgacaagaaa tactctattg gactggctat cgggacaaac 2160
tccgttggct gggccgtcat aaccgacgag tataaggtgc caagcaagaa attcaaggtg 2220
ctgggtaata ctgaccgcca ttcaatcaag aagaacctga tcggagcact cctcttcgac 2280
tccggtgaaa ccgctgaagc tactcggctg aagcggaccg caaggcggag atacacccgc 2340
cgcaagaatc ggatatgtta tctgcaagag atctttagca acgaaatggc taaggtggac 2400
gactccttct ttcaccgcct ggaagagagc tttctggtgg aggaggataa gaaacacgag 2460
aggcacccta tattcggaaa tatcgtggat gaggtggctt accatgaaaa gtatcctaca 2520
atctaccatc tgaggaagaa gctggtggac agcaccgata aagcagacct gaggctcatc 2580
tatctggccc tggctcatat gataaagttt agaggacact ttctgatcga gggcgacctg 2640
aatcccgata attccgatgt ggataaactc ttcattcaac tggtgcagac atataaccaa 2700
ctgttcgagg agaatcccat aaacgcttct ggtgtggatg ccaaggctat tctgtccgct 2760
cggctgtcca agtcacgcag actggagaat ctgattgccc aactgccagg agaaaagaag 2820
aacggcctgt ttgggaacct catcgccctg agcctgggcc tgacacctaa cttcaagtcc 2880
aattttgatc tggccgaaga tgctaaactc cagctctcca aggacaccta tgacgatgat 2940
ctggacaacc tgctcgcaca gataggcgac cagtacgccg atctctttct ggctgctaag 3000
aatctctccg acgccattct gctgagcgac atactccggg tcaacactga gatcaccaaa 3060
gcacctctga gcgcctccat gataaaacgc tatgatgaac accatcaaga cctgactctg 3120
ctcaaagccc tcgtgaggca acagctgcca gagaagtaca aagagatatt cttcgaccag 3180
agcaagaatg gatatgccgg atacatcgat ggcggagcat cacaggaaga attttacaag 3240
ttcatcaaac caatcctcga gaagatggac ggtactgaag agctgctggt gaagctgaac 3300
agggaggacc tgctgaggaa gcagaggacc tttgataatg gctccattcc acatcagata 3360
cacctgggag agctgcatgc aatcctccgc aggcaggagg atttctatcc tttcctgaag 3420
gataaccggg agaagataga gaagatcctg accttcagga tcccttatta cgtcggccct 3480
ctggctagag gcaactcccg cttcgcttgg atgaccagga aatctgagga gacaattact 3540
ccttggaact tcgaagaggt cgtggataag ggcgcaagcg cccagtcatt catcgaacgg 3600
atgaccaatt tcgataagaa cctgccaaac gagaaggtcc tgcccaaaca ttcactcctg 3660
tacgagtatt tcaccgtcta taacgagctg actaaagtga agtacgtgac cgagggcatg 3720
aggaagcctg ccttcctgtc cggagagcag aagaaggcta tcgttgatct gctcttcaag 3780
actaatagaa aggtgacagt gaagcagctc aaggaggatt actttaagaa gatcgaatgc 3840
tttgactcag tggaaatctc tggcgtggag gaccgcttta atgccagcct gggcacttac 3900
catgatctgc tgaagataat caaagacaaa gatttcctcg ataatgagga gaacgaggac 3960
atcctggaag atatcgtgct gaccctgact ctgttcgagg atagagagat gatcgaagag 4020
cgcctgaaga cctatgccca tctgtttgac gataaagtca tgaaacagct caagcggcgg 4080
cgctacactg ggtggggtag actctccagg aaactcataa acggcatccg cgacaaacag 4140
agcggaaaga ccatcctgga tttcctgaaa tccgacggat tcgctaacag gaacttcatg 4200
caactgattc acgatgactc tctgacattt aaagaggaca tccagaaggc acaggtgagc 4260
ggtcaaggcg acagcctgca cgagcacatc gccaacctcg ctggatcacc cgccataaag 4320
aagggaatac tgcagacagt caaggtcgtg gacgaactcg tcaaagtgat gggtcggcac 4380
aagccagaga atatcgttat cgaaatggca agggagaacc aaaccaccca gaagggccag 4440
aagaactctc gggaacggat gaaaagaatc gaagagggaa ttaaggagct gggatctcag 4500
atactgaagg agcaccctgt ggagaataca cagctccaga acgagaaact ctacctgtac 4560
tacctccaga acgggcggga catgtacgtt gaccaggaac tcgacatcaa ccggctgtcc 4620
gattatgacg tggacgctat tgttccacag tccttcctca aagatgactc cattgacaac 4680
aaggtgctga ccagatccga taaggcccgc ggtaagtctg acaatgttcc atcagaagag 4740
gtggtcaaga agatgaagaa ttactggcgg cagctcctca acgccaaact gatcacccag 4800
cggaagtttg acaatctgac taaggcagaa agaggaggtc tgagcgaact cgacaaggcc 4860
ggctttatta agaggcaact ggtcgaaaca cgccagatta ccaaacacgt ggcacaaatc 4920
ctcgactcta ggatgaacac taagtacgat gagaacgata agctgatcag ggaagtgaaa 4980
gtgataactc tgaagagcaa gctggtgtct gacttccgga aggactttca attctacaaa 5040
gttcgcgaaa taaacaatta ccatcatgct cacgatgcct atctcaatgc tgtcgttggc 5100
accgccctga tcaagaaata ccctaaactg gagtctgagt tcgtgtacgg tgactataaa 5160
gtctacgatg tgaggaagat gatagcaaag tctgagcaag agattggcaa agccaccgcc 5220
aagtacttct tctactctaa tatcatgaat ttctttaaga ctgagataac cctggctaac 5280
ggcgaaatcc ggaagcgccc actgatcgaa acaaacggag aaacaggaga aatcgtgtgg 5340
gataaaggca gggacttcgc aactgtgcgg aaggtgctgt ccatgccaca agtcaatatc 5400
gtgaagaaga ccgaagtgca gaccggcgga ttctcaaagg agagcatcct gccaaagcgg 5460
aactctgaca agctgatcgc caggaagaaa gattgggacc caaagaagta tggcggtttc 5520
gattccccta cagtggctta ttccgttctg gtcgtggcaa aagtggagaa aggcaagtcc 5580
aagaaactca agtctgttaa ggagctgctc ggaattacta ttatggagag atccagcttc 5640
gagaagaatc caatcgattt cctggaagct aagggctata aagaagtgaa gaaagatctc 5700
atcatcaaac tgcccaagta ctctctcttt gagctggaga atggtaggaa gcggatgctg 5760
gcctccgccg gagagctgca gaaaggaaac gagctggctc tgccctccaa atacgtgaac 5820
ttcctgtatc tggcctccca ctacgagaaa ctcaaaggta gccctgaaga caatgagcag 5880
aagcaactct ttgttgagca acataaacac tacctggacg aaatcattga acagattagc 5940
gagttcagca agcgggttat tctggccgat gcaaacctcg ataaagtgct gagcgcatat 6000
aataagcaca gggacaagcc aattcgcgaa caagcagaga atattatcca cctctttact 6060
ctgactaatc tgggcgctcc tgctgccttc aagtatttcg atacaactat tgacaggaag 6120
cggtacacct ctaccaaaga agttctcgat gccaccctga tacaccagtc aattaccgga 6180
ctgtacgaga ctcgcatcga cctgtctcag ctcggcggcg acggttctcc caagaagaag 6240
aggaaagtct cgagcggtgg agctgcagga taggaattcg ggcccttcga aggtaagcct 6300
atccctaacc ctctcctcgg tctcgattct acgcgtaccg gtcatcatca ccatcaccat 6360
tgagtttaaa cccgctgatc agcctcgact gtgccttcta gttgccagcc atctgttgtt 6420
tgcccctccc ccgtgccttc cttgaccctg gaaggtgcca ctcccactgt cctttcctaa 6480
taaaatgagg aaattgcatc gcattgtctg agtaggtgtc attctattct ggggggtggg 6540
gtggggcagg acagcaaggg ggaggattgg gaagacaata gcaggcatgc tggggatgcg 6600
gtgggctcta tggcttctga ggcggaaaga accagctggg gctctagggg gtatccccac 6660
gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct 6720
acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt tctcgccacg 6780
ttcgccggct ttccccgtca agctctaaat cggggcatcc ctttagggtt ccgatttagt 6840
gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg tagtgggcca 6900
tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt taatagtgga 6960
ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt tgatttataa 7020
gggattttgg ggatttcggc ctattggtta aaaaatgagc tgatttaaca aaaatttaac 7080
gcgaattaat tctgtggaat gtgtgtcagt tagggtgtgg aaagtcccca ggctccccag 7140
gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccaggtg tggaaagtcc 7200
ccaggctccc cagcaggcag aagtatgcaa agcatgcatc tcaattagtc agcaaccata 7260
gtcccgcccc taactccgcc catcccgccc ctaactccgc ccagttccgc ccattctccg 7320
ccccatggct gactaatttt ttttatttat gcagaggccg aggccgcctc tgcctctgag 7380
ctattccaga agtagtgagg aggctttttt ggaggcctag gcttttgcaa aaagctcccg 7440
ggagcttgta tatccatttt cggatctgat cagcacgtgt tgacaattaa tcatcggcat 7500
agtatatcgg catagtataa tacgacaagg tgaggaacta aaccatggcc aagcctttgt 7560
ctcaagaaga atccaccctc attgaaagag caacggctac aatcaacagc atccccatct 7620
ctgaagacta cagcgtcgcc agcgcagctc tctctagcga cggccgcatc ttcactggtg 7680
tcaatgtata tcattttact gggggacctt gtgcagaact cgtggtgctg ggcactgctg 7740
ctgctgcggc agctggcaac ctgacttgta tcgtcgcgat cggaaatgag aacaggggca 7800
tcttgagccc ctgcggacgg tgtcgacagg tgcttctcga tctgcatcct gggatcaaag 7860
cgatagtgaa ggacagtgat ggacagccga cggcagttgg gattcgtgaa ttgctgccct 7920
ctggttatgt gtgggagggc taagcacttc gtggccgagg agcaggactg acacgtgcta 7980
cgagatttcg attccaccgc cgccttctat gaaaggttgg gcttcggaat cgttttccgg 8040
gacgccggct ggatgatcct ccagcgcggg gatctcatgc tggagttctt cgcccacccc 8100
aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 8160
aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 8220
tatcatgtct gtataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg 8280
tttcctgtgt gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata 8340
aagtgtaaag cctggggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca 8400
ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc 8460
gcggggagag gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg 8520
cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 8580
tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 8640
aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 8700
catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 8760
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 8820
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcaatg ctcacgctgt 8880
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 8940
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 9000
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 9060
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 9120
tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 9180
tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 9240
cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 9300
tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 9360
tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 9420
tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 9480
cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 9540
ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta 9600
tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 9660
gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 9720
agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 9780
atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 9840
tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 9900
gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 9960
agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 10020
cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 10080
ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 10140
ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 10200
actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 10260
ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 10320
atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 10380
caaatagggg ttccgcgcac atttccccga aaagtgccac ctgacgtc 10428
<210> 12
<211> 10458
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 12
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg actacaaaga ccatgacggt 960
gattataaag atcatgacat cgactacaag gatgacgatg acaagtctag agacaagaaa 1020
tactctattg gactggctat cgggacaaac tccgttggct gggccgtcat aaccgacgag 1080
tataaggtgc caagcaagaa attcaaggtg ctgggtaata ctgaccgcca ttcaatcaag 1140
aagaacctga tcggagcact cctcttcgac tccggtgaaa ccgctgaagc tactcggctg 1200
aagcggaccg caaggcggag atacacccgc cgcaagaatc ggatatgtta tctgcaagag 1260
atctttagca acgaaatggc taaggtggac gactccttct ttcaccgcct ggaagagagc 1320
tttctggtgg aggaggataa gaaacacgag aggcacccta tattcggaaa tatcgtggat 1380
gaggtggctt accatgaaaa gtatcctaca atctaccatc tgaggaagaa gctggtggac 1440
agcaccgata aagcagacct gaggctcatc tatctggccc tggctcatat gataaagttt 1500
agaggacact ttctgatcga gggcgacctg aatcccgata attccgatgt ggataaactc 1560
ttcattcaac tggtgcagac atataaccaa ctgttcgagg agaatcccat aaacgcttct 1620
ggtgtggatg ccaaggctat tctgtccgct cggctgtcca agtcacgcag actggagaat 1680
ctgattgccc aactgccagg agaaaagaag aacggcctgt ttgggaacct catcgccctg 1740
agcctgggcc tgacacctaa cttcaagtcc aattttgatc tggccgaaga tgctaaactc 1800
cagctctcca aggacaccta tgacgatgat ctggacaacc tgctcgcaca gataggcgac 1860
cagtacgccg atctctttct ggctgctaag aatctctccg acgccattct gctgagcgac 1920
atactccggg tcaacactga gatcaccaaa gcacctctga gcgcctccat gataaaacgc 1980
tatgatgaac accatcaaga cctgactctg ctcaaagccc tcgtgaggca acagctgcca 2040
gagaagtaca aagagatatt cttcgaccag agcaagaatg gatatgccgg atacatcgat 2100
ggcggagcat cacaggaaga attttacaag ttcatcaaac caatcctcga gaagatggac 2160
ggtactgaag agctgctggt gaagctgaac agggaggacc tgctgaggaa gcagaggacc 2220
tttgataatg gctccattcc acatcagata cacctgggag agctgcatgc aatcctccgc 2280
aggcaggagg atttctatcc tttcctgaag gataaccggg agaagataga gaagatcctg 2340
accttcagga tcccttatta cgtcggccct ctggctagag gcaactcccg cttcgcttgg 2400
atgaccagga aatctgagga gacaattact ccttggaact tcgaagaggt cgtggataag 2460
ggcgcaagcg cccagtcatt catcgaacgg atgaccaatt tcgataagaa cctgccaaac 2520
gagaaggtcc tgcccaaaca ttcactcctg tacgagtatt tcaccgtcta taacgagctg 2580
actaaagtga agtacgtgac cgagggcatg aggaagcctg ccttcctgtc cggagagcag 2640
aagaaggcta tcgttgatct gctcttcaag actaatagaa aggtgacagt gaagcagctc 2700
aaggaggatt actttaagaa gatcgaatgc tttgactcag tggaaatctc tggcgtggag 2760
gaccgcttta atgccagcct gggcacttac catgatctgc tgaagataat caaagacaaa 2820
gatttcctcg ataatgagga gaacgaggac atcctggaag atatcgtgct gaccctgact 2880
ctgttcgagg atagagagat gatcgaagag cgcctgaaga cctatgccca tctgtttgac 2940
gataaagtca tgaaacagct caagcggcgg cgctacactg ggtggggtag actctccagg 3000
aaactcataa acggcatccg cgacaaacag agcggaaaga ccatcctgga tttcctgaaa 3060
tccgacggat tcgctaacag gaacttcatg caactgattc acgatgactc tctgacattt 3120
aaagaggaca tccagaaggc acaggtgagc ggtcaaggcg acagcctgca cgagcacatc 3180
gccaacctcg ctggatcacc cgccataaag aagggaatac tgcagacagt caaggtcgtg 3240
gacgaactcg tcaaagtgat gggtcggcac aagccagaga atatcgttat cgaaatggca 3300
agggagaacc aaaccaccca gaagggccag aagaactctc gggaacggat gaaaagaatc 3360
gaagagggaa ttaaggagct gggatctcag atactgaagg agcaccctgt ggagaataca 3420
cagctccaga acgagaaact ctacctgtac tacctccaga acgggcggga catgtacgtt 3480
gaccaggaac tcgacatcaa ccggctgtcc gattatgacg tggacgctat tgttccacag 3540
tccttcctca aagatgactc cattgacaac aaggtgctga ccagatccga taaggcccgc 3600
ggtaagtctg acaatgttcc atcagaagag gtggtcaaga agatgaagaa ttactggcgg 3660
cagctcctca acgccaaact gatcacccag cggaagtttg acaatctgac taaggcagaa 3720
agaggaggtc tgagcgaact cgacaaggcc ggctttatta agaggcaact ggtcgaaaca 3780
cgccagatta ccaaacacgt ggcacaaatc ctcgactcta ggatgaacac taagtacgat 3840
gagaacgata agctgatcag ggaagtgaaa gtgataactc tgaagagcaa gctggtgtct 3900
gacttccgga aggactttca attctacaaa gttcgcgaaa taaacaatta ccatcatgct 3960
cacgatgcct atctcaatgc tgtcgttggc accgccctga tcaagaaata ccctaaactg 4020
gagtctgagt tcgtgtacgg tgactataaa gtctacgatg tgaggaagat gatagcaaag 4080
tctgagcaag agattggcaa agccaccgcc aagtacttct tctactctaa tatcatgaat 4140
ttctttaaga ctgagataac cctggctaac ggcgaaatcc ggaagcgccc actgatcgaa 4200
acaaacggag aaacaggaga aatcgtgtgg gataaaggca gggacttcgc aactgtgcgg 4260
aaggtgctgt ccatgccaca agtcaatatc gtgaagaaga ccgaagtgca gaccggcgga 4320
ttctcaaagg agagcatcct gccaaagcgg aactctgaca agctgatcgc caggaagaaa 4380
gattgggacc caaagaagta tggcggtttc gattccccta cagtggctta ttccgttctg 4440
gtcgtggcaa aagtggagaa aggcaagtcc aagaaactca agtctgttaa ggagctgctc 4500
ggaattacta ttatggagag atccagcttc gagaagaatc caatcgattt cctggaagct 4560
aagggctata aagaagtgaa gaaagatctc atcatcaaac tgcccaagta ctctctcttt 4620
gagctggaga atggtaggaa gcggatgctg gcctccgccg gagagctgca gaaaggaaac 4680
gagctggctc tgccctccaa atacgtgaac ttcctgtatc tggcctccca ctacgagaaa 4740
ctcaaaggta gccctgaaga caatgagcag aagcaactct ttgttgagca acataaacac 4800
tacctggacg aaatcattga acagattagc gagttcagca agcgggttat tctggccgat 4860
gcaaacctcg ataaagtgct gagcgcatat aataagcaca gggacaagcc aattcgcgaa 4920
caagcagaga atattatcca cctctttact ctgactaatc tgggcgctcc tgctgccttc 4980
aagtatttcg atacaactat tgacaggaag cggtacacct ctaccaaaga agttctcgat 5040
gccaccctga tacaccagtc aattaccgga ctgtacgaga ctcgcatcga cctgtctcag 5100
ctcggcggcg acggttctcc caagaagaag aggaaagtcg ggcgcgctgg aggaggatcc 5160
ggaggaggat ccggaggagg atccatgctg gatagggatg tgggtccaac tcccatgtat 5220
ccgcctacat acctggagcc agggattggg aggcacacac catatggcaa ccaaactgac 5280
tacagaatat ttgagcttaa caaacggctt cagaactgga cagaggagtg tgacaatctc 5340
tggtgggatg cattcacgac tgagttcttt gaggatgatg ccatgttgac catcactttc 5400
tgcctggagg atggaccaaa gagatatacc attggccgga ccctgatccc acgctacttc 5460
cgcagcatct ttgagggggg tgctacggag ctgtactatg ttcttaagca ccccaaggag 5520
gcattccaca gcaactttgt gtccctcgac tgtgaccagg gcagcatggt gacccagcat 5580
ggcaagccca tgttcaccca ggtgtgtgtg gagggccggt tgtacctgga gttcatgttt 5640
gacgacatga tgcggataaa gacgtggcac ttcagcatcc ggcagcaccg agagctcatc 5700
ccccgcagca tccttgccat gcatgcccaa gacccccaga tgttggatca gctctccaaa 5760
aacatcactc ggtgtgggct gtccaattcc actctcaact acctccgact ctgtgtgata 5820
ctcgagccca tgcaagagct catgtcacgc cacaagacct acagcctcag cccccgcgac 5880
tgcctcaaga cctgcctttt ccagaagtgg cagcgcatgg tagcaccccc tgcggagccc 5940
acacgtcagc agcccagcaa acggcggaaa cggaagatgt cagggggcag caccatgagc 6000
tctggtggtg gcaacaccaa caacagcaac agcaagaaga agagcccagc tagcaccttc 6060
gccctctcca gccaggtacc tgatgtgatg gtggtggggg agcccaccct gatgggcggg 6120
gagttcgggg acgaggacga gaggctcatc acccggctgg agaacaccca gtttgacgca 6180
gccaacggca ttgacgacga ggacagcttt aacaactccc ctgcactggg cgccaacagc 6240
ccctggaaca gcaagcctcc gtccagccaa gaaagcaaat cggagaaccc cacgtcacag 6300
gcctcccagg ggcccttcga aggtaagcct atccctaacc ctctcctcgg tctcgattct 6360
acgcgtaccg gtcatcatca ccatcaccat tgagtttaaa cccgctgatc agcctcgact 6420
gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg 6480
gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg 6540
agtaggtgtc attctattct ggggggtggg gtggggcagg acagcaaggg ggaggattgg 6600
gaagacaata gcaggcatgc tggggatgcg gtgggctcta tggcttctga ggcggaaaga 6660
accagctggg gctctagggg gtatccccac gcgccctgta gcggcgcatt aagcgcggcg 6720
ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct 6780
ttcgctttct tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat 6840
cggggcatcc ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt 6900
gattagggtg atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg 6960
acgttggagt ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac 7020
cctatctcgg tctattcttt tgatttataa gggattttgg ggatttcggc ctattggtta 7080
aaaaatgagc tgatttaaca aaaatttaac gcgaattaat tctgtggaat gtgtgtcagt 7140
tagggtgtgg aaagtcccca ggctccccag gcaggcagaa gtatgcaaag catgcatctc 7200
aattagtcag caaccaggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 7260
agcatgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc catcccgccc 7320
ctaactccgc ccagttccgc ccattctccg ccccatggct gactaatttt ttttatttat 7380
gcagaggccg aggccgcctc tgcctctgag ctattccaga agtagtgagg aggctttttt 7440
ggaggcctag gcttttgcaa aaagctcccg ggagcttgta tatccatttt cggatctgat 7500
cagcacgtgt tgacaattaa tcatcggcat agtatatcgg catagtataa tacgacaagg 7560
tgaggaacta aaccatggcc aagcctttgt ctcaagaaga atccaccctc attgaaagag 7620
caacggctac aatcaacagc atccccatct ctgaagacta cagcgtcgcc agcgcagctc 7680
tctctagcga cggccgcatc ttcactggtg tcaatgtata tcattttact gggggacctt 7740
gtgcagaact cgtggtgctg ggcactgctg ctgctgcggc agctggcaac ctgacttgta 7800
tcgtcgcgat cggaaatgag aacaggggca tcttgagccc ctgcggacgg tgtcgacagg 7860
tgcttctcga tctgcatcct gggatcaaag cgatagtgaa ggacagtgat ggacagccga 7920
cggcagttgg gattcgtgaa ttgctgccct ctggttatgt gtgggagggc taagcacttc 7980
gtggccgagg agcaggactg acacgtgcta cgagatttcg attccaccgc cgccttctat 8040
gaaaggttgg gcttcggaat cgttttccgg gacgccggct ggatgatcct ccagcgcggg 8100
gatctcatgc tggagttctt cgcccacccc aacttgttta ttgcagctta taatggttac 8160
aaataaagca atagcatcac aaatttcaca aataaagcat ttttttcact gcattctagt 8220
tgtggtttgt ccaaactcat caatgtatct tatcatgtct gtataccgtc gacctctagc 8280
tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta tccgctcaca 8340
attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc ctaatgagtg 8400
agctaactca cattaattgc gttgcgctca ctgcccgctt tccagtcggg aaacctgtcg 8460
tgccagctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg tattgggcgc 8520
tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg gcgagcggta 8580
tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa cgcaggaaag 8640
aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg 8700
tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc aagtcagagg 8760
tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag ctccctcgtg 8820
cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 8880
agcgtggcgc tttctcaatg ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 8940
tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt 9000
aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc agcagccact 9060
ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt gaagtggtgg 9120
cctaactacg gctacactag aaggacagta tttggtatct gcgctctgct gaagccagtt 9180
accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt 9240
ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct 9300
ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg 9360
gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa atgaagtttt 9420
aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg cttaatcagt 9480
gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg actccccgtc 9540
gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc aatgataccg 9600
cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc cggaagggcc 9660
gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa ttgttgccgg 9720
gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc cattgctaca 9780
ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg ttcccaacga 9840
tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc cttcggtcct 9900
ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat ggcagcactg 9960
cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg tgagtactca 10020
accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata 10080
cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg aaaacgttct 10140
tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat gtaacccact 10200
cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg gtgagcaaaa 10260
acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg ttgaatactc 10320
atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct catgagcgga 10380
tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac atttccccga 10440
aaagtgccac ctgacgtc 10458
<210> 13
<211> 9903
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 13
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg gcggcagcat gctggatagg 960
gatgtgggtc caactcccat gtatccgcct acatacctgg agccagggat tgggaggcac 1020
acaccatatg gcaaccaaac tgactacaga atatttgagc ttaacaaacg gcttcagaac 1080
tggacagagg agtgtgacaa tctctggtgg gatgcattca cgactgagtt ctttgaggat 1140
gatgccatgt tgaccatcac tttctgcctg gaggatggac caaagagata taccattggc 1200
cggaccctga tcccacgcta cttccgcagc atctttgagg ggggtgctac ggagctgtac 1260
tatgttctta agcaccccaa ggaggcattc cacagcaact ttgtgtccct cgactgtgac 1320
cagggcagca tggtgaccca gcatggcaag cccatgttca cccaggtgtg tgtggagggc 1380
cggttgtacc tggagttcat gtttgacgac atgatgcgga taaagacgtg gcacttcagc 1440
atccggcagc accgagagct catcccccgc agcatccttg ccatgcatgc ccaagacccc 1500
cagatgttgg atcagctctc caaaaacatc actcggtgtg ggctgtccag cggcagcgag 1560
acccccggta ccagcgagag cgccaccccc gagagcgaca agaaatactc tattggactg 1620
gctatcggga caaactccgt tggctgggcc gtcataaccg acgagtataa ggtgccaagc 1680
aagaaattca aggtgctggg taatactgac cgccattcaa tcaagaagaa cctgatcgga 1740
gcactcctct tcgactccgg tgaaaccgct gaagctactc ggctgaagcg gaccgcaagg 1800
cggagataca cccgccgcaa gaatcggata tgttatctgc aagagatctt tagcaacgaa 1860
atggctaagg tggacgactc cttctttcac cgcctggaag agagctttct ggtggaggag 1920
gataagaaac acgagaggca ccctatattc ggaaatatcg tggatgaggt ggcttaccat 1980
gaaaagtatc ctacaatcta ccatctgagg aagaagctgg tggacagcac cgataaagca 2040
gacctgaggc tcatctatct ggccctggct catatgataa agtttagagg acactttctg 2100
atcgagggcg acctgaatcc cgataattcc gatgtggata aactcttcat tcaactggtg 2160
cagacatata accaactgtt cgaggagaat cccataaacg cttctggtgt ggatgccaag 2220
gctattctgt ccgctcggct gtccaagtca cgcagactgg agaatctgat tgcccaactg 2280
ccaggagaaa agaagaacgg cctgtttggg aacctcatcg ccctgagcct gggcctgaca 2340
cctaacttca agtccaattt tgatctggcc gaagatgcta aactccagct ctccaaggac 2400
acctatgacg atgatctgga caacctgctc gcacagatag gcgaccagta cgccgatctc 2460
tttctggctg ctaagaatct ctccgacgcc attctgctga gcgacatact ccgggtcaac 2520
actgagatca ccaaagcacc tctgagcgcc tccatgataa aacgctatga tgaacaccat 2580
caagacctga ctctgctcaa agccctcgtg aggcaacagc tgccagagaa gtacaaagag 2640
atattcttcg accagagcaa gaatggatat gccggataca tcgatggcgg agcatcacag 2700
gaagaatttt acaagttcat caaaccaatc ctcgagaaga tggacggtac tgaagagctg 2760
ctggtgaagc tgaacaggga ggacctgctg aggaagcaga ggacctttga taatggctcc 2820
attccacatc agatacacct gggagagctg catgcaatcc tccgcaggca ggaggatttc 2880
tatcctttcc tgaaggataa ccgggagaag atagagaaga tcctgacctt caggatccct 2940
tattacgtcg gccctctggc tagaggcaac tcccgcttcg cttggatgac caggaaatct 3000
gaggagacaa ttactccttg gaacttcgaa gaggtcgtgg ataagggcgc aagcgcccag 3060
tcattcatcg aacggatgac caatttcgat aagaacctgc caaacgagaa ggtcctgccc 3120
aaacattcac tcctgtacga gtatttcacc gtctataacg agctgactaa agtgaagtac 3180
gtgaccgagg gcatgaggaa gcctgccttc ctgtccggag agcagaagaa ggctatcgtt 3240
gatctgctct tcaagactaa tagaaaggtg acagtgaagc agctcaagga ggattacttt 3300
aagaagatcg aatgctttga ctcagtggaa atctctggcg tggaggaccg ctttaatgcc 3360
agcctgggca cttaccatga tctgctgaag ataatcaaag acaaagattt cctcgataat 3420
gaggagaacg aggacatcct ggaagatatc gtgctgaccc tgactctgtt cgaggataga 3480
gagatgatcg aagagcgcct gaagacctat gcccatctgt ttgacgataa agtcatgaaa 3540
cagctcaagc ggcggcgcta cactgggtgg ggtagactct ccaggaaact cataaacggc 3600
atccgcgaca aacagagcgg aaagaccatc ctggatttcc tgaaatccga cggattcgct 3660
aacaggaact tcatgcaact gattcacgat gactctctga catttaaaga ggacatccag 3720
aaggcacagg tgagcggtca aggcgacagc ctgcacgagc acatcgccaa cctcgctgga 3780
tcacccgcca taaagaaggg aatactgcag acagtcaagg tcgtggacga actcgtcaaa 3840
gtgatgggtc ggcacaagcc agagaatatc gttatcgaaa tggcaaggga gaaccaaacc 3900
acccagaagg gccagaagaa ctctcgggaa cggatgaaaa gaatcgaaga gggaattaag 3960
gagctgggat ctcagatact gaaggagcac cctgtggaga atacacagct ccagaacgag 4020
aaactctacc tgtactacct ccagaacggg cgggacatgt acgttgacca ggaactcgac 4080
atcaaccggc tgtccgatta tgacgtggac gctattgttc cacagtcctt cctcaaagat 4140
gactccattg acaacaaggt gctgaccaga tccgataagg cccgcggtaa gtctgacaat 4200
gttccatcag aagaggtggt caagaagatg aagaattact ggcggcagct cctcaacgcc 4260
aaactgatca cccagcggaa gtttgacaat ctgactaagg cagaaagagg aggtctgagc 4320
gaactcgaca aggccggctt tattaagagg caactggtcg aaacacgcca gattaccaaa 4380
cacgtggcac aaatcctcga ctctaggatg aacactaagt acgatgagaa cgataagctg 4440
atcagggaag tgaaagtgat aactctgaag agcaagctgg tgtctgactt ccggaaggac 4500
tttcaattct acaaagttcg cgaaataaac aattaccatc atgctcacga tgcctatctc 4560
aatgctgtcg ttggcaccgc cctgatcaag aaatacccta aactggagtc tgagttcgtg 4620
tacggtgact ataaagtcta cgatgtgagg aagatgatag caaagtctga gcaagagatt 4680
ggcaaagcca ccgccaagta cttcttctac tctaatatca tgaatttctt taagactgag 4740
ataaccctgg ctaacggcga aatccggaag cgcccactga tcgaaacaaa cggagaaaca 4800
ggagaaatcg tgtgggataa aggcagggac ttcgcaactg tgcggaaggt gctgtccatg 4860
ccacaagtca atatcgtgaa gaagaccgaa gtgcagaccg gcggattctc aaaggagagc 4920
atcctgccaa agcggaactc tgacaagctg atcgccagga agaaagattg ggacccaaag 4980
aagtatggcg gtttcgattc ccctacagtg gcttattccg ttctggtcgt ggcaaaagtg 5040
gagaaaggca agtccaagaa actcaagtct gttaaggagc tgctcggaat tactattatg 5100
gagagatcca gcttcgagaa gaatccaatc gatttcctgg aagctaaggg ctataaagaa 5160
gtgaagaaag atctcatcat caaactgccc aagtactctc tctttgagct ggagaatggt 5220
aggaagcgga tgctggcctc cgccggagag ctgcagaaag gaaacgagct ggctctgccc 5280
tccaaatacg tgaacttcct gtatctggcc tcccactacg agaaactcaa aggtagccct 5340
gaagacaatg agcagaagca actctttgtt gagcaacata aacactacct ggacgaaatc 5400
attgaacaga ttagcgagtt cagcaagcgg gttattctgg ccgatgcaaa cctcgataaa 5460
gtgctgagcg catataataa gcacagggac aagccaattc gcgaacaagc agagaatatt 5520
atccacctct ttactctgac taatctgggc gctcctgctg ccttcaagta tttcgataca 5580
actattgaca ggaagcggta cacctctacc aaagaagttc tcgatgccac cctgatacac 5640
cagtcaatta ccggactgta cgagactcgc atcgacctgt ctcagctcgg cggcgacggt 5700
tctcccaaga agaagaggaa agtctcgagc ggtggagctg caggatagga attcgggccc 5760
ttcgaaggta agcctatccc taaccctctc ctcggtctcg attctacgcg taccggtcat 5820
catcaccatc accattgagt ttaaacccgc tgatcagcct cgactgtgcc ttctagttgc 5880
cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg tgccactccc 5940
actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct 6000
attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga caatagcagg 6060
catgctgggg atgcggtggg ctctatggct tctgaggcgg aaagaaccag ctggggctct 6120
agggggtatc cccacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg 6180
cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct 6240
tcctttctcg ccacgttcgc cggctttccc cgtcaagctc taaatcgggg catcccttta 6300
gggttccgat ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt 6360
tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg 6420
ttctttaata gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat 6480
tcttttgatt tataagggat tttggggatt tcggcctatt ggttaaaaaa tgagctgatt 6540
taacaaaaat ttaacgcgaa ttaattctgt ggaatgtgtg tcagttaggg tgtggaaagt 6600
ccccaggctc cccaggcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc 6660
aggtgtggaa agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat 6720
tagtcagcaa ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt 6780
tccgcccatt ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc 6840
gcctctgcct ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt 6900
tgcaaaaagc tcccgggagc ttgtatatcc attttcggat ctgatcagca cgtgttgaca 6960
attaatcatc ggcatagtat atcggcatag tataatacga caaggtgagg aactaaacca 7020
tggccaagcc tttgtctcaa gaagaatcca ccctcattga aagagcaacg gctacaatca 7080
acagcatccc catctctgaa gactacagcg tcgccagcgc agctctctct agcgacggcc 7140
gcatcttcac tggtgtcaat gtatatcatt ttactggggg accttgtgca gaactcgtgg 7200
tgctgggcac tgctgctgct gcggcagctg gcaacctgac ttgtatcgtc gcgatcggaa 7260
atgagaacag gggcatcttg agcccctgcg gacggtgtcg acaggtgctt ctcgatctgc 7320
atcctgggat caaagcgata gtgaaggaca gtgatggaca gccgacggca gttgggattc 7380
gtgaattgct gccctctggt tatgtgtggg agggctaagc acttcgtggc cgaggagcag 7440
gactgacacg tgctacgaga tttcgattcc accgccgcct tctatgaaag gttgggcttc 7500
ggaatcgttt tccgggacgc cggctggatg atcctccagc gcggggatct catgctggag 7560
ttcttcgccc accccaactt gtttattgca gcttataatg gttacaaata aagcaatagc 7620
atcacaaatt tcacaaataa agcatttttt tcactgcatt ctagttgtgg tttgtccaaa 7680
ctcatcaatg tatcttatca tgtctgtata ccgtcgacct ctagctagag cttggcgtaa 7740
tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 7800
cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta 7860
attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 7920
tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 7980
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 8040
gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 8100
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 8160
cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 8220
ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 8280
accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 8340
caatgctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 8400
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 8460
tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 8520
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 8580
actagaagga cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 8640
gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 8700
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 8760
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 8820
aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 8880
atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca 8940
gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg 9000
atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca 9060
ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt 9120
cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt 9180
agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca 9240
cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca 9300
tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga 9360
agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact 9420
gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga 9480
gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg 9540
ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc 9600
tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga 9660
tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat 9720
gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttttt 9780
caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt 9840
atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccacctgac 9900
gtc 9903
<210> 14
<211> 9933
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 14
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
accatgggac ctaagaaaaa gaggaaggtg gcggccgctg actacaaaga ccatgacggt 960
gattataaag atcatgacat cgactacaag gatgacgatg acaagtctag agacaagaaa 1020
tactctattg gactggctat cgggacaaac tccgttggct gggccgtcat aaccgacgag 1080
tataaggtgc caagcaagaa attcaaggtg ctgggtaata ctgaccgcca ttcaatcaag 1140
aagaacctga tcggagcact cctcttcgac tccggtgaaa ccgctgaagc tactcggctg 1200
aagcggaccg caaggcggag atacacccgc cgcaagaatc ggatatgtta tctgcaagag 1260
atctttagca acgaaatggc taaggtggac gactccttct ttcaccgcct ggaagagagc 1320
tttctggtgg aggaggataa gaaacacgag aggcacccta tattcggaaa tatcgtggat 1380
gaggtggctt accatgaaaa gtatcctaca atctaccatc tgaggaagaa gctggtggac 1440
agcaccgata aagcagacct gaggctcatc tatctggccc tggctcatat gataaagttt 1500
agaggacact ttctgatcga gggcgacctg aatcccgata attccgatgt ggataaactc 1560
ttcattcaac tggtgcagac atataaccaa ctgttcgagg agaatcccat aaacgcttct 1620
ggtgtggatg ccaaggctat tctgtccgct cggctgtcca agtcacgcag actggagaat 1680
ctgattgccc aactgccagg agaaaagaag aacggcctgt ttgggaacct catcgccctg 1740
agcctgggcc tgacacctaa cttcaagtcc aattttgatc tggccgaaga tgctaaactc 1800
cagctctcca aggacaccta tgacgatgat ctggacaacc tgctcgcaca gataggcgac 1860
cagtacgccg atctctttct ggctgctaag aatctctccg acgccattct gctgagcgac 1920
atactccggg tcaacactga gatcaccaaa gcacctctga gcgcctccat gataaaacgc 1980
tatgatgaac accatcaaga cctgactctg ctcaaagccc tcgtgaggca acagctgcca 2040
gagaagtaca aagagatatt cttcgaccag agcaagaatg gatatgccgg atacatcgat 2100
ggcggagcat cacaggaaga attttacaag ttcatcaaac caatcctcga gaagatggac 2160
ggtactgaag agctgctggt gaagctgaac agggaggacc tgctgaggaa gcagaggacc 2220
tttgataatg gctccattcc acatcagata cacctgggag agctgcatgc aatcctccgc 2280
aggcaggagg atttctatcc tttcctgaag gataaccggg agaagataga gaagatcctg 2340
accttcagga tcccttatta cgtcggccct ctggctagag gcaactcccg cttcgcttgg 2400
atgaccagga aatctgagga gacaattact ccttggaact tcgaagaggt cgtggataag 2460
ggcgcaagcg cccagtcatt catcgaacgg atgaccaatt tcgataagaa cctgccaaac 2520
gagaaggtcc tgcccaaaca ttcactcctg tacgagtatt tcaccgtcta taacgagctg 2580
actaaagtga agtacgtgac cgagggcatg aggaagcctg ccttcctgtc cggagagcag 2640
aagaaggcta tcgttgatct gctcttcaag actaatagaa aggtgacagt gaagcagctc 2700
aaggaggatt actttaagaa gatcgaatgc tttgactcag tggaaatctc tggcgtggag 2760
gaccgcttta atgccagcct gggcacttac catgatctgc tgaagataat caaagacaaa 2820
gatttcctcg ataatgagga gaacgaggac atcctggaag atatcgtgct gaccctgact 2880
ctgttcgagg atagagagat gatcgaagag cgcctgaaga cctatgccca tctgtttgac 2940
gataaagtca tgaaacagct caagcggcgg cgctacactg ggtggggtag actctccagg 3000
aaactcataa acggcatccg cgacaaacag agcggaaaga ccatcctgga tttcctgaaa 3060
tccgacggat tcgctaacag gaacttcatg caactgattc acgatgactc tctgacattt 3120
aaagaggaca tccagaaggc acaggtgagc ggtcaaggcg acagcctgca cgagcacatc 3180
gccaacctcg ctggatcacc cgccataaag aagggaatac tgcagacagt caaggtcgtg 3240
gacgaactcg tcaaagtgat gggtcggcac aagccagaga atatcgttat cgaaatggca 3300
agggagaacc aaaccaccca gaagggccag aagaactctc gggaacggat gaaaagaatc 3360
gaagagggaa ttaaggagct gggatctcag atactgaagg agcaccctgt ggagaataca 3420
cagctccaga acgagaaact ctacctgtac tacctccaga acgggcggga catgtacgtt 3480
gaccaggaac tcgacatcaa ccggctgtcc gattatgacg tggacgctat tgttccacag 3540
tccttcctca aagatgactc cattgacaac aaggtgctga ccagatccga taaggcccgc 3600
ggtaagtctg acaatgttcc atcagaagag gtggtcaaga agatgaagaa ttactggcgg 3660
cagctcctca acgccaaact gatcacccag cggaagtttg acaatctgac taaggcagaa 3720
agaggaggtc tgagcgaact cgacaaggcc ggctttatta agaggcaact ggtcgaaaca 3780
cgccagatta ccaaacacgt ggcacaaatc ctcgactcta ggatgaacac taagtacgat 3840
gagaacgata agctgatcag ggaagtgaaa gtgataactc tgaagagcaa gctggtgtct 3900
gacttccgga aggactttca attctacaaa gttcgcgaaa taaacaatta ccatcatgct 3960
cacgatgcct atctcaatgc tgtcgttggc accgccctga tcaagaaata ccctaaactg 4020
gagtctgagt tcgtgtacgg tgactataaa gtctacgatg tgaggaagat gatagcaaag 4080
tctgagcaag agattggcaa agccaccgcc aagtacttct tctactctaa tatcatgaat 4140
ttctttaaga ctgagataac cctggctaac ggcgaaatcc ggaagcgccc actgatcgaa 4200
acaaacggag aaacaggaga aatcgtgtgg gataaaggca gggacttcgc aactgtgcgg 4260
aaggtgctgt ccatgccaca agtcaatatc gtgaagaaga ccgaagtgca gaccggcgga 4320
ttctcaaagg agagcatcct gccaaagcgg aactctgaca agctgatcgc caggaagaaa 4380
gattgggacc caaagaagta tggcggtttc gattccccta cagtggctta ttccgttctg 4440
gtcgtggcaa aagtggagaa aggcaagtcc aagaaactca agtctgttaa ggagctgctc 4500
ggaattacta ttatggagag atccagcttc gagaagaatc caatcgattt cctggaagct 4560
aagggctata aagaagtgaa gaaagatctc atcatcaaac tgcccaagta ctctctcttt 4620
gagctggaga atggtaggaa gcggatgctg gcctccgccg gagagctgca gaaaggaaac 4680
gagctggctc tgccctccaa atacgtgaac ttcctgtatc tggcctccca ctacgagaaa 4740
ctcaaaggta gccctgaaga caatgagcag aagcaactct ttgttgagca acataaacac 4800
tacctggacg aaatcattga acagattagc gagttcagca agcgggttat tctggccgat 4860
gcaaacctcg ataaagtgct gagcgcatat aataagcaca gggacaagcc aattcgcgaa 4920
caagcagaga atattatcca cctctttact ctgactaatc tgggcgctcc tgctgccttc 4980
aagtatttcg atacaactat tgacaggaag cggtacacct ctaccaaaga agttctcgat 5040
gccaccctga tacaccagtc aattaccgga ctgtacgaga ctcgcatcga cctgtctcag 5100
ctcggcggcg acggttctcc caagaagaag aggaaagtcg ggcgcgctgg aggaggatcc 5160
ggaggaggat ccggaggagg atccatgctg gatagggatg tgggtccaac tcccatgtat 5220
ccgcctacat acctggagcc agggattggg aggcacacac catatggcaa ccaaactgac 5280
tacagaatat ttgagcttaa caaacggctt cagaactgga cagaggagtg tgacaatctc 5340
tggtgggatg cattcacgac tgagttcttt gaggatgatg ccatgttgac catcactttc 5400
tgcctggagg atggaccaaa gagatatacc attggccgga ccctgatccc acgctacttc 5460
cgcagcatct ttgagggggg tgctacggag ctgtactatg ttcttaagca ccccaaggag 5520
gcattccaca gcaactttgt gtccctcgac tgtgaccagg gcagcatggt gacccagcat 5580
ggcaagccca tgttcaccca ggtgtgtgtg gagggccggt tgtacctgga gttcatgttt 5640
gacgacatga tgcggataaa gacgtggcac ttcagcatcc ggcagcaccg agagctcatc 5700
ccccgcagca tccttgccat gcatgcccaa gacccccaga tgttggatca gctctccaaa 5760
aacatcactc ggtgtgggct gtccgggccc ttcgaaggta agcctatccc taaccctctc 5820
ctcggtctcg attctacgcg taccggtcat catcaccatc accattgagt ttaaacccgc 5880
tgatcagcct cgactgtgcc ttctagttgc cagccatctg ttgtttgccc ctcccccgtg 5940
ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa tgaggaaatt 6000
gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg gcaggacagc 6060
aagggggagg attgggaaga caatagcagg catgctgggg atgcggtggg ctctatggct 6120
tctgaggcgg aaagaaccag ctggggctct agggggtatc cccacgcgcc ctgtagcggc 6180
gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc 6240
ctagcgcccg ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc 6300
cgtcaagctc taaatcgggg catcccttta gggttccgat ttagtgcttt acggcacctc 6360
gaccccaaaa aacttgatta gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg 6420
gtttttcgcc ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact 6480
ggaacaacac tcaaccctat ctcggtctat tcttttgatt tataagggat tttggggatt 6540
tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttaattctgt 6600
ggaatgtgtg tcagttaggg tgtggaaagt ccccaggctc cccaggcagg cagaagtatg 6660
caaagcatgc atctcaatta gtcagcaacc aggtgtggaa agtccccagg ctccccagca 6720
ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa ccatagtccc gcccctaact 6780
ccgcccatcc cgcccctaac tccgcccagt tccgcccatt ctccgcccca tggctgacta 6840
atttttttta tttatgcaga ggccgaggcc gcctctgcct ctgagctatt ccagaagtag 6900
tgaggaggct tttttggagg cctaggcttt tgcaaaaagc tcccgggagc ttgtatatcc 6960
attttcggat ctgatcagca cgtgttgaca attaatcatc ggcatagtat atcggcatag 7020
tataatacga caaggtgagg aactaaacca tggccaagcc tttgtctcaa gaagaatcca 7080
ccctcattga aagagcaacg gctacaatca acagcatccc catctctgaa gactacagcg 7140
tcgccagcgc agctctctct agcgacggcc gcatcttcac tggtgtcaat gtatatcatt 7200
ttactggggg accttgtgca gaactcgtgg tgctgggcac tgctgctgct gcggcagctg 7260
gcaacctgac ttgtatcgtc gcgatcggaa atgagaacag gggcatcttg agcccctgcg 7320
gacggtgtcg acaggtgctt ctcgatctgc atcctgggat caaagcgata gtgaaggaca 7380
gtgatggaca gccgacggca gttgggattc gtgaattgct gccctctggt tatgtgtggg 7440
agggctaagc acttcgtggc cgaggagcag gactgacacg tgctacgaga tttcgattcc 7500
accgccgcct tctatgaaag gttgggcttc ggaatcgttt tccgggacgc cggctggatg 7560
atcctccagc gcggggatct catgctggag ttcttcgccc accccaactt gtttattgca 7620
gcttataatg gttacaaata aagcaatagc atcacaaatt tcacaaataa agcatttttt 7680
tcactgcatt ctagttgtgg tttgtccaaa ctcatcaatg tatcttatca tgtctgtata 7740
ccgtcgacct ctagctagag cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat 7800
tgttatccgc tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctgg 7860
ggtgcctaat gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag 7920
tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt 7980
ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg 8040
ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg 8100
gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag 8160
gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga 8220
cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct 8280
ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc 8340
tttctccctt cgggaagcgt ggcgctttct caatgctcac gctgtaggta tctcagttcg 8400
gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc 8460
tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca 8520
ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag 8580
ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct 8640
ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc 8700
accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga 8760
tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca 8820
cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat 8880
taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac 8940
caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt 9000
gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt 9060
gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag 9120
ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct 9180
attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt 9240
gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc 9300
tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt 9360
agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg 9420
gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg 9480
actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct 9540
tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc 9600
attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt 9660
tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt 9720
tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg 9780
aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta tcagggttat 9840
tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg 9900
cgcacatttc cccgaaaagt gccacctgac gtc 9933
<210> 15
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 15
accgaatatg tcacattctg tctc 24
<210> 16
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 16
aaacgagaca gaatgtgaca tatt 24
<210> 17
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 17
accgggacta tgggaggtca ctaa 24
<210> 18
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 18
aaacttagtg acctcccata gtcc 24
<210> 19
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 19
accggaaggt tacacagaac caga 24
<210> 20
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 20
aaactctggt tctgtgtaac cttc 24
<210> 21
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 21
accgggccaa gagatatatc ttag 24
<210> 22
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 22
aaacctaaga tatatctctt ggcc 24
<210> 23
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 23
accggtgcca gaagagccaa ggac 24
<210> 24
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 24
aaacgtcctt ggctcttctg gcac 24
<210> 25
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 25
accggtggag ccacacccta gggt 24
<210> 26
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 26
aaacacccta gggtgtggct ccac 24
<210> 27
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 27
aaacggagcg caccatcttc ttca 24
<210> 28
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 28
aaactgaaga agatggtgcg ctcc 24
<210> 29
<211> 82
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 29
gggacctaag aaaaagagga aggtggcggc cgctggcggc agcatgctgg atagggatgt 60
gggtccaact cccatgtatc cg 82
<210> 30
<211> 65
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 30
ctctcggggg tggcgctctc gctggtaccg ggggtctcgc tgccgctctg ggaggcctgt 60
gacgt 65
<210> 31
<211> 84
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 31
gggcgcgctg gaggaggatc cggaggagga tccggaggag gatccatgct ggatagggat 60
gtgggtccaa ctcccatgta tccg 84
<210> 32
<211> 27
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 32
gaagggcccc tgggaggcct gtgacgt 27
<210> 33
<211> 60
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 33
gtggcggccg ctggcggcag catgctggat agggatgtgg gtccaactcc catgtatccg 60
<210> 34
<211> 57
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 34
cgctggtacc gggggtctcg ctgccgctgg acagcccaca ccgagtgatg tttttgg 57
<210> 35
<211> 84
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 35
gggcgcgctg gaggaggatc cggaggagga tccggaggag gatccatgct ggatagggat 60
gtgggtccaa ctcccatgta tccg 84
<210> 36
<211> 35
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 36
tcgaagggcc cggacagccc acaccgagtg atgtt 35
<210> 37
<211> 1767
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 37
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 1100
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 1215
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1315 1320 1325
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 1360
Leu Ser Gln Leu Gly Gly Asp Gly Ser Pro Lys Lys Lys Arg Lys Val
1365 1370 1375
Gly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Met
1380 1385 1390
Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr Leu
1395 1400 1405
Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp Tyr
1410 1415 1420
Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu Cys
1425 1430 1435 1440
Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp Asp
1445 1450 1455
Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg Tyr
1460 1465 1470
Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe Glu
1475 1480 1485
Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu Ala
1490 1495 1500
Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met Val
1505 1510 1515 1520
Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly Arg
1525 1530 1535
Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr Trp
1540 1545 1550
His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile Leu
1555 1560 1565
Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys Asn
1570 1575 1580
Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu Arg Leu
1585 1590 1595 1600
Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His Lys Thr
1605 1610 1615
Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe Gln Lys
1620 1625 1630
Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln Gln Pro
1635 1640 1645
Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met Ser Ser
1650 1655 1660
Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser Pro Ala
1665 1670 1675 1680
Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val Val Gly
1685 1690 1695
Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu Arg Leu
1700 1705 1710
Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly Ile Asp
1715 1720 1725
Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn Ser Pro
1730 1735 1740
Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu Asn Pro
1745 1750 1755 1760
Thr Ser Gln Ala Ser Gln Gly
1765
<210> 38
<211> 1767
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 38
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190
Asn Ile Thr Arg Cys Gly Leu Ser Asn Ser Thr Leu Asn Tyr Leu Arg
195 200 205
Leu Cys Val Ile Leu Glu Pro Met Gln Glu Leu Met Ser Arg His Lys
210 215 220
Thr Tyr Ser Leu Ser Pro Arg Asp Cys Leu Lys Thr Cys Leu Phe Gln
225 230 235 240
Lys Trp Gln Arg Met Val Ala Pro Pro Ala Glu Pro Thr Arg Gln Gln
245 250 255
Pro Ser Lys Arg Arg Lys Arg Lys Met Ser Gly Gly Ser Thr Met Ser
260 265 270
Ser Gly Gly Gly Asn Thr Asn Asn Ser Asn Ser Lys Lys Lys Ser Pro
275 280 285
Ala Ser Thr Phe Ala Leu Ser Ser Gln Val Pro Asp Val Met Val Val
290 295 300
Gly Glu Pro Thr Leu Met Gly Gly Glu Phe Gly Asp Glu Asp Glu Arg
305 310 315 320
Leu Ile Thr Arg Leu Glu Asn Thr Gln Phe Asp Ala Ala Asn Gly Ile
325 330 335
Asp Asp Glu Asp Ser Phe Asn Asn Ser Pro Ala Leu Gly Ala Asn Ser
340 345 350
Pro Trp Asn Ser Lys Pro Pro Ser Ser Gln Glu Ser Lys Ser Glu Asn
355 360 365
Pro Thr Ser Gln Ala Ser Gln Ser Gly Ser Glu Thr Pro Gly Thr Ser
370 375 380
Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
385 390 395 400
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
405 410 415
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
420 425 430
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
435 440 445
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
450 455 460
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
465 470 475 480
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
485 490 495
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
500 505 510
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
515 520 525
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
530 535 540
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
545 550 555 560
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
565 570 575
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
580 585 590
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
595 600 605
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
610 615 620
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
625 630 635 640
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
645 650 655
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
660 665 670
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
675 680 685
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
690 695 700
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
705 710 715 720
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
725 730 735
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
740 745 750
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
755 760 765
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
770 775 780
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
785 790 795 800
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
805 810 815
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
820 825 830
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
835 840 845
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
850 855 860
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
865 870 875 880
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
885 890 895
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
900 905 910
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
915 920 925
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
930 935 940
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
945 950 955 960
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
965 970 975
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
980 985 990
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
995 1000 1005
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1010 1015 1020
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
1025 1030 1035 1040
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
1045 1050 1055
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
1060 1065 1070
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
1075 1080 1085
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
1090 1095 1100
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
1105 1110 1115 1120
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
1125 1130 1135
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
1140 1145 1150
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
1155 1160 1165
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
1170 1175 1180
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
1185 1190 1195 1200
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
1205 1210 1215
Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala Ile Val
1220 1225 1230
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
1235 1240 1245
Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1250 1255 1260
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
1265 1270 1275 1280
Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1285 1290 1295
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
1300 1305 1310
Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1315 1320 1325
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys
1330 1335 1340
Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe
1345 1350 1355 1360
Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp
1365 1370 1375
Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1380 1385 1390
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1395 1400 1405
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
1410 1415 1420
Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1425 1430 1435 1440
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn
1445 1450 1455
Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1460 1465 1470
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1475 1480 1485
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1490 1495 1500
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys
1505 1510 1515 1520
Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1525 1530 1535
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu
1540 1545 1550
Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1555 1560 1565
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1570 1575 1580
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg
1585 1590 1595 1600
Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
1605 1610 1615
Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1620 1625 1630
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1635 1640 1645
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser
1650 1655 1660
Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1665 1670 1675 1680
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala
1685 1690 1695
Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1700 1705 1710
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1715 1720 1725
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1730 1735 1740
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Gly Ser
1745 1750 1755 1760
Pro Lys Lys Lys Arg Lys Val
1765
<210> 39
<211> 1592
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 39
Met Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr
1 5 10 15
Leu Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp
20 25 30
Tyr Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu
35 40 45
Cys Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp
50 55 60
Asp Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg
65 70 75 80
Tyr Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe
85 90 95
Glu Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu
100 105 110
Ala Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met
115 120 125
Val Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly
130 135 140
Arg Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr
145 150 155 160
Trp His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile
165 170 175
Leu Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys
180 185 190
Asn Ile Thr Arg Cys Gly Leu Ser Ser Gly Ser Glu Thr Pro Gly Thr
195 200 205
Ser Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu
210 215 220
Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr
225 230 235 240
Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His
245 250 255
Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu
260 265 270
Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr
275 280 285
Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu
290 295 300
Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe
305 310 315 320
Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn
325 330 335
Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His
340 345 350
Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu
355 360 365
Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu
370 375 380
Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe
385 390 395 400
Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile
405 410 415
Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser
420 425 430
Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys
435 440 445
Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr
450 455 460
Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln
465 470 475 480
Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln
485 490 495
Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser
500 505 510
Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr
515 520 525
Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His
530 535 540
Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu
545 550 555 560
Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly
565 570 575
Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys
580 585 590
Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu
595 600 605
Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
610 615 620
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg
625 630 635 640
Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu
645 650 655
Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg
660 665 670
Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile
675 680 685
Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln
690 695 700
Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu
705 710 715 720
Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr
725 730 735
Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro
740 745 750
Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe
755 760 765
Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe
770 775 780
Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp
785 790 795 800
Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile
805 810 815
Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu
820 825 830
Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu
835 840 845
Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
850 855 860
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys
865 870 875 880
Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp
885 890 895
Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile
900 905 910
His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val
915 920 925
Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly
930 935 940
Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp
945 950 955 960
Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile
965 970 975
Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser
980 985 990
Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser
995 1000 1005
Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu
1010 1015 1020
Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp
1025 1030 1035 1040
Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala Ile
1045 1050 1055
Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu
1060 1065 1070
Thr Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu
1075 1080 1085
Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1090 1095 1100
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg
1105 1110 1115 1120
Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu
1125 1130 1135
Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser
1140 1145 1150
Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val
1155 1160 1165
Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp
1170 1175 1180
Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His
1185 1190 1195 1200
Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr
1205 1210 1215
Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp
1220 1225 1230
Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr
1235 1240 1245
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu
1250 1255 1260
Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1265 1270 1275 1280
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala
1285 1290 1295
Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1300 1305 1310
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1315 1320 1325
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1330 1335 1340
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val
1345 1350 1355 1360
Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys
1365 1370 1375
Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn
1380 1385 1390
Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp
1395 1400 1405
Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly
1410 1415 1420
Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu
1425 1430 1435 1440
Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His
1445 1450 1455
Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu
1460 1465 1470
Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1475 1480 1485
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys
1490 1495 1500
Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1505 1510 1515 1520
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro
1525 1530 1535
Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1540 1545 1550
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1555 1560 1565
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Gly
1570 1575 1580
Ser Pro Lys Lys Lys Arg Lys Val
1585 1590
<210> 40
<211> 1592
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 40
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 1100
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 1215
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1315 1320 1325
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 1360
Leu Ser Gln Leu Gly Gly Asp Gly Ser Pro Lys Lys Lys Arg Lys Val
1365 1370 1375
Gly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Met
1380 1385 1390
Leu Asp Arg Asp Val Gly Pro Thr Pro Met Tyr Pro Pro Thr Tyr Leu
1395 1400 1405
Glu Pro Gly Ile Gly Arg His Thr Pro Tyr Gly Asn Gln Thr Asp Tyr
1410 1415 1420
Arg Ile Phe Glu Leu Asn Lys Arg Leu Gln Asn Trp Thr Glu Glu Cys
1425 1430 1435 1440
Asp Asn Leu Trp Trp Asp Ala Phe Thr Thr Glu Phe Phe Glu Asp Asp
1445 1450 1455
Ala Met Leu Thr Ile Thr Phe Cys Leu Glu Asp Gly Pro Lys Arg Tyr
1460 1465 1470
Thr Ile Gly Arg Thr Leu Ile Pro Arg Tyr Phe Arg Ser Ile Phe Glu
1475 1480 1485
Gly Gly Ala Thr Glu Leu Tyr Tyr Val Leu Lys His Pro Lys Glu Ala
1490 1495 1500
Phe His Ser Asn Phe Val Ser Leu Asp Cys Asp Gln Gly Ser Met Val
1505 1510 1515 1520
Thr Gln His Gly Lys Pro Met Phe Thr Gln Val Cys Val Glu Gly Arg
1525 1530 1535
Leu Tyr Leu Glu Phe Met Phe Asp Asp Met Met Arg Ile Lys Thr Trp
1540 1545 1550
His Phe Ser Ile Arg Gln His Arg Glu Leu Ile Pro Arg Ser Ile Leu
1555 1560 1565
Ala Met His Ala Gln Asp Pro Gln Met Leu Asp Gln Leu Ser Lys Asn
1570 1575 1580
Ile Thr Arg Cys Gly Leu Ser Gly
1585 1590
<210> 41
<211> 16
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 41
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
<210> 42
<211> 15
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 42
Gly Arg Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser
1 5 10 15
<210> 43
<211> 23
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 43
gnnnnnnnnn nnnnnnnnnn ngg 23

Claims (16)

1. A fusion protein is characterized in that the amino acid sequence of the fusion protein is shown as one of SEQ ID No. 37-40.
2. A DNA loop system comprising the fusion protein of claim 1, a promoter-targeting sgRNA and an enhancer-targeting sgRNA,
the sgRNA of the targeting promoter is located between-100 and-200 bp upstream of the TSS;
and/or, the sgrnas of the targeting promoter have gnnnnnnnnnnnnnnnngg characteristics;
and/or, the sgRNA of the targeting promoter targets the promoter region of the HBB gene;
and/or, the sgrnas of the targeting enhancer target the vicinity of DHS2 of the LCR region of β -globin.
3. The DNA looping system of claim 2, wherein the sequence of sgRNA of the targeting promoter is shown in SEQ ID No. 4-6.
4. The DNA looping system of claim 2, wherein the sequence of sgRNA of said targeting enhancer is shown in SEQ ID No. 7-9.
5. An isolated polynucleotide encoding the fusion protein of claim 1.
6. An expression system comprising a host cell capable of expressing the fusion protein of claim 1, the sgRNA of the targeting promoter of claim 2, and the sgRNA of the targeting enhancer.
7. The expression system of claim 6, wherein the host cell is selected from the group consisting of eukaryotic cells.
8. The expression system of claim 6, wherein the host cell is selected from a primary cell of metazoan origin or an immortalized cell line.
9. The expression system of claim 6, wherein the host cell is a blood cell line.
10. The expression system of claim 8, wherein the host cell is a human K562 cell.
11. Use of a DNA-loop system according to any one of claims 2 to 4, a polynucleotide according to claim 5, or an expression system according to any one of claims 6 to 10 for regulating gene expression for non-disease therapeutic purposes.
12. The use of claim 11, wherein the gene expression is eukaryotic gene expression.
13. The use according to claim 12, wherein the eukaryote is a metazoan.
14. The use of claim 12, wherein the eukaryotic organism is a combination of one or more of a human, a mouse, a nematode, a drosophila.
15. A method of modulating gene expression for non-disease therapeutic purposes comprising: the control of gene expression is performed by pulling the three-dimensional space distance of the target site closer by the loop-forming system according to any one of claims 2 to 4.
16. The method of claim 15, wherein the method of modulating gene expression comprises: culturing a host cell capable of expressing a gene of interest under suitable conditions in the presence of said loop-forming system;
and/or, the method for regulating gene expression is a method for regulating in vitro gene expression.
CN201910063623.6A 2019-01-23 2019-01-23 DNA cyclization molecule and application thereof Active CN111471665B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910063623.6A CN111471665B (en) 2019-01-23 2019-01-23 DNA cyclization molecule and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910063623.6A CN111471665B (en) 2019-01-23 2019-01-23 DNA cyclization molecule and application thereof

Publications (2)

Publication Number Publication Date
CN111471665A CN111471665A (en) 2020-07-31
CN111471665B true CN111471665B (en) 2023-07-04

Family

ID=71743398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910063623.6A Active CN111471665B (en) 2019-01-23 2019-01-23 DNA cyclization molecule and application thereof

Country Status (1)

Country Link
CN (1) CN111471665B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102858985A (en) * 2009-07-24 2013-01-02 西格马-奥尔德里奇有限责任公司 Method for genome editing
KR20130083964A (en) * 2012-01-16 2013-07-24 김승찬 The alteration of signal transduction by applying cultured cell line with antagomeric dna antisense oligmer targeting hsa-mir-129-5p mirna
AU2013272283A1 (en) * 2012-06-07 2015-01-15 The Children's Hospital Of Philadelphia Controlled gene expression methods
CN107636017A (en) * 2015-04-10 2018-01-26 费尔丹生物公司 The shuttling agent based on polypeptide for the efficiency transduceed for improving polypeptide load to the cytoplasm of target eukaryotic, its purposes and relative method and kit
CN108064283A (en) * 2015-02-24 2018-05-22 加利福尼亚大学董事会 With reference to the transcriptional switching and its application method of triggering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102858985A (en) * 2009-07-24 2013-01-02 西格马-奥尔德里奇有限责任公司 Method for genome editing
KR20130083964A (en) * 2012-01-16 2013-07-24 김승찬 The alteration of signal transduction by applying cultured cell line with antagomeric dna antisense oligmer targeting hsa-mir-129-5p mirna
AU2013272283A1 (en) * 2012-06-07 2015-01-15 The Children's Hospital Of Philadelphia Controlled gene expression methods
CN108064283A (en) * 2015-02-24 2018-05-22 加利福尼亚大学董事会 With reference to the transcriptional switching and its application method of triggering
CN107636017A (en) * 2015-04-10 2018-01-26 费尔丹生物公司 The shuttling agent based on polypeptide for the efficiency transduceed for improving polypeptide load to the cytoplasm of target eukaryotic, its purposes and relative method and kit

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LDB1-mediated enhancer looping can be established independent of mediator and cohesin;Ivan Krivega et al.;《Nucleic Acids Research》;20170518;第45卷(第14期);第8255页摘要、第8260页第1段-第8261页左栏第1段、图4 *
Long-range enhancer–promoter contacts in gene expression control;Stefan Schoenfelder et al.;《nature reviews genetics》;20190513;第20卷;第437-455页 *
Role of LDB1 in the transition from chromatin looping to transcription activation;Ivan Krivega et al.;《GENES & DEVELOPMENT》;20140529;第28卷;第1278页第1段、第1280页第3段、第1281页右栏第3段 *

Also Published As

Publication number Publication date
CN111471665A (en) 2020-07-31

Similar Documents

Publication Publication Date Title
US20230053915A1 (en) Directed editing of cellular rna via nuclear delivery of crispr/cas9
KR101666228B1 (en) Therapeutic gene-switch constructs and bioreactors for the expression of biotherapeutic molecules, and uses thereof
CN108396027A (en) The sgRNA of CRISPR-Cas9 targeting knock out people colon-cancer cell DEAF1 genes and its specificity
KR102523318B1 (en) Enhanced HAT family transposon-mediated gene delivery and associated compositions, systems, and methods
AU775988B2 (en) Ligand activated transcriptional regulator proteins
CN110835633B (en) Preparation of PTC stable cell line by using optimized gene codon expansion system and application
US8283518B2 (en) Administration of transposon-based vectors to reproductive organs
CN108495685B (en) Yeast-based immunotherapy against clostridium difficile infection
CN109260478A (en) Immunogenic ligand/effector molecule conjugate method is prepared by sequence-specific transpeptidase
KR20200015900A (en) Self-inactivating virus vector
JP2003534775A (en) Methods for destabilizing proteins and uses thereof
US20040235011A1 (en) Production of multimeric proteins
CN116157514A (en) Novel OMNI-59, 61, 67, 76, 79, 80, 81 and 82CRISPR nucleases
KR102584628B1 (en) An engineered multicomponent system for the identification and characterization of T-cell receptors, T-cell antigens, and their functional interactions.
KR20230065370A (en) Production cell line enhancers
KR20230131229A (en) Site-specific genetic modification
CN111471665B (en) DNA cyclization molecule and application thereof
WO2022241455A1 (en) A synthetic circuit for buffering gene dosage variation between individual mammalian cells
KR101077689B1 (en) Hypoxia inducible vegf plasmid for ischemic disease
CN113005140B (en) GS expression vector with double expression cassettes and application thereof
KR102061251B1 (en) Recombinant cell and method for production of endogenous polypeptide
CN109055425B (en) Xenopus laevis oocyte expression vector with yellow or red fluorescent protein label and application thereof
CN110777147A (en) IKZF3 gene-silenced T cell and application thereof
CN107541526B (en) CIK capable of knocking down endogenous CTLA4 expression and preparation method and application thereof
US11225666B2 (en) Plasmid vector for expressing a PVT1 exon and method for constructing standard curve therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant