CN108795900B

CN108795900B - DNA polymerase and preparation method thereof

Info

Publication number: CN108795900B
Application number: CN201710287578.3A
Authority: CN
Inventors: 陈清斌; 董宇亮; 章文蔚; 刘芬; 徐崇钧; 斯内扎娜·德马纳克
Original assignee: MGI Tech Co Ltd
Current assignee: MGI Tech Co Ltd
Priority date: 2017-04-27
Filing date: 2017-04-27
Publication date: 2021-02-02
Anticipated expiration: 2037-04-27
Also published as: CN108795900A

Abstract

The invention discloses a DNA polymerase and a preparation method thereof. The DNA polymerase disclosed by the invention is any one of the following A1) -A3): A1) a mutant protein having a DNA polymerase activity, which is obtained by substituting and/or deleting and/or adding an amino acid residue to an amino acid sequence of 9 ℃ N DNA polymerase; A2) a mutant protein having a DNA polymerase activity, which is obtained by modifying an amino acid residue of a 9 ℃ N DNA polymerase amino acid sequence; A3) a fusion protein having DNA polymerase activity obtained by attaching a tag to the middle or/and N-terminal or/and C-terminal of A1) or A2); A1) or A2) the mutein has a reduced affinity for the template DNA compared to the 9 ℃ N DNA polymerase, without the DNA polymerase activity being reduced.

Description

DNA polymerase and preparation method thereof

Technical Field

The invention relates to DNA polymerase and a preparation method thereof in the field of biotechnology.

Background

DNA polymerase (DNA polymerase) is an enzyme involved in DNA replication. The structure of DNA polymerase is usually composed of three subdomains of thumb (thumb), palm (palm) and finger (finger). Each subdomain has its own specific role, the palm being the catalytic center for DNA polymerases, the finger's primary function being to bind nucleotides or nucleotide analogs, and the thumb's function being related to the ability to bind template DNA, processivity. It has been shown that mutations in the thumb domain primarily affect the binding and processivity of the polymerase to the template DNA, but do not significantly affect other properties of the polymerase, such as catalytic activity, affinity for nucleotides or nucleotide analogs.

Thermococcus sp.9 ℃ N-7 polymerase, derived from the thermophilic marine archaea Thermococcus sp.9 ℃ N-7, isolated in the 9 ℃ N submarine hydrothermal port of the east Pacific ocean AlON.

Disclosure of Invention

The technical problem to be solved by the present invention is how to reduce the binding ability of a DNA polymerase to a template DNA without reducing the DNA polymerase activity.

In order to solve the above problems, the present invention provides a protein having DNA polymerase activity, which is obtained by replacing or modifying a part of amino acid residues based on a 9 ° N DNA polymerase (abbreviated as DC) sequence, and is named as DCm.

The protein DCm with DNA polymerase activity provided by the invention is any one of the following A1) -A3):

A1) a mutant protein having a DNA polymerase activity, which is obtained by substituting and/or deleting and/or adding an amino acid residue to an amino acid sequence of a DC;

A2) a mutant protein having a DNA polymerase activity, which is obtained by modifying an amino acid residue of an amino acid sequence of DC;

A3) a fusion protein having DNA polymerase activity obtained by attaching a tag to the middle or/and N-terminal or/and C-terminal of A1) or A2);

A1) or A2) the mutein has a reduced affinity for the template DNA compared with DC, without a reduction in the DNA polymerase activity.

In order to facilitate purification of the proteins in A1) or A2), the amino-or carboxyl-terminus of the proteins in A1) or A2) may be attached with the tags as shown in Table 1.

TABLE 1 sequence of tags

Label (R)	Residue of	Sequence of
			Poly-Arg	5-6 (typically 5)	RRRRR
Poly-His	2-10 (generally 6)	HHHHHH
			FLAG	8	DYKDDDDK
Strep-tag II	8	WSHPQFEK
			c-myc	10	EQKLISEEDL

The DCm can be synthesized artificially, or can be obtained by synthesizing the coding gene and then carrying out biological expression.

The gene encoding DCm can be obtained by mutating one or several nucleotides in the DNA sequence encoding DCs, and/or by linking the coding sequence of the tag shown in Table 1 above to the middle and/or 5 'and/or 3' of the sequence. The DNA sequence encoding DC may be a sequence obtained by adding ATG to the 5' end of position 43-2367 of SEQ ID NO. 10. The 1 st nucleotide of the sequence 10 is the 5' terminal nucleotide thereof.

In the above-mentioned DCm, the DC may be an exonuclease-inactivated protein obtained by replacing aspartic acid at position 141 of the NCBI ID AAA88769.1 with alanine and replacing glutamic acid at position 143 with alanine. The DC can be protein shown in a sequence 1 in a sequence table.

In the above-mentioned DCm, the substitution and/or deletion and/or addition of the amino acid residue is a substitution and/or deletion and/or addition of one or several amino acid residues. A1) Said DCm in (1) has 75% or more identity to DC. The above-mentioned identity of 75% or more may be 80%, 85%, 90% or 95% or more.

In the above-mentioned DCm, the substitution may be a substitution of an amino acid residue with an alanine residue. The substitution may be of one or several amino acid residues.

Specifically, the substitution may be at least one of 674 th, 665 th, 667 th, 668 th, 735 th and 679 th positions of DC.

In the above-mentioned DCm, the modification of the amino acid residue is a modification of one or several amino acid residues.

Specifically, the modification may be at least one of 674 th, 665 th, 667 th, 668 th, 735 th and 679 th positions of the DC.

In the above-mentioned DCm, the fusion protein described in A3) may be specifically a protein obtained by linking a his tag and/or a TEV enzyme recognition sequence to the middle of the protein of A1) or A2) or to the amino terminus or the carboxy terminus thereof. A3) The fusion protein can be specifically obtained by inserting amino acid residues 2-14 of the sequence 2 in a sequence table into

amino acid residues

1 and 2 of the protein of A1) or A2).

The above-mentioned DCm may be any one of the following B1) -B8):

B1) a protein obtained by substituting the 674 th lysine residue from the N-terminus of DC with an alanine residue (this DCm was designated as DC 5);

B2) a protein obtained by replacing both the glutamine residue at position 665 and the threonine residue at position 667 of DC from the N-terminus with an alanine residue (this DCm is designated DC 24);

B3) a protein obtained by substituting the N-terminal arginine residue at position 668 in DC with an alanine residue (this DCm was designated as DC 4);

B4) a protein obtained by substituting the glutamine residue at position 665 of DC from the N-terminus with an alanine residue (this DCm is designated DC 35);

B5) a protein obtained by substituting the threonine residue at position 667 of DC from the N-terminus with an alanine residue (this DCm was designated DC 36);

B6) a protein obtained by substituting alanine residues for the asparagine residue at position 735, the glutamine residue at position 665 and the threonine residue at position 667 of DC from the N-terminus (this DCm is designated DC 40);

B7) a protein obtained by substituting histidine residue at position 679 from the N-terminus of DC with alanine residue (this DCm was designated DC 17);

B8) a protein obtained by substituting the asparagine residue at position 735 from the N-terminus of DC with an alanine residue (this DCm was named DC 8).

The DC5 fusion protein can be a protein shown in a sequence 2 in a sequence table;

the DC24 fusion protein can be a protein shown in a sequence 3 in a sequence table;

the DC4 fusion protein can be a protein shown in a sequence 4 in a sequence table;

the DC35 fusion protein can be a protein shown in a sequence 5 in a sequence table;

the DC36 fusion protein can be a protein shown in a sequence 6 in a sequence table;

the DC40 fusion protein can be a protein shown as a sequence 7 in a sequence table;

the DC17 fusion protein can be a protein shown in a sequence 8 in a sequence table;

the DC8 fusion protein can be a protein shown in a sequence 9 in a sequence table.

The above-mentioned DCm has the following characteristics: the affinity of DCm to DNA molecules, cDNA molecules or biochips containing DNA molecules or cDNA molecules is reduced compared to DCs.

The biochip is a micro-array hybridization type chip (micro-array) of biological information molecules (such as gene fragments, DNA fragments or polypeptides, proteins, sugar molecules, tissues and the like) fixed on a mutual support medium in high density.

DCm may have nucleotides or nucleotide analogs as substrates. The nucleotide analogue is obtained by modifying nucleotide. The nucleotide analog may be specifically a substance obtained by modifying a nucleotide with a fluorescent group.

In order to solve the technical problems, the invention also provides a biomaterial related to the DCm.

The biomaterial related to DCm provided by the invention is any one of the following C1) to C5):

C1) a nucleic acid molecule encoding DCm;

C2) an expression cassette comprising the nucleic acid molecule of C1);

C3) a recombinant vector comprising the nucleic acid molecule of C1), or a recombinant vector comprising the expression cassette of C2);

C4) a recombinant microorganism containing C1) the nucleic acid molecule, or a recombinant microorganism containing C2) the expression cassette, or a recombinant microorganism containing C3) the recombinant vector;

C5) a transgenic cell line comprising C1) the nucleic acid molecule or a transgenic cell line comprising C2) the expression cassette.

In the above biological material, the nucleic acid molecule of C1) may be 1), 2) or 3) described below:

1) a cDNA molecule or a DNA molecule which is obtained by replacing at least one nucleotide of the sequence of the coding gene of the DC and codes the DCm;

2) a cDNA molecule or a genomic DNA molecule having 75% or more identity with the nucleotide sequence defined in 1) and encoding DCm;

3) a cDNA molecule or a genome DNA molecule which is hybridized with the nucleotide sequence defined in 1) under strict conditions and codes for DCm.

Wherein the nucleic acid molecule may be DNA, such as cDNA, genomic DNA or recombinant DNA; the nucleic acid molecule may also be RNA, such as mRNA or hnRNA, etc.

Wherein the DNA molecule shown in 43 th to 2367 th of the sequence 10 encodes the protein shown in 2 nd to 775 th of the sequence 1.

The nucleotide sequence encoding DCm of the present invention can be easily mutated by one of ordinary skill in the art using known methods, such as directed evolution and point mutation. Those nucleotides which are artificially modified to have 75% or more identity to the nucleotide sequence of the present DCm are derived from and identical to the nucleotide sequence of the present invention as long as they encode DCm and have the function of DCm.

The term "identity" as used herein refers to sequence similarity to a native nucleic acid sequence. "identity" includes nucleotide sequences that are 75% or greater, or 85% or greater, or 90% or greater, or 95% or greater identical to the nucleotide sequence of the amino acid sequence encoding DCm of the present invention. Identity can be assessed visually or by computer software. Using computer software, the identity between two or more sequences can be expressed in percent (%), which can be used to assess the identity between related sequences.

In the above biological material, the stringent conditions are hybridization and membrane washing at 68 ℃ for 2 times, 5min each, in a solution of 2 XSSC, 0.1% SDS, and hybridization and membrane washing at 68 ℃ for 2 times, 15min each, in a solution of 0.5 XSSC, 0.1% SDS; alternatively, hybridization was carried out at 65 ℃ in a solution of 0.1 XSSPE (or 0.1 XSSC), 0.1% SDS, and the membrane was washed.

The above-mentioned identity of 75% or more may be 80%, 85%, 90% or 95% or more.

In the biological material, the nucleic acid molecule encoding the DC5 fusion protein can be a DNA molecule shown as a sequence 11 in a sequence table;

the nucleic acid molecule for encoding the DC24 fusion protein can be a DNA molecule shown as a sequence 12 in a sequence table;

the nucleic acid molecule for encoding the DC4 fusion protein can be a DNA molecule shown as a sequence 13 in a sequence table;

the nucleic acid molecule for encoding the DC35 fusion protein can be a DNA molecule shown as a sequence 14 in a sequence table;

the nucleic acid molecule for encoding the DC36 fusion protein can be a DNA molecule shown as a sequence 15 in a sequence table;

the nucleic acid molecule for encoding the DC40 fusion protein can be a DNA molecule shown as a sequence 16 in a sequence table;

the nucleic acid molecule for encoding the DC17 fusion protein can be a DNA molecule shown as a sequence 17 in a sequence table;

the nucleic acid molecule for encoding the DC8 fusion protein can be a DNA molecule shown as a sequence 18 in a sequence table.

The expression cassette containing a nucleic acid molecule encoding DCm (DCm gene expression cassette) according to C2) in the above-mentioned biological material means a DNA capable of expressing DCm in a host cell, and the DNA may include not only a promoter which initiates transcription of DCm gene but also a terminator which terminates transcription of DCm gene. Further, the expression cassette may also include an enhancer sequence.

The recombinant vector containing the DCm gene expression cassette can be constructed by using the existing expression vector.

In the above biological material, the vector may be a plasmid, a cosmid, a phage, or a viral vector. The plasmid can be pET-28 a.

The recombinant vector may be one obtained by inserting the nucleic acid molecule encoding DCm into the multiple cloning site of the vector. In the embodiment of the invention, the recombinant vector is specifically a recombinant vector for expressing the fusion protein of the DC fusion His tag and the TEV enzyme recognition sequence, which is obtained by replacing a DNA fragment between AlwNI and HpaI recognition sequences of pET-28a by the nucleic acid molecule for coding the DCm fusion protein.

In the above biological material, the microorganism may be yeast, bacteria, algae or fungi. Wherein the bacterium can be Escherichia coli, such as Escherichia coli BL21(DE 3).

In the examples of the present invention, the recombinant microorganism is a vector obtained by introducing the recombinant vector into, in particular, E.coli BL21(DE 3).

In the above biological material, the transgenic cell line may or may not include propagation material.

In order to solve the technical problems, the invention also provides a preparation method of the DCm.

The method for preparing the DCm comprises the step of introducing the coding gene of the DCm into a biological cell to express the coding gene of the DCm so as to obtain the DCm.

In the above method, the introducing of the gene encoding DCm into the biological cell may be introducing a recombinant expression vector containing the gene encoding DCm into the biological cell to obtain a recombinant cell.

The gene encoding said DCm may in particular be the nucleic acid molecule described above under C1).

The recombinant expression vector can be obtained by introducing the encoding gene of the DCm into an expression vector. The expression vector may be a plasmid, cosmid, phage, or viral vector. The plasmid can be pET-28 a.

In the above method, the biological cell may be a microorganism, an animal cell or a plant cell. The microorganism may in particular be Escherichia coli, such as Escherichia coli BL21(DE 3).

In the above method, the expression of the gene encoding DCm may be specifically culturing the recombinant cell to obtain a culture, and expressing the gene encoding DCm in the recombinant cell.

The above method may further comprise purifying the DCm from the culture. Purification of DCm from the culture can be carried out by affinity chromatography and ion exchange chromatography.

In order to solve the technical problem, the invention also provides any one of the following applications:

E1) the use of DCm as a DNA polymerase;

E2) the use of the biological material for the preparation of a DNA polymerase;

E3) application of DCm in Polymerase Chain Reaction (PCR);

E4) application of DCm in preparing polymerase chain reaction products;

E5) the use of the biological material in a polymerase chain reaction;

E6) the application of the biological material in preparing polymerase chain reaction products;

E7) the application of the preparation method of the DCm in preparing polymerase chain reaction products;

E8) the use of DCm in sequencing;

E9) the use of the biological material in sequencing;

E10) application of DCm in preparation of sequencing products;

E11) the application of the biological material in the preparation of sequencing products;

E12) the preparation method of the DCm is applied to preparation of sequencing products.

In the present invention, the DNA polymerase may use a nucleotide or a nucleotide analog as a substrate. The nucleotide analogue is obtained by modifying nucleotide. The nucleotide analog may be specifically a substance obtained by modifying a nucleotide with a fluorescent group.

Experiments prove that compared with DC, the protein (DCm) with DNA polymerase activity, which is obtained by replacing partial amino acid residues in a DC sequence, has reduced affinity with template DNA, and the enzyme activity is not reduced. A suitable decrease in affinity is beneficial while maintaining the activity of the enzyme, e.g., a decrease in affinity for the template DNA during sequencing is beneficial to elute the polymerase attached to the chip, allowing better performance of each round of polymerase addition, and thus improving the quality of the sequencing results.

Drawings

FIG. 1 shows the SDS-PAGE result of the purified DC5 fusion protein. Wherein, lane 1 is the purified DC5 fusion protein, lane 2 is the 20-fold diluted sample of the purified DC5 fusion protein, and lane 3 is the protein molecular weight standard.

FIG. 2 shows the SDS-PAGE result of the purified DC24 fusion protein. Wherein, lane 1 is the protein molecular weight standard, lane 2 is the purified DC24 fusion protein, and lane 3 is the 20-fold diluted sample of the purified DC24 fusion protein.

FIG. 3 shows the SDS-PAGE result of the purified DC4 fusion protein. Wherein, lane 1 is the purified DC4 fusion protein, lane 2 is the 20-fold diluted sample of the purified DC4 fusion protein, and lane 3 is the protein molecular weight standard.

FIG. 4 shows the SDS-PAGE result of the purified DC35 fusion protein. Wherein, lane 1 is the purified DC35 fusion protein, lane 2 is the 20-fold diluted sample of the purified DC35 fusion protein, and lane M is the protein molecular weight standard.

FIG. 5 shows the SDS-PAGE result of the purified DC36 fusion protein. Wherein, lane 1 is the purified DC36 fusion protein, lane 2 is the 20-fold diluted sample of the purified DC36 fusion protein, and lane M is the protein molecular weight standard.

FIG. 6 shows the SDS-PAGE of the purified DC40 fusion protein. Wherein, lane 1 is the purified DC40 fusion protein, lane 2 is the 20-fold diluted sample of the purified DC40 fusion protein, and lane M is the protein molecular weight standard.

FIG. 7 shows the SDS-PAGE of the purified DC17 fusion protein. Wherein, lane 1 is the protein molecular weight standard, lane 2 is the purified DC17 fusion protein, and lane 3 is the 20-fold diluted sample of the purified DC17 fusion protein.

FIG. 8 shows the SDS-PAGE results of the purified DC8 fusion protein. Wherein, lane 1 is the purified DC8 fusion protein, lane 2 is the 20-fold diluted sample of the purified DC8 fusion protein, and lane 3 is the protein molecular weight standard.

FIG. 9 shows the result of SDS-PAGE electrophoresis of the purified DC fusion protein. Wherein, lane 1 is the protein molecular weight standard, lane 2 is the purified DC fusion protein, and lane 3 is the 20-fold diluted sample of the purified DC fusion protein.

FIG. 10 shows the results of detecting the endoprotease activity of the DC5 fusion protein. Wherein S represents DC5 fusion protein, and NC represents negative control.

FIG. 11 shows the results of detecting the endonuclease activity of the DC24 fusion protein. Wherein S represents DC24 fusion protein, and NC represents negative control.

FIG. 12 shows the results of detecting the endoprotease activity of the DC4 fusion protein. Wherein S represents DC4 fusion protein, and NC represents negative control.

FIG. 13 shows the results of detecting the endonuclease activity of the DC35 fusion protein. Wherein S represents DC35 fusion protein, and NC represents negative control.

FIG. 14 shows the results of detecting the endonuclease activity of the DC36 fusion protein. Wherein S represents DC36 fusion protein, and NC represents negative control.

FIG. 15 shows the results of detecting the endoprotease activity of DC40 fusion protein. Wherein S represents DC40 fusion protein, and NC represents negative control.

FIG. 16 shows the results of detecting the endonuclease activity of the DC17 fusion protein. Wherein S represents DC17 fusion protein, and NC represents negative control.

FIG. 17 shows the results of detecting the endoprotease activity of the DC8 fusion protein. Wherein S represents DC8 fusion protein, and NC represents negative control.

FIG. 18 shows the results of detecting the activity of DC fusion protein endonuclease. Wherein S represents a DC fusion protein, and NC represents a negative control.

Detailed Description

The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention.

The experimental procedures in the following examples are conventional unless otherwise specified.

Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.

pUC19 in the following examples is the NEB product.

The ICR buffer in the following examples consisted of a solvent, water, and solutes at concentrations of Trizma Base 50mM (pH 8.0), NaCl 50mM, EDTA 1mM, MgSO 1mM, respectively₄10mM and Tween-200.05% (volume percent).

Example 1 preparation of DNA polymerase

This application replaces some amino acid residues in the exonuclease inactivated 9 ° N DNA polymerase (hereinafter referred to as DC) sequence to obtain DNA polymerases named DC5, DC24, DC4, DC35, DC36, DC40, DC17, and DC8, respectively. The DC is a protein obtained by replacing aspartic acid at position 141 of the protein with the ID AAA88769.1 in NCBI by alanine and replacing glutamic acid at position 143 by alanine, and the amino acid sequence of the DC is shown as sequence 1 in the sequence table.

DC5 is DNA polymerase obtained by replacing the 674 th lysine residue of DC with alanine residue, and the sequence of DC5 fusion protein obtained by adding a his label consisting of 6 histidine residues and a TEV enzyme cutting site consisting of seven amino acid residues of ENLYFQG after the first methionine residue of DC5 is sequence 2 in the sequence table;

DC24 is DNA polymerase obtained by replacing glutamine 665 th residue and threonine 667 th residue of DC with alanine residue, and the sequence of DC24 fusion protein obtained by adding his label composed of 6 histidine residues and TEV restriction enzyme cutting site composed of seven amino acid residues of ENLYFQG after the first methionine residue of DC24 is sequence 3 in the sequence table;

DC4 is DNA polymerase obtained by replacing the 668 th arginine residue of DC with alanine residue, and the sequence of DC4 fusion protein obtained by adding his label composed of 6 histidine residues and TEV enzyme cutting site composed of seven amino acid residues of ENLYFQG after the first methionine residue of DC4 is sequence 4 in the sequence table;

DC35 is DNA polymerase obtained by replacing glutamine 665 th residue of DC with alanine residue, and the sequence of DC35 fusion protein obtained by adding his label composed of 6 histidine residues and TEV enzyme cutting site composed of seven amino acid residues of ENLYFQG after the first methionine residue of DC35 is sequence 5 in the sequence table;

the sequence of DC36 is DNA polymerase obtained by replacing the 667 th threonine residue of DC with alanine residue, and the sequence of DC36 fusion protein obtained by adding his label composed of 6 histidine residues after the first methionine residue of DC36 and TEV enzyme cutting site composed of seven amino acid residues of ENLYFQG is sequence 6 in the sequence table;

DC40 is DNA polymerase obtained by replacing glutamine 665, threonine 667 and asparagine 735 with alanine residues, the sequence of DC40 fusion protein obtained by adding his label composed of 6 histidine residues and TEV restriction enzyme cutting site composed of ENLYFQG seven amino acid residues after the first methionine residue of DC40 is sequence 7 in the sequence table;

DC17 is DNA polymerase obtained by replacing histidine residue at position 679 of DC with alanine residue, and the sequence of DC17 fusion protein obtained by adding his tag composed of 6 histidine residues and TEV enzyme cutting site composed of seven amino acid residues of ENLYFQG after the first methionine residue of DC17 is sequence 8 in the sequence table;

DC8 is DNA polymerase obtained by replacing the asparagine residue at DC 735 th position with alanine residue, and the sequence of DC8 fusion protein obtained by adding his label composed of 6 histidine residues and TEV cleavage site composed of seven amino acid residues of ENLYFQG after the first methionine residue of DC8 is sequence 9 in the sequence table.

The specific preparation method of each DNA polymerase is as follows:

1. preparation of recombinant bacterium

The DNA fragment between the AlwNI and HpaI recognition sequences of pET-28a (Novagen) is replaced by the coding gene of DC fusion protein to obtain a recombinant vector, and the recombinant vector is named as V-DC. The DC fusion protein is obtained by adding a his label consisting of 6 histidine residues and a TEV enzyme cutting site consisting of seven amino acid residues of ENLYFQG after the first methionine residue of the DC, and the sequence of the coding gene of the DC fusion protein is a sequence 10 in a sequence table.

The V-DC was introduced into Escherichia coli BL21(DE3) (Tiangen Biochemical technology, Inc. (Beijing) Co., Ltd.) to obtain a recombinant bacterium, which was named BL 21-V-DC. BL21-V-DC expresses DC fusion protein.

According to the above method, the encoding gene of the DC fusion protein was replaced with the encoding gene of the DC5 fusion protein, the encoding gene of the DC24 fusion protein, the encoding gene of the DC4 fusion protein, the encoding gene of the DC35 fusion protein, the encoding gene of the DC36 fusion protein, the encoding gene of the DC40 fusion protein, the encoding gene of the DC17 fusion protein and the encoding gene of the DC8 fusion protein, respectively, the other genes are not changed, and recombinant bacteria (BL21-V-DC5, BL21-V-DC24, BL21-V-DC4, BL21-V-DC35, BL21-V-DC36, BL21-V-DC40, BL21-V-DC17 and BL21-V-DC8) which respectively express DC5 fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC40 fusion protein, DC17 fusion protein and DC8 fusion protein are obtained.

Wherein the coding gene of the DC5 fusion protein is a DNA molecule shown in a sequence 11, the coding gene of the DC24 fusion protein is a DNA molecule shown in a sequence 12, the coding gene of the DC4 fusion protein is a DNA molecule shown in a sequence 13, the coding gene of the DC35 fusion protein is a DNA molecule shown in a sequence 14, the coding gene of the DC36 fusion protein is a DNA molecule shown in a sequence 15, the coding gene of the DC40 fusion protein is a DNA molecule shown in a sequence 16, the coding gene of the DC17 fusion protein is a DNA molecule shown in a sequence 17, and the coding gene of the DC8 fusion protein is a DNA molecule shown in a sequence 18.

pET-28a was introduced into E.coli BL21(DE3) to obtain a recombinant strain, and the recombinant strain was named BL 21-V.

2. Preparation of DC5 fusion protein

2.1 Induction of expression

Inoculating the recombinant bacterium BL21-V-DC5 obtained in the step 1 into 5ml of Kan-LB culture medium (the Kan-LB culture medium is a liquid culture medium with 25 mu g/ml kanamycin concentration obtained by adding kanamycin into LB culture medium), and culturing at 37 ℃ and 220rpm overnight; inoculating the obtained bacterial liquid into 30ml Kan-LB culture medium according to the ratio of 1:50, culturing at 37 ℃, 220rpm for 4 h; inoculating the obtained bacterial liquid into 250ml Kan-LB culture medium according to the ratio of 1:50, culturing at 37 ℃ and 220rpm for about 4 h; when the OD600 value reaches about 0.6, IPTG is added into the bacterial liquid obtained by the culture until the concentration of the IPTG in the bacterial liquid is 0.5mM, and then the bacterial liquid is cultured at 30 ℃ and 220rpm overnight (about 16 h); BL21-V-DC5 cells were collected at 8000rpm for 10 min. Meanwhile, no IPTG is added as a blank control, and uninduced BL21-V-DC5 thalli are collected.

According to the induction expression method, the recombinant bacterium BL21-V-DC5 is replaced by BL21-V, and the other cells are not changed to obtain BL21-V bacteria.

2.2 disruption and purification of the cells

BL21-V-DC5 cells from step 2.1 were collected and used as 1g of cells in 20ml of an affinity solution (50mM KPO)₄500mM NaCl, 5% Glycerol, pH 7.4) to obtain a bacterial suspension 1; adding phenylmethylsulfonyl fluoride, Triton X-100 and lysozyme into the thallus heavy suspension 1 to obtain a thallus heavy suspension 2, wherein the final concentrations of the phenylmethylsulfonyl fluoride, the Triton X-100 and the lysozyme in the thallus heavy suspension 2 are 0.25mM, 0.5% (volume percentage concentration) and 2.5mg/100mL respectively; the bacterial suspension 2 was incubated at room temperature for 30min, sonication was performed at 40% power (about 400W) for 2s and 2s for 30min, and the sonicated product was centrifuged at 12000rpm at 4 ℃ for 30min before collecting the supernatant.

The supernatant was filtered with a 0.45 μm syringe filter (Life Sciences), and then the supernatant was applied to Ni column affinity chromatography (affinity chromatography prepacked HisTrap HP, 5ml, 17-5248-02, GE healthcare) at an appropriate flow rate, followed by application of affinity buffer 1(50mM KPO)₄500mM NaCl, 10mM imidazole, 5% Glycerol, pH 7.4) equilibrated to 10 CV; 3% affinity buffer2(50mM KPO)₄1M NaCl, 500mM imidazole, 5% Glycerol, pH 7.4) elution 5 CV; eluting 5CV by 50% affinity buffer2, and collecting the Ni column affinity chromatography eluent corresponding to the peak value of more than or equal to 100 mAU.

Using ion buffer1(25mM KPO) to eluent with peak value of 100mAU or more₄50mM NaCl, 5% Glycerol, pH 7.4), then the column was diluted 10-fold and subjected to anion exchange chromatography (ion exchange pre-packed HiTrap Q HP, 5ml, 17-1154-01, GE healthcare) at a constant flow rate, and the anion exchange chromatography permeate was collected after loading.

Subjecting the anion exchange chromatography permeate to cation exchange chromatography (ion exchange pre-packed column HiTrap SP HP, 5ml, 17-1152-01, GE healthcare) at a certain flow rate, and balancing 10CV with ion buffer 1; 3% ion buffer2(50mM KPO)₄1M NaCl, 5% Glycerol, pH 7.40) elution 5 CV; eluting 5CV by 50% ion buffer2, collecting the SP column cation exchange chromatography eluent corresponding to the peak value of more than or equal to 100mAU, collecting the eluent, dialyzing the eluent in a dialyzate (20mM Tris, 200mM KCl, 0.2mM EDTA, 5% Glycerol) for 24 hours by using a dialysis bag (spectrumlabs, 131267), and quantifying to 1mg/ml and containing 50% Glycerol to obtain the purified DC5 fusion protein.

According to the induction expression and thallus crushing and purification method of the steps 2.1-2.2, BL21-V-DC5 is respectively replaced by BL21-V-DC24, BL21-V-DC4, BL21-V-DC35, BL21-V-DC36, BL21-V-DC40, BL21-V-DC17 and BL21-V-DC8, and the rest are not changed, so that purified DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC40 fusion protein, DC17 fusion protein and DC8 fusion protein are respectively obtained.

3. Routine quality inspection

(1) Purity and quality inspection by SDS-PAGE electrophoresis

10 mul of the DC5 fusion protein obtained in the step 2 with the concentration of 1mg/ml is mixed with 10 mul of the loading buffer solution to obtain a liquid 1, 10 mul of the DC5 fusion protein obtained in the step 2 with the concentration of 0.05mg/ml is mixed with 10 mul of the loading buffer solution to obtain a liquid 2, 10 mul of the liquid 1 and the liquid 2 are respectively taken for SDS-PAGE electrophoresis, and the result is shown in figure 1, and the purity of the purified DC5 fusion protein is more than 95%.

The purities of the purified DC fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC40 fusion protein, DC17 fusion protein and DC8 fusion protein were respectively examined according to the above-mentioned methods, and as a result, the purities of these proteins were all found to be more than 95% (fig. 2 to fig. 9).

(2) Quality control of endonuclease activity

The endonuclease activity of the DC5 fusion protein obtained in step 2 was examined using the reaction system of table 2:

TABLE 2 detection reaction System of endonuclease Activity

After the reaction system shown in Table 2 was incubated at 37 ℃ for 4 hours, 1. mu.l of 0.5M EDTA was added to terminate the reaction. The reaction product was purified using Gel extraction kit (omega bio-tek) and then subjected to 1% agarose Gel electrophoresis, and the result is shown in FIG. 10, and the pUC19 plasmid band was not degraded, demonstrating that the DC5 fusion protein was not contaminated with endonuclease. The DC5 fusion protein was replaced with water as a Negative Control (NC).

Quality tests of endonuclease activity of DC fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC40 fusion protein, DC17 fusion protein and DC8 fusion protein were carried out according to the above-mentioned methods, and the results (FIGS. 11-18) demonstrated that none of these proteins was contaminated with endonuclease.

(3) Exonuclease activity quality test

1ml of TE buffer resuspended DNase Alert Substrate (Thermo Fisher Scientific) ensuring complete dissolution of the Substrate. Taking the resuspended DNase Alert Substrate, and adding 5 mu l of the DNase Alert Substrate into a 384-well plate per well; add 40. mu.l of the DC5 fusion protease solution obtained in step 2 to the sample well and mix well. And setting a negative control and a positive control, wherein the negative control is that the DC5 fusion protease liquid is replaced by equal volume of Nuclease-free water, the positive control is that 10 times of nucleic acid Alert Buffer (invitrogen) is diluted by 10 times to 1 times of solution by the Nuclease-free water, the standard DNase I is diluted five times by the 1 times of solution, 2.5 mul of DNase I is added into a positive control hole, and 37.5 mul of Nuclease-free water is added to supplement the system, and the mixture is shaken and mixed evenly.

After incubation at 37 ℃ for 10min, detection was carried out with a Gen5 microplate reader. According to the detected fluorescence value, the Dnase activity of the DC5 fusion protein is calculated to be 0.002U/mu l, almost no exonuclease activity exists, and the quality inspection is qualified.

According to the method, the exonuclease activities of the DC fusion protein, the DC24 fusion protein, the DC4 fusion protein, the DC35 fusion protein, the DC36 fusion protein, the DC40 fusion protein, the DC17 fusion protein and the DC8 fusion protein are respectively subjected to quality inspection, and the proteins are proved to have no exonuclease pollution.

4. DNA polymerase activity for detecting DC5 fusion protein by using dNTP as substrate

20 μ l of the DC5 fusion protein (1mg/ml) obtained in step 2 was diluted in a round-bottom 96-well plate in a gradient of 2, 4, 8, 16, 32 times using an enzyme dilution buffer (the enzyme dilution buffer is composed of a solvent and a solute, the solvent is water, the solute and its concentration are Trizma Base 10mM (pH 7.4), KCl 100mM, EDTA 0.1mM and glycerol 5% (volume%), respectively), and finally a DC5 fusion protein diluted 32 times was obtained, and then the reaction was performed according to the reaction system of Table 3.

TABLE 3 reaction System

Components	Volume of
		annealed mixture	2.5μl
10mM dNTP	1.25μl
		Diluted 32-fold DC5 fusion protein	3μl
ICR buffer solution	12.5μl
		H₂O	5.75μl

The ingredients and ratios of the analyzed textures in Table 3

Components	Volume of
		60μM template	2.5μl
60μM primer	2.5μl
		2*Anneal buffer	12.5μl
H₂O	7.5μl

With H₂O as negative control, three replicates were incubated in a PCR apparatus at 60 ℃ for 10min, rapidly placed on ice, and the reaction was stopped with 10mM EDTA. Wherein, 2 × annex buffer (10mM Trizma Base,50mM NaCl,1mM EDTA, pH 7.4); template is a template solution with the concentration of the single-stranded DNA shown in the sequence 20 being 60 mu M, which is obtained by adding the single-stranded DNA shown in the sequence 20 in the sequence table into deionized water; the primer (primer) is a primer solution in which the single-stranded DNA represented by the sequence 19 in the sequence table is 60. mu.M, which is obtained by adding the single-stranded DNA represented by the sequence 19 in the sequence table to deionized water.

Taking 2 mu l of the final reaction solution into a 96-well plate which is added with 38 mu l of TE buffer solution in advance, blowing and uniformly mixing a gun head, and taking 10 mu l of the final reaction solution into a black flat-bottom 96-well plate which is added with 40 mu l of TE buffer solution in advance; in the other wells of the 96-well plate, 500ng/ml of lambda DNA was diluted in a gradient manner by a 2-fold method, and 50. mu.l of each dilution gradient was taken as a standard substance in a new well. Picogreen (thermo Fisher scientific) dye was diluted 200 times with TE buffer, 50. mu.l was added to the sample well and mixed well, and left to stand at room temperature for 2min in the dark. Fluorescence was detected at 520nm with 480nm excitation light emission.

The DNA polymerase activities of the DC fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC40 fusion protein, DC17 fusion protein and DC8 fusion protein were measured according to the above-described methods, respectively.

By the above method, the amount of the polymerized dNTP of the 32-fold diluted enzyme solution of the DC5 fusion protein detected to be 0.51nmol, the amount of the polymerized dNTP of the 32-fold diluted enzyme solution of the DC fusion protein detected to be 0.43nmol, and the amount of the polymerized dNTP of the 32-fold diluted enzyme solution of the DC4 fusion protein, the DC24 fusion protein, the DC17 fusion protein, the DC8 fusion protein, the DC35 fusion protein, the DC36 fusion protein and the DC40 fusion protein detected to be 0.45, 0.52, 0.47, 0.48, 0.46, 0.48 and 0.47nmol, respectively.

One unit of activity of DNA polymerase is defined as that 1U of DNA polymerase polymerizes 0.35nmoldNTP within 10min at 60 ℃.

The DNA polymerase activities of the DC5 fusion protein, the DC4 fusion protein, the DC6 fusion protein, the DC7 fusion protein, the DC8 fusion protein, the DC9 fusion protein, the DC17 fusion protein and the DC24 fusion protein are 1.46U, 1.29U, 1.49U, 1.34U, 1.37U, 1.31U, 1.37U and 1.34U respectively, and the DNA polymerase activity of the DC fusion protein is 1.23U.

It was shown that DC5 fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC8 fusion protein, DC40 fusion protein and DC17 fusion protein all have DNA polymerase activity, as compared to DC fusion protein.

5. Kinetics of single base incorporation of DNA polymerase

In this example, the relative reaction rates of DC fusion protein, DC5 fusion protein, DC1 fusion protein, DC4 fusion protein, DC6 fusion protein, DC7 fusion protein, DC8 fusion protein, DC9 fusion protein, DC17 fusion protein and DC24 fusion protein were measured by a microplate reader using a Cy3 fluorescent dye-labeled dATP (dATP-Cy3) and a Cy5 fluorescent dye-labeled DNA template (template DNA-Cy5), and the specific experimental methods were as follows:

the single-stranded primers S1A (sequence 21 in the sequence table) and S2A (sequence 22 in the sequence table) with a 5' Cy5 fluorescent label (two primers are mutually primer templates) are added according to the proportion of 1: 1, annealing at 65 ℃ for 1min, 40 ℃ for 1min and 4 ℃ for 10min, and storing the annealing product to-20 ℃ in a dark place to obtain the Cy5 fluorescent dye-labeled template DNA-Cy 5.

Enzyme activity detection is carried out by using a BioTek microplate reader, the reaction is carried out in 384plates (Corning black, clear bottom 384plates), the liquid loading amount of each hole is 50 mu l, dATP in the reaction system is excessive, and the concentration of template DNA is tested according to 8 concentration gradients of 2, 4, 5, 8, 10, 20, 40 and 80nmol/50 mu l; the reaction temperature was 25 ℃. The specific reaction system is as follows:

2U of DC fusion protein, DC5 fusion protein, DC4 fusion protein, DC6 fusion protein, DC7 fusion protein, DC8 fusion protein, DC9 fusion protein, DC17 fusion protein or DC24 fusion protein (one protein per reaction system), 1. mu.M dATP-Cy3, 10. mu.M dTTP, 10. mu.M dCTP, 10. mu.M dGTP, template DNA-Cy5 were tested in 2, 4, 5, 8, 10, 20, 40, 80 nmol/50. mu.l 8 concentration gradients (one template DNA-Cy5 concentration per reaction system), the reaction was carried out in enzyme reaction buffer (enzyme reaction buffer is composed of solute and solvent, solvent is water, solute and its concentration is 20 mM-Tris-HCl, 10mM (NH) (NH-Tris-HCl, 10 mM) (NH-3 mM) respectively₄)₂SO₄、10mM KCl、2mM MgSO₄pH 8.8).

The enzyme reaction is carried out in a dynamic detection mode, data is recorded every 5min of the reaction, and the detection condition is

The reaction rate of the relative fluorescence value can be approximately calculated.

The magnitude of the reaction rate depends on the concentration of the template DNA-Cy5, so that the Km value (i.e., the concentration of the substrate corresponding to the reaction rate reaching half of the maximum reaction rate (i.e., template DNA-Cy 5)) can be approximately determined by detecting the activity of the DNA polymerase under the condition of different concentrations of the template DNA-Cy 5.

The km values of DC fusion protein, DC5 fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC8 fusion protein, DC40 fusion protein and DC17 fusion protein were 29, 38.4, 39, 42.1, 40.3, 40.6, 38, 43.3, 42.6 and 41.4, respectively, and the km values of DC5 fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC8 fusion protein, DC40 fusion protein and DC17 fusion protein were all significantly increased compared to DC fusion protein, indicating that the affinities of DC5 fusion protein, DC24 fusion protein, DC4 fusion protein, DC35 fusion protein, DC36 fusion protein, DC8 fusion protein, DC40 fusion protein and DC17 fusion protein to template DNA were all significantly decreased.

<110> Shenzhen Hua Dagen research institute, Shenzhen Hua Dagen science and technology Limited

<120> DNA polymerase and method for preparing the same

<160> 22

<170> PatentIn version 3.5

<210> 1

<211> 775

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 1

Met Ile Leu Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile

1 5 10 15

Arg Val Phe Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg

20 25 30

Thr Phe Glu Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile

35 40 45

Glu Asp Val Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys

50 55 60

Val Lys Arg Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile

65 70 75 80

Glu Val Trp Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile

85 90 95

Arg Asp Arg Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr

100 105 110

Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro

115 120 125

Met Glu Gly Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr

130 135 140

Leu Tyr His Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile

145 150 155 160

Ser Tyr Ala Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile

165 170 175

Asp Leu Pro Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys

180 185 190

Arg Phe Leu Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr

195 200 205

Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu

210 215 220

Glu Leu Gly Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys

225 230 235 240

Ile Gln Arg Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile

245 250 255

His Phe Asp Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr

260 265 270

Tyr Thr Leu Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu

275 280 285

Lys Val Tyr Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly

290 295 300

Leu Glu Arg Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr

305 310 315 320

Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu

325 330 335

Ile Gly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu

340 345 350

Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala

355 360 365

Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr

370 375 380

Ala Gly Gly Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile

385 390 395 400

Val Tyr Leu Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His

405 410 415

Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp

420 425 430

Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe

435 440 445

Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys

450 455 460

Arg Lys Met Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp

465 470 475 480

Tyr Arg Gln Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr

485 490 495

Tyr Gly Tyr Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser

500 505 510

Val Thr Ala Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu

515 520 525

Glu Glu Lys Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu

530 535 540

His Ala Thr Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala

545 550 555 560

Lys Glu Phe Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu

565 570 575

Leu Glu Tyr Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys

580 585 590

Lys Tyr Ala Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu

595 600 605

Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala

610 615 620

Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val

625 630 635 640

Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro

645 650 655

Pro Glu Lys Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp

660 665 670

Tyr Lys Ala Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala

675 680 685

Arg Gly Val Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu

690 695 700

Lys Gly Ser Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe

705 710 715 720

Asp Pro Thr Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln

725 730 735

Val Leu Pro Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys

740 745 750

Glu Asp Leu Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp

755 760 765

Leu Lys Val Lys Gly Lys Lys

770 775

<210> 2

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 2

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp Tyr Ala Ala

675 680 685

Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 3

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 3

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Ala Ile Ala Arg Asp Leu Arg Asp Tyr Lys Ala

675 680 685

Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 4

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 4

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Gln Ile Thr Ala Asp Leu Arg Asp Tyr Lys Ala

675 680 685

Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 5

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 5

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Ala Ile Thr Arg Asp Leu Arg Asp Tyr Lys Ala

675 680 685

Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 6

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 6

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Gln Ile Ala Arg Asp Leu Arg Asp Tyr Lys Ala

675 680 685

Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 7

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 7

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Ala Ile Ala Arg Asp Leu Arg Asp Tyr Lys Ala

675 680 685

Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Ala Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 8

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 8

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp Tyr Lys Ala

675 680 685

Thr Gly Pro Ala Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 9

<211> 788

<212> PRT

<213> Artificial sequence

<220>

<223>

<400> 9

Met His His His His His His Glu Asn Leu Tyr Phe Gln Gly Ile Leu

1 5 10 15

Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile Arg Val Phe

20 25 30

Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg Thr Phe Glu

35 40 45

Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile Glu Asp Val

50 55 60

Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys Val Lys Arg

65 70 75 80

Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile Glu Val Trp

85 90 95

Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile Arg Asp Arg

100 105 110

Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr Asp Ile Pro

115 120 125

Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro Met Glu Gly

130 135 140

Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr Leu Tyr His

145 150 155 160

Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile Ser Tyr Ala

165 170 175

Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile Asp Leu Pro

180 185 190

Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys Arg Phe Leu

195 200 205

Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr Tyr Asn Gly

210 215 220

Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Cys Glu Glu Leu Gly

225 230 235 240

Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys Ile Gln Arg

245 250 255

Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile His Phe Asp

260 265 270

Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr Tyr Thr Leu

275 280 285

Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu Lys Val Tyr

290 295 300

Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly Leu Glu Arg

305 310 315 320

Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr Glu Leu Gly

325 330 335

Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu Ile Gly Gln

340 345 350

Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu Val Glu Trp

355 360 365

Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala Pro Asn Lys

370 375 380

Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr Ala Gly Gly

385 390 395 400

Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile Val Tyr Leu

405 410 415

Asp Phe Arg Ser Leu Tyr Pro Ser Ile Ile Ile Thr His Asn Val Ser

420 425 430

Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp Val Ala Pro

435 440 445

Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe Ile Pro Ser

450 455 460

Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys Arg Lys Met

465 470 475 480

Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp Tyr Arg Gln

485 490 495

Arg Ala Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr Tyr Gly Tyr

500 505 510

Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala

515 520 525

Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu Glu Glu Lys

530 535 540

Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu His Ala Thr

545 550 555 560

Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala Lys Glu Phe

565 570 575

Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr

580 585 590

Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala

595 600 605

Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu Glu Ile Val

610 615 620

Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala Arg Val Leu

625 630 635 640

Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val Arg Ile Val

645 650 655

Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro Pro Glu Lys

660 665 670

Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp Tyr Lys Ala

675 680 685

Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val

690 695 700

Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu Lys Gly Ser

705 710 715 720

Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe Asp Pro Thr

725 730 735

Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Ala Gln Val Leu Pro

740 745 750

Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys Glu Asp Leu

755 760 765

Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp Leu Lys Val

770 775 780

Lys Gly Lys Lys

785

<210> 10

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 10

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga gcagatcacg 2040

cgcgatttac gtgactataa agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga gaaccaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 11

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 11

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga gcagatcacg 2040

cgcgatttac gtgactatgc agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga gaaccaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 12

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 12

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga ggcaatcgca 2040

cgcgatttac gtgactataa agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga gaaccaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 13

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 13

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga gcagatcacg 2040

gcagatttac gtgactataa agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga gaaccaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 14

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 14

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga ggcaatcacg 2040

cgcgatttac gtgactataa agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga gaaccaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 15

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 15

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga gcagatcgca 2040

cgcgatttac gtgactataa agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga gaaccaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 16

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 16

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga ggcaatcgca 2040

cgcgatttac gtgactataa agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga ggcacaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 17

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 17

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga gcagatcacg 2040

cgcgatttac gtgactataa agcaaccggt ccggcagttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga gaaccaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 18

<211> 2367

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 18

atgcatcacc atcatcatca cgagaatctt tactttcagg gcattctgga cactgattac 60

attaccgaaa acggtaaacc ggttatccgc gtgttcaaga aagagaatgg tgagttcaaa 120

atcgagtacg atcgcacgtt tgaaccgtac ttctatgctc tgctgaaaga cgattctgcg 180

attgaagatg tgaaaaaagt gacggcgaaa cgtcacggca ccgtggttaa ggtgaaacgt 240

gcggagaaag tgcaaaagaa attcctgggc cgtccgatcg aagtttggaa gctgtacttt 300

aaccacccac aagacgtccc ggcgattcgt gaccgcatcc gtgcgcaccc ggctgtggtt 360

gacatctatg agtacgatat tccgttcgct aagagatact tgattgacaa gggtctgatc 420

cctatggaag gcgacgaaga actgaccatg ctggccttcg ctatcgcgac gttgtatcac 480

gagggcgaag agtttggcac cggcccaatc ctgatgatta gctatgccga cggttccgaa 540

gcgcgtgtga tcacctggaa gaaaattgat ctgccgtacg tcgatgtggt gagcacggaa 600

aaagaaatga tcaaacgttt tctgcgtgtg gtccgtgaga aagatccgga tgtcctgatt 660

acgtataacg gtgacaattt tgattttgcg tacctgaaaa agcgctgcga ggaactgggt 720

atcaagttca cgctgggtcg tgatggtagc gagccgaaga ttcagcgtat gggtgaccgt 780

tttgcagttg aggtgaaggg tcgcattcac ttcgacctgt acccggttat tcgccgcacc 840

atcaacttgc ctacctacac cctggaagcg gtctatgaag ctgtctttgg caaaccgaaa 900

gagaaagttt acgcggaaga gatcgcgcag gcgtgggaga gcggtgaggg tctggaacgt 960

gttgcccgct acagcatgga agatgcgaag gtgacttatg agttgggtcg cgagtttttc 1020

ccgatggaag cacagctgag ccgtctgatc ggccaaagcc tgtgggacgt cagccgttcg 1080

tccaccggca acttggttga atggttcctg ctgcgtaagg catacaagcg taacgaactg 1140

gcgccgaata agccggacga gcgtgagctg gcccgtcgcc gtggtggtta tgccggtggc 1200

tatgttaaag agccggagcg cggtctgtgg gacaatatcg tgtatctgga cttccgctcc 1260

ctgtatccga gcatcattat cacccacaat gttagcccgg atactttaaa ccgcgagggt 1320

tgtaaagagt acgacgtggc gcctgaggtc ggccacaagt tttgcaaaga tttcccgggc 1380

ttcatcccaa gcctgctggg cgatctgctg gaggaacgtc agaagatcaa acgcaaaatg 1440

aaagcaacgg ttgatccgct ggagaaaaag ctgctggatt atcgtcagcg cgcaattaag 1500

atcctggcga atagctttta tggttactac ggttatgcca aagcgcgttg gtactgtaaa 1560

gaatgcgctg agtctgtcac cgcgtggggc cgtgagtaca tcgaaatggt tatccgtgag 1620

ctcgaagaga aattcggttt taaggttctg tatgccgaca ccgacggtct gcacgcgacc 1680

atcccgggtg cagacgccga aaccgtcaag aagaaagcaa aagaatttct gaaatacatt 1740

aatccgaaat tgccgggtct gttggagttg gagtatgagg gtttctacgt tcgtggcttc 1800

tttgttacca agaagaagta cgcggtcatt gacgaagagg gcaagattac gacccgtggt 1860

ctggaaattg ttcgccgtga ctggtccgag attgcgaaag aaacccaggc gagagtgctg 1920

gaagcgattc tgaagcatgg tgatgtcgag gaagccgtgc gtatcgttaa agaagtgacg 1980

gagaagttga gcaagtacga agtcccaccg gagaaactgg tgattcatga gcagatcacg 2040

cgcgatttac gtgactataa agcaaccggt ccgcatgttg ccgtggcaaa gcgtctggct 2100

gcgcgtggcg ttaagatccg tccgggcacg gttattagct acattgtgtt gaaaggtagc 2160

ggtcgtattg gcgaccgcgc cattccggcc gacgagttcg atccgaccaa gcaccgctac 2220

gatgcagagt attacatcga ggcacaagtg ctgccggctg tagagcgtat tctgaaggca 2280

ttcggttatc gtaaagaaga tctgcgctat caaaagacga aacaagttgg cctgggtgcg 2340

tggctgaagg tcaagggcaa gaaataa 2367

<210> 19

<211> 18

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 19

gtaaaacgac ggccagtg 18

<210> 20

<211> 63

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 20

aattgaacat tcatgattat ttaagaaata aattgtttta aaatgcactg gccgtcgttt 60

tac 63

<210> 21

<211> 31

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 21

cgtgtatgcg taatacgact cactatggac g 31

<210> 22

<211> 31

<212> DNA

<213> Artificial sequence

<220>

<223>

<400> 22

cgtgtatcgt ccatagtgag tcgtattacg c 31

Claims

1. A protein, as in a1) or a 2):

A1) a mutant protein having a DNA polymerase activity, which is obtained by substituting at least one of positions 674, 665, 667, 668, and 735 of a 9 ℃ N DNA polymerase; the sequence of the 9 DEG N DNA polymerase is shown as a sequence 1 in a sequence table; the mutein having a reduced affinity for DNA compared to 9 ℃ N DNA polymerase, without a reduced DNA polymerase activity; the mutant protein is any one of the following B1) -B7):

B1) a protein obtained by substituting lysine residue at position 674 from the N-terminus of 9 ℃ N DNA polymerase with alanine residue;

B2) a protein obtained by substituting alanine residues for both of a glutamine residue at position 665 and a threonine residue at position 667 from the N-terminus of 9 ℃ N DNA polymerase;

B3) a protein obtained by substituting arginine residue at position 668 from the N-terminus of 9 ℃ N DNA polymerase with alanine residue;

B4) a protein obtained by substituting glutamine 665 of 9 ℃ N DNA polymerase from the N-terminus with alanine residue;

B5) a protein obtained by substituting threonine residue 667 of 9 ℃ N DNA polymerase from the N-terminus with alanine residue;

B6) a protein obtained by substituting alanine residues for the asparagine residue at position 735, the glutamine residue at position 665 and the threonine residue at position 667 of 9 ℃ N DNA polymerase from the N-terminus;

B7) a protein obtained by substituting the 9 ℃ N DNA polymerase with alanine residue at the 735 th asparagine residue from the N-terminus;

A2) the fusion protein with DNA polymerase activity is obtained by connecting labels at the N end or/and the C end of the mutant protein A1).

2. The protein of claim 1, wherein: the affinity of the protein to DNA molecules, cDNA molecules or biochips containing DNA molecules or cDNA molecules is reduced compared to 9 DEG N DNA polymerase.

3. Biological material related to a protein according to claim 1 or 2, characterized in that: the biomaterial is any one of the following C1) to C5):

C1) a nucleic acid molecule encoding the protein of claim 1 or 2;

C2) an expression cassette comprising the nucleic acid molecule of C1);

C5) a transgenic cell line comprising C1) the nucleic acid molecule or a transgenic cell line comprising C2) the expression cassette, said transgenic cell line not comprising propagation material.

4. The biomaterial of claim 3, wherein: C1) the nucleic acid molecule is a cDNA molecule or a DNA molecule encoding the protein of claim 1 or 2, which is obtained by substituting at least one nucleotide for the sequence of a gene encoding 9 ° N DNA polymerase.

5. The biomaterial of claim 4, wherein: the coding gene of the 9 DEG N DNA polymerase is a cDNA molecule or a DNA molecule obtained by adding ATG at the 5' end of the nucleic acid molecule shown in the 43 th-2364 th site of the sequence 10 in the sequence table.

6. The biomaterial according to claim 4 or 5, characterized in that: C1) the nucleic acid molecule is a cDNA molecule or a DNA molecule obtained by adding ATG at the 5' end of the nucleic acid molecule shown in the 43 th-2364 th site of any one of the sequences 11-16 and 18 in the sequence table.

7. A method for producing the protein of claim 1 or 2, which comprises introducing a gene encoding the protein of claim 1 or 2 into a biological cell and expressing the gene encoding the protein of claim 1 or 2 to obtain the protein of claim 1 or 2.

8. Any of the following applications:

E1) use of the protein of claim 1 or 2 as a DNA polymerase;

E2) use of the biomaterial of any one of claims 3-6 for the preparation of a DNA polymerase;

E3) use of the protein of claim 1 or 2 in DNA polymerization reactions;

E4) use of a protein according to claim 1 or 2 for the preparation of a polymerase chain reaction product;

E5) use of the biomaterial of any one of claims 3-6 in a polymerase chain reaction;

E6) use of the biomaterial of any one of claims 3-6 for the preparation of a polymerase chain reaction product;

E7) use of the preparation process according to claim 7 for preparing a product of DNA polymerization;

E8) use of a protein according to claim 1 or 2 for sequencing;

E9) use of the biological material of any one of claims 3-6 in sequencing;

E10) use of a protein according to claim 1 or 2 for the preparation of a sequencing product;

E11) use of the biomaterial of any one of claims 3-6 in the preparation of a sequencing product;

E12) use of the preparation method of claim 7 for preparing a sequencing product.