CN110747227B

CN110747227B - Blue light induced and activated Cre recombination optimization system and application thereof

Info

Publication number: CN110747227B
Application number: CN201910746260.6A
Authority: CN
Inventors: 李大力; 李慧莹; 吴英尹; 刘明耀
Original assignee: East China Normal University; Bioray Laboratories Inc
Current assignee: East China Normal University; Bioray Laboratories Inc
Priority date: 2019-08-13
Filing date: 2019-08-13
Publication date: 2023-03-31
Anticipated expiration: 2039-08-13
Also published as: CN110747227A

Abstract

The application discloses a blue light induction activated Cre recombination optimization system and application. The Cre recombinase generated by the system mainly comprises two fusion proteins CIB1-creC which are separated from each other _106‑343 And CRY2-CreN _19‑104 Forming; when the two proteins are under certain blue light conditions, CRY2-CreN _19‑104 The CRY2 in (A) undergoes conformational change with CIB1-CreC _106‑343 In (3) to allow CreC _106‑343 And CreN _19‑343 Binding forms a protein having Cre recombinase activity, thereby editing the target gene located between loxP sites. In order to improve the expression efficiency and the induction multiple of Cre recombinase in the system, the system is optimized, so that the expression efficiency of the Cre recombinase activated by blue light induction is obviously improved, and the induction multiple is improved by about 40 times.

Description

Blue light induced and activated Cre recombination optimization system and application thereof

Technical Field

The invention relates to the technical field of biology, in particular to a blue light induced and activated Cre recombination optimization system and application thereof.

Background

The Cre recombinase is derived from P1 bacteriophage and belongs to a tyrosine recombinase family. It can specifically recognize a DNA sequence (loxp) with the length of 34bp, so that the gene sequence between loxp sites is deleted or recombined. The Cre recombinase has high efficiency and high sequence recognition specificity, so scientists construct a plurality of Cre recombinase transgenic mouse models which have incomparable important roles in the aspects of researching gene function and organism development. According to the literature, it is reported that the Cre recombinase can affect the development of tissues such as heart and lung of mice if being expressed in vivo, so that the effective control of the expression of the recombinase is particularly important.

The mouse models for transgenosis of inducible Cre recombinase which have been reported at present comprise Cre-ERT2 and DD-Cre. Both are induced by chemical drugs, and have certain limitations, namely long induction time and large action range, and can disturb endogenous signal pathways of cells. Although spatial specificity can be achieved to some extent with specific promoters, precise manipulation of cell populations lacking tissue specificity for systemic distribution is not possible.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a blue light induced and activated Cre recombination optimization system and application thereof.

The invention optimizes and reforms the Cre recombinase system PA-Cre which is induced and activated by blue light, improves the activity of the Cre recombinase and greatly reduces the leakage of the system. The transgenic cell line or the transgenic animal obtained by transferring the gene into a cell line or an animal not only can realize more accurate space-time specific regulation and control by controlling the time and the space of illumination, but also provides a more appropriate tool for subsequent research.

In one aspect, the invention provides a blue light induction activated Cre recombination optimization system, which comprises a blue light induction activated Cre recombinase expression cassette, wherein the expression cassette comprises genes connected in the following sequence: coding genes of a photosensitive protein ligand CIB1, a Cre recombinase C end coding gene, a photosensitive protein CRY2 and a Cre recombinase N end coding gene;

the C end of the Cre recombinase comprises an amino acid sequence shown in SEQ ID No.1 (106-343 th site of the N end of the Cre recombinase); the N end of the Cre recombinase comprises an amino acid sequence shown in SEQ ID No.3 (19 th to 104 th positions of the N end of the Cre recombinase);

preferably, the coding gene sequence of the Cre recombinase C end is shown in SEQ ID No. 2; the coding gene sequence of the Cre recombinase N end is shown in SEQ ID No. 4.

In order to facilitate the expression of Cre recombinase and the study of gene functions in cells or animals, the system further comprises two spaced loxP recognized by Cre recombinase, wherein the two spaced loxP are located on a vector different from the expression cassette or integrated into the genome of the cells or animals.

In a preferred embodiment, the expression cassette further comprises a gene encoding a first linker peptide and a gene encoding a second linker peptide,

the coding gene of the first connecting peptide is positioned between the coding gene of the photosensitive protein ligand CIB1 and the coding gene of the Cre recombinase C end,

the coding gene of the second connecting peptide is positioned between the coding gene of the photosensitive protein CRY2 and the coding gene at the N end of the Cre recombinase,

preferably, said first linker peptide and said second linker peptide comprise the amino acid sequences shown in SEQ ID nos. 5, 6, 7, or 8, more preferably, SEQ ID nos. 5;

more preferably, the genes encoding the first and second connecting peptides comprise the nucleotide sequences shown in SEQ ID No. 9.

In a preferred embodiment, the expression cassette further comprises a gene encoding a nuclear localization signal NLS,

preferably, the coding gene of the nuclear localization signal NLS is positioned before the coding gene of the light sensitive protein CRY2 or before the coding gene of the light sensitive protein ligand CIB1, more preferably, before the coding gene of the light sensitive protein ligand CIB1,

preferably, the nuclear localization signal NLS comprises an amino acid sequence shown in SEQ ID No.10, and more preferably, the coding gene of the nuclear localization signal NLS comprises a nucleotide sequence shown in SEQ ID No. 11.

In a preferred embodiment, the expression cassette further comprises a promoter which is CAG or CMV, preferably CAG;

more preferably, the nucleotide sequence of CAG is shown in SEQ ID No. 12.

In a preferred embodiment, the expression cassette further comprises a gene encoding a self-cleaving protein or an IRES (internal ribosome entry site sequence),

the coding gene or IRES of the self-shearing protein is positioned between the coding gene of the Cre recombinase C end and the coding gene of the photosensitive protein CRY2,

preferably, the nucleotide sequence of the IRES is as shown in SEQ ID No.13,

the self-cleaving protein is P2A, T2A, or F2A, more preferably, T2A,

more preferably, the amino acid sequence of the T2A is shown as SEQ ID No.14, and more preferably, the nucleotide sequence of the T2A is shown as SEQ ID No. 15.

In a preferred embodiment, the expression cassette further comprises a transcriptional regulatory element,

the transcription regulation element is positioned behind the coding gene at the N end of the Cre recombinase,

the WPRE element is a Woodchuck Hepatitis Virus post-transcriptional regulatory element (WPRE), plays an important role in the transcriptional and post-transcriptional regulatory stages, and can improve the expression level of protein to a certain extent;

preferably, the WPRE element comprises the nucleotide sequence shown as SEQ ID No. 16;

and/or, the expression cassette also comprises a coding gene of the protein tag sequence,

preferably, the coding gene of the protein tag sequence is positioned between the promoter and the coding gene of the nuclear localization signal NLS,

the protein Tag comprises any one or at least two of MyC, his, GST, HA, flag, MBP, avi Tag, SUMO and c-Myc Tag; preferably, flag, more preferably 3 × Flag.

In a preferred embodiment, the light-sensitive protein ligand CIB1 comprises an amino acid sequence shown in SEQ ID No.17, and preferably, the coding gene sequence of the light-sensitive protein ligand CIB1 comprises a nucleotide sequence shown in SEQ ID No. 18.

In a preferred embodiment, the light-sensitive protein CRY2 comprises an amino acid sequence shown in SEQ ID No.19, and preferably, the coding gene sequence of the light-sensitive protein CRY2 comprises a nucleotide sequence shown in SEQ ID No. 20.

In another aspect, the present invention provides a recombinant vector comprising any one of the above expression cassettes, wherein the recombinant vector is a plasmid vector, a lentiviral vector, a retroviral vector, an adenoviral vector, an adeno-associated viral vector, a simian viral vector, a vaccinia viral vector, a sendai viral vector, an EB viral vector, or a herpes simplex viral vector;

preferably, the plasmid vector is Cre/loxP recombinase system plasmid or sleeping beauty transposon system plasmid.

In another aspect, the invention provides the use of any one of the above expression cassettes or recombinant vectors in the preparation of a blue light-induced and activated Cre recombinase transgenic cell line or transgenic animal model,

the animal is a mammal, preferably a mouse or rat, more preferably a mouse.

In the application, in the Cre recombinase transgenic cell line activated by blue light induction, the intensity of the blue light is 3-5mw/cm ² The irradiation mode is blue light irradiation for 20-40s and 2-5min off;

in the blue light induction activated Cre recombinase transgenic animal, the intensity of the blue light is 15-25mw/cm ² The irradiation mode is 0.5-2min on and 3-5min off.

On the other hand, the invention protects the application of the blue light induction activated Cre recombinase transgenic cell line or transgenic animal prepared by any one of the Cre recombination optimization systems or the recombinant vectors in gene function research, lineage tracing and cell clearing.

The invention has the following beneficial effects:

1. the light-controlled Cre recombinase transgenic mouse model is established for the first time.

The existing light-controlled Cre recombinase system is subjected to various researches on cells, but the cell level cannot fully simulate the animal level. The light control system is applied to a mouse for the first time, so that the application range of the light-sensitive protein is expanded, and a reference is provided for establishing an animal model by other light-sensitive proteins.

2. The space-time specificity regulation Cre recombinase expression is realized for the first time.

The existing Cre recombinase induction system can only realize the lack of space specificity of time specificity, and the gene editing of space-time specificity can be realized by controlling the time and space of illumination of the light-controlled Cre recombinase transgenic mouse model established by people, so that the defects of the traditional research method are overcome.

3. The Cre recombinase generated by the system provided by the invention mainly comprises two fusion proteins CIB1-creC which are separated from each other _106-343 And CRY2-CreN _19-104 Forming; when the two proteins are under certain blue light conditions, CRY2-CreN _19-104 The CRY2 in (A) undergoes conformational change with CIB1-CreC _106-343 In (3) to allow CreC _106-343 And CreN _19-343 Binding forms a protein having Cre recombinase activity, thereby editing the target gene located between the loxP sites. In order to improve the expression efficiency and the induction multiple of Cre recombinase in the system, the system is optimized, so that the expression efficiency of the Cre recombinase activated by blue light induction is obviously improved, and the induction multiple is improved by about 40 times.

Drawings

FIG. 1 is a schematic diagram of the structure of a plasmid containing different PA-Cre fragmentation sites (59/60 and 104/106), wherein the following symbols from left to right are used: CMV is a strong mammalian-expressed promoter of human cytomegalovirus origin; the CIB1 is a coding gene of a ligand of a light-sensitive protein CRY 2; cre (r. Cre. R. C) _60-343 And Cre _106-343 Coding gene of Cre recombinase C end; IRES is an internal ribosome entry site sequence such as SEQ ID No. 13; CRY2 _L384F The coding gene of the L384F mutant of the photosensitive protein CRY 2; cre (r. Cre. R. C) _19-59 And Cre _19-104 Is the coding gene of Cre recombinase N end.

FIG. 2 is the result of detecting the effect of different segmented sites of PA-Cre on the luciferase activity of the cell line.

FIG. 3 is a schematic diagram of PA-Cre with Linker peptides (Linker) with different amino acid sequences.

FIG. 4 shows the results of the assays for the effect of different PA-Cre on luciferase activity of the cell line shown in FIG. 3.

FIG. 5 is a schematic diagram of PA-Cre structure at different NLS positions.

FIG. 6 shows the results of the assays for the effect of different PA-Cre on luciferase activity of the cell line shown in FIG. 5.

FIG. 7 is a schematic diagram of the optimized PA-Cre structure.

FIG. 8 is a graph showing the results of the assay of the effect of PA-Cre on luciferase activity of the cell line shown in FIG. 7.

FIG. 9 shows the results of detection of protein expression level of PA-Cre after optimization, transfection of HEK293 cells, lysis of cells after 48h, and taking of protein as western blot.

Fig. 10 is a quantitative analysis of the results of fig. 9, with the GADPH protein in HEK293 cells as the internal control.

FIG. 11 shows the activity ratio of optimized PA-Cre in wild-type mice, CMV-loxp-stop-loxp-luc plasmid injected into the tail vein of fluid dynamics, one group of mice are dark, the other group of mice are irradiated by blue light, and the in vivo imaging result is 24h later.

FIG. 12 is the quantitative analysis of the results of FIG. 11, the upper panel corresponding to CMV-PACre and the lower panel corresponding to CAG-PACre.

FIG. 13 shows CAG-PACre transgenic carrier injected into CAG-loxp-stop-loxp-tdtomato mice in tail vein by fluid dynamics, one group of mice is dark, the other group of mice is irradiated by blue light, after 48h, the detection result of the frozen section of the liver of the mice is taken out, the scale at the lower right corner in the figure represents 200 μm, the length of the line segment marked in the right side of the second row in the figure represents that the irradiation depth is 400 μm, wherein Merge is the result of the superposition of the images of the first two columns.

FIG. 14 is a flow chart of the construction of PA-Cre transgenic mice using the transposase system.

FIG. 15 shows the results of protein expression characterization of PA-Cre transgenic mice.

FIG. 16 shows the result of in vivo imaging before and after the seventh day of light irradiation of the first generation PA-Cre transgenic mouse injected with AAV8CMV-loxp-stop-loxp-luc virus into the tail vein, and the reference light intensity and color bar values are the same as those in FIG. 11.

FIG. 17 is a quantitative analysis of the results of FIG. 16.

FIG. 18 shows the results of protein expression characterization of the first generation PA-Cre transgenic mice.

In fig. 2, 4, 6, 8, and 17, the values above the columns are the ratio of the light to dark results for the same treatment.

Detailed Description

The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.

Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.

Example 1 optimization of the PA-Cre System

In the embodiment, the framework vector for expressing the Cre recombinase is a commercial vector pT2/HB, and the insertion site of the PA-Cre recombinase gene in the framework vector is between XbaI and EcoRI.

The reporter plasmid CMV-loxp-stop-loxp-LUC used in this example was obtained by ligating the coding sequences of CMV, loxp, stop, loxp, and LUC (SEQ ID No. 21) in this order between the NcoI and EcoRI sites of the commercial vector px 330.

1. Determination of Cre recombinase segmentation sites

The common segmentation sites for Cre recombinase are 59/60 and 104/106, and two plasmids with different segmentation sites, PACre59/60 and PACre104/106 (FIG. 1) are constructed.

In FIG. 1, cre in the plasmid PACre104/106 _19-104 The amino acid sequence of (A) is shown as SEQ ID No.3, the coding gene sequence is shown as SEQ ID No.4, cre _106-343 The amino acid sequence of (A) is shown as SEQ ID No.1, and the coding gene sequence is shown as SEQ ID No. 2; the amino acid sequence of CIB1 is shown as SEQ ID No.17, and the coding geneThe sequence is shown as SEQ ID No. 18; CRY2 _L384F The amino acid sequence of (A) is shown as SEQ ID No.19, and the coding gene sequence is shown as SEQ ID No. 20.

Co-transforming HEK293 cells with the two plasmids PACre59/60 and PACre104/106 and the reporter plasmid CMV-loxp-stop-loxp-LUC respectively, and after 24 hours, one group is dark and light-proof, and the other group is used with the intensity of 4mw/cm ² After 48 hours, the luciferase LUC was detected, and as a result, as shown in FIG. 2, the 59/60 segmented Cre recombinase had almost no activity, so we selected 104/106 of the segmented sites.

2. Selection of different Linker peptides (Linker)

In the construction of fusion proteins, the linker peptide is important in relation to the activity of the protein. We selected 4 different amino acid sequence connecting peptides (figure 3), the amino acid sequence of which is respectively shown in SEQ ID No.5, 6, 7, 8, and the coding gene of each connecting peptide is simultaneously arranged at the upstream of two Cre recombinase segments to obtain four different plasmids L1, L2, L3, L4;

wherein, the coding gene sequence of the connecting peptide used by the plasmid L1 is shown as SEQ ID No. 9.

The four plasmids and a reporter plasmid CMV-loxp-stop-loxp-LUC are co-transformed into HEK293 cells respectively, and then the enzyme activity of the LUC is detected according to the method in the step 1, and the result is shown in figure 4, wherein the L1 effect is optimal, and at the moment, the Cre recombinase has high activity and almost no background (namely, the Cre recombinase has almost no activity under dark conditions, namely, the system has almost no leakage).

3. Selection of nuclear localization signal sequence (NLS) positions

In order to further improve the enzymatic activity of Cre recombinase, the expression efficiency of Cre recombinase is improved by adding NLS and adjusting the relative position of NLS so that NLS is better enriched in cell nucleus. To this end we designed four plasmids N ₀ 、N _C 、N _R 、N _CR Respectively is as follows: NLS is not added, NLS is added only at the N-terminal of CIB1, NLS is added only at the N-terminal of CRY2, and two nuclear localization signal sequences are added (FIG. 5).

Wherein, the amino acid sequence of the nuclear localization signal NLS is shown as SEQ ID No.10, and the coding gene sequence is shown as SEQ ID No. 11.

The four plasmids and a report plasmid CMV-loxp-stop-loxp-LUC are respectively co-transformed into HEK293 cells, then the enzyme activity of the LUC is detected according to the method in the step 1, and the result is shown in figure 6, and NLS is only added at the N end of CIB1, namely the activation multiple of Cre recombinase of the plasmid Nc is the highest; is improved by about 40 times compared with the plasmid PACre104/106 before optimization.

4. Optimization of expression cassettes/vectors

Setting the promoter as CAG (broad-spectrum strong promoter), adding 3 Xflag label in front of nuclear localization signal for convenient detection of protein expression, and setting transcription regulation element WPRE at the end to obtain plasmid containing CAG-PACre, and setting CMV as the plasmid CMV-PACre of promoter as control (FIG. 7);

the equimolar CMV-PACre-containing plasmid and the equimolar CAG-PACre-containing plasmid were co-transformed into HEK293 cells together with the reporter plasmid CMV-loxp-stop-loxp-LUC, and then the LUC enzyme activity was detected according to the method of step 1, as shown in FIG. 8, the CAG-PACre induction factor was higher.

Western blot after cell lysis, it was found that the expression level of equimolar CAG-PACre was much higher than that of CMV-PACre (FIG. 9), and that the expression of the plasmid was not affected by blue light irradiation (FIG. 10).

Wherein, the nucleotide sequence of the promoter CAG is shown as SEQ ID No. 12; the amino acid sequence of the T2A is shown as SEQ ID No.14, and the coding gene sequence is shown as SEQ ID No. 15; the nucleotide sequence of the WPRE element is shown as SEQ ID No. 16.

Example 2 validation of the function of the PA-Cre transgenic vector in mice

1. Verification on WT mice

Two plasmids containing CAG-PACre or CMV-PACre obtained in step 4 of example 1 were co-injected into mice with CMV-loxp-stop-loxp-LUC reporter plasmid, respectively, using the hydrodynamic tail vein injection method. After 8h, one group was dark and protected from light, and the other group was used at an intensity of 20mw/cm ² Blue light (1 min on, 4min off) illumination. After 16h, the in vivo imaging results (FIG. 11) were consistent with those verified on HEK293 cells, and were found to beThe equimolar CAG-PACre not only has high induction fold, but also has enzyme activity far higher than that of CMV-PACre (figure 12).

2. Validation on CAG-loxp-stop-loxp-tdtomato mice

In order to stabilize the experimental results, we selected a reporter mouse of Cre recombinase, CAG-loxp-stop-loxp-tdtomato mouse. The plasmid containing CAG-PACre was injected into the reporter mice using the hydrodynamic tail vein injection of plasmids. After 8h, one group was dark and protected from light, and the other group was used at an intensity of 20mw/cm ² Blue light (1 min on, 4min off) illumination. After 48h, a frozen section of mouse liver is taken out, and the CAG-PACre is detected to effectively cut off the stop sequence to express tdtomato. CAG-PACre has a stronger spatial specificity compared to the control Cre, expressed only in the part where blue light can penetrate, with a penetration thickness of about 400 μm (fig. 13).

Example 3 construction of PA-Cre transgenic mice Using sleeping beauty transposon subsystem

Phenol chloroform was used to extract CAG-PACre-containing plasmid to remove RNA enzyme, and then was microinjected into fertilized eggs together with in vitro transcribed SB transposase mRNA, and healthy and viable fertilized eggs were transplanted into pseudopregnant mice (FIG. 14). After birth, the mice with positive genotype identification result are subjected to protein identification, and the result shows that 7 mice (2 #, 7#, 24#, 25#, 33#, 43#, 44 #) successfully express PA-Cre protein and can be subjected to passage (figure 15).

Example 4 functional verification of PA-Cre transgenic mice

According to the results of protein identification, 4 mice (2 #, 24#, 25#, 44 #) successfully expressing PA-Cre protein were selected from example 3 and mated with wild-type mice for generation. After 5 weeks of F1 birth, we selected mice in which PA-Cre protein expression was positive, and injected with AAV8CMV-loxp-stop-loxp-LUC adeno-associated virus, respectively. After 7 days, mice were imaged in vivo without blue light, and no expression of LUC was detected (fig. 16, first row). Then using an intensity of 20mw/cm ² After 12h of blue light (1 min on, 4min off) exposure, and re-live imaging, some mice detected expression of LUC (fig. 16 second row, fig. 17, fig. 18), with the highest fold induction of 175, confirming iThe PA-Cre transgenic mice are successfully established and can be stably passaged.

Those not described in detail in this specification are within the skill of the art. The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of the present application shall be included in the scope of the claims of the present application.

SEQUENCE LISTING

<110> Shanghai Bodhisae Biotech Co., ltd, university of east China

<120> blue light induction activated Cre recombination optimization system and application thereof

<130> JH-CNP190788

<160> 21

<170> PatentIn version 3.5

<210> 1

<211> 238

<212> PRT

<213> Artificial sequence

<400> 1

Arg Pro Ser Asp Ser Asn Ala Val Ser Leu Val Met Arg Arg Ile Arg

1 5 10 15

Lys Glu Asn Val Asp Ala Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe

20 25 30

Glu Arg Thr Asp Phe Asp Gln Val Arg Ser Leu Met Glu Asn Ser Asp

35 40 45

Arg Cys Gln Asp Ile Arg Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn

50 55 60

Thr Leu Leu Arg Ile Ala Glu Ile Ala Arg Ile Arg Val Lys Asp Ile

65 70 75 80

Ser Arg Thr Asp Gly Gly Arg Met Leu Ile His Ile Gly Arg Thr Lys

85 90 95

Thr Leu Val Ser Thr Ala Gly Val Glu Lys Ala Leu Ser Leu Gly Val

100 105 110

Thr Lys Leu Val Glu Arg Trp Ile Ser Val Ser Gly Val Ala Asp Asp

115 120 125

Pro Asn Asn Tyr Leu Phe Cys Arg Val Arg Lys Asn Gly Val Ala Ala

130 135 140

Pro Ser Ala Thr Ser Gln Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe

145 150 155 160

Glu Ala Thr His Arg Leu Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln

165 170 175

Arg Tyr Leu Ala Trp Ser Gly His Ser Ala Arg Val Gly Ala Ala Arg

180 185 190

Asp Met Ala Arg Ala Gly Val Ser Ile Pro Glu Ile Met Gln Ala Gly

195 200 205

Gly Trp Thr Asn Val Asn Ile Val Met Asn Tyr Ile Arg Asn Leu Asp

210 215 220

Ser Glu Thr Gly Ala Met Val Arg Leu Leu Glu Asp Gly Asp

225 230 235

<210> 2

<211> 714

<212> DNA

<213> Artificial sequence

<400> 2

cgaccaagtg acagcaatgc tgtttcactg gttatgcggc ggatccgaaa agaaaacgtt 60

gatgccggtg aacgtgcaaa acaggctcta gcgttcgaac gcactgattt cgaccaggtt 120

cgttcactca tggaaaatag cgatcgctgc caggatatac gtaatctggc atttctgggg 180

attgcttata acaccctgtt acgtatagcc gaaattgcca ggatcagggt taaagatatc 240

tcacgtactg acggtgggag aatgttaatc catattggca gaacgaaaac gctggttagc 300

accgcaggtg tagagaaggc acttagcctg ggggtaacta aactggtcga gcgatggatt 360

tccgtctctg gtgtagctga tgatccgaat aactacctgt tttgccgggt cagaaaaaat 420

ggtgttgccg cgccatctgc caccagccag ctatcaactc gcgccctgga agggattttt 480

gaagcaactc atcgattgat ttacggcgct aaggatgact ctggtcagag atacctggcc 540

tggtctggac acagtgcccg tgtcggagcc gcgcgagata tggcccgcgc tggagtttca 600

ataccggaga tcatgcaagc tggtggctgg accaatgtaa atattgtcat gaactatatc 660

cgtaacctgg atagtgaaac aggggcaatg gtgcgcctgc tggaagatgg ggat 714

<210> 3

<211> 86

<212> PRT

<213> Artificial sequence

<400> 3

Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg Asp Arg

1 5 10 15

Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val Cys Arg

20 25 30

Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe Pro Ala

35 40 45

Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala Arg Gly

50 55 60

Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu Asn Met Leu

65 70 75 80

His Arg Arg Ser Gly Leu

85

<210> 4

<211> 258

<212> DNA

<213> Artificial sequence

<400> 4

acgagtgatg aggttcgcaa gaacctgatg gacatgttca gggatcgcca ggcgttttct 60

gagcatacct ggaaaatgct tctgtccgtt tgccggtcgt gggcggcatg gtgcaagttg 120

aataaccgga aatggtttcc cgcagaacct gaagatgttc gcgattatct tctatatctt 180

caggcgcgcg gtctggcagt aaaaactatc cagcaacatt tgggccagct aaacatgctt 240

catcgtcggt ccgggctg 258

<210> 5

<211> 15

<212> PRT

<213> Artificial sequence

<400> 5

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Arg

1 5 10 15

<210> 6

<211> 10

<212> PRT

<213> Artificial sequence

<400> 6

Leu Glu Ala Ser Thr Gly Gly Ser Gly Thr

1 5 10

<210> 7

<211> 16

<212> PRT

<213> Artificial sequence

<400> 7

Leu Glu Ala Ser Pro Ser Asn Pro Gly Ala Ser Asn Gly Ser Gly Thr

1 5 10 15

<210> 8

<211> 15

<212> PRT

<213> Artificial sequence

<400> 8

Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys

1 5 10 15

<210> 9

<211> 45

<212> DNA

<213> Artificial sequence

<400> 9

ggtggcggtg gctctggagg tggtgggtcc ggaggaggcg gccgc 45

<210> 10

<211> 12

<212> PRT

<213> Artificial sequence

<400> 10

Ala Ser Pro Lys Lys Lys Arg Lys Val Glu Ala Ser

1 5 10

<210> 11

<211> 36

<212> DNA

<213> Artificial sequence

<400> 11

gccagtccca agaagaagag aaaggtggag gccagt 36

<210> 12

<211> 1716

<212> DNA

<213> Artificial sequence

<400> 12

ctagttatta atagtaatca attacggggt cattagttca tagcccatat atggagttcc 60

gcgttacata acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat 120

tgacgtcaat aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc 180

aatgggtgga gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc 240

caagtacgcc ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt 300

acatgacctt atgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta 360

ccatggtcga ggtgagcccc acgttctgct tcactctccc catctccccc ccctccccac 420

ccccaatttt gtatttattt attttttaat tattttgtgc agcgatgggg gcgggggggg 480

ggggggggcg cgcgccaggc ggggcggggc ggggcgaggg gcggggcggg gcgaggcgga 540

gaggtgcggc ggcagccaat cagagcggcg cgctccgaaa gtttcctttt atggcgaggc 600

ggcggcggcg gcggccctat aaaaagcgaa gcgcgcggcg ggcggggagt cgctgcgacg 660

ctgccttcgc cccgtgcccc gctccgccgc cgcctcgcgc cgcccgcccc ggctctgact 720

gaccgcgtta ctcccacagg tgagcgggcg ggacggccct tctcctccgg gctgtaatta 780

gcgcttggtt taatgacggc ttgtttcttt tctgtggctg cgtgaaagcc ttgaggggct 840

ccgggagggc cctttgtgcg gggggagcgg ctcggggggt gcgtgcgtgt gtgtgtgcgt 900

ggggagcgcc gcgtgcggct ccgcgctgcc cggcggctgt gagcgctgcg ggcgcggcgc 960

ggggctttgt gcgctccgca gtgtgcgcga ggggagcgcg gccgggggcg gtgccccgcg 1020

gtgcgggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg tgggggggtg 1080

agcagggggt gtgggcgcgt cggtcgggct gcaacccccc ctgcaccccc ctccccgagt 1140

tgctgagcac ggcccggctt cgggtgcggg gctccgtacg gggcgtggcg cggggctcgc 1200

cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc ggggcggggc cgcctcgggc 1260

cggggagggc tcgggggagg ggcgcggcgg cccccggagc gccggcggct gtcgaggcgc 1320

ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg gacttccttt 1380

gtcccaaatc tgtgcggagc cgaaatctgg gaggcgccgc cgcaccccct ctagcgggcg 1440

cggggcgaag cggtgcggcg ccggcaggaa ggaaatgggc ggggagggcc ttcgtgcgtc 1500

gccgcgccgc cgtccccttc tccctctcca gcctcggggc tgtccgcggg gggacggctg 1560

ccttcggggg ggacggggca gggcggggtt cggcttctgg cgtgtgaccg gcggctctag 1620

agcctctgct aaccatgttc atgccttctt ctttttccta cagctcctgg gcaacgtgct 1680

ggttattgtg ctgtctcatc attttggcaa agaatt 1716

<210> 13

<211> 587

<212> DNA

<213> Artificial sequence

<400> 13

cccctctccc tccccccccc ctaacgttac tggccgaagc cgcttggaat aaggccggtg 60

tgcgtttgtc tatatgttat tttccaccat attgccgtct tttggcaatg tgagggcccg 120

gaaacctggc cctgtcttct tgacgagcat tcctaggggt ctttcccctc tcgccaaagg 180

aatgcaaggt ctgttgaatg tcgtgaagga agcagttcct ctggaagctt cttgaagaca 240

aacaacgtct gtagcgaccc tttgcaggca gcggaacccc ccacctggcg acaggtgcct 300

ctgcggccaa aagccacgtg tataagatac acctgcaaag gcggcacaac cccagtgcca 360

cgttgtgagt tggatagttg tggaaagagt caaatggctc tcctcaagcg tattcaacaa 420

ggggctgaag gatgcccaga aggtacccca ttgtatggga tctgatctgg ggcctcggta 480

cacatgcttt acatgtgttt agtcgaggtt aaaaaaacgt ctaggccccc cgaaccacgg 540

ggacgtggtt ttcctttgaa aaacacgatg ataatatggc cacaacc 587

<210> 14

<211> 21

<212> PRT

<213> Artificial sequence

<400> 14

Gly Ser Gly Glu Gly Arg Gly Ser Leu Leu Thr Cys Gly Asp Val Glu

1 5 10 15

Glu Asn Pro Gly Pro

20

<210> 15

<211> 63

<212> DNA

<213> Artificial sequence

<400> 15

ggcagtggag agggcagagg aagtctgcta acatgcggtg acgtcgagga gaatcctggc 60

cca 63

<210> 16

<211> 589

<212> DNA

<213> Artificial sequence

<400> 16

aatcaacctc tggattacaa aatttgtgaa agattgactg gtattcttaa ctatgttgct 60

ccttttacgc tatgtggata cgctgcttta atgcctttgt atcatgctat tgcttcccgt 120

atggctttca ttttctcctc cttgtataaa tcctggttgc tgtctcttta tgaggagttg 180

tggcccgttg tcaggcaacg tggcgtggtg tgcactgtgt ttgctgacgc aacccccact 240

ggttggggca ttgccaccac ctgtcagctc ctttccggga ctttcgcttt ccccctccct 300

attgccacgg cggaactcat cgccgcctgc cttgcccgct gctggacagg ggctcggctg 360

ttgggcactg acaattccgt ggtgttgtcg gggaaatcat cgtcctttcc ttggctgctc 420

gcctgtgttg ccacctggat tctgcgcggg acgtccttct gctacgtccc ttcggccctc 480

aatccagcgg accttccttc ccgcggcctg ctgccggctc tgcggcctct tccgcgtctt 540

cgccttcgcc ctcagacgag tcggatctcc ctttgggccg cctccccgc 589

<210> 17

<211> 334

<212> PRT

<213> Artificial sequence

<400> 17

Asn Gly Ala Ile Gly Gly Asp Leu Leu Leu Asn Phe Pro Asp Met Ser

1 5 10 15

Val Leu Glu Arg Gln Arg Ala His Leu Lys Tyr Leu Asn Pro Thr Phe

20 25 30

Asp Ser Pro Leu Ala Gly Phe Phe Ala Asp Ser Ser Met Ile Thr Gly

35 40 45

Gly Glu Met Asp Ser Tyr Leu Ser Thr Ala Gly Leu Asn Leu Pro Met

50 55 60

Met Tyr Gly Glu Thr Thr Val Glu Gly Asp Ser Arg Leu Ser Ile Ser

65 70 75 80

Pro Glu Thr Thr Leu Gly Thr Gly Asn Phe Lys Lys Arg Lys Phe Asp

85 90 95

Thr Glu Thr Lys Asp Cys Asn Glu Lys Lys Lys Lys Met Thr Met Asn

100 105 110

Arg Asp Asp Leu Val Glu Glu Gly Glu Glu Glu Lys Ser Lys Ile Thr

115 120 125

Glu Gln Asn Asn Gly Ser Thr Lys Ser Ile Lys Lys Met Lys His Lys

130 135 140

Ala Lys Lys Glu Glu Asn Asn Phe Ser Asn Asp Ser Ser Lys Val Thr

145 150 155 160

Lys Glu Leu Glu Lys Thr Asp Tyr Ile His Val Arg Ala Arg Arg Gly

165 170 175

Gln Ala Thr Asp Ser His Ser Ile Ala Glu Arg Val Arg Arg Glu Lys

180 185 190

Ile Ser Glu Arg Met Lys Phe Leu Gln Asp Leu Val Pro Gly Cys Asp

195 200 205

Lys Ile Thr Gly Lys Ala Gly Met Leu Asp Glu Ile Ile Asn Tyr Val

210 215 220

Gln Ser Leu Gln Arg Gln Ile Glu Phe Leu Ser Met Lys Leu Ala Ile

225 230 235 240

Val Asn Pro Arg Pro Asp Phe Asp Met Asp Asp Ile Phe Ala Lys Glu

245 250 255

Val Ala Ser Thr Pro Met Thr Val Val Pro Ser Pro Glu Met Val Leu

260 265 270

Ser Gly Tyr Ser His Glu Met Val His Ser Gly Tyr Ser Ser Glu Met

275 280 285

Val Asn Ser Gly Tyr Leu His Val Asn Pro Met Gln Gln Val Asn Thr

290 295 300

Ser Ser Asp Pro Leu Ser Cys Phe Asn Asn Gly Glu Ala Pro Ser Met

305 310 315 320

Trp Asp Ser His Val Gln Asn Leu Tyr Gly Asn Leu Gly Val

325 330

<210> 18

<211> 1002

<212> DNA

<213> Artificial sequence

<400> 18

aatggagcca tcggcggcga cctgctgctg aacttccccg atatgagcgt gctggagaga 60

cagagggccc acctgaagta cctgaacccc acctttgaca gccctctggc tggcttcttc 120

gccgactcca gcatgatcac cggcggagag atggacagct atctgagcac cgccggcctg 180

aacctgccca tgatgtacgg cgagacaaca gtggagggcg actccagact gtccatcagc 240

cccgaaacaa ccctgggcac cggcaacttc aagaagagga agttcgacac cgagaccaaa 300

gactgcaacg aaaagaagaa aaagatgacc atgaacagag atgacctggt ggaggagggc 360

gaggaggaga agagcaagat caccgagcag aacaacggca gcaccaagtc catcaagaag 420

atgaagcata aggccaagaa ggaagagaac aatttcagca acgacagctc caaggtgacc 480

aaggagctcg agaagaccga ctacatccac gtgagggcca gaaggggcca ggccacagac 540

agccactcca ttgccgagag ggtcagaagg gagaagatct ccgagaggat gaagttcctg 600

caagacctgg tgcccggctg tgacaaaatc accggcaagg ccggcatgct ggacgagatc 660

atcaactacg tccagagcct ccagaggcag atcgagttcc tctccatgaa actggccatc 720

gtgaatccca ggcccgactt cgacatggac gacatcttcg ccaaggaggt cgcctccacc 780

cccatgacag tcgtgcccag ccccgagatg gtgctgtccg gatacagcca cgagatggtg 840

cacagcggct acagctccga gatggtgaac tccggctacc tgcacgtgaa tcccatgcag 900

caggtgaaca catcctccga tcccctgagc tgcttcaaca acggagaggc ccccagcatg 960

tgggactccc acgtgcagaa cctgtacgga aatctgggcg tg 1002

<210> 19

<211> 611

<212> PRT

<213> Artificial sequence

<400> 19

Lys Met Asp Lys Lys Thr Ile Val Trp Phe Arg Arg Asp Leu Arg Ile

1 5 10 15

Glu Asp Asn Pro Ala Leu Ala Ala Ala Ala His Glu Gly Ser Val Phe

20 25 30

Pro Val Phe Ile Trp Cys Pro Glu Glu Glu Gly Gln Phe Tyr Pro Gly

35 40 45

Arg Ala Ser Arg Trp Trp Met Lys Gln Ser Leu Ala His Leu Ser Gln

50 55 60

Ser Leu Lys Ala Leu Gly Ser Asp Leu Thr Leu Ile Lys Thr His Asn

65 70 75 80

Thr Ile Ser Ala Ile Leu Asp Cys Ile Arg Val Thr Gly Ala Thr Lys

85 90 95

Val Val Phe Asn His Leu Tyr Asp Pro Val Ser Leu Val Arg Asp His

100 105 110

Thr Val Lys Glu Lys Leu Val Glu Arg Gly Ile Ser Val Gln Ser Tyr

115 120 125

Asn Gly Asp Leu Leu Tyr Glu Pro Trp Glu Ile Tyr Cys Glu Lys Gly

130 135 140

Lys Pro Phe Thr Ser Phe Asn Ser Tyr Trp Lys Lys Cys Leu Asp Met

145 150 155 160

Ser Ile Glu Ser Val Met Leu Pro Pro Pro Trp Arg Leu Met Pro Ile

165 170 175

Thr Ala Ala Ala Glu Ala Ile Trp Ala Cys Ser Ile Glu Glu Leu Gly

180 185 190

Leu Glu Asn Glu Ala Glu Lys Pro Ser Asn Ala Leu Leu Thr Arg Ala

195 200 205

Trp Ser Pro Gly Trp Ser Asn Ala Asp Lys Leu Leu Asn Glu Phe Ile

210 215 220

Glu Lys Gln Leu Ile Asp Tyr Ala Lys Asn Ser Lys Lys Val Val Gly

225 230 235 240

Asn Ser Thr Ser Leu Leu Ser Pro Tyr Leu His Phe Gly Glu Ile Ser

245 250 255

Val Arg His Val Phe Gln Cys Ala Arg Met Lys Gln Ile Ile Trp Ala

260 265 270

Arg Asp Lys Asn Ser Glu Gly Glu Glu Ser Ala Asp Leu Phe Leu Arg

275 280 285

Gly Ile Gly Leu Arg Glu Tyr Ser Arg Tyr Ile Cys Phe Asn Phe Pro

290 295 300

Phe Thr His Glu Gln Ser Leu Leu Ser His Leu Arg Phe Phe Pro Trp

305 310 315 320

Asp Ala Asp Val Asp Lys Phe Lys Ala Trp Arg Gln Gly Arg Thr Gly

325 330 335

Tyr Pro Leu Val Asp Ala Gly Met Arg Glu Phe Trp Ala Thr Gly Trp

340 345 350

Met His Asn Arg Ile Arg Val Ile Val Ser Ser Phe Ala Val Lys Phe

355 360 365

Leu Leu Leu Pro Trp Lys Trp Gly Met Lys Tyr Phe Trp Asp Thr Leu

370 375 380

Leu Asp Ala Asp Leu Glu Cys Asp Ile Leu Gly Trp Gln Tyr Ile Ser

385 390 395 400

Gly Ser Ile Pro Asp Gly His Glu Leu Asp Arg Leu Asp Asn Pro Ala

405 410 415

Leu Gln Gly Ala Lys Tyr Asp Pro Glu Gly Glu Tyr Ile Arg Gln Trp

420 425 430

Leu Pro Glu Leu Ala Arg Leu Pro Thr Glu Trp Ile His His Pro Trp

435 440 445

Asp Ala Pro Leu Thr Val Leu Lys Ala Ser Gly Val Glu Leu Gly Thr

450 455 460

Asn Tyr Ala Lys Pro Ile Val Asp Ile Asp Thr Ala Arg Glu Leu Leu

465 470 475 480

Ala Lys Ala Ile Ser Arg Thr Arg Glu Ala Gln Ile Met Ile Gly Ala

485 490 495

Ala Pro Asp Glu Ile Val Ala Asp Ser Phe Glu Ala Leu Gly Ala Asn

500 505 510

Thr Ile Lys Glu Pro Gly Leu Cys Pro Ser Val Ser Ser Asn Asp Gln

515 520 525

Gln Val Pro Ser Ala Val Arg Tyr Asn Gly Ser Lys Arg Val Lys Pro

530 535 540

Glu Glu Glu Glu Glu Arg Asp Met Lys Lys Ser Arg Gly Phe Asp Glu

545 550 555 560

Arg Glu Leu Phe Ser Thr Ala Glu Ser Ser Ser Ser Ser Ser Val Phe

565 570 575

Phe Val Ser Gln Ser Cys Ser Leu Ala Ser Glu Gly Lys Asn Leu Glu

580 585 590

Gly Ile Gln Asp Ser Ser Asp Gln Ile Thr Thr Ser Leu Gly Lys Asn

595 600 605

Gly Cys Lys

610

<210> 20

<211> 1833

<212> DNA

<213> Artificial sequence

<400> 20

aagatggaca agaaaaccat cgtctggttc aggagggacc tgaggatcga ggataacccc 60

gctctggctg ctgctgctca cgagggttct gtcttccctg tgtttatttg gtgccctgag 120

gaggagggac agttctatcc tggcagggcc agcaggtggt ggatgaagca gtccctggct 180

cacctgtccc agagcctgaa ggctctgggc agcgatctca ccctcatcaa aacccacaac 240

accatctccg ccatcctcga ctgcatcaga gtcaccggcg ccaccaaggt ggtgttcaac 300

catctctacg accctgtgtc cctggtcaga gaccacacag tcaaggagaa gctcgtcgaa 360

agaggaatct ccgtgcagtc ctacaacggc gacctgctgt acgagccctg ggagatttac 420

tgcgagaagg gcaagccctt cacatccttc aacagctact ggaagaagtg tctggacatg 480

tccatcgaga gcgtcatgct gccccctcct tggagactga tgcccattac cgctgccgct 540

gaggctatct gggcttgttc catcgaagaa ctcggcctgg agaacgaggc cgaaaagccc 600

agcaacgccc tgctcaccag agcttggtcc cccggctgga gcaatgccga caagctgctc 660

aacgagttca tcgagaagca gctgatcgac tatgccaaga acagcaagaa agtggtgggc 720

aatagcacca gcctgctgag cccctacctg catttcggag agatttccgt gaggcacgtc 780

ttccagtgcg ccaggatgaa gcaaatcatc tgggccagag acaagaacag cgaaggagag 840

gagtccgccg atctctttct caggggaatc ggcctcagag agtatagcag gtacatttgc 900

ttcaacttcc cctttaccca tgagcagagc ctcctgagcc acctcagatt ctttccttgg 960

gacgccgatg tggacaaatt caaagcctgg aggcagggaa ggaccggata ccctctggtg 1020

gacgccggca tgagagagtt ttgggctacc ggctggatgc acaacaggat tagggtcatc 1080

gtgagcagct ttgccgtcaa attcctcctg ctgccctgga agtggggcat gaagtatttc 1140

tgggacacac tgctggatgc cgatctcgag tgcgacatcc tgggctggca gtatatcagc 1200

ggctccatcc ctgatggcca cgagctcgac agactggaca accctgccct gcagggcgct 1260

aagtacgacc ccgaaggcga gtacatcaga caatggctgc ctgaactggc cagactccct 1320

acagagtgga ttcatcaccc ctgggacgcc cctctgaccg tcctgaaggc cagcggagtg 1380

gagctgggca ccaactacgc taaacccatc gtcgacatcg acacagccag ggagctcctc 1440

gccaaggcca tctccagaac cagggaagct cagatcatga tcggcgccgc tcccgatgag 1500

atcgtggccg attccttcga agccctggga gctaacacca tcaaggagcc cggactgtgc 1560

ccctccgtga gcagcaacga ccagcaagtg ccctccgccg tgaggtataa cggctccaag 1620

agagtgaaac ccgaagagga ggaagagaga gacatgaaga agagcagggg cttcgacgaa 1680

agggagctgt tttccaccgc tgaatccagc agctcctcct ccgtcttctt cgtgagccag 1740

tcctgtagcc tggccagcga gggcaagaac ctggaaggaa tccaggacag ctccgaccag 1800

attaccacca gcctcggaaa gaacggctgc aag 1833

<210> 21

<211> 1650

<212> DNA

<213> Artificial sequence

<400> 21

gaagacgcca aaaacataaa gaaaggcccg gcgccattct atccgctgga agatggaacc 60

gctggagagc aactgcataa ggctatgaag agatacgccc tggttcctgg aacaattgct 120

tttacagatg cacatatcga ggtggacatc acttacgctg agtacttcga aatgtccgtt 180

cggttggcag aagctatgaa acgatatggg ctgaatacaa atcacagaat cgtcgtatgc 240

agtgaaaact ctcttcaatt ctttatgccg gtgttgggcg cgttatttat cggagttgca 300

gttgcgcccg cgaacgacat ttataatgaa cgtgaattgc tcaacagtat gggcatttcg 360

cagcctaccg tggtgttcgt ttccaaaaag gggttgcaaa aaattttgaa cgtgcaaaaa 420

aagctcccaa tcatccaaaa aattattatc atggattcta aaacggatta ccagggattt 480

cagtcgatgt acacgttcgt cacatctcat ctacctcccg gttttaatga atacgatttt 540

gtgccagagt ccttcgatag ggacaagaca attgcactga tcatgaactc ctctggatct 600

actggtctgc ctaaaggtgt cgctctgcct catagaactg cctgcgtgag attctcgcat 660

gccagagatc ctatttttgg caatcaaatc attccggata ctgcgatttt aagtgttgtt 720

ccattccatc acggttttgg aatgtttact acactcggat atttgatatg tggatttcga 780

gtcgtcttaa tgtatagatt tgaagaagag ctgtttctga ggagccttca ggattacaag 840

attcaaagtg cgctgctggt gccaacccta ttctccttct tcgccaaaag cactctgatt 900

gacaaatacg atttatctaa tttacacgaa attgcttctg gtggcgctcc cctctctaag 960

gaagtcgggg aagcggttgc caagaggttc catctgccag gtatcaggca aggatatggg 1020

ctcactgaga ctacatcagc tattctgatt acacccgagg gggatgataa accgggcgcg 1080

gtcggtaaag ttgttccatt ttttgaagcg aaggttgtgg atctggatac cgggaaaacg 1140

ctgggcgtta atcaaagagg cgaactgtgt gtgagaggtc ctatgattat gtccggttat 1200

gtaaacaatc cggaagcgac caacgccttg attgacaagg atggatggct acattctgga 1260

gacatagctt actgggacga agacgaacac ttcttcatcg ttgaccgcct gaagtctctg 1320

attaagtaca aaggctatca ggtggctccc gctgaattgg aatccatctt gctccaacac 1380

cccaacatct tcgacgcagg tgtcgcaggt cttcccgacg atgacgccgg tgaacttccc 1440

gccgccgttg ttgttttgga gcacggaaag acgatgacgg aaaaagagat cgtggattac 1500

gtcgccagtc aagtaacaac cgcgaaaaag ttgcgcggag gagttgtgtt tgtggacgaa 1560

gtaccgaaag gtcttaccgg aaaactcgac gcaagaaaaa tcagagagat cctcataaag 1620

gccaagaagg gcggaaagat cgccgtgtaa 1650

Claims

1. A blue light induction activated Cre recombination optimization system is characterized in that: the system comprises a blue light-induced activated Cre recombinase expression cassette comprising genes linked in the following order: coding genes of a photosensitive protein ligand CIB1, a Cre recombinase C end coding gene, a photosensitive protein CRY2 and a Cre recombinase N end coding gene;

the amino acid sequence of the C end of the Cre recombinase is shown as SEQ ID No.1, the amino acid sequence of the N end of the Cre recombinase is shown as SEQ ID No.3, the coding gene sequence of the C end of the Cre recombinase is shown as SEQ ID No.2, and the coding gene sequence of the N end of the Cre recombinase is shown as SEQ ID No. 4;

the expression cassette also comprises a coding gene of a nuclear localization signal NLS, the coding gene of the nuclear localization signal NLS is positioned in front of the coding gene of the photosensitive protein ligand CIB1, and the amino acid sequence of the nuclear localization signal NLS is shown as SEQ ID No. 10;

the expression cassette also comprises a coding gene of a first connecting peptide and a coding gene of a second connecting peptide, wherein the coding gene of the first connecting peptide is positioned between the coding gene of the photosensitive protein ligand CIB1 and the coding gene of the Cre recombinase C end, and the coding gene of the second connecting peptide is positioned between the coding gene of the photosensitive protein CRY2 and the coding gene of the Cre recombinase N end;

a promoter is also included in the expression cassette; the expression cassette also comprises a coding gene or IRES of the self-shearing protein, and the coding gene or IRES of the self-shearing protein is positioned between the coding gene at the C end of the Cre recombinase and the coding gene of the light-sensitive protein CRY 2; the expression cassette also comprises a transcription regulation element, the transcription regulation element is positioned behind the coding gene at the N end of the Cre recombinase, and the transcription regulation element is a WPRE element.

2. A Cre reorganisation optimisation system as claimed in claim 1, wherein: the nucleotide sequence of the coding gene of the nuclear localization signal NLS is shown as SEQ ID No. 11.

3. A Cre reorganization optimization system according to claim 1, wherein: the amino acid sequences of the first connecting peptide and the second connecting peptide are shown in SEQ ID No.5, 6, 7 or 8.

4. A Cre reassembly optimization system according to claim 3, wherein: the amino acid sequences of the first connecting peptide and the second connecting peptide are shown in SEQ ID No. 5.

5. A Cre reassembly optimization system according to claim 4, wherein: the nucleotide sequences of the coding genes of the first connecting peptide and the second connecting peptide are shown in SEQ ID No. 9.

6. A Cre reorganisation optimisation system as claimed in claim 1, wherein: the promoter is CAG or CMV.

7. A Cre recombination optimization system as claimed in claim 6, wherein: the promoter is CAG.

8. A Cre reorganisation optimisation system as claimed in claim 7, wherein: the nucleotide sequence of the promoter CAG is shown as SEQ ID No. 12.

9. A Cre reorganisation optimisation system as claimed in claim 1, wherein: the self-cleavage protein is T2A.

10. A Cre reorganisation optimisation system as claimed in claim 1, wherein: the nucleotide sequence of the IRES is shown as SEQID No.13, and the self-cutting protein is T2A.

11. A Cre reorganisation optimisation system as claimed in claim 10, wherein: the amino acid sequence of the T2A is shown as SEQID No. 14.

12. A Cre reorganisation optimisation system as claimed in claim 11, wherein: the nucleotide sequence of the coding gene of the T2A is shown as SEQ ID No. 15.

13. A Cre reorganisation optimisation system as claimed in claim 1, wherein: the expression cassette also comprises a coding gene of a protein Tag sequence, the coding gene of the protein Tag sequence is positioned between the promoter and the coding gene of the nuclear localization signal NLS, and the protein Tag comprises any one or at least two of MyC, his, GST, HA, flag, MBP, avi Tag, SUMO and c-Myc Tag;

and/or the photosensitive protein ligand CIB1 is an amino acid sequence shown in SEQ ID No. 17;

and/or the light sensitive protein CRY2 is an amino acid sequence shown in SEQ ID No. 19.

14. A Cre reorganisation optimisation system as claimed in claim 1, wherein: the nucleotide sequence of the WPRE element is shown as SEQID No. 16;

and/or the nucleotide sequence of the coding gene sequence of the photosensitive protein ligand CIB1 is shown as SEQ ID No. 18;

and/or the nucleotide sequence of the coding gene sequence of the photosensitive protein CRY2 is shown as SEQ ID No. 20.

15. A Cre reorganisation optimisation system as claimed in claim 13, wherein: the protein tag is Flag.

16. A Cre reorganisation optimisation system as claimed in claim 15 wherein: the protein tag is 3 XFlag.

17. A recombinant vector comprising the Cre recombinant optimization system according to any one of claims 1 to 16, wherein the recombinant vector is a plasmid vector, a lentiviral vector, a retroviral vector, an adenoviral vector, an adeno-associated viral vector, a simian viral vector, a vaccinia viral vector, a sendai viral vector, an EB viral vector, or a herpes simplex viral vector.

18. The recombinant vector according to claim 17, wherein: the plasmid vector is Cre/loxP recombinase system plasmid or sleeping beauty transposon system plasmid.

19. Use of the Cre recombination optimisation system of any one of claims 1 to 16 or the recombinant vector of any one of claims 17 to 18 for the preparation of a transgenic cell line or a transgenic animal model of a blue light induced activated Cre recombinase, said animal being a mammal.

20. The use as claimed in claim 19, wherein: the mammal is a mouse or rat.

21. The use as claimed in claim 20, wherein: the mammal is a mouse.

22. Use according to any one of claims 19 to 21, wherein: in the blue light induction activated Cre recombinase transgenic cell line, the intensity of the blue light is 3-5mw/cm ² The irradiation mode is that blue light is irradiated for 20-40s and is switched off for 2-5 min;

23. Use of a transgenic cell line or a transgenic animal of the Cre recombinase activated by blue light induction prepared by the Cre recombination optimization system of any one of claims 1-16 or the recombinant vector of any one of claims 17-18 in gene function research, lineage tracing and cell depletion.