Detailed Description
To more clearly illustrate the technical solutions of the present invention, the following embodiments are further described, but the present invention is not limited thereto, and these embodiments are only some examples of the present invention.
Example 1 construction of S tet on Stable cell lines
The construction of the S Tet on stable cell line (control group: GFP Tet on stable cell line) included: construction of a Tet on inducible S protein lentiviral expression vector, packaging of a lentivirus, transduction of Vero E6 cells by the lentivirus, screening of an S Tet on cell line, induction of GFP Tet on cell line expression GFP protein by doxycycline hydrochloride (DOX), induction of S Tet on cell expression S protein by DOX, induction of S Tet on monoclonal cell line expression S protein by DOX and establishment of a syncytium lesion model. The method comprises the following specific steps:
(1) obtaining DNA sequence information of the S protein from an NCBI database, and artificially synthesizing a DNA fragment of the S protein modified by codon optimization, wherein the nucleotide sequence of the DNA fragment of the S protein modified by codon optimization is shown as SEQ ID NO: 1 is shown. The codon optimised S protein nucleotide sequence was 72% identical to that of the New coronavirus Wuhan strain (NCBI database ACCESSION NC-045512 REGION: 21563..25384), while the protein sequence between the two remained 100% identical. The invention can optimize the codon of the virus spike protein gene to achieve the following purposes: firstly, the biological safety is improved, and the pollution of virus DNA generated by homologous recombination to human DNA is completely avoided; ② the expression of S protein in stable transfer cells is improved.
(2) In order to improve the expression efficiency of the S protein, a Kozak consensus sequence is added at the upstream of the DNA fragment of the S protein, and the DNA sequence of the Kozak consensus sequence is as follows: GCCACCATGG. Since the last G (+ G4) of the Kozak sequence plays an important role in the recognition of the mRNA reading frame start site by ribosomes during translation, an alanine codon GCC was added after the ATG start codon of the DNA fragment of the S protein. Alanine is one of the smallest aliphatic amino acids, and its methyl side chain is non-reactive and almost never directly involved in any protein function, thus eliminating or minimizing the possibility that newly added alanine will affect the protein function of the viral S protein product. The nucleotide sequence of the finally obtained target fragment is shown as SEQ ID NO: 2, respectively.
(3) And (3) connecting the target fragment obtained in the step (2) to a lentivirus expression vector containing a Tet on regulation and control system, and then co-transfecting the target fragment and a lentivirus packaging helper plasmid to obtain the recombinant lentivirus expressed by the Tet on inducible S protein.
In order to construct a transgenic vector which is regulated by a Tet-On system and can stably integrate a target DNA segment (encoding S protein) into a host cell genome to ensure long-term expression of the S protein, the invention adopts a lentivirus transduction design. The basic vector used is a lentivirus expression vector pLent-TRE3G-hPGK-rtTA-SV40-Puro (Virginia Biotechnology Co., Ltd., http:// www.weizhenbio.cn/clone erviceZtDetail. phpid ═ 122& pid ═ 11& cid ═ fk ═ 1).
The plasmid map of this basic vector is shown in FIG. 5, in which the elements directly relevant to the present invention are:
1) TRE3G (tetracycline responsive element 3G promoter, Genbank access number: MN044710.1), which can control the mRNA transcription level of the target protein S to change the expression level of the S protein from the very low basic expression to the maximum expression after induction. The nucleotide sequence of the element is shown as SEQ ID NO: 4 is shown in the specification;
2) rtTA (inverse tetracycline-controlled transactivator, i.e. tet on regulator, Genbank access number: u89930.1) having the nucleotide sequence set forth in SEQ ID NO: 5, respectively.
Other important elements contained in the basic vector are:
1) puro (puromycin) for use in the screening of resistant cells;
2) WPRE (woodchuck hepatitis virus posttranscriptional regulatory element) for enhancing transgene expression; the following are the basic elements of a lentivirus transduction system:
3) cppt (central polypurine tract) which enhances integration and transduction efficiency of the vector;
4) psi (Psi) sequence), which is a high-level structure of the four-stem loop of RNA, plays a key role in improving the transgenic efficiency;
5) the 5 'LTR and 3' LTR (5 'and 3' long terminal repeats) facilitate integration of the transfer plasmid sequence into the host genome.
Designing a primer containing enzyme cutting sites BamHI and Mlu I, amplifying the target fragment obtained in the step (2), wherein the nucleotide sequence of the amplified product is shown as SEQ ID NO: 3 (see figure 6), then the amplified product and the vector pLent-TRE3G-hPGK-rtTA-SV40-Puro are respectively digested by BamHI and Mlu I enzymes, and then are connected to obtain the recombinant lentivirus expression vector.
The lentiviral packaging procedure is briefly described as follows: the mixture of the prepared S protein lentivirus expression vector containing Tet on regulation, a lentivirus packaging plasmid (psPAX2) and an envelope plasmid (pMD2G) is cotransfected with HEK293T cells, and after 72 hours of culture, lentivirus particles are collected from a supernatant culture medium and are used for virus titer determination. See http:// www.weizhenbio.cn/useFavoriteDetail. phpid 2& pid 20& cid 22.
(4) Vero E6 cells were plated on the day before lentivirus transduction into a 10cm cell culture dish and 1.5ml of a basal medium containing lentivirus (basal medium just passed over the cell surface layer) was added at an MOI of 1; taking 50ml as an example, the basic culture medium formula is as follows: 10% FBS, 1% P/S and DMEM/HG; (GFP tet on as control).
(5) After 8h, removing the basic culture medium containing lentivirus on the cell surface layer, adding 8ml of the basic culture medium, and adding 5% CO at 37 DEG C2Culturing the cells;
(6) removing the basic culture medium after 2 days, adding a screening culture medium (the basic culture medium containing 16 mu g/ml puromycin), screening for about one week to obtain positive S tet on stably transformed cells, and replacing the culture medium every 2-3 days;
reagent
|
Manufacture
|
Cat.no.
|
Storage of
|
standard
|
Puromycin
|
Solarbio
|
P8230
|
-20℃
|
25mg |
(7) 1000S tet on cells were seeded in 10cm cell plates at 37 ℃ with 5% CO2Culturing for about 1 month, and selecting monoclonal fineCarrying out cell amplification culture;
(8) adding inducer, namely 0.1-1 μ g/ml tetracycline analogue-doxycycline hydrochloride (DOX), into the screening culture medium, and inducing culture of S teton cells at 37 deg.C and 5% CO2Culturing under the conditions of (1);
(9) after the cells are induced and cultured for 8h, observing the growth state of the S teton cells under a microscope;
(10) collecting total proteins of S tet on cells induced for different time periods, and verifying the expression of the S proteins by Western blot;
(11) and collecting total protein of S tet on monoclonal cells cultured for 24h in an induction way, and verifying the expression of the S protein by Western blot.
Example 2: GFP tet on cell line is induced and cultured for 24h by 0.1-1 mug/ml doxycycline hydrochloride (DOX), and GFP green fluorescent protein is expressed
Study subjects: GFP tet on cell line
The specific experimental steps are as follows:
(1) obtaining DNA sequence information of GFP and expression vector in the database;
(2) integrating a GFP sequence into a Tet on system, and packaging the modified Tet on system into a lentivirus;
(3) the GFP tet on stable cell screening method is the same as the S tet on stable cell strain screening method;
(4) removing the original culture medium of GFP teton cells in a six-hole plate in a biological safety cabinet, adding 2.5ml of pre-preheated DOX containing 0.1 mu g/ml or 1 mu g/ml into each hole, and recording the induction time at the moment as 0 h;
(5) observing whether the cells have green fluorescence signals of GFP tet on under a microscope when DOX is induced for 24h (GFP is excited by blue light to be green) (figure 1);
(6) and analyzing data and results.
The results are shown in FIG. 1, indicating that:
(1) tetracycline analogue doxycycline hydrochloride (DOX) (0.1-1 μ g/ml) can activate Tet on system, GFP protein is expressed, and cells excite green fluorescence under blue light.
(2) The addition of 1. mu.g/ml DOX as an inducer to the selection medium resulted in the production of a large amount of GFP protein at 24h, so the working concentration of DOX was defined as 1. mu.g/ml.
Example 3: the S tet on cell line undergoes cell fusion under the induction of 1 μ g doxycycline hyclate (DOX), i.e., syncytial is formed.
Study subjects: s tet on cell line
The experimental steps are as follows:
(1) six-well plate internal-plating S tet on cell 4.0X 10^5 cells/well, 2.5ml screening culture medium 37 deg.C, 5% CO2Culturing overnight; in another experiment, cells S tet on and GFP tet on were plated in the same way, at a rate of 1: mixing at a ratio of 1.
(2) Removing the original screening culture medium of the S teton cells in the six-hole plate in the biological safety cabinet, adding 2.5ml of pre-preheated DOX containing 1 mu g/ml into each hole, and recording the induction time of 0 h;
(3) observing the growth state of the cells under the lens of 8h, 16h, 24h and 32h induced by adding DOX respectively, and photographing and recording; and analyzing data and results.
The results are shown in FIG. 2, indicating that:
(1) the fusion reaction of the cells is observed when the S Tet on cells are added into an induction culture medium for 8 hours, which indicates that 1 mu g/ml DOX activates the Tet on system to start expressing S protein, and then the S protein is combined with ACE2 on the cell surface to induce the fusion reaction of the cells.
(2) Along with the prolonging of the DOX induction time, the intercellular fusion phenomenon is intensified, cell membranes are ruptured, and a large number of plaques appear at the bottom of the culture dish; this provides a very intuitive index for drug screening using the cell model in the future: and (3) observing the degree change of cell fusion under a mirror, and judging whether the drug has a protective effect on the cells in the process of attacking the cells by the S protein.
(3) Along with the prolonging of the DOX induction time, the intercellular fusion phenomenon is intensified, cell membranes are ruptured, and a large number of plaques appear at the bottom of the culture dish; this provides a very intuitive index for drug screening using the cell model in the future: and (3) observing the degree change of cell fusion under a mirror, and judging whether the drug has a protective effect on the cells in the process of attacking the cells by the S protein.
Example 4: western blot proves that DOX induces S tet on cells to express S protein
Study subjects: s tet on cells
The specific experimental steps are as follows:
(1) six-well plate intracellular seeding 4.0X 10^5 cells/well, 2.5ml screening culture medium 37 ℃, 5% CO2Culturing overnight;
(2) removing the original screening culture medium of the S teton cells in the six-hole plate in the biological safety cabinet, adding 2.5ml of pre-preheated DOX containing 1 mu g/ml into each hole, and recording the induction time as 0 h;
(3) collecting S tet on cells at 8h, 16h, 24h and 32h respectively after DOX induction;
(4) scraping off cells at the bottom of the pore plate by using a scraper, and then transferring the cell suspension into a 15ml centrifugal tube;
(5)3000g, centrifuging at 4 ℃ for 5 min;
(6) discarding the supernatant, adding 1ml of 1 XPBS, 3000g, centrifuging at 4 ℃ for 5 min;
(7) repeating the step (5) once;
(8) the supernatant was discarded and 40. mu.l RIPA buffer (containing 0.4. mu.l 100mM PMSF) was added;
(9) ice-cooling for 30min, and vortexing every 5min for 3 s;
(10) centrifuging at 14000g for 5min, and taking the supernatant in a clean 1.5ml centrifuge tube for later use;
(11) taking 20 mu l of a 5mg/ml BSA standard protein sample, adding 180 mu l of 1 XPBS, uniformly mixing by oscillation, and then diluting the mixture into standard proteins with different gradients for later use according to the table shown in the following;
0.5mg/ml BSA(μl)
|
1X PBS(μl)
|
BSA final concentration (. mu.g/ml)
|
40
|
0
|
0.5
|
32
|
8
|
0.4
|
24
|
16
|
0.3
|
16
|
24
|
0.2
|
12
|
28
|
0.15
|
8
|
32
|
0.1
|
4
|
36
|
0.05
|
0
|
40
|
0 |
(12) The extracted protein samples were diluted 4, 6 and 10 fold respectively (volume of dilution >10 μ l) for use;
(13) preparing a BCA working solution, and mixing the BCA solution and the Cu solution according to the weight ratio of 50: 1 to prepare a BCA working solution;
(14) a96-well plate with a transparent bottom is taken, and 100. mu.l of BCA working solution and 10. mu.l of a protein sample to be prepared are sequentially added into the 96-well plate (each gradient of a standard protein sample is prepared into 3 times);
(15) incubating at 37 ℃ for 30 min; measuring the absorbance at the wavelength of 562 nm; calculating the protein concentration of the sample to be detected according to the absorbance of the standard protein and the known protein concentration;
(16) preparing protein electrophoresis gel (8% of separation gel and 5% of concentrated gel); adding 20 μ g protein loading buffer (quantitative 28 μ l), denaturing at 100 deg.C for 5min, and sequentially adding into gel loading well; glue running, 90V 30min and 120V 2 h; membrane transfer, 200mM 150 min; sealing and standing overnight at 4 ℃;
(17) incubating the primary antibody for 1h at room temperature; wherein the dilution ratio of the antibody is as follows: anti-S (. about.180 kDa) antibody, 1:1000, parts by weight; anti-GAPDH (37KDa) antibody, 1:500, a step of;
(18) eluting primary antibody, adding eluent 1X TBST (0.05% TWEEN 20), washing with shaking for 10min, and repeating for 3 times;
(19) secondary antibody was incubated at room temperature for 1h, where for anti-S: using Anti-Mouse (1: 5000), the expression vector for Anti-GAPDH: Anti-Rabbit (1: 5000);
(20) eluting the secondary antibody, washing with 1 × TBST (0.05% TWEEN 20) for 10min with shaking, and repeating for 3 times;
(21) and (3) carrying out development detection, namely placing the PVDF film in a developing solution, oscillating and soaking for 2min, and exposing and developing in a developing instrument, wherein the developing solution is prepared as follows: solution I: solution II: water 1: 1: 2;
(22) and analyzing data and results.
A first antibody: SARS-CoV-2(2019-nCoV) nucleopacified Antibody, Rabbit PAb, Antibody Affinity Purified (yield: Sino Biological; Cat:40588-T62) (1:1000)
GAPDH antibody (Manual: Santa; Cat: SC32233) (1:500)
Secondary antibody:
Goat Anti-Rabbit IgG(H+L),HRP Conjugate(Manufacture:TRANS;Cat:HS101-01)(1:5000)
Goat Anti-Mouse IgG(H+L),HRP Conjugate(Manufacture:TRANS;Cat:HS201-01)(1:5000)
the results are shown in FIG. 3, indicating that:
(1)1 mu g/ml DOX induces S tet on cells to express S protein, and the experimental result is consistent with that of example 3;
(2) s tet on cells start to express S protein after being induced by DOX with the concentration of 1 mu g/ml for 8 hours, the expression quantity of the S protein is increased along with the increase of the induction time, and the expression quantity of the S protein is the highest after 24 hours; however, at 32h of DOX induction, the expression level of the full-length S protein is reduced, and the expression level of the S1 protein is up-regulated, which indicates that after 32h induction, a large amount of S protein is self-sheared, and a 110kDa S1 subfin fragment is generated.
Example 5: western blot confirmed that DOX induced S tet on monoclonal cells to express S protein.
Study subjects: s teton monoclonal cell line
The specific experimental steps are as follows:
(1) 1000S tet on cells are planted into a 10cm cell culture dish in a biosafety cabinet, and 8-10ml of screening culture medium (containing 5 mu M screening drug y27632) is added;
(2)37℃5%CO2culturing;
(3) when the monoclonal cells in the plate were visible (about 1 month), the monoclonal was picked up in a 96-well plate and 200. mu.l of screening medium was added thereto at 37 ℃ with 5% CO2Culturing;
(4) after the monoclonal cells are spread over more than 80% of the bottom area of the hole, transferring the digested cells to a 48-hole plate for culture, and then sequentially carrying out amplification culture to a 24-hole plate, a 12-hole plate, a 6-cm plate and a 10-cm plate;
(5) 6-well plate intracellular seeding 4.0X 10^5 cells/well, 2.5ml screening culture medium 37 ℃, 5% CO2Culturing overnight;
(6) removing the original screening culture medium of S teton cells in a 6-well plate in a biological safety cabinet, adding 2.5ml of pre-preheated DOX containing 1 mu g/ml into each well, and recording the induction time as 0 h;
(7) when inducing for 24h, scraping the cells at the bottom of the pore plate by using a scraper, and then transferring the cell suspension into a 15ml centrifugal tube;
(8)3000g, centrifuging at 4 ℃ for 5 min;
(9) discarding the supernatant, adding 1ml of 1 XPBS, 3000g, centrifuging at 4 ℃ for 5 min;
(10) repeating the step (5) once;
(11) the supernatant was discarded and 40. mu.l RIPA buffer (containing 0.4. mu.l 100mM PMSF) was added;
(12) ice-cooling for 30min, and vortexing every 5 min;
(13)14000g of centrifugation is carried out for 5min, and the supernatant is transferred into a 1.5ml centrifuge tube for standby;
(14) taking 20 mu l of a 5mg/ml BSA standard protein sample, adding 180 mu l of 1 XPBS, uniformly mixing by oscillation, and then diluting the mixture into standard proteins with different gradients for later use according to the table shown in the following;
0.5mg/ml BSA(μl)
|
1X PBS(μl)
|
BSA final concentration (. mu.g/ml)
|
40
|
0
|
0.5
|
32
|
8
|
0.4
|
24
|
16
|
0.3
|
16
|
24
|
0.2
|
12
|
28
|
0.15
|
8
|
32
|
0.1
|
4
|
36
|
0.05
|
0
|
40
|
0 |
(15) Diluting the extracted protein sample by 4, 6 and 10 times respectively (the volume of the diluent is more than 10 mu l) for later use;
(16) preparing a BCA working solution, and mixing the BCA solution and the Cu solution according to the weight ratio of 50: 1 to prepare a BCA working solution;
(17) a96-well plate with a transparent bottom is taken, and 100. mu.l of BCA working solution and 10. mu.l of a protein sample to be prepared are sequentially added into the 96-well plate (each gradient of a standard protein sample is prepared into 3 times);
(18) incubating at 37 ℃ for 30 min;
(19) measuring the absorbance at the wavelength of 562 nm;
(20) and calculating the protein concentration of the sample to be detected according to the absorbance of the standard protein and the known protein concentration.
(21) Preparing protein electrophoresis gel (8% of separation gel and 5% of concentrated gel); adding 20 μ g protein loading buffer (quantitative 28 μ l), denaturing at 100 deg.C for 5min, and sequentially adding into gel loading well;
(23) glue running, 90V 30min and 120V 2 h; membrane transfer, 200mM 150 min; sealing and standing overnight at 4 ℃;
(24) incubating the primary antibody for 1h at room temperature; wherein the dilution ratio of the antibody is as follows: anti-S (. about.180 kDa) antibody, 1:1000, parts by weight; anti-GAPDH (37KDa) antibody, 1:500, a step of;
(25) eluting primary antibody, adding eluent 1X TBST (0.05% TWEEN 20), washing with shaking for 10min, and repeating for 3 times;
(26) secondary antibody was incubated at room temperature for 1h, with antibody to anti-S: using Anti-Mouse (1: 5000), the Anti-antibody GAPDH: Anti-Rabbit (1: 5000);
(27) eluting the secondary antibody, adding eluent 1X TBST (0.05% TWEEN 20), washing for 10min under shaking, and repeating for 3 times;
(28) and (3) carrying out development detection, namely placing the PVDF film in a developing solution, oscillating and soaking for 2min, and exposing and developing in a developing instrument, wherein the developing solution is prepared as follows: solution I: solution II: water 1: 1: 2;
(29) and analyzing data and results.
A first antibody: SARS-CoV-2(2019-nCoV) nucleopacified Antibody, Rabbit PAb, Antibody Affinity Purified (yield: Sino Biological; Cat:40588-T62) (1:1000)
Anti-GAPDH antibody (manufacturing: Santa; Cat: SC32233) (1:500)
Anti-β-Tubulin Mouse Monoclonal Antibody(Manufacture:TRANS;Cat:HC101-02)(1:1000)
Secondary antibody:
Goat Anti-Rabbit IgG(H+L),HRP Conjugate(Manufacture:TRANS;Cat:HS101-01)(1:5000)
Goat Anti-Mouse IgG(H+L),HRP Conjugate(Manufacture:TRANS;Cat:HS201-01)(1:5000)
the results are shown in FIG. 4, indicating that:
(1) culturing S tet on monoclonal cells for 24h by using an induction culture medium containing 1 mu g/ml DOX, and expressing a large amount of S protein;
(2) when the S protein in the cells is expressed in a large quantity, the expression of the cytoskeletal protein Tubulin is sharply reduced to disappear, which indicates that the syncytium phenomenon appears between the cells.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the protection scope of the present invention, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Reference to the literature
bioRxiv preprint doi:https://doi.org/10.1101/2020.03.16.994152;this version posted March 17,2020。
SEQUENCE LISTING
<110> five-membered Biotech Ltd of Nanchang
<120> cell syncytium lesion model based on S-TET protein expression system and preparation method thereof
<130> 2021.07.15
<160> 5
<170> PatentIn version 3.3
<210> 1
<211> 3822
<212> DNA
<213> Artificial sequence
<400> 1
atgttcgtgt tcctggtgct gctgccgctg gtgtcctccc aatgcgtgaa ccttaccact 60
cggacccagc tgcctcctgc ttacaccaat agctttacga gaggggtgta ctaccctgac 120
aaggtgttca gaagcagcgt gctgcacagc acccaagacc tgttcctgcc tttcttcagc 180
aatgtgacat ggttccacgc catccacgtg tctgggacta acggcaccaa gcgctttgac 240
aaccccgtgc tgcctttcaa cgatggcgtt tacttcgctt cgaccgaaaa gtccaatatc 300
atccgtggct ggattttcgg cacaaccctg gatagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccctttt 420
ctgggcgtgt attaccacaa aaacaacaag tcctggatgg aatcagaatt cagagtgtat 480
agcagcgcca acaactgcac gttcgagtac gtcagccagc cttttcttat ggacctggag 540
ggcaagcagg gcaacttcaa aaatctgaga gagttcgtgt tcaagaatat cgatggctac 600
ttcaagatct acagcaagca cacacctatt aacctggttc gggacctgcc ccagggcttt 660
agcgctctgg aacccctggt cgacctgcct atcgggatca acatcaccag atttcagacc 720
ctgctggccc tacacagaag ctacctgacc cctggcgact cttcttccgg atggaccgcc 780
ggcgctgccg cctactacgt gggctacctg cagcctagaa catttctgct gaaatacaac 840
gagaacggca ccatcaccga cgccgtggat tgtgccctgg accccctttc tgagacaaag 900
tgcaccctga aatctttcac cgtggagaag ggcatctacc agaccagcaa cttccgggtc 960
cagcctaccg agtctatcgt acggttccct aacatcacga acctctgtcc ttttggcgag 1020
gtgtttaacg ccacaagatt cgccagcgtg tacgcctgga accggaagag aatcagcaac 1080
tgcgtggctg attacagcgt gctgtacaac agcgcctctt tcagcacatt caagtgttac 1140
ggcgtgtctc ccaccaagct aaacgacctg tgcttcacca acgtgtatgc cgacagcttc 1200
gtgatcaggg gcgacgaggt gcgacagatc gctcctggcc aaacaggcaa gatcgccgac 1260
tacaattaca agctgccaga cgacttcaca ggctgcgtga tcgcctggaa cagcaacaac 1320
ctggacagca aggtgggcgg caactacaac tacctgtacc ggctgtttag aaagagcaac 1380
ctgaagccct tcgagagaga tatcagcacc gagatctacc aggctggcag caccccttgc 1440
aacggcgttg agggcttcaa ctgttatttc ccactccaat cttacggctt ccagcctaca 1500
aatggcgtgg gctaccagcc ttaccgggtg gtcgtgctca gctttgagct gctgcatgcc 1560
cctgccaccg tttgtggccc taagaagagc accaacctgg tgaagaataa atgcgtgaat 1620
ttcaatttca acggcctgac cggcaccggt gtcctgaccg aatccaacaa gaagttcctg 1680
cccttccaac agttcggcag agatatcgcc gacaccaccg atgctgttag agatccccag 1740
accctggaga tcctggacat cacaccctgc tctttcggtg gcgtcagtgt gatcactccc 1800
ggcacaaata ccagcaatca agtggccgtg ctttaccagg atgttaactg tacagaggtc 1860
cccgtggcca tacacgcgga ccagctgacc ccaacatggc gggtgtactc gacaggcagc 1920
aacgtgttcc agacccgcgc gggctgtctg atcggcgccg aacacgttaa caactcctac 1980
gagtgcgata tccctatcgg cgccggcatt tgcgccagct accagaccca gaccaacagc 2040
cctcggagag ctagatctgt ggccagccag tctatcatcg cctacaccat gagtctgggt 2100
gctgaaaact ctgtagcata tagcaataat agcatcgcca tccccaccaa tttcaccatc 2160
agcgtgacca cagaaatcct gcctgtgtct atgaccaaga ccagcgtgga ctgtaccatg 2220
tacatctgcg gcgattccac agaatgctct aacctgctgc tgcagtacgg ctccttttgc 2280
acacagctga acagagccct gaccggaatt gcagtggaac aagacaagaa cacacaggag 2340
gtgttcgctc aggtgaagca gatctacaaa acccctccta tcaaggactt cggcggtttc 2400
aatttcagcc agattctgcc tgatcctagc aaaccttcca agcggagctt catcgaggac 2460
ctgctgttca acaaggtgac cctggccgat gccggcttca tcaagcaata cggcgactgc 2520
ctgggagaca tcgccgctcg ggacctgatc tgcgcccaga agtttaacgg cctgaccgtg 2580
ctgcctccac tgttgaccga cgaaatgatc gctcagtaca ccagcgccct gctggccgga 2640
acaatcacca gcggatggac attcggcgcc ggcgccgccc tgcagattcc tttcgctatg 2700
cagatggcct atagattcaa cggaatcgga gtcacccaga acgtgctata cgagaaccag 2760
aaacttatcg ccaaccagtt taactccgcc atcggaaaga tccaggattc cctgagcagc 2820
acagcttctg ccctcggaaa actgcaggac gtggtgaacc aaaacgccca ggccctgaac 2880
accctggtga agcagctgag ctcaaacttc ggcgccatca gctccgtgct caacgacatc 2940
ctgtctagac tggacaaagt ggaagccgag gtgcagatcg accggctgat caccggaaga 3000
ctacagagcc tgcagacata cgtcacccag cagctgatca gagccgccga gattcgcgcc 3060
agcgctaatc tcgccgccac aaagatgagc gaatgtgtgc tgggccagag caagagagtg 3120
gacttctgcg gcaaaggcta ccacctgatg agcttccccc agtctgctcc ccacggcgtg 3180
gtgttcctgc atgtgaccta cgtgccagcc caggagaaga acttcactac agcccctgct 3240
atctgccacg acggcaaggc ccacttccct agagagggcg tgttcgtgag caacggcacc 3300
cactggttcg tgacccagag aaacttctac gagccccaga tcatcacaac agacaacacc 3360
ttcgtatctg gcaactgcga tgtggtgatc ggaatcgtga ataacaccgt gtacgacccc 3420
ctgcagcctg agctggatag ttttaaagag gaactggaca agtactttaa gaaccacacc 3480
tccccagacg tggacctggg cgatatcagc ggaataaatg cctcagtggt gaacatccag 3540
aaagaaatcg acagactgaa cgaggtcgcc aagaacctga acgagtccct gatcgacctg 3600
caagaactgg gcaagtacga gcagtacatc aagtggcctt ggtacatctg gctgggcttc 3660
atcgctggac tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gacaagctgc 3720
tgtagctgcc tgaagggctg ctgcagctgc ggatcttgtt gcaagttcga cgaagatgat 3780
tccgaacccg tgctgaaggg cgtgaaactc cactacacct ga 3822
<210> 2
<211> 3831
<212> DNA
<213> Artificial sequence
<400> 2
gccaccatgg ccttcgtgtt cctggtgctg ctgccgctgg tgtcctccca atgcgtgaac 60
cttaccactc ggacccagct gcctcctgct tacaccaata gctttacgag aggggtgtac 120
taccctgaca aggtgttcag aagcagcgtg ctgcacagca cccaagacct gttcctgcct 180
ttcttcagca atgtgacatg gttccacgcc atccacgtgt ctgggactaa cggcaccaag 240
cgctttgaca accccgtgct gcctttcaac gatggcgttt acttcgcttc gaccgaaaag 300
tccaatatca tccgtggctg gattttcggc acaaccctgg atagcaagac ccagagcctg 360
ctgatcgtga acaacgccac caacgtggtg atcaaggtgt gcgagttcca gttctgcaac 420
gacccttttc tgggcgtgta ttaccacaaa aacaacaagt cctggatgga atcagaattc 480
agagtgtata gcagcgccaa caactgcacg ttcgagtacg tcagccagcc ttttcttatg 540
gacctggagg gcaagcaggg caacttcaaa aatctgagag agttcgtgtt caagaatatc 600
gatggctact tcaagatcta cagcaagcac acacctatta acctggttcg ggacctgccc 660
cagggcttta gcgctctgga acccctggtc gacctgccta tcgggatcaa catcaccaga 720
tttcagaccc tgctggccct acacagaagc tacctgaccc ctggcgactc ttcttccgga 780
tggaccgccg gcgctgccgc ctactacgtg ggctacctgc agcctagaac atttctgctg 840
aaatacaacg agaacggcac catcaccgac gccgtggatt gtgccctgga ccccctttct 900
gagacaaagt gcaccctgaa atctttcacc gtggagaagg gcatctacca gaccagcaac 960
ttccgggtcc agcctaccga gtctatcgta cggttcccta acatcacgaa cctctgtcct 1020
tttggcgagg tgtttaacgc cacaagattc gccagcgtgt acgcctggaa ccggaagaga 1080
atcagcaact gcgtggctga ttacagcgtg ctgtacaaca gcgcctcttt cagcacattc 1140
aagtgttacg gcgtgtctcc caccaagcta aacgacctgt gcttcaccaa cgtgtatgcc 1200
gacagcttcg tgatcagggg cgacgaggtg cgacagatcg ctcctggcca aacaggcaag 1260
atcgccgact acaattacaa gctgccagac gacttcacag gctgcgtgat cgcctggaac 1320
agcaacaacc tggacagcaa ggtgggcggc aactacaact acctgtaccg gctgtttaga 1380
aagagcaacc tgaagccctt cgagagagat atcagcaccg agatctacca ggctggcagc 1440
accccttgca acggcgttga gggcttcaac tgttatttcc cactccaatc ttacggcttc 1500
cagcctacaa atggcgtggg ctaccagcct taccgggtgg tcgtgctcag ctttgagctg 1560
ctgcatgccc ctgccaccgt ttgtggccct aagaagagca ccaacctggt gaagaataaa 1620
tgcgtgaatt tcaatttcaa cggcctgacc ggcaccggtg tcctgaccga atccaacaag 1680
aagttcctgc ccttccaaca gttcggcaga gatatcgccg acaccaccga tgctgttaga 1740
gatccccaga ccctggagat cctggacatc acaccctgct ctttcggtgg cgtcagtgtg 1800
atcactcccg gcacaaatac cagcaatcaa gtggccgtgc tttaccagga tgttaactgt 1860
acagaggtcc ccgtggccat acacgcggac cagctgaccc caacatggcg ggtgtactcg 1920
acaggcagca acgtgttcca gacccgcgcg ggctgtctga tcggcgccga acacgttaac 1980
aactcctacg agtgcgatat ccctatcggc gccggcattt gcgccagcta ccagacccag 2040
accaacagcc ctcggagagc tagatctgtg gccagccagt ctatcatcgc ctacaccatg 2100
agtctgggtg ctgaaaactc tgtagcatat agcaataata gcatcgccat ccccaccaat 2160
ttcaccatca gcgtgaccac agaaatcctg cctgtgtcta tgaccaagac cagcgtggac 2220
tgtaccatgt acatctgcgg cgattccaca gaatgctcta acctgctgct gcagtacggc 2280
tccttttgca cacagctgaa cagagccctg accggaattg cagtggaaca agacaagaac 2340
acacaggagg tgttcgctca ggtgaagcag atctacaaaa cccctcctat caaggacttc 2400
ggcggtttca atttcagcca gattctgcct gatcctagca aaccttccaa gcggagcttc 2460
atcgaggacc tgctgttcaa caaggtgacc ctggccgatg ccggcttcat caagcaatac 2520
ggcgactgcc tgggagacat cgccgctcgg gacctgatct gcgcccagaa gtttaacggc 2580
ctgaccgtgc tgcctccact gttgaccgac gaaatgatcg ctcagtacac cagcgccctg 2640
ctggccggaa caatcaccag cggatggaca ttcggcgccg gcgccgccct gcagattcct 2700
ttcgctatgc agatggccta tagattcaac ggaatcggag tcacccagaa cgtgctatac 2760
gagaaccaga aacttatcgc caaccagttt aactccgcca tcggaaagat ccaggattcc 2820
ctgagcagca cagcttctgc cctcggaaaa ctgcaggacg tggtgaacca aaacgcccag 2880
gccctgaaca ccctggtgaa gcagctgagc tcaaacttcg gcgccatcag ctccgtgctc 2940
aacgacatcc tgtctagact ggacaaagtg gaagccgagg tgcagatcga ccggctgatc 3000
accggaagac tacagagcct gcagacatac gtcacccagc agctgatcag agccgccgag 3060
attcgcgcca gcgctaatct cgccgccaca aagatgagcg aatgtgtgct gggccagagc 3120
aagagagtgg acttctgcgg caaaggctac cacctgatga gcttccccca gtctgctccc 3180
cacggcgtgg tgttcctgca tgtgacctac gtgccagccc aggagaagaa cttcactaca 3240
gcccctgcta tctgccacga cggcaaggcc cacttcccta gagagggcgt gttcgtgagc 3300
aacggcaccc actggttcgt gacccagaga aacttctacg agccccagat catcacaaca 3360
gacaacacct tcgtatctgg caactgcgat gtggtgatcg gaatcgtgaa taacaccgtg 3420
tacgaccccc tgcagcctga gctggatagt tttaaagagg aactggacaa gtactttaag 3480
aaccacacct ccccagacgt ggacctgggc gatatcagcg gaataaatgc ctcagtggtg 3540
aacatccaga aagaaatcga cagactgaac gaggtcgcca agaacctgaa cgagtccctg 3600
atcgacctgc aagaactggg caagtacgag cagtacatca agtggccttg gtacatctgg 3660
ctgggcttca tcgctggact gatcgccatc gtgatggtga ccatcatgct gtgctgcatg 3720
acaagctgct gtagctgcct gaagggctgc tgcagctgcg gatcttgttg caagttcgac 3780
gaagatgatt ccgaacccgt gctgaagggc gtgaaactcc actacacctg a 3831
<210> 3
<211> 3843
<212> DNA
<213> Artificial sequence
<400> 3
ggatccgcca ccatggcctt cgtgttcctg gtgctgctgc cgctggtgtc ctcccaatgc 60
gtgaacctta ccactcggac ccagctgcct cctgcttaca ccaatagctt tacgagaggg 120
gtgtactacc ctgacaaggt gttcagaagc agcgtgctgc acagcaccca agacctgttc 180
ctgcctttct tcagcaatgt gacatggttc cacgccatcc acgtgtctgg gactaacggc 240
accaagcgct ttgacaaccc cgtgctgcct ttcaacgatg gcgtttactt cgcttcgacc 300
gaaaagtcca atatcatccg tggctggatt ttcggcacaa ccctggatag caagacccag 360
agcctgctga tcgtgaacaa cgccaccaac gtggtgatca aggtgtgcga gttccagttc 420
tgcaacgacc cttttctggg cgtgtattac cacaaaaaca acaagtcctg gatggaatca 480
gaattcagag tgtatagcag cgccaacaac tgcacgttcg agtacgtcag ccagcctttt 540
cttatggacc tggagggcaa gcagggcaac ttcaaaaatc tgagagagtt cgtgttcaag 600
aatatcgatg gctacttcaa gatctacagc aagcacacac ctattaacct ggttcgggac 660
ctgccccagg gctttagcgc tctggaaccc ctggtcgacc tgcctatcgg gatcaacatc 720
accagatttc agaccctgct ggccctacac agaagctacc tgacccctgg cgactcttct 780
tccggatgga ccgccggcgc tgccgcctac tacgtgggct acctgcagcc tagaacattt 840
ctgctgaaat acaacgagaa cggcaccatc accgacgccg tggattgtgc cctggacccc 900
ctttctgaga caaagtgcac cctgaaatct ttcaccgtgg agaagggcat ctaccagacc 960
agcaacttcc gggtccagcc taccgagtct atcgtacggt tccctaacat cacgaacctc 1020
tgtccttttg gcgaggtgtt taacgccaca agattcgcca gcgtgtacgc ctggaaccgg 1080
aagagaatca gcaactgcgt ggctgattac agcgtgctgt acaacagcgc ctctttcagc 1140
acattcaagt gttacggcgt gtctcccacc aagctaaacg acctgtgctt caccaacgtg 1200
tatgccgaca gcttcgtgat caggggcgac gaggtgcgac agatcgctcc tggccaaaca 1260
ggcaagatcg ccgactacaa ttacaagctg ccagacgact tcacaggctg cgtgatcgcc 1320
tggaacagca acaacctgga cagcaaggtg ggcggcaact acaactacct gtaccggctg 1380
tttagaaaga gcaacctgaa gcccttcgag agagatatca gcaccgagat ctaccaggct 1440
ggcagcaccc cttgcaacgg cgttgagggc ttcaactgtt atttcccact ccaatcttac 1500
ggcttccagc ctacaaatgg cgtgggctac cagccttacc gggtggtcgt gctcagcttt 1560
gagctgctgc atgcccctgc caccgtttgt ggccctaaga agagcaccaa cctggtgaag 1620
aataaatgcg tgaatttcaa tttcaacggc ctgaccggca ccggtgtcct gaccgaatcc 1680
aacaagaagt tcctgccctt ccaacagttc ggcagagata tcgccgacac caccgatgct 1740
gttagagatc cccagaccct ggagatcctg gacatcacac cctgctcttt cggtggcgtc 1800
agtgtgatca ctcccggcac aaataccagc aatcaagtgg ccgtgcttta ccaggatgtt 1860
aactgtacag aggtccccgt ggccatacac gcggaccagc tgaccccaac atggcgggtg 1920
tactcgacag gcagcaacgt gttccagacc cgcgcgggct gtctgatcgg cgccgaacac 1980
gttaacaact cctacgagtg cgatatccct atcggcgccg gcatttgcgc cagctaccag 2040
acccagacca acagccctcg gagagctaga tctgtggcca gccagtctat catcgcctac 2100
accatgagtc tgggtgctga aaactctgta gcatatagca ataatagcat cgccatcccc 2160
accaatttca ccatcagcgt gaccacagaa atcctgcctg tgtctatgac caagaccagc 2220
gtggactgta ccatgtacat ctgcggcgat tccacagaat gctctaacct gctgctgcag 2280
tacggctcct tttgcacaca gctgaacaga gccctgaccg gaattgcagt ggaacaagac 2340
aagaacacac aggaggtgtt cgctcaggtg aagcagatct acaaaacccc tcctatcaag 2400
gacttcggcg gtttcaattt cagccagatt ctgcctgatc ctagcaaacc ttccaagcgg 2460
agcttcatcg aggacctgct gttcaacaag gtgaccctgg ccgatgccgg cttcatcaag 2520
caatacggcg actgcctggg agacatcgcc gctcgggacc tgatctgcgc ccagaagttt 2580
aacggcctga ccgtgctgcc tccactgttg accgacgaaa tgatcgctca gtacaccagc 2640
gccctgctgg ccggaacaat caccagcgga tggacattcg gcgccggcgc cgccctgcag 2700
attcctttcg ctatgcagat ggcctataga ttcaacggaa tcggagtcac ccagaacgtg 2760
ctatacgaga accagaaact tatcgccaac cagtttaact ccgccatcgg aaagatccag 2820
gattccctga gcagcacagc ttctgccctc ggaaaactgc aggacgtggt gaaccaaaac 2880
gcccaggccc tgaacaccct ggtgaagcag ctgagctcaa acttcggcgc catcagctcc 2940
gtgctcaacg acatcctgtc tagactggac aaagtggaag ccgaggtgca gatcgaccgg 3000
ctgatcaccg gaagactaca gagcctgcag acatacgtca cccagcagct gatcagagcc 3060
gccgagattc gcgccagcgc taatctcgcc gccacaaaga tgagcgaatg tgtgctgggc 3120
cagagcaaga gagtggactt ctgcggcaaa ggctaccacc tgatgagctt cccccagtct 3180
gctccccacg gcgtggtgtt cctgcatgtg acctacgtgc cagcccagga gaagaacttc 3240
actacagccc ctgctatctg ccacgacggc aaggcccact tccctagaga gggcgtgttc 3300
gtgagcaacg gcacccactg gttcgtgacc cagagaaact tctacgagcc ccagatcatc 3360
acaacagaca acaccttcgt atctggcaac tgcgatgtgg tgatcggaat cgtgaataac 3420
accgtgtacg accccctgca gcctgagctg gatagtttta aagaggaact ggacaagtac 3480
tttaagaacc acacctcccc agacgtggac ctgggcgata tcagcggaat aaatgcctca 3540
gtggtgaaca tccagaaaga aatcgacaga ctgaacgagg tcgccaagaa cctgaacgag 3600
tccctgatcg acctgcaaga actgggcaag tacgagcagt acatcaagtg gccttggtac 3660
atctggctgg gcttcatcgc tggactgatc gccatcgtga tggtgaccat catgctgtgc 3720
tgcatgacaa gctgctgtag ctgcctgaag ggctgctgca gctgcggatc ttgttgcaag 3780
ttcgacgaag atgattccga acccgtgctg aagggcgtga aactccacta cacctgaacg 3840
cgt 3843
<210> 4
<211> 376
<212> DNA
<213> Artificial sequence
<400> 4
tttactccct atcagtgata gagaacgtat gaagagttta ctccctatca gtgatagaga 60
acgtatgcag actttactcc ctatcagtga tagagaacgt ataaggagtt tactccctat 120
cagtgataga gaacgtatga ccagtttact ccctatcagt gatagagaac gtatctacag 180
tttactccct atcagtgata gagaacgtat atccagttta ctccctatca gtgatagaga 240
acgtataagc tttaggcgtg tacggtgggc gcctataaaa gcagagctcg tttagtgaac 300
cgtcagatcg cctggagcaa ttccacaaca cttttgtctt ataccaactt tccgtaccac 360
ttcctaccct cgtaaa 376
<210> 5
<211> 1008
<212> DNA
<213> Artificial sequence
<400> 5
atgtctagat tagataaaag taaagtgatt aacagcgcat tagagctgct taatgaggtc 60
ggaatcgaag gtttaacaac ccgtaaactc gcccagaagc ttggtgtaga gcagcctaca 120
ctgtattggc atgtaaaaaa taagcgggct ttgctcgacg ccttagccat tgagatgtta 180
gataggcacc atactcactt ttgcccttta aaaggggaaa gctggcaaga ttttttacgc 240
aataacgcta aaagttttag atgtgcttta ctaagtcatc gcaatggagc aaaagtacat 300
tcagatacac ggcctacaga aaaacagtat gaaactctcg aaaatcaatt agccttttta 360
tgccaacaag gtttttcact agagaacgcg ttatatgcac tcagcgctgt ggggcatttt 420
actttaggtt gcgtattgga agatcaagag catcaagtcg ctaaagaaga aagggaaaca 480
cctactactg atagtatgcc gccattatta cgacaagcta tcgaattatt tgatcaccaa 540
ggtgcagagc cagccttctt attcggcctt gaattgatca tatgcggatt agaaaaacaa 600
cttaaatgtg aaagtgggtc cgcgtacagc cgcgcgcgta cgaaaaacaa ttacgggtct 660
accatcgagg gcctgctcga tctcccggac gacgacgccc ccgaagaggc ggggctggcg 720
gctccgcgcc tgtcctttct ccccgcggga cacacgcgca gactgtcgac ggcccccccg 780
accgatgtca gcctggggga cgagctccac ttagacggcg aggacgtggc gatggcgcat 840
gccgacgcgc tagacgattt cgatctggac atgttggggg acggggattc cccgggtccg 900
ggatttaccc cccacgactc cgccccctac ggcgctctgg atatggccga cttcgagttt 960
gagcagatgt ttaccgatgc ccttggaatt gacgagtacg gtgggtag 1008