CN110564772A - methods of engineering host cell genomes and uses thereof - Google Patents

methods of engineering host cell genomes and uses thereof Download PDF

Info

Publication number
CN110564772A
CN110564772A CN201910264952.7A CN201910264952A CN110564772A CN 110564772 A CN110564772 A CN 110564772A CN 201910264952 A CN201910264952 A CN 201910264952A CN 110564772 A CN110564772 A CN 110564772A
Authority
CN
China
Prior art keywords
gene
cho
sequence
site
host cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910264952.7A
Other languages
Chinese (zh)
Other versions
CN110564772B (en
Inventor
高闻达
毛昌群
邵静
岳国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Antaiji Beijing Biotechnology Co Ltd
Original Assignee
Antaiji Beijing Biotechnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Antaiji Beijing Biotechnology Co Ltd filed Critical Antaiji Beijing Biotechnology Co Ltd
Priority to CN201910264952.7A priority Critical patent/CN110564772B/en
Publication of CN110564772A publication Critical patent/CN110564772A/en
Application granted granted Critical
Publication of CN110564772B publication Critical patent/CN110564772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0681Cells of the genital tract; Non-germinal cells from gonads
    • C12N5/0682Cells of the female genital tract, e.g. endometrium; Non-germinal cells from ovaries, e.g. ovarian follicle cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • C12N2510/02Cells for production
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/30Vector systems comprising sequences for excision in presence of a recombinase, e.g. loxP or FRT

Abstract

the present invention provides a method for integrating multiple foreign genes at a designated site in the genome of a host cell. The invention also provides a method of engineering a host cell genome, the method comprising: a. integrating a foreign gene at a designated site in the genome of the host cell; knocking out a fucosyltransferase 8 gene and/or a glutamine synthetase gene and/or an alpha 2,3-sialyltransferase gene 4 and/or an alpha 2,3-sialyltransferase gene 6 in the host cell genome. The invention also provides host cells prepared according to the methods of the invention. The invention also provides the use of the host cell for the production of a protein of interest. The gene capable of improving protein expression is integrated into the cell genome of the CHO strain at a fixed point, and the CHO strain has important significance for developing novel antibodies and high-end imitation antibodies.

Description

Methods of engineering host cell genomes and uses thereof
Technical Field
The present invention is in the fields of molecular biology, cell biology and bioengineering. In particular, the invention relates to a method of engineering the genome of a host cell, to host cells engineered using the method, and to uses of the engineered host cells.
Background
Since the introduction of recombinant insulin approved by Genetech in 1982 by the FDA in the United states, recombinant proteins have been successfully used in scientific research, diagnosis, treatment, prevention, and the like of various diseases. At present, recombinant proteins are mainly produced and expressed using foreign protein expression systems. The production of recombinant proteins using mammalian cells has become the predominant biotechnological drug technology. Most of the protein drugs currently on the market, in clinical trials and preclinical studies, are expressed by mammalian cells, and among them, Chinese Hamster Ovary (CHO) cells are the most widely used host cells in recombinant protein production. CHO cells have multiple lines. The CHO cell lines currently used in industrial production include CHO-K1, CHO-DG44, CHO-DXB11 (also called CHO-DUKX), CHO-S and CHOZN (CHOZN ZFN-Modified CHO CellLines), and CHO-MK in development.
Although CHO cells are widely used in the industry for the production of antibodies for medical use, CHO cells under prior art conditions suffer from various drawbacks.
First, the tissue origin of CHO cells is fibroblast, which is different from the plasma cells of the immune system, which professionally express antibodies, in terms of the expression ability and modification of antibodies. Taking the index pg/cell/day (pcd) of cell single production as an example, after a CHO cell strain is subjected to tedious and complicated gene amplification and screening, the CHO cell strain can be used for industrial production if the CHO cell strain can reach about 20-40 pcd, and the commercial large-scale production of the antibody is mainly realized by the optimization of the downstream high-density cell culture and fermentation process. Compared with CHO cells, the plasma cells can easily reach the expression level of 100-200 pcd, and the assembly and the secretion of the super-large molecular pentamer IgM are not difficult. For expression of macromolecules requiring complex glycosylation modification and multi-chain assembly, CHO cells showed a bottleneck, and the increase of pcd after 20-40 appeared to touch the ceiling, both in expression vector construction and in drug pressurized screening.
second, glycoproteins (including antibodies) secreted by wild-type CHO cells all have Fucose (Fucose) sugar groups. Fucose on the sugar chain carried by the Fc portion of an antibody can hinder binding between the antibody and the Fc receptor. Although there are currently a number of methods for removing Fucose (Fucose) sugar groups, such as chemical or enzymatic treatment, small molecule analogs as inhibitors, RNA interference, overexpression of both beta-1, 4-N-acetyl-glucosaminyltransferase III (beta-1, 4-N-acetylglucosaminyltransferase III) and Golgi alpha-mannosidase II (alpha-mannosidase II) to alter the Fucose synthesis pathway, or the use of certain naturally mutated CHO strains with low Fucose expression (e.g., protein fucosylation deficient Lec13 CHO cells). However, these methods are difficult to completely remove fucose from the antibody, and thus the antibody drug cannot meet strict requirements for the uniformity of the preparation. By knocking out the specific gene responsible for fucose synthesis in CHO cells, it can be ensured at the gene level that 100% of the antibodies expressed by CHO cells are completely free of fucose. In particular, knocking out the gene for fucose transferase 8 (Fucosyltranferase 8, Fut8), which is responsible for the terminal addition of fucose, is an effective way to make antibodies no longer carrying fucose. The traditional gene knockout method is to construct a gene targeting vector and replace a target gene with an exogenous gene by utilizing homologous recombination. However, this method is very inefficient: in ES cells of mice, the efficiency of homologous recombination is in parts per million. CHO is a diploid somatic cell, and the efficiency of homologous recombination is 100 times lower than that of mouse ES cells. Although one Japanese laboratory used this method, which took 1 and a half years and was painstakingly screened from about 12000 CHO-DG44 cell clones to a Fut8-/-CHO cell strain (Yamane-Ohnuki N, et al, Biotechnol Bioeng 2004; 87: 614-.
Using Zinc Finger Nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and prokaryotesThe chromosome DNA cutting of the acquired immune system (clustered regularly Interspaced Short Palindromic Repeats (CRISPR)) greatly enhances the capability of people to randomly operate the genome site-specific parts. The Fut8 gene was knocked out in CHO cells using ZFN, as described (Malphettes L et al, Biotechnol Bioeng.2010; 106: 774-83). However, there is a severe off-target effect of ZFNs (Wang X et al, Nat Biotechnol. 2015; 33:175-8), and it is difficult to determine what effect ZFNs will also have on other functional genes in the CHO cell genome. Similarly, although the Fut8 gene in CHO cells can be knocked out with relative high efficiency by CRISPR technique (Sun T et al, Engineering in Life Sciences, 2015; 15: 660-6; Ronda C.et al, Biotechnol. Bioeng.2014; 111: 1604-16), the off-target probability is still high. Work by Ronda C et al shows that for a standard GN19The NGG target sequence, CHO-K1, has on average 6 identical sequences in its genome, 377 sequences with a mismatch of one nucleotide, and 4558 sequences with a mismatch of two nucleotides. Whole genome sequencing shows that 1397 single nucleotide mutations and 117 DNA sequence insertions or deletions occur in the transgenic mice prepared by site-directed gene modification by CRISPR technology compared with wild-type control mice (Schaefer, K.A. et al. nat. methods 2017; 14, 547-8). Thus, Fut8-/-CHO cells made with ZFN and CRISPR methods have too much genetic and epigenetic uncertainty to be a stable universal industrial cell line for medical protein production.
TALENs are much more specific than ZFNs and CRISPRs. This is because the left and right arms of TALENs not only have precise requirements on the binding sequences of the sense and antisense strands of DNA, respectively, but they must be strictly separated by 14-20 base pairs. This greatly reduced off-target effects in the CHO genome. At present, the report of preparing the Fut8-/-CHO cell line with TALEN is only from the Gregory Cost laboratory of Sangamo BioSciences. However, it was carried out by inserting a module for antibody expression (Cristea S et al, Biotechnol Bioeng.2013; 110:871-80) into the Fut8 gene of CHO cells, thereby expressing both the antibody gene and knocking out the Fut8 gene. However, since the expression and regulation of Fut8 gene cannot be compared with the expression of antibody, the Fut8-/-CHO cell thus obtained is difficult to be used as a general industrial cell line for large-scale production of medical proteins in terms of yield.
Again, in the late phase of high density growth, the CHO cells are at nutrient consumption and the ammonium ion in the system is increased, which puts a great pressure on the survival of the cells. Glutamine Synthetase (GS) is an enzyme that controls nitrogen metabolism. Glutamine, an amino acid, is not only used by cells to synthesize proteins, but also to transport nitrogen. Ammonium ions are toxic to cells, and when cells need to exclude excessive ammonium ions, glutamine is synthesized from ammonium ions and glutamic acid. It is this reaction that Glutamine Synthetase (GS) catalyzes.
In the case of a CHO cell line for the industrial production of antibodies, the cells could not grow in a glutamine-deficient medium after the endogenous GS gene was knocked out. The GS gene on the exogenous plasmid can not only increase the rigor of screening stable cell strains in a glutamine-deficient culture medium, but also enhance the reaction of converting waste ammonium ions into glutamine amino acid, and is very favorable for large-scale cell culture. GS knockout CHO cell lines have been obtained by the ZFN method (Fan L.et al.Biotechnol Bioeng.2012Apr; 109(4): 1007-15). However, this method also has the same disadvantages as those analyzed by the ZFN knock-out of the Fut8 gene.
In conclusion, the engineering and optimization of engineered strains of CHO cells is desirable if the controlled expression of multiple genes that are beneficial to the growth, protein (including antibody) production and activity of CHO cells as described above can be achieved in CHO cells. Engineering of CHO Cell lines (Cell Line Engineering) can provide an indispensable tool for basic biological research and drug development. Establishing a special cell line capable of simultaneously stably and highly expressing a plurality of gene products is very important for researching the processes of the function and the enzymatic activity of a multi-subunit membrane receptor, signal conduction, metabolic pathways, protein modification, virus assembly and the like which are involved by a plurality of genes. Moreover, these gene products often require expression in specific ratios. Although the new gene Editing (Genome Editing) technology such as CRISPR is helpful for rapid gene inactivation and replacement, the realization of proportional stable and high expression of multiple genes has been a bottleneck in the industry for a long time, and the progress is very small due to great technical difficulty. The main technical problems encountered are:
1) the conventional method is random insertion of a foreign gene into a chromosome. A cell line with high expression of gene A does not necessarily express gene B at a high level. After several genes are transformed, the heterogeneity of cell strains can be caused, and the screening difficulty is high. Moreover, due to Gene Silencing effect (Gene Silencing), the genes inserted into different chromosome parts can be slowly shut down to express at different speeds, and the uncontrollable property of cell strains and the uncertainty of experiments are further increased.
2) If a vector with a specific drug resistance gene is used for each gene, the existing drug resistance genes can be almost used up after 3-4 genes are transformed. If an antibody gene is also to be expressed in the modified CHO cell line, at least one drug resistance gene is used.
in view of the above, there is a need for methods that can efficiently and comprehensively engineer CHO cell lines.
disclosure of Invention
based on the defects of the prior art, the invention aims to provide a method for modifying a host cell genome, which not only can integrate a plurality of exogenous genes in the host cell genome at a fixed point to ensure that the exogenous genes are stably and efficiently co-expressed, but also can effectively knock out a Fut8 gene and/or a GS gene and/or an alpha 2,3-sialyltransferase gene 4 and/or an alpha 2,3-sialyltransferase gene 6, thereby constructing a high-efficiency host cell which can meet the industrial large-scale antibody production. The invention also provides host cells prepared according to the method and uses thereof.
in one aspect, the present invention provides a method for integrating multiple exogenous genes at a designated site of a host cell, wherein the method is a 'Toggle-In' method based on a Cre-LoxP system, and comprises the following steps:
1) Preparing an "anchored" host cell comprising site specific recombinase recognition sequences LoxPwt, LoxP1, PGK promoter driven positive selection sequence;
preferably, the positive screening sequence is Puro-T2A-d1 EGFP; this sequence allows the "anchored" host cell to appear weakly green fluorescent marker under fluorescent microscopy and to be Puromycin resistant.
Preferably, the "anchored" host cell is obtained by targeting vectors containing homologous left and right arms of the designated site, site-specific recombinase recognition sequences Loxpwt, Loxp1, and positive and negative selection sequences;
More preferably, the targeting vector comprises GAPDH homologous left arm, positive and negative selection sequences of Zeocin-T2A-TK, positive selection sequence of Puro-T2A-d1EGFP, recognition sequences Loxpwt and Loxp1 of site-specific recombinase in Cre-LoxP system, GAPDH homologous right arm and negative selection sequence of Diphtheria mycin (Diptheria toxin) A chain (DTA) in sequence;
2) a series of site-directed integration vectors (pToggle) containing foreign genes were prepared, which included:
a first site-directed integration vector, which is prepared using a first vector, comprising a first foreign gene, a first antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP4, and LoxP 2;
A second site-directed integration vector prepared using a second vector, comprising a second foreign gene, a second antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP1, and LoxP 5;
optionally, a third site-directed integration vector, prepared using the first vector, comprising a third exogenous gene, a first antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP4, and LoxP 2;
optionally, a fourth site-directed integration vector, prepared using the second vector, comprising a fourth foreign gene, a second antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP1, and LoxP 5;
Optionally, preparing a corresponding vector according to the gene to be integrated;
3) Transfecting host cells by using a first fixed-point integration vector, and screening out first antibiotic-resistant and second antibiotic-sensitive clones by using a first antibiotic;
4) transfecting the clone obtained in the step 3) by using a second fixed-point integration vector, and screening out a clone which is resistant to a second antibiotic and sensitive to the first antibiotic by using the second antibiotic;
5) Optionally, transfecting the clone obtained in step 4) with a third site-directed integration vector, and screening the first antibiotic-resistant and second antibiotic-sensitive clones with the first antibiotic;
6) Optionally, transfecting the clone obtained in the step 5) by using a fourth site-directed integration vector, and screening out a clone which is resistant to a second antibiotic and sensitive to the first antibiotic by using the second antibiotic;
7) optionally, corresponding vectors are prepared according to the genes to be integrated and are sequentially integrated to obtain a host cell integrating a plurality of genes at the designated sites.
Preferably, the first and second antibiotic resistance genes are respectively selected from two of:
hygromycin B gene, puromycin resistance gene, geneticin gene, blasticidin gene and phleomycin gene.
preferably, the first antibiotic resistance gene and/or the second antibiotic resistance gene are also linked to the gene sequence of the tracer protein by a self-cleaving peptide;
preferably, the tracer protein is selected from d1EGFP or DsRed-E2;
Preferably, the first fixed-point integrated carrier sequentially comprises the following components in sequence: a recombinase recognition sequence LoxPwt, a resistance gene HygB sequence without a promoter (preferably, the nucleotide sequence is shown as SEQ ID NO: 13) or a HygB-T2A-DsRed-E2 sequence, a recombinase recognition sequence LoxP4, a human elongation factor 1a (EF1a) promoter (preferably, the nucleotide sequence is shown as SEQ ID NO: 14), a first foreign gene and a recombinase recognition sequence LoxP 2;
Preferably, the second site-directed integration vector is, in order of arrangement: a recombinase recognition sequence LoxPwt, a resistance gene Puro sequence without a promoter or a Puro-T2A-d1EGFP sequence (preferably, the nucleotide sequence thereof is shown in SEQ ID NO:15), a recombinase recognition sequence LoxP1, a human elongation factor 1a (EF1a) promoter, a second foreign gene and a recombinase recognition sequence LoxP 5;
Preferably, the third site-directed integration vector is, in order of arrangement: a recombinase recognition sequence LoxPwt, a resistance gene HygB sequence without a promoter or a HygB-T2A-DsRed-E2 sequence, a recombinase recognition sequence LoxP4, a human elongation factor 1a (EF1a) promoter, a third foreign gene and a recombinase recognition sequence LoxP 2;
preferably, the fourth site-directed integration vector is, in order of arrangement: a recombinase recognition sequence LoxPwt, a resistance gene Puro sequence without a promoter or a Puro-T2A-d1EGFP sequence, a recombinase recognition sequence LoxP1, a human elongation factor 1a (EF1a) promoter, a fourth foreign gene and a recombinase recognition sequence LoxP 5;
preferably, the first and second antibiotic resistance genes are respectively selected from two of:
hygromycin B gene, puromycin resistance gene, geneticin gene, blasticidin gene and phleomycin gene.
preferably, the first and second vectors are selected from the group consisting of plasmid vectors pBR322, pUC57, pBluescript, pCI-neo, pcDNA3.1;
In a preferred embodiment, the first and second vectors used in the present invention are pTOG3 and pTOG 4;
The nucleotide sequence of the vector pTOG3 is shown as SEQ ID NO. 11;
the nucleotide sequence of the vector pTOG4 is shown as SEQ ID NO. 12;
Preferably, the sequences of two or more genes are joined with self-cleaving peptide fragments as optional foreign genes; ligating the sequences of the additional two or more genes with a self-cleaving peptide fragment as another optional foreign gene;
More preferably, the self-cleaving peptide fragment is selected from P2A, T2A, or E2A;
More preferably, the self-cleaving peptide fragment is P2A, and more preferably, the sequence thereof is as shown in SEQ ID NO: 16:
SEQ ID NO:16:GSGATNFSLLKQAGDVEENPGP。
Preferably, the host cell is a mammalian host cell; preferably chinese hamster ovary cells;
More preferably, the host cell is selected from the following cell strains:
CHO-K1 cell line, CHO-S cell line, DG44 cell line, CHO-DXB11 cell line, CHOZN cell line, CHO-MK cell line, CHL-YN cell line.
preferably, the designated site is a high expression hotspot region of the genome of the host cell;
Preferably, the designated site is selected from the regions of the following genes:
Rosa26 gene, beta-actin, beta 2-microrogobulin, CDK2, Ubiquitin, DHFR and GAPDH genes;
more preferably, the designated site is selected from the region of the GAPDH gene;
Further preferably, the designated site is the region of the GAPDH gene from position 3574544 to 3575484 in the CHO-K1 genome; preferably, the sequence of the site region is shown as SEQ ID NO 8.
in another aspect, the invention provides a method of engineering a host cell genome, the method comprising:
a. Integrating one or more of exogenous gene Signal recognition particle-9 (Signal recognition particle-9, SRP-9), Signal recognition particle-14 (Signal recognition particle-14, SRP-14), Signal recognition particle-54 (Signal recognition particle-9, SRP-54), Endoplasmic Reticulum Oxidoreductase (endothelial reductase, ERO1-L), fibroblast growth factor 9(fibroblast growth factor 9, FGF-9), β 1,4-galactosyltransferase 1(β 1,4-galactosyltransferase 1, GT) and/or α 2,6-sialyltransferase 1(α 2,6-sialyltransferase1, ST1) at a designated site in the genome of the host cell; and/or
b. knocking out a Fucosyltransferase 8 (fussyltransferase 8, Fut8) gene and/or a Glutamine Synthetase (GS) gene and/or an α 2,3-sialyltransferase gene 4(α 2,3-sialyltransferase 4, ST4) and/or an α 2,3-sialyltransferase gene 6(α 2,3-sialyltransferase 6, ST6) in the genome of the host cell using a transcription activator-like nuclease (TALEN);
Preferably, the host cell is a mammalian host cell; more preferably Chinese Hamster Ovary (CHO) cells; further preferably CHO-K1, CHO-S, CHOZN, DG44, CHO-DXB11, CHO-MK or CHL-YN cell lines;
Preferably, the designated site is a high expression hotspot region of the host cell genome, more preferably, the designated site is selected from the following regions of genes: constant high expression hot spot regions such as Rosa26, beta-actin, beta 2-microlobabulin, CDK2, Ubiquitin, DHFR and GAPDH. Further preferably, the site is a region in which glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene is located. In particular the GAPDH gene in the CHO-K1 genome from position 3574544 to 3575484, in a particular embodiment the sequence of said site is shown in SEQ ID NO 8.
Preferably, In step a, the heterologous polynucleotide sequences encoding the foreign genes are sequentially inserted into the designated sites of the host cells by "Toggle-In"; more preferably, three genes of the human origin integration exogenous gene signal recognition particle 9, the signal recognition particle 14 and the signal recognition particle 54 are connected by an automatic cutting peptide segment to be used as optional exogenous genes, preferably as shown in SEQ ID NO. 17; connecting endoplasmic reticulum oxidoreductase and fibroblast growth factor 9 with automatic cutting peptide segment to obtain optional exogenous gene; human beta 1,4-galactosyltransferase 1 and/or alpha 2,6-sialyltransferase1 genes are linked with auto-cleaving peptide fragments, which are optional foreign genes.
Preferably, in the step b, the knockout is performed by a TALEN method.
Preferably, TALEN left-arm and right-arm protein gene sequences constructed by knocking out Fut8 genes are shown as SEQ ID NO. 18 and SEQ ID NO. 19; the targeted region is exon 10 of the CHO cell Fut8 gene, but can be any exon that affects its function;
More preferably, the primers used are shown in SEQ ID NO 20 and SEQ ID NO 21.
preferably, the gene sequence of the TALEN left-arm protein constructed by knocking out the GS gene is shown in SEQ ID NO. 22 and 23. The targeted region is exon 7 of the CHO cell GS gene, but can be any exon that affects its function;
More preferably, the primers used are shown in SEQ ID NO. 24 and SEQ ID NO. 25.
Preferably, TALEN left-arm and right-arm protein gene sequences constructed by knocking out ST3GAL4 genes are shown as SEQ ID NO. 26 and 27; the targeted region is exon 5 of the ST3GAL4 gene in CHO cells, but can be any exon that affects its function.
More preferably, the primers used are shown in SEQ ID NO 28 and SEQ ID NO 29.
Preferably, TALEN left-arm and right-arm protein gene sequences constructed by knocking out ST3GAL6 genes are shown as SEQ ID NO. 30 and 31; the targeted region is exon 2 of the ST3GAL6 gene in CHO cells, but can be any exon that affects its function;
more preferably, the primers used are shown in SEQ ID NO 32 and SEQ ID NO 33.
In another aspect, the invention provides a host cell engineered according to the methods of the invention;
preferably, the engineered host cell carries one or more of the foreign genes SRP-9, SRP-14, SRP-54, Endoplasmic Reticulum Oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9), β 1,4-galactosyltransferase 1(β 1,4-galactosyltransferase 1, GT) and/or α 2,6-sialyltransferase 1(α 2,6-sialyltransferase1, ST 6);
preferably, the engineered host cell carries at least exogenous gene signal recognition particle 9, signal recognition particle 14, signal recognition particle 54, endoplasmic reticulum oxidoreductase, fibroblast growth factor 9;
Preferably, the engineered host cell carries exogenous gene signal recognition particle 9, signal recognition particle 14, signal recognition particle 54, endoplasmic reticulum oxidoreductase, fibroblast growth factor 9, β 1,4-galactosyltransferase 1 and α 2,6-sialyltransferase 1;
Preferably, the engineered host cell does not express one or more of a fucosyltransferase 8 gene and/or a glutamine synthetase gene and/or an alpha 2,3-sialyltransferase 4 and/or an alpha 2,3-sialyltransferase 6;
Preferably, the engineered host cell does not express the Fucosyltransferase 8(Fut8) gene and/or the Glutamine Synthetase (GS) gene;
Preferably, the engineered host cell does not express the fucosyltransferase 8 gene, the glutamine synthetase gene, the α 2,3-sialyltransferase 4 and the α 2,3-sialyltransferase 6;
more preferably, the present invention provides a CHO cell line as described below.
CHO-E cell line: carries exogenous genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L) and fibroblast growth factor 9(FGF-9) on the basis of wild type CHO-K1, and has better protein translation capability and capability of resisting apoptosis.
CHO-EF cell line: the Fut8 gene is knocked out, and the protein carries exogenous genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L) and fibroblast growth factor 9(FGF-9), has better protein translation capability and capability of resisting apoptosis, and expresses protein molecules without fucose at all.
CHO-EG cell line: carries exogenous genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9) and beta 1,4-galactosyltransferase 1 (beta 1,4-galactosyltransferase 1, GT), has better protein translation capability and capability of resisting apoptosis, and the terminal of the sugar chain of the expressed protein molecule is more provided with galactosyl at beta 1, 4-position.
CHO-EFG cell line: the Fut8 gene is knocked out and carries exogenous genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9) and beta 1,4-galactosyltransferase 1 (beta 1,4-galactosyltransferase 1, GT), so that the protein has better protein translation capability and capability of resisting apoptosis, the expressed protein molecule does not contain fucose completely, and the tail end of the sugar chain is more provided with galactosyl at beta 1, 4-position.
CHO-ES (3,6) cell line: carries exogenous genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9) and beta 1,4-galactosyltransferase 1 (beta 1,4-galactosyltransferase 1, GT) and alpha 2,6-sialyltransferase1 (alpha 2,6-sialyltransferase1, ST 6). The cell strain has the basic characteristics of the CHO-E cell strain, and can also enable protein molecules expressed by the CHO cells to carry sialic acid linked to alpha 2, 6-positions. Due to the existence of alpha 2,3-sialyltransferase (alpha 2,3-sialyltransferase, ST3) endogenous to CHO cells, the protein molecules expressed by the cells carry sialic acid linked at both the alpha 2, 3-position and the alpha 2, 6-position, which is closer to the human glycosylation pattern.
CHO-EFS (3,6) cell line: the Fut8 gene was knocked out and carried the foreign genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9) and β 1,4-galactosyltransferase 1(β 1,4-galactosyltransferase 1, GT) and α 2,6-sialyltransferase 1(α 2,6-sialyltransferase1, ST 6). The cell strain has the basic characteristics of the CHO-ES (3,6) cell strain, and can ensure that protein molecules expressed by the CHO cells are completely free of fucose.
CHO-EFS (3,6) GS cell line: the Fut8 gene and the GS gene were knocked out and carried the foreign genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9), β 1,4-galactosyltransferase 1(β 1,4-galactosyltransferase 1, GT), and α 2,6-sialyltransferase 1(α 2,6-sialyltransferase1, ST 6). The cell strain expresses alpha-2, 3-sialyltransferase and alpha-2, 6-sialyltransferase simultaneously, and the fucose transferase gene 8 and the glutamyl synthetase gene are double-deleted, so that the cell strain is a high-performance CHO-K1 cell strain.
CHO-ES (6) cell line: carries exogenous genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9) and beta 1,4-galactosyltransferase 1 (beta 1,4-galactosyltransferase 1, GT) and alpha 2,6-sialyltransferase1 (alpha 2,6-sialyltransferase1, ST6) and knocks out alpha 2,3-sialyltransferase (alpha 2,3-sialyltransferase, ST3) endogenous to CHO cells. Besides the basic characteristics of the CHO-E cell strain, the cell strain can also ensure that protein molecules expressed by the CHO cell only have sialic acid linked at the alpha 2, 6-position and do not have sialic acid linked at the alpha 2, 3-position, and can more uniformly express biological macromolecules of which the biological activity depends on the sialic acid at the alpha 2, 6-position.
CHO-EFS (6) cell line: the Fut8 gene was knocked out and carried the exogenous genes SRP-9, SRP-14, SRP-54, endoplasmic reticulum oxidoreductase (ERO1-L), fibroblast growth factor 9(FGF-9) and β 1,4-galactosyltransferase 1(β 1,4-galactosyltransferase 1, GT) and α 2,6-sialyltransferase 1(α 2,6-sialyltransferase1, ST6), and α 2,3-sialyltransferase (α 2,3-sialyltransferase, ST3) endogenous to CHO cells was knocked out. The cell strain has the basic characteristics of the CHO-ES (6) cell strain, and can ensure that protein molecules expressed by the CHO cells are completely free of fucose.
The series of CHO cell lines of the invention has the following properties: the expressed protein, preferably an antibody, is completely free of fucose; or can express glycosylation of human origin, especially galactosylation modification of glycoprotein or sialylation modification at alpha 2,6 position which is not possible for CHO cells; in particular, alpha 2, 3-sialic acid modifying enzyme endogenous to CHO cells is knocked out, or directional optimization is brought to the physicochemical properties of medicinal proteins by different combinations of different genotypes.
In a further aspect, the invention provides the use of the host cell for the expression of a protein of interest; preferably, the protein of interest is an antibody.
Compared with the prior art, the cell strain has the following advantages:
1) the CHO cell strain provided by the invention knocks out Fut8 gene and/or GS gene and/or alpha 2,3-sialyltransferase gene in CHO cells. The knockout of the Fut8 gene enables the antibody expressed by the CHO cell strain to have higher binding force with Fc gamma RIIIA receptor, and greatly improves ADCC (antibody-dependent cell-mediated cytotoxicity) effect and in-vivo tumor killing activity of the antibody. The knockout of GS gene leads the expression and clone selection of exogenous gene to be hooked with GS on the carrier, and simultaneously, the GS gene can be beneficial to the growth of cells during large-scale production;
2) Genes such as SRP-14, SRP-54, SRP-9, ERO-L1 and FGF9 capable of improving antibody expression and genes GT and ST6 responsible for humanized glycosylation are integrated in a cell genome of the CHO strain at fixed points, so that the constructed CHO cell strain can improve yield, and the produced antibody has sialic acid modification or galactoside modification of a human antibody. On the basis of fucose deficiency (Fut8 knockout), human sialylation is introduced, so that the ADCC activity is improved, the immunogenicity of molecules can be shielded, and the half-life in vivo is increased. On the basis of fucose deficiency, humanized galactosylation is introduced, so that ADCC (ADCC) and CDC (complement dependent cytotoxicity) functions of complement activation can be simultaneously improved. For certain specifically desired proteins, such as intravenous gamma globulin (IVIG) with sialic acid at the α 2, 6-position, HIV coat Env protein with sialic acid modification at the α 2, 6-position, which is required for inducing neutralizing antibodies, and the like, the CHO-ES (6) cell strain of the present invention can provide a protein expression platform without sialic acid at the α 2, 3-position but with sialic acid only at the α 2, 6-position, making sialic acid modification more biased in a direction important for biological activity. The CHO cell strain modified by the glycosyl synthetic route, in particular various derivative strains of GS gene knockout on the basis, has important significance for developing novel antibodies and high-end imitation antibodies.
Drawings
embodiments of the invention are described in detail below with reference to the attached drawing figures, wherein:
FIG. 1 shows the CHO cell targeting vector CHO GAPDH-Zeo-TK HR v1.5, in which the TK gene was included as negative selection and the PGK promoter-puromycin-T2A-d 1EGFP expression cassette was included as positive selection between the left and right arms of the GAPDH homology. Further, immediately upstream and downstream of the expression cassette, LoxPwt and LoxP1 sequences were contained, respectively.
FIG. 2 shows FACS profiles of CHO-K1 cells (ATCC, CCL-61) after electrotransformation with targeting vector and selection with drug (puromycin). Of these, the vast majority of resistant clones were randomly inserted, expressing a very weak d1EGFP signal (fig. 2 bottom left). Only a very low percentage of 0.6% are events of true GAPDH site-directed integration, expressing a strong d1EGFP signal (fig. 2 bottom right).
FIG. 3 shows the first step of the homologous recombination of pTOG3 vector with chromosomal DNA carrying LoxP sequence mediated by Cre recombinase, i.e., "Toggle In" reaction.
FIG. 4 shows the case where pTOG4 vector and chromosomal DNA carrying LoxP sequence undergo homologous recombination mediated by Cre recombinase, i.e., the second step of the "Toggle In" reaction.
FIG. 5 shows a micrograph of pTOG3 vector harboring DsRed-E2, transformed and anchored in CHO cells, and screened for hygromycin. The cell clone under the ordinary light microscope field is highly overlapped with the cell clone emitting red fluorescence under the fluorescence microscope field (FIG. 5A), which shows the high efficiency of the chromosome genome fixed-point integration technology. RT-PCR showed that different clones of randomly selected CHO-E cell lines that integrated genes such as SRP-14-9-54 and hERO-L, hFGF9 all expressed the same level of the genes, indicating the genetic homogeneity (isogenity) between the individual cell clones (FIG. 5B).
FIG. 6 shows the results of FACS analysis of the obtained clones with FITC-labeled lectin LCA after transformation of CHO-K1 cells (ATCC, CCL-61) with TALEN vector and knocking-out Fut8 gene. The wild type CHO cell surface was strongly positive for fucose, while the clones with Fut 8-/-were all negative.
FIG. 7 shows the DNA sequence analysis of a CHO cell clone to Fut 8-/-. Genomic DNA of the 10 th exon region targeted by TALEN was amplified (715bp) by PCR, and subjected to restriction enzyme MscI followed by agarose electrophoresis. In contrast to the wild-type allelic fragments (417bp, 298bp), the MscI site of at least one or even both alleles of each Fut8-/-CHO cell clone was disrupted by TALEN (FIG. 7A). Sequencing of both alleles of clone No. 1 indicated that TALENs both caused nonsense mutations in both, and the protein product was prematurely terminated. Sequencing results indicated that the MscI site of one of the alleles was disrupted by TALEN, consistent with the results in fig. 7B.
FIG. 8 shows Mass spectrometric analysis of sugar chains Mass Spec of Fc fusion proteins expressed by wild-type CHO-K1 and CHO-Fut 8-/-cells. The results indicated that 100% of the product expressed by CHO-Fut 8-/-cells was free of fucose (filled diamonds in the figure).
FIG. 9 shows Mass spectrometric analysis of sugar chains of antibodies expressed by wild-type CHO-K1 and CHO-Fut 8-/-cells. The results indicated that 100% of the product expressed by CHO-Fut 8-/-cells was free of fucose (filled diamonds in the figure).
FIG. 10 shows the ELISA binding activity of hIgG1 antibody expressed by wild-type CHO-K1 and CHO-Fut 8-/-cells on various mouse and human Fc receptor proteins. The results show that the fucose-deficient hIgG1 antibody has no change in binding capacity to mouse FcR γ I, FcR γ IIb, FcR γ III, and human FcR γ I, FcR γ IIa (H133), FcR γ IIa (R133), and FcR γ IIb, and has a greatly improved binding capacity to FcR γ IIIa.
FIG. 11 shows the high degree of protein sequence homology between Guinea Pig proteins identified from the UniProt protein sequence library using bioinformatics methods (H0VDZ8, designated Guinea Pig. gamma. IV, gpFCR. gamma. IV), and mouse FcR. gamma. IV and human FcR. gamma. IIIa.
figure 12 shows the differences in binding capacity of wild-type (open circles) and fucose-deficient (filled circles) human hIgG1, mouse mIgG2a antibodies to hfcrγ IIIa-V158, mfcrγ IV and gpfccr γ IV. The results showed that the binding of the fucose-depleted hIgG1 and mouse mIgG2a antibodies expressed by CHO-Fut 8-/-cells to these receptors was improved to different extents.
Figure 13 shows a comparison of ADCC activity of wild-type and fucose-depleted hIgG1 against target cells. In the system, the effector cells adopt lentivirus to mediate the expression of hFcR gamma IIIa-V158 receptors on Jurkat cells, and stimulate the stimulated expression of downstream Luciferase Luciferase.
Figure 14 shows a comparison of ADCC activity of wild-type and fucose-deficient mIgG2c against target cells. In this system, the effector cells used a lentivirus-mediated expression of the mFcR γ IV receptor on Jurkat cells, stimulating the stimulated expression of the downstream Luciferase Luciferase.
FIG. 15 shows the growth of subclone #4 and wild type cells in complete medium and glutamine-deficient medium, respectively, after knockout of the GS gene by TALEN in CHO cells. GS-/-CHO cells can only grow in complete medium, but not in glutamine-deficient medium. While GS +/+ CHO cells were able to grow in both.
FIG. 16 shows the DNA sequences of the two chromosomal exon regions of GS 7 of the clone No. 4 CHO-K1-/-.GS. Both chromosomal GS alleles terminate prematurely due to gene editing.
FIG. 17 shows the results of FACS analysis of the obtained clones with FITC-labeled lectin LCA after further knocking out the Fut8 gene by the TALEN method disclosed in the present invention on the basis of GS-/-clone CHO-K1# 4. Wherein #1, #4, #5 and #7 are GS-/-and Fut 8-/-double knockout strains.
FIG. 18 shows CHO-EFS-6.1 and CHO-EFS-6.3 cell lines obtained after CHO-EF cells were transformed with GT-P2A-ST6 gene and drug-screened and subcloned, stained with the phytohemagglutinin Sambucus Nigra Lectin (SNA), demonstrating that the cell surface carries sialic acid at the α 2,6 position which the CHO-F mother cell line cannot express; the sialic acids at the α 2 and 3 positions specifically recognized by the lectin maackiaaurarensis lectin (MAL II) were not different among the three (fig. 18A). The CHO-EFS-6.1 cell line expressing sialic acid at the α 2,6 positions was partially masked from surface galactose, so that the phytohemagglutinin Erythrina Cristagalli Lectin (ECL) showed less specific staining for galactose than the CHO-F cell line (FIG. 18B).
FIG. 19 shows CHO-ES- #3, CHO-ES- #4, and CHO-ES- #5 cell lines obtained after CHO-E cells were transformed with GT-P2A-ST6 gene and subjected to drug screening and subcloning. Staining with the phytohemagglutinin Sambucus Nigra Lectin (SNA) demonstrated that these cells harbor sialic acids at the α 2,6 positions, which were not expressed by CHO-E blastocytes, on the surface, consistent with the characteristics of the CHO-EFS-6.1 cell line, except that the former harbor fucose (LCA staining positive, as in CHO-K1) and the latter is fucose-deficient (LCA staining negative, as in CHO-F).
FIG. 20 shows FACS maps of CHO-ES (6) cell lines obtained after knocking out α 2,3 sialyltransferase (ST3GAL4 and ST3GAL6) in CHO-ES (3,6) cells using TALEN technology, after staining via the plant lectins Sambucus Nigra Lectin (SNA) and Maackia Amurensis Lectin II (MAL II). Wild type CHO-E cells express only sialic acid at the α 2,3 positions. CHO-ES (3,6) cells express sialic acid at both the α 2,3 and α 2,6 positions. On the basis, the CHO-ES (6) cell strain which knocks out the alpha 2,3 sialyltransferase only expresses sialic acid at the alpha 2,6 position.
FIG. 21 shows sugar chain Mass Spec Mass spectrometry analysis of human IgG1 antibody expressed by wild-type CHO-K1 and Fut 8-/-CHO-EFS (3,6) cell lines. The results showed that the sugar chains were very different between the two. The latter not only carries no fucose, but also more terminal sialic acids.
FIG. 22 shows the ADCC effect of hIgG1 antibody against tumor target GPC3 expressed by wild-type (WT) CHO-K1, fucose-deficient (AF) CHO-F, and fucose-deficient and humanized sialylated (AF + S) CHO-EFS (3,6) cell strains, respectively, on GPC3 positive Huh7 cells. The ADCC activity of the fucose-deficient antibody is obviously improved compared with that of a wild antibody, and after sialic acid is added on the basis of fucose deficiency, the ADCC activity of the antibody is improved.
Best Mode for Carrying Out The Invention
In order to more clearly explain the objects, technical solutions and advantages of the present invention, the present invention is further described below with reference to the following examples. The specific embodiments described herein are merely illustrative and explanatory of the invention and do not delimit the scope of application of the invention. The experimental procedures of the present invention, in which the specific conditions are not specified, can be carried out under the usual conditions (for example, refer to the conditions described in molecular cloning: laboratory Manual, or the conditions recommended by the manufacturer). Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In addition, any methods and materials similar or equivalent to those described herein can be used in the practice of the present invention. The preferred embodiments and materials described herein are intended to be exemplary only.
the detailed description of the invention is as follows:
As used herein, the term "host cell" refers to a cell that can be used for introducing a vector, and includes, but is not limited to, prokaryotic cells such as escherichia coli cells or bacillus subtilis cells and the like, fungal cells such as yeast cells or aspergillus cells and the like, insect cells such as S2 drosophila cells or SF9 cells, and animal cells such as fiber-derived cells, CHO cells, COS cells, NS0 cells, Hela cells, BHK cells, HEK293 cells, per.c6 cells, or human cells and the like.
As used herein, a "progeny cell" of a cell refers to a cell that originates directly or indirectly from the cell, and includes not only progeny cells that are produced directly from the cell by cell division or cell multiplication, but also progeny cells that are produced from progeny cells of the cell. For example, progeny cells of CHO-K1 include any cell derived, directly or indirectly, from CHO-K1.
as used herein, the term "Glutamine Synthetase (GS)" refers to an enzyme that is capable of catalyzing the synthesis of glutamine from glutamate and ammonium to glutamine. The international system classification number of GS is EC6.3.1.2. In the genome of CHO cells, the nucleotide sequence encoding Glutamine Synthetase (GS) is known, and exemplary nucleotide sequences thereof can be found in, for example, NCBI accession No. NW _ 003613921.1; 1430036 and 1435423.
As used herein, the term "fucosyltransferase 8(Fut 8)" also known as an α - (1,6) -fucosyltransferase, is capable of catalyzing the transfer of fucosyl groups to glycosylation sites of proteins in an α - (1,6) linked manner. FUT8 has an international system classification number of EC2.4.1.68. In the genome of CHO cells, the nucleotide sequence encoding fucosyltransferase 8(FUT8) is known, and exemplary nucleotide sequences thereof can be found, for example, in NCBI accession No. NW _ 003613860.1; 570169 bits 731500.
as used herein, the term "α 2,3-sialyltransferase 4(ST3GAL 4)" also known as ST3 β -galactoside α -2,3-sialyltransferase 4, is a subtype 4 enzyme capable of catalyzing the transfer of sialylglycosyl groups to β -galactosides in an α - (2,3) linked fashion. The international system classification number of ST3GAL4 is EC2.4.99.4. In the genome of CHO cells, the nucleotide sequence encoding α 2,3-sialyltransferase 4(ST3GAL4) is known, and exemplary nucleotide sequences thereof can be found in, for example, NCBI accession No. NW _ 003613908.1; 1021239 and 1065065.
as used herein, the term "α 2,3-sialyltransferase 6(ST3GAL 6)" also known as ST3 β -galactoside α -2,3-sialyltransferase 6, is a subtype 6 enzyme capable of catalyzing the transfer of sialylglycosyl groups to β -galactosides in an α - (2,3) linked fashion. The international system classification number of ST3GAL6 is EC2.4.99.4. In the genome of CHO cells, the nucleotide sequence encoding α 2,3-sialyltransferase 6(ST3GAL6) is known, and exemplary nucleotide sequences thereof can be found in, for example, NCBI accession No. NW _ 003614792.1; 384880 position 438354.
As used herein, the term "Endoplasmic Reticulum Oxidoreductase" (ERO1-L), also known as Endoplasmic Reticulum oxidoreductases, is an Oxidoreductase that catalyzes the formation of disulfide bonds in the Endoplasmic Reticulum. ERO1-L has an International System Classification number of EC1.8.4.2. In the human genome, nucleotide sequences encoding ERO1-L are known, and exemplary nucleotide sequences can be found, for example, in NCBI accession No. NC _ 000014.9; 52639915 and 52695931.
as used herein, the term "SRP-14" is also known as a signal recognition particle 14kDa protein. It forms a complex with SRP-9 and SRP-54 that recognizes the signal peptide and temporarily terminates translation, and introduces the secreted protein into the lumen of the rough endoplasmic reticulum. In the human genome, the nucleotide sequence encoding SRP-14 is known, and exemplary nucleotide sequences can be found, for example, in NCBI accession No. NC-000015.10; 40035690 and 40039202.
As used herein, the term "SRP-54" is also known as a signal recognition particle molecular weight 54kDa protein. It forms a complex with SRP-9 and SRP-14 that recognizes the signal peptide and temporarily terminates translation, and introduces the secreted protein into the lumen of the rough endoplasmic reticulum. In the human genome, the nucleotide sequence encoding SRP-54 is known, and exemplary nucleotide sequences can be found, for example, in NCBI accession No. NC-000014.9; 34982898 and 35029567.
As used herein, the term "SRP-9" is also known as a signal recognition particle molecular weight 9kDa protein. It forms a complex with SRP-14 and SRP-54 that recognizes the signal peptide and temporarily terminates translation, and introduces the secreted protein into the lumen of the rough endoplasmic reticulum. In the human genome, the nucleotide sequence encoding SRP-9 is known, and exemplary nucleotide sequences can be found, for example, in NCBI accession No. NC-000001.11; 225777813 and 225790466.
as used herein, the term "FGF 9" also known as fibroblast growth factor 9 and the term "hFGF 9" also known as human fibroblast growth factor 9 promote the growth and anti-apoptosis of CHO cells. In the human genome, the nucleotide sequence encoding FGF9 is known, and exemplary nucleotide sequences thereof can be found, for example, in NCBI accession No. NC _ 0000013.11; position 21671076-21704501.
As used herein, the term "GT," also known as β 1,4-galactosyltransferase 1, is involved in the addition of galactose residues linked at the β 1,4 positions during glycosylation of proteins in the golgi apparatus. GT has an international system classification number of EC2.4.1.133. In the human genome, the nucleotide sequence encoding GT is known, and exemplary nucleotide sequences thereof can be found, for example, in NCBI accession No. NC _ 000009.12; 33110641-33167358 bit.
as used herein, the term "ST 6", also known as α 2,6-sialyltransferase1, is involved in the addition of sialic acid residues linked at the α 2,6 position during glycosylation of proteins in the golgi apparatus. The international system classification number of ST6 is EC2.4.99.10. In the human genome, the nucleotide sequence encoding ST6 is known, and exemplary nucleotide sequences thereof can be found in, for example, NCBI accession No. NC _ 000003.12; 186930526-th 187078553.
as used herein, the term "vector" refers to a nucleic acid delivery vehicle into which a polynucleotide can be inserted. When a vector is capable of expressing a protein encoded by an inserted polynucleotide, the vector is referred to as an expression vector. The vector may be introduced into a host cell by transformation, transduction, or transfection, and the genetic material elements carried thereby are expressed in the host cell. Vectors are well known to those skilled in the art and include, but are not limited to: a plasmid; phagemid; artificial chromosomes such as Yeast Artificial Chromosomes (YACs), Bacterial Artificial Chromosomes (BACs), or artificial chromosomes (PACs) derived from P1; bacteriophage such as lambda phage or M13 phage, animal virus, etc. Animal viruses that may be used as vectors include, but are not limited to, retroviruses (including lentiviruses), adenoviruses, adeno-associated viruses, herpes viruses (e.g., herpes simplex virus), poxviruses, baculoviruses, papilloma viruses, papilloma polyoma vacuolatum viruses (e.g., SV 40). A vector may contain a variety of elements that control expression, including, but not limited to, promoter sequences, transcription initiation sequences, enhancer sequences, selection elements, and reporter genes. In addition, the vector may contain a replication initiation site.
As used herein, the term "stably expressed" means that the host cell expresses a gene encoding a foreign protein integrated into the host cell genome as the foreign protein. In the stable expression process, a gene encoding a foreign protein is stably inserted into the genome (e.g., chromosome) of a host cell, and thus, the host cell can stably express the foreign protein for a long period of time.
As used herein, the term "knockout" refers to the editing of a gene in the genome of a cell (e.g., alteration of the gene by insertion, substitution, and/or deletion) such that the gene loses its original function (e.g., is unable to express a functional protein). Genes in the genome of a cell can be edited using various known molecular biology techniques (e.g., gene editing techniques using ZFNs, TALENs, CRISPR/cas9, or NgAgo). The gene knockout is not limited to complete deletion or removal of the entire gene, as long as the gene loses its original function. For example, a gene can be knocked out by inserting a foreign DNA fragment into the gene so that the gene cannot express a functional protein, or by inserting or deleting one or several bases into the gene so that the gene is subjected to a frame shift mutation.
as used herein, the term "ADCC," also known as antibody-dependent cell-mediated cytotoxicity (antibody-dependent cell-mediated cytotoxicity), refers to cells with killing activity (e.g., NK cells, macrophages, and neutrophils) that directly kill target cells (e.g., virus-infected cells and tumor cells) by recognizing Fc fragments of antibodies that specifically bind to surface antigens of the target cells (e.g., virus-infected cells and tumor cells) via their surface-expressed Fc receptors (fcrs).
As used herein, the term "CDC," also known as complement dependent cytotoxicity (complement dependent cytotoxicity), refers to the action of an antibody to directly kill a target cell by activating the complement system, i.e., the classical pathway of complement is activated by binding a specific antibody to a corresponding antigen on the surface of a cell membrane to form a complex, and the formed membrane attack complex exerts a lytic effect on the target cell.
As used herein, the sequences of the LoxP wild-type and mutant are as follows:
LoxPtt ATAACTTCGTATAGCATACATTATACGAAGTTAT shown in SEQ ID NO 1
LoxP1 is shown as SEQ ID NO 2 and ATAACTTCGTATAGTATAGTATATACGAACGGTA
LoxP2 is shown as SEQ ID NO 3TACCGTTCGTATAGTATAGTATATACGAAGTTAT
LoxP3 is shown in SEQ ID NO 4TACCGTTCGTATA GTATAGTA ATACGAACGGTA
LoxP4 is shown as SEQ ID NO 5 at ATAACTTCGTATAGGCTATAGTATACGAACGGTA
LoxP5 is shown in SEQ ID NO 6TACCGTTCGTATAGGCTATAG TATACGAAGTTAT
LoxP6 is shown in SEQ ID NO 7TACCGTTCGTATAGGCTATAGTATACGAACGGTA
The nucleotide sequence from 3574544 th site to 3575484 th site of the GAPDH gene is shown as SEQ ID NO: 8: gggtgatgctggcgccgagtatgttgtggaatctactggcgtcttcaccaccatggagaaggctggggcccacttga agggcggggccaagagggtcatcatctccgccccttctgctgatgcccccatgtttgtgatgggtgtgaaccaagac aagtatgacaactccctcaagattgtcaggtgaggatggcagagggctgtggcaaagtgggcaagcaggggcaa ggttacaggtgggcgagcctcctaacctgtctcttctcttcagcaatgcgtcctgcaccaccaactgcttagcccccct ggccaaggtcatccatgacaactttggcattgtggaaggactcatggtatgtagttcatctgtttcatcctgccagcagt gggcgctgtggtgggggccctgcaagacctcactccctgcctctgtgtctttcagaccacggtccatgccatcactg ccacccagaagactgtggatggcccctccgggaagctgtggcgtgatggccgtggggctgcccagaacatcatcc ctgcatccactggcgctgccaaggctgtgggcaaagtcatcccagagctgaacgggaagctgactggcatggcctt ccgtgttcctacccccaacgtgtccgttgtggatctgacatgtcgcctggagaaacctgtatgtctggggtgggctga gggttgtctctagtggtgaggttggggcttgagtagtcaccttgatttttgcccttaataggccaagtatgaggacatca agaaggtggtgaagcaggcatctgagggcccactgaagggcatcctgggctacaccgaggaccaggttgtctcct gcgacttcaacagtgactcccactcttccacctttgatgctggggctggcattgctctcaatgacaactttgtaaagctc atttcctggtatgacaatgaatttggctacagcaacagagtggtggacctcatggcctacatggcctccaagg
the amino acid sequence of the positive screening sequence Puromycin-T2A-d1EGFP is shown as SEQ ID NO: 9:
MATEYKPTVRLATRDDVPRAVRTLAAAFADYPATRHTVDPDRHIERV TELQELFLTRVGLDIGKVWVADDGAAVAVWTTPESVEAGAVFAEIGSRMA ELSGSRLAAQQQMEGLLAPHRPKEPAWFLATVGVSPDHQGKGLGSAVVL PGVEAAERAGVPAFLETSAPRNLPFYERLGFTVTADVEVPEGPRTWCMTR KPGAGSEGRGSLLTCGDVEENPGPMVSKGEELFTGVVPILVELDGDVNG HKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYP DHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIE LKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDG SVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEF VTAAGITLGMDELYKKLSHGFPPAVAAQDDGTLPMSCAQESGMDRHPAA CASARINV
The DNA sequence of the targeting vector (the DNA sequences of the left arm, the right arm and the middle part of the GAPDH homologous sequence, including the TK gene-SV 40 PolyA-PGK promoter-Puromycin-T2A-d 1EGFP) is shown as SEQ ID NO:10, wherein the sequence of Puromycin-T2A-d1EGFP (SEQ ID NO:15) is shown by bold underlining:
the nucleotide sequence of the vector pTOG3 is shown as SEQ ID NO:11, wherein the black underlined part is tracer protein DsRed-E2, and the nucleotide sequence is shown as SEQ ID NO:36 (not separately listed):
The nucleotide sequence of the vector pTOG4 is shown as SEQ ID NO:12, wherein the black underlined part is tracer protein d1EGFP, and the nucleotide sequence is shown as SEQ ID NO:37 (not separately listed):
the resistance gene HygB without a promoter is shown as SEQ ID NO. 13;
atgaaaaagcctgaactcaccgcgacgtctgtcgagaagtttctgatcgaaaagttcgacagcgtctccgac ctgatgcagctctcggagggcgaagaatctcgtgctttcagcttcgatgtaggagggcgtggatatgtcctgcgggt aaatagctgcgccgatggtttctacaaagatcgttatgtttatcggcactttgcatcggccgcgctcccgattccggaa gtgcttgacattggggaattcagcgagagcctgacctattgcatctcccgccgtgcacagggtgtcacgttgcaaga cctgcctgaaaccgaactgcccgctgttctgcagccggtcgcggaggccatggatgcgatcgctgcggccgatctt agccagacgagcgggttcggcccattcggaccgcaaggaatcggtcaatacactacatggcgtgatttcatatgcg cgattgctgatccccatgtgtatcactggcaaactgtgatggacgacaccgtcagtgcgtccgtcgcgcaggctctc gatgagctgatgctttgggccgaggactgccccgaagtccggcacctcgtgcacgcggatttcggctccaacaatg tcctgacggacaatggccgcataacagcggtcattgactggagcgaggcgatgttcggggattcccaatacgaggt cgccaacatcttcttctggaggccgtggttggcttgtatggagcagcagacgcgctacttcgagcggaggcatccg gagcttgcaggatcgccgcggctccgggcgtatatgctccgcattggtcttgaccaactctatcagagcttggttgac ggcaatttcgatgatgcagcttgggcgcagggtcgatgcgacgcaatcgtccgatccggagccgggactgtcggg cgtacacaaatcgcccgcagaagcgcggccgtctggaccgatggctgtgtagaagtactcgccgatagtggaaac cgacgccccagcactcgtccgagggcaaaggaatag
The promoter of human elongation factor 1a (EF1a) is shown as SEQ ID NO: 14;
cgtgaggctccggtgcccgtcagtgggcagagcgcacatcgcccacagtccccgagaagttggggggag gggtcggcaattgaaccggtgcctagagaaggtggcgcggggtaaactgggaaagtgatgtcgtgtactggctccgcctttttcccgagggtgggggagaaccgtatataagtgcagtagtcgccgtgaacgttctttttcgcaacgggtttgc cgccagaacacaggtaagtgccgtgtgtggttcccgcgggcctggcctctttacgggttatggcccttgcgtgccttg aattacttccacctggctgcagtacgtgattcttgatcccgagcttcgggttggaagtgggtgggagagttcgaggcc ttgcgcttaaggagccccttcgcctcgtgcttgagttgaggcctggcctgggcgctggggccgccgcgtgcgaatct ggtggcaccttcgcgcctgtctcgctgctttcgataagtctctagccatttaaaatttttgatgacctgctgcgacgctttt tttctggcaagatagtcttgtaaatgcgggccaagatctgcacactggtatttcggtttttggggccgcgggcggcga cggggcccgtgcgtcccagcgcacatgttcggcgaggcggggcctgcgagcgcggccaccgagaatcggacg ggggtagtctcaagctggccggcctgctctggtgcctggcctcgcgccgccgtgtatcgccccgccctgggcggc aaggctggcccggtcggcaccagttgcgtgagcggaaagatggccgcttcccggccctgctgcagggagctcaa aatggaggacgcggcgctcgggagagcgggcgggtgagtcacccacacaaaggaaaagggcctttccgtcctc agccgtcgcttcatgtgactccacggagtaccgggcgccgtccaggcacctcgattagttctggagcttttggagtac gtcgtctttaggttggggggaggggttttatgcgatggagtttccccacactgagtgggtggagactgaagttaggcc agcttggcacttgatgtaattctccttggaatttgccctttttgagtttggatcttggttcattctcaagcctcagacagtgg ttcaaagtttttttcttccatttcaggtgtcgtga
The three genes of the human integrated exogenous gene signal recognition particle 9, the signal recognition particle 14 and the signal recognition particle 54 are connected by an automatic cutting peptide segment to be used as an optional exogenous gene, and are shown as SEQ ID NO: 17:
atggtgttgttggagagcgagcagttcctgacggagctgaccagacttttccagaagtgccggacgtcgggc agcgtctatatcaccttgaagaagtatgacggtcgaaccaaacccattccaaagaaaggtactgtggagggctttga gcccgcagacaacaagtgtctgttaagagctaccgatgggaagaagaagatcagcactgtggtgagctccaagga agtgaataagtttcagatggcttattcaaacctccttagagctaacatggatgggttgaagaagagagacaaaaagaa caaaactaagaagaccaaagcagcagcagcagcagcagcagcagcacctgccgcagcagcaacagcagcaca gggcagcgagggaaggggaagcctgctcacatgcggcgacgtcgaagagaaccctggccccatgccgcagtac cagacctgggaggagttcagccgcgctgccgagaagctttacctcgctgaccctatgaaggcacgtgtggttctca aatataggcattctgatgggaacttgtgtgttaaagtaacagatgatttagttagacagtgtcttgctctattgctcaggct gcagtgcagtggcatgatcatagctcactgcatcctcgacctcctgggctcaagcggtcctcttgcttcagcctccgg agccaccaacttcagcctgctgaagcaggccggcgatgtggaggagaatcctggccccatggttctagcagacctt ggaagaaaaataacatcagcattacgctcgttgagcaatgccaccattatcaatgaagaggtattgaatgctatgcta aaagaagtctgtaccgctttgttggaagcagatgttaatattaaactagtgaagcaactaagagaaaatgttaagtctg ctattgatcttgaagagatggcatctggtcttaacaaaagaaaaatgattcagcatgctgtatttaaagaacttgtgaag cttgtagaccctggagttaaggcatggacacccactaaaggaaaacaaaatgtgattatgtttgttggattgcaaggg agtggtaaaacaacaacatgttcaaagctagcatattattaccagaggaaaggttggaagacctgtttaatatgtgcag acacattcagagcaggggcttttgaccaactaaaacagaatgctaccaaagcaagaattccattttatggaagctata cagaaatggatcctgtcatcattgcttctgaaggagtagagaaatttaaaaatgaaaattttgaaattattattgttgatac aagtggccgccacaaacaagaagactctttgtttgaagaaatgcttcaagttgctaatgctatacaacctgataacatt gtttatgtgatggatgcctccattgggcaggcttgtgaagcccaggctaaggcttttaaagataaagtagatgtagcct cagtaatagtgacaaaacttgatggccatgcaaaaggaggtggtgcactcagtgcagtcgctgccacaaaaagtcc gattattttcattggtacaggggaacatatagatgactttgaacctttcaaaacacagccttttattagcaaacttcttggt atgggcgacattgaaggactgatagataaagtcaacgagttgaagttggatgacaatgaagcacttatagagaagtt gaaacatggtcagtttacgttgcgagacatgtatgagcaatttcaaaatatcatgaaaatgggccccttcagtcagatc ttggggatgatccctggttttgggacagattttatgagcaaaggaaatgaacaggagtcaatggcaaggctaaagaa attaatgacaataatggatagtatgaatgatcaagaactagacagtacggatggtgccaaagtttttagtaaacaacca ggaagaatccaaagagtagcaagaggatcgggtgtatcaacaagagatgttcaagaacttttgacacaatataccaagtttgcacagatggtaaaaaagatgggaggtatcaaaggacttttcaaaggtggcgacatgtctaagaatgtgagccagtcacagatggcaaaattgaaccaacaaatggccaaaatgatggatcctagggttcttcatcacatgggtggtatgg caggacttcagtcaatgatgaggcagtttcaacagggtgctgctggcaacatgaaaggcatgatgggattcaataat atgtaa
the DNA sequence of TALEN left arm protein constructed by knocking out Fut8 gene is shown in SEQ ID NO: 18:
atggactataaggaccacgacggagactacaaggatcatgatattgattacaaagacgatgacgataagatg gccccaaagaagaagcggaaggtcggtatccacggagtcccagcagccgtagatttgagaactttgggatattcac agcagcagcaggaaaagatcaagcccaaagtgaggtcgacagtcgcgcagcatcacgaagcgctggtgggtcat gggtttacacatgcccacatcgtagccttgtcgcagcaccctgcagcccttggcacggtcgccgtcaagtaccagg acatgattgcggcgttgccggaagccacacatgaggcgatcgtcggtgtggggaaacagtggagcggagcccga gcgcttgaggccctgttgacggtcgcgggagagctgagagggcctccccttcagctggacacgggccagttgctg aagatcgcgaagcggggaggagtcacggcggtcgaggcggtgcacgcgtggcgcaatgcgctcacgggagcacccctcaacctgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgt ccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactaaccccaga gcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgct gtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactc gaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagcaggtcgtggcaattgcg agcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggact gacaccagagcaggtcgtggcaattgcgagcaacggagggggaaagcaggcactcgaaaccgtccagaggttg ctgcctgtgctgtgccaagcgcacggacttacacccgaacaagtcgtggcaattgcgagcaacaacgggggaaag caggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtg gcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagc gcacggactaaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtc cagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacgga gggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccaga gcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgct gtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactc gaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctcaccccagagcaggtcgtggcaattgcg agcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactt acgccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgct gcctgtgctgtgccaagcgcacggactaaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagc aggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtgg caattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgc acggcctgaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtcca gaggttgctgcctgtgctgtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacaacgg gggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactcacgcctgagc aggtagtggctattgcatccaacatcgggggcagacccgcactggagtcaatcgtggcccagctttcgaggccgg accccgcgctggccgcactcactaatgatcatcttgtagcgctggcctgcctcggcggacgacccgccttggatgc ggtgaagaaggggctcccgcacgcgcctgcattgattaagcggaccaacagaaggattcccgagaggacatcac atcgagtggcaggttcccaactcgtgaagagtgaacttgaggagaaaaagtcggagctgcggcacaaattgaaata cgtaccgcatgaatacatcgaacttatcgaaattgctaggaactcgactcaagacagaatccttgagatgaaggtaat ggagttctttatgaaggtttatggataccgagggaagcatctcggtggatcacgaaaacccgacggagcaatctata cggtggggagcccgattgattacggagtgatcgtcgacacgaaagcctacagcggtgggtacaatcttcccatcgg gcaggcagatgagatgcaacgttatgtcgaagaaaatcagaccaggaacaaacacatcaatccaaatgagtggtg gaaagtgtatccttcatcagtgaccgagtttaagtttttgtttgtctctgggcatttcaaaggcaactataaggcccagct cacacggttgaatcacattacgaactgcaatggtgcggttttgtccgtagaggaactgctcattggtggagaaatgat caaagcgggaactctgacactggaagaagtcagacgcaagtttaacaatggcgagatcaatttccgctcataa
The DNA sequence of the TALEN right arm protein constructed by knocking out the Fut8 gene is shown in SEQ ID NO: 19:
atggactataaggaccacgacggagactacaaggatcatgatattgattacaaagacgatgacgataagatg gccccaaagaagaagcggaaggtcggtatccacggagtcccagcagccgtagatttgagaactttgggatattcac agcagcagcaggaaaagatcaagcccaaagtgaggtcgacagtcgcgcagcatcacgaagcgctggtgggtcat gggtttacacatgcccacatcgtagccttgtcgcagcaccctgcagcccttggcacggtcgccgtcaagtaccagg acatgattgcggcgttgccggaagccacacatgaggcgatcgtcggtgtggggaaacagtggagcggagcccga gcgcttgaggccctgttgacggtcgcgggagagctgagagggcctccccttcagctggacacgggccagttgctg aagatcgcgaagcggggaggagtcacggcggtcgaggcggtgcacgcgtggcgcaatgcgctcacgggagca cccctcaacctgaccccagagcaggtcgtggcaattgcgagccatgacgggggaaagcaggcactcgaaaccgt ccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggcaattgcgagccatga cgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactaaccccag agcaggtcgtggcaattgcgagcaacggagggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtg ctgtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacggagggggaaagcaggcac tcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagcaggtcgtggcaattgc gagcaacggagggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacgga ctgacaccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggtt gctgcctgtgctgtgccaagcgcacggacttacacccgaacaagtcgtggcaattgcgagcaacatcgggggaaa gcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgt ggcaattgcgagccatgacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagc gcacggactaaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtc cagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacatc gggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccaga gcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgct gtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactc gaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctcaccccagagcaggtcgtggcaattgcg agcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactt acgccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgct gcctgtgctgtgccaagcgcacggactaaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagc aggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtgg caattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcg cacggcctgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtcc agaggttgctgcctgtgctgtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacgga gggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactcacgcctga gcaggtagtggctattgcatcccatgacgggggcagacccgcactggagtcaatcgtggcccagctttcgaggcc ggaccccgcgctggccgcactcactaatgatcatcttgtagcgctggcctgcctcggcggacgacccgccttggat gcggtgaagaaggggctcccgcacgcgcctgcattgattaagcggaccaacagaaggattcccgagaggacatc acatcgagtggcaggttcccaactcgtgaagagtgaacttgaggagaaaaagtcggagctgcggcacaaattgaa atacgtaccgcatgaatacatcgaacttatcgaaattgctaggaactcgactcaagacagaatccttgagatgaaggt aatggagttctttatgaaggtttatggataccgagggaagcatctcggtggatcacgaaaacccgacggagcaatct atacggtggggagcccgattgattacggagtgatcgtcgacacgaaagcctacagcggtgggtacaatcttcccat cgggcaggcagatgagatgcaacgttatgtcgaagaaaatcagaccaggaacaaacacatcaatccaaatgagtg gtggaaagtgtatccttcatcagtgaccgagtttaagtttttgtttgtctctgggcatttcaaaggcaactataaggccca gctcacacggttgaatcacattacgaactgcaatggtgcggttttgtccgtagaggaactgctcattggtggagaaat gatcaaagcgggaactctgacactggaagaagtcagacgcaagtttaacaatggcgagatcaatttccgctcataa
the PCR primer 1 for amplifying the TALEN targeting region of the exon 10 of the Fut8 gene is shown as SEQ ID NO: 20:
taaatctgttgattccaggttccc
The PCR primer 2 for amplifying the TALEN targeting region of the exon 10 of the Fut8 gene is shown as SEQ ID NO. 21:
gtactaagaagtgtggtactgtgttg
The DNA sequence of TALEN left arm protein constructed by knocking out GS gene is shown in SEQ ID NO: 22:
atggactacaaggaccacgacggcgactacaaggaccacgacatcgactacaaggacgacgacgacaa gatggcccccaagaagaagagaaaggtgggcatccacagaggcgtgcccatggtggacctgagaaccctgggc tacagccagcagcagcaggagaagatcaagcccaaggtgagaagcaccgtggcccagcaccacgaggccctg gtgggccacggcttcacccacgcccacatcgtggccctgagccagcaccccgccgccctgggcaccgtggccgt gaagtaccaggacatgatcgccgccctgcccgaggccacccacgaggccatcgtgggcgtgggcaagcagtgg agcggcgccagagccctggaggccctgctgaccgtggccggcgagctgagaggcccccccctccagctggaca ccggccagctgctgaagatcgccaagagaggcggcgtgaccgccgtggaggccgtgcacgcctggagaaacg ccctgaccggcgcccccctgaatttgacacccgaccaagttgtggccattgccagcaacggtggagggaaacaag cattggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttgacccctgcacaagtggtcgccatcg cctcccatgacggcggtaagcaggccctggaaacagtgcaacggttgctccctgtcttgtgtcaagatcatggactg accccagaccaggtggtcgcaatcgcctctaacaacgggggaaagcaagccctggaaaccgtgcaaaggttgttg cccgtcctttgtcaagaccacggccttacacccgagcaagtcgtggccattgcatcaaacatcggtggcaaacagg ctcttgagactgttcagagacttctcccagttctctgccaggcacacgggcttactcccgatcaagttgtggccattgc cagcaacaacggagggaaacaagcattggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttg acccctgcacaagtggtcgccatcgcctccaacatcggcggtaagcaggccctggaaacagtgcagcgcctgctg cctgtgctgtgccaggatcatggactgaccccagaccaggttgtcgccatcgcctctaacatcgggggaaagcaag ccctggaaaccgtgcaaaggttgttgcccgtcctttgtcaagaccacggccttacacccgagcaagtcgtggccattgcatcaaacatcggtggcaaacaggctcttgagactgttcagagacttctcccagttctctgtcaagcccacggtttga cacccgaccaagttgtggccattgccagccatgacggagggaaacaagcattggagactgtccaacggctccttcc cgtgttgtgtcaagcccacggtttgacccctgcacaagtggtcgccatcgcctccaacggtggcggtaagcaggcc ctggaaacagtgcaacggttgctccctgtcttgtgtcaagatcatggactgaccccagaccaggtggtcgcaatcgc ctctaacatcgggggaaagcaagccctggaaaccgtgcaaaggttgttgcccgtcctttgtcaagaccacggcctta cacccgagcaagtcgtggccattgcatcaaacatcggtggcaaacaggctcttgagactgttcagagacttctccca gttctctgccaggcacacgggcttactcccgatcaagttgtggccattgccagcaacaacggagggaaacaagcat tggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttgacccctgcacaagtggtcgccatcgcct cccatgacggcggtaagcaggccctggaaacagtgcagcgcctgctgcctgtgctgtgccaggatcatggactga ccccagaccaggttgtcgccatcgcctctaacatcgggggaaagcaagccctggaaaccgtgcaaaggttgttgcc cgtcctttgtcaagaccacggccttacacccgagcaagtcgtggccattgcatcaaacatcggtggcaaacaggctc ttgagactgttcagagacttctcccagttctctgtcaagcccacggtttgacacccgaccaagttgtggccattgccag caacaacggagggaaacaagcattggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttgacc cctgcacaagtggtcgccatcgcctcccatgacggcggtaagcaggccctggaaacagtgcaacggttgctccct gtcttgtgtcaagaccatgggctgacccccgagcaggtggtggccatcgccagcaacaacggcggcagacccgc cctggagagcatcgtggcccagctgagcagacccgaccccgccctggccgccctgaccaacgaccacctggtgg ccctggcctgcctgggcggcagacccgccctggacgccgtgaagaagggcctgccccacgcccccgccctgat caagagaaccaacagaagaatccccgagagaaccagccacagagtggccggcagccagctggtgaagagcga gctggaggagaagaagagcgagctgagacacaagctgaagtacgtgccccacgagtacatcgagctgatcgaga tcgccagaaacagcacccaggacagaatcctggagatgaaggtgatggagttcttcatgaaggtgtacggctacag aggcaagcacctgggcggcagcagaaagcccgacggcgccatctacaccgtgggcagccccatcgactacggc gtgatcgtggacaccaaggcctacagcggcggctacaacctgcccatcggccaggccgacgagatgcagagata cgtggaggagaaccagaccagaaacaagcacatcaaccccaacgagtggtggaaggtgtaccccagcagcgtg accgagttcaagttcctgttcgtgagcggccacttcaagggcaactacaaggcccagctgaccagactgaaccaca tcaccaactgcaacggcgccgtgctgagcgtggaggagctgctgatcggcggcgagatgatcaaggccggcacc ctgaccctggaggaggtgagaagaaagttcaacaacggcgagatcaacttcagaagctctagatga
The DNA sequence of the TALEN right arm protein constructed by knocking out the GS gene is shown as SEQ ID NO. 23:
atggactacaaggaccacgacggcgactacaaggaccacgacatcgactacaaggacgacgacgacaa gatggcccccaagaagaagagaaaggtgggcatccacagaggcgtgcccatggtggacctgagaaccctgggc tacagccagcagcagcaggagaagatcaagcccaaggtgagaagcaccgtggcccagcaccacgaggccctg gtgggccacggcttcacccacgcccacatcgtggccctgagccagcaccccgccgccctgggcaccgtggccgt gaagtaccaggacatgatcgccgccctgcccgaggccacccacgaggccatcgtgggcgtgggcaagcagtgg agcggcgccagagccctggaggccctgctgaccgtggccggcgagctgagaggcccccccctccagctggaca ccggccagctgctgaagatcgccaagagaggcggcgtgaccgccgtggaggccgtgcacgcctggagaaacg ccctgaccggcgcccccctgaatttgacacccgaccaagttgtggccattgccagcaacggtggagggaaacaag cattggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttgacccctgcacaagtggtcgccatcg cctccaacaacggcggtaagcaggccctggaaacagtgcaacggttgctccctgtcttgtgtcaagatcatggactg accccagaccaggtggtcgcaatcgcctctaacaacgggggaaagcaagccctggaaaccgtgcaaaggttgttg cccgtcctttgtcaagaccacggccttacacccgagcaagtcgtggccattgcatcaaacaacggtggcaaacagg ctcttgagactgttcagagacttctcccagttctctgccaggcacacgggcttactcccgatcaagttgtggccattgc cagcaacatcggagggaaacaagcattggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttga cccctgcacaagtggtcgccatcgcctccaacggtggcggtaagcaggccctggaaacagtgcagcgcctgctgc ctgtgctgtgccaggatcatggactgaccccagaccaggttgtcgccatcgcctctcatgacgggggaaagcaagccctggaaaccgtgcaaaggttgttgcccgtcctttgtcaagaccacggccttacacccgagcaagtcgtggccattgcatcaaacaacggtggcaaacaggctcttgagactgttcagagacttctcccagttctctgtcaagcccacggtttga cacccgaccaagttgtggccattgccagcaacggtggagggaaacaagcattggagactgtccaacggctccttcc cgtgttgtgtcaagcccacggtttgacccctgcacaagtggtcgccatcgcctccaacatcggcggtaagcaggcc ctggaaacagtgcaacggttgctccctgtcttgtgtcaagatcatggactgaccccagaccaggtggtcgcaatcgc ctctaacaacgggggaaagcaagccctggaaaccgtgcaaaggttgttgcccgtcctttgtcaagaccacggcctt acacccgagcaagtcgtggccattgcatcaaacaacggtggcaaacaggctcttgagactgttcagagacttctccc agttctctgccaggcacacgggcttactcccgatcaagttgtggccattgccagccatgacggagggaaacaagca ttggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttgacccctgcacaagtggtcgccatcgcc tccaacggtggcggtaagcaggccctggaaacagtgcagcgcctgctgcctgtgctgtgccaggatcatggactg accccagaccaggttgtcgccatcgcctctcatgacgggggaaagcaagccctggaaaccgtgcaaaggttgttg cccgtcctttgtcaagaccacggccttacacccgagcaagtcgtggccattgcatcaaacaacggtggcaaacagg ctcttgagactgttcagagacttctcccagttctctgtcaagcccacggtttgacacccgaccaagttgtggccattgc cagcaacatcggagggaaacaagcattggagactgtccaacggctccttcccgtgttgtgtcaagcccacggtttga cccctgcacaagtggtcgccatcgcctccaacatcggcggtaagcaggccctggaaacagtgcaacggttgctcc ctgtcttgtgtcaagaccatgggctgacccccgagcaggtggtggccatcgccagcaacggtggcggcagacccg ccctggagagcatcgtggcccagctgagcagacccgaccccgccctggccgccctgaccaacgaccacctggtg gccctggcctgcctgggcggcagacccgccctggacgccgtgaagaagggcctgccccacgcccccgccctga tcaagagaaccaacagaagaatccccgagagaaccagccacagagtggccggcagccagctggtgaagagcg agctggaggagaagaagagcgagctgagacacaagctgaagtacgtgccccacgagtacatcgagctgatcgag atcgccagaaacagcacccaggacagaatcctggagatgaaggtgatggagttcttcatgaaggtgtacggctaca gaggcaagcacctgggcggcagcagaaagcccgacggcgccatctacaccgtgggcagccccatcgactacgg cgtgatcgtggacaccaaggcctacagcggcggctacaacctgcccatcggccaggccgacgagatgcagagat acgtggaggagaaccagaccagaaacaagcacatcaaccccaacgagtggtggaaggtgtaccccagcagcgt gaccgagttcaagttcctgttcgtgagcggccacttcaagggcaactacaaggcccagctgaccagactgaaccac atcaccaactgcaacggcgccgtgctgagcgtggaggagctgctgatcggcggcgagatgatcaaggccggcac cctgaccctggaggaggtgagaagaaagttcaacaacggcgagatcaacttcagaagctctagatga
The PCR primer 1 for amplifying the TALEN targeting region of the GS gene exon 7 is shown as SEQ ID NO. 24
ttgtacccgttggagaagtgacag
The PCR primer 2 for amplifying the targeting region of the exon TALEN of the GS gene No. 7 is shown as SEQ ID NO: 25:
gatgaactaggaaaggctcaagatcac
The DNA sequence of TALEN left arm protein constructed by knocking out ST3GAL4 gene is shown in SEQ ID NO: 26:
atggactataaggaccacgacggagactacaaggatcatgatattgattacaaagacgatgacgataagatggccc caaagaagaagcggaaggtcggtatccacggagtcccagcagccgtagatttgagaactttgggatattcacagca gcagcaggaaaagatcaagcccaaagtgaggtcgacagtcgcgcagcatcacgaagcgctggtgggtcatgggt ttacacatgcccacatcgtagccttgtcgcagcaccctgcagcccttggcacggtcgccgtcaagtaccaggacatg attgcggcgttgccggaagccacacatgaggcgatcgtcggtgtggggaaacagtggagcggagcccgagcgct tgaggccctgttgacggtcgcgggagagctgagagggcctccccttcagctggacacgggccagttgctgaagat cgcgaagcggggaggagtcacggcggtcgaggcggtgcacgcgtggcgcaatgcgctcacgggagcacccct caacctgaccccagagcaggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccag aggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggcaattgcgagcaacatcggg ggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactaaccccagagca ggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtg ccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagcaggtcgtggcaattgcgagccacgacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactgac accagagcaggtcgtggcaattgcgagcaacggagggggaaagcaggcactcgaaaccgtccagaggttgctg cctgtgctgtgccaagcgcacggacttacacccgaacaagtcgtggcaattgcgagcaacaacgggggaaagca ggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggc aattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgc acggactaaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtcca gaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacaacgg gggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagc aggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgt gccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcga aaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctcaccccagagcaggtcgtggcaattgcgag caacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttac gccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctg cctgtgctgtgccaagcgcacggactaaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagca ggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggc aattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgc acggcctgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtcca gaggttgctgcctgtgctgtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacaacgg gggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactcacgcctgagc aggtagtggctattgcatccaacaacgggggcagacccgcactggagtcaatcgtggcccagctttcgaggccgg accccgcgctggccgcactcactaatgatcatcttgtagcgctggcctgcctcggcggacgacccgccttggatgc ggtgaagaaggggctcccgcacgcgcctgcattgattaagcggaccaacagaaggattcccgagaggacatcac atcgagtggcaggttcccaactcgtgaagagtgaacttgaggagaaaaagtcggagctgcggcacaaattgaaata cgtaccgcatgaatacatcgaacttatcgaaattgctaggaactcgactcaagacagaatccttgagatgaaggtaat ggagttctttatgaaggtttatggataccgagggaagcatctcggtggatcacgaaaacccgacggagcaatctata cggtggggagcccgattgattacggagtgatcgtcgacacgaaagcctacagcggtgggtacaatcttcccatcgg gcaggcagatgagatgcaacgttatgtcgaagaaaatcagaccaggaacaaacacatcaatccaaatgagtggtg gaaagtgtatccttcatcagtgaccgagtttaagtttttgtttgtctctgggcatttcaaaggcaactataaggcccagct cacacggttgaatcacattacgaactgcaatggtgcggttttgtccgtagaggaactgctcattggtggagaaatgat caaagcgggaactctgacactggaagaagtcagacgcaagtttaacaatggcgagatcaatttccgctcataa
the DNA sequence of the TALEN right arm protein constructed by knocking out ST3GAL4 gene is shown in SEQ ID NO: 27:
atggactataaggaccacgacggagactacaaggatcatgatattgattacaaagacgatgacgataagatggccc caaagaagaagcggaaggtcggtatccacggagtcccagcagccgtagatttgagaactttgggatattcacagca gcagcaggaaaagatcaagcccaaagtgaggtcgacagtcgcgcagcatcacgaagcgctggtgggtcatgggt ttacacatgcccacatcgtagccttgtcgcagcaccctgcagcccttggcacggtcgccgtcaagtaccaggacatg attgcggcgttgccggaagccacacatgaggcgatcgtcggtgtggggaaacagtggagcggagcccgagcgct tgaggccctgttgacggtcgcgggagagctgagagggcctccccttcagctggacacgggccagttgctgaagat cgcgaagcggggaggagtcacggcggtcgaggcggtgcacgcgtggcgcaatgcgctcacgggagcacccct caacctgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccag aggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggcaattgcgagcaacggcggg ggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactaaccccagagca ggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtg ccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagcaggtcgtggcaattgcgagc aacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactgac accagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgc ctgtgctgtgccaagcgcacggacttacacccgaacaagtcgtggcaattgcgagccacgacgggggaaagcag gcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggca attgcgagccacgacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgca cggactaaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccag aggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacggcggg ggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagca ggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtg ccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagccacgacgggggaaagcaggcactcga aaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctcaccccagagcaggtcgtggcaattgcgag caacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactta cgccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctg cctgtgctgtgccaagcgcacggactaaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagca ggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggc aattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgc acggcctgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtcca gaggttgctgcctgtgctgtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacatcgg gggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactcacgcctgagc aggtagtggctattgcatccaacatcgggggcagacccgcactggagtcaatcgtggcccagctttcgaggccgg accccgcgctggccgcactcactaatgatcatcttgtagcgctggcctgcctcggcggacgacccgccttggatgc ggtgaagaaggggctcccgcacgcgcctgcattgattaagcggaccaacagaaggattcccgagaggacatcac atcgagtggcaggttcccaactcgtgaagagtgaacttgaggagaaaaagtcggagctgcggcacaaattgaaata cgtaccgcatgaatacatcgaacttatcgaaattgctaggaactcgactcaagacagaatccttgagatgaaggtaat ggagttctttatgaaggtttatggataccgagggaagcatctcggtggatcacgaaaacccgacggagcaatctata cggtggggagcccgattgattacggagtgatcgtcgacacgaaagcctacagcggtgggtacaatcttcccatcgg gcaggcagatgagatgcaacgttatgtcgaagaaaatcagaccaggaacaaacacatcaatccaaatgagtggtg gaaagtgtatccttcatcagtgaccgagtttaagtttttgtttgtctctgggcatttcaaaggcaactataaggcccagct cacacggttgaatcacattacgaactgcaatggtgcggttttgtccgtagaggaactgctcattggtggagaaatgat caaagcgggaactctgacactggaagaagtcagacgcaagtttaacaatggcgagatcaatttccgctcataa
The PCR primer 1 for amplifying the TALEN targeting region of the exon 5 of the ST3GAL4 gene is shown as SEQ ID NO. 28:
ctaaaggctgctcccactctac
the PCR primer 2 for amplifying the TALEN targeting region of the exon 5 of the ST3GAL4 gene is shown as SEQ ID NO. 29:
caaagtggaacttgggttgagg
The gene sequence of TALEN left arm protein constructed by knocking out ST3GAL6 gene is shown in SEQ ID NO: 30:
atggactataaggaccacgacggagactacaaggatcatgatattgattacaaagacgatgacgataagatggccc caaagaagaagcggaaggtcggtatccacggagtcccagcagccgtagatttgagaactttgggatattcacagca gcagcaggaaaagatcaagcccaaagtgaggtcgacagtcgcgcagcatcacgaagcgctggtgggtcatgggt ttacacatgcccacatcgtagccttgtcgcagcaccctgcagcccttggcacggtcgccgtcaagtaccaggacatg attgcggcgttgccggaagccacacatgaggcgatcgtcggtgtggggaaacagtggagcggagcccgagcgct tgaggccctgttgacggtcgcgggagagctgagagggcctccccttcagctggacacgggccagttgctgaagat cgcgaagcggggaggagtcacggcggtcgaggcggtgcacgcgtggcgcaatgcgctcacgggagcacccct caacctgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccag aggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggcaattgcgagcaacaacggg ggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactaaccccagagca ggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtg ccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaa accgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactgac accagagcaggtcgtggcaattgcgagccacgacgggggaaagcaggcactcgaaaccgtccagaggttgctg cctgtgctgtgccaagcgcacggacttacacccgaacaagtcgtggcaattgcgagccacgacgggggaaagca ggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggc aattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgc acggactaaccccagagcaggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtcca gaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacatcgg gggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagc aggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgt gccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcg aaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctcaccccagagcaggtcgtggcaattgcga gccacgacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactt acgccagagcaggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgc tgcctgtgctgtgccaagcgcacggactaaccccagagcaggtcgtggcaattgcgagcaacggcgggggaaag caggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtg gcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagc gcacggcctgaccccagagcaggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtc cagaggttgctgcctgtgctgtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacaac gggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactcacgcctga gcaggtagtggctattgcatccaacggcgggggcagacccgcactggagtcaatcgtggcccagctttcgaggcc ggaccccgcgctggccgcactcactaatgatcatcttgtagcgctggcctgcctcggcggacgacccgccttggat gcggtgaagaaggggctcccgcacgcgcctgcattgattaagcggaccaacagaaggattcccgagaggacatc acatcgagtggcaggttcccaactcgtgaagagtgaacttgaggagaaaaagtcggagctgcggcacaaattgaa atacgtaccgcatgaatacatcgaacttatcgaaattgctaggaactcgactcaagacagaatccttgagatgaaggt aatggagttctttatgaaggtttatggataccgagggaagcatctcggtggatcacgaaaacccgacggagcaatct atacggtggggagcccgattgattacggagtgatcgtcgacacgaaagcctacagcggtgggtacaatcttcccat cgggcaggcagatgagatgcaacgttatgtcgaagaaaatcagaccaggaacaaacacatcaatccaaatgagtg gtggaaagtgtatccttcatcagtgaccgagtttaagtttttgtttgtctctgggcatttcaaaggcaactataaggccca gctcacacggttgaatcacattacgaactgcaatggtgcggttttgtccgtagaggaactgctcattggtggagaaat gatcaaagcgggaactctgacactggaagaagtcagacgcaagtttaacaatggcgagatcaatttccgctcataa
The gene sequence of TALEN right arm protein constructed by knocking out ST3GAL6 gene is shown in SEQ ID NO: 31:
atggactataaggaccacgacggagactacaaggatcatgatattgattacaaagacgatgacgataagatggccc caaagaagaagcggaaggtcggtatccacggagtcccagcagccgtagatttgagaactttgggatattcacagca gcagcaggaaaagatcaagcccaaagtgaggtcgacagtcgcgcagcatcacgaagcgctggtgggtcatgggt ttacacatgcccacatcgtagccttgtcgcagcaccctgcagcccttggcacggtcgccgtcaagtaccaggacatg attgcggcgttgccggaagccacacatgaggcgatcgtcggtgtggggaaacagtggagcggagcccgagcgct tgaggccctgttgacggtcgcgggagagctgagagggcctccccttcagctggacacgggccagttgctgaagat cgcgaagcggggaggagtcacggcggtcgaggcggtgcacgcgtggcgcaatgcgctcacgggagcacccct caacctgaccccagagcaggtcgtggcaattgcgagccacgacgggggaaagcaggcactcgaaaccgtccag aggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggcaattgcgagcaacatcggg ggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactaaccccagagca ggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtg ccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagccacgacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactgac accagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgc ctgtgctgtgccaagcgcacggacttacacccgaacaagtcgtggcaattgcgagcaacatcgggggaaagcag gcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttacgccagagcaggtcgtggca attgcgagccacgacgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgca cggactaaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccag aggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggcaattgcgagcaacaacggg ggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggcctgaccccagagca ggtcgtggcaattgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtg ccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcaggcactcgaa accgtccagaggttgctgcctgtgctgtgccaagcgcacggcctcaccccagagcaggtcgtggcaattgcgagc aacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggacttac gccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccagaggttgctgc ctgtgctgtgccaagcgcacggactaaccccagagcaggtcgtggcaattgcgagcaacaacgggggaaagcag gcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacgggttgaccccagagcaggtcgtggca attgcgagcaacggcgggggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgca cggcctgaccccagagcaggtcgtggcaattgcgagcaacatcgggggaaagcaggcactcgaaaccgtccag aggttgctgcctgtgctgtgccaagcgcacggactgacaccagagcaggtcgtggcaattgcgagcaacaacggg ggaaagcaggcactcgaaaccgtccagaggttgctgcctgtgctgtgccaagcgcacggactcacgcctgagca ggtagtggctattgcatccaacatcgggggcagacccgcactggagtcaatcgtggcccagctttcgaggccgga ccccgcgctggccgcactcactaatgatcatcttgtagcgctggcctgcctcggcggacgacccgccttggatgcg gtgaagaaggggctcccgcacgcgcctgcattgattaagcggaccaacagaaggattcccgagaggacatcacat cgagtggcaggttcccaactcgtgaagagtgaacttgaggagaaaaagtcggagctgcggcacaaattgaaatac gtaccgcatgaatacatcgaacttatcgaaattgctaggaactcgactcaagacagaatccttgagatgaaggtaatg gagttctttatgaaggtttatggataccgagggaagcatctcggtggatcacgaaaacccgacggagcaatctatac ggtggggagcccgattgattacggagtgatcgtcgacacgaaagcctacagcggtgggtacaatcttcccatcggg caggcagatgagatgcaacgttatgtcgaagaaaatcagaccaggaacaaacacatcaatccaaatgagtggtgg aaagtgtatccttcatcagtgaccgagtttaagtttttgtttgtctctgggcatttcaaaggcaactataaggcccagctc acacggttgaatcacattacgaactgcaatggtgcggttttgtccgtagaggaactgctcattggtggagaaatgatcaaagcgggaactctgacactggaagaagtcagacgcaagtttaacaatggcgagatcaatttccgctcataa
The PCR primer 1 for amplifying the TALEN targeting region of the exon 2 of the ST3GAL6 gene is shown as SEQ ID NO: 32:
gggtgtagagatagattctcc
The PCR primer 2 for amplifying the TALEN targeting region of exon 2 of ST3GAL6 gene is shown as SEQ ID NO. 33:
ggccagccatcactagtattc
the DNA sequence of the hERO-L gene is shown in SEQ ID NO: 38:
atgggccgcggctggggattcttgtttggcctcctgggcgccgtgtggctgctcagctcgggccacggaga ggagcagcccccggagacagcggcacagaggtgcttctgccaggttagtggttacttggatgattgtacctgtgatg ttgaaaccattgatagatttaataactacaggcttttcccaagactacaaaaacttcttgaaagtgactactttaggtatta caaggtaaacctgaagaggccgtgtcctttctggaatgacatcagccagtgtggaagaagggactgtgctgtcaaa ccatgtcaatctgatgaagttcctgatggaattaaatctgcgagctacaagtattctgaagaagccaataatctcattga agaatgtgaacaagctgaacgacttggagcagtggatgaatctctgagtgaggaaacacagaaggctgttcttcagt ggaccaagcatgatgattcttcagataacttctgtgaagctgatgacattcagtcccctgaagctgaatatgtagatttg cttcttaatcctgagcgctacactggttacaagggaccagatgcttggaaaatatggaatgtcatctacgaagaaaact gttttaagccacagacaattaaaagacctttaaatcctttggcttctggtcaagggacaagtgaagagaacactttttac agttggctagaaggtctctgtgtagaaaaaagagcattctacagacttatatctggcctacatgcaagcattaatgtgc atttgagtgcaagatatcttttacaagagacctggttagaaaagaaatggggacacaacattacagaatttcaacagc gatttgatggaattttgactgaaggagaaggtccaagaaggcttaagaacttgtattttctctacttaatagaactaagg gctttatccaaagtgttaccattcttcgagcgcccagattttcaactctttactggaaataaaattcaggatgaggaaaac aaaatgttacttctggaaatacttcatgaaatcaagtcatttcctttgcattttgatgagaattcattttttgctggggataaa aaagaagcacacaaactaaaggaggactttcgactgcattttagaaatatttcaagaattatggattgtgttggttgtttt aaatgtcgtctgtggggaaagcttcagactcagggtttgggcactgctctgaagatcttattttctgagaaattgatagc aaatatgccagaaagtggacctagttatgaattccatctaaccagacaagaaatagtatcattattcaacgcatttggaagaatttctacaagtgtgaaagaattagaaaacttcaggaacttgttacagaatattcattga
The DNA sequence of the hFGF9 gene is shown as SEQ ID NO. 39:
Atgaccagcaagctcgccgtggctctgctggctgccttcctgatcagcgccgccctctgcgagggcttagg tgaagttgggaactatttcggtgtgcaggatgcggtaccgtttgggaatgtgcccgtgttgccggtggacagcccgg ttttgttaagtgaccacctgggtcagtccgaagcaggggggctccccaggggacccgcagtcacggacttggatca tttaaaggggattctcaggcggaggcagctatactgcaggactggatttcacttagaaatcttccccaatggtactatc cagggaaccaggaaagaccacagccgatttggcattctggaatttatcagtatagcagtgggcctggtcagcattcg aggcgtggacagtggactctacctcgggatgaatgagaagggggagctgtatggatcagaaaaactaacccaaga gtgtgtattcagagaacagttcgaagaaaactggtataatacgtactcatcaaacctatataagcacgtggacactgg aaggcgatactatgttgcattaaataaagatgggaccccgagagaagggactaggactaaacggcaccagaaattc acacattttttacctagaccagtggaccccgacaaagtacctgaactgtataaggatattctaagccaaagttga
the DNA sequence of the GT-P2A-ST6 gene is shown in SEQ ID NO: 40:
atgaggcttcgggagccgctcctgagcggcagcgccgcgatgccaggcgcgtccctacagcgggcctgc cgcctgctcgtggccgtctgcgctctgcaccttggcgtcaccctcgtttactacctggctggccgcgacctgagccgcctgccccaactggtcggagtctccacaccgctgcagggcggctcgaacagtgccgccgccatcgggcagtcctccggggagctccggaccggaggggcccggccgccgcctcctctaggcgcctcctcccagccgcgcccgggtgg cgactccagcccagtcgtggattctggccctggccccgctagcaacttgacctcggtcccagtgccccacaccacc gcactgtcgctgcccgcctgccctgaggagtccccgctgcttgtgggccccatgctgattgagtttaacatgcctgtggacctggagctcgtggcaaagcagaacccaaatgtgaagatgggcggccgctatgcccccagggactgcgtctctcctcacaaggtggccatcatcattccattccgcaaccggcaggagcacctcaagtactggctatattatttgcaccca gtcctgcagcgccagcagctggactatggcatctatgttatcaaccaggcgggagacactatattcaatcgtgctaag ctcctcaatgttggctttcaagaagccttgaaggactatgactacacctgctttgtgtttagtgacgtggacctcattcca atgaatgaccataatgcgtacaggtgtttttcacagccacggcacatttccgttgcaatggataagtttggattcagcct accttatgttcagtattttggaggtgtctctgctctaagtaaacaacagtttctaaccatcaatggatttcctaataattattg gggctggggaggagaagatgatgacatttttaacagattagtttttagaggcatgtctatatctcgcccaaatgctgtg gtcgggaggtgtcgcatgatccgccactcaagagacaagaaaaatgaacccaatcctcagaggtttgaccgaattg cacacacaaaggagacaatgctctctgatggtttgaactcactcacctaccaggtgctggatgtacagagatacccat tgtatacccaaatcacagtggacatcgggacaccgagctcgagcggcagcggagccaccaacttcagcctgctga agcaggccggcgatgtggaggagaatcctggccccatgattcacaccaacctgaagaaaaagttcagctgctgcg tcctggtctttcttctgtttgcagtcatctgtgtgtggaaggaaaagaagaaagggagttactatgattcctttaaattgca aaccaaggaattccaggtgttaaagagtctggggaaattggccatggggtctgattcccagtctgtatcctcaagcag cacccaggacccccacaggggccgccagaccctcggcagtctcagaggcctagccaaggccaaaccagaggc ctccttccaggtgtggaacaaggacagctcttccaaaaaccttatccctaggctgcaaaagatctggaagaattacct aagcatgaacaagtacaaagtgtcctacaaggggccaggaccaggcatcaagttcagtgcagaggccctgcgctg ccacctccgggaccatgtgaatgtatccatggtagaggtcacagattttcccttcaatacctctgaatgggagggttat ctgcccaaggagagcattaggaccaaggctgggccttggggcaggtgtgctgttgtgtcgtcagcgggatctctga agtcctcccaactaggcagagaaatcgatgatcatgacgcagtcctgaggtttaatggggcacccacagccaacttc caacaagatgtgggcacaaaaactaccattcgcctgatgaactctcagttggttaccacagagaagcgcttcctcaa agacagtttgtacaatgaaggaatcctaattgtatgggacccatctgtataccactcagatatcccaaagtggtaccag aatccggattataatttctttaacaactacaagacttatcgtaagctgcaccccaatcagcccttttacatcctcaagccc cagatgccttgggagctatgggacattcttcaagaaatctccccagaagagattcagccaaaccccccatcctctgg gatgcttggtatcatcatcatgatgacgctgtgtgaccaggtggatatttatgagttcctcccatccaagcgcaagact gacgtgtgctactactaccagaagttcttcgatagtgcctgcacgatgggtgcctaccacccgctgctctatgagaag aatttggtgaagcatctcaaccagggcacagatgaggacatctacctgcttggaaaagccacactgcctggcttccg gaccattcactgctaa
example 1: construction of CHO cell targeting vector and establishment of anchored cell strain
In order to achieve continuous, cyclically alternating, unlimited site-directed integration in the genome of CHO cells, it is first necessary to find a high-expression Hot Spot (Hot-Spot) in the genome of CHO cells. The inventor finds that a Promoter (Promoter) of a constitutive high-expression housekeeping gene GAPDH meets the requirement of open chromosome structure and constant high expression through intensive research. For example, Bahr s. et al identified a site in CHO cells similar to mouse ROSA 26. ROSA26 is a known site of open architecture, constant high expression in the genome of mice, and is often used for the implantation of foreign genes to make transgenic mice. Although the ROSA26 gene is also present in CHO cells, the transcriptional strength at the mRNA level is lower, especially only 1/300 of the transcriptional strength of GAPDH, compared to several other constant-type tubular genes. The GAPDH gene is 4-5 times higher than the second beta-actin gene. Therefore, we selected the nucleotide sequence from 3574544 th to 3575484 th position of GAPDH gene in CHO cells (SEQ ID NO:8) for gene targeting.
Gene targeting techniques have been widely used to prepare knockout (knock out) and knock-in (knock in) mice, based on homologous recombination of chromosomal DNA with foreign DNA. We cloned into the left and right arms of GAPDH between the PacI and FseI cleavage sites on targeting vector pDNL (AddGene) purchased from commercial Inc. Between the right and left arms of GAPDH, the TK gene was included as the negative selection, and the PGK promoter-puromycin-T2A-d 1EGFP expression cassette was included as the positive selection. Further, immediately upstream and downstream of the expression cassette, LoxPwt and LoxP1 sequences were contained, respectively. This targeting vector was designated CHO GAPDH-Zeo-TK HR v1.5 (FIG. 1, a partial nucleotide sequence of which is shown in SEQ ID NO: 10). In this process, all molecular cloning procedures used high fidelity DNA Polymerase Herculase II Fusion DNA Polymerase (Aligent Corp.) to join different fragments by Overlapping PCR. All PCR primers were synthesized by Integrated DNA Technology (IDT). Restriction enzymes were purchased from New England Biolabs (NEB).
CHO-K1 cells (ATCC, CCL-61) received electrotransfer of the targeting vector. The electrotransformation machine is a Neon Electroposition System from LifeTech. Electrotransfer conditions were 1620v, 10ms, 3 pulses. The amount of DNA used for targeting vector was 10. mu.g, and the amount of cells used was 2 million. After electroporation, CHO cells were cultured in DMEM medium containing 5% FBS, and 48 hours later, were screened with puromycin (10. mu.g/mL).
We used d1EGFP as the tracer protein, and the D1EGFP was connected with puromycin through an auto-cleavage sequence, and the amino acid sequence of the D1EGFP is shown in SEQ ID NO. 9. The auto-cleavage sequence was selected from the 2A peptide fragment (see "research progress for construction of multigene expression vectors based on 2A peptide strategy", China Biotechnology,2013, 33: 104-108). The self-cleaving 2A peptide is derived from foot-and-mouth disease viruses (FMDV). The 2A peptide fragments or 2A similar peptide fragments consist of 18-22 amino acids, and the C-terminal sequence is highly conserved: -DxExNPGP-. When the ribosome translates to this sequence, an automatic cleavage (cleavage) without protease is achieved because "skipping" occurs. Cleavage occurs between the C-terminal G-P residues. Preferably selected from P2A (SEQ ID NO: 16: ATNFSLLKQAGDVEENPGP), T2A (SEQ ID NO: 34: EGRGSLLTCGDVEENPGP) or E2A (SEQ ID NO: 35: QCTNYALLKLAGDVESNPGP).
EGFP has a half-life in cells of about 24 hours and a strong fluorescence intensity. d1EGFP is a mutant of EGFP with a half-life of only 1 hour, with greatly reduced fluorescence intensity (sequences from Clontech). In the invention, the principle of adopting d1EGFP as an expression tracer protein is as follows: when the targeting vector is positioned at a GAPDH expression hotspot, even d1EGFP with weak fluorescence can emit strong fluorescence due to vigorous transcription; while those randomly inserted into other locations of the chromosome give weak fluorescence.
The puromycin-selected resistant clones were examined by FACS and found that the vast majority of resistant clones were randomly inserted and expressed a weak d1EGFP signal. Only a very low percentage of 0.6% were events that truly targeted integration of GAPDH sites, expressing a strong d1EGFP signal (fig. 2). The inventors further enriched these d1EGFP highly expressed clones for subcloning by flow cytometry (MoFlo). The obtained CHO Cell strain which expresses strong d1EGFP signals, has good morphology and fast growth is used as an anchoring CHO Cell strain (Anchor Cell Line) which can further realize continuous, circularly alternating and non-upper-limit genome fixed-point integration.
Example 2: construction of a "Toggle-In" vector andMethod for stable co-expression of multiple genes
In order to solve the difficult problem of multi-gene stable co-expression faced by CHO cell strain modification, the inventor develops a set of Cre-LoxP mediated homologous recombination 'Toggle-In' platform system on the basis of anchoring the CHO cell strain. The system is characterized in that: continuous, cyclically alternating, unlimited, site-directed integration of the genome was achieved with two vectors (pTOG3 and pTOG4, in sequence, using only two drug resistance genes (hygromycin B and puromycin.) furthermore, these CHO cell clones, each homogeneous (Isogenic), allowed 1:1 expression of the insert.
the construction of pTOG3 (whose nucleotide sequence is shown in SEQ ID NO:11, carrying DsRed-E2 tracer protein whose nucleotide sequence is shown in SEQ ID NO:36) and pTOG4 (whose nucleotide sequence is shown in SEQ ID NO:12, carrying d1EGFP tracer protein whose nucleotide sequence is shown in SEQ ID NO: 37) still employed the aforementioned molecular cloning procedure, i.e., the high fidelity DNA Polymerase Herculase II Fusion DNA Polymerase (Aligent Corp.), to join different fragments according to the overlaying PCR method. All PCR primers were synthesized by Integrated DNAtechnology (IDT). Restriction enzymes were purchased from New England Biolabs (NEB).
among them, pTOG3 carries hygromycin resistance Gene (hygromycin B) and foreign Gene-1 (Gene-of-Interest, GOI-1), while pTOG4 carries puromycin resistance Gene (puromycin) and foreign Gene-2 (GOI-2).
The sequences of LoxP wild-type and mutants used in the pTOG3 and pTOG4 vectors are listed in table 1. The LoxP sequence (34nt) can be divided into a left arm (13nt), a spacer region (8nt), and a right arm (13 nt). The core spacer determines the match between the LoxP sequences, i.e. only LoxP sequences of the same spacer can undergo homologous recombination between them (e.g. LoxP1 and LoxP2, or LoxP4 and LoxP 5). Cre recombinase can tolerate some degree of mutation in one of the left and right arm regions (e.g., the right arm of LoxP1 compared to the right arm of LoxPwt), but cannot tolerate simultaneous mutation in the left and right arm regions (e.g., LoxP3 or LoxP6 compared to LoxPwt). Once this occurs, the newly generated LoxP3 or LoxP6 is inactivated after recombination and no further Cre-mediated recombination can occur.
TABLE 1
LoxP sequence Left arm Spacer region right arm serial number
LoxPwt ATAACTTCGTATA GCATACAT TATACGAAGTTAT SEQ ID NO:1
LoxP1 ATAACTTCGTATA GTATAGTA TATACGAACGGTA SEQ ID NO:2
LoxP2 TACCGTTCGTATA GTATAGTA TATACGAAGTTAT SEQ ID NO:3
LoxP3 TACCGTTCGTATA GTATAGTA TATACGAACGGTA SEQ ID NO:4
LoxP4 ATAACTTCGTATA GGCTATAG TATACGAACGGTA SEQ ID NO:5
LoxP5 TACCGTTCGTATA GGCTATAG TATACGAAGTTAT SEQ ID NO:6
LoxP6 TACCGTTCGTATA GGCTATAG TATACGAACGGTA SEQ ID NO:7
In a first step, pTOG3 was transformed in anchored CHO cells prepared in example 1. The specific method comprises the following steps: anchored CHO cells were cultured in 6-well plates with DMEM containing 5% FBS. When the cell density reached 80% confluence, 1.0. mu.g of pTOG3 with the exogenous gene and 20ng of pOG231 plasmid encoding Cre protein (purchased from AddGene) were simultaneously transformed with Fugene 6 reagent (Promega). Since the efficiency of Cre-LoxP homologous recombination in this step is very high, the amount of pOG231 cannot be too high or the efficiency is rather reduced, and the amount of 20ng is the optimum dose that has been titrated. After 48 hours of transformation, the cells were trypsinized, replated on two 10 cm dishes, 800. mu.g/mL hygromycin B was added, and after 7-10 days of screening, hundreds of clones were formed.
As an example, the inventors cloned the gene of DsRed-E2 tracer protein (SEQ ID NO:36) between MluI and XhoI cleavage sites of pTOG3, introduced the pTOG3 vector into the expression hot spot of GAPDH anchored in CHO cells, and screened the clones by hygromycin. Fluorescence microscopy showed that all drug-resistant clones expressed high levels of red fluorescent protein (FIG. 5A), demonstrating the high efficiency of this chromosomal genomic site-directed integration technique.
The purpose of the Drug resistance test (Drug test) is to screen for true gene targeting events, not random integration events. When the hygromycin B selected clones grew to around 1000 cells, the plates were washed three times with 2ml PBS to remove any floating cells. The well-formed clones were then picked up in 48 wells with a sterile tip and cultured with DMEM containing 5% FBS. When the cells were 80% full in 48 wells, they were trypsinized and each clone was plated in two duplicate wells, one of which was screened with puromycin (10. mu.g/mL) and the other was left plated. Whether the cells are sensitive to puromycin or not can be generally seen after 1-2 days.
According to the principle of the "Toggle-In" platform system, pTOG3 vector with GOI-1 was compared with LoxPwt at the GAPDH site: LoxPwt sequence and LoxP 1: homologous recombination occurs between the LoxP2 sequences, the LoxPwt sequence is retained, and an inactivated LoxP3 sequence is generated. GOI-1 was thus integrated into the chromosomal high expression site and the correct clone carrying the LoxP4 sequence was selected by hygromycin (FIG. 3). However, in the correct clones, which actually undergo homologous recombination, the puromycin resistance gene is replaced by the hygromycin resistance gene, and the cells should exhibit puromycin sensitivity. If not sensitive, it is a random integration event. Although some double-resistant clones may also express GOI-1, the next "Toggle-In" operation cannot be performed.
In the second step, the above puromycin-sensitive clones were cultured in 6-well plates using DMEM containing 5% FBS. When the cell density in the wells reached 80% full, 1.0. mu.g of pTOG4 with the foreign gene GOI-2 and 20ng of pOG231 plasmid were simultaneously transformed with Fugene 6 reagent and the correct clone with the LoxP1 sequence was selected from puromycin (FIG. 4). In this process, LoxPwt in pTOG4 vector and the new integration site described above: LoxPwt sequence and LoxP 4: homologous recombination occurs between LoxP5 sequences, the LoxPwt sequence is retained, and an inactivated LoxP6 sequence is generated, so that GOI-2 is integrated into the same chromosome high expression site. After 48 hours of transformation, the cells were trypsinized, replated on two 10 cm dishes, and 10. mu.g/mL puromycin was added, and after 7-10 days of screening, 10-20 clones were observed. Because of the LoxP sequence, the recombination efficiency of this step is significantly lower than that of pTOG 3.
Drug resistance test (Drug test): when puromycin-selected clones grew to around 1000 cells, the plates were washed three times with 2ml PBS to remove any floating cells. The well-formed clones were then picked up in 48 wells with a sterile tip and cultured with DMEM containing 5% FBS. When the cells were 80% full in 48 wells, they were trypsinized and each clone was plated in two duplicate wells, one of which was screened with hygromycin (800. mu.g/mL) and the other was plated. It can be seen after 7 days in general whether the cells are sensitive to hygromycin.
after the two rounds of homologous recombination and integration, the chromosome is restored to the original state with LoxPwt and LoxP1 sequences, and the next round of integration can be carried out. Screening for each round of integration was performed with only hygromycin or puromycin alternating with the two antibiotics. The inventor uses the method to transfer the genes of SRP-14-9-54 (the nucleotide sequence is shown as SEQ ID NO: 17), hERO-L (the nucleotide sequence is shown as SEQ ID NO: 38), hFGF9 (the nucleotide sequence is shown as SEQ ID NO: 39) and the like into a CHO cell high expression site to form a high expression cell strain capable of promoting antibody pairing, folding and secretion, and the high expression cell strain is named as CHO-E, wherein E represents the meaning of Enhanced. RT-PCR showed genetic homogeneity (isogenity) between individual CHO-E cell clones (FIG. 5B).
example 3: knocking out Fut8 gene in CHO cell by TALEN technology
In order to conveniently and rapidly knock out Fut8 gene in CHO cell, the inventor constructs a pair of TALEN vectors aiming at exon 10 of Fut8 gene of CHO cell (the DNA sequence of the left arm of TALEN protein for knocking out Fut8 gene of CHO cell is shown as SEQ ID NO:18, and the DNA sequence of the right arm of TALEN protein for knocking out Fut8 gene of CHO cell is shown as SEQ ID NO: 19), and the vectors are respectively named as FL #2 and FR # 12. Each 5. mu.g of the two miniPrep plasmids was transferred by electroporation into 1 million CHO-K1 cells (ATCC, CCL-61). The electrotransformation machine is a Neon Electroposition System from LifeTech. Electrotransfer conditions were 1620v, 10ms, 3 pulses. After electroporation, CHO cells were cultured in DMEM medium containing 5% FBS for 8 days without any drug selection. This growth phase allows those clones with the disrupted Fut8 gene to sufficiently divide and, through turnover metabolism of the cell membrane, dilute the membrane glycoproteins that originally contained Fucose (Fucose), while those newly produced transmembrane glycoproteins did not contain Fucose. Thereafter, Fut8-/-CHO cells were negatively screened using MACS magnetic beads. The specific method comprises the following steps:
the cells were collected by centrifugation (1200rpm, 5 minutes) in a 15mL centrifuge tube. The supernatant was discarded and the cell pellet was suspended in 1mL of MACS buffer (PBS plus 5mM EDTA plus 1% BSA) and stained with a 1:2500 fold dilution (final concentration 2. mu.g/mL) of the biotin-labeled lectin Lens Culinis Agglutinin (LCA) that specifically binds fucose for 15 minutes at 4 ℃. LCA-biotin is available from Vector Laboratories, inc (Burlingame, CA), usa. The cell suspension was added with 10mL of MACS buffer, centrifuged (1200rpm, 5 min) to collect the cells, and resuspended in 1mL of MACS buffer. PE-labeled Streptavidin (final concentration 1. mu.g/mL) was stained at 4 ℃ for 15 minutes in the dark. Streptavidin-PE was purchased from eBioscience, Inc. of USA (San Diego, Calif.). The cell suspension was added with 10mL of MACS buffer, centrifuged (1200rpm, 5 min) to collect the cells, and resuspended in 1mL of MACS buffer. Add anti-PE Microbeads 70. mu.L, 4 ℃ binding 15 minutes. anti-PE magnetic beads were purchased from Miltenyi Biotec Inc. (San Diego, Calif.). The cell suspension was added to 10mL of MACS buffer, centrifuged (1200rpm, 5 min) to collect the cells, resuspended in 0.5mL of MACS buffer, added to the MS column which had been rinsed with 5mL of MACS buffer and placed on a magnet. After the cell suspension entered the cylinder, 10mL of MACS buffer was added to wash the column. The liquid from the column was collected in a 15mL centrifuge tube and centrifuged (1200rpm, 5 minutes) to collect the cells. This fraction of cells is not bound by LCA, i.e. fucose is absent. This fraction of cells represents approximately 1% of the total number of cells. Cells were diluted at a certain concentration in 10 cm culture dishes containing 5% FBS DMEM medium. After 10-14 days, the well-shaped clones were picked up with a sterile tip into a 48-well plate and cultured with DMEM containing 5% FBS. When the cells were 80% full in 48-well plates, they were trypsinized and each clone was plated in two duplicate wells, one of which was verified by FACS for the presence of LCA binding sites on its surface and the other was left for plating. The FACS spectra obtained are shown in figure 6.
after 18 LCA-staining negative clones were obtained, the inventors continued to perform further PCR amplification of genomic DNA from the exon region of No. 10 (using the primer sequences SEQ ID NO: 20-21) for clone No. 1. The genomic DNA extraction method was performed according to the instructions of Qiagen kit. The PCR fragment was recovered using a PCR recovery kit from Qiagen, and digested with MscI from NEB for 1 hour at 37 ℃. The pattern of the 2% agarose electrophoresis run gel (fig. 7A) shows: the 715bp PCR product of wild-type cells was completely cut into two segments of 417bp and 298bp, corresponding to the position of the MscI cleavage site in the amplified fragment. 1. The clones No. 2 and 18 had half of the DNA completely cut into 417bp and 298bp segments, and the other half of the DNA was not cut. The PCR products of clones 3, 4 and 12 were completely not digested. This result illustrates that: 1. clones 2 and 18 had the MscI cleavage site on one allele inactivated by TALEN, while clones 3, 4, and 12 had the MscI cleavage site on both alleles inactivated by TALEN.
To further establish that the Fut8 gene no longer expresses a functional protein, the inventors performed the restriction of clone No. 1, and then performed the restriction of the PCR product by using T4 DNA Polymerase, and then cloned and transformed the bacterium by using the TACONING kit of LiFeTech. Randomly picking 5 single colonies, inoculating, culturing, extracting plasmid and sequencing. The sequencing results (fig. 7B) indicated that both of the two alleles of exon 10 of Fut8 from clone No. 1 had been inactivated: one of them is a mutation in which nucleotide deletion and insertion occur near the MscI cleavage site, not only causing loss of the MscI cleavage site, but also causing premature termination of protein translation; while the other retains the MscI site, the nucleotide insertion mutation also causes premature termination of protein translation. This result is consistent with the cleavage map of the PCR fragment. Thus, the inventors established a CHO-Fut 8-/-cell line, i.e., a CHO-F cell line (Table 2).
To further illustrate that the CHO-F cell line did express 100% fucose-depleted antibodies, in another embodiment according to the present invention, the inventors compared the sugar chains of the Fc fusion protein (FIG. 8) or antibodies (FIG. 9) expressed by wild-type CHO-K1 and CHO-F cells with Mass Spec Mass spectrometry. This part of the work was performed by Waters corporation, outsourcing services, by standard methods of sugar chain Mass Spec Mass spectrometry. The results showed that 100% of the product expressed by CHO-F cells was free of fucose.
Example 4: comparison of Fc receptor binding Activity of fucose-deficient antibodies
To further illustrate the altered binding activity of fucose-deficient antibodies to Fc receptors, in another embodiment of the invention, the inventors compared the binding capacity of the hIgG1 antibodies expressed by wild-type CHO-K1 and CHO-F cells to various mouse and human Fc receptor proteins using an ELISA method (fig. 10). The ELISA procedure was performed according to standard procedures, i.e.the respective Fc receptor proteins (AB Biosciences, Boston, Mass.) were coated on ELISA plates at a dose of 2. mu.g/mL. After the skim milk powder is blocked, the same dosage of wild type or fucose-deficient antibody is added to 1 mug/mL, and after the mixture is fully washed, the goat anti-human antibody marked by HRP is used for detection. The results show that the fucose-deficient hIgG1 antibody has no change in the binding capacity to mouse FcR gamma I, FcR gamma IIb, FcR gamma III, and human FcR gamma I, FcR gamma IIa (H133), FcR gamma IIa (R133), and FcR gamma IIb, and has greatly improved binding capacity to human FcR gamma IIIa.
Based on the literature reports that mouse FcR γ IV is a functional analogue of human FcR γ IIIa (mecetina, l.v.et., immunology.2002; 54:463-8), the inventors wished to compare the binding curves of both wild-type and fucose-deficient hIgG1, mouse mIgG2a antibodies. The inventors also identified a previously unknown Guinea Pig protein (H0VDZ8) from the UniProt protein sequence library using bioinformatics methods, with high homology to the mouse FcR γ IV and human FcR γ IIIa sequences (fig. 11), which was then designated Guinea Pig FcR γ IV (gpFcR γ IV). The inventors expressed hFcR γ IIIa-V158, mFcR γ IV or gpFCR γ IV on the surface of CHO-E cells transformed with Fc Receptormomon γ chain by pTOG3 and "Toggle-In", respectively, and then compared the binding capacity of wild-type and fucose-deficient hIgG1, mouse mIgG2a antibodies to the three by FACS (FIG. 12). The results showed that fucose-deficient hIgG1 and mouse mIgG2a antibodies expressed by CHO-F cells all showed improved binding to these receptors to varying degrees (Mao, C.et al., J.Infect.Dis.Med., 2016; 1: 1). To this end, the inventor established a method for rapid knockout of Fut8 gene in CHO cells by TALENs, which has been validated from the functional level of gene and protein.
It is noted that, before introducing hFcR γ IIIa-V158, mFcR γ IV or gpFCR γ IV with pTOG3, the inventors considered that the membrane expression and activity functions of many Fc receptors depend on a general Receptor subunit called Fc Receptor common γ chain (Nimmerjahn F. et al, Immunity. 2005 Jul; 23:41-51), and cloned it on pTOG4 vector, previously expressed on CHO-E cell surface by "Toggle-In" method. Subsequently, after the next round of introduction of pTOG3 into hFcR γ IIIa-V158, mFcR γ IV or gpFcR γ IV, the CHO cell line established was able to simultaneously express a functional Fc receptor complex with subunit 1:1 ratio.
Because of single copy site-directed integration, the expression densities of different Fc receptors on different cell lines are very close. This provides an advantage over random integration for lateral comparison of binding of the same antibody to different receptors. For example, as can be seen from the MFI values on the y-axis of fig. 12, the binding activity of human IgG1 antibody to guinea pig gpFcR γ IV receptor is much higher than that of the cognate receptors of the other two species. Therefore, it is considered carefully whether the obtained data of pharmacokinetics and the like can be deduced across species in animal models of different species for the human therapeutic IgG1 antibody. The above case is also another successful application of the "Toggle-In" method described In the present invention.
example 5: ADCC Activity comparison of fucose-deficient antibodies
To further elucidate the properties of the fucose-deficient antibodies at the cellular functional level, the inventors further compared the ADCC capacity of wild-type and fucose-deficient hIgG1, mouse mIgG2c antibodies. The traditional method for detecting ADCC is to measure the death of target cells (ADCC)51Release of Cr or LDH) and effector cells are mostly PBMC or NK cells purified therefrom. However, effector cells from different donors often vary widely and are not easily collected and stored. Promega developed an assay protocol for detecting activation of an effector cell line (Jurkat) by a transformed hFcR γ IIIa (V158) receptor, triggering a fluorescent signal. But the cell line maintained functional hFcR γ IIIa expression by three plasmid co-transformation, two antibiotic drug screening. After more passages, the functional loss of hFcR gamma IIIa gradually appears. In the present invention, the inventors stably expressed hfcryaiiia or mfcryiv receptors in Jurkat line using a lentivirus-mediated approach, so that Jurkat line was stable for a long period and fluorescence signals were greatly enhanced.
using this system, the inventors compared ADCC of wild-type and fucose-deficient hIgG1 on target cells with hFcR γ IIIa-Jurkat cells. The specific method comprises the following steps: 10000 cells expressing the target antigen were pre-inoculated in a Falcon white opaque cell culture 96-well plate (Cat #353296) 24 hours before the start of the experiment. Such cells may be tumor cell lines expressing the target antigen, or CHO-E cells expressing the target antigen on the surface by the methods of pTOG3 and "Toggle-In" after cloning the target antigen. On the day of the experiment, 2X 10 wells were added to the above 96-well plate5the hFcR gamma IIIa-Jurkat cells are mixed with antibodies of different dilutions and used for resisting target antigens, the total volume is controlled to be 75-90 mu L, and the cells are incubated for 6 hours at 37 ℃. After the incubation was completed, 60. mu.L of Bio-Glo luciferase substrate (Promega, Cat # G719A) was added to each well, and after 30 minutes at room temperature, the plate was read by a fluorescence reader. The results showed that ADCC of the fucose-depleted hIgG1 expressed by CHO-F cells against target cells was 50-100 fold higher than that of wild-type hIgG1 (fig. 13). In the same way, the inventors compared ADCC of wild-type and fucose-deficient mIgG2c against target cells. In contrast, the effector cells used were mFcR γ IV-Jurkat cells with lentivirus-mediated expression.the results showed that ADCC of fucose-depleted mIgG2c against target cells was increased 50-100 fold compared to wild-type mIgG2c (fig. 14). To this end, a method for rapidly knocking out Fut8 gene in CHO cell line has been established in the examples of the present invention not only at the functional level of gene and protein, but also at the level of antibody-mediated cellular activity. By using the method, the inventor not only knocks out the Fut8 gene in the CHO-K1 cell strain to obtain the CHO-F cell strain, but also knocks out the Fut8 gene in the CHO-E cell strain which is established in the first part of examples and expresses a plurality of genes simultaneously to obtain the CHO-EF cell strain.
Example 6: screening of GS deleted CHO cell clones
As shown in example 3, CHO-K1 cells or Fut8 gene-deleted CHO-F cells were electrically transduced with the TALEN left (SEQ ID NO:22) and right (SEQ ID NO: 23) vectors GS-L and GS-R (5. mu.g each) for the GS gene. The electrotransformation machine is a Neon Electroposition System from LifeTech. Electrotransfer conditions were 1620v, 10ms, 3 pulses. After electroporation, CHO cells were diluted to a certain concentration and seeded in a 10 cm dish of DMEM medium containing 5% FBS without any drug selection to form a monoclonal. After 10-14 days, the well-shaped clones were picked up with a sterile tip into a 48-well plate and cultured with DMEM containing 5% FBS. When the cells were 80% full in 48-well plates, they were trypsinized and each clone was plated in two duplicate wells, one of which was washed twice with PBS and replaced with glutaminefree HyCell medium (GE healthcare sciences, Logan, ct.cat No. sh30934.01); the other group was supplemented with 2mM glutamine in HyCell medium as described above. After 7 days, GS-/-CHO cells (clone #4) were able to grow only in glutamine-containing HyCell medium, but not in glutamine-deficient medium. While GS +/+ CHO cells were able to grow in both (FIG. 15).
We amplified the DNA region targeted by TALEN from clone #4 genome with a pair of primers (SEQ ID NO:24 and 25) directed to exon 7 of the GS gene. Sequencing by molecular cloning showed that both alleles were targeted and inactivated (FIG. 16).
to further verify that the GS-/-CHO-K1 strain can be used for screening of foreign genes, we introduced pDirect vector (10. mu.g) for antibody gene expression and carrying a GS expression cassette into GS-/-CHO-K1 cells according to the electrotransfer method described above. After electroporation, the cells were cultured in 5% FBS-DMEM medium containing glutamine for one day, washed twice with PBS, replaced with HyCell medium containing no glutamine, and cultured. After 8 to 10 days, when all GS-/-CHO-K1 cells not transfected with the exogenous GS gene died, GS-/-CHO-K1 cells that received the exogenous GS gene appeared to be resistant clones in the culture dish.
to this end, we verified that CHO-K1 cells no longer carry a functional GS gene after TALEN treatment in clone # 4. In the same manner, we also obtained GS-/-clone strains of CHO-F cells (# 1, #4, #5 and #7 in FIG. 17) from which the Fut8 gene had been deleted, and GS-/-Fut 8-/-CHO-DG 44 cell strains.
Example 7: humanized glycosylation engineering of CHO cells
In order to make the antibody expressed by the CHO cell more glycosylation similar to the human antibody (especially the sialylation at the alpha 2,6 position which the CHO cell can not carry out), the inventor further transferred the gene of the enzyme responsible for the synthesis of the specific glycosyl group according to the method described in example 2 by using the above-established fucose-deficient CHO-EF cell strain as a model. The specific method comprises the following steps:
first, to allow CHO cells to sialylate alpha 2,6 sites which they did not originally have, the GT-P2A-ST6 gene (SEQ ID NO:40) was cloned into pTOG3 vector, and CHO-EF cells were transformed. CHO-EF cells were cultured in 6-well plates with DMEM containing 5% FBS. When the cell density reached 80% confluence, 1.0. mu.g of pTOG3 carrying the GT-P2A-ST6 gene and 20ng of pOG231 plasmid encoding Cre protein (purchased from AddGene) were simultaneously transformed with Fugene 6 reagent (Promega). Since the efficiency of Cre-LoxP homologous recombination in this step is very high, the amount of pOG231 cannot be too high or the efficiency is rather reduced, and the amount of 20ng is the optimum dose that has been titrated. After 48 hours of transformation, the cells were trypsinized, replated on two 10 cm dishes, 800. mu.g/mL hygromycin B was added, and after 7-10 days of screening, hundreds of clones were formed. After 10-14 days, the well-shaped clones were picked up with a sterile tip into a 48-well plate and cultured with DMEM containing 5% FBS. When the cells were 80% full in 48-well plates, they were trypsinized and each clone was plated in two separate sets of 48-well duplicate wells, one set tested for puromycin (10. mu.g/mL) and the other set left seeded. Those correct clones, which indeed undergo homologous recombination, have the puromycin resistance gene replaced by the hygromycin resistance gene, so the cells should exhibit puromycin sensitivity. If not sensitive, it is a random integration event. The inventors thus obtained two clones of CHO-EFS-6.1, CHO-EFS-6.3 strains which are hygromycin resistant and puromycin sensitive. Then, the cells were stained with biotin-labeled phytohemagglutinin Sambucus Nigra Lectin (SNA) or biotin-labeled phytohemagglutinin II (MAL II) at 4 ℃ for 15 minutes (stock solution diluted 2000-fold to a final concentration of 2.5. mu.g/mL). SNA-biotin and MAL II-biotin are available from Vectorlaboratories, Inc. of USA (Burlingame, Calif.). The cell suspension was added with 10mL of MACS buffer, centrifuged (1200rpm, 5 min) to collect the cells, and resuspended in 1mL of MACS buffer. PE-labeled Streptavidin (final concentration 1. mu.g/mL) was stained at 4 ℃ for 15 minutes in the dark. Streptavidin-PE was purchased from eBioscience, Inc. of USA (San Diego, Calif.).
FIG. 18 shows the FACS profile obtained. CHO-F mother cell line can not express sialic acid at alpha 2,6 position, SNA staining is negative; however, they express sialic acid at the α 2,3 positions, so MAL II stains positively. Both SNA and MAL II staining were positive for two CHO-EFS-6.1 and CHO-EFS-6.3 clones, indicating that they were able to sialylate in the alpha 2,6 positions based on the original sialylation in the alpha 2,3 positions (FIG. 18A). FIG. 18B shows that the CHO-EFS-6.1 clone has reduced binding sites for the surface phytohemagglutinin Erythrina Cristatagalli Lectin (ECL) compared to the mother cell line CHO-F cells. ECL-biotin is also available from Vector Laboratories, inc (Burlingame, CA), usa, and staining methods are as above. ECL specifically recognizes Galactose (Galactose). Galactose (Galactose) and high mannose in which the terminal of the antibody sugar chain is not blocked with sialic acid have a great influence on the in vivo distribution and kinetics of the antibody. Antibodies that are not sialylated will be bound by Asialoglycoprotein Receptor (ASGPR) in the liver and cleared from the blood. Antibodies with high Mannose are also bound by Mannose receptors (manrs) in the liver and rapidly cleared. It is expected that antibodies produced by CHO-EFS-6.1 or CHO-EFS-6.3 cells are more resistant to clearance by ASGPR and ManR, and therefore have a longer half-life in vivo and are less immunogenic. The inventors selected the CHO-EFS-6.1 cell line with the best morphology after two subcloning, named CHO-EFS (3,6) (Table 2), where the numbers 3 and 6 indicate that the cell line can simultaneously perform terminal sialic acid modification at the alpha 2,3 and alpha 2,6 positions, respectively. Similarly, cell lines CHO-ES #3, #4, #5 (FIG. 19) were established after transfer of pTOG3-GT-P2A-ST6 into fucose-preserved CHO-E cells as described previously, wherein CHO-ES #3 was designated CHO-ES (3,6) (Table 2). The cell strain is characterized in that: fucose and sialic acid at the α 2,6 positions were expressed (fig. 19).
in another embodiment of the invention, to allow the CHO cell to carry out galactosylation of human antibodies and to promote complement activation, the GT gene is cloned into the pTOG3 vector and CHO-E or CHO-EF cells are transformed. The resulting hygromycin-resistant and puromycin-sensitive clones were designated CHO-EG or CHO-EFG, respectively (Table 2). To this end, the establishment of a method for targeted engineering of CHO cell glycosylation, in particular sialylation in the α 2,6 position of human origin, has been established in the examples of the present invention.
Example 8: knocking out ST3GAL4 and ST3GAL6 genes in CHO cells by TALEN technology
Even if the human alpha 2,6 sialylase gene is introduced by the above method, the terminal sialic acid modification of glycoprotein may simultaneously present alpha 2,3 and alpha 2,6 linked sialic acids due to the presence of alpha 2,3 sialyltransferase (mainly two enzymes ST3GAL4 and ST3GAL6) endogenous to CHO cells, resulting in uncontrollable product homogeneity. Furthermore, it has been reported that sialic acid linked to the α 2, 3-position is inhibitory to ADCC and that sialic acid linked to the α 2, 6-position is an optimal form to further enhance ADCC activity in the absence of fucose, as demonstrated by in vitro enzymatic chemical modification (Lin Cnet al Proc Natl Acad Sci U S A.2015; 112(34): 10611-6). For this reason, we continued to use TALEN technology to perform gene level modification on the obtained CHO-ES (3,6) and CHO-EFS (3,6) cell lines, and knock out the ST3GAL4 and ST3GAL6 enzyme genes.
We constructed a pair of TALEN vectors (SEQ ID NO:26-27 shows the DNA sequence encoding the TALEN protein) for the exon 5 of the ST3GAL4 gene of CHO cells, named S3G4-1 and S3G4-2, respectively, and the primers for the vectors are SEQ ID NO: 28-29. We also constructed a pair of TALEN vectors (SEQ ID NO: 30-31 shows DNA sequence encoding TALEN protein) for the exon 2 of the ST3GAL6 gene of CHO cells, named S3G6-1 and S3G6-2, respectively; the primers for the gene are SEQ ID NO: 32-33. 2. mu.g of each of the four miniPrep-prepared plasmids was transferred into 2 million CHO-ES (3,6) cells or CHO-EFS (3,6) cells by the electroporation method. The electrotransformation machine is a Neon Electroposition System from LifeTech. Electrotransfer conditions were 1620v, 10ms, 3 pulses. After electroporation, CHO cells were cultured in DMEM medium containing 5% FBS for 8 days without any drug selection. This growth stage allows those clones with the ST3GAL4/6 gene disrupted to divide sufficiently and, through turnover metabolism of the cell membrane, dilute the glycoprotein or glycolipid originally having sialic acid linked to the α 2,3 position on the membrane, while those newly produced transmembrane glycoproteins or glycolipids do not have sialic acid linked to the α 2,3 position. Thereafter, CHO cells from ST 3-/-were negatively selected by FACS method. The specific method comprises the following steps: cells were stained with biotin-labeled phytohemagglutinin Lectin MaackiaAmurensis Lectin II (MAL II) for 15 min at 4 ℃ (stock diluted 2000-fold to a final concentration of 2.5. mu.g/mL). MAL II-biotin is available from Vector Laboratories, Inc. of U.S.A. (Burlingame, Calif.). The cells were washed with FACS buffer, centrifuged (1200rpm, 5 min) to collect the cells, and stained with PE-labeled Streptavidin (final concentration 1. mu.g/mL) at 4 ℃ for 15 min in the dark. Streptavidin-PE was purchased from eBioscience, Inc. of USA (San Diego, Calif.). The MAL II negative cell population was then sorted by MoFlo flow cytometry (fig. 20) and subcloned in 96-well plates.
Table 2 lists CHO cell lines involved in the present invention or newly prepared, and their genotypic characteristics. Wherein, the cell line obtained by knocking out Fut8 gene on the basis of CHO-K1 is named as CHO-F. The cell line after the GS gene knockout was designated as CHO-GS. The cell line after double knockout of the GS and Fut8 genes was designated CHO-GSF. A cell strain which is transferred into genes of SRP-14-9-54, hERO-L, hFGF9 and the like at a fixed point by using a Cre-LoxP method and promotes high expression of an antibody is named as CHO-E, wherein E represents Enhanced. The cell strains which are transferred with GT and ST6 genes capable of carrying out the sialylation of human alpha 2,6 sites on the basis of CHO-E and CHO-EF are respectively named as CHO-ES (3,6) and CHO-EFS (3, 6); the cell lines transformed with the GT gene capable of galactosylation were designated CHO-EG and CHO-EFG, respectively. Cell lines obtained by knocking out the endogenous alpha 2,3-sialyltransferase 4/6 gene in CHO cells by TALEN technology on the basis of CHO-ES (3,6) and CHO-EFS (3,6) were named CHO-ES (6) and CHO-EFS (6), respectively. The cell line obtained by knocking out the Fut8 gene based on CHO-DG44 was designated CHO-F (DG 44). The cell line after the GS gene knockout was designated as CHO-GS (DG 44). The cell line after double knockout of the GS and Fut8 genes was designated CHO-GSF (DG 44).
TABLE 2
example 9: effect of CHO cell after humanized glycosylation modification
To date, the production of antibodies using wild-type CHO cells is still a common phenomenon in the antibody pharmaceutical industry. With the continuous optimization of culture medium and the improvement of fermentation process, the produced antibody can meet the requirements of downstream purification process (2-5 g/L) in quantity, but also reaches the bottleneck of development. On the other hand, medical antibodies still have a large growth space in nature at present. The invention mainly focuses on the latter cell strain and provides a specific improvement scheme.
For example, the CHO cell strain with fucose deficiency can improve the ADCC activity of the antibody by 50-100 times, and greatly reduces the in-vivo dosage, toxic and side effects and production cost. However, other forms of glycoengineering, and various combinations thereof, have not been systematically explored to improve the quality of antibodies. The in vitro enzyme chemical modification method reported in the literature does not have the operability of large-scale industrialization. In this example, we provide a means to do these rigorous, systematic explorations at the gene level. For example, we prepared stable cell lines with the assurance of the Toggle In site-directed integration mechanism, introduced both GT and ST6 enzymes necessary for human α 2,6 sialic acid modification, and prepared two CHO-ES (3,6) and CHO-EFS (3,6) cell lines to compare head-to-head the effect of sialic acid modification on antibody ADCC activity In the presence or absence of fucose. FIG. 21 shows sugar chain Mass Spec Mass spectrometry analysis of human IgG1 antibody expressed by wild-type CHO-K1 and Fut 8-/-CHO-EFS (3,6) cell lines. The results showed that the sugar chains were very different between the two. The latter not only carries no fucose, but also more terminal sialic acids.
Next, we expressed hIgG1 antibody against tumor target GPC3 using wild-type (WT) CHO-K1 cell line, fucose-deficient (AF) CHO-F cell line, and fucose-deficient and humanized sialylated (AF + S) CHO-EFS (3,6) cell line, respectively, and tested their ADCC effects on Huh7 cells positive for GPC3 in vitro. See the luciferase assay for in vitro ADCC assay in example 5 for specific experimental procedures. The ADCC activity of the fucose-deficient antibody is obviously improved compared with that of the wild-type antibody, and the ADCC activity of the antibody is improved after sialic acid is added on the basis of fucose deficiency (FIG. 22), which is consistent with the fact that the ADCC is further improved on the basis of fucose deficiency by increasing sialic acid at alpha 2 and 6 positions reported in the literature (Lin CW et al Proc Natl Acad Sci U S A.2015; 112(34): 10611-6). By analogy, the methods provided by the inventor can also compare the effect of terminal sialic acid only at the alpha 2 and 6 positions and no sialic acid at the alpha 2 and 3 positions on ADCC of the antibody, or the effect of terminal galactose residue modification on CDC activity of complement activation, and the like, in the presence or absence of fucose.
while specific embodiments of the present invention have been illustrated and described in detail, it should be appreciated that the present invention is not limited by the specific embodiments. The method provided by the invention is not only suitable for CHO-K1 cell strains, but also suitable for other CHO cell strains such as CHO-DG44 and the like. Various modifications, adaptations, and variations of the present invention can be made without departing from the spirit and scope of the invention, and these are within the scope of the invention.
Sequence listing
<110> Antaiji (Beijing) Biotechnology Ltd
<120> method for modifying host cell genome and use thereof
<130> DIC18110048
<160> 40
<170> SIPOSequenceListing 1.0
<210> 1
<211> 34
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
ataacttcgt atagcataca ttatacgaag ttat 34
<210> 2
<211> 34
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
ataacttcgt atagtatagt atatacgaac ggta 34
<210> 3
<211> 34
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
taccgttcgt atagtatagt atatacgaag ttat 34
<210> 4
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
taccgttcgt atagtatagt aatacgaacg gta 33
<210> 5
<211> 34
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
ataacttcgt ataggctata gtatacgaac ggta 34
<210> 6
<211> 34
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
taccgttcgt ataggctata gtatacgaag ttat 34
<210> 7
<211> 34
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
taccgttcgt ataggctata gtatacgaac ggta 34
<210> 8
<211> 1001
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
gggtgatgct ggcgccgagt atgttgtgga atctactggc gtcttcacca ccatggagaa 60
ggctggggcc cacttgaagg gcggggccaa gagggtcatc atctccgccc cttctgctga 120
tgcccccatg tttgtgatgg gtgtgaacca agacaagtat gacaactccc tcaagattgt 180
caggtgagga tggcagaggg ctgtggcaaa gtgggcaagc aggggcaagg ttacaggtgg 240
gcgagcctcc taacctgtct cttctcttca gcaatgcgtc ctgcaccacc aactgcttag 300
cccccctggc caaggtcatc catgacaact ttggcattgt ggaaggactc atggtatgta 360
gttcatctgt ttcatcctgc cagcagtggg cgctgtggtg ggggccctgc aagacctcac 420
tccctgcctc tgtgtctttc agaccacggt ccatgccatc actgccaccc agaagactgt 480
ggatggcccc tccgggaagc tgtggcgtga tggccgtggg gctgcccaga acatcatccc 540
tgcatccact ggcgctgcca aggctgtggg caaagtcatc ccagagctga acgggaagct 600
gactggcatg gccttccgtg ttcctacccc caacgtgtcc gttgtggatc tgacatgtcg 660
cctggagaaa cctgtatgtc tggggtgggc tgagggttgt ctctagtggt gaggttgggg 720
cttgagtagt caccttgatt tttgccctta ataggccaag tatgaggaca tcaagaaggt 780
ggtgaagcag gcatctgagg gcccactgaa gggcatcctg ggctacaccg aggaccaggt 840
tgtctcctgc gacttcaaca gtgactccca ctcttccacc tttgatgctg gggctggcat 900
tgctctcaat gacaactttg taaagctcat ttcctggtat gacaatgaat ttggctacag 960
caacagagtg gtggacctca tggcctacat ggcctccaag g 1001
<210> 9
<211> 501
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 9
Met Ala Thr Glu Tyr Lys Pro Thr Val Arg Leu Ala Thr Arg Asp Asp
1 5 10 15
Val Pro Arg Ala Val Arg Thr Leu Ala Ala Ala Phe Ala Asp Tyr Pro
20 25 30
Ala Thr Arg His Thr Val Asp Pro Asp Arg His Ile Glu Arg Val Thr
35 40 45
Glu Leu Gln Glu Leu Phe Leu Thr Arg Val Gly Leu Asp Ile Gly Lys
50 55 60
Val Trp Val Ala Asp Asp Gly Ala Ala Val Ala Val Trp Thr Thr Pro
65 70 75 80
Glu Ser Val Glu Ala Gly Ala Val Phe Ala Glu Ile Gly Ser Arg Met
85 90 95
Ala Glu Leu Ser Gly Ser Arg Leu Ala Ala Gln Gln Gln Met Glu Gly
100 105 110
Leu Leu Ala Pro His Arg Pro Lys Glu Pro Ala Trp Phe Leu Ala Thr
115 120 125
Val Gly Val Ser Pro Asp His Gln Gly Lys Gly Leu Gly Ser Ala Val
130 135 140
Val Leu Pro Gly Val Glu Ala Ala Glu Arg Ala Gly Val Pro Ala Phe
145 150 155 160
Leu Glu Thr Ser Ala Pro Arg Asn Leu Pro Phe Tyr Glu Arg Leu Gly
165 170 175
Phe Thr Val Thr Ala Asp Val Glu Val Pro Glu Gly Pro Arg Thr Trp
180 185 190
Cys Met Thr Arg Lys Pro Gly Ala Gly Ser Glu Gly Arg Gly Ser Leu
195 200 205
Leu Thr Cys Gly Asp Val Glu Glu Asn Pro Gly Pro Met Val Ser Lys
210 215 220
Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu Leu Asp
225 230 235 240
Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly
245 250 255
Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly
260 265 270
Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Leu Thr Tyr Gly
275 280 285
Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln His Asp Phe
290 295 300
Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr Ile Phe
305 310 315 320
Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu
325 330 335
Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys
340 345 350
Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser
355 360 365
His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile Lys Val
370 375 380
Asn Phe Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val Gln Leu Ala
385 390 395 400
Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val Leu Leu
405 410 415
Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Ala Leu Ser Lys Asp Pro
420 425 430
Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala
435 440 445
Gly Ile Thr Leu Gly Met Asp Glu Leu Tyr Lys Lys Leu Ser His Gly
450 455 460
Phe Pro Pro Ala Val Ala Ala Gln Asp Asp Gly Thr Leu Pro Met Ser
465 470 475 480
Cys Ala Gln Glu Ser Gly Met Asp Arg His Pro Ala Ala Cys Ala Ser
485 490 495
Ala Arg Ile Asn Val
500
<210> 10
<211> 6128
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
gggtgatgct ggcgccgagt atgttgtgga atctactggc gtcttcacca ccatggagaa 60
ggctggggcc cacttgaagg gcggggccaa gagggtcatc atctccgccc cttctgctga 120
tgcccccatg tttgtgatgg gtgtgaacca agacaagtat gacaactccc tcaagattgt 180
caggtgagga tggcagaggg ctgtggcaaa gtgggcaagc aggggcaagg ttacaggtgg 240
gcgagcctcc taacctgtct cttctcttca gcaatgcgtc ctgcaccacc aactgcttag 300
cccccctggc caaggtcatc catgacaact ttggcattgt ggaaggactc atggtatgta 360
gttcatctgt ttcatcctgc cagcagtggg cgctgtggtg ggggccctgc aagacctcac 420
tccctgcctc tgtgtctttc agaccacggt ccatgccatc actgccaccc agaagactgt 480
ggatggcccc tccgggaagc tgtggcgtga tggccgtggg gctgcccaga acatcatccc 540
tgcatccact ggcgctgcca aggctgtggg caaagtcatc ccagagctga acgggaagct 600
gactggcatg gccttccgtg ttcctacccc caacgtgtcc gttgtggatc tgacatgtcg 660
cctggagaaa cctgtatgtc tggggtgggc tgagggttgt ctctagtggt gaggttgggg 720
cttgagtagt caccttgatt tttgccctta ataggccaag tatgaggaca tcaagaaggt 780
ggtgaagcag gcatctgagg gcccactgaa gggcatcctg ggctacaccg aggaccaggt 840
tgtctcctgc gacttcaaca gtgactccca ctcttccacc tttgatgctg gggctggcat 900
tgctctcaat gacaactttg taaagctcat ttcctggtat gacaatgaat ttggctacag 960
caacagagtg gtggacctca tggcctacat ggcctccaag gagggaggtg ggaagttcct 1020
attccgaagt tcctattctt caaatagtat aggaacttcg ggcagcggag ccaccaactt 1080
cagcctgctg aagcaggccg gcgatgtgga ggagaatcct ggccccttgg ccaagttgac 1140
cagtgccgtt ccggtgctca ccgcgcgcga cgtcgccgga gcggtcgagt tctggaccga 1200
ccggctcggg ttctcccggg acttcgtgga ggacgacttc gccggtgtgg tccgggacga 1260
cgtgaccctg ttcatcagcg cggtccagga ccaggtggtg ccggacaaca ccctggcctg 1320
ggtgtgggtg cgcggcctgg acgagctgta cgccgagtgg tcggaggtcg tgtccacgaa 1380
cttccgggac gcctccgggc cggccttgac cgagatcggc gagcagccgt gggggcggga 1440
gttcgccctg cgcgacccgg ccggcaactg cgtgcacttc gtggccgagg agcaggacgg 1500
cagcgagggc agaggcagcc tgctgacctg cggcgacgtg gaggagaacc ccggccccat 1560
ggcttcgtac ccctgccatc aacacgcgtc tgcgttcgac caggctgcgc gttctcgcgg 1620
ccatagcaac cgacgtacgg cgttgcgccc tcgccggcag caagaagcca cggaagtccg 1680
cctggagcag aaaatgccca cgctactgcg ggtttatata gacggtcctc acgggatggg 1740
gaaaaccacc accacgcaac tgctggtggc cctgggttcg cgcgacgata tcgtctacgt 1800
acccgagccg atgacttact ggcaggtgct gggggcttcc gagacaatcg cgaacatcta 1860
caccacacaa caccgcctcg accagggtga gatatcggcc ggggacgcgg cggtggtaat 1920
gacaagcgcc cagataacaa tgggcatgcc ttatgccgtg accgacgccg ttctggctcc 1980
tcatatcggg ggggaggctg ggagttcaca tgccccgccc ccggccctca ccctcatctt 2040
cgaccgccat cccatcgccg ccctcctgtg ctacccggcc gcgcgatacc ttatgggcag 2100
catgaccccc caggccgtgc tggcgttcgt ggccctcatc ccgccgacct tgcccggcac 2160
aaacatcgtg ttgggggccc ttccggagga cagacacatc gaccgcctgg ccaaacgcca 2220
gcgccccggc gagcggcttg acctggctat gctggccgcg attcgccgcg tttacgggct 2280
gcttgccaat acggtgcggt atctgcaggg cggcgggtcg tggcgggagg attggggaca 2340
gctttcgggg acggccgtgc cgccccaggg tgccgagccc cagagcaacg cgggcccacg 2400
accccatatc ggggacacgt tatttaccct gtttcgggcc cccgagttgc tggcccccaa 2460
cggcgacctg tacaacgtgt ttgcctgggc cttggacgtc ttggccaaac gcctccgtcc 2520
catgcacgtc tttatcctgg attacgacca atcgcccgcc ggctgccggg acgccctgct 2580
gcaacttacc tccgggatgg tccagaccca cgtcaccacc cccggctcca taccgacgat 2640
ctgcgacctg gcgcgcacgt ttgcccggga gatgggggag gctaactgaa gttcctattc 2700
cgaagttcct attctctaga aagtatagga acttcatcat aatcagccat accacatttg 2760
tagaggtttt acttgcttta aaaaacctcc cacacctccc cctgaacctg aaacataaaa 2820
tgaatgcatt tgttgttgtt aacttgttta ttgcagctta taatggttac aaataaagca 2880
atagcatcac aaatttcaca aataaagcat ttttttcact gcattctagt tgtggtttgt 2940
ccaaactcat caatgtatct tatcatgtct caattgtacc gggtagggta ggcgcttttc 3000
ccaaggcagt ctggagcatg cgctttagca gccccgctgg gcacttggcg ctacacaagt 3060
ggcctctggc ctcgcacaca ttccacatcc accggtaggc gccaaccggc tccgttcttt 3120
ggtggcccct tcgcgccacc ttctactcct cccctagtca ggaagttccc ccccgccccg 3180
cagctcgcgt cgtgcaggac gtgacaaatg gaagtagcac gtctcactag tctcgtgcag 3240
atggacagca ccgctgagca atggaagcgg gtaggccttt ggggcagcgg ccaatagcag 3300
ctttgctcct tcgctttctg ggctcagagg ctgggaaggg gtgggtccgg gggcgggctc 3360
aggggcgggc tcaggggcgg ggcgggcgcc cgaaggtcct ccggaggccc ggcattctgc 3420
acgcttcaaa agcgcacgtc tgccgcgctg ttctcctctt cctcatctcc gggcctttcg 3480
accataactt cgtatagcat acattatacg aagttatgcc gccaccatgg ccaccgagta 3540
caagcccacg gtgcgcctcg ccacccgcga cgacgtcccc cgggccgtac gcaccctcgc 3600
cgccgcgttc gccgactacc ccgccacgcg ccacaccgtc gacccggacc gccacatcga 3660
gcgggtcacc gagctgcaag aactcttcct cacgcgcgtc gggctcgaca tcggcaaggt 3720
gtgggtcgcg gacgacggcg ccgcggtggc ggtctggacc acgccggaga gcgtcgaagc 3780
gggggcggtg ttcgccgaga tcggctcgcg catggccgag ttgagcggtt cccggctggc 3840
cgcgcagcaa cagatggaag gcctcctggc gccgcaccgg cccaaggagc ccgcgtggtt 3900
cctggccacc gtcggcgtct cgcccgacca ccagggcaag ggtctgggca gcgccgtcgt 3960
gctccccgga gtggaggcgg ccgagcgcgc tggggtgccc gccttcctgg agacctccgc 4020
gccccgcaac ctccccttct acgagcggct cggcttcacc gtcaccgccg acgtcgaggt 4080
gcccgaagga ccgcgcacct ggtgcatgac ccgcaagccc ggtgccggca gcgagggcag 4140
aggcagcctg ctgacctgcg gcgacgtgga ggagaacccc ggccccatgg tgagcaaggg 4200
cgaggagctg ttcaccgggg tggtgcccat cctggtcgag ctggacggcg acgtaaacgg 4260
ccacaagttc agcgtgtccg gcgagggcga gggcgatgcc acctacggca agctgaccct 4320
gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg cccaccctcg tgaccaccct 4380
gacctacggc gtgcagtgct tcagccgcta ccccgaccac atgaagcagc acgacttctt 4440
caagtccgcc atgcccgaag gctacgtcca ggagcgcacc atcttcttca aggacgacgg 4500
caactacaag acccgcgccg aggtgaagtt cgagggcgac accctggtga accgcatcga 4560
gctgaagggc atcgacttca aggaggacgg caacatcctg gggcacaagc tggagtacaa 4620
ctacaacagc cacaacgtct atatcatggc cgacaagcag aagaacggca tcaaggtgaa 4680
cttcaagatc cgccacaaca tcgaggacgg cagcgtgcag ctcgccgacc actaccagca 4740
gaacaccccc atcggcgacg gccccgtgct gctgcccgac aaccactacc tgagcaccca 4800
gtccgccctg agcaaagacc ccaacgagaa gcgcgatcac atggtcctgc tggagttcgt 4860
gaccgccgcc gggatcactc tcggcatgga cgagctgtac aagaagctca gccatggctt 4920
cccgccggcg gtggcggcgc aggatgatgg cacgctgccc atgtcttgtg cccaggagag 4980
cgggatggac cgtcaccctg cagcctgtgc ttctgctagg atcaatgtat aacttcgtat 5040
agtatagtat atacgaacgg tagcggccgc gaagcccacc ctggaccatc caccccagca 5100
aggactcgag caagagggag gccctggctg ctgagcagtc cctgtccaat aacccccaca 5160
ccgatcatct ccctcacagt ttccatccca gacccccaga ataaggaggg gcttagggag 5220
ccctactctc ttgaatacca tcaataaagt tcactgcacc catcttcctt ggcctttcaa 5280
tgtaaggttg ggggagggag gctgtgacct agcaaggttg ggaatcctct gtgtcacttt 5340
tcaaacaggg cactagccac atgccagccc aggtttcctg tcctgaacag atgaaaattc 5400
acctaaggtg tcttggtgct gggaggagtg ggggttgaca atgggaccag tagtatggtc 5460
ttagaggctt gggctggact gcatcaagtt ccagggctgt gtgtgtgtta tctgcaaaca 5520
aaggtcattt gtgtctggag gccttaggta aaattggaag gatgcccaac atagtaaaaa 5580
tgtatcagcc aggggaagtg actacactgt atctaacctg aaacagctga gctgtaagcc 5640
agcagctgtc actatgttca ggtgtggtcc gctggttctg gggtggtcac ttgtatccag 5700
tttgttagga agtgttgtca ttgcttgtta ggaagacaac acatctcagg ctgggcagtg 5760
gtagtgcgtg cctctaatgc cagcattccc agcacggaag tggcagaggc aggaggacag 5820
cctggaataa caacccaggg cagagccccc atctcggaaa agacaaaaac cagaaagtgc 5880
ttaaaacatt gacacaaagg tgctcaaata ttccttcatt gcttttagag attccactgt 5940
cagcttggca tggcctctag tgagacatcc tgacttggtc ccctgctttc caaggtcagg 6000
agaatgatag ccacagaacg ttccctcagc tgatggctgg agaaccgggg tccctgagcc 6060
cccaccctca cacccatgtg caggagggag cttcaccttt ccctccgagc agtgtctgcc 6120
ttcaggac 6128
<210> 11
<211> 6208
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct ggatccataa 60
cttcgtatag catacattat acgaagttat gccgccacca tgtggtggcg cctgtggtgg 120
ctgctgctgc tgctgctgct gctgtggccc atggtgtggg ctatgttccc cgccatgccc 180
ctgagcagcc tgttcgtgaa cggccccaga accctgtgcg gcgccgagct ggtggacgcc 240
ctgcagttcg tgtgcggcga ccggggcttc tacttcaaca agcccaccgg ctacggcagc 300
agcagccggc gggcccccca gaccggcatc gtggacgagt gctgcttcag aagctgcgac 360
ctgcggagac tggagatgta ctgcgccccc ctgaagcccg ccaagagcgc cggcagcgag 420
ggcagaggca gcctgctgac ctgcggcgac gtggaggaga accccggccc catgaaccca 480
gccatcagcg tcgctctcct gctctcagtc ttgcaggtgt cccgagggca gaaggtgacc 540
agcctgacag cctgcctggt gaaccaaaac cttcgcctgg actgccgcca tgagaataac 600
accaaggata actccatcca gcatgagttc agcctgaccc gagagaagag gaagcacgtg 660
ctctcaggca ccctcgggat acccgagcac acgtaccgct cccgcgtcac cctctccaac 720
cagccctata tcaaggtcct taccctagcc aacttcacca ccaaggatga gggcgactac 780
ttttgtgagc ttcgagtctc gggcgcgaat cccatgagct ccaataaaag tatcagtgtg 840
tatagagaca aactggtcaa gtgtggcggc ataagcctgc tggttcagaa cacatcctgg 900
atgctgctgc tgctgctttc cctctccctc ctccaagccc tggacttcat ttctctgggc 960
agccagtgca caaactacgc cctgctgaaa ctggccggag atgtggaaag caaccccgga 1020
cccaaaaagc ctgaactcac cgcgacgtct gtcgagaagt ttctgatcga aaagttcgac 1080
agcgtctccg acctgatgca gctctcggag ggcgaagaat ctcgtgcttt cagcttcgat 1140
gtaggagggc gtggatatgt cctgcgggta aatagctgcg ccgatggttt ctacaaagat 1200
cgttatgttt atcggcactt tgcatcggcc gcgctcccga ttccggaagt gcttgacatt 1260
ggggaattca gcgagagcct gacctattgc atctcccgcc gtgcacaggg tgtcacgttg 1320
caagacctgc ctgaaaccga actgcccgct gttctgcagc cggtcgcgga ggccatggat 1380
gcgatcgctg cggccgatct tagccagacg agcgggttcg gcccattcgg accgcaagga 1440
atcggtcaat acactacatg gcgtgatttc atatgcgcga ttgctgatcc ccatgtgtat 1500
cactggcaaa ctgtgatgga cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag 1560
ctgatgcttt gggccgagga ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc 1620
tccaacaatg tcctgacgga caatggccgc ataacagcgg tcattgactg gagcgaggcg 1680
atgttcgggg attcccaata cgaggtcgcc aacatcttct tctggaggcc gtggttggct 1740
tgtatggagc agcagacgcg ctacttcgag cggaggcatc cggagcttgc aggatcgccg 1800
cggctccggg cgtatatgct ccgcattggt cttgaccaac tctatcagag cttggttgac 1860
ggcaatttcg atgatgcagc ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga 1920
gccgggactg tcgggcgtac acaaatcgcc cgcagaagcg cggccgtctg gaccgatggc 1980
tgtgtagaag tactcgccga tagtggaaac cgacgcccca gcactcgtcc gagggcaaag 2040
gaatagaagc ttatcataat cagccatacc acatttgtag aggttttact tgctttaaaa 2100
aacctcccac acctccccct gaacctgaaa cataaaatga atgcaattgt tgttgttaac 2160
ttgtttattg cagcttataa tggttacaaa taaagcaata gcatcacaaa tttcacaaat 2220
aaagcatttt tttcactgca ttctagttgt ggtttgtcca aactcatcaa tgtatcttat 2280
catgtctgat aacttcgtat aggctatagt atacgaacgg tacgtgaggc tccggtgccc 2340
gtcagtgggc agagcgcaca tcgcccacag tccccgagaa gttgggggga ggggtcggca 2400
attgaaccgg tgcctagaga aggtggcgcg gggtaaactg ggaaagtgat gtcgtgtact 2460
ggctccgcct ttttcccgag ggtgggggag aaccgtatat aagtgcagta gtcgccgtga 2520
acgttctttt tcgcaacggg tttgccgcca gaacacaggt aagtgccgtg tgtggttccc 2580
gcgggcctgg cctctttacg ggttatggcc cttgcgtgcc ttgaattact tccacctggc 2640
tgcagtacgt gattcttgat cccgagcttc gggttggaag tgggtgggag agttcgaggc 2700
cttgcgctta aggagcccct tcgcctcgtg cttgagttga ggcctggcct gggcgctggg 2760
gccgccgcgt gcgaatctgg tggcaccttc gcgcctgtct cgctgctttc gataagtctc 2820
tagccattta aaatttttga tgacctgctg cgacgctttt tttctggcaa gatagtcttg 2880
taaatgcggg ccaagatctg cacactggta tttcggtttt tggggccgcg ggcggcgacg 2940
gggcccgtgc gtcccagcgc acatgttcgg cgaggcgggg cctgcgagcg cggccaccga 3000
gaatcggacg ggggtagtct caagctggcc ggcctgctct ggtgcctggc ctcgcgccgc 3060
cgtgtatcgc cccgccctgg gcggcaaggc tggcccggtc ggcaccagtt gcgtgagcgg 3120
aaagatggcc gcttcccggc cctgctgcag ggagctcaaa atggaggacg cggcgctcgg 3180
gagagcgggc gggtgagtca cccacacaaa ggaaaagggc ctttccgtcc tcagccgtcg 3240
cttcatgtga ctccacggag taccgggcgc cgtccaggca cctcgattag ttctggagct 3300
tttggagtac gtcgtcttta ggttgggggg aggggtttta tgcgatggag tttccccaca 3360
ctgagtgggt ggagactgaa gttaggccag cttggcactt gatgtaattc tccttggaat 3420
ttgccctttt tgagtttgga tcttggttca ttctcaagcc tcagacagtg gttcaaagtt 3480
tttttcttcc atttcaggtg tcgtgaacgc gtcgccacca tgagcgatag cactgagaac 3540
gtcatcaagc ccttcatgcg cttcaaggtg cacatggagg gctccgtgaa cggccacgag 3600
ttcgagatcg agggcgaggg cgagggcaag ccctacgagg gcacccagac cgccaagctg 3660
caggtgacca agggcggccc cctgcccttc gcctgggaca tcctgtcccc ccagttccag 3720
tacggctcca aggtgtacgt gaagcacccc gccgacatcc ccgactacaa gaagctgtcc 3780
ttccccgagg gcttcaagtg ggagcgcgtg atgaacttcg aggacggcgg cgtggtgacc 3840
gtgacccagg actcctccct gcaggacggc accttcatct accacgtgaa gttcatcggc 3900
gtgaacttcc cctccgacgg ccccgtaatg cagaagaaga ctctgggctg ggagccctcc 3960
accgagcgcc tgtacccccg cgacggcgtg ctgaagggcg agatccacaa ggcgctgaag 4020
ctgaagggcg gcggccacta cctggtggag ttcaagtcaa tctacatggc caagaagccc 4080
gtgaagctgc ccggctacta ctacgtggac tccaagctgg acatcacctc ccacaacgag 4140
gactacaccg tggtggagca gtacgagcgc gccgaggccc gccaccacct gttccagtag 4200
ctcgaggact gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc 4260
cttgaccctg gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc 4320
gcattgtctg agtaggtgtc attctattct ggggggtggg gtggggcagg acagcaaggg 4380
ggaggattgg gaagagaata gcaggcatgg cggccgctac cgttcgtata gtatagtata 4440
tacgaagtta tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 4500
ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt 4560
cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 4620
ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 4680
tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 4740
gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 4800
tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 4860
gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 4920
tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag 4980
ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt 5040
agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa 5100
gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg 5160
attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga 5220
agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta 5280
atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc 5340
cccgtcgtgt agataactac gatacgggag ggcttaccat ctggccccag tgctgcaatg 5400
ataccgcgag acccacgctc accggctcca gatttatcag caataaacca gccagccgga 5460
agggccgagc gcagaagtgg tcctgcaact ttatccgcct ccatccagtc tattaattgt 5520
tgccgggaag ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt 5580
gctgcaggca tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc 5640
caacgatcaa ggcgagttac atgatccccc atgttgtgca aaaaagcggt tagctccttc 5700
ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca 5760
gcactgcata attctcttac tgtcatgcca tccgtaagat gcttttctgt gactggtgag 5820
tactcaacca agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 5880
tcaacacggg ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa 5940
cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa 6000
cccactcgtg cacccaactg atcttcagca tcttttactt tcaccagcgt ttctgggtga 6060
gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg gaaatgttga 6120
atactcatac tcttcctttt tcaatattat tgaagcattt atcagggtta ttgtctcatg 6180
agcggataca tatttgaatg tatttaga 6208
<210> 12
<211> 5932
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct ggatccataa 60
cttcgtatag catacattat acgaagttat gccgccacca tggccaccga gtacaagccc 120
acggtgcgcc tcgccacccg cgacgacgtc ccccgggccg tacgcaccct cgccgccgcg 180
ttcgccgact accccgccac gcgccacacc gtcgacccgg accgccacat cgagcgggtc 240
accgagctgc aagaactctt cctcacgcgc gtcgggctcg acatcggcaa ggtgtgggtc 300
gcggacgacg gcgccgcggt ggcggtctgg accacgccgg agagcgtcga agcgggggcg 360
gtgttcgccg agatcggctc gcgcatggcc gagttgagcg gttcccggct ggccgcgcag 420
caacagatgg aaggcctcct ggcgccgcac cggcccaagg agcccgcgtg gttcctggcc 480
accgtcggcg tctcgcccga ccaccagggc aagggtctgg gcagcgccgt cgtgctcccc 540
ggagtggagg cggccgagcg cgctggggtg cccgccttcc tggagacctc cgcgccccgc 600
aacctcccct tctacgagcg gctcggcttc accgtcaccg ccgacgtcga ggtgcccgaa 660
ggaccgcgca cctggtgcat gacccgcaag cccggtgccg gcagcgaggg cagaggcagc 720
ctgctgacct gcggcgacgt ggaggagaac cccggcccca tggtgagcaa gggcgaggag 780
ctgttcaccg gggtggtgcc catcctggtc gagctggacg gcgacgtaaa cggccacaag 840
ttcagcgtgt ccggcgaggg cgagggcgat gccacctacg gcaagctgac cctgaagttc 900
atctgcacca ccggcaagct gcccgtgccc tggcccaccc tcgtgaccac cctgacctac 960
ggcgtgcagt gcttcagccg ctaccccgac cacatgaagc agcacgactt cttcaagtcc 1020
gccatgcccg aaggctacgt ccaggagcgc accatcttct tcaaggacga cggcaactac 1080
aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg tgaaccgcat cgagctgaag 1140
ggcatcgact tcaaggagga cggcaacatc ctggggcaca agctggagta caactacaac 1200
agccacaacg tctatatcat ggccgacaag cagaagaacg gcatcaaggt gaacttcaag 1260
atccgccaca acatcgagga cggcagcgtg cagctcgccg accactacca gcagaacacc 1320
cccatcggcg acggccccgt gctgctgccc gacaaccact acctgagcac ccagtccgcc 1380
ctgagcaaag accccaacga gaagcgcgat cacatggtcc tgctggagtt cgtgaccgcc 1440
gccgggatca ctctcggcat ggacgagctg tacaagaagc tcagccatgg cttcccgccg 1500
gcggtggcgg cgcaggatga tggcacgctg cccatgtctt gtgcccagga gagcgggatg 1560
gaccgtcacc ctgcagcctg tgcttctgct aggatcaatg tgtagaagct tatcataatc 1620
agccatacca catttgtaga ggttttactt gctttaaaaa acctcccaca cctccccctg 1680
aacctgaaac ataaaatgaa tgcaattgtt gttgttaact tgtttattgc agcttataat 1740
ggttacaaat aaagcaatag catcacaaat ttcacaaata aagcattttt ttcactgcat 1800
tctagttgtg gtttgtccaa actcatcaat gtatcttatc atgtctgata acttcgtata 1860
gtatagtata tacgaacggt acgtgaggct ccggtgcccg tcagtgggca gagcgcacat 1920
cgcccacagt ccccgagaag ttggggggag gggtcggcaa ttgaaccggt gcctagagaa 1980
ggtggcgcgg ggtaaactgg gaaagtgatg tcgtgtactg gctccgcctt tttcccgagg 2040
gtgggggaga accgtatata agtgcagtag tcgccgtgaa cgttcttttt cgcaacgggt 2100
ttgccgccag aacacaggta agtgccgtgt gtggttcccg cgggcctggc ctctttacgg 2160
gttatggccc ttgcgtgcct tgaattactt ccacctggct gcagtacgtg attcttgatc 2220
ccgagcttcg ggttggaagt gggtgggaga gttcgaggcc ttgcgcttaa ggagcccctt 2280
cgcctcgtgc ttgagttgag gcctggcctg ggcgctgggg ccgccgcgtg cgaatctggt 2340
ggcaccttcg cgcctgtctc gctgctttcg ataagtctct agccatttaa aatttttgat 2400
gacctgctgc gacgcttttt ttctggcaag atagtcttgt aaatgcgggc caagatctgc 2460
acactggtat ttcggttttt ggggccgcgg gcggcgacgg ggcccgtgcg tcccagcgca 2520
catgttcggc gaggcggggc ctgcgagcgc ggccaccgag aatcggacgg gggtagtctc 2580
aagctggccg gcctgctctg gtgcctggcc tcgcgccgcc gtgtatcgcc ccgccctggg 2640
cggcaaggct ggcccggtcg gcaccagttg cgtgagcgga aagatggccg cttcccggcc 2700
ctgctgcagg gagctcaaaa tggaggacgc ggcgctcggg agagcgggcg ggtgagtcac 2760
ccacacaaag gaaaagggcc tttccgtcct cagccgtcgc ttcatgtgac tccacggagt 2820
accgggcgcc gtccaggcac ctcgattagt tctggagctt ttggagtacg tcgtctttag 2880
gttgggggga ggggttttat gcgatggagt ttccccacac tgagtgggtg gagactgaag 2940
ttaggccagc ttggcacttg atgtaattct ccttggaatt tgcccttttt gagtttggat 3000
cttggttcat tctcaagcct cagacagtgg ttcaaagttt ttttcttcca tttcaggtgt 3060
cgtgaacgcg tgccgccacc atggtgagca agggcgagga gctgttcacc ggggtggtgc 3120
ccatcctggt cgagctggac ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg 3180
gcgagggcga tgccacctac ggcaagctga ccctgaagtt catctgcacc accggcaagc 3240
tgcccgtgcc ctggcccacc ctcgtgacca ccctgaccta cggcgtgcag tgcttcagcc 3300
gctaccccga ccacatgaag cagcacgact tcttcaagtc cgccatgccc gaaggctacg 3360
tccaggagcg caccatcttc ttcaaggacg acggcaacta caagacccgc gccgaggtga 3420
agttcgaggg cgacaccctg gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg 3480
acggcaacat cctggggcac aagctggagt acaactacaa cagccacaac gtctatatca 3540
tggccgacaa gcagaagaac ggcatcaagg tgaacttcaa gatccgccac aacatcgagg 3600
acggcagcgt gcagctcgcc gaccactacc agcagaacac ccccatcggc gacggccccg 3660
tgctgctgcc cgacaaccac tacctgagca cccagtccgc cctgagcaaa gaccccaacg 3720
agaagcgcga tcacatggtc ctgctggagt tcgtgaccgc cgccgggatc actctcggca 3780
tggacgagct gtacaagaag ctcagccatg gcttcccgcc ggcggtggcg gcgcaggatg 3840
atggcacgct gcccatgtct tgtgcccagg agagcgggat ggaccgtcac cctgcagcct 3900
gtgcttctgc taggatcaat gtataactcg aggactgtgc cttctagttg ccagccatct 3960
gttgtttgcc cctcccccgt gccttccttg accctggaag gtgccactcc cactgtcctt 4020
tcctaataaa atgaggaaat tgcatcgcat tgtctgagta ggtgtcattc tattctgggg 4080
ggtggggtgg ggcaggacag caagggggag gattgggaag agaatagcag gcatggcggc 4140
ctaccgttcg tataggctat agtatacgaa gttatgtgag caaaaggcca gcaaaaggcc 4200
aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 4260
catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 4320
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 4380
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 4440
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 4500
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 4560
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 4620
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 4680
tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 4740
tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 4800
cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 4860
tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 4920
tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 4980
tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 5040
cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 5100
ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta 5160
tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 5220
gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 5280
agtttgcgca acgttgttgc cattgctgca ggcatcgtgg tgtcacgctc gtcgtttggt 5340
atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 5400
tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 5460
gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 5520
agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 5580
cgaccgagtt gctcttgccc ggcgtcaaca cgggataata ccgcgccaca tagcagaact 5640
ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 5700
ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 5760
actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 5820
ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 5880
atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta ga 5932
<210> 13
<211> 1026
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
atgaaaaagc ctgaactcac cgcgacgtct gtcgagaagt ttctgatcga aaagttcgac 60
agcgtctccg acctgatgca gctctcggag ggcgaagaat ctcgtgcttt cagcttcgat 120
gtaggagggc gtggatatgt cctgcgggta aatagctgcg ccgatggttt ctacaaagat 180
cgttatgttt atcggcactt tgcatcggcc gcgctcccga ttccggaagt gcttgacatt 240
ggggaattca gcgagagcct gacctattgc atctcccgcc gtgcacaggg tgtcacgttg 300
caagacctgc ctgaaaccga actgcccgct gttctgcagc cggtcgcgga ggccatggat 360
gcgatcgctg cggccgatct tagccagacg agcgggttcg gcccattcgg accgcaagga 420
atcggtcaat acactacatg gcgtgatttc atatgcgcga ttgctgatcc ccatgtgtat 480
cactggcaaa ctgtgatgga cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag 540
ctgatgcttt gggccgagga ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc 600
tccaacaatg tcctgacgga caatggccgc ataacagcgg tcattgactg gagcgaggcg 660
atgttcgggg attcccaata cgaggtcgcc aacatcttct tctggaggcc gtggttggct 720
tgtatggagc agcagacgcg ctacttcgag cggaggcatc cggagcttgc aggatcgccg 780
cggctccggg cgtatatgct ccgcattggt cttgaccaac tctatcagag cttggttgac 840
ggcaatttcg atgatgcagc ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga 900
gccgggactg tcgggcgtac acaaatcgcc cgcagaagcg cggccgtctg gaccgatggc 960
tgtgtagaag tactcgccga tagtggaaac cgacgcccca gcactcgtcc gagggcaaag 1020
gaatag 1026
<210> 14
<211> 1184
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
cgtgaggctc cggtgcccgt cagtgggcag agcgcacatc gcccacagtc cccgagaagt 60
tggggggagg ggtcggcaat tgaaccggtg cctagagaag gtggcgcggg gtaaactggg 120
aaagtgatgt cgtgtactgg ctccgccttt ttcccgaggg tgggggagaa ccgtatataa 180
gtgcagtagt cgccgtgaac gttctttttc gcaacgggtt tgccgccaga acacaggtaa 240
gtgccgtgtg tggttcccgc gggcctggcc tctttacggg ttatggccct tgcgtgcctt 300
gaattacttc cacctggctg cagtacgtga ttcttgatcc cgagcttcgg gttggaagtg 360
ggtgggagag ttcgaggcct tgcgcttaag gagccccttc gcctcgtgct tgagttgagg 420
cctggcctgg gcgctggggc cgccgcgtgc gaatctggtg gcaccttcgc gcctgtctcg 480
ctgctttcga taagtctcta gccatttaaa atttttgatg acctgctgcg acgctttttt 540
tctggcaaga tagtcttgta aatgcgggcc aagatctgca cactggtatt tcggtttttg 600
gggccgcggg cggcgacggg gcccgtgcgt cccagcgcac atgttcggcg aggcggggcc 660
tgcgagcgcg gccaccgaga atcggacggg ggtagtctca agctggccgg cctgctctgg 720
tgcctggcct cgcgccgccg tgtatcgccc cgccctgggc ggcaaggctg gcccggtcgg 780
caccagttgc gtgagcggaa agatggccgc ttcccggccc tgctgcaggg agctcaaaat 840
ggaggacgcg gcgctcggga gagcgggcgg gtgagtcacc cacacaaagg aaaagggcct 900
ttccgtcctc agccgtcgct tcatgtgact ccacggagta ccgggcgccg tccaggcacc 960
tcgattagtt ctggagcttt tggagtacgt cgtctttagg ttggggggag gggttttatg 1020
cgatggagtt tccccacact gagtgggtgg agactgaagt taggccagct tggcacttga 1080
tgtaattctc cttggaattt gccctttttg agtttggatc ttggttcatt ctcaagcctc 1140
agacagtggt tcaaagtttt tttcttccat ttcaggtgtc gtga 1184
<210> 15
<211> 1506
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
atggccaccg agtacaagcc cacggtgcgc ctcgccaccc gcgacgacgt cccccgggcc 60
gtacgcaccc tcgccgccgc gttcgccgac taccccgcca cgcgccacac cgtcgacccg 120
gaccgccaca tcgagcgggt caccgagctg caagaactct tcctcacgcg cgtcgggctc 180
gacatcggca aggtgtgggt cgcggacgac ggcgccgcgg tggcggtctg gaccacgccg 240
gagagcgtcg aagcgggggc ggtgttcgcc gagatcggct cgcgcatggc cgagttgagc 300
ggttcccggc tggccgcgca gcaacagatg gaaggcctcc tggcgccgca ccggcccaag 360
gagcccgcgt ggttcctggc caccgtcggc gtctcgcccg accaccaggg caagggtctg 420
ggcagcgccg tcgtgctccc cggagtggag gcggccgagc gcgctggggt gcccgccttc 480
ctggagacct ccgcgccccg caacctcccc ttctacgagc ggctcggctt caccgtcacc 540
gccgacgtcg aggtgcccga aggaccgcgc acctggtgca tgacccgcaa gcccggtgcc 600
ggcagcgagg gcagaggcag cctgctgacc tgcggcgacg tggaggagaa ccccggcccc 660
atggtgagca agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac 720
ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac 780
ggcaagctga ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc 840
ctcgtgacca ccctgaccta cggcgtgcag tgcttcagcc gctaccccga ccacatgaag 900
cagcacgact tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc 960
ttcaaggacg acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg 1020
gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac 1080
aagctggagt acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac 1140
ggcatcaagg tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc 1200
gaccactacc agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac 1260
tacctgagca cccagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc 1320
ctgctggagt tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaagaag 1380
ctcagccatg gcttcccgcc ggcggtggcg gcgcaggatg atggcacgct gcccatgtct 1440
tgtgcccagg agagcgggat ggaccgtcac cctgcagcct gtgcttctgc taggatcaat 1500
gtataa 1506
<210> 16
<211> 19
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 16
Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val Glu Glu Asn
1 5 10 15
Pro Gly Pro
<210> 17
<211> 2256
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
atggtgttgt tggagagcga gcagttcctg acggagctga ccagactttt ccagaagtgc 60
cggacgtcgg gcagcgtcta tatcaccttg aagaagtatg acggtcgaac caaacccatt 120
ccaaagaaag gtactgtgga gggctttgag cccgcagaca acaagtgtct gttaagagct 180
accgatggga agaagaagat cagcactgtg gtgagctcca aggaagtgaa taagtttcag 240
atggcttatt caaacctcct tagagctaac atggatgggt tgaagaagag agacaaaaag 300
aacaaaacta agaagaccaa agcagcagca gcagcagcag cagcagcacc tgccgcagca 360
gcaacagcag cacagggcag cgagggaagg ggaagcctgc tcacatgcgg cgacgtcgaa 420
gagaaccctg gccccatgcc gcagtaccag acctgggagg agttcagccg cgctgccgag 480
aagctttacc tcgctgaccc tatgaaggca cgtgtggttc tcaaatatag gcattctgat 540
gggaacttgt gtgttaaagt aacagatgat ttagttagac agtgtcttgc tctattgctc 600
aggctgcagt gcagtggcat gatcatagct cactgcatcc tcgacctcct gggctcaagc 660
ggtcctcttg cttcagcctc cggagccacc aacttcagcc tgctgaagca ggccggcgat 720
gtggaggaga atcctggccc catggttcta gcagaccttg gaagaaaaat aacatcagca 780
ttacgctcgt tgagcaatgc caccattatc aatgaagagg tattgaatgc tatgctaaaa 840
gaagtctgta ccgctttgtt ggaagcagat gttaatatta aactagtgaa gcaactaaga 900
gaaaatgtta agtctgctat tgatcttgaa gagatggcat ctggtcttaa caaaagaaaa 960
atgattcagc atgctgtatt taaagaactt gtgaagcttg tagaccctgg agttaaggca 1020
tggacaccca ctaaaggaaa acaaaatgtg attatgtttg ttggattgca agggagtggt 1080
aaaacaacaa catgttcaaa gctagcatat tattaccaga ggaaaggttg gaagacctgt 1140
ttaatatgtg cagacacatt cagagcaggg gcttttgacc aactaaaaca gaatgctacc 1200
aaagcaagaa ttccatttta tggaagctat acagaaatgg atcctgtcat cattgcttct 1260
gaaggagtag agaaatttaa aaatgaaaat tttgaaatta ttattgttga tacaagtggc 1320
cgccacaaac aagaagactc tttgtttgaa gaaatgcttc aagttgctaa tgctatacaa 1380
cctgataaca ttgtttatgt gatggatgcc tccattgggc aggcttgtga agcccaggct 1440
aaggctttta aagataaagt agatgtagcc tcagtaatag tgacaaaact tgatggccat 1500
gcaaaaggag gtggtgcact cagtgcagtc gctgccacaa aaagtccgat tattttcatt 1560
ggtacagggg aacatataga tgactttgaa cctttcaaaa cacagccttt tattagcaaa 1620
cttcttggta tgggcgacat tgaaggactg atagataaag tcaacgagtt gaagttggat 1680
gacaatgaag cacttataga gaagttgaaa catggtcagt ttacgttgcg agacatgtat 1740
gagcaatttc aaaatatcat gaaaatgggc cccttcagtc agatcttggg gatgatccct 1800
ggttttggga cagattttat gagcaaagga aatgaacagg agtcaatggc aaggctaaag 1860
aaattaatga caataatgga tagtatgaat gatcaagaac tagacagtac ggatggtgcc 1920
aaagttttta gtaaacaacc aggaagaatc caaagagtag caagaggatc gggtgtatca 1980
acaagagatg ttcaagaact tttgacacaa tataccaagt ttgcacagat ggtaaaaaag 2040
atgggaggta tcaaaggact tttcaaaggt ggcgacatgt ctaagaatgt gagccagtca 2100
cagatggcaa aattgaacca acaaatggcc aaaatgatgg atcctagggt tcttcatcac 2160
atgggtggta tggcaggact tcagtcaatg atgaggcagt ttcaacaggg tgctgctggc 2220
aacatgaaag gcatgatggg attcaataat atgtaa 2256
<210> 18
<211> 3216
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
atggactata aggaccacga cggagactac aaggatcatg atattgatta caaagacgat 60
gacgataaga tggccccaaa gaagaagcgg aaggtcggta tccacggagt cccagcagcc 120
gtagatttga gaactttggg atattcacag cagcagcagg aaaagatcaa gcccaaagtg 180
aggtcgacag tcgcgcagca tcacgaagcg ctggtgggtc atgggtttac acatgcccac 240
atcgtagcct tgtcgcagca ccctgcagcc cttggcacgg tcgccgtcaa gtaccaggac 300
atgattgcgg cgttgccgga agccacacat gaggcgatcg tcggtgtggg gaaacagtgg 360
agcggagccc gagcgcttga ggccctgttg acggtcgcgg gagagctgag agggcctccc 420
cttcagctgg acacgggcca gttgctgaag atcgcgaagc ggggaggagt cacggcggtc 480
gaggcggtgc acgcgtggcg caatgcgctc acgggagcac ccctcaacct gaccccagag 540
caggtcgtgg caattgcgag caacaacggg ggaaagcagg cactcgaaac cgtccagagg 600
ttgctgcctg tgctgtgcca agcgcacgga cttacgccag agcaggtcgt ggcaattgcg 660
agcaacatcg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 720
caagcgcacg gactaacccc agagcaggtc gtggcaattg cgagcaacat cgggggaaag 780
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cgggttgacc 840
ccagagcagg tcgtggcaat tgcgagcaac atcgggggaa agcaggcact cgaaaccgtc 900
cagaggttgc tgcctgtgct gtgccaagcg cacggcctga ccccagagca ggtcgtggca 960
attgcgagca acaacggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 1020
ctgtgccaag cgcacggact gacaccagag caggtcgtgg caattgcgag caacggaggg 1080
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacgga 1140
cttacacccg aacaagtcgt ggcaattgcg agcaacaacg ggggaaagca ggcactcgaa 1200
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gacttacgcc agagcaggtc 1260
gtggcaattg cgagcaacaa cgggggaaag caggcactcg aaaccgtcca gaggttgctg 1320
cctgtgctgt gccaagcgca cggactaacc ccagagcagg tcgtggcaat tgcgagcaac 1380
atcgggggaa agcaggcact cgaaaccgtc cagaggttgc tgcctgtgct gtgccaagcg 1440
cacgggttga ccccagagca ggtcgtggca attgcgagca acggaggggg aaagcaggca 1500
ctcgaaaccg tccagaggtt gctgcctgtg ctgtgccaag cgcacggcct gaccccagag 1560
caggtcgtgg caattgcgag caacatcggg ggaaagcagg cactcgaaac cgtccagagg 1620
ttgctgcctg tgctgtgcca agcgcacgga ctgacaccag agcaggtcgt ggcaattgcg 1680
agcaacatcg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 1740
caagcgcacg gcctcacccc agagcaggtc gtggcaattg cgagcaacat cgggggaaag 1800
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cggacttacg 1860
ccagagcagg tcgtggcaat tgcgagcaac atcgggggaa agcaggcact cgaaaccgtc 1920
cagaggttgc tgcctgtgct gtgccaagcg cacggactaa ccccagagca ggtcgtggca 1980
attgcgagca acatcggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 2040
ctgtgccaag cgcacgggtt gaccccagag caggtcgtgg caattgcgag caacatcggg 2100
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacggc 2160
ctgaccccag agcaggtcgt ggcaattgcg agcaacatcg ggggaaagca ggcactcgaa 2220
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gactgacacc agagcaggtc 2280
gtggcaattg cgagcaacaa cgggggaaag caggcactcg aaaccgtcca gaggttgctg 2340
cctgtgctgt gccaagcgca cggactcacg cctgagcagg tagtggctat tgcatccaac 2400
atcgggggca gacccgcact ggagtcaatc gtggcccagc tttcgaggcc ggaccccgcg 2460
ctggccgcac tcactaatga tcatcttgta gcgctggcct gcctcggcgg acgacccgcc 2520
ttggatgcgg tgaagaaggg gctcccgcac gcgcctgcat tgattaagcg gaccaacaga 2580
aggattcccg agaggacatc acatcgagtg gcaggttccc aactcgtgaa gagtgaactt 2640
gaggagaaaa agtcggagct gcggcacaaa ttgaaatacg taccgcatga atacatcgaa 2700
cttatcgaaa ttgctaggaa ctcgactcaa gacagaatcc ttgagatgaa ggtaatggag 2760
ttctttatga aggtttatgg ataccgaggg aagcatctcg gtggatcacg aaaacccgac 2820
ggagcaatct atacggtggg gagcccgatt gattacggag tgatcgtcga cacgaaagcc 2880
tacagcggtg ggtacaatct tcccatcggg caggcagatg agatgcaacg ttatgtcgaa 2940
gaaaatcaga ccaggaacaa acacatcaat ccaaatgagt ggtggaaagt gtatccttca 3000
tcagtgaccg agtttaagtt tttgtttgtc tctgggcatt tcaaaggcaa ctataaggcc 3060
cagctcacac ggttgaatca cattacgaac tgcaatggtg cggttttgtc cgtagaggaa 3120
ctgctcattg gtggagaaat gatcaaagcg ggaactctga cactggaaga agtcagacgc 3180
aagtttaaca atggcgagat caatttccgc tcataa 3216
<210> 19
<211> 3216
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
atggactata aggaccacga cggagactac aaggatcatg atattgatta caaagacgat 60
gacgataaga tggccccaaa gaagaagcgg aaggtcggta tccacggagt cccagcagcc 120
gtagatttga gaactttggg atattcacag cagcagcagg aaaagatcaa gcccaaagtg 180
aggtcgacag tcgcgcagca tcacgaagcg ctggtgggtc atgggtttac acatgcccac 240
atcgtagcct tgtcgcagca ccctgcagcc cttggcacgg tcgccgtcaa gtaccaggac 300
atgattgcgg cgttgccgga agccacacat gaggcgatcg tcggtgtggg gaaacagtgg 360
agcggagccc gagcgcttga ggccctgttg acggtcgcgg gagagctgag agggcctccc 420
cttcagctgg acacgggcca gttgctgaag atcgcgaagc ggggaggagt cacggcggtc 480
gaggcggtgc acgcgtggcg caatgcgctc acgggagcac ccctcaacct gaccccagag 540
caggtcgtgg caattgcgag ccatgacggg ggaaagcagg cactcgaaac cgtccagagg 600
ttgctgcctg tgctgtgcca agcgcacgga cttacgccag agcaggtcgt ggcaattgcg 660
agccatgacg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 720
caagcgcacg gactaacccc agagcaggtc gtggcaattg cgagcaacgg agggggaaag 780
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cgggttgacc 840
ccagagcagg tcgtggcaat tgcgagcaac ggagggggaa agcaggcact cgaaaccgtc 900
cagaggttgc tgcctgtgct gtgccaagcg cacggcctga ccccagagca ggtcgtggca 960
attgcgagca acggaggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 1020
ctgtgccaag cgcacggact gacaccagag caggtcgtgg caattgcgag caacatcggg 1080
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacgga 1140
cttacacccg aacaagtcgt ggcaattgcg agcaacatcg ggggaaagca ggcactcgaa 1200
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gacttacgcc agagcaggtc 1260
gtggcaattg cgagccatga cgggggaaag caggcactcg aaaccgtcca gaggttgctg 1320
cctgtgctgt gccaagcgca cggactaacc ccagagcagg tcgtggcaat tgcgagcaac 1380
atcgggggaa agcaggcact cgaaaccgtc cagaggttgc tgcctgtgct gtgccaagcg 1440
cacgggttga ccccagagca ggtcgtggca attgcgagca acatcggggg aaagcaggca 1500
ctcgaaaccg tccagaggtt gctgcctgtg ctgtgccaag cgcacggcct gaccccagag 1560
caggtcgtgg caattgcgag caacatcggg ggaaagcagg cactcgaaac cgtccagagg 1620
ttgctgcctg tgctgtgcca agcgcacgga ctgacaccag agcaggtcgt ggcaattgcg 1680
agcaacaacg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 1740
caagcgcacg gcctcacccc agagcaggtc gtggcaattg cgagcaacat cgggggaaag 1800
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cggacttacg 1860
ccagagcagg tcgtggcaat tgcgagcaac atcgggggaa agcaggcact cgaaaccgtc 1920
cagaggttgc tgcctgtgct gtgccaagcg cacggactaa ccccagagca ggtcgtggca 1980
attgcgagca acaacggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 2040
ctgtgccaag cgcacgggtt gaccccagag caggtcgtgg caattgcgag caacaacggg 2100
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacggc 2160
ctgaccccag agcaggtcgt ggcaattgcg agcaacaacg ggggaaagca ggcactcgaa 2220
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gactgacacc agagcaggtc 2280
gtggcaattg cgagcaacgg agggggaaag caggcactcg aaaccgtcca gaggttgctg 2340
cctgtgctgt gccaagcgca cggactcacg cctgagcagg tagtggctat tgcatcccat 2400
gacgggggca gacccgcact ggagtcaatc gtggcccagc tttcgaggcc ggaccccgcg 2460
ctggccgcac tcactaatga tcatcttgta gcgctggcct gcctcggcgg acgacccgcc 2520
ttggatgcgg tgaagaaggg gctcccgcac gcgcctgcat tgattaagcg gaccaacaga 2580
aggattcccg agaggacatc acatcgagtg gcaggttccc aactcgtgaa gagtgaactt 2640
gaggagaaaa agtcggagct gcggcacaaa ttgaaatacg taccgcatga atacatcgaa 2700
cttatcgaaa ttgctaggaa ctcgactcaa gacagaatcc ttgagatgaa ggtaatggag 2760
ttctttatga aggtttatgg ataccgaggg aagcatctcg gtggatcacg aaaacccgac 2820
ggagcaatct atacggtggg gagcccgatt gattacggag tgatcgtcga cacgaaagcc 2880
tacagcggtg ggtacaatct tcccatcggg caggcagatg agatgcaacg ttatgtcgaa 2940
gaaaatcaga ccaggaacaa acacatcaat ccaaatgagt ggtggaaagt gtatccttca 3000
tcagtgaccg agtttaagtt tttgtttgtc tctgggcatt tcaaaggcaa ctataaggcc 3060
cagctcacac ggttgaatca cattacgaac tgcaatggtg cggttttgtc cgtagaggaa 3120
ctgctcattg gtggagaaat gatcaaagcg ggaactctga cactggaaga agtcagacgc 3180
aagtttaaca atggcgagat caatttccgc tcataa 3216
<210> 20
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
taaatctgtt gattccaggt tccc 24
<210> 21
<211> 26
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 21
gtactaagaa gtgtggtact gtgttg 26
<210> 22
<211> 3222
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
atggactaca aggaccacga cggcgactac aaggaccacg acatcgacta caaggacgac 60
gacgacaaga tggcccccaa gaagaagaga aaggtgggca tccacagagg cgtgcccatg 120
gtggacctga gaaccctggg ctacagccag cagcagcagg agaagatcaa gcccaaggtg 180
agaagcaccg tggcccagca ccacgaggcc ctggtgggcc acggcttcac ccacgcccac 240
atcgtggccc tgagccagca ccccgccgcc ctgggcaccg tggccgtgaa gtaccaggac 300
atgatcgccg ccctgcccga ggccacccac gaggccatcg tgggcgtggg caagcagtgg 360
agcggcgcca gagccctgga ggccctgctg accgtggccg gcgagctgag aggccccccc 420
ctccagctgg acaccggcca gctgctgaag atcgccaaga gaggcggcgt gaccgccgtg 480
gaggccgtgc acgcctggag aaacgccctg accggcgccc ccctgaattt gacacccgac 540
caagttgtgg ccattgccag caacggtgga gggaaacaag cattggagac tgtccaacgg 600
ctccttcccg tgttgtgtca agcccacggt ttgacccctg cacaagtggt cgccatcgcc 660
tcccatgacg gcggtaagca ggccctggaa acagtgcaac ggttgctccc tgtcttgtgt 720
caagatcatg gactgacccc agaccaggtg gtcgcaatcg cctctaacaa cgggggaaag 780
caagccctgg aaaccgtgca aaggttgttg cccgtccttt gtcaagacca cggccttaca 840
cccgagcaag tcgtggccat tgcatcaaac atcggtggca aacaggctct tgagactgtt 900
cagagacttc tcccagttct ctgccaggca cacgggctta ctcccgatca agttgtggcc 960
attgccagca acaacggagg gaaacaagca ttggagactg tccaacggct ccttcccgtg 1020
ttgtgtcaag cccacggttt gacccctgca caagtggtcg ccatcgcctc caacatcggc 1080
ggtaagcagg ccctggaaac agtgcagcgc ctgctgcctg tgctgtgcca ggatcatgga 1140
ctgaccccag accaggttgt cgccatcgcc tctaacatcg ggggaaagca agccctggaa 1200
accgtgcaaa ggttgttgcc cgtcctttgt caagaccacg gccttacacc cgagcaagtc 1260
gtggccattg catcaaacat cggtggcaaa caggctcttg agactgttca gagacttctc 1320
ccagttctct gtcaagccca cggtttgaca cccgaccaag ttgtggccat tgccagccat 1380
gacggaggga aacaagcatt ggagactgtc caacggctcc ttcccgtgtt gtgtcaagcc 1440
cacggtttga cccctgcaca agtggtcgcc atcgcctcca acggtggcgg taagcaggcc 1500
ctggaaacag tgcaacggtt gctccctgtc ttgtgtcaag atcatggact gaccccagac 1560
caggtggtcg caatcgcctc taacatcggg ggaaagcaag ccctggaaac cgtgcaaagg 1620
ttgttgcccg tcctttgtca agaccacggc cttacacccg agcaagtcgt ggccattgca 1680
tcaaacatcg gtggcaaaca ggctcttgag actgttcaga gacttctccc agttctctgc 1740
caggcacacg ggcttactcc cgatcaagtt gtggccattg ccagcaacaa cggagggaaa 1800
caagcattgg agactgtcca acggctcctt cccgtgttgt gtcaagccca cggtttgacc 1860
cctgcacaag tggtcgccat cgcctcccat gacggcggta agcaggccct ggaaacagtg 1920
cagcgcctgc tgcctgtgct gtgccaggat catggactga ccccagacca ggttgtcgcc 1980
atcgcctcta acatcggggg aaagcaagcc ctggaaaccg tgcaaaggtt gttgcccgtc 2040
ctttgtcaag accacggcct tacacccgag caagtcgtgg ccattgcatc aaacatcggt 2100
ggcaaacagg ctcttgagac tgttcagaga cttctcccag ttctctgtca agcccacggt 2160
ttgacacccg accaagttgt ggccattgcc agcaacaacg gagggaaaca agcattggag 2220
actgtccaac ggctccttcc cgtgttgtgt caagcccacg gtttgacccc tgcacaagtg 2280
gtcgccatcg cctcccatga cggcggtaag caggccctgg aaacagtgca acggttgctc 2340
cctgtcttgt gtcaagacca tgggctgacc cccgagcagg tggtggccat cgccagcaac 2400
aacggcggca gacccgccct ggagagcatc gtggcccagc tgagcagacc cgaccccgcc 2460
ctggccgccc tgaccaacga ccacctggtg gccctggcct gcctgggcgg cagacccgcc 2520
ctggacgccg tgaagaaggg cctgccccac gcccccgccc tgatcaagag aaccaacaga 2580
agaatccccg agagaaccag ccacagagtg gccggcagcc agctggtgaa gagcgagctg 2640
gaggagaaga agagcgagct gagacacaag ctgaagtacg tgccccacga gtacatcgag 2700
ctgatcgaga tcgccagaaa cagcacccag gacagaatcc tggagatgaa ggtgatggag 2760
ttcttcatga aggtgtacgg ctacagaggc aagcacctgg gcggcagcag aaagcccgac 2820
ggcgccatct acaccgtggg cagccccatc gactacggcg tgatcgtgga caccaaggcc 2880
tacagcggcg gctacaacct gcccatcggc caggccgacg agatgcagag atacgtggag 2940
gagaaccaga ccagaaacaa gcacatcaac cccaacgagt ggtggaaggt gtaccccagc 3000
agcgtgaccg agttcaagtt cctgttcgtg agcggccact tcaagggcaa ctacaaggcc 3060
cagctgacca gactgaacca catcaccaac tgcaacggcg ccgtgctgag cgtggaggag 3120
ctgctgatcg gcggcgagat gatcaaggcc ggcaccctga ccctggagga ggtgagaaga 3180
aagttcaaca acggcgagat caacttcaga agctctagat ga 3222
<210> 23
<211> 3222
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 23
atggactaca aggaccacga cggcgactac aaggaccacg acatcgacta caaggacgac 60
gacgacaaga tggcccccaa gaagaagaga aaggtgggca tccacagagg cgtgcccatg 120
gtggacctga gaaccctggg ctacagccag cagcagcagg agaagatcaa gcccaaggtg 180
agaagcaccg tggcccagca ccacgaggcc ctggtgggcc acggcttcac ccacgcccac 240
atcgtggccc tgagccagca ccccgccgcc ctgggcaccg tggccgtgaa gtaccaggac 300
atgatcgccg ccctgcccga ggccacccac gaggccatcg tgggcgtggg caagcagtgg 360
agcggcgcca gagccctgga ggccctgctg accgtggccg gcgagctgag aggccccccc 420
ctccagctgg acaccggcca gctgctgaag atcgccaaga gaggcggcgt gaccgccgtg 480
gaggccgtgc acgcctggag aaacgccctg accggcgccc ccctgaattt gacacccgac 540
caagttgtgg ccattgccag caacggtgga gggaaacaag cattggagac tgtccaacgg 600
ctccttcccg tgttgtgtca agcccacggt ttgacccctg cacaagtggt cgccatcgcc 660
tccaacaacg gcggtaagca ggccctggaa acagtgcaac ggttgctccc tgtcttgtgt 720
caagatcatg gactgacccc agaccaggtg gtcgcaatcg cctctaacaa cgggggaaag 780
caagccctgg aaaccgtgca aaggttgttg cccgtccttt gtcaagacca cggccttaca 840
cccgagcaag tcgtggccat tgcatcaaac aacggtggca aacaggctct tgagactgtt 900
cagagacttc tcccagttct ctgccaggca cacgggctta ctcccgatca agttgtggcc 960
attgccagca acatcggagg gaaacaagca ttggagactg tccaacggct ccttcccgtg 1020
ttgtgtcaag cccacggttt gacccctgca caagtggtcg ccatcgcctc caacggtggc 1080
ggtaagcagg ccctggaaac agtgcagcgc ctgctgcctg tgctgtgcca ggatcatgga 1140
ctgaccccag accaggttgt cgccatcgcc tctcatgacg ggggaaagca agccctggaa 1200
accgtgcaaa ggttgttgcc cgtcctttgt caagaccacg gccttacacc cgagcaagtc 1260
gtggccattg catcaaacaa cggtggcaaa caggctcttg agactgttca gagacttctc 1320
ccagttctct gtcaagccca cggtttgaca cccgaccaag ttgtggccat tgccagcaac 1380
ggtggaggga aacaagcatt ggagactgtc caacggctcc ttcccgtgtt gtgtcaagcc 1440
cacggtttga cccctgcaca agtggtcgcc atcgcctcca acatcggcgg taagcaggcc 1500
ctggaaacag tgcaacggtt gctccctgtc ttgtgtcaag atcatggact gaccccagac 1560
caggtggtcg caatcgcctc taacaacggg ggaaagcaag ccctggaaac cgtgcaaagg 1620
ttgttgcccg tcctttgtca agaccacggc cttacacccg agcaagtcgt ggccattgca 1680
tcaaacaacg gtggcaaaca ggctcttgag actgttcaga gacttctccc agttctctgc 1740
caggcacacg ggcttactcc cgatcaagtt gtggccattg ccagccatga cggagggaaa 1800
caagcattgg agactgtcca acggctcctt cccgtgttgt gtcaagccca cggtttgacc 1860
cctgcacaag tggtcgccat cgcctccaac ggtggcggta agcaggccct ggaaacagtg 1920
cagcgcctgc tgcctgtgct gtgccaggat catggactga ccccagacca ggttgtcgcc 1980
atcgcctctc atgacggggg aaagcaagcc ctggaaaccg tgcaaaggtt gttgcccgtc 2040
ctttgtcaag accacggcct tacacccgag caagtcgtgg ccattgcatc aaacaacggt 2100
ggcaaacagg ctcttgagac tgttcagaga cttctcccag ttctctgtca agcccacggt 2160
ttgacacccg accaagttgt ggccattgcc agcaacatcg gagggaaaca agcattggag 2220
actgtccaac ggctccttcc cgtgttgtgt caagcccacg gtttgacccc tgcacaagtg 2280
gtcgccatcg cctccaacat cggcggtaag caggccctgg aaacagtgca acggttgctc 2340
cctgtcttgt gtcaagacca tgggctgacc cccgagcagg tggtggccat cgccagcaac 2400
ggtggcggca gacccgccct ggagagcatc gtggcccagc tgagcagacc cgaccccgcc 2460
ctggccgccc tgaccaacga ccacctggtg gccctggcct gcctgggcgg cagacccgcc 2520
ctggacgccg tgaagaaggg cctgccccac gcccccgccc tgatcaagag aaccaacaga 2580
agaatccccg agagaaccag ccacagagtg gccggcagcc agctggtgaa gagcgagctg 2640
gaggagaaga agagcgagct gagacacaag ctgaagtacg tgccccacga gtacatcgag 2700
ctgatcgaga tcgccagaaa cagcacccag gacagaatcc tggagatgaa ggtgatggag 2760
ttcttcatga aggtgtacgg ctacagaggc aagcacctgg gcggcagcag aaagcccgac 2820
ggcgccatct acaccgtggg cagccccatc gactacggcg tgatcgtgga caccaaggcc 2880
tacagcggcg gctacaacct gcccatcggc caggccgacg agatgcagag atacgtggag 2940
gagaaccaga ccagaaacaa gcacatcaac cccaacgagt ggtggaaggt gtaccccagc 3000
agcgtgaccg agttcaagtt cctgttcgtg agcggccact tcaagggcaa ctacaaggcc 3060
cagctgacca gactgaacca catcaccaac tgcaacggcg ccgtgctgag cgtggaggag 3120
ctgctgatcg gcggcgagat gatcaaggcc ggcaccctga ccctggagga ggtgagaaga 3180
aagttcaaca acggcgagat caacttcaga agctctagat ga 3222
<210> 24
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 24
ttgtacccgt tggagaagtg acag 24
<210> 25
<211> 27
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 25
gatgaactag gaaaggctca agatcac 27
<210> 26
<211> 3216
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 26
atggactata aggaccacga cggagactac aaggatcatg atattgatta caaagacgat 60
gacgataaga tggccccaaa gaagaagcgg aaggtcggta tccacggagt cccagcagcc 120
gtagatttga gaactttggg atattcacag cagcagcagg aaaagatcaa gcccaaagtg 180
aggtcgacag tcgcgcagca tcacgaagcg ctggtgggtc atgggtttac acatgcccac 240
atcgtagcct tgtcgcagca ccctgcagcc cttggcacgg tcgccgtcaa gtaccaggac 300
atgattgcgg cgttgccgga agccacacat gaggcgatcg tcggtgtggg gaaacagtgg 360
agcggagccc gagcgcttga ggccctgttg acggtcgcgg gagagctgag agggcctccc 420
cttcagctgg acacgggcca gttgctgaag atcgcgaagc ggggaggagt cacggcggtc 480
gaggcggtgc acgcgtggcg caatgcgctc acgggagcac ccctcaacct gaccccagag 540
caggtcgtgg caattgcgag caacggcggg ggaaagcagg cactcgaaac cgtccagagg 600
ttgctgcctg tgctgtgcca agcgcacgga cttacgccag agcaggtcgt ggcaattgcg 660
agcaacatcg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 720
caagcgcacg gactaacccc agagcaggtc gtggcaattg cgagcaacat cgggggaaag 780
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cgggttgacc 840
ccagagcagg tcgtggcaat tgcgagcaac aacgggggaa agcaggcact cgaaaccgtc 900
cagaggttgc tgcctgtgct gtgccaagcg cacggcctga ccccagagca ggtcgtggca 960
attgcgagcc acgacggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 1020
ctgtgccaag cgcacggact gacaccagag caggtcgtgg caattgcgag caacggaggg 1080
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacgga 1140
cttacacccg aacaagtcgt ggcaattgcg agcaacaacg ggggaaagca ggcactcgaa 1200
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gacttacgcc agagcaggtc 1260
gtggcaattg cgagcaacaa cgggggaaag caggcactcg aaaccgtcca gaggttgctg 1320
cctgtgctgt gccaagcgca cggactaacc ccagagcagg tcgtggcaat tgcgagcaac 1380
atcgggggaa agcaggcact cgaaaccgtc cagaggttgc tgcctgtgct gtgccaagcg 1440
cacgggttga ccccagagca ggtcgtggca attgcgagca acaacggggg aaagcaggca 1500
ctcgaaaccg tccagaggtt gctgcctgtg ctgtgccaag cgcacggcct gaccccagag 1560
caggtcgtgg caattgcgag caacaacggg ggaaagcagg cactcgaaac cgtccagagg 1620
ttgctgcctg tgctgtgcca agcgcacgga ctgacaccag agcaggtcgt ggcaattgcg 1680
agcaacatcg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 1740
caagcgcacg gcctcacccc agagcaggtc gtggcaattg cgagcaacat cgggggaaag 1800
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cggacttacg 1860
ccagagcagg tcgtggcaat tgcgagcaac aacgggggaa agcaggcact cgaaaccgtc 1920
cagaggttgc tgcctgtgct gtgccaagcg cacggactaa ccccagagca ggtcgtggca 1980
attgcgagca acatcggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 2040
ctgtgccaag cgcacgggtt gaccccagag caggtcgtgg caattgcgag caacggcggg 2100
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacggc 2160
ctgaccccag agcaggtcgt ggcaattgcg agcaacaacg ggggaaagca ggcactcgaa 2220
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gactgacacc agagcaggtc 2280
gtggcaattg cgagcaacaa cgggggaaag caggcactcg aaaccgtcca gaggttgctg 2340
cctgtgctgt gccaagcgca cggactcacg cctgagcagg tagtggctat tgcatccaac 2400
aacgggggca gacccgcact ggagtcaatc gtggcccagc tttcgaggcc ggaccccgcg 2460
ctggccgcac tcactaatga tcatcttgta gcgctggcct gcctcggcgg acgacccgcc 2520
ttggatgcgg tgaagaaggg gctcccgcac gcgcctgcat tgattaagcg gaccaacaga 2580
aggattcccg agaggacatc acatcgagtg gcaggttccc aactcgtgaa gagtgaactt 2640
gaggagaaaa agtcggagct gcggcacaaa ttgaaatacg taccgcatga atacatcgaa 2700
cttatcgaaa ttgctaggaa ctcgactcaa gacagaatcc ttgagatgaa ggtaatggag 2760
ttctttatga aggtttatgg ataccgaggg aagcatctcg gtggatcacg aaaacccgac 2820
ggagcaatct atacggtggg gagcccgatt gattacggag tgatcgtcga cacgaaagcc 2880
tacagcggtg ggtacaatct tcccatcggg caggcagatg agatgcaacg ttatgtcgaa 2940
gaaaatcaga ccaggaacaa acacatcaat ccaaatgagt ggtggaaagt gtatccttca 3000
tcagtgaccg agtttaagtt tttgtttgtc tctgggcatt tcaaaggcaa ctataaggcc 3060
cagctcacac ggttgaatca cattacgaac tgcaatggtg cggttttgtc cgtagaggaa 3120
ctgctcattg gtggagaaat gatcaaagcg ggaactctga cactggaaga agtcagacgc 3180
aagtttaaca atggcgagat caatttccgc tcataa 3216
<210> 27
<211> 3216
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 27
atggactata aggaccacga cggagactac aaggatcatg atattgatta caaagacgat 60
gacgataaga tggccccaaa gaagaagcgg aaggtcggta tccacggagt cccagcagcc 120
gtagatttga gaactttggg atattcacag cagcagcagg aaaagatcaa gcccaaagtg 180
aggtcgacag tcgcgcagca tcacgaagcg ctggtgggtc atgggtttac acatgcccac 240
atcgtagcct tgtcgcagca ccctgcagcc cttggcacgg tcgccgtcaa gtaccaggac 300
atgattgcgg cgttgccgga agccacacat gaggcgatcg tcggtgtggg gaaacagtgg 360
agcggagccc gagcgcttga ggccctgttg acggtcgcgg gagagctgag agggcctccc 420
cttcagctgg acacgggcca gttgctgaag atcgcgaagc ggggaggagt cacggcggtc 480
gaggcggtgc acgcgtggcg caatgcgctc acgggagcac ccctcaacct gaccccagag 540
caggtcgtgg caattgcgag caacaacggg ggaaagcagg cactcgaaac cgtccagagg 600
ttgctgcctg tgctgtgcca agcgcacgga cttacgccag agcaggtcgt ggcaattgcg 660
agcaacggcg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 720
caagcgcacg gactaacccc agagcaggtc gtggcaattg cgagcaacat cgggggaaag 780
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cgggttgacc 840
ccagagcagg tcgtggcaat tgcgagcaac ggcgggggaa agcaggcact cgaaaccgtc 900
cagaggttgc tgcctgtgct gtgccaagcg cacggcctga ccccagagca ggtcgtggca 960
attgcgagca acatcggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 1020
ctgtgccaag cgcacggact gacaccagag caggtcgtgg caattgcgag caacaacggg 1080
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacgga 1140
cttacacccg aacaagtcgt ggcaattgcg agccacgacg ggggaaagca ggcactcgaa 1200
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gacttacgcc agagcaggtc 1260
gtggcaattg cgagccacga cgggggaaag caggcactcg aaaccgtcca gaggttgctg 1320
cctgtgctgt gccaagcgca cggactaacc ccagagcagg tcgtggcaat tgcgagcaac 1380
atcgggggaa agcaggcact cgaaaccgtc cagaggttgc tgcctgtgct gtgccaagcg 1440
cacgggttga ccccagagca ggtcgtggca attgcgagca acggcggggg aaagcaggca 1500
ctcgaaaccg tccagaggtt gctgcctgtg ctgtgccaag cgcacggcct gaccccagag 1560
caggtcgtgg caattgcgag caacggcggg ggaaagcagg cactcgaaac cgtccagagg 1620
ttgctgcctg tgctgtgcca agcgcacgga ctgacaccag agcaggtcgt ggcaattgcg 1680
agccacgacg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 1740
caagcgcacg gcctcacccc agagcaggtc gtggcaattg cgagcaacgg cgggggaaag 1800
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cggacttacg 1860
ccagagcagg tcgtggcaat tgcgagcaac atcgggggaa agcaggcact cgaaaccgtc 1920
cagaggttgc tgcctgtgct gtgccaagcg cacggactaa ccccagagca ggtcgtggca 1980
attgcgagca acaacggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 2040
ctgtgccaag cgcacgggtt gaccccagag caggtcgtgg caattgcgag caacaacggg 2100
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacggc 2160
ctgaccccag agcaggtcgt ggcaattgcg agcaacaacg ggggaaagca ggcactcgaa 2220
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gactgacacc agagcaggtc 2280
gtggcaattg cgagcaacat cgggggaaag caggcactcg aaaccgtcca gaggttgctg 2340
cctgtgctgt gccaagcgca cggactcacg cctgagcagg tagtggctat tgcatccaac 2400
atcgggggca gacccgcact ggagtcaatc gtggcccagc tttcgaggcc ggaccccgcg 2460
ctggccgcac tcactaatga tcatcttgta gcgctggcct gcctcggcgg acgacccgcc 2520
ttggatgcgg tgaagaaggg gctcccgcac gcgcctgcat tgattaagcg gaccaacaga 2580
aggattcccg agaggacatc acatcgagtg gcaggttccc aactcgtgaa gagtgaactt 2640
gaggagaaaa agtcggagct gcggcacaaa ttgaaatacg taccgcatga atacatcgaa 2700
cttatcgaaa ttgctaggaa ctcgactcaa gacagaatcc ttgagatgaa ggtaatggag 2760
ttctttatga aggtttatgg ataccgaggg aagcatctcg gtggatcacg aaaacccgac 2820
ggagcaatct atacggtggg gagcccgatt gattacggag tgatcgtcga cacgaaagcc 2880
tacagcggtg ggtacaatct tcccatcggg caggcagatg agatgcaacg ttatgtcgaa 2940
gaaaatcaga ccaggaacaa acacatcaat ccaaatgagt ggtggaaagt gtatccttca 3000
tcagtgaccg agtttaagtt tttgtttgtc tctgggcatt tcaaaggcaa ctataaggcc 3060
cagctcacac ggttgaatca cattacgaac tgcaatggtg cggttttgtc cgtagaggaa 3120
ctgctcattg gtggagaaat gatcaaagcg ggaactctga cactggaaga agtcagacgc 3180
aagtttaaca atggcgagat caatttccgc tcataa 3216
<210> 28
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 28
ctaaaggctg ctcccactct ac 22
<210> 29
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 29
caaagtggaa cttgggttga gg 22
<210> 30
<211> 3216
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 30
atggactata aggaccacga cggagactac aaggatcatg atattgatta caaagacgat 60
gacgataaga tggccccaaa gaagaagcgg aaggtcggta tccacggagt cccagcagcc 120
gtagatttga gaactttggg atattcacag cagcagcagg aaaagatcaa gcccaaagtg 180
aggtcgacag tcgcgcagca tcacgaagcg ctggtgggtc atgggtttac acatgcccac 240
atcgtagcct tgtcgcagca ccctgcagcc cttggcacgg tcgccgtcaa gtaccaggac 300
atgattgcgg cgttgccgga agccacacat gaggcgatcg tcggtgtggg gaaacagtgg 360
agcggagccc gagcgcttga ggccctgttg acggtcgcgg gagagctgag agggcctccc 420
cttcagctgg acacgggcca gttgctgaag atcgcgaagc ggggaggagt cacggcggtc 480
gaggcggtgc acgcgtggcg caatgcgctc acgggagcac ccctcaacct gaccccagag 540
caggtcgtgg caattgcgag caacaacggg ggaaagcagg cactcgaaac cgtccagagg 600
ttgctgcctg tgctgtgcca agcgcacgga cttacgccag agcaggtcgt ggcaattgcg 660
agcaacaacg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 720
caagcgcacg gactaacccc agagcaggtc gtggcaattg cgagcaacgg cgggggaaag 780
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cgggttgacc 840
ccagagcagg tcgtggcaat tgcgagcaac aacgggggaa agcaggcact cgaaaccgtc 900
cagaggttgc tgcctgtgct gtgccaagcg cacggcctga ccccagagca ggtcgtggca 960
attgcgagca acaacggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 1020
ctgtgccaag cgcacggact gacaccagag caggtcgtgg caattgcgag ccacgacggg 1080
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacgga 1140
cttacacccg aacaagtcgt ggcaattgcg agccacgacg ggggaaagca ggcactcgaa 1200
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gacttacgcc agagcaggtc 1260
gtggcaattg cgagcaacgg cgggggaaag caggcactcg aaaccgtcca gaggttgctg 1320
cctgtgctgt gccaagcgca cggactaacc ccagagcagg tcgtggcaat tgcgagcaac 1380
ggcgggggaa agcaggcact cgaaaccgtc cagaggttgc tgcctgtgct gtgccaagcg 1440
cacgggttga ccccagagca ggtcgtggca attgcgagca acatcggggg aaagcaggca 1500
ctcgaaaccg tccagaggtt gctgcctgtg ctgtgccaag cgcacggcct gaccccagag 1560
caggtcgtgg caattgcgag caacaacggg ggaaagcagg cactcgaaac cgtccagagg 1620
ttgctgcctg tgctgtgcca agcgcacgga ctgacaccag agcaggtcgt ggcaattgcg 1680
agcaacggcg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 1740
caagcgcacg gcctcacccc agagcaggtc gtggcaattg cgagccacga cgggggaaag 1800
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cggacttacg 1860
ccagagcagg tcgtggcaat tgcgagcaac ggcgggggaa agcaggcact cgaaaccgtc 1920
cagaggttgc tgcctgtgct gtgccaagcg cacggactaa ccccagagca ggtcgtggca 1980
attgcgagca acggcggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 2040
ctgtgccaag cgcacgggtt gaccccagag caggtcgtgg caattgcgag caacaacggg 2100
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacggc 2160
ctgaccccag agcaggtcgt ggcaattgcg agcaacggcg ggggaaagca ggcactcgaa 2220
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gactgacacc agagcaggtc 2280
gtggcaattg cgagcaacaa cgggggaaag caggcactcg aaaccgtcca gaggttgctg 2340
cctgtgctgt gccaagcgca cggactcacg cctgagcagg tagtggctat tgcatccaac 2400
ggcgggggca gacccgcact ggagtcaatc gtggcccagc tttcgaggcc ggaccccgcg 2460
ctggccgcac tcactaatga tcatcttgta gcgctggcct gcctcggcgg acgacccgcc 2520
ttggatgcgg tgaagaaggg gctcccgcac gcgcctgcat tgattaagcg gaccaacaga 2580
aggattcccg agaggacatc acatcgagtg gcaggttccc aactcgtgaa gagtgaactt 2640
gaggagaaaa agtcggagct gcggcacaaa ttgaaatacg taccgcatga atacatcgaa 2700
cttatcgaaa ttgctaggaa ctcgactcaa gacagaatcc ttgagatgaa ggtaatggag 2760
ttctttatga aggtttatgg ataccgaggg aagcatctcg gtggatcacg aaaacccgac 2820
ggagcaatct atacggtggg gagcccgatt gattacggag tgatcgtcga cacgaaagcc 2880
tacagcggtg ggtacaatct tcccatcggg caggcagatg agatgcaacg ttatgtcgaa 2940
gaaaatcaga ccaggaacaa acacatcaat ccaaatgagt ggtggaaagt gtatccttca 3000
tcagtgaccg agtttaagtt tttgtttgtc tctgggcatt tcaaaggcaa ctataaggcc 3060
cagctcacac ggttgaatca cattacgaac tgcaatggtg cggttttgtc cgtagaggaa 3120
ctgctcattg gtggagaaat gatcaaagcg ggaactctga cactggaaga agtcagacgc 3180
aagtttaaca atggcgagat caatttccgc tcataa 3216
<210> 31
<211> 3216
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 31
atggactata aggaccacga cggagactac aaggatcatg atattgatta caaagacgat 60
gacgataaga tggccccaaa gaagaagcgg aaggtcggta tccacggagt cccagcagcc 120
gtagatttga gaactttggg atattcacag cagcagcagg aaaagatcaa gcccaaagtg 180
aggtcgacag tcgcgcagca tcacgaagcg ctggtgggtc atgggtttac acatgcccac 240
atcgtagcct tgtcgcagca ccctgcagcc cttggcacgg tcgccgtcaa gtaccaggac 300
atgattgcgg cgttgccgga agccacacat gaggcgatcg tcggtgtggg gaaacagtgg 360
agcggagccc gagcgcttga ggccctgttg acggtcgcgg gagagctgag agggcctccc 420
cttcagctgg acacgggcca gttgctgaag atcgcgaagc ggggaggagt cacggcggtc 480
gaggcggtgc acgcgtggcg caatgcgctc acgggagcac ccctcaacct gaccccagag 540
caggtcgtgg caattgcgag ccacgacggg ggaaagcagg cactcgaaac cgtccagagg 600
ttgctgcctg tgctgtgcca agcgcacgga cttacgccag agcaggtcgt ggcaattgcg 660
agcaacatcg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 720
caagcgcacg gactaacccc agagcaggtc gtggcaattg cgagcaacaa cgggggaaag 780
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cgggttgacc 840
ccagagcagg tcgtggcaat tgcgagccac gacgggggaa agcaggcact cgaaaccgtc 900
cagaggttgc tgcctgtgct gtgccaagcg cacggcctga ccccagagca ggtcgtggca 960
attgcgagca acatcggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 1020
ctgtgccaag cgcacggact gacaccagag caggtcgtgg caattgcgag caacatcggg 1080
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacgga 1140
cttacacccg aacaagtcgt ggcaattgcg agcaacatcg ggggaaagca ggcactcgaa 1200
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gacttacgcc agagcaggtc 1260
gtggcaattg cgagccacga cgggggaaag caggcactcg aaaccgtcca gaggttgctg 1320
cctgtgctgt gccaagcgca cggactaacc ccagagcagg tcgtggcaat tgcgagcaac 1380
atcgggggaa agcaggcact cgaaaccgtc cagaggttgc tgcctgtgct gtgccaagcg 1440
cacgggttga ccccagagca ggtcgtggca attgcgagca acaacggggg aaagcaggca 1500
ctcgaaaccg tccagaggtt gctgcctgtg ctgtgccaag cgcacggcct gaccccagag 1560
caggtcgtgg caattgcgag caacggcggg ggaaagcagg cactcgaaac cgtccagagg 1620
ttgctgcctg tgctgtgcca agcgcacgga ctgacaccag agcaggtcgt ggcaattgcg 1680
agcaacaacg ggggaaagca ggcactcgaa accgtccaga ggttgctgcc tgtgctgtgc 1740
caagcgcacg gcctcacccc agagcaggtc gtggcaattg cgagcaacgg cgggggaaag 1800
caggcactcg aaaccgtcca gaggttgctg cctgtgctgt gccaagcgca cggacttacg 1860
ccagagcagg tcgtggcaat tgcgagcaac atcgggggaa agcaggcact cgaaaccgtc 1920
cagaggttgc tgcctgtgct gtgccaagcg cacggactaa ccccagagca ggtcgtggca 1980
attgcgagca acaacggggg aaagcaggca ctcgaaaccg tccagaggtt gctgcctgtg 2040
ctgtgccaag cgcacgggtt gaccccagag caggtcgtgg caattgcgag caacggcggg 2100
ggaaagcagg cactcgaaac cgtccagagg ttgctgcctg tgctgtgcca agcgcacggc 2160
ctgaccccag agcaggtcgt ggcaattgcg agcaacatcg ggggaaagca ggcactcgaa 2220
accgtccaga ggttgctgcc tgtgctgtgc caagcgcacg gactgacacc agagcaggtc 2280
gtggcaattg cgagcaacaa cgggggaaag caggcactcg aaaccgtcca gaggttgctg 2340
cctgtgctgt gccaagcgca cggactcacg cctgagcagg tagtggctat tgcatccaac 2400
atcgggggca gacccgcact ggagtcaatc gtggcccagc tttcgaggcc ggaccccgcg 2460
ctggccgcac tcactaatga tcatcttgta gcgctggcct gcctcggcgg acgacccgcc 2520
ttggatgcgg tgaagaaggg gctcccgcac gcgcctgcat tgattaagcg gaccaacaga 2580
aggattcccg agaggacatc acatcgagtg gcaggttccc aactcgtgaa gagtgaactt 2640
gaggagaaaa agtcggagct gcggcacaaa ttgaaatacg taccgcatga atacatcgaa 2700
cttatcgaaa ttgctaggaa ctcgactcaa gacagaatcc ttgagatgaa ggtaatggag 2760
ttctttatga aggtttatgg ataccgaggg aagcatctcg gtggatcacg aaaacccgac 2820
ggagcaatct atacggtggg gagcccgatt gattacggag tgatcgtcga cacgaaagcc 2880
tacagcggtg ggtacaatct tcccatcggg caggcagatg agatgcaacg ttatgtcgaa 2940
gaaaatcaga ccaggaacaa acacatcaat ccaaatgagt ggtggaaagt gtatccttca 3000
tcagtgaccg agtttaagtt tttgtttgtc tctgggcatt tcaaaggcaa ctataaggcc 3060
cagctcacac ggttgaatca cattacgaac tgcaatggtg cggttttgtc cgtagaggaa 3120
ctgctcattg gtggagaaat gatcaaagcg ggaactctga cactggaaga agtcagacgc 3180
aagtttaaca atggcgagat caatttccgc tcataa 3216
<210> 32
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 32
gggtgtagag atagattctc c 21
<210> 33
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 33
ggccagccat cactagtatt c 21
<210> 34
<211> 18
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 34
Glu Gly Arg Gly Ser Leu Leu Thr Cys Gly Asp Val Glu Glu Asn Pro
1 5 10 15
Gly Pro
<210> 35
<211> 20
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 35
Gln Cys Thr Asn Tyr Ala Leu Leu Lys Leu Ala Gly Asp Val Glu Ser
1 5 10 15
Asn Pro Gly Pro
20
<210> 36
<211> 681
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 36
atgagcgata gcactgagaa cgtcatcaag cccttcatgc gcttcaaggt gcacatggag 60
ggctccgtga acggccacga gttcgagatc gagggcgagg gcgagggcaa gccctacgag 120
ggcacccaga ccgccaagct gcaggtgacc aagggcggcc ccctgccctt cgcctgggac 180
atcctgtccc cccagttcca gtacggctcc aaggtgtacg tgaagcaccc cgccgacatc 240
cccgactaca agaagctgtc cttccccgag ggcttcaagt gggagcgcgt gatgaacttc 300
gaggacggcg gcgtggtgac cgtgacccag gactcctccc tgcaggacgg caccttcatc 360
taccacgtga agttcatcgg cgtgaacttc ccctccgacg gccccgtaat gcagaagaag 420
actctgggct gggagccctc caccgagcgc ctgtaccccc gcgacggcgt gctgaagggc 480
gagatccaca aggcgctgaa gctgaagggc ggcggccact acctggtgga gttcaagtca 540
atctacatgg ccaagaagcc cgtgaagctg cccggctact actacgtgga ctccaagctg 600
gacatcacct cccacaacga ggactacacc gtggtggagc agtacgagcg cgccgaggcc 660
cgccaccacc tgttccagta g 681
<210> 37
<211> 846
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 37
atggtgagca agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac 60
ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac 120
ggcaagctga ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc 180
ctcgtgacca ccctgaccta cggcgtgcag tgcttcagcc gctaccccga ccacatgaag 240
cagcacgact tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc 300
ttcaaggacg acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg 360
gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac 420
aagctggagt acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac 480
ggcatcaagg tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc 540
gaccactacc agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac 600
tacctgagca cccagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc 660
ctgctggagt tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaagaag 720
ctcagccatg gcttcccgcc ggcggtggcg gcgcaggatg atggcacgct gcccatgtct 780
tgtgcccagg agagcgggat ggaccgtcac cctgcagcct gtgcttctgc taggatcaat 840
gtataa 846
<210> 38
<211> 1407
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 38
atgggccgcg gctggggatt cttgtttggc ctcctgggcg ccgtgtggct gctcagctcg 60
ggccacggag aggagcagcc cccggagaca gcggcacaga ggtgcttctg ccaggttagt 120
ggttacttgg atgattgtac ctgtgatgtt gaaaccattg atagatttaa taactacagg 180
cttttcccaa gactacaaaa acttcttgaa agtgactact ttaggtatta caaggtaaac 240
ctgaagaggc cgtgtccttt ctggaatgac atcagccagt gtggaagaag ggactgtgct 300
gtcaaaccat gtcaatctga tgaagttcct gatggaatta aatctgcgag ctacaagtat 360
tctgaagaag ccaataatct cattgaagaa tgtgaacaag ctgaacgact tggagcagtg 420
gatgaatctc tgagtgagga aacacagaag gctgttcttc agtggaccaa gcatgatgat 480
tcttcagata acttctgtga agctgatgac attcagtccc ctgaagctga atatgtagat 540
ttgcttctta atcctgagcg ctacactggt tacaagggac cagatgcttg gaaaatatgg 600
aatgtcatct acgaagaaaa ctgttttaag ccacagacaa ttaaaagacc tttaaatcct 660
ttggcttctg gtcaagggac aagtgaagag aacacttttt acagttggct agaaggtctc 720
tgtgtagaaa aaagagcatt ctacagactt atatctggcc tacatgcaag cattaatgtg 780
catttgagtg caagatatct tttacaagag acctggttag aaaagaaatg gggacacaac 840
attacagaat ttcaacagcg atttgatgga attttgactg aaggagaagg tccaagaagg 900
cttaagaact tgtattttct ctacttaata gaactaaggg ctttatccaa agtgttacca 960
ttcttcgagc gcccagattt tcaactcttt actggaaata aaattcagga tgaggaaaac 1020
aaaatgttac ttctggaaat acttcatgaa atcaagtcat ttcctttgca ttttgatgag 1080
aattcatttt ttgctgggga taaaaaagaa gcacacaaac taaaggagga ctttcgactg 1140
cattttagaa atatttcaag aattatggat tgtgttggtt gttttaaatg tcgtctgtgg 1200
ggaaagcttc agactcaggg tttgggcact gctctgaaga tcttattttc tgagaaattg 1260
atagcaaata tgccagaaag tggacctagt tatgaattcc atctaaccag acaagaaata 1320
gtatcattat tcaacgcatt tggaagaatt tctacaagtg tgaaagaatt agaaaacttc 1380
aggaacttgt tacagaatat tcattga 1407
<210> 39
<211> 684
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 39
atgaccagca agctcgccgt ggctctgctg gctgccttcc tgatcagcgc cgccctctgc 60
gagggcttag gtgaagttgg gaactatttc ggtgtgcagg atgcggtacc gtttgggaat 120
gtgcccgtgt tgccggtgga cagcccggtt ttgttaagtg accacctggg tcagtccgaa 180
gcaggggggc tccccagggg acccgcagtc acggacttgg atcatttaaa ggggattctc 240
aggcggaggc agctatactg caggactgga tttcacttag aaatcttccc caatggtact 300
atccagggaa ccaggaaaga ccacagccga tttggcattc tggaatttat cagtatagca 360
gtgggcctgg tcagcattcg aggcgtggac agtggactct acctcgggat gaatgagaag 420
ggggagctgt atggatcaga aaaactaacc caagagtgtg tattcagaga acagttcgaa 480
gaaaactggt ataatacgta ctcatcaaac ctatataagc acgtggacac tggaaggcga 540
tactatgttg cattaaataa agatgggacc ccgagagaag ggactaggac taaacggcac 600
cagaaattca cacatttttt acctagacca gtggaccccg acaaagtacc tgaactgtat 660
aaggatattc taagccaaag ttga 684
<210> 40
<211> 2487
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 40
atgaggcttc gggagccgct cctgagcggc agcgccgcga tgccaggcgc gtccctacag 60
cgggcctgcc gcctgctcgt ggccgtctgc gctctgcacc ttggcgtcac cctcgtttac 120
tacctggctg gccgcgacct gagccgcctg ccccaactgg tcggagtctc cacaccgctg 180
cagggcggct cgaacagtgc cgccgccatc gggcagtcct ccggggagct ccggaccgga 240
ggggcccggc cgccgcctcc tctaggcgcc tcctcccagc cgcgcccggg tggcgactcc 300
agcccagtcg tggattctgg ccctggcccc gctagcaact tgacctcggt cccagtgccc 360
cacaccaccg cactgtcgct gcccgcctgc cctgaggagt ccccgctgct tgtgggcccc 420
atgctgattg agtttaacat gcctgtggac ctggagctcg tggcaaagca gaacccaaat 480
gtgaagatgg gcggccgcta tgcccccagg gactgcgtct ctcctcacaa ggtggccatc 540
atcattccat tccgcaaccg gcaggagcac ctcaagtact ggctatatta tttgcaccca 600
gtcctgcagc gccagcagct ggactatggc atctatgtta tcaaccaggc gggagacact 660
atattcaatc gtgctaagct cctcaatgtt ggctttcaag aagccttgaa ggactatgac 720
tacacctgct ttgtgtttag tgacgtggac ctcattccaa tgaatgacca taatgcgtac 780
aggtgttttt cacagccacg gcacatttcc gttgcaatgg ataagtttgg attcagccta 840
ccttatgttc agtattttgg aggtgtctct gctctaagta aacaacagtt tctaaccatc 900
aatggatttc ctaataatta ttggggctgg ggaggagaag atgatgacat ttttaacaga 960
ttagttttta gaggcatgtc tatatctcgc ccaaatgctg tggtcgggag gtgtcgcatg 1020
atccgccact caagagacaa gaaaaatgaa cccaatcctc agaggtttga ccgaattgca 1080
cacacaaagg agacaatgct ctctgatggt ttgaactcac tcacctacca ggtgctggat 1140
gtacagagat acccattgta tacccaaatc acagtggaca tcgggacacc gagctcgagc 1200
ggcagcggag ccaccaactt cagcctgctg aagcaggccg gcgatgtgga ggagaatcct 1260
ggccccatga ttcacaccaa cctgaagaaa aagttcagct gctgcgtcct ggtctttctt 1320
ctgtttgcag tcatctgtgt gtggaaggaa aagaagaaag ggagttacta tgattccttt 1380
aaattgcaaa ccaaggaatt ccaggtgtta aagagtctgg ggaaattggc catggggtct 1440
gattcccagt ctgtatcctc aagcagcacc caggaccccc acaggggccg ccagaccctc 1500
ggcagtctca gaggcctagc caaggccaaa ccagaggcct ccttccaggt gtggaacaag 1560
gacagctctt ccaaaaacct tatccctagg ctgcaaaaga tctggaagaa ttacctaagc 1620
atgaacaagt acaaagtgtc ctacaagggg ccaggaccag gcatcaagtt cagtgcagag 1680
gccctgcgct gccacctccg ggaccatgtg aatgtatcca tggtagaggt cacagatttt 1740
cccttcaata cctctgaatg ggagggttat ctgcccaagg agagcattag gaccaaggct 1800
gggccttggg gcaggtgtgc tgttgtgtcg tcagcgggat ctctgaagtc ctcccaacta 1860
ggcagagaaa tcgatgatca tgacgcagtc ctgaggttta atggggcacc cacagccaac 1920
ttccaacaag atgtgggcac aaaaactacc attcgcctga tgaactctca gttggttacc 1980
acagagaagc gcttcctcaa agacagtttg tacaatgaag gaatcctaat tgtatgggac 2040
ccatctgtat accactcaga tatcccaaag tggtaccaga atccggatta taatttcttt 2100
aacaactaca agacttatcg taagctgcac cccaatcagc ccttttacat cctcaagccc 2160
cagatgcctt gggagctatg ggacattctt caagaaatct ccccagaaga gattcagcca 2220
aaccccccat cctctgggat gcttggtatc atcatcatga tgacgctgtg tgaccaggtg 2280
gatatttatg agttcctccc atccaagcgc aagactgacg tgtgctacta ctaccagaag 2340
ttcttcgata gtgcctgcac gatgggtgcc taccacccgc tgctctatga gaagaatttg 2400
gtgaagcatc tcaaccaggg cacagatgag gacatctacc tgcttggaaa agccacactg 2460
cctggcttcc ggaccattca ctgctaa 2487

Claims (9)

1. a method of integrating a plurality of exogenous genes at a designated site in a host cell, the method comprising the steps of:
1) Preparing an "anchored" host cell comprising site specific recombinase recognition sequences LoxPwt, LoxP1 and a PGK promoter driven positive selection sequence;
Preferably, the positive screening sequence is Puro-T2A-d1EGFP, and the amino acid sequence is shown as SEQ ID NO. 9;
Preferably, the "anchored" host cell is obtained by targeting vectors containing homologous left and right arms of the designated site, site specific recombinase recognition sequences LoxPwt, LoxP1 and positive and negative selection sequences;
More preferably, the targeting vector comprises a GAPDH homologous left arm, a positive and negative screening sequence of Zeocin-T2A-TK, a positive screening sequence of Puro-T2A-d1EGFP, recognition sequences LoxPwt and LoxP1 of site-specific recombinase in Cre-LoxP system, a GAPDH homologous right arm and a negative screening sequence of diphtheria mycin A chain (DTA), preferably, a partial DNA sequence of the targeting vector is shown as SEQ ID NO: 10;
2) Preparing a series of site-directed integration vectors containing exogenous genes, which comprises:
a first site-directed integration vector, which is prepared using a first vector, comprising a first foreign gene, a first antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP4, and LoxP 2;
a second site-directed integration vector prepared using a second vector, comprising a second foreign gene, a second antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP1, and LoxP 5;
Optionally, a third site-directed integration vector, prepared using the first vector, comprising a third exogenous gene, a first antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP4, and LoxP 2;
optionally, a fourth site-directed integration vector, prepared using the second vector, comprising a fourth foreign gene, a second antibiotic resistance gene, site-specific recombinase recognition sequences LoxPwt, LoxP1, and LoxP 5;
optionally, preparing a corresponding vector according to the gene to be integrated;
3) transfecting host cells by using a first fixed-point integration vector, and screening out first antibiotic-resistant and second antibiotic-sensitive clones by using a first antibiotic;
4) transfecting the clone obtained in the step 3) by using a second fixed-point integration vector, and screening out a clone which is resistant to a second antibiotic and sensitive to the first antibiotic by using the second antibiotic;
5) Optionally, transfecting the clone obtained in step 4) with a third site-directed integration vector, and screening the first antibiotic-resistant and second antibiotic-sensitive clones with the first antibiotic;
6) optionally, transfecting the clone obtained in the step 5) by using a fourth site-directed integration vector, and screening out a clone which is resistant to a second antibiotic and sensitive to the first antibiotic by using the second antibiotic;
7) Optionally, preparing corresponding vectors according to the genes to be integrated, and sequentially integrating to obtain a host cell integrating multiple genes at a specified site;
Preferably, the first and second antibiotic resistance genes are respectively selected from two of:
Hygromycin B gene, puromycin resistance gene, geneticin gene, blasticidin gene and phleomycin gene.
Preferably, the first antibiotic resistance gene and/or the second antibiotic resistance gene are also linked to the gene sequence of the tracer protein by a self-cleaving peptide;
Preferably, the tracer protein is selected from d1EGFP or DsRed-E2;
preferably, the first fixed-point integrated carrier sequentially comprises the following components in sequence: a recombinase recognition sequence LoxPwt, a resistance gene HygB sequence without a promoter or a HygB-T2A-DsRed-E2 sequence, a recombinase recognition sequence LoxP4, a human elongation factor 1a (EF1a) promoter, a first foreign gene and a recombinase recognition sequence LoxP 2;
preferably, the second site-directed integration vector is, in order of arrangement: a recombinase recognition sequence LoxPwt, a resistance gene Puro sequence without a promoter or a Puro-T2A-d1EGFP sequence, a recombinase recognition sequence LoxP1, a human elongation factor 1a (EF1a) promoter, a second exogenous gene and a recombinase recognition sequence LoxP 5;
preferably, the third site-directed integration vector is, in order of arrangement: a recombinase recognition sequence LoxPwt, a resistance gene HygB sequence without a promoter or a HygB-T2A-DsRed-E2 sequence, a recombinase recognition sequence LoxP4, a human elongation factor 1a (EF1a) promoter, a third foreign gene and a recombinase recognition sequence LoxP 2;
Preferably, the fourth site-directed integration vector is, in order of arrangement: a recombinase recognition sequence LoxPwt, a resistance gene Puro sequence without a promoter or a Puro-T2A-d1EGFP sequence, a recombinase recognition sequence LoxP1, a human elongation factor 1a (EF1a) promoter, a fourth foreign gene and a recombinase recognition sequence LoxP 5;
Preferably, the first and second vectors are selected from the group consisting of plasmid vectors pBR322, pUC57, pBluescript, pCI-neo, pcDNA3.1; preferably, the first and second vectors are pTOG3 and pTOG4, respectively; wherein the nucleotide sequence of the vector pTOG3 is shown as SEQ ID NO. 11; the nucleotide sequence of the vector pTOG4 is shown as SEQ ID NO. 12;
Preferably, the sequences of two or more genes are joined with self-cleaving peptide fragments as optional foreign genes; ligating the sequences of the additional two or more genes with a self-cleaving peptide fragment as another optional foreign gene;
more preferably, the self-cleaving peptide fragment is selected from P2A, T2A, or E2A; more preferably, the self-cleaving peptide fragment is P2A, and even more preferably, the sequence thereof is as shown in SEQ ID NO: 16: 16 in SEQ ID NO: GSGATNFSLLKQAGDVEENPGP are provided.
2. the method of claim 1, wherein the host cell is a mammalian host cell; preferably Chinese Hamster Ovary (CHO) cells;
More preferably, the host cell is selected from the following cell strains:
CHO-K1 cell line, CHO-S cell line, DG44 cell line, CHO-DXB11 cell line, CHOZN cell line, CHO-MK cell line, CHL-YN cell line.
3. The method of claim 1, wherein the designated site is a high expression hotspot region of the host cell genome;
preferably, the designated site is selected from the regions of the following genes:
Rosa26 gene, beta-actin, beta 2-microrogobulin, CDK2, Ubiquitin, DHFR and GAPDH genes;
More preferably, the designated site is selected from the region of the GAPDH gene;
Further preferably, the designated site is the region of the GAPDH gene from position 3574544 to 3575484 in the CHO-K1 genome; preferably, the sequence of the site region is shown as SEQ ID NO 8.
4. a method of engineering a host cell genome, the method comprising the steps of:
a. integrating one or more of exogenous gene signal recognition particle 9, signal recognition particle 14, signal recognition particle 54, endoplasmic reticulum oxidoreductase, fibroblast growth factor 9, β 1,4-galactosyltransferase 1 and/or α 2,6-sialyltransferase1 at a specified site in the genome of the host cell; and/or
b. Knocking out a fucosyltransferase 8 gene and/or a glutamine synthetase gene and/or an alpha 2,3-sialyltransferase gene 4 and/or an alpha 2,3-sialyltransferase gene 6 in the genome of the host cell using a transcription activator-like effector nuclease.
5. The method of claim 4, wherein the host cell is a mammalian host cell; preferably chinese hamster ovary cells;
more preferably, the host cell is selected from the following cell strains:
CHO-K1 cell line, CHO-S cell line, DG44 cell line, CHO-DXB11 cell line, CHOZN cell line, CHO-MK cell line or CHL-YN cell line.
Preferably, the designated site is a high expression hotspot region of the genome of the host cell;
More preferably, the designated site is selected from the regions of the following genes:
Rosa26 gene, beta-actin, beta 2-microrogobulin, CDK2, Ubiquitin, DHFR and GAPDH genes;
more preferably, the designated site is selected from the region of the GAPDH gene;
further preferably, the designated site is the region of the GAPDH gene from position 3574544 to 3575484 in the CHO-K1 genome; preferably, the sequence of the site region is shown as SEQ ID NO 8.
6. the process according to claim 4, wherein step a. is carried out by a process according to any one of claims 1 to 3;
preferably, three genes of the human integrated exogenous gene signal recognition particle 9, the signal recognition particle 14 and the signal recognition particle 54 are connected by using automatic cleavage peptide fragments to be used as optional exogenous genes; connecting endoplasmic reticulum oxidoreductase and fibroblast growth factor 9 with automatic cutting peptide segment to obtain optional exogenous gene; human beta 1,4-galactosyltransferase 1 and/or alpha 2,6-sialyltransferase1 genes are linked with auto-cleaving peptide fragments, which are optional foreign genes.
7. The method according to claim 4, wherein in step b.a knockout is performed using the TALEN method;
preferably, TALEN left-arm and right-arm protein gene sequences constructed by knocking out Fut8 genes are shown in SEQ ID NO 18 and 19; the targeted region is exon 10 of the CHO cell Fut8 gene;
Preferably, the gene sequence of the TALEN left-arm protein constructed by knocking out the GS gene is shown in SEQ ID NO. 22 and 23. The targeted region is exon 7 of the CHO cell GS gene.
preferably, TALEN left-arm and right-arm protein gene sequences constructed by knocking out ST3GAL4 genes are shown as SEQ ID NO. 26 and 27; the targeted region is exon 5 of the ST3GAL4 gene in CHO cells.
Preferably, TALEN left-arm and right-arm protein gene sequences constructed by knocking out ST3GAL6 genes are shown as SEQ ID NO. 30 and 31; the targeted region is exon 2 of the ST3GAL6 gene in CHO cells.
8. A host cell engineered according to the method of any one of claims 1 to 7;
Preferably, the engineered host cell carries one or more of exogenous gene signal recognition particle 9, signal recognition particle 14, signal recognition particle 54, endoplasmic reticulum oxidoreductase, fibroblast growth factor 9, β 1,4-galactosyltransferase 1 and/or α 2,6-sialyltransferase 1;
Preferably, the engineered host cell carries at least exogenous gene signal recognition particle 9, signal recognition particle 14, signal recognition particle 54, endoplasmic reticulum oxidoreductase, fibroblast growth factor 9;
Preferably, the engineered host cell carries exogenous gene signal recognition particle 9, signal recognition particle 14, signal recognition particle 54, endoplasmic reticulum oxidoreductase, fibroblast growth factor 9, β 1,4-galactosyltransferase 1 and α 2,6-sialyltransferase 1;
Preferably, the engineered host cell does not express one or more of a fucosyltransferase 8 gene and/or a glutamine synthetase gene and/or an alpha 2,3-sialyltransferase 4 and/or an alpha 2,3-sialyltransferase 6;
Preferably, the engineered host cell does not express the fucosyltransferase 8 gene and/or the glutamine synthetase gene;
Preferably, the engineered host cell does not express the fucosyltransferase 8 gene, the glutamine synthetase gene, the α 2,3-sialyltransferase 4 and the α 2,3-sialyltransferase 6;
Preferably, the engineered host cell is selected from the following cell lines:
CHO-E cell line, CHO-EF cell line, CHO-EG cell line, CHO-EFG cell line, CHO-ES (3,6) cell line, CHO-EFS (3,6) GS cell line, CHO-ES (6) cell line, CHO-EFS (6) cell line.
9. A host cell engineered according to the method of any one of claims 1 to 7, the use of a host cell according to claim 8 for the production of a protein of interest; preferably, the protein of interest is an antibody.
CN201910264952.7A 2019-04-03 2019-04-03 Methods of engineering host cell genomes and uses thereof Active CN110564772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910264952.7A CN110564772B (en) 2019-04-03 2019-04-03 Methods of engineering host cell genomes and uses thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910264952.7A CN110564772B (en) 2019-04-03 2019-04-03 Methods of engineering host cell genomes and uses thereof

Publications (2)

Publication Number Publication Date
CN110564772A true CN110564772A (en) 2019-12-13
CN110564772B CN110564772B (en) 2022-07-05

Family

ID=68773398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910264952.7A Active CN110564772B (en) 2019-04-03 2019-04-03 Methods of engineering host cell genomes and uses thereof

Country Status (1)

Country Link
CN (1) CN110564772B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108138201A (en) * 2015-09-04 2018-06-08 托卡根公司 Include the recombinant vector of 2A peptides
WO2022012663A1 (en) * 2020-07-17 2022-01-20 创观(苏州)生物科技有限公司 Virus insusceptible animal and construction method therefor
CN114107176A (en) * 2021-12-14 2022-03-01 广东省农业科学院动物卫生研究所 CHO cell line for stably expressing African swine fever CD2v protein and construction method and application thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100081794A1 (en) * 2008-09-26 2010-04-01 Eureka Therapeutics, Inc. Modified Glycoproteins and Uses Thereof
EP2313497A1 (en) * 2008-07-23 2011-04-27 Boehringer Ingelheim Pharma GmbH & Co. KG Improved production host cell lines
CN107760650A (en) * 2016-08-22 2018-03-06 厦门大学 A kind of Chinese hamster ovary celI of transformation and application thereof
CN109096399A (en) * 2017-08-11 2018-12-28 百奥泰生物科技(广州)有限公司 A kind of recombinant antibodies and preparation method thereof with unique sugar spectrum generated by genome CHO host cell to be edited

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2313497A1 (en) * 2008-07-23 2011-04-27 Boehringer Ingelheim Pharma GmbH & Co. KG Improved production host cell lines
US20100081794A1 (en) * 2008-09-26 2010-04-01 Eureka Therapeutics, Inc. Modified Glycoproteins and Uses Thereof
CN107760650A (en) * 2016-08-22 2018-03-06 厦门大学 A kind of Chinese hamster ovary celI of transformation and application thereof
CN109096399A (en) * 2017-08-11 2018-12-28 百奥泰生物科技(广州)有限公司 A kind of recombinant antibodies and preparation method thereof with unique sugar spectrum generated by genome CHO host cell to be edited

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CHANGCHUIN MAO ET AL.: ""Identification of a Guinea Pig Fcγ Receptor that Exhibits Enhanced Binding to Afucosylated Human and Mouse IgG"", 《J INFECT DIS MED.》 *
CHANGCHUIN MAO ET AL.: ""Identification of a Guinea Pig Fcγ Receptor that Exhibits Enhanced Binding to Afucosylated Human and Mouse IgG"", 《J INFECT DIS MED.》, vol. 1, no. 1, 31 December 2017 (2017-12-31), pages 1 - 2 *
CHENG-YU CHUNG ET AL.: ""Combinatorial genome and protein engineering yields monoclonal antibodies with hypergalactosylation from CHO cells"", 《BIOTECHNOLOGY AND BIOENGINEERING》 *
CHENG-YU CHUNG ET AL.: ""Combinatorial genome and protein engineering yields monoclonal antibodies with hypergalactosylation from CHO cells"", 《BIOTECHNOLOGY AND BIOENGINEERING》, vol. 114, no. 12, 7 July 2017 (2017-07-07), pages 1 *
唐静 等: ""人IL35-IgG4(Fc)融合蛋白在CHO/DG44细胞中的稳定表达"", 《生物工程学报》 *
唐静 等: ""人IL35-IgG4(Fc)融合蛋白在CHO/DG44细胞中的稳定表达"", 《生物工程学报》, vol. 25, no. 1, 25 January 2009 (2009-01-25), pages 109 - 115 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108138201A (en) * 2015-09-04 2018-06-08 托卡根公司 Include the recombinant vector of 2A peptides
WO2022012663A1 (en) * 2020-07-17 2022-01-20 创观(苏州)生物科技有限公司 Virus insusceptible animal and construction method therefor
CN114107176A (en) * 2021-12-14 2022-03-01 广东省农业科学院动物卫生研究所 CHO cell line for stably expressing African swine fever CD2v protein and construction method and application thereof

Also Published As

Publication number Publication date
CN110564772B (en) 2022-07-05

Similar Documents

Publication Publication Date Title
CN110564772B (en) Methods of engineering host cell genomes and uses thereof
US6413777B1 (en) Method for integrating genes at specific sites in mammalian cells via homologous recombination and vectors for accomplishing the same
US6312693B1 (en) Antibodies against human CD40
DK2828384T3 (en) PROCEDURE FOR SURVIVING SENSITIVITY TO CHEMICAL DNA MODIFICATIONS OF CONSTRUCTED SPEECH DNA BINDING DOMAINS
CN101213203A (en) Methods and compositions for regulated expression of nucleic acid at post-transcriptional level
CN113908291A (en) Method for producing immunoligand/effector molecule conjugates by sequence-specific transpeptidases
AU2019203780B2 (en) Production cell line enhancers
CN110835633A (en) Preparation of PTC stable cell line by using optimized gene codon expansion system and application
KR20100051632A (en) Fsh producing cell clone
KR20230056630A (en) Novel OMNI-59, 61, 67, 76, 79, 80, 81 and 82 CRISPR nucleases
CN110214182A (en) Gene therapy for I type mucopolysaccharidosis
KR20220149588A (en) Compositions and methods for the treatment of metabolic liver disorders
KR20200044968A (en) Melanin antibodies and uses thereof
FI20195716A1 (en) A genetically modified fungus and methods and uses related thereto
KR20100084689A (en) Hcv ns3 protease replicon shuttle vectors
CN113637672B (en) Base editing tool and construction method thereof
CN115707778B (en) Recombinant coxsackievirus A10 virus-like particles and uses thereof
CN113166779A (en) Regulated gene editing system
CN113652450B (en) Preparation method of lentiviral vector, lentiviral vector obtained by preparation method and application of lentiviral vector
CN115707779A (en) Recombinant Coxsackie virus A16 virus-like particle and application thereof
KR102543504B1 (en) Fluorescent protein variant for detection of cell damage and evaluation method for drug toxicity using the same
CN114134141B (en) Chimeric phenylalanine translation system with introduced unnatural amino acid and construction method thereof
CN112852651B (en) Method for increasing yield of hydrocortisone produced by saccharomyces cerevisiae biotransformation
KR20190052888A (en) Komagataeibacter genus microorganism having enhanced cellulose productivity, method for producing cellulose using the same, and method for producing the microorganism
KR102663134B1 (en) Production cell line enhancers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant