CN114085841B

CN114085841B - Site for stably expressing protein in CHO cell gene NW _003614092.1 and application thereof

Info

Publication number: CN114085841B
Application number: CN202111395539.8A
Authority: CN
Inventors: 陈蕴; 金坚; 丁学峰; 瞿丽丽; 李华钟; 蔡燕飞; 杨兆琪; 朱景宇; 鲁晨; 俞琪
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2021-11-23
Filing date: 2021-11-23
Publication date: 2022-07-15
Anticipated expiration: 2041-11-23
Also published as: CN114085841A; WO2023093006A1

Abstract

The invention discloses a site for stably expressing protein in a CHO cell gene NW _003614092.1 and application thereof, wherein the stable expression site obtained by the invention is positioned at the base in the range of 100bp upstream and downstream of the 1159467 th base of the CHO cell gene NW _003614092.1, namely the 1159367 and 1159567 th bases, and can integrate a foreign protein gene and carry out stable expression. The invention integrates the target gene into the stable expression region in a fixed-point integration way, thus solving the problem of undefined integration site caused by random integration; the invention overcomes the expression instability caused by position effect and the repeated and fussy cell strain screening process by site-specific integration of exogenous genes in the base range of 1159367-1159567 at the 1159467 th base of a stable expression site NW-003614092.1 in a CHO genome.

Description

Site for stably expressing protein in CHO cell gene NW _003614092.1 and application thereof

Technical Field

The invention relates to a site for stably expressing protein in a CHO cell gene NW _003614092.1 and application thereof, belonging to the technical field of genes.

Background

Chinese Hamster Ovary (CHO) cells were established in the laboratory of dr. theodore t. puck in 1957, are immortalized, non-secreting cells that secrete little endogenous protein; meanwhile, the protein has the advantages of closer posttranslational modification to human natural protein, difficult infection by human viruses, large-scale culture in serum-free culture medium with definite chemical components and the like, is widely applied to the field of biological pharmacy, and produces more than 70 percent of protein medicines. However, CHO cells have long culture period and high culture cost as mammalian cells, and meanwhile, the demand for recombinant products such as monoclonal antibodies is increasing, and the continuously increasing demand means that the specific productivity needs to be optimized. Although the expression quantity can be improved by increasing the copy number of genes, developing a new strong promoter, searching a proper enhancer and the like, the expression level of most CHO cells in the long-term culture process is unstable, and the problems of product approval and product marketing of a supervision department are directly influenced. Therefore, the construction of stable and high expression CHO expression strain is very important for the research and development of protein drugs and the marketing.

There are two strategies for constructing stable and highly expressed cell lines. One is the traditional method of random integration combined with high-throughput screening, and the other is the method of combining gene editing technology and homology-directed repair to integrate the target gene into the predetermined chromosomal site in a targeted manner. However, due to uncertainty of integration sites and existence of position effects, the constructed cell genotypes have great difference in random integration, and the problem of unstable expression is easily caused in the long-term passage process, so that the later screening process is very long, and the whole process of generating the recombinant cell line by a random integration mode in an industrial environment usually needs 6-12 months; compared with random integration, the site-directed integration utilizes a gene editing technology, particularly a genome site-directed editing technology widely applied in recent years and mediated by CRISPR/Cas9, greatly reduces the development time and cost, and because the sequence is known, the information after the site-directed integration is clearer and more definite than the information after the random integration.

Random integration is the most mature traditional method for constructing a protein expression system, but multiple screening is needed to obtain a stable high-expression cell strain in a random integration mode, so that the time consumption is long, and the cost is high. Meanwhile, for the cell lines obtained by random integration, in the later culture period, the loss of expression stability of the cells cannot be predicted at all: such instability may not occur at all; it is also possible that the cells develop after an infinite number of divisions; it is also possible that after the cell has divided for only a few passages, significant instability occurs. The stability problem not only affects the time of the product on the market, but also conflicts with the drug regulation management. In addition, the information of random integration sites is unclear, and the site effect of exogenous gene integration can also cause the expression level of the target gene to be remarkably reduced. According to the existing literature reports, the instability of the recombinant CHO cell line appears in all recombinant CHO cell lines, and the problem of unstable expression becomes an extremely common problem.

Disclosure of Invention

In order to solve the technical problems, the invention provides a site for stably expressing protein in a CHO cell genome, the site has clear information, the site can realize the site-specific integration of protein genes, can stably express the protein, can greatly shorten the screening process and time in the cell construction process, and reduces the research and development cost.

The first purpose of the invention is to provide a site for stably expressing protein in the CHO cell genome, wherein the site for stably expressing protein is positioned in 1159367 and 1159567 bases of a CHO cell gene NW _ 003614092.1.

Further, the nucleotide sequence of 1159367-1159567 bases of the CHO cell gene NW _003614092.1 is shown in SEQ ID NO. 1.

Further, when the CRISPR/Cas9 technology is used for site-directed transfer of a coding gene of a target protein, the site for stably expressing the protein can be recognized by a 5'NNNNNNNNNNNNNNNNNNNNNGG 3' sequence of the CRISPR/Cas9 technology.

Further, in the embodiment of the present invention, the 5'NNNNNNNNNNNNNNNNNNNNNGG 3' sequence is selected from the following 9 groups of sequences: 5'-TAGGTCATGGGATTCCATGCTGG-3', 5'-TATGGCTTCATCTATGGAGTAGG-3', 5'-GTTCATACAAGTATTAGACTTGG-3', 5'-TCCATAGATGAAGCCATACCTGG-3', 5'-CCATACCTGGAACCCTTATCAGG-3', 5'-CTTTCCAGCCCAGTCTTTGTAGG-3', 5'-CTTGTGATCATTTTCCCCTCTGG-3', 5'-TATCAGGAAGAGTTTGGAGAGGG-3', 5'-AGTGTCTGTGTTCTTCCATGGGG-3'.

In the present invention, the above-mentioned 9 sequences cover most of the sequences at the upper, middle and lower reaches within 200 bases of the present invention within bases 1159367-1159567 of the CHO cell gene NW _003614092.1, which indicates that 200 bases of the present invention can be used as sites for stably expressing proteins.

The present invention is not limited to the 9 sequences, but the 9 sequences are only preferred technical means for introducing the gene encoding the target protein into the stable expression site, and the purpose of stably expressing the protein of the present invention can be achieved by using other sequences or even other means for introducing the target gene.

Further, the protein is a protein with a molecular weight of less than 160 KDa.

Furthermore, the protein is one of polypeptide, functional protein, antibody and fusion protein.

The second purpose of the invention is to provide the application of the site for stably expressing the protein in the CHO cell genome in the stable expression of the foreign protein by the CHO cells.

Furthermore, the application specifically comprises constructing a foreign protein or polypeptide encoding gene at a site where the protein is stably expressed in the CHO cell gene NW _ 003614092.1.

The third purpose of the invention is to provide an expression vector for expressing protein in CHO cells, wherein the coding gene of the protein is positioned in the middle region of a 5 'homologous arm and a 3' homologous arm on the expression vector, and the 5 'homologous arm and the 3' homologous arm are respectively sequences with the upstream and downstream length of 600bp of the site for stably expressing the protein.

In the invention, the expression vector is a vector suitable for CHO cell expression.

Furthermore, the expression vector also comprises a promoter sequence positioned at the upstream of the coding gene of the protein, and the promoter controls the expression of the protein.

Further, the promoter is: CMV (a strong mammalian expression promoter derived from human cytomegalovirus), EF-1a (a strong mammalian expression promoter derived from human elongation factor 1 α), SV40 (a mammalian expression promoter derived from simian vacuolating virus 40), PGK1 (a mammalian promoter derived from phosphoglycerate kinase gene), UBC (a mammalian promoter derived from human ubiquitin C gene), human beta actin (a mammalian promoter derived from beta-actin gene), CAG (a strong hybrid mammalian promoter), and the like.

In the present invention, there is also included a method of constructing an expression vector for expressing a protein in CHO cells, comprising the steps of: inserting the coding gene of the protein into the region between 5 'arm and 3' arm of the plasmid, and enabling the coding gene of the protein to be positioned at the downstream of the promoter and to be controlled by the promoter to obtain the expression vector for expressing the protein in the CHO cell.

The fourth purpose of the invention is to provide a CHO recombinant cell which can stably express protein in a fixed-point integration way, wherein the CHO recombinant cell is obtained by transferring the expression vector for expressing the protein in the CHO cell, sgRNA plasmid corresponding to a target sequence and Cas9 plasmid into the CHO cell.

Further, the target sequence is preferably 5'-TAGGTCATGGGATTCCATGCTGG-3', 5'-TATGGCTTCATCTATGGAGTAGG-3', 5'-GTTCATACAAGTATTAGACTTGG-3', 5'-TCCATAGATGAAGCCATACCTGG-3', 5'-CCATACCTGGAACCCTTATCAGG-3', 5'-CTTTCCAGCCCAGTCTTTGTAGG-3', 5'-CTTGTGATCATTTTCCCCTCTGG-3', 5'-TATCAGGAAGAGTTTGGAGAGGG-3' or 5'-AGTGTCTGTGTTCTTCCATGGGG-3'.

In the present invention, there is also provided a method for constructing a CHO recombinant cell, comprising the steps of:

(1) transfecting the plasmid vector into a CHO cell by a liposome transfection mode to obtain a recombinant CHO cell pool;

wherein, the plasmids are the expression vector for expressing the protein in the CHO cell, the sgRNA plasmid corresponding to the target sequence and the Cas9 plasmid respectively;

(2) screening the recombinant cell pool to obtain a CHO recombinant cell expressing the foreign protein;

(3) culturing the CHO recombinant cells in an adherent manner, detecting the expression level of the protein, and carrying out suspension domestication on the high expression adherent CHO recombinant cells;

(4) and (3) culturing and verifying stability of the suspension domesticated CHO recombinant cells, and detecting the expression level of the protein.

The invention has the beneficial effects that:

the stable expression site obtained by the invention is located in the range of 100bp upstream and downstream of the 1159467 th base of the CHO cell gene NW-003614092.1, namely 1159367-1159567 th base, and can integrate the foreign protein gene and carry out stable expression. The invention integrates the target gene into the stable expression region in a fixed-point integration way, thus solving the problem of unclear integration site caused by random integration; the invention overcomes the expression instability caused by position effect and the repeated and fussy cell strain screening process by integrating the exogenous gene at the position of 1159467 base upstream and downstream 1159367-1159567 bases of the stable expression site NW _003614092.1 in the CHO genome at fixed points, reduces the original screening time of 6-12 months to 1-3 months, effectively shortens the research and development time for constructing a stable expression cell line and reduces the cost.

Description of the drawings:

FIG. 1 is a schematic diagram of a fixed point integration according to the present invention;

FIG. 2 shows the EGFP expression of cells constructed with different target sequences in different generations.

FIG. 3 shows the expression of HSA in cells constructed with different target sequences in different generations.

Detailed Description

The present invention is further described below with reference to specific examples so that those skilled in the art can better understand the present invention and can practice the present invention, but the examples are not intended to limit the present invention.

The related detection method comprises the following steps:

the method for measuring the average fluorescence intensity of the cells comprises the following steps: culturing the cells until the confluence reaches about 90%, digesting the cells by using 0.25% trypsin, terminating the digestion by using a complete culture medium with the same amount as the trypsin, collecting the cells in a sterile centrifuge tube, centrifuging for 5min at 1000rpm/min, discarding the supernatant, resuspending the cells by using PBS, collecting the cells in a flow-type sample tube through a cell filter screen, and analyzing the fluorescence intensity of the cells by using a flow cytometer by using a blank CHO-K1 cell as a control.

Example 1: screening for Stable expression sites

The CHO-K1-1d2 cells which are screened by a flow cytometer in high flux and express Zsgreen1 reporter genes are cultured to be in a good state under an adherent culture state, the CHO cells at the time are regarded as 0 generation, the CHO cells are continuously cultured for 20 generations, the conditions that the cells express Zsgreen1 protein at 0, 10 and 20 generations are observed under an inverted fluorescence microscope, and meanwhile, the average fluorescence intensity of the cells at 0, 10 and 20 generations is detected by a BD flow cytometer.

Through observation by an inverted fluorescence microscope and detection by a flow cytometer, after 20 generations of continuous passage, the CHO-K1-1d2 cells can still express Zsgreen1 protein in percentage under the adherent culture state, and the expression levels of the Zsgreen1 protein among different generations are basically consistent and have stronger green fluorescence signals.

Suspension domestication is carried out on CHO-K1-1d2 cells verified by adherence stability, and 60 generations of continuous passage are carried out on CHO-K1-1d2 cells successfully subjected to suspension domestication, and the CHO-K1-1d2 cells successfully subjected to suspension domestication are used as the 0 th generation. And observing the Zsgreen1 protein expression condition of the cells at 0 th, 10 th, 20 th, 30 th, 40 th, 50 th and 60 th generations under an inverted fluorescence microscope, and simultaneously detecting the average fluorescence intensity of the cells at 0 th, 10 th, 20 th, 30 th, 40 th, 50 th and 60 th generations by using a flow cytometer.

Through observation by an inverted fluorescence microscope and detection by a flow cytometer, after 60 generations of continuous passage, the CHO-K1-1d2 cells can still express Zsgreen1 protein in percentage under the suspension culture state, and the expression levels of the Zsgreen1 protein among different generations are basically consistent and have stronger green fluorescence signals. The CHO-K1-1d2 cell is shown to be capable of stably expressing the Zsgreen1 reporter gene, and simultaneously, the integrated site of the lentivirus carrying the Zsgreen1 reporter gene is shown to be a stable expression site.

Example 2: lentiviral integration site analysis

The Integration Site of the lentiviral vector in CHO-K1-1d2 cells was analyzed using the Lenti-X Integration Site Analysis Kit (Clontech:631263) related to chromosome walking technology, and the specific steps were as follows:

(1) construction of lentivirus integration library

Collecting CHO-K1-1d2 cells, extracting a genome by using a DNA extraction kit, and performing enzyme digestion on the genome for 16-18h at 37 ℃ by using three restriction enzymes of DraI, SspI and HpaI respectively, wherein the enzyme digestion system is as follows:

purifying and recovering the product after enzyme digestion by using a PCR purification kit, and connecting chromosome walking joints genome Walker adapter to two ends of the purified enzyme digestion fragment, wherein the connection system is as follows:

after incubation at 16 ℃ overnight and at 70 ℃ for 5 minutes and termination of the reaction, 32. mu.l of TE (10/1, pH 7.5) was added to the system to obtain three lentivirus integration libraries.

(2) PCR amplification of lentivirus integration libraries

Three lentivirus integration libraries obtained in step (1) of example 2 were subjected to two rounds of nested PCR. Using the adaptor primers AP1 and AP2 and the lentiviral sequence-specific primers LSP1 and LSP2 of the adaptor ligated in step (1) in example 2, LTR region was amplified from the adjacent genomic region of CHO-K1 cells

The one-round PCR reaction system is as follows:

the reaction procedure was as follows:

mu.l of one round of PCR product was diluted to 50. mu.l with deinized H2O.

The two-round PCR reaction system is as follows:

the reaction procedure was as follows:

(3) sequencing and analysis

The two rounds of PCR products were subjected to agarose gel electrophoresis and gel recovery sequencing, which was performed according to the Lenti-X Integration Site Analysis Kit (Clontech: 631263). The sequencing results were compared at NCBI with the CHO cell genome to obtain the lentiviral integration site information, which was located 1159467 th base of CHO cell genome NW _003614092.1 in CHO-K1-1d2 cells.

Example 3: target sequence selection

According to the principle of closeness, the CCTOP CRISPR/Cas9 is used for predicting the sequences of 100bp upstream and downstream of 1159467 base of a position NW _003614092.1 on line: CCCTCTCCAAACTCTTCCTGATAAGGGTTCCAGGTATGGCTTCATCTATGGAGTAGGAGTCATATTCAATCAGAAAGTGTCTGTGTTCTTCCATGGGGTTCATACAAGTATTAGACTTGGGAGCATGTCTTTCCAGCCCAGTCTTTGTAGGTCATGGGATTCCATGCTGGGTATGACTTGTGATCATTTTCCCCTCTGGAA (SEQ ID NO.1), and selecting the target sequence with higher editing efficiency.

The relevant parameters are set as follows:

1) the maximum number of mismatch bases of the first 13bp allowed in the 20bp sequence after the NGG is 1;

2) the number of mismatched bases of all 20bp after NGG was 4.

The CCTOP CRISPR/Cas9 online prediction system scores the editing efficiency of the identified 5'NNNNNNNNNNNNNNNNNNNNNGG 3' target sequence, LOW efficacy (score < 0.56); MEDIUM efficacy (0.56 ═ score ≦ 0.74); HIGH efficacy (score > 0.74).

Sequences in which the predicted editing efficiency was higher than 0.56 were selected as target sequences.

Target sequence 5'-TAGGTCATGGGATTCCATGCTGG-3' (SEQ ID No.2), score 0.75

Target sequence 5'-TATGGCTTCATCTATGGAGTAGG-3' (SEQ ID No.3), score 0.71

Target sequence 5'-GTTCATACAAGTATTAGACTTGG-3' (SEQ ID No.4), score 0.62

Target sequence 5'-TCCATAGATGAAGCCATACCTGG-3' (SEQ ID No.5), score 0.63

Target sequence 5'-CCATACCTGGAACCCTTATCAGG-3' (SEQ ID NO.6), score 0.63

Target sequence 5'-CTTTCCAGCCCAGTCTTTGTAGG-3' (SEQ ID NO.7), score 0.71

Target sequence 5'-CTTGTGATCATTTTCCCCTCTGG-3' (SEQ ID No.8), score 0.76

Target sequence 5'-TATCAGGAAGAGTTTGGAGAGGG-3' (SEQ ID No.9), score 0.70

Target sequence 5'-AGTGTCTGTGTTCTTCCATGGGG-3' (SEQ ID NO.10), score 0.78

Example 4: site-specific integration of EGFP

The CRISPR/Cas9 mediated genome site-directed editing technology and homologous recombination are utilized to integrate the green fluorescent protein gene (EGFP, 26.7KDa) at the target site in a site-directed manner. The CRISPR/Cas 9-mediated homologous recombination technology requires the construction of sgRNA Plasmid and Donor Plasmid, and the construction process is as follows:

1. sgRNA plasmid construction

1) The oligonucleotide chain was synthesized based on the target sequence selected in example 3

sgRNA-F1 5'TTTGTAGGTCATGGGATTCCATGCGT 3'(SEQ ID NO.11)

sgRNA-R1 5'TAAAACGCATGGAATCCCATGACCTA 3'(SEQ ID NO.12)

sgRNA-F2 5'TTTGTATGGCTTCATCTATGGAGTGT 3'(SEQ ID NO.13)

sgRNA-R2 5'TAAAACACTCCATAGATGAAGCCATA 3'(SEQ ID NO.14)

sgRNA-F3 5'TTTGGTTCATACAAGTATTAGACTGT 3'(SEQ ID NO.15)

sgRNA-R3 5'TAAAACAGTCTAATACTTGTATGAAC 3'(SEQ ID NO.16)

sgRNA-F4 5'TTTGTCCATAGATGAAGCCATACCGT 3'(SEQ ID NO.17)

sgRNA-R4 5'TAAAACGGTATGGCTTCATCTATGGA 3'(SEQ ID NO.18)

sgRNA-F5 5'TTTGCCATACCTGGAACCCTTATCGT 3'(SEQ ID NO.19)

sgRNA-R5 5'TAAAACGATAAGGGTTCCAGGTATGG 3'(SEQ ID NO.20)

sgRNA-F6 5'TTTGCTTTCCAGCCCAGTCTTTGTGT 3'(SEQ ID NO.21)

sgRNA-R6 5'TAAAACACAAAGACTGGGCTGGAAAG 3'(SEQ ID NO.22)

sgRNA-F7 5'TTTGCTTGTGATCATTTTCCCCTCGT 3'(SEQ ID NO.23)

sgRNA-R7 5'TAAAACGAGGGGAAAATGATCACAAG 3'(SEQ ID NO.24)

sgRNA-F8 5'TTTGTATCAGGAAGAGTTTGGAGAGT 3'(SEQ ID NO.25)

sgRNA-R8 5'TAAAACTCTCCAAACTCTTCCTGATA 3'(SEQ ID NO.26)

sgRNA-F9 5'TTTGAGTGTCTGTGTTCTTCCATGGT 3'(SEQ ID NO.27)

sgRNA-R9 5'TAAAACCATGGAAGAACACAGACACT 3'(SEQ ID NO.28)

2) Separately annealing and connecting the synthesized oligonucleotide chains (1-9 pairs)

Performing metal bath at 95 ℃ for 5min, and naturally cooling to room temperature;

3) carrying out enzyme digestion on the PSK-u6-gRNA plasmid by BBsI enzyme, and carrying out gel recovery on the vector subjected to enzyme digestion;

4) connecting the recovered plasmid vector with the annealed oligonucleotide chain

Ligation was performed at 22 ℃ for 1h or at 4 ℃ overnight;

5) transformation to DH5 α competence;

6) selecting positive clones, and sequencing by using a universal primer M13 fwd;

7) and expanding and culturing the positive cloning strain and extracting plasmid.

2. Construction of Donor plasmid: the Donor plasmid information is shown in fig. 1, and is obtained by modifying an existing plasmid vector for expressing EGFP. The 5 'arm and the 3' arm are respectively the upstream and downstream homologous arms of the target site recognized by each pair of sgrnas, the length is 600bp, and the GOI is an integrated target gene.

1) Obtaining 5 'arm and 3' arm with the length of 600bp upstream and downstream of the site with the plasmid homologous fragment through primer design and PCR amplification;

2) respectively utilizing double enzyme digestion and glue recovery to cut out the original homology arm of the Donor plasmid;

3) respectively connecting 5 'arm and 3' arm corresponding to the target site by a homologous recombination method;

4) and the EGFP sequence of the target gene is carried by the original plasmid.

3. The constructed sgRNA plasmid, Donor plasmid and Cas9-DTU plasmid (donated by dr. helene F Kildegaard, denmark science and technology university) were transfected with Lipofectamine 3000 transfection reagent at 1.8: 1.8:1 mass ratio to CO-transfect CHO-K1 cells cultured at 37 ℃ under 5% CO2, and set blank control group. After 24h of transfection, pressure screening was performed using puromycin at 10. mu.g/ml until all the control cells were dead, the post-screening cell pool was expanded, and monoclonal cells emitting only green fluorescence and not emitting red fluorescence were sorted out using a BD flow cytometer.

4. After the cloning cell strain is amplified, a part of the amplified cell strain is extracted to obtain a genome, and the genome is identified by 5 'Junction PCR, 3' Junction PCR and out-out PCR, as shown in FIG. 1.

5. The positive clone cell line was retained.

Example 5: fixed point integration HSA

The CRISPR/Cas9 mediated genome site-directed editing technology and homologous homology are utilized to carry out site-directed integration of a gene (HSA, 68KDa) expressing human serum albumin at a target site. The CRISPR/Cas 9-mediated homologous recombination technology requires the construction of sgRNA Plasmid and Donor Plasmid, and the construction process is as follows:

1. sgRNA plasmid construction:

1) synthesis of an oligonucleotide chain according to the target sequence selected in example 3

sgRNA-F1 5'TTTGTAGGTCATGGGATTCCATGCGT 3'

sgRNA-R1 5'TAAAACGCATGGAATCCCATGACCTA 3'

sgRNA-F2 5'TTTGTATGGCTTCATCTATGGAGTGT 3'

sgRNA-R2 5'TAAAACACTCCATAGATGAAGCCATA 3'

sgRNA-F3 5'TTTGGTTCATACAAGTATTAGACTGT 3'

sgRNA-R3 5'TAAAACAGTCTAATACTTGTATGAAC 3'

sgRNA-F4 5'TTTGTCCATAGATGAAGCCATACCGT 3'

sgRNA-R4 5'TAAAACGGTATGGCTTCATCTATGGA 3'

sgRNA-F5 5'TTTGCCATACCTGGAACCCTTATCGT 3'

sgRNA-R5 5'TAAAACGATAAGGGTTCCAGGTATGG 3'

sgRNA-F6 5'TTTGCTTTCCAGCCCAGTCTTTGTGT 3'

sgRNA-R6 5'TAAAACACAAAGACTGGGCTGGAAAG 3'

sgRNA-F7 5'TTTGCTTGTGATCATTTTCCCCTCGT 3'

sgRNA-R7 5'TAAAACGAGGGGAAAATGATCACAAG 3'

sgRNA-F8 5'TTTGTATCAGGAAGAGTTTGGAGAGT 3'

sgRNA-R8 5'TAAAACTCTCCAAACTCTTCCTGATA 3'

sgRNA-F9 5'TTTGAGTGTCTGTGTTCTTCCATGGT 3'

sgRNA-R9 5'TAAAACCATGGAAGAACACAGACACT 3

2) Annealing and connecting the synthesized oligonucleotide chains (1-9 pairs) respectively

Ligation was performed at 22 ℃ for 1h or at 4 ℃ overnight;

5) transformation to DH5 α competence;

7) expanding and culturing positive clone strains and improving quality;

2. construction of Donor Plasmid: the Donor plasma information is shown in fig. 1. The 5 'arm and the 3' arm are respectively an upstream homologous arm and a downstream homologous arm of the target site, the length is 600bp, and the GOI is an integrated target gene.

4) and obtaining the target gene HSA through PCR amplification, and connecting the target gene HSA to a plasmid vector by utilizing enzyme digestion linkage.

3. The constructed sgRNA plasmids,Donor plasmid was transfected with Cas9-DTU plasmid (donated by dr. helene F kildogaard, denmark science and technology university) by Lipofectamine 3000 at a rate of 1.8: 1.8:1 mass ratio cotransfection at 37 ℃ with 5% CO₂CHO-K1 cells cultured under conditions, while a blank control was set. After 24h of transfection, pressure screening was performed using puromycin at 10. mu.g/ml until all the control cells died, the cell pool after screening was expanded, and monoclonal cells that did not fluoresce were sorted out using a BD flow cytometer.

4. After the cloning cell strain is expanded, a part of the extracted genome is identified by 5 'Junction PCR, 3' Junction PCR and out-out PCR, as shown in FIG. 1.

5. And reserving a positive clone cell strain.

Test example:

1. the green fluorescence intensity of the cell line constructed in example 4 was measured by a BD flow cytometer

The detection method comprises the following steps: the cell strains obtained in example 4 are continuously passaged for 60 generations, cells are collected every 15 generations, cell fluorescence is detected by a flow cytometer, and the intensity is detected, as shown in fig. 2, the detection result shows that more than 98% of the cells constructed according to different target sequences in example 4 still express green fluorescent protein after the cells are continuously passaged for 60 generations, and the fluctuation range of the green fluorescent intensity between 0 generation and 60 generation does not exceed 30%.

2. Urine microalbumin assay kit for detecting the expression of HSA in the cell line constructed in example 5

The detection method comprises the following steps: the cell strains obtained in example 5 were continuously passaged for 60 generations under serum-free culture conditions, cell fermentation supernatants under the passage were collected every 15 generations, and HSA content in the fermentation broth was detected by using a urine microalbumin assay kit, and analysis of the detection results shows that the cells constructed according to different target sequences in example 5 have stable HSA expression ability in different passages, as shown in fig. 3.

The 9 sets of target sequences screened in the example 3 of the invention cover most of the sequences in the 200bp base range, the middle base range and the downstream base range, and the 1159367-containing 1159567 base range in the CHO cell gene NW _003614092.1 of the invention can successfully construct site-directed integration stable expression cell lines and can stably express target proteins.

The above-mentioned embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitutions or changes made by the person skilled in the art on the basis of the present invention are all within the protection scope of the present invention. The protection scope of the invention is subject to the claims.

Sequence listing

<110> university in south of the Yangtze river

<120> site for stably expressing protein in CHO cell gene NW _003614092.1 and application thereof

<141> 2021-11-19

<160> 28

<170> SIPOSequenceListing 1.0

<210> 1

<211> 201

<212> DNA

<213> (Artificial sequence)

<400> 1

ccctctccaa actcttcctg ataagggttc caggtatggc ttcatctatg gagtaggagt 60

catattcaat cagaaagtgt ctgtgttctt ccatggggtt catacaagta ttagacttgg 120

gagcatgtct ttccagccca gtctttgtag gtcatgggat tccatgctgg gtatgacttg 180

tgatcatttt cccctctgga a 201

<210> 2

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 2

taggtcatgg gattccatgc tgg 23

<210> 3

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 3

tatggcttca tctatggagt agg 23

<210> 4

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 4

gttcatacaa gtattagact tgg 23

<210> 5

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 5

tccatagatg aagccatacc tgg 23

<210> 6

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 6

ccatacctgg aacccttatc agg 23

<210> 7

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 7

ctttccagcc cagtctttgt agg 23

<210> 8

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 8

cttgtgatca ttttcccctc tgg 23

<210> 9

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 9

tatcaggaag agtttggaga ggg 23

<210> 10

<211> 23

<212> DNA

<213> (Artificial sequence)

<400> 10

agtgtctgtg ttcttccatg ggg 23

<210> 11

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 11

tttgtaggtc atgggattcc atgcgt 26

<210> 12

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 12

taaaacgcat ggaatcccat gaccta 26

<210> 13

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 13

tttgtatggc ttcatctatg gagtgt 26

<210> 14

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 14

taaaacactc catagatgaa gccata 26

<210> 15

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 15

tttggttcat acaagtatta gactgt 26

<210> 16

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 16

taaaacagtc taatacttgt atgaac 26

<210> 17

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 17

tttgtccata gatgaagcca taccgt 26

<210> 18

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 18

taaaacggta tggcttcatc tatgga 26

<210> 19

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 19

tttgccatac ctggaaccct tatcgt 26

<210> 20

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 20

taaaacgata agggttccag gtatgg 26

<210> 21

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 21

tttgctttcc agcccagtct ttgtgt 26

<210> 22

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 22

taaaacacaa agactgggct ggaaag 26

<210> 23

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 23

tttgcttgtg atcattttcc cctcgt 26

<210> 24

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 24

taaaacgagg ggaaaatgat cacaag 26

<210> 25

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 25

tttgtatcag gaagagtttg gagagt 26

<210> 26

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 26

taaaactctc caaactcttc ctgata 26

<210> 27

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 27

tttgagtgtc tgtgttcttc catggt 26

<210> 28

<211> 26

<212> DNA

<213> (Artificial sequence)

<400> 28

taaaaccatg gaagaacaca gacact 26

Claims

1. The application of a site for stably expressing protein in a CHO cell gene NW _003614092.1 in stably expressing foreign protein or polypeptide in CHO cells is characterized in that a coding gene of the foreign protein or polypeptide is integrated at the site for stably expressing the protein in the CHO cell gene NW _003614092.1, and the site for stably expressing the protein is positioned within bases 1159367-1159567 of the CHO cell gene NW _ 003614092.1.

2. The use according to claim 1, wherein the site of stable protein expression is recognized by CRISPR/Cas9 technology with 5'NNNNNNNNNNNNNNNNNNNNNGG 3' as target sequence.

3. The use according to claim 1, wherein the protein is a protein or polypeptide having a molecular weight of less than 160 KDa.

4. An expression vector for expressing a protein in CHO cells, wherein a gene encoding the protein is located in a region between a 5 'homology arm and a 3' homology arm of the expression vector, and the 5 'homology arm and the 3' homology arm are sequences having a length of 600bp upstream and downstream of the site where the protein is stably expressed according to claim 1.

5. The expression vector of claim 4, further comprising a promoter sequence upstream of the gene encoding the protein, wherein the promoter controls expression of the protein.

6. The expression vector of claim 5, wherein the promoter is: one of a human cytomegalovirus-derived strong mammalian expression promoter, a human elongation factor 1 α -derived strong mammalian expression promoter, a simian vacuolating virus 40-derived mammalian expression promoter, a phosphoglycerate kinase gene-derived mammalian promoter, a human ubiquitin C gene-derived mammalian promoter, a β -actin gene-derived mammalian promoter, and a strong hybrid mammalian promoter.

7. A CHO recombinant cell for expressing a protein in a site-specific integration manner, which is characterized in that the CHO recombinant cell is obtained by transferring the expression vector for expressing the protein in the CHO cell, sgRNA plasmids corresponding to target sequences and Cas9 plasmids into the CHO cell, wherein the target sequences are shown as 5'NNNNNNNNNNNNNNNNNNNNNGG 3'.