CN113755512A - Method for preparing tandem repeat protein and application - Google Patents
Method for preparing tandem repeat protein and application Download PDFInfo
- Publication number
- CN113755512A CN113755512A CN202011405477.XA CN202011405477A CN113755512A CN 113755512 A CN113755512 A CN 113755512A CN 202011405477 A CN202011405477 A CN 202011405477A CN 113755512 A CN113755512 A CN 113755512A
- Authority
- CN
- China
- Prior art keywords
- sequence
- gene
- double
- stranded dna
- dna molecule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 143
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims abstract description 46
- 108020004414 DNA Proteins 0.000 claims abstract description 68
- 102000053602 DNA Human genes 0.000 claims abstract description 55
- 230000014509 gene expression Effects 0.000 claims abstract description 30
- 210000004027 cell Anatomy 0.000 claims abstract description 29
- 239000013604 expression vector Substances 0.000 claims abstract description 15
- 210000003370 receptor cell Anatomy 0.000 claims abstract description 3
- 239000002773 nucleotide Substances 0.000 claims description 65
- 125000003729 nucleotide group Chemical group 0.000 claims description 65
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 34
- 238000006243 chemical reaction Methods 0.000 claims description 17
- 108020005067 RNA Splice Sites Proteins 0.000 claims description 13
- 108091081024 Start codon Proteins 0.000 claims description 13
- 239000013598 vector Substances 0.000 claims description 9
- 108091026890 Coding region Proteins 0.000 claims description 8
- 230000000295 complement effect Effects 0.000 claims description 8
- 125000006850 spacer group Chemical group 0.000 claims description 8
- 241001198387 Escherichia coli BL21(DE3) Species 0.000 claims description 7
- 239000002243 precursor Substances 0.000 claims description 7
- 101150090724 3 gene Proteins 0.000 claims description 6
- 101100263837 Bovine ephemeral fever virus (strain BB7721) beta gene Proteins 0.000 claims description 5
- 230000001580 bacterial effect Effects 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims description 5
- 108020004705 Codon Proteins 0.000 claims description 4
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 4
- 101150084750 1 gene Proteins 0.000 claims description 3
- 101150096316 5 gene Proteins 0.000 claims description 3
- 238000012258 culturing Methods 0.000 claims description 3
- 108700026220 vif Genes Proteins 0.000 claims description 3
- 241000588722 Escherichia Species 0.000 claims description 2
- 108091092195 Intron Proteins 0.000 claims description 2
- 230000000813 microbial effect Effects 0.000 claims description 2
- 244000005700 microbiome Species 0.000 claims description 2
- 238000002474 experimental method Methods 0.000 abstract description 4
- 238000004904 shortening Methods 0.000 abstract description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 24
- 229960000723 ampicillin Drugs 0.000 description 13
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 13
- 229940041514 candida albicans extract Drugs 0.000 description 12
- 239000011780 sodium chloride Substances 0.000 description 12
- 239000012137 tryptone Substances 0.000 description 12
- 239000012138 yeast extract Substances 0.000 description 12
- 102000004190 Enzymes Human genes 0.000 description 10
- 108090000790 Enzymes Proteins 0.000 description 10
- 108010022355 Fibroins Proteins 0.000 description 9
- 239000001963 growth medium Substances 0.000 description 9
- 239000003550 marker Substances 0.000 description 8
- 229920001872 Spider silk Polymers 0.000 description 7
- 108020004999 messenger RNA Proteins 0.000 description 7
- 229920001817 Agar Polymers 0.000 description 6
- 239000008272 agar Substances 0.000 description 6
- 239000000047 product Substances 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- 239000003242 anti bacterial agent Substances 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 239000008367 deionised water Substances 0.000 description 5
- 229910021641 deionized water Inorganic materials 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 239000007788 liquid Substances 0.000 description 5
- 230000001681 protective effect Effects 0.000 description 5
- 108091008146 restriction endonucleases Proteins 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 229940088710 antibiotic agent Drugs 0.000 description 4
- 230000002363 herbicidal effect Effects 0.000 description 4
- 239000004009 herbicide Substances 0.000 description 4
- 230000000977 initiatory effect Effects 0.000 description 4
- 239000013612 plasmid Substances 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 230000014616 translation Effects 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 3
- 108091033380 Coding strand Proteins 0.000 description 3
- 241000588724 Escherichia coli Species 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 210000003705 ribosome Anatomy 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 239000006228 supernatant Substances 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 241000701489 Cauliflower mosaic virus Species 0.000 description 2
- 108091028732 Concatemer Proteins 0.000 description 2
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 2
- 101150074155 DHFR gene Proteins 0.000 description 2
- 102000012410 DNA Ligases Human genes 0.000 description 2
- 108010061982 DNA Ligases Proteins 0.000 description 2
- 101150111720 EPSPS gene Proteins 0.000 description 2
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 2
- 241000620209 Escherichia coli DH5[alpha] Species 0.000 description 2
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 2
- 239000005562 Glyphosate Substances 0.000 description 2
- 101100288095 Klebsiella pneumoniae neo gene Proteins 0.000 description 2
- 108060001084 Luciferase Proteins 0.000 description 2
- 108091022912 Mannose-6-Phosphate Isomerase Proteins 0.000 description 2
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N Phosphinothricin Natural products CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 2
- 238000010802 RNA extraction kit Methods 0.000 description 2
- 101150103518 bar gene Proteins 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 238000010805 cDNA synthesis kit Methods 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 238000003169 complementation method Methods 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- IAJOBQBIJHVGMQ-BYPYZUCNSA-N glufosinate-P Chemical compound CP(O)(=O)CC[C@H](N)C(O)=O IAJOBQBIJHVGMQ-BYPYZUCNSA-N 0.000 description 2
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 2
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 2
- 229940097068 glyphosate Drugs 0.000 description 2
- 101150054900 gus gene Proteins 0.000 description 2
- 101150029559 hph gene Proteins 0.000 description 2
- 210000003000 inclusion body Anatomy 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000012460 protein solution Substances 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 238000009987 spinning Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 241000589158 Agrobacterium Species 0.000 description 1
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 1
- OMDNCNKNEGFOMM-BQBZGAKWSA-N Ala-Met-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O OMDNCNKNEGFOMM-BQBZGAKWSA-N 0.000 description 1
- 241000239290 Araneae Species 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- JXFLPKSDLDEOQK-JHEQGTHGSA-N Gln-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O JXFLPKSDLDEOQK-JHEQGTHGSA-N 0.000 description 1
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 1
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 1
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 1
- 229920000271 Kevlar® Polymers 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- YRAWWKUTNBILNT-FXQIFTODSA-N Met-Ala-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YRAWWKUTNBILNT-FXQIFTODSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 108020005091 Replication Origin Proteins 0.000 description 1
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- XZNUGFQTQHRASN-XQENGBIVSA-N apramycin Chemical compound O([C@H]1O[C@@H]2[C@H](O)[C@@H]([C@H](O[C@H]2C[C@H]1N)O[C@@H]1[C@@H]([C@@H](O)[C@H](N)[C@@H](CO)O1)O)NC)[C@@H]1[C@@H](N)C[C@@H](N)[C@H](O)[C@H]1O XZNUGFQTQHRASN-XQENGBIVSA-N 0.000 description 1
- 229950006334 apramycin Drugs 0.000 description 1
- 229960000074 biopharmaceutical Drugs 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 239000012295 chemical reaction liquid Substances 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 1
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 1
- 239000004761 kevlar Substances 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 238000001819 mass spectrum Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- 108010058731 nopaline synthase Proteins 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000007363 ring formation reaction Methods 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/43504—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates
- C07K14/43513—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from arachnidae
- C07K14/43518—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from arachnidae from spiders
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Abstract
The invention provides a method for preparing tandem repeat protein and related products and application thereof. The method for preparing the tandem repeat protein comprises the steps of introducing an expression vector containing a double-stranded DNA molecule named as a single-copy gene expression cassette into a receptor cell to obtain a recombinant cell, extracting total RNA of the recombinant cell, translating the total RNA to obtain the tandem repeat protein, and greatly shortening the time for preparing the long tandem repeat protein. Experiments prove that only 7 days are needed to obtain the tandem repeat MaSp1 which is repeated for 40 times, and the time is greatly shortened compared with the traditional method. The method has the characteristics of short experimental period, time and cost saving, high efficiency and the like.
Description
Technical Field
The invention relates to a method for preparing tandem repeat protein and application thereof in the field of biotechnology.
Background
Tandem repeat proteins are proteins with highly repetitive amino acid sequences, produced by the expression of tandem repeat genes. In the prior art, tandem repeat proteins are prepared by constructing an expression vector containing tandem repeat DNA and then expressing the tandem repeat proteins. At present, the construction of tandem repeat DNA expression vector adopts 2 methods, mainly including asymmetric viscous end complementation method and isocaudarner method. The copy number generated by the asymmetric sticky end complementation method is random, and a plurality of enzymes are needed for enzyme digestion and connection. The isocaudarner method is also complicated and requires repeated enzyme digestion and ligation. Both methods are time consuming and labor intensive.
The dragline silk protein in spider silk has very high strength, and the dragline silk strength of the spider silk is 5 times that of steel wires and 3 times that of artificial Kevlar fibers under the same weight. At the same time, spider silks have good plasticity, and the two characteristics enable the spider silks to be widely applied to various fields. In the industrial sector, for example, composite materials for parachutes, protective clothing, aircraft are produced. In the biomedical field, wound sutures, delivery vehicles for biopharmaceuticals, scaffolds for cell culture and organ transplantation are included. Dragline silk is composed primarily of the spidroin proteins MaSp1(major ampullate spidroins 1) and MaSp2(major ampullate spidroins 2). These two proteins are highly modular proteins with long repeats within the sequence, flanked by approximately 100 amino acid residues in length. However, spider silks are difficult to obtain in large quantities by feeding spiders, because they have strong field awareness and aggressiveness. Therefore, many studies have attempted to express recombinant spidroin proteins in other hosts. Increasing the length of the recombinant dragline silk protein is one of the key factors for improving the mechanical performance of spider silk spinning. The size of the dragline silk protein in nature is 250-320 kDa. A scholarly expresses 284.9kDa recombinant spidroin protein by using expression escherichia coli in 2010, and the spinning mechanical property of the recombinant spidroin protein is similar to that of natural spidroin. The recombinant spidroin protein for expressing 184.9kDa needs to synthesize a repeating unit MaSp1, then utilize the homologous enzyme seamless splicing technology MaSp2 concatemer, further repeat the same method to synthesize MaSp4, MaSp8, MaSp16, MaSp32 and MaSp48 in sequence, and finally splice into MaSp 96. The steps are complicated, time-consuming and labor-consuming. And if optimization of spider silk sequences is desired, the genes need to be resynthesized, again taking a significant amount of time to reconstruct a series of concatemers.
Disclosure of Invention
The problem to be solved by the present invention is how to prepare tandem repeat proteins.
In order to solve the above technical problems, the present invention provides a method for preparing tandem repeat protein, comprising introducing an expression vector containing a double-stranded DNA molecule named single-copy gene expression cassette into a recipient cell to obtain a recombinant cell, culturing the recombinant cell, and expressing to obtain tandem repeat protein; the single-copy gene expression cassette comprises a promoter, an intron named 3 ' intron connected to the promoter, a target protein coding gene named single-copy gene connected to the 3 ' intron, a coding sequence of Ribosome Binding Site (RBS) connected to the target protein coding gene, a spacer sequence connected to the coding sequence of ribosome binding site, an initiation codon connected to the spacer sequence, and an intron named 5 ' intron connected to the initiation codon; the 3 'intron and the 5' intron satisfy condition a, where condition a is that a splicing bubble is formed by base complementary pairing of the precursor RNA transcribed from the single-copy gene expression cassette in the recombinant cell, and a mature circular single-stranded RNA molecule is produced by a splicing reaction; the target protein coding gene does not contain a stop codon.
In the above method, the spacer is a sequence between RBS and ATG, and functions to bind ribosomes to mRNA with high strength. The spacer sequence may be a double-stranded DNA of 4-10bp, for example, a double-stranded DNA whose nucleotide sequence of one strand is the 5535-5543 th nucleotide of the sequence 1.
In the above method, the tandem repeat protein may contain more than 2 copies of the single copy protein, such as more than 7 copies of the single copy protein, and more than 10 copies of the single copy protein.
In the above method, the single-copy gene expression cassette is composed of a promoter, the 3 'intron, the target protein-encoding gene, the coding sequence for the ribosome binding site, the spacer sequence, the initiation codon, and the 5' intron, which are linked to each other.
As the expression cassette of the circular mRNA of the target protein, the single-copy gene expression cassette may include a promoter for initiating the transcription of the gene encoding the target protein, and may further include a terminator for terminating the transcription of the gene encoding the target protein. Further, the single copy gene expression cassette may also include an enhancer sequence. Promoters useful in the present invention include, but are not limited to: constitutive promoters, tissue, organ and development specific promoters, and inducible promoters. Examples of promoters include, but are not limited to: the T7 promoter of the T7 phage, the constitutive promoter of cauliflower mosaic virus 35S. They may be used alone or in combination with other promoters. Suitable transcription terminators include, but are not limited to: an agrobacterium nopaline synthase terminator (NOS terminator), a cauliflower mosaic virus CaMV 35S terminator, a tml terminator.
In the above method, the single copy gene is a gene encoding a protein of interest, and the gene encoding the protein of interest does not contain a stop codon (TAA, TGA or TAG).
In the above method, the initiation codon is ATG.
In the above method, the target gene may further comprise a replication origin (pMB1) gene.
In the above method, the target gene may further comprise a selection marker gene. The selection marker gene is a gene of known function and sequence capable of functioning as a specific marker. For example, a gene encoding an enzyme or a luminescent compound which can produce a color change (GUS gene, luciferase gene, etc.), a marker gene for antibiotics (e.g., nptII gene conferring resistance to kanamycin and related antibiotics, bar gene conferring resistance to the herbicide phosphinothricin, hph gene conferring resistance to the antibiotic hygromycin, and dhfr gene conferring resistance to methotrexate, EPSPS gene conferring resistance to glyphosate), or a chemical-resistant marker gene (e.g., herbicide-resistant gene), a mannose-6-phosphate isomerase gene providing the ability to metabolize mannose.
In the above method, the target protein-encoding gene encodes a target protein, which may be MaSp 1; the MaSp1 is a protein with an amino acid sequence of SEQ ID No. 3.
In the above method, the recipient cell is any one of C1) -C4):
C1) a prokaryotic microbial cell;
C2) gram-negative bacterial cells;
C3) an escherichia bacterial cell;
C4) escherichia coli BL21(DE3) cells.
In the above method, the 3 'intron and the 5' intron which satisfy condition a are a pair of introns as follows:
the 3 'intron comprises 6 splicing bubbles and 3' splicing sites, the names of the coding DNAs of the 6 splicing bubbles are 3 'sp 1 gene, 3' sp2 gene, 3 'sp 3 gene, 3' sp4 gene, 3 'sp 5 gene and 3' sp6 gene respectively, and the coding DNAs of the 3 'splicing sites are called 3' ss gene; the nucleotide sequence of the 3' sp1 gene which is a chain is a double-stranded DNA molecule at the 5193-5214 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp2 gene which is a chain is a double-stranded DNA molecule at the 5278-5289 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp3 gene which is a chain is a double-stranded DNA molecule at 5293-5306 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp4 gene which is one strand is a double-stranded DNA molecule at the 5318-th-5337 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp5 gene which is one strand is a double-stranded DNA molecule at the 5352-5370 th site of the sequence 1 in the sequence table; the nucleotide sequence of one strand of the 3' sp6 gene is a double-stranded DNA molecule at the 5371-5386 th site of the sequence 1 in the sequence table; the nucleotide sequence of which the 3' splice site is a strand is a double-stranded DNA molecule at the 5419-5423 th site of the sequence 1 in the sequence table;
the 5 ' intron contains a 5 ' splice site and a 5 ' ss sequence; the nucleotide sequence of the 5' splice site is 5547-5556 position of the sequence 1; the nucleotide sequence of the 5 ' ss sequence is 5557-5721 of the sequence 1, comprises 4 splicing bubbles, and the names of coding DNA are 5 ' sp1 gene, 5 ' sp2 gene, 5 ' sp3 gene and 5 ' sp4 gene respectively; the nucleotide sequence of one strand of the 5' sp1 gene is a double-stranded DNA molecule at the 5569-5590 position of the sequence 1 in the sequence table; the nucleotide sequence of which one strand is the 5' sp2 gene is a double-stranded DNA molecule at 5631-5643 site of the sequence 1 in the sequence table; the 5 'sp 3 gene is a double-stranded DNA molecule with the nucleotide sequence of one strand being the 5648-th-5698-th site of the sequence 1 in the sequence table, and the 5' sp4 gene is a double-stranded DNA molecule with the nucleotide sequence of one strand being the 5671-th-5687-th site of the sequence 1 in the sequence table.
In the above method, the 3 'intron is a double-stranded DNA having the nucleotide sequence of the 5190-5423 th nucleotide of the sequence 1 in one strand (coding strand), and the 5' intron is a double-stranded DNA having the nucleotide sequence of the 5547-5721 th nucleotide of the sequence 1 in one strand (coding strand).
In the above method, the single copy gene expression cassette is a double-stranded DNA molecule having one strand (coding strand) with the nucleotide sequence of position 5117-5835 of SEQ ID No. 1;
or the expression vector is a double-stranded DNA molecule (expressing tandem repeat MaSp protein) with one strand of which the nucleotide sequence is SEQ ID No. 1.
The invention also provides any one of the following products related to the method:
A1) said double stranded DNA molecule in said method named single copy gene expression cassette;
A2) a vector containing a1) the double-stranded DNA molecule;
A3) a recombinant microorganism comprising the double-stranded DNA molecule of A1).
The vector of A2) can be constructed using an existing expression vector. The existing expression vectors comprise pMD 18-T vector, pET21b and the like. The existing expression vector may also contain the 3' untranslated region of the foreign gene, i.e., contain the polyadenylation signal and any other DNA segments involved in mRNA processing or gene expression. The poly A signal can direct the addition of poly A to the 3' end of the mRNA precursor. Construction of the vector according to A2), enhancers, such as transcription enhancers, may also be used, which may be ATG start codons or adjacent regions, but which must be in reading frame with the coding sequence in order to ensure correct translation of the entire sequence. In order to facilitate identification and screening of the transgenic results, the existing expression vectors used may be processed, for example, by adding genes encoding enzymes or luminescent compounds which produce a color change (GUS gene, luciferase gene, etc.), marker genes for antibiotics (e.g., nptII gene which confers resistance to kanamycin and related antibiotics, bar gene which confers resistance to phosphinothricin as an herbicide, hph gene which confers resistance to hygromycin as an antibiotic, dhfr gene which confers resistance to methatrexate, EPSPS gene which confers resistance to glyphosate), or marker genes for chemical resistance (e.g., herbicide resistance), mannose-6-phosphate isomerase gene which provides the ability to metabolize mannose.
The invention provides the use of the above method or the above product in the preparation of tandem repeat proteins.
The invention provides a method for preparing tandem repeat protein, which comprises the steps of introducing an expression vector containing double-stranded DNA molecules named as a single-copy gene expression cassette into a receptor cell to obtain a recombinant cell, extracting total RNA of the recombinant cell, translating the total RNA into tandem repeat protein, and greatly shortening the time for preparing long tandem repeat protein. Experiments prove that only 7 days are needed to obtain 40 times of tandem repeat MaSp1 protein, and the time is greatly shortened.
Drawings
FIG. 1 is a schematic diagram of the structure of MaSp1 RNA expression cassette in example 1 of the present invention. In the figure, RBS is the coding sequence of the ribosome binding site and ATG is the initiation codon.
FIG. 2 is a schematic diagram showing the mechanism of intron splicing to generate circular MaSp1 RNA in example 1 of the present invention. In the figure, BSJ is a back 'splice junction (back' splice junction) site, i.e., a splice junction; RBS is the ribosome binding site; ATG is the initiation codon.
FIG. 3 is a schematic diagram of the mechanism of the MaSp1 tandem repeat protein translation in example 1 of the present invention. RBS is the ribosome binding site; ATG is the initiation codon.
FIG. 4 is a diagram showing the confirmation of the looping of MaSp1 RNA in example 1 of the present invention.
FIG. 5 is a graph showing the result of sanger sequencing of the MaSp1 RNA splice junction in example 1 of the present invention.
FIG. 6 is a PAGE gel of translated circularized MaSp1 RNA according to example 1 of the present invention, wherein M is Marker, 1 is MaSp1 inclusion body, and 2 is MaSp1 supernatant.
FIG. 7 is a Western drawing of the protein after translation of the cyclized MaSp1 RNA of example 1, wherein 1 is MaSp1 inclusion body and 2 is MaSp1 supernatant.
FIG. 8 is a mass spectrum of a suspected MaSp1 protein in example 1 of the present invention.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The experimental procedures in the following examples are conventional unless otherwise specified. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
In the examples described below, Escherichia coli DH5 α (BC102-02) was produced by Biomed; coli BL21(CW0809S) is a century product of Beijing kang.
In the following examples, the RNAprep Pure culture cell/bacteria total RNA extraction kit (DP430) is a product of TIANGEN corporation; the Rever Tra Ace qPCR RT kit cDNA Synthesis kit (FSQ-101) is a product of TOYOBO corporation.
In the following examples, 10xBSA protein solution (B9000S) was NEB; 2XEs Taq MasterMix (with dye) (CW0690H) is a century company product of Beijing kang.
In the following examples, the media used are specifically as follows:
the solid LB culture medium is a sterile culture medium prepared from tryptone, yeast extract, NaCl, agar and deionized water, and the content of the tryptone, the yeast extract, the NaCl and the agar is as follows: 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl and 15g/L agar.
The liquid LB culture medium is a sterile culture medium prepared from tryptone, yeast extract, NaCl and deionized water, and the content of the tryptone, the yeast extract and the NaCl is as follows: 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl.
The solid LB culture medium with 100 mug/mL ampicillin concentration is a sterile culture medium made of ampicillin, tryptone, yeast extract, NaCl, agar and deionized water, and the contents of ampicillin, tryptone, yeast extract, NaCl and agar are as follows: 100 ug/mL ampicillin, 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl, 15g/L agar.
The liquid LB culture medium with 100 mug/mL ampicillin concentration is a sterile culture medium made of ampicillin, tryptone, yeast extract, NaCl and deionized water, and the contents of ampicillin, tryptone, yeast extract and NaCl are as follows: 100. mu.g/mL ampicillin, 10g/L tryptone, 5g/L yeast extract, 10g/L NaCl.
Example 1 preparation of tandem repeat MaSp1
This example prepared an expression vector containing a single copy gene expression cassette named pMaSP1, a double-stranded DNA whose nucleotide sequence of one strand was SEQ ID No.1 of the sequence Listing (pMaSP 1). In the sequence 1, the 494-1459 th site is an apramycin resistance gene, the 5117 th site 5835 th site is a DNA molecule named single copy gene expression cassette, which is hereinafter referred to as MaSp1 RNA expression cassette. The structure of the MaSp1 RNA expression cassette is shown in FIG. 1, and comprises a T7 promoter (the nucleotide sequence is the 5117-, An initiation codon ATG (the nucleotide sequence is the 5544-th and 5546-th nucleotides of the sequence 1), an intron (the nucleotide sequence is the 5547-th and 5721-th nucleotides of the sequence 1, wherein the 5547-5556-th and 5557-5721-th nucleotides of the sequence 1) connected with the initiation codon and named as a 5' intron. The 5788-5835 th site of the sequence 1 in the sequence table is a terminator for terminating the transcription of the intron and the MaSp1 gene, and the 12 th to 467 th sites are replication initiation sites.
The MaSp1 gene does not contain a stop codon (TAA, TGA or TAG). The MaSp1 gene encodes MaSp1, and MaSp1 is a protein with an amino acid sequence of sequence 2. The 3 'intron and the 5' intron satisfy condition A that the precursor RNA transcribed from the single-copy gene expression cassette in the recombinant cell forms a splicing bubble through base complementary pairing and generates a mature circular single-stranded RNA molecule through a splicing reaction (G-OH-catalyzed splicing reaction) (see the mechanism in FIG. 2).
The 3 'intron contains 6 splicing bubbles and 3' splicing sites, the names of the coding DNAs of the 6 splicing bubbles are 3 'sp 1 gene, 3' sp2 gene, 3 'sp 3 gene, 3' sp4 gene, 3 'sp 5 gene and 3' sp6 gene respectively, and the coding DNAs of the 3 'splicing sites are called 3' ss gene; the nucleotide sequence of the 3' sp1 gene which is a chain is a double-stranded DNA molecule at the 5193-5214 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp2 gene which is a chain is a double-stranded DNA molecule at the 5278-5289 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp3 gene which is a chain is a double-stranded DNA molecule at 5293-5306 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp4 gene which is one strand is a double-stranded DNA molecule at the 5318-th-5337 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp5 gene which is one strand is a double-stranded DNA molecule at the 5352-5370 th site of the sequence 1 in the sequence table; the nucleotide sequence of one strand of the 3' sp6 gene is a double-stranded DNA molecule at the 5371-5386 th site of the sequence 1 in the sequence table; the nucleotide sequence of which the 3' splice site is one strand is a double-stranded DNA molecule at the 5419-5423 th site of the sequence 1 in the sequence table.
The 5 ' intron contains a 5 ' splice site and a 5 ' ss sequence; the nucleotide sequence of the 5' splice site is 5547-5556 position of the sequence 1; the nucleotide sequence of the 5 ' ss sequence is 5557-5721 of the sequence 1, comprises 4 splicing bubbles, and the names of coding DNA are 5 ' sp1 gene, 5 ' sp2 gene, 5 ' sp3 gene and 5 ' sp4 gene respectively; the nucleotide sequence of one strand of the 5' sp1 gene is a double-stranded DNA molecule at the 5569-5590 position of the sequence 1 in the sequence table; the nucleotide sequence of which one strand is the 5' sp2 gene is a double-stranded DNA molecule at 5631-5643 site of the sequence 1 in the sequence table; the 5 'sp 3 gene is a double-stranded DNA molecule with the nucleotide sequence of one strand being the 5648-th-5698-th site of the sequence 1 in the sequence table, and the 5' sp4 gene is a double-stranded DNA molecule with the nucleotide sequence of one strand being the 5671-th-5687-th site of the sequence 1 in the sequence table.
The mechanism of preparing tandem repeat protein using the expression vector pMaSp1 containing the single-copy gene expression cassette described above is to introduce pMaSp1 into recipient cells to obtain recombinant cells, and in the recombinant cells, pMaSp1 transcribes precursor RNA shown in the left panel of fig. 2, which is also called nuclear mRNA precursor (pre-mRNA). In the precursor RNA, the 3 'intron and the 5' intron form a splicing bubble through base complementary pairing, and splicing reaction is catalyzed by G-OH, and then cyclization is carried out, so as to generate a mature circular single-stranded RNA molecule shown in the right diagram of FIG. 2, which is called circular MaSp1 RNA. Ribosomes bind to the Ribosome Binding Site (RBS) sequence on the circular MaSp1 RNA, initiating translation of the protein from the AUG, and because the circular MaSp1 RNA does not contain UAA, UGA, or UAG, the ribosomes continue to translate on the circular MaSp1 mRNA, thereby generating the MaSp1 tandem repeat protein (fig. 3). The specific process is as follows:
1. preparation of expression vector pMaSP1 containing Single copy Gene expression cassette
pMaSP1 is constructed in a modular manner, and each module is connected by adopting a golden gate method: protective bases, enzyme recognition sites and complementary sticky ends of restriction endonuclease BsaI are added at two ends of each module, and the protective bases, the enzyme recognition sites and the complementary sticky ends are added in a primer embedding mode, and the method comprises the following specific steps:
1.1 Module
The construction of pMaSp1 requires module a and module B:
the deoxyribonucleotide sequence of the module A is shown as the 5547-5423 position of the sequence 1 of the sequence table, wherein the 5547-5721 position is a 5 'intron (wherein the 5547-5556 position is a 5' splice site, the 5557-5721 position is a 5 'ss gene), the 5788-5835 position is a transcription terminator, the 12-467 position is an replication initiation site (pMB1) gene, the 494-1459 position is an ampicillin resistance gene, the 5117-5135 position is a T7 promoter, and the 5190-5423 position is a 3' intron (wherein the 5190-5418 position is a3 'ss gene, and the 5419-5423 position is a 3' splice site).
The module B contains MaSp1 gene, and the deoxyribonucleotide sequence of the module B is shown as 5424-5528 site of the sequence 1 in the sequence table.
1.2 processing of the modules
Adding protective basic groups and enzyme recognition sites of restriction endonuclease BsaI and complementary sticky ends to two ends of the module A through PCR reaction to obtain a module A with restriction endonuclease BsaI sites at two ends, and naming the module A as a module A-BsaI; the primer pairs used for this PCR reaction were PartA-F and PartA-R.
PartA-F:5’-CCAGGTCTCAAAGGAGTACTCGATGGATCTCAGGTCAATTGAGGCCTGAGTA-3' (underlined nucleotides are BsaI recognition sites)
PartA-R:5’-CCAGGTCTCAGGTAGCATTATGTTCAGATAAGGTC-3'. (underlined nucleotides are BsaI recognition sites)
Adding protective basic groups and enzyme recognition sites of restriction endonuclease BsaI and complementary sticky ends to two ends of the module B through PCR reaction to obtain a module B with restriction endonuclease BsaI sites at two ends, and naming the module B as a module B-BsaI; the primer pair used in the PCR reaction is PartB-F and PartB-R.
PartB-F:5’-CCAGGTCTCATACCAGCGGACGTGG-3' (underlined nucleotides are BsaI recognition sites)
PartB-R:5’-CCAGGTCTCACCTTTGTTCCCTGGCTTCC-3 (underlined nucleotides are BsaI recognition sites).
1.3 construction of pMaSP1
A plasmid was prepared by connecting the module A-BsaI and the module B-BsaI at a molar ratio of 1:1 into a circular MaSp1 RNA by the following system: 20 μ L of module A-BsaI 5.05E-8mol (about 100ng) and module B-BsaI 5.05E-8mol in the reaction system; BsaI enzyme 1. mu.L, T4 DNA Ligase (T4 DNA Ligase) 1. mu.L, 10x T4 buffer (10x T4 buffer) 2. mu.L, 10xBSA protein solution 2. mu.L, with deionized water to make up to 20. mu.L. The ligation was performed by the following reaction conditions: reacting at 37 ℃ for 3 min; reacting for 4min at 25 ℃, and carrying out 25 cycles; the unligated fragments were excised by reaction at 50 ℃ for 5min and then the enzyme was inactivated by reaction at 80 ℃ for 5 min. After the reaction, a golden gate reaction solution of pMaSP1 was obtained.
Transferring 5 mu L of golden gate reaction liquid of pMaSP1 into escherichia coli DH5 alpha competent cells, screening on a solid LB culture medium with ampicillin concentration of 100 mu g/mL, selecting bacteria for sequencing, screening and constructing correct plasmids, and amplifying and extracting the plasmids to obtain pMaSP 1.
2. Construction and verification of circular MaSp1 RNA
Transferring 0.5 μ L of pMaSP1 into competent cells of Escherichia coli BL21(DE3), screening on solid LB medium with ampicillin concentration of 100 μ g/mL to obtain Escherichia coli BL21(DE3) positive transformant (recombinant cells transferred into pMaSP 1), transferring Escherichia coli BL21(DE3) positive transformant into liquid LB medium with ampicillin concentration of 100 μ g/mL, and culturing at 37 deg.C to OD600nmAbout.0.4, inducing with 1mM isopropyl-beta-D-thiogalactoside (IPTG) for 12h, taking 1mL of the bacterial liquid, centrifuging for 2min at 12000rmp, discarding the supernatant, and extracting the total RNA of the escherichia coli BL21(DE3) positive transformant by using an RNAprep Pure cultured cell/bacteria total RNA extraction kit according to the method described in the specification.
Total RNA was reverse transcribed into cDNA using the Rever Tra Ace qPCRRT kit cDNA Synthesis kit according to the method described in the specification.
The PCR reaction of the cDNA was performed using the primer pair Testif cirMaSp1-F and Testif cirMaSp1-R to verify whether MaSp1 RNA forms a loop.
Testify cirMaSp1-F:5’-CAGGACAGGGAGGATATGGA-3’;
Testify cirMaSp1-R:5’-CTCCTCCCATGGCTGC-3’。
The DNA polymerase used was verified to be 2XEs Taq MasterMix (with dye). The samples after the PCR reaction were run on gel and the electrophoretogram is shown in FIG. 4. The band with a molecular weight of 100bp was sent for sequencing, and the sanger sequencing result of MaSp1 RNA splice (Splicejunction) is shown in FIG. 5. Sequencing showed that the 5 'splice site was ligated to the 3' splice site, indicating that the MaSp1 RNA had been circularized.
3. Preparation of MaSp1 tandem repeat proteins
Escherichia coli BL21(DE3) positive transformant in which the loop formation of MaSp1 RNA was verified in step 2 was inoculated into liquid LB medium having ampicillin concentration of 100. mu.g/mL and cultured at 37 ℃ to OD600Approximatively 0.4, followed by further induction with 1mM IPTG at 37 ℃ for 6h (labeled MaSp1 cirmRNA 6h in FIG. 6); the empty vector-transferred BL21 strain was used as a control in place of the loop-formed MaSp1 RNA BL21 strain (indicated as empty vector 6h in fig. 6). After the induction was completed, a protein gel-running sample was prepared, and the results of the protein gel-running test are shown in FIG. 6, and three protein gel bands were added after the treatment of E.coli BL21(DE3) positive transformant in which MaSp1 RNA had been cyclized, as compared to the control. The three bands are cut off respectively and sent to mass spectrometry for detection, the result is shown in figure 8, and the band with the molecular weight more than 118kD is spidroin protein.
Among them, the empty vector-transferred BL21 strain was a transformant obtained by transferring an empty vector into E.coli BL21(DE 3). The empty vector was a plasmid obtained by removing the MaSp1 RNA expression cassette from pMaSP1 while keeping the other nucleotides of pMaSP1 unchanged. The empty vector differed from pMaSp1 only in that it did not contain the MaSp1 RNA expression cassette.
Experiments prove that only 7 days are needed to obtain the tandem repeat MaSp1 which is repeated for 40 times, and the time is greatly shortened compared with the traditional method.
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.
Sequence listing
<110> institute of biotechnology for Tianjin industry of Chinese academy of sciences
<120> a method for preparing tandem repeat protein and uses thereof
<130> GNCSY200930
<160> 2
<170> SIPOSequenceListing 1.0
<210> 1
<211> 5860
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg 60
cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc 120
ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg 180
gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc 240
acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt 300
ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc 360
ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta 420
acaaaaattt aacgcgaatt ttaacaaaat attaacgttt acaatttcag gtggcacttt 480
tcggggaaat gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta 540
tccgctcatg agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat 600
gagtattcaa catttccgtg tcgcccttat tccctttttt gcggcatttt gccttcctgt 660
ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt tgggtgcacg 720
agtgggttac atcgaactgg atctcaacag cggtaagatc cttgagagtt ttcgccccga 780
agaacgtttt ccaatgatga gcacttttaa agttctgcta tgtggcgcgg tattatcccg 840
tattgacgcc gggcaagagc aactcggtcg ccgcatacac tattctcaga atgacttggt 900
tgagtactca ccagtcacag aaaagcatct tacggatggc atgacagtaa gagaattatg 960
cagtgctgcc ataaccatga gtgataacac tgcggccaac ttacttctga caacgatcgg 1020
aggaccgaag gagctaaccg cttttttgca caacatgggg gatcatgtaa ctcgccttga 1080
tcgttgggaa ccggagctga atgaagccat accaaacgac gagcgtgaca ccacgatgcc 1140
tgcagcaatg gcaacaacgt tgcgcaaact attaactggc gaactactta ctctagcttc 1200
ccggcaacaa ttaatagact ggatggaggc ggataaagtt gcaggaccac ttctgcgctc 1260
ggcccttccg gctggctggt ttattgctga taaatctgga gccggtgagc gtgggtctcg 1320
cggtatcatt gcagcactgg ggccagatgg taagccctcc cgtatcgtag ttatctacac 1380
gacggggagt caggcaacta tggatgaacg aaatagacag atcgctgaga taggtgcctc 1440
actgattaag cattggtaac tgtcagacca agtttactca tatatacttt agattgattt 1500
aaaacttcat ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac 1560
caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa 1620
aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc 1680
accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt 1740
aactggcttc agcagagcgc agataccaaa tactgtcctt ctagtgtagc cgtagttagg 1800
ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc 1860
agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt 1920
accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga 1980
gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct 2040
tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg 2100
cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca 2160
cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 2220
cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt 2280
ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga 2340
taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga 2400
gcgcctgatg cggtattttc tccttacgca tctgtgcggt atttcacacc gcatatatgg 2460
tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagtatac actccgctat 2520
cgctacgtga ctgggtcatg gctgcgcccc gacacccgcc aacacccgct gacgcgccct 2580
gacgggcttg tctgctcccg gcatccgctt acagacaagc tgtgaccgtc tccgggagct 2640
gcatgtgtca gaggttttca ccgtcatcac cgaaacgcgc gaggcagctg cggtaaagct 2700
catcagcgtg gtcgtgaagc gattcacaga tgtctgcctg ttcatccgcg tccagctcgt 2760
tgagtttctc cagaagcgtt aatgtctggc ttctgataaa gcgggccatg ttaagggcgg 2820
ttttttcctg tttggtcact gatgcctccg tgtaaggggg atttctgttc atgggggtaa 2880
tgataccgat gaaacgagag aggatgctca cgatacgggt tactgatgat gaacatgccc 2940
ggttactgga acgttgtgag ggtaaacaac tggcggtatg gatgcggcgg gaccagagaa 3000
aaatcactca gggtcaatgc cagcgcttcg ttaatacaga tgtaggtgtt ccacagggta 3060
gccagcagca tcctgcgatg cagatccgga acataatggt gcagggcgct gacttccgcg 3120
tttccagact ttacgaaaca cggaaaccga agaccattca tgttgttgct caggtcgcag 3180
acgttttgca gcagcagtcg cttcacgttc gctcgcgtat cggtgattca ttctgctaac 3240
cagtaaggca accccgccag cctagccggg tcctcaacga caggagcacg atcatgcgca 3300
cccgtggggc cgccatgccg gcgataatgg cctgcttctc gccgaaacgt ttggtggcgg 3360
gaccagtgac gaaggcttga gcgagggcgt gcaagattcc gaataccgca agcgacaggc 3420
cgatcatcgt cgcgctccag cgaaagcggt cctcgccgaa aatgacccag agcgctgccg 3480
gcacctgtcc tacgagttgc atgataaaga agacagtcat aagtgcggcg acgatagtca 3540
tgccccgcgc ccaccggaag gagctgactg ggttgaaggc tctcaagggc atcggtcgag 3600
atcccggtgc ctaatgagtg agctaactta cattaattgc gttgcgctca ctgcccgctt 3660
tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 3720
gcggtttgcg tattgggcgc cagggtggtt tttcttttca ccagtgagac gggcaacagc 3780
tgattgccct tcaccgcctg gccctgagag agttgcagca agcggtccac gctggtttgc 3840
cccagcaggc gaaaatcctg tttgatggtg gttaacggcg ggatataaca tgagctgtct 3900
tcggtatcgt cgtatcccac taccgagata tccgcaccaa cgcgcagccc ggactcggta 3960
atggcgcgca ttgcgcccag cgccatctga tcgttggcaa ccagcatcgc agtgggaacg 4020
atgccctcat tcagcatttg catggtttgt tgaaaaccgg acatggcact ccagtcgcct 4080
tcccgttccg ctatcggctg aatttgattg cgagtgagat atttatgcca gccagccaga 4140
cgcagacgcg ccgagacaga acttaatggg cccgctaaca gcgcgatttg ctggtgaccc 4200
aatgcgacca gatgctccac gcccagtcgc gtaccgtctt catgggagaa aataatactg 4260
ttgatgggtg tctggtcaga gacatcaaga aataacgccg gaacattagt gcaggcagct 4320
tccacagcaa tggcatcctg gtcatccagc ggatagttaa tgatcagccc actgacgcgt 4380
tgcgcgagaa gattgtgcac cgccgcttta caggcttcga cgccgcttcg ttctaccatc 4440
gacaccacca cgctggcacc cagttgatcg gcgcgagatt taatcgccgc gacaatttgc 4500
gacggcgcgt gcagggccag actggaggtg gcaacgccaa tcagcaacga ctgtttgccc 4560
gccagttgtt gtgccacgcg gttgggaatg taattcagct ccgccatcgc cgcttccact 4620
ttttcccgcg ttttcgcaga aacgtggctg gcctggttca ccacgcggga aacggtctga 4680
taagagacac cggcatactc tgcgacatcg tataacgtta ctggtttcac attcaccacc 4740
ctgaattgac tctcttccgg gcgctatcat gccataccgc gaaaggtttt gcgccattcg 4800
atggtgtccg ggatctcgac gctctccctt atgcgactcc tgcattagga agcagcccag 4860
tagtaggttg aggccgttga gcaccgccgc cgcaaggaat ggtgcatgca aggagatggc 4920
gcccaacagt cccccggcca cggggcctgc caccataccc acgccgaaac aagcgctcat 4980
gagcccgaag tggcgagccc gatcttcccc atcggtgatg tcggcgatat aggcgccagc 5040
aaccgcacct gtggcgccgg tgatgccggc cacgatgcgt ccggcgtaga ggatcgagat 5100
ctcgatcccg cgaaattaat acgactcact ataggggaat tgtgagcgga taacaattcc 5160
cctctagaaa taattttgtt taactttaaa attctagaga aaatttcgtc tggattagtt 5220
acttatcgtg taaaatctga taaatggaat tggttctaca taaatgccta acgactatcc 5280
ctttggggag tagggtcaag tgactcgaaa cgatagacaa cttgctttaa caagttggag 5340
atatagtctg ctctgcatgg tgacatgcag ctggatataa ttccggggta agattaacga 5400
ccttatctga acataatgct accagcggtc gcggcggtct gggtggccag ggtgcaggta 5460
tggcggctgc ggctgcaatg ggcggtgctg gccaaggtgg ctacggcggc ctgggttctc 5520
agggtactaa ggagatatac catatggatc tgcgttcaat tgaggcctga gtataaggtg 5580
acttatactt gtaatctatc taaacgggga acctctctag tagacaatcc cgtgctaaat 5640
tgtaggactg ccctttaata aatacttcta tatttaaaga ggtatttatg aaaagcggaa 5700
tttatcagat taaaaatact ttgagatccg gctgctaaca aagcccgaaa ggaagctgag 5760
ttggctgctg ccaccgctga gcaataacta gcataacccc ttggggcctc taaacgggtc 5820
ttgaggggtt ttttgctgaa aggaggaact atatccggat 5860
<210> 2
<211> 35
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 2
Ser Gly Arg Gly Gly Leu Gly Gly Gln Gly Ala Gly Met Ala Ala Ala
1 5 10 15
Ala Ala Met Gly Gly Ala Gly Gln Gly Gly Tyr Gly Gly Leu Gly Ser
20 25 30
Gln Gly Thr
35
Claims (10)
1. A method of preparing tandem repeat proteins, comprising: comprises introducing an expression vector containing a double-stranded DNA molecule named as a single-copy gene expression cassette into a receptor cell to obtain a recombinant cell, culturing the recombinant cell, and expressing to obtain a tandem repeat protein; the single-copy gene expression cassette comprises a promoter, an intron which is connected with the promoter and is named as a3 ' intron, a target protein coding gene which is connected with the 3 ' intron and is named as a single-copy gene, a coding sequence of a ribosome binding site which is connected with the target protein coding gene, a spacer sequence which is connected with the coding sequence of the ribosome binding site, an initiation codon which is connected with the spacer sequence, and an intron which is connected with the initiation codon and is named as a 5 ' intron; the 3 'intron and the 5' intron satisfy condition a, where condition a is that a splicing bubble is formed by base complementary pairing of the precursor RNA transcribed from the single-copy gene expression cassette in the recombinant cell, and a mature circular single-stranded RNA molecule is produced by a splicing reaction; the target protein coding gene does not contain a stop codon.
2. The method of claim 1, wherein: the single copy gene expression cassette is formed by connecting a promoter, the 3 'intron, the target protein coding gene, the coding sequence of the ribosome binding site, the spacer sequence, the initiation codon and the 5' intron.
3. The method according to claim 1 or 2, characterized in that: the target protein is MaSp 1.
4. The method of claim 3, wherein: the MaSp1 is a protein with an amino acid sequence of SEQ ID No. 2.
5. The method according to any one of claims 1-4, wherein: the recipient cell is any one of C1) -C4):
C1) a prokaryotic microbial cell;
C2) gram-negative bacterial cells;
C3) an escherichia bacterial cell;
C4) escherichia coli BL21(DE3) cells.
6. The method according to any one of claims 1-5, wherein: the 3 'intron and the 5' intron which satisfy the condition A are a pair of introns as follows:
the 3 'intron comprises 6 splicing bubbles and 3' splicing sites, the names of the coding DNAs of the 6 splicing bubbles are 3 'sp 1 gene, 3' sp2 gene, 3 'sp 3 gene, 3' sp4 gene, 3 'sp 5 gene and 3' sp6 gene respectively, and the coding DNAs of the 3 'splicing sites are called 3' ss gene; the nucleotide sequence of the 3' sp1 gene which is a chain is a double-stranded DNA molecule at the 5193-5214 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp2 gene which is a chain is a double-stranded DNA molecule at the 5278-5289 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp3 gene which is a chain is a double-stranded DNA molecule at 5293-5306 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp4 gene which is one strand is a double-stranded DNA molecule at the 5318-th-5337 th site of the sequence 1 in the sequence table; the nucleotide sequence of the 3' sp5 gene which is one strand is a double-stranded DNA molecule at the 5352-5370 th site of the sequence 1 in the sequence table; the nucleotide sequence of one strand of the 3' sp6 gene is a double-stranded DNA molecule at the 5371-5386 th site of the sequence 1 in the sequence table; the nucleotide sequence of which the 3' splice site is a strand is a double-stranded DNA molecule at the 5419-5423 th site of the sequence 1 in the sequence table;
the 5 ' intron contains a 5 ' splice site and a 5 ' ss sequence; the nucleotide sequence of the 5' splice site is 5547-5556 position of the sequence 1; the nucleotide sequence of the 5 ' ss sequence is 5557-5721 of the sequence 1, comprises 4 splicing bubbles, and the names of coding DNA are 5 ' sp1 gene, 5 ' sp2 gene, 5 ' sp3 gene and 5 ' sp4 gene respectively; the nucleotide sequence of one strand of the 5' sp1 gene is a double-stranded DNA molecule at the 5569-5590 position of the sequence 1 in the sequence table; the nucleotide sequence of which one strand is the 5' sp2 gene is a double-stranded DNA molecule at 5631-5643 site of the sequence 1 in the sequence table; the 5 'sp 3 gene is a double-stranded DNA molecule with the nucleotide sequence of one strand being the 5648-th-5698-th site of the sequence 1 in the sequence table, and the 5' sp4 gene is a double-stranded DNA molecule with the nucleotide sequence of one strand being the 5671-th-5687-th site of the sequence 1 in the sequence table.
7. The method according to any one of claims 1-6, wherein: the 3 'intron is a double-stranded DNA of which the nucleotide sequence of one strand is the 5190-5423 th nucleotide of the sequence 1, and the 5' intron is a double-stranded DNA of which the nucleotide sequence of one strand is the 5547-5721 th nucleotide of the sequence 1.
8. The method according to any one of claims 1-7, wherein: the single copy gene expression cassette is a double-stranded DNA molecule with one strand of which the nucleotide sequence is the 5117-position 5835 of SEQ ID No. 1;
or the expression vector is a double-stranded DNA molecule with one strand of which the nucleotide sequence is SEQ ID No. 1.
9. Any one of the following products:
A1) the method of any one of claims 1-8 wherein said double stranded DNA molecule is a single copy gene expression cassette;
A2) a vector containing a1) the double-stranded DNA molecule;
A3) a recombinant microorganism comprising the double-stranded DNA molecule of A1).
10. Use of the method of any one of claims 1 to 8 or the product of claim 9 for the preparation of tandem repeat proteins.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011405477.XA CN113755512B (en) | 2020-12-03 | 2020-12-03 | Method for preparing tandem repeat protein and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011405477.XA CN113755512B (en) | 2020-12-03 | 2020-12-03 | Method for preparing tandem repeat protein and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113755512A true CN113755512A (en) | 2021-12-07 |
CN113755512B CN113755512B (en) | 2023-11-10 |
Family
ID=78786166
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011405477.XA Active CN113755512B (en) | 2020-12-03 | 2020-12-03 | Method for preparing tandem repeat protein and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113755512B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4673641A (en) * | 1982-12-16 | 1987-06-16 | Molecular Genetics Research And Development Limited Partnership | Co-aggregate purification of proteins |
US20020059656A1 (en) * | 2000-03-13 | 2002-05-16 | Qi Wang | Recombinant proteins containing repeating units |
KR20040083194A (en) * | 2003-03-21 | 2004-10-01 | 한국생명공학연구원 | The transformed plant cell expressing tandem repeats of β-amyloid gene and plant produced by the same |
WO2006073727A2 (en) * | 2004-12-21 | 2006-07-13 | Monsanto Technology, Llc | Recombinant dna constructs and methods for controlling gene expression |
US20100222553A1 (en) * | 2007-06-11 | 2010-09-02 | The Regents Of The University Of California | Spider silk dragline polynucleotides, polypeptides and methods of use thereof |
US20140094589A1 (en) * | 2010-11-25 | 2014-04-03 | Vladimir Grigorievich Bogush | Method for producing web protein, a fused protein, recombinant dna, an expression vector, a host cell and strain-producers |
-
2020
- 2020-12-03 CN CN202011405477.XA patent/CN113755512B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4673641A (en) * | 1982-12-16 | 1987-06-16 | Molecular Genetics Research And Development Limited Partnership | Co-aggregate purification of proteins |
US20020059656A1 (en) * | 2000-03-13 | 2002-05-16 | Qi Wang | Recombinant proteins containing repeating units |
KR20040083194A (en) * | 2003-03-21 | 2004-10-01 | 한국생명공학연구원 | The transformed plant cell expressing tandem repeats of β-amyloid gene and plant produced by the same |
WO2006073727A2 (en) * | 2004-12-21 | 2006-07-13 | Monsanto Technology, Llc | Recombinant dna constructs and methods for controlling gene expression |
US20100222553A1 (en) * | 2007-06-11 | 2010-09-02 | The Regents Of The University Of California | Spider silk dragline polynucleotides, polypeptides and methods of use thereof |
US20140094589A1 (en) * | 2010-11-25 | 2014-04-03 | Vladimir Grigorievich Bogush | Method for producing web protein, a fused protein, recombinant dna, an expression vector, a host cell and strain-producers |
Non-Patent Citations (1)
Title |
---|
J RIET 等: "mproving the PCR protocol to amplify a repetitive DNA sequence" * |
Also Published As
Publication number | Publication date |
---|---|
CN113755512B (en) | 2023-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110923183A (en) | Construction method of lanosterol-producing escherichia coli strain | |
CN110951700B (en) | Diels-Alder reaction enzyme and application thereof | |
CN111893104B (en) | Structure-based CRISPR protein optimization design method | |
CN106755031B (en) | Rhamnolipid production plasmid, construction method thereof, escherichia coli engineering bacteria and application | |
CN113755512B (en) | Method for preparing tandem repeat protein and application thereof | |
CN109402109B (en) | Improved overlap extension PCR method | |
CN110964702B (en) | Application of Diels-Alder reaction enzyme and preparation method and application of mutant thereof | |
CN111748034B (en) | Preparation method of mycoplasma synoviae monoclonal antibody | |
CN115247173A (en) | Gene editing system for constructing TMPRSS6 gene mutant iron deficiency anemia pig nuclear transplantation donor cells and application thereof | |
CN107075495B (en) | Lyase, DNA encoding the lyase, vector comprising the DNA, and method for asymmetric synthesis of (S) -phenylacetylcarbinol | |
CN106715689B (en) | Lyase and method for asymmetric synthesis of (S) -phenylacetylcarbinol | |
CN113234746B (en) | Method for pesticide induced protein interaction and induced gene expression | |
RU2774333C1 (en) | RECOMBINANT PLASMID pET-GST-3CL-GPG PROVIDING SYNTHESIS OF SARS-CoV-2 3CL PROTEASE IN E. COLI CELLS IN SOLUBLE FORM | |
RU2792132C1 (en) | Soluble recombinant plasmid pet-gst-3cl ensuring synthesis of 3cl sars-cov-2 protease in e. coli cells | |
KR100902634B1 (en) | Nucleic acid delivery complex comprising recombinant hmgb-1 peptide | |
CN114539425B (en) | Method for improving biological expression of linear polypeptide | |
CN112553177B (en) | Glutamine transaminase variant with improved heat stability | |
KR101137021B1 (en) | Novel glycosyltransferase from Fusobacterium nucleatum and use thereof | |
CN112813087A (en) | Preparation method of SalI restriction endonuclease | |
CN107354172B (en) | Recombinant expression vector and construction method and application thereof | |
CN112662647A (en) | Method for preparing recombinant NcoI restriction enzyme | |
CN109306354A (en) | A kind of method of great expression antifungal protein | |
KR20220080101A (en) | Chimeric thermostable aminoacyl-tRNA synthetase for improved unnatural amino acid incorporation | |
CN115232813A (en) | Gene editing system for constructing von willebrand model pig nuclear transplantation donor cells with vWF gene mutation and application of gene editing system | |
CN115247191A (en) | Gene editing system and application thereof in construction of double-gene-mutation nevus basal cell carcinoma syndrome pig nuclear transplantation donor cell |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |