CN112553196B - Nucleic acid sequences and their use as promoters - Google Patents
Nucleic acid sequences and their use as promoters Download PDFInfo
- Publication number
- CN112553196B CN112553196B CN202011231411.3A CN202011231411A CN112553196B CN 112553196 B CN112553196 B CN 112553196B CN 202011231411 A CN202011231411 A CN 202011231411A CN 112553196 B CN112553196 B CN 112553196B
- Authority
- CN
- China
- Prior art keywords
- plasmid
- egfp
- sequence
- pt2al
- nucleic acid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/461—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from fish
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/43504—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates
- C07K14/43595—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from coelenteratae, e.g. medusae
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/65—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Gastroenterology & Hepatology (AREA)
- Physics & Mathematics (AREA)
- Toxicology (AREA)
- Plant Pathology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Tropical Medicine & Parasitology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The invention relates to the technical field of genetic engineering, in particular to a nucleic acid sequence and application thereof as a promoter. The present invention provides a novel promoter sequence which is a nucleic acid having a nucleotide sequence as shown in SEQ ID NO. 1; or a nucleic acid in which one or more nucleotides are substituted, deleted or added in this fragment; or a sequence complementary or partially complementary thereto. The promoter sequence is capable of driving protein expression in a variety of tissues or cells.
Description
Technical Field
The invention relates to the technical field of genetic engineering, in particular to a nucleic acid sequence and application thereof as a promoter.
Background
Gene expression refers to the process of transcription, translation, etc. of a gene to produce biologically active RNA or protein, which displays its stored genetic information, and is regulated by various factors at various levels. Most regulatory events occur at the transcriptional level, whether in eukaryotes or prokaryotes. Transcription is not only the first step in gene expression, but also a key step in controlling gene activity. Promoters are important players in the regulation of transcription levels and have been of interest for a long time.
Promoters (Promoters) are DNA sequences recognized by RNA polymerase, binding to the initiation of transcription, and contain conserved sequences required for specific binding of RNA polymerase and initiation of transcription, most of which are located upstream of the initiation of transcription of a structural gene. Such promoters are not transcribed per se, but there are promoters (e.g.tRNA promoters) which are located downstream of the transcription start point and these DNA sequences can also be transcribed. The activity of a promoter is influenced by various factors, and in the heterologous expression of genes, promoters with different activities are often selected according to actual requirements.
In the field of evolutionary biology, there are often some groups of unique genomic segments. These genomic fragments are only present in a certain species and are unique to this species. Early studies of such genomic fragments were relatively rare and their function was poorly understood. Recently, however, more and more studies have found that genomic fragments specific to these groups often have important biological functions, often associated with species adaptation or the appearance of new phenotypes. On the other hand, in many previous studies, it was considered that the occurrence of the biological phenotype is mainly caused by mutation of a coding region of a gene-encoded protein. The presence of large numbers of non-coding regions in the genome is often considered as redundant, non-functional so-called "garbage-fragments". However, with the scientific development in recent years, it has been found that these non-coding regions often have important regulatory effects, often allowing genes to acquire new tissue/cell expression patterns, and have important relationships with the generation of new functions of genes.
Therefore, it is important to further find out a new genome fragment with transcription initiation activity in the research and construct different expression systems.
Disclosure of Invention
In view of the above, the technical problem to be solved by the present invention is to provide a novel nucleic acid sequence and its use as a promoter, wherein the fragment is derived from Pundailia nyhurica nyerei in the family Liquidae (also called Cichlidae).
The invention provides a nucleic acid as described in any one of I), II) or III):
I) a nucleic acid with a nucleotide sequence shown as SEQ ID NO. 1;
II) a nucleic acid in which one or more nucleotides are substituted, deleted or added in the fragment of I);
III), a nucleic acid which is partially or completely complementary to I) or II).
In some embodiments, the nucleotide sequence of the nucleic acid is as set forth in any one of SEQ ID NOs 1-3.
The nucleic acid in which one or more nucleotides are substituted, deleted or added in the fragment described in I) refers to a sequence having at least 60% homology with the nucleic acid having the nucleotide sequence shown in SEQ ID NO. 1 and having a similar function.
Wherein the nucleic acid fragment shown in SEQ ID NO. 1 is one from east African fishes (African cichlid fishes), and the length of the wild type fragment of Netheria splendens P.nyerei is 2125 bp. The nucleic acid fragment shown in SEQ ID NO. 2 is a nucleic acid fragment formed by deleting 3bp on the basis of SEQ ID NO. l, and the SEQ ID NO. 3 is from P.nyerei kindred species Nile tilapia (Oreochromyis nilotius) and has 94 percent of homology with the nucleic acid sequence shown in SEQ ID NO. 1.
In some embodiments, the 3' end of the nucleic acid as set forth in any one of SEQ ID Nos. 1-3 further includes a nucleotide fragment as set forth in SEQ ID No. 4.
The nucleic acid shown as SEQ ID NO. 1 is derived from P.nyerei of Nerymeria, and the genome fragment of P.nyerei is subjected to enzyme digestion and connection to construct a recombinant plasmid (the original plasmid does not contain a promoter) which takes the unique genome fragment as an upstream promoter and drives the expression of eGFP fluorescent protein (Enhanced green fluorescent protein). The recombinant plasmid is injected into a single-cell fertilized egg of a zebra fish of a model organism (a development model and a disease model) to obtain a transgenic expression system for driving the expression of the eGFP fluorescent protein in a plurality of important tissues and cells in the zebra fish by the P.nyerei genome segment.
Further research shows that the nucleic acid segments shown in SEQ ID NO. 2-3 have similar functions with the nucleic acid segment shown in SEQ ID NO. 1, but the strength for regulating protein expression is weak, and the cell types capable of causing protein expression are fewer. This indicates that the corresponding sequence of O.niloticus (SEQ ID NO:3) has not evolved as strongly as P.nyerei (SEQ ID NO:1) during evolution; on the other hand, the expression function of the sequence of SEQ ID NO. 1 can be altered by merely deleting three consecutive SNPs (GTA) (SEQ ID NO:2) of SEQ ID NO. 1; meanwhile, the expression mode of the P.nyerei sequence after deleting the 3bp SNPs is similar to that of O.niloticus (SEQ ID NO:3), which shows that the three continuous SNPs are important factors for causing the expression difference between the SEQ ID NO:1 and the related species (SEQ ID NO:3, the 3bp SNPs are not contained in the sequence).
The invention also provides the application of the nucleic acid as a promoter.
Zebrafish are well known model organisms for use in models of human disease research and theoretical models of evolution. The invention takes zebra fish as an experimental animal, eGFP green fluorescent protein as a research object and the nucleic acid of P.nyerei provided by the invention as a promoter. The result shows that the nucleic acid can be used as a promoter to drive the expression of eGFP in a plurality of tissues and organs of zebra fish.
The invention also provides a transcription unit containing the nucleic acid and the target gene.
The invention takes eGFP protein as an object, and proves that the nucleic acid provided by the invention can be used as a promoter to drive the expression of eGFP. Based on the same principle, the nucleic acid provided by the invention can also be used for driving the expression of other proteins. The transcriptional unit comprises a nucleic acid of the invention and a CDS region of a gene of interest. In some embodiments, the transcriptional unit further comprises a terminator and/or an enhancer of a promoter.
The invention also provides a plasmid vector comprising a nucleic acid according to the invention or a transcription unit according to the invention.
The plasmid vector also comprises: at least one of an origin of replication, a selectable marker, a multiple cloning site, a primer detection sequence, a transposase fragment, a secretory signal peptide, and a selectable tag.
In some embodiments, the plasmid vector comprises, in sequential linkage: the nucleic acid, SV40 polyA, 150G, Amp resistance selection marker and 200R.
In some embodiments, the nucleic acid fragment of the invention and the pT2AL backbone vector are included in the plasmid vector.
The invention also provides a recombinant host transfected or transformed with the plasmid vector.
The recombinant host is a microorganism, an animal cell and/or a plant cell transfected or transformed with the plasmid vector.
The invention also provides a preparation method of the recombinant protein, which is to culture and induce the recombinant host to express the recombinant protein.
The invention also provides a breeding method, which is to transfect or transform the plasmid vector in embryonic cells or thalli to ensure that the target gene is over-expressed.
The invention also provides a medicament which comprises pharmaceutically acceptable auxiliary materials and at least one of the following i) to iv):
i) the plasmid vector of the present invention;
ii) an agent that increases the activity of the nucleic acid of the invention;
iii) an agent that inhibits the activity of a nucleic acid of the invention;
iv) expression products of the recombinant host according to the invention.
The present invention provides a novel promoter sequence which is a nucleic acid having a nucleotide sequence as shown in SEQ ID NO. 1; or a nucleic acid in which one or more nucleotides are substituted, deleted or added in this fragment; or a sequence complementary or partially complementary thereto. The promoter sequence is capable of driving protein expression in a variety of tissues or cells.
Drawings
FIG. 1a is a diagram showing the sequencing results of the pT2AL-Pn plasmid Sanger; sequencing a T3 primer on a plasmid vector to obtain a T3 downstream sequence fragment (such as a 200R fragment on the plasmid, sky blue labeling), a restriction enzyme site SmaI fragment (black labeling) and a partial P.nyerei lg gene (the unknown functional gene is named as lg gene in the invention) starting from the 5 ' end on the plasmid (orange labeling, Pn _ promoter _ part), wherein the length of a Sanger sequencing high-quality fragment is about 800bp generally, and about 100bp peak images of the 5 ' end and the 3 ' end are disordered in a sequencing result;
FIG. 1b shows a peak diagram of the sequencing result of pT2AL-Pn plasmid Sanger (FIG. 1a, enlarged version); sequencing a T3 primer on a plasmid vector to obtain a downstream sequence fragment of T3, a fragment of a restriction enzyme site SmaI and a partial P.nyerei lg gene promoter fragment starting from the 5' end on the plasmid;
FIG. 2a is a diagram showing the sequencing results of the pT2AL-Pn-eGFP plasmid Sanger; sequencing a T7 primer on a plasmid vector to obtain a plasmid T7 upstream sequence fragment (150G, sky blue mark; SV40/PolyA, grey mark), an enzyme cutting site StuI fragment (black mark) and a partial eGFP coding region fragment (green mark) which is sequenced reversely from the 3' end; note that the length of a high-quality fragment for Sanger sequencing is about 800bp generally, and peak patterns of about 100bp at the 5 'end and the 3' end in a sequencing result are disordered;
FIG. 2b shows a diagram of the peak of the sequencing result of the pT2AL-Pn-eGFP plasmid Sanger (FIG. 2a, enlarged); sequencing a T7 primer on a plasmid vector to obtain a plasmid T7 upstream sequence fragment, a restriction enzyme site StuI fragment and a part of eGFP coding region fragment which is reversely sequenced from a 3' end;
FIG. 3a shows a map of pT2AL-Pn-eGFP plasmid; the red mark is a sequence (Pn) which is externally connected into a plasmid empty vector and comprises a black mark enzyme cutting site, an orange mark P.nyerei lg gene upstream promoter sequence (Pn _ promoter) and a blue mark lg gene pre-48 bp exon sequence; the green marker is the sequence of the coding region of eGFP;
FIG. 3b shows a map of pT2AL-On-eGFP plasmid; the red marker is connected into a plasmid empty vector from O.niloticus sequence On which is homologous with the Pn sequence, and comprises a black marker enzyme cutting site, an orange marker O.niloticus lg gene upstream promoter sequence On _ promoter, and a blue marker O.niloticus lg gene front 48bp exon sequence; the green marker is the sequence of the coding region of eGFP;
FIG. 3c shows a map of pT2AL-Pn-nosNPs-eGFP plasmid; the red mark is a sequence which is externally connected into a plasmid empty vector, wherein the black mark is a restriction enzyme cutting site, the orange mark is a promoter sequence which is positioned at the upstream of the P.nyereri lg gene after three continuous SNPs (GTA) are deleted, Pn _ NOSNPs _ promoter and a sequence of the first 48bp exon of the lg gene marked by blue; the green marker is the sequence of the coding region of eGFP;
FIG. 4a is a diagram showing the sequencing results of the pT2AL-On-eGFP plasmid Sanger; sequencing a T3 primer On a plasmid vector to obtain a downstream sequence fragment (such as a 200R fragment On a plasmid, marked by sky blue), a restriction enzyme site SmaI fragment (marked by black) and a partial O.niloticus lg gene promoter fragment (marked by orange) from a 5 ' end, wherein the length of a Sanger sequencing high-quality fragment is about 800bp generally, and about 100bp peak images of the 5 ' end and the 3 ' end in a sequencing result are disordered;
FIG. 4b shows a peak diagram of the sequencing result of the pT2AL-On-eGFP plasmid Sanger (FIG. 4a, enlarged); sequencing a plasmid vector T3 primer to obtain a downstream sequence fragment of a plasmid T3, a restriction enzyme site SmaI fragment and a partial O.niloticus lg gene promoter fragment starting from a 5' end;
FIG. 5 is a diagram showing the sequencing results of pT2AL-Pn-nosNPs-eGFP plasmid Sanger; designing primers near the positions where the three continuous SNPs sites are deleted for Sanger sequencing, and further determining that the three continuous SNPs sites are really deleted from the P.nyerei promoter sequence in pT2 AL-Pn-noSNPs-eGFP;
FIG. 6a shows a co-localization diagram of hatching gland cells; the left panel shows the eGFP fluorescence expression under the 488nm channel; the right picture is the expression condition of anti-Cathepsin L/MEP antibody under a 647nm channel, and the middle part is a co-localization image after superposition of the anti-Cathepsin L/MEP antibody and the Cathepsin L/MEP antibody;
FIG. 6b is a graph showing epidermal chemoreceptor cell co-localization; the left panel shows the eGFP fluorescence expression under the 488nm channel; the right picture is the expression condition of the anti-Serotonin antibody under the 647nm channel, and the middle part is a co-localization image after superposition of the anti-Serotonin antibody and the Serotonin antibody;
FIG. 6c is a graph showing the results of heart valve expression; FIG. is the eGFP fluorescence expression at 488nm channel;
FIG. 6d shows a map of the co-localization of mandibular cells; the left panel shows the eGFP fluorescence expression under the 488nm channel; the right picture is the expression condition of the anti-SOX10 antibody under the 647nm channel, and the middle part is a co-localization image after the two are superposed;
FIG. 6e is a graph showing dorsal and gluteal fin expression results; both figures show the eGFP fluorescence expression under 488nm channel;
FIG. 6f shows a pectoral fin co-location map; the left panel shows the eGFP fluorescence expression under the 488nm channel; the right picture is the expression condition of the anti-Collagen type II antibody under the 647nm channel, and the middle part is a co-localization image after the two are superposed;
FIG. 6g shows a caudal fin co-location map; the left panel shows the eGFP fluorescence expression under the 488nm channel; the right picture is the expression condition of the anti-Collagen type II antibody under the 647nm channel, and the middle part is a co-localization image after the two are superposed;
FIG. 6h shows a diagram of the co-localization of neurons in the inner and outer nuclear layers of the eye; the left panel shows the eGFP fluorescence expression under the 488nm channel; the right panel shows the expression of anti-HuC/HuD antibody under 647nm channel; the right two figures show the DAPI fluorescence expression; the left two is a common positioning image after the three are superposed;
FIG. 6i shows a map of olfactory epithelial cell co-localization; the left panel shows the eGFP fluorescence expression under the 488nm channel; the right picture is the expression condition of the anti-HuC/HuD antibody under the 647nm channel, and the middle picture is a co-localization image after the two are superposed;
FIG. 6j shows the colocalization map of olfactory bulb cells, with eGFP fluorescence expression at 488nm channel on the left; the right picture is the expression condition of the anti-HuC/HuD antibody under the 647nm channel, and the middle picture is a co-localization image after the two are superposed;
FIG. 6k shows a view of roof and cerebellum co-localization, left panel of eGFP fluorescence expression at 488nm channel; the right picture is the expression condition of the anti-HuC/HuD antibody under the 647nm channel, and the middle picture is a co-localization image after the two are superposed;
FIG. 6l shows cerebellar colocalization maps, left panel of eGFP fluorescence expression at 488nm channel; the right picture is the expression condition of anti-Pvalb7 antibody under 647nm channel, and the middle part is a co-localization image after the two are superposed;
FIG. 6m shows a post-brain colocalization map, left panel showing eGFP fluorescence expression at 488nm channel; the right picture is the expression condition of the anti-HuC/HuD antibody under the 647nm channel, and the middle picture is a co-localization image after the two are superposed;
FIG. 6n shows a map of spinal cord neuron co-localization, left panel showing eGFP fluorescence expression at 488nm channel; the right picture is the expression condition of the anti-HuC/HuD antibody under the 647nm channel, and the middle picture is a co-localization image after the two are superposed;
FIG. 7a shows the fluorescent expression of the transgenic line pT2AL-Pn-eGFP predominantly expressed in olfactory epithelial cells, olfactory bulb, forebrain, midbrain cap, cerebellum, hindbrain, pectoral fin, spinal cord, mandible, including Mylar cartilage, palatoglossal cartilage, hyoid bone, canthus bone and gill bone, and epidermal chemosensory cells of the head;
FIG. 7b shows the fluorescence expression of the transgenic line pT2AL-On-eGFP, the fluorescence signal is significantly weaker than that of the transgenic line pT2AL-Pn-eGFP, and the obvious fluorescence signal can be distinguished mainly in the cerebellum, hindbrain and first two arches of the mandible (Maifanitum cartilage, palatine cartilage, hyoid bone and hyoid bone);
FIG. 7c shows the fluorescence expression of the transgenic line pT2AL-Pn-nosNPs-eGFP, the fluorescence signal is significantly weaker than that of the transgenic line pT2AL-Pn-eGFP, and is more similar to that of the transgenic line pT2AL-On-eGFP, and the obvious fluorescence signal can be distinguished mainly in the olfactory epithelium, the parietal gland, the hindbrain and the first two bone arches of the mandible;
fig. 7d shows a confocal picture taken under the same parameters versus statistical analysis of fluorescence intensity, the fluorescence intensity of the transgenic line pT2AL-Pn-eGFP being statistically significantly stronger compared to the fluorescence signals of the other two transgenic lines in the cerebellum, the hindbrain region and the first two arches of the mandible.
Detailed Description
The present invention provides nucleic acid sequences and their use as promoters, which can be achieved by one skilled in the art with the appropriate modification of process parameters in view of the disclosure herein. It is expressly intended that all such similar substitutes and modifications which would be obvious to one skilled in the art are deemed to be included in the invention. While the methods and applications of this invention have been described in terms of preferred embodiments, it will be apparent to those of ordinary skill in the art that variations and modifications in the methods and applications described herein, as well as other suitable variations and combinations, may be made to implement and use the techniques of this invention without departing from the spirit and scope of the invention.
In order to obtain a novel promoter sequence, a novel gene with unknown function is found in the Dongfei beautiful fish through a method of comparing genomics and transcriptomics at the early stage, and the novel gene is named as lg gene. By sequencing the species of the major lineage of the entire Dongfei phylogenetic tree at 10kb upstream and 10kb downstream of the gene, we found a fragment of the fragment of Dongfei fish shown in SEQ ID NO:1, which is unique to the region of coding region of the lg gene. By designing a PCR primer with a specific enzyme cutting site, firstly, a genome extraction kit is utilized to extract the whole genome of the species, and then PCR amplification is carried out on the segment in the genome and a part of 5' end coding region sequence of lg gene to obtain a target segment with the specific enzyme cutting site. Subsequently, the fragment was ligated into pT2AL empty vector as a promoter by digestion, ligation, and transformation, and the eGFP fluorescent protein coding region was ligated downstream thereof, thereby constructing a plasmid that drives the expression of eGFP fluorescent protein from the unique fragment. After endotoxin-free purification, the plasmid is injected into a single cell fertilized egg of a model organism zebra fish, individuals expressing the eGFP green fluorescent protein are screened and cultured until sexual maturity (F0 generation), and the individuals with the eGFP green fluorescent protein are selected as offspring after mating with wild zebra fish (F1 generation); the culture is continued until sexual maturity, and then the wild zebra fish are mated again, so that offspring (F2 generation) expressing the eGFP fluorescent protein with purer strains are obtained. The result shows that the segment described by SEQ ID NO. 1 has the function of a promoter and can drive the expression of fluorescent protein in various important cells and tissues.
In the present invention, the complementarity refers to the ability of hydrogen bonding between nucleotide bases, for example, in the case of DNA, Adenine (Adenine) is complementary to Thymine (Thymine), and Cytosine (Cytosine) is complementary to Guanine (Guanine). In some embodiments of the present invention, the nucleic acid formed by substituting, deleting or adding one or more nucleotides refers to a sequence having at least 60% homology with the nucleic acid of the nucleotide sequence shown in SEQ ID NO. 1 and having similar functions. In the present invention, the homology refers to the degree of matching with the sequence shown in SEQ ID NO. 1. In some embodiments, the homology is not less than 70%, 80%, 90%, 95%, or 99%. In the examples, the nucleic acids of SEQ ID Nos. 2 to 3 were also functionally verified. Wherein, the nucleic acid fragment of SEQ ID NO. 2 is a nucleic acid fragment formed by artificially deleting three continuous SNPs (GTA) on the basis of SEQ ID NO. l; SEQ ID NO 3 from the closely related species Nile tilapia (O.niloticus) of P.nyerei has 94% homology with the nucleic acid sequence shown in SEQ ID NO 1 and does not have the three consecutive SNPs deleted as described above by itself. Promoter analysis research is carried out on the nucleic acid segments shown in SEQ ID NO. 2-3, and the results show that the nucleic acid segments have similar functions with the nucleic acid segments shown in SEQ ID NO. 1, but the intensity for regulating protein expression is weak, and the cell types capable of causing protein expression are fewer. Particularly, only 3bp difference exists between the SEQ ID NO. 2 and the SEQ ID NO. 1, the regulated activity is obviously reduced, and the importance of the three continuous SNPs for realizing the activity of the promoter of the SEQ ID NO. 1 is shown.
In the present invention, the nucleic acid sequence of SEQ ID NO. 1 is:
actggtgggctctgagtctttgtttctgtaaactgcagtctgctagtgcatgccactgagtattaatgaaacataataaaaacataataacacaataaaacacggtctatgaaaactcaatcaggtgtaaaaagcatttcaatttttaaaggtctaaagagtcagaaatgtatctgcagaggggaactgttgactcatgtgaatcatgagacgggtttccataagtcagtgctcccatgatcactgcaatgtgttgtttgcattatgtgcacacacacgcacacacataaacacacacacacatgcatacacagagcaaacgtacccatctgatcagattatctgcatgatgtatgtcacaaactttacacaacatgtttacagctctgcagtcctttttccatttttgtttttagaaggaatcattcattttggcacattttaccaaaattacagtggctgactgtgttcatttacaatgagacatttttagaaatggcgtcttataagatgaaaaaaaatgtggaaaaaataaacctgactccactttgtttgtttgttttgttgttttcttctcttcctctctccctctctctctctctctctctctctctcatcacccctcctttgctacatctgtgtctgagtctccttcctcttccctatttcttctgcatctgtttttctcgctccatcaaaacagtcagccagagacagtgctggggctgaggaaatggatggtgtgtctatttttagaagcagggaaggagggcaaaccaaaagtaagcgagggagagagggagcaagagagagagggaggaggaggatgaggtggaggagagggaggggggatgtgaggagaagataaacctgcattctgcctctgcctctggtcagagacagttgaagcttcacagcacaattaacaagcagcagcagatgaaaacagaggatcgtaaataaaattgaggagacaaaaaaaagagagaacgagagcgacggatgcaggctgctatttaaagctgcaccaggagaagagacaagaaaagattcggtggactcaacaaagatcatccgtaaaggtgagtgtgaaggcaaattgagcagaaaacctttacattattcctgagaatgcatgacgaacacacacacagagtgcctgttttgtgcttcagcctcgagctgccgccgtggaaccatttttagctgttgtatacagaaggtgtggtctatatttagcagtggttacaccttggaagaaactgggagaggctggcattttggtttggcttgcgctatttttgtcatgaggtcaatacaagcatacattgtgggctacgctctgggttataacaactcggcgctccacactctgaaacagtttgaggctttacgtgtctgagatcccacaaaacagtcatctgtgtgtgcagcctctcactgggattgtttgtatattgaacggtcaacacacacaactccagagtaagcggcgcacggtgcctgaaatgaaggaaagggagaggagaggctcctatgtttacttggttttatcgatacagccagcagttgtagtcggttgtgttttgggagcttcggacaatgtgagcaatgtccagatgcacattatgtcattagaaaaacacattagactgatgaggtgattaaacaagaggcttttcaaaagaaactgaggaaaaattgttgcaaatacaatgcaacacaacagtgtgtacgaatgagaaattaactccagaagttataagtcagttcgtgagataatagtgcacagattcatctttttatttgtatcgtgtggagacacatattgcaagtttcgacataaatgacacacacaaacagggctctcactttgtaaactaggaagtaagattgcatacgttgcatctgaccctatcaggaatgtgtcacatctatcaaaatatccagttttccaaatggtgattatattttcctaaaggattctgcttttttgagacttggatgcagtatttcacattacacagtttttttccccatctaaatgagacctctcccgtcttgtctttctctcagct
the nucleic acid sequence shown in SEQ ID NO. 2 is:
actggtgggctctgagtctttgtttctgtaaactgcagtctgctagtgcatgccactgagtattaatgaaacataataaaaacataataacacaataaaacacggtctatgaaaactcaatcaggtgtaaaaagcatttcaatttttaaaggtctaaagagtcagaaatgtatctgcagaggggaactgttgactcatgtgaatcatgagacgggtttccataagtcagtgctcccatgatcactgcaatgtgttgtttgcattatgtgcacacacacgcacacacataaacacacacacacatgcatacacagagcaaaccccatctgatcagattatctgcatgatgtatgtcacaaactttacacaacatgtttacagctctgcagtcctttttccatttttgtttttagaaggaatcattcattttggcacattttaccaaaattacagtggctgactgtgttcatttacaatgagacatttttagaaatggcgtcttataagatgaaaaaaaatgtggaaaaaataaacctgactccactttgtttgtttgttttgttgttttcttctcttcctctctccctctctctctctctctctctctctctcatcacccctcctttgctacatctgtgtctgagtctccttcctcttccctatttcttctgcatctgtttttctcgctccatcaaaacagtcagccagagacagtgctggggctgaggaaatggatggtgtgtctatttttagaagcagggaaggagggcaaaccaaaagtaagcgagggagagagggagcaagagagagagggaggaggaggatgaggtggaggagagggaggggggatgtgaggagaagataaacctgcattctgcctctgcctctggtcagagacagttgaagcttcacagcacaattaacaagcagcagcagatgaaaacagaggatcgtaaataaaattgaggagacaaaaaaaagagagaacgagagcgacggatgcaggctgctatttaaagctgcaccaggagaagagacaagaaaagattcggtggactcaacaaagatcatccgtaaaggtgagtgtgaaggcaaattgagcagaaaacctttacattattcctgagaatgcatgacgaacacacacacagagtgcctgttttgtgcttcagcctcgagctgccgccgtggaaccatttttagctgttgtatacagaaggtgtggtctatatttagcagtggttacaccttggaagaaactgggagaggctggcattttggtttggcttgcgctatttttgtcatgaggtcaatacaagcatacattgtgggctacgctctgggttataacaactcggcgctccacactctgaaacagtttgaggctttacgtgtctgagatcccacaaaacagtcatctgtgtgtgcagcctctcactgggattgtttgtatattgaacggtcaacacacacaactccagagtaagcggcgcacggtgcctgaaatgaaggaaagggagaggagaggctcctatgtttacttggttttatcgatacagccagcagttgtagtcggttgtgttttgggagcttcggacaatgtgagcaatgtccagatgcacattatgtcattagaaaaacacattagactgatgaggtgattaaacaagaggcttttcaaaagaaactgaggaaaaattgttgcaaatacaatgcaacacaacagtgtgtacgaatgagaaattaactccagaagttataagtcagttcgtgagataatagtgcacagattcatctttttatttgtatcgtgtggagacacatattgcaagtttcgacataaatgacacacacaaacagggctctcactttgtaaactaggaagtaagattgcatacgttgcatctgaccctatcaggaatgtgtcacatctatcaaaatatccagttttccaaatggtgattatattttcctaaaggattctgcttttttgagacttggatgcagtatttcacattacacagtttttttccccatctaaatgagacctctcccgtcttgtctttctctcagct
the nucleic acid sequence shown in SEQ ID NO. 3 is:
actggtgggctctgagtctttgtttctgtaaactgcagtctgctagtgcttgccactgagtattaatgaaacataataaaacataataacacaataaaacacagtctatgaaaactcaatcaggtgtaaagagcatttcaatttttaaaggtctaatgagtcagaaatgtatctgtagatgagaactgttgactcatgtgactcatgagacaggtttccataagtcagtgctcccgtgatcactgcaatgtgttgtttgcattacgtgcacacacgtgcgcacgcataaacacacacacatccatatacacagagcaaacatgccaaagcccatctgatcagattgtctgcatgatgtatgtgacaagctttacacaacatgtttacagctctgcagtcctttttccatttttgtttttagaaggaatcgttcattttggcacattcaccaaaattacagtggctgactgtgttcatttacagtgagacatttttagaaatggcgtcttataagatgaaaaaaaatgtggaaaaaataaacctgtctccactttgtttgttttgttgttttctcctctctctctctctctctctctctctctctctctctctcatcacccctctgttacatctgtgtctgagtctccttcctcttccctctttcttctgcctctgtttttctcgttccatcaatacagtcagccagagacagtgctggggctgaggaaatggatggtgtgtctatttttagaagcagggaaggagggcaaaccaaaagtaagcgagggagagagggagcaagagagagagggaggaggaggatgaggtggaggagagggaggggggatgtgaggagaagataaatctgcattctgcctctgcctctggtcagagacagttgaagcttcacagcacaattaaggagaagcagcagcagatgaaaacagacgatcgtaaataaaattgaggagacaaaaaaagagagaaagagagcgacggatgcaggctgaaatttaaaactgcaccaggagaaaagacaagaaaagattccgtggactcaacaaagatcatccgtaaaggtgagtgtgaaggcaaattgagcagaaaacctttacattattcctgagaatgcatgacgaacacacacacagggtgcctgttttgtgcttcagcctcgagctgccgccgtggaaccatttttagcttttgtatacagaaggtgtggtctatatttagcagtggttacaccttggaagaaactgggagaggctggcattttggtttggcttgtgctatttttgtcacgaggtcaatacaagcatatattgtgggctacactcagggttataacaactcggcgctccacactctgaaacagtttgaggctttacgtgtctgagatcccacaaaacagtcatctgtgtgtgcagcctctcgctgggattgtttgtatattgaacggtcaacacacacaactccagagtaagcagcgcacggtgcctgaaatgaagggaagggagaggagaggcttctgtgtttacttggttttatcgatacagccagcagttgtagtcggttgtgttttgaaaacgcttcgggggatgtgagcaatgtccagatgtacattatgtcattagaaaaacacattagactgatgaggtgattaaacaagaggcttttgaaaagaaactgagcaaaaattgttgcaaatacaatgtgtgtacgaatgtgaaattaactccagaagttataagtcagttcatgagataacagtgcacagatttatatttttatttgtattgtgtggaggcacatattgcaagtttcgacataaaggacacacacaacacactcactttgtaaactaggaagtaagattgcatacgttgcatctgaccctatcaggaatgtgtcacacctatcaaaatatccagttttccaaacggtgattatattttcctaaaggattctgctttttcgagacttggatgcagtatttcacattacgcagtttttttccccatctaaatgagacctctcccgtcttgtctttctctcagct
the nucleic acid sequence shown in SEQ ID NO. 4 is: atgcggccacggtccagactgctgtccaagaggagactgacgctgccc
In the invention, the promoter is a DNA sequence located at the upstream of the 5' end of the structural gene, and can activate RNA polymerase and start gene transcription. The research of the invention shows that the nucleic acid can be used as a promoter to drive the expression of eGFP in a plurality of tissues and organs of zebra fish. In practical applications, the target gene for driving expression of the promoter of the present invention may be eGFP, or may be other target genes.
In some embodiments, the site at which the nucleic acid is capable of driving expression of eGFP fluorescent protein in zebrafish comprises: firstly, the eGFP fluorescent protein is driven to express in the heart valve, so that the nucleic acid can be used as a molecular marker for researching tissues or cells of relevant diseases such as heart valve development and the like; driving expression of eGFP fluorescent protein in many neurons in zebrafish brain, including in recognition of chemical molecular signals for capturing food or escape natural enemy, mate, etc. associated olfactory epithelial cells and cells in olfactory bulb, and visual-related visual cap located in midbrain, cerebellum associated with motor balance, and neurons in hindbrain and spinal cord. The expression in different types of neurons in the brain is helpful for researching the action of the brain and providing important molecular markers for constructing relevant disease models; the expression in the inner nuclear layer and outer nuclear layer neuron cells in eyes provides important molecular markers for researching the function of visual nerves and constructing corresponding disease models; and fourthly, the eGFP fluorescent protein can be driven to be expressed in chemosensory cells in the skin of the fish. Chemosensory cells in fish skin are cells for sensing external environment, so that the chemosensory cells provide important research basis and models for researching fish sensing environment and interaction between the cells and the environment; able to drive the expression of eGFP fluorescent protein in hatching gland cells related to the hatching of fish eggs. The hatching gland cells can secrete enzymes related to egg hatching, so that the method has an important effect on fish hatching and development, and researches on basic theories such as fish hatching and aquatic product application. And the eGFP fluorescent protein can be driven to express in fish fins and mandible skeleton, and important molecular markers are provided for understanding fish skeletal development and constructing skeletal disease models.
In the present invention, the promoter in the transcription unit (transcription unit) is the nucleic acid of the present invention, wherein the promoter is directly linked to the target gene or linked via a linker sequence. In some embodiments, the transcriptional unit further comprises a terminator and/or an enhancer of a promoter. An enhancer of the promoter is understood as a regulatory sequence of gene expression, which may be, for example, any control sequence regulating the transcription of a gene other than the promoter effecting the transcription of the gene, and encoding an appropriate mRNA ribosome binding site, etc. The target gene in the transcriptional unit may also be a target gene, and refers to a gene encoding a target protein, which is an eGFP-encoding gene in the present embodiment. However, the gene whose expression can be driven by the promoter of the nucleic acid sequence of the present invention is not limited thereto. In other applications, it may be the CDS region of other structural genes.
In the present invention, the plasmid vector refers to a DNA molecule having a function of expressing a target gene in an appropriate host. The backbone of the plasmid vector is not limited by the present invention, and it is within the scope of the present invention as long as it can be expressed in a host cell. In the present invention, the nucleic acid of the present invention is used as a promoter upstream of the target gene in the plasmid vector, and the promoter may or may not contain other nucleic acid sequences. In some embodiments, the plasmid vector does not contain an additional promoter upstream of the target protein. The plasmid vector contains the nucleic acid of the invention, or contains the transcription unit of the invention. The backbone vector of the plasmid vector may be a natural or recombinant plasmid, cosmid, virus, or phage. It may be a linear carrier or a circular carrier. Can be an integrative plasmid vector, an episomal vector or a shuttle plasmid. The vector includes elements for plasmid replication and gene expression. For example, at least one of an origin of replication, a selectable marker, a multiple cloning site, a primer detection sequence, a transposase fragment, a secretion signal peptide, a selectable tag, or an origin fragment. Wherein the selectable marker is at least one of a resistance selectable marker, an auxotrophic selectable marker, or a fluorescent marker; the resistance screening marker is an ampicillin resistance marker, a kanamycin resistance marker, a tetracycline resistance marker, a streptomycin resistance marker or a chloramphenicol resistance marker. For example, the backbone of the plasmid vector can be a series of vectors such as pWE15, M13, λ LB3, λ BL4, λ xii, λ ASHII, λ APII, λ t10, λ t11, Charon4A, Charon 21A; the vectors may be pBR, pUC, pBluescriptII, pGEM, pTZ, pCL or pET series vectors. More specifically, it may be pECCG117, pDZ, pACYC177, pACYC184, pCL, pUC19, pBR322, pMW118, pCC1BAC, pCES208, pXMJ19, pT2AL, pCS2 vectors.
In the present invention, the recombinant host refers to a host transfected or transformed with the plasmid vector of the present invention. The transfection (transfection) is a process by which eukaryotic cells actively or passively introduce foreign DNA fragments under certain conditions to obtain a new phenotype. Transformation is the taking up of DNA from one genotype by a cell of another genotype from the surrounding medium to phase its genotype with its phenotypeThe phenomenon of strain. Transfection or transformation methods common in the art include: electroporation, Lipofectation, calcium phosphate (CaPO)4) Precipitation method, calcium chloride (CaCl)2) Precipitation, microinjection, polyethylene glycol (PEG) technology, DEAE-dextran technology, cationic liposome technology, and lithium acetate-DMSO technology. Other means of gene editing may also be used to ligate the promoter fragments of the present invention into the host genome. For example, the means of gene editing include: zinc finger nuclease method, TALENs technology, CRISPR/Cas9 technology, transposase transgenic technology and the like. The recombinant host may be a microorganism, a plant cell or an animal cell. Preferably, the host cell of the recombinant host is a microorganism or an animal cell. In some embodiments, the animal cell is an embryonic cell of zebrafish.
The promoter of the sequence can be used for preparing recombinant protein, and can also be used for realizing the purpose of improving the level of cell metabolites by over-expressing a certain protein by utilizing the promoter. The promoter can be used for expressing fluorescent protein or other screening markers to carry out positioning analysis on tissues or cell types in the recombinant host. The gene is introduced into embryonic cells to promote the expression of target genes, thereby realizing the purpose of constructing new cell strains or new animal varieties. The promoter of the sequence of the invention is used as a medicine, and can improve the expression level of a target gene, thereby realizing the purpose of disease treatment.
The test materials adopted by the invention are all common commercial products and can be purchased in the market. Wherein the backbone vector pT2AL is obtained from Lissajous labs of Life sciences college of southwest university.
The invention is further illustrated by the following examples:
examples
First, plasmid construction
1. Construction of pT2AL-Pn-eGFP plasmid
Firstly, connecting a specific upstream promoter sequence of P.nyereri of origanum dongfei and a coding region sequence (hereinafter, referred to as Pn sequence) of the first 48bp of lg gene into a pT2AL empty vector plasmid to construct a pT2AL-Pn plasmid; then, the sequence of the coding region of eGFP fluorescent protein was ligated to the downstream of the Pn sequence in the pT2AL-Pn plasmid constructed as described above to obtain pT2AL-Pn-eGFP plasmid.
The method comprises the following specific steps:
1.1 construction of pT2AL-Pn plasmid
1.1.1 by analyzing the sequence of the multiple cloning site of the empty vector pT2AL (provided by the Li professor laboratories of Li Prof., southwest university), the appropriate double cleavage sites Sma I, BamH I (NEB) (present in the multiple cloning site of the empty vector pT2AL, but not in the desired fragment Pn, in order not to fragment the desired fragment) were found.
1.1.2 designing a restriction enzyme site primer with a 5' end with a protective base:
F5’-3’:TCCCCCGGGACTGGTGGGCTCTGAGT;
R5’-3’:CGGGATCCGGGCAGCGTCAGTCTCCTCT;
the target fragment is amplified from the P.nyereri genome by PCR amplification technology by using high fidelity enzyme PrimerSTAR (Takara) (annealing temperature: 60 ℃ -50 ℃, extension time 2min) and double enzyme cutting sites are introduced to both sides of the target fragment to be connected.
1.1.3 the PCR amplification products were purified using a gel recovery kit (Biospin), the detailed procedures of which are described in the kit's instructions.
1.1.4 the empty vector pT2AL plasmid and the purified PCR amplified fragment were each double digested using a double digestion system (NEB) to expose the same double digestion sites (Sma I, BamH I) in the plasmid vector and PCR product, respectively. The digested product was purified and recovered with a gel recovery kit (Biospin), and the concentration was measured with Nanodrop 2000.
1.1.5 the purified and recovered linear empty vector pT2AL plasmid and the purified PCR target fragment were ligated by ligase (Solution I, Takara). Ligation was carried out at 16 ℃ for 2 h. The connection system is calculated using the following formula:
(C1V1length of the target fragment)/(C2V2Carrier segment length) 10
Wherein V is1+V2=7.5μL;Solution I=2.5μL;
C1Concentration of target fragment C2Concentration of carrier
1.1.6 transformation: the above ligation product (10ul) was transformed into a 1.5ml EP tube containing 30ul E.coli DH5 alpha competent cells (green) and placed on ice for 30min, heat-shocked at 42 ℃ for 90s and then placed on ice for 5 min. Adding 1ml of Escherichia coli culture solution, and performing amplification culture at 37 deg.C and 180rpm for 30 min. Centrifuging at 4000rpm for 5min, and discarding most of the supernatant (about 200ul of liquid is retained). The precipitate and the remaining liquid were repeatedly pipetted and mixed, and then spread evenly on an ampicillin-containing E.coli culture medium and cultured overnight at 37 ℃ to select the plasmid pT2AL containing ampicillin as a positive fragment to obtain the plasmid pT2 AL-Pn.
1.1.7 the positive monoclonal bacteria in the selected culture medium are dropped into 1ml of culture solution containing ampicillin, and after the bacteria are shaken at 37 ℃ for amplification culture, the culture solution is sent to a company for plasmid extraction, and the T3 site of the multiple cloning site in the plasmid is subjected to Sanger sequencing (engine family), thereby further confirming that the selected monoclonal bacteria are indeed pT2AL-Pn plasmids which are successfully connected. The sequencing results are shown in FIG. 1a and FIG. 1 b.
1.1.8 the pT2AL-Pn plasmid successfully constructed as described above was transformed into E.coli DH 5. alpha. competent cells (the transformation procedure was as in step 1.1.6). Selecting single clone into 40ml culture solution containing ampicillin, culturing at 37 deg.C and 180rpm for 12-16h, extracting plasmid (plasmid small extraction medium kit, Tiangen, see kit description for details), and storing the extracted plasmid in-20 deg.C refrigerator.
1.2. Construction of pT2AL-Pn-eGFP plasmid:
1.2.1 by analyzing the multiple cloning site in the pT2AL-Pn plasmid obtained by the above construction and the sequence of the eGFP coding region, the appropriate double-restriction sites BamH I, Stu I (present in the pT2AL-Pn multiple cloning site but absent in the eGFP coding region so as not to cut up the eGFP coding region) were found.
1.2.2 designing a restriction enzyme site primer with a 5' end with a protective base:
F:5’-3’CGGGATCCATGGTGAGCAAGGGCGAGGA,
R:5’-3’GAAGGCCTTTACTTGTACAGCTCGTCCA;
the coding region of the eGFP of the target fragment is amplified by a PCR amplification technology by using a high fidelity enzyme PrimerSTAR (Takara) (the annealing temperature is 60-50 ℃, and the extension time is 1min), and enzyme cutting sites are introduced to two sides of the fragment.
1.2.3 the PCR amplification products were purified using the gel recovery kit (Biospin), the detailed procedures of which are described in the kit's instructions.
1.2.4 double digestion of pT2AL-Pn plasmid and purified eGFP PCR product with double digestion system (NEB) to expose the same digestion sites (BamH I, Stu I) for plasmid vector and eGFP, respectively, purification of digested product with gel recovery kit (Biospin), and concentration determination with Nanodrop 2000.
1.2.5 the plasmid pT2AL-Pn, which is a purified and recovered linear vector after double digestion, is ligated with the purified eGFP target fragment using ligase (Solution I, Takara). Ligation was carried out at 16 ℃ for 2 h. The connection system is calculated using the following formula:
(C1V1length of the target fragment)/(C2V2Carrier segment length) 10
Wherein V is1+V2=7.5μL Solution I=2.5μL
C1Concentration of target fragment C2Concentration of carrier
1.2.6 transformation: synchronization step 1.1.6. The plasmid pT2AL-Pn-eGFP was selected so that the target eGFP fragment was ligated to the plasmid pT2AL-Pn containing ampicillin.
1.2.7 the positive monoclonal bacteria in the selected culture medium are dropped into 1ml of culture solution containing ampicillin, and after the culture medium is shaken at 37 ℃ for amplification culture, the culture medium is sent to a company for plasmid extraction, and the T7 site of the multiple cloning site in the plasmid is subjected to Sanger sequencing (Scout Biotech), thereby further confirming that the selected monoclonal bacteria are indeed pT2AL-Pn-eGFP plasmids which are successfully connected. The sequencing results are shown in FIG. 2a and FIG. 2 b.
1.2.8 the pT2AL-Pn-eGFP plasmid successfully constructed as described above was transformed into E.coli competent cells (the same transformation procedure as in step 6). Selecting single clone into 40ml culture solution containing ampicillin, culturing at 37 deg.C and 180rpm for 12-16h, extracting plasmid (plasmid small extraction medium kit, Tiangen, see kit description for details), and storing the extracted plasmid in-20 deg.C refrigerator.
1.2.9 the selected plasmid is finally tested through the T3 and T7 primers to connect the foreign fragment of the plasmid, and the sequence connected is further determined as the target sequence Pn-eGFP, and the constructed plasmid is pT2AL-Pn-eGFP (see the plasmid picture 3a, the whole sequence of the plasmid is shown as SEQ ID NO: 5).
2. Construction of pT2AL-On-eGFP plasmid
The Pn sequence in the constructed pT2AL-Pn-eGFP plasmid is cut off by using double enzyme cutting sites (Sma I and BamH I) at both sides of the Pn sequence in the constructed pT2AL-Pn-eGFP plasmid, and is replaced by a sequence homologous to the Pn sequence in Nile tilapia O.niloticus of P.nyerei, a Neurorei, a east Africa beauty fish, so as to construct the pT2AL-On-eGFP plasmid. The method comprises the following specific steps:
2.1 the pT2AL-On-eGFP plasmid was constructed using the double restriction sites (Sma I and BamH I) On both sides of the Pn sequence in the above constructed plasmid pT2AL-Pn-eGFP (the double restriction sites are not present in the On sequence and therefore the On sequence is not cut up). Designing an enzyme cutting site primer with a 5' end having a protective base:
F:5’-3’TCCCCCGGGACTGGTGGGCTCTGAGT,
R:5’-3’CGGGATCCGGGCAGCGTCAGTCTCCTCT,
the target fragment On is amplified from the O.niloticus genome of Nile tilapia by a PCR amplification technology (the annealing temperature is 60-50 ℃, and the extension time is 2min), and double enzyme cutting sites are introduced to two sides of the target fragment to be connected.
2.2 the PCR amplification products were purified using a gel recovery kit (Biospin), the detailed procedures of which are described in the kit's instructions.
2.3 the constructed pT2AL-Pn-eGFP plasmid and the purified PCR product were digested simultaneously using the double digestion system (NEB) to expose the same cleavage sites (Sma I and BamH I) for the plasmid vector and PCR product, respectively, and the Pn sequence was cleaved from the constructed pT2AL-Pn-eGFP plasmid. The digested product was purified and recovered with a gel recovery kit (Biospin), and the concentration was measured with Nanodrop 2000.
2.4 ligation of the purified and recovered linear vector pT2AL-eGFP plasmid and the purified PCR target fragment was performed by ligase (Solution I, Takara). Ligation was carried out at 16 ℃ for 2 h. The connection system is calculated using the following formula:
(C1V1length of the target fragment)/(C2V2Carrier segment length) 10
Wherein V is1+V2=7.5ul Solution I=2.5ul
C1Concentration of target fragment C2Concentration of carrier
2.5 conversion: synchronization step 1.1.6. The plasmid pT2AL-On-eGFP was obtained by screening the plasmid pT2AL-eGFP which had been ligated to the On sequence and had ampicillin resistance.
2.6 the positive monoclonal bacteria in the selected culture medium are dropped into 1ml of culture solution containing ampicillin, the culture solution is shaken at 37 ℃ for amplification culture, then the culture solution is sent to a company for plasmid extraction, and the T3 site of the multiple cloning site in the plasmid is subjected to Sanger sequencing (engine department), thereby further confirming that the selected monoclonal bacteria are indeed the pT2AL-On-eGFP plasmid which is successfully connected. The sequencing results are shown in FIGS. 4a and 4 b.
2.7 the pT2AL-On-eGFP plasmid successfully constructed as described above was transformed into E.coli competent cells (the transformation procedure was as described above). Selecting single clone into 40ml culture solution containing ampicillin, culturing at 37 deg.C and 180rpm for 12-16h, extracting plasmid (plasmid small extraction medium kit, Tiangen, see kit description for details), and storing the extracted plasmid in-20 deg.C refrigerator.
2.8 finally, the selected plasmid is tested through the fragments connected into the plasmid by using primers T3 and T7, the connected sequence is further determined as a target sequence On-eGFP, and the constructed plasmid is pT2AL-On-eGFP (see a plasmid map 3 b).
3. Construction of pT2AL-Pn-nosnPs-eGFP plasmid
Three specific consecutive SNPs (GTA) in the Pn sequence were deleted by bridge PCR amplification (see Reikofski J, Tao BY. Polymerase Chain Reaction (PCR) techniques for site-directed mutagenesis. Biotechnol. Adv; 1992; 10: 535-47.) to obtain the sequence Pn-nonsnPs. The Pn sequence in the constructed pT2AL-Pn-eGFP plasmid is cut off by using the double restriction sites (Sma I and BamH I) on both sides of the Pn sequence in the constructed pT2AL-Pn-eGFP plasmid, and is replaced by the Pn-nosnPs sequence with three continuous SNPs sites deleted, so as to construct the pT2AL-Pn-nosnPs-eGFP plasmid. The method comprises the following specific steps:
3.1 deleting three specific continuous SNPs sites in the Pn sequence by using bridge PCR technology to obtain the Pn-nosNs sequence. Briefly, primers are designed at the positions where SNPs are to be deleted, and the primers amplify a pre-fragment (AB) containing the deleted positions and a post-fragment (CD) containing the deleted positions, respectively, and then the PCR fragments are purified and recovered, respectively, and then mixed (AB + CD). And designing PCR primers at two ends of the whole sequence (AD), and amplifying the whole sequence by taking the mixed target fragment (AB + CD) as a template, thereby deleting three continuous SNPs positions in the target fragment. Meanwhile, when the whole sequence is amplified, double enzyme cutting sites (Sma I and BamH I) with protective bases are added on the primer, so that the enzyme cutting sites are introduced into the PCR amplification product. Each primer sequence (5 '-3') is as follows:
FA:TCCCCCGGGCTGGTGGGCTCTGAGTCTTTG
FB:TCTGATCAGATGGGGTTTGCTC
FC:GAGCAAACCCCATCTGATCAG
FD:CGGGATCCGGGCAGCGTCAGTCTCCTC
the Pn-nosnPs of the fragment of interest were amplified from the plasmid pT2AL-Pn-eGFP by PCR amplification using the high fidelity enzyme PrimerSTAR (Takara) (annealing temperature: 50 ℃, extension time: 1min for the amplification of the AB and CD fragments, respectively; 2min for the amplification of the AD fragment).
3.2 the PCR amplification product was purified using a gel recovery kit (Biospin), and the detailed procedures are described in the kit's instructions.
3.3 double digestion of the constructed pT2AL-Pn-eGFP plasmid and the PCR product of purified Pn-noSNPs by means of a double digestion system (NEB) to expose the same digestion sites (Sma I and BamH I) for the plasmid vector and the PCR product of Pn-noSNPs, respectively, excision of the Pn sequence from the constructed pT2AL-Pn-eGFP plasmid, purification and recovery of the digested product by means of a gel recovery kit (Biospin), and determination of the concentration by means of Nanodrop 2000.
3.4 the purified and recovered linear vector pT2AL-eGFP plasmid and the target fragment of the purified Pn-nosnPs were ligated by ligase (Solution I, Takara) at 16 ℃ for 2 h. The connection system is calculated using the following formula:
(C1V1length of the target fragment)/(C2V2Carrier segment length) 10
Wherein V is1+V2=7.5ul Solution I=2.5ul
C1Concentration of target fragment C2Concentration of carrier
3.5 conversion: synchronization step 1.1.6. The objective fragment of Pn-nonSNPs was screened out and ligated to plasmid pT2AL-eGFP containing ampicillin to obtain plasmid pT2 AL-Pn-nonSNPs-eGFP.
3.6 selecting positive monoclonal bacteria from the culture medium, dropping the bacteria into 1ml of culture solution containing ampicillin, shaking the bacteria at 37 ℃ for amplification culture, then sending the bacteria to the company for plasmid extraction, designing primers (testF5 '-3': GTCTATGAAAACTCAATCAGGTG) near the positions where three continuous SNPs sites are deleted, carrying out Sanger sequencing (Scigosaccae), and further determining that the selected monoclonal bacteria are indeed pT2 AL-Pn-nops SNPs-eGFP plasmids which are successfully connected and from which three continuous SNPs (GTA) are deleted. The sequencing results are shown in FIG. 5.
3.7 the pT2AL-Pn-nosnPs-eGFP plasmid successfully constructed as described above was transformed into E.coli competent cells (the same transformation procedure as in 1.1.6). Selecting single clone into 40ml culture solution containing ampicillin, culturing at 37 deg.C and 180rpm for 12-16h, extracting plasmid (plasmid small extraction medium kit, Tiangen, see kit description for details), and storing the extracted plasmid in-20 deg.C refrigerator.
3.8 finally, the selected plasmid was sequenced using T3 and T7 primers to ligate the fragments into the plasmid, and the plasmid was further confirmed to be pT2AL-Pn-noSNPs-eGFP (see FIG. 3 c).
Microinjection, fluorescence screening and passage of zebra fish embryo single cell fertilized egg
1. The successfully constructed upstream insertion fragment of the unique unknown functional gene of the fish, namely the Neyerei noni, Neyerei, oridonia, is used as a promoter to drive the expression of downstream eGFP fluorescent protein, 1) pT2AL-Pn-eGFP plasmid vector, 2) pT2AL-On-eGFP plasmid vector constructed by the upstream fragment of Niloticus O.niloticus, a closely related species of P.nyoreri, which has 94% similar homology with Pn, and 3) pT2 AL-Pn-nops-eGFP plasmid vector constructed by deleting three continuous SNPs sites is respectively transferred into escherichia coli 5 alpha competent cells for transformation (the transformation step is the same as 1.1.6). Respectively picking the monoclonals into 40ml of culture solution containing the ampicillin, culturing for 12-16h at 37 ℃ and 180rpm, and respectively carrying out endotoxin-removing plasmid purification extraction on the three plasmid vectors by using an endotoxin-free plasmid extraction kit, thereby obtaining plasmid vectors with high purity for removing substances such as escherichia coli cell bodies and the like.
2. The three plasmid vectors (70 ng/. mu.L) and the transposase (100 ng/. mu.L) are mixed according to a certain concentration ratio and then are respectively injected into the zebra fish single cell fertilized eggs, so that the three plasmid vectors are respectively randomly integrated into the zebra fish genome under the action of the transposase.
3. And detecting the eGFP green fluorescent protein signal expression condition under a fluorescence microscope 24h after microinjection fertilization of the single-cell fertilized egg embryo, thereby determining whether microinjection is successful and whether the recombinant plasmid can drive eGFP expression.
4. Fertilized eggs expressing fluorescence were selected and cultured at a standard temperature of 28.5 ℃ until they became sexually mature (F0 generation). Mating the F0 generation with wild zebra fish to obtain fluorescent offspring individuals as F1 generations. The F1 generation was continued to be cultured until sexual maturity, and mated with wild type zebra fish to obtain fluorescent F2 generation. From this, three fluorescent strains of zebrafish (pT2AL-Pn-eGFP, pT2AL-On-eGFP, pT2AL-Pn-nosnPs-eGFP) were constructed separately.
Determining the fluorescent expression part and cell type through confocal fluorescent microscopic imaging and fluorescent immunohistochemical cell co-localization
1. The fluorescent expression position of the eGFP of the transgenic line pT2AL-Pn-eGFP zebra fish is detected by a Zeiss880 confocal microscope. With antibodies labelled with the corresponding moleculesCells/tissues were further defined for the type of eGFP green fluorescent protein expressed by the transgenic zebrafish line constructed according to the present invention by cell co-localization with eGFP fluorescent protein in the 488nm excitation light channel using the corresponding antibodies via a fluorescent immunohistochemical reaction (see Barresi MJF, Stickney HL, Demoto SH. the zebrafish slow-tissue-fluorescent product is required for the fluorescent signal transmission and the reduction of slow tissue identity.development.2000; 127: 2189-99.). The fluorescently labeled antibodies used include anti-Cathepsin L/MEP antibodies (ab209708, abcam) that label hatching gland cells; anti-Pvalb7 antibody (professor Masahiko Hibi, university of ancient House) that labels cerebellar neurons; anti-Serotonin antibody (s5545, Sigma) labeling epidermal chemoreceptor cells; anti-HuC/HuD antibody (16A11, Invitrogen) that labels neurons; anti-SOX10 antibody (ab229331, abcam) that labels mandibular cartilage; anti-Collagen type II antibody labeled osteoblast (II-II6B3, DSHB); anti-GFP antibody (ab6673, abcam) labeled with green fluorescent protein. The secondary antibody comprises donkey anti-goat IgG H&L(Alexa488) Pre-adsorbed Secondary antibody (ab150133, abcam), donkey anti-mouse IgG H&L(Alexa647) Antibodies (ab150107, abcam), and donkey anti-rabbit IgG H&L(Alexa647) Pre-adsorbed secondary antibodies (ab150063, abcam) and DAPI stain (ab228549, abcam).
The experimental results are as follows:
1. three transgenic lines were successfully constructed for the first time: 1) zebrafish transgenic line pT2AL-Pn-eGFP which takes a unique upstream sequence fragment of fish unique unknown functional gene (named lg gene herein) in Dongfei noni fish P.nyerei as a promoter to drive the expression of eGFP fluorescent protein, 2) zebrafish transgenic line pT2AL-On-eGFP which takes a homologous sequence (homology is 94%) of nile tilapia (O.niloticus) which is a closely related species as a promoter to drive the expression of eGFP fluorescent protein; 3) three consecutive SNPs not contained in O.niloticus were deleted from the sequence of pT2AL-Pn-eGFP using the upstream fragment of P.nyerei as a promoter, and a zebrafish transgenic line pT2 AL-Pn-nonsnPs-eGFP was constructed.
The plasmid maps are shown in FIG. 3a, FIG. 3b, FIG. 3c
2. The results show that the upstream sequence of P.nyerei has the function of a promoter and can drive the high expression of the fluorescent protein in various important cells/tissues: hatching gland cells associated with hatching of fish eggs (fig. 6a), epidermal receptor cells associated with sensing the external environment (e.g. water flow, pheromones, hormones, etc.) (fig. 6b), heart valve cells associated with controlling the direction of blood flow, preventing the flow from backing up (fig. 6c), mandibular chondrocytes associated with food intake (fig. 6d), osteoblasts in fish fins associated with movement (fig. 6e, fig. 6f, fig. 6g), neurons of the inner and outer nuclear layers located in the eye associated with vision (fig. 6h) and most of the neuronal cells in the brain (e.g. olfactory epithelium (fig. 6i) and olfactory bulb (fig. 6j) located in the forebrain associated with sensing food, pheromones, hormones, etc.), neurons of the visual cap located in the midbrain associated with visual transmission (fig. 6k), cells of the cerebellum associated with motor balance (fig. 6k,6l), hindbrain (FIG. 6m) and spinal cord neuronal cells (FIG. 6n)) associated with neurotransmission.
Interestingly, in the closely related species nile tilapia o. niloticus of p.nyerei, the On sequence, 94% homologous to the Pn sequence, although also having promoter function, was able to drive expression of eGFP, its expressed cell type and signal intensity were significantly attenuated compared to the Pn sequence (fig. 7a) (fig. 7b), mainly expressed in cerebellum, hindbrain, spinal cord and the first two bony arches associated with mandibular formation, while the signal was very weak in other cells and tissues.
On the other hand, the Pn-noSNPs sequence after deleting three consecutive SNPs sites (GTA) which are not present in On from the Pn sequence, the eGFP fluorescent protein expression signal driven by the Pn sequence is also obviously weakened, the expressed cell types are also obviously reduced (FIG. 7c), and the fluorescent expression sites are mainly in the top cap, olfactory epithelium, hindbrain and first two bone arches of mandible. The expression pattern becomes more similar to that of nile tilapia On sequence (fig. 7 b).
The cerebellum, hindbrain and anterior two arch parts of the mandible of the zebrafish 4dpf (days post fertilization) embryos of the three transgenic lines were photographed under the same parameters of a Zeiss880 confocal microscope, and the fluorescence signal intensity was statistically analyzed, and it was found that there was a statistically significant difference in the signal intensity of the pT2AL-Pn-eGFP line compared to the pT2AL-On-eGFP line and the pT2 AL-Pn-nonsnPs-eGFP line (FIG. 7 d). It was demonstrated that the deletion of three consecutive SNPs (GTAs) which only P.nyerei has but not O.niloticus is an important cause of the difference in the fluorescence expression pattern between the pT2AL-Pn-eGFP line and the pT2AL-On-eGFP line.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.
Sequence listing
<110> university of Zhongshan; southwest university
<120> nucleic acid sequences and their use as promoters
<130> MP2012844
<160> 5
<170> SIPOSequenceListing 1.0
<210> 1
<211> 2078
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
actggtgggc tctgagtctt tgtttctgta aactgcagtc tgctagtgca tgccactgag 60
tattaatgaa acataataaa aacataataa cacaataaaa cacggtctat gaaaactcaa 120
tcaggtgtaa aaagcatttc aatttttaaa ggtctaaaga gtcagaaatg tatctgcaga 180
ggggaactgt tgactcatgt gaatcatgag acgggtttcc ataagtcagt gctcccatga 240
tcactgcaat gtgttgtttg cattatgtgc acacacacgc acacacataa acacacacac 300
acatgcatac acagagcaaa cgtacccatc tgatcagatt atctgcatga tgtatgtcac 360
aaactttaca caacatgttt acagctctgc agtccttttt ccatttttgt ttttagaagg 420
aatcattcat tttggcacat tttaccaaaa ttacagtggc tgactgtgtt catttacaat 480
gagacatttt tagaaatggc gtcttataag atgaaaaaaa atgtggaaaa aataaacctg 540
actccacttt gtttgtttgt tttgttgttt tcttctcttc ctctctccct ctctctctct 600
ctctctctct ctctcatcac ccctcctttg ctacatctgt gtctgagtct ccttcctctt 660
ccctatttct tctgcatctg tttttctcgc tccatcaaaa cagtcagcca gagacagtgc 720
tggggctgag gaaatggatg gtgtgtctat ttttagaagc agggaaggag ggcaaaccaa 780
aagtaagcga gggagagagg gagcaagaga gagagggagg aggaggatga ggtggaggag 840
agggaggggg gatgtgagga gaagataaac ctgcattctg cctctgcctc tggtcagaga 900
cagttgaagc ttcacagcac aattaacaag cagcagcaga tgaaaacaga ggatcgtaaa 960
taaaattgag gagacaaaaa aaagagagaa cgagagcgac ggatgcaggc tgctatttaa 1020
agctgcacca ggagaagaga caagaaaaga ttcggtggac tcaacaaaga tcatccgtaa 1080
aggtgagtgt gaaggcaaat tgagcagaaa acctttacat tattcctgag aatgcatgac 1140
gaacacacac acagagtgcc tgttttgtgc ttcagcctcg agctgccgcc gtggaaccat 1200
ttttagctgt tgtatacaga aggtgtggtc tatatttagc agtggttaca ccttggaaga 1260
aactgggaga ggctggcatt ttggtttggc ttgcgctatt tttgtcatga ggtcaataca 1320
agcatacatt gtgggctacg ctctgggtta taacaactcg gcgctccaca ctctgaaaca 1380
gtttgaggct ttacgtgtct gagatcccac aaaacagtca tctgtgtgtg cagcctctca 1440
ctgggattgt ttgtatattg aacggtcaac acacacaact ccagagtaag cggcgcacgg 1500
tgcctgaaat gaaggaaagg gagaggagag gctcctatgt ttacttggtt ttatcgatac 1560
agccagcagt tgtagtcggt tgtgttttgg gagcttcgga caatgtgagc aatgtccaga 1620
tgcacattat gtcattagaa aaacacatta gactgatgag gtgattaaac aagaggcttt 1680
tcaaaagaaa ctgaggaaaa attgttgcaa atacaatgca acacaacagt gtgtacgaat 1740
gagaaattaa ctccagaagt tataagtcag ttcgtgagat aatagtgcac agattcatct 1800
ttttatttgt atcgtgtgga gacacatatt gcaagtttcg acataaatga cacacacaaa 1860
cagggctctc actttgtaaa ctaggaagta agattgcata cgttgcatct gaccctatca 1920
ggaatgtgtc acatctatca aaatatccag ttttccaaat ggtgattata ttttcctaaa 1980
ggattctgct tttttgagac ttggatgcag tatttcacat tacacagttt ttttccccat 2040
ctaaatgaga cctctcccgt cttgtctttc tctcagct 2078
<210> 2
<211> 2075
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
actggtgggc tctgagtctt tgtttctgta aactgcagtc tgctagtgca tgccactgag 60
tattaatgaa acataataaa aacataataa cacaataaaa cacggtctat gaaaactcaa 120
tcaggtgtaa aaagcatttc aatttttaaa ggtctaaaga gtcagaaatg tatctgcaga 180
ggggaactgt tgactcatgt gaatcatgag acgggtttcc ataagtcagt gctcccatga 240
tcactgcaat gtgttgtttg cattatgtgc acacacacgc acacacataa acacacacac 300
acatgcatac acagagcaaa ccccatctga tcagattatc tgcatgatgt atgtcacaaa 360
ctttacacaa catgtttaca gctctgcagt cctttttcca tttttgtttt tagaaggaat 420
cattcatttt ggcacatttt accaaaatta cagtggctga ctgtgttcat ttacaatgag 480
acatttttag aaatggcgtc ttataagatg aaaaaaaatg tggaaaaaat aaacctgact 540
ccactttgtt tgtttgtttt gttgttttct tctcttcctc tctccctctc tctctctctc 600
tctctctctc tcatcacccc tcctttgcta catctgtgtc tgagtctcct tcctcttccc 660
tatttcttct gcatctgttt ttctcgctcc atcaaaacag tcagccagag acagtgctgg 720
ggctgaggaa atggatggtg tgtctatttt tagaagcagg gaaggagggc aaaccaaaag 780
taagcgaggg agagagggag caagagagag agggaggagg aggatgaggt ggaggagagg 840
gaggggggat gtgaggagaa gataaacctg cattctgcct ctgcctctgg tcagagacag 900
ttgaagcttc acagcacaat taacaagcag cagcagatga aaacagagga tcgtaaataa 960
aattgaggag acaaaaaaaa gagagaacga gagcgacgga tgcaggctgc tatttaaagc 1020
tgcaccagga gaagagacaa gaaaagattc ggtggactca acaaagatca tccgtaaagg 1080
tgagtgtgaa ggcaaattga gcagaaaacc tttacattat tcctgagaat gcatgacgaa 1140
cacacacaca gagtgcctgt tttgtgcttc agcctcgagc tgccgccgtg gaaccatttt 1200
tagctgttgt atacagaagg tgtggtctat atttagcagt ggttacacct tggaagaaac 1260
tgggagaggc tggcattttg gtttggcttg cgctattttt gtcatgaggt caatacaagc 1320
atacattgtg ggctacgctc tgggttataa caactcggcg ctccacactc tgaaacagtt 1380
tgaggcttta cgtgtctgag atcccacaaa acagtcatct gtgtgtgcag cctctcactg 1440
ggattgtttg tatattgaac ggtcaacaca cacaactcca gagtaagcgg cgcacggtgc 1500
ctgaaatgaa ggaaagggag aggagaggct cctatgttta cttggtttta tcgatacagc 1560
cagcagttgt agtcggttgt gttttgggag cttcggacaa tgtgagcaat gtccagatgc 1620
acattatgtc attagaaaaa cacattagac tgatgaggtg attaaacaag aggcttttca 1680
aaagaaactg aggaaaaatt gttgcaaata caatgcaaca caacagtgtg tacgaatgag 1740
aaattaactc cagaagttat aagtcagttc gtgagataat agtgcacaga ttcatctttt 1800
tatttgtatc gtgtggagac acatattgca agtttcgaca taaatgacac acacaaacag 1860
ggctctcact ttgtaaacta ggaagtaaga ttgcatacgt tgcatctgac cctatcagga 1920
atgtgtcaca tctatcaaaa tatccagttt tccaaatggt gattatattt tcctaaagga 1980
ttctgctttt ttgagacttg gatgcagtat ttcacattac acagtttttt tccccatcta 2040
aatgagacct ctcccgtctt gtctttctct cagct 2075
<210> 3
<211> 2062
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
actggtgggc tctgagtctt tgtttctgta aactgcagtc tgctagtgct tgccactgag 60
tattaatgaa acataataaa acataataac acaataaaac acagtctatg aaaactcaat 120
caggtgtaaa gagcatttca atttttaaag gtctaatgag tcagaaatgt atctgtagat 180
gagaactgtt gactcatgtg actcatgaga caggtttcca taagtcagtg ctcccgtgat 240
cactgcaatg tgttgtttgc attacgtgca cacacgtgcg cacgcataaa cacacacaca 300
tccatataca cagagcaaac atgccaaagc ccatctgatc agattgtctg catgatgtat 360
gtgacaagct ttacacaaca tgtttacagc tctgcagtcc tttttccatt tttgttttta 420
gaaggaatcg ttcattttgg cacattcacc aaaattacag tggctgactg tgttcattta 480
cagtgagaca tttttagaaa tggcgtctta taagatgaaa aaaaatgtgg aaaaaataaa 540
cctgtctcca ctttgtttgt tttgttgttt tctcctctct ctctctctct ctctctctct 600
ctctctctct ctcatcaccc ctctgttaca tctgtgtctg agtctccttc ctcttccctc 660
tttcttctgc ctctgttttt ctcgttccat caatacagtc agccagagac agtgctgggg 720
ctgaggaaat ggatggtgtg tctattttta gaagcaggga aggagggcaa accaaaagta 780
agcgagggag agagggagca agagagagag ggaggaggag gatgaggtgg aggagaggga 840
ggggggatgt gaggagaaga taaatctgca ttctgcctct gcctctggtc agagacagtt 900
gaagcttcac agcacaatta aggagaagca gcagcagatg aaaacagacg atcgtaaata 960
aaattgagga gacaaaaaaa gagagaaaga gagcgacgga tgcaggctga aatttaaaac 1020
tgcaccagga gaaaagacaa gaaaagattc cgtggactca acaaagatca tccgtaaagg 1080
tgagtgtgaa ggcaaattga gcagaaaacc tttacattat tcctgagaat gcatgacgaa 1140
cacacacaca gggtgcctgt tttgtgcttc agcctcgagc tgccgccgtg gaaccatttt 1200
tagcttttgt atacagaagg tgtggtctat atttagcagt ggttacacct tggaagaaac 1260
tgggagaggc tggcattttg gtttggcttg tgctattttt gtcacgaggt caatacaagc 1320
atatattgtg ggctacactc agggttataa caactcggcg ctccacactc tgaaacagtt 1380
tgaggcttta cgtgtctgag atcccacaaa acagtcatct gtgtgtgcag cctctcgctg 1440
ggattgtttg tatattgaac ggtcaacaca cacaactcca gagtaagcag cgcacggtgc 1500
ctgaaatgaa gggaagggag aggagaggct tctgtgttta cttggtttta tcgatacagc 1560
cagcagttgt agtcggttgt gttttgaaaa cgcttcgggg gatgtgagca atgtccagat 1620
gtacattatg tcattagaaa aacacattag actgatgagg tgattaaaca agaggctttt 1680
gaaaagaaac tgagcaaaaa ttgttgcaaa tacaatgtgt gtacgaatgt gaaattaact 1740
ccagaagtta taagtcagtt catgagataa cagtgcacag atttatattt ttatttgtat 1800
tgtgtggagg cacatattgc aagtttcgac ataaaggaca cacacaacac actcactttg 1860
taaactagga agtaagattg catacgttgc atctgaccct atcaggaatg tgtcacacct 1920
atcaaaatat ccagttttcc aaacggtgat tatattttcc taaaggattc tgctttttcg 1980
agacttggat gcagtatttc acattacgca gtttttttcc ccatctaaat gagacctctc 2040
ccgtcttgtc tttctctcag ct 2062
<210> 4
<211> 48
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
atgcggccac ggtccagact gctgtccaag aggagactga cgctgccc 48
<210> 5
<211> 6573
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
cacttttcgg ggaaatgtgc gcggaacccc tatttgttta tttttctaaa tacattcaaa 60
tatgtatccg ctcatgagac aataaccctg ataaatgctt caataatatt gaaaaaggaa 120
gagtatgagt attcaacatt tccgtgtcgc ccttattccc ttttttgcgg cattttgcct 180
tcctgttttt gctcacccag aaacgctggt gaaagtaaaa gatgctgaag atcagttggg 240
tgcacgagtg ggttacatcg aactggatct caacagcggt aagatccttg agagttttcg 300
ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt 360
atcccgtatt gacgccgggc aagagcaact cggtcgccgc atacactatt ctcagaatga 420
cttggttgag tactcaccag tcacagaaaa gcatcttacg gatggcatga cagtaagaga 480
attatgcagt gctgccataa ccatgagtga taacactgcg gccaacttac ttctgacaac 540
gatcggagga ccgaaggagc taaccgcttt tttgcacaac atgggggatc atgtaactcg 600
ccttgatcgt tgggaaccgg agctgaatga agccatacca aacgacgagc gtgacaccac 660
gatgcctgta gcaatggcaa caacgttgcg caaactatta actggcgaac tacttactct 720
agcttcccgg caacaattaa tagactggat ggaggcggat aaagttgcag gaccacttct 780
gcgctcggcc cttccggctg gctggtttat tgctgataaa tctggagccg gtgagcgtgg 840
gtctcgcggt atcattgcag cactggggcc agatggtaag ccctcccgta tcgtagttat 900
ctacacgacg gggagtcagg caactatgga tgaacgaaat agacagatcg ctgagatagg 960
tgcctcactg attaagcatt ggtaactgtc agaccaagtt tactcatata tactttagat 1020
tgatttaaaa cttcattttt aatttaaaag gatctaggtg aagatccttt ttgataatct 1080
catgaccaaa atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa 1140
gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa 1200
aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc 1260
gaaggtaact ggcttcagca gagcgcagat accaaatact gtccttctag tgtagccgta 1320
gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct 1380
gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 1440
atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag 1500
cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc 1560
cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg 1620
agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt 1680
tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 1740
gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc cttttgctca 1800
catgttcttt cctgcgttat cccctgattc tgtggataac cgtattaccg cctttgagtg 1860
agctgatacc gctcgccgca gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc 1920
ggaagagcgc ccaatacgca aaccgcctct ccccgcgcgt tggccgattc attaatgcag 1980
ctggcacgac aggtttcccg actggaaagc gggcagtgag cgcaacgcaa ttaatgtgag 2040
ttagctcact cattaggcac cccaggcttt acactttatg cttccggctc gtatgttgtg 2100
tggaattgtg agcggataac aatttcacac aggaaacagc tatgaccatg attacgccaa 2160
gcgcgcaatt aaccctcact aaagggaaca aaagctggag ctccaccgcg gtggcggccg 2220
ctctagaact agtggatctg ctgggcttgc tgaaggtagg gggtcaagaa ccagaggtgt 2280
aaagtacttg agtaatttta cttgattact gtacttaagt attatttttg gggattttta 2340
ctttacttga gtacaattaa aaatcaatac ttttactttt acttaattac atttttttag 2400
aaaaaaaagt actttttact ccttacaatt ttatttacag tcaaaaagta cttatttttt 2460
ggagatcact tgggcccggc tcgagcgctt aagtttaaac gcgttaacaa ttggccatat 2520
gcatgctagc ggccgcacgt ggcgcctagg ccgccgatcg tcgactagtt ataatttaaa 2580
tattggcgcg ccaagcttga tatcgaattc ctgcagcccg ggactggtgg gctctgagtc 2640
tttgtttctg taaactgcag tctgctagtg catgccactg agtattaatg aaacataata 2700
aaaacataat aacacaataa aacacggtct atgaaaactc aatcaggtgt aaaaagcatt 2760
tcaattttta aaggtctaaa gagtcagaaa tgtatctgca gaggggaact gttgactcat 2820
gtgaatcatg agacgggttt ccataagtca gtgctcccat gatcactgca atgtgttgtt 2880
tgcattatgt gcacacacac gcacacacat aaacacacac acacatgcat acacagagca 2940
aacgtaccca tctgatcaga ttatctgcat gatgtatgtc acaaacttta cacaacatgt 3000
ttacagctct gcagtccttt ttccattttt gtttttagaa ggaatcattc attttggcac 3060
attttaccaa aattacagtg gctgactgtg ttcatttaca atgagacatt tttagaaatg 3120
gcgtcttata agatgaaaaa aaatgtggaa aaaataaacc tgactccact ttgtttgttt 3180
gttttgttgt tttcttctct tcctctctcc ctctctctct ctctctctct ctctctcatc 3240
acccctcctt tgctacatct gtgtctgagt ctccttcctc ttccctattt cttctgcatc 3300
tgtttttctc gctccatcaa aacagtcagc cagagacagt gctggggctg aggaaatgga 3360
tggtgtgtct atttttagaa gcagggaagg agggcaaacc aaaagtaagc gagggagaga 3420
gggagcaaga gagagaggga ggaggaggat gaggtggagg agagggaggg gggatgtgag 3480
gagaagataa acctgcattc tgcctctgcc tctggtcaga gacagttgaa gcttcacagc 3540
acaattaaca agcagcagca gatgaaaaca gaggatcgta aataaaattg aggagacaaa 3600
aaaaagagag aacgagagcg acggatgcag gctgctattt aaagctgcac caggagaaga 3660
gacaagaaaa gattcggtgg actcaacaaa gatcatccgt aaaggtgagt gtgaaggcaa 3720
attgagcaga aaacctttac attattcctg agaatgcatg acgaacacac acacagagtg 3780
cctgttttgt gcttcagcct cgagctgccg ccgtggaacc atttttagct gttgtataca 3840
gaaggtgtgg tctatattta gcagtggtta caccttggaa gaaactggga gaggctggca 3900
ttttggtttg gcttgcgcta tttttgtcat gaggtcaata caagcataca ttgtgggcta 3960
cgctctgggt tataacaact cggcgctcca cactctgaaa cagtttgagg ctttacgtgt 4020
ctgagatccc acaaaacagt catctgtgtg tgcagcctct cactgggatt gtttgtatat 4080
tgaacggtca acacacacaa ctccagagta agcggcgcac ggtgcctgaa atgaaggaaa 4140
gggagaggag aggctcctat gtttacttgg ttttatcgat acagccagca gttgtagtcg 4200
gttgtgtttt gggagcttcg gacaatgtga gcaatgtcca gatgcacatt atgtcattag 4260
aaaaacacat tagactgatg aggtgattaa acaagaggct tttcaaaaga aactgaggaa 4320
aaattgttgc aaatacaatg caacacaaca gtgtgtacga atgagaaatt aactccagaa 4380
gttataagtc agttcgtgag ataatagtgc acagattcat ctttttattt gtatcgtgtg 4440
gagacacata ttgcaagttt cgacataaat gacacacaca aacagggctc tcactttgta 4500
aactaggaag taagattgca tacgttgcat ctgaccctat caggaatgtg tcacatctat 4560
caaaatatcc agttttccaa atggtgatta tattttccta aaggattctg cttttttgag 4620
acttggatgc agtatttcac attacacagt ttttttcccc atctaaatga gacctctccc 4680
gtcttgtctt tctctcagct atgcggccgc ggtccagact gctgtccaag aggagactga 4740
cgctgcccgg atccatggtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 4800
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgtccggc gagggcgagg 4860
gcgatgccac ctacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 4920
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 4980
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 5040
agcgcaccat cttcttcaag gacgacggca actacaagac ccgcgccgag gtgaagttcg 5100
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 5160
acatcctggg gcacaagctg gagtacaact acaacagcca caacgtctat atcatggccg 5220
acaagcagaa gaacggcatc aaggtgaact tcaagatccg ccacaacatc gaggacggca 5280
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 5340
tgcccgacaa ccactacctg agcacccagt ccgccctgag caaagacccc aacgagaagc 5400
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 5460
agctgtacaa gtaaaggcct atcgatgatg atccagacat gataagatac attgatgagt 5520
ttggacaaac cacaactaga atgcagtgaa aaaaatgctt tatttgtgaa atttgtgatg 5580
ctattgcttt atttgtaacc attataagct gcaataaaca agttaacaac aacaattgca 5640
ttcattttat gtttcaggtt cagggggagg tgtgggaggt tttttaaagc aagtaaaacc 5700
tctacaaatg tggtatggct gattatgatc ctctagatca gatctaatac tcaagtacaa 5760
ttttaatgga gtactttttt acttttactc aagtaagatt ctagccagat acttttactt 5820
ttaattgagt aaaattttcc ctaagtactt gtactttcac ttgagtaaaa tttttgagta 5880
ctttttacac ctctgtcaag aaccatatgc cggtacccaa ttcgccctat agtgagtcgt 5940
attacgcgcg ctcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta 6000
cccaacttaa tcgccttgca gcacatcccc ctttcgccag ctggcgtaat agcgaagagg 6060
cccgcaccga tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg gacgcgccct 6120
gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg 6180
ccagcgccct agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg 6240
gctttccccg tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac 6300
ggcacctcga ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct 6360
gatagacggt ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt 6420
tccaaactgg aacaacactc aaccctatct cggtctattc ttttgattta taagggattt 6480
tgccgatttc ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt 6540
ttaacaaaat attaacgctt acaatttagg tgg 6573
Claims (4)
1.1 as shown in SEQ ID NO. 1 as the promoter.
2. The use according to claim 1, wherein a nucleotide fragment represented by SEQ ID NO. 4 is further added to the 3' -end of the nucleic acid represented by SEQ ID NO. 1.
3. A preparation method of recombinant protein is characterized in that nucleic acid of a nucleotide sequence shown in SEQ ID NO. 1 is used as a promoter to induce and express the recombinant protein.
4. A breeding method characterized by overexpressing a target gene in a bacterial cell using a nucleic acid having a nucleotide sequence represented by SEQ ID NO. 1 as a promoter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011231411.3A CN112553196B (en) | 2020-11-06 | 2020-11-06 | Nucleic acid sequences and their use as promoters |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011231411.3A CN112553196B (en) | 2020-11-06 | 2020-11-06 | Nucleic acid sequences and their use as promoters |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112553196A CN112553196A (en) | 2021-03-26 |
CN112553196B true CN112553196B (en) | 2021-08-10 |
Family
ID=75041548
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011231411.3A Active CN112553196B (en) | 2020-11-06 | 2020-11-06 | Nucleic acid sequences and their use as promoters |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112553196B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114600837B (en) * | 2022-04-15 | 2023-05-02 | 润康生物医药(苏州)有限公司 | Animal model for granulocytopenia, construction method thereof and application of ikzf1 and cmyb in construction model |
CN116555259B (en) * | 2023-04-20 | 2024-02-27 | 云舟生物科技(广州)股份有限公司 | Nucleic acid molecules and their use as promoters |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2479278A1 (en) * | 2011-01-25 | 2012-07-25 | Synpromics Ltd. | Method for the construction of specific promoters |
-
2020
- 2020-11-06 CN CN202011231411.3A patent/CN112553196B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112553196A (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2016273850B2 (en) | Adeno-associated virus virions with variant capsid and methods of use thereof | |
CN109689865A (en) | BCL11A homing endonuclease variants, composition and application method | |
KR100563295B1 (en) | Characterisation of gene function using double stranded RNA inhibition | |
CN112553196B (en) | Nucleic acid sequences and their use as promoters | |
CN106520830B (en) | method for targeted editing of mitochondrial genome by using CRISPR/Cas9 | |
CN109451729B (en) | Engineered viral vectors reduce induction of inflammation and immune responses | |
AU2023214288A1 (en) | Materials and methods for delivering nucleic acids to cochlear and vestibular cells | |
JP5576382B2 (en) | Tetracycline-inducible transcriptional regulatory sequence | |
CN111850034B (en) | Gene editing carrier and method | |
CN101343637A (en) | Method for feeding dsRNA restraint insect gene expression | |
KR102628872B1 (en) | Tools and methods for using cell division loci to control proliferation of cells | |
KR20220024527A (en) | Systems and methods for double recombinase-mediated cassette exchange (dRMCE) in vivo and their disease models | |
KR101686688B1 (en) | Protein substance having triple helix structure and manufacturing method therefor | |
US20090053295A1 (en) | Regulatable fusion promoters | |
CN114525304B (en) | Gene editing method | |
KR102369720B1 (en) | Neurodegenerative diseases model animal and method for producing the same | |
CN114958760B (en) | Gene editing technology for constructing Alzheimer disease model pig and application thereof | |
KR102242790B1 (en) | Neurodegenerative diseases model animal | |
KR20230117177A (en) | Development of novel gene therapy for progressive ossifying fibrous dysplasia | |
CN115247189A (en) | Construction method of alopecia model pig nuclear transplantation donor cell expressing humanized II-type 5 alpha-reductase | |
CN110438138A (en) | Plasmid vector | |
CN115247188A (en) | Kit and application thereof in constructing alopecia model pig nuclear transplantation donor cells of high-expression pig II-type 5 alpha-reductase |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |