WO2016173556A1 - 新型基因打靶方法 - Google Patents

新型基因打靶方法 Download PDF

Info

Publication number
WO2016173556A1
WO2016173556A1 PCT/CN2016/080788 CN2016080788W WO2016173556A1 WO 2016173556 A1 WO2016173556 A1 WO 2016173556A1 CN 2016080788 W CN2016080788 W CN 2016080788W WO 2016173556 A1 WO2016173556 A1 WO 2016173556A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene
och1
seq
ura3
targeting
Prior art date
Application number
PCT/CN2016/080788
Other languages
English (en)
French (fr)
Inventor
姜有为
李云飞
杜正伟
梁华军
Original Assignee
杭州菁因康生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州菁因康生物科技有限公司 filed Critical 杭州菁因康生物科技有限公司
Priority to US15/570,656 priority Critical patent/US11466280B2/en
Priority to EP16785984.2A priority patent/EP3290519A4/en
Publication of WO2016173556A1 publication Critical patent/WO2016173556A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/905Stable introduction of foreign DNA into chromosome using homologous recombination in yeast
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N1/00Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
    • C12N1/14Fungi; Culture media therefor
    • C12N1/16Yeasts; Culture media therefor
    • C12N1/165Yeast isolates
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12RINDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
    • C12R2001/00Microorganisms ; Processes using microorganisms
    • C12R2001/645Fungi ; Processes using fungi
    • C12R2001/84Pichia

Definitions

  • the present invention relates to the field of biotechnology; in particular, the present invention relates to novel targeting methods and nucleotide constructs for use in the methods.
  • each gene In the genome of an organism, each gene generally consists of a coding region and a regulatory region.
  • the coding region or open reading frame (ORF) encodes a protein and RNA strand with various biological functions.
  • the boundaries of the protein coding sequence include the 5' end (N-terminal) start codon and the 3' end (C-terminus) translation termination nonsense codon.
  • the regulatory regions preceding (5' region) and after (3' region) of the coding region contain various DNA regulatory elements such as promoters, enhancers, terminators, polyadenylation signals, 5' untranslated regions (5'UTR) And 3' untranslated regions (3'UTR), which control various aspects of the gene expression process, including transcription, translation, and RNA stability.
  • mRNA messenger RNA
  • RNA messenger RNA
  • mRNA messenger RNA
  • RNA Ribonucleic acid
  • Gene expression can be controlled by transcriptional and translational steps, respectively, through various DNA elements in the regulatory region.
  • a promoter is a specific segment of a gene sequence that is recognized by RNA polymerase and initiates transcription, which is a sequence that controls the initiation of transcription and determines the intensity of expression of the gene.
  • a terminator is a specific sequence in a gene sequence responsible for transcription termination, which provides a signal that triggers the transcription machinery to release newly synthesized mRNA (or RNA) to terminate transcription.
  • Translation is the process of synthesizing proteins using mRNA as a template.
  • the mature mRNA consists of three parts: the 5' UTR, the ORF and the 3' UTR.
  • the translation initiation complex scans the 5'UTR in the 5' to 3' direction until the initiation codon AUG is encountered, at which position the ribosome moves along the 5' end of the mRNA to the 3' end, beginning protein synthesis from the N-terminus to the C-terminus. .
  • a stop codon UAA, UAG or UGA
  • Gene targeting is widely used to disrupt or enhance the activity of genes in the chromosomal genome. This is a method of integrating foreign DNA into the genetic genome that results in the transformation, replacement or replication of the target gene. Gene targeting is a process that is common to all living organisms and can be applied to any gene, and is independent of the transcription and translation steps of the gene.
  • DSBs DNA double-strand breaks
  • HR homologous recombination
  • NHEJ non-homologous end joining
  • a foreign DNA fragment generally a selectable marker gene
  • HR homologous recombination
  • NHEJ non-homologous end joining
  • a foreign DNA fragment generally a selectable marker gene
  • non-homologous end joining pathway foreign DNA fragments are randomly integrated into non-homologous chromosomal loci (Paques and Haber 1999, Microbiology and Molecular Biology Reviews, 63: 349-404).
  • the efficiency of site-specific gene targeting is determined by the relative strength between homologous recombination and non-homologous end joining pathways.
  • An object of the present invention is to provide a gene targeting method capable of improving gene targeting efficiency, particularly a gene which is difficult to effectively target by a conventional method, and a substance means for use in the method.
  • the invention provides a nucleotide construct for use in a regulatory gene, the structure of which is as follows:
  • A is a 5' homologous sequence
  • B is an interfering gene
  • C is a 3' homologous sequence
  • the 5' and 3' homologous sequences are such that the recombination site of the nucleotide construct is located at the first nucleotide of the start codon of the gene to be regulated to the start codon of the gene to be regulated 110, preferably 50 nucleotides upstream of the first nucleotide, or the 5' and 3' homologous sequences such that the recombination site of the nucleotide construct is located at the stop codon of the gene to be regulated 100, 50 or 20 nucleotides upstream of the first nucleotide of the sub-, preferably 50 nucleotides to 300 nucleotides downstream of the first nucleotide of the stop codon of the gene to be regulated.
  • the recombination sites are separated by 0-20 nucleotides, preferably 0-5 nucleotides, and most preferably 0 nucleotides.
  • the interfering genes may be more than one, and may be the same or different.
  • the interfering gene can be a marker gene.
  • the nucleotide construct can be circular or linear.
  • the gene to be regulated may be a gene with low recombination efficiency, preferably having a recombination efficiency of ⁇ 3%, more preferably a recombination efficiency of ⁇ 1%.
  • the gene to be regulated is the OCH1, ADE1 gene.
  • the homologous sequences are 400-1200 bp (base pairs), 500-1000 bp, 600-800 bp in length.
  • the invention provides a host cell comprising the nucleotide construct of the first aspect of the invention.
  • the host cell is a yeast cell.
  • the yeast is Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pichia pastoris, Hansenula polymorpha, and fat Yalewe Yarrowia lipolytica, Pichia stipitis, and Kluyveromyces lactis.
  • the yeast is Pichia pastoris, Hansenula polymorpha, Yarrowia lipolytica, Pichia stipitis, and lactic acid. Kluyveromyces lactis.
  • the invention provides a method of regulating gene expression, the method comprising:
  • step b) introducing the nucleotide construct constructed in step a) into the cell, thereby integrating into the gene to be regulated by homologous recombination.
  • the gene to be regulated may be a gene with low recombination efficiency, preferably having a recombination efficiency of ⁇ 3%, more preferably a recombination efficiency of ⁇ 1%.
  • the gene to be regulated is OCH1, ADE1 gene
  • the method may further comprise the step c) detecting the expression of the gene to be regulated in the cells obtained in step b).
  • the invention provides a method of engineering a strain comprising:
  • step b) The nucleotide construct constructed in step a) is transformed into the strain to be engineered.
  • the method may further comprise the step of screening the engineered strain in step c).
  • the invention provides the use of a strain engineered using the method of the fourth aspect of the invention for producing recombinant proteins, metabolites, and for use in biocatalysis.
  • integration of the homologous recombination upstream of the yeast OCH1 coding region inhibits the expression of the OCH1 gene, resulting in a change in the glycosylation pattern in the recombinant protein.
  • the production of metabolites means: integration by homologous recombination upstream of the coding region of the yeast LPD1, inhibiting the expression of the LPD1 gene, inhibiting its metabolic competition pathway, and increasing isobutanol (isobutanol)
  • the yield is integrated by homologous recombination upstream of the PDC1 coding region of yeast, inhibiting the expression of PDC1 gene, and changing the yeast alcohol fermentation pathway contributes to the efficient production of L-lactic acid.
  • the application to biocatalysis means integration by homologous recombination upstream of the yeast ARO8 coding region, inhibiting ARO8 gene expression, enhancing yeast biocatalytic ability, and increasing glucose conversion to phenylethanol (phenylethanol) )s efficiency.
  • Figure 1 is a typical genetic map comprising a 5' regulatory region (5' region), an open reading frame (ORF) and a 3' regulatory region (3' region). Promoters and enhancers determine which part of the gene is transcribed into messenger RNA (mRNA).
  • mRNA messenger RNA
  • the 5' and 3' UTRs regulate the translation process of mRAN to protein.
  • the nucleotide numbers of the 5' and 3' regions refer to the corresponding start codons of the coding region as nucleotides 1-3, and the 5' upstream region is numbered with a minus sign; and the corresponding stop codon is used as nucleotide +1. To +3, its 3' downstream area is numbered with a plus sign. Carrier components are not drawn to scale.
  • Figure 2 depicts a schematic of the construction of a pUO vector. Carrier components are not drawn to scale.
  • Figure 3 depicts a schematic representation of the construction of the pUAH (1) targeting vector for integration into the OCH1 gene of Pichia pastoris. Carrier components are not drawn to scale.
  • Figure 4 depicts a schematic representation of integration at different locations in the OCH1 gene (SEQ ID NO: 126) of Pichia pastoris. Carrier components are not drawn to scale.
  • Lane 1 with P1/P4 primer pair, wild type OCH1 in JC301 yielded 1433 bp; lanes 2, 3, 4, 3 positions (-1/1), (1212/+1) and The gene integration of (+3/+4) was not banded because the sequence was too long (4900 bp), PCR could not be amplified with P1/P4 primer pairs; Lane 5, with P1/P2 primer pair, wild type OCH1 in JC301 No bands were obtained; lanes 6, 7, and 8 were integrated with P1/P2 primer pairs, and three positions (-1/1), (1212/+1), and (+3/+4) were obtained, respectively.
  • Lane 9 with P3/P4 primer pair, wild type OCH1 in JC301 without banding; lanes 10, 11, 12 with P3/P4 primer pair, 3 positions (-1/1), Gene integration of (1212/+1) and (+3/+4) yielded 3500, 2500 and 2300 bp bands, respectively.
  • Figure 5 depicts a schematic representation of the integration of different positions in the ADE1 gene of the Pichia pastoris genome (the DNA sequence of P. pastoris PR-aminoimidazolium succinylcarboxamide synthase (ADE1) as shown in SEQ ID NO: 127). . Carrier components are not drawn to scale.
  • B. Describes that the targeting cassette is integrated into the ADE1 locus (knock-in) of Pichia pastoris by double-crossover homologous recombination. Crossing indicates homologous recombination.
  • PCR validation results showing integration at different positions in the ADE1 gene.
  • M DNA size marker
  • Lanes 1-13 genomic DNA of 13 randomly selected colonies were verified by PCR using P5/P6 primer pairs.
  • Wild type ADE1 obtained a 2398 bp band, and the position (912/+1) gene integration resulted in a 3763 bp band.
  • Figure 6 depicts a schematic representation of the construction of the 5' AOX1 induced initiation of the lacZ expression vector p5'AOX1-URA3-lacZ, wherein URA3 is located between the 5' AOX1 and lacZ ORFs. Carrier components are not drawn to scale.
  • Figure 7 depicts the construction of a series of 5'AOX1-inducible lacZ and lacZns expression vectors p5'AOX1-lacZ-URA3, p5'AOX1-lacZ-URA3(-), p5'AOX1-lacZns-URA3 and p5'AOX1-lacZns - Schematic representation of URA3(-), where URA3 is located downstream of lacZ and lacZns in two orientations. Carrier components are not drawn to scale.
  • Figure 8 depicts a schematic representation of the construction of a series of 5' OCH1-primed lacZ and lacZns expression vectors p5'OCH1-lacZ and p5'OCH1-lacZns. Carrier components are not drawn to scale.
  • Figure 9 depicts a schematic representation of the construction of a 5' OCH1 primed lacZ expression vector p5'OCH1-URA3-lacZ, wherein URA3 is located between the 5'OCHl and lacZ ORFs. Carrier components are not drawn to scale.
  • Figure 10 depicts a schematic representation of the construction of a series of 5' OCH1 -initiated lacZ expression vectors p5'OCH1-lacZ-URA3 and p5'OCH1-lacZ-URA3(-), wherein URA3 is located downstream of lacZ in two orientations. Carrier components are not drawn to scale.
  • Figure 11 depicts a schematic representation of the construction of a series of 5' OCH1-primed lacZns expression vectors p5'OCH1-lacZns-URA3 and p5'OCH1-lacZns-URA3(-), wherein URA3 is located downstream of lacZns in two orientations. Carrier components are not drawn to scale.
  • Figure 12 shows the relative expression amount (%) of lacZ mRNA.
  • A Relative expression level (%) of lacZ mRNA induced by 5' AOX1 in the presence of URA3 adjacent to the start and stop codons. 100% corresponds to the amount of lacZ mRNA expression induced by 5' AOX1 without URA3 integration (p5'AOX1-lacZ).
  • B Relative expression level (%) of 5'OCH1 initiated lacZ mRNA in the presence of URA3 adjacent to the start and stop codons. 100% corresponds to lacZ mRNA expression initiated by 5'OCH1 without URA3 integration (p5'OCH1-lacZ). Data are shown as mean ⁇ standard deviation (s.d.) of 3 experiments.
  • Figure 13 shows the relative activity (%) of ⁇ -galactosidase in cells.
  • Figure 14 shows the relative expression amount (%) of OCH1 mRNA in a strain containing gene integration adjacent to the start and stop codons. 100% corresponds to mRNA expression in the parental JC307 strain without gene integration. Data are shown as mean ⁇ standard deviation (s.d.) of 3 experiments.
  • Figure 15 shows the positive ion MALDI-TOF mass spectrum of the N-glycan chain released by mIL-22.
  • A Mass spectrum showing N-glycans released by mIL-22 expressed by the GS115 strain.
  • B Mass spectrum of N-glycans released by mIL-22 expressed by strain och1 (-1/+1, ADE1URA3), which has gene integration upstream of the OCH1 coding region.
  • the inventors unexpectedly discovered a gene-independent homologous recombination effective gene targeting region, and based on this, a gene targeting system for gene expression regulation and gene destruction has been developed, and the method of the present invention can be Any gene that regulates or modifies an organism.
  • the present invention has been completed on this basis.
  • the present invention systematically analyzes gene targeting at different positions in a gene, and identifies an effective integration region of homologous recombination in the gene.
  • Gene targeting systems have also been developed for gene expression regulation and gene disruption.
  • the present invention utilizes endogenous homologous recombination processes in all organism cells, and thus, the methods of the invention can modulate or modify any gene of an organism.
  • the method of the present invention is widely used in the fields of biotechnology industry and biological research to regulate gene expression, improve cell function, and produce heterologous proteins.
  • Gene is used broadly to refer to any nucleic acid segment that is biologically related.
  • Peptide polypeptide
  • protein protein
  • Gene targeting is a method of integrating foreign DNA at a genetic gene, usually resulting in alteration, replacement or replication at the target gene. This is a mechanism that is common to all living things.
  • Cell or “body” is a term used to carry out the gene targeting of the present invention.
  • Cell transformation means that foreign DNA is introduced into a cell. This is usually caused by the integration of foreign DNA into chromosomal DNA or the introduction of a self-replicating plasmid.
  • target gene refers to a gene or DNA segment which is altered by the gene targeting method of the present invention.
  • the target gene can be an endogenous gene or an exogenous DNA segment previously introduced into the body.
  • the target gene can be the endogenous genomic DNA of the organism, any part of the gene including, but not limited to, a polypeptide coding region, an open reading frame, a regulatory region, an intron, an exon, or a portion thereof.
  • Label refers to a sequence of genes whose presence or absence provides a detectable phenotype of the organism. Different types of labels include, but are not limited to, selection markers, screening markers, and molecular markers.
  • a selectable marker is typically a gene whose expression can cause the organism to have a phenotype that is tolerant or sensitive to a particular set of conditions.
  • the screening marker delivers a phenotype as an observable and distinguishable feature.
  • Molecular markers are genetic sequence features that can be identified by DNA analysis.
  • a gene includes a "coding sequence", a “coding region” or an “open reading frame (ORF)" encoding a specific protein or functional RNA.
  • a protein coding sequence is a nucleic acid sequence that is transcribed into messenger RNA (mRNA) and is then translated into a protein. The boundaries of the protein coding sequence are defined by the 5' end (N-terminal) start codon and the 3' end (C-terminus) translation termination nonsense codon.
  • Genes also include “regulatory regions” or “regulatory elements” before and after the coding sequence.
  • the regulatory elements include, but are not limited to, a promoter, an enhancer, an intron, a polyadenylation signal, a 5' untranslated region (5' UTR), a 3' untranslated region (3' UTR), and any derivatives thereof. Things. Some regulatory regions are transcribed as part of an RNA molecule, such as the 5' UTR and 3' UTR.
  • the term "5' untranslated region (5' UTR)" shall mean a nucleotide sequence in the mature mRNA immediately upstream of any coding sequence that is not translated into a protein.
  • 3' untranslated region shall mean a nucleotide sequence in the mature mRNA immediately downstream of any coding sequence that is not translated into a protein. It extends from the first nucleotide after the stop codon of any coding sequence just before the poly(A) tail of the mRNA.
  • These regulatory elements can regulate various aspects of the gene expression process including, but not limited to, transcription (eg, initiation, extension, and/or termination), translation (priming, extension, and/or termination) and RNA stability, and the like.
  • a “promoter” is a nucleic acid regulatory region that is capable of aggregating RNA polymerase and initiating transcription of a downstream (3' direction) coding sequence.
  • the promoter sequence is flanked by a transcription initiation site at its 3' end and extends upstream (5' direction), including the minimum number of bases or elements required to initiate transcription.
  • a transcription initiation site and a protein binding domain responsible for binding to RNA polymerase can be found in the promoter sequence.
  • Eukaryotic promoters often (but not always) contain a "TATA" cassette and a "CAT” cassette, while prokaryotic promoters often contain the consensus sequence TATAAT.
  • Many promoters are referred to as constitutive promoters because they are active in all environments of the cell, but some are inducible promoters whose activity is regulated by reacting to specific stimuli.
  • a “terminator” is a segment of a nucleic acid sequence that provides a signal during transcription that triggers the transcription machinery to release newly synthesized mRNA (or RNA) to terminate transcription.
  • mRNA or RNA
  • two types of transcriptional terminators, dependent on Rho and Rho-independent sequences, are responsible for triggering transcription termination.
  • the transcriptional machinery recognizes the terminator signal and triggers the termination process to release mRNA, which is then added to the 3' end of the mRNA by a polyadenylation reaction.
  • the regulation of transcription can be divided into two major categories according to its action properties: the first is the inhibition of DNA template function, the template function is inhibited by inhibiting the binding of molecules to DNA; the second is RNA polymerase inhibition, by inhibiting molecules and RNA. Polymerase binding inhibits its activity (Sandhya Payankaulam, Li M. Li, and David N. Arnosti (2010) Transcriptional repression: conserved and evolved features. Curr Biol. 14; 20(17): R764 - R771).
  • these two types of control methods cannot be used as a general method to specifically regulate the expression of any target gene.
  • Translation is the process of synthesizing proteins using mRNA as a template.
  • the mature mRNA consists of three parts: the 5' UTR, the ORF and the 3' UTR.
  • the translation initiation complex scans the 5'UTR in the 5' to 3' direction until the initiation codon AUG is encountered, at which position the ribosome moves along the 5' end of the mRNA to the 3' end, beginning protein synthesis from the N-terminus to the C-terminus. .
  • a stop codon UAA, UAG or UGA
  • the translation process can be controlled by RNA-binding proteins (RBPs) and small RNAs that regulate their translation by binding to mRNA.
  • RBP typically binds to specific elements located in the 5' or 3' UTR to activate or depress translation.
  • the elements within the 5' UTR are in the pathway of scanning/translating the ribosome, which can displace the regulatory elements before they occur.
  • 3'UTR has limited impact on the stability and translation efficiency of most mRNAs (Noah Spies, Christopher B. Burge, and David P. Bartel (2013), 3'UTR-isoform choice has limited influence on the stability and translational Efficiency of most mRNAs in mouse fibroblasts, Genome Research, 23: 2078-2090). Therefore, regulation of protein translation by binding of RBP and small RNA to mRNA should not be an effective method.
  • Cells have a surveillance system to identify and eliminate abnormal mRNAs to avoid the production of potentially harmful protein products. For example, a cell can recognize aberrant mRNA (no stop mRNA) lacking a stop codon and form a Ski complex at the 3' end to mediate degradation of the non-stopped mRNA. This non-terminating mRNA decay avoids potentially deleterious extension products that have dominant negative activity relative to wild-type gene products (van Hoof A, Frischmeyer PA, Dietz HC, Parker R (2002) Exosome mediated recognition and degradation Of mRNAs lacking a termination codon.Science 295:2262–2264).
  • Gene targeting technology is a method of integrating foreign DNA into the genome of a chromosome, resulting in the transformation, replacement or replication of genes at the target gene, which is widely used to disrupt gene activity.
  • Ends-in and “Ends-out” refer to two different settings that can be used to integrate foreign DNA into the genome via homologous recombination.
  • gene targeting by “Ends-in” recombination when the foreign DNA is paired with a homologous region in the genome, the ends of the linear foreign DNA point to each other, by single cross-over recombination, "roll in ") Integrate DNA into the genome to produce a target gene homologous repeat.
  • the repeated target gene can be resected by its own homologous recombination to restore the original wild-type state of the target gene.
  • gene targeting In gene targeting by "ends-out” recombination, when the foreign DNA is paired with a homologous region in the genome, the ends of the linear exogenous DNA deviate from each other, between the terminal targeting flanking and the host genome homologous sequence. Double exchange recombination inserts DNA into the genome.
  • the "Ends-out” target is commonly used in mice and yeast because it can directly replace or delete target genes. However, the probability of an "ends-out” event is much lower than the "ends-in” event (Paques and Haber 1999, Microbiology and Molecular Biology Reviews, 63: 349-404).
  • gene targeting refers to "ends-out” double-crossover homologous recombination unless specifically indicated as an "ends-in” roll-in by single-exchange homologous recombination.
  • Gene targeting is a process that is common to all living organisms and can be applied to any gene without being affected by its transcriptional and translational activities.
  • this technique is limited by two limitations: low homologous recombination efficiency and high random (non-targeted) integration rate.
  • Gene targeting occurs through two distinct molecular mechanisms: the homologous recombination (HR) pathway and the non-homologous end joining (NHEJ) pathway. Both recombinant pathways are typically mediated via repair of DNA cleavage duplex (DSB).
  • HR homologous recombination
  • NHEJ non-homologous end joining
  • DSB DNA cleavage duplex
  • a foreign DNA fragment usually a selectable marker gene with homologous sequences at both ends, is precisely integrated into the corresponding sequence of its homologous genome.
  • exogenous DNA fragments containing the selectable marker gene are randomly integrated at non-homologous chromosomal loci.
  • the efficiency of site-specific gene targeting is usually determined by the relative strength between homologous recombination and non-homologous end joining pathways.
  • homologous recombination gene targeting in different biological systems is also very different.
  • Conventional yeast, Saccharomyces cerevisiae and fission yeast, Schizosaccharomyces pombe have a very efficient homologous recombination gene targeting system.
  • the homologous recombination gene target replacement efficiency can reach up to 95% of the transformant (Paques and Haber 1999, Microbiology and Molecular Biology Reviews, 63:349) -404).
  • most organisms have very low homologous recombination gene targeting efficiency.
  • the efficiency may be less than 0.1%, but when using a large 1 kb targeted homologous sequence, it may be higher than 50% for some target genes (Klinner U et al., (2004) Fems Microbiology Reviews 28 : 201-223; Gregg JM (2010) Pichia Protocols, Second edition. Totowa, New Jersey: Humanna Press). Most biological homologous recombination genes, including fungi and eukaryotes, also have low targeting efficiency.
  • the efficiency of homologous gene targeting in strains with the same genetic background is also gene-dependent.
  • the length of the targeting homologous sequence is between 200 and 900 bp
  • ARG1, ARG2, ARG3, HIS1, HIS2, HIS5 and HIS6 in the Pichia pastoris GS115 strain undergo homologous recombination at a high efficiency of 44-90% (Nett et al. (2005) Yeast 22:295–304).
  • the present invention utilizes OCH1, an inefficient targeting gene, and ADE1, a common targeting gene, as a model to systematically analyze homologous gene targeting at various positions in a genomic gene. This study helps to apply gene targeting technology to gene expression regulation and gene disruption.
  • the present invention develops targeting vectors, including portions of marker genes, homologous sequences, and origins of replication. These moieties can be joined to form a circular carrier.
  • the annular carrier may contain other moieties and linkers between the parts if desired.
  • the invention should also include other forms of targeting vectors that are functionally equivalent.
  • Targeting vectors can also be referred to as vectors.
  • the vector of DNA technology is usually in the form of a "plasmid". In the present specification, the terms "vector” and "plasmid" are used interchangeably.
  • Label refers to a gene or sequence whose presence or absence provides a detectable phenotype of the organism. One or more markers can be used to select and screen for gene targeting events. Different types of labels that can be used in the present invention include, but are not limited to, selection markers, screening markers, and molecular markers.
  • Selection markers include genes that are tolerant to antibiotics such as kanamycin, hygromycin, zeocin, bleomycin, spectinomycin, streptomycin, gentamicin, and the like.
  • the selectable marker system consists of an auxotrophic mutant host strain and a wild-type biological gene that complements the host's defects in incomplete media, such as the HIS4, LEU2, URA3, ADE1, LYS2, and TRP1 genes in yeast, and has been Other genes known.
  • the Saccharomyces cerevisiae or Pichia pastoris HIS4 gene can be used for the transformation of the His4 Pichia strain.
  • Screenable markers convey observable and distinguishable features.
  • Screenable markers include fluorescent proteins such as green fluorescent protein (GFP), reporter enzymes such as beta-galactosidase (lacZ), alkaline phosphatase (AP), beta-lactamase, beta-glucuronidase, Glutathione S-transferase (GST), luciferase, and other enzymes known in the art.
  • GFP green fluorescent protein
  • lacZ beta-galactosidase
  • AP alkaline phosphatase
  • beta-lactamase beta-glucuronidase
  • GST Glutathione S-transferase
  • luciferase and other enzymes known in the art.
  • Molecular markers are genetic sequence features that can be identified by DNA analysis.
  • the marker gene is flanked by two homologous recombination regions.
  • the upstream side of one of the homologous recombination regions is homologous to the upstream region of the target gene, and the downstream side of one of the homologous recombination regions is homologous to the downstream region of the target gene.
  • Single or multiple marker genes can be joined to each other in the same or opposite orientation between the upstream and downstream homologous recombination regions.
  • the homologous recombination region is such that the recombination site is 110, preferably 50, upstream of the first nucleotide of the start codon of the gene to be regulated to the first nucleotide of the start codon of the gene to be regulated.
  • a region homologous to a corresponding gene region indicates that the region has at least 90%, preferably at least 92%, more preferably at least 94%, still more preferably at least 96%, still more preferably the base sequence of the gene region. Preferably at least 98%, still more preferably at least 99%, most preferably 100% identical.
  • a preferred choice is that such "homologous regions" are derived from the described gene regions.
  • the length of the homologous recombination region is not particularly limited.
  • the length of this region is preferably adapted to undergo homologous recombination. Therefore, the length of this region is at least 40 bp.
  • Bacterial origins of origin include fl-ori, colisin, col El, and other starting points known in the art.
  • Antibiotic resistance genes include ampicillin, kanamycin, tetracycline, Zeocin tolerance genes, and other antibiotic resistance genes known in the art.
  • the origin of replication and the antibiotic resistance gene can be linked between different parts.
  • the present invention provides a linear "targeting cartridge” which can be linearized by a restriction enzyme digestion from a targeting vector or can be obtained by genetic chemical synthesis.
  • the "targeting cassette” may also be referred to herein as a "target fragment,” “gene disrupted fragment,” or “gene-integrated fragment.”
  • the targeting cassette is used to disrupt the target gene and integrate the foreign gene into the host's chromosomal genome such that the foreign gene can function in the host.
  • the substantial portion of the targeting cassette includes the marker gene and the homologous region.
  • the target cartridge may contain other portions and, if desired, a link between the portions.
  • the upstream and downstream sides of the marker gene are flanked by homologous regions.
  • a targeting cassette or vector is introduced into a host cell for homologous recombination. Transformation and transfection of host cells can be performed according to methods well known to those skilled in the art.
  • Suitable transformation methods include viral infection, transfection, conjugation, protoplast fusion, electroporation, gene gun technology, calcium phosphate precipitation, direct microinjection, and the like.
  • the choice of method will usually depend on the type of cell being transformed and the conditions under which the transformation takes place. A general discussion of these methods can be found in Ausubel et al, Short Protocols in Molecular Biology, 3rd ed., Wiley & Sons, 1995.
  • yeast transformation can be carried out using different methods, including spheroidal methods, electroporation, polyethylene glycol methods, alkaline cation methods, etc. [Gregg JM (2010) Pichia Protocols, Second edition. Totowa, New Jersey: Humanna Press ].
  • host cells useful in the present invention include typical eukaryotic and prokaryotic hosts, such as E. coli, Pseudomonas spp., Bacillus spp., Streptomyces. Genus (Streptomyces spp.), fungi, yeast, such as Saccharomyces cerevisiae, Pichia pastoris, insect cells, such as Spodoptera frugiperda (SF9), animal cells, such as Chinese hamster ovary cells (CHO) and mouse cells, African green monkey cells, cultured human cells and plant cells.
  • Yeast is a preferred host cell of the invention.
  • Pichia pastoris is a more preferred host cell.
  • the transformed cells can then be selected based on the phenotypic selection marker of the selectable marker.
  • the present invention assessed site-specific homologous recombination integration of the 5'-regulatory region, coding region and 3'-regulatory region, identifying regions of high efficiency gene integration.
  • Previous reports have shown that when using a 1 kb or greater homologous region, the efficiency of homologous recombination gene targeting depends on the genomic gene.
  • the present inventors have found that when using less than 1 kb homology regions, 5'- and 3
  • the efficiency of homologous recombination gene integration in the '-regulatory region is independent of genes, and its level is significantly higher.
  • the efficiency of gene integration at the 3' end of the coding region is higher than in other coding regions.
  • the efficiency of targeting integration in different regions of the genomic gene can be expressed in the following order: 5'-regulatory region and 3'-regulatory region>>3' end of the coding region>other coding regions.
  • the present invention develops a method for precisely controlling target gene expression by gene integration in the 5'-regulatory region.
  • the integration gene can be any marker gene that facilitates the identification of a transformant having gene integration, including selectable marker systems, selection markers, and molecular markers.
  • the marker gene ORF can be arranged in a certain position and orientation with different promoters, secretion signal sequences (if necessary) and transcription terminators, and fused to form an expression cassette. The arrangement and orientation of these segments are known to those of ordinary skill in the art.
  • marker gene expression cassettes can be integrated in the DNA duplex of the genome, in the same or opposite orientation to the target ORF, anywhere in the 5'-regulatory region, more preferably in the 5'-regulatory region, close to the target ORF, most preferably In the same strand, the same orientation is immediately adjacent to the position of the target ORF.
  • the marker gene ORF and the transcription terminator fusion can also be integrated in the same strand of the 5'-regulatory region, the same orientation is close to the position of the target ORF, and most preferably integrated in the vicinity of the target ORF, so that the target gene can be utilized. '-Regulatory region to initiate expression of the marker gene.
  • the 5' regulatory region can inhibit the efficiency of transcription and translation, thereby effectively inhibiting specific target gene expression.
  • the methods for regulating gene transcription mainly include: inhibiting the binding function of molecules and DNA, and inhibiting the transcriptional activity by inhibiting the binding of molecules to RNA polymerase.
  • the method of regulating protein translation mainly uses small RNA and RNA-binding proteins to bind to mRNA to change its translation ability.
  • the gene Gene integration in the set of genes, particularly upstream of the ORF is a more efficient method of accurately controlling target gene expression.
  • the targeting cassette can be integrated by any position in the 5' region near the start codon of the target ORF, more preferably upstream of the start codon, and most preferably 3-10 bases upstream of the start codon.
  • a strong promoter is introduced to upregulate the expression of any target ORF consisting of a marker gene expression cassette fused downstream with a strong promoter.
  • the invention contemplates reducing target ORF gene expression by integrating a selectable marker cassette in the DNA duplex of the genome in the same or relative orientation in the 3'-regulatory region, most preferably immediately downstream of the stop codon. method.
  • the present invention develops a method for reducing target ORF gene expression by integrating a selectable marker cassette near the 3' end of the coding region in the same or relative orientation in the DNA duplex of the genome.
  • Table 1 compares the efficiency of homologous recombination integration at different positions in the OCH1 and ADE1 genes of Pichia pastoris. These positions are determined by the nucleotide numbering of the genomic gene, which refers to the corresponding start codon of the coding region as nucleotides 1-3 and the corresponding stop codon as nucleotides +1 to +3.
  • the correct integrant is a clone with the correct genetic integration verified by PCR. Targeting efficiency was defined as the ratio of the correct integrant verified by PCR to the total clone examined.
  • the skilled person is able to engineer a strain, in particular, a gene capable of transforming a strain using a conventional targeting method with a homologous recombination efficiency of less than 3%, preferably less than 1%.
  • a homologous recombination efficiency of less than 3%, preferably less than 1%.
  • the present invention provides a method of engineering a strain which can be used to produce a recombinant protein.
  • the glycosylation pattern in the recombinant protein is altered.
  • the gene targeting method of the present invention can be applied to the biological metabolic reaction of a modified strain to produce a metabolite more efficiently.
  • the gene targeting method of the present invention can be applied to change the enzyme reactivity in the body, so that the modified organism can perform the biocatalytic reaction more efficiently.
  • the strains engineered by the method of the invention can also be used in various fields such as metabolic engineering, genetic research, and biotechnology applications.
  • the present invention identifies a region in which a gene-specific effective homologous recombination gene is targeted;
  • the method of the invention can modulate or engineer any gene of an organism
  • the method of the present invention is widely used in the fields of biotechnology industry and biological research to regulate gene expression, improve cell function, and produce heterologous proteins.
  • coli strain Trans1-T1 was obtained from TransGen Biotech.
  • Pichia pastoris auxotrophic strains JC301 (ade1 his4 ura3) and JC307 (his4 ura3) were obtained from Keck graduate Institute (KGI) and GS115 (his) were obtained from Invitrogen.
  • Nucleotide sequence data was primarily obtained from the public database NCBI (www.ncbi.nlm.nih.gov).
  • the methods used in the present invention are carried out according to standard methods well known to those skilled in the art of molecular and cell biology, including polymerase chain reaction (PCR), restriction enzyme cloning, DNA purification, bacterial and prokaryotic cell culture, transformation. , transfection and Western blotting, such as described in the following manual: Sambrook J et al. (Molecular Cloning A Laboratory Manual (Third Edition), Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 2001), Ausubel F M et al. Current Protocols in Molecular Biology, Wiley InterScience, 2010), and Gregg JM (Pichia Protocols, (Second Edition), Totowa, New Jersey: Humanna Press, 2010).
  • PCR polymerase chain reaction
  • restriction enzyme cloning DNA purification
  • bacterial and prokaryotic cell culture transformation.
  • transfection and Western blotting such as described in the following manual: Sambrook J et al. (Molecular Cloning A Laboratory Manual (Third Edition), Cold Spring Harbor Laboratory Press, Cold
  • the E. coli strain Trans1-T1 was used to construct and amplify the plasmid.
  • the strain was cultured in Luria-Bertani (LB) medium (10 g/L tryptone, 5 g/L yeast extract and 5 g/L sodium chloride) or LB plate (15 g/L agar) containing appropriate antibiotics.
  • the concentration of the antibiotic added was as follows: 100 mg/L ampicillin, 50 mg/L kanamycin and 25 mg/L Zeocin).
  • Pichia pastoris strain utilizes YPD medium (10 g/L yeast extract, 20 g/L peptone, 20 g/L glucose) and YPD plate (10 g/L yeast extract, 20 g/L peptone, 20 g/L glucose, 15 g/L) Agar) culture.
  • YPD medium 10 g/L yeast extract, 20 g/L peptone, 20 g/L glucose
  • YPD plate 10 g/L yeast extract, 20 g/L peptone, 20 g/L glucose, 15 g/L
  • amino acid-free YNB medium 67 g/L yeast nitrogen base, 5 g/L glucose
  • amino acid-free YNB plate 67 g/L yeast nitrogen source, 5 g/L glucose, 15/L agar
  • Pichia pastoris auxotrophic strains utilize SC medium (8g/L SC without histidine and uracil, 20g/L glucose) and SC plates (8g/L SC without histidine and uracil, 20g/ L-glucose, 15/L agar) was selected and added as appropriate (antibiotics).
  • concentration of antibiotic added was as follows: 500 mg/L G-418 sulfate and 100 mg/L Zeocin.
  • MicroPulser TM using electroporation apparatus according to manufacturer's instruction manual (BioRad) operation, transformed Pichia pastoris by electroporation.
  • Figure 2 depicts a schematic of the construction of a pUO vector.
  • PCR1 using genomic DNA as a template, KpnIOch1(+54)F (SEQ ID NO: 1, the primer has a Kpn I restriction enzyme cleavage site) and Och1 (+801) BamHI R (SEQ ID NO: 2, The primer has a BamH I restriction site) primer pair for PCR amplification of the Pichia pastoris OCH1 3' sequence (3'H);
  • PCR2 using the pBlunt-URA3SK vector as a template, using XhoIURA3F (SEQ ID NO: 3, the primer has a Xho I restriction enzyme cleavage site) and DRKpnI R (SEQ ID NO: 4, the primer has a Kpn I restriction enzyme digestion
  • the primer pair was used to PCR amplify the Pichia pastoris URA3 expression cassette and the SacI-KpnI fragment.
  • the pBlunt-URA3SK vector was obtained by ligating the PCR fusion fragment of the URA3 and SacI-KpnI fragments to the pBlunt vector (TransGen Biotech, China).
  • the PCR product of OCH1 3'H was digested with Kpn I and BamH I, respectively, and the URA3 expression cassette was digested with Xho I and Kpn I.
  • the KpnI-BamHI fragment of OCH1 3'H and the XhoI-KpnI fragment of the URA3 expression cassette were inserted into the Xho I and BamH I sites of the pBlunt-XB vector to generate a pUO3H vector.
  • the fragment containing the XhoI and BamHI sites was ligated to the pBlunt vector (TransGen Biotech) to obtain pBlunt-XB.
  • the PCR product of OCH1 5'H was digested with Sph I and Xho I, respectively, and the pUO3H vector was digested with Xho I and BamH I.
  • the SphI-XhoI fragment of OCH1 5'H and the XhoI-BamHI fragment of the URA3 expression cassette and OCH1 3'H were inserted into the BamH I and Sph I sites of the pUC19-EBSH vector.
  • the pUC19EcoRI-HindIII fragment with the multiple cloning site was replaced with a fragment containing the EcoR I, BamH I, Sph I and Hind III restriction sites to obtain the pUC19-EBSH vector.
  • the resulting pUO vector was used as a base vector to construct other different OCH1 targeting vectors.
  • Figure 3 depicts a schematic representation of the construction of a targeting vector integrated into the Pichia pastoris OCH1 gene.
  • the PCR product of the ADE1 expression cassette was digested with Sac I and Kpn I, respectively, and the PCR product of the URA3 expression cassette was digested with Sac I and Xho I.
  • SacI-KpnI fragment of ADE1 and the SacI-XhoI fragment of URA3 were inserted into the Xho I and KpnI sites of the pUO vector to generate a pUAH vector.
  • the SphI-XhoI fragment of OCH1 5'H was inserted into the same restriction enzyme site of pUAH to generate pUA5H.
  • the KpnI-BamHI fragment of OCH1 3'H (1/646) was inserted into the same restriction enzyme site of pUA5H to generate the OCH1 targeting vector pUAH(1) for integration into OCH1 5 'The position in the regulatory region immediately upstream of the start codon (-1/1).
  • OCH1 targeting vectors were constructed by inserting the corresponding PCR products of 5' and 3' homology (sequences) of OCH1 into pUAH, which were integrated at different positions of the OCH1 gene.
  • FIG. 4A depicts the target integration site in the OCH1 locus of Pichia pastoris.
  • OCH1 targeting vectors including pUAH (-109, 1,554, 1097, 1166, +1, +4, and +204), were PCR amplified using the following primer pairs to generate UAH in linear form ( OCH1 target box of -109,1,554,1097,1166,+1, +4 and +204):
  • Och1 (816) F (SEQ ID NO: 49) / Och1 (+803) R (SEQ ID NO: 50).
  • the OCH1 targeting cassette contains URA3 and ADE1 expression cassettes that are positioned adjacent to each other in relative orientation on the DNA duplex. Both expression cassettes are flanked by 600 bp of 5' and 3' integration homologous sequences (5'H and 3' H) of the same length, which are gene-specific homologous sequences to ensure precise integration of the targeting sites in the OCH1 gene.
  • Figure 4B depicts a representative homologous integration of the targeting cassette in the position (-1/1) upstream of the start codon of the OCH1 gene in Pichia pastoris.
  • colonies were randomly picked and cultured to extract genomic DNA for PCR to verify the integration of the genome.
  • Two primer pairs, P1 (SEQ ID NO: 51, located upstream of the 5' homology region in the genome) / P2 (SEQ ID NO: 52, located within the URA3 of the targeting fragment) and P3 (SEQ ID NO: 53, located at The targeting fragment of ADE1)/P4 (SEQ ID NO: 54, located downstream of the 3' homology region in the genome) was used to verify homologous integration of the targeting sites (Fig. 4B).
  • Figure 4C shows the results of PCR validation of integration at different locations in the OCH1 locus.
  • the predicted 1300, 2550 and 2550 bp bands were amplified by P1/P2 primer pairs, respectively. These strains also amplified the predicted 3500, 2300 and 2300 kb bands in PCR using P3/P4 primer pairs.
  • the PCR results verified that the corresponding target fragments were successfully integrated at the specified positions (-1/1, 1212/+1, +3/+4).
  • the parent strain JC301 was used as a negative control, and the 1433 bp band was amplified by the P1/P4 primer pair, but no bands were amplified by the P1/P2 and P3/P4 primer pairs.
  • the strain integrated upstream of the initiation codon of the OCH1 locus is called the och1(-1/+1, ADE1URA3) strain.
  • Pichia pastoris is extremely inefficient in homologous recombination gene targeting.
  • the efficiency of its homologous recombination gene replacement events is highly dependent on the length of the target fragment.
  • the targeting homologous sequence is less than 500 bp, the homologous recombination efficiency is less than 0.1%.
  • the present inventors have found that when a short homologous sequence of 600 bp is utilized, the homologous integration efficiency immediately upstream of the stop codon (1212/+1) is about 7%. This may be due to incomplete disruption of OCH1 function, as integration results in OCH1 gene transcription leading to OCH1 non-stopping mRNA, which is subsequently translated to produce a functionally active C-terminally extended OCH1 product.
  • the C-terminal extension product formed retains a certain activity, the homologous integration efficiency of the 3' end site adjacent to the coding region may be higher than other sites in the coding region.
  • the present inventors have also found that the efficiency of homologous recombination at positions in the 5'- and 3'-regulatory regions of OCH1 is significantly higher, for example, upstream of the start codon (-110/-109). , -1/1) is 40%, 35%, and the two positions downstream of the stop codon (+3/+4, +203/+204) are 80%, 25%.
  • Targetbox integration was performed at various locations in the ADE1 gene to further validate the results of the OCH1 gene targeting.
  • Figure 5A depicts the integration position of the targeting cassette in the genomic ADE1 gene.
  • Figure 5B shows a schematic of the construction of an ADE1 targeting cassette by PCR.
  • PCR1 using Pichia pastoris genomic DNA as a template, using ADE1 (-800) F (SEQ ID NO: 55) and ADE1 (-1) U R (SEQ ID NO: 56) (this primer has URA3 overlapping sequences for fusion) PCR) primer pair for PCR amplification of the 5'-homologous sequence of the ADE1 locus (5'H, -800/-1);
  • ADE1 targeting cassettes were constructed by PCR amplification and fusion using the corresponding primer pairs below, which were integrated at different positions of the ADE1 gene:
  • the sequence was used for fusion PCR) and ADE1 (+750)R (SEQ ID NO: 72) was used to construct the targeting cassette UH (863) for integration at the position in the ADE1 coding region (862/863).
  • linear targeting cassettes contain the URA3 expression gene, flanked by 5' and 3' homologous sequences of approximately 800 bp (750-850) in length, which are gene-specific homologous sequences for precise integration in Pichia pastoris The target location of the ADE1 gene.
  • These targeting cassettes were transformed into cells of Pichia pastoris auxotrophic strain JC307 (his4 ura3) (Keck graduate Institute, USA) by electroporation.
  • the transformed cells were grown on SC plates supplemented with 20 mg/L histidine (8 g/L SC without histidine and uracil, 20 g/L glucose, 15 g/L agar) to select the uracil prototrophy.
  • colonies were randomly picked from the plates to avoid deviations from colony picking due to white/light red colonies, as the accumulation of red pigments in the ade1 strain and the appearance of pale red colonies required longer incubations.
  • Genomic DNA was extracted from overnight culture colonies for verification of integration of positions in the ADE1 gene by PCR.
  • Primer pairs P5/P6 (located upstream of the 5' homology region in the genome and downstream of the 3' homology region) were used to validate genomic integration (Fig. 5B).
  • the corresponding P5/P6 primer pair was further named P5-1 (SEQ ID NO: 93) / P6-1 (SEQ ID NO: 94), P5-2 (SEQ ID NO: 95) / P6-2 (SEQ ID NO) :96), P5-3 (SEQ ID NO:97)/P6-3 (SEQ ID NO:98), and P5-4 (SEQ ID NO:99)/P6-4 (SEQ ID NO:100) to verify Integration at different locations in the ADE1 gene.
  • the homologous integration efficiency at the 5'- and 3'-regulatory regions of ADE1 was significantly higher, for example, at two positions upstream of the initiation codon (-110/-109, -1/1). 50%, 65%, at the two downstream positions of the stop codon (+3/+4, +203/+204) is 30,45%.
  • homologous recombination gene targeting efficiency is gene-dependent, for example, the OCH1 gene targeting efficiency is extremely low, while the ADE1 gene targeting efficiency is higher.
  • homologous recombination gene targeting efficiency depends mainly on the region of the target gene. When less than 1 kb homologous sequences are used, although homologous recombination is integrated in the coding regions of OCH1 and ADE1 to disrupt gene function, both have completely different efficiencies, but homologous recombination is integrated at 5' and 3 of the OCH1 and ADE1 loci. 'Regulatory areas have similar high efficiency, more than 25%.
  • the present inventors have found that the efficiency of homologous recombination gene integration in the 5'- and 3'-regulatory regions is independent of genes and is highly efficient. In addition, the efficiency of gene integration adjacent to the 3' end of the coding region is higher than other regions of the coding region.
  • the homologous recombination targeting integration efficiency in different regions of the genomic gene can be expressed in the following order: 5'-regulatory region and 3'-regulatory region>>3' end of the coding region>other coding regions.
  • the lacZ gene of Escherichia coli encodes a beta-galactosidase that hydrolyzes various beta-D-galactosides, including chromogenic substrates, to produce colored products. Due to the convenience and sensitivity of the beta-galactosidase activity assay in liquid culture, it is a commonly used reporter enzyme to monitor the regulation of gene expression.
  • the lacZ reporter enzyme of Pichia pastoris can be constructed by fusing the lacZ ORF with the 5' and 3' regulatory regions of the gene.
  • Figures 6 and 7 depict a schematic representation of the construction of a series of 5' AOX1-induced lacZ expression vectors in which URA3 is contiguous with the start and stop codons in the lacZ ORF to regulate its expression.
  • BamHIlacZ F (SEQ ID NO: 101) (the primer has a BamH I restriction enzyme cleavage site) and lacZNotI R (SEQ ID NO: 102) (the primer Has Not I and The Xho I restriction site was used as a primer pair for PCR to amplify the lacZ ORF (SEQ ID NO: 128).
  • PCR2 using E. coli BL21 (DE3) genomic DNA as a template, using BamHIlacZ F (SEQ ID NO: 101) and lacZnsNotI R (SEQ ID NO: 103) (the primer lacks lacZ to terminate the promoter and has a Not I and Xho I restriction
  • the primer pair was used as a PCR to amplify the E. coli lacZ ORF.
  • lacZ and lacZns were digested with BamH I and Not I, respectively.
  • the BamHI/NotI fragment of the lacZ and lacZns was inserted into the BamH I and Not I sites of the pPIC3.5K vector (Invitrogen) to generate p5'AOX1-lacZ and p5'AOX1-lacZns vectors (Fig. 6).
  • PCR3 using the pBLURA-SX vector as a template, BamHIURA3F (SEQ ID NO: 104) (the primer has a BamH I restriction site) and URA3BamHI R (SEQ ID NO: 105) (the primer has a BamH I restriction)
  • BamHIURA3F SEQ ID NO: 104
  • URA3BamHI R SEQ ID NO: 105
  • the cleavage site primer pair was used for PCR to amplify the Pichia URA3 expression cassette.
  • the PCR product of the URA3 expression cassette was digested with BamH I and inserted into the BamH I site of the p5'AOX1-lacZ vector.
  • the ligation vector containing the two orientations of URA3 was transformed into E. coli strain Trans1-T1 (TransGen Biotech, China), and colony PCR was performed to select the vector p5'AOX1-URA3-lacZ, wherein URA3 was in the same chain, and the same orientation was in close proximity to lac.
  • Upstream of the ORF Figure 6).
  • PCR4 using the pBLURA-SX vector as a template, using NotIURA3F (SEQ ID NO: 106) (the primer has a Not I restriction enzyme cleavage site) and URA3 NotI R (SEQ ID NO: 107) (the primer has a Not I restriction)
  • NotIURA3F SEQ ID NO: 106
  • URA3 NotI R SEQ ID NO: 107
  • the cleavage site primer pair was used for PCR to amplify the Pichia URA3 expression cassette.
  • the PCR product of the URA3 expression cassette was digested with Not I and inserted into the Not I sites of the p5'AOX1-lacZ and p5'AOX1-lacZns vectors, respectively.
  • the ligated vector containing the two orientations of URA3 was transformed into the phage-tolerant Escherichia coli strain Trans1-T1, and the vector p5'AOX1-lacZ-URA3, p5'AOX1-lacZ-URA(-), p5 were selected by colony PCR.
  • Both p5'AOX1-lacZ-URA3 and p5'AOX1-lacZns-URA3 contain the URA3 expression cassette, which is located in the same strand, in the same orientation immediately downstream of the lac ORF and the lacZns ORF (Fig. 6).
  • the other two vectors, p5'AOX1-lacZ-URA3(-) and p5'AOX1-lacZns-URA3(-) contain the URA3 expression cassette, which is located in the opposite strand, and the relative orientation is immediately downstream of the lac ORF and the lacZns ORF (Fig. 7).
  • Figures 8, 9, 10 and 11 depict a schematic representation of the construction of a series of 5' OCH1-mediated lacZ expression vectors in which URA3 is contiguous with the start and stop codons in the lacZ ORF to regulate its expression.
  • PCR1 using genomic DNA as a template, using BamHIOCH1(-731)F (SEQ ID NO: 108) (the primer has a BamH I restriction site) and OCH1(-1)L R (SEQ ID NO: 109) (The primer has a lacZ overlapping sequence for fusion PCR)
  • the primer pair is subjected to PCR to amplify the 5' regulatory region of Pichia pastoris OCH1 (5'OCH1, -731/-1).
  • PCR2 using E. coli BL21 (DE3) genomic DNA as a template, OLacZ F (SEQ ID NO: 110) (the primer has a 5' OCH1 overlapping sequence for fusion PCR) and lacZXhoI R (SEQ ID NO: 111) (this The primers have a Xho I restriction site) primer pair for PCR to amplify the lacZ ORF.
  • PCR3 using E. coli BL21 (DE3) genomic DNA as a template, OLacZ F (SEQ ID NO: 110) and lacZnsNotI R (SEQ ID NO: 103) (the primer lacks the lacZ stop codon and has a Not I and Xho I restriction
  • the primer pair was used for PCR to amplify the lacZ ORF (lacZns) without a stop codon.
  • PCR4 using the BamHIOCH1(-731)F (SEQ ID NO: 108) and LacZXhoI R (SEQ ID NO: 111) primer pairs, the PCR1 and PCR2 products were fused by overlap-extension PCR. This produced a fragment of 5'OCH1-lacZ.
  • PCR5 using the BamHIOCH1(-731)F (SEQ ID NO: 108) and lacZnsNotI R (SEQ ID NO: 103) primer pairs, the PCR1 and 3 products were fused by overlap-extension PCR. This produced a fragment of 5'OCH1-lacZns.
  • PCR6 using genomic DNA as a template, using XhoIOCH1(+4)F (SEQ ID NO: 112) (the primer has a Xho I restriction site) and OCH1 (+798) SacI R (SEQ ID NO: 113) (The primer has a Sac I restriction site) primer pair for PCR to amplify the 3' regulatory region of Pichia pastoris OCH1 (3'OCH1, +4/+798).
  • the PCR product of the 5'OCH1-lacZ fragment was digested with BamH I and Xho I
  • the PCR product of 3'OCH1 was digested with Xho I and Sac I.
  • the BamHI-XhoI fragment of 5'OCH1-lacZ and the XhoI-SacI fragment of 3'OCH1 were inserted into the Sac I and BamH I sites of the pBLHIS-SX vector to generate the p5'OCH1-lacZ vector (Fig. 8).
  • the PCR product of the 5'OCH1-lacZns fragment was digested with BamH I and Xho I, and the PCR product of 3'OCH1 was digested with Xho I and Sac I.
  • the BamHI-XhoI fragment of 5'OCH1-lacZns and the XhoI-SacI fragment of 3'OCH1 were inserted into the Sac I and BamH I sites of the pBLHIS-SX vector to generate the p5'OCH1-lacZns vector (Fig. 8).
  • PCR7 using genomic DNA as a template, using BamHIOCH1(-731)F (SEQ ID NO:108) and OCH1(-1)U R (SEQ ID NO:114) (the primer has a URA3 overlapping sequence for fusion PCR) primers PCR was performed to amplify the 5' regulatory region of OCH1 (5'OCH1, -731/-1).
  • PCR8 using the pBLURA-SX vector as a template, OURA3F (SEQ ID NO: 115) (the primer has an OCH1 overlapping sequence for fusion PCR) and URA3SphIXhoI R (SEQ ID NO: 116) (the primer has Sph I and Xho I)
  • OURA3F SEQ ID NO: 115
  • URA3SphIXhoI R SEQ ID NO: 116
  • the primer has Sph I and Xho I
  • a restriction enzyme pair primer pair was used for PCR to amplify the Pichia URA3 expression cassette.
  • the PCR product of the 5'OCH1-URA3 fragment was digested with BamH I and Xho I
  • the PCR product of the lacZ ORF was digested with Sph I and Xho I.
  • the BamHI-XhoI fragment of 5'OCH1-URA3 and the SphI-XhoI fragment of lacZ were inserted into the BamHI and XhoI sites of the p5'OCH1-lacZ vector to generate the p5'OCH1-URA3-lacZ vector (Fig. 9).
  • PCR11 using the pBLURA-SX vector as a template, XhoIURA3F (SEQ ID NO: 3) (the primer has a Xho I restriction enzyme cleavage site) and URA3XhoI R (SEQ ID NO: 10) (the primer has Xho I restriction)
  • XhoIURA3F SEQ ID NO: 3
  • URA3XhoI R SEQ ID NO: 10
  • the PCR product of the URA3 expression cassette was digested with Xho I and inserted into the Xho I site of the p5'OCH1-lacZ and p5'OCH1-lacZns vectors.
  • the insertion vector containing the two orientations of URA3 was transformed into E. coli strain Trans1-T1, and colony PCR was performed to select vectors p5'OCH1-lacZ-URA3, p5'OCH1-lacZ-URA3(-), p5'OCH1-lacZns, respectively.
  • Both p5'OCH1-lacZ-URA3 and p5'OCH1-lacZns-URA3 contain the URA3 expression cassette, which is located in the same strand, in the same orientation immediately downstream of the lac ORF and the lacZns ORF.
  • the other two vectors, p5'OCH1-lacZ-URA3(-) and p5'OCH1-lacZns-URA3(-) both contain a URA3 expression cassette, which is located in the opposite strand, and the relative orientation is immediately downstream of the lacz ORF and the lacZns ORF (Fig. 10 and 11).
  • the 5'AOX1-induced lacZ expression vector was linearized by Sac I digestion, including p5'AOX1-lacZ, p5'AOX1-URA3-lacZ, p5'AOX1-lacZ-URA3, p5'AOX1-lacZ-URA3(-), p5'AOX1-lacZns-URA3 and p5'AOX1-lacZns-URA3(-) were transformed into Pichia pastoris strain GS115 (his4) (Invitrogen) by electroporation. The transformed cells were cultured on YNB plates to select histidine prototrophy. The linearized expression vector was integrated into the genome by roll-in recombination as described by the manufacturer (Invitrogen).
  • Digestion with stu I to linearize the 5'OCH1-mediated lacZ expression vector including p5'OCH1-lacZ, p5'OCH1-URA3-lacZ, p5'OCH1-lacZ-URA3, p5'OCH11-lacZ-URA3(-) , p5'OCH1-lacZns-URA3 and p5'OCH1-lacZns-URA3(-), and they were transformed into Pichia pastoris strain GS115 by electroporation. The transformed cells were cultured on YNB plates to select histidine prototrophy. The linearized expression vector was integrated at the his4 gene by a roll-in recombination.
  • the cells were pelleted by centrifugation at 3000 g for 5 minutes, and then the pelleted cells were resuspended in 5 ml of BMMY medium (10 g/L yeast extract, 20 g/L peptone, 13.4 g/L YNB amino acid free, 100 mM potassium phosphate buffer, pH 6.0). , 0.4 mg/L biotin, 10 ml/L methanol), shaking at 225 rpm at 30 ° C to induce lacZ expression.
  • the culture was mixed twice daily with 50 ⁇ l of 100% methanol (1% final concentration) and induction was maintained for an additional 48 hours. Subsequently, the cells were centrifuged at 3000 g for 10 minutes. Wash with 5 ml of water and centrifuge again to collect the cell pellet. Cell pellets were used for the beta-galactosidase assay or at -80 °C for total RNA isolation.
  • the transformed cells of the 5'OCH1-mediated lacZ expression vector were cultured in 5 ml of YPD medium at 30 ° C for 72 hours with shaking at 225 rpm. Subsequently, the cells were centrifuged at 3000 g for 10 minutes, washed with 5 ml of water, and centrifuged to collect a cell pellet. Cell pellets were used for the beta-galactosidase assay or at -80 °C for total RNA isolation.
  • RNA Reverse transcription of RNA was carried out using the ReverTra Ace- ⁇ -first-strand cDNA synthesis kit (Toyobo) according to the manufacturer's instructions.
  • Real-time PCR reaction 10 ⁇ l 2 ⁇ iTaq TM Universal Green supermix (BioRad, Hercules, CA), 1 ⁇ l cDNA and 100 nM each GAPDH F (SEQ ID NO: 118) / R (SEQ ID NO: 119) and LacZ F (SEQ ID NO: 120) / R (SEQ ID NO: 121) Primer, 20 ⁇ l total reaction volume.
  • the PCR reaction was carried out using LightCycler LC480 (Roche) with the following parameters: 1 cycle of 95 ° C, 1 minute, 40 cycles of 95 ° C, 10 seconds, 58 ° C, 10 seconds, 72 ° C, 10 seconds. All samples were taken in triplicate and tested several times.
  • Figure 12A and B show that the 5' AOX1 and 5' OCH1 initiated lacZ mRNA expressions have different inhibitory effects when the URA3 expression cassette is integrated near the start and stop codons.
  • the URA3 expression cassette When the URA3 expression cassette is in the same strand, integrated upstream of the lacZ ORF start codon in the same orientation, it effectively reduced lacZ mRNA levels by 60% and 70%, respectively.
  • This The reduction in mRNA can be attributed to the 3' URA3 terminator, which is effective to block the lacZ transcription initiated by the 5' AOX1 or 5' URA3 promoter due to its upstream of the lacZ ORF.
  • the URA3 3' terminator failed to completely terminate transcription resulting in a low level of aberrant lacZ mRNA, which lacks a suitable 5' UTR for translation.
  • Figure 13A and B show the relative intracellular specific activity of ⁇ -galactosidase corresponding to the integration of the URA3 expression cassette near the start and stop codons.
  • URA3 When URA3 is in the same strand and integrated in the same orientation upstream of the lacZ ORF start codon, there is no detectable beta-galactosidase activity in the cells. Complete inhibition of ⁇ -galactosidase activity can be attributed to transcriptional and translational regulation.
  • the termination of 3' URA3 significantly reduced aberrant lacZ mRNA transcription lacking the 5' UTR.
  • the present invention shows that the 5' regulatory region, particularly the gene integration immediately upstream of the ORF, specifically inhibits target gene expression by repressing transcription and translation.
  • Current methods for regulating gene transcription mainly include: inhibiting the function of the molecule by binding to the DNA, or inhibiting the transcriptional activity by inhibiting the binding of the molecule to the RNA polymerase.
  • the method of regulating protein translation mainly uses small RNA and RNA-binding proteins to bind to mRNA to change its translation ability.
  • gene integration in genomic genes, particularly upstream of ORFs is a more efficient method of specifically controlling target gene expression.
  • URA3 integration has a different effect on abnormal lacZ mRNA levels
  • URA3 integration in the different strands in different orientations near the stop codon has 5'AOX1 and 5'OCH1 mediated ⁇ -galactosidase translation in the cells.
  • Different inhibition effects When URA3 is integrated in the appropriate orientation near the stop codon, ⁇ -galactosidase activity can be reduced by up to 70%. This inhibition is more effective than previously reported inhibition by microRNA. It has been reported that microRNAs can only inhibit 32% and 4% of mRNA stability and translation levels (Noah Spies, Christopher B. Burge, and David P. Bartel (2013). 3'UTR-isoform choice has limited influence on the stability And translationalefficiency of most mRNAs in mouse fibroblasts, Genome Research, 23: 2078-2090).
  • the present invention shows another method of controlling target gene expression by integration in the vicinity of a stop codon. It is necessary to evaluate the integration of genes in the DNA duplex with two orientations near the stop codon to obtain the best regulatory results.
  • Example 3 a representative integrated strain in Example 3 was selected to analyze mRNA transcription of the OCH1 gene.
  • the control strain JC307 and the three integrated strains och1 (-1/+1), (1212/+1), (+3/+4) were cultured in 5 ml of YPD medium at 30 ° C for 72 hours at 225 rpm with shaking. . Subsequently, the cells were centrifuged at 3000 g for 10 minutes, washed with 5 ml of water, centrifuged again to collect cell pellets, and stored at -80 ° C for total RNA isolation.
  • RNA isolation was subjected to total RNA isolation, and reverse transcription of cDNA was carried out using the ReverTra Ace- ⁇ -first-strand cDNA synthesis kit (Toyobo).
  • Real-time PCR reaction 10 ⁇ L 2 ⁇ iTaq TM Universal Green supermix (BioRad, Hercules, CA), 1 ⁇ l cDNA and 100 nM each GAPDH F (SEQ ID NO: 118) / R (SEQ ID NO: 119) and OCH1F (SEQ ID NO: 122) / R (SEQ ID NO) :123) Primer, 20 ⁇ L total reaction volume.
  • the PCR reaction was carried out using LightCycler LC480 (Roche) with the following parameters: 1 cycle of 95 ° C, 1 minute, 40 cycles of 95 ° C, 10 seconds, 58 ° C, 10 seconds, 72 ° C, 10 seconds. All samples were taken in triplicate and tested several times.
  • Figure 14 shows the relative expression levels of OCH1 mRNA in these strains.
  • Gene integration upstream of the initiation codon is effective in inhibiting OCH1 mRNA levels by more than 90%.
  • Gene integration downstream of the stop codon also reduces OCH1 mRNA levels.
  • gene integration at the upstream position of the stop codon significantly increased OCH1 mRNA levels. This result is similar to the observation of the regulation of lacZ mRNA in Example 5.
  • Mammalian cells and yeast have the same N-glycosylation initiation step and modification process in the endoplasmic reticulum. While synthesizing the nascent peptide chain in the lumen of the endoplasmic reticulum, the N-glycosylated precursor oligosaccharide G1c 3 Man 9 GlcNAc 2 is linked to the nascent peptide chain Asn-X-Thr/Ser (X is any amino acid other than Pro) On the Asn residue in the conserved sequence, the sugar chain of the protein is finally processed under the action of glycosidase such as glucosidase I and II to form a Man 8 GlcNAc 2 sugar chain structure, and then the protein carrying the sugar chain is transported.
  • glycosidase such as glucosidase I and II
  • the sugar chain of the protein first accepts an ⁇ -l,6-mannose to form a Man 9 GlcNAc 2 sugar chain structure, and then in various mannose
  • the mannose is continuously added by the action of the transferase, and many of them reach tens to hundreds of mannose, eventually forming a high mannose type sugar chain structure, and forming an excessive glycosylation modification to the protein.
  • Ochlp is the first and most important enzyme for yeast to differentiate from mammalian cells to form high mannosylation, so destroying OCH1 gene is expected to block the production of high mannosylation of P. pastoris.
  • the template was synthesized by Generay synthetic codon-optimized mouse interleukin-22 (mIL-22, the DNA sequence of his-tagged mouse IL-22 mature peptide optimized for yeast codon as shown in SEQ ID NO: 129).
  • the MIL22F (SEQ ID NO: 124) / R (SEQ ID NO: 125) primer pair was used for PCR amplification.
  • the PCR product was digested with Xho I and Not I restriction enzymes and cloned into the Xho I/Not I site of pPICZ ⁇ (Invitrogen) to generate a mIL-22 expression vector capable of expressing and secreting His-tagged mIL-22.
  • the expression vector was linearized with restriction enzyme Sac I and electroporated into GS115 and och1(-1/+1) strains.
  • the transformed cells were cultured on YPD plates supplemented with 100 mg/L of Zeocin.
  • the linearized vector was integrated into the AOX1 locus by roll-in recombination as described by the manufacturer (Invitrogen).
  • the transformed cells were cultured in 5 ml of YPD medium at 30 ° C, shaking at 225 rpm for 24 hours.
  • the cells were pelleted by centrifugation at 3000 g for 5 minutes, resuspended in 5 ml of BMGY medium, and cultured at 30 ° C, shaking at 225 rpm for 24 hours.
  • the cells were pelleted by centrifugation at 3000 g for 5 minutes, resuspended in 5 ml of BMGY medium, and shake cultured at 225 rpm at 30 ° C to induce mIL-22 expression.
  • the culture was mixed twice daily with 50 ⁇ l of 100% methanol (1% final concentration) and induction was maintained for an additional 72 hours. Subsequently, the culture supernatant (3000 g, 10 minutes) was harvested by centrifugation, and the frozen supernatant was -20 ° C until use.
  • the His-tagged mIL-22 protein was purified from the supernatant by Ni-affinity chromatography according to the manufacturer's instructions (Nanjing Kingsray Biotechnology Co., Ltd.).
  • the molecular weight of the sugar chain was determined using an Ultraflex MALDI-TOF (bruker daltonics, Bremen, Germany) mass spectrometer according to the manufacturer's instructions.
  • Figure 15A shows the mass spectrum of N-glycans released from mIL-22 in the GS115 strain. It shows that the major N-high mannose chain is Man 9-15 GlcNAc 2 (m/z: 1907, 2069, 2231, 2393, 2555, 2717, 2880), indicating that Och1p initiates complete high mannose on Man 8 GlcNAc 2 Basic modification.
  • Figure 15B shows the mass spectrum of N-glycans released from mIL-22 in the och1(-1/+1) strain.
  • Man 8-15 GlcNAc 2 (m/z: 1744, 1907, 2069, 2231, 2393, 2555, 2717, 2880), in which the high mannose chain may be due to other mannose transferases. Formed by the role.
  • Production of Man 8 GlcNAc 2 (m/z: 1744) indicates that gene integration upstream of the OCH1 coding region is effective in blocking Och1p-primed high-mannosylation modification (Choi, et al. (2003) Proc Natl Acad Sci USA 100 :5022–5027).

Abstract

一种新型的基因打靶方法以及用于所述方法的核苷酸构建物,所述方法将含干扰基因的核苷酸构建物通过同源重组整合在不依赖于基因的有效基因打靶区域,从而提高基因的打靶效率。本发明还提供了用于基因表达调控和基因破坏的基因打靶系统。

Description

新型基因打靶方法 技术领域
本发明涉及生物技术领域;具体地说,本发明涉及新型的打靶方法以及用于该方法的核苷酸构建物。
背景技术
在生物体基因组中,各基因一般由编码区和调控区两部分组成。编码区或开放读框(ORF)编码具有各种生物学功能的蛋白和RNA链。蛋白编码序列的边界包括5’末端(N-末端)的起始密码子和3’末端(C-末端)的翻译终止无义密码子。编码区之前(5’区)和之后(3’区)的调控区含有各种DNA调控元件,例如启动子、增强子、终止子、多腺苷酸化信号,5'非翻译区(5'UTR)和3’非翻译区(3'UTR),它们控制基因表达过程的各个方面,包括转录、翻译和RNA稳定性等等。
生物基因表达主要有两个步骤:编码的基因从DNA转录成信使RNA(mRNA)或RNA,以及mRNA翻译为蛋白质。基因的表达分别可在转录和翻译步骤通过调控区各种DNA元件进行控制。
基因的转录通过启动子启动、延伸到终止子结束。启动子是基因序列中被RNA聚合酶识别并起始转录的特定区段,它是控制转录起始的序列并决定着基因的表达强度。终止子是基因序列中负责转录终止的特定序列,它提供信号触发转录机器释放新合成的mRNA(或RNA)终止转录。
翻译是以mRNA为模板合成蛋白质的过程。成熟的mRNA由三个部分组成:5'UTR、ORF和3'UTR。翻译起始复合物在5'至3'方向扫描5'UTR直至遇到起始密码子AUG,在该位置核糖体沿mRNA5’端向3’端移动,开始了从N端向C端的蛋白质合成。当核糖体遇到终止密码子(UAA、UAG或UGA)时,蛋白质合成终止,蛋白质从核糖体释放。
实际上,能可预测地调控任何目标基因表达的方法和/或工具将有益于生物学研究和众多生物技术应用。
基因打靶广泛用于破坏或增强染色体基因组中基因的活性。这是一种将外源DNA整合在遗传基因组中的方法,其导致靶基因被改造、替换或复制。基因打靶是对所有生命体通用的过程并可以用于任何基因,而且与基因的转录和翻译步骤无关。
基因打靶通常经由DNA双链断裂(DSB)的修复介导而实现。这种修复机制通过两种截然不同的分子机制发生:同源重组(HR)途径和非同源末端连接(NHEJ)途径。在同源重组基因打靶中,外源DNA片段(一般为可选择的标记基因)通过两端的同源序列精确整合在基因组的相应序列中。但是在非同源末端连接途径中,外源DNA片段会随机整合在非同源的染色体基因位点上(Paques and Haber 1999,Microbiology and Molecular Biology Reviews,63:349–404)。因此,位点特异性基因打靶的效率由同源重组和非同源末端连接途径之间的相对强度决定。
虽然非同源末端连接途径被认为是造成基因打靶效率低的主要原因,但在具有相同遗传背景的菌株中,同源重组基因打靶的效率也可能是基因依赖性的。这种基因依赖性现象的 分子机制不十分清楚。一种可能的原因是各染色体基因组有热点区域,这些热点区域易于发生同源重组(Wahls等.Plos One 3:e2887)。
不同生物系统同源重组基因打靶的效率也大不相同。常规酵母,酿酒酵母(Saccharomyces cerevisiae)和裂殖酵母,粟酒裂殖酵母(Schizosaccharomyces pombe)已经有非常有效的同源重组基因打靶系统。然而,在甲醇营养型酵母,巴斯德毕赤酵母(Pichia pastoris)和其它“非常规”酵母,例如多形汉森酵母(Hansenula polymorpha)、脂耶罗威亚酵母(Yarrowia lipolytica)、树干毕赤酵母(Pichia stipitis)和乳酸克鲁维酵母(Kluyveromyces lactis)中,同源重组基因打靶效率极其低下。包括真菌和真核生物在内的大多数生物也具有极其低下的同源重组基因打靶效率(Klinner U,et al(2004)Fems Microbiology Reviews 28:201-223;Gregg JM(2010)Pichia Protocols,Second edition.Totowa,New Jersey:Humanna Press)。
虽然基因打靶的效率取决于同源重组和非同源末端连接途径之间竞争、也依赖于基因组基因和生物系统,但对于基因的编码和调控区中不同位置的同源重组基因打靶效率知之甚少,特别是在难以破坏的基因中。
因此,本领域急需一种能提高基因打靶效率,特别是针对常规方法难以有效打靶的基因的基因打靶方法。
发明内容
本发明的目的是提供一种能提高基因打靶效率,特别是针对常规方法难以有效打靶的基因的基因打靶方法以及用于这种方法中的物质手段。
在第一方面,本发明提供一种用于调控基因的核苷酸构建物,其结构如下式所示:
5’-A-B-C-3’
其中,A是5’同源序列,B是干扰基因,C是3’同源序列;
所述5’和3’同源序列使得所述核苷酸构建物的重组位点位于该待调控基因的起始密码子的第一个核苷酸到该待调控基因的起始密码子的第一个核苷酸上游的110,优选50个核苷酸之间,或者所述5’和3’同源序列使得所述核苷酸构建物的重组位点位于该待调控基因的终止密码子的第一个核苷酸上游的100、50或20个核苷酸,优选50个核苷酸到该待调控基因的终止密码子的第一个核苷酸下游的300个核苷酸。
在具体的实施方式中,所述重组位点之间间隔0-20个核苷酸,优选0-5个核苷酸,最优选0个核苷酸。
在优选的实施方式中,所述干扰基因可以多于一个,可以相同或不同。
在优选的实施方式中,所述干扰基因可以是标记基因。
在优选的实施方式中,所述核苷酸构建物可以是环状或线形的。
在具体的实施方式中,所述待调控基因可以是重组效率低的基因,优选重组效率<3%,更优选重组效率<1%。
在具体的实施方式中,所述待调控基因是OCH1、ADE1基因。
在优选的实施方式中,所述同源序列的长度为400-1200bp(碱基对)、500-1000bp、600-800bp。
在第二方面,本发明提供一种包含本发明第一方面所述的核苷酸构建物的宿主细胞。
在具体的实施方式中,所述宿主细胞是酵母细胞。
在优选的实施方式中,所述酵母是酿酒酵母(Saccharomyces cerevisiae)、粟酒裂殖酵母(Schizosaccharomyces pombe)、毕赤酵母(Pichia pastoris)、多形汉森酵母(Hansenula polymorpha)、脂耶罗威亚酵母(Yarrowia lipolytica)、树干毕赤酵母(Pichia stipitis)和乳酸克鲁维酵母(Kluyveromyces lactis)。
在优选的实施方式中,所述酵母是毕赤酵母(Pichia pastoris)、多形汉森酵母(Hansenula polymorpha)、脂耶罗威亚酵母(Yarrowia lipolytica)、树干毕赤酵母(Pichia stipitis)和乳酸克鲁维酵母(Kluyveromyces lactis)。
在第三方面,本发明提供一种调控基因表达的方法,所述方法包括:
a)构建本发明第一方面所述的核苷酸构建物;和
b)将步骤a)构建的核苷酸构建物导入细胞,从而通过同源重组整合入待调控基因。
在具体的实施方式中,所述待调控基因可以是重组效率低的基因,优选重组效率<3%,更优选重组效率<1%。
在优选的实施方式中,所述待调控基因是OCH1、ADE1基因
在优选的实施方式中,所述方法还可包括步骤c)检测步骤b)所得细胞中待调控基因的表达。
在第四方面,本发明提供一种改造菌株的方法,包括:
a)构建本发明第一方面所述的核苷酸构建物;和
b)将步骤a)构建的核苷酸构建物转化入待改造的菌株。
在优选的实施方式中,所述方法还可包括步骤c)筛选改造菌株的步骤。
在第五方面,本发明提供一种采用本发明第四方面所述方法改造的菌株的用途,所述菌株用于生产重组蛋白,代谢物,以及应用于生物催化。
在优选的实施方式中,通过同源重组整合在酵母OCH1编码区的上游,抑止OCH1基因的表达,使重组蛋白中的糖基化模式得到改变。
在优选的实施方式中,所述用于生产代谢物是指:通过同源重组整合在酵母LPD1编码区的上游,抑止LPD1基因的表达,抑止其代谢竟争途径,能够增加异丁醇(isobutanol)的产量;通过同源重组整合在酵母PDC1编码区上游,抑止PDC1基因的表达,改变酵母酒精发酵途径有助于乳酸(L-lactic acid)的高效生产。
在优选的实施方式中,所述应用于生物催化是指:通过同源重组整合在酵母ARO8编码区的上游,抑止ARO8基因表达,能够增强酵母生物催化能力,增加葡萄糖转化为苯基乙醇(phenylethanol)的效率。
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描 述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。
附图说明
图1是典型的基因图谱,包括5’调控区(5’区)、开放读框(ORF)和3’调控区(3’区)。启动子和增强子决定该基因的哪部分被转录为信使RNA(mRNA)。5’和3’UTR调控mRAN到蛋白质的翻译过程。5'和3'区的核苷酸编号是指编码区的相应起始密码子作为核苷酸1-3,其5’上游区域用减号编号;而相应终止密码子作为核苷酸+1至+3,其3'下游区域用加号编号。载体组分未按比例绘制。
图2描述了构建pUO载体的示意图。载体组分未按比例绘制。
图3描述了构建pUAH(1)打靶载体以便整合入毕赤酵母的OCH1基因的示意图。载体组分未按比例绘制。
图4描述了在毕赤酵母的OCH1基因(SEQ ID NO:126)中不同位置整合的示意图。载体组分未按比例绘制。A.毕赤酵母OCH1基因座中打靶盒的整合位置。OCH1基因座的整合位置由箭头表示并标有核苷酸编号。载体中的基因组分未按比例绘制。B.描述了打靶盒通过双交换同源重组整合入毕赤酵母的OCH1基因(敲入)。交叉表示同源重组。C.显示OCH1基因中不同位置处整合的PCR验证结果。M,DNA大小标记物;泳道1,用P1/P4引物对,JC301中的野生型OCH1得到1433bp;泳道2、3、4,3个位置(-1/1)、(1212/+1)和(+3/+4)的基因整合未得到条带,因为序列太长(4900bp),PCR用P1/P4引物对无法扩增;泳道5,用P1/P2引物对,JC301中的野生型OCH1未得到条带;泳道6、7、8,用P1/P2引物对,3个位置(-1/1)、(1212/+1)和(+3/+4)的基因整合分别得到1300、2550和2550bp带;泳道9,用P3/P4引物对,JC301中的野生型OCH1未得到条带;泳道10、11、12,用P3/P4引物对,3个位置(-1/1)、(1212/+1)和(+3/+4)的基因整合分别得到3500、2500和2300bp条带。
图5描述了在毕赤酵母基因组的ADE1基因(巴斯德毕赤酵母PR-氨基咪唑琥珀酰羧酰胺合酶(ADE1)的DNA序列如SEQ ID NO:127所示)中不同位置整合的示意图。载体组分未按比例绘制。A.毕赤酵母ADE1基因中打靶盒的整合位置。整合位置由箭头表示并标有核苷酸编号。B.描述了打靶盒通过双交换同源重组整合入毕赤酵母的ADE1基因座(敲入)。交叉表示同源重组。C.显示ADE1基因中不同位置处整合的PCR验证结果。M,DNA大小标记物;泳道1-13,用P5/P6引物对,通过PCR验证13个随机选择菌落的基因组DNA。野生型ADE1得到2398bp条带,位置(912/+1)的基因整合得到3763bp条带。
图6描述了构建5’AOX1诱导启动的lacZ表达载体p5’AOX1-URA3-lacZ的示意图,其中URA3位于5’AOX1和lacZ ORF之间。载体组分未按比例绘制。
图7描述了构建一系列5’AOX1诱导启动的lacZ和lacZns表达载体p5’AOX1-lacZ-URA3、p5’AOX1-lacZ-URA3(-)、p5’AOX1-lacZns-URA3和p5’AOX1-lacZns-URA3(-)的示意图,其中URA3以两种取向位于lacZ和lacZns下游。载体组分未按比例绘制。
图8描述了构建一系列5’OCH1启动的lacZ和lacZns表达载体p5’OCH1-lacZ和p5’OCH1-lacZns的示意图。载体组分未按比例绘制。
图9描述了构建5’OCH1启动的lacZ表达载体p5’OCH1-URA3-lacZ的示意图,其中URA3位于5’OCH1和lacZ ORF之间。载体组分未按比例绘制。
图10描述了构建一系列5’OCH1启动的lacZ表达载体p5’OCH1-lacZ-URA3和p5’OCH1-lacZ-URA3(-)的示意图,其中URA3以两种取向位于lacZ的下游。载体组分未按比例绘制。
图11描述了构建一系列5’OCH1启动的lacZns表达载体p5’OCH1-lacZns-URA3和p5’OCH1-lacZns-URA3(-)的示意图,其中URA3以两种取向位于lacZns的下游。载体组分未按比例绘制。
图12显示了lacZ mRNA的相对表达量(%)。A.在毗邻起始和终止密码子的URA3存在下,5’AOX1诱导启动的lacZ mRNA相对表达量(%)。100%对应于无URA3整合下(p5’AOX1-lacZ),5’AOX1诱导启动的lacZ mRNA表达量。B.在毗邻起始和终止密码子的URA3存在下,5’OCH1启动的lacZ mRNA相对表达量(%)。100%对应于无URA3整合下(p5’OCH1-lacZ),5’OCH1启动的lacZ mRNA表达。数据显示为3次实验的平均值±标准偏差(s.d.)。
图13显示了β-半乳糖苷酶在细胞内的相对活性(%)。A.在毗邻起始和终止密码子的URA3存在下,5’AOX1诱导启动的β-半乳糖苷酶活性。100%对应于无URA3整合下(p5’AOX1-lacZ),5’AOX1诱导启动的β-半乳糖苷酶活性。B.在毗邻起始和终止密码子的URA3存在下,5’OCH1启动的β-半乳糖苷酶活性。100%对应于无URA3整合下(p5’OCH1-lacZ),5’OCH1启动的β-半乳糖苷酶活性。数据显示为3次实验的平均值±标准偏差(s.d.)。
图14显示了在毗邻起始和终止密码子含有基因整合的菌株中OCH1 mRNA的相对表达量(%)。100%对应于无基因整合的亲代JC307菌株中的mRNA表达。数据显示为3次实验的平均值±标准偏差(s.d.)。
图15显示mIL-22释放的N-糖链的正离子MALDI-TOF质谱图。A.显示用GS115菌株表达的mIL-22释放的N-糖链的质谱图。B.显示用菌株och1(-1/+1,ADE1URA3)表达的mIL-22释放的N-糖链的质谱图,该菌株在OCH1编码区上游有基因整合。
具体实施方式
发明人经过广泛而深入的研究,出乎意料地发现了不依赖于基因的同源重组有效基因打靶区域,基于此还开发了用于基因表达调控和基因破坏的基因打靶系统,本发明方法可调控或修饰生物体的任何基因。在此基础上完成了本发明。
本发明系统地分析基因中不同位置的基因打靶,鉴定了基因中同源重组的有效整合区。还开发了基因打靶系统以供基因表达调控和基因破坏。本发明利用所有生物体细胞中内源性的同源重组过程,因此,本发明方法可调控或修饰生物体的任何基因。本发明方法在生物科技产业和生物学研究领域可广泛用于调控基因表达、改进细胞功能和生产异源蛋白质。
以下术语根据下面的定义在本文使用。
“基因”作广义使用,指生物学功能相关的任何核酸区段。“肽”、“多肽”和“蛋白质”在本文可互换使用,是指任何长度的聚合形式氨基酸。"基因打靶"是在遗传基因处整合外源DNA的方法,通常导致在靶基因处改造、替换或复制。这是对所有生命体通用的机制。"细胞"或"机体"是用于实施本发明的基因打靶的机体的术语。"细胞转化"表示外源DNA被引入细胞。这通常是外源DNA整合入染色体DNA或引入自我复制质粒所致。"靶基因"或"靶部位"是指通过本发明的基因打靶方法予以改变的基因或DNA区段。靶基因可以是内源基因或先前引入机体的外源DNA区段。靶基因可以是机体的内源性基因组DNA、基因的任何部分,包括但不限于多肽编码区、开放读框、调控区、内含子、外显子或它们的一部分。
"标记"代表基因序列,其存在或不存在提供机体的可检测表型。不同类型的标记包括但不限于:选择标记、筛选标记和分子标记。选择标记通常是基因,其表达可以使得机体具有对一组特定条件耐受或敏感的表型。筛选标记传递作为可观察和可区分特征的表型。分子标记是可由DNA分析鉴定的基因序列特征。
基因包括编码特定蛋白或功能性RNA的“编码序列”、“编码区”或“开放读框(ORF)”。蛋白编码序列是转录为信使RNA(mRNA)的核酸序列,进而翻译成蛋白质。蛋白编码序列的边界由5’末端(N-末端)的起始密码子和3’末端(C-末端)的翻译终止无义密码子界定。
基因还包括在编码序列之前和之后的“调控区”或“调控元件”。所述调控元件包括但不限于启动子、增强子、内含子、多腺苷酸化信号,5'非翻译区(5'UTR)、3’非翻译区(3'UTR)和它们的任何衍生物。一些调控区转录为RNA分子的一部分,例如5'UTR和3'UTR。术语“5’非翻译区(5′UTR)”应表示成熟mRNA中紧邻任何编码序列上游的核苷酸序列,其不翻译成蛋白质。术语“3’非翻译区(3′UTR)”应表示成熟mRNA中紧邻任何编码序列下游的核苷酸序列,其不翻译成蛋白质。其从任何编码序列的终止密码子之后的第一个核苷酸延伸恰到mRNA的poly(A)尾之前。这些调控元件可以调控基因表达过程的各个方面,包括但不限于转录(例如,启动、延伸和/或终止)、翻译(启动、延伸和/或终止)和RNA稳定性,等等。
“启动子”是能聚合RNA聚合酶并启动下游(3’方向)编码序列转录的核酸调控区。启动子序列在其3’末端以转录起始位点为界,向上游延伸(5’方向),包括启动转录所需的最低数量的碱基或元件。启动子序列中可以找到转录起始位点以及负责结合RNA聚合酶的蛋白质结合域。真核启动子常(但不总是)含有“TATA”盒和“CAT”盒,而原核启动子常含有共有序列TATAAT。许多启动子称为组成型启动子,因为它们在细胞的所有环境中具有活性,但一些是诱导型启动子,其活性通过对特定刺激起反应而受到调控。
“终止子”是核酸序列的某区段,其在转录期间提供信号触发转录机器释放新合成的mRNA(或RNA)终止转录。在原核转录中,两类转录终止子,依赖于Rho和不依赖于Rho的序列负责触发转录终止。在真核mRNA转录中,转录机器识别终止子信号并触发终止过程释放mRNA,随后poly(A)序列通过多腺苷酸化反应加在mRNA的3'端。
对转录的调控根据其作用性质主要可以分为两大类:第一类是DNA模板功能抑止,通过抑止分子与DNA结合改变模板功能;第二类是RNA聚合酶抑止,通过抑止分子与RNA 聚合酶结合而抑止其活性(Sandhya Payankaulam,Li M.Li,and David N.Arnosti(2010)Transcriptional repression:conserved and evolved features.Curr Biol.14;20(17):R764–R771)。然而,这两类控制方法不能用作通用方法来专一性地调节任何靶基因的表达。
“翻译”是以mRNA为模板合成蛋白质的过程。成熟的mRNA由三个部分组成:5'UTR、ORF和3'UTR。翻译起始复合物在5'至3'方向扫描5'UTR直至遇到起始密码子AUG,在该位置核糖体沿mRNA5’端向3’端移动,开始了从N端向C端的蛋白质合成。当核糖体遇到终止密码子(UAA、UAG或UGA)时,蛋白质合成终止,蛋白质从核糖体释放。
翻译过程可由RNA-结合蛋白(RBP)和小RNA控制,它们通过结合在mRNA上来调控其翻译。RBP通常结合在位于5’或3’UTR中的特定元件以激活或阻遏翻译。然而,5’UTR内的元件处于扫描/翻译核糖体的途径内,核糖体能在调控元件发生作用前置换它们。通过检测细胞内所有mRNA的降解率和翻译率时发现,对两者影响最高的3’UTR元件是microRNA互补位点,它对mRNA的稳定性和翻译水平分别抑制32%和4%。然而,3’UTR对大多数mRNA的稳定性和翻译效率的影响有限(Noah Spies,Christopher B.Burge,and David P.Bartel(2013),3’UTR-isoform choice has limited influence on the stability and translational efficiency of most mRNAs in mouse fibroblasts,Genome Research,23:2078–2090)。因此,通过RBP和小RNA结合在mRNA上来调节蛋白质翻译应该不会是一种有效的方法。
细胞具有监督系统来识别和消除异常mRNA以避免产生可能有害的蛋白质产物。例如,细胞能识别缺少终止密码子的异常mRNA(无终止mRNA)并在3’端形成Ski复合物以介导该无终止mRNA的降解。这种无终止mRNA衰减可避免产生可能有害的延伸产物,这种产物相对于野生型基因产物具有显性负活性(van Hoof A,Frischmeyer PA,Dietz HC,Parker R(2002)Exosome mediated recognition and degradation of mRNAs lacking a termination codon.Science 295:2262–2264)。
实际上,目前缺少调控基因组基因表达的有效方法。任何能够可预测专一性地调控基因组中目标基因表达的方法和/或工具都将有益于生物学研究和生物技术发展。
基因打靶技术是一种将外源DNA整合在染色体遗传基因组中的方法,从而导致靶基因处的基因被改造、替换或复制,它广泛用于破坏基因活性。
“Ends-in”和“Ends-out”是指可用于经同源重组整合外源DNA进入基因组的两种不同设置。在通过“Ends-in”重组的基因打靶中,当外源DNA与基因组中的同源区域配对时,线形外源DNA的末端指向彼此,通过单交换重组(single cross-over recombination,“roll in”)将DNA整合进入基因组,产生靶基因同向重复序列。但是重复的靶基因可以再通过自身的同源重组切除外源DNA,恢复靶基因原来的野生型状态。在通过“ends-out”重组的基因打靶中,当外源DNA与基因组中的同源区域配对时,线形外源DNA的末端背离彼此,通过末端靶向侧翼和宿主基因组同源序列之间的双交换重组将DNA插入基因组中。“Ends-out”打靶常用于小鼠和酵母,因为它可以直接替换或删除靶基因。不过,“ends-out”事件发生的概率远低于“ends-in”事件(Paques and Haber 1999,Microbiology and Molecular Biology Reviews,63:349-404)。在本发明中,基因打靶是指“ends-out”双交换同源重组,除非专门表明是通过单交换同源重组的“ends-in”打靶(roll-in)。
基因打靶是对所有生命体通用的过程并可以用于任何基因,而且不受其转录和翻译活性影响。然而,该技术受限于以下两个局限性:同源重组效率低和随机(非靶向)整合率高。
基因打靶藉由两种截然不同的分子机制发生:同源重组(HR)途径和非同源末端连接(NHEJ)途径。两种重组途径通常经由DNA断裂双链(DSB)的修复介导。在同源重组基因打靶中,外源DNA片段,通常是两端有同源序列的可选择标记基因,精确整合在其同源基因组的相应序列中。但是在非同源末端连接途径中,含可选择标记基因的外源DNA片段会随机整合在非同源的染色体位点上。位点特异性基因打靶的效率通常由同源重组和非同源末端连接途径之间的相对强度决定。
不同生物系统同源重组基因打靶的效率也大不相同。常规酵母,酿酒酵母(Saccharomyces cerevisiae)和裂殖酵母,粟酒裂殖酵母(Schizosaccharomyces pombe)具有非常有效的同源重组基因打靶系统。在酿酒酵母中,当打靶片段是30至45bp(碱基对)时,同源重组基因打靶替换的效率最高可达到转化菌的95%(Paques and Haber 1999,Microbiology and Molecular Biology Reviews,63:349-404)。然而,大多数生物具有极低的同源重组基因打靶效率。在甲醇营养型酵母,巴斯德毕赤酵母(Pichia pastoris)和其它“非常规”酵母,包括多形汉森酵母(Hansenula polymorpha)、脂耶罗威亚酵母(Yarrowia lipolytica)、树干毕赤酵母(Pichia stipitis)和乳酸克鲁维酵母(Kluyveromyces lactis)中,基因打靶效率极低。基因打靶替换的效率高度依赖于打靶片段中同源序列的长度。当打靶同源序列小于500bp时,效率可能低于0.1%,但是当利用大的1kb的打靶同源序列时,对于一些靶基因可以高于50%(Klinner U等,(2004)Fems Microbiology Reviews 28:201-223;Gregg JM(2010)Pichia Protocols,Second edition.Totowa,New Jersey:Humanna Press)。包括真菌和真核生物在内的大多数生物同源重组基因打靶效率也很低。
此外,在具有相同遗传背景的菌株中同源基因打靶的效率还是基因依赖性的。例如,当打靶同源序列的长度在200-900bp时,毕赤酵母GS115菌株中的ARG1、ARG2、ARG3、HIS1、HIS2、HIS5和HIS6以44-90%的高效率发生同源重组(Nett等,(2005)Yeast 22:295–304)。但当利用约1kb或更长的同源序列去敲除毕赤酵母基因组中OCH1和SGS1基因时,其效率低于1%(Choi等,(2003)Proc Natl Acad Sci U S A 100:5022–5027;Chen等,(2013)PLoS ONE 8(3):e57952)。这种基因依赖性现象的分子机制不十分清楚。一种可能的原因是各染色体有热点区域,这些热点区域易于发生同源重组(Wahls等.Plos One 3:e2887)。
虽然基因打靶的效率取决于同源重组和非同源末端连接途径之间竞争、也依赖于基因组基因和生物系统,但对于基因的编码和调控区中不同位置的同源重组基因打靶效率知之甚少,特别是在难以破坏的基因中。
本发明利用OCH1,一种低效的打靶基因,和ADE1,一种常见的打靶基因,作为模型来系统性分析基因组基因中各种位置的同源基因打靶。本项研究有助于将基因打靶技术体系应用于基因表达调控和基因破坏。
本发明开发了打靶载体,包括标记基因、同源序列和复制起点的诸部分。这些部分可以相连形成环状载体。如果需要的话,该环状载体在诸部分之间可以含有其它部分和接头。然而,本发明也应包括功能等价的其它形式打靶载体。打靶载体也可称为载体。用于重组 DNA技术的载体通常是"质粒"的形式。在本说明书中,术语"载体"和"质粒"可互换使用。
"标记"代表基因或序列,其存在或不存在提供机体的可检测表型。可利用一种或多种标记选择和筛选基因打靶事件。可用于本发明的不同类型的标记包括但不限于:选择标记、筛选标记和分子标记。
选择标记基因的表达可以使得机体具有对一组特定条件耐受或敏感的表型。选择标记包括对抗生素,例如卡那霉素、潮霉素、zeocin、博莱霉素、壮观霉素、链霉素、庆大霉素等耐受的基因。
可选择标记体系由营养缺陷突变型宿主菌株和野生型生物基因构成,补充宿主对不完全培养基的缺陷,例如酵母菌中的HIS4、LEU2、URA3、ADE1、LYS2和TRP1基因,和本领域已知的其它基因。例如,酿酒酵母或毕赤酵母HIS4基因可用于his4毕赤酵母菌株的转化。
筛选标记传递可观察和可区分的特征。可筛选标记包括荧光蛋白,例如绿色荧光蛋白(GFP)、报道酶,例如β-半乳糖苷酶(lacZ)、碱性磷酸酶(AP)、β-内酰胺酶、β-葡萄糖醛酸酶、谷胱甘肽S-转移酶(GST)、荧光素酶和本领域已知的其它酶。
分子标记是可由DNA分析鉴定的基因序列特征。
标记基因侧接两个同源重组区。所述同源重组区之一的上游侧与靶基因的上游区域同源,所述同源重组区之一的下游侧与靶基因的下游区域同源。在上、下游同源重组区之间单个或多个标记基因相互间可以以相同或相对的取向相连结。
所述同源重组区使得重组位点在待调控基因的起始密码子的第一个核苷酸到该待调控基因的起始密码子的第一个核苷酸上游的110、优选50个核苷酸之间,或者所述5’和3’同源序列使得所述核苷酸构建物的重组位点位于待调控基因的终止密码子的第一个核苷酸上游的100、50或20个核苷酸,优选50个核苷酸到该待调控基因的终止密码子的第一个核苷酸下游的300个核苷酸。
在本文,某区域与相应基因区域同源表示该区域与所述基因区域的碱基序列有至少90%、优选至少92%、更优选至少94%、还要更优选至少96%、还要更优选至少98%、还要更优选至少99%、最优选100%相同。优先的选择是这种“同源区域”源自所述的基因区域。
同源重组区域的长度不作特别限定。该区域的长度优选适于发生同源重组。因此,该区域的长度至少40bp。
当考虑将本发明的载体转入细菌细胞传代时,优选载体中包含细菌复制起点和抗生素耐受性基因,以确保细菌传代过程中保有该载体。细菌复制起点(ori)包括fl-ori、colisin、col El和本领域已知的其它起点。抗生素耐受性基因包括氨苄青霉素、卡那霉素、四环素、Zeocin耐受性基因和本领域已知的其它抗生素耐受性基因。复制起点和抗生素耐受性基因可以连接在不同部分之间。
本发明提供线形“打靶盒”,可利用限制性酶消化使之从打靶载体线形化,或者可通过基因化学合成得到。为简便起见,该“打靶盒”在本文还可称为“打靶片段”、“基因破坏片段”、或“基因整合的片段”。该打靶盒用于破坏靶基因并将外源基因整合入宿主的染色体基因组中,从而该外源基因能在该宿主中行使功能。
打靶盒的实质性部分包括标记基因和同源区。打靶盒可含有其它部分,如果需要,可在诸部分之间含有接头。标记基因的上游和下游侧侧接同源区域。
将打靶盒或载体引入宿主细胞以供同源重组。可按照本领域技术人员熟知的方法进行宿主细胞的转化和转染。
合适的转化方法包括病毒感染、转染、接合、原生质体融合、电穿孔、基因枪技术、磷酸钙沉淀、直接显微注射等等。方法的选择通常依赖于所转化的细胞类型和进行转化的条件。这些方法的常规讨论见Ausubel等,Short Protocols in Molecular Biology,3rd ed.,Wiley&Sons,1995。
例如,可采用不同的方法实施酵母转化,包括球形体方法、电穿孔、聚乙二醇方法、碱性阳离子方法等等[Gregg JM(2010)Pichia Protocols,Second edition.Totowa,New Jersey:Humanna Press]。
可用于本发明的宿主细胞的例子包括典型的真核的和原核宿主,例如大肠杆菌(E.coli)、假单胞菌属(Pseudomonas spp.)、芽胞杆菌属(Bacillus spp.)、链霉菌属(Streptomyces spp.)、真菌、酵母菌,例如酿酒酵母、毕赤酵母,昆虫细胞,例如草地贪夜蛾(Spodoptera frugiperda)(SF9),动物细胞,例如中国仓鼠卵巢细胞(Chinese hamster ovary cell)(CHO)和小鼠细胞,非洲绿猴细胞,培养的人类细胞和植物细胞。酵母是优选的本发明宿主细胞。毕赤酵母是更优选的宿主细胞。
然后可基于可选标记的表型选择标记选择转化的细胞。
本发明评估了5’-调控区、编码区和3’-调控区的位点特异性同源重组整合,鉴定了高效率基因整合的区域。以前的报道说,当利用1kb或更大同源区域时,同源重组基因打靶的效率依赖于基因组基因,与之相反,本发明发现,当利用小于1kb同源区域时,5’-和3’-调控区的同源重组基因整合效率不依赖于基因,其水平显著较高。此外,编码区3’端基因整合的效率高于其它编码区。基因组基因的不同区域中打靶整合的效率可以表示为以下次序:5’-调控区和3’-调控区﹥﹥编码区的3’端﹥其它编码区。
本发明开发了通过在5’-调控区作基因整合来精确控制靶基因表达的方法。整合基因可以是有助于鉴定具有基因整合的转化株的任何标记基因,包括可选择标记体系、筛选标记和分子标记。标记基因ORF可与不同启动子、分泌信号序列(如果需要的话)和转录终止子按一定的位置和方向排列,融合形成表达盒。这些区段的排列位置和方向是本领域普通技术人员已知的。这些标记基因表达盒可以在基因组的DNA双链中,与靶ORF以相同或相对的方向整合在5’-调控区的任何位置,更优选5’-调控区中接近靶ORF的位置,最优选在同一链中,相同取向的紧邻靶ORF上游的位置。也可以将标记基因ORF和转录终止子融合体,整合在5’-调控区同一链中,相同取向接近靶ORF的位置,最优选整合在紧邻靶ORF上游的位置,如此可利用靶基因的5’-调控区以启动标记基因的表达。
5’调控区特别是在紧邻靶ORF上游位置的基因整合可以抑制转录和翻译的效率,从而有效地抑制特定靶基因表达。目前调控基因转录的方法主要包括:通过抑止分子与DNA结合改变模扳功能,通过抑止分子与RNA聚合酶结合而抑止其转录活性。调控蛋白翻译的方法主要采用小RNA和RNA-结合蛋白与mRNA结合来改变其翻译能力。相比之下,在基因 组基因中,特别是在ORF上游进行的基因整合是更有效的精确控制靶基因表达的方法。
在另一方面,可通过在接近靶ORF起始密码子的5’区中的任何位置,更优选起始密码子上游,最优选起始密码子上游3-10个碱基的位置整合打靶盒引进强启动子来上调任何靶ORF表达,所述打靶盒由在下游与强启动子融合的标记基因表达盒构成。
另一方面,本发明开发了通过在基因组的DNA双链中,以相同或相对取向在3’-调控区,最优选在紧邻终止密码子下游的位置整合选择标记盒来降低靶ORF基因表达的方法。
另一方面,本发明开发了通过在基因组的DNA双链中,以相同或相对取向在编码区3’端附近整合选择标记盒来降低靶ORF基因表达的方法。
下表1比较了毕赤酵母的OCH1和ADE1基因中不同位置的同源重组整合效率。这些位置由基因组基因的核苷酸编号确定,所述编号是指编码区的相应起始密码子作为核苷酸1-3和相应终止密码子作为核苷酸+1至+3。正确的整合子是通过PCR验证的具有正确基因整合的克隆。打靶效率定义为通过PCR验证的正确整合子与检查的总克隆之比。
表1
Figure PCTCN2016080788-appb-000001
基于本发明的基因打靶的方法,技术人员能够改造菌株,具体地说,能够改造菌株中利用常规打靶方法同源重组效率低于3%,优选低于1%的基因。从上表可以看出,采用本发明方法在特定整合位置进行同源重组整合的打靶效率远高于常规打靶方法。
进一步地,本发明提供了改造菌株的方法,这种改造的的菌株可用于生产重组蛋白。在具体的实施方式中,所述重组蛋白中的糖基化模式得到改变。例如,通过破坏OCH1基因,改变菌株对蛋白的糖基化修饰途径;通过破坏菌株中的蛋白酶基因,可以减少重组蛋白的降解;等等。本发明的基因打靶方法可以应用于改造菌株的生物代谢反应,使其更有效地生产代谢物。而且本发明的基因打靶方法可以应用于改变机体内的酶反应活性,使改造的机体能更有效地进行生物催化反应。通过本发明方法改造的菌株还可用于代谢工程改造、遗传学研究以及生物技术应用等各个领域。
本发明的优点:
1.本发明鉴定到不依赖于基因的有效同源重组基因打靶的区域;
2.本发明方法可调控或改造生物体的任何基因;
3.本发明方法在生物科技产业和生物学研究领域可广泛用于调控基因表达、改进细胞功能和生产异源蛋白质。
实施例
材料
用于文库的产生、验证和应用的化学试剂、酶、培养基和溶液是常用的并且是分子和细胞生物学领域的技术人员熟知的;它们可以从许多公司获得,包括Thermo Fisher Scientific、Invitrogen、Sigma、New England BioLabs、Takara Biotechnology、Toyobo、TransGen Biotech和Generay Biotechnology等等。其中许多以试剂盒的形式提供。pPIC3.5K和pPICZ载体获自Invitrogen。pBLHIS-SX、pBLURA-SX、pBLADE-SX载体获自Keck Graduate Institute,Claremont,美国。大肠杆菌(E.coli)菌株Trans1-T1获自TransGen Biotech。毕赤酵母营养缺陷型菌株JC301(ade1 his4 ura3)和JC307(his4 ura3)获自Keck Graduate Institute(KGI),GS115(his)获自Invitrogen。核苷酸序列数据主要获自公共数据库NCBI(www.ncbi.nlm.nih.gov)。
方法
除非另有表示,按照分子和细胞生物学领域技术人员熟知的标准方法进行本发明所用的方法,包括聚合酶链式反应(PCR)、限制性酶克隆、DNA纯化、细菌和原核细胞培养、转化、转染和蛋白质印迹,例如以下手册所述的:Sambrook J等.(Molecular Cloning A Laboratory Manual(Third Edition),Cold Spring Harbor Laboratory Press,Cold Spring Harbor,N.Y.,2001),Ausubel F M等.(Current Protocols in Molecular Biology,Wiley InterScience,2010),和Gregg JM(Pichia Protocols,(第二版),Totowa,New Jersey:Humanna Press,2010)。
大肠杆菌菌株Trans1-T1用于构建和扩增质粒。菌株用含合适抗生素的Luria-Bertani(LB)培养基(10g/L胰蛋白胨、5g/L酵母提取物和5g/L氯化钠)或LB平板(15g/L琼脂)培养。抗生素的加入浓度如下所述:100mg/L氨苄青霉素、50mg/L卡那霉素和25mg/L Zeocin)。
毕赤酵母菌株利用YPD培养基(10g/L酵母提取物、20g/L蛋白胨、20g/L葡萄糖)和YPD平板(10g/L酵母提取物、20g/L蛋白胨、20g/L葡萄糖、15g/L琼脂)培养。利用不含氨基酸的YNB培养基(67g/L酵母氮源(yeast nitrogen base)、5g/L葡萄糖)和不含氨基酸的YNB平板(67g/L酵母氮源、5g/L葡萄糖,15/L琼脂)来选择毕赤酵母营养缺陷型菌株,视需要适当添加(抗生素)。一些毕赤酵母营养缺陷型菌株利用SC培养基(8g/L SC不含组氨酸和尿嘧啶,20g/L葡萄糖)和SC平板(8g/L SC不含组氨酸和尿嘧啶,20g/L葡萄糖,15/L琼脂)选择,视需要适当添加(抗生素)。抗生素的加入浓度如下所述:500mg/L G-418硫酸盐和100mg/L Zeocin。
采用乙酸锂-SDS裂解,然后进行乙醇沉淀提取毕赤酵母中的基因组DNA,该方法描述 于以下出版物:Looke et al.2011,Biotechniques.50:325–328。
利用MicroPulserTM电穿孔设备,按照生产商(BioRad)的操作使用说明书,通过电穿孔进行毕赤酵母的转化。
实施例1
构建基础载体
图2描述了构建pUO载体的示意图。
PCR1,用基因组DNA作为模板,用KpnIOch1(+54)F(SEQ ID NO:1,该引物具有Kpn I限制性酶切位点)和Och1(+801)BamHI R(SEQ ID NO:2,该引物具有BamH I限制性酶切位点)引物对作PCR扩增毕赤酵母OCH1 3’序列(3’H);
PCR2,利用pBlunt-URA3SK载体作为模板,利用XhoIURA3F(SEQ ID NO:3,该引物具有Xho I限制性酶切位点)和DRKpnI R(SEQ ID NO:4,该引物具有Kpn I限制性酶切位点)引物对作PCR扩增毕赤酵母URA3表达盒和SacI-KpnI片段。通过将URA3和SacI-KpnI片段的PCR融合片段连接于pBlunt载体(TransGen Biotech,中国)获得pBlunt-URA3SK载体。
然后,分别用Kpn I和BamH I消化OCH1 3’H的PCR产物,用Xho I和Kpn I消化URA3表达盒。将OCH1 3’H的KpnI-BamHI片段和URA3表达盒的XhoI-KpnI片段插入pBlunt-XB载体的Xho I和BamH I位点以产生pUO3H载体。将含XhoI和BamHI位点的片段连接于pBlunt载体(TransGen Biotech)获得pBlunt-XB。
PCR3,用基因组DNA作为模板,利用SphIOch1(274)F(SEQ ID NO:5,该引物具有Sph I限制性酶切位点)和Och1(+53)XhoI R(SEQ ID NO:6,该引物具有Xho I限制性酶切位点)引物对作PCR扩增毕赤酵母OCH1 5’序列(5’H);
然后,分别用Sph I和Xho I消化OCH1 5’H的PCR产物,用Xho I和BamH I消化pUO3H载体。将OCH1 5’H的SphI-XhoI片段和URA3表达盒与OCH1 3’H相连的XhoI-BamHI片段插入pUC19-EBSH载体的BamH I和Sph I位点。用含EcoR I、BamH I、Sph I和Hind III限制性酶切位点的片段替换有多克隆位点的pUC19EcoRI-HindIII片段得到pUC19-EBSH载体。产生的pUO载体用作基础载体以构建其它不同的OCH1打靶载体。
实施例2
构建OCH1打靶载体
图3描述了构建整合入毕赤酵母OCH1基因的打靶载体的示意图。
PCR4,用毕赤酵母基因组DNA作为模板,用SacIADE1F(SEQ ID NO:7,该引物具有Sac I限制性酶切位点)和ADE1KpnI R(SEQ ID NO:8,该引物具有Kpn I限制性酶切位点)引物对扩增ADE1表达盒。
PCR5,用pBLURA-SX(Keck Graduate Institute)作为模板,用SacIURA3F(SEQ ID NO:9,该引物具有Sac I限制性酶切位点)和URA3XhoI R(SEQ ID NO:10,该引物具有Xho I限制性酶切位点)引物对扩增毕赤酵母URA3表达盒。
然后,分别用Sac I和Kpn I消化ADE1表达盒的PCR产物,用Sac I和Xho I消化URA3表达盒的PCR产物。将ADE1的SacI-KpnI片段和URA3的SacI-XhoI片段插入pUO载体的Xho I和KpnI位点以产生pUAH载体。
PCR6,用基因组DNA作为模板,用SphIOch1(-733)F(SEQ ID NO:11,该引物具有Sph I限制性酶切位点)和Och1(-1)XhoI R(SEQ ID NO:12,该引物具有Xho I限制性酶切位点)引物对作PCR扩增毕赤酵母OCH1 5’同源序列(5’H,-733/-1)。
用限制性酶消化PCR产物后,将OCH1 5’H的SphI-XhoI片段(-733/-1)插入pUAH的相同限制性酶位点以产生pUA5H。
PCR7,用基因组DNA作为模板,用KpnIOch1(1)F(SEQ ID NO:13,该引物具有Kpn I限制性酶切位点)和Och1(646)BamHI R(SEQ ID NO:14,该引物具有BamH I限制性酶切位点)引物对作PCR扩增毕赤酵母OCH1 3’同源序列(3’H,1/646)。
用限制性酶消化PCR产物后,将OCH1 3’H的KpnI-BamHI片段(1/646)插入pUA5H的相同限制性酶位点以产生OCH1打靶载体pUAH(1),其用于整合入OCH1 5’调控区中紧邻起始密码子上游的位置(-1/1)。
以相同的方式,通过将OCH1 5’和3’同源性(序列)的相应PCR产物插入pUAH来构建一系列OCH1打靶载体,它们整合在OCH1基因的不同位置。
引物对SphIOch1(-118)F(SEQ ID NO:15)和Och1(553)XhoI R(SEQ ID NO:16)以及引物对KpnIOch1(554)F(SEQ ID NO:17)和Och1(+103)BamHI R(SEQ ID NO:18)用于构建pUAH(554)以便整合在OCH1编码区的中的位置(553/554)。
引物对SphIOch1(274)F(SEQ ID NO:19)和Och1(1096)XhoI R(SEQ ID NO:20)以及引物对KpnIOch1(1097)F(SEQ ID NO:21)和Och1(+801)BamHI R(SEQ ID NO:22)用于构建pUAH(1097)以便整合在OCH1编码区中的位置(1096/1097)。
引物对SphIOch1(274)F(SEQ ID NO:19)和Och1(1165)XhoI R(SEQ ID NO:23)以及引物对KpnIOch1(1166)F(SEQ ID NO:24)和Och1(+801)BamHI R(SEQ ID NO:22)用于构建pUAH(1166)以便整合在OCH1编码区中的位置(1165/1166)。
引物对SphIOch1(274)F(SEQ ID NO:19)和Och1(1212)XhoI R(SEQ ID NO:25)以及引物对KpnIOch1(+1)F(SEQ ID NO:26)和Och1(+801)BamHI R(SEQ ID NO:22)用于构建pUAH(+1)以便整合在OCH1编码区中紧邻终止密码子上游的位置(1212/+1)。
引物对SphIOch1(274)F(SEQ ID NO:19)和Och1(+3)XhoI R(SEQ ID NO:27)以及引物对KpnIOch1(+4)F(SEQ ID NO:28)和Och1(+801)BamHI R(SEQ ID NO:22)用于构建pUAH(+4)以便整合在OCH1 3’调控区中紧邻终止密码子下游的位置(+3/+4)。
引物对SphIOch1(274)F(SEQ ID NO:19)和Och1(+203)XhoI R(SEQ ID NO:29)以及引物对KpnIOch1(+204)F(SEQ ID NO:30)和Och1(+801)BamHI R(SEQ ID NO:22)用于构建pUAH(+204)以便整合在OCH1 3’调控区中的位置(+203/+204)。
引物对SphIOch1(-860)F(SEQ ID NO:31)和Och1(-110)XhoI R(SEQ ID NO:32)以及引物对KpnIOch1(-109)F(SEQ ID NO:33)和Och1(+641)BamHI R(SEQ ID NO:34)用于构建 pUAH(-109)以便整合在OCH1 5’调控区起始密码子上游的位置(-110/-109)。
实施例3
OCH1基因中不同位置的打靶盒整合
图4A绘制了毕赤酵母的OCH1基因座中打靶整合位置。为进行整合,利用以下引物对,将一系列构建的OCH1打靶载体,包括pUAH(-109,1,554,1097,1166,+1,+4和+204)进行PCR扩增以产生线形形式的UAH(-109,1,554,1097,1166,+1,+4和+204)的OCH1打靶盒:
Och1(-709)F(SEQ ID NO:35)/Och1(491)R(SEQ ID NO:36)
Och1(-600)F(SEQ ID NO:37)/Och1(600)R(SEQ ID NO:38)
Och1(-47)F(SEQ ID NO:39)/Och1(1153)R(SEQ ID NO:40)
Och1(496)F(SEQ ID NO:41)/Och1(+484)R(SEQ ID NO:42)
Och1(565)F(SEQ ID NO:43)/Och1(+553)R(SEQ ID NO:44)
Och1(612)F(SEQ ID NO:45)/Och1(+600)R(SEQ ID NO:46)
Och1(615)F(SEQ ID NO:47)/Och1(+603)R(SEQ ID NO:48)
Och1(816)F(SEQ ID NO:49)/Och1(+803)R(SEQ ID NO:50)。
OCH1打靶盒含有URA3和ADE1表达盒,它们在DNA双链上,以相对取向彼此相邻定位。两种表达盒均侧接600bp相同长度的5’和3’整合同源序列(5’H和3’H),它们是基因特异性的同源序列以确保OCH1基因中打靶位置的精确整合。
利用MicroPulserTM电穿孔设备,按照生产商(BioRad,美国)的操作使用说明书,通过电穿孔将打靶盒转化入毕赤酵母营养缺陷型菌株JC301(ade1 his4 ura3)(Keck Graduate Institute)。转化的细胞在补充了20mg/L组氨酸的YNB平板上培养以选择腺嘌呤和尿嘧啶原养型。
图4B描述了打靶盒在毕赤酵母中紧邻OCH1基因的起始密码子上游的位置(-1/1)的代表性同源整合。在各转化板上,随机挑取菌落并培养以提取基因组DNA供PCR验证基因组的整合。两个引物对,P1(SEQ ID NO:51,位于基因组中5’同源区域的上游)/P2(SEQ ID NO:52,位于打靶片段的URA3内)和P3(SEQ ID NO:53,位于打靶片段的ADE1内)/P4(SEQ ID NO:54,位于基因组中3’同源区域的下游)用于验证打靶位置的同源整合(图4B)。图4C显示OCH1基因座中不同位置整合的PCR验证结果。通过P1/P2引物对分别扩增出预计的1300、2550和2550bp条带。这些菌株在利用P3/P4引物对的PCR中也扩增出预计的3500、2300和2300kb条带。PCR结果验证了相应的打靶片段分别成功整合在指定位置(-1/1,1212/+1,+3/+4)。亲代菌株JC301作为负对照,通过P1/P4引物对扩增1433bp条带,但通过P1/P2和P3/P4引物对没有扩增到条带。紧邻OCH1基因座的起始密码子上游整合的菌株称为och1(-1/+1,ADE1URA3)菌株。
本实施例证明,当利用600bp相同长度的同源序列时,OCH1基因中不同位置的同源重组整合效率显著不同(表1)。在编码区的5’端和中部这些位置(553/554,1096/1097和1165/1166)整合的整合转化株无法在100个筛选的菌落中鉴定到,说明其同源重组整合的效 率很低。与此结果一致,以前报道利用约1kb或更长的同源序列敲除毕赤酵母的OCH1,其效率仅为0.1%,而其它实验室认为其效率可能更低(Choi,2003,Proc Natl Acad Sci U S A 100:5022–5027;Chen,2013,PLoS ONE 8(3):e57952)。
毕赤酵母在同源重组基因打靶中效率极低是众所周知的。其同源重组基因替换事件的效率高度依赖于打靶片段的长度。当打靶同源序列低于500bp时,同源重组效率小于0.1%。
然而,与以前的报道不同的是,本发明发现,当利用600bp的短同源序列时,紧邻终止密码子上游位置(1212/+1)的同源整合效率约为7%。这可能归结为OCH1功能的不完全破坏,因为整合导致OCH1基因转录生成OCH1无终止mRNA,其随后翻译生成具有一定功能活性的C-末端延伸OCH1产物。当形成的C-末端延伸产物保留一定活性时,邻近编码区3’端位点的同源整合效率可能高于编码区的其它位点。
特别与以前报道不同的是,本发明还发现在OCH1 5’-和3’-调控区中位置的同源重组的效率显著较高,例如在起始密码子上游的位置(-110/-109,-1/1)为40%、35%,终止密码子下游的两位置(+3/+4,+203/+204)为80%、25%。
实施例4
ADE1基因中不同位置的打靶盒整合
在ADE1基因中不同位置进行打靶盒整合以进一步验证OCH1基因打靶的结果。
图5A描述了基因组ADE1基因中打靶盒的整合位置。
图5B显示了通过PCR构建ADE1打靶盒的示意图。
PCR1,用毕赤酵母基因组DNA作为模板,用ADE1(-800)F(SEQ ID NO:55)和ADE1(-1)U R(SEQ ID NO:56)(该引物具有URA3重叠序列以供融合PCR)引物对作PCR扩增ADE1基因座的5’-同源序列(5’H,-800/-1);
PCR2,用pBLURA-SX载体作为模板,用A(-21)URA3F(SEQ ID NO:57)和URA3A(19)R(SEQ ID NO:58)(二者均具有ADE1重叠序列以供融合PCR)引物对扩增URA3表达盒。
PCR3,用毕赤酵母基因组DNA作为模板,用UADE1(1)F(SEQ ID NO:59)(该引物具有URA3重叠序列以供融合PCR)和ADE1(800)R(SEQ ID NO:60)引物对作PCR扩增ADE1基因的3'-同源序列(3’H,1/800)。
用ADE1(-800)F(SEQ ID NO:55)和ADE1(800)R(SEQ ID NO:60)引物对,通过重叠延伸PCR连接以上3种PCR产物(1,2,3)。如此产生了线形打靶盒UH(1),其整合在ADE1 5’调控区中紧邻起始密码子上游的位置(-1/1)。
以相同的方式,利用以下相应的引物对,通过PCR扩增和融合构建一系列ADE1打靶盒,它们整合在ADE1基因的不同位置:
引物对ADE1(-98)F(SEQ ID NO:61)和ADE1(703)U R(SEQ ID NO:62)(其具有URA3重叠序列以供融合PCR),引物对A(684)URA3F(SEQ ID NO:63)和URA3A(728)R(SEQ ID NO:64)(二者均具有ADE1重叠序列以供融合PCR)和引物对ADE1(704)F(SEQ ID NO:65)(其具有URA3重叠序列以供融合PCR)和ADE1(+591)R(SEQ ID NO:66)用于构建打靶盒 UH(704)以便整合在ADE1编码区中的位置(703/704)。
引物对ADE1(62)F(SEQ ID NO:67)和ADE1(862)U R(SEQ ID NO:68)(其具有URA3重叠序列以供融合PCR)、引物对A(842)URA3F(SEQ ID NO:69)和URA3A(881)R(SEQ ID NO:70)(二者均具有ADE1重叠序列以供融合PCR)和引物对UADE1(863)F(SEQ ID NO:71)(其具有URA3重叠序列以供融合PCR)和ADE1(+750)R(SEQ ID NO:72)用于构建打靶盒UH(863)以便整合在ADE1编码区中的位置(862/863)。
引物对ADE1(62)F(SEQ ID NO:67)和ADE1(912)U R(SEQ ID NO:73)(其具有URA3重叠序列以供融合PCR)、引物对A(896)URA3F(SEQ ID NO:74)和URA3A(+21)R(SEQ ID NO:75)(二者均具有ADE1重叠序列以供融合PCR)和引物对UADE1(+1)F(SEQ ID NO:76)(其具有URA3重叠序列以供融合PCR)和ADE1(+750)R(SEQ ID NO:72)用于构建打靶盒UH(+1)以便整合在编码区中紧邻ADE1终止密码子上游的位置(912/+1)。
引物对ADE1(62)F(SEQ ID NO:67)和ADE1(+3)U R(SEQ ID NO:77)(其具有URA3重叠序列以供融合PCR)、引物对A(896)URA3F(SEQ ID NO:78)和URA3A(+23)R(SEQ ID NO:79)(二者均具有ADE1重叠序列以供融合PCR)和引物对UADE1(+4)F(SEQ ID NO:80)(其具有URA3重叠序列以供融合PCR)和ADE1(+750)R(SEQ ID NO:72)用于构建打靶盒UH(+4)以便整合在3’调控区中紧邻ADE1终止密码子下游的位置(+3/+4)。
引物对ADE1(298)F(SEQ ID NO:81)和ADE1(+203)U R(SEQ ID NO:82)(其具有URA3重叠序列以供融合PCR)、引物对A(+186)URA3F(SEQ ID NO:83)和URA3A(+226)R(SEQ ID NO:84)(二者均具有ADE1重叠序列以供融合PCR)和引物对ADE1(+204)F(SEQ ID NO:85)(其具有URA3重叠序列以供融合PCR)和ADE1(+1004)R(SEQ ID NO:86)用于构建打靶盒UH(+204)以便整合在ADE1 3’调控区中的位置(+203/+204)。
引物对ADE1(-895)F(SEQ ID NO:87)和ADE1(-110)U R(SEQ ID NO:88)(其具有URA3重叠序列以供融合PCR)、引物对A(-133)URA3F(SEQ ID NO:89)和URA3A(-87)R(SEQ ID NO:90)(二者均具有ADE1重叠序列以供融合PCR)和引物对UADE1(-109)F(SEQ ID NO:91)(其具有URA3重叠序列以供融合PCR)和ADE1(691)R(SEQ ID NO:92)用于构建打靶盒UH(-109),以便整合在ADE1 5’调控区起始密码子上游的位置(-110/-109)。
这些线形打靶盒含有URA3表达基因,在两侧侧接长度相似在800bp左右(750-850)的5’和3’同源序列,这些序列是基因特异性同源序列以便精确整合在毕赤酵母ADE1基因的打靶位置。
通过电穿孔将这些打靶盒转化入毕赤酵母营养缺陷型菌株JC307(his4 ura3)(Keck Graduate Institute,USA)的细胞。转化的细胞生长在补加了20mg/L组氨酸的SC平板(8g/L SC不含组氨酸和尿嘧啶,20g/L葡萄糖,15g/L琼脂)上以选择尿嘧啶原养型。在2-3天培育期间,从平板随机挑取菌落以避免因白色/淡红色菌落而引起菌落挑取的偏差,因为ade1菌株中红色色素的累积以及淡红色菌落的出现需要较长时间的培育。从过夜培养菌落提取基因组DNA,用于通过PCR验证在ADE1基因中位置的整合。
引物对P5/P6(位于基因组中5’同源区上游,和3’同源区下游)用于验证基因组整合(图 5B)。相应的P5/P6引物对进一步命名为P5-1(SEQ ID NO:93)/P6-1(SEQ ID NO:94)、P5-2(SEQ ID NO:95)/P6-2(SEQ ID NO:96)、P5-3(SEQ ID NO:97)/P6-3(SEQ ID NO:98)、和P5-4(SEQ ID NO:99)/P6-4(SEQ ID NO:100)以验证在ADE1基因中不同位置的整合。例如,成功扩增具有3763bp预期大小的条带表明在912/+1位置的染色体整合正确,但扩增大小为2398bp的条带表明无染色体整合(图5C)。
本实施例证明,当利用约800bp相似长度的同源序列时,ADE1基因中不同位点的同源重组整合效率显著不同(表1)。在邻近编码区3’端位置(862/863和912/+1)的整合效率超过15%。但在编码区中部位置(703/704)的整合效率则降低到3%。与OCH1整合相似,邻近编码区3’端位点的整合效率要高于编码区的其它区域。
与OCH1整合结果一致,在ADE1 5’-和3’-调控区位置的同源整合效率显著较高,例如在起始密码子上游的两位置(-110/-109,-1/1)为50%,65%,在终止密码子下游两位置(+3/+4,+203/+204)为30,45%。
以前的报道表明,同源重组基因打靶效率是基因依赖性的,例如OCH1基因打靶的效率极低,而ADE1基因打靶的效率则较高。然而,本发明发现同源重组基因打靶效率主要取决于靶基因的区域。当利用小于1kb同源序列时,虽然同源重组整合在OCH1和ADE1的编码区以破坏基因功能,两者具有完全不同的效率,但同源重组整合在OCH1和ADE1基因座的5’和3’调控区都具有相似的高效率,超过25%。本发明发现在5’-和3’-调控区的同源重组基因整合的效率不依赖于基因,而且效率很高。此外,邻近编码区3’端位点的基因整合效率高于编码区其它区域。基因组基因的不同区域中的同源重组打靶整合效率可以表示为以下次序:5’-调控区和3’-调控区﹥﹥编码区的3’端﹥其它编码区。本发明的这些发现为调控靶基因功能提供了新的机会。
实施例5
基因整合以调控β-半乳糖苷酶活性
虽然本发明已经确认在OCH1和ADE1的5’和3’调控区同源重组基因整合具有很高的效率,但是目前还没有系统分析来阐明5’和3’调控区的基因整合将会如何影响基因转录和蛋白质表达。
大肠杆菌(Escherichia coli)的lacZ基因编码β-半乳糖苷酶,其水解包括生色底物在内的各种β-D-半乳糖苷以产生有色产物。由于液体培养中β-半乳糖苷酶活性试验的便利性和灵敏度,其是常用的报道酶以监测基因表达的调控。通过融合lacZ ORF与基因的5’和3’调控区可以构建毕赤酵母的lacZ报道酶。
(1)构建5’AOX1-诱导的lacZ表达载体
图6和7描述了构建一系列5’AOX1-诱导的lacZ表达载体的示意图,其中URA3在lacZ ORF中毗邻起始和终止密码子以调控其表达。
PCR1,用大肠杆菌BL21(DE3)基因组DNA作为模板,用BamHIlacZ F(SEQ ID NO:101)(该引物具有BamH I限制性酶切位点)和lacZNotI R(SEQ ID NO:102)(该引物具有Not I和 Xho I限制性酶切位点)引物对作PCR以扩增lacZ ORF(SEQ ID NO:128)。
PCR2,用大肠杆菌BL21(DE3)基因组DNA作为模板,利用BamHIlacZ F(SEQ ID NO:101)和lacZnsNotI R(SEQ ID NO:103)(该引物缺乏lacZ终止启动子并具有Not I和Xho I限制性酶切位点)引物对作PCR以扩增大肠杆菌lacZ ORF。
然后,分别用BamH I和Not I消化lacZ和lacZns的PCR产物。将酶切lacZ和lacZns的BamHI/NotI片段插入pPIC3.5K载体(Invitrogen)的BamH I和Not I位点以产生p5’AOX1-lacZ和p5’AOX1-lacZns载体(图6)。
PCR3,用pBLURA-SX载体作为模板,用BamHIURA3F(SEQ ID NO:104)(该引物具有BamH I限制性酶切位点)和URA3BamHI R(SEQ ID NO:105)(该引物具有BamH I限制性酶切位点)引物对作PCR以扩增毕赤酵母URA3表达盒。
用BamH I消化URA3表达盒的PCR产物,插入p5’AOX1-lacZ载体的BamH I位点。将含有两种取向的URA3的连接载体转化入大肠杆菌菌株Trans1-T1(TransGen Biotech,中国),进行菌落PCR以选择载体p5’AOX1-URA3-lacZ,其中URA3在同一链中,相同取向紧邻lac ORF上游(图6)。
PCR4,用pBLURA-SX载体作为模板,用NotIURA3F(SEQ ID NO:106)(该引物具有Not I限制性酶切位点)和URA3NotI R(SEQ ID NO:107)(该引物具有Not I限制性酶切位点)引物对作PCR以扩增毕赤酵母URA3表达盒。
用Not I消化URA3表达盒的PCR产物,分别插入p5’AOX1-lacZ和p5’AOX1-lacZns载体的Not I位点。将含有两种取向的URA3的连接载体转化入噬菌体耐受性大肠杆菌菌株Trans1-T1,进行菌落PCR分别选出载体p5’AOX1-lacZ-URA3、p5’AOX1-lacZ-URA(-)、p5’AOX1-lacZns-URA3、p5’AOX1-lacZns-URA3(-)。p5’AOX1-lacZ-URA3和p5’AOX1-lacZns-URA3均含有URA3表达盒,其位于同一链中,相同取向紧邻lac ORF和lacZns ORF下游(图6)。另两个载体p5’AOX1-lacZ-URA3(-)和p5’AOX1-lacZns-URA3(-)含有URA3表达盒,其位于相对链中,相对取向紧邻lac ORF和lacZns ORF下游(图7)。
(2)构建5’OCH1-介导的lacZ表达载体
图8、9、10和11描述了构建一系列5’OCH1-介导的lacZ表达载体的示意图,其中URA3在lacZ ORF中毗邻起始和终止密码子以调控其表达。
PCR1,用基因组DNA作为模板,利用BamHIOCH1(-731)F(SEQ ID NO:108)(该引物具有BamH I限制性酶切位点)和OCH1(-1)L R(SEQ ID NO:109)(该引物具有lacZ重叠序列以供融合PCR)引物对作PCR以扩增毕赤酵母OCH1的5’调控区(5’OCH1,-731/-1)。
PCR2,用大肠杆菌BL21(DE3)基因组DNA作为模板,用OLacZ F(SEQ ID NO:110)(该引物具有5’OCH1重叠序列以供融合PCR)和lacZXhoI R(SEQ ID NO:111)(该引物具有Xho I限制性酶切位点)引物对作PCR以扩增lacZ ORF。
PCR3,用大肠杆菌BL21(DE3)基因组DNA作为模板,用OLacZ F(SEQ ID NO:110)和lacZnsNotI R(SEQ ID NO:103)(该引物缺乏lacZ终止密码子并具有Not I和Xho I限制性酶切位点)引物对作PCR以扩增无终止密码子的lacZ ORF(lacZns)。
PCR4,用BamHIOCH1(-731)F(SEQ ID NO:108)和LacZXhoI R(SEQ ID NO:111)引物对,通过重叠-延伸PCR融合PCR1和PCR2产物。如此产生5’OCH1-lacZ的片段。
PCR5,用BamHIOCH1(-731)F(SEQ ID NO:108)和lacZnsNotI R(SEQ ID NO:103)引物对,通过重叠-延伸PCR融合PCR1和3产物。如此产生5’OCH1-lacZns的片段。
PCR6,用基因组DNA作为模板,用XhoIOCH1(+4)F(SEQ ID NO:112)(该引物具有Xho I限制性酶切位点)和OCH1(+798)SacI R(SEQ ID NO:113)(该引物具有Sac I限制性酶切位点)引物对作PCR以扩增毕赤酵母OCH1的3’调控区(3’OCH1,+4/+798)。
然后,用BamH I和Xho I消化5’OCH1-lacZ片段的PCR产物,用Xho I和Sac I消化3’OCH1的PCR产物。将5’OCH1-lacZ的BamHI-XhoI片段和3’OCH1的XhoI-SacI片段插入pBLHIS-SX载体的Sac I和BamH I位点以产生p5’OCH1-lacZ载体(图8)。
用BamH I和Xho I消化5’OCH1-lacZns片段的PCR产物,用Xho I和Sac I消化3’OCH1的PCR产物。将5’OCH1-lacZns的BamHI-XhoI片段和3’OCH1的XhoI-SacI片段插入pBLHIS-SX载体的Sac I和BamH I位点以产生p5’OCH1-lacZns载体(图8)。
PCR7,用基因组DNA作为模板,用BamHIOCH1(-731)F(SEQ ID NO:108)和OCH1(-1)U R(SEQ ID NO:114)(该引物具有URA3重叠序列以供融合PCR)引物对作PCR以扩增OCH1的5’调控区(5’OCH1,-731/-1)。
PCR8,用pBLURA-SX载体作为模板,用OURA3F(SEQ ID NO:115)(该引物具有OCH1重叠序列以供融合PCR)和URA3SphIXhoI R(SEQ ID NO:116)(该引物具有Sph I和Xho I限制性酶切位点)引物对作PCR以扩增毕赤酵母URA3表达盒。
PCR9,用BamHIOCH1(-731)F(SEQ ID NO:108)和URA3SphIXhoI R(SEQ ID NO:116)引物对,通过重叠-延伸PCR融合PCR7和8产物。如此产生5’OCH1-URA3的片段。
PCR10,用大肠杆菌BL21(DE3)基因组DNA作为模板,用SphILacZ F(SEQ ID NO:117)(该引物具有Sph I限制性酶位点)和LacZXhoI R(SEQ ID NO:111)引物对作PCR以扩增lacZ ORF。
然后,用BamH I和Xho I消化5’OCH1-URA3片段的PCR产物,用Sph I和Xho I消化lacZ ORF的PCR产物。将5’OCH1-URA3的BamHI-XhoI片段和lacZ的SphI-XhoI片段插入p5’OCH1-lacZ载体的BamHI和XhoI位点以产生p5’OCH1-URA3-lacZ载体(图9)。
PCR11,用pBLURA-SX载体作为模板,用XhoIURA3F(SEQ ID NO:3)(该引物具有Xho I限制性酶切位点)和URA3XhoI R(SEQ ID NO:10)(该引物具有Xho I限制性酶切位点)引物对作PCR以扩增毕赤酵母URA3表达盒。
用Xho I消化URA3表达盒的PCR产物,插入p5’OCH1-lacZ和p5’OCH1-lacZns载体的Xho I位点。将含有两种取向的URA3的插入载体转化入大肠杆菌菌株Trans1-T1,进行菌落PCR以分别选择载体p5’OCH1-lacZ-URA3、p5’OCH1-lacZ-URA3(-)、p5’OCH1-lacZns-URA3、p5’OCH1-lacZns-URA3(-)。p5’OCH1-lacZ-URA3和p5’OCH1-lacZns-URA3均含有URA3表达盒,其位于同一链中,相同取向紧邻lac ORF和lacZns ORF下游。另两个载体p5’OCH1-lacZ-URA3(-)和p5’OCH1-lacZns-URA3(-)均含有URA3表达盒,其位于相对链中,相对取向紧邻lacz ORF和lacZns ORF下游(图10和11)。
(3)lacZ表达载体的转化
用Sac I消化来线形化5’AOX1-诱导lacZ表达载体,包括p5’AOX1-lacZ、p5’AOX1-URA3-lacZ、p5’AOX1-lacZ-URA3、p5’AOX1-lacZ-URA3(-)、p5’AOX1-lacZns-URA3和p5’AOX1-lacZns-URA3(-),并通过电穿孔将它们转化入毕赤酵母菌株GS115(his4)(Invitrogen)。将转化的细胞在YNB平板上培养以选择组氨酸原养型。根据生产商(Invitrogen)所述,将线形化的表达载体通过单交换(roll-in)重组整合在基因组中。
用stu I消化来线形化5’OCH1-介导lacZ表达载体,包括p5’OCH1-lacZ、p5’OCH1-URA3-lacZ、p5’OCH1-lacZ-URA3、p5’OCH11-lacZ-URA3(-)、p5’OCH1-lacZns-URA3和p5’OCH1-lacZns-URA3(-),并通过电穿孔将它们转化入毕赤酵母菌株GS115。将转化的细胞在YNB平板上培养以选择组氨酸原养型。将线形化的表达载体通过单交换(roll-in)重组整合在his4基因处。
(4)lacZ mRNA的实时PCR分析
将5’AOX1-诱导lacZ表达载体的转化细胞在5ml BMGY培养基(10g/L酵母提取物、20g/L蛋白胨、13.4g/L YNB不含氨基酸、100mM磷酸钾缓冲液、pH 6.0、0.4mg/L生物素、10ml/L甘油)中,30℃,以225rpm振荡培养48小时。以3000g离心5分钟沉淀细胞,然后将沉淀的细胞重悬在5ml BMMY培养基(10g/L酵母提取物、20g/L蛋白胨、13.4g/L YNB不含氨基酸、100mM磷酸钾缓冲液、pH 6.0、0.4mg/L生物素、10ml/L甲醇),30℃,225rpm振荡以诱导lacZ表达。用50 μl 100%甲醇(1%终浓度)每日两次掺入培养液,将诱导再维持48小时。随后,以3000g离心细胞10分钟。用5ml水洗涤,再离心以收集细胞沉淀。将细胞沉淀用于β-半乳糖苷酶试验,或保存在-80℃用于总RNA分离。
将5’OCH1-介导lacZ表达载体的转化细胞在5ml YPD培养基中,30℃,以225rpm振荡培养72小时。随后,以3000g离心细胞10分钟,用5ml水洗涤,再离心以收集细胞沉淀。将细胞沉淀用于β-半乳糖苷酶试验,或保存在-80℃用于总RNA分离。
按照生产商的使用说明,利用
Figure PCTCN2016080788-appb-000002
试剂(Lifetechnologies)进行总RNA分离。
按照生产商的使用说明,利用ReverTra Ace-α-第一链cDNA合成试剂盒(Toyobo)进行RNA的逆转录。
实时PCR反应:10 μl 2×iTaqTM Universal 
Figure PCTCN2016080788-appb-000003
Green supermix(BioRad,Hercules,CA),1 μl cDNA和100nM各自的GAPDH F(SEQ ID NO:118)/R(SEQ ID NO:119)和LacZ F(SEQ ID NO:120)/R(SEQ ID NO:121)引物,20 μl总反应体积。利用LightCycler LC480(Roche),以如下参数进行PCR反应:1轮95℃、1分钟,40轮95℃、10秒,58℃、10秒,72℃、10秒。所有的样品一式三份,并测试数次。利用生产商(Roche)的软件分析实时PCR数据。采用比较CT方法(ΔΔCT方法)测定mRNA的相对表达。甘油醛-3-磷酸脱氢酶用作内源性对照以便定量测定基因表达。
图12A和B显示当URA3表达盒整合于起始和终止密码子附近时对5’AOX1和5’OCH1启动的lacZ mRNA表达起着不同的抑制作用。当URA3表达盒在同一链中,以相同取向紧邻lacZ ORF起始密码子的上游整合时,其将lacZ mRNA水平分别有效降低60%和70%。这 种mRNA降低可归结为3’URA3终止子,由于其位于lacZ ORF上游,能有效阻断5’AOX1或5’URA3启动子启动的lacZ转录。但是URA3 3’终止子未能完全终止转录而产生低水平的异常lacZ mRNA,其缺乏合适的5’UTR以供翻译。
当URA3表达盒在DNA双链中,以两种取向紧邻lacZ ORF终止密码子的上游整合时,预计能产生缺乏终止密码子的异常lacZ mRNA(无终止lacZ mRNA,lacZns)。当URA3表达盒在DNA双链中,以两种取向紧邻lacZ ORF终止密码子的下游整合时,预计能产生含有异常3’UTR的lacZ mRNA。RT-PCR分析显示终止密码子附近的整合能增加或降低5’AOX1或5’OCH1启动的lacZ异常mRNA水平(图12A和B)。这些结果与以前的报道不完全相同。以前的报道认为细胞具备监督系统来识别和消除异常mRNA以避免产生可能有害的蛋白质产物(van Hoof A,Frischmeyer PA,Dietz HC,Parker R(2002)Exosome mediated recognition and degradation of mRNAs lacking a termination codon.Science 295:2262–2264)。因此,有必要评估DNA双链中、以两种取向在终止密码子附近的整合以获得调节靶基因转录的最佳效应。
(5)β-半乳糖苷酶活性试验
为进一步了解整合的URA3表达盒对5’AOX1和5’OCH1介导的蛋白表达的抑制作用,采用已报导的方案检测收集的细胞沉淀中β-半乳糖苷酶的胞内比活性(Ausubel F M et al.Current Protocols in Molecular Biology,Wiley InterScience,2010)。
图13A和B显示对应于URA3表达盒整合在起始和终止密码子附近时,β-半乳糖苷酶的相对胞内比活性。当URA3在同一链中,以相同取向整合在紧邻lacZ ORF起始密码子上游的位置时,细胞中没有可检测的β-半乳糖苷酶活性。β-半乳糖苷酶活性的完全抑制可归结于转录和翻译调控作用。首先,3’URA3的终止作用显著降低缺乏5’UTR的异常lacZ mRNA转录。其次,在不含合适5’UTR的异常lacZ mRNA中,蛋白质的翻译无法启动。
本发明显示5’调控区,特别是紧邻ORF上游位置的基因整合能通过阻遏转录和翻译来专一性抑止靶基因表达。目前调控基因转录的方法主要包括:通过抑止分子与DNA结合改变模扳功能,或者通过抑止分子与RNA聚合酶结合而抑止其转录活性。调控蛋白翻译的方法主要采用小RNA和RNA-结合蛋白与mRNA结合来改变其翻译能力。但是这些方法对靶基因表达调控的专一性不强。相比之下,基因组基因中,特别是ORF上游的基因整合是专一性控制靶基因表达的更为有效的方法。
由于URA3整合对lacZ异常mRNA水平有不同作用,在不同链中、以不同取向在终止密码子附近的URA3整合对细胞中5’AOX1和5’OCH1介导的β-半乳糖苷酶翻译有各种不同抑制作用。当URA3以合适取向在终止密码子附近整合时,β-半乳糖苷酶活性可降低高达70%。这种抑制作用比以前报道的通过microRNA抑制更有效。据报道,microRNA对mRNA的稳定性和翻译水平只能抑制32%和4%(Noah Spies,Christopher B.Burge,and David P.Bartel(2013).3’UTR-isoform choice has limited influence on the stability and translationalefficiency of most mRNAs in mouse fibroblasts,Genome Research,23:2078–2090)。
本发明显示了通过在终止密码子附近整合来控制靶基因表达的另一种方法。必需评估DNA双链中、以两种取向在终止密码子附近的基因整合以获得最佳调节结果。
实施例6
基因整合以抑制OCH1表达
为进一步了解基因整合对调节基因组中基因转录的作用,选择实施例3中的代表性整合菌株来分析OCH1基因的mRNA转录。将对照菌株JC307和3种整合菌株och1(-1/+1)、(1212/+1)、(+3/+4)的细胞在5ml YPD培养基中,30℃,以225rpm振荡培养72小时。随后,以3000g离心细胞10分钟,用5ml水洗涤,再离心以收集细胞沉淀,并保存在-80℃用于总RNA分离。
利用
Figure PCTCN2016080788-appb-000004
试剂(Lifetechnologies)进行总RNA分离,利用ReverTra Ace-α-第一链cDNA合成试剂盒(Toyobo)进行cDNA的逆转录。
实时PCR反应:10 μL 2×iTaqTM Universal 
Figure PCTCN2016080788-appb-000005
Green supermix(BioRad,Hercules,CA),1 μl cDNA和100nM各自的GAPDH F(SEQ ID NO:118)/R(SEQ ID NO:119)和OCH1F(SEQ ID NO:122)/R(SEQ ID NO:123)引物,20 μL总反应体积。利用LightCycler LC480(Roche),以如下参数进行PCR反应:1轮95℃、1分钟,40轮95℃、10秒,58℃、10秒,72℃、10秒。所有的样品一式三份,并测试数次。利用生产商(Roche)的软件分析实时PCR数据。采用比较CT方法(ΔΔCT方法)测定mRNA的相对表达。甘油醛-3-磷酸脱氢酶(GAPDH)用作内源性对照以便定量测定基因表达。
图14显示了这些菌株中的OCH1mRNA相对表达量。起始密码子上游位置的基因整合能有效抑制OCH1 mRNA水平超过90%。终止密码子下游位置的基因整合也能降低OCH1mRNA水平。然而,终止密码子上游位置的基因整合显著增加OCH1 mRNA水平。该结果类似于在实施例5中lacZ mRNA调控的观察结果。
实施例7
OCH1抑制菌株中的蛋白质糖基化
哺乳动物细胞和酵母在内质网中具有相同的N-糖基化起始步骤及修饰加工过程。在内质网腔合成新生肽链的同时,N-糖基化的前体寡糖G1c3Man9GlcNAc2被连接到新生肽链Asn-X-Thr/Ser(X为除Pro外的任意氨基酸)保守序列中的Asn残基上,接着在葡萄糖苷酶I和Ⅱ等糖苷酶的作用下蛋白的糖链最终被加工形成Man8GlcNAc2糖链结构,随后带有该糖链的蛋白被转运至高尔基体中。但是在哺乳动物细胞和酵母高尔基体内,蛋白糖链的进一步修饰加工过程则显著不同。哺乳动物细胞高尔基体内,在一系列甘露糖苷酶和糖基转移酶的作用下,蛋白的糖链逐步被加工形成杂合型和复杂型的糖链结构;而毕赤酵母高尔基体内,在OCH1基因编码的α-l,6-甘露糖转移酶(Ochlp)的作用下,蛋白的糖链首先接受一个α-l,6-甘露糖,形成Man9GlcNAc2糖链结构,然后在各种甘露糖转移酶的作用下继续添加甘露糖,许多达到数十至上百个甘露糖,最终形成高甘露糖型的糖链结构,对蛋白形成过度糖基化修饰。由此可见,Ochlp是酵母不同于哺乳动物细胞对蛋白形成高甘露糖基化修饰的第一个也是最为关键的酶,因此破坏OCH1基因有望阻断毕赤酵母对蛋白产生高甘露糖基化修饰 (Kornfeld,R.&Kornfeld,S.Assembly of asparagine-linked oligosaccharides.Annu.Rev.Biochem.54,631–664,1985)。
为阻断高甘露糖基化修饰,目前已进行了大量的工作以敲除酵母菌基因组中的OCH1基因。然而,同源基因打靶直接破坏OCH1基因编码区及其功能的效率极低。在本发明中,5’OCH1调控区的基因整合可应用于阻断Och1p启动的对蛋白的高甘露糖基化修饰。
由Generay合成密码子优化的小鼠白介素-22(mIL-22,针对酵母菌密码子作优化的含his-标签小鼠IL-22成熟肽的DNA序列如SEQ ID NO:129所示)做模板,用MIL22F(SEQ ID NO:124)/R(SEQ ID NO:125)引物对作PCR扩增。用Xho I和Not I限制性酶消化PCR产物,并克隆入pPICZα(Invitrogen)的Xho I/Not I位点以产生mIL-22表达载体,其能表达并分泌含His-标签的mIL-22。用限制性酶Sac I线形化该表达载体,并电穿孔转入GS115和och1(-1/+1)菌株。转化的细胞在补加了100mg/L的Zeocin的YPD平板上培养。根据生产商(Invitrogen)所述,通过单交换(roll-in)重组将线形化的载体整合在AOX1基因座。
将转化的细胞在5ml YPD培养基中,30℃,225rpm振荡培养24小时。以3000g离心沉淀细胞5分钟,重悬在5ml BMGY培养基中,30℃,225rpm振荡培养24小时。然后,以3000g离心沉淀细胞5分钟,重悬在5ml BMGY培养基中,30℃,225rpm振荡培养以诱导mIL-22表达。用50μl 100%甲醇(1%终浓度)每日两次掺入培养液,将诱导再维持72小时。随后,通过离心收获培养上清液(3000g,10分钟),-20℃冷冻上清液待用。
按照生产商的使用说明书(南京金斯瑞生物科技有限公司),通过Ni-亲和层析从上清液中纯化带His-标签的mIL-22蛋白。
采用以前报道的方法(Gregg JM(2010)Pichia Protocols,Second edition.Totowa,New Jersey:Humanna Press),通过N-糖苷酶F(PNGaseF)(New England Biolabs,Beverly,MA)酶切,从带His-标签的mIL-22蛋白上释放分离出糖链。
按照生产商的使用说明书,利用Ultraflex MALDI-TOF(bruker daltonics,不来梅,德国)质谱仪测定糖链的分子量。
图15A显示GS115菌株中从mIL-22释放的N-糖链的质谱图。其显示主要的N-高甘露糖链为Man9-15GlcNAc2(m/z:1907,2069,2231,2393,2555,2717,2880),表明Och1p启动对Man8GlcNAc2进行完全高甘露糖基化修饰。图15B显示och1(-1/+1)菌株中从mIL-22释放的N-糖链的质谱图。其显示主要的N-糖链为Man8-15GlcNAc2(m/z:1744,1907,2069,2231,2393,2555,2717,2880),其中的高甘露糖链可能由于其它甘露糖转移酶的作用而形成。Man8GlcNAc2(m/z:1744)的产生表明在OCH1编码区上游的基因整合能有效阻断Och1p启动的高甘露糖基化修饰(Choi,et al.(2003)Proc Natl Acad Sci U S A 100:5022–5027)。
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。

Claims (12)

  1. 一种用于调控基因的核苷酸构建物,其结构如下式所示:
    5’-A-B-C-3’
    其中,A是5’同源序列,B是干扰基因,C是3’同源序列;
    所述5’和3’同源序列使得所述核苷酸构建物的重组位点位于该待调控基因的起始密码子的第一个核苷酸到该待调控基因的起始密码子的第一个核苷酸上游的110,优选50个核苷酸之间,或者所述5’和3’同源序列使得所述核苷酸构建物的重组位点位于该待调控基因的终止密码子的第一个核苷酸上游的100、50或20个核苷酸,优选50个核苷酸到该待调控基因的终止密码子的第一个核苷酸下游的300个核苷酸。
  2. 如权利要求1所述的核苷酸构建物,其特征在于,所述重组位点之间间隔0-20个核苷酸,优选0-5个核苷酸,最优选0个核苷酸。
  3. 如权利要求1或2所述的核苷酸构建物,其特征在于,所述待调控基因可以是重组效率低的基因,优选重组效率<3%,更优选重组效率<1%。
  4. 如权利要求3所述的核苷酸构建物,其特征在于,所述待调控基因是OCH1、ADE1基因。
  5. 一种包含权利要求1-4中任一项所述的核苷酸构建物的宿主细胞。
  6. 如权利要求5所述的宿主细胞,其特征在于,所述宿主细胞是酵母细胞。
  7. 一种调控基因表达的方法,所述方法包括:
    a)构建如权利要求1-4中任一项所述的核苷酸构建物;和
    b)将步骤a)构建的核苷酸构建物导入细胞,从而通过同源重组整合入待调控基因。
  8. 如权利要求7所述的方法,其特征在于,所述待调控基因可以是重组效率低的基因,优选重组效率<3%,更优选重组效率<1%。
  9. 一种改造菌株的方法,包括:
    a)构建如权利要求1-4中任一项所述的核苷酸构建物;和
    b)将步骤a)构建的核苷酸构建物转化入待改造的菌株。
  10. 一种采用权利要求9所述方法改造的菌株的用途,所述菌株用于生产重组蛋白。
  11. 一种采用权利要求9所述方法改造的菌株的用途,所述菌株用于生产代谢物。
  12. 一种采用权利要求9所述方法改造的菌株的用途,所述菌株用于生物催化反应。
PCT/CN2016/080788 2015-04-30 2016-04-29 新型基因打靶方法 WO2016173556A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US15/570,656 US11466280B2 (en) 2015-04-30 2016-04-29 Gene targeting method
EP16785984.2A EP3290519A4 (en) 2015-04-30 2016-04-29 NOVEL GEN-TARGETING PROCESS

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510218188.1 2015-04-30
CN201510218188.1A CN106191040B (zh) 2015-04-30 2015-04-30 基因打靶方法

Publications (1)

Publication Number Publication Date
WO2016173556A1 true WO2016173556A1 (zh) 2016-11-03

Family

ID=57198173

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/080788 WO2016173556A1 (zh) 2015-04-30 2016-04-29 新型基因打靶方法

Country Status (4)

Country Link
US (1) US11466280B2 (zh)
EP (1) EP3290519A4 (zh)
CN (1) CN106191040B (zh)
WO (1) WO2016173556A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108866050B (zh) * 2017-05-11 2022-12-27 杭州菁因康生物科技有限公司 一种高效的基因工程载体
CN107480473B (zh) * 2017-07-18 2021-02-26 中国石油大学(华东) 一种基于密码子模板的真核生物功能基因序列搜索方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1498270A (zh) * 2001-01-19 2004-05-19 ��Ԩ��ѧ��ҵ��ʽ���� 基因已被破坏的酵母
CN101679991A (zh) * 2007-02-02 2010-03-24 格莱科德公司 用于产生同质糖蛋白的遗传改良酵母
CN102120967A (zh) * 2010-12-09 2011-07-13 江南大学 Och1基因缺陷型毕赤酵母x-33菌株的构建及应用

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0981637B1 (en) * 1997-03-14 2005-05-25 Biogen Idec Inc. Method for integrating genes at specific sites in mammalian cells via homologous recombination and vectors for accomplishing the same
MX2012003816A (es) * 2009-10-01 2012-07-04 Toto Ltd Construccion de adn y procedimiento para la generacion de celulas cho recombinantes usando la misma.
CN102154188B (zh) * 2010-12-22 2013-05-08 中国人民解放军第三军医大学 大肠杆菌DH5α的nfi基因敲除突变株及其制备方法和应用
EP4234696A3 (en) * 2012-12-12 2023-09-06 The Broad Institute Inc. Crispr-cas component systems, methods and compositions for sequence manipulation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1498270A (zh) * 2001-01-19 2004-05-19 ��Ԩ��ѧ��ҵ��ʽ���� 基因已被破坏的酵母
CN101679991A (zh) * 2007-02-02 2010-03-24 格莱科德公司 用于产生同质糖蛋白的遗传改良酵母
CN102120967A (zh) * 2010-12-09 2011-07-13 江南大学 Och1基因缺陷型毕赤酵母x-33菌株的构建及应用

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHEN, ZAO ET AL.: "Enhancement of the Gene Targeting Efficiency of Non-Conventional Yeasts by Increasing Genetic Redundancy", PLOS ONE, vol. 8, no. 3, 7 March 2013 (2013-03-07), pages 1 - 9, XP055325946, ISSN: 1932-6203 *
See also references of EP3290519A4 *

Also Published As

Publication number Publication date
EP3290519A4 (en) 2019-01-23
US11466280B2 (en) 2022-10-11
CN106191040A (zh) 2016-12-07
CN106191040B (zh) 2021-09-14
EP3290519A1 (en) 2018-03-07
US20190093115A1 (en) 2019-03-28

Similar Documents

Publication Publication Date Title
EP2588616B1 (en) A method for the production of a compound of interest
EP2683732B1 (en) Vector-host system
EP2912162B1 (en) Pichia pastoris strains for producing predominantly homogeneous glycan structure
US20220220161A1 (en) Constitutive Yeast LLP Promotor-Based Expression Systems
CN105492604B (zh) 受调节的pepc表达
Jørgensen et al. A novel platform for heterologous gene expression in Trichoderma reesei (Teleomorph Hypocrea jecorina)
CN110079546B (zh) 一种用于毕赤酵母表达宿主的多基因敲入方法
US11299754B2 (en) Gene targeting method
US20080070277A1 (en) Homologous Amds Genes as Selectable Marker
WO2016173556A1 (zh) 新型基因打靶方法
AU2010251133A1 (en) Eukaryotic host cell comprising an expression enhancer
Li et al. A novel protein expression system-PichiaPink™-and a protocol for fast and efficient recombinant protein expression
US11155824B2 (en) Episomal plasmid vectors
US20220267783A1 (en) Filamentous fungal expression system
EP2981546B1 (en) A filamentous fungal cell with inactivated component of the selective autophagy pathway and method of using same
WO2018205977A1 (zh) 一种高效的基因工程载体
US20230111619A1 (en) Non-viral transcription activation domains and methods and uses related thereto
WO2019149288A1 (zh) 一种高效的基因工程载体
WO2023104902A1 (en) Improved production of secreted proteins in yeast cells
JP2011167160A (ja) 新規ターミネーターおよびその利用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16785984

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016785984

Country of ref document: EP