CN104372016A - System and method for introducing variation in target genomic sequences of recipient cells - Google Patents

System and method for introducing variation in target genomic sequences of recipient cells Download PDF

Info

Publication number
CN104372016A
CN104372016A CN201410559802.6A CN201410559802A CN104372016A CN 104372016 A CN104372016 A CN 104372016A CN 201410559802 A CN201410559802 A CN 201410559802A CN 104372016 A CN104372016 A CN 104372016A
Authority
CN
China
Prior art keywords
sequence
expression cassette
gsms
thermoscript
dna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410559802.6A
Other languages
Chinese (zh)
Inventor
刘立新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/058,214 external-priority patent/US20140113375A1/en
Application filed by Individual filed Critical Individual
Publication of CN104372016A publication Critical patent/CN104372016A/en
Pending legal-status Critical Current

Links

Abstract

The invention provides a system and a method for introducing variation in target genomic sequences of recipient cells. The system comprises a) a gene group DNA mutation sequence GSMS expression box which comprises a segment of homologous polynucleotide sequence of the target genomic sequence, and a primer combination sequence and generates GSMS RNAs; and b) a reverse transcriptase expression box which comprises a segment of a polynucleotide sequence encoding reverse transcriptase. The invention also provides a method for introducing variation in target genomic sequences of recipient cells and a method for obtaining cell population with random variant on the target genomic sequences. According to the invention, a large number gene modification sequences can be continuously provided in recipient cells in a relative long time, success rate of variation of the target genes is increased, and a large number of random variation can be introduced into areas of target gene groups.

Description

The system and method for variation is introduced in the goal gene group sequence of recipient cell
Technical field
The present invention is based on molecular genetic field, specifically, is utilize transient expression and reverse transcription system to introduce specific variation in recipient cell genome sequence.
Background technology
Genetic modification technique is in the genetic information of the genome rna sequence encoding by active somatic cell or tissue, introduce variation provide possibility.Traditional genetic modification technology relates to the random site one section of exogenous nucleic acid sequences being incorporated into acceptor gene group.These methods introduce recipient cell based on by the carrier containing exogenous array, and foreign DNA random integration to acceptor gene group, and is separated the cell containing exogenous array.These genetic information are integrated with some limitation in the non-natural of acceptor gene group.The random integration of exogenous array, integration site in the gene that tissue survival is necessary, and may upset its function.Even if the insertion of foreign gene does not damage host gene, the expression of foreign gene also can be subject to the impact (site effect) of surrounding genes DNA.In some cases, the negative impact of surrounding genes group environment is so big, to such an extent as to foreign gene is beyond expression.In other cases, surrounding genes group environment may cause the transition of foreign gene to be expressed thus infringement recipient cell.The multiple-sites integration of foreign gene causes RNA to disturb (RNAi) sometimes.Another problem of these methods adds unnecessary or useless genetic material in acceptor gene group, such as virus or other carrier residue, regulating and controlling sequence and marker gene.These useless genetic material may produce unknowable impact to receptor tissue in over a long time, and marker gene, the possible impact of such as anti-herbicide gene and antiviral antibiotic gene pairs health and environment, is worth worry.
For solving the problem of these traditional genetic modification technology, develop the target gene renovation technique introducing variation on host genome specific site.These technology to be corrected by specific site or rite-directed mutagenesis modifies target sequence in episome or karyomit(e).Some of them technology utilizes dissimilar oligomerization or polynucleotide, as: double-stranded DNA (dsDNA), single stranded DNA (ssDNA), containing 5 ' and/or 3 ' terminal modified with the oligonucleotide resisting intracellular nucleic acid enzyme liberating (Campbell et al., 1989), heterozygosis RNA/DNA or DNA/DNA molecule (Igoucheva et al., 2004a; Parekh-Olmedo et al., 2005), RNA oligonucleotide (Storici, 2008), and three close oligonucleotide (Simon et al., 2008).Campbell and colleague thereof with single-chain nucleic acid successfully in by the plasmid-encoded kantlex phosphoric acid transferase gene of cotransformation induction of a variation.One section of external source single stranded DNA is inserted into blue-green algae (Chlamydomonasreinhardti) genomic specific site and reports that the probability of the non-homogeneous restructuring of single stranded DNA is significantly less than double-stranded DNA (Zorin et al., 2005) by Zorin and colleague thereof.Kmeic and colleague thereof have developed the specific site induce variation utilizing heterozygosis RNA/DNA binary oligonucleotide at higher eucaryotic cells goal gene, and report the single strain oligonucleotide (United States Patent (USP) of variation transformation efficiency higher than unmodified of heterozygosis RNA/DNA binary primer, the patent No. 5,565,350).The molecular mechanism that homologous gene is transformed also imperfectly understands.Possible mechanism is variation induced sequence and a goal gene sequence hybridization, thus produces the mispairing bubble triggering DNA repair mechanism, and DNA repair mechanism utilizes variation induced sequence to be template " correction " recipient cell goal gene group sequence.Other mechanism may relate to chain invasion (strand invasion) or homologous recombination.
A major defect based on the targeting modification technology of oligonucleotide is that success ratio is low, and partly cause can be attributed to degraded and the limited supply (Zorin et al., 2005) of external source oligomerization/polynucleotide.Kmeic et al. describes a technology utilizing the single strain oligonucleotide modified through nuclease-resistant to carry out targeting modification.Modify as phosphorothioate bond (phosphorothioate linkage) through nuclease-resistant, LNA key, the oligonucleotide modified with 2-O-methyl is compared with not modified single strain oligonucleotide, and in born of the same parents, degree of degradation is low, and mutagenesis success ratio is high.Through optimizing the mutagenesis success ratio comparable RNA/DNA binary hybrid primer height 2-3 of the single strain oligonucleotide modified doubly (United States Patent (USP), the patent No. 6,936,497).But the impact of chemically modified on the enzyme participating in variation induction is not fully aware of.Although modify through optimizing, as a rule, variation inductivity is still relatively low, greatly about 1x 10 -4left and right (Zhu, 2000).Another problem of target gene transformation is the competition from the non-homogeneous integration of foreign DNA.The ratio of Homologous integration and non-homogeneous integration changes with different tissues and cell type.Such as yeast homologous integration and probability are relatively high, and the non-homogeneous integrating frequency of higher eucaryote as plant and animal is significantly higher than Homologous integration.Zorin et al. finds that strand (ssDNA) is similar with the efficiency that double-strand (dsDNA) DNA homology is recombinated, but single stranded DNA (ssDNA) is relative to double-stranded DNA (dsDNA), much smaller (the Zorin of proneness of non-homogeneous restructuring, 2005), this invention provides a simple solution for reducing non-homogeneous integration.
Other target gene technology relate to and utilize target endonuclease, as activating transcription factor sample effector nuclease (transcriptional activator-like effector nucleases) (Miller, et al., 2010), Zinc finger nuclease (zinc finger nuclease) (Bibikova, Beumer, Trautman, & Carroll, 2003), playback restriction endonuclease (homing endonucleases) (Grizot, et al., 2009), be combined with the genetic modification carrier containing the sequence with target gene homology.Carrier is simultaneously containing report and/or selectable marker gene.Its mechanism it is believed that it is homologous recombination.These technology are quite effective, but primary treatment can only produce a specific variation.
With regard to target gene transformation, needs an energy within the relatively long time in a large number and genetic modification sequence is provided constantly, thus improve target gene and to make a variation the technology of success ratio.The present invention not only meets these demands and provides the possibility introducing random mutation in target gene group region.
Summary of the invention
First object of the present invention is to provide a kind of system producing the variation of predetermined or random generation in genome specific site.
Second object of the present invention is to provide a kind of method introducing variation in recipient cell goal gene group sequence.
3rd object of the present invention is to obtain the method having the cell mass of random variation in goal gene group sequence.
In order to reach first object of the present invention, the invention provides the system that a kind of goal gene group sequence at recipient cell introduces variation, described system comprises: an a) genomic dna mutagenized sequences GSMS expression cassette, wherein said GSMS expression cassette comprises polynucleotide sequence and a primer binding sequence of one section and described goal gene group sequence homology, and wherein said GSMS expression cassette produces GSMS RNAs; B) a ThermoScript II expression cassette, wherein said ThermoScript II expression cassette comprises the polynucleotide sequence of one section of coding ThermoScript II, coded ThermoScript II can be natural or engineered ThermoScript II, and described ThermoScript II can be have good proofreading function or poor proofreading function; (Figure 10) and in recipient cell, described GSMS RNAs is ssGSMS cDNAs (Fig. 7) by described ThermoScript II reverse transcription, and described ssGSMS cDNAs is repaired by cell DNA in nucleus or homologous recombination machinery causes the change of goal gene group sequence.Goal gene group sequence can be any genome sequence of recipient cell, comprises picture exon sequence, intron sequences, non-transcription regulating nucleotide sequence, direct or Inverted repeat, and recombination hotspot sequence.
In some embodiments, GSMS and ThermoScript II can be joined together puts into same expression cassette, ThermoScript II is at nearly 5 ' end, the sequence forming hair fastener type structure by one section after being transcribed into RNA between ThermoScript II and GSMS separates, and the secondary structure of hair fastener type can stop reverse transcription (Figure 12).
In some embodiments, described GSMS comprises one section and can match with recipient cell genome sequence with the sequence of described goal gene group sequence complete complementary and hybridize being introduced in described GSMS after in recipient cell; Or, except there is the difference of one or several Nucleotide the position preset at described GSMS, described GSMS comprise one section with the sequence of described goal gene group sequence complete complementary, the difference of described Nucleotide is the listed combination of mispairing, deletion, insertion or more.In some embodiments, described GSMS comprises one section of non-homogeneous polynucleotide sequence, be clipped in and recipient cell genome sequence homology sequence between; Preferably, described non-homogeneous polynucleotide sequence is encoded a selectable marker gene.In some embodiments, described GSMS comprise one section with the sequence of the homologous recombination hot spot region homology of goal gene group sequence, or, described GSMS comprise one section with the sequence of the goal gene group sequence homology containing direct repeat and inverted repeat.
In some embodiments, the GSMS fitted like a glove with goal gene group sequence and the more weak ThermoScript II of error correction are introduced at the same time in recipient cell.Because the error correcting capability of ThermoScript II is more weak, in transcriptive process,reversed, random mutation is introduced in GSMS cDNA.These strand GSMS cDNA with random mutation can be repaired by DNA or homologous recombination integration is attached to goal gene group sequence, thus obtain one has random mutation cell bank in goal gene group sequence.This random mutation storehouse can be used to screen the valuable sudden change in goal gene group region.In some embodiments, described ThermoScript II has good proofreading function.
In some embodiments, the 5 ' end of GSMS RNA can form secondary structure, and when ThermoScript II arrives secondary structure, reverse transcription is terminated (Fig. 9).
In some embodiments, system provided by the present invention also comprises primer expression cassette, and wherein said primer expression cassette produces natural or artificial primer tRNA, and the described primer binding sequence of primer tRNA on described GSMS RNAs is combined, initial reverse transcription.
GSMS has a primer binding site can combine startup reverse transcription with primer.In some embodiments, primer binding site sequence and the complementation of natural or artificial tRNA 3 ' terminal sequence.GSMS RNA can utilize natural tRNA or the synthesis tRNA that simultaneously expresses with it to start reverse transcription (Fig. 9) for primer.In some embodiments, GSMS RNA 3 ' end comprises poly (U) tail.In eukaryote, poly (A) tail can be added to the 3 ' end of the RNA transcribed.The GSMS RNA with poly (U)-poly (A) tail self-annealing can start reverse transcription (Fig. 8).
In some embodiments, ThermoScript II is the ThermoScript II that natural having depends on the DNA polymerase activity of RNA, as HIV, M_MLV and AMV ThermoScript II.Naturally occurring ThermoScript II does not generally possess or only has little check and correction active, therefore can be used for producing the GSMS cDNA carrying random mutation.If the random mutation on GSMS cDNA is also non-required, then should uses and there are colleges and universities to the engineering ThermoScript II through transformation of activity or natural ThermoScript II.
In some embodiments; described system can comprise single stranded DNA (ssDNA) binding protein expression box; expressed ssDNA associated proteins can be attached to ssGSMS cDNA, protects them not to be degraded and the ssGSMS cDNA that assists in the transfer enters nucleus.More or, ssDNA associated proteins can DNA repair and homologous recombination in play an important role.These ssDNA associated proteins comprise, the homologous protein of replicative enzyme A (replication proteinA), RecA, the homologous protein of Rad51, Rad51, the homologous protein of DMC1, DMC1, the homologous protein of ICP8, ICP8, the homologous protein etc. of SSB and SSB.Wherein, Rad5 and DMC1 can be incorporated into formation nucleic acid-protein silk on ssDNA thus participates in such as Homology search, the important step (Holthausen, 2010) (Fig. 3) of the homologous recombination such as the invasion of DNA chain and homologous sequence pairing.
In some embodiments, described system comprises the target endonuclease expression cassette of a coding target endonuclease.(Fig. 2) target endonuclease attacks the region with GSMS homology in goal gene group sequence.Target endonuclease is through engineered restriction endonuclease and comprises one in order to identify the recognition sequence structural domain of the DNA sequence dna preset, and a DNA endonuclease enzyme domains.And wherein said target endonuclease and described GSMS point to the same homology region of described goal gene group sequence; Described DNA sequence dna recognition structure territory can from zinc finger protein DNA sequence dna recognition structure territory, activating transcription factor sample effector DNA recognition structure territory (TALE), and the recognition sequence structural domain of large-scale nuclease (meganuclease).DNA restriction endonuclease structural domain has endonuclease activity, the described DNA endonuclease enzyme domains of described target endonuclease can cut one section of double stranded polynucleotide sequence and produce double-strand break, or, a chain in a cutting double-stranded DNA thus cause a breach on double-stranded DNA.
In some embodiments, described system comprises a siRNA expression cassette, and the siRNA that described siRNA expression cassette produces impels the degraded of the mRNA produced from goal gene group sequence transcribes, but siRNA and GSMSRNAs is without homology region.Described siRNA can keep described goal gene group sequence to be in the state of untwisting, thus increases described ssGSMS cDNA and the synergistic chance of goal gene group sequence.Then siRNA also together can introduce recipient cell with described GSMS and ThermoScript II expression cassette via chemosynthesis.
In some embodiments, GSMS ThermoScript II expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette etc. can be that RNA form is directly translated as protein or synthesizes template as cDNA.
In some embodiments, described GSMS expression cassette, ThermoScript II expression cassette, primer expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette and siRNA expression cassette can be positioned at different expression vector or simultaneously on the same vector, and ThermoScript II encoding sequence, in ssDNA associated proteins encoding sequence and target endonuclease encoding sequence, at least two sequences can be present in an expression cassette simultaneously, are formed and merge expression cassette; One of them single operating is connected to two or more albumen coded sequences, and wherein adjacent protein encoding sequence is translated jump sequence and separates; Or, one of them single operating is connected to two or more albumen coded sequences and continues to be connected to GSMS, wherein adjacent protein encoding sequence is translated jump sequence and separates, and GSMS and upstream protein encoding sequence are opened by the sequence separates of one section of hair fastener type RNA structure of encoding.Such as, ThermoScript II and ssDNA associated proteins sequence can be received GSMS sequence by linear chain and handled by same promotor.Protein coding sequence is at nearly 5 ' end, and GSMS is at nearly 3 ' end, a translation jump sequence is inserted between two protein coding sequences, can form the sequence of hair fastener type RNA after GSMS5 ' holds insertion one to transcribe simultaneously, is used for stopping reverse transcription (Figure 12).
System provided by the present invention can produce the single stranded DNA with goal gene group homology in a large number and constantly, and these single stranded DNAs can be transferred in nucleus, is repaired or homologous recombination path by DNA, changes the genetic information of goal gene group sequence.
In order to reach second object of the present invention, the invention provides a kind of method introducing variation in recipient cell goal gene group sequence, comprise and a) build a GSMS expression cassette, wherein said GSMS contain one section with the polynucleotide sequence of described goal gene group sequence homology and a primer binding sequence, and wherein said GSMS expression cassette produces GSMS RNAs; B) build a ThermoScript II expression cassette, wherein said ThermoScript II expression cassette is encoded a ThermoScript II; Further, c) described GSMS expression cassette and described ThermoScript II expression cassette are introduced recipient cell simultaneously, it is ssGSMS cDNAs that wherein said GSMS RNAs is inverted the record of record enzymatic reversion, and described ssGSMS cDNAs causes variation in goal gene group sequence; D) the reformed recipient cell of goal gene group sequence is selected.
In some embodiments, method provided by the present invention also comprises the one or more expression cassettes selected from following one group of expression cassette and they is incorporated in recipient cell simultaneously, they comprise primer expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette and siRNA expression cassette.
Wherein, the reformed cell of genome can be selected by traditional way; Such as, genome changes the growth vigor caused, resistance, and metabolism change and fluorescence etc. can be used as the variation selected especially.In addition, the PCR of goal gene group sequence and DNA sequencing also can be used to the cell clone selecting to have variation.
In order to reach the 3rd object of the present invention, the invention provides the method obtaining and have the cell mass of random variation in goal gene group sequence, comprise the following steps: a) build a GSMS expression cassette, wherein said GSMS contain one section with the polynucleotide sequence of the complete homology of described goal gene group sequence and a primer binding sequence, and wherein said GSMS expression cassette produces GSMS RNAs; B) build a ThermoScript II expression cassette, wherein said ThermoScript II expression cassette is encoded the weak ThermoScript II of an error correction; C) described GSMS expression cassette and described ThermoScript II expression cassette are introduced recipient cell simultaneously, wherein said GSMS RNAs is the ssGSMS cDNAs containing random variation by the ThermoScript II reverse transcription that error correction is weak, and described ssGSMScDNAs causes random variation to be integrated into described goal gene group sequence; Further, d) cell in described goal gene group sequence with random variation is selected.
In certain embodiments of the present invention, described method also comprises the one or more expression cassettes selected from following one group of expression cassette and they is incorporated in recipient cell simultaneously, they comprise primer expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette and siRNA expression cassette.
The invention provides the transient expression system that can continue to produce in large quantities and supply strand cDNA, the strand cDNA produced has a part and goal gene group homology at least, and can be transferred to change genome sequence in nucleus.Invention also provides a method being created on target gene group region and containing the cell bank of random mutation.The present invention is in addition also by ssGSMS cDNA and activating transcription factor sample effector nuclease (transcriptional activator-like effector nucleases) (Miller, et al., 2010), Zinc finger nuclease (zinc finger nuclease) (Bibikova, Beumer, Trautman, & Carroll, 2003), playback restriction endonuclease (homing endonucleases) (Grizot, et al., 2009) etc. the use of target endonuclease combines, in particular sequence, introduce sudden change.What provide also has cytology and molecular biology assembly to improve the efficiency of target gene group transformation simultaneously.
Core of the present invention is a transient expression system imitating retrovirus and retrotransposon dubbing system, and it can by transcribing and post transcription cloning genomic dna mutagenized sequences (GenomicSequence Modification Sequence) (GSMS) in cell.Owing to there being two-wheeled to increase, it can continue to provide a large amount of ssGSMS cDNA, and ssGSMS cDNA and cellular genome do the variation causing goal gene group to produce needs mutually.After several days, intracellular transient expression system can be completely degraded, except the variation in aim sequence, and remaining without any exogenous genetic material.Compared with oligonucleotides technic acid, the present invention can at several days, even a few week, instead of provide relative durations with high-caliber ssGSMS cDNA in several hours.The present invention is specially adapted to the difficult cell transforming (hard-to-transfect), as vegetable cell.It take homologous recombination as the ssGSMScDNA that mode carries out genetic modification that the expression vector that minute quantity is transferred into cell can provide a large amount of.In addition, because double-stranded DNA participates in probability comparatively high 100 times of (the Zorin etal. of single stranded DNA of homologous recombination, 2005) the present invention, utilizing single stranded DNA to carry out genetic modification compares with the method utilizing double-stranded DNA (as plasmid) to carry out genetic modification the generation greatly can lowering abnormal variation.
The present invention also uses the ThermoScript II of check and correction ability.Adopt the ThermoScript II of check and correction ability can continue to produce the ssGSMS cDNA with random mutation in cell, random mutation can be incorporated in goal gene with homologous recombination machinery by these ssGSMS cDNA.Goal gene group random mutation cell bank can be used to screen needed for phenotype and proterties.
The system of invention provides simultaneously can stablize and assist ssGSMS cDNA to enter nuclear single-stranded DNA binding protein, and produces the target endonuclease of strand or double-strand break in the site of hope variation.Present invention also offers one utilizes siRNA to force target sequence to be in relaxed state in order to DNA hybridization and ferment treatment.The present invention can be used for introducing variation in specific native gene, also one section of foreign DNA can be inserted into the specific site of karyomit(e) or episome.
Accompanying drawing explanation
Fig. 1 .GSMS transient expression basic system.Plasmid comprises a GSMS expression cassette, a ThermoScript II expression cassette, and a ssDNA binding protein expression box, and each expression cassette has respective promotor and termination.
Fig. 2. carry the GSMS transient expression system of target endonuclease.Plasmid comprises a GSMS expression cassette, a ThermoScript II expression cassette, a ssDNA binding protein expression box, a tRNA expression cassette, and a target endonuclease expression cassette, and each expression cassette has respective promotor and termination.
Fig. 3. the production process of diagram ssGSMS cDNAs in cell.GSMS be first transcribed into RNA then with the tRNA expressed simultaneously for primer, be sscDNA by the ThermoScript II reverse transcription expressed simultaneously.SsGSMScDNA and ssDNA associated proteins combines and enters nucleus.Target endonuclease can participate in this process.
Fig. 4. when target endonuclease is included in systems in which, the purposes that ssGSMS cDNA is possible after entering nucleus.SsGSMS cDNA associated proteins not only protects ssGSMS cDNA do not degraded and form secondary structure, also helps ssGSMS cDNA search homologous gene group region and align with it.When lining up to make uniform, single nucleotide variations that is that no matter ssGSMS cDNA presets or that introduced by fallibility ThermoScript II can form mispairing with genome sequence and steep, mispairing steep oneself-meeting trigger receptor cell mismatch repair system (Mismatch Repair, MMR) thus sudden change is incorporated into genomic dna.SsGSMS cDNA also can pass through homologous recombination or similar mode, is be integrated into recipient cell genome as chain intrusion (Strand Invasion) and chain exchange (StrandExchange).
Fig. 5. when target double-strandednucleic acid restriction endonuclease involved in systems in which time, the purposes that ssGSMS cDNA is possible after entering nucleus.SsGSMS cDNA associated proteins not only protects ssGSMS cDNA do not degraded and form secondary structure, also helps ssGSMS cDNA search homologous gene group region and align with it.Target endonuclease cutting genome target site produces double-strand break (DSB), ssGSMS cDNA and target gene group sequence hybridization.Hybridization pairing after, ssGSMS cDNA can be used as template to synthesize target gene group sequence or by homology chain cross integration in genome sequence.
Fig. 6. when target single-strand endonuclease involved in systems in which time, the purposes that ssGSMS cDNA is possible after entering nucleus.SsGSMS cDNA associated proteins not only protects ssGSMS cDNA do not degraded and form secondary structure, also helps ssGSMS cDNA search homologous gene group region and align with it.Target endonuclease cutting genome target site produces single-strand break (SSB), ssGSMS cDNA and target gene group sequence hybridization.Hybridization pairing after, ssGSMS cDNA can be used as template to synthesize target gene group sequence or by homology chain cross integration in genome sequence.Relative to target double-strandednucleic acid restriction endonuclease, use the less generation dystopy sudden change of target single-strand endonuclease or chromosome rearrangement.
Fig. 7. show and utilize tRNA for primer generation ssGSMS cDNA.There is a primer binding site at GSMS RNA 3 ' end, hold complementation with 3 ' of tRNA.TRNA is attached to the primer binding site of GSMS RNA and is used as the primer of reverse transcription.
Fig. 8. show poly T/A, poly (T/A) primer in ssGSMS cDNA synthesis.In the design, one section of poly T, poly (dT) are added to 3 ' end of GSMS normal chain, therefore comprise one section of poly U, poly (U) at the 3 ' end of GSMS RNA.When transcribing, one section of poly A, poly (A) tail is added to GSMS RNA3 ' and holds poly U, after poly (U).The complementation of can reversing of Poly (A) tail is attached to poly (U) and goes up and be used as the primer of transcriptive process,reversed, synthesis ssGSMS cDNA.
Fig. 9. show that the secondary structure that GSMS RNA 5 ' holds is used to stop reverse transcription.GSMS design makes the secondary structure of 5 ' the end formation hair fastener type at GSMS RNA.When ThermoScript II arrives at the secondary structure that GSMS RNA 5 ' holds, transcriptive process,reversed is terminated.
Figure 10. show that had with the behavior of ThermoScript II that the is check and correction performance differed from.If use the ThermoScript II of check and correction better performances, GSMS RNA is inverted record for the ssGSMS cDNA consistent with design.And when using the ThermoScript II of check and correction poor performance, random mutation often can be introduced in ssGSMS cDNA, causes the recipient cell random mutation storehouse of a generation ssGSMS cDNA.
Figure 11. can comprise in systems in which, assist some expression cassettes of goal gene group transformation.
Expression cassette 1: operon/GSMS/ terminates son
Expression cassette 2: siRNA/ termination of operon/sensing GSMS goal gene
Expression cassette 3: operon/ThermoScript II/termination
Expression cassette 4: operon/energy complementation is attached to artificial tRNA/ termination of primer binding site on GSMS
Expression cassette 5: operon/energy complementation is attached to natural tRNA/ termination of primer binding site on GSMS
Expression cassette 6: fluorescent protein expression box
Expression cassette 7: other RNA interfering are transported, the RNAi expression cassette of degraded
Expression cassette 8: operon/Zinc finger nuclease or activating transcription factor sample effector nuclease/termination
Expression cassette 9: operon/single stranded DNA (ssDNA) associated proteins/termination
Figure 12. show that one merges the design of expression cassette, merges in expression cassette at this, exceed a kind of protein coding sequence (as ThermoScript II, single-stranded DNA binding protein) and GSMS by linearly connected together and controlled by same operon.Albumen coded sequence is near 5 ' end of expression cassette, and GSMS is near 3 ' end.A translation jump sequence is inserted between two adjacent protein encoding sequences, and simultaneously sequence that can form hairpin structure when being translated into RNA is added in the 5 ' end of GSMS.After expression cassette is transcribed into RNA, rrna starts synthetic protein from the 5 ' end of RNA and stops when running into stop code, and ThermoScript II is synthesized cDNA and stopped when running into hair fastener type secondary structure from the 3 ' end of RNA, seldom enter rna protein coding region.Translation jump sequence is encoded one and is ensured to generate independently ThermoScript II and single-stranded DNA binding protein instead of larger fusion protein that the two combines from cutting polypeptide.
Embodiment
In this article, unless otherwise defined, the personnel of the general technical ability of grasp that the technology used and scientific terminology all can be worked in field related to the present invention by one understand.As used herein, following term has appointment implication, states clearly unless separately had used different definition in the text.
Term " genomic dna transformation sequence " (genomic sequence modifying sequence – GSMS).As used herein, refer to that one section can be caused at recipient chromosome or episome specific site sequence the nucleic acid changed.Change can refer to one or more nucleotide diversity, Nucleotide insert, delete, or more arbitrary combination.Change can also refer to insertion one section can cause required phenotypic foreign DNA.Independent of extrachromosomal DNA that can be independently duplicated in term " episome DNA " phalangeal cell, as plasmid, clay, bacterial artificial chromosome (BAC) and yeast artificial chromosome (YAC).Genomic dna transformation sequence (GSMS) is containing a primer binding sequence and the sequence with recipient chromosome or episome aim sequence homology.The homologous sequence of GSMS can be the sequence with aim sequence complete complementary except one or several base.GSMS also can have complementation in various degree with aim sequence, as 10-20%, 30-40%, 50-60%, and 70%, 80%, the homologous complementary of 90%, 95%.In some embodiment, GSMS comprises one section of exogenous array, it and be connected from one or both ends with the sequence of aim sequence homology; The homologous sequence of GSMS should have enough length that it can be confirmed the aim sequence of recipient cell and be hybrid with it.The length of homologous sequence can be, such as 10 bases, but also can be that 40-60 base is long, or hundreds of, and a few kilobase is long.Aim sequence can be any interested sequence needing to change and transform, and comprises encoding sequence, non-coding sequence (regulating and controlling sequence, intron and tumor-necrosis factor glycoproteins).Aim sequence can also be recombination hotspot sequence or forward or backwards tumor-necrosis factor glycoproteins etc. have the sequence of higher homologous recombination rate.
The term used in this article, as " allogeneic dna sequence DNA ", " external source polynucleotide ", " exogenous nucleic acid ", or " exogenous array ", refer to that those are for the sequence of special receptor origin of cell in external source, or arise from homology but be in site seen by recipient cell non-natural.Foreign DNA is expressed and is produced allogenic polypeptide, as marker gene.
As used herein, term " homologous sequence ", " homologous nucleic acid ", or " homologous polynucleotides " refers to the endogenous nucleotide sequence (as DNA or RNA) of special receptor cell.Homologous sequence can extract and clone from recipient cell, or according to sequence data chemosynthesis.Homologous sequence can with source DNA in recipient cell or RNA complete complementary.Also can there is some base difference with endogenous sequence in homologous sequence, but it should have sufficiently high homology with corresponding endogenous sequence, to such an extent as to can confirm and and corresponding endogenous sequence hybridization when being introduced in recipient cell.Homologous gene changes the native gene referring to utilize homologous nucleotide sequence change recipient cell corresponding.
" expression cassette " used herein refers to the partial nucleic acid sequence of a carrier, and it comprises instructs cellular machineries to produce RNA or the genetic module needed for albumen.The basic module of a DNA expression cassette comprises operon sequence, the sequence of coding RNA or albumen, and a transcription termination sequence.Operon be one with start specific gene and transcribe the region of DNA territory that desirable proteins (as RNA polymerase, transcription factor) combines.According to the difference (as prokaryotic cell prokaryocyte, eukaryotic cell) of recipient cell type, multiple operon can be applied in expression cassette.Prokaryotic cell prokaryocyte operon comprises, as Lac operon, and Trp operon, Tac operon and T7 operon.Efficient viral operon can be applied in eukaryotic cell, as CMV-IE operon, and SV40 operon, RSV-LTR operon, (MoMLV) LTR operon, and (CaMV) 35S operon.In general eukaryote operon is weaker than viruses manipulate, but favourable one side is expression specificity in a organized way, as Apo A-I operon (De Geest etal., 2000) and ApoE operon (Kim et al., 2001) be the special operon of liver, MCK operon (Hauser et al., 2000) and flesh ball heavy chain operon (myosin heavy-chain promoter, Skarli et al., 1998) be muscle specific operon.Transcription terminator causes RNA polymerase to stop the section of DNA sequence of transcribing.Protokaryon and eukaryote have different transcription termination signal systems, therefore use different terminator sequences.Operon in an expression cassette and terminator can derive from same or different genes.Such as, T7 terminator and T7 operon are used for the expression cassette of bacterial cell, and CMV-IE operon and rabbit beta Globulin terminator are used for animal cell expression box (as pTandem-1 carrier, Navagen, Madison, WI).Sometimes, poly (A) terminator sequence (as pTargeTTM carrier, Promega, Madison, WI) that can synthesize with a section.
Except expressing multiple albumen with the expression cassette be cascaded, 2 to 3 albumen can be integrated in a ceneme and be driven by same operon; In this case, the DNA sequence dna of different albumen of encoding is translated jump sequence and separates, and translation jump sequence encodes one 2A polypeptide (Halpin etal., 1999) (Zhang, 2013) with function of autotomying.2A polypeptide is a 18-22 the amino acid whose picornavirus polypeptide with function of autotomying.When translation albumen, the peptide bond of a synthesis 2A PROTEIN C end skipped by rrna, causes peptide bond rupture between 2A, downstream peptide separation (Kim, 2011).As shown in figure 12, one or more albumen coded sequence can put into same expression cassette with the merging of GSMS sequence.Albumen coded sequence is translated jump sequence and separates.GSMS holds near 3 ' and is separated by a hairpin structure formation sequence and upstream protein encoding sequence in expression cassette, such as, same expression cassette is put into when ThermoScript II and single-stranded DNA binding protein and GSMS merge, generation RNA molecule (from 5 ' end to 3 ' end) of transcribing of expression cassette is followed successively by ThermoScript II encoding sequence, translation jump sequence, single-stranded DNA binding protein encoding sequence, hairpin structure formation sequence, GSMS sequence (Figure 12).Translation jump sequence is translated as a polypeptide of autotomying, and causes generation one independently ThermoScript II and one independently single-stranded DNA binding protein.Stop downstream GSMS RNA to be translated in the stop code of single-stranded DNA binding protein end, and the hairpin structure that GSMS 5 ' holds hinder reverse transcription to enter protein-coding region.This structure can allow multiple albumen or protein protomer express in same expression cassette, simplifies expression cassette and builds, facilitate the coordinated expression of associated protein.
Expression cassette is used for generating RNA or having the protein of function.Such as, GSMS expression cassette is used for generating GSMSRNA, is then inverted record for ssGSMS cDNA.ThermoScript II expression cassette, ssDNA binding protein expression box, and target endonuclease expression cassette is used to expressive function albumen.These protein expression boxes in order to generate albumen or the polypeptide of required function, and do not have the restriction of concrete protein/polypeptide sequence.ThermoScript II expression cassette can be the natural of templated synthesis complementary DNA or through engineered ThermoScript II with RNA in order to generate.It is active that ThermoScript II can possess or not possess RNA enzyme H (RNase H), and RNase H activity refers to the RNA molecule sheared in RNA-DNA heterozygote, and depends on the DNA synthase activity of DNA profiling.ThermoScript II can obtain from retrovirus, as Moloney murine leukemia virus (M-MuLV), and human immunodeficiency virus (HIV) and avian myeloblastosis virus (AMV).Naturally occurring ThermoScript II generally only has little or does not possess proofreading function, therefore easily makes mistakes in reverse transcription.Engineered engineering ThermoScript II has good proofreading function, can obtain, as SuperScript from commercial channel tMthermoScript II (Life Technologies, Carlsbad, CA) and AccuScript high-fidelity ThermoScript II (Agilent Technologies, Santa Clara, CA).When needing to maintain specific variation in ssGSMS cDNA, need to use high-fidelity ThermoScript II.The ThermoScript II of check and correction ability can be used for manufacturing random mutation in ssGSMS cDNA.
SsDNA binding protein expression box is used for generating the natural or engineered ssDNA associated proteins that can combine with single stranded DNA, in order to protection, stablizes single stranded DNA and assists single stranded DNA to be transported to nucleus.These ssDNA associated proteins are likely also parts for cell-isogenic recombination mechanism, participate in the homologous recombination of GSMS.SsDNA associated proteins can comprise, as replication protein A (replication protein A), and Rec A, Rad51, DMC1, ICP8, SSB and any homologous protein with identity function.Intestinal bacteria (E.coli) RecA albumen and homologue thereof are the important albumen of homologous recombination, participate in same source acknowledgement, and homologous dna matches, and chain exchanges.Homologue RAD51 in RecA and eukaryote thereof can be incorporated into formative dynamics nucleic acid-protein silk on single stranded DNA, and double-stranded DNA finds homologous region, then assists homologous dna chain invasion (Li, 2008).The overexpression of RecA and similar protein thereof may increase the efficiency of the homologous gene transformation dominated by GSMS.Target endonuclease expression cassette is used for generating the endonuclease that can identify preassigned specific nucleic acid squences.Target endonuclease is through engineered restriction endonuclease, and a DNA sequence dna recognition structure territory is linked to a DNA endonuclease enzyme domains by it.Recognition sequence structural domain can comprise activating transcription factor sample effector (TALE) DNA sequence dna recognition structure territory, zinc finger protein DNA sequence dna recognition structure territory, and the recognition sequence structural domain of large-scale nuclease.
In some embodiments, siRNA expression cassette is used for generating the siRNA (siRNA) that goal gene transcription product RNA can be caused to degrade.SiRNA is designed to identify the sequence beyond GSMS homology region, thus causes the degraded of goal gene transcribe rna instead of the degraded of GSMS RNA.The degraded of goal gene RNA can send signal to nucleus, maintains goal gene transcriptionally active, causes goal gene sequence to be in untwist or partly untwist state.This hybridization for GSMS and object site homologous sequence provides conveniently.SiRNA expression cassette has multiple building mode.Such as, it can drive a bit of hair fastener type siRNA by rna plymerase iii operon, and is stopped transcribing by one section of guanylic acid and without the need to poly VITAMIN B4 signal (Sui et al., 2002; Brummelkamp et al., 2002).SiRNA expression plasmid and virus vector can from commercial channel as acquisitions such as LifeTechnologies, Gene Script, BMC Biotechnology.
Expression cassette preferably DNA but also can be rna form.Albumen coded sequence in rna expression box directly can be translated as protein, and the GSMS RNA in rna expression box can be used directly as templated synthesis ssGSMS cDNA.Such as, the circles lentiviral vectors (integration deficient lentiviral vector) (Chick, 2012) containing GSMS and ThermoScript II expression cassette can be used to produce ssGSMScDNA.Circles retrovirus vector, because it can reduce the unconventional integration of retrovirus vector self, therefore more applicable.The RNA containing the synthetic of multiple albumen coded sequence and the one section of GSMS sequence driven by same operon can be introduced into recipient cell and generate protein and ssGSMS cDNA in a coordinated fashion.
the generation of GSMS:gSMS that first terminates with an operon and is together cloned in an expression cassette, and then with one ThermoScript II expression cassette on same carrier or different carriers is together introduced into recipient cell (Fig. 1).GSMS also can integrate with (Figure 12) in same expression cassette with ThermoScript II.Be introduced in the single-stranded DNA binding protein expression cassette (as RecA, Rad51, RPA) also had on same or different carriers of recipient cell simultaneously.Once enter recipient cell, GSMS is first transcribed into RNA, and then GSMSRNA is ssGSMS cDNA (such as making primer with tRNA) by the ThermoScript II reverse transcription expressed simultaneously.SsGSMS cDNA is combined with the single-stranded DNA binding protein of expressing simultaneously subsequently and forms nucleic acid-protein complex body or nucleic acid-protein silk.Single-stranded DNA binding protein protection ssGSMS cDNA, resists degraded, reduces the formation of secondary structure, assist ssGSMS cDNA enter nucleus and aim at cognate genomic sequences (Robyn L.Maher, 2013).Several days subsequently or in a few week, as long as carrier exists, GSMS will be amplified thousands of times by transcribing with reverse transcription.
Said process provides lasting and a large amount of ssGSMS cDNA and enters in the nucleus of recipient cell, and the pairing genomic specific region of recipient cell, facilitates the change of specific gene.Based on the genetic modification of homology cell mechanism and imperfectly understand.Possible mechanism comprises dissimilar homologous recombination process and DNA repair path.It is that first ssGSMScDNA hybridizes with the homologous sequence of recipient cell genomic dna specific region that a possible DNA by DNA mismatch reparation (MMR) path transforms mechanism, if ssGSMScDNA contains variation, variation or in advance design or introduced at random by the ThermoScript II of easily makeing mistakes, ssGSMS cDNA and aim sequence are to the base pair that will produce mispairing on time.The base pair of mispairing can the gene repair of trigger receptor cell or error correction system, during reparation sometimes mistakenly with ssGSMS cDNA for template and change recipient cell genome sequence thus introduce sudden change (Fig. 4).The process of different homologous recombination can be participated in from the part of goal gene group sequence homology, particularly when goal gene group sequence is targeted endonuclease or other natural causes are damaged in GSMS.Homologous recombination process relates generally to Homology search, and DNA chain is invaded, and template guided DNA synthesis and the genetic modification of aim sequence that causes.Homologous recombination includes but are not limited to: following path, double-strand break repair path (DSBR), based on the strand annealing path (SDSA) of synthesis, strand annealing path (SSA), reproduction path (BIR), RecBCD path and the RecF path of chain rupture induction.
the GSMS design of different application object: if present system is used to change the base pair that presets of goal gene, then except the nucleotide diversity preset GSMS should with goal gene complete complementary.Variant nucleotides is preferably positioned at the center of GSMS homologous sequence or close center.High-fidelity ThermoScript II should be used to reduce the probability of the random non-required variation introduced in ssGSMS cDNA building-up process.
An attractive place of the present invention random mutation can be introduced goal gene thus be structured in the cell bank containing random mutation in goal gene region.As with this end in view, the ssGSMScDNA generated with the GSMS sequence of goal gene complete complementary and the ThermoScript II of proofreading ability containing random mutation can be used.
The present invention can also be used to object site foreign DNA being incorporated into acceptor gene group.Such as, it can knock out a native gene in the mode inserting one section of junk DNA sequence, or inserts the gene needed on the genomic locus preset.As with this end in view, GSMS comprises one section of exogenous array and the sequence with aim sequence homology, homologous sequence can at one end or two ends be connected with exogenous array.The homologous sequence being positioned at two ends is in general complementary with aim sequence 100%, and has enough length to support homologous recombination (Fig. 5, Fig. 6).Object site can be genomic any site, and such as, be increase the probability of targeted integration, aim sequence can be recombination hotspot sequence, and directly to or inverted repeats.
GSMS DNA sequence dna can be designed to when being transcribed into RNA, can form a hair fastener type secondary structure at the 5 ' end of GSMS RNA, during reverse transcription, can come off when ThermoScript II meets with this secondary structure thus stop cDNA synthesis from RNA.Unwanted like this Nucleotide can not be added to the 5 ' end (Fig. 9) of ssGSMS cDNA.
primer during ssGSMS cDNA synthesizes:the primer generating ssGSMS cDNA conventional is natural tRNA and artificial tRNA.Other also can use as Poly (T/A) primer.The primer of cDNA synthesis can be natural tRNA or artificial tRNA.3 ' the end of tRNA is inverted record virus always and retrotransposon is used as primer (Kleiman, 1997) (R Marquet, 1995) (Voytas, 1993) (S.B.Sandmeyer, 1996) (Wakefield, 1995).Generally, 10-18 the base pair that tRNA3 ' holds is inverted record virus and retrotransposon as the initial primers (Kleiman, 1997) of reverse transcription.The ThermoScript II of different retroviruss uses different tRNA as the initial primers of reverse transcription.Such as, human immunodeficiency virus (HIV), the ThermoScript II of Moloney murine leukemia virus (M-MuLV) and avian myeloblastosis virus (AMV) uses tRNA3lys respectively, tRNApro, with the initial primers (Kleiman, 1997) that tRNAtrp synthesizes as cDNA.Primer binding sequence on GSMS can be designed to hold complementation with 3 ' of the tRNA needed for selected ThermoScript II completely.GSMS RNA directly can use the initial reverse transcription of the tRNA of recipient cell.Natural tRNA expression cassette can join in the transient expression system containing GSMS and ThermoScript II expression cassette, increases the amount of primer tRNA, to improve the output of ssGSMS cDNA.
Except using the tRNA of recipient cell itself to do except primer, can design described by use (as A.H.Lund, 1997) with the artificial tRNA of GSMS primer binding site complete complementary.Artificial tRNA expression cassette can be structured on same or different carriers from GSMS and ThermoScript II expression cassette and import recipient cell simultaneously.Artificial tRNA is used as the primer of initial ssGSMS cDNA synthesis as shown in Figure 9 (Fig. 9).
Poly (T/A) primer method is that to be added in Poly (A) tail that mRNA 3 ' holds be primer.Achieve this end, one section of poly (U) must be added to the 3 ' end of GSMS RNA, and accordingly, one section of dT sequence will be added to 3 ' end of GSMS DNA sense strand.After poly (A) tail is added in poly (U) region that GSMS RNA 3 ' holds, poly (A) tail turns back and to anneal as the synthesis (Fig. 8) of the initial cDNA of primer with poly (U) section.A possible problem is, adds poly (U) and RNA may be hindered to transfer to tenuigenin.If situation is true, nuclear localization sequence (NLS) can be added to ThermoScript II transfers them in nucleus synthesizes GSMScDNA.
the selection of ThermoScript II:present system preferably adopts the ThermoScript II with RNase H activity, is used for ssGSMS cDNA to dissociate out from RNA/cDNA hybrid molecule.As with change aim sequence specific nucleotide or object site insert foreign gene for target, need, transcribing and the original GSMS sequence of strict maintenance in reverse transcription, high-fidelity ThermoScript II should be used to be introduced in ssGSMS cDNA to avoid unwanted variation in reverse transcription.Because most of natural ThermoScript II does not have 3 '-5 ' exonuclease activity (i.e. proofreading function), therefore easily make mistakes.The Superscript ThermoScript II of engineering ThermoScript II as Life Technologies with better proofreading function can be used as this object.
Separately there is purposes poor ThermoScript II (Battula N, 1974) (Battula N, 1976) (Kunkel TA, 1981) of check and correction ability.Such as, the error rate of HIV ThermoScript II of easily makeing mistakes is about 1/1500 base.For the GSMS DNA that a 2kb is long, average each ssGSMS cDNA molecule can contain a random variation.ThermoScript II of easily makeing mistakes in reverse transcription can be used for generating random variation on ssGSMS cDNA, random variation can be incorporated into goal gene by the ssGSMS cDNA containing random variation, thus producing one contains random mutation cell bank in goal gene region, this cell bank can be used to the transgenation required for screening.
the application of ssGSMS cDNA:no matter whether have the assistance of target endonuclease, ssGSMS cDNA can be used to modify goal gene.In certain embodiments, if target endonuclease is not included in transient expression system, ssGSMS cDNA still can enter into nucleus and hybridize with the cognate genomic sequences of recipient cell.If ssGSMS cDNA contains variation that is that design in advance or that introduced by ThermoScript II of easily makeing mistakes, ssGSMS cDNA and aim sequence are to the base pair that will produce mispairing on time.The base pair of mispairing can the gene repair of trigger receptor cell or error correction system as MMR, the gene repair of recipient cell or error correction system can mistakenly with ssGSMS cDNA for template and change recipient cell genome sequence thus introduce sudden change (Fig. 4).Also likely ssGSMS cDNA and homology aim sequence are hybridized, and dominate targeting modification or are integrated in recipient cell genome by exogenous DNA array by homologous recombination.
In some embodiments, transient expression system comprises a target endonuclease expression cassette.Target endonuclease is through engineered restriction endonuclease, and a DNA sequence dna recognition structure territory customized and a non-specific DNA endonuclease enzyme domains link together by it.DNA sequence dna recognition structure territory can be designed to the specific gene group sequence of identification one section of 10 to 50 base, and target endonuclease is therefore, it is possible at the target site cutting DNA introducing variation near expectation.When there being a large amount of ssGSMS cDNA to enter in nucleus, the DNA splitting of chain of target sequence can activate homologous recombination path and repair DNA.SsGSMScDNA and single-stranded DNA binding protein form nucleic acid-protein complex body or nucleic acid-protein silk, the DNA break of target homologous region is found by Homology search, probably under the assistance of single-stranded DNA binding protein as RecA and similar protein, invasion homology region is also synthesized as template guided DNA, thus variation is incorporated into target site.Associating ssGSMS cDNA and target endonuclease can greatly improve by the efficiency of homologous recombination at specific site modifying factor.The mechanism of the homologous recombination that may get involved includes but is not limited to, double-strand break repair path (DSBR), based on the strand annealing path (SDSA) of synthesis, strand annealing path (SSA), fracture induction reproduction path (BIR), RecBCD path and RecF path.
The example of target endonuclease comprises activating transcription factor sample effector nuclease (TALEN), Zinc finger nuclease (ZFN) and large-scale nuclease etc.These target endonucleases can through engineered and can recognize and cut the long specific duplex nucleotide sequence of one section of 10 to 15 base, this remarkable tool making them become target gene to modify.TALEN is an all well and good selection, repeat because micro-motif relevant with single core thuja acid is contained in the DNA recognition structure territory of TALEN, design one can be identified, and micro-motif of required sequence is combined into possibility (United States Patent (USP), the patent No. 8440431 and 8440432).TALEN and ZFN through engineered with cutting double-stranded DNA or produce double-strand break (DSB) further, or can produce and incises (Kim, et al., 2012) on a chain of double-stranded DNA.Kim finds, compare with double-strand DNA cleavage (DSB) endonuclease, the integration efficiency of incising endonuclease is lower.But incise that endonuclease causes non-specific integration and chromosome rearrangement rate also lower.If avoid non-specific integration and chromosome rearrangement to be the aspects of outbalance, as gene therapy, then target incises endonuclease is better selection.In certain embodiments, the polypeptide with high-affinity can link together with target endonuclease (as ZFN, TALEN) and single-stranded DNA binding protein respectively.By the mutual work between affine polypeptide, the ssGSMS cDNA together with single-stranded DNA binding protein can attracted to the cut point of target site and incise a little.
Modify the assembly of the transient expression system of goal gene:
Some basic modules of system described above, can add other expression cassette (Figure 11) according to different objects.The expression cassette of system includes but are not limited to::
Expression cassette 1: promotor/GSMS/ terminates son; Expression cassette 2: promotor/SiRNA-is in order to RNA/ termination of goal gene of degrading; Expression cassette 3: promotor/ThermoScript II/termination; Expression cassette 4: promotor/sub as the artificial tRNA/ termination of reverse transcription primer; Expression cassette 5: promotor/sub as the natural tRNA/ termination of reverse transcription primer; Expression cassette 6: (fluorescent protein expression box is used to monitoring and transforms promotor/fluorescence protein gene/termination, the cell that transient expression and selection are converted; Expression cassette 7: other RNAi expression cassette be used for RNA interfering transport, the processes such as degraded.Other RNA interference methods such as siRNA also can adopt; Expression cassette 8: promotor/ZFN or TALEN/ terminates son; Expression cassette 9: promotor/single-stranded DNA binding protein/termination.
The most basic assembly of this transient expression system invention is GSMS and ThermoScript II expression cassette, and they can on same or different expression vectors, also can in same expression cassette.Preferably on same carrier or different carriers, add a natural or artificial tRNA expression cassette, produce the primer tRNA needed for initial reverse transcription.A single-stranded DNA binding protein expression cassette also be introduced in system, and the single-stranded DNA binding protein produced is combined with ssGSMS cDNA, stablizes ssGSMS cDNA and assists ssGSMS cDNA to nuclear transport.Add the specially designed goal gene but simultaneously do not disturb the RNAi expression cassette of ssGSMScDNA to be also useful of can degrading.The degraded of goal gene may feed back to nucleus, stimulates transcribing of goal gene, goal gene zone maintenance is untwisted or relaxed state.This can provide more multimachine meeting for ssGSMS cDNA close to the goal gene in karyomit(e).In addition, target endonuclease expression cassette can be included the efficiency of the genetic modification increased in systems in which based on homology.As shown in figure 12, one or more albumen also can be incorporated in an expression cassette together with GSMS.
Above-mentioned expression cassette can be structured on a carrier, also can be structured on different carriers and introduce recipient cell simultaneously.The carrier of transient expression system can be any carrier being suitable for rna transcription and protein expression.It can be DNA or RNA, ring-type or linear pattern, virus or non-virus carrier.Vector is entered the method for cell, the investigator in this field knows.The example of these methods comprises electric shocking method, and PEG transforms, DEAE-detran or CaPO 4the precipitator method, lipofection, nanoparticle transforms, virus infection and direct injection.Recipient cell can be applicable protokaryon and eukaryote, as Mammals, and plant, insect, blue-green algae, fungi, yeast, bacterium, cultured cells etc.To be modified and what change can be any genome sequence of recipient cell, comprised exon, intron, promotor, termination, UTRs and regulating and controlling sequence or motif.
This technology perhaps can be used in treating communicable disease, genetic diseases and as Other diseases such as cancers; Cultivate crop, forest, domestic animal, and the new proterties of fish; Produce bacterium, yeast, blue-green algae and the new culturing cell type of animal are used for medicine and character screening; And for fermentation, oil plant, albumen and other medicine, the production of biological industrial raw material and scientific research.
Embodiment
Following embodiment only explains purposes and does not limit or restriction effect having invented.
Embodiment 1. utilizes GSMS ssDNAs to recover GFP fluorescence.
In the present embodiment, GFP GSMS ssDNA be used to correct stable integration to the deletion frameshit GFP mutant gene in yeast cell genome.Complete GFP encoding sequence causes frameshit deleting due to single base near 5 ' end place, the GFP sequence that this deletes frameshit is stably incorporated in yeast cell.Sequence measurement is utilized to examine the variation of GFP in Yeast genome.Confirm that deleting frameshit result in GFP inefficacy.
The system utilized in the present embodiment comprises GSMS expression cassette, ThermoScript II expression cassette, ssDNA binding protein expression box and primer expression cassette.
First build GSMS expression cassette, the wild-type GFP one section being comprised variation GFP base deletion point is connected with a strong constitutive promoter TEF1 of yeast, then is connected with primer binding sequence and yeast transcriptional terminator.The wild-type GFP encoding sequence of 400bp length lacks initiator codon and therefore can not be translated into albumen.Only has the GFP albumen that just can produce normal luminous after the variation GFP in Yeast genome is corrected.The primer binding sequence of GSMS has 18 bases and the complementation of tRNApro3 ' terminal sequence, and tRNApro is used as tRNA primer by M-MuLV ThermoScript II.Due to this example need one section accurately wild-type GFP encoding sequence correct variation GFP sequence, use high-fidelity ThermoScript II SuperScriptTMII to build ThermoScript II expression cassette.ThermoScript II expression cassette has a strong Yeast promoter TEF1, the sequence of coding SuperScriptTMII M-MuLV ThermoScript II and a yeast transcriptional terminator sequence (yeast ADH1 terminator).TRNApro expression cassette is formed by a Yeast promoter (yeast ADH promotor) being connected to coding tRNAproDNA sequence and transcription termination sequence (yeast ADH1 terminator) structure following closely.
In merging expression cassette, first a Yeast promoter is connected to RAD51 encoding sequence, and being a hair fastener secondary structure formation sequence subsequently, is then GSMS sequence, and one the 3 ' transcription termination sequence held.RAD51 with GSMS merge expression cassette can make RAD51 albumen and ssGSMS cDNA almost in the same time and place together with express, be conducive to the formation of GSMS-RAD51 nucleic acid-protein silk, GFP sequence homology is searched for, and the correction of the invasion of GSMS chain and variation GFP gene.
Above expression cassette on same yeast vector, or respectively on different carriers.Yeast vector can be annular or linear, but uses and can be integrated in Yeast genome to avoid whole carrier by integrative vector, uses linear pattern yeast vector in the present embodiment.Include SuperScriptTMII ThermoScript II, utilize LiAc/SS Carrier DNA/PEG to be introduced in the yeast cell containing variation GFP by the carrier of RAD51 and GSMS expression cassette.Mutant gene group GFP gene after correction can produce the GFP albumen that can send fluorescence, utilizes fluorescent microscope to detect; Flow cytometer is utilized to capture cell containing correct GFP gene; Confirm that GFP sequence is correct by DNA sequencing qualification.
Embodiment 2. utilizes strand GSMS DNA and GFP siRNA to knock out recipient cell GFP gene.
The present embodiment demonstration utilizes the GFP gene of tool normal function in GSMS ssDNA knock-out animal cell.
The system used in the present embodiment comprises GSMS expression cassette, ThermoScript II expression cassette, ssDNA binding protein expression box, primer expression cassette and siRNA expression cassette.
Select the 293T-GFP clone that can produce green fluorescence that is that there is normal function and that be integrated into the GFP cDNA in nucleus.GSMS is the GFP encoding sequence that close GFP cDNA 5 ' that one section of 300 base is long holds, it contain a frameshit delete and two do sth. in advance terminator codon (ATG).The primer binding sequence of GSMS (+) chain is 20 U transcribed at 3 ' 20 dT held in rear formation GSMS RNA.After Transcription Termination, one section of poly A tract bar is added to the 3 ' end of GSMS RNA, it and 20 U sequences self-annealing before it, becomes the primer of reverse transcription.Same close GFP cDNA 5 ' holds, but the GFP sequence outside this 300 bases G FP cDNA region is used to design double-strand GFP siRNA, and is cloned in expression cassette.GFP siRNA induces the degraded of GFP mRNA, sends signal and maintains GFP gene active transcription, and make the genomic locus at GFP gene place be in state of untwisting or partly untwist.The structure of SuperScriptTMII ThermoScript II expression cassette is as embodiment 1.
The GFP siRNA of chemosynthesis and carrier fat body transfection agents (lipofectamine) containing GSMS and SuperScriptTMII ThermoScript II expression cassette import in 293T-GFP cell.Normally cultivate 293T-GFP after conversion, and find the cell not having GFP fluorescence.Utilize cell flow instrument to be separated unstressed configuration cell, then confirm that GFP gene is knocked by PCR and DNA sequencing.
Embodiment 3. utilizes the targeted integration of the homologous recombination focus of the GFP gene of GSMS ssDNA.
The system used in the present embodiment comprises GSMS expression cassette, ThermoScript II expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette and primer expression cassette.
The present embodiment utilizes GSMS ssDNA one section of foreign DNA to be incorporated into specific site in recipient cell, is exactly homologous recombination hot spot region specifically.First confirming will the genome homologous recombination focus that hits of target.GSMS design be the GFP encoding sequence of a tool normal function and promotor and two ends 100 base long with the sequence of genome homologous recombination hotspot sequence complementation.As description in embodiment 1, the primer binding sequence of GSMS is and tRNAPro3 ' holds 18 complementary base sequences.The merging expression cassette of a SuperScriptTMII ThermoScript II expression cassette and a GSMS as above and animal GAD51 albumen coded sequence builds by described in embodiment 1.Carrier containing above-mentioned expression cassette is transferred to HEK293 born of the same parents system in and cultivate one week, until transient expression vector is degraded and disappears.Find and the cell clone being separated GFP fluorescence, confirm to there occurs homologous recombination by DNA sequencing.
Embodiment 4. utilize GSMS ssDNA and TALEN to build one contains random mutation cell bank in goal gene group sequence
The present embodiment demonstration utilizes GSMS ssDNA, and the ThermoScript II of proofreading function difference and target endonuclease TALEN build one contains the cell bank of random mutation method in goal gene group sequence.The system used in the present embodiment comprises GSMS expression cassette, the ThermoScript II expression cassette of proofreading function difference, ssDNA binding protein expression box, and primer expression cassette.
This method can be used for importing random mutation in the DNA sequence dna in goal gene group site, builds random mutation cell bank in order to screen new proterties or required phenotype.Need the goal gene group site changed can be continuous print exon sequence, regulation and control region, or be incorporated into the cDNA sequence in genome.In the goal gene group site that the present embodiment changes for be incorporated into the genomic normal GFP gene coded sequence of recipient cell.Mutant easily can be screened because of the GFP that weakens or inefficacy GFP (unstressed configuration).
GSMS is identical with goal gene group sequence, and the HIV ThermoScript II (average error rate 1/1500 base) of easily makeing mistakes is used for being incorporated in ssGSMS cDNA by random mutation when reverse transcription.Cater to the tRNA3Lys that HIV ThermoScript II is personal, GSMS primer binding sequence contains 18 bases and tRNA3Lys3 ' holds complementation.The DNA binding domains of TELEN is customized to and can confirms GFP protein encoding regions in goal gene group sequence, and is used for carrying out double-strand cutting in this region.Utilize TALEN to produce in recipient cell efficiency that double-strand break greatly can improve Homologous integration, facilitate generation cell colony, each cell contains Different Variation.GSMS, HIV ThermoScript II, GAD51, and TALEN expression cassette is as described in Example 1.Containing GSMS, HIV ThermoScript II, and the expression cassette electric-shocking method of TALEN imports recipient cell.After entering cell, GSMS expression cassette expresses GSMS RNA, and then GSMS RNA is inverted record for the ssGSMS cDNA containing random mutation, and ssGSMS cDNA and GAD51 protein binding form nucleoprotein filament and be transported to nucleus.Entered nucleus by the TALEN enzyme of expressing simultaneously, cause double-strand break in target gene group sequence, GSMS-GAD51 nucleoprotein filament carries out Homology search and anneals with the homology region of goal gene group sequence, facilitates the Homologous integration of the GSMS containing random mutation.The detection of fluorocyte flow instrument is utilized to cultivate the cell after several days and select fluorescent weakening or non-blooming viable cell.Cultivate these cells and GFP coding region is checked order, determining the site of the transgenation causing GFP inefficacy or miopragia.
Although the present invention is for elaboration, clear and definite and convenient understanding is described some details, and any people in this field with general technical ability can admit and still be covered by the present invention to the present invention's many changes in form and details.All relevant figure, table, annex, patent, patent application and publication include in reference.

Claims (16)

1. introduce a system for variation in the goal gene group sequence of recipient cell, described system comprises:
A) a genomic dna mutagenized sequences GSMS expression cassette, wherein said GSMS expression cassette comprises polynucleotide sequence and a primer binding sequence of one section and described goal gene group sequence homology, and wherein said GSMS expression cassette produces GSMS RNAs;
B) a ThermoScript II expression cassette, wherein said ThermoScript II expression cassette comprises the polynucleotide sequence of one section of coding ThermoScript II, coded ThermoScript II can be natural or engineered ThermoScript II, and described ThermoScript II can be have good proofreading function or poor proofreading function; And
Described GSMS expression cassette imports in recipient cell from ThermoScript II expression cassette by same expression vector or different expression vector, in recipient cell, described GSMS RNAs is ssGSMS cDNAs by described ThermoScript II reverse transcription, and described ssGSMS cDNAs causes variation in described goal gene group sequence.
2. system according to claim 1, wherein, described GSMS comprises one section and can match with recipient cell genome sequence with the sequence of described goal gene group sequence complete complementary and hybridize being introduced in described GSMS after in recipient cell; Or, except there is the difference of one or several Nucleotide the position preset at described GSMS, described GSMS comprise one section with the sequence of described goal gene group sequence complete complementary, the difference of described Nucleotide is the listed combination of mispairing, deletion, insertion or more.
3. system according to claim 1, wherein, described GSMS comprises one section of non-homogeneous polynucleotide sequence, be clipped in and recipient cell genome sequence homology sequence between; Preferably, described non-homogeneous polynucleotide sequence is encoded a selectable marker gene.
4. system according to claim 1, wherein said GSMS comprise one section with the sequence of the homologous recombination hot spot region homology of goal gene group sequence, or, described GSMS comprise one section with the sequence of the goal gene group sequence homology containing direct repeat and inverted repeat.
5., according to the system in claim 1-4 described in any one, the primer binding sequence of wherein said GSMS and 3 ' of natural or artificial tRNA sequence holds complementation.
6. system according to claim 5, wherein 5 ' the end of GSMS RNA can form secondary structure, stops reverse transcription when described ThermoScript II runs into described secondary structure.
7. system according to claim 6, also comprises a primer expression cassette, and wherein said primer expression cassette produces natural or artificial primer tRNA, and the described primer binding sequence of primer tRNA on described GSMS RNAs is combined, initial reverse transcription.
8. the system according to claim 1-4,6 or 7, also comprises
SsDNA binding protein expression box, wherein said ssDNA binding protein expression box is encoded a ssDNA associated proteins;
Wherein, described ssDNA associated proteins comprises: the homologous protein of the homologous protein of RecA, RecA, the homologous protein of Rad51, Rad51, DMC1, DMC1, the homologous protein of ICP8, ICP8, the homologous protein of SSB and SSB.
9., according to the system in claim 1-4,6-8 described in any one, also comprise
A target endonuclease expression cassette, described target endonuclease expression cassette is encoded a target endonuclease, and described target endonuclease comprises a DNA sequence dna recognition structure territory and a DNA endonuclease enzyme domains, and described target endonuclease and described GSMS point to the same homology region of described goal gene group sequence;
Wherein, the described DNA sequence dna recognition structure territory of described target endonuclease is selected from such histone, and this histone comprises, zinc finger protein DNA sequence dna recognition structure territory, activating transcription factor sample effector DNA sequence dna recognition structure territory, and the recognition sequence structural domain of large-scale nuclease.
10. system according to claim 9, the described DNA endonuclease enzyme domains of wherein said target endonuclease can cut one section of double stranded polynucleotide sequence and produce double-strand break, or, a chain in a cutting double-stranded DNA thus cause a breach on double-stranded DNA.
11. according to the system in claim 1-4 and 6-10 described in any one, also comprise a siRNA expression cassette, wherein said siRNA expression cassette produces the degraded that siRNA causes the mRNA of described goal gene group sequence, and siRNA and described GSMS RNAs does not have homologous sequence.
12. according to the system in claim 6-11 described in any one, and wherein, described system comprises primer expression cassette, ssDNA binding protein expression box, at least one in target endonuclease expression cassette and siRNA expression cassette;
Wherein, described GSMS expression cassette, ThermoScript II expression cassette, primer expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette and siRNA expression cassette can be positioned at different expression vector or simultaneously on the same vector, and ThermoScript II encoding sequence, in ssDNA associated proteins encoding sequence and target endonuclease encoding sequence, at least two sequences can be present in an expression cassette simultaneously, are formed and merge expression cassette;
In described merging expression cassette, single operating is connected to two or more albumen coded sequences, and wherein adjacent protein encoding sequence is translated jump sequence and separates; Or, one of them single operating is connected to two or more albumen coded sequences and continues to be connected to GSMS, wherein adjacent protein encoding sequence is translated jump sequence and separates, and GSMS and upstream protein encoding sequence are opened by the sequence separates of one section of hair fastener type RNA structure of encoding.
Introduce the method for variation in recipient cell goal gene group sequence, comprise the following steps for 13. 1:
A) build a GSMS expression cassette, wherein said GSMS contain one section with the polynucleotide sequence of described goal gene group sequence homology and a primer binding sequence, and wherein said GSMS expression cassette generation GSMS RNAs;
B) build a ThermoScript II expression cassette, wherein said ThermoScript II expression cassette is encoded a ThermoScript II; And
C) described GSMS expression cassette and described ThermoScript II expression cassette are introduced recipient cell simultaneously, it is ssGSMS cDNAs that wherein said GSMS RNAs is inverted the record of record enzymatic reversion, and described ssGSMScDNAs causes variation in goal gene group sequence;
D) the reformed recipient cell of goal gene group sequence is selected.
14. methods according to claim 13, also comprise the one or more expression cassettes selected from following one group of expression cassette they are incorporated in recipient cell simultaneously, they comprise primer expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette and siRNA expression cassette.
15. obtain the method having the cell mass of random variation in goal gene group sequence, comprise the following steps:
A) build a GSMS expression cassette, wherein said GSMS contain one section with the polynucleotide sequence of the complete homology of described goal gene group sequence and a primer binding sequence, and wherein said GSMS expression cassette generation GSMS RNAs;
B) build a ThermoScript II expression cassette, wherein said ThermoScript II expression cassette is encoded the weak ThermoScript II of an error correction;
C) described GSMS expression cassette and described ThermoScript II expression cassette are introduced recipient cell simultaneously, wherein said GSMS RNAs is the ssGSMScDNAs containing random variation by the ThermoScript II reverse transcription that error correction is weak, and described ssGSMS cDNAs causes random variation to be integrated into described goal gene group sequence; And
D) cell in described goal gene group sequence with random variation is selected.
16. methods according to claim 15, also comprise the one or more expression cassettes selected from following one group of expression cassette they are incorporated in recipient cell simultaneously, they comprise primer expression cassette, ssDNA binding protein expression box, target endonuclease expression cassette and siRNA expression cassette.
CN201410559802.6A 2013-10-19 2014-10-20 System and method for introducing variation in target genomic sequences of recipient cells Pending CN104372016A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/058,214 US20140113375A1 (en) 2012-10-21 2013-10-19 Transient Expression And Reverse Transcription Aided Genome Alteration System
US14/058,214 2013-10-19

Publications (1)

Publication Number Publication Date
CN104372016A true CN104372016A (en) 2015-02-25

Family

ID=52569778

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410559802.6A Pending CN104372016A (en) 2013-10-19 2014-10-20 System and method for introducing variation in target genomic sequences of recipient cells

Country Status (1)

Country Link
CN (1) CN104372016A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1316878A (en) * 1998-08-11 2001-10-10 安东尼C·F·佩里 Method of performing transgenesis
WO2003104470A2 (en) * 2002-06-05 2003-12-18 Her Majesty In Right Of Canada As Represented By The Minister Of Agriculture And Agri-Food Canada Retrons for gene targeting
CN102634534A (en) * 2012-03-30 2012-08-15 深圳市中联生物科技开发有限公司 Nucleic acid molecular cloning method and related kit based on homologous recombination

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1316878A (en) * 1998-08-11 2001-10-10 安东尼C·F·佩里 Method of performing transgenesis
WO2003104470A2 (en) * 2002-06-05 2003-12-18 Her Majesty In Right Of Canada As Represented By The Minister Of Agriculture And Agri-Food Canada Retrons for gene targeting
CN102634534A (en) * 2012-03-30 2012-08-15 深圳市中联生物科技开发有限公司 Nucleic acid molecular cloning method and related kit based on homologous recombination

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
EUNJI KIM 等: "Precision genome engineering with programmable DNA-nicking enzymes", 《GENOME RESEARCH》 *
JIN HEE KIM 等: "High Cleavage Efficiency of a 2A Peptide Derived from Porcine Teschovirus-1 in Human Cell Lines, Zebrafish and Mice", 《PLOS ONE》 *
MASAYORI INOUYE 等: "Retrons and Multicopy Single-stranded DNA", 《JOURNAL OF BACTERIOLOGY》 *

Similar Documents

Publication Publication Date Title
Ryan et al. Multiplex engineering of industrial yeast genomes using CRISPRm
US10612043B2 (en) Methods of in vivo engineering of large sequences using multiple CRISPR/cas selections of recombineering events
KR102098915B1 (en) Chimeric genome engineering molecules and methods
CN103068995B (en) Direct cloning
US10287590B2 (en) Methods for generating libraries with co-varying regions of polynuleotides for genome modification
CN104520429A (en) RNA-directed DNA cleavage by the Cas9-crRNA complex
Moyer et al. Generation of a conditional analog-sensitive kinase in human cells using CRISPR/Cas9-mediated genome engineering
WO2018123134A1 (en) Method for editing filamentous fungal genome through direct introduction of genome-editing protein
Bao et al. Accelerated genome engineering through multiplexing
Velázquez et al. Targetron-assisted delivery of exogenous DNA sequences into Pseudomonas putida through CRISPR-aided counterselection
CA3129869A1 (en) Pooled genome editing in microbes
Adjalley et al. CRISPR/Cas9 editing of the Plasmodium falciparum genome
Teng et al. The expanded CRISPR toolbox for constructing microbial cell factories
US20190264216A1 (en) Fungal artificial chromosomes, compositions, methods and uses therefor
Zhang et al. CRISPR/dCas9-mediated gene silencing in two plant fungal pathogens
Yamashita et al. CRISPR toolbox for genome editing in Dictyostelium
US20190359991A1 (en) Method for Producing Mutant Filamentous Fungi
Cooper et al. One-day construction of multiplex arrays to harness natural CRISPR-Cas systems
CN104372016A (en) System and method for introducing variation in target genomic sequences of recipient cells
Ostermeier et al. Construction of hybrid gene libraries involving the circular permutation of DNA
Liao et al. One-step assembly of large CRISPR arrays enables multi-functional targeting and reveals constraints on array design
Zhang et al. Efficient site-specific editing of the C. elegans genome
Gomaa et al. CRISPR/Cas9‐induced disruption of Bodo saltans paraflagellar rod‐2 gene reveals its importance for cell survival
Abbott et al. Evolution at the cutting edge: CRISPR-mediated directed evolution
US20240052370A1 (en) Modulating cellular repair mechanisms for genomic editing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20150225