CN116171326A - Nucleic acid therapy for genetic disorders - Google Patents

Nucleic acid therapy for genetic disorders Download PDF

Info

Publication number
CN116171326A
CN116171326A CN202180062136.3A CN202180062136A CN116171326A CN 116171326 A CN116171326 A CN 116171326A CN 202180062136 A CN202180062136 A CN 202180062136A CN 116171326 A CN116171326 A CN 116171326A
Authority
CN
China
Prior art keywords
virus
composition
nucleic acid
protein
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180062136.3A
Other languages
Chinese (zh)
Inventor
C·马特兰加
K·A·豪夫
A·J·扎鲁尔
J·R·阿布希尔
K·J·威廉姆斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Greenlight Biosciences Inc
Original Assignee
Greenlight Biosciences Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Greenlight Biosciences Inc filed Critical Greenlight Biosciences Inc
Publication of CN116171326A publication Critical patent/CN116171326A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • A61K31/7115Nucleic acids or oligonucleotides having modified bases, i.e. other than adenine, guanine, cytosine, uracil or thymine
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/0012Galenical forms characterised by the site of application
    • A61K9/0019Injectable compositions; Intramuscular, intravenous, arterial, subcutaneous administration; Compositions to be administered through the skin in an invasive manner
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/48Preparations in capsules, e.g. of gelatin, of chocolate
    • A61K9/50Microcapsules having a gas, liquid or semi-solid filling; Solid microparticles or pellets surrounded by a distinct coating layer, e.g. coated microspheres, coated drug crystals
    • A61K9/51Nanocapsules; Nanoparticles
    • A61K9/5107Excipients; Inactive ingredients
    • A61K9/5123Organic compounds, e.g. fats, sugars
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • A61K48/0066Manipulation of the nucleic acid to modify its expression pattern, e.g. enhance its duration of expression, achieved by the presence of particular introns in the delivered nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16041Use of virus, viral particle or viral elements as a vector
    • C12N2740/16043Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16211Human Immunodeficiency Virus, HIV concerning HIV gagpol
    • C12N2740/16222New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/24Vectors characterised by the absence of particular element, e.g. selectable marker, viral origin of replication
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/60Vectors containing traps for, e.g. exons, promoters
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/70Vectors containing special elements for cloning, e.g. topoisomerase, adaptor sites
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/20Vector systems having a special element relevant for transcription transcription of more than one cistron
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2840/00Vectors comprising a special translation-regulating system
    • C12N2840/20Vectors comprising a special translation-regulating system translation of more than one cistron
    • C12N2840/203Vectors comprising a special translation-regulating system translation of more than one cistron having an IRES

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Organic Chemistry (AREA)
  • Public Health (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Veterinary Medicine (AREA)
  • Animal Behavior & Ethology (AREA)
  • Epidemiology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Microbiology (AREA)
  • Virology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Dermatology (AREA)
  • Nanotechnology (AREA)
  • Optics & Photonics (AREA)
  • Immunology (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Provided herein are retroviral (e.g., lentiviral) based compositions comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component and a nucleic acid molecule comprising a transgene sequence of one or more flanking long terminal repeats for ex vivo or in vivo delivery of one or more transgenes to a target cell. The compositions are useful for delivery to target cells (e.g., hematopoietic Stem Cells (HSCs), hepatocytes, vision cells, muscle cells, epithelial cells, T cells, etc.) and/or stably expressing any transgene (e.g., β -globulin, factor VIII, RP gtpase modulator (RPGR), dystrophin, cystic fibrosis transmembrane conductance regulator (CFTR), chimeric antigen receptor, etc.), with biological effects to treat and/or ameliorate any condition associated with gene expression (e.g., sickle cell disease, β -thalassemia, hemophilia B, retinitis pigmentosa, duchenne muscular dystrophy, cystic fibrosis, cancer, etc.).

Description

Nucleic acid therapy for genetic disorders
Background
Examples of current gene therapy methods include, but are not limited to, CRISPR-based systems such as CRISPR/Cas9, retroviruses, lentiviruses, herpesviruses, adenoviruses, and adeno-associated viruses (AAV), as shown in fig. 2. While these different approaches have many advantages, there are significant limitations. For example, retroviruses and some CRISPR-based systems typically require cell division and homologous recombination to succeed and therefore do not function in non-dividing cells. In AAV, there are transgene size limitations, effectively limiting their capacity and utility. Other limitations of adenoviruses include, for example, toxicity associated with high doses due to limited efficacy and reduced effectiveness of repeated dosing due to immune responses to the virus. Furthermore, the generation of therapeutic doses of virus can be very expensive.
As shown in fig. 2, lentiviruses have previously been used as models for gene therapy projects. Lentiviruses are ssRNA genomic viruses (e.g., HIV) that infect a broad host range and integrate into the genome, sometimes in the form of latent DNA. In current gene therapy programs, lentiviral particles are produced using a safer separate construct design. For example, the third generation of designs involves gene therapy using four plasmids: a transfer plasmid carrying the gene of interest and integrated into the genome, a packaging plasmid providing granule formation, reverse transcription and integration functions, an envelope plasmid providing a broad host range, and a plasmid providing nuclear export of mRNA. A common limitation of lentiviral methods is that virus production requires amplification in cells, which is both slow and expensive; and the treatment efficiency of the virus particles is low.
Thus, there remains a need for improved gene delivery systems over currently available systems.
Summary of The Invention
Disclosed herein are nucleic acid compositions for in vivo gene therapy. The composition comprises a minimal design that provides a nucleic acid molecule encoding a transgene, a nucleic acid molecule expressing a reverse transcriptase and/or an integrase, and can contain an auxiliary nucleic acid molecule that supports reverse transcription and integration of the transgene, and can contain a delivery system. The compositions do not require nucleic acid sequences that express the proteins encoded by the retroviral rev and env genes. However, the composition can contain a nucleic acid sequence that expresses one or more proteins encoded by the retroviral rev gene, the retroviral env gene, or both. The invention also provides nucleic acid molecules, methods of use, methods of treatment, nucleic acid templates, cells, and kits. The invention also provides modifications to increase the stability and/or function (functonation) of the nucleic acid, and/or the stability and/or function of the encoded protein. These modifications can include mutations in one or more of the intrinsic Instability (INS) elements; functional units having an N-terminal methionine-glycine dipeptide, such as integrase; fusion of functional units (e.g., integrase) with homing proteins; the use of codon optimisation for expression in a host cell; and/or the use of priming oligonucleotides. Other modifications are disclosed herein.
Aspects of the invention include one or more nucleic acid molecules encoding one or more Pol polyprotein components flanking 5 'and 3' untranslated regions (UTRs), as well as nucleic acid molecules comprising one or more reverse transcriptase priming elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes. Some aspects of the invention include one or more nucleic acid molecules encoding one or more protein components of Pol polyprotein flanked by 5 'and 3' untranslated regions (UTRs), and a nucleic acid molecule comprising one or more reverse transcriptase initiating elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes, but excluding nucleic acid sequences that express proteins encoded by at least one of a retroviral rev gene and a retroviral env gene. Some aspects of the invention include one or more nucleic acid molecules encoding one or more Pol polyprotein components flanking the 5 'and 3' untranslated regions (UTRs), as well as nucleic acid molecules comprising one or more reverse transcriptase priming elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes, but excluding nucleic acid sequences that express proteins encoded by both retroviral rev and env genes. Some aspects of the invention include one or more nucleic acid molecules encoding one or more Pol polyprotein components flanking 5 'and 3' untranslated regions (UTRs), as well as nucleic acid molecules comprising one or more reverse transcriptase priming elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes, and capable of integrating the one or more transgenes into a host genome in the absence of functional retroviral Rev and/or Env proteins. Expression of the Pol polyprotein component does not necessarily require translational sliding of the inline (inline) gag gene (translational slippage). The one or more nucleic acid molecules encoding the Pol polyprotein component can be a nucleic acid molecule comprising a 5'utr, a nucleic acid sequence encoding the Pol polyprotein, and a 3' utr. Alternatively, the one or more nucleic acid molecules encoding the Pol polyprotein component may be a nucleic acid molecule comprising a 5'utr, a nucleic acid sequence encoding at least the Pol polyprotein component reverse transcriptase and integrase, and a 3' utr. The Pol polyprotein components reverse transcriptase and integrase can be expressed on polycistronic constructs. In some embodiments, the polycistronic construct is a bicistronic construct. In some embodiments, the polycistronic construct is a tricistronic. The Pol polyprotein components reverse transcriptase and integrase can be expressed together with one or more polycistronic elements. The polycistronic element may be an inserted Internal Ribosome Entry Site (IRES), an a2A peptide coding sequence, or other polycistronic element. The one or more nucleic acid molecules encoding the Pol polyprotein component may be two nucleic acid molecules, one of which comprises a 5'utr, a nucleic acid sequence encoding at least the Pol polyprotein component reverse transcriptase, and a 3' utr, and the other of which comprises a 5'utr, a nucleic acid sequence encoding at least the Pol polyprotein component integrase, and a 3' utr. Fig. 3-5C illustrate exemplary embodiments of certain aspects of the present invention.
Some aspects of the invention provide a composition packaged in a non-viral delivery system, wherein the composition comprises a first RNA molecule and a second RNA molecule, the first RNA molecule comprising a 5 'untranslated region (UTR), a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' UTR; the second RNA molecule comprises one or more reverse transcriptase initiation elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes. Some aspects of the invention provide a composition packaged in a non-viral delivery system, wherein the composition comprises a first RNA molecule and a second RNA molecule, the first RNA molecule comprising a 5 'untranslated region (UTR), a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' UTR; the second RNA molecule comprises one or more reverse transcriptase initiation elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes, but does not include a nucleic acid sequence that expresses a protein encoded by at least one of a retroviral rev gene and a retroviral env gene. Some aspects of the invention provide a composition packaged in a non-viral delivery system, wherein the composition comprises a first RNA molecule and a second RNA molecule, the first RNA molecule comprising a 5 'untranslated region (UTR), a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' UTR; the second RNA molecule comprises one or more reverse transcriptase initiation elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes, but the composition does not include a nucleic acid sequence that expresses a protein encoded by both the retroviral rev and env genes. Some aspects of the invention provide a composition packaged in a non-viral delivery system, wherein the composition comprises a first RNA molecule and a second RNA molecule, the first RNA molecule comprising a 5 'untranslated region (UTR), a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' UTR; the second RNA molecule comprises one or more reverse transcriptase initiation elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes and is capable of integrating the one or more transgenes into the host genome in the absence of functional retroviral Rev and/or Env proteins.
In some aspects, the one or more nucleic acid molecules encoding the Pol polyprotein component comprise a nucleic acid sequence encoding one or more functional units. In some aspects, the one or more nucleic acid molecules encoding the Pol polyprotein component can also comprise nucleic acid sequences encoding one or more accessory proteins. The one or more accessory proteins may be selected from the group consisting of Nucleocapsid (NC), capsid protein (CA), matrix protein (MA), p6, viral infectious agent (Vif), transcription transactivator (Tat), negative regulator (Nef), viral protein R (Vpr) and viral protein u (Vpu). In some aspects, the helper protein may contain one or more mutations. In some aspects, the CA is mutant CA. In some aspects, the mutant CA has an N74D and/or E45A mutation, the numbering of which corresponds to the wild-type HIV-1 capsid protein. In some aspects, the wild-type HIV-1 strain may be NL4-3. In some aspects, the one or more nucleic acid molecules encoding the Pol polyprotein component may also comprise nucleic acid sequences encoding one or more Gag polyprotein helper proteins (i.e., helper proteins that are Gag polyprotein components). One or more accessory proteins may be encoded on the same nucleic acid molecule as the Pol polyprotein component. One or more accessory proteins may also be expressed with one or more polycistronic elements. The polycistronic element may be an inserted IRES, 2A peptide coding sequence, or other polycistronic element. Alternatively, one or more accessory proteins may be encoded by one or more nucleic acid molecules that are different from the nucleic acid molecules encoding the Pol polyprotein component, wherein each nucleic acid molecule comprises a 5'utr and a 3' utr. In some embodiments, the Gag multimeric protein helper protein may be selected from NC, CA, MA and p6. In some embodiments, the Gag multimeric protein helper protein may be encoded by a Gag multimeric protein. In other embodiments, the one or more nucleic acid molecules encoding the Pol polyprotein component do not encode Gag polyprotein. In some embodiments, the helper protein MA is not encoded by any of the nucleic acid molecules. In some embodiments, the one or more nucleic acid molecules encoding the Pol polyprotein component comprise a gag-Pol gene. In some embodiments, the gag-pol gene comprises a frameshift mutation. In some embodiments, the frameshift mutation is a single nucleotide insertion or deletion.
Some aspects of the invention provide one or more nucleic acid molecules encoding Pol polyprotein components and/or accessory proteins that are modified to increase the stability of the nucleic acid molecule, and/or to increase the stability of the encoded protein. In some embodiments, the one or more nucleic acid molecules encoding the Pol polyprotein component and/or the accessory protein are RNA molecules (e.g., mRNA molecules) and can optionally include one or more modifications to increase the stability of the RNA molecules (e.g., mRNA molecules). In some embodiments, one or more nucleic acid molecules encoding the Pol polyprotein component and/or accessory protein are DNA molecules and may optionally include one or more modifications to increase the stability of the DNA molecules and/or corresponding mRNA molecules. In some embodiments, the nucleic acid molecules encoding one or more Pol polyprotein components and/or helper proteins comprise one or more mutations in one or more internal Instability (INS) elements. One or more INS elements may be selected from TAGAT, ATAGA, AAAAG, ATAAA, TTATA, etc. In some embodiments, codons in the INS element are mutated to substitution codons for the same amino acid such that the functional protein sequence is retained (i.e., silent mutations) while the INS element is altered or removed. In some embodiments, one or more nucleic acid molecules encoding one or more Pol polyprotein components encode an integrase polypeptide comprising an N-terminal methionine-glycine dipeptide. In some embodiments, the nucleic acid molecule encoding one or more Pol polyprotein components encodes a methionine-glycine dipeptide at the 5' end, and in some embodiments, the nucleic acid molecule encoding one or more Pol polyprotein components encodes a methionine-glycine dipeptide internally.
Some aspects of the invention provide one or more Pol polyprotein components (e.g., integrases) fused to a homing protein, which can direct the components, e.g., integrases, to specific sequences in a host cell. In some embodiments, the homing protein recognizes palindromic sequences, in some embodiments, the homing protein recognizes restriction sites. In some embodiments, the homing protein is a restriction enzyme. In some embodiments, the restriction enzyme is I-PpoI. In some embodiments, the homing protein is a nuclease-inactivated restriction enzyme. In some embodiments, the homing protein is I-PpoI N119A.
Some aspects of the invention provide for increased translation in a host cell, e.g., by providing for codon optimization of one or more nucleic acid molecules of the composition for expression in the host cell. In some aspects, the host cell is a mammalian cell. In some aspects, the host cell is a human cell. In some aspects, the host cell is an avian cell.
Some aspects of the invention provide compositions further comprising priming oligonucleotides. In some embodiments, the priming oligonucleotide is GUCCCUGUUCGGGCGCCA (SEQ ID NO: 18) or GTCCCTGTTCGGGCGCCA (SEQ ID NO: 19). In some embodiments, the priming oligonucleotide may be an engineered sequence that is complementary to a reverse transcriptase priming element.
Some aspects of the invention provide that the nucleic acid molecule is an RNA molecule or a DNA molecule. For example, the nucleic acid molecule may be a ssDNA molecule, a dsDNA molecule, a ssRNA molecule, or a dsRNA molecule.
Some aspects of the invention provide that a nucleic acid molecule comprising one or more transgenes can comprise two or more transgenes, wherein the transgenes are separated by one or more polycistronic elements. The polycistronic element may be one or more IRES and/or one or more sequences encoding a 2A peptide and/or other polycistronic elements. The nucleic acid molecule may comprise one or more enhancers. For example, the one or more enhancers may include woodchuck hepatitis virus posttranscriptional regulatory elements (WPREs).
Some aspects of the invention provide that the nucleic acid molecule is an RNA molecule and comprises one or more modifications selected from the group consisting of modified ribonucleosides, 5' -7 mgcap structures, and poly (rA) tails. In some embodiments, the modified ribonucleoside is pseudouridine or a derivative of pseudouridine (e.g., N1-methyl pseudouridine). In some embodiments, the modified ribonucleoside is N6-methyladenosine. In other embodiments, the nucleic acid molecule comprising one or more transgenes may be an RNA molecule and further comprise one or more modifications selected from the group consisting of a 5' -7 mgs cap structure and a poly (rA) tail.
Some aspects of the invention include Pol polyprotein components, helper proteins and/or LTRs based on Pol polyprotein components, helper proteins and/or LTRs from: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abelsen murine leukemia virus (A-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), bowlwave sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29) Avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV), friend MLV (FrMLV), avian Sarcoma Virus (ASV), avian leukemia virus, avian myeloblastosis virus, UR2 sarcoma virus, Y73 sarcoma virus, jaagsiekte sheep retrovirus, leaf monkey virus, mason-pfizer monkey virus, pinus monkey retrovirus, and preparation method thereof avian cancer Mier Hill Virus 2, bovine leukemia Virus, primate T lymphocyte Virus 1, primate T lymphocyte Virus 2, primate T lymphocyte Virus 3, ehrlich hilsa dermoma Virus, ehrlich hilsa Epstein-Barr Virus 1, ehrlich hilsa Epstein-Barr Virus 2, chicken syncytial Virus, feline leukemia Virus, ehrlich Lelch Virus, pachyrhizus hilsa Epstein-Barr Virus, finkel-Biskis-Jinkins murine sarcoma virus, gardner-Arnstein feline sarcoma virus, gibbon ape leukemia virus, guinea pig type C tumor virus, hardy-Zuckerman feline sarcoma virus, harvey murine sarcoma virus, kirsten murine sarcoma virus, kola retrovirus, moloney murine sarcoma virus, pig type C tumor virus, reticuloendotheliosis virus, snyder-Theilen feline sarcoma virus, trager duck spleen necrosis virus, venomorpha retrovirus, moschner's sarcoma virus, jembrana's disease virus, american lion lentivirus, bovine foamy virus, cat foamy virus brown da Cong simian foamy virus, vero chimpanzee foamy virus, central chimpanzee foamy virus, macaca foamy virus, oriental chimpanzee foamy virus, green simian foamy virus, african simian foamy virus, japanese macaque foamy virus, rhesus simian foamy virus, spider simian foamy virus, pinus simian foamy virus, taiwan macaque foamy virus, west chimpanzee foamy virus, west low gorilla simian foamy virus, white marmoset simian foamy virus, yellow macaque simian foamy virus, or any combination thereof. Other aspects of the invention provide for the encapsulation of nucleic acid molecules in non-viral delivery systems. For example, they may be packaged in lipid nanoparticles.
Aspects of the invention provide that the one or more promoters comprise one or more tissue-specific or cell-specific promoters. For example, the promoter may be specific for bone marrow, hematopoietic Stem Cells (HSCs), epithelial cells, hepatocytes, visual cells, muscle cells, or T cells. The one or more promoters may include the hCMV promoter.
Some aspects of the invention provide that one or more transgenes encode one or more therapeutic, diagnostic, or reporter proteins, or fragments thereof. For example, one or more transgenes may encode one or more therapeutic, diagnostic, or reporter proteins, or fragments thereof. The therapeutic protein may be beta globin, cystic fibrosis transmembrane conductance regulator (CFTR), factor VIII, dystrophin, or RP gtpase regulator (RPGR). The reporter protein may be a fluorescent protein or a luciferase.
Other aspects of the invention provide for non-viral delivery systems that target specific tissues or cell types. The specific tissue or cell type may be bone marrow, HSC, epithelial cells, hepatocytes, vision cells, muscle cells, or T cells. The non-viral delivery systems are lipid nanoparticles, liposomes, polypeptide nanoparticles, silica nanoparticles, gold nanoparticles, polymer nanoparticles, dendrimers, cationic nanoemulsions, inorganic carriers (e.g., caP), polymers, and lipid-mixed carriers.
Also provided are methods of expressing a gene in a subject in need thereof, wherein the method comprises administering to the subject an effective amount of the above-described nucleic acids or a composition thereof (e.g., lipid nanoparticle) resulting in expression of one or more transgenes in the subject. The method results in expression of the gene in the cell by delivering the nucleic acid into the cell. The method can include using the nucleic acids or compositions thereof (e.g., lipid nanoparticles) by delivering them to a subject, thereby expressing one or more transgenes in the subject.
Aspects of the invention provide methods of treating a disease or disorder in a subject in need thereof by delivering a nucleic acid or a composition thereof (e.g., a lipid nanoparticle) to the subject, thereby expressing one or more transgenes in the subject. The disease or condition to be treated may be a genetic disease or condition. The disease or disorder may be a genetic disease or disorder. The disease or condition may be sickle cell disease, beta-thalassemia, hemophilia B, retinitis pigmentosa, duchenne muscular dystrophy, cystic fibrosis or cancer.
Some aspects of the invention provide that one or more transgenes are integrated into the genome of a target cell. For example, one or more transgenes may be stably expressed for at least one week, at least two weeks, at least one month, at least 6 months, at least one year, or for the lifetime of the subject.
Some aspects of the invention provide methods of eliciting an immune response in a subject in need thereof by administering to the subject an effective amount of a nucleic acid or a composition thereof (e.g., a lipid nanoparticle) to express one or more transgenes in the subject. For example, the subject may have cancer, and one or more transgenes encode a tumor antigen. Alternatively, the subject may have or be at risk of having an infectious disease, and one or more transgenes encode an antigen associated with the infectious disease. Embodiments include local or systemic delivery of nucleic acids.
Aspects of the invention also provide one or more nucleic acid templates comprising a 5'UTR, a nucleic acid sequence encoding one or more retroviral Pol polyprotein components, and a 3' UTR, wherein expression of the Pol polyprotein components does not require translational sliding from an inline (inline) gag gene. Other aspects of the invention provide one or more nucleic acid templates comprising a 5'UTR, a nucleic acid sequence encoding a gag-pol gene comprising a frameshift mutation, and a 3' UTR. Other aspects of the invention provide one or more nucleic acid templates comprising a 5'UTR, a nucleic acid sequence encoding a gag-pol gene, and a 3' UTR, wherein the gag-pol gene does not encode a matrix protein. The nucleic acid templates may also comprise a nucleic acid sequence encoding one or more accessory proteins selected from the group consisting of MA, p6, NC, CA, vif, tat, nef, vpr and Vpu. In some embodiments, the Pol polyprotein component and/or helper protein of the template is based on polyprotein and helper protein from: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abelsen murine leukemia virus (A-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), bowlwave sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29) Avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV), friend MLV (FrMLV), avian Sarcoma Virus (ASV), avian leukemia virus, avian myeloblastosis virus, UR2 sarcoma virus, Y73 sarcoma virus, jaagsiekte sheep retrovirus, leaf monkey virus, mason-pfizer monkey virus, pinus monkey retrovirus, and preparation method thereof avian cancer Mier Hill Virus 2, bovine leukemia Virus, primate T lymphocyte Virus 1, primate T lymphocyte Virus 2, primate T lymphocyte Virus 3, ehrlich hilsa dermoma Virus, ehrlich hilsa Epstein-Barr Virus 1, ehrlich hilsa Epstein-Barr Virus 2, chicken syncytial Virus, feline leukemia Virus, ehrlich Lelch Virus, pachyrhizus hilsa Epstein-Barr Virus, finkel-Biskis-Jinkins murine sarcoma virus, gardner-Arnstein feline sarcoma virus, gibbon ape leukemia virus, guinea pig type C tumor virus, hardy-Zuckerman feline sarcoma virus, harvey murine sarcoma virus, kirsten murine sarcoma virus, kola retrovirus, moloney murine sarcoma virus, pig type C tumor virus, reticuloendotheliosis virus, snyder-Theilen feline sarcoma virus, trager duck spleen necrosis virus, venomorpha retrovirus, moschner's sarcoma virus, jembrana's disease virus, american lion lentivirus, bovine foamy virus, cat foamy virus brown da Cong simian foamy virus, vero chimpanzee foamy virus, central chimpanzee foamy virus, macaca foamy virus, oriental chimpanzee foamy virus, green simian foamy virus, african simian foamy virus, japanese macaque foamy virus, rhesus simian foamy virus, spider simian foamy virus, pinus simian foamy virus, taiwan macaque foamy virus, west chimpanzee foamy virus, west low gorilla simian foamy virus, white marmoset simian foamy virus, yellow macaque simian foamy virus, or any combination thereof. Also provided are methods of producing a nucleic acid, e.g., RNA, wherein the method comprises in vitro transcription of a nucleic acid template.
Other aspects of the invention provide kits comprising the above nucleic acids or nucleic acid templates. The nucleic acid or nucleic acid template may be provided in one or more containers.
Other advantages, features and uses of the invention will emerge from certain exemplary, non-limiting embodiments; the attached drawings; non-limiting working examples; and as is apparent from the detailed description of the claims.
Drawings
FIG. 1 is a schematic diagram summarizing a method of an mRNA-based gene therapy platform.
FIG. 2 is a table depicting examples of current gene therapy methods and major advantages and challenges associated with each method.
FIG. 3 is a schematic diagram outlining an exemplary embodiment of a transgenic coding transfer nucleic acid design. The RNA-producing element includes an RNA polymerase promoter (e.g., T7 RNA polymerase promoter) and a restriction digestion site for template linearization.
FIG. 4 is a schematic diagram illustrating an exemplary embodiment of a reverse transcription and integration complex design compared to a canonical design. In contrast to the in-line gag/Pol design from the canonical design, pol polyproteins are produced from separate constructs, thus eliminating the need for translational slippage. In some embodiments, one or more helper retroviral proteins are cis-encoded with the Pol polyprotein. The RNA-producing element includes an RNA polymerase promoter (e.g., T7 RNA polymerase promoter) and, optionally, one or more restriction digestion sites for template linearization.
FIGS. 5A-5C illustrate exemplary embodiments of reverse transcription and integration mechanism designs in which the Pol polyprotein component is encoded as a separate protein on one or more nucleic acid molecules. The RNA-producing element includes an RNA polymerase promoter (e.g., T7 RNA polymerase promoter) and, optionally, one or more restriction digestion sites for template linearization. FIG. 5A is a schematic diagram illustrating an exemplary embodiment, wherein Protease (PR), reverse transcriptase/RNase H (RT) and Integrase (IN) are each encoded on a single nucleic acid construct, and wherein each component is separated by PR self-processing. Fig. 5B is a schematic diagram illustrating an exemplary embodiment, wherein the single units RT and IN are located on a bicistronic construct. Fig. 5C is a schematic diagram illustrating an exemplary embodiment, wherein the individual units RT and IN are located on separate constructs.
FIG. 6 is another schematic drawing depicting an mRNA-based gene therapy approach. Viral mechanisms (e.g., viral enzymes and/or one or more accessory proteins) and transgenes required for reverse transcription and integration are provided on separate RNAs. Optionally using a surface-labeled non-viral delivery system for delivering RNA to target cells. Transgenic coding RNAs are also known as lentiviral "genomic" RNAs. Expression of the transgene depends on translation of viral enzymes, conversion of lentiviral "genomic" RNA into DNA, and integration of the DNA into the genomic DNA of the target cell, resulting in persistent expression of the transgene.
Figures 7A and 7B show that RNA-derived functional units drive the expression of a reporter gene from a transgenic coding RNA. FIG. 7A shows schematic diagrams of the functional units Gag polyprotein RNA and Pol polyprotein RNA (functional unit mRNA) and transgene encoding RNA ("genomic" RNA). FIG. 7B shows the expression of the reporter transgene upon transfection with functional unit RNA.
Figures 8A-8D show that the optimized functional unit RNAs exhibit enhanced expression and localization to target organelles. FIG. 8A is a depiction of encoding WT Integrase (IN); stabilization of Met-Gly, human codon optimized (hCO) IN; and schematic representation of the RNA construct stabilizing Met-Gly hCoΔINS IN. FIG. 8B shows a capillary electrophoresis pattern of Met-Gly, hCO.DELTA.INS IN mRNA before and after polyadenylation. FIG. 8C shows WT IN mRNA, met-Gly hCO IN mRNA; and Western blot analysis of Met-Gly hCoΔINS IN mRNA expression. Fig. 8D shows confocal microscopy images showing optimized subcellular localization of integrase following RNA transfection into 293FT cells.
FIG. 9 illustrates an exemplary embodiment in which the reverse transcriptase is encoded as a single p51 and p66 subunit.
FIG. 10 illustrates an exemplary embodiment in which an integrase is fused to a homing protein for site-directed genomic integration. When expressed as part of a Pol polyprotein or as a single unit, the integrase may be fused to the homing protein.
Detailed Description
According to some aspects of the present disclosure, the minimally designed nucleic acid compositions are effective for in vivo gene therapy. This minimal design can overcome some of the limitations of lentiviral methods, including the need to produce virus by amplification in cells, which are both slow and expensive; and the need to handle virus particles, which may be inefficient. These are examples of advantages of the disclosed nucleic acid compositions, and one of skill in the art will recognize other advantages. The disclosure also provides nucleic acid molecules, methods of use, methods of treatment, nucleic acid templates, cells, and kits. The present disclosure also provides modifications to increase the stability and/or function of a nucleic acid and/or the stability and/or function of a encoded protein. These modifications may include mutations in one or more of the intrinsic Instability (INS) elements; functional units having an N-terminal methionine-glycine dipeptide, such as integrase; fusion of functional units such as integrase with homing protein; codon optimization for expression in the host cell; and/or the use of priming oligonucleotides. Other modifications are disclosed herein.
In certain aspects, provided herein are nucleic acids and nucleic acid compositions, kits, and methods of use for delivering transgenes such as, but not limited to, beta globin, cystic fibrosis transmembrane conductance regulator (CFTR), factor VIII, dystrophin, or RP gtpase regulator (RPGR). In some aspects, the present disclosure provides a retroviral (e.g., lentiviral) based non-viral composition comprising: at least one first nucleic acid molecule encoding a retroviral Pol polyprotein, the retroviral Pol polyprotein being processed to provide protease activity, reverse transcription, and integration of transgene-encoding nucleic acid molecules, and at least one second nucleic acid molecule comprising one or more transgenes flanking a Long Terminal Repeat (LTR) sequence for delivering the one or more transgenes to a target cell. Nucleic acid molecules comprising one or more transgenes may also be referred to herein as "transgene-encoding nucleic acid molecules". In some aspects, the present disclosure provides for encoding components of Pol polyproteins as separate functional units, including but not limited to units encoding reverse transcription and integration activity. These multiple functional units may be encoded on one nucleic acid or two or more nucleic acids. The protease of the Pol polyprotein may also be encoded on the same or a different construct than one or more other Pol polyprotein components, or may be omitted if the desired enzymatic activity (e.g., reverse transcriptase, integrase) is not encoded as a viral polyprotein. In some embodiments, the Pol polyprotein component is encoded on one or more nucleic acids with inserted polycistronic elements. The polycistronic element may be an IRES and/or a 2A sequence and/or other polycistronic element. In some embodiments, the compositions of the present disclosure further comprise one or more nucleic acid sequences encoding one or more helper retroviral proteins (helper proteins) that may enhance reverse transcription and integration of the transgene encoding nucleic acid. As used herein, "helper protein" refers to any retroviral protein, other than the Pol polyprotein component, capable of enhancing reverse transcription and/or integration of a transgene-encoding nucleic acid. The accessory proteins may be structural proteins including, but not limited to, MA, CA, NC, and Env, or non-structural proteins including, but not limited to Tat, rev, nef, vpr, vpu and Vif. The nucleic acid sequence encoding one or more helper retroviral proteins may be located on one or more nucleic acids encoding the Pol polyprotein component or may be located on one or more nucleic acids not encoding the Pol polyprotein component. In some embodiments, the stability of the mRNA is increased by, for example, introducing mutations in one or more intrinsic Instability (INS) elements. In some embodiments, the stability of the translated protein is increased by, for example, incorporating an N-terminal methionine-glycine dipeptide. In some embodiments, codon optimization for expression in a host cell (e.g., codon optimization for expression in a human cell) is used to increase translation efficiency of mRNA. In some embodiments, a protein of the disclosure, e.g., an integrase, is fused to a homing protein. In some embodiments, the composition comprises a priming oligonucleotide. The nucleic acids and compositions of the present disclosure can be used to express one or more transgenes ex vivo or in vivo by a cell, tissue, organ, or subject, or to produce recombinant proteins in vitro. Furthermore, the nucleic acid molecules can be packaged into non-viral delivery systems (e.g., lipid nanoparticles), avoiding the need to generate viral particles, which require amplification in cells and are slow and expensive, and avoiding the risks, e.g., immunogenicity, associated with viral-based methods.
Definition of the definition
General methods of molecular and cellular biochemistry can be found in textbooks such as: molecular Cloning: A Laboratory Manual, 3 rd edition (Sambrook et al, CSH Laboratory Press 2001); short Protocols in Molecular Biology, 4 th edition (Ausubel et al, john Wiley & Sons 1999); protein Methods (Bollag et al, john Wiley & Sons 1996); nonviral Vectors for Gene Therapy (Wagner et al, academic Press 1999); viral Vectors (Kaplift & Loewy edit, academic Press 1995); immunology Methods Manual (I.Lefkovits edit, academic Press 1997); and Cell and Tissue Culture: laboratory Procedures in Biotechnology (Doyle & Griffiths, john Wiley & Sons 1998), the disclosures of which are incorporated herein by reference.
Antibodies (used interchangeably in plural form) are immunoglobulin molecules capable of specifically binding to a target, such as a carbohydrate, polynucleotide, lipid, polypeptide, etc., through at least one antigen recognition site located in the variable region of the immunoglobulin molecule. As used herein, the term "antibody" encompasses not only intact (e.g., full length) polyclonal or monoclonal antibodies, but also antigen binding fragments thereof (e.g., fab ', F (ab') 2, fv), single chain (scFv), mutants thereof, fusion proteins comprising an antibody portion, humanized antibodies, chimeric antibodies, bivalent antibodies, nanobodies, linear antibodies, single chain antibodies, as well as any other modified configuration of an immunoglobulin molecule comprising an antigen recognition site having a desired specificity, including glycosylated variants of an antibody, amino acid sequence variants of an antibody, and covalently modified antibodies.
An "expression cassette" or "nucleic acid template" comprises any nucleic acid construct capable of acting as a template for the production of DNA or RNA, including genes/coding sequences of interest as well as non-translated RNA.
"multimeric protein" refers to a polypeptide comprising covalently linked smaller proteins that may occur naturally and are typically cleaved into constituent proteins with different biological functions.
"Gag polyprotein" refers to a precursor polyprotein encoded by the Gag (group-specific antigen) gene. Gag polyprotein is processed during maturation to form matrix protein (MA), capsid protein (CA), spacer peptide 1 (SP 1), nucleocapsid protein P7 (NC), spacer peptide 2 (SP 2) and P6 proteins. The terms "Gag protein" and "Gag multimeric protein component" refer to one or more of the proteins encoded by the Gag gene, i.e. MA, CA, SP1, NC, SP2 or P6.
"Pol polyprotein" refers to a precursor polyprotein encoded by the Pol gene. Pol polyproteins are processed to form Proteases (PR), reverse transcriptase/RNase H (RT/RH) and Integrase (IN). Each component is separated by self-processing via PR. The terms "Pol protein" and "Pol polyprotein components" refer to one or more of the proteins encoded by the Pol gene, i.e., RT/RH, IN or PR.
The "percent identity" of two nucleic acid sequences may be determined by any method known in the art. In some embodiments, the percent identity of two nucleic acid sequences is determined using the algorithm of Karlin and Altschul, proc.Natl.Acad.Sci.USA 87:2264-68,1990, as modified by Karlin and Altschul, proc.Natl.Acad.Sci.USA 90:5873-77,1993. This algorithm was incorporated into the NBLAST and XBLAST programs of Altschul et al, J.mol. Biol.215:403-10,1990 (version 2.0). BLAST nucleotide searches can be performed using the NBLAST program, score = 100, word length-12 to obtain a guide sequence homologous to the target nucleic acid. In the case of gaps between the two sequences, gapped BLAST can be used as described in Altschul et al, nucleic Acids Res.25 (17): 3389-3402, 1997. When using BLAST and Gapped BLAST programs, default parameters for the respective programs (e.g., XBLAST and NBLAST) can be used.
The terms "polycistronic" and "polycistronic" are used interchangeably and refer to nucleic acid molecules that individually encode more than one polypeptide within the same nucleic acid molecule. For example, polycistronic mRNA is mRNA encoding more than one polypeptide in the same transcript. "polycistronic element" or "multiple cistron elements" refers to elements that separate individual polypeptides in a polycistronic nucleic acid to allow independent translation and/or post-translational processing.
The terms "polypeptide" and "protein" are used interchangeably to refer to a polymer of amino acid residues joined by peptide bonds and have a minimum length of at least 5 amino acids for purposes of this disclosure. This definition encompasses full-length proteins and fragments thereof of greater than 5 amino acids. The term also includes polypeptides having co-translation (e.g., signal peptide cleavage) and post-translational modifications of the polypeptide, such as disulfide bond formation, glycosylation, acetylation, phosphorylation, proteolytic cleavage (e.g., cleavage by furin or metalloprotease), and the like. Furthermore, as used herein, "polypeptide" or "protein" refers to a protein that includes modifications to the native sequence, such as deletions, additions, and substitutions (typically conserved in nature as known to those skilled in the art), so long as the protein retains the desired activity associated with the purpose of the method. These modifications may be deliberate, such as by site-directed mutagenesis, or occasional, such as by mutation of the host producing the protein, or errors due to PCR amplification or other recombinant DNA methods.
As used herein, the term "nucleic acid" or "nucleic acid molecule" generally refers to deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). In some embodiments, "nucleic acid" refers to a single nucleic acid residue (e.g., nucleotide and/or nucleoside). In some embodiments, "nucleic acid" refers to an oligonucleotide strand comprising a single nucleotide. As used herein, the terms "oligonucleotide" and "polynucleotide" are used interchangeably to refer to a polymer of nucleotides (e.g., a string of at least two nucleotides). The nucleic acid may be single-stranded or double-stranded. The nucleotide monomers in the nucleic acid molecule can be naturally occurring nucleotides, modified nucleotides, or a combination thereof. In some embodiments, the modified nucleotide comprises a modification of a sugar moiety and/or a pyrimidine or purine base.
The terms "subject" and "patient" are used interchangeably herein. In some embodiments, the subject is a mammal. "mammal" refers to any animal classified as a mammal, including humans, domestic animals, and farm animals, as well as zoo, sports, or pet animals, such as dogs, horses, cats, cattle, sheep, goats, pigs, camels, and the like. In some embodiments, the mammal is a human. In some embodiments, the subject is an avian.
A "therapeutically effective amount" of a composition of the present disclosure generally refers to an amount sufficient to elicit a desired biological response (e.g., expression of a transgene in a target cell, treatment of a disorder, etc.). As will be appreciated by one of ordinary skill in the art, the effective amount of the agents described herein may vary depending on factors such as the condition being treated, the mode of administration, and the age, body composition, and health of the subject.
The term "transgene" or "transgene sequence" refers to an exogenous nucleotide sequence encoding a protein or RNA product of interest.
The terms "treatment", and "therapy" encompass actions taken to reduce the severity of a disorder (or symptoms associated with a disorder) or delay or slow the progression of a disorder (or symptoms associated with a disorder) when a subject has a disorder.
Minimal gene therapy platform
The present disclosure is based in part on the following findings: the retroviral (e.g., lentiviral) derived nucleic acid molecules are useful for delivering, integrating, and stably expressing transgenes (e.g., for gene therapy) using a minimal system comprising at least one first nucleic acid encoding a retroviral Pol polyprotein that is processed to provide, among other things, reverse transcription and integration of potentially transgene encoding nucleic acids, and at least one second nucleic acid molecule comprising one or more transgenes flanking the LTR sequence. In some embodiments, the components of the Pol polyprotein are encoded as separate units on one or more nucleic acid molecules. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further encode one or more helper retroviral proteins. In some embodiments, the system comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some embodiments, the minimal system comprises two nucleic acid molecules, one nucleic acid molecule providing reverse transcriptase and integrase functions (either as a multimeric protein processed with a protease or as a separately encoded unit), and one nucleic acid molecule encoding one or more transgenes. In some embodiments, the minimal system comprises two or more nucleic acid molecules (e.g., 2, 3, 4, or 5 or more). The nucleic acid molecules can be packaged into non-viral delivery systems (e.g., lipid nanoparticles) thereby bypassing the production of viral particles that need to be amplified in cells and that are slow and expensive, and avoiding the risks, e.g., immunogenicity, associated with virus-based methods.
The compositions of the present disclosure have several advantages over current gene and cell therapy approaches. Some of the challenges associated with current gene therapy approaches such as CRISPR/Cas9, retroviruses, lentiviruses, herpesviruses, adenoviruses and adeno-associated viruses (AAV) are shown in figure 2. In contrast to prior methods, the compositions of the present disclosure can function in dividing and non-dividing cells, have high transgene-sized capacity (e.g., 4-5kb, 5-6kb, 6-7kb, 7-8kb, 8-9kb, 9-10kb or greater than 4kb, greater than 5kb, greater than 6kb, greater than 7kb, greater than 8kb, greater than 9kb or greater than 10 kb) and/or do not require the production of expensive viruses and/or viral particles. The compositions of the present disclosure also overcome some of the limitations associated with HIV-1 and third generation lentiviral vectors. In an HIV-1 vector, all components required for the HIV-1 life cycle are on a single RNA genome. HIV-1 vectors also have replication capacity (i.e., they can replicate after integration), have packaging signals that allow the genome to be packaged into viral particles, have mRNA nuclear export elements for HIV-1mRNA, and have a cis viral mechanism. HIV-1 vectors present safety concerns and therefore cannot be used in vivo. On the other hand, the next generation of laboratory-based lentiviruses is characterized by self-inactivation after the integration step, trans-viral mechanisms on multiple individual constructs, and transgenes with promoter/enhancer regions. However, these require co-transfection of packaging cells with multiple plasmids, resulting in inefficient viral particle production. Partially or misfolded viral particles are also highly immunogenic. In an improvement to the lentiviral vectors described above, the minimal RNA designs described herein do not require particle formation or packaging, and in some embodiments are characterized by having reverse transcription and trans integration functions only on separate constructs. The compositions of the present disclosure overcome challenges associated with current lentiviral methods by requiring only minimal components that provide reverse transcription and integration and transfer (i.e., transgene encoding) functions, while self-inactivating and bypassing the production of viral particles. In addition, non-viral delivery systems can be easily functionalized with targeting moieties, simplifying administration and nucleic acid synthesis (e.g., RNA synthesis) as well as delivery system generation can be cell-free, thereby simplifying the manufacturing process.
The compositions described herein comprise the following essential components: one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription (e.g., RT/RH) and integration (e.g., IN) of a transgene encoding nucleic acid, and a nucleic acid molecule comprising one or more transgene sequences flanking the LTR sequence. Pol polyproteins can be expressed and processed to provide protease activity (e.g., using PR proteins), reverse transcriptase activity (e.g., using RT/RH proteins), and integration of transgene-encoding nucleic acid molecules (e.g., using IN proteins). The Pol polyprotein component can alternatively be expressed as a separate plurality of functional units on one or more nucleic acid molecules. When expressed as a separate unit, the Pol polyprotein component that provides protease activity may or may not be expressed. The compositions of the present disclosure may optionally further comprise one or more nucleic acid sequences encoding one or more helper retroviral proteins to enhance reverse transcription and integration of the transgene-encoding nucleic acid. One or more accessory proteins may be encoded in cis or trans with one or more Pol polyprotein components. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein further encodes one or more helper retroviral proteins. In some embodiments, the compositions of the present disclosure comprise one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some embodiments, the composition comprises two nucleic acid molecules, one nucleic acid molecule providing reverse transcriptase and integrase functions (e.g., without limitation, pol polyprotein and/or one or more helper retroviral proteins) and one nucleic acid molecule encoding one or more transgenes. In some embodiments, the composition comprises two or more nucleic acid molecules (e.g., 2, 3, 4, or 5 or more). In some embodiments, the composition does not comprise any helper retroviral proteins. The compositions of the present disclosure are packaged in a non-viral delivery system (e.g., lipid nanoparticles) for in vivo or ex vivo delivery.
The nucleic acid molecule may be an RNA molecule or a DNA molecule, a genome and/or a cDNA, or a hybrid of RNA and DNA, wherein the nucleic acid contains any combination of deoxyribonucleotides and ribonucleotides (e.g., artificial or natural). The nucleic acid molecule may be single-stranded (ss) or double-stranded (ds), or may contain portions of both single-and double-stranded sequences. In some embodiments, the nucleic acid molecule is a DNA molecule. The DNA molecule may be a ssDNA molecule or a dsDNA molecule (e.g., a linear dsDNA molecule). In some embodiments, the nucleic acid molecule is an RNA molecule. The RNA molecule may be a ssRNA or dsRNA molecule. In some embodiments, the RNA molecule may be a ssRNA molecule. In some embodiments, the nucleic acid molecule may be an mRNA molecule.
In some embodiments, the compositions of the present disclosure comprise one or more RNA molecules encoding retroviral Pol polyprotein components, and an RNA molecule comprising one or more transgene sequences flanking the LTR sequence. In some embodiments, the compositions of the present disclosure comprise two RNA molecules: a first RNA molecule encoding a retroviral Pol polyprotein and a second RNA molecule comprising a transgene sequence flanking the LTR sequence. In some embodiments, the RNA molecule encoding the Pol polyprotein expresses the polyprotein as a canonical unit, wherein each component is separated by protease processing. In some embodiments, the RNA molecules encoding the Pol polyprotein components express them as a plurality of individual units, e.g., as a bicistronic construct. In some embodiments, the RNA molecule encoding the Pol polyprotein further encodes one or more helper retroviral proteins as described herein. In some embodiments, the composition comprises one or more additional RNA molecules encoding one or more helper retroviral proteins. In vivo methods of embodiments in which the nucleic acid molecule is an RNA molecule are summarized in fig. 1 and 6.
Reverse transcription and integration functions
One or more nucleic acids of the system encode a retroviral Pol polyprotein that is processed to provide functional units responsible for protease activity, reverse transcription, and integration of one or more transgenes. In some embodiments, a nucleic acid molecule comprising a 5'utr, a nucleic acid sequence encoding a retroviral Pol polyprotein (retroviral Pol gene), and a 3' utr provides reverse transcription and integration functions. Thus, in some embodiments, all Pol polyprotein components are encoded on one nucleic acid molecule with 5 'and 3' utrs. In some embodiments, the Pol polyprotein component is encoded on two or more nucleic acid molecules each having a 5 'and a 3' utr. In some embodiments, the Pol polyprotein components are encoded on one or more nucleic acid molecules each having 5 'and 3' utrs, and are separated by one or more polycistronic elements (e.g., one or more IRES and/or 2A peptide coding sequences and/or other polycistronic elements). Any other polycistronic element that facilitates translation may also be used. In some embodiments, the nucleic acid molecule does not encode a Pol protease, e.g., wherein the Pol polyprotein component reverse transcriptase and integrase are encoded as separate units on one or more nucleic acid molecules or separated by an intervening polycistronic element (e.g., IRES and/or 2A peptide coding sequence, and/or other polycistronic element). In further embodiments, the Pol polyprotein component reverse transcriptase can be encoded as separate p51 and p66 subunits (fig. 9). In some embodiments, p51 and p66 are encoded on separate nucleic acid molecules each having a 5 'and 3' utr. In some embodiments, p51 and p66 are encoded as separate units on the same nucleic acid separated by a polycistronic element (e.g., an IRES or 2A peptide coding sequence, or other polycistronic element).
In contrast to canonical designs where Pol expression requires translation slippage from gag, in some embodiments, the expression of Pol proteins can be increased by generating Pol polyproteins or individual Pol polyproteins components from one single construct and by encoding different UTRs for mRNA stability and high levels of translation. In some embodiments, heterologous UTRs are selected to increase mRNA stability and/or translation levels. In some embodiments, the UTR comprises a 3 'or 5' sequence from a stable and highly translated mRNA molecule (such as, but not limited to, β -globin, actin, GAPDH, tubulin, histone, or citrate-circulating enzyme). Engineered or synthetic UTRs (i.e., rationally designed UTRs based on non-naturally occurring and mutated UTRs that differ from naturally occurring) may also be used to enhance stability and/or translation. The sequence of human UTR is available on GenBank.
In some embodiments, in addition to encoding functional units, the nucleic acid sequence encodes one or more helper retroviral proteins to enhance reverse transcription and/or integration. In some embodiments, one or more accessory proteins are cis-encoded (i.e., encoded by the same nucleic acid molecule) with one or more Pol polyprotein components. In some embodiments, one or more accessory proteins are cis-encoded with all Pol polyprotein components. In some embodiments, one or more accessory proteins are cis-encoded with the Pol polyprotein. Thus, in some embodiments, all Pol polyprotein components and accessory proteins are encoded by one nucleic acid molecule with 5 'and 3' utrs. In some embodiments, the nucleotide sequences encoding the Pol polyprotein component and/or helper protein are separated by one or more polycistronic elements (e.g., one or more IRES and/or 2A peptide coding sequences, and/or other polycistronic elements). In some embodiments, the nucleotide sequence encoding the Pol polyprotein component and/or helper protein encodes a fusion protein comprising the Pol polyprotein component and/or helper protein. In some embodiments, one or more accessory proteins are encoded trans-form (i.e., encoded by separate nucleic acid molecules) with the Pol polyprotein component. In some embodiments, the Pol polyprotein component and accessory protein are encoded by two or more (e.g., two, three, four, or five or more) nucleic acid molecules each having 5 'and 3' utrs. In some embodiments, the UTR comprises a 3 'or 5' sequence from a stable and highly translated mRNA molecule (including but not limited to globin, actin, GAPDH, tubulin, histone, or citrate-circulating enzyme).
In some embodiments, the one or more accessory proteins are selected from the group consisting of gag matrix protein (MA) (p 17), nucleocapsid (NC) (p 9), capsid protein (CA) (p 24), p6, viral infectious agent (Vif), transcription transactivator (Tat), negative regulator (Nef), viral protein R (Vpr), and viral protein u (Vpu). In some embodiments, the accessory protein may be a wild-type protein or may be a mutant protein. In some embodiments, the mutant helper protein is CA N74D or CA E45A. The numbering corresponds to the wild-type HIV-1CA protein and the corresponding mutations in other retroviral CA proteins can be readily determined by the skilled person. In some aspects, the wild-type HIV-1 strain may be NL4-3. In some embodiments, the system comprises one accessory protein selected from MA, NC, CA, p, vif, tat, nef, vpr and Vpu. In some embodiments, the accessory protein is NC. In some embodiments, the accessory protein is CA. In some embodiments, the compositions of the present disclosure comprise two accessory proteins selected from MA, NC, CA, p6Vif, tat, nef, vpr and Vpu. In some embodiments, the compositions of the present disclosure comprise three accessory proteins selected from MA, NC, CA, p6Vif, tat, nef, vpr and Vpu. In some embodiments, the compositions of the present disclosure comprise four, five, six, seven, or eight accessory proteins selected from MA, NC, CA, p6Vif, tat, nef, vpr and Vpu. In some embodiments, the system comprises MA, NC, CA, p, vif, tat, nef, vpr and Vpu.
In some embodiments, one or more accessory proteins are expressed by the gag polyprotein (p 55). Thus, in some embodiments, the nucleic acid molecule comprises a retroviral gag gene. In some embodiments, the nucleic acid molecules of the present disclosure comprise a nucleic acid sequence comprising gag and pol genes in a canonical gag-pol overlap direction. In some embodiments, the nucleic acid molecules of the present disclosure comprise a nucleic acid sequence encoding a fusion of Gag and Pol polyproteins. In some embodiments, the nucleic acids of the present disclosure comprise nucleic acid sequences encoding Gag and Pol polyproteins separated by a 2A peptide coding sequence. In some embodiments, the nucleic acid molecules of the present disclosure encode one or more Pol polyprotein components and one or more Gag polyprotein components as separate units. In some embodiments, the individual units are linked by one or more polycistronic elements (e.g., one or more 2A peptide coding sequences and/or IRES and/or other polycistronic elements). In some embodiments, the nucleic acids of the present disclosure comprise a modified gag-pol gene that does not encode a matrix protein (MA). In some embodiments, the nucleic acids of the present disclosure comprise a modified gag-pol gene comprising a frameshift mutation. In some embodiments, the nucleic acids of the present disclosure comprise a modified gag-pol gene comprising a single nucleotide insertion that disrupts the gag-pol frameshift mechanism. In some embodiments, the nucleic acids of the disclosure comprise a gag gene that does not encode MA.
In some embodiments, the nucleic acid molecules or compositions of the present disclosure are capable of integrating one or more transgenes into a host genome in the absence of functional retroviral Rev and/or Env proteins. In some embodiments, the nucleic acid molecules or compositions of the present disclosure are free of nucleic acid sequences that express a protein encoded by at least one of retrovirus rev and retrovirus env. In some embodiments, the nucleic acid molecules or compositions of the present disclosure are free of nucleic acid sequences that express proteins encoded by both retroviral rev and env genes. In some embodiments, the nucleic acid molecules or compositions of the present disclosure do not comprise any nucleic acid sequences encoding retroviral Gag proteins. In other embodiments, the nucleic acid molecules or compositions of the present disclosure do not comprise a nucleic acid sequence capable of expressing all retroviral Gag proteins. In some embodiments, the nucleic acid molecules or compositions of the present disclosure do not comprise a nucleic acid sequence encoding the following retroviral Gag proteins: MA, SP1, SP2 and p6. In some embodiments, the nucleic acid molecules or compositions of the present disclosure do not comprise a nucleic acid sequence encoding MA. In some embodiments, the nucleic acid molecules or compositions of the present disclosure do not comprise any nucleic acid sequences encoding retroviral Gag proteins. In other embodiments, the nucleic acid molecules or compositions of the present disclosure do not comprise a nucleic acid sequence capable of expressing all retroviral Gag proteins.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding NC. In some embodiments, the NC is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further comprise a nucleic acid sequence encoding NC in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes NC in cis. In some embodiments, there are no nucleic acid sequences encoding other Gag proteins (including MA, CA, SP1, SP2, and p 6). In some embodiments, nucleic acid sequences encoding MA, SP1, SP2, and p6 are not present.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding CA. In some embodiments, the CA is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further comprise a nucleic acid sequence encoding CA in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes CA in cis. In some embodiments, there are no nucleic acid sequences encoding other Gag proteins (including MA, NC, SP1, SP2 and p 6). In some embodiments, nucleic acid sequences encoding MA, SP1, SP2, and p6 are not present.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding vif. In some embodiments, vif is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further comprise a nucleic acid sequence encoding vif in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes vif in cis.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding Tat. In some embodiments, tat is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further comprise a nucleic acid sequence encoding Tat in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes Tat in cis.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding Nef. In some embodiments, nef is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further comprise a nucleic acid sequence encoding Nef in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes Nef in cis.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding Vpr. In some embodiments, vpr is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the reverse transcribed Pol polyprotein component further comprise a nucleic acid sequence encoding Vpr in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes Vpr in cis.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding Vpu. In some embodiments, vpu is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the reverse transcribed Pol polyprotein component further comprise a nucleic acid sequence encoding Vpu in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes Vpu in cis.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding MA. In some embodiments, MA is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further comprise a nucleic acid sequence encoding MA in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes MA in cis.
In some embodiments, the compositions of the present disclosure further comprise a nucleic acid sequence encoding p6. In some embodiments, p6 is encoded by separate nucleic acid molecules comprising a 5'utr and a 3' utr. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further comprise a nucleic acid sequence encoding p6 in cis. In some embodiments, the nucleic acid molecule encoding the Pol polyprotein encodes p6 in cis.
In some embodiments, the nucleic acid sequences encoding the Pol polyprotein component and/or the one or more accessory proteins are based on nucleic acid sequences from one or more retroviruses. In some embodiments, the nucleic acid sequences encoding the Pol polyprotein component and/or the one or more accessory proteins are based on nucleic acid sequences from the same retrovirus. In some embodiments, the nucleic acid sequences encoding the Pol polyprotein component and/or the one or more accessory proteins are based on nucleic acid sequences from two or more different retroviruses. In some embodiments, the nucleic acid sequence encoding the Pol polyprotein component and/or the one or more accessory proteins is based on a nucleic acid sequence from: murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), ebensen murine leukemia virus (A-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), bow sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29), avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV), friend MLV (FrMLV), avian Sarcoma Virus (ASV), avian leukemia virus, avian myeloblastoma virus, UR2 sarcoma virus, Y73 sarcoma virus, jaagsiekte sheep retrovirus, phylloto virus, mason-pfizer monkey virus, pinus monkey retrovirus, fungiocarcinoma virus, furthogonal virus, fungiocarcinoma virus, and Equi-Pogostemavirus avian cancer Mier Hill Virus 2, bovine leukemia Virus, primate T lymphocyte Virus 1, primate T lymphocyte Virus 2, primate T lymphocyte Virus 3, hilsa herring dermis sarcoma Virus, hilsa herring epidermosis Virus 1, hilsa herring epidermosis Virus 2, chicken syncytial Virus, feline leukemia Virus, finkel-biskiss-jinks murine sarcoma Virus, gardner-Arnstein feline sarcoma Virus, gibbon ape leukemia Virus, guinea pig C tumor Virus, hardy-Zuckerman cat sarcoma Virus, harvey murine sarcoma Virus, kirsten murine sarcoma Virus, kola retrovirus, moloney murine sarcoma Virus, porcine C tumor Virus, reticuloendotheliosis Virus, snyder-Theilen cat sarcoma Virus, trager duck spleen necrosis Virus, the virus may be selected from the group consisting of a venomous retrovirus, a majoram sarcoma virus, a jenbrana disease virus, a american lion lentivirus, a bovine foamy virus, a equine foamy virus, a feline foamy virus, a brown da Cong simian foamy virus, a salfollows simian foamy virus, an eastern chimpanzee foamy virus, a green simian foamy virus, a long tailmonkey foamy virus, a japanese macaque foamy virus, a rhesus foamy virus, a spider simian foamy virus, a squirrel foamy virus, a taiwan macaque foamy virus, a western chimpanzee foamy virus, a western low-ground gorilla foamy virus, a marmoset simian foamy virus, a yellow chest simian foamy virus, or a combination thereof. In some embodiments, the nucleic acid sequence is based on a nucleic acid sequence from HFV. In some embodiments, the nucleic acid sequences encoding the Pol polyprotein component and/or the one or more accessory proteins are based on nucleic acid sequences from one or more lentiviruses. In some embodiments, the nucleic acid sequence encoding the Pol polyprotein component and/or the one or more accessory proteins is based on a nucleic acid sequence from: human Immunodeficiency Virus (HIV) (e.g., HIV-1, HIV-2), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), jembran disease virus, american lion lentivirus, or combinations thereof. In some embodiments, the nucleic acid sequence is based on a nucleic acid sequence from HIV-1 (e.g., genBank accession No. AF 033819). The skilled person can obtain nucleotide sequences from the HIV genome. See, for example, HIV genome browser (national laboratory of alamous, los): www.hiv.lanl.gov/content/sequence/genome_browser/browser and "numbered positions in HXB 2": korber et al, unknown journal, attached to UCSD and referenced by LANL database. www.hiv.lanl.gov/content/sequence/HIV/MAP/analysis also provides a deeply annotated resource for identifying genes within HIV.
It is understood that the term "based on" or "derived from" means that a nucleic acid sequence (or amino acid sequence) may comprise one or more modifications relative to a base sequence. Thus, a nucleic acid sequence may comprise one or more modifications (e.g., a deletion of one or more nucleotides, an addition of one or more nucleotides, a substitution of one or more nucleotides, or a combination thereof) relative to the corresponding wild-type nucleic acid sequence. In some embodiments, the nucleic acid sequence encoding the Pol polyprotein component and/or one or more accessory proteins has at least 60% identity (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity) with the corresponding wild-type nucleic acid sequence. In some aspects, the wild-type Pol polyprotein is from HIV-1 strain NL4-3.
The one or more nucleic acid molecules encoding the Pol polyprotein component and/or the one or more accessory proteins can comprise one or more chemical or biological modifications relative to the naturally occurring nucleic acid molecule. Modifications may enhance nucleic acid or protein stability and/or transcription/translation efficiency and/or reduce immunogenicity. Modifications may include, but are not limited to, modified nucleobases, modified backbones (e.g., phosphoramides, phosphorothioates, phosphorodithioates, O-methylphosphite linkages, and/or peptide nucleic acids).
In some embodiments, one or more nucleic acid molecules encoding the Pol polyprotein component and/or one or more accessory proteins comprise one or more mutations that increase mRNA or stability of the corresponding mRNA. In some embodiments, the one or more nucleic acid molecules encoding the Pol polyprotein component and/or the one or more accessory proteins comprise one or more mutations that increase Rev independence. In some embodiments, the nucleic acid sequence comprises one or more mutations in an intrinsic Instability (INS) element (e.g., a deletion of one or more nucleotides, an addition of one or more nucleotides, a substitution of one or more nucleotides, or a combination thereof). In some embodiments, codons in the INS element are replaced with alternative codons to preserve the protein sequence while altering or removing the INS element. In some embodiments, the INS element is replaced with codons optimized for expression in humans. In some embodiments, the one or more INS elements are selected from TAGAT, ATAGA, AAAAG, ATAAA and TTATA or other INS elements (e.g., those described by Wolff et al Nucleic Acids Res.31 (11): 2839-51 (2003)). In some embodiments, the INS element is completely removed. In some embodiments, the INS element is partially removed. In some embodiments, the nucleic acid sequence is codon optimized for expression in humans. These modifications may be present in the nucleotide sequence encoding one or more of reverse transcriptase, integrase, capsid, matrix, nucleocapsid, p6, vif, nef, vpu or Vpr.
In some embodiments, one or more nucleic acid molecules encoding the Pol polyprotein component and/or one or more accessory proteins comprise one or more mutations that increase stability of the respective proteins. In some embodiments, the nucleic acid molecules of the present disclosure encode an integrase polypeptide comprising a stabilized methionine-glycine dipeptide. In some embodiments, the nucleic acids of the present disclosure encode an integrase polypeptide comprising an N-terminal methionine-glycine dipeptide.
In some embodiments, the one or more nucleic acid molecules encoding the Pol polyprotein component and/or the one or more accessory proteins comprise one or more mutations that increase translational efficiency in the host cell. In some embodiments, the codons of the nucleic acid molecules of the disclosure are optimized for expression in humans.
In some embodiments, the one or more nucleic acid molecules encoding the Pol polyprotein component and/or the one or more accessory proteins comprise one or more modifications selected from the group consisting of: codon optimization for expression in host cells (e.g., human or other mammalian or avian cells), partial or complete removal of INS elements, and modification with an integrase of a stabilized methionine-glycine dipeptide. In some embodiments, one or more nucleic acid molecules encoding the Pol polyprotein component and/or one or more accessory proteins are codon-optimized for expression in a host cell (e.g., a human or other mammalian or avian cell), including complete or partial removal of INS elements, and encode an integrase polypeptide modified with a methionine-glycine dipeptide.
In some embodiments, one or more nucleic acid molecules encoding the Pol polyprotein component encode an integrase fused to the homing protein for site-directed genomic integration (fig. 10). Homing proteins recognize specific DNA sequences, thereby directing the integrase to specific sequences. This approach has the potential for reduced genotoxicity. In some embodiments, the integrase is encoded by a Pol polyprotein. In some embodiments, the integrase is encoded as a separate unit. In some embodiments, the Pol polyprotein is fused to the homing protein. In some embodiments, the homing protein targets a genomic locus of the transgene. In some embodiments, the homing protein is an enzyme that recognizes a restriction site. Enzymes that recognize restriction sites are known in the art. See, e.g., www.neb.com/tools-and-resources/selections-charts/alphabetized-list-of-recogination-features. Non-limiting examples of restriction site-recognizing enzymes include, but are not limited to, aatII, abaSI, acc I, accI, aciI, aclI, acuI, afeI, aflII, aflIII, ageI, ahdI, aleI-v2, aluI, alwI, alwNI, apaI, apaLI, apoI, ascI, aseI, asiSI, avaI, bsoBI, avaII, avrII, baeGI, baeI, bamHI, banI, banII, bbsI, bbvCI, bbvI, bccI, bceAI, bcgI, bciVI, bclI, bfaI, bglI, bglII, blpI, bmgBI, bmrI, bmtI, bpmI, bpuEI, bpu I, bsaAI, bsaBI, bsaHI, bsaJI, bsaWI, bsaXI, bseRI, bseYI, bsgI, bsiEI, bsiHKAI, bsiWI, bslI, bsmAI BsoDI, bsmBI-v2, bsmFI, bsmI, bspCNI, bspEI, bspHI, bsp1286I, bspMI, bfuAI, bsrBI, bsrDI, bsrFI-v2, bsrGI, bssHII, bssSI-v2, bstAPI, bstBI, bstEII, bstNI, bstUI, bstXI, bstYI, bsu I, btgI, btgZI, btsCI, btsIMutI, btsI-v2, cac8I, claI, bspDI, cspCI, cviAII, cviKI-1, cviQI, ddeI, dpnI, draI, drdI, eaeI, earI, eciI, eco53kI, ecoNI, ecoO109I, ecoP15I, ecoRI, ecoRV, esp3I, fatI, fauI, fnu4HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP1 HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP III, hpyCH4IV, hpyCH 4HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP 188 HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP 166II, hpy188III, I-CeuI, I-HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP 3 HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP I HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP, nb.BbvCI, nb.BbsmI, nb.BsrDI, nb.BssSI, nb.BssI, HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP, nt.AlwI, nt.BbvCI, nt.BspQI, nt.BstNBI, nt.CviI, HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP-PspI, HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP, sapI HI, fokI, fseI, fspEI, fspI, haeII, haeIII, hgaI, hhaI, hincII, hindIII, hinfI, hinP, sapI.pI.37 SrfI, sspI, stuI, styD4I, swaI, taqI-v2, tfiI, tseI, apeKI, tsp45I, tspRI, tth111I, pflFI, xbaI, xcmI, xhoI, paeR7I, xmaI, tspMI, xmnI and ZraI. In some embodiments, the homing protein is a homing endonuclease. Homing endonucleases are known in the art, such as those disclosed in Stoddard, B.I., homing endonucleases from Mobile group I introns: discovery to genome engineering, mobile DNA 5,7 (2014). Non-limiting examples of homing endonucleases include I-TevI, I-HmuI, I-Bth 035-I, I-CreI, I-AniI, I-PpoI and I-Ssp6803I. In some embodiments, the homing protein is I-PpoI. In some embodiments, homing proteins may be modified to remove any unwanted nuclease activity to prevent any genomic damage. In some embodiments, the homing protein is I-PpoI (N119A), or the like. In some embodiments, the integrase polypeptide is modified to remove any unwanted nuclease activity. In some embodiments, the integrase polypeptide comprises a D64V mutation or the like.
In certain embodiments, the nucleic acid molecule encoding the Pol polyprotein component and/or the one or more accessory proteins is an RNA molecule (e.g., ssRNA molecule, such as mRNA molecule). In some embodiments, RNA molecules comprising a 5'utr, a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' utr provide RT and integrase functions. In some embodiments, a nucleic acid molecule comprising a 5'utr, a nucleic acid sequence encoding a separate Pol polyprotein component, and a 3' utr provides RT and integrase functions. In some embodiments, all Pol polyprotein components are encoded on an RNA molecule with 5 'and 3' utrs. In some embodiments, the Pol polyprotein component is encoded on two or more RNA molecules each having 5 'and 3' utrs. In some embodiments, the Pol polyprotein components are encoded on one or more RNA molecules each having 5 'and 3' utrs and are separated by one or more polycistronic elements. The one or more polycistronic elements may be IRES and/or 2A peptide coding sequences and/or other polycistronic elements. In some embodiments, the RNA molecules do not encode Pol polyproteins, e.g., the Pol polyproteins components for RT and integrase are encoded on separate RNA molecules or separated by polycistronic elements such as inserted IRES and/or 2A peptide coding sequences and/or other polycistronic elements. In some embodiments, there are additionally nucleic acid molecules encoding one or more accessory proteins to enhance reverse transcription and/or integration. In some embodiments, one or more accessory proteins are encoded in cis with (i.e., encoded by the same RNA molecule as) one or more Pol polyprotein components. In some embodiments, one or more accessory proteins are cis-encoded with all Pol polyprotein components. In some embodiments, one or more accessory proteins are cis-encoded with the Pol polyprotein. Thus, in some embodiments, all Pol polyprotein components and accessory proteins are encoded on one RNA molecule with 5 'and 3' utrs. In some embodiments, the nucleotide sequences encoding the Pol polyprotein component and/or helper protein are separated by one or more polycistronic elements, such as IRES and/or 2A peptide coding sequences and/or other polycistronic elements. In some embodiments, one or more accessory proteins are encoded trans-form (i.e., encoded by separate nucleic acid molecules) with the Pol polyprotein component. In some embodiments, each of the Pol polyprotein component and the helper protein is encoded by two or more (e.g., two, three, four, or five or more) RNA molecules each having a 5 'and 3' utr. In some embodiments, the UTR comprises a 3 'or 5' sequence from a stable mRNA molecule (including but not limited to globin, actin, GAPDH, tubulin, histone, or citrate-circulating enzyme).
In some embodiments, the nucleic acid molecules or compositions of the present disclosure are capable of integrating one or more transgenes into a host genome in the absence of functional retroviral Rev and/or Env proteins. In some embodiments, the compositions of the present disclosure are free of nucleic acid sequences that express a protein encoded by at least one of a retroviral rev gene and a retroviral env gene. In some embodiments, the nucleic acid molecules or compositions of the present disclosure are free of nucleic acid sequences that express proteins encoded by both retroviral rev and env genes. In some embodiments, the compositions of the present disclosure do not comprise a nucleic acid sequence encoding the following retroviral Gag proteins: MA, SP1, SP2 and p6. In some embodiments, the compositions of the present disclosure do not comprise any nucleic acid sequences encoding retroviral Gag proteins. In other embodiments, the compositions of the present disclosure do not comprise a nucleic acid sequence capable of expressing all retroviral Gag proteins.
In some embodiments, the RNA molecule encoding the retroviral Pol polyprotein component and/or the one or more accessory proteins comprises one or more modified ribonucleosides. In some embodiments, the modified ribonucleoside is selected from the group consisting of 2-thiouridine, 5-azauridine, pseudouridine, 4-thiouridine, 5-methyluridine, 5-methylpseudouridine, 5-aminouridine, 5-aminopseudouridine, 5-hydroxyuridine, 5-hydroxypseudouridine, 5-methoxyuridine, 5-methoxypseudouridine, 5-ethoxyuridine, 5-ethoxypseudouridine, 5-hydroxymethyl uridine, 5-hydroxymethyl pseudouridine, 5-carboxyuridine, 5-carboxypseudouridine, 5-formyl uridine, 5-formyl pseudouridine, 5-methyl-5-azauridine, 5-amino-5-azauridine, 5-hydroxy-5-azauridine, 5-methylpseudouridine, 5-aminopseudouridine, 5-hydroxy-pseudouridine, 4-thio-5-azauridine, 4-thiopseudouridine, 4-thio-5-methyluridine, 4-hydroxy-5-thiouridine, 4-hydroxy-5-azauridine, 4-methyl-5-amino-5-thiouridine, 4-hydroxy-5-azauridine, 4-thiouridine, 4-amino-hydroxy-5-thiouridine, 4-methylpseudouridine and 5-hydroxy-azauridine, 2-thiocytidine, 5-azacytidine, pseudoisocytidine, N4-methylcytidine, N4-aminocytidine, N4-hydroxycytidine, 5-methylcytidine, 5-aminocytidine, 5-hydroxycytidine, 5-methoxycytidine, 5-ethoxycytidine, 5-hydroxymethylcytidine, 5-carboxycytidine, 5-formylcytidine, 5-methyl-5-azacytidine, 5-amino-5-azacytidine, 5-hydroxy-5-azacytidine, 5-methylisocytidine, 5-aminopseudoisocytidine, 5-hydroxy-pseudoisocytidine, N4-methyl-5-azacytidine, N4-methylpseudoisocytidine 2-thio-5-azacytidine, 2-thio-pseudoisocytidine, 2-thio-N4-methylcytidine, 2-thio-N4-aminocytidine, 2-thio-N4-hydroxycytidine, 2-thio-5-methylcytidine, 2-thio-5-aminocytidine, 2-thio-5-hydroxycytidine, 2-thio-5-methyl-5-azacytidine, 2-thio-5-amino-5-azacytidine, 2-thio-5-hydroxy-5-azacytidine, 2-thio-5-methylpseudoisocytidine, 2-thio-5-aminopseudoisocytidine, 2-thio-5-hydroxy-pseudoisocytidine, 2-thio-N4-methyl-5-azacytidine, 2-thio-N4-methyl pseudoisocytidine, N4-methyl-5-methylcytidine, N4-methyl-5-aminocytidine, N4-methyl-5-hydroxycytidine, N4-methyl-5-azacytidine, N4-methyl-5-amino-5-azacytidine, N4-methyl-5-hydroxy-5-azacytidine, N4-methyl-5-methyl pseudoisocytidine, N4-methyl-5-amino pseudoisocytidine, N4-methyl-5-hydroxy pseudoisocytidine, N4-amino-5-azacytidine N4-amino-pseudoisocytosine, N4-amino-5-methylcytidine, N4-amino-5-aminocytidine, N4-amino-5-hydroxycytidine, N4-amino-5-methyl-5-azacytidine, N4-amino-5-azacytidine, N4-amino-5-hydroxy-5-azacytidine, N4-amino-5-methylpseudoisocytidine, N4-amino-5-amino-pseudoisocytidine, N4-amino-5-hydroxy-pseudoisocytidine, N4-hydroxy-5-azacytidine, N4-hydroxy-5-methylcytidine, N4-hydroxy-5-aminocytidine, N4-hydroxy-5-hydroxycytidine, N4-hydroxy-5-methyl-5-azacytidine, N4-hydroxy-5-amino-5-azacytidine, N4-hydroxy-5-azacytidine, N4-hydroxy-5-methylpseudoisocytidine, N4-hydroxy-5-aminopseudoisocytidine, N4-hydroxy-5-hydroxy-pseudoisocytidine, 2-thio-N4-methyl-5-methylcytidine, 2-thio-N4-methyl-5-aminocytidine, 2-thio-N4-methyl-5-hydroxycytidine 2-thio-N4-methyl-5-azacytidine, 2-thio-N4-methyl-5-amino-5-azacytidine, 2-thio-N4-methyl-5-hydroxy-5-azacytidine, 2-thio-N4-methyl-5-methyl pseudoisocytidine, 2-thio-N4-methyl-5-amino pseudoisocytidine, 2-thio-N4-methyl-5-hydroxy pseudoisocytidine, 2-thio-N4-amino-5-azacytidine, 2-thio-N4-amino pseudoisocytidine, 2-thio-N4-amino-5-methyl cytidine, 2-thio-N4-amino-5-aminocytidine, 2-thio-N4-amino-5-hydroxycytidine, 2-thio-N4-amino-5-methyl-5-azacytidine, 2-thio-N4-amino-5-azacytidine, 2-thio-N4-amino-5-hydroxy-5-azacytidine, 2-thio-N4-amino-5-methylpseudoisocytidine, 2-thio-N4-amino-5-aminopseudoisocytidine, 2-thio-N4-amino-5-hydroxy-pseudoisocytidine, 2-thio-N4-hydroxy-5-azacytidine, 2-thio-N4-hydroxy-5-methylcytidine, N4-hydroxy-5-azacytidine, 2-thio-N4-hydroxy-5-hydroxy-cytidine, 2-thio-N4-hydroxy-5-methylcytidine, 2-thio-N4-hydroxy-5-azacytidine, 2-thio-N4-hydroxy-5-azacytidine, 2-thio-N4-hydroxy-5-amino-pseudoisocytosine, 2-thio-N4-hydroxy-5-hydroxy-pseudoisocytosine, N6-methyladenosine, N6-aminoadenosine, N6-hydroxyadenosine, 7-deazaadenosine, 8-azaadenosine, N6-methyl-7-deazaadenosine, N6-methyl-8-azaadenosine, 7-deaza-8-azaadenosine, N6-methyl-7-deaza-8-azaadenosine, N6-amino-7-deazaadenosine, N6-amino-8-aza-adenosine, N6-amino-7-deaza-8-azaadenosine, N6-hydroxy-7-deazaadenosine, N6-hydroxy-8-deazaadenosine, N6-hydroxy-7-deaza-8-azaadenosine, 6-thioguanosine, 8-aza-guanosine, 6-thio-7-deazaguanosine, 6-thioguanosine, 6-amino-8-deazaguanosine and pseudothioguanosine. In some embodiments, the modified ribonucleoside is pseudouridine or a derivative of pseudouridine. In some embodiments, the derivative of pseudouridine is N1-methyl pseudouridine. In some embodiments, the modified ribonucleoside is N6-methyladenosine.
In some embodiments, the RNA molecule encoding the retroviral Pol polyprotein component and/or one or more accessory proteins comprises one or more modifications, including, but not limited to: a 5 'end cap (e.g., a 5' -7mG cap structure); comprising a poly (rA) tail; altering the 3'UTR or the 5' UTR; complexing the mRNA with an agent (e.g., a protein or complementary nucleic acid molecule); comprising elements that alter the structure (e.g., form a secondary structure) of an mRNA molecule; and reducing the number of C and/or U residues. In some embodiments, the poly (rA) sequence is about 10-500 nucleotides. In some embodiments, the poly (rA) sequence is at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 200, at least 300, at least 400, or at least 500 nucleotides. Other chemical and biological modifications to RNA molecules are known in the art.
Pol polyprotein components and helper protein nucleic acid templates
In one aspect, the present disclosure also provides nucleic acid templates (e.g., recombinant DNA constructs) for producing nucleic acid molecules encoding the Pol polyprotein components and/or one or more accessory proteins described herein. In some embodiments, the nucleic acid templates are used to produce RNA molecules. In some embodiments, the RNA molecules are produced using cell-free methods (e.g., in vitro transcription or cell-free RNA synthesis). See, for example, U.S. patent publication nos. 2017-0292138, 2018-0087045 2019-0144489, which are incorporated by reference herein in their entireties. In some embodiments, the nucleic acid templates comprise, in addition to RNA-producing elements (e.g., T7 promoter, digestion site, etc.), a 5'utr, a nucleic acid sequence encoding one or more Pol polyprotein components, and a 3' utr. In some embodiments, the nucleic acid template comprises a T7 promoter, a 5'utr, a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' utr, and optionally, one or more restriction digest sites for template linearization. In some embodiments, the nucleic acid template comprises a T7 promoter, a 5'utr, a nucleic acid sequence encoding the Pol polyprotein components alone, and a 3' utr, and optionally, one or more restriction digest sites for template linearization. In some embodiments, two or more nucleic acid templates each having a T7 promoter and 5 'and 3' utrs encode Pol polyprotein components. In some embodiments, one or more templates each having a T7 promoter and 5 'and 3' utrs encode a Pol polyprotein component, wherein the Pol polyprotein component is separated by one or more polycistronic elements. The one or more polycistronic elements may be IRES and/or 2A peptide coding sequences and/or other polycistronic elements. In some embodiments, the nucleic acid template does not encode a Pol polyprotein enzyme.
In some embodiments, the nucleic acid templates further comprise nucleic acid sequences encoding one or more helper retroviral proteins to enhance reverse transcription and/or integration. In some embodiments, one or more accessory proteins are cis-encoded (i.e., encoded by the same nucleic acid molecule) with one or more Pol polyprotein components. In some embodiments, one or more accessory proteins are cis-encoded with all Pol polyprotein components. In some embodiments, one or more accessory proteins are cis-encoded with the Pol polyprotein. In some embodiments, one nucleic acid template encodes all Pol polyprotein components and accessory proteins. In some embodiments, the nucleotide sequences encoding the Pol polyprotein component and/or the helper protein are separated by one or more polycistronic elements. The polycistronic element may be an IRES and/or 2A peptide coding sequence and/or other polycistronic element. In some embodiments, one or more accessory proteins are encoded trans-form (i.e., encoded by separate nucleic acid molecules) with the Pol polyprotein component. In some embodiments, each of the Pol polyprotein component and the helper protein is encoded by two or more (e.g., two, three, four, or five or more) nucleic acid templates each having a T7 promoter and 5 'and 3' utrs. In some embodiments, the nucleic acid template comprises a T7 promoter, a 5'utr, a nucleic acid sequence encoding a gag-pol gene, a 3' utr, and optionally one or more restriction digest sites for template linearization. In some embodiments, the nucleic acid template comprises a T7 promoter, a 5'utr, a nucleic acid sequence encoding a gag-pol gene comprising a frameshift mutation, a 3' utr, and optionally one or more restriction digest sites for template linearization. In some embodiments, the gag-pol gene does not encode a matrix protein.
In some embodiments, the UTR comprises a 3 'or 5' sequence from a stable and highly translated mRNA molecule (including but not limited to globin, actin, GAPDH, tubulin, histone, or citrate-circulating enzyme).
In some embodiments, the nucleic acid templates do not contain nucleic acid sequences that express proteins encoded by retroviral rev and env genes. In some embodiments, the nucleic acid template is free of nucleic acid sequences encoding the following retroviral Gag proteins: MA, SP1, SP2 and p6. In some embodiments, the nucleic acid template does not comprise a nucleic acid sequence encoding MA. In some embodiments, the nucleic acid template comprises a nucleic acid sequence encoding one or more of NC and CA. In some embodiments, the compositions of the present disclosure do not comprise any nucleic acid sequences encoding retroviral Gag proteins (MA, CA, NC, SP1, SP2, P6). In other embodiments, the compositions of the present disclosure do not comprise a nucleic acid sequence capable of expressing all retroviral Gag proteins. In some embodiments, the nucleic acid templates comprise nucleic acid sequences encoding one or more of NC, CA, MA, p, vif, tat, nef, vpr, and Vpu. Exemplary nucleic acid templates encoding Pol polyprotein and/or helper proteins are shown in FIGS. 4-5. In some embodiments, the RNA molecules are produced using cell-free methods (e.g., in vitro transcription, cell-free RNA synthesis).
Transgenic encoding nucleic acids
In one aspect, the present disclosure provides nucleic acid molecules comprising one or more transgenes flanking an LTR sequence that facilitate integration of the one or more transgenes into a host genome. The nucleic acid molecule may not comprise a viral packaging signal (e.g., psi packaging element) and/or a Rev Response Element (RRE). Thus, a transgene encoding nucleic acid molecule encodes one or more transgenes and contains sequences that facilitate integration of the one or more transgenes into the host cell genome; however, this system does not allow any viral particles to be produced. In some embodiments, the transgene encoding nucleic acid molecule comprises one or more reverse transcriptase initiation elements located between the 5'ltr and the 3' ltr and one or more promoter sequences operably linked to one or more transgenes.
As used herein, the term "long terminal repeat" or "LTR" refers to an RNA or DNA sequence that is repeated hundreds or thousands of times and is present at either end of proviral DNA formed by the reverse transcription of retroviral RNA. The virus uses the LTRs to integrate its genetic material into the host genome (e.g., the ends of the LTRs are involved in integrating the provirus into the host genome). In some embodiments, the LTR is self-inactivating. The 3' LTR may have a deletion that is transferred to the 5' LTR after a single round of reverse transcription (e.g., a deletion of a transcriptional enhancer or enhancers and promoters in the U3 region of the 3' LTR). Deletions in the 3'LTR (e.g., deletions in the transcriptional enhancer or enhancers and promoters in the U3 region of the 3' LTR) may be 10-50, 50-100, 100-150, 150-200, or more nucleotides. In some embodiments, the deletion in the 3' LTR is 100-150 nucleotides. In some embodiments, the deletion in the 3' ltr is 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, or 150 nucleotides. In some embodiments, the LTR, pol polyprotein component, and/or one or more accessory proteins are based on nucleic acid sequences from the same retrovirus. In some embodiments, the LTR, pol polyprotein component, and/or one or more accessory proteins are based on nucleic acid sequences from two or more different retroviruses. In some embodiments, the LTR is based on an LTR from: murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), ebensen murine leukemia virus (A-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), bow sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29), avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV), friend MLV (FrMLV), avian Sarcoma Virus (ASV), avian leukemia virus, avian myeloblastoma virus, UR2 sarcoma virus, Y73 sarcoma virus, jaagsiekte sheep retrovirus, phylloto virus, mason-pfizer monkey virus, pinus monkey retrovirus, fungiocarcinoma virus, furthogonal virus, fungiocarcinoma virus, and Equi-Pogostemavirus avian cancer Mier Hill Virus 2, bovine leukemia Virus, primate T lymphocyte Virus 1, primate T lymphocyte Virus 2, primate T lymphocyte Virus 3, hilsa herring dermis sarcoma Virus, hilsa herring epidermosis Virus 1, hilsa herring epidermosis Virus 2, chicken syncytial Virus, feline leukemia Virus, finkel-biskiss-jinks murine sarcoma Virus, gardner-Arnstein feline sarcoma Virus, gibbon ape leukemia Virus, guinea pig C tumor Virus, hardy-Zuckerman cat sarcoma Virus, harvey murine sarcoma Virus, kirsten murine sarcoma Virus, kola retrovirus, moloney murine sarcoma Virus, porcine C tumor Virus, reticuloendotheliosis Virus, snyder-Theilen cat sarcoma Virus, trager duck spleen necrosis Virus, the virus may be selected from the group consisting of a venomous retrovirus, a majoram sarcoma virus, a jenbrana disease virus, a american lion lentivirus, a bovine foamy virus, a equine foamy virus, a feline foamy virus, a brown da Cong simian foamy virus, a salfollows simian foamy virus, an eastern chimpanzee foamy virus, a green simian foamy virus, a long tailmonkey foamy virus, a japanese macaque foamy virus, a rhesus foamy virus, a spider simian foamy virus, a squirrel foamy virus, a taiwan macaque foamy virus, a western chimpanzee foamy virus, a western low-ground gorilla foamy virus, a marmoset simian foamy virus, a yellow chest simian foamy virus, or a combination thereof. In some embodiments, the LTR is based on an LTR from an HFV. In some embodiments, the LTR is a lentiviral LTR. In some embodiments, the lentiviral LTR is based on an LTR from: human Immunodeficiency Virus (HIV) (e.g., HIV-1, HIV-2), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), jembran disease virus, american lion lentivirus, or combinations thereof. In some embodiments, the LTR is based on LTR from HIV-1 (e.g., HIV-1 having genomic sequence in GenBank accession No. AF 033819). Other LTRs can be found in the art, for example on GenBank, as described above.
It is understood that the term "based on" or "derived from" means that the nucleic acid sequence may comprise one or more modifications relative to the base sequence. Thus, an LTR may comprise one or more modifications (e.g., a deletion of one or more nucleotides, an addition of one or more nucleotides, a substitution of one or more nucleotides, or a combination thereof) relative to the corresponding wild-type nucleic acid sequence.
In some embodiments, the LTR has at least 60% identity (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 9%, 96%, 95%, 98%, or 99% identity) with the wild-type LTR.
HIV-1RT relies on cytoplasmic tRNA for priming the viral genome prior to reverse transcription. In some embodiments, the transgene encoding nucleic acid molecule may comprise one or more reverse transcriptase priming elements (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more). In some embodiments, the reverse transcriptase initiating element is a binding site for an endogenous tRNA. In some embodiments, the priming element is a binding site for lysyl tRNA. In some embodiments, the reverse transcriptase priming element is a synthetic primer binding site. In some embodiments, the reverse transcriptase priming element is a synthetic primer binding site, and one or more synthetic primers are provided for use in conjunction with a transgene encoding nucleic acid molecule and a nucleic acid molecule encoding a Pol polyprotein component. The synthesized primers are based on rationally designed non-naturally occurring primers. It will be appreciated that the synthetic primers may be based on naturally occurring sequences. The primer binding sites synthesized are complementary sequences. In some embodiments, the nucleic acid molecules of the present disclosure are primed with one or more RNA or DNA priming oligonucleotides prior to delivery to a target cell. In some embodiments, the nucleic acid molecules of the present disclosure are co-delivered with one or more short RNA or DNA priming oligonucleotides. In some embodiments, the RNA or DNA priming oligonucleotide is 5-10, 10-15, 15-20, 20-25, 25-30, 5-15, 5-20, 5-25, 5-30, 10-20, 10-25, 10-30, 15-25, or 15-30 nucleotides in length. In some embodiments, the RNA or DNA priming oligonucleotide is 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length. In some embodiments, the RNA or DNA priming oligonucleotide may be 30-50 or 50-80 or 80-100, 100-150, or up to about 200 nucleotides. In some embodiments, the RNA or DNA priming oligonucleotide may be about 70-80 nucleotides. In some embodiments, the RNA or DNA priming oligonucleotide may be about 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, or 80 nucleotides. In some embodiments, the RNA or DNA priming oligonucleotide may be about 76 nucleotides. In some embodiments, the priming oligonucleotide is an engineered oligonucleotide that is complementary to an RT priming element. In some embodiments, the priming oligonucleotide is GUCCCUGUUCGGGCGCCA (SEQ ID NO: 18) or GTCCCTGTTCGGGCGCCA (SEQ ID NO: 19).
Expression of one or more transgenes may be controlled using one or more control sequences operably linked to one or more transgenes, wherein the control sequences include promoters, enhancers, and other regulatory sequences. In some embodiments, the transfer nucleic acid comprises a single transgene linked to a promoter. In some embodiments, the transfer nucleic acid comprises two or more transgenes. In some embodiments, each transgene is operably linked to its own promoter. In some embodiments, a single promoter is operably linked to two or more transgenes. In some embodiments, some of the two or more transgenes are operably linked to their own promoters, and some of the two or more transgenes are operably linked to a common promoter.
Two or more transgenes may be separated by one or more polycistronic elements. The one or more polycistronic elements may be one or more IRES and/or one or more 2A peptide coding sequences, and/or other polycistronic elements. IRES is a nucleotide sequence that allows for initiation of protein translation in the middle of a messenger RNA (mRNA) sequence. The 2A peptide is a small (18-22 amino acid) sequence that allows for efficient, stoichiometric production of discrete protein products in a single reading frame by ribosome jump events within the 2A peptide sequence. Any other polycistronic element that facilitates translation may also be used.
Any promoter that is functional in eukaryotic cells may be used in the present invention. In some embodiments, the promoter may be a promoter naturally associated with the transgene, and may be obtained by isolating the coding segment of a given gene and/or the 5' non-coding sequence upstream of the exon. Such promoters may be referred to as endogenous promoters or native promoters. In other embodiments, the transgene may be placed under the control of a recombinant or heterologous promoter, a heterologous promoter referring to a promoter that is not normally associated with the encoding nucleic acid sequence in its natural environment. In some embodiments, the promoter is tissue-specific or cell-specific (e.g., specific for bone marrow, hematopoietic Stem Cells (HSCs), T cells, liver, ocular tissue, or muscle, etc.). In some embodiments, the promoter is a constitutively active promoter. In some embodiments, the promoter is an inducible promoter. In some embodiments, the promoter may be a chimeric promoter comprising sequence elements from two or more different promoters. Suitable promoters include promoters derived from the viral genome, such as polyoma virus, adenovirus, fowlpox virus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus (CMV) (e.g., hCMV, mCMV), rous Sarcoma Virus (RSV), and simian virus 40 (SV 40), or promoters from heterologous mammals, such as actin promoter, human elongation factor-1 a (EF 1 a) promoter, CAG promoter, thymidine Kinase (TK) promoter, ubiquitin promoter, human phosphoglycerate kinase (PGK) promoter, human glyceraldehyde 3-phosphate dehydrogenase (GAPDH) promoter, or ribosomal protein promoter. Alternatively, tissue-specific promoters such as rhodopsin (Rho) promoter, rhodopsin kinase (RhoK) promoter, cone-bar homology box-containing gene (CRX) promoter, neural retina-specific leucine zipper protein (NRL) promoter, vitelline-like macular dystrophy 2 (VMD 2) promoter, tyrosine hydroxylase promoter, neuron-specific enolase (NSE) promoter, astrocyte-specific Glial Fibrillary Acidic Protein (GFAP) promoter, human alpha 1-antitrypsin (hAAT) promoter, phosphoenolpyruvate carboxykinase (PEPCK) promoter, hepatic fatty acid binding protein promoter, flt-1 promoter, IFN-beta promoter (e.g., mfn-beta promoter), mb promoter, SP-B promoter, SYN1 promoter, WASP promoter, SV40/hAlb promoter, SV40/CD43, SV40/CD45, NSE/5' promoter, ICAM-2 gfrp promoter, gpi promoter, cdi-14 promoter, and the promoters useful for the transcription of the endothelial proteins, the splice protein, the CD45, the CD-promoter. In some embodiments, the promoter is the hCMV promoter.
Transcription of one or more transgenes may be further increased or modulated by inserting one or more enhancer sequences into the transgene-encoding nucleic acid molecule. Enhancers may be located 5 'or 3' of the promoter. Suitable enhancers may include enhancers from eukaryotic viruses, such as the SV40 enhancer, the CMV early promoter enhancer, or the woodchuck hepatitis virus post-transcriptional regulatory element (WPRE). In some embodiments, the enhancer is WPRE.
In some embodiments, the transgene encoding nucleic acid molecule comprises a 5'ltr, one or more reverse transcriptase priming elements, one or more promoters operably linked to one or more transgenes, and a 3' ltr. In some embodiments, the promoter is the hCMV promoter. In some embodiments, the transgene encoding nucleic acid molecule further comprises one or more enhancers operably linked to the promoter and/or the transgene. In some embodiments, the enhancer is a WPRE element. In some embodiments, the transgene encoding nucleic acid molecule comprises a 5'ltr, one or more reverse transcriptase priming elements, a promoter operably linked to the transgene, an enhancer, and a 3' ltr. In some embodiments, the transgene encoding nucleic acid molecule comprises a 5'ltr, one or more reverse transcriptase priming elements, an hCMV promoter operably linked to a transgene, a WPRE enhancer, and a 3' ltr.
A transgene encoding nucleic acid molecule may comprise one or more chemical or biological modifications relative to a naturally occurring nucleic acid molecule. Modifications may increase stability and/or transcription/translation efficiency and/or reduce immunogenicity. Modifications may include, but are not limited to, modified nucleobases, modified backbones (e.g., phosphoramides, phosphorothioates, phosphorodithioates, O-methylphosphite linkages, and/or peptide nucleic acids).
In some embodiments, the transgene encoding nucleic acid molecule is an RNA molecule (e.g., ssRNA molecule). In some embodiments, the transgenic coding RNA molecule comprises a 5'ltr, one or more reverse transcriptase priming elements, one or more promoters operably linked to one or more transgenes, and a 3' ltr. In some embodiments, the promoter is the hCMV promoter. In some embodiments, the transgenic coding RNA further comprises one or more enhancers operably linked to the promoter and/or the transgene. In some embodiments, the enhancer is a WPRE element. In some embodiments, the transgenic coding RNA molecule comprises a 5'ltr, one or more reverse transcriptase priming elements, a promoter operably linked to the transgene, an enhancer, and a 3' ltr. In some embodiments, the transgenic coding RNA molecule comprises a 5'ltr, one or more reverse transcriptase priming elements, an hCMV promoter operably linked to the transgene, a WPRE enhancer, and a 3' ltr.
In some embodiments, the transgenic coding RNA molecule comprises one or more chemical or biological modifications relative to the naturally occurring RNA. In some embodiments, the transgenic coding RNA molecule does not comprise any modified ribonucleoside. In some embodiments, the transgenic coding RNA molecule comprises one or more modified ribonucleosides. Examples of suitable modified ribonucleosides are provided elsewhere in the specification. In some embodiments, the RNA molecule comprises one or more modifications, including, but not limited to: modified ribonucleosides, 5 'end caps (e.g., 5' -7mG cap structures); comprising a poly (rA) tail; complexing the mRNA with an agent (e.g., a protein or complementary nucleic acid molecule); comprising elements that alter the structure (e.g., form a secondary structure) of the RNA molecule; and reducing the number of C and/or U residues. In some embodiments, the poly (rA) sequence is about 10-500 nucleotides. In some embodiments, the poly (rA) sequence is at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 200, at least 300, at least 400, or at least 500 nucleotides. In some embodiments, the transgenic coding RNA molecule is uncapped. Other chemical and biological modifications to RNA molecules are known in the art.
Transgenic encoding nucleic acid templates
In one aspect, the disclosure also provides nucleic acid templates (e.g., recombinant DNA constructs) for producing a transgene-encoding nucleic acid molecule. In some embodiments, the nucleic acid templates are used to generate a transgenic coding RNA molecule. In some embodiments, the transgenic coding RNA molecule is produced using a cell-free method (e.g., in vitro transcription or cell-free RNA synthesis). See, for example, U.S. patent publication nos. 20170292138, 20180087045 20190144489, the entire contents of which are incorporated herein by reference. In some embodiments, the nucleic acid template comprises, in addition to an RNA-producing element (e.g., T7 promoter, digestion site, etc.), a 5'utr, one or more reverse transcriptase priming elements, one or more promoters operably linked to one or more transgenes, and a 3' ltr. In some embodiments, the nucleic acid template comprises a T7 promoter, a 5'ltr, one or more reverse transcriptase priming elements, one or more promoters operably linked to one or more transgenes, and a 3' ltr, and optionally one or more restriction sites for template linearization. In some embodiments, the promoter is the hCMV promoter. In some embodiments, the nucleic acid template further comprises one or more enhancers operably linked to the promoter and/or transgene. In some embodiments, the enhancer is a WPRE element. In some embodiments, the nucleic acid template comprises a T7 promoter, a 5'ltr, one or more reverse transcriptase priming elements, a promoter operably linked to a transgene, an enhancer, a 3' ltr, and optionally one or more restriction digestion sites for template linearization. In some embodiments, the transgene encoding nucleic acid template comprises a 5'ltr, one or more reverse transcriptase priming elements, an hCMV promoter operably linked to the transgene, a WPRE enhancer, a 3' ltr, and optionally one or more restriction digestion sites for template linearization. An exemplary nucleic acid template encoding a transgene encoding nucleic acid molecule is shown in fig. 3.
Transgenic plants
In some embodiments, the transgene encoding nucleic acid molecule encodes one or more transgenes. In some embodiments, the transgene-encoding nucleic acid molecule encodes two or more transgenes (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more). In certain embodiments, the transgene encoding nucleic acid molecule encodes only one transgene.
In some embodiments, the transgene encodes a therapeutic molecule, a diagnostic molecule, or a reporter molecule.
In some embodiments, the transgene encodes a therapeutic molecule. The therapeutic molecule may be a nucleic acid or a polypeptide. The therapeutic molecules of the present disclosure may be used, for example, to replace defective or abnormal proteins in cells, tissues, organs, or subjects, to enhance existing biological pathways, to provide new functions or activities. Therapeutic molecules may also be used to elicit an immune response. Exemplary types of transgenes include, but are not limited to, sequences encoding enzymes, cofactors, carrier proteins, transport proteins, cytokines, signaling proteins, suicide gene products, drug-resistant proteins, tumor suppressor hormones, peptides with immunomodulatory properties, tolerogenic peptides, immunogenic peptides, antibodies and antigen binding fragments thereof, antioxidant molecules, engineered immunoglobulin-like molecules, fusion proteins, immune co-stimulatory molecules, immunomodulating molecules, chimeric antigen receptors, toxins, tumor suppressor proteins, growth factors, membrane proteins, receptors, vasoactive proteins, ligand proteins, antiviral proteins, ribozymes, RNAs, riboswitches, mrnas, RNA interference (e.g., shRNA, siRNA, microrna) molecules, and derivatives thereof.
In some embodiments, the therapeutic protein encoded by the transgene includes, but is not limited to, a cytokine, cystic fibrosis transmembrane conductance regulator (CFTR), a dystrophin protein, an dystrophin-related protein, a coagulation (hemagglutination) factor (e.g., factor XIII, factor IX, factor X, factor VIII, factor VIIa, protein C, factor VII, factor VIII with a deletion of the B domain, or a highly active or long half-life variant of the coagulation factor, or an active or inactive form of the coagulation factor), a retinoid-specific 65kDa protein (RPE 65), erythropoietin, LDL receptor, lipoprotein lipase, ornithine carbamoyltransferase, beta-globin, alpha-globin, ghost protein, alpha-antitrypsin, adenosine Deaminase (ADA), a metal transporter (ATP 7A or ATP 7), a sulfonamide enzyme, an enzyme involved in a lysosomal storage disease, hypoxanthine guanine phosphoribosyl transferase, beta-25 glucocerebrosidase, sphingomyelinase, lysosomal hexosaminidase, branched-chain ketoacid dehydrogenase, insulin-like growth factor 1 or 2, platelet-derived growth factor, epidermal growth factor, nerve growth factor, neurotrophic factors 3 and 4, brain-derived neurotrophic factor, glial-derived growth factor, transforming growth factors alpha and beta, alpha-interferon, beta-interferon, interferon-gamma, interleukin 2, interleukin 4, interleukin-12, granulocyte-macrophage colony stimulating factor, lymphotoxin, herpes simplex virus thymidine kinase, cytosine deaminase, diphtheria toxin, cytochrome P450, deoxycytidine kinase, tumor necrosis factor, P53, rb, wt-1, NF1, von Hippel-Lindau (VHL), SERCA2a, adenomatous Polyposis Coli (APC), VEGF, microdystrophin, lysosomal acid lipase, arylsulfatase A and B, ATP A and B, insulin, glucokinase, guanylate cyclase 2D (GUCY 2D), rab-escrow 1, LCA5, ornithine ketoacid aminotransferase, retinoblastosis 1, USH1C, RP GTPase modulator (RPGR), MERTK, DFNB1, ACHM2, 3 and 4, PKD-1 or PKD-2, TPP1, CLN2, gene products associated with lysosomal storage diseases (e.g., sulfatase, N-acetylglucosamine-1-phosphotransferase, cathepsin A, GM-AP, NPC1, VPC2, sphingolipid activating protein) and any other peptide or peptides in need thereof.
It is to be understood that the term "derivative thereof" includes any therapeutically or functionally active fragment, or modification, of the base polypeptide. Derivatives of the polypeptides may comprise one or more modifications (e.g., a deletion of one or more nucleotides, an addition of one or more nucleotides, a substitution of one or more nucleotides, or a combination thereof) relative to the corresponding wild-type polypeptide sequence. In some embodiments, the derivative is a homolog. Homology is typically deduced from sequence similarity between two or more nucleic acids or proteins (or sequences thereof). The exact percentage of similarity between sequences that can be used to establish homology varies with the nucleic acid and protein in question, but typically only 25% sequence similarity is used to establish homology. Higher levels of sequence similarity, such as 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or 99% or more, may also be used to establish homology. In some embodiments, a derivative is a polypeptide that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical) to a base polypeptide. Methods for determining the percent sequence similarity are well known to those of ordinary skill in the art. In some embodiments, the similarity may be determined using algorithms such as those described herein, including, for example, BLASTP and BLASTN algorithms, e.g., using default parameters. In some aspects, the wild-type HIV-1 strain is NL4-3.
In some embodiments, the transgene encodes β -globin or a derivative thereof. The beta-globin gene can be used in gene therapy to treat beta-thalassemia or sickle cell disease. In some embodiments, the β -globin gene is a human β -globin gene. In some embodiments, the transgene encoding β -globin or a derivative thereof is operably linked to the hCMV promoter. In some embodiments, the transgene encoding β -globin or a derivative thereof is operably linked to a tissue-specific or cell-specific promoter. In some embodiments, the promoter is specific for bone marrow or HSCs. In some embodiments, the transgene encoding β -globin or a derivative thereof is operably linked to a native β -globin promoter. In some embodiments, the transgene encoding the beta globin or derivative thereof is further operably linked to a WPRE enhancer. In some embodiments, the transgene encodes a β -globin polypeptide having an amino acid sequence corresponding to NCBI reference sequence number np_000509.1 or a fragment thereof. In some embodiments, the transgene encodes a β -globin polypeptide having an amino acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000509.1 or a fragment thereof. In some embodiments, the transgene comprises a β -globin gene having a nucleic acid sequence corresponding to NCBI reference sequence number nm_000518.5 or a fragment thereof. In some embodiments, the transgene comprises a β -globin gene having an amino acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000509.1 or a fragment thereof.
In some embodiments, the transgene encodes CFTR or a derivative thereof. The CFTR gene may be used in gene therapy to treat cystic fibrosis. In some embodiments, the transgene encoding CFTR or a derivative thereof is operably linked to the hCMV promoter. In some embodiments, the transgene encoding CFTR or a derivative thereof is operably linked to a tissue-specific or cell-specific promoter. In some embodiments, the promoter is specific for an epithelial cell. In some embodiments, the transgene encoding CFTR or a derivative thereof is operably linked to a native CFTR promoter. In some embodiments, the transgene encoding CFTR or a derivative thereof is further operably linked to a WPRE enhancer. In some embodiments, the transgene encodes a CFTR having an amino acid sequence corresponding to NCBI reference sequence number np_000483.3 or a fragment thereof. In some embodiments, the transgene encodes a CFTR polypeptide having an amino acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000483.3 or a fragment thereof. In some embodiments, the transgene comprises a CFTR gene having a nucleic acid sequence corresponding to NCBI reference sequence number nm_000492.4 or a fragment thereof. In some embodiments, the transgene comprises a CFTR gene having an amino acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000492.4 or a fragment thereof.
In some embodiments, the transgene encodes factor VIII or a derivative thereof. The factor VIII gene may be used in gene therapy to treat hemophilia B. In some embodiments, the factor VIII gene is a human factor VIII gene. In some embodiments, the transgene encoding factor VIII or a derivative thereof is operably linked to the hCMV promoter. In some embodiments, the transgene encoding factor VIII or a derivative thereof is operably linked to a tissue-specific or cell-specific promoter. In some embodiments, the promoter is specific for the liver. In some embodiments, the promoter is specific for bone marrow or HSCs. In some embodiments, the transgene encoding factor VIII or a derivative thereof is operably linked to a native factor VIII promoter. In some embodiments, the transgene encoding factor VIII or a derivative thereof is further operably linked to a WPRE enhancer. In some embodiments, the transgene encodes a factor VIII polypeptide having an amino acid sequence corresponding to NCBI reference sequence number np_000123.1 or a fragment thereof. In some embodiments, the transgene encodes a factor VIII polypeptide having an amino acid sequence corresponding to NCBI reference sequence number np_063916.1 or a fragment thereof. In some embodiments, the transgene encodes a factor VIII polypeptide having an amino acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000123.1 or NCBI reference sequence number np_063916.1 or a fragment thereof. In some embodiments, the transgene comprises a factor VIII gene having a nucleic acid sequence corresponding to NCBI reference sequence number nm_000132.4 or a fragment thereof. In some embodiments, the transgene comprises a factor VIII gene having a nucleic acid sequence corresponding to NCBI reference sequence number nm_019863.2 or a fragment thereof. In some embodiments, the transgene comprises a factor VIII gene having a nucleic acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number nm_000132.4 or NCBI reference sequence number nm_019863.2 or a fragment thereof.
In some embodiments, the transgene encodes a dystrophin protein or derivative thereof. The dystrophin gene may be used in gene therapy to treat Duchenne Muscular Dystrophy (DMD). In some embodiments, the dystrophin gene is a microdystrophin or microdystrophin gene (Vincent et al, nature Genetics 5:130 (1993); wang et al, proc Natl Acad Sci USA97:13714-9 (2000) [ microdystrophin ]; harper et al, nat Med.8:253-61 (2002) [ microdystrophin ]). In some embodiments, the dystrophin gene is a human dystrophin gene. In some embodiments, the transgene encoding a dystrophin protein or derivative thereof is operably linked to the hCMV promoter. In some embodiments, the transgene encoding a dystrophin protein or derivative thereof is operably linked to a tissue-specific or cell-specific promoter. In some embodiments, the promoter is specific for a muscle cell. In some embodiments, the transgene encoding a dystrophin protein or derivative thereof is operably linked to a native dystrophin promoter. In some embodiments, the transgene encoding a dystrophin protein or derivative thereof is further operably linked to a WPRE enhancer. In some embodiments, the transgene encodes a dystrophin polypeptide having an amino acid sequence corresponding to NCBI reference sequence number np_000100.3 or a fragment thereof. In some embodiments, the transgene encodes a dystrophin polypeptide having an amino acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000100.3 or a fragment thereof. In some embodiments, the transgene comprises a dystrophin gene having a nucleic acid sequence corresponding to NCBI reference sequence number nm_000109.4 or a fragment thereof. In some embodiments, the transgene comprises a dystrophin gene having an amino acid sequence that is at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000109.4 or a fragment thereof.
In some embodiments, the transgene encodes an RP gtpase modulator (RPGR) or derivative thereof. The RPGR gene can be used for gene therapy of retinal pigment degeneration. In some embodiments, the RPGR gene is a human RPGR gene. In some embodiments, the transgene encoding RPGR or derivative thereof is operably linked to the hCMV promoter. In some embodiments, the transgene encoding RPGR or derivative thereof is operably linked to a tissue-specific or cell-specific promoter. In some embodiments, the promoter is specific for ocular tissue. In some embodiments, the transgene encoding RPGR or a derivative thereof is operably linked to a native RPGR promoter. In some embodiments, the transgene encoding RPGR or derivative thereof is further operably linked to a WPRE enhancer. In some embodiments, the transgene encodes an RPGR polypeptide having an amino acid sequence corresponding to NCBI reference sequence number np_000319.1 or a fragment thereof. In some embodiments, the transgene encodes an RPGR polypeptide having an amino acid sequence at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000319.1 or a fragment thereof. In some embodiments, the transgene comprises an RPGR gene having a nucleic acid sequence corresponding to NCBI reference sequence number nm_000328.3 or a fragment thereof. In some embodiments, the transgene comprises an RPGR gene having an amino acid sequence at least 60% identical (e.g., 70%, 80%, 90% to 100%, 95% to 100%, 98% to 100%, or 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical) to NCBI reference sequence number np_000328.3 or a fragment thereof.
In some embodiments, the transgene encodes a Chimeric Antigen Receptor (CAR). The CAR can be used to treat cancer. In some embodiments, the transgene encoding the CAR is operably linked to the hCMV promoter. In some embodiments, the transgene encoding the CAR is operably linked to a tissue-specific or cell-specific promoter. In some embodiments, the promoter is specific for HSCs or T cells (e.g., cytotoxic T cells). In some embodiments, the transgene encoding the CAR is further operably linked to a WPRE enhancer.
In some embodiments, the transgene encodes a reporter molecule. The reporter molecule may be a nucleic acid or a polypeptide. A reporter molecule refers to a molecule that can be used to measure gene expression and generally produce a measurable signal such as fluorescence, luminescence, or color. In some embodiments, the polypeptide is a luciferase. In some embodiments, the polypeptide is a fluorescent protein. Fluorescent proteins are known in the art and are a subset of fluorophores, which are fluorescent chemical compounds that have the ability to re-emit light upon excitation. The fluorophore will absorb the excitation light energy of a first specific wavelength and then re-emit light energy of a second, longer specific wavelength. Each type of fluorophore responds to and emits light at a different wavelength, depending on the nature and environment of its chemical structure. In some embodiments, fluorescent proteins useful in the present invention include, but are not limited to, green fluorescent proteins (e.g., wt-GFP, EGFP, emerald, superfolder GFP, azami Green, mWasabi, tagGFP, turboGFP, acGFP, zsGreen, T-saphire, etc.), blue fluorescent proteins (e.g., EBFP2, azurite, mTagBFP, etc.), cyan fluorescent proteins (e.g., ECFP, mECFP, cerulean, mTurquoise, cyPet, amCyan1, midori-Ishi Cyan, tagCFP, mTFP1 (Teal), etc.), yellow fluorescent proteins (e.g., EYFP, topaz, venus, mCitrine, YPet, tagYFP, phiYFP, zsYellow1, mBanana, etc.), orange fluorescent proteins (e.g., kusabira Orange2, mOrange, mOrange2, dTomato, dTomato-Tandem, tagRFP, tagRFP-T, dsRed, dsRed2, dsRed-Express (T1), dsRed-Monomer, mTangerine, etc.), or red fluorescent proteins (e.g., mRuby, mApple, mStrawberry, asRed2, mRFP1, JRed, mCherry, hcRed, mRaspberry, dKeima-Tandmem, hcRed-Tandem, mPlum, AQ, etc.). In some embodiments, the transgene encodes GFP. In some embodiments, the reporter molecule may be fused in-frame to sequences encoding other proteins to identify the location of the protein in a cell, tissue, organ or organism. Reporter molecules used in accordance with the present disclosure include any reporter molecule described herein or known to one of ordinary skill in the art.
Application method
The nucleic acids and compositions of the present disclosure can be used to deliver any transgene having a biological effect to treat and/or ameliorate symptoms associated with any disorder associated with gene expression. The methods of the present disclosure can be used, for example, to enhance existing biological pathways by integrating and stably expressing transgenes to replace defective or abnormal proteins, providing new functions or activities. The methods of the present disclosure may also be used to transiently express transgenes and elicit an immune response.
In one aspect, the present disclosure provides a method of expressing a transgene in a subject in need thereof, the method comprising administering to the subject an effective amount of a composition comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of a transgene encoding nucleic acid and a transgene encoding nucleic acid comprising one or more transgene sequences flanking the LTR sequence, thereby expressing one or more transgenes in the subject. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the disclosed transgenes can be expressed in a subject in need thereof. Compositions comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of transgene encoding nucleic acids, and comprising one or more transgene sequences flanking the LTR sequences, can be used to express one or more transgenes in a subject.
In one aspect, the present disclosure provides a method for expressing a transgene in a cell, the method comprising delivering to the cell a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component for reverse transcription and integration of a transgene encoding nucleic acid and a transgene encoding nucleic acid comprising one or more transgene sequences flanking an LTR sequence, thereby expressing one or more transgenes in the cell. In some embodiments, the one or more nucleic acid molecules encoding the retroviral Pol polyprotein component further encode one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. Suitable cell types include, but are not limited to, HSCs, hepatocytes, vision cells (e.g., retinal cells), muscle cells, epithelial cells, T cells (e.g., cytotoxic T cells), and the like. In some aspects, the disclosed transgenes can be expressed in cells. Compositions comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of transgene encoding nucleic acids, and comprising one or more transgene sequences flanking the LTR sequences, can be used to express one or more transgenes in a cell.
In one aspect, the present disclosure provides a method for expressing a transgene in a tissue, the method comprising delivering to the tissue a composition comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of a transgene encoding nucleic acid and a transgene encoding nucleic acid comprising one or more transgene sequences flanking an LTR sequence, thereby expressing one or more transgenes in the tissue. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. Suitable tissues include, but are not limited to, bone marrow, muscle, ocular tissue, cardiac tissue, liver tissue, epithelial tissue, connective tissue, neural tissue, gastrointestinal tissue, and the like. In some aspects, the disclosed transgenes can be expressed in tissue. Compositions comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of transgene encoding nucleic acids, and comprising one or more transgene sequences flanking the LTR sequences, can be used to express one or more transgenes in tissue.
In one aspect, the present disclosure provides a method for expressing a transgene in an organ, the method comprising delivering to the organ a composition comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of a transgene encoding nucleic acid and a transgene encoding nucleic acid comprising one or more transgene sequences flanking the LTR sequence, thereby expressing one or more transgenes in the organ. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. Suitable organs include, but are not limited to, eyes, heart, lung, liver, stomach, spleen, pancreas, small intestine, large intestine, kidney, bone marrow, brain, and the like. In some aspects, the disclosed transgenes can be expressed in an organ. Compositions comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of transgene encoding nucleic acids, and comprising one or more transgene sequences flanking the LTR sequences, can be used to express one or more transgenes in an organ.
In one aspect, the present disclosure provides a method of treating a disorder using a composition comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of transgene encoding nucleic acids and a transgene encoding nucleic acid comprising one or more transgene sequences flanking the LTR sequences, the method comprising administering the composition to a subject in need of treatment, thereby expressing the transgene in the subject. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the disclosed transgenes are useful for treating disorders. Compositions comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of transgene encoding nucleic acids, and comprising one or more transgene sequences flanking the LTR sequence, are useful in treating disorders.
In some embodiments, one or more transgenes are integrated into the genome of a target cell (e.g., in a subject). In some embodiments, one or more transgenes are stably expressed for at least one week, at least two weeks, at least one month, at least 6 months, at least one year, at least 2 years, at least 3 years, at least 4 years, or at least 5 years. In some embodiments, one or more transgenes are stably expressed for 1-2 weeks, 2-4 weeks, 1-3 months, 3-6 months, 6-9 months, 9-12 months, 1-2 years, 2-3 years, 3-4 years, or 4-5 years or more. In some embodiments, one or more transgenes are stably expressed throughout the lifetime of the subject. In some embodiments, the compositions of the present disclosure are used to integrate one or more transgenes into the genome of a target cell (e.g., in a subject).
Any disease, disorder or condition associated with gene expression (e.g., where integration and stable expression of a transgene may be desired) may be treated. In some embodiments, the disease, disorder, or condition to be treated is a genetic disorder. A genetic disease, disorder or condition may be genetic or non-genetic. In some embodiments, the disease, disorder, or condition is a neurodegenerative, proliferative, inflammatory, or autoimmune disease, disorder, or condition. In some embodiments, the condition to be treated includes, but is not limited to, cystic fibrosis (and other pulmonary diseases), hemophilia a, hemophilia B, β -thalassemia, sickle cell disease, anemia and other coagulation disorders, alzheimer's disease, parkinson's disease, huntington's disease, amyotrophic lateral sclerosis, epilepsy and other neurological disorders, cancer, diabetes, muscular dystrophy (e.g., duchenne, becker), leber congenital amaurosis, lysosomal storage disease, josepia syndrome 1C, cyclotron atrophy, connexin 26 deafness, achromatopsia, X-linked retinal cleavage, polycystic kidney disease, gaucher's disease, heller's disease, adenosine deaminase deficiency, glycogen storage disease and other metabolic defects, pompe's disease, congestive heart failure, retinal pigment degeneration, solid organ (e.g., brain, liver, kidney, heart) disease, and the like. In some embodiments, the condition to be treated is cancer. In some embodiments, the cancer is a blood cancer (e.g., lymphoma, leukemia, multiple myeloma, etc.), breast cancer, prostate cancer, digestive system cancer (e.g., esophageal cancer, gastric cancer, colorectal cancer), liver cancer, cervical cancer, ovarian cancer or uterine cancer, pancreatic cancer, lung cancer, brain cancer (e.g., glioblastoma), skin cancer (e.g., melanoma), or sarcoma of muscle or nerve, etc.
In some embodiments, the present disclosure provides methods of treating β -thalassemia or sickle cell disease in a subject in need thereof, comprising administering a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component for reverse transcription and integration of a transgene encoding nucleic acid and a nucleic acid molecule encoding a β -globin or derivative thereof that comprises flanking LTR sequences. In some embodiments, the present disclosure provides methods of treating sickle cell disease in a subject in need thereof, comprising administering a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component for reverse transcription and integration of a transgene encoding nucleic acid, and a nucleic acid molecule encoding a β -globin or derivative thereof that comprises flanking LTR sequences. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the compositions of the present disclosure may be used to treat β -thalassemia or sickle cell disease.
In some embodiments, the present disclosure provides methods of treating cystic fibrosis in a subject in need thereof, comprising administering a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component for reverse transcription and integration of a transgene encoding nucleic acid, and a nucleic acid molecule encoding a CFTR or derivative thereof comprising flanking LTR sequences. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the compositions of the present disclosure may be used to treat cystic fibrosis.
In some embodiments, the present disclosure provides methods of treating hemophilia B in a subject in need thereof, comprising administering a composition comprising one or more nucleic acid molecules encoding retroviral Pol polyprotein components for reverse transcription and integration of a transgene encoding nucleic acid, and a nucleic acid molecule encoding factor VIII or derivative thereof flanking the LTR sequence. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the compositions of the present disclosure may be used to treat hemophilia B.
In some embodiments, the present disclosure provides methods of treating DMD in a subject in need thereof, comprising administering a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component for reverse transcription and integration of a transgene encoding nucleic acid, and a nucleic acid molecule encoding a dystrophin protein or derivative thereof flanking the LTR sequence. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the compositions of the present disclosure may be used to treat DMD.
In some embodiments, the present disclosure provides methods of treating retinal pigment degeneration in a subject in need thereof comprising administering a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component for reverse transcription and integration of a transgene encoding nucleic acid and a nucleic acid molecule encoding an RPGR or derivative thereof comprising flanking LTR sequences. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the compositions of the present disclosure are useful for treating retinal pigment degeneration.
In some embodiments, the present disclosure provides methods of treating cancer in a subject in need thereof, comprising administering a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component for reverse transcription and integration of a transgene encoding nucleic acid, and a nucleic acid molecule encoding a Chimeric Antigen Receptor (CAR) comprising flanking LTR sequences. In some embodiments, the disorder to be treated is cancer. In some embodiments, the cancer is a blood cancer (e.g., lymphoma, leukemia, multiple myeloma, etc.), breast cancer, prostate cancer, digestive system cancer (e.g., esophageal cancer, gastric cancer, colorectal cancer), liver cancer, cervical cancer, ovarian or uterine cancer, pancreatic cancer, lung cancer, brain cancer (e.g., glioblastoma), skin cancer (e.g., melanoma), sarcomas of muscles or nerves, and the like. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some aspects, the compositions of the present disclosure are useful for treating cancer.
In some embodiments, one or more transgenes are transiently expressed once integrated into the genome of a target cell (e.g., in a subject). Transient expression is sufficient to elicit an immune response. Any disease, disorder or condition that would benefit from an immune response may be treated. Thus, in some embodiments, the present disclosure provides a method of eliciting an immune response in a subject in need thereof by transiently expressing one or more transgenes, the method comprising administering to the subject an effective amount of a composition comprising one or more nucleic acid molecules encoding a retroviral Pol polyprotein component and a nucleic acid molecule comprising a transgene flanked by long-terminal repeat sequences, thereby expressing the transgenes in the subject. In some embodiments, the disease, disorder, or condition is cancer and the transgene may encode a tumor antigen. In some embodiments, the disease, disorder, or condition is an infectious disease (e.g., a disease caused by an infectious agent such as a pathogen) and the transgene may encode an antigen associated with the infectious disease. The subject may be at risk of contracting an infectious disease and the composition may be administered prophylactically. In some aspects, the compositions of the present disclosure may be used to elicit an immune response.
Non-viral delivery system
The nucleic acid molecules of the present disclosure are delivered to a cell, tissue, or subject via a non-viral delivery system. The nucleic acid molecules of the present disclosure can be delivered to a target tissue and/or cell ex vivo or in vivo using any suitable non-viral delivery system known in the art.
In some embodiments, the nucleic acid molecule (e.g., RNA, ssDNA, linear dsDNA) is packaged in a Lipid Nanoparticle (LNP). Suitable LNPs for packaging DNA molecules and RNA molecules (e.g., mRNA molecules) are known in the art. In some embodiments, the LNP comprises one or more ionizable or cationic lipids (e.g., N- [1- (2, 3-dioleoyloxy) propyl ] -N, N, N-trimethylammonium chloride (DOTMA), 1, 2-dioleoyl-3-trimethylammonium-propane (DOTAP), 1, 2-dioleoyl-3-dimethylammonium-propane (DODAP), DODMA, dlin-MC3-DMA (MC 3), LP-01, diketopiperazine-based ionizable lipids, cKK-E12); one or more neutral phospholipids (e.g., cholesterol), one or more zwitterionic lipid molecules (e.g., 1, 2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE)), and/or one or PEG-modified lipids. In some embodiments, the LNP comprises polylactic acid (PLA) or poly (lactic-co-glycolic acid) (PGLA) polymers within a lipid monolayer. In some embodiments, the ionizable lipid is a pH-dependent ionizable material.
In some embodiments, non-viral delivery systems including, but not limited to, LNPs are conjugated to polypeptides (e.g., viral envelope or capsid proteins, antibodies or antigen binding fragments thereof, or other targeting moieties) that specifically bind to receptors on target cells or tissues. In some embodiments, the LNP is conjugated to a small molecule that facilitates targeting to a particular cell, tissue or organ. Suitable cell types include, but are not limited to, HSCs, liver cells (e.g., hepatocytes), vision cells (e.g., retinal cells), muscle cells, epithelial cells, T cells (e.g., cytotoxic T cells), and the like. Suitable tissues include, but are not limited to, bone marrow, muscle, ocular tissue, cardiac tissue, liver tissue, epithelial tissue, connective tissue, neural tissue, gastrointestinal tissue, and the like. Suitable organs include, but are not limited to, eyes, heart, lung, liver, stomach, spleen, pancreas, small intestine, large intestine, kidney, and the like. In some embodiments, the target cells or tissues include, but are not limited to, bone marrow, HSCs, epithelial cells, liver cells (e.g., hepatocytes), vision cells (e.g., retinal cells), muscle cells, T cells (e.g., cytotoxic T cells).
In some embodiments, the nucleic acid molecules of the present disclosure may be delivered naked, in an aqueous solution (e.g., a buffer, such as sucrose citrate buffer), or in combination with a lipid, polymer, peptide, or other compound that facilitates entry into a cell. The nucleic acid molecule can be introduced into the target cell or subject using any suitable technique, such as by direct injection, microinjection, transfection, nuclear transfection, electroporation, lipofection, high pressure spraying, biolysis (biolistics), and the like. In some embodiments, the nucleic acid molecule is complexed (complex) with a cationic amphiphile. In some embodiments, the nucleic acid molecule is conjugated to a transfection agent (e.g., lipofectamine TM 、Lipofectin TM jetPEI, RNAiMAX and Invivofectamine, megaFectin TM 、TransIT TM ) And (5) compounding. In some embodiments, the nucleic acid molecule is complexed with a cell penetrating peptide (e.g., a polycationic peptide or an amphiphilic peptide).
In some embodiments, the nucleic acid molecules of the present disclosure are packaged in liposomes, lipid nanoparticles, polypeptide nanoparticles, inorganic materials (e.g., silica nanoparticles, gold nanoparticles, and inorganic-based carriers such as CaP), synthetic polymers, dendrimers, cationic nanoemulsions, polymeric nanoparticles, polymeric and lipid hybrid carriers, or combinations thereof. In some embodiments, the nucleic acid molecules of the disclosure are packaged in a viral capsid.
In some embodiments, the nucleic acid molecules of the present disclosure are packaged in polymeric nanoparticles, including but not limited to, nanoparticles based on Polyethylenimine (PEI), polyacrylate, poly (β -amino ester) (PBAE), and poly (asparagine) (PAsp).
In some embodiments, the nucleic acid molecules of the present disclosure are packaged in liposomes. Liposomes are artificially prepared vesicles, which may consist mainly of lipid bilayers. Liposomes can be of different sizes, such as, but not limited to, multilamellar vesicles (MLVs), which can be hundreds of nanometers in diameter, and can contain a series of concentric bilayers separated by narrow aqueous compartments, small single cell vesicles (SUVs) which can be less than 50nm in diameter, and Large Unilamellar Vesicles (LUVs) which can be between 50 and 500nm in diameter. Liposome designs may include, but are not limited to, opsonin or ligands to improve the attachment of liposomes to unhealthy tissues or to activate events such as, but not limited to, endocytosis. Liposomes can contain low or high pH values to improve delivery of nucleic acid molecules. In some embodiments, the nucleic acid molecules of the present disclosure are packaged in liposomes, such as those formed from 1, 2-dioleoyloxy-N, N-dimethylaminopropane (DODMA) liposomes, diLa2 liposomes from Marina Biotech (bothall, dash.), 1, 2-dioleoyloxy-3-dimethylaminopropane (DLin-DMA), 2-dioleylidene-4- (2-dimethylaminoethyl) - [1,3] -dioxolane (DLin-KC 2-DMA), and MC3 (US 20100324120; the contents of which are incorporated herein by reference in their entirety). In some embodiments, the liposome is pegylated.
The nucleic acid molecules of the present disclosure (e.g., packaged in liposomes, lipid nanoparticles, cationic nanoemulsions, polymeric nanoparticles, etc.) can be administered systemically or locally, intravenously, intradermally, intraarterially, intralesionally, intratumorally, intracranially, intra-articular, intraprostatically, intrapleurally, intratracheally, intranasally, intravitreally, intravaginally, intrarectally, locally (topically), intramuscularly, intraperitoneally, subcutaneously, subconjunctival, intracapsularly, transmucosally, intracardially, intraumbilical, intraocularly, orally, locally (localily), by inhalation (e.g., aerosol inhalation), by injection, by infusion, by continuous infusion, direct local infusion of water bath target cells, via catheters, via lavage, in creams, or by any other method known to one of ordinary skill in the art or any combination of the foregoing (see, e.g., remington's Pharmaceutical Sciences (1990).
In some embodiments, the nucleic acid molecules of the present disclosure (e.g., packaged in liposomes, lipid nanoparticles, cationic nanoemulsions, polymeric nanoparticles) are administered in the form of a pharmaceutical composition comprising a pharmaceutically acceptable carrier. As used herein, "pharmaceutically acceptable carrier" includes any and all solvents, dispersion media, coatings, surfactants, antioxidants, preservatives (e.g., antibacterial, antifungal), isotonic agents, absorption delaying agents, salts, preservatives, pharmaceuticals, pharmaceutical stabilizers (e.g., antioxidants), gels, binders, excipients, disintegrants, lubricants, sweeteners, flavoring agents, dyes, and the like, as well as combinations thereof, known to those of ordinary skill in the art (see, e.g., remington's Pharmaceutical Sciences (1990), incorporated herein by reference). Except insofar as any conventional carrier is incompatible with the active ingredient, its use in a therapeutic or pharmaceutical composition is contemplated. Compositions for lipid nanoparticles with bioactive molecules and suitable carriers are disclosed, for example, in U.S. patent No. 7,404,969, which is incorporated herein by reference.
The nucleic acid molecules of the present disclosure may be administered once, or alternatively they may be administered in multiple administrations. If administered multiple times, the compounds may be administered via different routes. For example, the first (or first few) administrations may be directed into the affected tissue, and the subsequent administrations may be systemic.
Kit for detecting a substance in a sample
In one aspect, the present disclosure provides a kit for delivering one or more transgenes. In some embodiments, the kit comprises a composition according to the present disclosure comprising an effective amount of one or more nucleic acid molecules encoding a retroviral Pol polyprotein component and one or more nucleic acid molecules encoding a transgene flanking the LTR sequence. In some embodiments, the kit comprises a composition according to the present disclosure in unit dosage form comprising an effective amount of one or more nucleic acid molecules encoding a retroviral Pol polyprotein component and one or more nucleic acid molecules encoding a transgene flanking the LTR sequence. In some embodiments, the nucleic acid molecule encoding the retroviral Pol polyprotein component also encodes one or more helper retroviral proteins. In some embodiments, the composition comprises one or more additional nucleic acid molecules encoding one or more helper retroviral proteins. In some embodiments, the kit comprises a composition comprising two nucleic acid molecules. In some embodiments, the kit comprises a composition comprising two or more nucleic acid molecules (e.g., 2, 3, 4, or 5). In some embodiments, the nucleic acid molecule is packaged or formulated for delivery in a lipid nanoparticle. In some embodiments, the kit comprises a sterile container comprising the composition. Such containers may be in the form of boxes, ampoules, bottles, vials, tubes, bags, pouches, blister packs, or other suitable containers known in the art. Such containers may be made of plastic, glass, laminated paper, metal foil, or other materials suitable for containing medicaments. If desired, the composition is provided with instructions for administering the cells to a subject in need thereof.
Examples
In order that the invention described herein may be more fully understood, the following examples are set forth. The examples described in this application are provided to illustrate the methods, compositions, and systems provided herein and should not be construed as limiting the scope thereof in any way.
Example 1: mRNA encoded minimal lentiviral mechanism for in vivo gene therapy
The transgene encoding template is designed as shown in figure 3. The transgene encoding nucleic acid template comprises, from 5 'to 3', a 5'ltr, a reverse transcriptase initiating element, a promoter operably linked to the transgene, an enhancer, and a 3' ltr. Additional RNA-producing elements include a 5't7 RNA polymerase promoter and a 3' digestion site. The RNA molecules produced from the transgene-encoding nucleic acid templates contain unmodified nucleosides because retroviral reverse transcriptase is unable to replicate the modified ribonuclease (Swanstrom et al J.biol. Chem.256:1115-1121 (1981)). In addition, RNA has a 5' -7mG cap structure and a poly (rA) tail to enhance RNA stability.
The two smallest nucleic acid templates encoding Pol polyproteins (for reverse transcription/integration) were designed as shown in FIG. 4. Canonical gene construction provides gag and Pol expressed in-line, which requires translational slippage of Pol polyprotein expression. The novel design increases pol expression by processing pol from a separate construct into multimeric protein units (or even separate units, as described below). These novel designs encode different UTRs to increase mRNA stability and translation. This novel design, optionally together with pol, encodes an accessory protein unit necessary for reverse transcription and integration, which can be in cis with 2A. The first nucleic acid template comprises, from 5 'to 3', a 5'UTR, a nucleic acid sequence encoding a retroviral Pol polyprotein component in the form of a Pol polyprotein, and a 3' UTR. Additional RNA-producing elements include a 5't7 RNA polymerase promoter and a 3' digestion site. The second nucleic acid template comprises, from 5 'to 3', a 5'UTR, a nucleic acid sequence encoding one or more retroviral helper proteins, a nucleic acid sequence encoding a retroviral Pol polyprotein component in the form of a Pol polyprotein, and a 3' UTR. The helper protein units include retroviral elements not encoded by the pol polyprotein gene, examples include, but are not limited to, gag polyprotein (p 55), MA (p 17), CA (p 24), NC (p 9), p6, tat, nef, vpr, vif, and vpu. Additional RNA-producing elements include a 5't7 RNA polymerase promoter and a 3' digestion site. The mRNA molecules produced from these nucleic acid templates comprise both modified nucleosides (e.g., pseudouridine) and unmodified nucleosides. The use of modified nucleosides reduces the immunogenicity of the final mRNA molecule. In addition, mRNA has a 5' -7mG cap structure and a poly (rA) tail to ensure efficient translation and stability. For example, the 3' UTR may encode a poly A tail.
At least three different constructs for expressing the multifunctional units of Pol polyproteins are designed as shown in fig. 5. IN the first, as shown IN FIG. 5A, the construct provides Protease (PR), reverse transcriptase/RNase H (RT) and Integrase (IN) as canonical units, where each component is separated by PR self-processing. IN the second, as shown IN FIG. 5B, the Pol polyprotein component individual units RT and IN are encoded on a bicistronic construct. Third, as shown in FIG. 5C, the Pol polyprotein component individual units are encoded on separate constructs. There are various scenarios in which these constructs may be used. First, all Pol polyprotein components can be encoded as polyproteins on one mRNA with 5 'and 3' utrs (represented by fig. 5A), where PR proteases are required when encoding the components as polyproteins (Pol). Second, pol polyprotein components RT and IN units may be encoded on one or more constructs (represented by fig. 5B) each having 5 'and 3' utrs and an inserted IRES or 2A sequence, where PR protease is not required to encode the element as a "non-multimeric" individual unit. Third, pol polyprotein components RT and IN units can be encoded on two or more constructs each with 5 'and 3' utrs (represented by fig. 5C). Fourth, the Pol polyprotein RT and IN units and the helper protein units may be encoded on one construct with 5 'and 3' utrs, respectively, and an inserted IRES or 2A sequence. In this case, PR protease would be required when the accessory element is encoded as a multimeric protein, such as Gag multimeric protein. Fifth, as a non-limiting example, pol polyprotein components RT and IN units and helper protein units may be encoded on two or more constructs each having 5 'and 3' utrs.
Example 2: in vivo delivery of beta-globin for treatment of sickle cell disease
A transgene encoding nucleic acid template as described in example 1 comprising a β -globin transgene is used to produce an RNA molecule comprising a β -globin gene flanked by LTRs for integration into the host cell genome. As described in example 1, the nucleic acid templates comprise, from 5 'to 3', a 5'UTR, a nucleic acid sequence encoding a retroviral Pol polyprotein component in the form of a Pol polyprotein, and a 3' UTR for producing the RNA molecules encoding the retroviral Pol proteins.
An effective amount of a lipid nanoparticle composition comprising the two RNA molecules is administered to a patient suffering from sickle cell disease to treat sickle cell disease.
Example 3: improving RNA stability and translation in human cells
Lentiviral derived genes underlying the RNA-mediated transgenic integration systems described herein may present challenges to synthetic gene expression in cells. Lentiviral replication steps undergo tightly controlled spatiotemporal processes coordinated by the complete viral genes and regulatory elements, structural proteins, replicases and interactions with host factors critical to replication. This example describes several modifications to improve the stability and efficiency of RNA-based integrated systems.
Increased stability of functional unit RNA
One or more nucleic acid molecules encoding one or more Pol polyprotein components and/or helper proteins are also referred to as "functional unit nucleic acids". The generation of minimal, integrated RNA systems typically requires the removal of viral genes that do not directly affect retroviral and proviral integration. However, IN the absence of a full-length viral genome, expression of wild-type single viral RNAs (i.e., reverse Transcriptase (RT) and Integrase (IN)) is challenging IN the absence of accessory proteins (e.g., rev). In addition, wild-type HIV-1RNA is inherently unstable due to the presence of an internal destabilizing element (INS), a short 5 nucleotide repeat being present throughout the transcript (Wolff et al Nucleic Acids Res.31 (11): 2839-51 (2003)). Rev can counteract the effects of INS during HIV-1 infection, as Rev can bind and stabilize RNA transcripts. To increase RNA stability without Rev, five major INS sequences were identified, removed and replaced in the entire functional unit mRNA: TAGAT, ATAGA, AAAAG, ATAAA, TTATA or other INS elements (e.g., as those described in Wolff et al Nucleic Acids res.31 (11): 2839-51 (2003)). INS removal may be partial or complete. The INS sequence may be replaced with alternative codons that increase RNA stability while preserving protein sequence.
Furthermore, human codon optimization (hCO) results in increased gene expression in human cells. Previous studies using the full-length gag-pol gene demonstrated Rev independence after human codon optimization (Kotsopoulou et al J Virol.74 (10): 4839-52 (2000)). hCO functional unit mRNA was generated to increase translation and stability in target cells. In another iteration, the remaining INS elements are removed from the hCO sequence and replaced.
These improvements have been applied to all Pol polyprotein components and accessory proteins, including RT, IN, matrix (MA), capsid (CA), nucleocapsid (NC) and Vpr RNA.
Increased stability of functional unit proteins
Functional units in this system (e.g., pol polyprotein components and accessory proteins) may lack the space-time coordination observed in the incoming viral particles due to their cytoplasmic translation from the input mRNA. Unlike the entering viral particles, where active IN and RT are packaged near the viral genome, this RNA-based system first requires translation of the entering functional unit mRNA and subsequent interaction of the newly translated functional unit with the transgenic coding RNA. Thus, the protein lifetime of the functional unit is important.
Wild-type IN proteins have a shorter lifetime IN cells because the N-terminal residue of wild-type IN is a substrate for proteolytic degradation (Mulder et al J Biol chem.275 (38): 29749-53 (2000)). Previous studies using expression plasmids have shown that this degradation can be counteracted by the addition of the N-terminal methionine-glycine dipeptide (Chereponov et al FASEB J.14 (10): 1389-99 (2000)). Several N-terminal Met-Gly IN constructs were generated. These mRNAs are derived from the WT HIV gene, are hCO, and/or have had the INS structure removed and replaced.
Cytoplasmic redirection of pol-derived functional units in target cells
IN some embodiments, RNAs encoding individual gag or pol genes are generated, which contain key structural proteins or RT and IN, respectively, necessary for nuclear pore crossing during the pre-integration replication step. The advantage of this strategy is that the functional unit RNA sequence is translated into the native Gag or Pol polyprotein, whereas the isolation of the Gag and Pol genes allows for stoichiometric optimization of the intracellular activity to be favored. Expression of gag and pol genes at wild-type ratios results in the assembly, budding and release of the new virus from the producer cell into the extracellular space (Sundquist et al Cold Spring Harb Perspect Med.2 (7): a006924 (2012)). Although there is some evidence for functional RT and IN activity IN producer cells (Al Mosabbi et Al Biotechnol. Lett.38 (10): 1715-21 (2016)), expression of new viral gene products IN producer cells is primarily understood to favor the formation of new viral particles. Several studies of expression plasmids encoding gag and pol have shown that viral gene expression shifts from particle formation to intracellular activity of the pol product (Al Mosabbi et Al Biotechnol. Lett.38 (10): 1715-21 (2016), karacostas V et Al Virology193 (2): 661-71 (1993)). For this reason, several mechanisms involving DNA plasmid transfection have been established: (1) The deletion of the Gag sequence encoding the matrix protein, which anchors the Gag polyprotein to the plasma membrane during viral budding (Al Mosabir et Al Biotechnol. Lett.38 (10): 1715-21 (2016)), and (2) single nucleotide insertion disrupting the Gag-pol frameshift mechanism (Karacostas V et Al Virology193 (2): 661-71 (1993)). While these modifications have been described in the context of a full-length viral genome, they have not been explored for (1) extragenomic single gene expression and (2) synthetic functional RNAs. To facilitate intracellular activity of functional units expressed as gag and pol derived multimeric proteins, several novel full-length gag and pol RNAs were engineered: (1) a matrix deleted gag-Pol RNA for preventing sprouting of the pellet, (2) a matrix deleted single gag RNA, (3) a matrix deleted gag-Pol RNA for preventing sprouting of the pellet of hCO, (4) an RNA containing a mutation that results in gag and Pol in a single reading frame, (5) a gag-Pol frameshift RNA of hCO for intracellular Pol activity, (6) transfection of single gag or Pol RNA in an optimized ratio that results in accumulation of intracellular Pol products.
Priming RNA genome for promoting reverse transcription reaction
HIV-1RT relies on cytoplasmic tRNA to prime the viral genome prior to reverse transcription. The efficiency of in vitro reverse transcription reactions has been observed to be increased by priming the viral genome with short RNA or DNA oligonucleotides (Iwatani et al J Biol chem.278 (16): 14185-95 (2003)). As a strategy to increase the efficiency of reverse transcription reactions in our integrated system, transgenic coding RNAs are primed with short RNAs or DNA oligonucleotides prior to delivery to target cells with functional unit mrnas.
Uncapped transgenic coding RNAs
The HIV-1RNA genome is capped after transcription in the producer cell. After viral entry, the viral genome is uncapped by a cellular host factor, such as RNA debranching enzyme 1 (DBR 1), prior to reverse transcription (Galvis et al J Virol.2017Nov 14;91 (23): e01377-17 (2017)). To increase the efficiency of the reverse transcription step, uncapped transgenic coding RNAs were generated and co-transfected with functional unit mRNA.
Example 4: materials and methods
Materials and methods used in the studies described in examples 5 and 6 are described.
IVT template design
DNA template sequences for IVT of functional unit RNA, including UTR and 5' T7 promoter, were cloned into pTWIST Amp cloning vector (Twist Bioscience). IVT template sequences of transgenic coding RNA, including LTR and 5' T7 promoter, were cloned into pMK scaffold from GeneArt (ThermoFisher).
RNA preparation and purification
Plasmids containing the IVT template were linearized by digestion with BsmBI or Bsu36I and SfiI (New England Biolabs). Digested DNA was precipitated with isopropanol. Digested DNA templates (30 ng/. Mu.L) were incubated for 4h at 37℃in the presence of NTP (transgene encoding RNA:7.5mM CTP, 1.5mM GTP, 7.5mM UTP; functional unit RNA:4mM ATP, 4mM CTP, 4mM GTP, 4mM UTP), cleanCap (transgene encoding RNA: cleanCap GG,6mM; functional unit RNA: cleanCap AG,5 mM), 0.1M MgSO4, 0.1M spermidine, T7 RNA polymerase (transgene encoding RNA:0.2mg/mL; functional unit RNA:0.1 mg/mL) and 40U/mLTIPP (New England Biolabs)). The reaction was treated with DNase (Thermo Fisher) and incubated at 37℃for 15 min. RNA was precipitated by LiCl and resuspended in THE THE RNA storage solution (Thermo Fisher). The in vitro transcribed RNA was enzymatically polyadenylated using E.coli Poly (A) polymerase (New England Biolabs). RNA (0.25. Mu.g/. Mu.L) was incubated for 20 min at 37℃in the presence of E.coli poly (A) polymerase (0.1U/. Mu.L), 1 XPolyA polymerase reaction buffer and 10mM ATP. RNA was precipitated by LiCl and resuspended in THE THE RNA stock solution (functional unit RNA) or purified by HPLC (transgene encoding RNA). Polyadenylation was confirmed using BioAnalyzer (Agilent Technologies).
RNA transfection
293FT cells (Thermo Fisher) were transfected with RNA using Lipofectamine2000 (Thermo Fisher). For transgenic expression experiments, cells seeded in 48-well plates were transfected with a specified proportion of 500ng total RNA, each well was complexed with 0.5. Mu.L Lipofectamine 2000. For Western blotting, 293FT cells seeded in 12-well plates were transfected with 1 μg of integrase RNA and 2 μl of Lipofectamine2000 per well. For confocal microscopy analysis, 293FT cells were seeded on poly-D-lysine coated #1 coverslip (Electron Microscopy Sciences) in 24-well plates. Cells were transfected with 1. Mu.g of integrase RNA per well in 24-well plates.
Western blot
48 hours after transfection, transfected cells were resuspended in RIPA buffer (Pierce, thermo Fisher) containing a Halt protease inhibitor (Pierce, thermo Fisher). After two freeze-thaw cycles at-80 ℃, the cell suspension was lysed in the presence of a benzoate nuclease (Sigma Aldrich) at 4 ℃. Lysates were incubated at 37℃for 5 min, treated with EDTA, and clarified by centrifugation at 21,000Xg at 4℃for 20 min. Lysates were electrophoresed on NuPage 4-12% Bis-Tris gel (Thermo Fisher) and transferred to 0.2. Mu.g PVDF membrane (Trans-Blot Turbo Mini, bio-Rad). The membranes were probed with a primary antibody to integrase (Abcam, [ In-2], ab 66645) and beta-actin (Abcam, ab 8227). The integrase and beta-actin were detected with fluorescent secondary goat anti-mouse and goat anti-rabbit antibodies (Abcam), respectively. The blots were imaged on a Li-Cor imager.
Immunocytochemistry and confocal microscopy
24 hours after transfection, transfected 293FT cells were washed with PBS and fixed in 4% paraformaldehyde (Electron Microscopy Sciences) for 7 minutes at room temperature. After fixation, cells were washed with PBS. Fixed cells were stained overnight with 300nM DAPI (Biotium), viaFluor-647 (Biotium) and 1:100 anti-integrase antibodies (Abcam, [ In-2], ab 66645) conjugated to CF-568 dye (Mix-and-Stain, biotium) In blocking buffer (5% FBS, 1% BSA and 1 XTBS Tween (Thermo Fisher)) at room temperature. The stained coverslips were washed with PBS and mounted onto Superfrost Plus microscope slides (Fisher) using ProLong Glass (ThermoFisher). The samples were visualized using a Nikon/Yokogawa CSU-W1 rotating disc confocal microscope.
Nano luciferase assay
Cells were lysed in Nano-Glo buffer (Promega) without furimazine at 37℃for 10 min. The Nano-luciferase assay was performed in 96-well opaque plates (Costar) by adding 25 μl of Nano-Glo buffer and 1:50 of furimazine (Promega) to 20 μl of cell lysate per well. Immediately after furimazine addition, luminescence was read on a BioTek HTX plate reader.
Example 5: RNA derived functional units drive expression of reporter genes from transgenic coding RNA
293FT cells were transfected with RNA encoding Gag polyprotein, RNA encoding Pol polyprotein and transgenic coding RNA encoding nano-luciferase (Nluc), or with transgenic coding RNA alone. The RNA construct is shown in fig. 7A. Cells were harvested 3 days after transfection for Nluc assay. The Nluc activity was assessed via photometer. The Nluc activity is shown as light units against background (RLU) (fig. 7B). The proportion of RNA species in 500ng RNA transfection per well of a 48-well plate is shown. Statistical significance was determined and confirmed by student t-test; n=3. These results indicate that lentiviral RNA-derived functional units are capable of driving reporter gene expression from transgenic coding RNAs flanking the LTR. The results further guide the stoichiometry for single mRNA functional unit and transgene-encoding RNA transfection.
Example 6: stabilization of functional unit expression
An optimized Integrase (IN) encoding functional unit mRNA with one or more modifications described IN example 3 to increase RNA and/or protein stability is more stably expressed than wild-type integrase and localized to the nucleus of human cells. The RNA construct design elements encoding wild-type IN, stabilized Met-Gly, hCO IN and stabilized Met-Gly hCO delta INS IN are shown IN FIG. 8A. IN Δins IN, an unstable sequence (INS) was identified and codon optimized to increase RNA stability.
293FT cells were transfected with each indicated RNA and harvested 48 hours (p.t.) post-transfection for western blot lysis. The level of integrase expression from the optimized construct was higher compared to the construct encoding the wild-type IN. At 48h p.t., no wild-type IN protein band was detected, IN contrast to the bands from the optimized construct (fig. 8C). Subcellular localization of optimized IN (Met-Gly hCoΔININ) after RNA transfection into 293FT cells was also determined (FIG. 8D). Cells were transfected with optimized IN RNA, fixed 24 hours after transfection, and immunofluorescence treated. The fixed cells were stained with conjugated anti-IN antibody, DAPI (nucleus) and phalloidin-647 (actin). Integrase is strongly expressed and localized to the nucleus. These results indicate that the optimized integrase-encoding functional unit mRNA exhibits enhanced expression and localization to the target organelle.
A summary of the exemplary nucleotide sequences described in the examples is provided in table 1 below.
TABLE 1 exemplary nucleotide sequences
Figure BDA0004118371990000581
Description of the embodiments
The following embodiments are within the scope of the present disclosure. Furthermore, this disclosure encompasses all variations, combinations, and permutations of these embodiments, wherein one or more of the limitations, elements, clauses, and descriptive terms of one or more of the listed embodiments are introduced into another embodiment listed in this section. For example, any listed embodiment that depends on another embodiment may be modified to include one or more limitations found in any other listed embodiment in this section that depends on the same basic embodiment. Where elements are presented in a list format, such as in Markush group format, each subgroup of elements is also disclosed, and any elements may be deleted from the group. It should be understood that, in general, where the disclosure or aspects of the disclosure are referred to as comprising particular elements and/or features, certain embodiments of the invention or aspects of the invention consist of or consist essentially of such elements and/or features. It should also be noted that the terms "comprising" and "including" are intended to be open-ended and allow for the inclusion of additional elements or steps. Where ranges are given, endpoints are also included. Furthermore, unless otherwise indicated or apparent from the context and understanding of one of ordinary skill in the art, values expressed as ranges can employ any particular value or subrange within the ranges described in the various embodiments of the invention to one tenth of the unit of the lower limit of the range unless the context clearly dictates otherwise.
1. A composition comprising:
(i) One or more nucleic acid molecules encoding one or more Pol polyprotein components flanking the 5 'and 3' untranslated regions (UTRs); and
(ii) A nucleic acid molecule comprising one or more reverse transcriptase initiation elements located between the 5 'Long Terminal Repeat (LTR) and the 3' LTR and one or more promoter sequences operably linked to one or more transgenes,
wherein the composition is free of nucleic acid sequences expressing proteins encoded by the retroviral rev and env genes.
2. The composition of embodiment 1, wherein expression of the Pol polyprotein component does not require translational sliding from the in-line gag gene.
3. The composition of embodiments 1-2, wherein one or more of the nucleic acid molecules of (i) does not encode a Gag polyprotein.
4. The composition of embodiments 1-3, wherein the one or more nucleic acid molecules of (i) are nucleic acid molecules comprising a 5'utr, a nucleic acid sequence encoding a Pol polyprotein, and a 3' utr.
5. The composition of embodiments 1-3, wherein the one or more nucleic acid molecules of (i) are nucleic acid molecules comprising a 5'utr, at least nucleic acid sequences encoding Pol polyprotein components reverse transcriptase and integrase, and a 3' utr.
6. The composition of embodiment 5, wherein the Pol polyprotein component reverse transcriptase and integrase are expressed on a bicistronic construct.
7. The composition of embodiments 5-6, wherein the Pol polyprotein component reverse transcriptase and integrase are expressed together with an inserted Internal Ribosome Entry Site (IRES) or 2A peptide coding sequence.
8. The composition of embodiments 1-3, wherein the one or more nucleic acid molecules of (i) are two nucleic acid molecules, wherein one of the two nucleic acid molecules comprises a 5'utr, a nucleic acid sequence encoding at least a Pol polyprotein component reverse transcriptase, and a 3' utr, and wherein the second of the two nucleic acid molecules comprises a 5'utr, a nucleic acid sequence encoding at least a Pol polyprotein component integrase, and a 3' utr.
9. The composition of embodiments 1-8, wherein the one or more nucleic acid molecules of (i) further comprise a nucleic acid encoding one or more Gag polyprotein helper proteins.
10. The composition of embodiment 9, wherein the one or more Gag polyprotein auxiliary proteins are encoded on the same nucleic acid molecule as the Pol polyprotein component.
11. The composition of embodiment 10, wherein the one or more Gag polyprotein auxiliary proteins are expressed together with an inserted Internal Ribosome Entry Site (IRES) or 2A peptide coding sequence.
12. The composition of embodiment 9, wherein the one or more Gag polyprotein auxiliary proteins are encoded by one or more nucleic acid molecules that are different from the Pol polyprotein component, wherein each nucleic acid molecule comprises a 5'utr and a 3' utr.
13. The composition of any of embodiments 9-12, wherein the Gag multimeric protein helper protein is selected from the group consisting of Nucleocapsid (NC), capsid protein (CA), matrix protein (MA), p6, viral infectious agent (Vif), transcription transactivator (Tat), negative regulator (Nef), viral protein R (Vpr) and viral protein u (Vpu).
14. The composition of embodiment 13, wherein the Gag multimeric protein helper protein comprises a Nucleocapsid (NC), a capsid protein (CA), a matrix protein (MA), p6, a viral infectious agent (Vif), a transcription transactivator (Tat), a negative regulator (Nef), a viral protein R (Vpr) and a viral protein u (Vpu).
15. The composition of embodiment 14, wherein the Gag polyprotein auxiliary protein is encoded by a Gag polyprotein.
16. The composition of any of embodiments 1-15, wherein the nucleic acid molecules of (i) and (ii) are RNA molecules or DNA molecules.
17. The composition of embodiment 16, wherein the nucleic acid molecules of (i) and (ii) are ssDNA molecules or dsDNA molecules.
18. The composition of embodiment 16, wherein the nucleic acid molecules of (i) and (ii) are ssRNA molecules.
19. The composition of any of embodiments 1-18, wherein the nucleic acid molecule of (ii) comprises two or more transgenes, and the transgenes are separated by one or more Internal Ribosome Entry Sites (IRES) and/or one or more 2A peptide coding sequences.
20. The composition of any of embodiments 1-19, wherein the nucleic acid molecule of (ii) further comprises one or more enhancers.
21. The composition of embodiment 20, wherein the one or more enhancers comprise woodchuck hepatitis virus post-transcriptional regulatory elements.
22. The composition of any one of embodiments 1-21, wherein one or more of the nucleic acid molecules of (i) and/or (ii) is an RNA molecule and comprises one or more modifications selected from the group consisting of modified ribonucleosides, 5' -7mG cap structures, and poly (rA) tails.
23. The composition of embodiment 22, wherein the modified ribonucleoside is pseudouridine or a derivative of pseudouridine.
24. The composition of any of embodiments 1-23, wherein the nucleic acid molecule of (ii) is an RNA molecule and further comprises one or more modifications selected from the group consisting of a 5' -7 mgs cap structure and a poly (rA) tail.
25. The composition of any one of embodiments 1-24, wherein the Pol proteins and/or LTRs are based on Pol proteins and/or LTRs from: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abasen murine leukemia virus (a-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), tenascus sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29), avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV).
26. The composition of any of embodiments 1-25, wherein the composition is packaged in a non-viral delivery system.
27. The composition of embodiment 26, wherein the composition is packaged in a lipid nanoparticle.
28. A composition comprising:
(i) A first RNA molecule comprising: a 5 'untranslated region (UTR), a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' UTR; and
(ii) A second RNA molecule comprising one or more reverse transcriptase initiation elements located between a 5 'Long Terminal Repeat (LTR) and a 3' LTR and one or more promoter sequences operably linked to one or more transgenes;
wherein the composition is free of nucleic acid sequences expressing proteins encoded by retroviral rev and env genes, and wherein the composition is packaged in a non-viral delivery system.
29. The composition of embodiment 28, wherein the RNA molecules of (i) and (ii) are ssRNA molecules.
30. The composition of any of embodiments 28-29, further comprising a third RNA molecule encoding one or more helper proteins selected from the group consisting of Nucleocapsid (NC), capsid protein (CA), viral infectious agent (Vif), transcription transactivator (Tat), negative regulator (Nef), viral protein R (Vpr), and viral protein u (Vpu).
31. The composition of embodiment 28, wherein the first RNA molecule further comprises a nucleic acid sequence encoding one or more accessory proteins selected from NC, CA, vif, tat, nef, vpr and Vpu.
32. The composition of any of embodiments 28-31, wherein the second RNA molecule comprises two or more transgenes and the transgenes are separated by one or more Internal Ribosome Entry Sites (IRES) and/or one or more sequences encoding a 2A peptide coding sequence.
33. The composition of any one of embodiments 28-32, wherein the second RNA molecule further comprises one or more enhancers.
34. The composition of embodiment 33, wherein the one or more enhancers comprise woodchuck hepatitis virus post-transcriptional regulatory elements.
35. The composition of any one of embodiments 28-34, wherein the first RNA molecule comprises one or more modifications selected from the group consisting of a modified nucleoside, a 5' -7mG cap structure, and a poly (rA) tail.
36. The composition of any one of embodiments 28-35, wherein the second RNA molecule comprises one or more modifications selected from the group consisting of a 5' -7 mgs cap structure and a poly (rA) tail.
37. The composition of any of embodiments 1-36, wherein the one or more promoters comprise one or more tissue-specific or cell-specific promoters.
38. The composition of embodiment 37, wherein the one or more tissue-specific or cell-specific promoters are specific for bone marrow, hematopoietic Stem Cells (HSCs), epithelial cells, hepatocytes, vision cells, muscle cells, or T cells.
39. The composition of any one of embodiments 1-36, wherein one or more promoters comprise the hCMV promoter.
40. The composition of any of embodiments 1-39, wherein the one or more transgenes encode one or more therapeutic, diagnostic, or reporter molecules, or fragments thereof.
41. The composition of embodiment 40, wherein the one or more transgenes encode one or more therapeutic, diagnostic or reporter proteins, or fragments thereof.
42. The composition of embodiment 41, wherein the therapeutic protein is beta globin, cystic fibrosis transmembrane conductance regulator (CFTR), factor VIII, a dystrophin protein, or an RP gtpase regulator (RPGR).
43. The composition of embodiment 41, wherein the reporter protein is a fluorescent protein or a luciferase.
44. The composition of any one of embodiments 38-43, wherein the Pol proteins, accessory proteins, and/or LTRs are based on Pol proteins, accessory proteins, and/or LTRs from: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abasen murine leukemia virus (a-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), tenascus sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29), avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV).
45. The composition of any of embodiments 26 or 28-44, wherein the non-viral delivery system targets a specific tissue or cell type.
46. The composition of embodiment 45, wherein the specific tissue or cell type is bone marrow, HSC, epithelial cells, hepatocytes, vision cells, muscle cells, or T cells.
47. The composition of any of embodiments 28-46, wherein the non-viral delivery system is a lipid nanoparticle, a liposome, a polypeptide nanoparticle, a silica nanoparticle, a gold nanoparticle, a polymer nanoparticle, a dendrimer, or a cationic nanoemulsion.
48. A method of expressing a gene in a subject in need thereof, the method comprising administering to the subject an effective amount of the composition of any one of embodiments 1-47, thereby expressing one or more transgenes in the subject.
49. A method for expressing a gene in a cell, the method comprising delivering the composition of any one of embodiments 1-47 to the cell, thereby expressing one or more transgenes in the cell.
50. A method of using the composition of any of embodiments 1-47, the method comprising delivering the composition to a subject, thereby expressing one or more transgenes in the subject.
51. A method of treating a disease or disorder in a subject in need thereof, the method comprising delivering the composition of any one of embodiments 1-47 to the subject, thereby expressing one or more transgenes in the subject.
52. The method of embodiment 51, wherein the disease or disorder is a genetic disease or disorder.
53. The method of embodiment 51, wherein the disease or disorder is a genetic disease or disorder.
54. The method of embodiment 51, wherein the disease or disorder is sickle cell disease, β -thalassemia, hemophilia B, retinitis pigmentosa, duchenne muscular dystrophy, cystic fibrosis, or cancer.
55. The method of any one of embodiments 48-54, wherein one or more transgenes are integrated into the genome of the target cell.
56. The method of embodiment 55, wherein stable expression of one or more transgenes persists for at least one week, at least two weeks, at least one month, at least 6 months, at least one year, or for the lifetime of the subject.
57. A method of eliciting an immune response in a subject in need thereof, the method comprising administering to the subject an effective amount of the composition of any one of embodiments 1-47, thereby expressing one or more transgenes in the subject.
58. The method of embodiment 57, wherein the subject has cancer and the one or more transgenes encode a tumor antigen.
59. The method of embodiment 57, wherein the subject has or is at risk of having an infectious disease, and the one or more transgenes encode an antigen associated with the infectious disease.
60. The method of any of embodiments 48-59, wherein the composition is delivered locally or systemically.
61. The method of embodiment 60, wherein the composition is delivered by injection, inhalation, intravenous, intraperitoneal, subcutaneous, intramuscular, oral, intranasal, by pulmonary administration, transdermal, transmucosal, or intratumoral.
62. One or more nucleic acid expression cassettes for use in the composition of any one of embodiments 1-47, the nucleic acid expression cassettes comprising a 5'utr, a nucleic acid sequence encoding one or more retroviral Pol polyprotein components, and a 3' utr, wherein expression of the Pol polyprotein components does not require translational sliding from an in-row gag gene.
63. The nucleic acid expression cassette of embodiment 62, further comprising a nucleic acid sequence encoding one or more accessory proteins selected from the group consisting of Nucleocapsid (NC), capsid protein (CA), viral infectious agent (Vif), transcription transactivator (Tat), negative regulator (Nef), viral protein R (Vpr), and viral protein u (Vpu).
64. The nucleic acid expression cassette of embodiment 62 or 63, wherein the Pol proteins and/or helper proteins are based on Pol proteins and helper proteins from: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abasen murine leukemia virus (a-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), tenascus sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29), avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV).
65. A method of producing an RNA molecule comprising in vitro transcribing the nucleic acid expression cassette of any one of embodiments 62-64.
66. A kit comprising one or more containers comprising the composition of any one of embodiments 1-47 or the nucleic acid expression cassette of any one of embodiments 62-65.
Equivalents and scope
Although several inventive embodiments have been described and illustrated herein, one of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the inventive embodiments described herein. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings of the invention is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific invention embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, inventive embodiments may be practiced otherwise than as specifically described and claimed. Inventive embodiments of the present disclosure relate to each individual feature, system, article, material, kit, and/or method described herein. Furthermore, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, any combination of two or more such features, systems, articles, materials, kits, and/or methods is included within the scope of the present invention.
All definitions as defined and used herein should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
All references, patents and patent applications disclosed herein are incorporated by reference to the subject matter of each reference, and in some cases may encompass the entire document.
The indefinite articles "a" and "an" as used in the specification and claims should be understood to mean "at least one" unless explicitly indicated to the contrary.
The phrase "and/or" as used in the specification and claims should be understood as "one or both of the elements so combined, i.e., elements that are in some cases present in combination and in other cases present separately. The various elements listed as "and/or" should be interpreted in the same manner, i.e., "one or more" of the elements so connected. In addition to the elements specifically identified by the "and/or" clause, other elements may optionally be present, whether or not associated with those specifically identified elements. Thus, as a non-limiting example, reference to "a and/or B" when used in conjunction with an open language such as "comprising" may refer in one embodiment to a only (optionally including elements other than B); in another embodiment may refer to B only (optionally including elements other than a); in yet another embodiment both a and B (optionally including other elements) may be referred to; etc.
As used herein in the specification and claims, "or" should be understood to have the same meaning as "and/or" as defined above. For example, when separating items in a list, "or" and/or "should be construed as inclusive, i.e., including at least one element, but also including more than one element of a plurality of elements or list elements, and optionally other unlisted items. Only the opposite terms, such as "only one" or "exactly one," or "consisting of" when used in the claims, are explicitly stated to mean that exactly one of the elements or list elements is included. In general, the proviso that exclusive terms, such as "either," "one of," "only one," or "exactly one of," as used herein, are to be interpreted as merely indicating an exclusive alternative (i.e., "one or the other, but not both"). "consisting essentially of.
As used herein in the specification and claims, the phrase "at least one" referring to a list of one or more elements is understood to mean at least one element selected from any one or more elements in the list of elements, but does not necessarily include at least one of each element specifically listed in the list of elements, and does not exclude any combination of elements in the list of elements. The definition also allows that elements other than the specifically identified elements in the list of elements to which the phrase "at least one" refers may optionally be present, whether or not associated with those elements specifically identified. Thus, as a non-limiting example, "at least one of a and B" (or equivalently "at least one of a or B", or equivalently "at least one of a and/or B") may refer in one embodiment to at least one, optionally including more than one, a, without B (and optionally including elements other than B); in another embodiment, it may mean at least one, optionally including more than one B, without a being present (and optionally including elements other than a); in yet another embodiment, it may refer to at least one, optionally including more than one a, and at least one, optionally including more than one B (and optionally including other elements); etc.
It should also be understood that, unless explicitly indicated to the contrary, in any method claimed herein that includes more than one step or action, the order of the steps or actions of the method is not necessarily limited to the order of the steps or actions of the method as referred to.
In the claims and the above description, all transitional phrases such as "comprising," "including," "carrying," "having," "containing," "involving," "holding," "consisting of," and the like are to be construed as open-ended, i.e., to mean including but not limited to. As described in section 2111.03 of the U.S. patent office patent review program manual, only the transitional phrases "consisting of" and "consisting essentially of" shall be closed or semi-closed transitional phrases, respectively. It should be understood that in alternative embodiments, embodiments described in this document using an open transition phrase (e.g., "comprising") are also contemplated as "consisting of the open transition phrase" and "consisting essentially of the" feature. For example, if the present disclosure describes "a composition comprising a and B," the present disclosure also contemplates alternative embodiments "a composition consisting of a and B" and "a composition consisting essentially of a and B.
Sequence listing
<110> Green biotechnology Co., ltd
<120> nucleic acid therapy for Gene disorders
<130> G0830.70039WO00
<140> unspecified
<141> at the same time
<150> US 63/053,474
<151> 2020-07-17
<160> 19
<170> patent in version 3.5
<210> 1
<211> 4664
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 1
agaccagatc tgagcctggg agctctctgg ctaactaggg aacccactgc ttaagcctca 60
ataaagcttg ccttgagtgc ttcaagtagt gtgtgcccgt ctgttgtgtg actctggtaa 120
ctagagatcc ctcagaccct tttagtcagt gtggaaaatc tctagcagtg gcgcccgaac 180
agggacttga aagcgaaagg gaaaccagag gagctctctc gacgcaggac tcggcttgct 240
gaagcgcgca cggcaagagg cgaggggcgg cgactggtga gtacgccaaa aattttgact 300
agcggaggct agaaggagag agatgggtgc gagagcgtca gtattaagcg ggggagaatt 360
agatcgcgat gggaaaaaat tcggttaagg ccagggggaa agaaaaaata taaattaaaa 420
catatagtat gggcaagcag ggagctagaa cgattcgcag ttaatcctgg cctgttagaa 480
acatcagaag gctgtagaca aatactggga cagctacaac catcccttca gacaggatca 540
gaagaactta gatcattata taatacagta gcaaccctct attgtgtgca tcaaaggata 600
gagataaaag acaccaagga agctttagac aagatagagg aagagcaaaa caaaagtaag 660
accaccgcac agcaagcggc cgctgatctt cagacctgga ggaggagata tgagggacaa 720
ttggagaagt gaattatata aatataaagt agtaaaaatt gaaccattag gagtagcacc 780
caccaaggca aagagaagag tggtgcagag agaaaaaaga gcagtgggaa taggagcttt 840
gttccttggg ttcttgggag cagcaggaag cactatgggc gcagcgtcaa tgacgctgac 900
ggtacaggcc agacaattat tgtctggtat agtgcagcag cagaacaatt tgctgagggc 960
tattgaggcg caacagcatc tgttgcaact cacagtctgg ggcatcaagc agctccaggc 1020
aagaatcctg gctgtggaaa gatacctaaa ggatcaacag ctcctgggga tttggggttg 1080
ctctggaaaa ctcatttgca ccactgctgt gccttggaat gctagttgga gtaataaatc 1140
tctggaacag atttggaatc acacgacctg gatggagtgg gacagagaaa ttaacaatta 1200
cacaagctta atacactcct taattgaaga atcgcaaaac cagcaagaaa agaatgaaca 1260
agaattattg gaattagata aatgggcaag tttgtggaat tggtttaaca taacaaattg 1320
gctgtggtat ataaaattat tcataatgat agtaggaggc ttggtaggtt taagaatagt 1380
ttttgctgta ctttctatag tgaatagagt taggcaggga tattcaccat tatcgtttca 1440
gacccacctc ccaaccccga ggggacccga caggcccgaa ggaatagaag aagaaggtgg 1500
agagagagac agagacagat ccattcgatt agtgaacgga tctcgacggt atcggttaac 1560
ttttaaaaga aaagggggga ttggggggta cagtgcaggg gaaagaatag tagacataat 1620
agcaacagac atacaaacta aagaattaca aaaacaaatt acaaaaattc aaaattttat 1680
cgataagctt gggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg 1740
cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata 1800
gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta 1860
catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc 1920
gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca gtacatctac 1980
gtattagtca tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga 2040
tagcggtttg actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg 2100
ttttggcacc aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg 2160
caaatgggcg gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac 2220
cgtcagatcg cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgactc 2280
tagaggatcc actagtccag tgtggtggaa ttctgcagat atcaacaagt ttgtacaaaa 2340
aagcaggctc caccatggtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 2400
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgtccggc gagggcgagg 2460
gcgatgccac ctacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 2520
tgccctggcc caccctcgtg accaccttca cctacggcgt gcagtgcttc gcccgctacc 2580
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 2640
agcgcaccat cttcttcaag gacgacggca actacaagac ccgcgccgag gtgaagttcg 2700
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 2760
acatcctggg gcacaagctg gagtacaact acaacagcca caaggtctat atcaccgccg 2820
acaagcagaa gaacggcatc aaggtgaact tcaagacccg ccacaacatc gaggacggca 2880
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 2940
tgcccgacaa ccactacctg agcacccagt ccgccctgag caaagacccc aacgagaagc 3000
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 3060
agctgtacaa gtaaacccag ctttcttgta caaagtggtt gatatccagc acagtggcgg 3120
ccgctcgagt ctagagggcc cgcggttcga aggtaagcct atccctaacc ctctcctcgg 3180
tctcgattct acgcgtaccg gttagtaatg atcgacaatc aacctctgga ttacaaaatt 3240
tgtgaaagat tgactggtat tcttaactat gttgctcctt ttacgctatg tggatacgct 3300
gctttaatgc ctttgtatca tgctattgct tcccgtatgg ctttcatttt ctcctccttg 3360
tataaatcct ggttgctgtc tctttatgag gagttgtggc ccgttgtcag gcaacgtggc 3420
gtggtgtgca ctgtgtttgc tgacgcaacc cccactggtt ggggcattgc caccacctgt 3480
cagctccttt ccgggacttt cgctttcccc ctccctattg ccacggcgga actcatcgcc 3540
gcctgccttg cccgctgctg gacaggggct cggctgttgg gcactgacaa ttccgtggtg 3600
ttgtcgggga agctgacgtc ctttccatgg ctgctcgcct gtgttgccac ctggattctg 3660
cgcgggacgt ccttctgcta cgtcccttcg gccctcaatc cagcggacct tccttcccgc 3720
ggcctgctgc cggctctgcg gcctcttccg cgtcttcgcc ttcgccctca gacgagtcgg 3780
atctcccttt gggccgcctc cccgcctggc gatggtaccg gtgtggaaag tccccaggct 3840
ccccagcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc aggtgtggaa 3900
agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa 3960
ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt tccgcccatt 4020
ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc gcctctgcct 4080
ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt tgcaaaaagc 4140
tcccgggagc ttgtatatcc attttcggat ctgatcagca cgtgcacaat tcgagctcgg 4200
tacctttaag accaatgact tacaaggcag ctgtagatct tagccacttt ttaaaagaaa 4260
aggggggact ggaagggcta attcactccc aacgaagaca agatctgctt tttgcttgta 4320
ctgggtctct ctggttagac cagatctgag cctgggagct ctctggctaa ctagggaacc 4380
cactgcttaa gcctcaataa agcttgcctt gagtgcttca agtagtgtgt gcccgtctgt 4440
tgtgtgactc tggtaactag agatccctca gaccctttta gtcagtgtgg aaaatctcta 4500
gcagtagtag ttcatgtcat cttattattc agtatttata acttgcaaag aaatgaatat 4560
cagagagtga gaggaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 4620
tcacaaattt cacaaataaa gcattttttt cactgcacct aagg 4664
<210> 2
<211> 4664
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 2
agaccagatc tgagcctggg agctctctgg ctaactaggg aacccactgc ttaagcctca 60
ataaagcttg ccttgagtgc ttcaagtagt gtgtgcccgt ctgttgtgtg actctggtaa 120
ctagagatcc ctcagaccct tttagtcagt gtggaaaatc tctagcagtg gcgcccgaac 180
agggacttga aagcgaaagg gaaaccagag gagctctctc gacgcaggac tcggcttgct 240
gaagcgcgca cggcaagagg cgaggggcgg cgactggtga gtacgccaaa aattttgact 300
agcggaggct agaaggagag agatgggtgc gagagcgtca gtattaagcg ggggagaatt 360
agatcgcgat gggaaaaaat tcggttaagg ccagggggaa agaaaaaata taaattaaaa 420
catatagtat gggcaagcag ggagctagaa cgattcgcag ttaatcctgg cctgttagaa 480
acatcagaag gctgtagaca aatactggga cagctacaac catcccttca gacaggatca 540
gaagaactta gatcattata taatacagta gcaaccctct attgtgtgca tcaaaggata 600
gagataaaag acaccaagga agctttagac aagatagagg aagagcaaaa caaaagtaag 660
accaccgcac agcaagcggc cgctgatctt cagacctgga ggaggagata tgagggacaa 720
ttggagaagt gaattatata aatataaagt agtaaaaatt gaaccattag gagtagcacc 780
caccaaggca aagagaagag tggtgcagag agaaaaaaga gcagtgggaa taggagcttt 840
gttccttggg ttcttgggag cagcaggaag cactatgggc gcagcgtcaa tgacgctgac 900
ggtacaggcc agacaattat tgtctggtat agtgcagcag cagaacaatt tgctgagggc 960
tattgaggcg caacagcatc tgttgcaact cacagtctgg ggcatcaagc agctccaggc 1020
aagaatcctg gctgtggaaa gatacctaaa ggatcaacag ctcctgggga tttggggttg 1080
ctctggaaaa ctcatttgca ccactgctgt gccttggaat gctagttgga gtaataaatc 1140
tctggaacag atttggaatc acacgacctg gatggagtgg gacagagaaa ttaacaatta 1200
cacaagctta atacactcct taattgaaga atcgcaaaac cagcaagaaa agaatgaaca 1260
agaattattg gaattagata aatgggcaag tttgtggaat tggtttaaca taacaaattg 1320
gctgtggtat ataaaattat tcataatgat agtaggaggc ttggtaggtt taagaatagt 1380
ttttgctgta ctttctatag tgaatagagt taggcaggga tattcaccat tatcgtttca 1440
gacccacctc ccaaccccga ggggacccga caggcccgaa ggaatagaag aagaaggtgg 1500
agagagagac agagacagat ccattcgatt agtgaacgga tctcgacggt atcggttaac 1560
ttttaaaaga aaagggggga ttggggggta cagtgcaggg gaaagaatag tagacataat 1620
agcaacagac atacaaacta aagaattaca aaaacaaatt acaaaaattc aaaattttat 1680
cgataagctt gggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg 1740
cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata 1800
gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta 1860
catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc 1920
gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca gtacatctac 1980
gtattagtca tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga 2040
tagcggtttg actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg 2100
ttttggcacc aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg 2160
caaatgggcg gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac 2220
cgtcagatcg cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgactc 2280
tagaggatcc actagtccag tgtggtggaa ttctgcagat atcaacaagt ttgtacaaaa 2340
aagcaggctc caccatggtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 2400
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgtccggc gagggcgagg 2460
gcgatgccac ctacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 2520
tgccctggcc caccctcgtg accaccttcg gctacggcct gatgtgcttc gcccgctacc 2580
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 2640
agcgcaccat cttcttcaag gacgacggca actacaagac ccgcgccgag gtgaagttcg 2700
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 2760
acatcctggg gcacaagctg gagtacaact acaacagcca caacgtctat atcatggccg 2820
acaagcagaa gaacggcatc aaggtgaact tcaagatccg ccacaacatc gaggacggca 2880
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 2940
tgcccgacaa ccactacctg agctaccagt ccaaactgag caaagacccc aacgagaagc 3000
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 3060
agctgtacaa gtaaacccag ctttcttgta caaagtggtt gatatccagc acagtggcgg 3120
ccgctcgagt ctagagggcc cgcggttcga aggtaagcct atccctaacc ctctcctcgg 3180
tctcgattct acgcgtaccg gttagtaatg atcgacaatc aacctctgga ttacaaaatt 3240
tgtgaaagat tgactggtat tcttaactat gttgctcctt ttacgctatg tggatacgct 3300
gctttaatgc ctttgtatca tgctattgct tcccgtatgg ctttcatttt ctcctccttg 3360
tataaatcct ggttgctgtc tctttatgag gagttgtggc ccgttgtcag gcaacgtggc 3420
gtggtgtgca ctgtgtttgc tgacgcaacc cccactggtt ggggcattgc caccacctgt 3480
cagctccttt ccgggacttt cgctttcccc ctccctattg ccacggcgga actcatcgcc 3540
gcctgccttg cccgctgctg gacaggggct cggctgttgg gcactgacaa ttccgtggtg 3600
ttgtcgggga agctgacgtc ctttccatgg ctgctcgcct gtgttgccac ctggattctg 3660
cgcgggacgt ccttctgcta cgtcccttcg gccctcaatc cagcggacct tccttcccgc 3720
ggcctgctgc cggctctgcg gcctcttccg cgtcttcgcc ttcgccctca gacgagtcgg 3780
atctcccttt gggccgcctc cccgcctggc gatggtaccg gtgtggaaag tccccaggct 3840
ccccagcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc aggtgtggaa 3900
agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa 3960
ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt tccgcccatt 4020
ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc gcctctgcct 4080
ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt tgcaaaaagc 4140
tcccgggagc ttgtatatcc attttcggat ctgatcagca cgtgcacaat tcgagctcgg 4200
tacctttaag accaatgact tacaaggcag ctgtagatct tagccacttt ttaaaagaaa 4260
aggggggact ggaagggcta attcactccc aacgaagaca agatctgctt tttgcttgta 4320
ctgggtctct ctggttagac cagatctgag cctgggagct ctctggctaa ctagggaacc 4380
cactgcttaa gcctcaataa agcttgcctt gagtgcttca agtagtgtgt gcccgtctgt 4440
tgtgtgactc tggtaactag agatccctca gaccctttta gtcagtgtgg aaaatctcta 4500
gcagtagtag ttcatgtcat cttattattc agtatttata acttgcaaag aaatgaatat 4560
cagagagtga gaggaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 4620
tcacaaattt cacaaataaa gcattttttt cactgcacct aagg 4664
<210> 3
<211> 4346
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 3
agaccagatc tgagcctggg agctctctgg ctaactaggg aacccactgc ttaagcctca 60
ataaagcttg ccttgagtgc ttcaagtagt gtgtgcccgt ctgttgtgtg actctggtaa 120
ctagagatcc ctcagaccct tttagtcagt gtggaaaatc tctagcagtg gcgcccgaac 180
agggacttga aagcgaaagg gaaaccagag gagctctctc gacgcaggac tcggcttgct 240
gaagcgcgca cggcaagagg cgaggggcgg cgactggtga gtacgccaaa aattttgact 300
agcggaggct agaaggagag agatgggtgc gagagcgtca gtattaagcg ggggagaatt 360
agatcgcgat gggaaaaaat tcggttaagg ccagggggaa agaaaaaata taaattaaaa 420
catatagtat gggcaagcag ggagctagaa cgattcgcag ttaatcctgg cctgttagaa 480
acatcagaag gctgtagaca aatactggga cagctacaac catcccttca gacaggatca 540
gaagaactta gatcattata taatacagta gcaaccctct attgtgtgca tcaaaggata 600
gagataaaag acaccaagga agctttagac aagatagagg aagagcaaaa caaaagtaag 660
accaccgcac agcaagcggc cgctgatctt cagacctgga ggaggagata tgagggacaa 720
ttggagaagt gaattatata aatataaagt agtaaaaatt gaaccattag gagtagcacc 780
caccaaggca aagagaagag tggtgcagag agaaaaaaga gcagtgggaa taggagcttt 840
gttccttggg ttcttgggag cagcaggaag cactatgggc gcagcgtcaa tgacgctgac 900
ggtacaggcc agacaattat tgtctggtat agtgcagcag cagaacaatt tgctgagggc 960
tattgaggcg caacagcatc tgttgcaact cacagtctgg ggcatcaagc agctccaggc 1020
aagaatcctg gctgtggaaa gatacctaaa ggatcaacag ctcctgggga tttggggttg 1080
ctctggaaaa ctcatttgca ccactgctgt gccttggaat gctagttgga gtaataaatc 1140
tctggaacag atttggaatc acacgacctg gatggagtgg gacagagaaa ttaacaatta 1200
cacaagctta atacactcct taattgaaga atcgcaaaac cagcaagaaa agaatgaaca 1260
agaattattg gaattagata aatgggcaag tttgtggaat tggtttaaca taacaaattg 1320
gctgtggtat ataaaattat tcataatgat agtaggaggc ttggtaggtt taagaatagt 1380
ttttgctgta ctttctatag tgaatagagt taggcaggga tattcaccat tatcgtttca 1440
gacccacctc ccaaccccga ggggacccga caggcccgaa ggaatagaag aagaaggtgg 1500
agagagagac agagacagat ccattcgatt agtgaacgga tctcgacggt atcggttaac 1560
ttttaaaaga aaagggggga ttggggggta cagtgcaggg gaaagaatag tagacataat 1620
agcaacagac atacaaacta aagaattaca aaaacaaatt acaaaaattc aaaattttat 1680
cgataagctt gggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg 1740
cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata 1800
gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta 1860
catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc 1920
gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca gtacatctac 1980
gtattagtca tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga 2040
tagcggtttg actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg 2100
ttttggcacc aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg 2160
caaatgggcg gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac 2220
cgtcagatcg cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgactc 2280
tagaggatcc actagtccag tgtggtggaa ttctgcagat atcaacaagt ttgtacaaaa 2340
aagcaggctc caccatggct aagacttccg aacagagggt gaacattgct acactgctga 2400
cagaaaataa gaagaaaatc gtggataagg cttcccagga tctgtggcgg agacacccag 2460
acctgatcgc accaggagga attgctttct ctcagaggga ccgcgctctg tgcctgcgag 2520
attacggctg gttcctgcat ctgatcacct tttgtctgct ggccggagat aagggcccca 2580
tcgagtctat tgggctgatc agtattcgag aaatgtataa ctcactggga gtgcccgtcc 2640
ctgcaatgat ggagagcatt agatgcctga aagaagccag cctgtccctg ctggacgaag 2700
aggacgccaa cgagaccgca ccctactttg attacattat taaggctatg agctaaaccc 2760
agctttcttg tacaaagtgg ttgatatcca gcacagtggc ggccgctcga gtctagaggg 2820
cccgcggttc gaaggtaagc ctatccctaa ccctctcctc ggtctcgatt ctacgcgtac 2880
cggttagtaa tgatcgacaa tcaacctctg gattacaaaa tttgtgaaag attgactggt 2940
attcttaact atgttgctcc ttttacgcta tgtggatacg ctgctttaat gcctttgtat 3000
catgctattg cttcccgtat ggctttcatt ttctcctcct tgtataaatc ctggttgctg 3060
tctctttatg aggagttgtg gcccgttgtc aggcaacgtg gcgtggtgtg cactgtgttt 3120
gctgacgcaa cccccactgg ttggggcatt gccaccacct gtcagctcct ttccgggact 3180
ttcgctttcc ccctccctat tgccacggcg gaactcatcg ccgcctgcct tgcccgctgc 3240
tggacagggg ctcggctgtt gggcactgac aattccgtgg tgttgtcggg gaagctgacg 3300
tcctttccat ggctgctcgc ctgtgttgcc acctggattc tgcgcgggac gtccttctgc 3360
tacgtccctt cggccctcaa tccagcggac cttccttccc gcggcctgct gccggctctg 3420
cggcctcttc cgcgtcttcg ccttcgccct cagacgagtc ggatctccct ttgggccgcc 3480
tccccgcctg gcgatggtac cggtgtggaa agtccccagg ctccccagca ggcagaagta 3540
tgcaaagcat gcatctcaat tagtcagcaa ccaggtgtgg aaagtcccca ggctccccag 3600
caggcagaag tatgcaaagc atgcatctca attagtcagc aaccatagtc ccgcccctaa 3660
ctccgcccat cccgccccta actccgccca gttccgccca ttctccgccc catggctgac 3720
taattttttt tatttatgca gaggccgagg ccgcctctgc ctctgagcta ttccagaagt 3780
agtgaggagg cttttttgga ggcctaggct tttgcaaaaa gctcccggga gcttgtatat 3840
ccattttcgg atctgatcag cacgtgcaca attcgagctc ggtaccttta agaccaatga 3900
cttacaaggc agctgtagat cttagccact ttttaaaaga aaagggggga ctggaagggc 3960
taattcactc ccaacgaaga caagatctgc tttttgcttg tactgggtct ctctggttag 4020
accagatctg agcctgggag ctctctggct aactagggaa cccactgctt aagcctcaat 4080
aaagcttgcc ttgagtgctt caagtagtgt gtgcccgtct gttgtgtgac tctggtaact 4140
agagatccct cagacccttt tagtcagtgt ggaaaatctc tagcagtagt agttcatgtc 4200
atcttattat tcagtattta taacttgcaa agaaatgaat atcagagagt gagaggaact 4260
tgtttattgc agcttataat ggttacaaat aaagcaatag catcacaaat ttcacaaata 4320
aagcattttt ttcactgcac ctaagg 4346
<210> 4
<211> 4663
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 4
gggtctctct ggttagacca gatctgagcc tgggagctct ctggctaact agggaaccca 60
ctgcttaagc ctcaataaag cttgccttga gtgcttcaag tagtgtgtgc ccgtctgttg 120
tgtgactctg gtaactagag atccctcaga cccttttagt cagtgtggaa aatctctagc 180
agtggcgccc gaacagggac ttgaaagcga aagggaaacc agaggagctc tctcgacgca 240
ggactcggct tgctgaagcg cgcacggcaa gaggcgaggg gcggcgactg gtgagtacgc 300
caaaaatttt gactagcgga ggctagaagg agagagatgg gtgcgagagc gtcagtatta 360
agcgggggag aattagatcg cgatgggaaa aaattcggtt aaggccaggg ggaaagaaaa 420
aatataaatt aaaacatata gtatgggcaa gcagggagct agaacgattc gcagttaatc 480
ctggcctgtt agaaacatca gaaggctgta gacaaatact gggacagcta caaccatccc 540
ttcagacagg atcagaagaa cttagatcat tatataatac agtagcaacc ctctattgtg 600
tgcatcaaag gatagagata aaagacacca aggaagcttt agacaagata gaggaagagc 660
aaaacaaaag taagaccacc gcacagcaag cggccgctga tcttcagacc tggaggagga 720
gatatgaggg acaattggag aagtgaatta tataaatata aagtagtaaa aattgaacca 780
ttaggagtag cacccaccaa ggcaaagaga agagtggtgc agagagaaaa aagagcagtg 840
ggaataggag ctttgttcct tgggttcttg ggagcagcag gaagcactat gggcgcagcg 900
tcaatgacgc tgacggtaca ggccagacaa ttattgtctg gtatagtgca gcagcagaac 960
aatttgctga gggctattga ggcgcaacag catctgttgc aactcacagt ctggggcatc 1020
aagcagctcc aggcaagaat cctggctgtg gaaagatacc taaaggatca acagctcctg 1080
gggatttggg gttgctctgg aaaactcatt tgcaccactg ctgtgccttg gaatgctagt 1140
tggagtaata aatctctgga acagatttgg aatcacacga cctggatgga gtgggacaga 1200
gaaattaaca attacacaag cttaatacac tccttaattg aagaatcgca aaaccagcaa 1260
gaaaagaatg aacaagaatt attggaatta gataaatggg caagtttgtg gaattggttt 1320
aacataacaa attggctgtg gtatataaaa ttattcataa tgatagtagg aggcttggta 1380
ggtttaagaa tagtttttgc tgtactttct atagtgaata gagttaggca gggatattca 1440
ccattatcgt ttcagaccca cctcccaacc ccgaggggac ccgacaggcc cgaaggaata 1500
gaagaagaag gtggagagag agacagagac agatccattc gattagtgaa cggatctcga 1560
cggtatcggt taacttttaa aagaaaaggg gggattgggg ggtacagtgc aggggaaaga 1620
atagtagaca taatagcaac agacatacaa actaaagaat tacaaaaaca aattacaaaa 1680
attcaaaatt ttatcgataa gcttgggagt tccgcgttac ataacttacg gtaaatggcc 1740
cgcctggctg accgcccaac gacccccgcc cattgacgtc aataatgacg tatgttccca 1800
tagtaacgcc aatagggact ttccattgac gtcaatgggt ggagtattta cggtaaactg 1860
cccacttggc agtacatcaa gtgtatcata tgccaagtac gccccctatt gacgtcaatg 1920
acggtaaatg gcccgcctgg cattatgccc agtacatgac cttatgggac tttcctactt 1980
ggcagtacat ctacgtatta gtcatcgcta ttaccatggt gatgcggttt tggcagtaca 2040
tcaatgggcg tggatagcgg tttgactcac ggggatttcc aagtctccac cccattgacg 2100
tcaatgggag tttgttttgg caccaaaatc aacgggactt tccaaaatgt cgtaacaact 2160
ccgccccatt gacgcaaatg ggcggtaggc gtgtacggtg ggaggtctat ataagcagag 2220
ctcgtttagt gaaccgtcag atcgcctgga gacgccatcc acgctgtttt gacctccata 2280
gaagacaccg actctagagg atccactagt ccagtgtggt ggaattctgc agatatcaac 2340
aagtttgtac aaaaaagcag gctccaccat ggtgagcaag ggcgaggagc tgatcaagga 2400
gaacatgaga agcaagctgt acctggaagg cagcgtgaac ggccaccagt tcaagtgcac 2460
ccacgaaggg gagggcaagc cctacgaggg caagcagacc gcgaggatca aggtggtgga 2520
gggaggcccc ctgccgttcg cattcgacat cctggccacc atgtttatgt acgggagcaa 2580
ggtgttcatc aagtaccccg ccgacctccc cgattatttt aagcagtcct tccctgaggg 2640
cttcacatgg gagagagtca tggtgttcga agacgggggc gtgctgaccg ccacccagga 2700
caccagcctc caggacggcg agctcatcta caacgtcaag ctcagagggg tgaacttccc 2760
agccaacggc cccgtgatgc agaagaaaac actgggctgg gagcccagca ccgagaccat 2820
gtaccccgct gacggcggcc tggaaggcag atgcgacaag aagctgaagc tcgtgggcgg 2880
gggccacctg cacgtcaact tcaagaccac atacaagtcc aagaaacccg tgaagatgcc 2940
cggcgtccac tacgtggacc gcagactgga aagaatcaag gaggccgaca acgagaccta 3000
cgtcgagcag tacgagcacg ctgtggccag atactccaac ctgggcggag gcatggacga 3060
gctgtacaag taaacccagc tttcttgtac aaagtggttg atatccagca cagtggcggc 3120
cgctcgagtc tagagggccc gcggttcgaa ggtaagccta tccctaaccc tctcctcggt 3180
ctcgattcta cgcgtaccgg ttagtaatga tcgacaatca acctctggat tacaaaattt 3240
gtgaaagatt gactggtatt cttaactatg ttgctccttt tacgctatgt ggatacgctg 3300
ctttaatgcc tttgtatcat gctattgctt cccgtatggc tttcattttc tcctccttgt 3360
ataaatcctg gttgctgtct ctttatgagg agttgtggcc cgttgtcagg caacgtggcg 3420
tggtgtgcac tgtgtttgct gacgcaaccc ccactggttg gggcattgcc accacctgtc 3480
agctcctttc cgggactttc gctttccccc tccctattgc cacggcggaa ctcatcgccg 3540
cctgccttgc ccgctgctgg acaggggctc ggctgttggg cactgacaat tccgtggtgt 3600
tgtcggggaa gctgacgtcc tttccatggc tgctcgcctg tgttgccacc tggattctgc 3660
gcgggacgtc cttctgctac gtcccttcgg ccctcaatcc agcggacctt ccttcccgcg 3720
gcctgctgcc ggctctgcgg cctcttccgc gtcttcgcct tcgccctcag acgagtcgga 3780
tctccctttg ggccgcctcc ccgcctggcg atggtaccgg tgtggaaagt ccccaggctc 3840
cccagcaggc agaagtatgc aaagcatgca tctcaattag tcagcaacca ggtgtggaaa 3900
gtccccaggc tccccagcag gcagaagtat gcaaagcatg catctcaatt agtcagcaac 3960
catagtcccg cccctaactc cgcccatccc gcccctaact ccgcccagtt ccgcccattc 4020
tccgccccat ggctgactaa ttttttttat ttatgcagag gccgaggccg cctctgcctc 4080
tgagctattc cagaagtagt gaggaggctt ttttggaggc ctaggctttt gcaaaaagct 4140
cccgggagct tgtatatcca ttttcggatc tgatcagcac gtgcacaatt cgagctcggt 4200
acctttaaga ccaatgactt acaaggcagc tgtagatctt agccactttt taaaagaaaa 4260
ggggggactg gaagggctaa ttcactccca acgaagacaa gatctgcttt ttgcttgtac 4320
tgggtctctc tggttagacc agatctgagc ctgggagctc tctggctaac tagggaaccc 4380
actgcttaag cctcaataaa gcttgccttg agtgcttcaa gtagtgtgtg cccgtctgtt 4440
gtgtgactct ggtaactaga gatccctcag acccttttag tcagtgtgga aaatctctag 4500
cagtagtagt tcatgtcatc ttattattca gtatttataa cttgcaaaga aatgaatatc 4560
agagagtgag aggaacttgt ttattgcagc ttataatggt tacaaataaa gcaatagcat 4620
cacaaatttc acaaataaag catttttttc actgcaccta agg 4663
<210> 5
<211> 5611
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 5
gggtctctct ggttagacca gatctgagcc tgggagctct ctggctaact agggaaccca 60
ctgcttaagc ctcaataaag cttgccttga gtgcttcaag tagtgtgtgc ccgtctgttg 120
tgtgactctg gtaactagag atccctcaga cccttttagt cagtgtggaa aatctctagc 180
agtggcgccc gaacagggac ttgaaagcga aagggaaacc agaggagctc tctcgacgca 240
ggactcggct tgctgaagcg cgcacggcaa gaggcgaggg gcggcgactg gtgagtacgc 300
caaaaatttt gactagcgga ggctagaagg agagagatgg gtgcgagagc gtcagtatta 360
agcgggggag aattagatcg cgatgggaaa aaattcggtt aaggccaggg ggaaagaaaa 420
aatataaatt aaaacatata gtatgggcaa gcagggagct agaacgattc gcagttaatc 480
ctggcctgtt agaaacatca gaaggctgta gacaaatact gggacagcta caaccatccc 540
ttcagacagg atcagaagaa cttagatcat tatataatac agtagcaacc ctctattgtg 600
tgcatcaaag gatagagata aaagacacca aggaagcttt agacaagata gaggaagagc 660
aaaacaaaag taagaccacc gcacagcaag cggccgctga tcttcagacc tggaggagga 720
gatatgaggg acaattggag aagtgaatta tataaatata aagtagtaaa aattgaacca 780
ttaggagtag cacccaccaa ggcaaagaga agagtggtgc agagagaaaa aagagcagtg 840
ggaataggag ctttgttcct tgggttcttg ggagcagcag gaagcactat gggcgcagcg 900
tcaatgacgc tgacggtaca ggccagacaa ttattgtctg gtatagtgca gcagcagaac 960
aatttgctga gggctattga ggcgcaacag catctgttgc aactcacagt ctggggcatc 1020
aagcagctcc aggcaagaat cctggctgtg gaaagatacc taaaggatca acagctcctg 1080
gggatttggg gttgctctgg aaaactcatt tgcaccactg ctgtgccttg gaatgctagt 1140
tggagtaata aatctctgga acagatttgg aatcacacga cctggatgga gtgggacaga 1200
gaaattaaca attacacaag cttaatacac tccttaattg aagaatcgca aaaccagcaa 1260
gaaaagaatg aacaagaatt attggaatta gataaatggg caagtttgtg gaattggttt 1320
aacataacaa attggctgtg gtatataaaa ttattcataa tgatagtagg aggcttggta 1380
ggtttaagaa tagtttttgc tgtactttct atagtgaata gagttaggca gggatattca 1440
ccattatcgt ttcagaccca cctcccaacc ccgaggggac ccgacaggcc cgaaggaata 1500
gaagaagaag gtggagagag agacagagac agatccattc gattagtgaa cggatctcga 1560
cggtatcggt taacttttaa aagaaaaggg gggattgggg ggtacagtgc aggggaaaga 1620
atagtagaca taatagcaac agacatacaa actaaagaat tacaaaaaca aattacaaaa 1680
attcaaaatt ttatcgataa gcttgggagt tccgcgttac ataacttacg gtaaatggcc 1740
cgcctggctg accgcccaac gacccccgcc cattgacgtc aataatgacg tatgttccca 1800
tagtaacgcc aatagggact ttccattgac gtcaatgggt ggagtattta cggtaaactg 1860
cccacttggc agtacatcaa gtgtatcata tgccaagtac gccccctatt gacgtcaatg 1920
acggtaaatg gcccgcctgg cattatgccc agtacatgac cttatgggac tttcctactt 1980
ggcagtacat ctacgtatta gtcatcgcta ttaccatggt gatgcggttt tggcagtaca 2040
tcaatgggcg tggatagcgg tttgactcac ggggatttcc aagtctccac cccattgacg 2100
tcaatgggag tttgttttgg caccaaaatc aacgggactt tccaaaatgt cgtaacaact 2160
ccgccccatt gacgcaaatg ggcggtaggc gtgtacggtg ggaggtctat ataagcagag 2220
ctcgtttagt gaaccgtcag atcgcctgga gacgccatcc acgctgtttt gacctccata 2280
gaagacaccg actctagagg atccactagt ccagtgtggt ggaattctgc agatatcaac 2340
aagtttgtac aaaaaagcag gctccaccat ggaagacgcc aaaaacataa agaaaggccc 2400
ggcgccattc tatcctctag aggatggaac cgctggagag caactgcata aggctatgaa 2460
gagatacgcc ctggttcctg gaacaattgc ttttacagat gcacatatcg aggtgaacat 2520
cacgtacgcg gaatacttcg aaatgtccgt tcggttggca gaagctatga aacgatatgg 2580
gctgaataca aatcacagaa tcgtcgtatg cagtgaaaac tctcttcaat tctttatgcc 2640
ggtgttgggc gcgttattta tcggagttgc agttgcgccc gcgaacgaca tttataatga 2700
acgtgaattg ctcaacagta tgaacatttc gcagcctacc gtagtgtttg tttccaaaaa 2760
ggggttgcaa aaaattttga acgtgcaaaa aaaattacca ataattcaga aaattattat 2820
catggattct aaaacggatt accagggatt tcagtcgatg tacacgttcg tcacatctca 2880
tctacctccc ggttttaatg agtacgattt tgtaccagag tcctttgatc gtgacaaaac 2940
aattgcactg ataatgaatt cctctggatc tactgggtta cctaagggtg tggcccttcc 3000
gcatagaact gcctgcgtca gattctcgca tgccagagat cctatttttg gcaatcaaat 3060
cattccggat actgcgattt taagtgttgt tccattccat cacggttttg gaatgtttac 3120
tacactcgga tatttgatat gtggatttcg agtcgtctta atgtatagat ttgaagaaga 3180
gctgttttta cgatcccttc aggattacaa aattcaaagt gcgttgctag taccaaccct 3240
attttcattc ttcgccaaaa gcactctgat tgacaaatac gatttatcta atttacacga 3300
aattgcttct gggggcgcac ctctttcgaa agaagtcggg gaagcggttg caaaacgctt 3360
ccatcttcca gggatacgac aaggatatgg gctcactgag actacatcag ctattctgat 3420
tacacccgag ggggatgata aaccgggcgc ggtcggtaaa gttgttccat tttttgaagc 3480
gaaggttgtg gatctggata ccgggaaaac gctgggcgtt aatcagagag gcgaattatg 3540
tgtcagagga cctatgatta tgtccggtta tgtaaacaat ccggaagcga ccaacgcctt 3600
gattgacaag gatggatggc tacattctgg agacatagct tactgggacg aagacgaaca 3660
cttcttcata gttgaccgct tgaagtcttt aattaaatac aaaggatatc aggtggcccc 3720
cgctgaattg gaatcgatat tgttacaaca ccccaacatc ttcgacgcgg gcgtggcagg 3780
tcttcccgac gatgacgccg gtgaacttcc cgccgccgtt gttgttttgg agcacggaaa 3840
gacgatgacg gaaaaagaga tcgtggatta cgtcgccagt caagtaacaa ccgcgaaaaa 3900
gttgcgcgga ggagttgtgt ttgtggacga agtaccgaaa ggtcttaccg gaaaactcga 3960
cgcaagaaaa atcagagaga tcctcataaa ggccaagaag ggcggaaagt ccaaattgta 4020
aacccagctt tcttgtacaa agtggttgat atccagcaca gtggcggccg ctcgagtcta 4080
gagggcccgc ggttcgaagg taagcctatc cctaaccctc tcctcggtct cgattctacg 4140
cgtaccggtt agtaatgatc gacaatcaac ctctggatta caaaatttgt gaaagattga 4200
ctggtattct taactatgtt gctcctttta cgctatgtgg atacgctgct ttaatgcctt 4260
tgtatcatgc tattgcttcc cgtatggctt tcattttctc ctccttgtat aaatcctggt 4320
tgctgtctct ttatgaggag ttgtggcccg ttgtcaggca acgtggcgtg gtgtgcactg 4380
tgtttgctga cgcaaccccc actggttggg gcattgccac cacctgtcag ctcctttccg 4440
ggactttcgc tttccccctc cctattgcca cggcggaact catcgccgcc tgccttgccc 4500
gctgctggac aggggctcgg ctgttgggca ctgacaattc cgtggtgttg tcggggaagc 4560
tgacgtcctt tccatggctg ctcgcctgtg ttgccacctg gattctgcgc gggacgtcct 4620
tctgctacgt cccttcggcc ctcaatccag cggaccttcc ttcccgcggc ctgctgccgg 4680
ctctgcggcc tcttccgcgt cttcgccttc gccctcagac gagtcggatc tccctttggg 4740
ccgcctcccc gcctggcgat ggtaccggtg tggaaagtcc ccaggctccc cagcaggcag 4800
aagtatgcaa agcatgcatc tcaattagtc agcaaccagg tgtggaaagt ccccaggctc 4860
cccagcaggc agaagtatgc aaagcatgca tctcaattag tcagcaacca tagtcccgcc 4920
cctaactccg cccatcccgc ccctaactcc gcccagttcc gcccattctc cgccccatgg 4980
ctgactaatt ttttttattt atgcagaggc cgaggccgcc tctgcctctg agctattcca 5040
gaagtagtga ggaggctttt ttggaggcct aggcttttgc aaaaagctcc cgggagcttg 5100
tatatccatt ttcggatctg atcagcacgt gcacaattcg agctcggtac ctttaagacc 5160
aatgacttac aaggcagctg tagatcttag ccacttttta aaagaaaagg ggggactgga 5220
agggctaatt cactcccaac gaagacaaga tctgcttttt gcttgtactg ggtctctctg 5280
gttagaccag atctgagcct gggagctctc tggctaacta gggaacccac tgcttaagcc 5340
tcaataaagc ttgccttgag tgcttcaagt agtgtgtgcc cgtctgttgt gtgactctgg 5400
taactagaga tccctcagac ccttttagtc agtgtggaaa atctctagca gtagtagttc 5460
atgtcatctt attattcagt atttataact tgcaaagaaa tgaatatcag agagtgagag 5520
gaacttgttt attgcagctt ataatggtta caaataaagc aatagcatca caaatttcac 5580
aaataaagca tttttttcac tgcacctaag g 5611
<210> 6
<211> 6770
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 6
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgggt gcgagagcgt cagtattaag cgggggagaa ttagatcgat gggaaaaaat 120
tcggttaagg ccagggggaa agaaaaaata taaattaaaa catatagtat gggcaagcag 180
ggagctagaa cgattcgcag ttaatcctgg cctgttagaa acatcagaag gctgtagaca 240
aatactggga cagctacaac catcccttca gacaggatca gaagaactta gatcattata 300
taatacagta gcaaccctct attgtgtgca tcaaaggata gagataaaag acaccaagga 360
agctttagac aagatagagg aagagcaaaa caaaagtaag aaaaaagcac agcaagcagc 420
agctgacaca ggacacagca atcaggtcag ccaaaattac cctatagtgc agaacatcca 480
ggggcaaatg gtacatcagg ccatatcacc tagaacttta aatgcatggg taaaagtagt 540
agaagagaag gctttcagcc cagaagtgat acccatgttt tcagcattat cagaaggagc 600
caccccacaa gatttaaaca ccatgctaaa cacagtgggg ggacatcaag cagccatgca 660
aatgttaaaa gagaccatca atgaggaagc tgcagaatgg gatagagtgc atccagtgca 720
tgcagggcct attgcaccag gccagatgag agaaccaagg ggaagtgaca tagcaggaac 780
tactagtacc cttcaggaac aaataggatg gatgacacat aatccaccta tcccagtagg 840
agaaatctat aaaagatgga taatcctggg attaaataaa atagtaagaa tgtatagccc 900
taccagcatt ctggacataa gacaaggacc aaaggaaccc tttagagact atgtagaccg 960
attctataaa actctaagag ccgagcaagc ttcacaagag gtaaaaaatt ggatgacaga 1020
aaccttgttg gtccaaaatg cgaacccaga ttgtaagact attttaaaag cattgggacc 1080
aggagcgaca ctagaagaaa tgatgacagc atgtcaggga gtggggggac ccggccataa 1140
agcaagagtt ttggctgaag caatgagcca agtaacaaat ccagctacca taatgataca 1200
gaaaggcaat tttaggaacc aaagaaagac tgttaagtgt ttcaattgtg gcaaagaagg 1260
gcacatagcc aaaaattgca gggcccctag gaaaaagggc tgttggaaat gtggaaagga 1320
aggacaccaa atgaaagatt gtactgagag acaggctaat tttttaggga agatctggcc 1380
ttcccacaag ggaaggccag ggaattttct tcagagcaga ccagagccaa cagccccacc 1440
agaagagagc ttcaggtttg gggaagagac aacaactccc tctcagaagc aggagccgat 1500
agacaaggaa ctgtatcctt tagcttccct cagatcactc tttggcagcg acccctcgtc 1560
acaataaaga taggggggca attaaaggaa gctctattag atacaggagc agatgataca 1620
gtattagaag aaatgaattt gccaggaaga tggaaaccaa aaatgatagg gggaattgga 1680
ggttttatca aagtaagaca gtatgatcag atactcatag aaatctgcgg acataaagct 1740
ataggtacag tattagtagg acctacacct gtcaacataa ttggaagaaa tctgttgact 1800
cagattggct gcactttaaa ttttcccatt agtcctattg agactgtacc agtaaaatta 1860
aagccaggaa tggatggccc aaaagttaaa caatggccat tgacagaaga aaaaataaaa 1920
gcattagtag aaatttgtac agaaatggaa aaggaaggaa aaatttcaaa aattgggcct 1980
gaaaatccat acaatactcc agtatttgcc ataaagaaaa aagacagtac taaatggaga 2040
aaattagtag atttcagaga acttaataag agaactcaag atttctggga agttcaatta 2100
ggaataccac atcctgcagg gttaaaacag aaaaaatcag taacagtact ggatgtgggc 2160
gatgcatatt tttcagttcc cttagataaa gacttcagga agtatactgc atttaccata 2220
cctagtataa acaatgagac accagggatt agatatcagt acaatgtgct tccacaggga 2280
tggaaaggat caccagcaat attccagtgt agcatgacaa aaatcttaga gccttttaga 2340
aaacaaaatc cagacatagt catctatcaa tacatggatg atttgtatgt aggatctgac 2400
ttagaaatag ggcagcatag aacaaaaata gaggaactga gacaacatct gttgaggtgg 2460
ggatttacca caccagacaa aaaacatcag aaagaacctc cattcctttg gatgggttat 2520
gaactccatc ctgataaatg gacagtacag cctatagtgc tgccagaaaa ggacagctgg 2580
actgtcaatg acatacagaa attagtggga aaattgaatt gggcaagtca gatttatgca 2640
gggattaaag taaggcaatt atgtaaactt cttaggggaa ccaaagcact aacagaagta 2700
gtaccactaa cagaagaagc agagctagaa ctggcagaaa acagggagat tctaaaagaa 2760
ccggtacatg gagtgtatta tgacccatca aaagacttaa tagcagaaat acagaagcag 2820
gggcaaggcc aatggacata tcaaatttat caagagccat ttaaaaatct gaaaacagga 2880
aagtatgcaa gaatgaaggg tgcccacact aatgatgtga aacaattaac agaggcagta 2940
caaaaaatag ccacagaaag catagtaata tggggaaaga ctcctaaatt taaattaccc 3000
atacaaaagg aaacatggga agcatggtgg acagagtatt ggcaagccac ctggattcct 3060
gagtgggagt ttgtcaatac ccctccctta gtgaagttat ggtaccagtt agagaaagaa 3120
cccataatag gagcagaaac tttctatgta gatggggcag ccaataggga aactaaatta 3180
ggaaaagcag gatatgtaac tgacagagga agacaaaaag ttgtccccct aacggacaca 3240
acaaatcaga agactgagtt acaagcaatt catctagctt tgcaggattc gggattagaa 3300
gtaaacatag tgacagactc acaatatgca ttgggaatca ttcaagcaca accagataag 3360
agtgaatcag agttagtcag tcaaataata gagcagttaa taaaaaagga aaaagtctac 3420
ctggcatggg taccagcaca caaaggaatt ggaggaaatg aacaagtaga taaattggtc 3480
agtgctggaa tcaggaaagt actattttta gatggaatag ataaggccca agaagaacat 3540
gagaaatatc acagtaattg gagagcaatg gctagtgatt ttaacctacc acctgtagta 3600
gcaaaagaaa tagtagccag ctgtgataaa tgtcagctaa aaggggaagc catgcatgga 3660
caagtagact gtagcccagg aatatggcag ctagattgta cacatttaga aggaaaagtt 3720
atcttggtag cagttcatgt agccagtgga tatatagaag cagaagtaat tccagcagag 3780
acagggcaag aaacagcata cttcctctta aaattagcag gaagatggcc agtaaaaaca 3840
gtacatacag acaatggcag caatttcacc agtactacag ttaaggccgc ctgttggtgg 3900
gcggggatca agcaggaatt tggcattccc tacaatcccc aaagtcaagg agtaatagaa 3960
tctatgaata aagaattaaa gaaaattata ggacaggtaa gagatcaggc tgaacatctt 4020
aagacagcag tacaaatggc agtattcatc cacaatttta aaagaaaagg ggggattggg 4080
gggtacagtg caggggaaag aatagtagac ataatagcaa cagacataca aactaaagaa 4140
ttacaaaaac aaattacaaa aattcaaaat tttcgggttt attacaggga cagcagagat 4200
ccagtttgga aaggaccagc aaagctcctc tggaaaggtg aaggggcagt agtaatacaa 4260
gataatagtg acataaaagt agtgccaaga agaaaagcaa agatcatcag ggattatgga 4320
aaacagatgg caggtgatga ttgtgtggca agtagacagg atgaggatta aagctcgctt 4380
tcttgctgtc caatttctat taaaggttcc tttgttccct aagtccaact actaaactgg 4440
gggatattat gaagggcctt gagcatctgg attctgccta ataagaaaca tttattgtca 4500
ttgcagagac gcggccgcgc gtctcaatga agagcgtcga cgcatgcgga ggctaggtgg 4560
aggctcagtg atgataagtc tgcgatggtg gatgcatgtg tcatggtcat agctgtttcc 4620
tgtgtgaaat tgttatccgc tcagagggca caatcctatt ccgcgctatc cgacaatctc 4680
caagacatta ggtggagttc agttcggcgt atggcatatg tcgctggaaa gaacatgtga 4740
gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat 4800
aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac 4860
ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 4920
gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 4980
ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 5040
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 5100
cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 5160
attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 5220
ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga 5280
aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt 5340
gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 5400
tctacggggt ctgacgctct attcaacaaa gccgccgtcc cgtcaagtca gcgtaaatgg 5460
gtagggggct tcaaatcgtc ctcgtgatac caattcggag cctgcttttt tgtacaaact 5520
tgttgataat ggcaattcaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt 5580
taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag 5640
tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt 5700
cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg caatgatacc 5760
gcgagagcca cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc 5820
cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg 5880
ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac 5940
aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg 6000
atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc 6060
tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact 6120
gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc 6180
aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat 6240
acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc 6300
ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac 6360
tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa 6420
aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact 6480
catactcttc ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg 6540
atacatattt gaatgtattt agaaaaataa acaaataggg gttccgcgca catttccccg 6600
aaaagtgcca gatacctgaa acaaaaccca tcgtacggcc aaggaagtct ccaataactg 6660
tgatccacca caagcgccag ggttttccca gtcacgacgt tgtaaaacga cggccagtca 6720
tgcataatcc gcacgcatct ggaataagga agtgccattc cgcctgacct 6770
<210> 7
<211> 1700
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 7
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgggt gcgagagcgt cagtattaag cgggggagaa ttagatcgat gggaaaaaat 120
tcggttaagg ccagggggaa agaaaaaata taaattaaaa catatagtat gggcaagcag 180
ggagctagaa cgattcgcag ttaatcctgg cctgttagaa acatcagaag gctgtagaca 240
aatactggga cagctacaac catcccttca gacaggatca gaagaactta gatcattata 300
taatacagta gcaaccctct attgtgtgca tcaaaggata gagataaaag acaccaagga 360
agctttagac aagatagagg aagagcaaaa caaaagtaag aaaaaagcac agcaagcagc 420
agctgacaca ggacacagca atcaggtcag ccaaaattac cctatagtgc agaacatcca 480
ggggcaaatg gtacatcagg ccatatcacc tagaacttta aatgcatggg taaaagtagt 540
agaagagaag gctttcagcc cagaagtgat acccatgttt tcagcattat cagaaggagc 600
caccccacaa gatttaaaca ccatgctaaa cacagtgggg ggacatcaag cagccatgca 660
aatgttaaaa gagaccatca atgaggaagc tgcagaatgg gatagagtgc atccagtgca 720
tgcagggcct attgcaccag gccagatgag agaaccaagg ggaagtgaca tagcaggaac 780
tactagtacc cttcaggaac aaataggatg gatgacacat aatccaccta tcccagtagg 840
agaaatctat aaaagatgga taatcctggg attaaataaa atagtaagaa tgtatagccc 900
taccagcatt ctggacataa gacaaggacc aaaggaaccc tttagagact atgtagaccg 960
attctataaa actctaagag ccgagcaagc ttcacaagag gtaaaaaatt ggatgacaga 1020
aaccttgttg gtccaaaatg cgaacccaga ttgtaagact attttaaaag cattgggacc 1080
aggagcgaca ctagaagaaa tgatgacagc atgtcaggga gtggggggac ccggccataa 1140
agcaagagtt ttggctgaag caatgagcca agtaacaaat ccagctacca taatgataca 1200
gaaaggcaat tttaggaacc aaagaaagac tgttaagtgt ttcaattgtg gcaaagaagg 1260
gcacatagcc aaaaattgca gggcccctag gaaaaagggc tgttggaaat gtggaaagga 1320
aggacaccaa atgaaagatt gtactgagag acaggctaat tttttaggga agatctggcc 1380
ttcccacaag ggaaggccag ggaattttct tcagagcaga ccagagccaa cagccccacc 1440
agaagagagc ttcaggtttg gggaagagac aacaactccc tctcagaagc aggagccgat 1500
agacaaggaa ctgtatcctt tagcttccct cagatcactc tttggcagcg acccctcgtc 1560
acaataaagc tcgctttctt gctgtccaat ttctattaaa ggttcctttg ttccctaagt 1620
ccaactacta aactggggga tattatgaag ggccttgagc atctggattc tgcctaataa 1680
gaaacattta ttgtcattgc 1700
<210> 8
<211> 1619
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 8
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatggga ttccttgacg gcatcgacaa ggcgcaagag gagcacgaga aatatcatag 120
caactggagg gcgatggcat cagacttcaa cttgccgcca gtcgtggcta aagagatagt 180
agcaagctgc gataagtgtc aactgaaggg tgaagcgatg cacggtcaag tggactgttc 240
ccctggtata tggcaactgg actgtaccca tctcgaaggt aaagtcattt tggttgctgt 300
acatgttgcc tctgggtata tcgaggctga ggtgattcca gcagaaacgg gacaggaaac 360
agcttatttc ttgctcaagc ttgccgggag atggcctgtg aaaactgttc atacagataa 420
tggtagcaat tttactagta cgaccgtgaa agcagcttgt tggtgggcag ggattaagca 480
agaattcgga atcccgtata acccgcagag tcagggcgtt attgagagca tgaacaagga 540
actgaaaaaa ataattgggc aagtgagaga tcaggctgag catcttaaaa ctgctgtaca 600
aatggcggtg ttcatacata attttaagcg gaagggagga attggaggat actctgcggg 660
agagaggata gtggatataa tagcgaccga tattcagaca aaggagctgc aaaaacagat 720
aacgaaaata caaaattttc gagtttacta tcgggactcc cgcgatcccg tgtggaaagg 780
tccagcgaaa ttgctttgga agggcgaagg cgcggtagtg atccaggaca attctgatat 840
caaagtggtc ccaaggcgga aagcaaagat aatccgcgac tacggcaagc aaatggcagg 900
agatgattgc gtcgcatcac gacaggacga agatggcggc ggcggctcca tggccctgac 960
caacgcccag atcctggccg tgatcgacag ctgggaggaa accgtgggcc agttccccgt 1020
gatcacccac catgtgcctc tgggcggagg cctccaggga accctgcact gttacgagat 1080
ccccctggcc gctccttacg gcgtgggctt tgccaagaac ggccccacca gatggcagta 1140
caagcggacc atcaaccagg tggtgcacag atggggcagc cacaccgtgc cctttctgct 1200
ggaacccgac aacatcaacg gcaagacctg caccgccagc cacctgtgcc acaacaccag 1260
atgccacaac cccctgcacc tgtgctggga gagcctggac gacgccaagg gccggaattg 1320
gtgccctggc cctaatggcg gatgtgtgca tgccgtcgtg tgcctgagac agggacctct 1380
gtatggccct ggcgctacag tggctggccc tcagcagagg ggctcccact tcgtggtgta 1440
aagctcgctt tcttgctgtc caatttctat taaaggttcc tttgttccct aagtccaact 1500
actaaactgg gggatattat gaagggcctt gagcatctgg attctgccta ataagaaaca 1560
tttattgtca ttgcagagac gcggccgcgc gtctcaatga agagcgtcga cgcatgcgg 1619
<210> 9
<211> 3764
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 9
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgggc ttttttaggg aagatctggc cttcccacaa gggaaggcca gggaattttc 120
ttcagagcag accagagcca acagccccac cagaagagag cttcaggttt ggggaagaga 180
caacaactcc ctctcagaag caggagccga tagacaagga actgtatcct ttagcttccc 240
tcagatcact ctttggcagc gacccctcgt cacaataaag ataggggggc agctcaagga 300
ggctctcctg gacaccggag cagacgacac cgtgctggag gagatgtcgt tgccaggccg 360
ctggaagccg aagatgatcg ggggaatcgg cggtttcatc aaggtgcgcc agtatgacca 420
gatcctcatc gaaatctgcg gccacaaggc tatcggtacc gtgctggtgg gccccacacc 480
cgtcaacatc atcggacgca acctgttgac gcagatcggt tgcacgctga acttccccat 540
tagccctatc gagacggtac cggtgaagct gaagcccggg atggacggcc cgaaggtcaa 600
gcaatggcca ttgacagagg agaagatcaa ggcactggtg gagatttgca cagagatgga 660
aaaggaaggg aaaatctcca agattgggcc tgagaacccg tacaacacgc cggtgttcgc 720
aatcaagaag aaggactcga cgaaatggcg caagctggtg gacttccgcg agctgaacaa 780
gcgcacgcaa gacttctggg aggttcagct gggcatcccg caccccgcag ggctgaagaa 840
gaagaaatcc gtgaccgtac tggatgtggg tgatgcctac ttctccgttc ccctggacga 900
agacttcagg aagtacactg ccttcacaat cccttcgatc aacaacgaga caccggggat 960
tcgatatcag tacaacgtgc tgccccaggg ctggaaaggc tctcccgcaa tcttccagag 1020
tagcatgacc aaaatcctgg agcctttccg caaacagaac cccgacatcg tcatctatca 1080
gtacatggat gacttgtacg tgggctctga tctagagata gggcagcacc gcaccaagat 1140
cgaggagctg cgccagcacc tgttgaggtg gggactgacc acacccgaca agaagcacca 1200
gaaggagcct cccttcctct ggatgggtta cgagctgcac cctgacaaat ggaccgtgca 1260
gcctatcgtg ctgccagaga aagacagctg gactgtcaac gacatacaga agctggtggg 1320
gaagttgaac tgggccagtc agatttaccc agggattaag gtgaggcagc tgtgcaaact 1380
cctccgcgga accaaggcac tcacagaggt gatcccccta accgaggagg ccgagctcga 1440
actggcagaa aaccgagaga tcctaaagga gcccgtgcac ggcgtgtact atgacccctc 1500
caaggacctg atcgccgaga tccagaagca ggggcaaggc cagtggacct atcagattta 1560
ccaggagccc ttcaagaacc tgaagaccgg caagtacgcc cggatgaggg gtgcccacac 1620
taacgacgtc aagcagctga ccgaggccgt gcagaagatc accaccgaaa gcatcgtgat 1680
ctggggaaag actcctaagt tcaagctgcc catccagaag gaaacctggg aaacctggtg 1740
gacagagtat tggcaggcca cctggattcc tgagtgggag ttcgtcaaca cccctcccct 1800
ggtgaagctg tggtaccagc tggagaagga gcccatagtg ggcgccgaaa ccttctacgt 1860
ggatggggcc gctaacaggg agactaagct gggcaaagcc ggatacgtca ctaaccgggg 1920
cagacagaag gttgtcaccc tcactgacac caccaaccag aagactgagc tgcaggccat 1980
ttacctcgct ttgcaggact cgggcctgga ggtgaacatc gtgacagact ctcagtatgc 2040
cctgggcatc attcaagccc agccagacca gagtgagtcc gagctggtca atcagatcat 2100
cgagcagctg atcaagaagg aaaaggtcta tctggcctgg gtacccgccc acaaaggcat 2160
tggcggcaat gagcaggtcg acaagctggt ctcggctggc atcaggaagg tgctattcct 2220
ggatggcatc gacaaggccc aggacgagca cgagaaatac cacagcaact ggcgggccat 2280
ggctagcgac ttcaacctgc cccctgtggt ggccaaagag atcgtggcca gctgtgacaa 2340
gtgtcagctc aagggcgaag ccatgcatgg ccaggtggac tgtagccccg gcatctggca 2400
actcgattgc acccatctgg agggcaaggt tatcctggta gccgtccatg tggccagtgg 2460
ctacatcgag gccgaggtca ttcccgccga aacagggcag gagacagcct acttcctcct 2520
gaagctggca ggccggtggc cagtgaagac catccatact gacaatggca gcaatttcac 2580
cagtgctacg gttaaggccg cctgctggtg ggcgggaatc aagcaggagt tcgggatccc 2640
ctacaatccc cagagtcagg gcgtcgtcga gtctatgaat aaggagttaa agaagattat 2700
cggccaggtc agagatcagg ctgagcatct caagaccgcg gtccaaatgg cggtattcat 2760
ccacaatttc aagcggaagg gggggattgg ggggtacagt gcgggggagc ggatcgtgga 2820
catcatcgcg accgacatcc agactaagga gctgcaaaag cagattacca agattcagaa 2880
tttccgggtc tactacaggg acagcagaaa tcccctctgg aaaggcccag cgaagctcct 2940
ctggaagggt gagggggcag tagtgatcca ggataatagc gacatcaagg tggtgcccag 3000
aagaaaggcg aagatcatta gggattatgg caaacagatg gcgggtgatg attgcgtggc 3060
gagcagacag gatgaggatg gcggcggcgg ctccatggcc ctgaccaacg cccagatcct 3120
ggccgtgatc gacagctggg aggaaaccgt gggccagttc cccgtgatca cccaccatgt 3180
gcctctgggc ggaggcctcc agggaaccct gcactgttac gagatccccc tggccgctcc 3240
ttacggcgtg ggctttgcca agaacggccc caccagatgg cagtacaagc ggaccatcaa 3300
ccaggtggtg cacagatggg gcagccacac cgtgcccttt ctgctggaac ccgacaacat 3360
caacggcaag acctgcaccg ccagccacct gtgccacaac accagatgcc acaaccccct 3420
gcacctgtgc tgggagagcc tggacgacgc caagggccgg aattggtgcc ctggccctaa 3480
tggcggatgt gtgcatgccg tcgtgtgcct gagacaggga cctctgtatg gccctggcgc 3540
tacagtggct ggccctcagc agaggggctc ccacttcgtg gtgtaaagct cgctttcttg 3600
ctgtccaatt tctattaaag gttcctttgt tccctaagtc caactactaa actgggggat 3660
attatgaagg gccttgagca tctggattct gcctaataag aaacatttat tgtcattgca 3720
gagacgcggc cgcgcgtctc aatgaagagc gtcgacgcat gcgg 3764
<210> 10
<211> 4474
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 10
gggtctctct ggttagacca gatctgagcc tgggagctct ctggctaact agggaaccca 60
ctgcttaagc ctcaataaag cttgccttga gtgcttcaag tagtgtgtgc ccgtctgttg 120
tgtgactctg gtaactagag atccctcaga cccttttagt cagtgtggaa aatctctagc 180
agtggcgccc gaacagggac ttgaaagcga aagggaaacc agaggagctc tctcgacgca 240
ggactcggct tgctgaagcg cgcacggcaa gaggcgaggg gcggcgactg gtgagtacgc 300
caaaaatttt gactagcgga ggctagaagg agagagatgg gtgcgagagc gtcagtatta 360
agcgggggag aattagatcg cgatgggaaa aaattcggtt aaggccaggg ggaaagaaaa 420
aatataaatt aaaacatata gtatgggcaa gcagggagct agaacgattc gcagttaatc 480
ctggcctgtt agaaacatca gaaggctgta gacaaatact gggacagcta caaccatccc 540
ttcagacagg atcagaagaa cttagatcat tatataatac agtagcaacc ctctattgtg 600
tgcatcaaag gatagagata aaagacacca aggaagcttt agacaagata gaggaagagc 660
aaaacaaaag taagaccacc gcacagcaag cggccgctga tcttcagacc tggaggagga 720
gatatgaggg acaattggag aagtgaatta tataaatata aagtagtaaa aattgaacca 780
ttaggagtag cacccaccaa ggcaaagaga agagtggtgc agagagaaaa aagagcagtg 840
ggaataggag ctttgttcct tgggttcttg ggagcagcag gaagcactat gggcgcagcg 900
tcaatgacgc tgacggtaca ggccagacaa ttattgtctg gtatagtgca gcagcagaac 960
aatttgctga gggctattga ggcgcaacag catctgttgc aactcacagt ctggggcatc 1020
aagcagctcc aggcaagaat cctggctgtg gaaagatacc taaaggatca acagctcctg 1080
gggatttggg gttgctctgg aaaactcatt tgcaccactg ctgtgccttg gaatgctagt 1140
tggagtaata aatctctgga acagatttgg aatcacacga cctggatgga gtgggacaga 1200
gaaattaaca attacacaag cttaatacac tccttaattg aagaatcgca aaaccagcaa 1260
gaaaagaatg aacaagaatt attggaatta gataaatggg caagtttgtg gaattggttt 1320
aacataacaa attggctgtg gtatataaaa ttattcataa tgatagtagg aggcttggta 1380
ggtttaagaa tagtttttgc tgtactttct atagtgaata gagttaggca gggatattca 1440
ccattatcgt ttcagaccca cctcccaacc ccgaggggac ccgacaggcc cgaaggaata 1500
gaagaagaag gtggagagag agacagagac agatccattc gattagtgaa cggatctcga 1560
cggtatcggt taacttttaa aagaaaaggg gggattgggg ggtacagtgc aggggaaaga 1620
atagtagaca taatagcaac agacatacaa actaaagaat tacaaaaaca aattacaaaa 1680
attcaaaatt ttatcgataa gcttgggagt tccgcgttac ataacttacg gtaaatggcc 1740
cgcctggctg accgcccaac gacccccgcc cattgacgtc aataatgacg tatgttccca 1800
tagtaacgcc aatagggact ttccattgac gtcaatgggt ggagtattta cggtaaactg 1860
cccacttggc agtacatcaa gtgtatcata tgccaagtac gccccctatt gacgtcaatg 1920
acggtaaatg gcccgcctgg cattatgccc agtacatgac cttatgggac tttcctactt 1980
ggcagtacat ctacgtatta gtcatcgcta ttaccatggt gatgcggttt tggcagtaca 2040
tcaatgggcg tggatagcgg tttgactcac ggggatttcc aagtctccac cccattgacg 2100
tcaatgggag tttgttttgg caccaaaatc aacgggactt tccaaaatgt cgtaacaact 2160
ccgccccatt gacgcaaatg ggcggtaggc gtgtacggtg ggaggtctat ataagcagag 2220
ctcgtttagt gaaccgtcag atcgcctgga gacgccatcc acgctgtttt gacctccata 2280
gaagacaccg actctagagg atccactagt ccagtgtggt ggaattctgc agatatcaac 2340
aagtttgtac aaaaaagcag gctccaccat ggtcttcaca ctcgaagatt tcgttgggga 2400
ctggcgacag acagccggct acaacctgga ccaagtcctt gaacagggag gtgtgtccag 2460
tttgtttcag aatctcgggg tgtccgtaac tccgatccaa aggattgtcc tgagcggtga 2520
aaatgggctg aagatcgaca tccatgtcat catcccgtat gaaggtctga gcggcgacca 2580
aatgggccag atcgaaaaaa tttttaaggt ggtgtaccct gtggatgatc atcactttaa 2640
ggtgatcctg cactatggca cactggtaat cgacggggtt acgccgaaca tgatcgacta 2700
tttcggacgg ccgtatgaag gcatcgccgt gttcgacggc aaaaagatca ctgtaacagg 2760
gaccctgtgg aacggcaaca aaattatcga cgagcgcctg atcaaccccg acggctccct 2820
gctgttccga gtaaccatca acggagtgac cggctggcgg ctgtgcgaac gcattctggc 2880
gtaaacccag ctttcttgta caaagtggtt gatatccagc acagtggcgg ccgctcgagt 2940
ctagagggcc cgcggttcga aggtaagcct atccctaacc ctctcctcgg tctcgattct 3000
acgcgtaccg gttagtaatg atcgacaatc aacctctgga ttacaaaatt tgtgaaagat 3060
tgactggtat tcttaactat gttgctcctt ttacgctatg tggatacgct gctttaatgc 3120
ctttgtatca tgctattgct tcccgtatgg ctttcatttt ctcctccttg tataaatcct 3180
ggttgctgtc tctttatgag gagttgtggc ccgttgtcag gcaacgtggc gtggtgtgca 3240
ctgtgtttgc tgacgcaacc cccactggtt ggggcattgc caccacctgt cagctccttt 3300
ccgggacttt cgctttcccc ctccctattg ccacggcgga actcatcgcc gcctgccttg 3360
cccgctgctg gacaggggct cggctgttgg gcactgacaa ttccgtggtg ttgtcgggga 3420
agctgacgtc ctttccatgg ctgctcgcct gtgttgccac ctggattctg cgcgggacgt 3480
ccttctgcta cgtcccttcg gccctcaatc cagcggacct tccttcccgc ggcctgctgc 3540
cggctctgcg gcctcttccg cgtcttcgcc ttcgccctca gacgagtcgg atctcccttt 3600
gggccgcctc cccgcctggc gatggtaccg gtgtggaaag tccccaggct ccccagcagg 3660
cagaagtatg caaagcatgc atctcaatta gtcagcaacc aggtgtggaa agtccccagg 3720
ctccccagca ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa ccatagtccc 3780
gcccctaact ccgcccatcc cgcccctaac tccgcccagt tccgcccatt ctccgcccca 3840
tggctgacta atttttttta tttatgcaga ggccgaggcc gcctctgcct ctgagctatt 3900
ccagaagtag tgaggaggct tttttggagg cctaggcttt tgcaaaaagc tcccgggagc 3960
ttgtatatcc attttcggat ctgatcagca cgtgcacaat tcgagctcgg tacctttaag 4020
accaatgact tacaaggcag ctgtagatct tagccacttt ttaaaagaaa aggggggact 4080
ggaagggcta attcactccc aacgaagaca agatctgctt tttgcttgta ctgggtctct 4140
ctggttagac cagatctgag cctgggagct ctctggctaa ctagggaacc cactgcttaa 4200
gcctcaataa agcttgcctt gagtgcttca agtagtgtgt gcccgtctgt tgtgtgactc 4260
tggtaactag agatccctca gaccctttta gtcagtgtgg aaaatctcta gcagtagtag 4320
ttcatgtcat cttattattc agtatttata acttgcaaag aaatgaatat cagagagtga 4380
gaggaacttg tttattgcag cttataatgg ttacaaataa agcaatagca tcacaaattt 4440
cacaaataaa gcattttttt cactgcacct aagg 4474
<210> 11
<211> 3260
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 11
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgggc ttttttaggg aagatctggc cttcccacaa gggaaggcca gggaattttc 120
ttcagagcag accagagcca acagccccac cagaagagag cttcaggttt ggggaagaga 180
caacaactcc ctctcagaag caggagccga tagacaagga actgtatcct ttagcttccc 240
tcagatcact ctttggcagc gacccctcgt cacaataaag ataggggggc agctcaagga 300
ggctctcctg gacaccggag cagacgacac cgtgctggag gagatgtcgt tgccaggccg 360
ctggaagccg aagatgatcg ggggaatcgg cggtttcatc aaggtgcgcc agtatgacca 420
gatcctcatc gaaatctgcg gccacaaggc tatcggtacc gtgctggtgg gccccacacc 480
cgtcaacatc atcggacgca acctgttgac gcagatcggt tgcacgctga acttccccat 540
tagccctatc gagacggtac cggtgaagct gaagcccggg atggacggcc cgaaggtcaa 600
gcaatggcca ttgacagagg agaagatcaa ggcactggtg gagatttgca cagagatgga 660
aaaggaaggg aaaatctcca agattgggcc tgagaacccg tacaacacgc cggtgttcgc 720
aatcaagaag aaggactcga cgaaatggcg caagctggtg gacttccgcg agctgaacaa 780
gcgcacgcaa gacttctggg aggttcagct gggcatcccg caccccgcag ggctgaagaa 840
gaagaaatcc gtgaccgtac tggatgtggg tgatgcctac ttctccgttc ccctggacga 900
agacttcagg aagtacactg ccttcacaat cccttcgatc aacaacgaga caccggggat 960
tcgatatcag tacaacgtgc tgccccaggg ctggaaaggc tctcccgcaa tcttccagag 1020
tagcatgacc aaaatcctgg agcctttccg caaacagaac cccgacatcg tcatctatca 1080
gtacatggat gacttgtacg tgggctctga tctagagata gggcagcacc gcaccaagat 1140
cgaggagctg cgccagcacc tgttgaggtg gggactgacc acacccgaca agaagcacca 1200
gaaggagcct cccttcctct ggatgggtta cgagctgcac cctgacaaat ggaccgtgca 1260
gcctatcgtg ctgccagaga aagacagctg gactgtcaac gacatacaga agctggtggg 1320
gaagttgaac tgggccagtc agatttaccc agggattaag gtgaggcagc tgtgcaaact 1380
cctccgcgga accaaggcac tcacagaggt gatcccccta accgaggagg ccgagctcga 1440
actggcagaa aaccgagaga tcctaaagga gcccgtgcac ggcgtgtact atgacccctc 1500
caaggacctg atcgccgaga tccagaagca ggggcaaggc cagtggacct atcagattta 1560
ccaggagccc ttcaagaacc tgaagaccgg caagtacgcc cggatgaggg gtgcccacac 1620
taacgacgtc aagcagctga ccgaggccgt gcagaagatc accaccgaaa gcatcgtgat 1680
ctggggaaag actcctaagt tcaagctgcc catccagaag gaaacctggg aaacctggtg 1740
gacagagtat tggcaggcca cctggattcc tgagtgggag ttcgtcaaca cccctcccct 1800
ggtgaagctg tggtaccagc tggagaagga gcccatagtg ggcgccgaaa ccttctacgt 1860
ggatggggcc gctaacaggg agactaagct gggcaaagcc ggatacgtca ctaaccgggg 1920
cagacagaag gttgtcaccc tcactgacac caccaaccag aagactgagc tgcaggccat 1980
ttacctcgct ttgcaggact cgggcctgga ggtgaacatc gtgacagact ctcagtatgc 2040
cctgggcatc attcaagccc agccagacca gagtgagtcc gagctggtca atcagatcat 2100
cgagcagctg atcaagaagg aaaaggtcta tctggcctgg gtacccgccc acaaaggcat 2160
tggcggcaat gagcaggtcg acaagctggt ctcggctggc atcaggaagg tgctattcct 2220
ggatggcatc gacaaggccc aggacgagca cgagaaatac cacagcaact ggcgggccat 2280
ggctagcgac ttcaacctgc cccctgtggt ggccaaagag atcgtggcca gctgtgacaa 2340
gtgtcagctc aagggcgaag ccatgcatgg ccaggtggac tgtagccccg gcatctggca 2400
actcgattgc acccatctgg agggcaaggt tatcctggta gccgtccatg tggccagtgg 2460
ctacatcgag gccgaggtca ttcccgccga aacagggcag gagacagcct acttcctcct 2520
gaagctggca ggccggtggc cagtgaagac catccatact gacaatggca gcaatttcac 2580
cagtgctacg gttaaggccg cctgctggtg ggcgggaatc aagcaggagt tcgggatccc 2640
ctacaatccc cagagtcagg gcgtcgtcga gtctatgaat aaggagttaa agaagattat 2700
cggccaggtc agagatcagg ctgagcatct caagaccgcg gtccaaatgg cggtattcat 2760
ccacaatttc aagcggaagg gggggattgg ggggtacagt gcgggggagc ggatcgtgga 2820
catcatcgcg accgacatcc agactaagga gctgcaaaag cagattacca agattcagaa 2880
tttccgggtc tactacaggg acagcagaaa tcccctctgg aaaggcccag cgaagctcct 2940
ctggaagggt gagggggcag tagtgatcca ggataatagc gacatcaagg tggtgcccag 3000
aagaaaggcg aagatcatta gggattatgg caaacagatg gcgggtgatg attgcgtggc 3060
gagcagacag gatgaggatt agagctcgct ttcttgctgt ccaatttcta ttaaaggttc 3120
ctttgttccc taagtccaac tactaaactg ggggatatta tgaagggcct tgagcatctg 3180
gattctgcct aataagaaac atttattgtc attgcagaga cgcggccgcg cgtctcaatg 3240
aagagcgtcg acgcatgcgg 3260
<210> 12
<211> 3212
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 12
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgttt tttagggaag atctggcctt cccacaaggg aaggccaggg aattttcttc 120
agagcagacc agagccaaca gccccaccag aagagagctt caggtttggg gaagagacaa 180
caactccctc tcagaagcag gagccgatag acaaggaact gtatccttta gcttccctca 240
gatcactctt tggcagcgac ccctcgtcac aataaagata ggggggcaat taaaggaagc 300
tctattagat acaggagcag atgatacagt attagaagaa atgaatttgc caggaagatg 360
gaaaccaaaa atgatagggg gaattggagg ttttatcaaa gtaagacagt atgatcagat 420
actcatagaa atctgcggac ataaagctat aggtacagta ttagtaggac ctacacctgt 480
caacataatt ggaagaaatc tgttgactca gattggctgc actttaaatt ttcccattag 540
tcctattgag actgtaccag taaaattaaa gccaggaatg gatggcccaa aagttaaaca 600
atggccattg acagaagaaa aaataaaagc attagtagaa atttgtacag aaatggaaaa 660
ggaaggaaaa atttcaaaaa ttgggcctga aaatccatac aatactccag tatttgccat 720
aaagaaaaaa gacagtacta aatggagaaa attagtagat ttcagagaac ttaataagag 780
aactcaagat ttctgggaag ttcaattagg aataccacat cctgcagggt taaaacagaa 840
aaaatcagta acagtactgg atgtgggcga tgcatatttt tcagttccct tagataaaga 900
cttcaggaag tatactgcat ttaccatacc tagtataaac aatgagacac cagggattag 960
atatcagtac aatgtgcttc cacagggatg gaaaggatca ccagcaatat tccagtgtag 1020
catgacaaaa atcttagagc cttttagaaa acaaaatcca gacatagtca tctatcaata 1080
catggatgat ttgtatgtag gatctgactt agaaataggg cagcatagaa caaaaataga 1140
ggaactgaga caacatctgt tgaggtgggg atttaccaca ccagacaaaa aacatcagaa 1200
agaacctcca ttcctttgga tgggttatga actccatcct gataaatgga cagtacagcc 1260
tatagtgctg ccagaaaagg acagctggac tgtcaatgac atacagaaat tagtgggaaa 1320
attgaattgg gcaagtcaga tttatgcagg gattaaagta aggcaattat gtaaacttct 1380
taggggaacc aaagcactaa cagaagtagt accactaaca gaagaagcag agctagaact 1440
ggcagaaaac agggagattc taaaagaacc ggtacatgga gtgtattatg acccatcaaa 1500
agacttaata gcagaaatac agaagcaggg gcaaggccaa tggacatatc aaatttatca 1560
agagccattt aaaaatctga aaacaggaaa gtatgcaaga atgaagggtg cccacactaa 1620
tgatgtgaaa caattaacag aggcagtaca aaaaatagcc acagaaagca tagtaatatg 1680
gggaaagact cctaaattta aattacccat acaaaaggaa acatgggaag catggtggac 1740
agagtattgg caagccacct ggattcctga gtgggagttt gtcaataccc ctcccttagt 1800
gaagttatgg taccagttag agaaagaacc cataatagga gcagaaactt tctatgtaga 1860
tggggcagcc aatagggaaa ctaaattagg aaaagcagga tatgtaactg acagaggaag 1920
acaaaaagtt gtccccctaa cggacacaac aaatcagaag actgagttac aagcaattca 1980
tctagctttg caggattcgg gattagaagt aaacatagtg acagactcac aatatgcatt 2040
gggaatcatt caagcacaac cagataagag tgaatcagag ttagtcagtc aaataataga 2100
gcagttaata aaaaaggaaa aagtctacct ggcatgggta ccagcacaca aaggaattgg 2160
aggaaatgaa caagtagata aattggtcag tgctggaatc aggaaagtac tatttttaga 2220
tggaatagat aaggcccaag aagaacatga gaaatatcac agtaattgga gagcaatggc 2280
tagtgatttt aacctaccac ctgtagtagc aaaagaaata gtagccagct gtgataaatg 2340
tcagctaaaa ggggaagcca tgcatggaca agtagactgt agcccaggaa tatggcagct 2400
agattgtaca catttagaag gaaaagttat cttggtagca gttcatgtag ccagtggata 2460
tatagaagca gaagtaattc cagcagagac agggcaagaa acagcatact tcctcttaaa 2520
attagcagga agatggccag taaaaacagt acatacagac aatggcagca atttcaccag 2580
tactacagtt aaggccgcct gttggtgggc ggggatcaag caggaatttg gcattcccta 2640
caatccccaa agtcaaggag taatagaatc tatgaataaa gaattaaaga aaattatagg 2700
acaggtaaga gatcaggctg aacatcttaa gacagcagta caaatggcag tattcatcca 2760
caattttaaa agaaaagggg ggattggggg gtacagtgca ggggaaagaa tagtagacat 2820
aatagcaaca gacatacaaa ctaaagaatt acaaaaacaa attacaaaaa ttcaaaattt 2880
tcgggtttat tacagggaca gcagagatcc agtttggaaa ggaccagcaa agctcctctg 2940
gaaaggtgaa ggggcagtag taatacaaga taatagtgac ataaaagtag tgccaagaag 3000
aaaagcaaag atcatcaggg attatggaaa acagatggca ggtgatgatt gtgtggcaag 3060
tagacaggat gaggattaaa gctcgctttc ttgctgtcca atttctatta aaggttcctt 3120
tgttccctaa gtccaactac taaactgggg gatattatga agggccttga gcatctggat 3180
tctgcctaat aagaaacatt tattgtcatt gc 3212
<210> 13
<211> 1115
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 13
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatggga ttcctggacg gcattgacaa ggctcaggag gagcacgaga agtaccactc 120
gaattggcgg gccatggcct ccgacttcaa cctgccaccc gtcgtcgcta aggagatcgt 180
tgctagctgc gacaagtgcc agctgaaagg cgaggctatg cacgggcagg ttgattgctc 240
tcccggcatc tggcagctcg actgtactca cctggagggc aaggtcatcc tggtcgccgt 300
gcacgtggcc tctggttaca tcgaggctga ggtcatccct gcagagactg gccaggagac 360
tgcctatttc ctgctgaaac tggccggccg gtggcctgtg aagacagtgc acacagataa 420
cggctccaac ttcacctcca ccactgtgaa ggctgcctgc tggtgggctg ggatcaagca 480
ggagttcggg atcccctata acccacagtc tcagggcgtg atcgaatcca tgaacaagga 540
gctgaagaag atcatcggcc aggttccgga ccaggcagag cacctgaaga ctgcagtgca 600
catggccgtg ttcatccaca acttcaagcg aaagggcgcc atcggtgcct actcagccgg 660
cgagcggatc gtggacatca tcgccactga catccagacc aaagagctgc agaagcagat 720
caccaagatc cagaacttcc gtgtgtacta ccgggactcc cgggaccctg tgtggaaggg 780
ccctgccaag ctgctgtgga agggcgaggg cgccgtggtc attcaggaca actgtgacat 840
caaggttgtg cccaggcgca aggccaagat tatccgggac tacggcaagc agatggctgg 900
cgacgactgt gtggcctctc gtcaagatga ggactaaagc tcgctttctt gctgtccaat 960
ttctattaaa ggttcctttg ttccctaagt ccaactacta aactggggga tattatgaag 1020
ggccttgagc atctggattc tgcctaataa gaaacattta ttgtcattgc agagacgcgg 1080
ccgcgcgtct caatgaagag cgtcgacgca tgcgg 1115
<210> 14
<211> 1115
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 14
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatggga ttccttgacg gcatcgacaa ggcgcaagag gagcacgaga aatatcatag 120
caactggagg gcgatggcat cagacttcaa cttgccgcca gtcgtggcta aagagatagt 180
agcaagctgc gataagtgtc aactgaaggg tgaagcgatg cacggtcaag tggactgttc 240
ccctggtata tggcaactgg actgtaccca tctcgaaggt aaagtcattt tggttgctgt 300
acatgttgcc tctgggtata tcgaggctga ggtgattcca gcagaaacgg gacaggaaac 360
agcttatttc ttgctcaagc ttgccgggag atggcctgtg aaaactgttc atacagataa 420
tggtagcaat tttactagta cgaccgtgaa agcagcttgt tggtgggcag ggattaagca 480
agaattcgga atcccgtata acccgcagag tcagggcgtt attgagagca tgaacaagga 540
actgaaaaaa ataattgggc aagtgagaga tcaggctgag catcttaaaa ctgctgtaca 600
aatggcggtg ttcatacata attttaagcg gaagggagga attggaggat actctgcggg 660
agagaggata gtggatataa tagcgaccga tattcagaca aaggagctgc aaaaacagat 720
aacgaaaata caaaattttc gagtttacta tcgggactcc cgcgatcccg tgtggaaagg 780
tccagcgaaa ttgctttgga agggcgaagg cgcggtagtg atccaggaca attctgatat 840
caaagtggtc ccaaggcgga aagcaaagat aatccgcgac tacggcaagc aaatggcagg 900
agatgattgc gtcgcatcac gacaggacga agattaaagc tcgctttctt gctgtccaat 960
ttctattaaa ggttcctttg ttccctaagt ccaactacta aactggggga tattatgaag 1020
ggccttgagc atctggattc tgcctaataa gaaacattta ttgtcattgc agagacgcgg 1080
ccgcgcgtct caatgaagag cgtcgacgca tgcgg 1115
<210> 15
<211> 1067
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 15
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgttt ttagatggaa tagataaggc ccaagaagaa catgagaaat atcacagtaa 120
ttggagagca atggctagtg attttaacct accacctgta gtagcaaaag aaatagtagc 180
cagctgtgat aaatgtcagc taaaagggga agccatgcat ggacaagtag actgtagccc 240
aggaatatgg cagctagatt gtacacattt agaaggaaaa gttatcttgg tagcagttca 300
tgtagccagt ggatatatag aagcagaagt aattccagca gagacagggc aagaaacagc 360
atacttcctc ttaaaattag caggaagatg gccagtaaaa acagtacata cagacaatgg 420
cagcaatttc accagtacta cagttaaggc cgcctgttgg tgggcgggga tcaagcagga 480
atttggcatt ccctacaatc cccaaagtca aggagtaata gaatctatga ataaagaatt 540
aaagaaaatt ataggacagg taagagatca ggctgaacat cttaagacag cagtacaaat 600
ggcagtattc atccacaatt ttaaaagaaa aggggggatt ggggggtaca gtgcagggga 660
aagaatagta gacataatag caacagacat acaaactaaa gaattacaaa aacaaattac 720
aaaaattcaa aattttcggg tttattacag ggacagcaga gatccagttt ggaaaggacc 780
agcaaagctc ctctggaaag gtgaaggggc agtagtaata caagataata gtgacataaa 840
agtagtgcca agaagaaaag caaagatcat cagggattat ggaaaacaga tggcaggtga 900
tgattgtgtg gcaagtagac aggatgagga ttaaagctcg ctttcttgct gtccaatttc 960
tattaaaggt tcctttgttc cctaagtcca actactaaac tgggggatat tatgaagggc 1020
cttgagcatc tggattctgc ctaataagaa acatttattg tcattgc 1067
<210> 16
<211> 1568
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 16
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgccc attagcccta tcgagacggt accggtgaag ctgaagcccg ggatggacgg 120
cccgaaggtc aagcaatggc cattgacaga ggagaagatc aaggcactgg tggagatttg 180
cacagagatg gaaaaggaag ggaaaatctc caagattggg cctgagaacc cgtacaacac 240
gccggtgttc gcaatcaaga agaaggactc gacgaaatgg cgcaagctgg tggacttccg 300
cgagctgaac aagcgcacgc aagacttctg ggaggttcag ctgggcatcc cgcaccccgc 360
agggctgaag aagaagaaat ccgtgaccgt actggatgtg ggtgatgcct acttctccgt 420
tcccctggac gaagacttca ggaagtacac tgccttcaca atcccttcga tcaacaacga 480
gacaccgggg attcgatatc agtacaacgt gctgccccag ggctggaaag gctctcccgc 540
aatcttccag agtagcatga ccaaaatcct ggagcctttc cgcaaacaga accccgacat 600
cgtcatctat cagtacatgg atgacttgta cgtgggctct gatctagaga tagggcagca 660
ccgcaccaag atcgaggagc tgcgccagca cctgttgagg tggggactga ccacacccga 720
caagaagcac cagaaggagc ctcccttcct ctggatgggt tacgagctgc accctgacaa 780
atggaccgtg cagcctatcg tgctgccaga gaaagacagc tggactgtca acgacataca 840
gaagctggtg gggaagttga actgggccag tcagatttac ccagggatta aggtgaggca 900
gctgtgcaaa ctcctccgcg gaaccaaggc actcacagag gtgatccccc taaccgagga 960
ggccgagctc gaactggcag aaaaccgaga gatcctaaag gagcccgtgc acggcgtgta 1020
ctatgacccc tccaaggacc tgatcgccga gatccagaag caggggcaag gccagtggac 1080
ctatcagatt taccaggagc ccttcaagaa cctgaagacc ggcaagtacg cccggatgag 1140
gggtgcccac actaacgacg tcaagcagct gaccgaggcc gtgcagaaga tcaccaccga 1200
aagcatcgtg atctggggaa agactcctaa gttcaagctg cccatccaga aggaaacctg 1260
ggaaacctgg tggacagagt attggcaggc cacctggatt cctgagtggg agttcgtcaa 1320
cacccctccc ctggtgaagc tgtggtacca gctggagaag gagcccatag tgggcgccga 1380
aaccttctaa agctcgcttt cttgctgtcc aatttctatt aaaggttcct ttgttcccta 1440
agtccaacta ctaaactggg ggatattatg aagggccttg agcatctgga ttctgcctaa 1500
taagaaacat ttattgtcat tgcagagacg cggccgcgcg tctcaatgaa gagcgtcgac 1560
gcatgcgg 1568
<210> 17
<211> 1928
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 17
agaccaggaa ttacatttgc ttctgacaca actgtgttca ctagcaacct caaacagagc 60
cgccatgccc attagcccta tcgagacggt accggtgaag ctgaagcccg ggatggacgg 120
cccgaaggtc aagcaatggc cattgacaga ggagaagatc aaggcactgg tggagatttg 180
cacagagatg gaaaaggaag ggaaaatctc caagattggg cctgagaacc cgtacaacac 240
gccggtgttc gcaatcaaga agaaggactc gacgaaatgg cgcaagctgg tggacttccg 300
cgagctgaac aagcgcacgc aagacttctg ggaggttcag ctgggcatcc cgcaccccgc 360
agggctgaag aagaagaaat ccgtgaccgt actggatgtg ggtgatgcct acttctccgt 420
tcccctggac gaagacttca ggaagtacac tgccttcaca atcccttcga tcaacaacga 480
gacaccgggg attcgatatc agtacaacgt gctgccccag ggctggaaag gctctcccgc 540
aatcttccag agtagcatga ccaaaatcct ggagcctttc cgcaaacaga accccgacat 600
cgtcatctat cagtacatgg atgacttgta cgtgggctct gatctagaga tagggcagca 660
ccgcaccaag atcgaggagc tgcgccagca cctgttgagg tggggactga ccacacccga 720
caagaagcac cagaaggagc ctcccttcct ctggatgggt tacgagctgc accctgacaa 780
atggaccgtg cagcctatcg tgctgccaga gaaagacagc tggactgtca acgacataca 840
gaagctggtg gggaagttga actgggccag tcagatttac ccagggatta aggtgaggca 900
gctgtgcaaa ctcctccgcg gaaccaaggc actcacagag gtgatccccc taaccgagga 960
ggccgagctc gaactggcag aaaaccgaga gatcctaaag gagcccgtgc acggcgtgta 1020
ctatgacccc tccaaggacc tgatcgccga gatccagaag caggggcaag gccagtggac 1080
ctatcagatt taccaggagc ccttcaagaa cctgaagacc ggcaagtacg cccggatgag 1140
gggtgcccac actaacgacg tcaagcagct gaccgaggcc gtgcagaaga tcaccaccga 1200
aagcatcgtg atctggggaa agactcctaa gttcaagctg cccatccaga aggaaacctg 1260
ggaaacctgg tggacagagt attggcaggc cacctggatt cctgagtggg agttcgtcaa 1320
cacccctccc ctggtgaagc tgtggtacca gctggagaag gagcccatag tgggcgccga 1380
aaccttctac gtggatgggg ccgctaacag ggagactaag ctgggcaaag ccggatacgt 1440
cactaaccgg ggcagacaga aggttgtcac cctcactgac accaccaacc agaagactga 1500
gctgcaggcc atttacctcg ctttgcagga ctcgggcctg gaggtgaaca tcgtgacaga 1560
ctctcagtat gccctgggca tcattcaagc ccagccagac cagagtgagt ccgagctggt 1620
caatcagatc atcgagcagc tgatcaagaa ggaaaaggtc tatctggcct gggtacccgc 1680
ccacaaaggc attggcggca atgagcaggt cgacaagctg gtctcggctg gcatcaggaa 1740
ggtgctataa agctcgcttt cttgctgtcc aatttctatt aaaggttcct ttgttcccta 1800
agtccaacta ctaaactggg ggatattatg aagggccttg agcatctgga ttctgcctaa 1860
taagaaacat ttattgtcat tgcagagacg cggccgcgcg tctcaatgaa gagcgtcgac 1920
gcatgcgg 1928
<210> 18
<211> 18
<212> RNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 18
gucccuguuc gggcgcca 18
<210> 19
<211> 18
<212> DNA
<213> artificial sequence
<220>
<223> Synthesis
<400> 19
gtccctgttc gggcgcca 18

Claims (129)

1. A composition comprising:
(i) One or more nucleic acid molecules encoding one or more Pol polyprotein components flanking the 5 'and 3' untranslated regions (UTRs); and
(ii) A nucleic acid molecule comprising one or more reverse transcriptase initiation elements located between a 5 'Long Terminal Repeat (LTR) and a 3' LTR and one or more promoter sequences operably linked to one or more transgenes.
2. The composition of claim 1, wherein the composition is free of nucleic acid sequences that express proteins encoded by at least one of a retrovirus rev gene and a retrovirus env gene.
3. The composition of claim 1 or 2, wherein the composition is free of nucleic acid sequences expressing proteins encoded by retroviral rev and env genes.
4. A composition according to any one of claims 1 to 3, wherein the composition is capable of integrating the one or more transgenes into a host genome in the absence of functional retroviral Rev and/or Env proteins.
5. The composition of any one of claims 1-4, wherein expression of the Pol polyprotein component does not require translational sliding from an in-line gag gene.
6. The composition of any one of claims 1-5, wherein the one or more nucleic acid molecules of (i) are nucleic acid molecules comprising a 5'utr, a nucleic acid sequence encoding a Pol polyprotein, and a 3' utr.
7. The composition of any one of claims 1-6, wherein the one or more nucleic acid molecules of (i) are nucleic acid molecules comprising a 5'utr, at least nucleic acid sequences encoding Pol polyprotein components reverse transcriptase and integrase, and a 3' utr.
8. The composition of any one of claims 1-5 and 7, wherein the Pol polyprotein components reverse transcriptase and integrase are expressed on a polycistronic construct.
9. The composition of claim 8, wherein the polycistronic construct is a bicistronic or a tricistronic.
10. The composition of any one of claims 1-5 and 7-9, wherein the Pol polyprotein component reverse transcriptase and integrase are expressed together with one or more polycistronic elements.
11. The composition of claim 10, wherein the polycistronic element is an inserted Internal Ribosome Entry Site (IRES) or a 2A peptide coding sequence.
12. The composition of any one of claims 1-5 and 7-11, wherein the one or more nucleic acid molecules of (i) are two nucleic acid molecules, wherein one of the two nucleic acid molecules comprises a 5'utr, a nucleic acid sequence encoding at least a Pol polyprotein component reverse transcriptase, and a 3' utr, and wherein the second of the two nucleic acid molecules comprises a 5'utr, a nucleic acid sequence encoding at least a Pol polyprotein component integrase, and a 3' utr.
13. The composition of any one of claims 1-12, wherein the one or more nucleic acid molecules of (i) further comprise a nucleic acid sequence encoding one or more accessory proteins.
14. The composition of claim 13, wherein the one or more accessory proteins are selected from the group consisting of Nucleocapsid (NC), capsid protein (CA), matrix protein (MA), p6, viral infectious agent (Vif), transcription transactivator (Tat), negative regulator (Nef), viral protein R (Vpr), and viral protein u (Vpu).
15. The composition of claim 14, wherein the one or more accessory proteins are mutant accessory proteins.
16. The composition of claim 15, wherein the mutant helper protein is a mutant capsid protein (CA), optionally the mutant capsid protein comprises an N74D mutation, an E45A mutation, or both, wherein numbering corresponds to wild-type HIV-1 capsid protein.
17. The composition of any one of claims 1-12, wherein the one or more nucleic acid molecules of (i) further comprise a nucleic acid sequence encoding one or more Gag polyprotein helper proteins.
18. The composition of any one of claims 13-17, wherein the one or more accessory proteins are encoded on the same nucleic acid molecule as the Pol polyprotein component.
19. The composition of claim 18, wherein the one or more accessory proteins are expressed with one or more polycistronic elements.
20. The composition of claim 19, wherein the polycistronic element is an inserted Internal Ribosome Entry Site (IRES) or a 2A peptide coding sequence.
21. The composition of any one of claims 13-17, wherein the one or more accessory proteins are encoded by one or more nucleic acid molecules that are different from the nucleic acid molecules encoding the Pol polyprotein component, wherein each nucleic acid molecule comprises a 5'utr and a 3' utr.
22. The composition of any one of claims 13-21, wherein the Gag polyprotein auxiliary protein is encoded by a Gag polyprotein.
23. The composition of any one of claims 1-21, wherein the one or more nucleic acid molecules of (i) do not encode Gag polyprotein.
24. The composition of any one of claims 1-4, 6, 7, and 13-22, wherein the one or more nucleic acid molecules of (i) comprise a gag-pol gene.
25. The composition of claim 24, wherein the gag-pol gene comprises a frameshift mutation.
26. The composition of claim 25, wherein the frameshift mutation is a single nucleotide insertion or deletion.
27. The composition of any one of claims 1-26, wherein the composition does not comprise a nucleic acid sequence encoding a matrix protein.
28. The composition of any one of claims 1-27, wherein the one or more nucleic acid molecules of (i) comprise at least one mutation in one or more internal Instability (INS) elements.
29. The composition of claim 28, wherein the one or more INS elements are selected from TAGAT, ATAGA, AAAAG, ATAAA and TTATA.
30. The composition of any one of claims 1-29, wherein the one or more nucleic acid molecules of (i) encode an integrase polypeptide comprising an N-terminal methionine-glycine dipeptide.
31. The composition of any one of claims 1-30, wherein the one or more Pol polyprotein components are fused to homing proteins.
32. The composition of claim 31, wherein at least one of the Pol polyprotein components fused to homing protein is an integrase polypeptide.
33. The composition of claim 31 or 32, wherein the homing protein is I-PpoI.
34. The composition of any one of claims 1-33, wherein the one or more nucleic acid molecules of (i) and/or the nucleic acid molecule of (ii) are codon optimized for expression in a host cell.
35. The composition of claim 34, wherein the host cell is a mammalian cell.
36. The composition of claim 35, wherein the mammalian cell is a human cell.
37. The composition of claim 34, wherein the host cell is an avian cell.
38. The composition of any one of claims 1-37, wherein the composition further comprises a priming oligonucleotide.
39. The composition of claim 38, wherein the priming oligonucleotide is GUCCCUGUUCGGGCGCCA or GTCCCTGTTCGGGCGCCA.
40. The composition of claim 38, wherein the priming oligonucleotide is engineered to be complementary to an RT priming element.
41. The composition of any one of claims 1-40, wherein said nucleic acid molecules of (i) and (ii) are RNA molecules or DNA molecules.
42. The composition of any of claims 1-40, wherein said nucleic acid molecules of (i) and (ii) are ssDNA molecules or dsDNA molecules.
43. The composition of any one of claims 1-40, wherein said nucleic acid molecules of (i) and (ii) are ssRNA molecules or dsRNA molecules.
44. The composition of any one of claims 1-43, wherein said nucleic acid molecule of (ii) comprises two or more transgenes, and said transgenes are separated by one or more polycistronic elements.
45. The composition of claim 44, wherein said one or more polycistronic elements comprise one or more Internal Ribosome Entry Sites (IRES) and/or one or more 2A peptide coding sequences.
46. The composition of any one of claims 1-45, wherein said nucleic acid molecule of (ii) further comprises one or more enhancers.
47. The composition of claim 46, wherein the one or more enhancers comprises a woodchuck hepatitis virus post-transcriptional regulatory element (WPRE).
48. The composition of any one of claims 1-47, wherein the one or more nucleic acid molecules of (i) are RNA molecules and comprise one or more modifications selected from the group consisting of modified ribonucleosides, 5' -7mG cap structures, and poly (rA) tails.
49. The composition of claim 48, wherein said one or more modifications comprise pseudouridine or a derivative of pseudouridine.
50. The composition of claim 49, wherein said pseudouridine derivative is N1-methyl pseudouridine.
51. The composition of claim 48, wherein said one or more modifications comprise N6-methyladenosine.
52. The composition of any one of claims 1-51, wherein the nucleic acid molecule of (ii) is an RNA molecule and further comprises one or more modifications selected from the group consisting of modified ribonucleosides, 5' -7mG cap structures, and poly (rA) tails.
53. The composition of any one of claims 1-52, wherein the Pol polyprotein component, accessory protein, and/or LTR are based on the Pol polyprotein component and/or LTR from: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abelsen murine leukemia virus (A-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), bowlwave sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29) Avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV), friend MLV (FrMLV), avian Sarcoma Virus (ASV), avian leukemia virus, avian myeloblastosis virus, UR2 sarcoma virus, Y73 sarcoma virus, jaagsiekte sheep retrovirus, leaf monkey virus, mason-pfizer monkey virus, pinus monkey retrovirus, and preparation method thereof avian cancer Mier Hill Virus 2, bovine leukemia Virus, primate T lymphocyte Virus 1, primate T lymphocyte Virus 2, primate T lymphocyte Virus 3, ehrlich hilsa dermoma Virus, ehrlich hilsa Epstein-Barr Virus 1, ehrlich hilsa Epstein-Barr Virus 2, chicken syncytial Virus, feline leukemia Virus, ehrlich Lelch Virus, pachyrhizus hilsa Epstein-Barr Virus, finkel-Biskis-Jinkins murine sarcoma virus, gardner-Arnstein feline sarcoma virus, gibbon ape leukemia virus, guinea pig type C tumor virus, hardy-Zuckerman feline sarcoma virus, harvey murine sarcoma virus, kirsten murine sarcoma virus, kola retrovirus, moloney murine sarcoma virus, pig type C tumor virus, reticuloendotheliosis virus, snyder-Theilen feline sarcoma virus, trager duck spleen necrosis virus, venomorpha retrovirus, moschner's sarcoma virus, jembrana's disease virus, american lion lentivirus, bovine foamy virus, cat foamy virus brown da Cong simian foamy virus, verrucan simian foamy virus, central chimpanzee simian foamy virus, cynomolgus simian foamy virus, eastern chimpanzee simian foamy virus, green simian foamy virus, long tail simian foamy virus, japanese macaque foamy virus, rhesus simian foamy virus, spider simian foamy virus, pine monkey foamy virus, taiwan simian foamy virus, western chimpanzee simian foamy virus, western low-ground gorilla simian foamy virus, white marmoset simian foamy virus, or yellow-chest simian foamy virus.
54. The composition of any one of claims 1-53, wherein the composition is packaged in a non-viral delivery system.
55. The composition of any of claim 54, wherein the non-viral delivery system is a lipid nanoparticle, a liposome, a polypeptide nanoparticle, a silica nanoparticle, a gold nanoparticle, a polymer nanoparticle, a dendrimer, or a cationic nanoemulsion.
56. The composition of claim 54, wherein the non-viral delivery system is a lipid nanoparticle.
57. A composition comprising:
(i) A first RNA molecule comprising: a 5 'untranslated region (UTR), a nucleic acid sequence encoding a retroviral Pol polyprotein, and a 3' UTR; and
(ii) A second RNA molecule comprising one or more reverse transcriptase initiation elements located between a 5 'Long Terminal Repeat (LTR) and a 3' LTR and one or more promoter sequences operably linked to one or more transgenes;
wherein the composition is packaged in a non-viral delivery system.
58. The composition of claim 57, wherein the composition is free of nucleic acid sequences that express proteins encoded by at least one of a retrovirus rev gene and a retrovirus env gene.
59. The composition of claim 57 or 58, wherein said composition is free of nucleic acid sequences that express proteins encoded by both retroviral rev and env genes.
60. The composition of claim 57, wherein said composition is capable of integrating said one or more transgenes into a host genome in the absence of functional retroviral Rev and/or Env proteins.
61. The composition of any one of claims 57-60, wherein said RNA molecules of (i) and (ii) are ssRNA molecules or dsRNA molecules.
62. The composition of any one of claims 57-61, further comprising a third RNA molecule encoding one or more accessory proteins.
63. The composition of claim 62, wherein the one or more accessory proteins are selected from the group consisting of Nucleocapsid (NC), capsid protein (CA), matrix protein (MA), p6, viral infectious agent (Vif), transcription transactivator (Tat), negative regulator (Nef), viral protein R (Vpr), and viral protein u (Vpu).
64. The composition of claim 63, wherein the one or more accessory proteins are mutant accessory proteins.
65. The composition of claim 64, wherein the mutant helper protein is a mutant capsid protein (CA), optionally the mutant capsid protein comprises an N74D mutation, an E45A mutation, or both, wherein numbering corresponds to wild-type HIV-1 capsid protein.
66. The composition of any one of claims 57-61, wherein said first RNA molecule further comprises a nucleic acid sequence encoding one or more accessory proteins.
67. The composition of claim 66, wherein the one or more accessory proteins are selected from the group consisting of NC, CA, MA, p, vif, tat, nef, vpr, and Vpu.
68. The composition of any one of claims 57-61, further comprising a third RNA molecule encoding one or more Gag polyprotein helper proteins.
69. The composition of any one of claims 57-61, wherein said first RNA molecule further comprises a nucleic acid sequence encoding one or more Gag polyprotein helper proteins.
70. The composition of claim 68 or 69, wherein said Gag polyprotein auxiliary protein is encoded by a Gag polyprotein.
71. The composition of any one of claims 57-70, wherein said first RNA molecule comprises a gag-pol gene.
72. The composition of claim 71, wherein the gag-pol gene comprises a frameshift mutation.
73. The composition of claim 72, wherein said frameshift mutation is a single nucleotide insertion or deletion.
74. The composition of any one of claims 57-73, wherein the composition does not comprise a nucleic acid sequence encoding a Matrix (MA) protein.
75. The composition of any one of claims 57-74, wherein said first RNA molecule or said third RNA molecule, if present, comprises a mutation in one or more intrinsic Instability (INS) elements.
76. The composition of claim 75 wherein said one or more INS elements are selected from TAGAT, ATAGA, AAAAG, ATAAA and TTATA.
77. The composition of any one of claims 57-76, wherein said first RNA molecule encodes an integrase polypeptide comprising an N-terminal methionine-glycine dipeptide.
78. The composition of any one of claims 57-77, wherein said Pol polyprotein is fused to a homing protein.
79. The composition of claim 78, wherein said integrase polypeptide is fused to said homing protein.
80. The composition of claim 78 or 79, wherein said homing protein is I-PpoI.
81. The composition of any one of claims 57-80, wherein said first and/or second RNA molecules are codon optimized for expression in a host cell.
82. The composition of claim 81, wherein said host cell is a mammalian cell.
83. The composition of claim 82, wherein said mammalian cell is a human cell.
84. The composition of claim 81, wherein said host cell is an avian cell.
85. The composition of any one of claims 57-84, wherein the composition further comprises a priming oligonucleotide.
86. The composition of claim 85, wherein the priming oligonucleotide is GUCCCUGUUCGGGCGCCA (SEQ ID NO: 18) or GTCCCTGTTCGGGCGCCA (SEQ ID NO: 19).
87. The composition of claim 85, wherein the priming oligonucleotide is engineered to be complementary to an RT priming element.
88. The composition of any one of claims 57-87, wherein the second RNA molecule comprises two or more transgenes and the transgenes are separated by one or more polycistronic elements.
89. The composition of claim 88, wherein said one or more polycistronic elements comprise one or more Internal Ribosome Entry Sites (IRES) and/or one or more 2A peptide coding sequences.
90. The composition of any one of claims 57-89, wherein the second RNA molecule further comprises one or more enhancers.
91. The composition of claim 90, wherein the one or more enhancers comprise woodchuck hepatitis virus posttranscriptional regulatory elements (WPREs).
92. The composition of any one of claims 57-91, wherein the first or second RNA molecule comprises one or more modifications selected from the group consisting of a modified nucleoside, a 5' -7 mgcap structure, and a poly (rA) tail.
93. The composition of any one of claims 57-92, wherein the second RNA molecule comprises one or more modifications selected from the group consisting of a 5' -7 mgs cap structure and a poly (rA) tail.
94. The composition of claim 92, wherein the one or more modifications comprise pseudouridine or a derivative of pseudouridine.
95. The composition of claim 94, wherein said derivative of pseudouridine is N1-methyl pseudouridine.
96. The composition of claim 92, wherein the one or more modifications comprise N6-methyladenosine.
97. The composition of any one of claims 1-96, wherein the one or more promoters comprise one or more tissue-specific or cell-specific promoters.
98. The composition of claim 97, wherein the one or more tissue-specific or cell-specific promoters are specific for bone marrow, hematopoietic Stem Cells (HSCs), epithelial cells, hepatocytes, vision cells, muscle cells, or T cells.
99. The composition of claim 97 or claim 98, wherein the one or more promoters comprise the hCMV promoter.
100. The composition of any one of claims 1-99, wherein the one or more transgenes encode one or more therapeutic, diagnostic or reporter molecules, or fragments thereof.
101. The composition of any one of claims 1-99, wherein the one or more transgenes encode one or more therapeutic, diagnostic or reporter proteins, or fragments thereof.
102. The composition of claim 101, wherein the therapeutic protein is beta globin, cystic fibrosis transmembrane conductance regulator (CFTR), factor VIII, dystrophin, or RP gtpase regulator (RPGR).
103. The composition of claim 101, wherein the reporter protein is a fluorescent protein or a luciferase.
104. The composition of any one of claims 57-103, wherein the Pol polyprotein component, helper protein, and/or LTR is based on a Pol polyprotein component, helper protein, and/or LTR from the group consisting of: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abelsen murine leukemia virus (A-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), bowlwave sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29) Avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV), friend MLV (FrMLV), avian Sarcoma Virus (ASV), avian leukemia virus, avian myeloblastosis virus, UR2 sarcoma virus, Y73 sarcoma virus, jaagsiekte sheep retrovirus, leaf monkey virus, mason-pfizer monkey virus, pinus monkey retrovirus, and preparation method thereof avian cancer Mier Hill Virus 2, bovine leukemia Virus, primate T lymphocyte Virus 1, primate T lymphocyte Virus 2, primate T lymphocyte Virus 3, ehrlich hilsa dermoma Virus, ehrlich hilsa Epstein-Barr Virus 1, ehrlich hilsa Epstein-Barr Virus 2, chicken syncytial Virus, feline leukemia Virus, ehrlich Lelch Virus, pachyrhizus hilsa Epstein-Barr Virus, finkel-Biskis-Jinkins murine sarcoma virus, gardner-Arnstein feline sarcoma virus, gibbon ape leukemia virus, guinea pig type C tumor virus, hardy-Zuckerman feline sarcoma virus, harvey murine sarcoma virus, kirsten murine sarcoma virus, kola retrovirus, moloney murine sarcoma virus, pig type C tumor virus, reticuloendotheliosis virus, snyder-Theilen feline sarcoma virus, trager duck spleen necrosis virus, venomorpha retrovirus, moschner's sarcoma virus, jembrana's disease virus, american lion lentivirus, bovine foamy virus, cat foamy virus brown da Cong simian foamy virus, verrucan simian foamy virus, central chimpanzee simian foamy virus, cynomolgus simian foamy virus, eastern chimpanzee simian foamy virus, green simian foamy virus, long tail simian foamy virus, japanese macaque foamy virus, rhesus simian foamy virus, spider simian foamy virus, pine monkey foamy virus, taiwan simian foamy virus, western chimpanzee simian foamy virus, western low-ground gorilla simian foamy virus, white marmoset simian foamy virus, or yellow-chest simian foamy virus.
105. The composition of any one of claims 54 and 57-104, wherein said non-viral delivery system targets a specific tissue or cell type.
106. The composition of claim 105, wherein the specific tissue or cell type is bone marrow, HSC, epithelial cells, hepatocytes, vision cells, muscle cells, or T cells.
107. The composition of any one of claims 57-104, wherein the non-viral delivery system is a lipid nanoparticle, a liposome, a polypeptide nanoparticle, a silica nanoparticle, a gold nanoparticle, a polymer nanoparticle, a dendrimer, or a cationic nanoemulsion.
108. A method of expressing a gene in a subject in need thereof, the method comprising administering to the subject an effective amount of the composition of any one of claims 1-107, thereby expressing the one or more transgenes in the subject.
109. A method for expressing a gene in a cell, the method comprising delivering the composition of any one of claims 1-107 to the cell, thereby expressing the one or more transgenes in the cell.
110. A method of using the composition of any one of claims 1-107, the method comprising delivering the composition to a subject, thereby expressing the one or more transgenes in the subject.
111. A method of treating a disease or disorder in a subject in need thereof, the method comprising delivering the composition of any one of claims 1-107 to the subject, thereby expressing the one or more transgenes in the subject.
112. The method of claim 111, wherein the disease or disorder is a genetic disease or disorder.
113. The method of claim 111, wherein the disease or disorder is a genetic gene disease or disorder.
114. The method of claim 111, wherein the disease or disorder is sickle cell disease, β -thalassemia, hemophilia B, retinitis pigmentosa, duchenne muscular dystrophy, cystic fibrosis, or cancer.
115. The method of any one of claims 108-114, wherein the one or more transgenes are integrated into the genome of the target cell.
116. The method of claim 115, wherein the stable expression of the one or more transgenes continues for at least one week, at least two weeks, at least one month, at least 6 months, at least one year, or for the lifetime of the subject.
117. A method of eliciting an immune response in a subject in need thereof, the method comprising administering to the subject an effective amount of the composition of any one of claims 1-107, thereby expressing the one or more transgenes in the subject.
118. The method of claim 117, wherein the subject has cancer and the one or more transgenes encode a tumor antigen.
119. The method of claim 117, wherein the subject has or is at risk of an infectious disease and the one or more transgenes encode an antigen associated with the infectious disease.
120. The method of any one of claims 108-119, wherein the composition is delivered locally or systemically.
121. The method of claim 120, wherein the composition is delivered by injection, inhalation, intravenous, intraperitoneal, subcutaneous, intramuscular, oral, intranasal, by pulmonary administration, transdermal, transmucosal, or intratumoral.
122. One or more nucleic acid templates for use in the composition of any of claims 1-107, the nucleic acid templates comprising a 5'utr, a nucleic acid sequence encoding one or more retroviral Pol polyprotein components, and a 3' utr, wherein expression of the Pol polyprotein components does not require translational sliding from an in-row gag gene.
123. One or more nucleic acid templates for use in the composition of any of claims 1-107, the nucleic acid templates comprising a 5'utr, a nucleic acid sequence encoding a gag-pol gene, and a 3' utr.
124. The one or more nucleic acid templates of claim 123, wherein the gag-pol gene comprises a frameshift mutation.
125. One or more nucleic acid templates for use in the composition of any of claims 1-107, the nucleic acid templates comprising a 5'utr, a nucleic acid sequence encoding a gag-pol gene, and a 3' utr, wherein the gag-pol gene does not encode a matrix protein.
126. The nucleic acid template of any one of claims 122-125, further comprising a nucleic acid sequence encoding one or more accessory proteins selected from the group consisting of NC, CA, MA, p, vif, tat, nef, vpr and Vpu.
127. The nucleic acid template of any one of claims 122-126, wherein the Pol polyprotein component and/or helper protein is based on a Pol polyprotein component and helper protein from the group consisting of: human Immunodeficiency Virus (HIV), simian Immunodeficiency Virus (SIV), visna/maedi virus (VMV), caprine Arthritis Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), feline Immunodeficiency Virus (FIV), bovine Immunodeficiency Virus (BIV), human Foamy Virus (HFV), murine Leukemia Virus (MLV), moloney murine leukemia virus (MoLV), friedel Virus (FV), abelsen murine leukemia virus (A-MLV), murine Stem Cell Virus (MSCV), murine Mammary Tumor Virus (MMTV), moloney murine sarcoma virus (MoMSV), rous Sarcoma Virus (RSV), bowlwave sarcoma virus (FuSV), FBR murine osteosarcoma virus (FBR MSV), avian myeloblastoma virus 29 (MC 29) Avian Erythroblastosis Virus (AEV), human T cell leukemia virus (HTLV), friend MLV (FrMLV), avian Sarcoma Virus (ASV), avian leukemia virus, avian myeloblastosis virus, UR2 sarcoma virus, Y73 sarcoma virus, jaagsiekte sheep retrovirus, leaf monkey virus, mason-pfizer monkey virus, pinus monkey retrovirus, and preparation method thereof avian cancer Mier Hill Virus 2, bovine leukemia Virus, primate T lymphocyte Virus 1, primate T lymphocyte Virus 2, primate T lymphocyte Virus 3, ehrlich hilsa dermoma Virus, ehrlich hilsa Epstein-Barr Virus 1, ehrlich hilsa Epstein-Barr Virus 2, chicken syncytial Virus, feline leukemia Virus, ehrlich Lelch Virus, pachyrhizus hilsa Epstein-Barr Virus, finkel-Biskis-Jinkins murine sarcoma virus, gardner-Arnstein feline sarcoma virus, gibbon ape leukemia virus, guinea pig type C tumor virus, hardy-Zuckerman feline sarcoma virus, harvey murine sarcoma virus, kirsten murine sarcoma virus, kola retrovirus, moloney murine sarcoma virus, pig type C tumor virus, reticuloendotheliosis virus, snyder-Theilen feline sarcoma virus, trager duck spleen necrosis virus, venomorpha retrovirus, moschner's sarcoma virus, jembrana's disease virus, american lion lentivirus, bovine foamy virus, cat foamy virus brown da Cong simian foamy virus, verrucan simian foamy virus, central chimpanzee simian foamy virus, cynomolgus simian foamy virus, eastern chimpanzee simian foamy virus, green simian foamy virus, long tail simian foamy virus, japanese macaque foamy virus, rhesus simian foamy virus, spider simian foamy virus, pine monkey foamy virus, taiwan simian foamy virus, western chimpanzee simian foamy virus, western low-ground gorilla simian foamy virus, white marmoset simian foamy virus, or yellow-chest simian foamy virus.
128. A method of producing an RNA molecule comprising in vitro transcribing the nucleic acid template of any one of claims 122-127.
129. A kit comprising one or more containers comprising the composition of any one of claims 1-107, or the nucleic acid template of any one of claims 122-127.
CN202180062136.3A 2020-07-17 2021-07-16 Nucleic acid therapy for genetic disorders Pending CN116171326A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202063053474P 2020-07-17 2020-07-17
US63/053,474 2020-07-17
PCT/US2021/042015 WO2022016077A1 (en) 2020-07-17 2021-07-16 Nucleic acid therapeutics for genetic disorders

Publications (1)

Publication Number Publication Date
CN116171326A true CN116171326A (en) 2023-05-26

Family

ID=77519749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180062136.3A Pending CN116171326A (en) 2020-07-17 2021-07-16 Nucleic acid therapy for genetic disorders

Country Status (10)

Country Link
US (1) US20220096525A1 (en)
EP (1) EP4182332A1 (en)
JP (1) JP2023535381A (en)
KR (1) KR20230040353A (en)
CN (1) CN116171326A (en)
AU (1) AU2021308688A1 (en)
BR (1) BR112023000784A2 (en)
CA (1) CA3184980A1 (en)
IL (1) IL299446A (en)
WO (1) WO2022016077A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024201067A1 (en) * 2023-03-31 2024-10-03 Dawn Therapeutics Limited Transfection system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050048652A1 (en) * 2003-08-19 2005-03-03 Iowa State University Research Foundation, Inc. Retroelement vector system for amplification and delivery of nucleotide sequences in plants
US7404969B2 (en) 2005-02-14 2008-07-29 Sirna Therapeutics, Inc. Lipid nanoparticle based compositions and methods for the delivery of biologically active molecules
PL2440183T3 (en) 2009-06-10 2019-01-31 Arbutus Biopharma Corporation Improved lipid formulation
WO2012156839A2 (en) * 2011-05-19 2012-11-22 Ospedale San Raffaele S.R.L. New generation of splice-less lentiviral vectors for safer gene therapy applications
US11274284B2 (en) 2015-03-30 2022-03-15 Greenlight Biosciences, Inc. Cell-free production of ribonucleic acid
WO2017083356A1 (en) * 2015-11-09 2017-05-18 Immune Design Corp. A retroviral vector for the administration and expression of replicon rna expressing heterologous nucleic acids
WO2017176963A1 (en) 2016-04-06 2017-10-12 Greenlight Biosciences, Inc. Cell-free production of ribonucleic acid
SG11202003192UA (en) 2017-10-11 2020-05-28 Greenlight Biosciences Inc Methods and compositions for nucleoside triphosphate and ribonucleic acid production

Also Published As

Publication number Publication date
US20220096525A1 (en) 2022-03-31
BR112023000784A2 (en) 2023-03-28
KR20230040353A (en) 2023-03-22
IL299446A (en) 2023-02-01
WO2022016077A1 (en) 2022-01-20
CA3184980A1 (en) 2022-01-20
AU2021308688A1 (en) 2023-02-16
EP4182332A1 (en) 2023-05-24
JP2023535381A (en) 2023-08-17

Similar Documents

Publication Publication Date Title
KR102712142B1 (en) Cas-transgenic mouse embryonic stem cells and mice and uses thereof
KR20230019843A (en) Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence
AU2022204199A1 (en) Gene editing of deep intronic mutations
KR20210143230A (en) Methods and compositions for editing nucleotide sequences
US20040219516A1 (en) Viral vectors containing recombination sites
KR20210143897A (en) Integration of Nucleic Acid Constructs into Eukaryotic Cells Using Transposase from Origias
KR20210144861A (en) Translocation of Nucleic Acid Constructs Using Transposase from Amyelois to Eukaryotic Genomes
US20240052366A1 (en) Production of Lentiviral Vectors
BRPI0613784A2 (en) multiple gene expression including sorf constructs and methods with polyproteins, proproteins and proteolysis
CN101208425A (en) Cell lines for production of replication-defective adenovirus
AU2024202171A1 (en) Novel OMNI-50 CRISPR nuclease
EP3974524A1 (en) Dna vectors, transposons and transposases for eukaryotic genome modification
US20050251872A1 (en) Lentiviral vectors, related reagents, and methods of use thereof
US20210079360A1 (en) Viral vector production system
KR20220035338A (en) Improved production of lentiviral vectors
KR20230129996A (en) Polynucleotides, compositions and methods for genome editing including deamination
KR20210049133A (en) Vector preparation in serum-free medium
US12037599B2 (en) Methods and constructs for production of lentiviral vector
KR20210149702A (en) Non-viral DNA vectors and their use for expression of phenylalanine hydroxylase (PAH) therapeutics
KR20230129162A (en) RNA targeting composition and method for treating type 1 myotonic dystrophy
CN112105389B (en) Compositions for transfecting resistant cell types
CN110582305A (en) Vectors and compositions for the treatment of hemoglobinopathies
KR20240037192A (en) Methods and compositions for genome integration
CN116171326A (en) Nucleic acid therapy for genetic disorders
KR20220139344A (en) Compositions and methods for treating neurodegenerative diseases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination