CN116782878A

CN116782878A - Improved compositions for delivery of codon optimized mRNA

Info

Publication number: CN116782878A
Application number: CN202180089274.0A
Authority: CN
Inventors: J·安德罗萨维奇; L·博格林; S·夏尔马; 孙刚; N·考沙尔; S·卡尔维
Original assignee: Translation Bio Co
Current assignee: Translation Bio Co
Priority date: 2020-11-09
Filing date: 2021-11-09
Publication date: 2023-09-19

Abstract

The present application provides, inter alia, improved pharmaceutical compositions comprising a codon optimized mRNA encoding a peptide or polypeptide encapsulated in a lipid nanoparticle comprising one or more of a cationic lipid that is particularly effective for pulmonary delivery.

Description

Improved compositions for delivery of codon optimized mRNA

Cross Reference to Related Applications

The present application claims priority from U.S. provisional application Ser. No. 63/111,321, filed 11/9/2020, and U.S. provisional application Ser. No. 63/195,581, filed 6/2021, the disclosures of each of which are hereby incorporated by reference.

Sequence listing

The description refers to the sequence listing (electronic submission of txt file under the name "MRT-2205WO_ST25" at month 9 of 2021). The txt file was generated at 2021, 11, 5, and was 92KB in size. The entire contents of the sequence listing are incorporated herein by reference.

Background

Messenger RNA Therapy (MRT) is becoming an increasingly important method for treating a variety of diseases. MRT involves administering messenger RNA (mRNA) to a patient in need of therapy for producing a protein encoded by the mRNA in the patient. Lipid nanoparticles are commonly used to encapsulate mRNA for efficient in vivo delivery of mRNA.

The success of an mRNA therapeutic will depend on deliberate and elaborate delivery systems that should direct mRNA into the desired compartments of the selected cells. However, humans and other organisms have developed natural barriers to protect their bodies against different kinds of pathogens or intruders. For example, the lungs contain mucus which traps microorganisms and particles, removing them from the lungs via synergistic beating of the motor cilia.

Disclosure of Invention

The invention provides, inter alia, pharmaceutical compositions comprising messenger RNA (mRNA) and methods of making and using the same. Notably, the cationic lipids described herein exhibit increased efficacy of pulmonary delivery and increased protein expression. In addition, the mRNAs described herein are codon optimized and exhibit increased levels of protein expression and activity. These pharmaceutical compositions may be used for improved treatment of pulmonary diseases.

The invention provides, inter alia, pharmaceutical compositions for delivery to a subject (e.g., a human subject or a cell of a human subject or a cell treated and delivered to a human subject) comprising a codon-optimized mRNA molecule encoding a peptide or polypeptide encapsulated in a lipid nanoparticle comprising one or more of the cationic lipids described herein.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid as GL-TES-SA-DMP-E18-2.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid that is GL-TES-SA-DME-E18-2.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid that is TL 1-01D-DMA.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid that is TL 1-04D-DMA.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid that is SY-3-E14-DMAPr.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid that is TL 1-10D-DMA.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid as HEP-E3-E10.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid as HEP-E4-E10.

In one aspect, the invention provides, inter alia, a composition for pulmonary delivery comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid as a Guan-SS-Chol.

In some embodiments, the mRNA is codon optimized.

In some embodiments, the lipid nanoparticle comprises one or more non-cationic lipids and one or more PEG-modified lipids. In some embodiments, the liposome comprises no more than three different lipid components. In some embodiments, the liposome comprises no more than four different lipid components.

In some embodiments, the liposome comprises four different lipid components, namely cationic lipids, non-cationic lipids, cholesterol, and PEG-modified lipids. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to cholesterol to PEG-modified lipid is between about 30-60:25-35:20-30:1-15, respectively.

In some embodiments, the liposome comprises three different lipid components, namely a cationic lipid (typically a sterol-based cationic lipid), a non-cationic lipid, and a PEG-modified lipid. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to PEG-modified lipid is about 60:35:5, respectively.

In some embodiments, the lipid nanoparticle comprises one or more cholesterol-based lipids.

In some embodiments, the lipid nanoparticle comprises a non-cationic lipid as DOPE.

In some embodiments, the lipid nanoparticle comprises a non-cationic lipid as DEPE.

In some embodiments, the lipid nanoparticle comprises a PEG-modified lipid as DMG-PEG 2K.

In some embodiments, the mRNA encodes a cystic fibrosis transmembrane conductance regulator (CFTR). In some embodiments, the mRNA encodes an ATP-binding cassette subfamily a member 3 protein. In some embodiments, the mRNA encodes a kinesin axin medium chain 1 (DNAI 1) protein. In some embodiments, the mRNA encodes a motor protein shaft heavy chain 5 (DNAH 5) protein. In some embodiments, the mRNA encodes an alpha-1-antitrypsin protein, a fork box P3 (FOXP 3) protein. In some embodiments, the mRNA encodes one or more surfactant proteins. In some embodiments, the mRNA encodes a surfactant a protein, a surfactant B protein, a surfactant C protein, or a surfactant D protein.

In certain embodiments, the invention provides a pharmaceutical composition for delivery to the lung or lung cells of a subject, the pharmaceutical composition comprising a codon-optimized mRNA molecule encoding a peptide or polypeptide encapsulated in a lipid nanoparticle comprising one or more of the cationic lipids described herein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a cystic fibrosis transmembrane conductance regulator (CFTR) protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes an ATP-binding cassette subfamily a member 3 protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a kinesin axin medium-chain 1 (DNAI 1) protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a kinesin shaft heavy chain 5 (DNAH 5) protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes an alpha-1-antitrypsin protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a fork box P3 (FOXP 3) protein. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes one or more surfactant proteins, such as one or more of surfactant a protein, surfactant B protein, surfactant C protein, and surfactant D protein.

In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antigen. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antigen associated with cancer in a subject or an antigen identified from cancer cells in a subject. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antigen determined from the subject's own cancer cells, i.e., to provide a personalized cancer vaccine.

In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antibody. In certain embodiments, the antibody may be a bispecific antibody. In certain embodiments, the antibody may be part of a fusion protein. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antibody to OX 40. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antibody to VEGF. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antibody to tissue necrosis factor alpha. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antibody to CD 3. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes an antibody to CD 19.

In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes an immunomodulatory agent. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes interleukin 12. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes interleukin 23. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes interleukin 36 γ. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes a constitutively active variant of one or more interferon gene stimulating factor (STING) proteins.

In certain embodiments, the codon optimized mRNA encapsulated in the lipid nanoparticle encodes an endonuclease. In certain embodiments, the codon optimized mRNA encapsulated in such lipid nanoparticles encodes an RNA-guided DNA endonuclease protein, such as a Cas 9 protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a meganuclease protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a transcriptional activator-like effector nuclease protein. In certain embodiments, the codon optimized mRNA encapsulated in the lipid nanoparticle encodes a zinc finger nuclease protein.

In some embodiments, the mRNA molecule comprises a 5' untranslated region (UTR). In some embodiments, the mRNA molecule comprises a 3' untranslated region (UTR). In some embodiments, the 5' untranslated region (UTR) comprises SEQ ID NO:12. In some embodiments, the 3' untranslated region (UTR) comprises SEQ ID NO:13. In some embodiments, the 3' untranslated region (UTR) comprises SEQ ID NO:14.

In some embodiments, the mRNA molecule further comprises a poly a tail. In some embodiments, the mRNA molecule further comprises a poly a tail that is at least 70 residues in length. In some embodiments, the mRNA molecule further comprises a poly a tail of at least 100 residues in length. In some embodiments, the mRNA molecule further comprises a poly a tail of at least 120 residues in length. In some embodiments, the mRNA molecule further comprises a poly a tail that is at least 150 residues in length. In some embodiments, the mRNA molecule further comprises a poly a tail that is at least 200 residues in length. In some embodiments, the mRNA molecule further comprises a poly a tail of at least 250 residues in length.

In some embodiments, the mRNA molecule comprises a 5' cap.

In some embodiments, the mRNA molecule comprises at least one non-standard nucleobase. In some embodiments, the nonstandard nucleobase is selected from one or more of 5-methyl-cytidine, pseudouridine, and 2-thio-uridine.

In some embodiments, the mRNA molecules are used to induce functional CFTR expression in a mammal or mammalian cell.

In some embodiments, the functional protein expression induced by the codon optimized mRNA molecule is at least 1.2-fold greater than the protein expression induced by the non-codon optimized mRNA. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 1.5-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 1.8-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 2-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 2.3-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 2.5-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 2.8-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon optimized mRNA molecule is at least 3.0 fold greater than the protein expression induced by the non-codon optimized mRNA. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 3.2-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 3.5-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 3.7-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 4.0-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 4.5-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule. In some embodiments, the functional protein expression induced by the codon-optimized mRNA molecule is at least 5.0-fold greater than the protein expression induced by an un-codon-optimized mRNA molecule.

In some embodiments, the polynucleotide is a linear polynucleotide comprising deoxynucleotide residues. In some embodiments, the polynucleotide is a circular polynucleotide comprising deoxynucleotide residues.

In some embodiments, the codon optimized mRNA is encapsulated within a nanoparticle. In some embodiments, the nanoparticle is a liposome.

In some embodiments, the liposome comprises one or more cationic lipids, one or more non-cationic lipids, and one or more PEG-modified lipids. In some embodiments, the liposome comprises one or more cholesterol-based lipids. In some embodiments, the liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids, and one or more PEG-modified lipids. In some embodiments, the liposome comprises no more than three different lipid components. In some embodiments, one of the different lipid components is a sterol-based cationic lipid. In some embodiments, the sterol-based cationic lipid is an Imidazole Cholesterol Ester (ICE).

In some embodiments, the one or more cationic lipids comprise GL-TES-SA-DME-E18-2. In some embodiments, the one or more cationic lipids comprise TL1-01D-DMA. In some embodiments, the one or more cationic lipids comprise SY-3-E14-DMAPR. In some embodiments, the one or more cationic lipids comprise TL1-10D-DMA. In some embodiments, the one or more cationic lipids comprise Guan-SS-Chol. In some embodiments, the one or more cationic lipids comprise GL-TES-SA-DMP-E18-2. In some embodiments, the one or more cationic lipids comprise HEP-E4-E10. In some embodiments, the one or more cationic lipids comprise HEP-E3-E10. In some embodiments, the one or more cationic lipids comprise TL1-04D-DMA.

In some embodiments, the lipid nanoparticle comprises a cationic lipid that is GL-TES-SA-DME-E18-2. In some embodiments, the lipid nanoparticle comprises a cationic lipid that is TL1-01D-DMA. In some embodiments, the lipid nanoparticle comprises a cationic lipid that is SY-3-E14-DMAPR. In some embodiments, the lipid nanoparticle comprises a cationic lipid that is TL1-10D-DMA. In some embodiments, the lipid nanoparticle comprises a cationic lipid as Guan-SS-Chol. In some embodiments, the lipid nanoparticle comprises a cationic lipid that is GL-TES-SA-DMP-E18-2. In some embodiments, the lipid nanoparticle comprises a cationic lipid that is HEP-E4-E10. In some embodiments, the lipid nanoparticle comprises a cationic lipid that is HEP-E3-E10. In some embodiments, the lipid nanoparticle comprises a cationic lipid that is TL1-04D-DMA.

In some embodiments, the liposome has a size of less than about 200 nm. In some embodiments, the liposomes have a size of less than about 150 nm. In some embodiments, the liposomes have a size of less than about 120 nm. In some embodiments, the liposomes have a size of less than about 110 nm. In some embodiments, the liposome has a size of less than about 100 nm. In some embodiments, the liposome has a size of less than about 80 nm. In some embodiments, the liposomes have a size of less than about 60 nm. In some embodiments, the liposomes have a size of less than about 50 nm. In some embodiments, the liposomes have a size of less than about 40 nm. In some embodiments, the liposome has a size of less than about 30 nm.

In some embodiments, the pharmaceutical composition further comprises a CFTR potentiator. In some embodiments, the pharmaceutical composition further comprises a CFTR corrector. In some embodiments, the pharmaceutical composition further comprises a CFTR activator. In some embodiments, the pharmaceutical composition further comprises a CFTR potentiator, a correction agent, and/or an activator. Suitable CFTR potentiators, correction and/or activators include ivacaine (trade name ) Lu Maka Torr (trade name->) Tizakator, vX-659, vX-445, vX-152, vX-440, vX-371, vX-561, GLPG1837, GLPG2222, GLPG2737, GLPG2451, GLPG1837, PTI-428, PTI-801, PTI-808, and Ai Lufu sen (eluforsen). In some embodiments, the pharmaceutical composition further comprises ivacaine. In some embodiments, the pharmaceutical composition further comprises Lu Maka torr. In some embodiments, the pharmaceutical composition further comprises tizakapton. In some embodiments, the pharmaceutical composition further comprises ivacaine, lu Maka torr, tizalcine, or a combination. In some embodiments, the pharmaceutical composition further comprises VX-659. In some embodiments, the pharmaceutical composition further comprises VX-445. In some embodiments, the pharmaceutical composition further comprises VX-152. In some embodiments, the pharmaceutical composition further comprises VX-440. In some embodiments, the pharmaceutical composition further comprises VX-371. In some embodiments, the pharmaceutical composition further comprises VX-561. In some embodiments, the pharmaceutical composition further comprises GLPG1837. In some embodiments, the pharmaceutical composition further comprises GLPG2222. In some embodiments, the pharmaceutical composition further comprises GLPG2737. In one place In some embodiments, the pharmaceutical composition further comprises GLPG2451. In some embodiments, the pharmaceutical composition further comprises GLPG1837. In some embodiments, the pharmaceutical composition further comprises PTI-428. In some embodiments, the pharmaceutical composition further comprises PTI-801. In some embodiments, the pharmaceutical composition further comprises PTI-808. In some embodiments, the pharmaceutical composition further comprises Ai Lufu sen. In some embodiments, the pharmaceutical composition further comprises any combination of CFTR potentiators, correction agents, and/or activators.

In one aspect, the invention provides a method of inducing protein expression in epithelial cells in the lung of a mammal, the method comprising the step of contacting the epithelial cells in the lung of the mammal with a pharmaceutical composition of the invention.

In some embodiments, the codon optimized mRNA is administered via pulmonary delivery. In some embodiments, the codon optimized mRNA is administered via intravenous delivery. In some embodiments, the codon optimized mRNA is administered via: oral, rectal, vaginal, transmucosal or intestinal administration; parenteral delivery, including intradermal, transdermal (topical), intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, and/or intranasal administration.

In some embodiments, the pulmonary delivery is nebulization. In some embodiments, the codon optimized mRNA is administered via aerosolization.

In some embodiments, treating the subject is achieved at a lower therapeutically effective dose as compared to treating the subject with a non-codon optimized mRNA encoding a wild-type protein.

In some embodiments, treating a subject in need thereof results in a shorter nebulization time to administer a therapeutically effective dose compared to treatment with a non-codon optimized mRNA encoding a wild-type protein.

Drawings

The drawings are for illustration purposes only and are not intended to be limiting.

FIGS. 1A and 1B illustrate a method of generating an optimized nucleotide sequence according to the present invention. As shown in fig. 1A, the method receives an amino acid sequence of interest and a first codon usage table reflecting the frequency of each codon in a given organism (e.g., mammal or human). If a codon is associated with a codon usage frequency below a threshold frequency (e.g., 10%), the method removes the codon from the first codon usage table. The codon usage frequency of the codons not removed in the first step is normalized to generate a normalized codon usage table. The method uses a standardized codon usage table to generate a list of optimized nucleotide sequences. Each optimized nucleotide sequence encodes an amino acid sequence of interest. As shown in fig. 1B, the list of optimized nucleotide sequences was further processed by applying a motif screening filter, a guanine-cytosine (GC) content analysis filter, and a Codon Adaptation Index (CAI) analysis filter in the following order to generate an updated list of optimized nucleotide sequences.

FIG. 2A shows an exemplary Western blot for determining the protein expression yield of CFTR protein encoded by an optimized nucleotide sequence generated according to the method of the invention in a time course experiment after transfection of the optimized nucleotide sequence into human cells. Fig. 2B shows an exemplary line graph depicting quantification of western blot data depicted in fig. 2A.

Fig. 3A shows an exemplary plot of data obtained from a bioassay for testing mRNA containing optimized nucleotide sequences encoding hCFTR. It depicts the short circuit current (I) in the Uwing (using) epithelial voltage clamp device for each mRNA tested _SC ) And outputting. Fig. 3B shows an exemplary bar graph demonstrating hCFTR activity change as depicted in fig. 3A, expressed as a percentage of the activity of a reference mRNA encoding hCFTR.

Fig. 4A is an exemplary gel depicting the banding pattern of various CFTR sequences. Fig. 4B is a bar graph depicting the relative expression of various CFTR sequences in the C-band. Fig. 4C is a bar graph depicting CFTR mRNA efficacy of various CFTR sequences.

Fig. 5A and 5B are exemplary graphs showing short circuit conductivities of various codon optimized and non-codon optimized CFTR constructs measured by ewing cells. Fig. 5C is an exemplary bar graph of the maximum activation current of various CFTR constructs in the ews chamber assay at 22 and 44 hours.

Fig. 6 is an exemplary bar graph depicting the results of an assay to evaluate cytotoxicity of various CFTR mRNA sequences.

FIG. 7 is an exemplary bar graph depicting the amount of radiation produced by luciferase protein expressed in mice after administration of mRNA-LNP (each comprising a different cationic lipid component). In the figure at 10 ⁴ p/s/cm ² The horizontal line above and below sr represents the historical radiation/expression of pulmonary delivered (e.g., aerosolized) FFL mRNA encapsulated in LNP comprising ICE as cationic lipid. In the figure at 10 ⁶ p/s/cm ² The horizontal line above and below sr represents the historical radiation/expression of pulmonary delivered (e.g., aerosolized) FFL mRNA encapsulated in LNP comprising ML2 as a cationic lipid. These thresholds may be used to screen lipids for pulmonary delivery.

Fig. 8A is an exemplary bar graph depicting the amount of mRNA delivered to lung tissue as determined by RT-qPCR. Fig. 8B is an exemplary graph depicting the amount of mCherry protein produced by each amount of mCherry mRNA delivered to the lung (x-axis).

Fig. 9A is an exemplary image of a whole mouse showing the radiation of firefly luciferase expressed by delivered mRNA-LNP. Radiation showed that mRNA-LNP was efficiently delivered to the mouse lung for in vivo expression. Fig. 9B is an exemplary imaging of mice by cryofluorescence tomography, which shows Cre recombinase mRNA expression. Imaging showed that expression of the delivered protein was detected in the branches of the lung and airways, as indicated by the arrows. Fig. 9C is an exemplary immunohistochemistry of the mouse lung administered saline or Cre-mRNA LNP via nebulization. Positive bronchiole epithelial cells include secretory and/or ciliated cells (arrow with "1") and type I and type II lung cells are indicated by arrow with "2". Fig. 9D is an exemplary immunofluorescence imaging of lung sections at 40x and 100x magnification. The CFTR protein expressed by the delivered mRNA is evident at the apical surface of the airways, as indicated by the arrow.

Fig. 10A is an exemplary HBEC-ALI (human bronchial epithelial cell-gas liquid interface) model and time axis for ALI culture growth. Fig. 10B is an exemplary image of differentiated epithelium in HBEC-ALI model after staining with hematoxylin and eosin (H & E).

FIG. 11A is an exemplary bar graph depicting the amount of luminescence (on a logarithmic scale) produced by luciferase protein in cells in the HBEC-ALI model after transfection with mRNA-LNP. Fig. 11B is an exemplary bar graph depicting cell integrity as measured by transepithelial electrical resistance (TEER), which is a strong indicator of epithelial integrity.

Fig. 12 is an exemplary ROC curve (receiver operating profile) demonstrating that the HBEC-ALI model shows meaningful performance as a classification model for screening and filtering lipids prior to in vivo evaluation.

Fig. 13A is an exemplary graph showing the relative concentration of lipids in the HBEC-ALI model after transfection with mRNA-LNP. The half-life determined by the graph is about 2.9 hours. Fig. 13B is an exemplary table showing half-life values determined by mouse lung and human lung homogenates.

Definition of the definition

In order that the invention may be more readily understood, certain terms are first defined below. Additional definitions of the following terms and other terms are set forth throughout the specification. Publications and other reference materials mentioned herein to describe the background of the invention and to provide additional details concerning the practice of the invention are hereby incorporated by reference.

About or about: as used herein, the term "about" or "approximately" as applied to one or more destination values refers to values similar to the reference value. In certain embodiments, the term "about" or "approximately" refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1% or less of any direction (greater than or less than) of the recited reference value, unless otherwise indicated or otherwise apparent from the context (unless such numbers exceed 100% of the possible values).

As used herein, the term "batch" refers to the amount or quantity of mRNA synthesized at one time (e.g., produced according to a single manufacturing order during the same manufacturing cycle). A batch may refer to the amount of mRNA synthesized in a single reaction that is performed under a set of conditions via a single aliquot of enzyme and/or a single aliquot of DNA template for continuous synthesis. In some embodiments, a batch will include mRNA produced by a reaction in which not all reagents and/or components are replenished and/or replenished as the reaction proceeds. The term "not in a single batch" does not mean that the mRNA synthesized at different times are combined to achieve the desired amount.

Delivery: as used herein, the term "delivery" encompasses both local and systemic delivery. For example, delivery of mRNA encompasses situations where mRNA is delivered to a target tissue and the encoded protein is expressed and retained within the target tissue (also referred to as "local distribution" or "local delivery"); and cases where mRNA is delivered to a target tissue and the encoded protein is expressed and secreted into the patient's circulatory system (e.g., serum), and is distributed systemically and taken up by other tissues (also referred to as "systemic distribution" or "systemic delivery"). In some embodiments, the delivery is pulmonary delivery, including, for example, nebulization.

Encapsulation: as used herein, the term "encapsulation" or grammatical equivalents refers to the process of confining mRNA molecules within a nanoparticle.

Engineered or mutant: as used herein, the term "engineered" or "mutant" or grammatical equivalents refers to a nucleotide or protein sequence that contains one or more modifications as compared to its naturally occurring sequence, including but not limited to deletions, insertions of heterologous nucleic acids or amino acids, inversions, substitutions, or combinations thereof.

Expression: as used herein, "expression" of a nucleic acid sequence refers to translation of mRNA into a polypeptide, assembly of multiple polypeptides (e.g., heavy or light chains of an antibody) into an intact protein (e.g., an antibody), and/or post-translational modification of the polypeptide or the fully assembled protein (e.g., an antibody). In the present application, the terms "express" and "produce" and their grammatical equivalents are used interchangeably.

Functionality: as used herein, a "functional" biomolecule is a biomolecule in a form in which it exhibits its characteristic properties and/or activity.

Half-life period: as used herein, the term "half-life" is the time required for an amount, such as nucleic acid or protein concentration or activity, to drop to half its value as measured at the beginning of a time period.

Improvement, increase or decrease: as used herein, the terms "improve," "increase," or "decrease," or grammatical equivalents, refer to a value relative to a baseline measurement, such as a measurement in the same individual prior to initiation of a treatment described herein, or a measurement in a control subject (or multiple control subjects) in the absence of a treatment described herein. A "control subject" is a subject afflicted with the same form of disease as the subject being treated, and is approximately the same age as the subject being treated.

Impurity: as used herein, the term "impurity" refers to a substance within a limited amount of liquid, gas, or solid that differs from the chemical composition of the target material or compound. Impurities are also known as contaminants.

In vitro: as used herein, the term "in vitro" refers to events that occur in an artificial environment (e.g., in a tube or reaction vessel, in cell culture, etc.), rather than within a multicellular organism.

In vivo: as used herein, the term "in vivo" refers to events that occur within multicellular organisms (e.g., humans and non-human animals). In the case of a cell-based system, the term may be used to refer to events that occur within living cells (as opposed to, for example, in vitro systems).

Separating: as used herein, the term "isolated" refers to the following substances and/or entities: has been (1) separated from at least some of the components to which it was originally associated (in nature and/or in the experimental environment), and/or (2) artificially created, prepared and/or manufactured. The isolated substance and/or entity may be separated from about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than about 99% of the other components to which it was originally attached. In some embodiments, the isolated agent is about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than about 99% pure. As used herein, a substance is "pure" if the substance is substantially free of other components. As used herein, calculation of the percent purity of an isolated substance and/or entity should not include excipients (e.g., buffers, solvents, water, etc.).

Messenger RNA (mRNA): as used herein, the term "messenger RNA (mRNA)" refers to a polynucleotide encoding at least one polypeptide. mRNA as used herein encompasses both modified and unmodified RNAs. An mRNA may contain one or more coding and non-coding regions. mRNA can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, and the like. Where appropriate, for example, in the case of a chemically synthesized molecule, the mRNA may comprise nucleoside analogs (e.g., analogs of bases or sugars with chemical modifications), backbone modifications, and the like. Unless otherwise indicated, mRNA sequences are presented in the 5 'to 3' direction.

Nucleic acid: as used herein, the term "nucleic acid" is used in its broadest sense to refer to any compound and/or substance that is or can be incorporated into a polynucleotide strand. In some embodiments, the nucleic acid is a compound and/or substance that is incorporated or can be incorporated into the polynucleotide strand via a phosphodiester linkage. In some embodiments, "nucleic acid" refers to individual nucleic acid residues (e.g., nucleotides and/or nucleosides). In some embodiments, "nucleic acid" refers to a polynucleotide strand comprising individual nucleic acid residues. In some embodiments, "nucleic acid" encompasses RNA as well as single and/or double stranded DNA and/or cDNA. Furthermore, the terms "nucleic acid," "DNA," "RNA," and/or similar terms include nucleic acid analogs, i.e., analogs having a backbone other than a phosphodiester. For example, so-called "peptide nucleic acids" known in the art and having peptide bonds in the backbone in place of phosphodiester bonds are considered to be within the scope of the present invention. The term "nucleotide sequence encoding an amino acid sequence" includes all nucleotide sequences that are degenerate versions of each other and/or encode the same amino acid sequence. The nucleotide sequence encoding the protein and/or RNA may include introns. Nucleic acids can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, and the like. Where appropriate, for example, in the case of a chemically synthesized molecule, the nucleic acid may comprise nucleoside analogs (e.g., analogs of bases or sugars with chemical modifications), backbone modifications, and the like. Unless otherwise indicated, the nucleic acid sequences are presented in the 5 'to 3' direction. In some embodiments, the nucleic acid is or comprises a natural nucleoside (e.g., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine); nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyladenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deadenosine, 7-deazaguanosine, 8-oxo-guanosine, O (6) -methylguanine, and 2-thiocytidine); chemically modified bases; biologically modified bases (e.g., methylated bases); an intercalating base; modified sugars (e.g., 2 '-fluororibose, ribose, 2' -deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioate and 5' -N-phosphoramidite linkages). In some embodiments, the invention relates specifically to "unmodified nucleic acids," meaning nucleic acids (e.g., polynucleotides and residues, including nucleotides and/or nucleosides) that have not been chemically modified to facilitate or effect delivery. In some embodiments, the nucleotides T and U are used interchangeably in the sequence description.

Patient: as used herein, the term "patient" or "subject" refers to any organism to which the provided compositions can be administered, e.g., for experimental, diagnostic, prophylactic, cosmetic, and/or therapeutic purposes. Typical patients include animals (e.g., mammals such as mice, rats, rabbits, non-human primates, and/or humans). In some embodiments, the patient is a human. Humans include prenatal and postnatal forms.

Pharmaceutically acceptable: the term "pharmaceutically acceptable" as used herein refers to the following: suitable for use in contact with human and animal tissue without undue toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio, within the scope of sound medical judgment.

Stable: as used herein, the term "stable" protein or grammatical equivalents thereof refers to a protein that retains its physical stability and/or biological activity. In one embodiment, protein stability is determined at a low percentage of degraded (e.g., fragmented) and/or aggregated protein based on the percentage of monomeric protein in the solution. In one embodiment, the stabilized engineered protein retains or exhibits an increased half-life as compared to the wild-type protein. In one embodiment, the stabilized engineered protein is less prone to ubiquitination resulting in proteolysis than the wild-type protein.

The subject: as used herein, the term "subject" refers to a human or any non-human animal (e.g., mouse, rat, rabbit, dog, cat, cow, pig, sheep, horse, or primate). Humans include prenatal and postnatal forms. In many embodiments, the subject is a human. The subject may be a patient, which refers to a person presented to a medical provider for diagnosis or treatment of a disease. The term "subject" is used interchangeably herein with "individual" or "patient. The subject may be afflicted with or susceptible to a disease or disorder, but may or may not exhibit symptoms of the disease or disorder.

Basically: as used herein, the term "substantially" refers to a qualitative condition that exhibits an overall or near-overall range or degree of the characteristic or feature of interest. Those of ordinary skill in the biological arts will appreciate that biological and chemical phenomena are rarely, if ever, accomplished and/or proceed to completion or achieve or avoid absolute results. Thus, the term "substantially" is used herein to capture the potential lack of integrity inherent in many biological and chemical phenomena.

Treatment: as used herein, the term "treatment" or "treatment" refers to any method for partially or completely alleviating, ameliorating, alleviating, inhibiting, preventing, delaying the onset of, reducing the severity of, and/or reducing the incidence of one or more symptoms or features of a particular disease, disorder, and/or condition. The treatment may be administered to a subject that does not exhibit signs of disease and/or exhibits only early signs of disease for the purpose of reducing the risk of developing a condition associated with the disease.

Detailed Description

The present invention provides pharmaceutical compositions for delivery to a subject (e.g., a human subject or cells of a human subject or cells treated and delivered to a human subject) comprising a codon-optimized mRNA molecule encoding a peptide or polypeptide encapsulated in a lipid nanoparticle comprising one or more of the cationic lipids described herein. The pharmaceutical compositions described herein comprising codon optimized mRNA and/or cationic lipids are particularly effective in treating diseases by pulmonary administration.

Thus, in certain embodiments, the invention provides a pharmaceutical composition for delivery to the lung or lung cells of a subject, the pharmaceutical composition comprising a codon-optimized mRNA molecule encoding a peptide or polypeptide encapsulated in a lipid nanoparticle comprising one or more of the cationic lipids described herein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a cystic fibrosis transmembrane conductance regulator (CFTR) protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes an ATP-binding cassette subfamily a member 3 protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a kinesin axin medium-chain 1 (DNAI 1) protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a kinesin shaft heavy chain 5 (DNAH 5) protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes an alpha-1-antitrypsin protein. In certain embodiments, the codon-optimized mRNA encapsulated in the lipid nanoparticle encodes a fork box P3 (FOXP 3) protein. In certain embodiments, the codon-optimized mRNA encapsulated in such lipid nanoparticles encodes one or more surfactant proteins, such as one or more of surfactant a protein, surfactant B protein, surfactant C protein, and surfactant D protein.

Cystic fibrosis

The invention may be used to treat subjects suffering from or susceptible to cystic fibrosis. Cystic fibrosis is a genetic disorder characterized by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene. CFTR proteins function as channels across the cell membrane, which produce mucus, sweat, saliva, tears, and digestive enzymes. The channels transport negatively charged particles, called chloride ions, into and out of the cell. The transport of chloride ions helps control the movement of water in the tissue, which is necessary to produce free flowing dilute mucus. Mucus is a slippery substance that lubricates and protects the lining of the airways, digestive system, reproductive system, and other organs and tissues.

Respiratory symptoms of cystic fibrosis include: continuous cough, wheezing, shortness of breath, exercise intolerance, recurrent pulmonary infections and nasal airway inflammation or nasal obstruction that produce thick mucus (sputum). Digestive symptoms of cystic fibrosis include: malodor, oily stool, poor weight gain and growth, ileus (especially in newborns (meconium ileus)) and severe constipation.

Codon optimized mRNA

In some embodiments, the invention provides methods and compositions for delivering codon-optimized mRNA encoding a peptide or polypeptide to a subject for treating a disease. Suitable codon-optimized mRNAs encode any full length, fragment, or portion of a protein that can replace naturally occurring protein activity and/or reduce the intensity, severity, and/or frequency of one or more symptoms associated with a disease.

According to an increasing number of studies, mRNA contains multiple layers of information overlapping with the amino acid code. Traditionally, codon optimization has been used to remove rare codons that are believed to have rate limitations for protein expression. While both fast growing bacteria and yeast exhibit Jiang Mima seed bias in highly expressed genes, higher eukaryotes exhibit significantly less codon bias, which makes it more difficult to discern codons that may be rate limiting. Furthermore, it has been found that codon bias does not necessarily produce high expression per se, but other features are required.

For example, rare codons are associated with slowing translation and creating a pause site, which may be necessary for proper protein folding. Thus, changes in codon usage can provide a mechanism to fine tune the extended temporal pattern, thereby increasing the time available for the protein to make its correct confirmation. Codon optimisation may interfere with this trimming mechanism, resulting in reduced protein translation efficiency or increased numbers of misfolded proteins. Similarly, codon optimization can disrupt the normal pattern of homologous and oscillating tRNA usage, thereby affecting protein structure and function, as the mechanism of achieving proper protein folding can likewise be chosen to rely on the slowing of oscillating extension.

Various methods of performing codon optimization are known in the art, however, each has significant drawbacks and limitations from a computational and/or therapeutic standpoint. In particular, known codon optimization methods generally involve, for each amino acid, replacing each codon with the highest codon used for that amino acid, such that the "optimized" sequence contains only one codon encoding each amino acid (and thus may be referred to as a one-to-one sequence).

Despite these obstacles, the inventors have obtained improved codon optimized sequences that enhance protein expression at least twice as much as the coding sequence of the wild-type gene. The observed improvement in codon optimized protein coding sequence expression is expected to result in improved, more cost-effective mRNA replacement therapy for patients with disease, as it does not require the use of modified nucleotides to make mRNA, and allows treatment at reduced doses and/or at prolonged dosing intervals.

The genetic code has 64 possible codons. Each codon comprises a sequence of three nucleotides. The frequency of use of each codon in the protein coding region of the genome can be counted by: the number of instances of a particular codon present in the protein-encoding region of the genome is determined, and the obtained value is then divided by the total number of codons encoding the same amino acid within the protein-encoding region of the genome.

Regarding the frequency with which each codon is used to encode a certain amino acid for a particular biological source that generates a codon usage table, the codon usage table contains experimentally obtained data. For each codon, this information is expressed as a percentage (0 to 100%) or fraction (0 to 1) of the frequency with which the codon is used to encode a certain amino acid relative to the total number of times the codon encodes the amino acid.

The codon usage tables are stored in publicly available databases such as the codon usage database (Nakamura et al (2000) Nucleic Acids Research (1), 292; available online: https:// www.kazusa.or.jp/codon /) and the high performance integrated virtual environment-codon usage table (HIVE-CUT) database (Athey et al, (2017), BMC Bioinformatics (1), 391; available online: http:// HIVE. Biochem. Gwu. Edu/review/codon).

During the first step of codon optimization, if a codon is associated with a frequency of codon usage below a threshold frequency (e.g., 10%), the codon is removed from a first codon usage table reflecting the frequency of each codon in a given organism (e.g., mammal or human). The codon usage frequency of the codons not removed in the first step is normalized to generate a normalized codon usage table. An optimized nucleotide sequence encoding an amino acid sequence of interest is generated by selecting codons for each amino acid in the amino acid sequence based on the frequency of use of one or more codons associated with the given amino acid in the standardized codon usage table. The probability of selecting a certain codon for a given amino acid is equal to the frequency of codon usage in the standardized codon usage table that is related to that amino acid.

The codon optimized sequences of the invention are generated by computer-implemented methods for generating optimized nucleotide sequences. The method comprises the following steps: (i) Receiving an amino acid sequence, wherein the amino acid sequence encodes a peptide, polypeptide, or protein; (ii) Receiving a first codon usage table, wherein the first codon usage table comprises a list of amino acids, wherein each amino acid in the table is associated with at least one codon, and each codon is associated with a frequency of use; (iii) Removing any codons from the codon usage table that are associated with a frequency of use below a threshold frequency; (iv) Generating a normalized codon usage table by normalizing the frequency of use of codons not removed in step (iii); and (v) generating an optimized nucleotide sequence encoding the amino acid sequence by selecting codons for each amino acid in the amino acid sequence based on the frequency of use of the one or more codons in the standardized codon usage table associated with the amino acid sequence. The threshold frequency may be in the range of 5% -30%, in particular 5%, 10%, 15%, 20%, 25% or 30%. In the case of the present invention, the threshold frequency is typically 10%.

The step of generating a standardized codon usage table comprises: (a) Assigning the frequency of use of each codon associated with a first amino acid and removed in step (iii) to the remaining codons associated with the first amino acid; and (b) repeating step (a) for each amino acid to produce a standardized codon usage table. In some embodiments, the frequency of use of the removed codons is equally divided among the remaining codons. In some embodiments, the frequency of use of the removed codons is apportioned between the remaining codons based on the frequency of use of each remaining codon. "assignment" in this case may be defined as taking the combined magnitude of the frequency of use of the removed codons associated with a certain amino acid and apportioning a portion of that combined frequency to each of the remaining codons encoding the certain amino acid.

The step of selecting a codon for each amino acid comprises: (a) Identifying one or more codons associated with a first amino acid of the amino acid sequence in the standardized codon usage table; (b) Selecting codons associated with the first amino acid, wherein the probability of selecting a codon is equal to the frequency of use of codons in the standardized codon usage table that are related to the first amino acid; and (c) repeating steps (a) and (b) until a codon for each amino acid in the amino acid sequence has been selected.

The step of generating an optimized nucleotide sequence by selecting the codon of each amino acid in the amino acid sequence (step (v) in the above method) is performed n times to generate a list of optimized nucleotide sequences.

Motif screening

Motif screening filters were applied to the list of optimized nucleotide sequences. The optimized nucleotide sequences encoding any known negative cis-regulatory elements and weight bearing components are removed from the list to produce an updated list.

For each optimized nucleotide sequence in the list, it is also determined whether it contains a termination signal. Removing any nucleotide sequences containing one or more termination signals from the list, resulting in an updated list. In some embodiments, the termination signal has the following nucleotide sequence: 5' -X ₁ ATCTX ₂ TX ₃ -3', wherein X ₁ 、X ₂ And X ₃ Independently selected from A, C, T or G. In some embodiments, the termination signal has one of the following nucleotide sequences: TATCTGTT; and/or TTTTTT; and/or AAGCTT; and/or gaagag; and/or tctag a. In some embodiments, the termination signal has the following nucleotide sequence: 5' -X ₁ AUCUX ₂ UX ₃ -3', whereinX ₁ 、X ₂ And X ₃ Independently selected from A, C, U or G. In some embodiments, the termination signal has one of the following nucleotide sequences: UAUCUGUU; and/or uuuuuuuu; and/or AAGCUU; and/or gaagag; and/or UCUAGA.

Guanine-cytosine (GC) content

The method further includes determining a guanine-cytosine (GC) content of each optimized nucleotide sequence in the updated list of optimized nucleotide sequences. The GC content of a sequence is the percentage of bases in the nucleotide sequence that are guanine or cytosine. The list of optimized nucleotide sequences is further updated by: if the GC content of any nucleotide sequence falls outside a predetermined GC content range, the sequence is removed from the list.

Determining the GC content of each optimized nucleotide sequence includes, for each nucleotide sequence: determining the GC content of one or more additional portions of the nucleotide sequence, wherein the additional portions do not overlap each other and the first portion, and wherein updating the list of optimized sequences comprises: if the GC content of any portion falls outside a predetermined GC content range, the nucleotide sequence is removed, optionally wherein determining the GC content of the nucleotide sequence is stopped when the GC content of any portion is determined to be outside the predetermined GC content range. In some embodiments, the first portion and/or one or more additional portions of the nucleotide sequence comprises a predetermined number of nucleotides, optionally wherein the predetermined number of nucleotides is within the following range: 5 to 300 nucleotides, or 10 to 200 nucleotides, or 15 to 100 nucleotides, or 20 to 50 nucleotides. In the case of the present invention, the predetermined number of nucleotides is typically 30 nucleotides. The predetermined GC content range may be 15% -75%, or 40% -60%, or 30% -70%. In the case of the present invention, the predetermined GC content range is typically 30% -70%.

In the case of the present invention a suitable GC content filter may first analyze the first 30 nucleotides of the optimized nucleotide sequence, i.e. nucleotides 1 to 30 of the optimized nucleotide sequence. The analyzing may include determining the number of nucleotides in the portion as G or C, and the determining the GC content of the portion may include dividing the number of G or C nucleotides in the portion by the total number of nucleotides in the portion. The result of this analysis will provide a value describing the proportion of nucleotides in the portion as G or C and may be a percentage, for example 50%, or a fraction, for example 0.5. If the GC content of the first portion falls outside a predetermined GC content range, the optimized nucleotide sequence may be removed from the list of optimized nucleotide sequences.

The GC content filter may then analyze the second portion of the optimized nucleotide sequence if the GC content of the first portion falls within the predetermined GC content range. In this example, this may be the second 30 nucleotides of the optimized nucleotide sequence, i.e., nucleotides 31 to 60. The partial analysis may be repeated for each portion until either of the following occurs: a portion of the GC content was found to fall outside a predetermined GC content range, in which case the optimized nucleotide sequence can be removed from the list; or the entire optimized nucleotide sequence has been analyzed and no such part is found, in which case the GC content filter retains the optimized nucleotide sequence in the list and can move to the next optimized nucleotide sequence in the list.

Codon Adaptation Index (CAI)

The method further includes determining a codon adaptation index for each optimized nucleotide sequence in the most recently updated list of optimized nucleotide sequences. The codon adaptation index of a sequence is a measure of codon usage bias and may be a value between 0 and 1. The most recently updated list of optimized nucleotide sequences is further updated by: any nucleotide sequence is removed if its codon usage index is less than or equal to a predetermined codon usage index threshold. The codon adaptation index threshold may be 0.7, or 0.75, or 0.8, or 0.85, or 0.9. The inventors have found that optimized nucleotide sequences with codon adaptation indexes equal to or greater than 0.8 deliver very high protein yields. Thus, in the case of the present invention, the codon usage index threshold is typically 0.8.

For each optimized nucleotide sequence, the codon adaptation index may be calculated in any manner apparent to a person skilled in the art, for example as described in the following documents: "The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications" (Sharp and Li,1987.Nucleic Acids Research 15 (3), pages 1281-1295).

Implementing the codon adaptation index calculation may include following or similar methods. For each amino acid in the sequence, the weight of each codon in the sequence can be determined by what is known as relative fitness (w _i ) Is a parameter representation of (c). The relative fitness can be calculated from the set of reference sequences as the observed frequency f of the codon for that amino acid _i Frequency f of the synonymous codon most frequently used _j The ratio between. The codon adaptation index of the sequence can then be calculated as a geometric mean of the weights associated with each codon in the sequence length (measured in codons). The set of reference sequences used to calculate the codon adaptation index may be the same set of reference sequences from which the codon usage table used with the method of the invention was derived.

Codon optimized CFTR mRNA

In some embodiments, a suitable codon optimized mRNA sequence is an mRNA sequence encoding the human CFTR (hCFTR) protein of SEQ ID NO. 1.

TABLE 1 exemplary codon optimized human CFTR

In some embodiments, a suitable mRNA can be a codon optimized sequence as shown in SEQ ID NOS.2-11.

In some embodiments, a suitable mRNA sequence may be an mRNA sequence of a homolog or analog of a human CFTR protein. For example, a homolog or analog of a human CFTR protein may be a modified human CFTR protein that contains one or more amino acid substitutions, deletions, and/or insertions as compared to a wild-type or naturally occurring human CFTR protein, while substantially retaining the activity of the CFTR protein. In some embodiments, mRNAs suitable for the present invention encode an amino acid sequence that is at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% homologous or more homologous to SEQ ID NO. 1. In some embodiments, mRNA suitable for the present invention encodes a protein that is substantially identical to a human CFTR protein. In some embodiments, mRNAs suitable for the present invention encode an amino acid sequence that is at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical or more identical to SEQ ID NO. 1. In general, mRNA according to the invention encodes a CFTR protein having the same amino acid sequence as SEQ ID NO. 1.

In some embodiments, mRNA suitable for the present invention encodes a fragment or portion of a human CFTR protein. In some embodiments, mRNA suitable for the present invention encodes a fragment or portion of a human CFTR protein, wherein the fragment or portion of the protein still retains CFTR activity similar to that of a wild-type protein.

In some embodiments, a suitable mRNA encodes a fusion protein comprising the full length, fragment, or portion of a CFTR protein fused (e.g., N-or C-terminal fused) to another protein. In some embodiments, the protein encodes a signal or cell targeting sequence fused to mRNA encoding the full length, fragment or portion of the CFTR protein.

In some embodiments, mRNAs suitable for the present invention comprise nucleotide sequences that are at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical or more to SEQ ID NO. 2, SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NO. 9, SEQ ID NO. 10 or SEQ ID NO. 11.

mRNA synthesis

mRNA according to the present invention can be synthesized according to any of a variety of known methods. For example, mRNA according to the invention may be synthesized by In Vitro Transcription (IVT). Briefly, IVT is typically performed with: a linear or circular DNA template containing a promoter, a pool of ribonucleotides triphosphates, a buffer system that can include DTT and magnesium ions, and a suitable RNA polymerase (e.g., T3, T7, or SP6RNA polymerase), dnase I, pyrophosphatase, and/or an rnase inhibitor. The exact conditions will vary depending on the particular application.

Exemplary codon-optimized human cystic fibrosis transmembrane conductance regulator (CFTR) mRNA

Construct design:

X-SEQ ID NO:1-Y

5 'and 3' UTR sequences:

x (5' utr sequence) =

GGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGAAGAC

ACCGGGACCGAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGAACGCGGAUUCCCCGUGCCAAGAGUGACUCACCGUCCUUGACACG(SEQ ID NO:12)

Y (3' utr sequence) =

CGGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAGUU

GCCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUGCAUCAAGCU(SEQ ID NO:13)

Or (b)

GGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAGUUG

CCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUGCAUCAAAGCU(SEQ ID NO:14)

Exemplary codon optimized human CFTR mRNA sequences include any of SEQ ID NO. 2 through SEQ ID NO. 11, as described in the detailed description section.

In some embodiments, the activity of the CFTR protein is assessed by an ews chamber assay. In some embodiments, the duration of activity of the CFTR protein is assessed by a time course ews assay. In some embodiments, protein expression and stability are assessed by pulse-chase. In some embodiments, protein expression and stability are assessed by surface biotinylation.

In some embodiments, for the preparation of mRNA according to the invention, the DNA template is transcribed in vitro. Suitable DNA templates typically have a promoter for in vitro transcription (e.g., a T3, T7, or SP6 promoter) followed by the desired nucleotide sequence for the desired mRNA and a termination signal.

Synthesis of mRNA using SP6RNA polymerase

In some embodiments, the CFTR mRNA is produced using SP6RNA polymerase. SP6RNA polymerase is a DNA-dependent RNA polymerase that has high sequence specificity for the SP6 promoter sequence. SP6 polymerase catalyzes 5'→3' in vitro RNA synthesis on single-stranded DNA or double-stranded DNA downstream of its promoter; it incorporates natural ribonucleotides and/or modified ribonucleotides and/or labeled ribonucleotides into polymeric transcripts. Examples of such labeled ribonucleotides include biotin, fluorescein, digoxin, amino allyl, and isotopically labeled nucleotides.

The sequence of phage SP6RNA polymerase was originally described (GenBank: Y00105.1) as having the following amino acid sequence:

MQDLHAIQLQLEEEMFNGGIRRFEADQQRQIAAGSESDTAWNRRLLSELIAPMAEGIQ

AYKEEYEGKKGRAPRALAFLQCVENEVAAYITMKVVMDMLNTDATLQAIAMSVAER

IEDQVRFSKLEGHAAKYFEKVKKSLKASRTKSYRHAHNVAVVAEKSVAEKDADFDR

WEAWPKETQLQIGTTLLEILEGSVFYNGEPVFMRAMRTYGGKTIYYLQTSESVGQWIS

AFKEHVAQLSPAYAPCVIPPRPWRTPFNGGFHTEKVASRIRLVKGNREHVRKLTQKQ

MPKVYKAINALQNTQWQINKDVLAVIEEVIRLDLGYGVPSFKPLIDKENKPANPVPVE

FQHLRGRELKEMLSPEQWQQFINWKGECARLYTAETKRGSKSAAVVRMVGQARKYS

AFESIYFVYAMDSRSRVYVQSSTLSPQSNDLGKALLRFTEGRPVNGVEALKWFCINGA

NLWGWDKKTFDVRVSNVLDEEFQDMCRDIAADPLTFTQWAKADAPYEFLAWCFEY

AQYLDLVDEGRADEFRTHLPVHQDGSCSGIQHYSAMLRDEVGAKAVNLKPSDAPQDI

YGAVAQVVIKKNALYMDADDATTFTSGSVTLSGTELRAMASAWDSIGITRSLTKKPV

MTLPYGSTRLTCRESVIDYIVDLEEKEAQKAVAEGRTANKVHPFEDDRQDYLTPGAA

YNYMTALIWPSISEVVKAPIVAMKMIRQLARFAAKRNEGLMYTLPTGFILEQKIMATE

MLRVRTCLMGDIKMSLQVETDIVDEAAMMGAAAPNFVHGHDASHLILTVCELVDKG

VTSIAVIHDSFGTHADNTLTLRVALKGQMVAMYIDGNALQKLLEEHEVRWMVDTGIEVPEQGEFDLNEIMDSEYVFA(SEQ ID NO:16)。

the SP6RNA polymerase suitable for the present invention may be any enzyme having substantially the same polymerase activity as the bacteriophage SP6RNA polymerase. Thus, in some embodiments, SP6RNA polymerase suitable for the present invention may be modified from SEQ ID NO. 16. For example, a suitable SP6RNA polymerase may contain one or more amino acid substitutions, deletions or additions. In some embodiments, a suitable SP6RNA polymerase has an amino acid sequence that is about 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 75%, 70%, 65%, or 60% identical or homologous to SEQ ID NO. 16. In some embodiments, a suitable SP6RNA polymerase may be a truncated protein (from the N-terminus, the C-terminus, or internally), but retains polymerase activity. In some embodiments, a suitable SP6RNA polymerase is a fusion protein.

SP6RNA polymerase suitable for the present invention may be commercially available products, such as those from Aldevron, ambion, new England Biolabs (NEB), promega and Roche. SP6 may be ordered and/or custom designed from commercial or non-commercial sources based on the amino acid sequence of SEQ ID NO. 16 or a variant of SEQ ID NO. 16 as described herein. SP6 may be a standard fidelity polymerase; or may be high fidelity/high efficiency/high capacity, which has been modified to promote RNA polymerase activity, such as mutations in the SP6RNA polymerase gene or post-translational modification of the SP6RNA polymerase itself. Examples of such modified SP6 include SP6RNA Polymerase-Plus from Ambion ^TM HiScribe SP6 from NEB and RiboMAX from Promega ^TM Andthe system.

In some embodiments, a suitable SP6RNA polymerase is a fusion protein. For example, the SP6RNA polymerase may comprise one or more tags to facilitate isolation, purification or solubility of the enzyme. Suitable tags may be located at the N-terminus, the C-terminus and/or internally. Non-limiting examples of suitable tags include Calmodulin Binding Protein (CBP); the fasciola hepatica 8kDa antigen (Fh 8); a FLAG tag peptide; glutathione-S-transferase (GST); a histidine tag (e.g., hexahistidine tag (His 6)); maltose Binding Protein (MBP); nitrogen-utilizing substance (NusA); a small molecule ubiquitin-related modifier (SUMO) fusion tag; streptavidin binding peptide (STREP); tandem Affinity Purification (TAP); and thioredoxin (TrxA). Other labels may be used in the present invention. These and other fusion tags have been described, for example, in the following documents: costa et al Frontiers in Microbiology (2014): 63 and PCT/US16/57044, the contents of which are incorporated herein by reference in their entirety. In certain embodiments, the His tag is located at the N-terminus of SP 6.

DNA template

Typically, CFTR DNA templates are fully double-stranded, or mostly single-stranded and have a double-stranded SP6 promoter sequence.

Linearized plasmid DNA (linearized via one or more restriction enzymes), linearized genomic DNA fragments (via restriction enzymes and/or physical means), PCR products and/or synthetic DNA oligonucleotides can be used as templates for SP6 in vitro transcription, provided that they contain a double stranded SP6 promoter upstream (and in the correct orientation) of the DNA sequence to be transcribed.

In some embodiments, the linearized DNA template has blunt ends.

In some embodiments, the DNA sequence to be transcribed may be optimized to promote more efficient transcription and/or translation. For example, the DNA sequence may be optimized with respect to: cis-regulatory elements (e.g., TATA box, termination signal, and protein binding sites), artificial recombination sites, χ sites, cpG dinucleotide content, negative CpG islands, GC content, polymerase slip sites, and/or other elements related to transcription; the DNA sequence may be optimized with respect to: cryptic splice sites, mRNA secondary structure, stable free energy of mRNA, repeat sequence, RNA instability motifsAnd/or other factors related to mRNA processing and stability; the DNA sequence may be optimized with respect to: codon usage bias, codon adaptation, internal χ sites, ribosome binding sites (e.g., IRES), premature poly-A sites, shine-Dalgarno (SD) sequences and/or other elements associated with translation; and/or the DNA sequence may be optimized with respect to: a sequence upstream and downstream of the codon, a codon-anticodon interaction, a translation pause site, and/or other elements related to protein folding. Optimization methods known in the art, such as thermo Fisher and optimumGene, may be used in the present invention ^TM Is described in US 20110081708, the contents of which are incorporated herein by reference in their entirety.

In some embodiments, the DNA template comprises 5 'and/or 3' untranslated regions. In some embodiments, the 5' untranslated region includes one or more elements that affect the stability or translation of the mRNA, such as an iron-responsive element. In some embodiments, the 5' untranslated region may be between about 50 and 500 nucleotides in length.

In some embodiments, the 3' untranslated region comprises one or more of the following: polyadenylation signals, binding sites for proteins affecting the stability of the position of mRNA in a cell, or one or more binding sites for mirnas. In some embodiments, the 3' untranslated region may be between 50 and 500 nucleotides in length or longer.

Exemplary 3 'and/or 5' utr sequences may be derived from a stable mRNA molecule (e.g., globin, actin, GAPDH, tubulin, histone, or citrate-circulating enzyme) to increase stability of the sense mRNA molecule. For example, the 5' utr sequence may include a partial sequence of the CMV immediate early 1 (IE 1) gene or fragment thereof to improve nuclease resistance and/or to improve the half-life of the polynucleotide. It is also contemplated that a sequence encoding human growth hormone (hGH) or fragment thereof is contained in the 3' or untranslated region of a polynucleotide (e.g., mRNA) to further stabilize the polynucleotide. In general, such modifications may improve the stability and/or pharmacokinetic properties (e.g., half-life) of the polynucleotide relative to the unmodified counterpart of the polynucleotide, and include, for example, modifications made to improve the resistance of such polynucleotides to nuclease digestion in vivo.

Large-scale mRNA synthesis

The present invention relates to the large-scale production of codon optimized CFTR mRNA. In some embodiments, the method according to the invention synthesizes at least 100mg, 150mg, 200mg, 300mg, 400mg, 500mg, 600mg, 700mg, 800mg, 900mg, 1g, 5g, 10g, 25g, 50g, 75g, 100g, 250g, 500g, 750g, 1kg, 5kg, 10kg, 50kg, 100kg, 1000kg or more mRNA in a single batch. As used herein, the term "batch" refers to the amount or quantity of mRNA synthesized at one time (e.g., produced according to a single manufacturing set-up). A batch may refer to the amount of mRNA synthesized in a single reaction that is performed under a set of conditions via a single aliquot of enzyme and/or a single aliquot of DNA template for continuous synthesis. mRNA synthesized in a single batch will not include mRNA synthesized at different times combined to achieve the desired amount. Typically, the reaction mixture includes SP6RNA polymerase, a linear DNA template, and an RNA polymerase reaction buffer (which may include ribonucleotides or may require the addition of ribonucleotides).

According to the invention, 1-100mg of SP6 polymerase per gram (g) of mRNA is generally produced. In some embodiments, about 1-90mg, 1-80mg, 1-60mg, 1-50mg, 1-40mg, 10-100mg, 10-80mg, 10-60mg, 10-50mg of SP6 polymerase is typically used per gram of mRNA produced. In some embodiments, about 5-20mg of the SP6 polymerase is used to produce about 1 gram of mRNA. In some embodiments, about 0.5 to 2 grams of SP6 polymerase is used to produce about 100 grams of mRNA. In some embodiments, about 5 to 20 grams of SP6 polymerase is used to produce about 1 kilogram of mRNA. In some embodiments, at least 5mg of the SP6 polymerase is used to produce at least 1 gram of mRNA. In some embodiments, at least 500mg of the SP6 polymerase is used to produce at least 100 grams of mRNA. In some embodiments, at least 5 grams of SP6 polymerase is used to produce at least 1 kilogram of mRNA. In some embodiments, about 10mg, 20mg, 30mg, 40mg, 50mg, 60mg, 70mg, 80mg, 90mg, or 100mg of plasmid DNA is used per gram of mRNA produced. In some embodiments, about 10-30mg plasmid DNA is used to produce about 1 gram of mRNA. In some embodiments, about 1 to 3 grams of plasmid DNA is used to produce about 100 grams of mRNA. In some embodiments, about 10 to 30 grams of plasmid DNA is used for about 1 kilogram of mRNA. In some embodiments, at least 10mg of plasmid DNA is used to produce at least 1 gram of mRNA. In some embodiments, at least 1 gram of plasmid DNA is used to produce at least 100 grams of mRNA. In some embodiments, at least 10 grams of plasmid DNA is used to produce at least 1 kilogram of mRNA.

In some embodiments, the concentration of SP6RNA polymerase in the reaction mixture may be about 1 to 100nM, 1 to 90nM, 1 to 80nM, 1 to 70nM, 1 to 60nM, 1 to 50nM, 1 to 40nM, 1 to 30nM, 1 to 20nM, or about 1 to 10nM. In certain embodiments, the concentration of the SP6RNA polymerase is about 10 to 50nM, 20 to 50nM, or 30 to 50nM. SP6RNA polymerase concentrations of 100 to 10000 units/ml may be used, for example the following concentrations may be used: 100 to 9000 units/ml, 100 to 8000 units/ml, 100 to 7000 units/ml, 100 to 6000 units/ml, 100 to 5000 units/ml, 100 to 1000 units/ml, 200 to 2000 units/ml, 500 to 1000 units/ml, 500 to 2000 units/ml, 500 to 3000 units/ml, 500 to 4000 units/ml, 500 to 5000 units/ml, 500 to 6000 units/ml, 1000 to 7500 units/ml, and 2500 to 5000 units/ml.

The concentration of each ribonucleotide (e.g., ATP, UTP, GTP and CTP) in the reaction mixture is between about 0.1mM and about 10mM, such as between about 1mM and about 10mM, between about 2mM and about 10mM, between about 3mM and about 10mM, between about 1mM and about 8mM, between about 1mM and about 6mM, between about 3mM and about 10mM, between about 3mM and about 8mM, between about 3mM and about 6mM, between about 4mM and about 5mM. In some embodiments, each ribonucleotide is about 5mM in the reaction mixture. In some embodiments, the total concentration of rtp (e.g., ATP, GTP, CTP and UTP combined) used in the reaction is in the range between 1mM and 40 mM. In some embodiments, the total concentration of rtp (e.g., ATP, GTP, CTP and UTP combined) used in the reaction is in the range between 1mM and 30mM or between 1mM and 28mM or between 1mM and 25mM or between 1mM and 20mM. In some embodiments, the total rtp concentration is less than 30mM. In some embodiments, the total rtp concentration is less than 25mM. In some embodiments, the total rtp concentration is less than 20mM. In some embodiments, the total rtp concentration is less than 15mM. In some embodiments, the total rtp concentration is less than 10mM.

RNA polymerase reaction buffers typically include salts/buffers such as Tris, HEPES, ammonium sulfate, sodium bicarbonate, sodium citrate, sodium acetate, potassium phosphate, sodium chloride, and magnesium chloride.

The pH of the reaction mixture may be between about 6 to 8.5, 6.5 to 8.0, 7.0 to 7.5, and in some embodiments, the pH is 7.5.

The linear or linearized DNA template (e.g., as described above and in an amount/concentration sufficient to provide the desired amount of RNA), the RNA polymerase reaction buffer, and the SP6RNA polymerase are combined to form a reaction mixture. The reaction mixture is incubated at between about 37 ℃ and about 42 ℃ for thirty minutes to six hours, for example about sixty minutes to about ninety minutes.

In some embodiments, about 5mM NTP, about 0.05mg/mL RNA polymerase, and about 0.1mg/mL DNA template are incubated in a suitable SP6 polymerase reaction buffer (final reaction mixture pH of about 7.5) at about 37℃to about 42℃for sixty to ninety minutes.

In some embodiments, the reaction mixture contains a linearized double stranded DNA template with an SP6 polymerase specific promoter, SP6RNA polymerase, rnase inhibitor, pyrophosphatase, 29mM NTP, 10mM DTT, and reaction buffer (800 mM HEPES, 20mM spermidine, 250mM MgCl when at 10x ₂ (pH 7.7)) and with rnase free water sufficient (QS) to the desired reaction volume; the reaction mixture was then incubated at 37℃for 60 minutes. Then by adding DNase I and DNase I buffer (100 mM Tris-HCl, 5mM MgCl when at 10X) ₂ And 25mM CaCl ₂ (pH 7.6)) to promote digestion of double stranded DNA template to quench the polymerase reaction in preparation for purification. This embodiment has been shown to be sufficient to produce 100 grams of mRNA.

In some embodiments, the reaction mixture comprises NTP at a concentration ranging from 1 to 10mM, DNA template at a concentration ranging from 0.01 to 0.5mg/ml, and SP6RNA polymerase at a concentration ranging from 0.01 to 0.1mg/ml, e.g., the reaction mixture comprises NTP at a concentration of 5mM, DNA template at a concentration of 0.1mg/ml, and SP6RNA polymerase at a concentration of 0.05 mg/ml.

Nucleotide(s)

According to the present invention, various naturally occurring or modified nucleosides can be used to produce mRNA. In some embodiments, the mRNA is or comprises a natural nucleoside (e.g., adenosine, guanosine, cytidine, uridine); nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyladenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deadenosine, 7-deazaguanosine, 8-oxo-guanosine, O (6) -methylguanine, pseudouridine (e.g., N-1-methyl-pseudouridine), 2-thiouridine, and 2-thiocytosine); chemically modified bases; biologically modified bases (e.g., methylated bases); an intercalating base; modified sugars (e.g., 2 '-fluororibose, ribose, 2' -deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioate and 5' -N-phosphoramidite linkages).

In some embodiments, the mRNA comprises one or more non-standard nucleotide residues. Non-standard nucleotide residues may include, for example, 5-methyl-cytidine ("5 mC"), pseudouridine ("ψu"), and/or 2-thiouridine ("2 sU"). For a discussion of such residues and their incorporation into mRNA see, e.g., U.S. patent No. 8,278,036 or WO 2011012316.mRNA can be RNA which is defined as RNA in which 25% of the U residues are 2-thiouridine and 25% of the C residues are 5-methylcytidine. Teachings regarding the use of RNA are disclosed in U.S. patent publication US 20120195936 and international publication WO 2011012316, both of which are incorporated herein by reference in their entirety. The presence of non-standard nucleotide residues may render the mRNA more stable and/or less immunogenic than a control mRNA having the same sequence but containing only standard residues. In other embodiments, the mRNA may comprise one or more non-standard nucleotide residues selected from the group consisting of: isocytosine, pseudoisocytosine, 5-bromouracil, 5-propynyluracil, 6-aminopurine, 2-aminopurine, inosine, diaminopurine, and 2-chloro-6-aminopurine cytosine, and combinations of these modifications and other nucleobase modifications. Some embodiments may further include additional modifications to the furanose ring or nucleobase. Additional modifications may include, for example, sugar modifications or substitutions (e.g., one or more of 2' -O-alkyl modifications, locked Nucleic Acids (LNAs)). In some embodiments, the RNA may be complexed or hybridized to additional polynucleotides and/or peptide Polynucleotides (PNAs). In some embodiments where the sugar modification is a 2 '-O-alkyl modification, such modifications may include, but are not limited to, 2' -deoxy-2 '-fluoro modifications, 2' -O-methyl modifications, 2 '-O-methoxyethyl modifications, and 2' -deoxy modifications. In some embodiments, any of these modifications may be present in 0-100% of the nucleotides, alone or in combination, e.g., more than 0%, 1%, 10%, 25%, 50%, 75%, 85%, 90%, 95% or 100% of the constituent nucleotides.

Post synthesis treatment

Typically, the 5 'cap and/or 3' tail may be added after synthesis. The presence of the cap is important to provide resistance to nucleases found in most eukaryotic cells. The presence of a "tail" serves to protect the mRNA from exonuclease degradation.

The 5' cap is typically added as follows: first, RNA terminal phosphatase removes one of the terminal phosphate groups from the 5' nucleotide, leaving two terminal phosphates; guanosine Triphosphate (GTP) is then added to the terminal phosphate via guanylate transferase, resulting in a 5'5 triphosphate linkage; the 7-nitrogen of guanine is then methylated by methyltransferase. Examples of cap structures include, but are not limited to, m7G (5 ') ppp (5' (a, G (5 ') ppp (5') a) and G (5 ') ppp (5') G. Additional cap structures are described in published U.S. application nos. US 2016/0032356 and U.S. provisional application 62/464,327 filed on 27 months 2 in 2017, which applications are incorporated herein by reference.

Typically, the tail structure comprises poly (a) and/or poly (C) tails. The poly a or poly C tail on the 3' end of an mRNA typically comprises at least 50 adenosine or cytosine nucleotides, at least 150 adenosine or cytosine nucleotides, at least 200 adenosine or cytosine nucleotides, at least 250 adenosine or cytosine nucleotides, at least 300 adenosine or cytosine nucleotides, at least 350 adenosine or cytosine nucleotides, at least 400 adenosine or cytosine nucleotides, at least 450 adenosine or cytosine nucleotides, at least 500 adenosine or cytosine nucleotides, at least 550 adenosine or cytosine nucleotides, at least 600 adenosine or cytosine nucleotides, at least 650 adenosine or cytosine nucleotides, at least 700 adenosine or cytosine nucleotides, at least 750 adenosine or cytosine nucleotides, at least 800 adenosine or cytosine nucleotides, at least 850 adenosine or cytosine nucleotides, at least 900 adenosine or cytosine nucleotides, at least 950 adenosine or cytosine nucleotides, or at least 1kb adenosine or cytosine nucleotides, respectively. In some embodiments, the poly a or poly C tail can be about 10 to 800 adenosine or cytosine nucleotides (e.g., about 10 to 200 adenosine or cytosine nucleotides, about 10 to 300 adenosine or cytosine nucleotides, about 10 to 400 adenosine or cytosine nucleotides, about 10 to 500 adenosine or cytosine nucleotides, about 10 to 550 adenosine or cytosine nucleotides, about 10 to 600 adenosine or cytosine nucleotides, about 50 to 600 adenosine or cytosine nucleotides, about 100 to 600 adenosine or cytosine nucleotides, about 150 to 600 adenosine or cytosine nucleotides, about 200 to 600 adenosine or cytosine nucleotides, about 250 to 600 adenosine or cytosine nucleotides, about 300 to 600 adenosine or cytosine nucleotides, about 350 to 600 adenosine or cytosine nucleotides, about 400 to 600 adenosine or cytosine nucleotides, about 450 to 600 adenosine or cytosine nucleotides, about 500 to 600 adenosine or cytosine nucleotides, about 10 to 20 to 60 adenosine or about 20 to cytosine nucleotides). In some embodiments, the tail structure comprises a combination of poly (a) and poly (C) tails of various lengths as described herein. In some embodiments, the tail structure comprises at least 50%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 92%, 94%, 95%, 96%, 97%, 98% or 99% adenosine nucleotides. In some embodiments, the tail structure comprises at least 50%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 92%, 94%, 95%, 96%, 97%, 98% or 99% cytosine nucleotides.

As described herein, the addition of a 5 'cap and/or 3' tail helps to detect abortive transcripts generated during in vitro synthesis, as the size of those prematurely aborted mRNA transcripts may be too small to be detected without capping and/or tailing. Thus, in some embodiments, a 5 'cap and/or 3' tail is added to the synthesized mRNA prior to testing the purity of the mRNA (e.g., the level of abortive transcripts present in the mRNA). In some embodiments, the 5 'cap and/or 3' tail is added to the synthesized mRNA prior to purifying the mRNA as described herein. In other embodiments, the 5 'cap and/or 3' tail is added to the synthesized mRNA after purification of the mRNA as described herein.

mRNA synthesized according to the present invention can be used without further purification. In particular, mRNA synthesized according to the present invention can be used without the step of removing short bodies (shortmers). In some embodiments, mRNA synthesized according to the invention may be further purified. Various methods can be used to purify the mRNA synthesized according to the present invention. For example, purification of mRNA can be performed using centrifugation, filtration, and/or chromatography. In some embodiments, the synthesized mRNA is purified by ethanol precipitation or filtration or chromatography or gel purification or any other suitable means. In some embodiments, the mRNA is purified by HPLC. In some embodiments, mRNA is extracted from a standard phenol-chloroform-isoamyl alcohol solution as is well known to those skilled in the art. In some embodiments, the mRNA is purified using tangential flow filtration. Suitable purification methods include those described in the following PCT application: PCT application PCT/US18/19954 entitled "METHODS FOR PURIFICATION OF MESSENGER RNA" filed on US 2016/0040154, US 2015/0376220, 27, 2, 2018, and PCT application PCT/US18/19978 entitled "METHODS FOR PURIFICATION OF MESSENGER RNA" filed on 27, 2, 2018, which are incorporated herein by reference in their entirety; and may be used to practice the invention.

In some embodiments, the mRNA is purified prior to capping and tailing. In some embodiments, the mRNA is purified after capping and tailing. In some embodiments, the mRNA is purified both before and after capping and tailing.

In some embodiments, the mRNA is purified by centrifugation before or after capping and tailing, or both.

In some embodiments, the mRNA is purified by filtration before or after capping and tailing, or both.

In some embodiments, the mRNA is purified by Tangential Flow Filtration (TFF) before or after capping and tailing, or both.

In some embodiments, the mRNA is purified by chromatography before or after capping and tailing or both.

Characterization of mRNA

Full-length or abortive transcripts of mRNA may be detected and quantified using any method available in the art. In some embodiments, the synthetic mRNA molecules are detected using the following method: blotting, capillary electrophoresis, chromatography, fluorescence, gel electrophoresis, HPLC, silver staining, spectroscopy, ultraviolet (UV) or UPLC or combinations thereof. Other detection methods known in the art are included in the present invention. In some embodiments, the synthesized mRNA molecules are detected using UV absorbance spectroscopy and separated by capillary electrophoresis. In some embodiments, mRNA is first denatured by glyoxal dye prior to gel electrophoresis ("glyoxal gel electrophoresis"). In some embodiments, the synthesized mRNA is characterized prior to capping or tailing. In some embodiments, the synthesized mRNA is characterized after capping and tailing.

In some embodiments, the mRNA produced by the methods disclosed herein comprises less than 10%, less than 9%, less than 8%, less than 7%, less than 6%, less than 5%, less than 4%, less than 3%, less than 2%, less than 1%, less than 0.5%, less than 0.1% of impurities other than full-length mRNA. The impurities include IVT contaminants such as proteins, enzymes, free nucleotides and/or short bodies.

In some embodiments, the mRNA produced according to the invention is substantially free of short bodies or abortive transcripts. In particular, the mRNA produced according to the invention contains short bodies or abortive transcripts at levels undetectable by capillary electrophoresis or glyoxal gel electrophoresis. As used herein, the term "short body" or "abortive transcript" refers to any transcript that is less than full length. In some embodiments, a "short body" or "abortive transcript" is less than 100 nucleotides in length, less than 90, less than 80, less than 70, less than 60, less than 50, less than 40, less than 30, less than 20, or less than 10 nucleotides in length. In some embodiments, short bodies are detected or quantified after addition of the 5 '-cap and/or 3' -poly a tail.

mRNA solution

In some embodiments, mRNA may be provided in a solution to be mixed with a lipid solution, such that the mRNA may be encapsulated in a lipid nanoparticle. Suitable mRNA solutions may be any aqueous solution containing the mRNA to be encapsulated at different concentrations. For example, a suitable mRNA solution may contain mRNA at the following concentrations: is at or above about 0.01mg/ml, 0.05mg/ml, 0.06mg/ml, 0.07mg/ml, 0.08mg/ml, 0.09mg/ml, 0.1mg/ml, 0.15mg/ml, 0.2mg/ml, 0.3mg/ml, 0.4mg/ml, 0.5mg/ml, 0.6mg/ml, 0.7mg/ml, 0.8mg/ml, 0.9mg/ml or 1.0mg/ml. In some embodiments, suitable mRNA solutions may contain mRNA in the following concentration ranges: about 0.01-1.0mg/ml, 0.01-0.9mg/ml, 0.01-0.8mg/ml, 0.01-0.7mg/ml, 0.01-0.6mg/ml, 0.01-0.5mg/ml, 0.01-0.4mg/ml, 0.01-0.3mg/ml, 0.01-0.2mg/ml, 0.01-0.1mg/ml, 0.05-1.0mg/ml, 0.05-0.9mg/ml, 0.05-0.8mg/ml, 0.05-0.7mg/ml, 0.05-0.6mg/ml, 0.05-0.5mg/ml, 0.05-0.4mg/ml, 0.05-0.3mg/ml, 0.05-0.2mg/ml, 0.05-0.1mg/ml, 0.1-0.2 mg/ml, 0.05-0.8mg/ml, 0.05-0.7mg/ml, 0.0.7 mg/ml, 0.0.3-0.5 mg/ml. In some embodiments, a suitable mRNA solution may contain mRNA at the following concentrations: up to about 5.0mg/ml, 4.0mg/ml, 3.0mg/ml, 2.0mg/ml, 1.0mg/ml, 0.09mg/ml, 0.08mg/ml, 0.07mg/ml, 0.06mg/ml or 0.05mg/ml.

In general, suitable mRNA solutions may also contain buffers and/or salts. Typically, buffers may include HEPES, ammonium sulfate, sodium bicarbonate, sodium citrate, sodium acetate, potassium phosphate, and sodium phosphate. In some embodiments, suitable concentrations of buffer may range from about 0.1mM to 100mM, 0.5mM to 90mM, 1.0mM to 80mM, 2mM to 70mM, 3mM to 60mM, 4mM to 50mM, 5mM to 40mM, 6mM to 30mM, 7mM to 20mM, 8mM to 15mM, or 9 to 12mM. In some embodiments, suitable concentrations of buffers are or greater than the following: about 0.1mM, 0.5mM, 1mM, 2mM, 4mM, 6mM, 8mM, 10mM, 15mM, 20mM, 25mM, 30mM, 35mM, 40mM, 45mM or 50mM.

Exemplary salts may include sodium chloride, magnesium chloride, and potassium chloride. In some embodiments, suitable salt concentrations in the mRNA solution may range from about 1mM to 500mM, 5mM to 400mM, 10mM to 350mM, 15mM to 300mM, 20mM to 250mM, 30mM to 200mM, 40mM to 190mM, 50mM to 180mM, 50mM to 170mM, 50mM to 160mM, 50mM to 150mM, or 50mM to 100mM. Suitable salt concentrations in the mRNA solutions are at or above about 1mM, 5mM, 10mM, 20mM, 30mM, 40mM, 50mM, 60mM, 70mM, 80mM, 90mM or 100mM.

In some embodiments, suitable mRNA solutions may have a pH in the range of about 3.5-6.5, 3.5-6.0, 3.5-5.5, 3.5-5.0, 3.5-4.5, 4.0-5.5, 4.0-5.0, 4.0-4.9, 4.0-4.8, 4.0-4.7, 4.0-4.6, or 4.0-4.5. In some embodiments, a suitable mRNA solution may have a pH of or not greater than about 3.5, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.2, 5.4, 5.6, 5.8, 6.0, 6.1, 6.3, and 6.5.

Various methods can be used to prepare mRNA solutions suitable for the present invention. In some embodiments, mRNA can be directly dissolved in a buffer solution as described herein. In some embodiments, the mRNA solution may be generated by mixing the mRNA stock solution with a buffer solution prior to mixing with the lipid solution for encapsulation. In some embodiments, the mRNA solution may be generated by mixing the mRNA stock solution with the buffer solution immediately after mixing with the lipid solution for encapsulation. In some embodiments, a suitable stock solution of mRNA may contain mRNA in water at the following concentrations: is at or above about 0.2mg/ml, 0.4mg/ml, 0.5mg/ml, 0.6mg/ml, 0.8mg/ml, 1.0mg/ml, 1.2mg/ml, 1.4mg/ml, 1.5mg/ml, or 1.6mg/ml, 2.0mg/ml, 2.5mg/ml, 3.0mg/ml, 3.5mg/ml, 4.0mg/ml, 4.5mg/ml, or 5.0mg/ml.

In some embodiments, the mRNA stock solution is mixed with the buffer solution using a pump. Exemplary pumps include, but are not limited to, gear pumps, peristaltic pumps, and centrifugal pumps.

Typically, the buffer solution is mixed at a rate greater than the rate of the mRNA stock solution. For example, the buffer solution may be mixed at a rate at least 1x, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, 10x, 15x, or 20x greater than the rate of the mRNA stock solution. In some embodiments, the buffer solution is mixed at a flow rate ranging between about 100-6000 ml/min (e.g., about 100-300 ml/min, 300-600 ml/min, 600-1200 ml/min, 1200-2400 ml/min, 2400-3600 ml/min, 3600-4800 ml/min, 4800-6000 ml/min, or 60-420 ml/min). In some embodiments, the buffer solution is mixed at a flow rate of at or greater than about 60 ml/min, 100 ml/min, 140 ml/min, 180 ml/min, 220 ml/min, 260 ml/min, 300 ml/min, 340 ml/min, 380 ml/min, 420 ml/min, 480 ml/min, 540 ml/min, 600 ml/min, 1200 ml/min, 2400 ml/min, 3600 ml/min, 4800 ml/min, or 6000 ml/min.

In some embodiments, the mRNA stock solution is mixed at a flow rate ranging between about 10-600 ml/min (e.g., about 5-50 ml/min, about 10-30 ml/min, about 30-60 ml/min, about 60-120 ml/min, about 120-240 ml/min, about 240-360 ml/min, about 360-480 ml/min, or about 480-600 ml/min). In some embodiments, the mRNA stock solution is mixed at a flow rate of about or greater than 5 ml/min, 10 ml/min, 15 ml/min, 20 ml/min, 25 ml/min, 30 ml/min, 35 ml/min, 40 ml/min, 45 ml/min, 50 ml/min, 60 ml/min, 80 ml/min, 100 ml/min, 200 ml/min, 300 ml/min, 400 ml/min, 500 ml/min, or 600 ml/min.

Delivery vehicle

According to the invention, mRNA encoding a peptide, polypeptide, or protein (e.g., full length, fragment, or portion of a protein) as described herein can be delivered via a delivery vehicle. As used herein, the terms "delivery vehicle," "transfer vehicle," "nanoparticle," or grammatical equivalents are used interchangeably.

The delivery vehicle may be formulated in combination with one or more additional nucleic acids, carriers, targeting ligands or stabilizing agents, or in a pharmaceutical composition in which the vehicle is admixed with a suitable excipient. Techniques for drug formulation and administration can be found in the following documents: "Remington's Pharmaceutical Sciences," Mack Publishing Co., easton, pa., latest edition. The particular delivery vehicle is selected based on its ability to facilitate transfection of the nucleic acid into the target cell.

In some embodiments, the delivery vehicle comprising mRNA is administered by pulmonary delivery (e.g., including nebulization). In these embodiments, the delivery vehicle may be in the form of an aerosolized composition that can be inhaled. In some embodiments, mRNA is expressed in tissue to which the delivery vehicle is administered, e.g., nasal, tracheal, bronchial, bronchiole, and/or other pulmonary system-related cells or tissues. Other teachings of pulmonary delivery and nebulization are described in related international application PCT/US17/61100 filed on 10/11/2017, entitled "NOVEL ICE-BASED LIPID NANOPARTICLE FORMULATION FOR DELIVERY OF MRNA" by the applicant, and U.S. provisional application USSN 62/507,061, each of which is incorporated by reference in its entirety.

In some embodiments, mRNA encoding a protein may be delivered via a single delivery vehicle. In some embodiments, mRNA encoding a protein may be delivered via one or more delivery vehicles (each having a different composition). According to various embodiments, suitable delivery vehicles include, but are not limited to, polymer-based carriers (such as Polyethylenimine (PEI), lipid nanoparticles and liposomes, nanoliposomes, ceramide-containing nanoliposomes, proteoliposomes, exosomes of both natural and synthetic origin, natural, synthetic and semisynthetic lamellar bodies, nanoparticles, calcium phosphor-silicate nanoparticles, calcium phosphate nanoparticles, silica nanoparticles, nanocrystal particles, semiconductor nanoparticles, poly (D-arginine), sol-gel, nanodendrimers, starch-based delivery systems, micelles, emulsions, lipid vesicles (niosomes), multidomain-block polymers (vinyl polymers, polyacrylic acid polymers, dynamic poly-conjugates), dry powder formulations, plasmids, viruses, calcium phosphate nucleotides, aptamers, peptides and other vector tags, the use of biological nanocapsules and other viral capsid protein assemblies as suitable transfer vehicles is also contemplated (Hum. Gene. Th.2008; 19 (9): 887-95).

The delivery vehicle comprising mRNA can be administered and dosed according to current medical practice, taking into account the clinical condition of the subject, the site and method of administration (e.g., local and systemic, including oral, pulmonary, and via injection), the schedule of administration, the age, sex, weight of the subject, and other factors relevant to the clinician of ordinary skill in the art. An "effective amount" for purposes herein may be determined by such relevant considerations as are known to those of ordinary skill in the experimental clinical study, pharmacology, clinical and medical arts. In some embodiments, the amount administered is effective to achieve at least some stabilization, improvement or elimination of symptoms and other indicia, as selected by one of skill in the art as an appropriate measure of disease progression, regression or improvement. For example, suitable amounts and dosing regimens are those that result in at least temporary protein production.

In some embodiments, CFTR mRNA is administered in combination with one or more CFTR potentiators and/or correctors. Suitable CFTR potentiators and/or correction agents include ivacaine (trade name) Lu Maka Torr (trade name)) Or a combination of ivacaine and Lu Maka torr. In some embodiments, CFTR mRNA is combined with one or more other CF therapies (e.g., hormone replacement therapy, thyroid hormone during treatment Replacement therapy, non-steroidal inflammatory drugs and prescribed dronabinol ++>) And (3) combined application.

In some embodiments, the human subject receives concomitant CFTR modulator therapy. In some embodiments, the concomitant CFTR modulator therapy comprises ivacaine. In some embodiments, the concomitant CFTR modulator therapy comprises Lu Maka torr. In some embodiments, the concomitant CFTR modulator therapy comprises tizakatuo. In some embodiments, the concomitant CFTR modulator therapy is selected from ivacaine, lu Maka torr, tizalcine, or a combination. In some embodiments, the concomitant CFTR modulator therapy comprises VX-659. In some embodiments, the concomitant CFTR modulator therapy comprises VX-445. In some embodiments, the concomitant CFTR modulator therapy comprises VX-152. In some embodiments, the concomitant CFTR modulator therapy comprises VX-440. In some embodiments, the concomitant CFTR modulator therapy comprises VX-371. In some embodiments, the concomitant CFTR modulator therapy comprises VX-561. In some embodiments, the concomitant CFTR modulator therapy comprises GLPG1837. In some embodiments, the concomitant CFTR modulator therapy comprises GLPG2222. In some embodiments, the concomitant CFTR modulator therapy comprises GLPG2737. In some embodiments, the concomitant CFTR modulator therapy comprises GLPG2451. In some embodiments, the concomitant CFTR modulator therapy comprises GLPG1837. In some embodiments, the concomitant CFTR modulator therapy comprises PTI-428. In some embodiments, the concomitant CFTR modulator therapy comprises PTI-801. In some embodiments, the concomitant CFTR modulator therapy comprises PTI-808. In some embodiments, the concomitant CFTR modulator therapy comprises Ai Lufu sen.

In some embodiments, the human subject is not suitable for treatment with one or more of ivacaine, lu Maka torr, tizacal torr, VX-659, VX-445, VX-152, VX-440, VX-371, VX-561, VX-659, or a combination thereof. In some embodiments, the human subject is not suitable for treatment with one or more of ivacaide, lu Maka torr, tizacaide, VX-659, VX-445, VX-152, VX-440, VX-371, VX-561, VX-659, GLPG1837, GLPG2222, GLPG2737, GLPG2451, GLPG1837, PTI-428, PTI-801, PTI-808, ai Lufu sen, or a combination thereof.

In some embodiments, the delivery vehicles are formulated such that they are suitable for prolonged release of the mRNA contained therein. Such extended release compositions can be conveniently administered to a subject at extended dosing intervals.

Liposome delivery vehicles

In some embodiments, a suitable delivery vehicle is a liposomal delivery vehicle, such as a lipid nanoparticle. As used herein, a liposome delivery vehicle (e.g., a lipid nanoparticle) is generally characterized as a microvesicle having an internal aqueous space separated from an external medium by one or more bilayer membranes. Bilayer membranes of liposomes are typically formed from amphiphilic molecules, such as lipids of synthetic or natural origin, comprising spatially separated hydrophilic and hydrophobic domains (Lasic, trends biotechnology, 16:307-321,1998). The bilayer membrane of the liposome may also be formed from an amphiphilic polymer and a surfactant (e.g., a polymer body, a lipid vesicle, etc.). In the context of the present invention, a liposome delivery vehicle is typically used to transport the desired mRNA to a target cell or tissue. In some embodiments, the nanoparticle delivery vehicle is a liposome. In some embodiments, the liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids, and one or more PEG-modified lipids. In some embodiments, the liposome comprises no more than three different lipid components. In some embodiments, one of the different lipid components is a sterol-based cationic lipid.

Cationic lipids

As used herein, the phrase "cationic lipid" refers to any of a number of lipid species that have a net positive charge at a selected pH, such as physiological pH.

Suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2010/144740, which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention comprise cationic lipids (6 z,9z,28z,31 z) -heptadecen-6,9,28,31-tetraen-19-yl 4- (dimethylamino) butyrate having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include ionizable cationic lipids as described in international patent publication WO 2013/149440, which is incorporated herein by reference. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of one of the following formulas:

or a pharmaceutically acceptable salt thereof, wherein R ₁ And R is ₂ Each independently selected from hydrogen, optionally substituted, variably saturated or unsaturated C ₁ -C ₂₀ Alkyl and optionally substituted, variably saturated or unsaturated C ₆ -C ₂₀ An acyl group; wherein L is ₁ And L ₂ Each independently selected from hydrogen, optionally substituted C ₁ -C ₃₀ Alkyl, optionally substituted, variably unsaturated C ₁ -C ₃₀ Alkenyl and optionally substituted C ₁ -C ₃₀ Alkynyl; wherein m and o are each independently selected from zero and any positive integer (e.g., where m is three); and wherein n is zero or any positive integer (e.g., where n is one). In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid (15Z, 18Z) -N, N-dimethyl-6- (9Z, 12Z) -octadecane-9, 12-dien-l-yl) tetracosyl-15, 18-dien-1-amine ("HGT 5000") having the following compound structure:

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid (15 z,18 z) -N, N-dimethyl-6- ((9 z,12 z) -octadeca-9, 12-dien-1-yl) tetracosan-4,15,18-trien-l-amine ("HGT 5001") having the following compound structure:

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid and (15 z,18 z) -N, N-dimethyl-6- ((9 z,12 z) -octadeca-9, 12-dien-1-yl) tetracosan-5,15,18-trien-1-amine ("HGT 5002") having the following compound structure:

And pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include the cationic lipids described as aminoalcohol lipids (lipidoid) in international patent publication WO 2010/053572, which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2016/118725, which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2016/118724, which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

And pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids having the formula 14, 25-ditridecyl 15,18,21,24-tetraaza-trioctadecyl, and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publications WO 2013/063284 and WO 2016/205691, each of which is incorporated herein by reference. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of the formula:

or a pharmaceutically acceptable salt thereof, wherein R ^L Independently of each occurrence of (2) is optionally substituted C ₆ -C ₄₀ Alkenyl groups. In certain embodiments, the compositions and methods of the present invention compriseA cationic lipid having the structure of:

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2015/184356, which is incorporated herein by reference. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of the formula:

or a pharmaceutically acceptable salt thereof, wherein each X is independently O or S; each Y is independently O or S; each m is independently 0 to 20; each of which isn is independently 1 to 6; each R _A Independently is hydrogen, optionally substituted C1-50 alkyl, optionally substituted C2-50 alkenyl, optionally substituted C2-50 alkynyl, optionally substituted C3-10 carbocyclyl, optionally substituted 3-14 membered heterocyclyl, optionally substituted C6-14 aryl, optionally substituted 5-14 membered heteroaryl, or halogen; and each R _B Independently is hydrogen, optionally substituted C1-50 alkyl, optionally substituted C2-50 alkenyl, optionally substituted C2-50 alkynyl, optionally substituted C3-10 carbocyclyl, optionally substituted 3-14 membered heterocyclyl, optionally substituted C6-14 aryl, optionally substituted 5-14 membered heteroaryl, or halogen. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid "target 23" having the following compound structure:

And pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2016/004202, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

or a pharmaceutically acceptable salt thereof. In some embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

or a pharmaceutically acceptable salt thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in U.S. provisional patent application Ser. No. 62/758,179, which is incorporated herein by reference. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of the formula:

or a pharmaceutically acceptable salt thereof, wherein each R ¹ And R is ² Independently H or C ₁ -C ₆ An aliphatic group; each m is independently an integer having a value of 1 to 4; each a is independently a covalent bond or arylene; each L ¹ Independently an ester, thioester, disulfide, or anhydride group; each L ² Independently C ₂ -C ₁₀ An aliphatic group; each X is ¹ Independently H or OH; and each R ³ Independently C ₆ -C ₂₀ An aliphatic group. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of the formula:

or a pharmaceutically acceptable salt thereof. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of the formula:

or a pharmaceutically acceptable salt thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in the following documents: J.McClellan, M.C.King, cell 2010,141,210-217 and Whitehead et al, nature Communications (2014) 5:4277, which are incorporated herein by reference. In certain embodiments, the cationic lipids of the compositions and methods of the present invention include cationic lipids having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2015/199952, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

And pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2017/004143, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

/>

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2017/075531, which is incorporated herein by reference. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of the formula:

or a pharmaceutically acceptable salt thereof, wherein L ¹ Or L ² One of them is-O (c=o) -, - (c=o) O-, -C (=o) -, -O-, -S (O) _x 、-S-S-、-C(＝O)S-、-SC(＝O)-、-NR ^a C(＝O)-、-C(＝O)NR ^a -、NR ^a C(＝O)NR ^a -、-OC(＝O)NR ^a -or-NR ^a C (=o) O-; and L is ¹ Or L ² The other of (C=O) -, - (C=O) O-, -C (=O) -, -O-, -S (O) _x 、-S-S-、-C(＝O)S-、SC(＝O)-、-NR ^a C(＝O)-、-C(＝O)NR ^a -、NR ^a C(＝O)NR ^a -、-OC(＝O)NR ^a -or-NR ^a C (=o) O-or a direct bond; g ¹ And G ² Each independently is unsubstituted C ₁ -C ₁₂ Alkylene or C ₁ -C ₁₂ Alkenylene; g ³ Is C ₁ -C ₂₄ Alkylene, C ₁ -C ₂₄ Alkenylene, C ₃ -C ₈ Cycloalkylene, C ₃ -C ₈ A cycloalkenyl group; r is R ^a Is H or C ₁ -C ₁₂ An alkyl group; r is R ¹ And R is ² Each independently is C ₆ -C ₂₄ Alkyl or C ₆ -C ₂₄ Alkenyl groups; r is R ³ Is H, OR ⁵ 、CN、-C(＝O)OR ⁴ 、-OC(＝O)R ⁴ or-NR ⁵ C(＝O)R ⁴ ；R ⁴ Is C ₁ -C ₁₂ An alkyl group; r is R ⁵ Is H or C ₁ -C ₆ An alkyl group; and x is 0, 1 or 2.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publication WO 2017/117528, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in international patent publication WO 2017/049245, which is incorporated herein by reference. In some embodiments, the cationic lipids of the compositions and methods of the present invention include compounds of one of the following formulas:

and pharmaceutically acceptable salts thereof. For any of these four formulas, R ₄ Independently selected from- (CH) ₂ ) _n Q and- (CH) ₂ ) _n CHQR; q is selected from the group consisting of-OR, -OH, -O (CH) ₂ ) _n N(R) ₂ 、-OC(O)R、-CX ₃ 、-CN、-N(R)C(O)R、-N(H)C(O)R、-N(R)S(O) ₂ R、-N(H)S(O) ₂ R、-N(R)C(O)N(R) ₂ 、-N(H)C(O)N(R) ₂ 、-N(H)C(O)N(H)(R)、-N(R)C(S)N(R) ₂ 、-N(H)C(S)N(R) ₂ -N (H) C (S) N (H) (R) and heterocycle; and n is 1, 2 or 3. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cationic lipids as described in International patent publications WO 2017/173054 and WO 2015/095340, each of which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid having the following compound structure:

And pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cleavable cationic lipids as described in international patent publication WO 2012/170889, which is incorporated herein by reference. In some embodiments, the compositions and methods of the invention comprise a cationic lipid of the formula:

wherein R is ₁ Selected from imidazole, guanidine, amino, imine, enamine, optionally substituted alkylamino (e.g., alkylamino, such as dimethylamino), and pyridinyl; wherein R is ₂ Selected from one of the following two formulas:

and wherein R is ₃ And R is ₄ Each independently selected from optionally substituted, variably saturated or unsaturated C ₆ -C ₂₀ Alkyl and optionally substituted, variably saturated or unsaturated C ₆ -C ₂₀ An acyl group; and wherein n is zero or any positive integer (e.g., one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty or more). In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid, "HGT4001", having the following compound structure:

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid ("HGT 4002" (also referred to herein as "Guan-SS-Chol")) a compound having the structure:

And pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid, "HGT4003", having the following compound structure:

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid, "HGT4004", having the following compound structure:

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention comprise a cationic lipid, "HGT4005", having the following compound structure:

and pharmaceutically acceptable salts thereof.

Other suitable cationic lipids for use in the compositions and methods of the present invention include cleavable cationic lipids as described in U.S. provisional application No. 62/672,194 filed 5/16 a 2018, and which application is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention include cationic lipids having any of the general formulas or any of structures (1 a) - (21 a) and (1 b) - (21 b) and (22) - (237) described in U.S. provisional application No. 62/672,194. In certain embodiments, the compositions and methods of the present invention comprise cationic lipids having a structure according to formula (I'),

Wherein:

R ^X independently is-H, -L ¹ -R ¹ or-L ^5A -L ^5B -B'；

L ¹ 、L ² And L ³ Each of which is independently a covalent bond, -C (O) -, -C (O) O-, -C (O) S-, or-C (O) NR ^L -；

Each L ^4A And L ^5A Is independently-C (O) -, -C (O) O-or-C (O) NR ^L -；

Each L ^4B And L ^5B Independently C ₁ -C ₂₀ An alkylene group; c (C) ₂ -C ₂₀ Alkenylene; or C ₂ -C ₂₀ Alkynylene;

each of B and B' is NR ⁴ R ⁵ Or a 5 to 10 membered nitrogen containing heteroaryl;

each R ¹ 、R ² And R is ³ Independently C ₆ -C ₃₀ Alkyl, C ₆ -C ₃₀ Alkenyl or C ₆ -C ₃₀ Alkynyl;

each R ⁴ And R is ⁵ Independently hydrogen, C ₁ -C ₁₀ An alkyl group; c (C) ₂ -C ₁₀ Alkenyl groups; or C ₂ -C ₁₀ Alkynyl; and is also provided with

Each R ^L Independently hydrogen, C ₁ -C ₂₀ Alkyl, C ₂ -C ₂₀ Alkenyl or C ₂ -C ₂₀ Alkynyl groups.

In certain embodiments, the compositions and methods of the present invention include a cationic lipid of compound (139) as 62/672,194 having the following compound structure:

in some embodiments, the compositions and methods of the present invention include the cationic lipid N- [ l- (2, 3-dioleoyloxy) propyl ] -N, N, N-trimethylammonium chloride ("DOTMA"). (Feigner et al Proc. Nat' l Acad. Sci.84,7413 (1987); U.S. Pat. No. 4,897,355, incorporated herein by reference). Other cationic lipids suitable for the compositions and methods of the present invention include, for example, 5-carboxy sperminyl (spinyl) glycine dioctadecylamide ("DOGS"); 2, 3-dioleoyloxy-N- [2 (spermine-carboxamide) ethyl ] -N, N-dimethyl-l-propylamine ("DOSPA") (Behr et al proc. Nat.' l acad. Sci.86,6982 (1989), U.S. patent No. 5,171,678; U.S. patent No. 5,334,761); l, 2-dioleoyl-3-dimethylammonium-propane ("DODAP"); l, 2-dioleoyl-3-trimethylammonium-propane ("DOTAP").

Additional exemplary cationic lipids suitable for the compositions and methods of the present invention also include: l, 2-distearoyloxy-N, N-dimethyl-3-aminopropane ("DSDMA"); 1, 2-dioleoyloxy-N, N-dimethyl-3-aminopropane ("DODMA"); 1, 2-dioleoyloxy-N, N-dimethyl-3-aminopropane ("DLinDMA"); l, 2-di-linolenyloxy-N, N-dimethyl-3-aminopropane ("DLenDMA"); N-dioleoyl-N, N-dimethyl ammonium chloride ("DODAC"); n, N-distearoyl-N, N-dimethyl ammonium bromide ("DDAB"); n- (l, 2-dimyristoxyprop-3-yl) -N, N-dimethyl-N-hydroxyethylammonium bromide ("dmriie"); 3-dimethylamino-2- (cholest-5-en-3- β -oxybut-4-oxy) -l- (cis, cis-9, 12-octadecadienyloxy) propane ("CLinDMA"); 2- [5'- (cholest-5-en-3- β -oxy) -3' -oxapentoxy) -3-dimethyl-l- (cis, cis-9 ', l-2' -octadecadienoxy) propane ("CpLinDMA"); n, N-dimethyl-3, 4-dioleoyloxybenzylamine ("DMOBA"); 1,2-N, N' -dioleylcarbamoyl-3-dimethylaminopropane ("DOcarbDAP"); 2, 3-dioleoyloxy-n, n-dimethylpropylamine ("DLinDAP"); l,2-N, N' -dioleylcarbamoyl-3-dimethylaminopropane ("DLincarbDAP"); l, 2-dioleoyl carbamoyl-3-dimethylaminopropane ("dlindcap"); 2, 2-dioleoyl-4-dimethylaminomethyl- [ l,3] -dioxolane ("DLin-K-DMA"); 2- ((8- [ (3P) -cholest-5-en-3-yloxy ] octyl) oxy) -N, N-dimethyl-3- [ (9 z,12 z) -octadec-9, 12-dien-1-yloxy ] propan-1-amine ("octyl-CLinDMA"); (2R) -2- ((8- [ (3β) -cholest-5-en-3-yloxy ] octyl) oxy) -N, N-dimethyl-3- [ (9 z,12 z) -octadec-9, 12-dien-1-yloxy ] propan-1-amine ("octyl-CLinDMA (2R)"); (2S) -2- ((8- [ (3P) -cholest-5-en-3-yloxy ] octyl) oxy) -N, fsl-dimethyl 3- [ (9 z,12 z) -octadec-9, 12-dien-1-yloxy ] propan-1-amine ("octyl-CLinDMA (2S)"); 2, 2-dioleoyl-4-dimethylaminoethyl- [ l,3] -dioxolane ("DLin-K-XTC 2-DMA"); and 2- (2, 2-di ((9Z, 12Z) -octadecane-9, l 2-dien-1-yl) -l, 3-dioxolan-4-yl) -N, N-dimethylethylamine ("DLin-KC 2-DMA") (see, WO 2010/042877, which is incorporated herein by reference; semple et al, nature Biotech.28:172-176 (2010)). (Heyes, J. Et al J Controlled Release 107:276-287 (2005); morrissey, DV. et al Nat. Biotechnol.23 (8): 1003-1007 (2005); international patent publication WO 2005/121348). In some embodiments, the one or more cationic lipids comprise at least one of an imidazole, a dialkylamino, or a guanidine moiety. In some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include 2, 2-dioleoyl-4-dimethylaminoethyl- [1,3] -dioxolane ("XTC"); (3 aR,5s,6 aS) -N, N-dimethyl-2, 2-bis ((9Z, 12Z) -octadecane-9, 12-dienyl) tetrahydro-3 aH-cyclopenta [ d ] [1,3] dioxol-5-amine ("ALNY-100") and/or 4,7, 13-tris (3-oxo-3- (undecylamino) propyl) -N1, N16-bis undecyl-4, 7,10, 13-tetraazahexadecane-1, 16-diamide ("NC 98-5").

In some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include cationic lipids as TL1-04D-DMA having the following compound structure:

in some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include a cationic lipid as GL-TES-SA-DME-E18-2 having the following compound structure:

in some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include cationic lipids as SY-3-E14-DMAPR, which have the following compound structure:

in some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include cationic lipids as TL1-01D-DMA having the following compound structure:

in some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include cationic lipids as TL1-10D-DMA having the following compound structure:

in some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include cationic lipids as GL-TES-SA-DMP-E18-2 having the following compound structure:

In some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include cationic lipids as HEP-E4-E10 having the following compound structure:

in some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include cationic lipids as HEP-E3-E10 having the following compound structure:

in some embodiments, the compositions of the present invention comprise one or more cationic lipids that constitute at least about 5%, 10%, 20%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, or 70% of the total lipid content in the composition (e.g., lipid nanoparticle), as measured by weight. In some embodiments, the compositions of the present invention comprise one or more cationic lipids, measured in mol%, that constitute at least about 5%, 10%, 20%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, or 70% of the total lipid content in the composition (e.g., lipid nanoparticle). In some embodiments, the compositions of the present invention comprise one or more cationic lipids, which constitute about 30-70% (e.g., about 30-65%, about 30-60%, about 30-55%, about 30-50%, about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the total lipid content in the composition (e.g., lipid nanoparticle), by weight. In some embodiments, the compositions of the present invention include one or more cationic lipids, measured in mol%, that constitute about 30-70% (e.g., about 30-65%, about 30-60%, about 30-55%, about 30-50%, about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the total lipid content in the composition (e.g., lipid nanoparticle).

Non-cationic/helper lipids

In some embodiments, provided liposomes contain one or more non-cationic ("helper") lipids. As used herein, the phrase "non-cationic lipid" refers to any neutral, zwitterionic, or anionic lipid. As used herein, the phrase "anionic lipid" refers to any of a number of lipid species that carry a net negative charge at a selected pH (e.g., physiological pH). Non-cationic lipids include, but are not limited to, distearoyl phosphatidylcholine (DSPC), dioleoyl phosphatidylcholine (DOPC), dipalmitoyl phosphatidylcholine (DPPC), dioleoyl phosphatidylglycerol (DOPG), dipalmitoyl phosphatidylglycerol (DPPG), dioleoyl phosphatidylethanolamine (DOPE), palmitoyl Oleoyl Phosphatidylcholine (POPC), palmitoyl Oleoyl Phosphatidylethanolamine (POPE), dioleoyl phosphatidylethanolamine 4- (N-maleimidomethyl) -cyclohexane-l-formate (DOPE-mal), dipalmitoyl phosphatidylethanolamine (DPPE), dimyristoyl phosphatidylethanolamine (DMPE), distearoyl phosphatidylethanolamine (DSPE), phosphatidylserine, sphingolipids, cerebrosides, gangliosides, 16-O-monomethyl PE, 16-O-dimethyl PE, 18-1-trans PE, l-stearoyl-2-oleoyl-phosphatidylethanolamine (SOPE), or mixtures thereof.

In some embodiments, such non-cationic lipids may be used alone, but preferably in combination with other lipids (e.g., cationic lipids). In some embodiments, the non-cationic lipid may comprise the following molar ratios of total lipids present in the liposome: about 5% to about 90%, or about 10% to about 70%. In some embodiments, the non-cationic lipid is a neutral lipid, i.e., a lipid that does not carry a net charge under the conditions of formulation and/or administration of the composition. In some embodiments, the percentage of non-cationic lipids in the liposomes can be greater than about 5%, greater than 10%, greater than 20%, greater than 30%, or greater than 40%.

Cholesterol-based lipids

In some embodiments, provided liposomes comprise one or more cholesterol-based lipids. Suitable cholesterol-based cationic lipids include, for example, DC-Choi (N, N-dimethyl-N-ethylcarboxamido cholesterol), l, 4-bis (3-N-oleylamino-propyl) piperazine (Gao et al biochem. Biophys. Res. Comm.179,280 (1991); wolf et al BioTechniques 23,139 (1997); U.S. Pat. No. 5,744,335), or ICE. In some embodiments, the cholesterol-based lipid may comprise the following molar ratios of total lipids present in the liposome: about 2% to about 30%, or about 5% to about 20%. In some embodiments, the percentage of cholesterol-based lipids in the lipid nanoparticle is greater than 5%, greater than 10%, greater than 20%, greater than 30%, or greater than 40%.

PEG modified lipids

The present invention also contemplates the use of polyethylene glycol (PEG) -modified phospholipids and derivatized lipids, such as derivatized ceramides (PEG-CER), including N-octanoyl-sphingosine-1- [ succinyl (methoxypolyethylene glycol) -2000](C8 PEG-2000 ceramide), alone or preferably in combination with other lipid formulations that constitute a transfer vehicle (e.g., lipid nanoparticles). Contemplated PEG modified lipids include, but are not limited to, covalent attachment to a polypeptide having one or more C' s ₆ -C ₂₀ A polyethylene glycol chain of length maximum S kDa of the lipid of the alkyl chain of length. The addition of such components can prevent duplicationThe complexes aggregate and may also provide a means for increasing the circulation life of the lipid-nucleic acid composition and increasing delivery of the lipid-nucleic acid composition to the target tissue (Klibanov et al (1990) FEBS Letters,268 (1): 235-237), or they may be selected for rapid exchange from the formulation in vivo (see us patent No. 5,885,613). Particularly useful exchangeable lipids are PEG-ceramides with a shorter acyl chain (e.g., C14 or C18). The PEG-modified phospholipids and derivatized lipids of the invention may comprise the following molar ratios of total lipids present in the liposomal transfer vehicle: about 0% to about 20%, about 0.5% to about 20%, about 1% to about 15%, about 4% to about 10% or about 2%.

According to various embodiments, the selection of cationic lipids, non-cationic lipids, and/or PEG-modified lipids comprising the lipid nanoparticle, as well as the relative molar ratio of such lipids with respect to each other, is based on the characteristics of one or more selected lipids, the nature of the intended target cell, the characteristics of the MCNA to be delivered. Additional considerations include, for example, saturation of alkyl chains, as well as the size, charge, pH, pKa, fusogenic (fusogenicity), and toxicity of one or more selected lipids. Whereby the molar ratio can be adjusted accordingly.

Polymer

In some embodiments, suitable delivery vehicles are formulated using polymers as carriers, either alone or in combination with other carriers, including the various lipids described herein. Thus, in some embodiments, liposome delivery vehicles as used herein also encompass nanoparticles comprising a polymer. Suitable polymers may include, for example, polyacrylates, polyalkylcyanoacrylates, polylactides, polylactide-polyglycolide copolymers, polycaprolactone, dextran, albumin, gelatin, alginate, collagen, chitosan, cyclodextrin, protamine, pegylated protamine, PLL, pegylated PLL, and Polyethylenimine (PEI). In the presence of PEI, it may be branched PEI with a molecular weight in the range of 10 to 40kDa, such as 25kDa branched PEI (Sigma # 408727).

Liposomes suitable for use in the present invention

Suitable liposomes for use in the present invention can include one or more of any of the cationic lipids, non-cationic lipids, cholesterol lipids, PEG-modified lipids and/or polymers described herein in varying ratios. As non-limiting examples, suitable liposome formulations can include a combination selected from the group consisting of: cKK-E12, DOPE, cholesterol and DMG-PEG2K; c12-200, DOPE, cholesterol and DMG-PEG2K; HGT4003, DOPE, cholesterol, and DMG-PEG2K; ICE, DOPE, cholesterol and DMG-PEG2K; or ICE, DOPE and DMG-PEG2K.

In various embodiments, the cationic lipid (e.g., cKK-E12, C12-200, ICE, and/or HGT 4003) comprises about 30-60% (e.g., about 30-55%, about 30-50%, about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the liposome. In some embodiments, the percentage of cationic lipids (e.g., cKK-E12, C12-200, ICE, and/or HGT 4003) is at or above about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, or about 60% of the liposome.

In some embodiments, the ratio of the one or more cationic lipids to the one or more non-cationic lipids to the one or more cholesterol-based lipids to the one or more PEG-modified lipids may be between about 30-60:25-35:20-30:1-15, respectively. In some embodiments, the ratio of the one or more cationic lipids to the one or more non-cationic lipids to the one or more cholesterol-based lipids to the one or more PEG-modified lipids is about 40:30:20:10, respectively. In some embodiments, the ratio of the one or more cationic lipids to the one or more non-cationic lipids to the one or more cholesterol-based lipids to the one or more PEG-modified lipids is about 40:30:25:5, respectively. In some embodiments, the ratio of the one or more cationic lipids to the one or more non-cationic lipids to the one or more cholesterol-based lipids to the one or more PEG-modified lipids is about 40:32:25:3, respectively. In some embodiments, the ratio of the one or more cationic lipids to the one or more non-cationic lipids to the one or more cholesterol-based lipids to the one or more PEG-modified lipids is about 50:25:20:5.

In particular embodiments, the liposomes for use in the present invention include a lipid component consisting of a cationic lipid, a non-cationic lipid (e.g., DOPE or DEPE), a PEG-modified lipid (e.g., DMG-PEG 2K), and optionally cholesterol. Cationic lipids particularly suitable for inclusion in such liposomes include GL-TES-SA-DME-E18-2, TL1-01D-DMA, SY-3-E14-DMAPR, TL1-10D-DMA, HGT4002 (also referred to herein as Guan-SS-Chol), GL-TES-SA-DMP-E18-2, HEP-E4-E10, HEP-E3-E10 and TL1-04D-DMA. These cationic lipids have been found to be particularly suitable for use in liposomes for administration by pulmonary delivery via nebulization. Among these cationic lipids, HEP-E4-E10, HEP-E3-E10, GL-TES-SA-DME-E18-2, GL-TES-SA-DMP-E18-2, TL1-01D-DMA and TL1-04D-DMA perform particularly well.

Exemplary liposomes include one of GL-TES-SA-DME-E18-2, TL1-01D-DMA, SY-3-E14-DMAPR, TL1-10D-DMA, GL-TES-SA-DMP-E18-2, HEP-E4-E10, HEP-E3-E10, and TL1-04D-DMA as a cationic lipid component, DOPE as a non-cationic lipid component, cholesterol as a helper lipid component, and DMG-PEG2K as a PEG modified lipid component. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to cholesterol to PEG-modified lipid may be between about 30-60:25-35:20-30:1-15, respectively. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to cholesterol to PEG-modified lipid is about 40:30:20:10, respectively. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to cholesterol to PEG-modified lipid is about 40:30:25:5, respectively. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to cholesterol to PEG-modified lipid is about 40:32:25:3, respectively. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to cholesterol to PEG-modified lipid is about 50:25:20:5, respectively.

In some embodiments, the lipid component of liposomes particularly suitable for pulmonary delivery consists of HGT4002 (also referred to herein as Guan-SS-Chol), DOPE, and DMG-PEG 2K. In some embodiments, the molar ratio of cationic lipid to non-cationic lipid to PEG-modified lipid is about 60:35:5.

Ratios of different lipid components

In embodiments where the lipid nanoparticle comprises three and no more than three different components of lipid, the ratio of total lipid content (i.e., the ratio of lipid component (1): lipid component (2): lipid component (3)) may be expressed as x: y: z, where

(y+z)＝100-x。

In some embodiments, "x", "y" and "z" each represent mole percentages of three different components of the lipid, and the ratio is a molar ratio.

In some embodiments, "x", "y" and "z" each represent weight percentages of three different components of the lipid, and the ratio is a weight ratio.

In some embodiments, lipid component (1) represented by variable "x" is a sterol-based cationic lipid.

In some embodiments, lipid component (2) represented by variable "y" is a helper lipid.

In some embodiments, lipid component (3) represented by the variable "z" is a PEG lipid.

In some embodiments, the variable "x" representing the mole percent of lipid component (1) (e.g., a sterol-based cationic lipid) is at least about 10%, about 20%, about 30%, about 40%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95%.

In some embodiments, the variable "x" representing the mole percent of lipid component (1) (e.g., a sterol-based cationic lipid) is no more than about 95%, about 90%, about 85%, about 80%, about 75%, about 70%, about 65%, about 60%, about 55%, about 50%, about 40%, about 30%, about 20%, or about 10%. In embodiments, the variable "x" is no more than about 65%, about 60%, about 55%, about 50%, about 40%.

In some embodiments, the variable "x" representing the mole percent of lipid component (1) (e.g., a sterol-based cationic lipid) is: at least about 50% but less than about 95%; at least about 50% but less than about 90%; at least about 50% but less than about 85%; at least about 50% but less than about 80%; at least about 50% but less than about 75%; at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%. In embodiments, the variable "x" is at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%.

In some embodiments, the variable "x" representing the weight percent of lipid component (1) (e.g., a sterol-based cationic lipid) is at least about 10%, about 20%, about 30%, about 40%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95%.

In some embodiments, the variable "x" representing the weight percent of lipid component (1) (e.g., a sterol-based cationic lipid) is no more than about 95%, about 90%, about 85%, about 80%, about 75%, about 70%, about 65%, about 60%, about 55%, about 50%, about 40%, about 30%, about 20%, or about 10%. In embodiments, the variable "x" is no more than about 65%, about 60%, about 55%, about 50%, about 40%.

In some embodiments, the variable "x" representing the weight percent of lipid component (1) (e.g., a sterol-based cationic lipid) is: at least about 50% but less than about 95%; at least about 50% but less than about 90%; at least about 50% but less than about 85%; at least about 50% but less than about 80%; at least about 50% but less than about 75%; at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%. In embodiments, the variable "x" is at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%.

In some embodiments, the variable "z" representing the mole percent of lipid component (3) (e.g., a PEG lipid) is no more than about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, or 25%. In embodiments, the variable "z" representing the mole percentage of lipid component (3) (e.g., PEG lipid) is about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%. In embodiments, the variable "z" representing the mole percent of lipid component (3) (e.g., PEG lipid) is from about 1% to about 10%, from about 2% to about 10%, from about 3% to about 10%, from about 4% to about 10%, from about 1% to about 7.5%, from about 2.5% to about 10%, from about 2.5% to about 7.5%, from about 2.5% to about 5%, from about 5% to about 7.5%, or from about 5% to about 10%.

In some embodiments, the variable "z" representing the weight percentage of lipid component (3) (e.g., PEG lipid) is no more than about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, or 25%. In embodiments, the variable "z" representing the weight percent of lipid component (3) (e.g., PEG lipid) is about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%. In embodiments, the variable "z" representing the weight percent of lipid component (3) (e.g., PEG lipid) is from about 1% to about 10%, from about 2% to about 10%, from about 3% to about 10%, from about 4% to about 10%, from about 1% to about 7.5%, from about 2.5% to about 10%, from about 2.5% to about 7.5%, from about 2.5% to about 5%, from about 5% to about 7.5%, or from about 5% to about 10%.

For compositions having three and only three different lipid components, the variables "x", "y" and "z" may be in any combination, so long as the sum of the three variables adds up to 100% of the total lipid content.

Formation of liposomes encapsulating mRNA

Liposome transfer vehicles for use in the compositions of the present invention can be prepared by a variety of techniques presently known in the art. Liposomes for use in the provided compositions can be prepared by a variety of techniques presently known in the art. For example, multilamellar vesicles (MLVs) can be prepared according to conventional techniques, such as by depositing the selected lipid on the inner wall of a suitable container or vessel (by dissolving the lipid in a suitable solvent, then evaporating the solvent to leave a thin film inside the vessel) or by spray drying. An aqueous phase may then be added to the vessel with a swirling motion, which results in the formation of MLVs. Unilamellar vesicles (ULV) can then be formed by homogenization, sonication, or extrusion of the multilamellar vesicles. Alternatively, unilamellar vesicles may be formed by detergent removal techniques.

In certain embodiments, provided compositions comprise liposomes, wherein mRNA is both associated on the surface of the liposome and encapsulated within the same liposome. For example, during the preparation of the compositions of the invention, cationic liposomes can associate with mRNA via electrostatic interactions. For example, during the preparation of the compositions of the invention, cationic liposomes can associate with mRNA via electrostatic interactions.

In some embodiments, the compositions and methods of the invention comprise mRNA encapsulated in liposomes. In some embodiments, one or more mRNA species may be encapsulated in the same liposome. In some embodiments, one or more mRNA species may be encapsulated in different liposomes. In some embodiments, mRNA is encapsulated in one or more liposomes that differ in their lipid composition, molar ratio of lipid components, size, charge (zeta potential), targeting ligand, and/or combinations thereof. In some embodiments, one or more liposomes can have different compositions of sterol-based cationic lipids, neutral lipids, PEG-modified lipids, and/or combinations thereof. In some embodiments, one or more of the liposomes can have different molar ratios of cholesterol-based cationic lipids, neutral lipids, and PEG-modified lipids used to produce the liposomes.

The process of incorporating the desired mRNA into liposomes is often referred to as "loading". Exemplary methods are described in Lasic et al, FEBS lett, 312:255-258,1992, which is incorporated herein by reference. The nucleic acid incorporated into the liposome may be located wholly or partially within the interior space of the liposome, within the bilayer membrane of the liposome, or attached to the outer surface of the liposome membrane. Incorporation of a nucleic acid into a liposome is referred to herein as "encapsulation," wherein the nucleic acid is contained entirely within the interior space of the liposome. The purpose of incorporating mRNA into a transfer vehicle (e.g., a liposome) is often to protect the nucleic acid from the environment, which may contain enzymes or chemicals that degrade the nucleic acid and/or systems or receptors that cause rapid excretion of the nucleic acid. Thus, in some embodiments, a suitable delivery vehicle is capable of enhancing the stability of the mRNA contained therein and/or facilitating delivery of the mRNA to a target cell or tissue.

Suitable liposomes according to the invention can be made in a variety of sizes. In some embodiments, provided liposomes can be made smaller than previously known mRNA-encapsulating liposomes. In some embodiments, reduced liposome size is associated with more efficient delivery of mRNA. The selection of the appropriate liposome size may take into account the site of the target cell or tissue and, to some extent, the application for which the liposome is being prepared.

In some embodiments, liposomes of suitable size are selected to promote systemic distribution of antibodies encoded by the mRNA. In some embodiments, it may be desirable to limit transfection of mRNA to certain cells or tissues. For example, to target hepatocytes, liposomes can be sized such that they are smaller than the perforations of the endothelial intimal liver antrum in the liver; in this case, the liposomes can readily penetrate such endothelial cells to reach the target hepatocytes.

Alternatively or additionally, the liposomes may be sized such that the size of the liposomes has a sufficiently large diameter to limit or clearly avoid distribution into certain cells or tissues.

Various alternative methods known in the art may be used to determine the size of the liposome population. One such sizing method is described in U.S. patent No. 4,737,323 (incorporated herein by reference). The liposome suspension is sonicated by bath or probe sonication to gradually reduce the size to a small ULV of less than about 0.05 microns in diameter. Homogenization is another method that relies on shear energy to break up large liposomes into smaller ones. In a typical homogenization procedure, the MLV is recirculated through a standard emulsion homogenizer until a selected liposome size (typically between about 0.1 and 0.5 microns) is observed. The size of the liposomes can be determined by quasi-electro-optical scattering (QELS) as described in bloom field, ann.rev. Biophys. Bioeng.,10:421-150 (1981), incorporated herein by reference. The average liposome diameter can be reduced by sonicating the formed liposomes. Intermittent sonication cycles can be alternated with QELS assessment to direct efficient liposome synthesis.

Therapeutic use of compositions

In one aspect, the invention provides, inter alia, a method of inducing expression of a protein in vivo by administering a codon optimized nucleic acid encoding the protein, or by administering the protein. In some embodiments, the composition comprises a nucleic acid encapsulated with or complexed with a delivery vehicle. In some embodiments, the delivery vehicle is selected from the group consisting of liposomes, lipid nanoparticles, solid-lipid nanoparticles, polymers, viruses, sol-gels, and nanogels. In some embodiments, the codon-optimized nucleic acid encoding the protein is packaged in a viral particle.

For oral administration, the pharmaceutical formulation is in the form of, for example, a tablet or capsule prepared by known methods with pharmaceutically acceptable excipients such as binders (e.g., pregelatinized cornstarch, polyvinylpyrrolidone, or methylcellulose); fillers (e.g., lactose, microcrystalline cellulose, or calcium hydrogen phosphate); additives (e.g., magnesium stearate, talc, silica); disintegrants (e.g., potato starch); and/or a lubricant (e.g., sodium lauryl sulfate). The tablets may be coated using known methods. Liquid formulations for oral administration have the form of solutions, syrups or suspensions, for example, or may be in the form of a dry product which may be dissolved in water or another liquid prior to use. The formulations are prepared by known methods with pharmaceutically acceptable additives such as suspending agents (e.g., sorbitol, cellulose derivatives, edible hydrogenated fats); emulsifying agents (e.g., lecithin or acacia); a non-aqueous liquid (e.g., almond oil, oily esters, ethyl alcohol, or fractionated vegetable oil); and/or a preservative (e.g., methyl or propyl hydroxybenzoate, sorbic acid, or ascorbic acid). Where appropriate, the formulations may also contain buffer salts, colorants, flavors and/or sweeteners.

Formulations for oral administration are formulated in known manner to provide controlled release of the active compound.

Examples

Although certain compounds, compositions, and methods of the present invention have been specifically described according to certain embodiments, the following examples are illustrative of the compounds of the present invention and are not intended to be limiting.

Example 1 production of optimized nucleotide sequences

This example illustrates a method of producing an optimized nucleotide sequence according to the invention that is optimized to produce full length transcripts and result in high levels of expression of the encoded protein during in vitro synthesis.

The method combines the codon optimization method of fig. 1A with a series of filtering steps shown in fig. 1B to generate a list of optimized nucleotide sequences. Specifically, as shown in FIG. 1A, the method receives an amino acid sequence of interest and a first codon usage table reflecting the frequency of each codon in a given organism (i.e., human codon usage bias in the case of the present example). If a codon is associated with a codon usage frequency below a threshold frequency (10%), then the method removes the codon from the first codon usage table. The codon usage frequency of the codons not removed in the first step is normalized to generate a normalized codon usage table.

Normalizing the codon usage table involves reassigning the frequency of use value for each removed codon; the frequency of use of a certain removed codon is added to the frequency of use of other codons sharing amino acids with the removed codon. In this example, the reassignment is proportional to the magnitude of the frequency of use of codons not removed from the table. The method uses a standardized codon usage table to generate a list of optimized nucleotide sequences. Each optimized nucleotide sequence encodes an amino acid sequence of interest.

As shown in fig. 1B, the list of optimized nucleotide sequences was further processed by applying a motif screening filter, a guanine-cytosine (GC) content analysis filter, and a Codon Adaptation Index (CAI) analysis filter in the following order to generate an updated list of optimized nucleotide sequences.

As shown in the examples below, this method results in an optimized nucleotide sequence encoding the amino acid sequence of interest. The nucleotide sequences produce full length transcripts during in vitro synthesis and result in high levels of expression of the encoded proteins.

Example 2 codon optimization of CFTR mRNA sequence to increase CAI resulted in higher protein expression

This example demonstrates that a codon optimized protein coding sequence with a Codon Adaptation Index (CAI) of about 0.8 or higher is superior to a codon optimized protein coding sequence with a CAI of less than 0.8.

Human cystic fibrosis transmembrane conductance regulator (hCFTR) was codon optimized as described in example 1. hCFTR is encoded by a sequence of 4440 nucleotides.

Mutations in the gene encoding the hCFTR protein lead to Cystic Fibrosis (CF), which is the most common genetic disease in the caucasian population. It is characterized by abnormal transport of chloride and sodium ions in epithelial cells, resulting in thick, viscous secretions that most severely affect the lung, as well as the pancreas, liver and intestine. mRNA encoding the codon optimized hCFTR coding sequence is being developed as a novel therapeutic for the treatment of CF.

The natural hCFTR amino acid sequence was codon optimized according to the method of the invention as shown in example 1. Three sequences designated hCFTR#1 (SEQ ID NO: 16), hCFTR#2 (SEQ ID NO: 2) and hCFTR#3 (SEQ ID NO: 3) were selected for further analysis. For reference, nucleotide sequences having hCFTR coding sequences codon optimized with different algorithms are provided (SEQ ID NO: 15). The reference nucleotide sequence (SEQ ID NO: 15) has been previously experimentally verified both in vitro and in vivo. It has been found that the reference nucleotide sequence provides superior protein yields relative to other earlier tested codon-optimised nucleotide sequences encoding hCFTR proteins. The CAI and GC content% of the codon optimized hcftr#2 and hcftr#3 sequences are significantly increased when compared to the reference nucleotide sequence. Furthermore, their codon frequency assignment (CFD)% was 0% compared to 6% of the reference nucleotide sequence, indicating that rare codon clusters detrimental to translation efficiency were successfully removed. Additional filtration to remove the negative regulatory motif resulted in a significant reduction in the number of negative CIS-regulatory (CIS) elements in hcftr#2 and hcftr#3 (see table 2).

TABLE 2

To test the protein yield from each codon optimized sequence, 4 nucleic acid vectors were prepared, each comprising an expression cassette containing one of the 4 nucleotide sequences encoding hCFTR protein flanked by identical 3 'and 5' untranslated sequences (3 'and 5' utrs) and preceded by an RNA polymerase promoter. These nucleic acid vectors were used as templates for in vitro transcription reactions to provide 4 batches of mRNA containing 4 codon-optimized nucleotide sequences (reference and hCFTR #1 to # 3). Capping and tailing are performed separately.

Each capped and tailed mRNA was transfected individually into a cell line (HEK 293). Cell lysates were collected 24 and 48 hours after transfection. Protein samples were extracted and processed for SDS-PAGE. The expression level of the encoded hCFTR protein was assessed by western blotting. Proteins were visualized and quantified using the LI-COR system. Protein yield is expressed as Relative Fluorescence Units (RFU). The results of this experiment are summarized in fig. 2A. The codon optimized nucleotide sequences hcftr#2 and hcftr#3 (both having a CAI of 0.89) produced significantly higher yields of encoded hCFTR protein than the reference nucleotide sequence and hcftr#1 (both having a CAI of 0.7). This effect was more pronounced at the 24 hour time point (see fig. 2B), probably due to relatively rapid degradation of mRNA in HEK293 cells after transfection.

The data in this example demonstrate that codon optimization of the treatment-related nucleotide sequence (hCFTR) to achieve CAI of about 0.8 or higher results in higher protein yields, especially when combined with optimization of its CFD and GC content as well as with removal of any negative CIS elements from the nucleic acid sequence. The data in this example also demonstrate that codon optimization of mRNA according to the method of the invention results in very high protein yields in human cells compared to nucleotide sequences that are codon optimized with different algorithms.

Example 3 codon optimization of cftr nucleotide sequence leads to increased functional activity in cells

This example illustrates that codon optimization of the nucleotide sequence encoding a protein of interest according to the method of the invention does not affect the functional activity of the protein in human cells. To illustrate this, mRNA encoding CFTR was used in this study.

Administration of hCFTR mRNA is intended to result in uptake of hCFTR mRNA by airway epithelial cells of CF patients, followed by internalization into the cytoplasm of the target cells. After cellular uptake is achieved, hCFTR mRNA is translated into normal hCFTR protein, which is then processed by the cell's endogenous secretory pathway, resulting in localization of the hCFTR protein in the apical cell membrane. In this way, hCFTR mRNA administration produces functional hCFTR protein in airway epithelial cells, thereby correcting the lack of functional CFTR in the lungs of CF patients. Codon optimization of the hCFTR mRNA nucleotide sequence can increase expression of functional hCFTR protein, which is thought to result in higher amounts of functional hCFTR protein in target airway epithelial cells of CF patients.

Codon optimisation has been reported to be possible at the expense of reduced functional activity and associated loss of efficacy of the encoded protein, as the process may remove information encoded in the nucleotide sequence important to control translation of the protein and ensure correct folding of the nascent polypeptide chain (Mauro and Chappell, trends Mol Med.2014;20 (11): 604-13). To test the functional activity of hCFTR protein expressed from a codon optimized sequence generated using the codon optimization method as shown in example 1, hCFTR mRNA produced in example 2 was tested in an ewing cell assay. The assay uses an epithelial voltage clamp to monitor epithelium transfected with hCFTR mRNAThe chloride transport function of the cells to assess the functional activity of the protein expressed by the mRNA. Specifically, the functional activity of hCFTR protein expressed by mRNA having the control hCFTR coding sequence (SEQ ID NO: 15) or the coding sequence of hCFTR#1 (SEQ ID NO: 16), hCFTR#2 (SEQ ID NO: 2) or hCFTR#3 (SEQ ID NO: 3) was measured in Fischer Rat Thyroid (FRT) epithelial cells. FRT epithelial cells are commonly used as models to study human airway epithelial cell function. FRT epithelial cells in Snapwell ^TM Monolayers on filter inserts were grown and transfected with 4 hCFTR mRNA. 4 hCFTR mRNA were generated as described in example 2. Control mRNA has been previously validated in this assay and used as a reference standard.

When CFTR agonists (forskolin and VX-770) are administered) When correctly translated and located hCFTR protein produced by hCFTR mRNA increases short circuit current (I) in ews epithelial voltage clamp device _SC ) And outputting. The administration of CFTR antagonist CFTRinh-172 brings hCFTR into a blocking state. I in the assay _SC Current polarity switching records the apical to basolateral sodium current and basolateral to apical chloride current as negative values, and thus if transfection with test hCFTR mRNA generates high negative values, it can be concluded that: the encoded hCFTR protein is functional (fig. 3A). Furthermore, by transfecting an equal amount of mRNA, it can be assessed whether the mRNA produces a higher hCFTR protein yield, as protein yield and activity are related. Transfection of FRT epithelial cells with mRNA having hcftr#1 coding sequence resulted in activity comparable to that obtained by transfection with mRNA having control hCFTR coding sequence (fig. 3B). The nucleotide sequence encoding hCFTR produced by the method of the invention encodes mRNA resulting in significantly increased activity. Consistent with the higher protein yields observed in example 2, the activity resulting from hCFTR protein produced from mRNA encoding hcftr#2 was more than 2 times that of the control mRNA, and the activity resulting from hCFTR protein produced from mRNA encoding hcftr#3 was 3 times that of the control mRNA. This demonstrates the higher results observed in example 2 for hcftr#2 and hcftr#3 Protein yield is directly related to higher functional activity, indicating that codon optimization according to the methods of the invention does not negatively affect the functional activity of the encoded protein.

In summary, codon optimisation according to the method of the invention results in higher expression of the encoded protein in human cells and the expressed protein provides complete functional activity in a model system, which is a highly relevant model for human therapy.

Example 4 evaluation of codon optimized wild-type CFTR constructs against activated CFTR constructs

In this example, to examine the efficacy of the codon optimized mRNA of the present invention, the expression and activity of the codon optimized wild-type CFTR construct was compared to the non-codon optimized wild-type CFTR construct and the activated CFTR mutant construct.

The inventors developed mRNA encoding an engineered or mutant CFTR protein that showed increased activity and/or stability. In particular, the engineered CFTR protein may contain one or more modifications that mimic phosphorylated residues in the R domain (R domain phosphomimetic mutations). These mutations result in activation and opening of CFTR chloride channels. Another strategy for engineering activation mutants of CFTR is to mutate residues involved in ATP gating (e.g., E1371Q). CFTR proteins undergo ubiquitination at lysine residues. Amino acid mutations at lysine residues that result in substitution of lysine for another amino acid residue result in enhanced stability and protein expression of CFTR proteins (e.g., K14R). Example 3 shows that the codon optimized wild type CFTR construct of the invention has higher activity than the reference CFTR mRNA construct. To further evaluate the expression and activity of the codon optimized CFTR constructs of the invention, comparisons were made in this experiment using the activated CFTR constructs listed in table 3 and non-codon optimized CFTR constructs.

TABLE 3 wild-type and mutant constructs of various CFTR

First, studies were also performed to evaluate in vitro translation of codon optimized wild-type CFTR and CFTR mutants. The data from these studies indicate that codon optimized WT sequences, i.e., 13E and E1371Q K14R variants, showed increased expression in the C band of HEK293 lysates (fig. 4A and 4B). "C-band" refers to the mature complex glycosylated form of CFTR. In addition, codon-optimized WT CFTR showed higher efficacy than non-codon-optimized WT CFTR (fig. 4C).

Next, the codon optimized CFTR sequences (both WT and mutant) were evaluated for non-codon optimized WT CFTR in an ews chamber assay. The data from these assays indicate that codon optimized WT CFTR showed increased activity in the ews chamber assay compared to non-codon optimized CFTR (fig. 5A). Notably, CO WT showed activity comparable to the codon-optimized activated CFTR mutant construct (fig. 5B). Surprisingly, it was found that the activity of wild-type CFTR protein can be significantly enhanced without introducing amino acid mutations.

The duration of activity of the various CFTR constructs listed in test table 3 was measured in Cheng Yousi chambers when used. The activity of the CFTR protein was measured at 22 and 44 hours. Short-circuit current was plotted for each CFTR protein at 22 and 44 hours (I _sc ) (ion movement measured from active transport in ews chamber). Figure 5C shows that codon optimized WT CFTR has high residual activity at 44 hours, significantly higher than the non-codon optimized counterpart.

In vitro tolerance of mutant CFTR mRNA was also assessed in HEK293 cells using a commercially available cytotoxicity assay. The data from these studies indicate that none of the CFTR variants, including codon optimized WT CFTR, showed increased cytotoxicity in HEK293 cells when compared to vehicle controls (fig. 6).

Taken together, these data demonstrate that mRNAs that are codon optimized according to the present invention exhibit significantly higher activity, which is particularly useful for treating pulmonary diseases by means of mRNA therapeutics.

EXAMPLE 5 Synthesis of lipids for pulmonary delivery

The synthesis of the cationic lipids of the present invention is described in this example.

1.GL-TES-SA-DME-E18-2

Synthetic scheme

Synthesis of (9Z, 12Z) -octadeca-9, 12-dienoyl chloride (2)

To a solution of linolenic acid (1.0 g,3.6 mmol) in 10mL of dichloromethane was added N, N-dimethylformamide (0.1 mL) and oxalyl chloride (1.2 mL,14.3 mmol) at 0deg.C. The reaction mixture was warmed to room temperature and stirred for 3h. The solvent was removed under reduced pressure and the crude product was used in the next step without further purification.

Synthesis of 2- ((1, 3-bis (((9Z, 12Z) -octadeca-9, 12-dienoyl) oxy) -2- ((((9Z, 12Z) -octadeca-9, 12-dienoyl) oxy) methyl) propan-2-yl) amino) ethane-1-sulfonic acid (3)

To a solution of (9Z, 12Z) -octadeca-9, 12-dienoyl chloride 2 (1.1 g,3.6 mmol) in anhydrous N, N-dimethylacetamide (5.0 mL) and N-methylmorpholine (3.0 mL) was added 2- ((1, 3-dihydroxy-2- (hydroxymethyl) propan-2-yl) amino) ethane-1-sulfonic acid (1, TES) (200 mg,0.87 mmol). The reaction mixture was heated to 55 ℃ for 3h. MS analysis showed the formation of the desired product. Will be reversedThe mixture was cooled to room temperature, diluted with water (100 mL) and extracted with dichloromethane (2 x 100 mL). The combined organic layers were washed with saturated brine (100 mL) and dried over anhydrous sodium sulfate. The solvent was removed under vacuum and the residue was purified by column chromatography (40 g SiO ₂ : a gradient of 0 to 10% methanol in dichloromethane) to obtain 2- ((1, 3-bis (((9 z,12 z) -octadeca-9, 12-dienoyl) oxy) -2- ((((9 z,12 z) -octadeca-9, 12-dienoyl) oxy) methyl) propan-2-yl) amino) ethane-1-sulfonic acid (292 mg,47% yield) as a colorless solid.

Synthesis of 2- ((2- (chlorosulfonyl) ethyl) amino) -2- ((((9Z, 12Z) -octadeca-9, 12-dienoyl) oxy) methyl) propane-1, 3-diyl (9Z, 9'Z, 12' Z) -bis (octadeca-9, 12-dienoate) (3-Cl)

To a solution of 2- ((1, 3-bis (((9Z, 12Z) -octadeca-9, 12-dienoyl) oxy) -2- ((((9Z, 12Z) -octadeca-9, 12-dienoyl) oxy) methyl) propan-2-yl) amino) ethane-1-sulfonic acid 3 (210 mg,0.82 mmol) in anhydrous dichloromethane (5.0 mL) was added N, N-dimethylformamide (0.05 mL) and oxalyl chloride (0.08 mL,2.1 mmol) at 0deg.C. The reaction mixture was warmed to room temperature and stirred for 3h. The solvent was removed under reduced pressure to dryness to give 2- ((2- (chlorosulfonyl) ethyl) amino) -2- ((((9 z,12 z) -octadeca-9, 12-dienyloxy) methyl) propane-1, 3-diyl (9 z,9'z, 12' z) -bis (octadeca-9, 12-dienoate) which was used in the next step without further purification.

Synthesis of 2- ((2- (N- (2- (dimethylamino) ethyl) sulfamoyl) ethyl) amino) -2- ((((9Z, 12Z) -octadeca-9, 12-dienoyl) oxy) methyl) propane-1, 3-diyl (9Z, 9'Z, 12' Z) -bis (octadeca-9, 12-dienoate) (Compound I)

To 2- ((2- (chlorosulfonyl) ethyl) amino) -2- ((((9Z, 12Z) -octadeca-9, 12-diene) at 0 DEG CAcyl) oxy) methyl) propane-1, 3-diyl (9Z, 9'Z, 12' Z) -bis (octadeca-9, 12-dienoate) 3-Cl (210 mg,0.82 mmol) in dry dichloromethane (5.0 mL) was added N ¹ ,N ¹ Dimethylethane-1, 2-diamine (182 mg,2.1 mmol). The reaction mixture was warmed to room temperature and stirred for 3h. The reaction was quenched by the addition of water and the mixture was extracted with dichloromethane (2 x 100 ml). The combined organic layers were washed with saturated brine (100 mL) and dried over anhydrous sodium sulfate. The solvent was removed and the crude was purified by column chromatography (40 g SiO2: 0 to 15% methanol gradient in dichloromethane) to give 2- ((2- (N- (2- (dimethylamino) ethyl) sulfamoyl) ethyl) amino) -2- ((((9 z,12 z) -octadeca-9, 12-dienyloxy) methyl) propane-1, 3-diyl (9 z,9'z, 12' z) -bis (octadeca-9, 12-dienoate) (139 mg,62% yield) as a yellow oil.

1H NMR (300 MHz, chloroform-d) delta 5.26-5.44 (m, 12H), 4.09 (s, 6H), 3.06-3.18 (m, 6H), 2.75 (t, 6H), 2.47 (t, 2H), 2.32 (t, 6H), 2.24 (s, 6H), 2.00-2.10 (m, 12H), 1.52-1.65 (m, 4H), 1.20-1.40 (m, 44H), 0.88 (t, 9H).

APCI-MS analysis: c64h115N3O8S, [ m+h ] calculated = 1186.7, observed = 1186.8.

2.GL-TES-SA-DMP-E18-2

Synthetic route

/>

Synthetic scheme

Compound II was prepared following the representative procedure described above in similar yields to those obtained for compound I.

Linoleic acid is treated with a chlorinating agent (e.g., oxalyl chloride) to provide the acid chloride compound 2. The reaction of compound 2 with a nucleophilic compound (e.g., buffer compound 1) provides compound 3. Compound 3 is treated with a chlorinating agent (e.g., oxalyl chloride) to provide the electrophilic compound 3-Cl. The reaction of 3-Cl with a nucleophile such as compound 4b then provides compound II.

The reaction conditions used were as follows:

1H NMR (300 MHz, chloroform-d) delta 5.24-5.42 (m, 12H), 4.08 (s, 6H), 3.17 (t, 2H), 3.06 (bs, 4H), 2.75 (t, 6H), 2.43 (t, 2H), 2.31 (t, 6H), 2.23 (s, 6H), 1.98-2.08 (m, 12H), 1.70 (quintuple peak, 2H), 1.52-1.63 (m, 4H), 1.17-1.45 (m, 44H), 0.87 (t, 9H).

APCI-MS analysis: c65h117N3O8S, [ m+h ] calculated = 1100.7, observed = 1100.8.

3.TL1-01D-DMA

Synthetic scheme

Synthesis of trioctyl (2-hydroxypropane-1, 2, 3-tricarboxylic acid)

To a solution of citric acid A1 (2.1 g,11.0 mmol) and 1-octanol A2-1 (9.4 g,72.6 mmol) in dichloromethane (40 mL) were added DMAP (1.34 g,11.0 mmol) and EDCI (14.3 g,72.6 mmol), and the resulting mixture was stirred at room temperature for 24h. The reaction mixture was evaporated under vacuum. The residue was dissolved in dichloromethane (200 mL) and washed with brine (100 mL x 3). In the warp anhydrous Na ₂ SO ₄ After drying, the solvent was evaporated and the crude product was purified by column chromatography (220 g SiO ₂ : gradient of 0 to 20% ethyl acetate in hexane) to obtain a solid phaseTrioctyl (2-hydroxypropane-1, 2, 3-tricarboxylic acid) was found to be a coloured oil (5.2 g, 90%).

Synthesis of trioctyl (2- ((3- (dimethylamino) propionyl) oxy) propane-1, 2, 3-tricarboxylic acid)

To a solution of trioctyl 2-hydroxypropane-1, 2, 3-tricarboxylic acid A3-1 (0.528 g,1.0 mmol), DMAP (122 mg,1.0 mmol) and pyridine (316 mg,4.0 mmol) in 10mL of dichloromethane was added 3- (dimethylamino) propionyl chloride A4-1 (271mg, 2.0 mmol) at 0deg.C, and the resulting mixture was stirred at room temperature for 24h. The reaction mixture was evaporated under vacuum. The residue was dissolved in dichloromethane (100 mL) and washed with brine (80 mL x 3). In the warp anhydrous Na ₂ SO ₄ After drying, the solvent was evaporated and the crude product was purified by column chromatography (80 g SiO ₂ : a gradient of 0 to 10% methanol in dichloromethane) to give trioctyl 2- ((3- (dimethylamino) propionyl) oxy) propane-1, 2, 3-tricarboxylic acid (210 mg, 33%) as a colorless oil.

Alternatively, EDCI (13.1 g,68.5 mmol) and DMAP (2.09 g,17.1 mmol) were added to a suspension of 3- (dimethylamino) propionic acid (8.02 g,68.5 mmol) in 150mL of dichloromethane at 0deg.C, and the resulting mixture was stirred at that temperature for 5min. A solution of trioctyl 2-hydroxypropane-1, 2, 3-tricarboxylic acid A3-1 (9.05 g,17.1 mmol) in 10mL of dichloromethane was added, and the resulting mixture was stirred at room temperature for 48h. The reaction mixture was diluted with dichloromethane and washed with saturated sodium bicarbonate and brine. After drying over sodium sulfate, the organic layer was evaporated under vacuum. The residue was purified by column chromatography (220 g SiO2: 0 to 10% methanol in dichloromethane gradient) to give trioctyl 2- ((3- (dimethylamino) propionyl) oxy) propane-1, 2, 3-tricarboxylic acid (4.2 g, 38%) as a colorless oil.

¹ H NMR(300MHz,CDCl ₃ )δ4.56(s,br.,6H),4.24(t,2H),4.12(s,2H),2.55(t,2H),2.28-2.17(m,14H),1.63-1.48(m,8H),1.25(s,br.,32H),0.86(t,12H)。

APCI-MS analysis: c35h65NO8, [ m+h ] calculated = 627.9, observed = 628.5.

4.TL1-04D-DMA

TD1-04D-DMA may be prepared in a similar manner as TD-01D-DMA described above.

5.SY-3-E14-DMAPr

Synthetic scheme

R ^A ＝R ^B ＝C ₁₂ H ₂₅

Synthesis of 3- (dimethylamino) propyl 4-hydroxy-3, 5-dimethoxy benzoate (6)

To a suspension of syringic acid 5 (7.5 g,0.04 mol) in 100mL of dichloromethane was added oxalyl chloride (12.8 mL,0.15 mol) at 0 ℃, followed by dimethylformamide (5 drops), and the resulting mixture was stirred at that temperature for 2h. The reaction mixture was evaporated to dryness and the residue was dissolved in 100mL of dichloromethane. After cooling to 0 ℃, 3- (dimethylamino) propan-1-ol 2 (4.5 ml,40 mmol) was slowly added and the reaction mixture was stirred at room temperature overnight. The precipitate was filtered to give 3- (dimethylamino) propyl 4-hydroxy-3, 5-dimethoxy benzoate 6 (6.2 g, 58%) as a white solid.

6.TL1-10D-DMA

TD1-04D-DMA may be prepared in a similar manner as TD-01D-DMA described above.

7.HEP-E3-E10

Synthetic scheme

Scheme 1

Synthetic scheme

[3] Is synthesized by (a)

As described in scheme 1: to a solution containing HEP [1] (0.100 g, 0.284 mmol,1.0 eq), E3-E10[2] (0.668 g,1.038mmol,2.1 eq), 1ml dimethylformamide, 3ml dichloroethane, diisopropylethylamine (0.344. Mu.L, 1.98mmol,4.0 eq) and N, N-dimethylaminopyridine (0.024 g,0.198mmol,0.4 eq) was added 1-ethyl-3- (3-dimethylaminopropyl) carbodiimide (0.284 g,1.48mmol,3.0 eq) and allowed to react overnight (18 h) at room temperature. The reaction mixture was then concentrated using a rotary evaporator and purified using a Buchi Combi-flash system on a 12g,40 μm size silica gel column using hexane/ethyl acetate as the mobile phase to give a colorless oil (70% yield).

Synthesis of HEP-E3-E10[4]

As described in scheme 1: to a 20ml polypropylene scintillation vial equipped with a PTFE stirring rod was added [3] (0.500 g,0.344mmol,1.0 eq.) together with 4ml dry tetrahydrofuran. The vial was cooled to 0 ℃ to 5 ℃ on an ice bath and HF/pyridine (1.76 ml,67.86mmol,197.3 eq.) was added dropwise. After addition, the reaction vial was allowed to warm to room temperature and stirred overnight (18 h). The reaction mixture was then neutralized with saturated sodium bicarbonate at 0 ℃. Extraction was performed using ethyl acetate (3×). The organic layers were combined, washed with saturated sodium chloride (4×), dried over sodium sulfate, filtered and rotary evaporated to give a pale yellow oil. The oil was further purified using a Buchi Combi-flash system on a 12g,40 μm size silica gel column using dichloromethane/methanol (3% methanol) as the mobile phase to give a colorless oil (60% yield).

1H NMR(400MHz,CDCl3)4.16(m,4H),3.60(m,4H),2.97(m,3H),2.78(d,3H),2.58(m,9H),2.37(m,12H),2.15(m,2H),1.78(m,4H),1.44(m,7H),1.36(m,9H),1.26(br,45H),1.05(d,6H),0.87(t,12H)。

M/Z expected = 998.59, observed = 998.0.

8.HEP-E4-E10

Synthetic scheme

Scheme 2

Synthetic scheme

[12] Is synthesized by (a)

As described in scheme 2: to a solution of HEP [1] (0.100 g, 0.284 mmol,1.0 eq), E4-E10[11] (0.683 g,1.038mmol,2.1 eq), 1ml dimethylformamide, 3ml dichloroethane, diisopropylethylamine (0.344. Mu.L, 1.98mmol,4.0 eq) and N, N-dimethylaminopyridine (0.024 g,0.198mmol,0.4 eq) was added 1-ethyl-3- (3-dimethylaminopropyl) carbodiimide (0.284 g,1.48mmol,3.0 eq) and the reaction was allowed to proceed overnight (18 h) at room temperature. The reaction mixture was then concentrated using a rotary evaporator and purified using a Buchi Combi-flash system on a 12g,40 μm size silica gel column using hexane/ethyl acetate as the mobile phase to give a colorless oil (63.3% yield).

Synthesis of HEP-E4-E10[13]

As described in scheme 2: to a 20ml polypropylene scintillation vial equipped with a PTFE stirring rod was added [12] (0.450 g,0.303mmol,1.0 eq.) together with 4ml dry tetrahydrofuran. The vial was cooled to 0 ℃ to 5 ℃ on an ice bath and HF/pyridine (1.55 ml,59.920mmol,197.3 eq.) was added dropwise. After addition, the reaction vial was allowed to warm to room temperature and stirred overnight (18 h). The reaction mixture was then neutralized with saturated sodium bicarbonate at 0 ℃. Extraction was performed using ethyl acetate (3×). The organic layers were combined, washed with saturated sodium chloride (4×), dried over sodium sulfate, filtered and rotary evaporated to give a pale yellow oil. The oil was further purified using a Buchi Combi-flash system on a 12g,40 μm size silica gel column using dichloromethane/methanol (3%) as the mobile phase to give a colorless oil (48.4% yield).

1H NMR(400MHz,CDCl3)4.16(t,4H),3.62(br,4H),2.96(q,3H),2.76(d,4H),2.56(m,8H),2.40(m,4H),2.32(t,4H),2.13(t,2H),1.61(m,4H),1.46(m,8H),1.37(m,8H),1.28(br,44H),1.03(d,6H),0.87(t,12H),

13C NMR(400MHz,CDCl3)173.65(2C),69.65(2C),68.04(2C),62.84(2C),61.82(2C),61.44(2C),60.89(2C),55.57(4C),51.55(2C),35.35(4C),34.20(2C),32.09(7C),30.00(5C),29.77(6C),29.47(6C),26.93(2C),25.84(5C),22.84(9C),17.77(2C),14.30(7C)。

M/Z expected = 1025.64, observed = 1025.8.

9.Guan-SS-Chol

Guan-SS-Chol may be prepared according to the method described in International publication No. WO 2018/089801, which is hereby incorporated by reference in its entirety. Guan-SS-Chol and formula (V) (HGT 4002) are used interchangeably.

Example 6 evaluation of cationic lipids for pulmonary delivery

In this example, the in vivo efficacy of various cationic lipids was tested when mRNA encapsulated in lipid nanoparticles (mRNA-LNP) was administered to mice by pulmonary delivery. Both the potency (as determined by the level of protein production) and the tolerability (as determined by side effects associated with clearance and metabolism) of the cationic lipids were tested.

About 150 cationic lipids were tested. (FIG. 7). Each cationic lipid was used to prepare lipid nanoparticles encapsulating mRNA (FFL mRNA) encoding firefly luciferase protein according to methods known in the art. For example, suitable methods for mRNA encapsulation include those described in international publication nos. WO 2016/004318 and WO 2018/089801, which are hereby incorporated by reference in their entireties. The lipid nanoparticle tested comprised a lipid component consisting of cationic lipid, non-cationic lipid (DOPE), PEG-modified lipid (DMG-PEG 2K), and optionally cholesterol.

Through use ofNebulization lipid nanoparticle formulations comprising FFL mRNA were administered to male CD1 mice by intratracheal single administration. Animals were dosed with luciferin by intraperitoneal injection approximately 5 hours after dosing, and all animals were imaged using an IVIS imaging system to measure luciferase production in the lungs. Figure 7 shows the different efficacy of each cationic lipid in the lung with in vivo protein expression. Some cationic lipids have significantly greater than 50-fold increase in pulmonary protein expression compared to other cationic lipids.

Based on their performance in this in vivo screen, the following nine cationic lipids were selected for further study: GL-TES-SA-DME-E18-2, TL1-01D-DMA, SY-3-E14-DMAPR, TL1-10D-DMA, HGT4002 (also referred to herein as Guan-SS-Chol), GL-TES-SA-DMP-E18-2, HEP-E4-E10, HEP-E3-E10 and TL1-04D-DMA. Among these cationic lipids, HEP-E4-E10, HEP-E3-E10, GL-TES-SA-DME-E18-2, GL-TES-SA-DMP-E18-2, TL1-01D-DMA and TL1-04D-DMA exhibited particularly high potency as determined by the average radiation detected in the mouse lungs.

Example 7 evaluation of protein expression of lipid nanoparticles

In this example, both in vivo mRNA delivery and protein expression of cationic lipids were tested to evaluate potency and biodistribution. In this study, the cationic lipids TL-10D-DMA, SY-3-E14-DMAPR and TD1-04D-DMA were used to prepare lipid nanoparticles LNP-A, LNP-B and LNP-C, respectively, encapsulating mCherry mRNA.

mRNA-LNP administration by intratracheal administrationFor mice, and the amount of mRNA delivered to lung tissue was determined. As shown in fig. 8A, all mRNA-LNPs tested more efficiently delivered mRNA to lung cells. To check if the amount of mRNA delivered to the lung cells correlated with protein expression in the lung, the amount of mCherry protein was determined by ELISA. Fig. 8B shows the amount of mRNA in lung tissue in the x-axis and the amount of protein expressed in lung in the y-axis. The data indicate that certain LNPs have higher efficacy even when an equal amount of mRNA is delivered to the tissue, as shown by increased protein expression. For example, for an equal amount of 10 delivered ⁵ CN/mg tissue RNA, LNP-B produced about 10pg/mg protein, and LNP-A produced about 10 ² pg/mg protein.

Example 8 evaluation of the biodistribution of lipid nanoparticles by pulmonary delivery

In this example, mRNA-LNP biodistribution of LNP encapsulated mRNA was tested when the mRNA was administered to mice by pulmonary delivery.

First, a study was conducted to examine whether the mRNA-LNP of the present invention was effectively delivered to the lung in vivo. LNP encapsulating FFL mRNA was administered to CD-1 mice by intratracheal delivery, and radiation was detected 24 hours after administration. As shown in fig. 9A, the results demonstrate that mRNA-LNP was efficiently delivered to the lungs of mice.

To identify which types of cells were transfected with mRNA-LNP in vivo, a genetically modified mouse was used whose cells expressed fluorescent tdTomato protein after successful transfection with Cre recombinase. After in vivo administration of mRNA encoding Cre recombinase, successfully transfected cells in bulk tissue can be visualized at single cell resolution by detecting Cre-induced tdTomato expression. LNP encapsulating Cre recombinase mRNA was administered via aerosol inhalation to tdmamato transgenic mice. A single inhalation exposure occurs on a nasal-only atomizing tower with liquefied aerosol product. On day 3, approximately 48 hours after exposure, mice were euthanized and whole-body 3D imaging was performed using cryofluorescence tomography (CFT). This allows for high resolution imaging and spatial distribution of the tdtometer signal. In the event of efficient delivery, cre protein is expressed and results in tdtometer expression. After 48 hours, mice were imaged by cryofluorescence tomography. Fig. 9B shows that mRNA-LNP is efficiently delivered to the lung, and protein expression is observed even in the branches of the airways, as indicated by the arrows. Along the respiratory tract, mice treated with Cre mRNA-LNP displayed positive tdthato fluorescence in nasal epithelium, trachea and bronchi. Animals treated with saline did not show Td rimto signals in these tissues. Any signal observed in both saline and TdT treated mice was considered background signal and was not specific for the sample related delivery. The tdtometer signal was identified at the cellular level using pulmonary Immunohistochemistry (IHC) while maintaining tissue architecture. As shown in fig. 9C, positive IHC staining was observed in the whole lung, with bronchiole epithelium and alveolar lung cells being specific anatomical sites exhibiting tdthato positivity. Positive bronchiole epithelial cells include secretory cells and/or ciliated cells (arrows with "1"). Type I (arrow with "2") and type II lung cells are often positive, as are the bronchioles, alveolar ducts and dispersed macrophages in the alveoli.

Next, to examine the biodistribution and expression of mRNA-LNP at high resolution, LNP encapsulating CFTR mRNA was prepared. mRNA-LNP was administered by pulmonary delivery to CFTR Knockout (KO) mice. Protein expression was detected by immunofluorescence. Fig. 9D shows that CFTR protein expressed by the delivered mRNA-LNP is present on the top surface of the airway, as indicated by the arrow, demonstrating the effectiveness of the mRNA-LNP of the invention.

Example 9 evaluation of protein expression of lipid nanoparticles by HBEC-ALI

In this example, protein expression of mRNA-encapsulated LNP was tested using the HBEC-ALI (human bronchial epithelial cell-gas liquid interface) system. The HBEC-ALI technique is advantageous because it reproduces well-differentiated airway epithelium with different functional cells, allowing it to be used as a highly translatable airway cell model.

It is important to obtain successful HBEC-ALI culture that can be used for future experiments. Briefly, human bronchial epithelial cells were seeded onto wells and grown in culture. After confluence is reached, the apical medium is removed and replaced with growth medium. Prior to performing the experiments with mRNA-LNP, cells were grown to allow polarization and differentiation, as shown in fig. 10A. An exemplary HBEC-ALI system schematic is shown in fig. 10A. Differentiated epithelium was sectioned and stained with hematoxylin and eosin (H & E), as shown in fig. 10B, which indicates the presence of multi-ciliated cells that can be used as an airway cell model.

Cationic lipids ML2, GL-TES-SA-DMP-E18-2, GL-TES-SA-DME-E18-2, TL1-01D-DMA, TL1-04D-DMA, SY-3-E14-DMAPR, HEP-E3-E10 and HEP-E4-E10 were used to prepare LNPs encapsulating FFL mRNA. LNP encapsulating FFL mRNA was added to the top layer of HBEC-ALI. Luminescence is then measured to assess the amount of luciferase protein expressed in the cells. As shown in fig. 11A, all mRNA-LNPs tested showed dose-dependent protein expression. In addition, in the HEBC-ALI model, mRNA-LNP showed robust protein expression in lung cells. To examine whether human bronchial epithelial cells in the HBEC-ALI model maintained cell integrity during the experiment, transepithelial resistance (TEER), which is a strong indicator of epithelial integrity, was measured. As shown in fig. 11B, TEER was not significantly different, indicating that the monolayer remained intact for most treatments with mRNA-LNP. Thus, the HBEC-ALI model can be used as a robust in vitro system for evaluating protein expression of mRNA-LNP in lung cells.

To further examine whether data from the HBEC-ALI model is a good predictor of protein expression in vivo, ROC curves (receiver operating characteristics) were plotted. In general, the closer the ROC curve is to the upper left corner, the more effective the test. Statistical data from ROC curves showed high AUC (area under ROC curve) (0.827) and low p-value (< 0.013) (fig. 12), indicating that data from the HBEC-ALI model was interpretable for determining in vivo efficacy of mRNA-LNP, and that the HBEC-ALI model could be used to predict mRNA-LNP that is necessary for further study of in vivo applications.

Next, the lipid degradation rate after HBEC-ALI transfection was determined and compared with the results obtained with mouse lung and human lung homogenates. It is desirable that the lipid degrade rapidly to reduce the potential toxicity of the LNP component (including cationic lipids). The concentration of lipids in HBEC-ALI sample cultures was measured over time and plotted as shown in fig. 13A. The results show that after transfection with mRNA-LNP, the lipid rapidly degraded over time with a half-life of about 2.9 hours. The half-life values determined by the HBEC-ALI model were comparable to those determined by mouse lung and human lung homogenates (4.5 hours and 3.6 hours, respectively), as shown in fig. 13B. These results demonstrate that the HBEC-ALI model is a useful indicator of in vivo expression of mRNA-LNP.

Overall, the data in this example demonstrate that HBEC-ALI shows meaningful performance as a classification model for screening and filtering lipids prior to in vivo evaluation. In addition, the mRNA-LNP of the invention has robust protein expression and rapid degradation. In combination with the in vivo data provided herein, it is predicted that the mRNA-LNP of the present invention performs extremely well in both increasing potency and improving tolerability in vivo applications involving repeated delivery of mRNA to the lung via nebulization.

Equivalent content

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. The scope of the invention is not intended to be limited by the foregoing description, but is instead as set forth in the following claims:

SEQUENCE LISTING

<110> translation Bio Inc

<120> improved compositions for delivery of codon optimized mRNA

<130> MRT-2205USP1

<140> PCT/US21/58623

<141> 2021-11-09

<150> 63/111,321

<151> 2020-11-09

<150> 63/195,581

<151> 2021-06-01

<160> 17

<170> PatentIn version 3.5

<210> 1

<211> 1480

<212> PRT

<213> Homo sapiens

<400> 1

Met Gln Arg Ser Pro Leu Glu Lys Ala Ser Val Val Ser Lys Leu Phe

1 5 10 15

Phe Ser Trp Thr Arg Pro Ile Leu Arg Lys Gly Tyr Arg Gln Arg Leu

20 25 30

Glu Leu Ser Asp Ile Tyr Gln Ile Pro Ser Val Asp Ser Ala Asp Asn

35 40 45

Leu Ser Glu Lys Leu Glu Arg Glu Trp Asp Arg Glu Leu Ala Ser Lys

50 55 60

Lys Asn Pro Lys Leu Ile Asn Ala Leu Arg Arg Cys Phe Phe Trp Arg

65 70 75 80

Phe Met Phe Tyr Gly Ile Phe Leu Tyr Leu Gly Glu Val Thr Lys Ala

85 90 95

Val Gln Pro Leu Leu Leu Gly Arg Ile Ile Ala Ser Tyr Asp Pro Asp

100 105 110

Asn Lys Glu Glu Arg Ser Ile Ala Ile Tyr Leu Gly Ile Gly Leu Cys

115 120 125

Leu Leu Phe Ile Val Arg Thr Leu Leu Leu His Pro Ala Ile Phe Gly

130 135 140

Leu His His Ile Gly Met Gln Met Arg Ile Ala Met Phe Ser Leu Ile

145 150 155 160

Tyr Lys Lys Thr Leu Lys Leu Ser Ser Arg Val Leu Asp Lys Ile Ser

165 170 175

Ile Gly Gln Leu Val Ser Leu Leu Ser Asn Asn Leu Asn Lys Phe Asp

180 185 190

Glu Gly Leu Ala Leu Ala His Phe Val Trp Ile Ala Pro Leu Gln Val

195 200 205

Ala Leu Leu Met Gly Leu Ile Trp Glu Leu Leu Gln Ala Ser Ala Phe

210 215 220

Cys Gly Leu Gly Phe Leu Ile Val Leu Ala Leu Phe Gln Ala Gly Leu

225 230 235 240

Gly Arg Met Met Met Lys Tyr Arg Asp Gln Arg Ala Gly Lys Ile Ser

245 250 255

Glu Arg Leu Val Ile Thr Ser Glu Met Ile Glu Asn Ile Gln Ser Val

260 265 270

Lys Ala Tyr Cys Trp Glu Glu Ala Met Glu Lys Met Ile Glu Asn Leu

275 280 285

Arg Gln Thr Glu Leu Lys Leu Thr Arg Lys Ala Ala Tyr Val Arg Tyr

290 295 300

Phe Asn Ser Ser Ala Phe Phe Phe Ser Gly Phe Phe Val Val Phe Leu

305 310 315 320

Ser Val Leu Pro Tyr Ala Leu Ile Lys Gly Ile Ile Leu Arg Lys Ile

325 330 335

Phe Thr Thr Ile Ser Phe Cys Ile Val Leu Arg Met Ala Val Thr Arg

340 345 350

Gln Phe Pro Trp Ala Val Gln Thr Trp Tyr Asp Ser Leu Gly Ala Ile

355 360 365

Asn Lys Ile Gln Asp Phe Leu Gln Lys Gln Glu Tyr Lys Thr Leu Glu

370 375 380

Tyr Asn Leu Thr Thr Thr Glu Val Val Met Glu Asn Val Thr Ala Phe

385 390 395 400

Trp Glu Glu Gly Phe Gly Glu Leu Phe Glu Lys Ala Lys Gln Asn Asn

405 410 415

Asn Asn Arg Lys Thr Ser Asn Gly Asp Asp Ser Leu Phe Phe Ser Asn

420 425 430

Phe Ser Leu Leu Gly Thr Pro Val Leu Lys Asp Ile Asn Phe Lys Ile

435 440 445

Glu Arg Gly Gln Leu Leu Ala Val Ala Gly Ser Thr Gly Ala Gly Lys

450 455 460

Thr Ser Leu Leu Met Val Ile Met Gly Glu Leu Glu Pro Ser Glu Gly

465 470 475 480

Lys Ile Lys His Ser Gly Arg Ile Ser Phe Cys Ser Gln Phe Ser Trp

485 490 495

Ile Met Pro Gly Thr Ile Lys Glu Asn Ile Ile Phe Gly Val Ser Tyr

500 505 510

Asp Glu Tyr Arg Tyr Arg Ser Val Ile Lys Ala Cys Gln Leu Glu Glu

515 520 525

Asp Ile Ser Lys Phe Ala Glu Lys Asp Asn Ile Val Leu Gly Glu Gly

530 535 540

Gly Ile Thr Leu Ser Gly Gly Gln Arg Ala Arg Ile Ser Leu Ala Arg

545 550 555 560

Ala Val Tyr Lys Asp Ala Asp Leu Tyr Leu Leu Asp Ser Pro Phe Gly

565 570 575

Tyr Leu Asp Val Leu Thr Glu Lys Glu Ile Phe Glu Ser Cys Val Cys

580 585 590

Lys Leu Met Ala Asn Lys Thr Arg Ile Leu Val Thr Ser Lys Met Glu

595 600 605

His Leu Lys Lys Ala Asp Lys Ile Leu Ile Leu His Glu Gly Ser Ser

610 615 620

Tyr Phe Tyr Gly Thr Phe Ser Glu Leu Gln Asn Leu Gln Pro Asp Phe

625 630 635 640

Ser Ser Lys Leu Met Gly Cys Asp Ser Phe Asp Gln Phe Ser Ala Glu

645 650 655

Arg Arg Asn Ser Ile Leu Thr Glu Thr Leu His Arg Phe Ser Leu Glu

660 665 670

Gly Asp Ala Pro Val Ser Trp Thr Glu Thr Lys Lys Gln Ser Phe Lys

675 680 685

Gln Thr Gly Glu Phe Gly Glu Lys Arg Lys Asn Ser Ile Leu Asn Pro

690 695 700

Ile Asn Ser Ile Arg Lys Phe Ser Ile Val Gln Lys Thr Pro Leu Gln

705 710 715 720

Met Asn Gly Ile Glu Glu Asp Ser Asp Glu Pro Leu Glu Arg Arg Leu

725 730 735

Ser Leu Val Pro Asp Ser Glu Gln Gly Glu Ala Ile Leu Pro Arg Ile

740 745 750

Ser Val Ile Ser Thr Gly Pro Thr Leu Gln Ala Arg Arg Arg Gln Ser

755 760 765

Val Leu Asn Leu Met Thr His Ser Val Asn Gln Gly Gln Asn Ile His

770 775 780

Arg Lys Thr Thr Ala Ser Thr Arg Lys Val Ser Leu Ala Pro Gln Ala

785 790 795 800

Asn Leu Thr Glu Leu Asp Ile Tyr Ser Arg Arg Leu Ser Gln Glu Thr

805 810 815

Gly Leu Glu Ile Ser Glu Glu Ile Asn Glu Glu Asp Leu Lys Glu Cys

820 825 830

Phe Phe Asp Asp Met Glu Ser Ile Pro Ala Val Thr Thr Trp Asn Thr

835 840 845

Tyr Leu Arg Tyr Ile Thr Val His Lys Ser Leu Ile Phe Val Leu Ile

850 855 860

Trp Cys Leu Val Ile Phe Leu Ala Glu Val Ala Ala Ser Leu Val Val

865 870 875 880

Leu Trp Leu Leu Gly Asn Thr Pro Leu Gln Asp Lys Gly Asn Ser Thr

885 890 895

His Ser Arg Asn Asn Ser Tyr Ala Val Ile Ile Thr Ser Thr Ser Ser

900 905 910

Tyr Tyr Val Phe Tyr Ile Tyr Val Gly Val Ala Asp Thr Leu Leu Ala

915 920 925

Met Gly Phe Phe Arg Gly Leu Pro Leu Val His Thr Leu Ile Thr Val

930 935 940

Ser Lys Ile Leu His His Lys Met Leu His Ser Val Leu Gln Ala Pro

945 950 955 960

Met Ser Thr Leu Asn Thr Leu Lys Ala Gly Gly Ile Leu Asn Arg Phe

965 970 975

Ser Lys Asp Ile Ala Ile Leu Asp Asp Leu Leu Pro Leu Thr Ile Phe

980 985 990

Asp Phe Ile Gln Leu Leu Leu Ile Val Ile Gly Ala Ile Ala Val Val

995 1000 1005

Ala Val Leu Gln Pro Tyr Ile Phe Val Ala Thr Val Pro Val Ile

1010 1015 1020

Val Ala Phe Ile Met Leu Arg Ala Tyr Phe Leu Gln Thr Ser Gln

1025 1030 1035

Gln Leu Lys Gln Leu Glu Ser Glu Gly Arg Ser Pro Ile Phe Thr

1040 1045 1050

His Leu Val Thr Ser Leu Lys Gly Leu Trp Thr Leu Arg Ala Phe

1055 1060 1065

Gly Arg Gln Pro Tyr Phe Glu Thr Leu Phe His Lys Ala Leu Asn

1070 1075 1080

Leu His Thr Ala Asn Trp Phe Leu Tyr Leu Ser Thr Leu Arg Trp

1085 1090 1095

Phe Gln Met Arg Ile Glu Met Ile Phe Val Ile Phe Phe Ile Ala

1100 1105 1110

Val Thr Phe Ile Ser Ile Leu Thr Thr Gly Glu Gly Glu Gly Arg

1115 1120 1125

Val Gly Ile Ile Leu Thr Leu Ala Met Asn Ile Met Ser Thr Leu

1130 1135 1140

Gln Trp Ala Val Asn Ser Ser Ile Asp Val Asp Ser Leu Met Arg

1145 1150 1155

Ser Val Ser Arg Val Phe Lys Phe Ile Asp Met Pro Thr Glu Gly

1160 1165 1170

Lys Pro Thr Lys Ser Thr Lys Pro Tyr Lys Asn Gly Gln Leu Ser

1175 1180 1185

Lys Val Met Ile Ile Glu Asn Ser His Val Lys Lys Asp Asp Ile

1190 1195 1200

Trp Pro Ser Gly Gly Gln Met Thr Val Lys Asp Leu Thr Ala Lys

1205 1210 1215

Tyr Thr Glu Gly Gly Asn Ala Ile Leu Glu Asn Ile Ser Phe Ser

1220 1225 1230

Ile Ser Pro Gly Gln Arg Val Gly Leu Leu Gly Arg Thr Gly Ser

1235 1240 1245

Gly Lys Ser Thr Leu Leu Ser Ala Phe Leu Arg Leu Leu Asn Thr

1250 1255 1260

Glu Gly Glu Ile Gln Ile Asp Gly Val Ser Trp Asp Ser Ile Thr

1265 1270 1275

Leu Gln Gln Trp Arg Lys Ala Phe Gly Val Ile Pro Gln Lys Val

1280 1285 1290

Phe Ile Phe Ser Gly Thr Phe Arg Lys Asn Leu Asp Pro Tyr Glu

1295 1300 1305

Gln Trp Ser Asp Gln Glu Ile Trp Lys Val Ala Asp Glu Val Gly

1310 1315 1320

Leu Arg Ser Val Ile Glu Gln Phe Pro Gly Lys Leu Asp Phe Val

1325 1330 1335

Leu Val Asp Gly Gly Cys Val Leu Ser His Gly His Lys Gln Leu

1340 1345 1350

Met Cys Leu Ala Arg Ser Val Leu Ser Lys Ala Lys Ile Leu Leu

1355 1360 1365

Leu Asp Glu Pro Ser Ala His Leu Asp Pro Val Thr Tyr Gln Ile

1370 1375 1380

Ile Arg Arg Thr Leu Lys Gln Ala Phe Ala Asp Cys Thr Val Ile

1385 1390 1395

Leu Cys Glu His Arg Ile Glu Ala Met Leu Glu Cys Gln Gln Phe

1400 1405 1410

Leu Val Ile Glu Glu Asn Lys Val Arg Gln Tyr Asp Ser Ile Gln

1415 1420 1425

Lys Leu Leu Asn Glu Arg Ser Leu Phe Arg Gln Ala Ile Ser Pro

1430 1435 1440

Ser Asp Arg Val Lys Leu Phe Pro His Arg Asn Ser Ser Lys Cys

1445 1450 1455

Lys Ser Lys Pro Gln Ile Ala Ala Leu Lys Glu Glu Thr Glu Glu

1460 1465 1470

Glu Val Gln Asp Thr Arg Leu

1475 1480

<210> 2

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 2

atgcagcgtt ctcccctgga gaaggcttct gtggtgagta aacttttttt ctcctggacc 60

agacctatcc tgaggaaagg ctacaggcag agactggagc tctctgacat ataccagata 120

ccttcagtcg atagcgccga caacctgagc gagaagctgg aacgcgagtg ggacagagag 180

ctggcaagca agaagaaccc aaagctgatt aatgccctga gaaggtgttt cttctggaga 240

ttcatgttct acggaatctt tctgtatctg ggggaggtta caaaggctgt gcaacccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaaca aggaagaaag aagcatcgcc 360

atctacctgg gcattggcct ctgcctcctg tttattgtgc ggactctgct gctgcaccca 420

gcaattttcg ggttgcatca tattggcatg cagatgcgca ttgctatgtt ttccctcatc 480

tacaaaaaga cactgaaact cagctcccgg gtgctggaca agatctccat cggccaactg 540

gtgtctctcc tgagcaataa cttgaataag ttcgacgaag ggctggccct ggcacacttc 600

gtgtggattg cccccctgca ggtggccctg ctgatgggac tgatttggga actgctgcag 660

gctagcgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg ggttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctgc ggaagatctt caccaccatc 1020

agcttttgca tcgtgcttag aatggccgtg acacggcagt tcccatgggc cgttcaaact 1080

tggtatgatt ccctgggcgc catcaacaaa atccaggatt tcctgcagaa gcaggaatac 1140

aagacactcg aatataacct cacaactact gaggtggtta tggagaacgt gactgccttc 1200

tgggaggagg ggttcggaga gctttttgag aaggccaaac agaataataa taaccgcaaa 1260

accagcaacg gcgacgacag cctgttcttc tccaattttt ctctcctggg aacacccgtc 1320

ctcaaagaca tcaactttaa gatcgagagg ggccagctgc tcgccgtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc ctccgagggc 1440

aagattaagc actcaggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaagttcg cagagaagga taatattgtg 1620

ctgggagagg gaggaatcac cctgagcgga ggccagagag ccagaatctc actggcccgg 1680

gccgtctaca aggacgccga cctttacctt ctggacagtc cctttggata tctggatgtg 1740

ctgactgaaa aggagatctt cgagtcttgt gtgtgcaagc tgatggctaa caagacccgg 1800

atcctagtga ctagtaagat ggagcacctg aagaaggcag acaagatctt gattctgcac 1860

gagggatcct cttactttta cggcaccttt agcgagctgc agaacctcca gcccgatttc 1920

tcatctaagc tgatgggctg tgatagcttc gaccagttct ctgccgagcg cagaaacagc 1980

atcctgacag agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttc ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

agcacactgc agtgggccgt gaatagtagt atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagaaggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaagctcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 3

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 3

atgcagcgct cgcctctgga aaaggcgagc gtcgtgtcaa agctattctt ttcttggacc 60

cggcccattc tcaggaaggg ctacaggcag aggctggagt tgagcgacat ctatcagatt 120

ccttccgtgg acagcgccga caacctgagc gagaagctgg aaagggagtg ggaccgcgaa 180

ctggcaagca aaaagaaccc caagctgatc aatgccctga gaaggtgttt cttttggaga 240

ttcatgttct acgggatctt tctgtatctg ggcgaggtta caaaggctgt gcagcccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaaca aggaagaaag aagcatcgcc 360

atctacctgg gcattggcct ctgcctcctg tttattgtgc ggactctgct gctgcaccca 420

gcaattttcg ggttgcatca tattggcatg cagatgcgca ttgctatgtt ttccctcatc 480

tacaaaaaga cactgaaact cagctcccgg gtgctggaca agatctccat cggccaactg 540

gtgtctctcc tgagcaataa cttgaataag ttcgacgaag ggctggccct ggcacacttc 600

gtgtggattg cccccctgca ggtggccctg ctgatgggac tgatttggga actgctgcag 660

gctagcgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg gcttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctgc ggaagatctt caccaccatc 1020

agcttttgca tcgtgcttag aatggccgtg acccggcagt tcccatgggc cgtgcaaact 1080

tggtatgatt ccctgggcgc catcaacaaa atccaggatt tcctgcagaa gcaggaatac 1140

aagacactcg aatataatct cacaactact gaggtggtta tggagaacgt gactgccttc 1200

tgggaggagg ggttcggaga gctttttgag aaggcaaaac agaataacaa caaccgcaaa 1260

accagcaacg gcgacgacag cctgttcttc tccaattttt ctctcctggg aacacccgtc 1320

ctcaaagaca tcaactttaa gatcgagagg ggacagctgc tcgcagtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc atccgagggc 1440

aagattaagc acagtggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaaattcg cagagaagga taatatcgtg 1620

ctgggggagg ggggaatcac cctgagcgga ggccagagag ccagaatctc actggcccgg 1680

gccgtctaca aggacgccga cctttacctt ctggacagtc cctttggata tctggatgtg 1740

ctgactgaaa aggagatctt cgagtcttgt gtgtgcaagc tgatggctaa taagacccgg 1800

atcctagtga ccagtaagat ggagcacctg aagaaggcag acaagatctt gattctgcac 1860

gagggatcct cttactttta cggcaccttt agcgagctgc agaatctcca gcccgatttc 1920

tcatctaagc tgatgggctg tgatagcttc gaccagttct ctgccgagcg cagaaacagc 1980

atcctgacag agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttt ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

tccacactgc agtgggccgt gaatagttca atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagagggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaagctcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 4

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 4

atgcagcgtt ctcccctgga gaaggcttct gtggtgagta aacttttttt ctcctggacc 60

agacctatcc tgaggaaagg ctacaggcag agactggagc tctctgacat ataccagata 120

ccttcagtcg atagcgccga caacctgagc gagaagctgg aacgcgagtg ggacagagag 180

ctggcaagca agaagaaccc aaagctgatt aatgccctga gaaggtgttt cttctggaga 240

ttcatgttct acggaatctt tctgtatctg ggggaggtta caaaggctgt gcaacccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaaca aggaagaaag aagcatcgcc 360

atctacctgg gcattggcct ctgcctcctg tttattgtgc ggactctgct gctgcaccca 420

gcaattttcg ggttgcatca tattggcatg cagatgcgca ttgctatgtt ttccctcatc 480

tacaaaaaga cactgaaact cagctcccgg gtgctggaca agatctccat cggccaactg 540

gtgtctctcc tgagcaataa cttgaataag ttcgacgaag ggctggccct ggcacacttc 600

gtgtggattg cccccctgca ggtggccctg ctgatgggac tgatttggga actgctgcag 660

gctagcgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg ggttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctgc ggaagatctt caccaccatc 1020

agcttttgca tcgtgcttag aatggccgtg acccggcagt tcccatgggc cgtgcaaact 1080

tggtatgatt ccctgggcgc catcaacaaa atccaggatt tcctgcagaa gcaggaatac 1140

aagacactcg aatataatct cacaactact gaggtggtta tggagaacgt gactgccttc 1200

tgggaggagg ggttcggaga gctttttgag aaggcaaaac agaataacaa caaccgcaaa 1260

accagcaacg gcgacgacag cctgttcttc tccaattttt ctctcctggg aacacccgtc 1320

ctcaaagaca tcaactttaa gatcgagagg ggccagctgc tcgccgtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc ctccgagggc 1440

aagattaagc actcaggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaagttcg cagagaagga taatattgtg 1620

ctgggagagg gaggaatcac cctgagcgga ggccagagag ccagaatctc actggcccgg 1680

gccgtctaca aggacgccga cctttacctt ctggacagtc cctttggata tctggatgtg 1740

ctgactgaaa aggagatctt cgagtcttgt gtgtgcaagc tgatggctaa taagacccgg 1800

atcctagtga ccagtaagat ggagcacctg aagaaggcag acaagatctt gattctgcac 1860

gagggatcct cttactttta cggcaccttt agcgagctgc agaatctcca gcccgatttc 1920

tcatctaagc tgatgggctg tgatagcttc gaccagttct ctgccgagcg cagaaacagc 1980

atcctgacag agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttt ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

agcacactgc agtgggccgt gaatagtagt atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagaaggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaaactcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 5

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 5

atgcagcgct cgcctctgga aaaggcgagc gtcgtgtcaa agctattctt ttcttggacc 60

cggcccattc tcaggaaggg ctacaggcag aggctggagt tgagcgacat ctatcagatt 120

ccttccgtgg acagcgccga caacctgagc gagaagctgg aaagggagtg ggaccgcgaa 180

ctggcaagca aaaagaaccc caagctgatc aatgccctga gaaggtgttt cttttggaga 240

ttcatgttct acgggatctt tctgtatctg ggcgaggtta caaaggctgt gcagcccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaaca aggaagaaag aagcatcgcc 360

atctacctgg gcattggcct ctgcctcctg tttattgtgc ggactctgct gctgcaccca 420

gcaattttcg ggttgcatca tattggcatg cagatgcgca ttgctatgtt ttccctcatc 480

tacaaaaaga cactgaaact cagctcccgg gtgctggaca agatctccat cggccaactg 540

gtgtctctcc tgagcaataa cttgaataag ttcgacgaag ggctggccct ggcacacttc 600

gtgtggattg cccccctgca ggtggccctg ctgatgggac tgatttggga actgctgcag 660

gctagcgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg gcttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctgc ggaagatctt caccaccatc 1020

agcttttgca tcgtgcttag aatggccgtg acacggcagt tcccatgggc cgttcaaact 1080

tggtatgatt ccctgggcgc catcaacaaa atccaggatt tcctgcagaa gcaggaatac 1140

aagacactcg aatataacct cacaactact gaggtggtta tggagaacgt gactgccttc 1200

tgggaggagg ggttcggaga gctttttgag aaggccaaac agaataataa taaccgcaaa 1260

accagcaacg gcgacgacag cctgttcttc tccaattttt ctctcctggg aacacccgtc 1320

ctcaaagaca tcaactttaa gatcgagagg ggacagctgc tcgcagtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc atccgagggc 1440

aagattaagc acagtggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaaattcg cagagaagga taatatcgtg 1620

ctgggggagg ggggaatcac cctgagcgga ggccagagag ccagaatctc actggcccgg 1680

gccgtctaca aggacgccga cctttacctt ctggacagtc cctttggata tctggatgtg 1740

ctgactgaaa aggagatctt cgagtcttgt gtgtgcaagc tgatggctaa caagacccgg 1800

atcctagtga ctagtaagat ggagcacctg aagaaggcag acaagatctt gattctgcac 1860

gagggatcct cttactttta cggcaccttt agcgagctgc agaacctcca gcccgatttc 1920

tcatctaagc tgatgggctg tgatagcttc gaccagttct ctgccgagcg cagaaacagc 1980

atcctgacag agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttc ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

tccacactgc agtgggccgt gaatagttca atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagagggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaaactcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 6

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 6

atgcagcgtt ctcccctgga gaaggcttct gtggtgagta aacttttttt ctcctggacc 60

agacctatcc tgaggaaagg ctacaggcag agactggagc tctctgacat ataccagata 120

ccttcagtcg atagcgccga caacctgagc gagaagctgg aacgcgagtg ggacagagag 180

ctggcaagca agaagaaccc aaagctgatt aatgccctga gaaggtgttt cttctggaga 240

ttcatgttct acggaatctt tctgtatctg ggggaggtta caaaggctgt gcaacccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaaca aggaagaaag aagcatcgcc 360

atctacctgg gcattggcct ctgcctcctg tttattgtgc ggactctgct gctgcaccca 420

gcaattttcg ggttgcatca tattggcatg cagatgcgca ttgctatgtt ttccctcatc 480

tacaaaaaga cactgaaact cagctcccgg gtgctggaca agatctccat cggccaactg 540

gtgtctctcc tgagcaataa cttgaataag ttcgacgaag ggctggccct ggcacacttc 600

gtgtggattg cccccctgca ggtggccctg ctgatgggac tgatttggga actgctgcag 660

gctagcgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg gcttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctgc ggaagatctt caccaccatc 1020

agcttttgca tcgtgcttag aatggccgtg acccggcagt tcccatgggc cgtgcaaact 1080

tggtatgatt ccctgggcgc catcaacaaa atccaggatt tcctgcagaa gcaggaatac 1140

aagacactcg aatataatct cacaactact gaggtggtta tggagaacgt gactgccttc 1200

tgggaggagg ggttcggaga gctttttgag aaggcaaaac agaataacaa caaccgcaaa 1260

accagcaacg gcgacgacag cctgttcttc tccaattttt ctctcctggg aacacccgtc 1320

ctcaaagaca tcaactttaa gatcgagagg ggccagctgc tcgccgtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc ctccgagggc 1440

aagattaagc actcaggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaagttcg cagagaagga taatattgtg 1620

ctgggagagg gaggaatcac cctgagcgga ggccagagag ccagaatctc actggcccgg 1680

gccgtctaca aggacgccga cctttacctt ctggacagtc cctttggata tctggatgtg 1740

ctgactgaaa aggagatctt cgagtcttgt gtgtgcaagc tgatggctaa caagacccgg 1800

atcctagtga ctagtaagat ggagcacctg aagaaggcag acaagatctt gattctgcac 1860

gagggatcct cttactttta cggcaccttt agcgagctgc agaacctcca gcccgatttc 1920

tcatctaagc tgatgggctg tgatagcttc gaccagttct ctgccgagcg cagaaacagc 1980

atcctgacag agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttc ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

tccacactgc agtgggccgt gaatagttca atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagagggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaaactcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 7

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 7

atgcagcgtt ctcccctgga gaaggcttct gtggtgagta aacttttttt ctcctggacc 60

agacctatcc tgaggaaagg ctacaggcag agactggagc tctctgacat ataccagata 120

ccttcagtcg atagcgccga caacctgagc gagaagctgg aacgcgagtg ggacagagag 180

ctggcaagca agaagaaccc aaagctgatt aatgccctga gaaggtgttt cttctggaga 240

ttcatgttct acggaatctt tctgtatctg ggggaggtta caaaggctgt gcaacccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaaca aggaagaaag aagcatcgcc 360

atctacctgg gcattggcct ctgcctcctg tttattgtgc ggactctgct gctgcaccca 420

gcaattttcg ggttgcatca tattggcatg cagatgcgca ttgctatgtt ttccctcatc 480

tacaaaaaga cactgaaact cagctcccgg gtgctggaca agatctccat cggccaactg 540

gtgtctctcc tgagcaataa cttgaataag ttcgacgaag ggctggccct ggcacacttc 600

gtgtggattg cccccctgca ggtggccctg ctgatgggac tgatttggga actgctgcag 660

gctagcgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg ggttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctgc ggaagatctt caccaccatc 1020

agcttttgca tcgtgcttag aatggccgtg acacggcagt tcccatgggc cgttcaaact 1080

tggtatgatt ccctgggcgc catcaacaaa atccaggatt tcctgcagaa gcaggaatac 1140

aagacactcg aatataacct cacaactact gaggtggtta tggagaacgt gactgccttc 1200

tgggaggagg ggttcggaga gctttttgag aaggccaaac agaataataa taaccgcaaa 1260

accagcaacg gcgacgacag cctgttcttc tccaattttt ctctcctggg aacacccgtc 1320

ctcaaagaca tcaactttaa gatcgagagg ggccagctgc tcgccgtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc ctccgagggc 1440

aagattaagc actcaggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaagttcg cagagaagga taatattgtg 1620

ctgggagagg gaggaatcac cctgagcgga ggccagagag ccagaatctc actggcccgg 1680

gccgtctaca aggacgccga cctttacctt ctggacagtc cctttggata tctggatgtg 1740

ctgactgaaa aggagatctt cgagtcttgt gtgtgcaagc tgatggctaa caagacccgg 1800

atcctagtga ctagtaagat ggagcacctg aagaaggcag acaagatctt gattctgcac 1860

gagggatcct cttactttta cggcaccttt agcgagctgc agaacctcca gcccgatttc 1920

tcatctaagc tgatgggctg tgatagcttc gaccagttct ctgccgagcg cagaaacagc 1980

atcctgacag agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttc ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

agcacactgc agtgggccgt gaatagtagt atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagaaggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaagctcttt ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 8

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 8

atgcagcgtt ctcccctgga gaaggcttct gtggtgagta aacttttttt ctcctggacc 60

agacctatcc tgaggaaagg ctacaggcag agactggagc tctctgacat ataccagata 120

ccttcagtcg atagcgccga caacctgagc gagaagctgg aacgcgagtg ggacagagag 180

ctggcaagca agaagaaccc aaagctgatt aatgccctga gaaggtgttt cttctggaga 240

ttcatgttct acggaatctt tctgtatctg ggggaggtta caaaggctgt gcaacccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaata aggaagagag atctatcgcc 360

atctacctgg gaattggcct gtgtctgctg ttcatcgtgc gcaccctgct cctccaccca 420

gccatttttg ggctgcatca catcggaatg cagatgagga ttgctatgtt ttccctgatc 480

tataagaaga ccctgaaact ctcaagcaga gtgctggaca aaatttccat tggccagctg 540

gtgtctctgc tgtccaataa tctcaataag tttgacgagg gcctggccct ggcacacttc 600

gtctggattg cccctctcca ggtcgctctg ctgatgggcc tgatctggga gctgctgcag 660

gcatccgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg gcttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctga ggaagatctt cactacaatc 1020

tccttctgca tcgtactcag aatggccgtg acccgccagt ttccctgggc cgtgcagaca 1080

tggtacgact ccctcggcgc cattaataag atccaggatt ttctgcagaa acaggaatac 1140

aagacactgg aatacaacct gacaacaaca gaggtggtca tggaaaacgt gaccgcattt 1200

tgggaggaag gcttcggaga gctctttgaa aaagctaagc agaacaacaa taacaggaaa 1260

acctctaatg gggacgacag cctgtttttc agcaattttt ctctgctggg gacacctgtg 1320

ctgaaggaca ttaactttaa gatcgagagg ggacagctgc tcgcagtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc atccgagggc 1440

aagattaagc acagtggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaaattcg cagagaagga taatatcgtg 1620

ctgggggagg ggggaatcac cctgagcgga ggccagagag ccagaatcag cctggcaagg 1680

gcagtgtata aagacgctga cctgtacttg ctggactccc cttttggcta cctggacgtg 1740

ctgaccgaaa aggaaatctt tgagtcctgc gtctgcaagc tgatggcaaa caagaccaga 1800

atcctggtga cctccaagat ggaacatctg aagaaggcag ataaaatcct catcctgcat 1860

gagggatctt cttactttta tggaactttt agcgagctgc agaacctgca gccagacttc 1920

tccagcaagc tgatgggatg cgactccttt gaccagttct ccgccgaacg gcgcaattct 1980

atcctgaccg aaaccctgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttt ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

tccacactgc agtgggccgt gaatagttca atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagagggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaaactcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 9

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 9

atgcagcgct cgcctctgga aaaggcgagc gtcgtgtcaa agctattctt ttcttggacc 60

cggcccattc tcaggaaggg ctacaggcag aggctggagt tgagcgacat ctatcagatt 120

ccttccgtgg acagcgccga caacctgagc gagaagctgg aaagggagtg ggaccgcgaa 180

ctggcaagca aaaagaaccc caagctgatc aatgccctga gaaggtgttt cttttggaga 240

ttcatgttct acgggatctt tctgtatctg ggcgaggtta caaaggctgt gcagcccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaaca aggaagaaag aagcatcgcc 360

atctacctgg gcattggcct ctgcctcctg tttattgtgc ggactctgct gctgcaccca 420

gcaattttcg ggttgcatca tattggcatg cagatgcgca ttgctatgtt ttccctcatc 480

tacaaaaaga cactgaaact cagctcccgg gtgctggaca agatctccat cggccaactg 540

gtgtctctcc tgagcaataa cttgaataag ttcgacgaag ggctggccct ggcacacttc 600

gtgtggattg cccccctgca ggtggccctg ctgatgggac tgatttggga actgctgcag 660

gctagcgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg ggttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctgc ggaagatctt caccaccatc 1020

agcttttgca tcgtgcttag aatggccgtg acacggcagt tcccatgggc cgttcaaact 1080

tggtatgatt ccctgggcgc catcaacaaa atccaggatt tcctgcagaa gcaggaatac 1140

aagacactcg aatataacct cacaactact gaggtggtta tggagaacgt gactgccttc 1200

tgggaggagg ggttcggaga gctttttgag aaggccaaac agaataataa taaccgcaaa 1260

accagcaacg gcgacgacag cctgttcttc tccaattttt ctctcctggg aacacccgtc 1320

ctcaaagaca tcaactttaa gatcgagagg ggccagctgc tcgccgtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc ctccgagggc 1440

aagattaagc actcaggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaagttcg cagagaagga taatattgtg 1620

ctgggagagg gaggaatcac cctgagcgga ggccagagag ccagaatctc actggcccgg 1680

gccgtctaca aggacgccga cctttacctt ctggacagtc cctttggata tctggatgtg 1740

ctgactgaaa aggagatctt cgagtcttgt gtgtgcaagc tgatggctaa caagacccgg 1800

atcctagtga ctagtaagat ggagcacctg aagaaggcag acaagatctt gattctgcac 1860

gagggatcct cttactttta cggcaccttt agcgagctgc agaacctcca gcccgatttc 1920

tcatctaagc tgatgggctg tgatagcttc gaccagttct ctgccgagcg cagaaacagc 1980

atcctgacag agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttc ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

agcacactgc agtgggccgt gaatagtagt atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagaaggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaaactcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 10

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 10

atgcagcgtt ctcccctgga gaaggcttct gtggtgagta aacttttttt ctcctggacc 60

agacctatcc tgaggaaagg ctacaggcag agactggagc tctctgacat ataccagata 120

ccttcagtcg atagcgccga caacctgagc gagaagctgg aacgcgagtg ggacagagag 180

ctggcaagca agaagaaccc aaagctgatt aatgccctga gaaggtgttt cttctggaga 240

ttcatgttct acggaatctt tctgtatctg ggggaggtta caaaggctgt gcaacccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaata aggaagagag atctatcgcc 360

atctacctgg gaattggcct gtgtctgctg ttcatcgtgc gcaccctgct cctccaccca 420

gccatttttg ggctgcatca catcggaatg cagatgagga ttgctatgtt ttccctgatc 480

tataagaaga ccctgaaact ctcaagcaga gtgctggaca aaatttccat tggccagctg 540

gtgtctctgc tgtccaataa tctcaataag tttgacgagg gcctggccct ggcacacttc 600

gtctggattg cccctctcca ggtcgctctg ctgatgggcc tgatctggga gctgctgcag 660

gcatccgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg ggttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctga ggaagatctt cactacaatc 1020

tccttctgca tcgtactcag aatggccgtg acccgccagt ttccctgggc cgtgcagaca 1080

tggtacgact ccctcggcgc cattaataag atccaggatt ttctgcagaa acaggaatac 1140

aagacactgg aatacaacct gacaacaaca gaggtggtca tggaaaacgt gaccgcattt 1200

tgggaggaag gcttcggaga gctctttgaa aaagctaagc agaacaacaa taacaggaaa 1260

acctctaatg gggacgacag cctgtttttc agcaattttt ctctgctggg gacacctgtg 1320

ctgaaggaca ttaactttaa gatcgagagg ggacagctgc tcgcagtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc atccgagggc 1440

aagattaagc acagtggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaaattcg cagagaagga taatatcgtg 1620

ctgggggagg ggggaatcac cctgagcgga ggccagagag ccagaatttc tctggccaga 1680

gccgtgtaca aagatgccga cctgtacctg ctggacagcc catttggcta tctggacgtg 1740

ctgaccgaaa aagagatttt cgagtcatgc gtttgtaagc tgatggccaa caagactcgc 1800

atcctggtga cttcgaagat ggaacatctg aagaaagctg ataagattct gatcctgcac 1860

gaaggcagct cctactttta cgggaccttc tccgagctcc agaacctgca gcctgatttc 1920

agctctaagc tgatgggctg cgatagcttt gaccagttta gcgcagaaag gcgcaactct 1980

attctgactg agacactgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttt ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

tccacactgc agtgggccgt gaatagttca atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagagggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaaactcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 11

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 11

atgcagcgct cgcctctgga aaaggcgagc gtcgtgtcaa agctattctt ttcttggacc 60

cggcccattc tcaggaaggg ctacaggcag aggctggagt tgagcgacat ctatcagatt 120

ccttccgtgg acagcgccga caacctgagc gagaagctgg aaagggagtg ggaccgcgaa 180

ctggcaagca aaaagaaccc caagctgatc aatgccctga gaaggtgttt cttttggaga 240

ttcatgttct acgggatctt tctgtatctg ggcgaggtta caaaggctgt gcagcccctg 300

ctgctcggca gaatcatcgc ctcatacgat ccagacaata aggaagagag atctatcgcc 360

atctacctgg gaattggcct gtgtctgctg ttcatcgtgc gcaccctgct cctccaccca 420

gccatttttg ggctgcatca catcggaatg cagatgagga ttgctatgtt ttccctgatc 480

tataagaaga ccctgaaact ctcaagcaga gtgctggaca aaatttccat tggccagctg 540

gtgtctctgc tgtccaataa tctcaataag tttgacgagg gcctggccct ggcacacttc 600

gtctggattg cccctctcca ggtcgctctg ctgatgggcc tgatctggga gctgctgcag 660

gcatccgctt tctgcggcct ggggttcctg atcgtgctgg cactgtttca ggcaggcctg 720

ggccgtatga tgatgaagta cagagaccag agggccggga agatctccga acggctcgtt 780

attacctctg agatgatcga gaacattcag tctgtgaaag cctactgctg ggaggaggct 840

atggagaaga tgatcgagaa tctgagacag accgagctga agctgaccag aaaggccgcc 900

tacgtgaggt acttcaacag cagtgccttc ttcttctctg gcttcttcgt tgtgtttctg 960

agcgtgctgc catacgctct catcaaaggc atcatcctga gaaaaatttt cacaaccatc 1020

tccttttgca tcgtgctgag aatggccgtg acaaggcagt tcccttgggc tgtgcagacc 1080

tggtacgaca gcctgggagc tattaataag attcaagatt tcctgcagaa gcaggaatac 1140

aaaacactgg aatacaacct gacaactact gaggtcgtta tggagaacgt gacagcattt 1200

tgggaggagg ggttcgggga actcttcgag aaggcaaagc agaacaacaa caatcggaag 1260

acatccaacg gcgacgacag cctgttcttt tccaacttca gcctgctggg aactccagtg 1320

ctcaaagaca ttaactttaa gatcgagagg ggccagctgc tcgccgtcgc cggatccaca 1380

ggcgccggca agacctctct gctgatggtt atcatgggcg aactggagcc ctccgagggc 1440

aagattaagc actcaggaag aatctccttt tgtagccagt tcagttggat tatgcccggc 1500

actattaagg agaatatcat ttttggggtg agctatgatg agtatcggta tcggagcgtt 1560

atcaaagcct gtcagctgga ggaggatatc agcaagttcg cagagaagga taatattgtg 1620

ctgggagagg gaggaatcac cctgagcgga ggccagagag ccagaattag cctcgcccgg 1680

gcagtctaca aagatgccga cctgtacctg ctggacagcc cttttggcta tttggatgtg 1740

ctgactgaaa aggaaatctt cgagagctgc gtgtgcaagc tgatggccaa caagacccgc 1800

atcctcgtca ctagcaagat ggaacacctg aagaaggccg acaagatcct gattctgcac 1860

gaggggagca gctacttcta tggcactttt tccgagctgc aaaatctcca gcctgacttc 1920

tcttccaagc tgatgggatg tgacagcttt gaccagtttt ccgctgagcg gcgcaatagc 1980

atcctgaccg aaaccctgca ccggttttca ctggagggcg acgcccctgt cagctggacc 2040

gagaccaaaa agcagtcttt caagcagaca ggcgagttcg gcgagaagcg caaaaacagc 2100

atcctgaatc caatcaactc tataaggaag tttagcatcg tgcagaagac acccctccag 2160

atgaacggca tcgaagagga cagtgacgag cccctggagc ggcgcctgag cctcgtgcct 2220

gacagcgaac agggcgaggc catcctgcct aggatcagcg tgatttcaac cgggccaaca 2280

ctgcaggcta ggagaagaca gtcagtgctt aacctgatga cacatagcgt gaatcaggga 2340

cagaacatcc atcgaaaaac cacagcctct actcgcaaag tgtcactggc tcctcaggct 2400

aatctgacag agctggacat ctatagcagg aggctgagcc aggagacagg cctggagatc 2460

agtgaggaga tcaacgaaga ggacctgaag gagtgctttt tcgatgacat ggagagtatc 2520

cccgccgtca ccacctggaa tacctacctc cggtacatca cagtgcacaa gtccctcatc 2580

tttgtgctga tttggtgcct cgtgatcttt ctcgcagaag tggccgcctc cctggtggtg 2640

ctgtggctgt tggggaatac tccactgcag gacaaaggca attctacaca cagcaggaat 2700

aattcctatg ccgtgattat caccagcaca tcctcttact acgtgttcta catctacgtg 2760

ggagtggcag atactctgct tgcaatgggc ttcttcaggg ggctgcccct ggtgcacaca 2820

ctgatcacag tgtccaagat cctccaccat aaaatgctcc acagcgtgct gcaggcaccc 2880

atgagcaccc tgaacacact gaaggccggc ggcatcctga atcgcttttc caaagacatc 2940

gccatcctcg acgatctcct gccactgacc atcttcgatt ttatccagct gctgctgatc 3000

gtgatcgggg ccatcgccgt ggtggccgtg ctgcagccat acattttcgt ggctacagtg 3060

cccgtgatcg ttgcctttat catgctgaga gcctacttcc tgcagacttc tcagcagctg 3120

aagcagctgg agagcgaagg gagaagcccc atcttcactc acctggtgac aagcctgaag 3180

ggactctgga ccctgagagc cttcggccgg cagccctatt tcgagaccct gtttcacaag 3240

gccctcaacc tgcacacagc caactggttc ctctacctgt ccaccctgag gtggttccag 3300

atgaggattg aaatgatctt cgtgattttt ttcatcgccg tgacattcat tagcattctg 3360

accaccggcg agggggaggg gagagtgggc atcatcctga cccttgccat gaacattatg 3420

agcacactgc agtgggccgt gaatagtagt atcgacgtgg acagtctgat gaggtccgtg 3480

agccgggtgt tcaagttcat tgacatgccc acagaaggga aacccaccaa aagcaccaag 3540

ccctacaaga acgggcagct gtccaaggtt atgatcatcg agaactctca cgtgaagaag 3600

gacgacattt ggcccagcgg cggccagatg acagtgaaag atctgaccgc caaatacacc 3660

gagggaggca acgccatcct cgaaaacatt agcttctcta tcagccctgg acagagggtg 3720

ggcctgctgg gccggacagg ctcagggaag agtactctgc tgtcagcatt cctgaggctc 3780

ctgaacacag agggcgagat ccagattgac ggcgtgtcct gggactccat caccctgcag 3840

cagtggcgga aggctttcgg ggtgatcccc cagaaggtgt tcatctttag cggcactttc 3900

agaaagaatc tggaccctta tgagcagtgg agtgaccagg agatctggaa agtggccgat 3960

gaggtcggac tgaggagcgt gatcgagcag tttccaggga agctggactt tgtgctggtg 4020

gatggcggat gcgtgctgtc tcacggccat aaacagctga tgtgtctggc ccggtccgtg 4080

ctgtctaagg ccaagatcct gctgctggac gaaccctccg cccacctgga ccccgtgaca 4140

taccagatca tcaggagaac tctcaagcag gccttcgccg actgtaccgt gattctgtgc 4200

gagcaccgca ttgaagctat gctggagtgt cagcagttcc tggtgatcga ggaaaataag 4260

gtgaggcagt acgacagcat ccagaagctg ctgaacgagc gctccctgtt ccgccaggct 4320

atctccccat cagaccgggt gaaactcttc ccccacagaa actcctcaaa gtgcaagtcc 4380

aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg 4440

tga 4443

<210> 12

<211> 140

<212> RNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 12

ggacagaucg ccuggagacg ccauccacgc uguuuugacc uccauagaag acaccgggac 60

cgauccagcc uccgcggccg ggaacggugc auuggaacgc ggauuccccg ugccaagagu 120

gacucaccgu ccuugacacg 140

<210> 13

<211> 105

<212> RNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 13

cggguggcau cccugugacc ccuccccagu gccucuccug gcccuggaag uugccacucc 60

agugcccacc agccuugucc uaauaaaauu aaguugcauc aagcu 105

<210> 14

<211> 105

<212> RNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 14

ggguggcauc ccugugaccc cuccccagug ccucuccugg cccuggaagu ugccacucca 60

gugcccacca gccuuguccu aauaaaauua aguugcauca aagcu 105

<210> 15

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 15

atgcaacgct ctcctcttga aaaggcctcg gtggtgtcca agctcttctt ctcgtggact 60

agacccatcc tgagaaaggg gtacagacag cgcttggagc tgtccgatat ctatcaaatc 120

ccttccgtgg actccgcgga caacctgtcc gagaagctcg agagagaatg ggacagagaa 180

ctcgcctcaa agaagaaccc gaagctgatt aatgcgctta ggcggtgctt tttctggcgg 240

ttcatgttct acggcatctt cctctacctg ggagaggtca ccaaggccgt gcagcccctg 300

ttgctgggac ggattattgc ctcctacgac cccgacaaca aggaagaaag aagcatcgct 360

atctacttgg gcatcggtct gtgcctgctt ttcatcgtcc ggaccctctt gttgcatcct 420

gctattttcg gcctgcatca cattggcatg cagatgagaa ttgccatgtt ttccctgatc 480

tacaagaaaa ctctgaagct ctcgagccgc gtgcttgaca agatttccat cggccagctc 540

gtgtccctgc tctccaacaa tctgaacaag ttcgacgagg gcctcgccct ggcccacttc 600

gtgtggatcg cccctctgca agtggcgctt ctgatgggcc tgatctggga gctgctgcaa 660

gcctcggcat tctgtgggct tggattcctg atcgtgctgg cactgttcca ggccggactg 720

gggcggatga tgatgaagta cagggaccag agagccggaa agatttccga acggctggtg 780

atcacttcgg aaatgatcga aaacatccag tcagtgaagg cctactgctg ggaagaggcc 840

atggaaaaga tgattgaaaa cctccggcaa accgagctga agctgacccg caaggccgct 900

tacgtgcgct atttcaactc gtccgctttc ttcttctccg ggttcttcgt ggtgtttctc 960

tccgtgctcc cctacgccct gattaaggga atcatcctca ggaagatctt caccaccatt 1020

tccttctgta tcgtgctccg catggccgtg acccggcagt tcccatgggc cgtgcagact 1080

tggtacgact ccctgggagc cattaacaag atccaggact tccttcaaaa gcaggagtac 1140

aagaccctcg agtacaacct gactactacc gaggtcgtga tggaaaacgt caccgccttt 1200

tgggaggagg gatttggcga actgttcgag aaggccaagc agaacaacaa caaccgcaag 1260

acctcgaacg gtgacgactc cctcttcttt tcaaacttca gcctgctcgg gacgcccgtg 1320

ctgaaggaca ttaacttcaa gatcgaaaga ggacagctcc tggcggtggc cggatcgacc 1380

ggagccggaa agacttccct gctgatggtg atcatgggag agcttgaacc tagcgaggga 1440

aagatcaagc actccggccg catcagcttc tgtagccagt tttcctggat catgcccgga 1500

accattaagg aaaacatcat cttcggcgtg tcctacgatg aataccgcta ccggtccgtg 1560

atcaaagcct gccagctgga agaggatatt tcaaagttcg cggagaaaga taacatcgtg 1620

ctgggcgaag ggggtattac cttgtcgggg ggccagcggg ctagaatctc gctggccaga 1680

gccgtgtata aggacgccga cctgtatctc ctggactccc ccttcggata cctggacgtc 1740

ctgaccgaaa aggagatctt cgaatcgtgc gtgtgcaagc tgatggctaa caagactcgc 1800

atcctcgtga cctccaaaat ggagcacctg aagaaggcag acaagattct gattctgcat 1860

gaggggtcct cctactttta cggcaccttc tcggagttgc agaacttgca gcccgacttc 1920

tcatcgaagc tgatgggttg cgacagcttc gaccagttct ccgccgaaag aaggaactcg 1980

atcctgacgg aaaccttgca ccgcttctct ttggaaggcg acgcccctgt gtcatggacc 2040

gagactaaga agcagagctt caagcagacc ggggaattcg gcgaaaagag gaagaacagc 2100

atcttgaacc ccattaactc catccgcaag ttctcaatcg tgcaaaagac gccactgcag 2160

atgaacggca ttgaggagga ctccgacgaa ccccttgaga ggcgcctgtc cctggtgccg 2220

gacagcgagc agggagaagc catcctgcct cggatttccg tgatctccac tggtccgacg 2280

ctccaagccc ggcggcggca gtccgtgctg aacctgatga cccacagcgt gaaccagggc 2340

caaaacattc accgcaagac taccgcatcc acccggaaag tgtccctggc acctcaagcg 2400

aatcttaccg agctcgacat ctactcccgg agactgtcgc aggaaaccgg gctcgaaatt 2460

tccgaagaaa tcaacgagga ggatctgaaa gagtgcttct tcgacgatat ggagtcgata 2520

cccgccgtga cgacttggaa cacttatctg cggtacatca ctgtgcacaa gtcattgatc 2580

ttcgtgctga tttggtgcct ggtgattttc ctggccgagg tcgcggcctc actggtggtg 2640

ctctggctgt tgggaaacac gcctctgcaa gacaagggaa actccacgca ctcgagaaac 2700

aacagctatg ccgtgattat cacttccacc tcctcttatt acgtgttcta catctacgtc 2760

ggagtggcgg ataccctgct cgcgatgggt ttcttcagag gactgccgct ggtccacacc 2820

ttgatcaccg tcagcaagat tcttcaccac aagatgttgc atagcgtgct gcaggccccc 2880

atgtccaccc tcaacactct gaaggccgga ggcattctga acagattctc caaggacatc 2940

gctatcctgg acgatctcct gccgcttacc atctttgact tcatccagct gctgctgatc 3000

gtgattggag caatcgcagt ggtggcggtg ctgcagcctt acattttcgt ggccactgtg 3060

ccggtcattg tggcgttcat catgctgcgg gcctacttcc tccaaaccag ccagcagctg 3120

aagcaactgg aatccgaggg acgatccccc atcttcactc accttgtgac gtcgttgaag 3180

ggactgtgga ccctccgggc tttcggacgg cagccctact tcgaaaccct cttccacaag 3240

gccctgaacc tccacaccgc caattggttc ctgtacctgt ccaccctgcg gtggttccag 3300

atgcgcatcg agatgatttt cgtcatcttc ttcatcgcgg tcacattcat cagcatcctg 3360

actaccggag agggagaggg acgggtcgga ataatcctga ccctcgccat gaacattatg 3420

agcaccctgc agtgggcagt gaacagctcg atcgacgtgg acagcctgat gcgaagcgtc 3480

agccgcgtgt tcaagttcat cgacatgcct actgagggaa aacccactaa gtccactaag 3540

ccctacaaaa atggccagct gagcaaggtc atgatcatcg aaaactccca cgtgaagaag 3600

gacgatattt ggccctccgg aggtcaaatg accgtgaagg acctgaccgc aaagtacacc 3660

gagggaggaa acgccattct cgaaaacatc agcttctcca tttcgccggg acagcgggtc 3720

ggccttctcg ggcggaccgg ttccgggaag tcaactctgc tgtcggcttt cctccggctg 3780

ctgaataccg agggggaaat ccaaattgac ggcgtgtctt gggattccat tactctgcag 3840

cagtggcgga aggccttcgg cgtgatcccc cagaaggtgt tcatcttctc gggtaccttc 3900

cggaagaacc tggatcctta cgagcagtgg agcgaccaag aaatctggaa ggtcgccgac 3960

gaggtcggcc tgcgctccgt gattgaacaa tttcctggaa agctggactt cgtgctcgtc 4020

gacgggggat gtgtcctgtc gcacggacat aagcagctca tgtgcctcgc acggtccgtg 4080

ctctccaagg ccaagattct gctgctggac gaaccttcgg cccacctgga tccggtcacc 4140

taccagatca tcaggaggac cctgaagcag gcctttgccg attgcaccgt gattctctgc 4200

gagcaccgca tcgaggccat gctggagtgc cagcagttcc tggtcatcga ggagaacaag 4260

gtccgccaat acgactccat tcaaaagctc ctcaacgagc ggtcgctgtt cagacaagct 4320

atttcaccgt ccgatagagt gaagctcttc ccgcatcgga acagctcaaa gtgcaaatcg 4380

aagccgcaga tcgcagcctt gaaggaagag actgaggaag aggtgcagga cacccggctt 4440

taa 4443

<210> 16

<211> 4443

<212> DNA

<213> Artificial Sequence

<220>

<223> Synthetic polynucleotide

<400> 16

atgcagcggt ccccgctcga aaaggccagt gtcgtgtcca aactcttctt ctcatggact 60

cggcctatcc ttagaaaggg gtatcggcag aggcttgagt tgtctgacat ctaccagatc 120

ccctcggtag attcggcgga taacctctcg gagaagctcg aacgggaatg ggaccgcgaa 180

ctcgcgtcta agaaaaaccc gaagctcatc aacgcactga gaaggtgctt cttctggcgg 240

ttcatgttct acggtatctt cttgtatctc ggggaggtca caaaagcagt ccaacccctg 300

ttgttgggtc gcattatcgc ctcgtacgac cccgataaca aagaagaacg gagcatcgcg 360

atctacctcg ggatcggact gtgtttgctt ttcatcgtca gaacactttt gttgcatcca 420

gcaatcttcg gcctccatca catcggtatg cagatgcgaa tcgctatgtt tagcttgatc 480

tacaaaaaga cactgaaact ctcgtcgcgg gtgttggata agatttccat cggtcagttg 540

gtgtccctgc ttagtaataa cctcaacaaa ttcgatgagg gactggcgct ggcacatttc 600

gtgtggattg ccccgttgca agtcgccctt ttgatgggcc ttatttggga actcttgcag 660

gcatctgcct tttgtggcct gggatttctg attgtgttgg cattgtttca ggctgggctt 720

gggcggatga tgatgaagta tcgcgaccag agagcgggta aaatctcgga aagactcgtc 780

atcacttcgg aaatgatcga aaacatccag tcggtcaaag cctattgctg ggaagaagct 840

atggagaaga tgattgaaaa cctccgccaa actgagctga aactgacccg caaggcggcg 900

tatgtccggt atttcaattc gtcagcgttc ttcttttccg ggttcttcgt tgtctttctc 960

tcggttttgc cttatgcctt gattaagggg attatcctcc gcaagatttt caccacgatt 1020

tcgttctgca ttgtattgcg catggcagtg acacggcaat ttccgtgggc cgtgcagaca 1080

tggtatgact cgcttggagc gatcaacaaa atccaagact tcttgcaaaa gcaagagtac 1140

aagaccctgg agtacaatct tactactacg gaggtagtaa tggagaatgt gacggctttt 1200

tgggaagagg gttttggaga gctcttcgag aaagcaaagc agaataacaa caaccgcaag 1260

acctcaaatg gggacgattc cctgtttttc tcgaacttct ccctgctcgg aacacccgtg 1320

ttgaaggaca tcaatttcaa gattgagagg ggacagcttc tcgcggtagc gggaagcact 1380

ggtgcgggaa aaactagcct cttgatggtg attatggggg agcttgagcc cagcgagggg 1440

aagattaaac actccgggcg tatctcattc tgtagccagt tttcatggat catgcccgga 1500

accattaaag agaacatcat tttcggagta tcctatgatg agtaccgata cagatcggtc 1560

attaaggcgt gccagttgga agaggacatt tctaagttcg ccgagaagga taacatcgtc 1620

ttgggagaag ggggtattac attgtcggga gggcagcgag cgcggatcag cctcgcgaga 1680

gcggtataca aagatgcaga tttgtacctg ctcgattcac cgtttggata cctcgacgta 1740

ttgacagaaa aagaaatctt cgagtcgtgc gtgtgtaaac ttatggctaa taagacgaga 1800

atcctggtga catcaaaaat ggaacacctt aagaaggcgg acaagatcct gatcctccac 1860

gaaggatcgt cctactttta cggcactttc tcagagttgc aaaacttgca gccggacttc 1920

tcaagcaaac tcatggggtg tgactcattc gaccagttca gcgcggaacg gcggaactcg 1980

atcttgacgg aaacgctgca ccgattctcg cttgagggtg atgccccggt atcgtggacc 2040

gagacaaaga agcagtcgtt taagcagaca ggagaatttg gtgagaaaag aaagaacagt 2100

atcttgaatc ctattaactc aattcgcaag ttctcaatcg tccagaaaac tccactgcag 2160

atgaatggaa ttgaagagga ttcggacgaa cccctggagc gcaggcttag cctcgtgccg 2220

gattcagagc aaggggaggc cattcttccc cggatttcgg tgatttcaac cggacctaca 2280

cttcaggcga ggcgaaggca atccgtgctc aacctcatga cgcattcggt aaaccagggg 2340

caaaacattc accgcaaaac gacggcctca acgagaaaag tgtcacttgc accccaggcg 2400

aatttgactg aactcgacat ctacagccgt aggctttcgc aagaaaccgg acttgagatc 2460

agcgaagaaa tcaatgaaga agatttgaaa gagtgtttct ttgatgacat ggaatcaatc 2520

ccagcggtga caacgtggaa cacatacttg cgttacatca cggtgcacaa gtccttgatt 2580

ttcgtcctca tttggtgcct cgtgatcttt ctcgctgagg tcgcagcgtc acttgtggtc 2640

ctctggctgc ttggtaatac gcccttgcaa gacaaaggca attctacaca ctcaagaaac 2700

aattcctatg ccgtgattat cacttctaca agctcgtatt acgtgtttta catctacgta 2760

ggagtggccg acactctgct cgcgatgggt ttcttccgag gactcccact cgttcacacg 2820

cttatcactg tctccaagat tctccaccat aagatgcttc atagcgtact gcaggctccc 2880

atgtccacct tgaatacgct caaggcggga ggtattttga atcgcttctc aaaagatatt 2940

gcaattttgg atgaccttct gcccctgacg atcttcgact tcatccagtt gttgctgatc 3000

gtgattgggg ctattgcagt agtcgctgtc ctccagcctt acatttttgt cgcgaccgtt 3060

ccggtgatcg tggcgtttat catgctgcgg gcctatttct tgcagacgtc acagcagctt 3120

aagcaactgg agtctgaagg gaggtcgcct atctttacgc atcttgtgac cagtttgaag 3180

ggattgtgga cgttgcgcgc ctttggcagg cagccctact ttgaaacact gttccacaaa 3240

gcgctgaatc tccatacggc aaattggttt ttgtatttga gtaccctccg atggtttcag 3300

atgcgcattg agatgatttt tgtgatcttc tttatcgcgg tgacttttat ctccatcttg 3360

accacgggag agggcgaggg acgggtcggt attatcctga cactcgccat gaacattatg 3420

agcactttgc agtgggcagt gaacagctcg attgatgtgg atagcctgat gaggtccgtt 3480

tcgagggtct ttaagttcat cgacatgccg acggagggaa agcccacaaa aagtacgaaa 3540

ccctataaga atgggcaatt gagtaaggta atgatcatcg agaacagtca cgtgaagaag 3600

gatgacatct ggcctagcgg gggtcagatg accgtgaagg acctgacggc aaaatacacc 3660

gagggaggga acgcaatcct tgaaaacatc tcgttcagca ttagccccgg tcagcgtgtg 3720

gggttgctcg ggaggaccgg gtcaggaaaa tcgacgttgc tgtcggcctt cttgagactt 3780

ctgaatacag agggtgagat ccagatcgac ggcgtttcgt gggatagcat caccttgcag 3840

cagtggcgga aagcgtttgg agtaatcccc caaaaggtct ttatctttag cggaaccttc 3900

cgaaagaatc tcgatcctta tgaacagtgg tcagatcaag agatttggaa agtcgcggac 3960

gaggttggcc ttcggagtgt aatcgagcag tttccgggaa aactcgactt tgtccttgta 4020

gatgggggat gcgtcctgtc gcatgggcac aagcagctca tgtgcctggc gcgatccgtc 4080

ctctctaaag cgaaaattct tctcttggat gaaccttcgg cccatctgga cccggtaacg 4140

tatcagatca tcagaaggac acttaagcag gcgtttgccg actgcacggt gattctctgt 4200

gagcatcgta tcgaggccat gctcgaatgc cagcaatttc ttgtcatcga agagaataag 4260

gtccgccagt acgactccat ccagaagctg cttaatgaga gatcattgtt ccggcaggcg 4320

atttcaccat ccgatagggt gaaacttttt ccacacagaa attcgtcgaa gtgcaagtcc 4380

aaaccgcaga tcgcggcctt gaaagaagag actgaagaag aagttcaaga cacgcgtctt 4440

taa 4443

<210> 17

<211> 874

<212> PRT

<213> Artificial Sequence

<220>

<223> Synthetic polypeptide

<400> 17

Met Gln Asp Leu His Ala Ile Gln Leu Gln Leu Glu Glu Glu Met Phe

1 5 10 15

Asn Gly Gly Ile Arg Arg Phe Glu Ala Asp Gln Gln Arg Gln Ile Ala

20 25 30

Ala Gly Ser Glu Ser Asp Thr Ala Trp Asn Arg Arg Leu Leu Ser Glu

35 40 45

Leu Ile Ala Pro Met Ala Glu Gly Ile Gln Ala Tyr Lys Glu Glu Tyr

50 55 60

Glu Gly Lys Lys Gly Arg Ala Pro Arg Ala Leu Ala Phe Leu Gln Cys

65 70 75 80

Val Glu Asn Glu Val Ala Ala Tyr Ile Thr Met Lys Val Val Met Asp

85 90 95

Met Leu Asn Thr Asp Ala Thr Leu Gln Ala Ile Ala Met Ser Val Ala

100 105 110

Glu Arg Ile Glu Asp Gln Val Arg Phe Ser Lys Leu Glu Gly His Ala

115 120 125

Ala Lys Tyr Phe Glu Lys Val Lys Lys Ser Leu Lys Ala Ser Arg Thr

130 135 140

Lys Ser Tyr Arg His Ala His Asn Val Ala Val Val Ala Glu Lys Ser

145 150 155 160

Val Ala Glu Lys Asp Ala Asp Phe Asp Arg Trp Glu Ala Trp Pro Lys

165 170 175

Glu Thr Gln Leu Gln Ile Gly Thr Thr Leu Leu Glu Ile Leu Glu Gly

180 185 190

Ser Val Phe Tyr Asn Gly Glu Pro Val Phe Met Arg Ala Met Arg Thr

195 200 205

Tyr Gly Gly Lys Thr Ile Tyr Tyr Leu Gln Thr Ser Glu Ser Val Gly

210 215 220

Gln Trp Ile Ser Ala Phe Lys Glu His Val Ala Gln Leu Ser Pro Ala

225 230 235 240

Tyr Ala Pro Cys Val Ile Pro Pro Arg Pro Trp Arg Thr Pro Phe Asn

245 250 255

Gly Gly Phe His Thr Glu Lys Val Ala Ser Arg Ile Arg Leu Val Lys

260 265 270

Gly Asn Arg Glu His Val Arg Lys Leu Thr Gln Lys Gln Met Pro Lys

275 280 285

Val Tyr Lys Ala Ile Asn Ala Leu Gln Asn Thr Gln Trp Gln Ile Asn

290 295 300

Lys Asp Val Leu Ala Val Ile Glu Glu Val Ile Arg Leu Asp Leu Gly

305 310 315 320

Tyr Gly Val Pro Ser Phe Lys Pro Leu Ile Asp Lys Glu Asn Lys Pro

325 330 335

Ala Asn Pro Val Pro Val Glu Phe Gln His Leu Arg Gly Arg Glu Leu

340 345 350

Lys Glu Met Leu Ser Pro Glu Gln Trp Gln Gln Phe Ile Asn Trp Lys

355 360 365

Gly Glu Cys Ala Arg Leu Tyr Thr Ala Glu Thr Lys Arg Gly Ser Lys

370 375 380

Ser Ala Ala Val Val Arg Met Val Gly Gln Ala Arg Lys Tyr Ser Ala

385 390 395 400

Phe Glu Ser Ile Tyr Phe Val Tyr Ala Met Asp Ser Arg Ser Arg Val

405 410 415

Tyr Val Gln Ser Ser Thr Leu Ser Pro Gln Ser Asn Asp Leu Gly Lys

420 425 430

Ala Leu Leu Arg Phe Thr Glu Gly Arg Pro Val Asn Gly Val Glu Ala

435 440 445

Leu Lys Trp Phe Cys Ile Asn Gly Ala Asn Leu Trp Gly Trp Asp Lys

450 455 460

Lys Thr Phe Asp Val Arg Val Ser Asn Val Leu Asp Glu Glu Phe Gln

465 470 475 480

Asp Met Cys Arg Asp Ile Ala Ala Asp Pro Leu Thr Phe Thr Gln Trp

485 490 495

Ala Lys Ala Asp Ala Pro Tyr Glu Phe Leu Ala Trp Cys Phe Glu Tyr

500 505 510

Ala Gln Tyr Leu Asp Leu Val Asp Glu Gly Arg Ala Asp Glu Phe Arg

515 520 525

Thr His Leu Pro Val His Gln Asp Gly Ser Cys Ser Gly Ile Gln His

530 535 540

Tyr Ser Ala Met Leu Arg Asp Glu Val Gly Ala Lys Ala Val Asn Leu

545 550 555 560

Lys Pro Ser Asp Ala Pro Gln Asp Ile Tyr Gly Ala Val Ala Gln Val

565 570 575

Val Ile Lys Lys Asn Ala Leu Tyr Met Asp Ala Asp Asp Ala Thr Thr

580 585 590

Phe Thr Ser Gly Ser Val Thr Leu Ser Gly Thr Glu Leu Arg Ala Met

595 600 605

Ala Ser Ala Trp Asp Ser Ile Gly Ile Thr Arg Ser Leu Thr Lys Lys

610 615 620

Pro Val Met Thr Leu Pro Tyr Gly Ser Thr Arg Leu Thr Cys Arg Glu

625 630 635 640

Ser Val Ile Asp Tyr Ile Val Asp Leu Glu Glu Lys Glu Ala Gln Lys

645 650 655

Ala Val Ala Glu Gly Arg Thr Ala Asn Lys Val His Pro Phe Glu Asp

660 665 670

Asp Arg Gln Asp Tyr Leu Thr Pro Gly Ala Ala Tyr Asn Tyr Met Thr

675 680 685

Ala Leu Ile Trp Pro Ser Ile Ser Glu Val Val Lys Ala Pro Ile Val

690 695 700

Ala Met Lys Met Ile Arg Gln Leu Ala Arg Phe Ala Ala Lys Arg Asn

705 710 715 720

Glu Gly Leu Met Tyr Thr Leu Pro Thr Gly Phe Ile Leu Glu Gln Lys

725 730 735

Ile Met Ala Thr Glu Met Leu Arg Val Arg Thr Cys Leu Met Gly Asp

740 745 750

Ile Lys Met Ser Leu Gln Val Glu Thr Asp Ile Val Asp Glu Ala Ala

755 760 765

Met Met Gly Ala Ala Ala Pro Asn Phe Val His Gly His Asp Ala Ser

770 775 780

His Leu Ile Leu Thr Val Cys Glu Leu Val Asp Lys Gly Val Thr Ser

785 790 795 800

Ile Ala Val Ile His Asp Ser Phe Gly Thr His Ala Asp Asn Thr Leu

805 810 815

Thr Leu Arg Val Ala Leu Lys Gly Gln Met Val Ala Met Tyr Ile Asp

820 825 830

Gly Asn Ala Leu Gln Lys Leu Leu Glu Glu His Glu Val Arg Trp Met

835 840 845

Val Asp Thr Gly Ile Glu Val Pro Glu Gln Gly Glu Phe Asp Leu Asn

850 855 860

Glu Ile Met Asp Ser Glu Tyr Val Phe Ala

865 870

Claims

1. A composition for pulmonary delivery, the composition comprising mRNA encapsulated in a lipid nanoparticle, wherein the lipid nanoparticle comprises a cationic lipid selected from the group consisting of: GL-TES-SA-DMP-E18-2, GL-TES-SA-DME-E18-2, TL1-01D-DMA, TL1-04D-DMA, SY-3-E14-DMAPR, TL1-10D-DMA, HEP-E3-E10, HEP-E4-E10 and Guan-SS-Chol.

2. The composition of claim 1, wherein the mRNA is codon optimized.

3. The composition of claim 1 or 2, wherein the lipid nanoparticle further comprises one or more non-cationic lipids and one or more PEG-modified lipids.

4. The composition of any one of the preceding claims, wherein the lipid nanoparticle further comprises one or more cholesterol-based lipids.

5. The composition of claim 3 or 4, wherein the non-cationic lipid is DOPE or DEPE.

6. The composition of any one of the preceding claims, wherein the liposomes have a size of less than about 100 nm.

7. The composition of any one of the preceding claims, wherein the liposome comprises no more than three different lipid components.

8. The composition of claim 7, wherein the three different lipid components are the cationic lipid, a non-cationic lipid, and a PEG-modified lipid.

9. The composition of claim 8, wherein the non-cationic lipid is DOPE or DEPE.

10. The composition of claim 8 or 9, wherein the PEG-modified lipid is DMG-PEG2K.

11. The composition of any one of claims 8-10, wherein the three different lipid components are Guan-SS-Chol, DOPE, and DMG-PEG2K.

12. The composition of claim 11, wherein the Guan-SS-Chol, DOPE, and DMG-PEG2K are present in a molar ratio of about 60:35:5, respectively.

13. The composition of any one of claims 1-6, wherein the liposome comprises four different lipid components.

14. The composition of claim 13, wherein the four different lipid components are the cationic lipid, non-cationic lipid, cholesterol, and PEG-modified lipid.

15. The method of claim 11 or 12, wherein the non-cationic lipid is DOPE or DEPE.

16. The method of claim 15, wherein the non-cationic lipid is DOPE.

17. The method of any one of claims 11-16, wherein the PEG-modified lipid is DMG-PEG2K.

18. The pharmaceutical composition of any one of claims 13-15, wherein the molar ratio of cationic lipid to non-cationic lipid to cholesterol to PEG-modified lipid is between about 30-60:25-35:20-30:1-15, respectively.

19. The composition of any one of the preceding claims, wherein the mRNA encodes a cystic fibrosis transmembrane conductance regulator, an ATP-binding cassette subfamily a member 3 protein, a kinesin shaft chain 1 (DNAI 1) protein, a kinesin shaft chain heavy chain 5 (DNAH 5) protein, an alpha-1-antitrypsin protein, a fork box P3 (FOXP 3) protein, or one or more surfactant proteins.

20. A method of inducing protein expression in epithelial cells in the lung of a mammal, the method comprising contacting epithelial cells in the lung of the mammal with the composition of any one of claims 1-19.

21. The method of claim 20, wherein the composition is administered by nebulization.