CN117915957A - Compositions and methods for modulating mRNA splicing - Google Patents

Compositions and methods for modulating mRNA splicing Download PDF

Info

Publication number
CN117915957A
CN117915957A CN202280041112.4A CN202280041112A CN117915957A CN 117915957 A CN117915957 A CN 117915957A CN 202280041112 A CN202280041112 A CN 202280041112A CN 117915957 A CN117915957 A CN 117915957A
Authority
CN
China
Prior art keywords
compound
side chain
amino acid
ccpp
peg
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280041112.4A
Other languages
Chinese (zh)
Inventor
钱自清
纳塔拉詹·塞瑟拉曼
沈秀龙
刘皓明
李翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ant Rada Therapeutics Ltd By Share Ltd
Original Assignee
Ant Rada Therapeutics Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ant Rada Therapeutics Ltd By Share Ltd filed Critical Ant Rada Therapeutics Ltd By Share Ltd
Priority claimed from PCT/US2022/028357 external-priority patent/WO2022240760A2/en
Publication of CN117915957A publication Critical patent/CN117915957A/en
Pending legal-status Critical Current

Links

Abstract

The present invention provides compounds comprising at least one cyclic cell penetrating peptide (cCPP) conjugated to an Antisense Compound (AC). The AC regulates splicing of RNA transcripts. For example, the AC induces exon skipping. Exon skipping can lead to down-regulation of protein expression or activity. Exon skipping can cause frame shifting of the resulting mRNA. The frame shift may result in a premature stop codon. This frameshift can lead to nonsense-mediated decay.

Description

Compositions and methods for modulating mRNA splicing
Cross Reference to Related Applications
The application claims the benefit of the following provisional applications: U.S. provisional application Ser. No. 63/186,664, filed 5/10/2021; U.S. provisional application Ser. No. 63/210,882, filed on 6/15 of 2021; U.S. provisional application Ser. No. 63/321,921, filed on 3/21 of 2022; U.S. provisional application Ser. No. 63/362,295 filed on 3/2022; U.S. provisional application Ser. No. 63/239,671, filed on 1 at 9/2021; U.S. provisional application Ser. No. 63/210,866 filed on 6/15 of 2021; U.S. provisional application Ser. No. 63/298,587, filed on 1/11 of 2022; and U.S. provisional application serial No. 63/318,201 filed on 3/9 of 2022, each of which is incorporated herein by reference in its entirety, without conflict with the disclosure provided herein.
Technical Field
Provided herein are compositions and methods for modulating mRNA splicing. In particular, compositions and methods are provided for modulating expression or activity of a protein of interest by inducing exon skipping, for example, to introduce frame shifts in RNA transcripts that can result in nonsense-mediated decay of RNA transcripts.
Background
A gene is a deoxyribonucleic acid (DNA) sequence that encodes a functional gene product, such as a protein. The process of converting a gene code into a functional gene product includes the steps of transcribing RNA (transcripts) from genetic DNA and translating the RNA into protein. RNA is first transcribed from DNA into immature "pre-mRNA," which undergoes processing to become mature messenger RNA (mRNA) that can be translated into protein. In eukaryotes, the processing steps include adding a single nucleotide modified guanine (G) nucleotide cap to the 5' end of the RNA; adding a polyadenylation sequence to the 3' end of the RNA (poly a tail); and (3) RNA splicing.
Splicing refers to the process of removing introns (intervening sequences) from a pre-mRNA and joining exons (coding sequences) together to form a mature mRNA.
Many mammalian genes are alternatively spliced, wherein different exons in the pre-mRNA sequence are contained or excluded from the mature mRNA transcript, such that one gene can generate different mRNA messages that are translated into proteins (isoforms) of different size and/or function.
Alternative splicing may involve cryptic splice sites within the exon and/or intron regions of the transcript. Cryptic splice sites are splice sites that are not normally used, but can be used when the common splice site is blocked or unavailable or when a mutation results in a normal inactive site becoming an active splice site. In cryptic splicing, the splicing machinery recognizes cryptic splice sites rather than typical splice sites. Typically, cryptic splicing results in the inclusion or exclusion of a portion or the entire intron or exon sequence in the mRNA.
Antisense modulation of pre-mRNA splicing has been used to restore cryptic splicing, alter the level of alternatively spliced genes (isotype switching), and for exon skipping, e.g., restoring the disrupted reading frame or knocking down the function of unwanted genes (Aartsma-Rus and Ommen, RNA (2007), 13:1609-1624).
The main problems with the use of antisense compounds in therapeutic agents include their limited ability to gain access to intracellular compartments when administered systemically, their limited ability to achieve broad or specifically targeted tissue distribution, and the challenge of achieving sufficient specificity of the targeted RNA to minimize off-target effects. Intracellular delivery of antisense compounds can be facilitated by the use of carrier systems (such as polymers, cationic liposomes) or by chemical modification of the construct (e.g., by covalent attachment of cholesterol molecules). However, intracellular delivery is inefficient and tissue distribution may be narrow. In addition, the prior art is still hampered by off-target interactions. Thus, there remains a need for improved delivery systems to increase the effectiveness of these antisense approaches, and there remains an unmet need for effective compositions that broadly deliver antisense compounds to all affected tissue types of intracellular compartments to specifically target a given gene product, thereby treating diseases caused by, for example, aberrant gene transcription, splicing, and/or translation.
Disclosure of Invention
The present disclosure relates generally to compounds, compositions, and methods for modulating splicing of genes, such as target transcripts (e.g., pre-mRNAs) of genes associated with disease. In embodiments, the present disclosure relates to compounds and compositions comprising a Therapeutic Moiety (TM) and a Cell Penetrating Peptide (CPP). TM may be an Antisense Compound (AC) that binds to a target transcript to modulate splicing of the target transcript. In embodiments, AC binds to at least a portion of, or is in proximity to, a Splice Element (SE) or cis-acting Splice Regulatory Element (SRE) of the target transcript to regulate splicing of the target transcript. In embodiments, binding of AC to a target transcript results in down-regulation of expression or activity of a protein expressed by the target transcript.
In embodiments, binding of AC to the target transcript results in skipping of the exon. In embodiments, the skipping of the exon results in a frame shift. In embodiments, the frameshift results in a premature stop codon. In embodiments, the frameshift results in nonsense-mediated decay. In embodiments, the frameshift results in premature stop codons and nonsense-mediated decay.
Described herein are methods wherein the compounds or compositions described herein are used to treat a disease. In embodiments, the disease is a genetic disease. In embodiments, the compounds or compositions are used to treat a genetic disorder by modulating splicing of a gene associated with the disorder. In embodiments, the compounds or compositions treat a genetic disorder by modulating splicing of a gene transcript associated with the disorder. In embodiments, the methods comprise administering a compound or composition described herein to a subject in need thereof. In embodiments, the subject in need thereof is a patient suffering from or at risk of suffering from a genetic disease. In embodiments, the methods comprise administering to a subject in need thereof a therapeutically effective amount of a compound or composition described herein. In embodiments, the genetic disease is a disease associated with aberrant expression of IRF-5, DUX4 or GYS1 or a genetic variant thereof.
CPPs can enhance intracellular delivery of AC to enhance the effectiveness of AC to regulate splicing of target transcripts. The CPP may be a cyclic CPP (cCPP).
The compounds described herein may comprise an Endosomal Escape Vector (EEV) configured to allow a compound or portion thereof internalized into a cell in the endosome to escape the endosome and enter the cytosol or cellular compartment to allow AC to act on the target transcript and modulate splicing. In embodiments, the EEV comprises a CPP, such as cCPP.
In an embodiment cCPP has the following formula (a):
Or a protonated form thereof, wherein:
r 1、R2 and R 3 are each independently H or an aromatic or heteroaromatic side chain of an amino acid;
At least one of R 1、R2 and R 3 is an aromatic or heteroaromatic side chain of an amino acid;
r 4、R5、R6、R7 is independently H or an amino acid side chain;
At least one of R 4、R5、R6、R7 is a side chain of 3-guanidino-2-aminopropionic acid, 4-guanidino-2-aminobutyric acid, arginine, homoarginine, N-methylarginine, N, N-dimethylarginine, 2, 3-diaminopropionic acid, 2, 4-diaminobutyric acid, lysine, N-methyllysine, N, N-dimethyllysine, N-ethyllysine, N, N, N-trimethyllysine, 4-guanidinophenylalanine, citrulline, N, N-dimethyllysine, β -homoarginine, 3- (1-piperidinyl) alanine;
AA SC is an amino acid side chain; and
Q is 1,2, 3 or 4.
In an embodiment, cCPP of formula (a) has the following formula (I):
Or a protonated form or salt thereof,
Wherein each m is independently an integer from 0 to 3.
In an embodiment, cCPP of formula (a) has the following formula (I-1):
Or a protonated form or salt thereof. In an embodiment cCPP has formula (a), having the following formula (I-2): /(I) Or a protonated form or salt thereof. In an embodiment, cCPP of formula (a) has the following formula (I-3):
Or a protonated form or salt thereof.
In an embodiment, cCPP of formula (a) has the following formula (I-4):
Or a protonated form or salt thereof.
In an embodiment, cCPP of formula (a) has the following formula (I-5):
Or a protonated form or salt thereof.
In an embodiment, cCPP of formula (a) has the following formula (I-6):
or a protonated form or salt thereof. In an embodiment cCPP has the following formula (II):
Wherein:
AA SC is an amino acid side chain;
R 1a、R1b and R 1c are each independently 6 to 14 membered aryl or 6 to 14 membered heteroaryl;
r 2a、R2b、R2c and R 2d are independently amino acid side chains;
At least one of R 2a、R2b、R2c and R 2d is Or a protonated form or salt thereof;
At least one of R 2a、R2b、R2c and R 2d is guanidine or a protonated form or salt thereof;
each n "is independently an integer from 0 to 5;
Each n' is independently an integer from 0 to 3; and
If n' is 0, then R 2a、R2b、R2b or R 2d are absent.
In an embodiment, cCPP of formula (II) has the following formula (II-1):
In an embodiment, cCPP of formula (II) has the following formula (IIa):
in an embodiment, cCPP of formula (II) has the following formula (IIb):
in an embodiment, cCPP of formula (II) has the following formula (IIc):
or a protonated form or salt thereof. In an embodiment cCPP has the following structure:
Or a protonated form or salt thereof, wherein at least one atom of the amino acid side-chain is replaced by said therapeutic moiety or linker or at least one lone pair forms a bond with said therapeutic moiety or said linker.
In an embodiment cCPP has the following structure:
Or a protonated form or salt thereof, wherein at least one atom of the amino acid side-chain is replaced by said therapeutic moiety or linker or at least one lone pair forms a bond with said therapeutic moiety or said linker.
In embodiments, the compound comprises an Exocyclic Peptide (EP). In embodiments, the EP comprises one of the following sequences :KK、KR、RR、HH、HK、HR、RH、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKH、KHK、HKK、HRR、HRH、HHR、HBH、HHH、HHHH、KHKK、KKHK、KKKH、KHKH、HKHK、KKKK、KKRK、KRKK、KRRK、RKKR、RRRR、KGKK、KKGK、HBHBH、HBKBH、RRRRR、KKKKK、KKKRK、RKKKK、KRKKK、KKRKK、KKKKR、KBKBK、RKKKKG、KRKKKG、KKRKKG、KKKKRG、RKKKKB、KRKKKB、KKRKKB、KKKKRB、KKKRKV、RRRRRR、HHHHHH、RHRHRH、HRHRHR、KRKRKR、RKRKRK、RBRBRB、KBKBKB、PKKKRKV、PGKKRKV、PKGKRKV、PKKGRKV、PKKKGKV、PKKKRGV or PKKKRKG, wherein B is β -alanine.
In embodiments, the compound has the following formula (C):
Or a protonated form or salt thereof, wherein:
R 1、R2 and R 3 are each independently H or a side chain comprising an aryl or heteroaryl group, wherein at least one of R 1、R2 and R 3 is a side chain comprising an aryl or heteroaryl group;
R 4 and R 7 are independently H or an amino acid side chain;
EP is a cyclic exopeptide;
Each m is independently an integer from 0 to 3;
n is an integer from 0 to 2;
x' is an integer from 1 to 23;
y is an integer from 1 to 5;
q is an integer from 1 to 4;
z' is an integer from 1 to 23; and
The cargo is AC.
In embodiments, the compound comprises a structure of formula (C-1), (C-2), (C-3), or (C-4):
/>
Or a protonated form or salt thereof, wherein EP is a cyclic exopeptide and the oligonucleotide is AC.
Drawings
FIGS. 1A-1B are schematic diagrams showing splice regulatory elements, including splice site (A) and general splice reactions (two transesterification reactions) (B).
FIG. 2 is a schematic diagram showing antisense compound-mediated exon skipping to produce premature stop codons that ultimately lead to nonsense-mediated decay of a target transcript.
FIG. 3 shows modified nucleotides used in the antisense oligonucleotides described herein. Structures 1-3 (1=phosphorothioate; 2= (S C5-Rp) - α, β -CAN; 3=pmo) are phosphate backbone modifications; 4 (2-thio-dT) is a base modification; 5-8 (5 = 2' -OMe-RNA;6 = 2' o-MOE-RNA;7 = 2' f-RNA;8 = 2' f-ANA) is a 2' sugar modification; 9-11 are restriction nucleotides; 12-14 (9=lna; 10= (S) -cET; 11=tcdna; 12=fhna; 13= (S) 5' -C-methyl; 14=una) are further sugar modifications; and 15-18 (15=e-VP; 16=methyl phosphonate; 17=5 ' phosphorothioate; 18= (S) -5' -C-methyl with phosphate) is a 5' phosphate stable modification; 19 is morpholino sugar. From Khvorova, a. Et al, nat. Biotechnol. (2017) for 3 months; 35 And (3) 238-248.
FIGS. 4A-4D provide structures of adenine (A), cytosine (B), guanine (C) and thymine (D) morpholino subunit monomers for the synthesis of phosphorodiamidate-linked morpholino oligomers (PMOs).
Fig. 5A-5D illustrate conjugation chemistry for linking an Antisense Compound (AC) to a peptide such as a cyclic cell penetrating peptide (cCPP). FIG. 5A shows reagents for an amide bond formation reaction between a peptide with an N-hydroxysuccinimide activated ester (upper panel) or a peptide with a free carboxylic acid (lower panel) and a primary amine at the 5' end of AC. FIG. 5B shows reagents for an amide bond formation reaction of a primary or secondary amine at the 3' end of AC with a peptide having a Tetrafluorophenyl (TFP) activated ester. Figure 5C shows reagents for conjugation of peptide-azide via copper-free azide-alkyne cycloaddition to 5' cyclooctyne modified AC. FIG. 5D shows another exemplary reagent for conjugation between a 3 'modified cyclooctyne AC or a 3' modified azide AC and a peptide containing a linker-azide or linker-alkyne/cyclooctyne moiety such as cCPP via copper-free azide-alkyne cycloaddition or copper-catalyzed azide-alkyne cycloaddition (click reaction), respectively.
FIG. 6 shows the conjugation chemistry of linking AC and CPP with additional linker forms containing polyethylene glycol (PEG) moieties using the conjugation chemistry shown in FIG. 5. Purification methods are indicated.
Fig. 7A to 7D show the levels of GYS1 protein (a and C) and GYS1 mRNA (B and D) in the diaphragms (a and B) and hearts (C and D) of untreated mice, PMO treated mice, and EEV-PMO treated mice with various concentrations in GAA knockout mouse models. ( P >0.05 = NS; p is less than or equal to 0.05=; p is less than or equal to 0.01=; p.ltoreq.0.001 =/times )
Figures 8A to 8D show graphs of GYS1 mRNA levels in heart (a), diaphragm (B), quadriceps (C) and triceps (D) at various time points after treatment in untreated mice, mice treated with PMO and mice treated with EEV-PMO. ( P >0.05 = NS; p is less than or equal to 0.05=; p is less than or equal to 0.01=; p.ltoreq.0.001 =/times )
Fig. 9A to 9D show graphs of GYS1 protein levels in heart (a), diaphragm (B), quadriceps (C) and triceps (D) at various time points after treatment in untreated mice, mice treated with PMO and mice treated with EEV-PMO. ( P >0.05 = NS; p is less than or equal to 0.05=; p is less than or equal to 0.01=; p.ltoreq.0.001 =/times )
Fig. 10A to 10C are graphs showing IRF5 mRNA expression levels in liver (a), small intestine (B) and tibialis anterior (C) of mice treated with various concentrations of EEV-PMO. (P > 0.05=ns; P +.0.05 =, P +.0.01 =, P +.0.001 =). MPK (MPK) =mg/kg.
FIGS. 11A to 11B are graphs showing IRF5 protein expression levels in vitro experiments in which mouse macrophages were treated with various concentrations of EEV#1-PMO, EEV#2-PMO, EEV#3-PMO, and EEV#4-PMO. ( P >0.05 = NS; p is less than or equal to 0.05=; p is less than or equal to 0.01=; p.ltoreq.0.001 =/times )
FIG. 12 is a graph showing knockdown of GYS1 mRNA levels in wild-type mouse myoblast cell line C2C12 after treatment with various concentrations of PMO 220 or EEV-PMO 220-814. N=3, p <0.05, p <0.01 relative to 0 (untreated) by student t test.
Fig. 13A to 13B are graphs showing knockdown of GYS1 mRNA levels in mouse myoblasts (a) and mouse fibroblasts (B) after treatment with various concentrations of PMO 220. N=2, p <0.05 relative to NT (untreated) by student t-test.
Fig. 14A to 14D are graphs showing GYS1 mRNA levels in heart (a), diaphragm (B), triceps (C) and quadriceps (D) after treatment of GAA knockout mice with PMO 220 or various concentrations of PMO-EEV 220-814. MPK (MPK) =mg/kg.
FIG. 15 is a graph showing GYS2 mRNA levels in liver after treatment of GAA knockout mice with PMO 220 or various concentrations of PMO-EEV 220-814. MPK (MPK) =mg/kg.
Fig. 16A to 16D are graphs showing GYS1 mRNA levels in heart (a), diaphragm (B), triceps (C) and quadriceps (D) after treatment of GAA knockout mice with PMO 220 or various concentrations of PMO-EEV 220-1055.
FIGS. 17A-17D are graphs showing GYS1 protein levels in heart (A), diaphragm (B), triceps (C) and quadriceps (D) at various time points after treatment of GAA knockout mice with 20mpk PMO-EEV 220-1055. MPK (MPK) =mg/kg.
Fig. 18A-18D are graphs showing drug exposure levels in heart (a), diaphragm (B), triceps (C) and quadriceps (D) at various time points after treatment of GAA knockout mice with either 20mpk PMO 220 or 20mpk PMO-EEV 220-1055. MPK (MPK) =mg/kg.
FIGS. 19A-19D are graphs showing GYS1 mRNA levels in heart (A), diaphragm (B), triceps (C) and quadriceps (D) of wild-type mice, GAA knockout mice and GAA knockout mice treated with EEV-PMO 220-1120 at various concentrations. MPK (MPK) =mg/kg.
FIGS. 20A-20D are graphs showing GYS1 protein levels in heart (A), diaphragm (B), triceps (C) and quadriceps (D) of wild-type mice, GAA knockout mice and GAA knockout mice treated with EEV-PMO 220-1120 at various concentrations. MPK (MPK) =mg/kg.
Fig. 21A to 21D are graphs showing GYS1 protein levels in heart (a), diaphragm (B) and quadriceps (C) of wild-type mice, GAA knockout mice and GAA knockout mice treated with multiple doses of EEV-PMO 220-1055.
FIGS. 22A-22B are graphs showing GYS1 (A) and GYS2 (B) levels in the livers of wild-type mice, GAA knockout mice, and GAA knockout mice treated with multiple doses of EEV-PMO 220-1055.
FIGS. 23A-23C show the expression levels of IRF-5 in mouse TiA tissue (A), liver tissue (B) and small intestine tissue (C) after treatment of mice with two doses of PMO or EEV-PMO 278-1120. MPK (MPK) =mg/kg.
FIGS. 24A-24C show IRF-5 expression levels in mouse liver (A), kidney (B) and tibialis anterior (C) tissues after mice are treated with one dose of PMO 278 or PMO-EEV 278-1120. P >0.05 = NS; p is less than or equal to 0.05=; p is less than or equal to 0.01=; p.ltoreq.0.001=.
FIGS. 25A-25B show GYS1 protein levels in quadriceps (A) and triceps (B) using GYS antibodies non-specific for GYS1 after treatment of mice with various concentrations of EEV-PMO constructs 220-814.
FIGS. 26A-26C show GYS1 protein levels in diaphragm (A), heart (B) and triceps (C) using GYS 1-specific antibodies after treatment of mice with various concentrations of EEV-PMO constructs 220-814.
FIG. 27A shows IRF-5 expression levels of RAW 264.7 monocytes/macrophages after treatment with various concentrations of PMO-EEVs 277-1120 and 278-1120. P >0.05 = NS; p is less than or equal to 0.05=; p is less than or equal to 0.01=; p.ltoreq.0.001=.
FIG. 27B is a bar graph of percent exon skipping at various time points after RAW 264.7 monocytes/macrophages were treated with EEV-PMO 278-1120. Nt=untreated.
FIGS. 28A-28B are bar graphs showing IRF-5 expression levels (A) and percent exon 4 skipping (B) in RAW 264.7 monocytes/macrophages following treatment with various EEV-PMO concentrations followed by R848 stimulation.
FIGS. 29A-29B are graphs showing IRF-5 exon 4 and exon 5 skipping levels in human THP1 cells after treatment with various EEV-PMO concentrations.
Detailed Description
Splicing
The pre-mRNA molecules are prepared in the nucleus and processed before or during transport to the cytoplasm for translation. Processing of the pre-mRNA involves the addition of a 5 'methylated guanine cap and a poly (A) tail of about 200-250 bases to the 3' end of the transcript. pre-mRNA processing also includes splicing, which occurs in about 90% to about 95% of mammalian mRNA maturation. Introns (or intervening sequences) are regions of the primary transcript (or DNA encoding it) that are not included in the coding sequence of the mature mRNA. Exons are regions of the primary transcript that remain in the mature mRNA when it reaches the cytoplasm. Transcripts may have multiple introns and exons. Exons are spliced together to form the mature mRNA sequence. Splice junctions are also referred to as splice sites, wherein the 5 'side of the junction is commonly referred to as a "5' splice site" or "splice donor site" and the 3 'side is referred to as a "3' splice site" or "splice acceptor site". In splicing, the 3 'end of the upstream exon is linked to the 5' end of the downstream exon. Thus, a transcript (e.g., a pre-mRNA) has an exon/intron junction at the 5 'end of the intron and an intron/exon junction at the 3' end of the intron. After removal of the introns, the exons are contiguous in the mature mRNA at positions sometimes referred to as exon/exon junctions or boundaries. Cryptic splice sites are those that are less used but can be used when the common splice site is blocked or otherwise unavailable. Alternative splicing (defined as splicing together different combinations of exons) generally results in multiple mRNA transcripts from a single gene.
The removal of introns from pre-mRNA is catalyzed by a spliceosome (a Ribonucleoprotein (RNP) complex comprising five microribonucleoproteins (snRNP)) and many other proteins (Will and Lu hrmann, cold Spring Harb.Perspect.biol. (2011), 3 (7): a003707; havens et al, wiley Intertiscip. RNA (2014), 4 (3), 247-266.Doi: 10.1002/wrna.1158). Splicing is controlled in part by Splice Elements (SE). As used herein, a "splice element" is a sequence element present in a pre-mRNA that is necessary for splicing to occur, such as typical splicing (fig. 1A). SE contains a 5 'splice site (5' ss) and a 3 'splice site (3' ss). The 5' ss is also referred to as a donor splice site, and includes a nearly constant "GU" dinucleotide sequence as well as less conserved downstream residues. The 5' splice site also comprises an exon/intron junction. As used herein, an exon/intron junction is a nucleotide sequence 10 nucleotides upstream and 10 nucleotides (+10 and-10) from G of the GU sequence of 5' ss. The 3' ss or acceptor splice site comprises three conserved elements: branch Splice Points (BSPs) (sometimes referred to as branch points), polypyrimidine or Py bundles, and terminal "AGs". BSP is typically an adenosine located between about 18 and about 40 nucleotides in the 3' ss. The Py bundle typically contains about 15 to about 20 pyrimidine residues, specifically uracil (U) (shown as X n in fig. 1A). However, atypical branching points exist; they are further away from the 3' splice site and/or utilize non-adenosine bases (Montes et al Trends Genet. (2019), 35 (1): 68-87). The 3' ss also contains an intron/exon junction. As used herein, an intron/exon junction is a nucleotide sequence 10 nucleotides upstream and 10 nucleotides (+10 and-10) from G of the AG sequence of 3' ss.
Exons are recognized in most splicing reactions by specific base pairing interactions with the micronuclear RNA (snRNA) components of five microribonucleoproteins (snrnps); u1, U2, U4, U5 and U6 (Havens et al, (2014) Wiley Intertiscip. RNA.2013,4 (3), 247-266.Doi:10.1002/wrna.1158; wahl M.C. et al, cell (2009), 136:701-718). Each snRNP comprises a microrna and one or more proteins configured to recognize a specific nucleotide sequence. Exon splicing involves two consecutive spliceosome-catalyzed transesterification reactions (FIG. 1B). Generally, the splicing reaction is initiated by U1 binding to the 5' ss, followed by U2 binding to the branching splice junction (BPS), and finally U4, U5 and U6 binding near the 5' and 3' splice sites. U1 and U4 are then displaced, followed by a first transesterification reaction in which the 2'-OH of the branching point nucleotide (a as shown in fig. 1B) within the intron nucleophilic attack on the first nucleotide (G as shown in fig. 1B) of the intron at the 5' splice site, thereby forming a lasso intermediate. In the second reaction, the 3' -OH of the released 5' exon carries out a nucleophilic attack on the last nucleotide of the intron at the 3' splice site (G as shown in FIG. 1B), thereby ligating the exon and releasing the intron lasso. U4, U5 and U6 are also released.
In addition to SE, splicing is regulated in part by Splice Regulatory Elements (SREs). SREs comprise cis-regulatory elements and trans-acting splice factors. Cis-regulatory elements and trans-acting splicing factors may promote classical splicing, alternative splicing or cryptic splicing.
Cis-regulatory elements are nucleotide sequences within a transcript that inhibit or enhance splicing. Trans-acting splicing factors are proteins and/or oligonucleotides that are not located within a transcript and are used to enhance or inhibit splicing. Cis-regulatory elements are commonly used to recruit trans-acting splicing factors that activate or inhibit splicing. Trans-acting splicing factors regulate splicing by binding to cis-regulatory elements. Trans-acting splicing factors include serine/arginine-rich (SR-rich) proteins and heteronuclear ribonucleoproteins (hnRNPs).
Splice cis-regulatory elements include an Exon Splice Enhancer (ESE) sequence, an Exon Splice Silencer (ESS) sequence, an Intron Splice Enhancer (ISE) sequence, and an Intron Splice Silencer (ISS) sequence (FIG. 1A). ESE sequences facilitate inclusion of the exons in mRNA in which they reside. ESS sequences inhibit the inclusion of exons in mRNA in which they reside. ISE sequences enhance the use of alternative splice sites from their positions within introns. The ISS sequence inhibits the use of alternative splice sites from its position within the intron. Typically, ISS is between 8 and 16 nucleotides in length and is less conserved than the splice site at the exon-intron junction.
Pre-mRNA splicing may also be regulated by forming secondary structures within the transcript, such as Terminal Stem Loops (TSLs), that can affect the binding of spliceosomes or other regulatory proteins. The terminal stem-loop sequence may be an SRE and is typically from about 12 to about 24 nucleotides, and forms a secondary loop structure due to complementarity and thus binding within the 12 to 24 nucleotide sequence.
Each SE and/or cis-acting SRE IS separated from adjacent cis-acting SREs and/or SEs by an Intervening Sequence (IS).
Exon skipping
Most eukaryotic pre-mRNAs can be spliced in different ways, typically by skipping exons, to produce different mature mRNA isoforms in a process called alternative splicing. The term "alternative splicing" refers to joining exons in different combinations (e.g., joining different 5 'and 3' splice sites). Alternative splicing may insert or remove amino acids, shift the reading frame and/or introduce stop codons, which contribute to the complexity, flexibility and abundance of genes and proteins expressed by genes. Alternative splicing may also affect gene expression by removing or inserting regulatory elements, controlling translation, mRNA stability and/or localization. Mutations that disrupt splicing are estimated to account for up to one third of all pathogenic mutations (Havens et al (2014) Wiley intelriscip.rna.2013, 4 (3), 247-266.doi:10.1002/wrna.1158; lim k.h. et al, proc.Natl.Acad.sci.usa (2011), 108:11093-11098; faustino and Cooper, genes & dev. (2003), 17:419-437; and Sterne-Weiler t. Et al, genome res. (2011), 21:1563-1571).
Mutations that affect the splicing process can occur in many different ways (Havens et al, (2014) Wiley intelriscip. Rna.2013,4 (3), 247-266.Doi: 10.1002/wrna.1158). For example, an intronic mutation may disrupt the core splice site (sequence within the 5' ss or 3' ss, py tract, or BPS), resulting in a jump or retention of introns upstream or downstream of the mutated splice site (5 ' ss and/or 3 ss). Typically, when a splice site is mutated, a pseudo-splice site is activated within a flanking exon or intron, which upon splicing produces an alternative transcript. Mutations within introns may also disrupt or create de novo splice silencers and/or enhancers and/or create de novo cryptic splice sites. Intronic splice site mutations may account for about 10% -15% of disease mutations (havens et al (2014) wiley intelriscip. Rna.2013,4 (3), 247-266.Doi:10.1002/wrna.1158; stenson p.d. et al, the Human Gene Mutation Database:2008update.Genome Med 2009,1:13). Mutations that occur within the coding exons (exon mutations) can lead to the creation of de novo cryptic splice sites, disruption of the secondary structure of RNA with regulatory functions, and/or disruption of splice silencers or enhancers such that splice sites are not recognized by sequence-specific RNA binding proteins required for splicing. Analysis of exon mutations predicts that up to 25% of the mutations within the exon can be spliced alternately (Ibid; proc. Natl. Acad. Sci. USA (2011), 108:11093-11098). Cryptic splicing is caused by sequences in pre-mRNAs that are not normally used as splice sites, but are activated by mutations that inactivate typical splice sites or create splice sites that were not previously present (ARECHAVALA-Gomeza et al, the Application of CLINICAL GENETICS (2014), 4 (7), 245-252; roca X. Et al genes Dev. (2013); 27 (2): 129-144). In addition, alternative splicing of different proteins contributing to the production of pre-mRNA can cause disease by transferring expression from one isoform to a different isoform associated with the disease (Ibid).
Targeting splice reactions or splice elements involved in splicing (e.g., SE and/or SRE) to induce aberrant splicing may be useful to disrupt gene expression of proteins involved in disease pathogenesis. For example, splicing can be targeted to cause exon skipping, introducing a shift or stop codon that results in a nonfunctional or truncated protein or degraded RNA transcript (Stenson P.D. et al Genome Med.2008;1 (13)). Splice-induced reading frame correction, re-framing, and/or nonsense-mediated decay of target transcripts provide an opportunity to treat a number of diseases and conditions.
Compounds of formula (I)
Disclosed herein are compounds that modulate the expression and/or activity of a gene of interest. In embodiments, the compound modulates splicing of a target transcript of a target gene. In embodiments, the compound comprises at least one Cell Penetrating Peptide (CPP) and at least one Therapeutic Moiety (TM) that binds to a target nucleotide sequence. In embodiments, TM is an Antisense Compound (AC). In embodiments, the target nucleotide sequence comprises a nucleotide sequence that is proximal to or comprises at least a portion of a cis-acting Splice Regulatory Element (SRE) and/or proximal to or comprises at least a portion of a Splice Element (SE).
As used herein, "modulation of splicing" and "modulating splicing" refer to altering processing of a pre-mRNA transcript such that a spliced mRNA molecule contains a different combination of exons due to exon skipping or exon inclusion, deletion or addition of one or more exons, or a sequence (e.g., an intron sequence) that is not normally present in a spliced mRNA. Modulating splicing may include one or more steps that interrupt or facilitate the splicing process. As used herein, the term "splicing process" encompasses all steps of a splicing reaction, including, for example, the binding of various snrnps (e.g., U1, U2, U3, U4, and U5) to splice elements and/or cis-acting splice regulatory elements, the binding of various proteins and/or oligonucleotides to cis-regulatory elements, and two consecutive transesterification reactions, as shown, for example, in fig. 1B.
Therapeutic moiety
In embodiments, the present disclosure describes compounds comprising one or more Therapeutic Moieties (TM) capable of modulating splicing of a transcript of interest from a gene of interest. In embodiments, the gene of interest may be a pathogenic gene.
TM binds (e.g., hybridizes) to the target nucleotide sequence. The target nucleotide sequence is typically contained in a target transcript of the gene of interest. For example, a TM targeting a gene of interest may bind to a target nucleotide sequence (e.g., a splice element) within a target transcript.
The TM may be an Antisense Compound (AC), one or more elements associated with Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) gene editing mechanisms, a polypeptide, or a combination thereof.
Antisense Compounds (AC)
In embodiments, the therapeutic moiety comprises an Antisense Compound (AC) that can modulate splicing of a target transcript of a target gene. AC is an oligonucleotide comprising a DNA base, a modified DNA base, an RNA base, a modified internucleoside linkage, a traditional DNA sugar, a modified DNA sugar, a traditional RNA sugar, a modified RNA sugar, or a combination thereof. In embodiments, the AC comprises a nucleotide sequence that is complementary to a target nucleotide sequence present within a target transcript. In embodiments, AC comprises a nucleotide sequence complementary to a target nucleotide sequence that approximates or comprises at least a portion of a splice element and/or splice regulatory element within a target transcript.
The ACs described herein may contain one or more asymmetric centers and thus produce enantiomers, diastereomers, and other stereoisomeric configurations, which may be defined as (R) or (S) depending on absolute stereochemistry; alpha or beta; or (D) or (L). Antisense compounds provided herein include all such possible isomers, as well as their racemic and optically pure forms.
In embodiments, AC induces alternative splicing that results in the addition or deletion of a nucleotide in the target transcript. In some embodiments, AC induces alternative splicing that results in the addition or deletion of nucleotides within a single exon of the target transcript. In embodiments, AC induces alternative splicing resulting in nucleotide deletions within a single exon of the target transcript. In embodiments, the deletion of a single exon internal nucleotide results in translation of a truncated protein. In embodiments, the truncated protein is less toxic to cells than the untruncated protein.
In embodiments, the AC is designed such that exons are skipped (sometimes referred to as exon skipping), resulting in increased or decreased expression or activity of the target protein and/or downstream proteins regulated by the target gene. In embodiments, AC is provided that generates mRNA encoding a truncated protein and/or a nonfunctional protein. In embodiments, AC is provided that produces mRNA encoding a truncated protein and/or a nonfunctional protein by alternative splicing. In embodiments, an AC is provided that triggers degradation of the target transcript, such as by nonsense-mediated decay. In embodiments, antisense Compounds (ACs) are provided that produce alternative mRNA isoforms with beneficial properties.
Antisense Compounds (AC) may be used to modulate splicing in any suitable manner. In embodiments, AC may be designed to spatially block access to a splice site, or at least a portion of a Splice Element (SE) and/or cis-acting Splice Regulatory Element (SRE), thereby redirecting splicing to a cryptic or de novo splice site. In embodiments, AC may target a splice enhancer sequence (e.g., ESE and/or ISE) or a splice silencer sequence (e.g., ESS and/or ISS) to prevent trans-acting regulatory splicing factors from binding at a target site and effectively block or promote splicing. In embodiments, AC may be designed to base pair on the bases of the splice-regulating stem loop to enhance the stem loop structure.
In embodiments, AC induces the addition or deletion of one or more nucleotides in the resulting processed transcript, such as mRNA. If the number of nucleotides added or removed from an open reading is divisible by three to produce an integer, the resulting transcript can be translated into a functional or nonfunctional protein having more or less amino acids than the corresponding protein expressed from the transcript, but otherwise having the same amino acid sequence (except for added or deleted amino acids) as the protein expressed from the transcript from which the nucleotides were not added or removed. If the number of nucleotides added or removed from the open reading frame is not divisible by three to produce an integer, the open reading frame of the resulting processed transcript, such as mRNA, is shifted. For example, the number of nucleotides added or deleted to induce such a "frameshift" change may be 1,2,4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, etc. Due to the triplet nature of the genetic code, additions or deletions of the number of nucleotides that are not divisible by three shift the reading frame of the resulting processed transcript, such as mRNA, downstream of the frameshift. The shifted reading frame may result in nonsense-mediated decay, may result in premature stop codons within the nonsense downstream of the frameshift, and/or may result in expression of proteins having entirely different amino acid sequences downstream of the frameshift.
In embodiments, AC induction introduces a premature stop codon (PTC) into the open reading frame. As used herein, a "premature stop codon" is a stop codon that is in phase with the translation initiation codon and that is upstream of a physiological stop codon that is in phase with the translation initiation codon. Target transcripts with PTC can be destabilized and degraded by a variety of mechanisms including nonsense-mediated decay.
Nonsense-mediated decay is a monitoring mechanism that recognizes the initiation of the exonucleolytic and endonucleolytic pathways to remove mRNA transcripts with PTC, thereby preventing the expression of truncated proteins that may have deleterious effects on cells. Several nonsense-mediated decay pathways have been envisaged and reviewed (Lejeune et al, biomedicines (2020), 10 (1): 141; brogna et al, nature Structural and Molecular Biology (2009), 16,108-113; karousis et al, wiley Interriscip. Rev. RNA (2016), 7 (5): 661-682). In embodiments where the target gene is overexpressed in the disease, induction of nonsense-mediated decay may be used to reduce the concentration of the target protein and thus treat the disease.
In embodiments, AC induces exon skipping to result in nonsense-mediated decay of the target transcript. This is in contrast to conventional exon skipping, which is intended to skip exons to induce expression of a particular protein isoform, thereby correcting for mis-splicing, alternative splicing, and/or avoiding deleterious mutations in a particular exon.
In embodiments, the AC induces exon skipping of an exon within the target transcript, wherein the exon has a number of nucleotides that is not divisible by three. In embodiments, AC induces exon skipping with exons of a nucleotide number that cannot be divided by three, resulting in PTC within the target transcript. In embodiments, AC induces exon skipping with exons of a number of nucleotides that cannot be divided by three, resulting in PCT within the target transcript, which leads to nonsense-mediated decay of the target transcript. In embodiments, inducing nonsense-mediated decay of the target transcript results in a decrease in target transcript concentration. In embodiments, inducing nonsense-mediated decay of the target transcript results in a decrease in the concentration of the target protein encoded by the target transcript. In embodiments, inducing nonsense-mediated decay of the target transcript results in an increase and/or decrease in protein levels of downstream genes regulated by the target gene.
FIG. 2 shows an example of AC-induced exon skipping leading to nonsense-mediated decay of target transcripts or premature termination of translation of proteins. AC binds to pre-mRNA. In an exemplary embodiment, AC is bound at the intron/exon junction of exon three. In other embodiments, the AC may bind to the target transcript at various other positions to induce exon skipping, resulting in nonsense-mediated decay of the target transcript (as discussed elsewhere). The number of nucleotides in exon three is not divisible by three, e.g., 52, 106, 232, 365, etc. Binding of AC to the intron/exon junctions induces exon skipping of exon three by a number of possible mechanisms. For example, the binding of AC to an intron/exon junction prevents the splicing machinery from approaching the splice element. In addition or alternatively, the binding of AC to the intron/exon linkage prevents completion of one or both transesterification reactions required to complete the splicing process. As a result of AC binding to the target transcript, exon three was skipped and the resulting transcript contained exon two linked to exon four. As a result of AC binding to the target transcript and skipping exon three, the reading frame in exon four of the resulting transcript is shifted. In the illustrated embodiment, a shift in the reading frame introduces PTC into the resulting transcript. As a result of AC binding to the target transcript, exon three and exon four with PTC are skipped and the resulting transcript is targeted and undergoes nonsense-mediated decay.
Determination of target sequences and design of antisense compounds (acs) to induce exon skipping can be accomplished using a variety of different methods, including, for example, by Aartsma-rus, a. Et al, molecular Therapy (2008), 17 (3) 548-553; and Aartsma-Rus, A. Et al, RNA (2007), 13 (10) 1609-1624. In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of a Splice Element (SE) of a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence comprising the entire SE of a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence comprising a plurality of SE of a target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising a plurality of SEs of a target transcript and intervening sequences between the SEs.
In embodiments, the AC hybridizes to a target nucleotide sequence of at least a portion of an SRE comprising a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence of an entire SRE comprising a target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence of a plurality of SREs comprising a target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising a plurality of SREs of a target transcript and intervening sequences between the SREs.
In embodiments, the target nucleotide sequence comprises the entire SE and/or SRE and one or more flanking sequences upstream and/or downstream of the SE and/or SRE of the target transcript. In embodiments, the target nucleotide sequence comprises a portion, but not all, of the SE and/or SRE, and one or more flanking sequences located upstream and/or downstream of the SE and/or SRE of the target transcript.
In embodiments, the flanking sequences comprise 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 10 or more, 15 or more, or 20 or more bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 25 or fewer, 20 or fewer, 15 or fewer, 10 or fewer, 5 or fewer, 4 or fewer, 3 or fewer, or 2 or fewer bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 1 to 25, 1 to 20, 1 to 15,1 to 10, 1 to 5,1 to 4, 1 to 3, or 1 to 2 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 2 to 25, 2 to 20, 2 to 15, 2 to 10, 2 to 5, 2 to 4, or 2 to 3 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 3 to 25, 3 to 20, 3 to 15, 3 to 10, 3 to 5, or 3 to 4 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 4 to 25, 4 to 20, 4 to 15, 4 to 10, or 4 to 5 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 5 to 25, 5 to 20, 5 to 15, or 5 to 10 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 10 to 25, 10 to 20, or 10 to 15 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 15 to 25 or 15 to 20 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequences comprise 20 to 25 bases on one or both sides of the SE and/or SRE. In embodiments, the flanking sequence comprises an intervening sequence or a portion thereof.
In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of the 5' ss of a target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an exon/intron junction of a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of the 3' ss of a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of a Py strand, BPS, terminal "AG" and/or intron/exon junctions of a target transcript.
In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of a Splice Regulatory Element (SRE) of a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence of an entire SRE comprising a target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence of a plurality of SREs comprising a target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising a plurality of SREs of a target transcript and intervening sequences between SREs of the target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an ESE of a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of an ISE. In embodiments, the AC hybridizes to a target nucleotide sequence of at least a portion of an ESS comprising a target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an ISS of a target transcript.
In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of a terminal stem loop (TLS) of a target transcript.
In embodiments, the AC hybridizes to at least a portion of an abnormal SE and/or SRE of the target transcript, wherein the abnormal SE and/or SRE is generated by a mutation in the target gene.
In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of a SE and/or SRE, an exon/intron junction, or an intron/exon junction of a target transcript. In embodiments, AC hybridizes to a target nucleotide sequence comprising an aberrant fusion junction due to rearrangement or deletion of the target transcript. In embodiments, AC hybridizes to a specific exon in an alternatively spliced mRNA of a target transcript.
In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of a Splice Element (SE) of an IRF-5, GYS1 and/or DUX4 target transcript. In embodiments, AC hybridizes to a target nucleotide sequence of an entire SE comprising IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC hybridizes to a target nucleotide sequence comprising a plurality of SE of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC hybridizes to a target nucleotide sequence comprising a plurality of SE of a target transcript and an intervening sequence between SE of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC hybridizes to at least a portion of the SE and one or more flanking sequences of the SE of the IRF-5, GYS1 and/or DUX4 target transcripts.
In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of the 5' ss of an IRF-5 target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an exon/intron junction of an IRF-5 target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of the 3' ss of an IRF-5 target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of a Py strand, BPS, terminal "AG" and/or intron/exon junctions of an IRF-5 target transcript.
In embodiments, the AC binds a target nucleotide sequence that does not comprise at least a portion of SE or at least a portion of SRE of the target transcript. In embodiments, AC binds to a target nucleotide sequence sufficiently close to SE and/or SRE to modulate splicing of the target transcript. In embodiments, AC that binds to a target nucleotide sequence that does not comprise at least a portion of SE or at least a portion of SRE of a target transcript and modulates splicing of the target transcript may bind to the target transcript and spatially block binding of a translation factor or trans-acting modulator to SE or SRE.
In embodiments, the AC binds to a target nucleotide sequence that is 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 10 or more, 15 or more, or 20 or more nucleotides from the 3' end and/or 5' end and/or 3' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 25 or less, 20 or less, 15 or less, 10 or less, 5 or less, 4 or less, 3 or less, or 2 or less nucleotides from the 5 'end and/or 3' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5,1 to 4, 1 to 3, or 1 to 2 nucleotides from the 3' end and/or the 5' end and/or the 3' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 2 to 25, 2 to 20, 2 to 15, 2 to 10, 2 to 5, 2 to 4, or 2 to 3 nucleotides from the 3' end and/or the 5' end and/or the 3' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 3 to 25, 3 to 20, 3 to 15, 3 to 10, 3 to 5, or 3 to 4 nucleotides from the 3' end and/or the 5' end and/or the 3' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 4 to 25, 4 to 20, 4 to 15,4 to 10, or 4 to 5 nucleotides from the 3' end and/or the 5' end and/or the 3' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 5 to 25, 5 to 20, 5 to 15, or 5 to 10 nucleotides from the 3 'end and/or 5' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 10 to 25 or 10 to 20 nucleotides from the 3 'and/or 5' end of the SE and/or SRE of the target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 20 to 25 nucleotides 3 'and/or 5' from the 5 'and/or 3' end of the SE and/or SRE of the target transcript.
In embodiments, AC hybridizes to a target nucleotide sequence of about 5 to about 50 nucleic acids in length. In embodiments, AC is the same length as the target nucleotide sequence. In embodiments, the AC is different from the length of the target nucleotide sequence. In embodiments, AC is longer than the target nucleic acid sequence.
In embodiments, the AC is 5 or more, 10 or more, 15 or more, 20 or more, 25 or more, 30 or more, 35 or more, 40 or more, or 45 or more nucleic acids in length. In embodiments, the AC is 50 or less, 45 or less, 40 or less, 35 or less, 30 or less, 25 or less, 20 or less, 15 or less, or 10 or less nucleic acids in length. In embodiments, the AC is 5 to 50, 5 to 45, 5 to 40, 5 to 35, 5 to 30, 5 to 25, 5 to 20, 5 to 15, or 5 to 10 nucleic acids in length. In embodiments, the AC is 10 to 50, 10 to 45, 10 to 40, 10 to 35, 10 to 30, 10 to 25, 10 to 20, or 10 to 15 nucleic acids in length. In embodiments, the AC is 15 to 50, 15 to 45, 15 to 40, 15 to 35, 15 to 30, 15 to 25, or 15 to 20 nucleic acids in length. In embodiments, the AC is 20 to 50, 20 to 45, 20 to 40, 20 to 35, 20 to 30, or 20 to 25 nucleic acids in length. In embodiments, the AC is 25 to 50, 25 to 45, 25 to 40, 25 to 35, or 25 to 30 nucleic acids in length. In embodiments, the AC is 30 to 50, 30 to 45, 30 to 40, or 30 to 35 nucleic acids in length. In embodiments, the AC is 35 to 50, 35 to 45, or 35 to 40 nucleic acids in length. In embodiments, the AC is 40 to 50 or 40 to 45 nucleic acids in length. In embodiments, the AC is 45 to 50 nucleic acids in length. In embodiments, the AC is 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49 or 50 nucleic acids in length.
In embodiments, AC has 100% complementarity to the target nucleotide sequence. In embodiments, AC does not have 100% complementarity to the target nucleotide sequence. As used herein, the term "percent complementary" refers to the number of nucleobases of AC that have nucleobase complementarity to the corresponding nucleobase of an oligomeric compound or nucleic acid (e.g., target nucleotide sequence) divided by the total length of AC (the number of nucleobases). One skilled in the art recognizes that it is possible to include mismatches without eliminating the activity of the antisense compound.
In embodiments, AC comprises 20% or less, 15% or less, 10% or less, 5% or less, or zero mismatches with the target nucleotide sequence. In some embodiments, AC comprises 5% or more, 10% or more, or 15% or more mismatches with the target nucleotide sequence. In embodiments, AC comprises zero to 5%, zero to 10%, zero to 15%, or zero to 20% mismatches with the target nucleotide sequence. In embodiments, AC comprises 5% to 10%, 5% to 15%, or 5% to 20% mismatches with the target nucleotide sequence. In embodiments, the AC comprises 10% to 15% or 10% to 20% mismatches with the target nucleotide sequence. In embodiments, the AC comprises 10% to 20% mismatches with the target nucleotide sequence.
In embodiments, AC has 80% or greater, 85% or greater, 90% or greater, 95% or greater, 96% or greater, 97% or greater, 98% or greater, or 99% or greater complementarity to the target nucleotide sequence. In embodiments, AC has 100% or less, 99% or less, 98% or less, 97% or less, 96% or less, 95% or less, 90% or less, 85% or less complementarity to the target nucleotide sequence. In embodiments, AC has 80% to 100%, 80% to 99%, 80% to 98%, 80% to 97%, 80% to 96%, 80% to 95%, 80% to 90%, or 80% to 85% complementarity to the target nucleotide sequence. In embodiments, AC has 85% to 100%, 85% to 99%, 85% to 98%, 85% to 97%, 85% to 96%, 85% to 95%, or 85% to 90% complementarity to the target nucleotide sequence. In embodiments, AC has 90% to 100%, 90% to 99%, 90% to 98%, 90% to 97%, 90% to 96%, or 90% to 95% complementarity to the target nucleotide sequence. In embodiments, AC has 95% to 100%, 95% to 99%, 95% to 98%, 95% to 97%, or 95% to 96% complementarity to the target nucleotide sequence. In embodiments, AC has 96% to 100%, 96% to 99%, 96% to 98%, or 96% to 97% complementarity to the target nucleotide sequence. In embodiments, AC has 97% to 100%, 97% to 99%, or 97% to 98% complementarity to the target nucleotide sequence. In embodiments, AC has 98% to 100% or 98% to 99% complementarity to the target nucleotide sequence. In embodiments, AC has 99% to 100% complementarity to the target nucleotide sequence. The percent complementarity of an oligonucleotide is calculated by dividing the number of complementary nucleobases by the total number of nucleobases of the oligonucleotide.
In embodiments, AC comprises 1,2,3, 4, or 5 mismatches relative to the target nucleic acid sequence to which AC hybridizes. In embodiments, AC comprises 1 or 2 mismatches relative to the target nucleic acid sequence to which AC hybridizes. In embodiments, AC does not contain mismatches relative to the target nucleic acid sequence to which AC hybridizes.
In embodiments, incorporating nucleotide affinity modifications allows for a greater number of mismatches than unmodified compounds. Similarly, certain oligonucleotide sequences may be more tolerant of mismatches than other oligonucleotide sequences. One of ordinary skill in the art can determine the appropriate number of mismatches between AC and the target nucleotide sequence, such as by determining the thermal melting temperature (Tm). Tm or Δtm may be calculated by techniques familiar to those of ordinary skill in the art. For example, freier et al (Nucleic ACIDS RESEARCH (1997), 25, 22:4429-4443) describe techniques that allow one of ordinary skill in the art to evaluate the ability of nucleotide modifications to increase the melting temperature of RNA to DNA duplex.
In embodiments, the AC comprises a sequence that hybridizes to the target transcript under stringent conditions and comprises a sequence that does not hybridize to the target transcript under stringent conditions. In embodiments, the AC comprises a first sequence that does not hybridize to a target sequence under stringent conditions, a second sequence that does not hybridize to a target sequence under stringent conditions, and a third sequence that hybridizes to a target sequence under stringent conditions, wherein the third sequence is located between the first sequence and the second sequence.
In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of a Splice Regulatory Element (SRE) of an IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, AC hybridizes to a target nucleotide sequence of an entire SRE comprising IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC hybridizes to a target nucleotide sequence of a plurality of SREs comprising IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC hybridizes to a target nucleotide sequence comprising a plurality of SREs of IRF-5, GYS1 and/or DUX4 target transcripts and intervening sequences between SREs. In embodiments, the AC hybridizes to at least a portion of the SE and one or more flanking sequences of the SE of the IRF-5, GYS1 and/or DUX4 target transcripts.
In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an ESE of an IRF-5, GYS1 and/or DUX4 target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an ISE of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC hybridizes to a target nucleotide sequence of at least a portion of an ESS comprising IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an ISS of IRF-5, GYS1 and/or DUX4 target transcripts.
In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of a terminal stem loop (TLS) of an IRF-5, GYS1 and/or DUX4 target transcript.
In embodiments, the AC hybridizes to at least a portion of an aberrant SE and/or SRE of the IRF-5, GYS1 and/or DUX4 target transcript, wherein the aberrant SE and/or SRE is generated by a mutation in the IRF-5, GYS1 and/or DUX4 target transcript.
In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of an exon-exon junction, an intron-exon junction, and/or an exon-intron junction of an IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC hybridizes to a target nucleotide sequence that comprises an aberrant fusion junction due to rearrangement or deletion of a portion of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, AC hybridizes to a particular exon in an alternatively spliced mRNA in an IRF-5, GYS1 and/or DUX4 target transcript.
In embodiments, the AC binds to a target nucleotide sequence that does not comprise at least a portion of the SE or at least a portion of the SRE of the ISS of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, AC binds to a target nucleotide sequence sufficiently close to SE and/or SRE to modulate splicing of ISS of IRF-5, GYS1 and/or DUX4 target transcripts.
In embodiments, the AC binds to a target nucleotide sequence that is 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 10 or more, 15 or more, or 20 or more nucleotides from the 3 'and/or 5' end of the SE and/or SRE of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 25 or fewer, 20 or fewer, 15 or fewer, 10 or fewer, 5 or fewer, 4 or fewer, 3 or fewer, or 2 or fewer nucleotides from the 3 'and/or 5' end of the SE and/or SRE of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5,1 to 4,1 to 3, or 1 to 2 nucleotides from the 3 'end and/or the 5' end of the SE and/or SRE of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 2 to 25, 2 to 20, 2 to 15, 2 to 10, 2 to 5, 2 to 4, or 2 to 3 nucleotides from the 3 'and/or 5' end of the SE and/or SRE of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 3 to 25, 3 to 20, 3 to 15, 3 to 10, 3 to 5, or 3 to 4 nucleotides from the 3 'and/or 3' end of the SE and/or SRE of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 4 to 25, 4 to 20, 4 to 15, 4 to 10, or 4 to 5 nucleotides from the 3 'and/or 5' end of the SE and/or SRE of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 5 to 25, 5 to 20, 5 to 15, or 5 to 10 nucleotides from the 3 'and/or 5' end of the SE and/or SRE of the IRF-5, GYS1 and/or DUX4 target transcript. In embodiments, the AC binds to a target nucleotide sequence that is 10 to 25 or 10 to 20 nucleotides from the 3 'and/or 5' end of the SE and/or SRE of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC binds to a target nucleotide sequence that is 20 to 25 nucleotides 3 'and/or 5' from the 5 'and/or 3' end of the SE and/or SRE of the IRF-5, GYS1 and/or DUX4 target transcript.
In embodiments, the AC hybridizes to a target nucleotide sequence of IRF-5, GYS1 and/or DUX4 target transcripts from about 5 to about 50 nucleic acids in length. In embodiments, AC is the same length as the target nucleotide sequence. In embodiments, the AC is different from the length of the target nucleotide sequence. In embodiments, AC is longer than the target nucleic acid sequence.
In embodiments, AC has 100% complementarity to the target nucleotide sequence of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, AC does not have 100% complementarity to the target nucleotide sequence. As used herein, the term "percent complementary" refers to the number of nucleobases of AC that have nucleobase complementarity to the corresponding nucleobase of an oligomeric compound or nucleic acid (e.g., target nucleotide sequence) divided by the total length of AC (the number of nucleobases). One skilled in the art recognizes that it is possible to include mismatches without eliminating the activity of the antisense compound.
In embodiments, the AC comprises 20% or less, 15% or less, 10% or less, 5% or less, or zero mismatches with IRF-5, GYS1, and/or DUX4 target nucleotide sequences. In some embodiments, the AC comprises 5% or more, 10% or more, or 15% or more mismatches with the target nucleotide sequence of IRF-5, GYS1, and/or DUX4 target transcripts. In embodiments, the AC comprises zero to 5%, zero to 10%, zero to 15%, or zero to 20% mismatches with the target nucleotide sequence of IRF-5, GYS1, and/or DUX4 target transcripts. In embodiments, the AC comprises 5% to 10%, 5% to 15%, or 5% to 20% mismatches with the target nucleotide sequence of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC comprises 10% to 15% or 10% to 20% mismatches with the target nucleotide sequence of IRF-5, GYS1 and/or DUX4 target transcripts. In embodiments, the AC comprises 10% to 20% mismatches with the target nucleotide sequence of IRF-5, GYS1 and/or DUX4 target transcripts.
In embodiments, the AC has 80% or greater, 85% or greater, 90% or greater, 95% or greater, 96% or greater, 97% or greater, 98% or greater, or 99% or greater complementarity to the target nucleotide sequence of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, the AC has 100% or less, 99% or less, 98% or less, 97% or less, 96% or less, 95% or less, 90% or less, 85% or less complementarity to the target nucleotide sequence of the IRF-5, GYS1 and/or DUX4 target transcript. In embodiments, AC has 80% to 100%, 80% to 99%, 80% to 98%, 80% to 97%, 80% to 96%, 80% to 95%, 80% to 90%, or 80% to 85% complementarity to the target nucleotide sequence of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, AC has 85% to 100%, 85% to 99%, 85% to 98%, 85% to 97%, 85% to 96%, 85% to 95%, or 85% to 90% complementarity to the target nucleotide sequence of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, AC has 90% to 100%, 90% to 99%, 90% to 98%, 90% to 97%, 90% to 96%, or 90% to 95% complementarity to the target nucleotide sequence of the IRF-5, GYS1, and/or DUX4 target transcript. In embodiments, AC has 95% to 100%, 95% to 99%, 95% to 98%, 95% to 97%, or 95% to 96% complementarity to the target nucleotide sequence of IRF-5, GYS1, and/or DUX4 target transcripts. In embodiments, AC has 96% to 100%, 96% to 99%, 96% to 98%, or 96% to 97% complementarity to the target nucleotide sequence of IRF-5, GYS1, and/or DUX4 target transcripts. In embodiments, AC has 97% to 100%, 97% to 99%, or 97% to 98% complementarity to the target nucleotide sequence of IRF-5, GYS1, and/or DUX4 target transcripts. In embodiments, AC has 98% to 100% or 98% to 99% complementarity to the target nucleotide sequence. In embodiments, AC has 99% to 100% complementarity to the target nucleotide sequence of the IRF-5, GYS1 and/or DUX4 target transcript. The percent complementarity of an oligonucleotide is calculated by dividing the number of complementary nucleobases by the total number of nucleobases of the oligonucleotide.
Antisense mechanism
In embodiments, AC modulates one or more aspects of protein transcription, translation, and expression. In embodiments, hybridization of AC to a target nucleotide sequence of a target transcript modulates one or more aspects of pre-mRNA splicing. In embodiments, hybridization of AC to a target nucleotide sequence of a target transcript restores native splicing to the mutated transcript sequence. In embodiments, hybridization of AC to a target nucleotide sequence of a target transcript results in alternative splicing of the target transcript.
In embodiments, AC hybridization results in exon inclusion or exon skipping of one or more exons. In embodiments, exon skipping increases the activity of a protein expressed by the resulting mRNA. In embodiments, exon skipping reduces the activity of a protein expressed by the resulting mRNA. In embodiments, skipping one or more exons induces a frame shift in the mRNA transcript. In embodiments, the frame shift results in mRNA encoding a protein with reduced activity. In embodiments, the frame shift results in a truncated or nonfunctional protein. In embodiments, skipping one or more exons results in the introduction of premature stop codons in the mRNA. In embodiments, skipping one or more exons results in degradation of the mRNA transcript by nonsense-mediated decay. In embodiments, the skipped exon sequence comprises a nucleic acid deletion, substitution, or insertion. In embodiments, the skipped exons do not contain sequence mutations. In embodiments, hybridization of the antisense oligonucleotide to a target nucleotide sequence within a target pre-mRNA transcript results in expression of different protein isoforms.
In embodiments, hybridization of AC to a target nucleotide sequence of a target transcript prevents inclusion of an intron sequence in the mature mRNA molecule. In embodiments, hybridization of AC to a target nucleotide sequence of a target transcript results in increased expression of a protein isoform. In embodiments, hybridization of AC to a target nucleotide sequence of a target transcript results in reduced expression of a protein isoform. In embodiments, hybridization of AC to a target nucleotide sequence of a target transcript results in expression of a re-spliced protein comprising an inactive fragment of the protein.
In embodiments, the AC comprises DNA and hybridization of the AC to the target transcript results in degradation of the transcript via rnase H. In embodiments, the AC comprises nucleotide modifications designed to not support rnase H activity. Nucleotide modifications of antisense compounds that do not support rnase H activity are known, including but not limited to 2' -O-methoxyethyl/phosphorothioate (MOE) modifications. Advantageously, the AC with MOE modification increases affinity for the target RNA and increases nuclease stability.
In embodiments, AC modulates transcription, translation, or protein expression by steric blocking. The following review articles describe the mechanism of spatial blocking and its application, and are incorporated herein by reference in their entirety: roberts et al Nature Reviews Drug Discovery (2020) 19:673-694.
Efficacy of AC can be assessed by evaluating the antisense activity affected by its administration. As used herein, the term "antisense activity" refers to any detectable and/or measurable activity attributable to hybridization of an antisense compound to its target nucleotide sequence. Such detection and/or measurement may be direct or indirect. In embodiments, antisense activity is assessed by detecting and/or measuring the amount of protein expressed by the transcript of interest. In embodiments, antisense activity is assessed by detecting and/or measuring the amount of a transcript of interest. In embodiments, antisense activity is assessed by detecting and/or measuring the amount of alternatively spliced RNA and/or the amount of protein isoforms translated from the target transcript. In embodiments, antisense activity is assessed by detecting and/or measuring the amount of downstream transcripts and/or proteins regulated by a gene of interest.
Antisense compound design
The design of the AC will depend on the target gene. Targeting AC to a specific target nucleotide sequence can be a multi-step process. The method generally begins with identifying a gene of interest. Transcripts of the gene of interest are analyzed and target nucleotide sequences are identified. In embodiments, the target nucleotide sequence comprises at least a portion of a splice element and/or a splice regulatory element. In embodiments, the target gene is IRF-5. In embodiments, the target gene is GYS1. In embodiments, the target gene is DUX4.
Those skilled in the art will be able to design, synthesize and screen AC for different nucleobase sequences to identify sequences that produce antisense activity. For example, antisense compounds that inhibit the expression of a target gene can be designed. Methods for designing, synthesizing, and screening for AC based on antisense activity against preselected target nucleic acids and/or target genes can be found, for example, in "ANTISENSE DRUG TECHNOLOGY, principles, strategs, and Applications," CRC Press, boca Raton, florida, edited by Stanley T.Crooke, which is incorporated by reference in its entirety for all purposes.
AC structure
AC includes oligonucleotides and/or oligonucleotides. Oligonucleotides and/or oligonucleotides are nucleosides linked by internucleoside linkages. Nucleosides include a pentose (e.g., ribose or deoxyribose) and a nitrogen-containing base covalently attached to the sugar. Naturally occurring (conventional) bases present in DNA and/or RNA are adenine (a), guanine (G), thymine (T), cytosine (C) and uracil (U). Naturally occurring (conventional) sugars present in DNA and/or RNA are Deoxyribose (DNA) and Ribose (RNA). Naturally occurring (traditional) nucleoside linkages are phosphodiester linkages. In embodiments, the AC of the present disclosure may have all natural sugar, base, and internucleoside linkages.
Chemically modified nucleosides are typically used for incorporation into antisense compounds to enhance one or more properties, such as nuclease resistance, pharmacokinetics, or affinity for a target RNA. In embodiments, the AC of the present disclosure may have one or more modified nucleosides. In embodiments, the AC of the present disclosure may have one or more modified sugars. In embodiments, the AC of the invention may have one or more modified bases. In embodiments, the AC of the present disclosure may have one or more modified internucleoside linkages.
In general, a nucleobase is any group containing one or more atoms or groups of atoms capable of hydrogen bonding with the base of another nucleic acid. In addition to "unmodified" or "natural" nucleobases (A, G, T, C and U), many modified nucleobases or nucleobase mimics known to those of skill in the art are suitable for use in the compounds described herein. In general, a modified nucleobase refers to a nucleobase that is quite similar in structure to the parent nucleobase, such as 7-deazapurine, 5-methylcytosine, 2-thio-dT (FIG. 3) or a G-clamp. In general, nucleobase mimetics are nucleobases that comprise a more complex structure than the modified nucleobase, such as tricyclic phenonesOxazine nucleobase mimics. Methods for preparing the modified nucleobases described above are well known to those skilled in the art.
In embodiments, AC may comprise one or more nucleosides having a modified sugar moiety. In embodiments, the furanosyl sugar of a natural nucleoside can have a 2' modification, a modification to make a limited nucleoside, and the like (see fig. 3). For example, in embodiments, the furanosyl sugar ring of a natural nucleoside can be modified in a variety of ways, including but not limited to adding substituents, bridging two non-geminal ring atoms to form a Bicyclic Nucleic Acid (BNA) or locked nucleic acid; exchanging the oxygen of the furanosyl ring with C or N; and/or substitution of such atoms or groups (see fig. 3). Modified sugars are well known and can be used to increase or decrease the affinity of AC for its target nucleotide sequence. Modified sugars may also be used to increase AC resistance to nucleases. The sugar may also be replaced with a sugar mimetic group or the like. In embodiments, one or more sugars of the nucleoside of AC are replaced with a methylenemorpholine ring as shown at 19 in fig. 3.
In embodiments, the AC comprises one or more nucleosides that comprise a bicyclic modified sugar (BNA; sometimes referred to as a bridging nucleic acid). Examples of BNA include, but are not limited to, LNA (4 '- (CH 2) -O-2' bridge), 2 '-thio-LNA (4' - (CH 2) -S-2 'bridge), 2' -amino-LNA (4 '- (CH 2) -NR-2' bridge), ENA (4 '- (CH 2)2 -O-2' bridge), 4'- (CH 2)3 -2' bridge BNA, 4'- (CH 2CH(CH3)) -2' bridge BNA "cEt (4 '- (CH (CH 3) -O-2' bridge) and cMOE BNA (4 '- (CH (CH 2OCH3) -O-2' bridge). Some examples are shown in FIG. 3. BNA (Srivastava et al, J.am.chem. Soc. (2007), ACS ADVANCED online publication,10.1021/ja071106 y) have been prepared and disclosed in the patent literature as well as the scientific literature; albaek et al, J.org. chem. (2006), 71,7731-7740; fluid et al, chembiochem (2005), 6,1104-1109; singh et al, chem. Commun. (1998), 4,455-456; koshkin et al, tetrahedron (1998), 54,3607-3630; wahlestedt et al, proc. Natl. Acad. Sci. U.S.A. (2000), 97,5633-5638; kumar et al, bioorg. Med. Chem. Lett. (1998), 8,2219-2222; WO 94/14226; WO 2005/021570; singh et al, J.org. chem. (1998), 63,10035-10039; WO 2007/090071; U.S. patent No. 7053207;6,268,490;6,770,748;6,794,499;7,034,133; and 6,525,191; U.S. pre-grant publication No. 2004-0171570;2004-0219565;2004-0014959;2003-0207841;2004-0143114; and 20030082807).
In embodiments, the AC comprises one or more nucleosides comprising a Locked Nucleic Acid (LNA). In LNA, the 2 '-hydroxyl group of the ribosyl sugar ring is attached to the 4' carbon atom of the sugar ring, thereby forming a 2'-C,4' -C-oxymethylene linkage to form a bicyclic sugar moiety (reviewed in Elayadi et al, curr. Opinion Invens. Drugs (2001), 2,558-561; braasch et al, chem. Biol. (2001), 8,1-7; and Orum et al, curr. Opinion mol. Ther. (2001), 3,239-243; see also U.S. Pat. Nos. 6,268,490 and 6,670,461). The linkage may be a methylene (-CH 2 -) group bridging the 2 'oxygen atom and the 4' carbon atom, for which the term LNA is used for the bicyclic moiety; in the case of an ethylene group at this position, the term ENA TM is used (Singh et al, chem. Commun. (1998), 4,455-456; ENA TM; morita et al, bioorganic MEDICINAL CHEMISTRY (2003), 11, 2211-2226). LNAs and other bicyclic sugar analogs exhibit very high duplex thermal stability (tm= +3 ℃ to +10 ℃) to complementary DNA and RNA, stability to 3' -exonucleolytic degradation, and good solubility characteristics. Effective and nontoxic antisense oligonucleotides containing LNA have been described (Wahlestedt et al, proc. Natl. Acad. Sci. U.S. A. (2000), 97, 5633-5638).
The LNA isomer that has also been studied is alpha-L-LNA, which has been shown to have excellent stability to 3' -exonucleases. alpha-L-LNA was incorporated into antisense gapmers and chimeras that showed potent antisense activity (Frieden et al, nucleic ACIDS RESEARCH (2003), 21, 6365-6372).
The synthesis and preparation of LNA monomers adenine, cytosine, guanine, 5-methyl-cytosine, thymine and uracil and their oligomerization and nucleic acid recognition properties have been described (Koshkin et al, tetrahedron (1998), 54, 3607-3630). LNA and its preparation are also described in WO 98/39352 and WO 99/14226.
Analogs of LNA have also been prepared, such as phosphorothioate-LNA and 2' -thio-LNA (Kumar et al, biorg. Med. Chem. Lett.,1998,8,2219-2222). The preparation of LNA analogues containing oligodeoxyribonucleotide duplex as substrates for nucleic acid polymerase has also been described (WO 99/14226). Furthermore, synthesis of a conformationally constrained high affinity oligonucleotide analogue, 2' -amino-LNA, has been described (Singh et al, J.org.chem. (1998), 63, 10035-10039). In addition, 2 '-amino-LNAs and 2' -methylamino-LNAs have been prepared and their thermal stability with duplex of complementary RNA and DNA strands has been previously reported.
In embodiments, the antisense compound is a "tricyclo-DNA (tc-DNA)", which refers to a class of constrained DNA analogs in which each nucleotide is modified by the introduction of a cyclopropane loop to limit conformational flexibility of the backbone and enhance the backbone geometry of the torsion angle γ. tc-DNA containing the homobases adenine and thymine forms very stable A-T base pairs with complementary RNA.
Methods for preparing modified sugars are well known to those skilled in the art. Representative patents and publications that teach the preparation of such modified sugars include, but are not limited to, U.S. Pat. nos. :4,981,957;5,118,800;5,319,080;5,359,044;5,393,878;5,446,137;5,466,786;5,514,785;5,519,134;5,567,811;5,576,427;5,591,722;5,597,909;5,610,300;5,627,053;5,639,873;5,646,265;5,658,873;5,670,633;5,792,747;5,700,920; and 6,600,032; WO 2005/121371.
Internucleoside linkage
Described herein are internucleoside linking groups that link together nucleoside or otherwise modified nucleoside monomer units, thereby forming oligonucleotides and/or oligonucleotide-containing ACs. AC may include naturally occurring internucleoside linkages, non-natural internucleoside linkages, or both.
In naturally occurring DNA and RNA, internucleoside linkages are phosphodiesters that covalently link adjacent nucleosides to each other to form linear polymeric compounds. In naturally occurring DNA and RNA, phosphodiester is linked to the 2', 3' or 5' hydroxyl moiety of a sugar. Within an oligonucleotide, phosphate groups are commonly referred to as forming the internucleoside backbone of the oligonucleotide. In naturally occurring DNA and RNA, the linkage or backbone of RNA and DNA is a 3 'to 5' phosphodiester linkage. In embodiments, the internucleoside linking group of AC is a phosphodiester. In embodiments, the internucleoside linking group of AC is a 3 'to 5' phosphodiester linkage.
Two main classes of non-natural internucleoside linkages are defined by the presence or absence of phosphorus atoms. Representative phosphorus-containing internucleoside linkages include, but are not limited to, phosphotriesters, methylphosphonates, phosphoramidates, and phosphorothioates. Representative phosphorus-free internucleoside linkages include, but are not limited to, methylenemethylamino (-CH 2-N(CH3)-O-CH2 -), thiodiester (-O-C (O) -S-), thiocarbamate (-O-C (O) (NH) -S-); siloxane (-O-Si (H 2 -O-) and N, N' -dimethylhydrazine (-CH 2-N(CH3)-N(CH3) -). AC with a phosphorus internucleoside linkage is referred to as an oligonucleotide. Antisense compounds having non-phosphorus internucleoside linkages are referred to as oligonucleotides. Modified internucleoside linkages can be used to alter (typically increase) nuclease resistance of antisense compounds compared to native phosphodiester linkages. Internucleoside linkages having chiral atoms can be prepared as racemic, chiral or as mixtures. Representative chiral internucleoside linkages include, but are not limited to, alkyl phosphonates and phosphorothioates. Methods for preparing phosphorus-containing and phosphorus-free linkages are well known to those skilled in the art.
In embodiments, two or more nucleosides with modified sugars and/or modified nucleobases can be linked using phosphoramidates. In embodiments, two or more nucleosides having a methylene morpholine ring can be linked by an phosphoramidate internucleoside linkage as shown at 20 in fig. 3, wherein B 1 and B 2 are modified or natural nucleobases. Antisense compounds comprising nucleobases having methylene morpholino rings linked by phosphoramidate internucleoside linkages can be referred to as Phosphoramidate Morpholino Oligomers (PMOs).
Conjugation group
In embodiments, AC is modified by covalent attachment of one or more conjugate groups. Generally, the conjugate group modifies one or more properties of the AC, including, but not limited to, pharmacodynamics, pharmacokinetics, binding, absorption, cellular distribution, cellular uptake, charge and clearance. The conjugate groups are conventionally used in the chemical arts and are attached to a parent compound, such as AC, either directly or via an optional linking moiety or linking group. Conjugation groups include, but are not limited to, intercalators, reporter molecules, polyamines, polyamides, polyethylene glycols, thioethers, polyethers, cholesterol, thiocholesterols, cholic acid moieties, folic acid, lipids, phospholipids, biotin, phenazine, phenanthridine, anthraquinone, adamantane, acridine, fluorescein, rhodamine, coumarin, and dyes. In embodiments, the conjugation group is polyethylene glycol (PEG), and the PEG is conjugated to AC or CPP (CPP discussed elsewhere herein).
In embodiments, the conjugate group includes a lipid moiety, such as a cholesterol moiety (Letsinger et al, proc.Natl. Acad.Sci.USA (1989), 86,6553); cholic acid (Manoharan et al, biorg. Med. Chem. Lett. (1994), 4,1053); thioethers, for example hexyl-S-tritylthiol (Manoharan et al, ann.N. Y. Acad. Sci. (1992), 660,306; manoharan et al, biorg. Med. Chem. Let. (1993), 3,2765); sulphur cholesterol (Oberhauser et al, nucleic acids res. (1992), 20,533); aliphatic chains such as dodecanediol or undecyl residues (Saison-Behmoaras et al, EMBO J. (1991), 10,111; kabanov et al, FEBS Lett. (1990), 259,327; svinarchuk et al, biochimie (1993), 75,49); phospholipids, such as hexacosanol or triethylammonium-1, 2-di-O-hexadecyl-racemic glycerin-3-H-phosphonate (Manoharan et al, tetrahedron Lett. (1995), 36,3651; shea et al, nucleic acids Res. (1990), 18,3777); polyamine or polyethylene glycol chains (Manoharan et al, nucleic & nucleic acids (1995), 14,969); adamantaneacetic acid (Manoharan et al, tetrahedron lett (1995), 36,3651); palm-based moiety (Mishra et al, biochim. Biophys. Acta. (1995), 1264,229); or octadecylamine or hexylamino-carbonyl-oxy cholesterol moiety (Crooke et al, j. Pharmacol. Exp. Ter. (1996), 277,923).
Type of antisense compound
Various types of acs may be used, including, for example, antisense oligonucleotides, sirnas, mini RNA, antagomir, aptamers, ribozymes, supermir, miRNA mimics, miRNA inhibitors, or combinations thereof.
Antisense oligonucleotides
In various embodiments, the Antisense Compound (AC) is an antisense oligonucleotide (ASO) complementary to a target nucleotide sequence. The term "antisense oligonucleotide (ASO)" or simply "antisense" is intended to include oligonucleotides complementary to a target nucleotide sequence. The term also includes ASOs that may not be fully complementary to the desired target nucleotide sequence. ASOs include single stranded DNA and/or RNA complementary to a selected target nucleotide sequence or target gene. ASOs may comprise one or more modified DNA and/or RNA bases, modified sugars, and/or unnatural internucleoside linkages. In embodiments, the ASO may comprise one or more phosphoramidate internucleoside linkages. In embodiments, the ASO is a Phosphoroamidate Morpholino Oligomer (PMO). ASOs can have any feature, any length, incorporate any splice element, and implement any of the mechanisms described with respect to AC. In embodiments, ASO induces exon skipping to introduce premature stop codons and ultimately lead to nonsense-mediated degradation of the target transcript. In embodiments, the ASO is PMO and exon skipping is induced to introduce premature stop codons and ultimately lead to nonsense-mediated degradation of the target transcript.
Antisense oligonucleotides have proven effective as targeted inhibitors of protein synthesis and are therefore useful for specifically inhibiting protein synthesis by targeting genes. The efficacy of ASO in inhibiting protein synthesis has been well established. To date, these compounds have shown promise in several in vitro and in vivo models, including models of inflammatory disease, cancer and HIV (Agrawal, trends in Biotech. (1996), 14:376-387). Antisense can also affect cellular activity by specifically hybridizing to chromosomal DNA.
Methods of producing ASOs are known in the art and can be readily adapted to produce ASOs that bind to target nucleotide sequences of the present disclosure. The selection of ASO sequences specific for a given target sequence is based on analysis of the selected target nucleotide sequence and determination of secondary structure, tm, binding energy and relative stability. Antisense oligonucleotides can be selected based on their relative inability to form dimers, hairpins, or other secondary structures that will reduce or inhibit specific binding to target nucleotide sequences in host cells. These secondary structural analysis and target site selection considerations may be performed, for example, using the 4 th edition of OLIGO primer analysis software (Molecular Biology Insights) and/or BLASTN 2.0.5 algorithm software (Altschul et al, nucleic Acids res.1997,25 (17): 3389-402).
RNA interference
In embodiments, the AC comprises a molecule that mediates RNA interference (RNAi). As used herein, the phrase "mediate RNAi" refers to the ability to silence a target transcript in a sequence-specific manner. While not wishing to be bound by theory, it is believed that silencing uses RNAi machinery or methods and guide RNAs, e.g., siRNA and/or miRNA compounds of about 21 to about 23 nucleotides. In embodiments, the AC targets the target transcript for degradation. Thus, in embodiments, RNAi molecules can be used to disrupt expression of a gene or polynucleotide of interest. In embodiments, RNAi molecules are used to induce degradation of target transcripts such as pre-mRNA or mature mRNA.
In embodiments, the AC comprises small interfering RNAs (sirnas) that elicit an RNAi response. In embodiments, the AC comprises a microrna (miRNA) that elicits an RNAi response.
Small interfering RNAs (sirnas) are nucleic acid duplex, typically about 16 to about 30 nucleotides long, that can bind to a cytoplasmic polyprotein complex called RNAi-induced silencing complex (RISC). RISC loaded with siRNA mediates degradation of homologous transcripts, so siRNA can be designed to knock down protein expression with high specificity. Unlike other antisense technologies, siRNA acts through a natural mechanism that evolves through non-coding RNAs to control gene expression. Various RNAi agents, including siRNAs targeting clinically relevant targets, are currently under drug development, as described, for example, in de Fougerolles, A. Et al, nature Reviews (2007) 6:443-453.
RNAi has a very wide range of therapeutic applications, as siRNA and miRNA constructs can be synthesized using any nucleotide sequence directed against a target protein. To date, siRNA constructs have been shown to specifically down-regulate target proteins in vitro and in vivo models as well as in clinical studies.
Although the first described RNAi molecules were RNA:RNA hybrids comprising both the RNA sense and RNA antisense strands, the DNA sense strand has now been demonstrated: RNA antisense hybrid, RNA sense strand: DNA antisense hybrids and DNA hybrids are capable of mediating RNAi (Lamberton, J.S. and Christian, A.T., molecular Biotechnology (2003), 24:111-119). In embodiments, RNAi molecules are used that include any of these different types of double-stranded molecules. Furthermore, it should be understood that RNAi molecules can be used and introduced into cells in a variety of forms. Thus, as used herein, RNAi molecules include any and all molecules capable of mediating RNAi in cells, including, but not limited to, double-stranded oligonucleotides comprising two separate strands, a sense strand and an antisense strand, such as small interfering RNAs (sirnas); a double-stranded oligonucleotide comprising two separate strands joined together by a non-nucleotide linker; an oligonucleotide comprising a hairpin loop that forms a complement of the double-stranded region, e.g., a shRNAi molecule, and an expression vector that expresses one or more polynucleotides capable of forming a double-stranded polynucleotide alone or in combination with another polynucleotide.
As used herein, a "single stranded siRNA compound" is an siRNA compound consisting of a single molecule. It may comprise duplex regions formed by intra-strand pairing, for example, it may be or comprise a hairpin or disc handle structure. The single stranded siRNA compound may be antisense to the target molecule.
The single stranded siRNA compound may be long enough to be able to enter RISC and participate in RISC-mediated cleavage of target mRNA. The single stranded siRNA compound is at least about 14, at least about 15, at least about 20, at least about 25, at least about 30, at least about 35, at least about 40, or at most about 50 nucleotides in length. In certain embodiments, the single stranded siRNA is less than about 200, about 100 or about 60 nucleotides in length.
Hairpin siRNA compounds can have a duplex region equal to or at least about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, or about 25 nucleotide pairs. The duplex region may be equal to or less than about 200, about 100, or about 50 nucleotide pairs in length. In certain embodiments, the duplex region ranges in length from about 15 to about 30, from about 17 to about 23, from about 19 to about 23, and from about 19 to about 21 nucleotide pairs. The hairpin may have a single stranded overhang or a terminal unpaired region. In certain embodiments, the length of the overhang is about 2 to about 3 nucleotides. In embodiments, the overhangs are on the same side of the hairpin, and in embodiments on the antisense side of the hairpin.
As used herein, a "double stranded siRNA compound" is an siRNA compound comprising more than one, and in some cases two strands, wherein inter-strand hybridization can form a region of duplex structure.
The antisense strand of the double stranded siRNA compound can be equal to or at least about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 25, about 30, about 40 or about 60 nucleotides in length. It may be equal to or less than about 200, about 100, or about 50 nucleotides in length. The length may range from about 17 to about 25, from about 19 to about 23, and from about 19 to about 21 nucleotides. As used herein, the term "antisense strand" means a strand of an siRNA compound that is sufficiently complementary to a target molecule (e.g., a target nucleotide sequence of a target transcript).
The sense strand of the double stranded siRNA compound may be equal to or at least about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 25, about 30, about 40 or about 60 nucleotides in length. It may be equal to or less than about 200, about 100, or about 50 nucleotides in length. The length may range from about 17 to about 25, from about 19 to about 23, and from about 19 to about 21 nucleotides.
The double stranded portion of the double stranded siRNA compound may be equal to or at least about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 30, about 40 or about 60 nucleotide pairs in length, which may be equal to or less than about 200, about 100 or about 50 nucleotide pairs in length. The length can range from about 15 to about 30, about 17 to about 23, about 19 to about 23, and about 19 to about 21 nucleotide pairs.
In embodiments, the siRNA compound is sufficiently large to enable it to be cleaved by an endogenous molecule (e.g., by Dicer) to produce a smaller siRNA compound, such as an siRNA agent.
The sense strand and the antisense strand can be selected such that the double stranded siRNA compound comprises a single stranded or unpaired region at one or both ends of the molecule. Thus, a double stranded siRNA compound may contain paired sense and antisense strands to contain an overhang, such as one or two 5' or 3' overhangs, or a 3' overhang of 1 to 3 nucleotides. An overhang may be the result of one strand being longer than the other, or the result of two strands of the same length being interleaved. Some embodiments will have at least one 3' overhang. In embodiments, both ends of the siRNA molecule will have a 3' overhang. In embodiments, the overhang is2 nucleotides.
In embodiments, the duplex region is about 15 to about 30, or about 18, about 19, about 20, about 21, about 22, or about 23 nucleotides in length, e.g., within the ssiRNA (cohesive overhang-bearing siRNA) compound range discussed above. ssiRNA compounds can be similar in length and structure to the natural Dicer processing products from long dsiRNA. Also included are embodiments wherein both chains of ssiRNA compounds are linked (e.g., covalently linked). In embodiments, hairpins or other single stranded structures providing a double stranded region and 3' overhangs are included.
SiRNA compounds (including double stranded siRNA compounds and single stranded siRNA compounds) described herein can mediate silencing of a target RNA (e.g., mRNA, e.g., transcripts of a gene encoding a protein). For convenience, such mRNA is also referred to herein as mRNA to be silenced. Such genes are also referred to as target genes. Typically, the RNA to be silenced is an endogenous gene.
In embodiments, the siRNA compound is "sufficiently complementary" to the target transcript such that the siRNA compound silences production of a protein encoded by the target mRNA. In embodiments, the siRNA compound is "sufficiently complementary" to at least a portion of the target transcript such that the siRNA compound silences the production of a gene product encoded by the target transcript. In another embodiment, the siRNA compound is "exactly complementary" to the target nucleotide sequence (e.g., a portion of the target transcript) such that the target nucleotide sequence and the siRNA compound anneal, e.g., form a hybrid consisting of only watson-crick base pairs in the exactly complementary region. "substantially complementary" to a target nucleotide sequence can include an interior region (e.g., at least about 10 nucleotides) that is precisely complementary to the target nucleotide sequence. Furthermore, in certain embodiments, the siRNA compounds specifically differentiate between single nucleotide differences. In this case, the siRNA compound only mediates RNAi if exact complementarity is found in the region of single nucleotide differences (e.g., within 7 nucleotides).
RNAi has a very wide range of therapeutic applications, as siRNA and miRNA constructs can be synthesized using any nucleotide sequence directed against a target protein. To date, siRNA constructs have been shown to specifically down-regulate target proteins in vitro and in vivo models as well as in clinical studies.
MicroRNA
In embodiments, the AC comprises a microrna molecule. Micrornas (mirnas) are a highly conserved class of small RNA molecules that are transcribed from DNA in plant and animal genomes, but are not translated into proteins. Processed mirnas are single-stranded 17-25 nucleotide RNA molecules that are integrated into RNA-induced silencing complexes (RISC) and have been identified as key mediators of development, cell proliferation, apoptosis, and differentiation. They are believed to play a role in the regulation of gene expression by binding to the 3' -untranslated region of a particular mRNA. RISC mediates down-regulation of gene expression by translational inhibition, transcriptional cleavage, or both. RISC is also associated with transcriptional silencing in a variety of eukaryotic nuclei.
Antagomir
In embodiments, AC is antagomir. Antagomir are RNA-like oligonucleotides having various modifications directed against RNase protection and pharmacological properties such as enhanced tissue and cell uptake. They differ from normal RNAs in, for example, sugar, phosphorothioate backbones and complete 2 '-0-methylation of cholesterol moieties, for example, the 3' -end. Antagomir can be used to effectively quench endogenous mirnas by forming a duplex comprising an antagomir and the endogenous miRNA, thereby preventing miRNA-induced gene silencing. an example of antagomir-mediated miRNA silencing is the silencing of miR-122, described in Krutzfeldt et al, nature (2005), 438:685-689, which is expressly incorporated herein by reference in its entirety. Antagomir RNA can be synthesized using standard solid phase oligonucleotide synthesis protocols (U.S. patent application Ser. Nos. 11/502,158 and 11/657,341; the respective disclosures of which are incorporated herein by reference).
Antagomir may include ligand conjugated monomer subunits and monomers for oligonucleotide synthesis. Monomers are described in U.S. application Ser. No. 10/916,185. antagomir may have a ZXY structure, such as described in PCT application No. PCT/US 2004/07070. antagomir may be complexed with an amphiphilic moiety. The amphiphilic moiety for use with an oligonucleotide agent is described in PCT application No. PCT/US 2004/07070.
Aptamer
In embodiments, the AC includes an aptamer. Aptamers are nucleic acid or peptide molecules that bind with high affinity and specificity to a particular molecule of interest (Tuerk and Gold, science 249:505 (1990); ellington and Szostank, nature 346:818 (1990)). DNA or RNA aptamers that bind many different entities from large proteins to small organic molecules have been successfully produced (Eaton, curr. Opin. Chem. Biol. (1997), 1:10-16; famulok, curr. Opin. Structure. Biol. (1999), 9:324-9; and Hermann and Patel, science (2000), 287:820-5). The aptamer may be RNA or DNA-based and may comprise a nuclear switch. The nuclear switch is a part of an mRNA molecule that can directly bind to a small target molecule, and its binding to the target affects the activity of the gene. Thus, mRNA containing a nuclear switch is directly involved in regulating its own activity, depending on the presence or absence of its target molecule. Typically, aptamers are engineered to bind to a variety of molecular targets, such as small molecules, proteins, nucleic acids, and even cells, tissues, and organisms, by repeating several rounds of in vitro selection or equivalent SELEX (systematic evolution through exponentially enriched ligands). The aptamer may be prepared by any known method, including synthetic, recombinant, and purification methods, and may be used alone or in combination with other aptamers specific for the same target. In addition, the term "aptamer" also includes "secondary aptamers" that contain consensus sequences derived from comparing two or more known aptamers to a given target. In embodiments, the aptamer is an "intracellular aptamer" or "intermer" that specifically recognizes an intracellular target (Famulok et al, chem biol. (2001), 8 (10): 931-939; yoon and Rossi, adv. Drug deliv. Rev. (2018), 134:22-35; each of which is incorporated herein by reference).
Ribozyme
In embodiments, AC is a ribozyme. Ribozymes are complexes of RNA molecules with specific catalytic domains, which have endonuclease activity (Kim and Cech, proc. Natl. Acad. Sci. USA (1987), 84 (24): 8788-92; forster and Symons, cell (1987) 24,49 (2): 211-20). For example, a large number of ribozymes accelerate the phosphotransesterification reaction with high specificity, typically cleaving only one of several phosphates in an oligonucleotide substrate (Cech et al, cell (1981), 27 (3 Pt 2): 487-96; michel and Westhof, J.mol.biol. (1990), 5,216 (3): 585-610; reinhold-Hurek and Shub, nature (1992), 14,357 (6374): 173-6). This specificity is due to the need for the substrate to bind to the Internal Guide Sequence (IGS) of the ribozyme via specific base pairing interactions prior to the chemical reaction.
At least six basic classes of naturally occurring enzymatic RNAs are currently known. Under physiological conditions, each can trans-catalyze the hydrolysis of RNA phosphodiester bonds (and thus can cleave other RNA molecules), typically, enzymatic nucleic acids act by first binding to the target RNA. Such binding occurs through a target binding moiety of the enzymatic nucleic acid that is held in close proximity to the enzymatic moiety of the molecule used to cleave the target RNA. Thus, the enzymatic nucleic acid first recognizes and then binds to the target RNA by complementary base pairing, and once bound to the correct site, acts enzymatically to cleave the target RNA. Strategic cleavage of such target RNAs would destroy their ability to direct the synthesis of the encoded protein. After an enzymatic nucleic acid has bound and cleaved its RNA target, it is released from the RNA to find another target and reproducibly bind and cleave the new target.
For example, the enzymatic nucleic acid molecule can be formed in a hammerhead, hairpin, delta hepatitis virus, group I intron or RNASEP RNA (binding to RNA guide sequence) or a neurospora VS RNA motif. Specific examples of hammerhead motifs are described by Rossi et al Nucleic Acids Res (1992), 20 (17): 4559-65. Examples of hairpin motifs are described by the following documents: hampel et al (European patent application publication No. EP 0360257); hampel and Tritz, biochemistry (1989), 28 (12): 4929-33; hampel et al, nucleic Acids Res (1990), 18 (2): 299-304 and U.S. Pat. No. 5,631,359. Examples of hepatitis virus motifs are described by Perrotta and ben, biochemistry (1992), 31 (47): 11843-52; an example of an RNaseP motif is described by Guerrier-Takada et al, cell (1983), 35 (3 Pt 2): 849-57; neurospora VS RNA ribozyme motifs are described by Collins (Saville and Collins, cell (1990), 61 (4): 685-96; saville and Collins, proc. Natl. Acad. Sci. USA (1991), 88 (19): 8826-30; collins and Olive, biochemistry (1993), 32 (11): 2795-9); and examples of group I introns are described in U.S. Pat. No. 4,987,071. In embodiments, the enzymatic nucleotide molecules have specific substrate binding sites complementary to one or more target gene DNA or RNA regions, and they have nucleotide sequences within or around the substrate binding sites that confer RNA cleavage activity to the molecule. Thus, the ribozyme construct need not be limited to the specific motifs mentioned herein.
Ribozymes can be designed as described in International patent application publication No. WO 93/23569 and International patent application publication No. WO 94/02595, each specifically incorporated herein by reference, and synthesized as described herein for in vitro and in vivo testing. In embodiments, the ribozyme targets a target nucleotide sequence of a target transcript.
The ribozyme activity can be increased by altering the length of the ribozyme binding arm or chemically synthesizing a ribozyme having the following modifications: modifications that prevent their degradation by serum ribonucleases (see, e.g., international patent application publication No. WO 92/07065; international patent application publication No. WO 93/15187; international patent application publication No. WO 91/03162; european patent application publication No. 92110298.4; U.S. patent 5334711; and International patent application publication No. WO 94/13688, which describe various chemical modifications that can be made to the sugar portion of an enzymatic RNA molecule), modifications that enhance their efficacy in cells and that eliminate stem Pi bases to shorten RNA synthesis time and reduce chemical requirements.
Supermir
In an embodiment, AC is supermir. supermir refers to single-, double-or partially double-stranded oligomers or polymers of RNA or DNA or both or modifications thereof, which have essentially the same nucleotide sequence as miRNA and are antisense with respect to their target, which term includes oligonucleotides consisting of naturally occurring nucleobases, sugars and covalent internucleoside (backbone) linkages, and which contain at least one functionally similar non-naturally occurring moiety. Such modified or substituted oligonucleotides have desirable properties such as, for example, enhanced cellular uptake, enhanced affinity for nucleic acid targets, and increased stability in the presence of nucleases. In embodiments supermir does not comprise a sense strand, and in another embodiment supermir does not self-hybridize to a significant extent. supermir may have a secondary structure, but it is essentially single-stranded under physiological conditions. Substantially single-stranded supermir is single-stranded to the extent that less than about 50% (e.g., less than about 40%, about 30%, about 20%, about 10%, or about 5%) of supermir is duplex with itself. supermir may comprise hairpin fragments, e.g., sequences, such as duplex regions that can self-hybridize at the 3' end and form, e.g., duplex regions of at least about 1, about 2, about 3, or about 4, or less than about 8, about 7, about 6, or about 5 nucleotides. The duplex regions may be joined by a linker, such as a nucleotide linker, e.g., about 3, about 4, about 5, or about 6 dT, e.g., a modified dT. In another embodiment, supermir is duplex with shorter oligonucleotides (e.g., at one or both of the 3 'and 5' ends of supermir or at one end and non-terminal or intermediate) of, for example, about 5, about 6, about 7, about 8, about 9, or about 10 nucleotides in length.
MiRNA mimics
In embodiments, AC is a miRNA mimic. miRNA mimics represent a class of molecules useful for mimicking the gene silencing ability of one or more mirnas. Thus, the term "microrna mimetic" refers to a synthetic non-coding RNA that is capable of entering the RNAi pathway and regulating gene expression (i.e., a miRNA that is not obtained by purification from a source of endogenous miRNA). miRNA mimics may be designed as mature molecules (e.g., single-stranded) or as mimetic precursors (e.g., primary or pre-mirnas). miRNA mimics may include nucleic acids (modified or modified nucleic acids), including oligonucleotides, including but not limited to RNA, modified RNA, DNA, modified DNA, locked nucleic acids, or 2'-0,4' -C-ethylene bridged nucleic acids (ENA), or any combination of the above (including DNA-RNA hybrids). Furthermore, miRNA mimics may include conjugates capable of affecting delivery, intracellular compartmentalization, stability, specificity, functionality, strand use, and/or potency. In one design, the miRNA mimic is a double-stranded molecule (e.g., having a duplex region between about 16 and about 31 nucleotides in length) and contains one or more sequences having identity to the mature strand of a given miRNA. Modifications may include 2' modifications (including 2' -0 methyl modifications and 2' f modifications) and internucleoside modifications (e.g., phosphorothioate modifications) on one or both strands of the molecule that enhance nucleic acid stability and/or specificity. In addition, the miRNA mimic may comprise an overhang. The overhang may comprise about 1 to about 6 nucleotides at the 3 'or 5' end of either strand, and may be modified to enhance stability or functionality. In embodiments, the miRNA mimic comprises a duplex region of about 16 to about 31 nucleotides and one or more of the following chemical modification patterns: the sense strand contains 2 '-0-methyl modifications of nucleotides 1 and 2 (counted from the 5' end of the sense oligonucleotide), as well as all C and U; antisense strand modifications can include all C and U2 ' f modifications, phosphorylation of the 5' end of the oligonucleotide, and stabilized internucleoside linkages associated with 2 nucleotide 3' overhangs.
MiRNA inhibitors
In embodiments, AC is a miRNA inhibitor. The terms "antimir," "microrna inhibitor," "miR inhibitor," or "miRNA inhibitor" are synonymous and refer to an oligonucleotide or modified oligonucleotide that interferes with the ability of a particular miRNA. Typically, the inhibitor is a nucleic acid or modified nucleic acid in nature, including oligonucleotides, including RNA, modified RNA, DNA, modified DNA, locked Nucleic Acid (LNA), or any combination of the above.
Modifications include 2' modifications (including 2' -0 alkyl modifications and 2' f modifications) and internucleoside modifications (e.g., phosphorothioate modifications) that can affect delivery, stability, specificity, intracellular compartmentalization, or potency. Furthermore, miRNA inhibitors may include conjugates capable of affecting delivery, intracellular compartmentalization, stability, and/or potency. Inhibitors can take a variety of configurations, including single-stranded, double-stranded (RNA/RNA or RNA/DNA duplex) and hairpin designs, typically microrna inhibitors include one or more sequences or portions of sequences that are complementary or partially complementary to the mature strand (or strands) of the miRNA to be targeted, and further, miRNA inhibitors can include additional sequences located 5 'and 3' of the reverse complement of the mature miRNA. The additional sequence may be the reverse complement of the sequence adjacent to the mature miRNA in the primary miRNA from which the mature miRNA is derived, or the additional sequence may be any sequence (mixture with A, G, C or U). In embodiments, one or both of the additional sequences are any sequence capable of forming a hairpin. Thus, in embodiments, the reverse complement of the sequence as a miRNA flanks the hairpin structure on the 5 'and 3' sides. When double-stranded, microRNA inhibitors can include mismatches between nucleotides on opposite strands. In addition, microrna inhibitors can be linked to a conjugate moiety to facilitate uptake of the inhibitor into cells. For example, the microrna inhibitor can be linked to cholesterol 5- (bis (4-methoxyphenyl) (phenyl) methoxy) -3-hydroxypentylcarbamate, which allows passive uptake of the microrna inhibitor into cells. Microrna inhibitors, including hairpin miRNA inhibitors, are described in detail in Vermeulen et al ,"Double-Stranded Regions Are Essential Design Components Of Potent Inhibitors of RISC Function,"RNA 13:723-730(2007), and WO2007/095387 and WO 2008/036825, each of which is incorporated herein by reference in its entirety. One of ordinary skill in the art can select the sequence of the desired miRNA from the database and design inhibitors useful in the methods disclosed herein.
CRISPR gene editing mechanism
In embodiments, the therapeutic moiety comprises one or more elements of a CRISPR gene editing mechanism. As used herein, "CRISPR gene editing mechanism" refers to a protein, nucleic acid, or combination thereof that can be used to edit a genome. Non-limiting examples of gene editing mechanisms include guide RNAs (grnas), nucleases, nuclease inhibitors, combinations and complexes thereof. The following patent documents describe CRISPR gene editing mechanisms: U.S. Pat. No. 8,697,359, U.S. Pat. No. 8,771,945, U.S. Pat. No. 8,795,965, U.S. Pat. No. 8,865,406, U.S. Pat. No. 8,871,445, U.S. Pat. No. 8,889,356, U.S. Pat. No. 8,895,308, U.S. Pat. No. 8,906,616, U.S. Pat. No. 8,932,814, U.S. Pat. No. 8,945,839, U.S. Pat. No. 8,993,233, U.S. Pat. No. 8,999,641, U.S. patent application Ser. No. 14/704,551, and U.S. patent application Ser. No. 13/842,859. The above patent documents are each incorporated by reference in their entirety.
gRNA
In embodiments, the TM comprises gRNA. gRNA targets genomic loci in prokaryotic or eukaryotic cells.
In embodiments, the gRNA is a single molecule guide RNA (sgRNA). The sgrnas include spacer sequences and scaffold sequences. The spacer sequence is a short nucleic acid sequence for targeting a nuclease (e.g., cas9 nuclease) to a particular nucleotide region of interest (e.g., genomic DNA sequence to be cleaved). In embodiments, the spacer may be about 17-24 bases in length, such as about 20 bases in length. In embodiments, the length of the spacer can be about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, or about 30 bases. In embodiments, the length of the spacer can be at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 bases. In embodiments, the length of the spacer can be about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, or about 30 bases. In embodiments, the spacer sequence has a GC content of between about 40% to about 80%.
In embodiments, the spacer binds to a target nucleotide sequence immediately preceding the 5' Protospacer Adjacent Motif (PAM). PAM sequences may be selected based on the desired nuclease. For example, the PAM sequence may be any one of the PAM sequences shown in table 13 below, wherein N refers to any nucleic acid, R refers to a or G, Y refers to C or T, W refers to a or T, and V refers to a or C or G.
TABLE 13 nuclease and PAM sequences
In embodiments, the spacer binds to a target nucleotide sequence of a mammalian target transcript of a target gene, such as a human gene. In embodiments, the spacer region can bind to a target nucleotide sequence of a target transcript. In embodiments, the spacer region may bind to a target nucleotide sequence comprising at least a portion of a Splice Element (SE) and/or a Splice Regulatory Element (SRE) of the target transcript or in sufficient proximity to the SE and/or SRE of the target transcript to regulate splicing.
The scaffold sequence is the sequence within the sgRNA responsible for nuclease (e.g., cas 9) binding. The scaffold sequence does not include a spacer/targeting sequence. In embodiments, the length of the scaffold may be about 1 to about 10, about 10 to about 20, about 20 to about 30, about 30 to about 40, about 40 to about 50, about 50 to about 60, about 60 to about 70, about 70 to about 80, about 80 to about 90, about 90 to about 100, about 100 to about 110, about 110 to about 120, or about 120 to about 130 nucleotides. In the context of an embodiment of the present invention, the length of the scaffold can be about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, about 40, about 41, about 42, about 43, about 44, about 45, about 46, about 47, about 48, about 49, about 50, about 51, about 52, about 53, about 54, about 55, about 56, about 57, about 58, about 59, about 60, about out 60, about 61, about 62, about 63, about about 64, about 65, about 66, about 67, about 68, about 69, about 70, about 71, about 72, about 73, about 74, about 75, about 76, about 77, about 78, about 79, about 80, about 81, about 82, about 83, about 84, about 85, about 86, about 87, about 88, about 89, about 90, about 91, about 92, about 93, about 94, about 95, about 96, about 97, about 98, about 99, about 100, about 101, about 102, about 103, about 104, about 105, about 106, about 107, about 108, about 109, about 110, about 111, about 112, about 113, about 114, about 115, about 116, about 117, about 118, about 119, about 120, about 121, about 122, about 123, about 124, or about 125 nucleotides. In embodiments, the length of the scaffold may be at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 110, at least 120, or at least 125 nucleotides.
In embodiments, the gRNA is a bimolecular guide RNA, such as crRNA and tracrRNA. In embodiments, the gRNA may further comprise a poly (a) tail.
In embodiments, multiple grnas may be used as TMs in a single compound. In embodiments, a TM comprises about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20 grnas. In embodiments, the grnas recognize the same target. In embodiments, the gRNA recognizes different targets. In embodiments, the nucleic acid comprising the gRNA comprises a sequence encoding a promoter, wherein the promoter drives expression of the gRNA.
Nuclease (nuclease)
In embodiments, the TM includes a nuclease. In embodiments, the nuclease is Sup>A type II, type V-A, type V-B, type VC, type V-U, type VI-B nuclease. In embodiments, the nuclease is a transcription activator-like effector nuclease (TALEN), meganuclease, or zinc finger nuclease. In embodiments, the nuclease is a Cas9, cas12a (CF 3), cas12B, cas12C, tnp-B like, cas13a (C2), cas13B, or Cas14 nuclease. For example, in embodiments, the nuclease is a Cas9 nuclease or a Cpf1 nuclease.
In embodiments, the nuclease is a modified form or variant of Cas9, cas12a (Cpf 1), cas12B, cas12C, tnp-B like, cas13a (C2), cas13B, or Cas14 nuclease. In embodiments, the nuclease is a modified form or variant of TAL nuclease, meganuclease or zinc finger nuclease. A "modified" or "variant" nuclease is, for example, a truncated, fused to another protein (such as another nuclease), catalytically inactivated, or the like nuclease. In embodiments, the nuclease may have at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or about 100% sequence identity to a naturally occurring Cas9, cas12a (Cpf 1), cas12B, cas12C, tnp-B like, cas13a (C2), cas13B, cas14 nuclease or TALEN, meganuclease, or zinc finger nuclease. In embodiments, the nuclease is a Cas9 nuclease (SpCas 9) derived from streptococcus pyogenes(s). In embodiments, the nuclease has at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% sequence identity with a Cas9 nuclease derived from streptococcus pyogenes (SpCas 9). In embodiments, the nuclease is Cas9 (SaCas 9) derived from staphylococcus aureus (s.aureus). In embodiments, the nuclease has at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% sequence identity with Cas9 (SaCas 9) derived from staphylococcus aureus. In an embodiment, cpf1 is a Cpf1 enzyme from the genus amino acid coccus (Acidaminococcus) (species BV3L6, uniProt accession number U2UMQ 6). In embodiments, the nuclease has at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% sequence identity with a Cpf1 enzyme from the genus amino acid coccus (species BV3L6, uniProt accession No. U2 UMQ).
In an embodiment, cpf1 is Cpf1 enzyme from the family Trichosporon (Lachnospiraceae) (species ND2006, uniProt accession A0A182DWE 3). In embodiments, the nuclease has at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98% or at least about 99% sequence identity with a Cpf1 enzyme from the family chaetoviridae. In embodiments, the sequence encoding the nuclease is codon optimized for expression in mammalian cells. In embodiments, the sequence encoding the nuclease is codon optimized for expression in a human cell or a mouse cell.
In embodiments, the nuclease is a soluble protein.
In embodiments, TM is a nucleotide sequence encoding a nuclease. In embodiments, the nucleic acid encoding the nuclease comprises a sequence encoding a promoter, wherein the promoter drives expression of the nuclease.
GRNA and nuclease combinations
In embodiments, the compounds of the present disclosure comprise a gRNA and a nuclease or a nucleotide sequence encoding a nuclease as TM. In embodiments, the nucleic acid encoding the nuclease and the gRNA comprises a sequence encoding a promoter, wherein the promoter drives expression of the nuclease and the gRNA. In embodiments, the nucleic acid encoding the nuclease and the gRNA comprises two promoters, wherein a first promoter controls expression of the nuclease and a second promoter controls expression of the gRNA. In embodiments, the nucleic acid encoding the gRNA and the nuclease encodes about 1 to about 20 grnas, or about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, or about 19, and up to about 20 grnas. In embodiments, the gRNA recognizes different targets. In embodiments, the grnas recognize the same target.
In embodiments, the compounds of the present disclosure comprise Ribonucleoprotein (RNP) comprising gRNA and nuclease as TM.
In embodiments, a composition is delivered to a cell, the composition comprising: (a) A first compound comprising a gRNA TM, and (b) a second compound that is or comprises a nuclease. In embodiments, a composition is delivered to a cell, the composition comprising: (a) A first compound comprising a nuclease that is a TM, a CPP, and (b) a second molecule that is or comprises a gRNA. In embodiments, a composition is delivered to a cell, the composition comprising: (a) A first compound comprising gRNA as TM, and (b) a second compound comprising a nuclease as TM.
Genetic element of interest
In embodiments, the compounds disclosed herein comprise a genetic element of interest as a TM. In embodiments, the genetic element of interest replaces a genomic DNA sequence that is cleaved by a nuclease. Non-limiting examples of genetic elements of interest include genes, single nucleotide polymorphisms, promoters or terminators.
Nuclease inhibitors
In embodiments, the compounds disclosed herein comprise nuclease inhibitors as TM. The limitation of gene editing is potential off-target editing. Delivery of nuclease inhibitors will limit off-target editing. In embodiments, the nuclease inhibitor is a polypeptide, polynucleotide, or small molecule. Nuclease inhibitors are described in U.S. publication No. 2020/087354, international publication No. 2018/085288, U.S. publication No. 2018/0382741, international publication No. 2019/089761, international publication No. 2020/068304, international publication No. 2020/04384, and international publication No. 2019/076651, each of which is incorporated herein by reference in its entirety.
Endosome escape carrier (EEV)
Endosomal Escape Vehicles (EEVs) may be used to transport cargo across a cell membrane, for example, to deliver cargo to the cytosol or nucleus of a cell. The cargo may comprise a TM. EEV may comprise a Cell Penetrating Peptide (CPP), such as a cyclic cell penetrating peptide (cCPP). In an embodiment, the EEV comprises cCPP conjugated to an Exocyclic Peptide (EP). EP is interchangeably referred to as a regulatory peptide (MP). The EP may comprise a sequence of Nuclear Localization Signals (NLS). The EP may be coupled to the cargo. EP's may be coupled to cCPP. The EP may be coupled to cargo and cCPP. The coupling between the EP, cargo, cCPP, or combinations thereof may be non-covalent or covalent. The EP may be attached to the N-terminus of cCPP by a peptide bond. The EP may be attached to the C-terminal end of cCPP by a peptide bond. EP may be attached to cCPP by a side chain of an amino acid in cCPP. EP may be attached to cCPP through the side chain of lysine, which may be conjugated to the side chain of glutamine in cCPP. The EP may be conjugated to the 5 'or 3' end of the oligonucleotide cargo. The EP may be coupled to a linker. The exocyclic peptide may be conjugated to the amino group of the linker. The EP may be coupled to the linker via the C-terminal ends of EP and cCPP via cCPP and/or a side chain on the EP. For example, an EP may comprise a terminal lysine, which may then be coupled to a glutamine-containing cCPP via an amide linkage. When EP contains a terminal lysine and the side chain of lysine is available for attachment cCPP, the C-terminus or N-terminus can be attached to a linker on the cargo.
Cyclic exopeptides
The Exocyclic Peptide (EP) may comprise 2 to 10 amino acid residues, for example 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acid residues, including all ranges and values therebetween. An EP may comprise 6 to 9 amino acid residues. An EP may comprise 4 to 8 amino acid residues.
Each amino acid in the exocyclic peptide may be a natural or unnatural amino acid. The term "unnatural amino acid" refers to an organic compound that is a generic species of natural amino acids because it has a structure similar to that of a natural amino acid, thereby mimicking the structure and reactivity of a natural amino acid. The unnatural amino acid can be a modified amino acid and/or amino acid analog that is not one of the 20 common naturally occurring amino acids or the rare natural amino acid selenocysteine or pyrrolysine. The unnatural amino acid can also be a D-isomer of the natural amino acid. Examples of suitable amino acids include, but are not limited to, alanine, alloleucine, arginine, citrulline, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, naphthylalanine, phenylalanine, proline, pyroglutamic acid, serine, threonine, tryptophan, tyrosine, valine, derivatives thereof, or combinations thereof. These and other amino acids are listed in table 1 along with their abbreviations used herein. For example, the amino acid may be A, G, P, K, R, V, F, H, nal or citrulline.
The EP may comprise at least one positively charged amino acid residue, e.g. at least one lysine residue and/or at least one amino acid residue comprising a side chain comprising a guanidine group or a protonated form thereof. An EP may comprise 1 or 2 amino acid residues comprising a side chain comprising a guanidine group or a protonated form thereof. The amino acid residue comprising a guanidine-containing side chain may be an arginine residue. Protonated forms may refer to salts thereof throughout the disclosure.
The EP may comprise at least two, at least three or at least four or more lysine residues. An EP may comprise 2, 3 or 4 lysine residues. The amino group on the side chain of each lysine residue may be substituted with a protecting group including, for example, trifluoroacetyl (-COCF 3), allyloxycarbonyl (Alloc), 1- (4, 4-dimethyl-2, 6-dioxocyclohexylidene) ethyl (Dde) or (4, 4-dimethyl-2, 6-dioxocyclohex-1-ylidene-3) -methylbutyl (ivDde). The amino group on the side chain of each lysine residue may be substituted with a trifluoroacetyl group (-COCF 3). Protecting groups may be included to effect amide conjugation. The protecting group may be removed after conjugation of EP to cCPP.
An EP may comprise at least 2 amino acid residues with hydrophobic side chains. Amino acid residues having hydrophobic side chains may be selected from valine, proline, alanine, leucine, isoleucine and methionine. The amino acid residue having a hydrophobic side chain may be valine or proline.
The EP may comprise at least one positively charged amino acid residue, e.g. at least one lysine residue and/or at least one arginine residue. The EP may comprise at least two, at least three or at least four or more lysine residues and/or arginine residues.
EP may comprise KK、KR、RR、HH、HK、HR、RH、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKH、KHK、HKK、HRR、HRH、HHR、HBH、HHH、HHHH(SEQ ID NO:1)、KHKK(SEQ ID NO:2)、KKHK(SEQ ID NO:3)、KKKH(SEQ ID NO:4)、KHKH(SEQ ID NO:5)、HKHK(SEQ ID NO:6)、KKKK(SEQ ID NO:7)、KKRK(SEQ ID NO:8)、KRKK(SEQ ID NO:9)、KRRK(SEQ ID NO:10)、RKKR(SEQ ID NO:11)、RRRR(SEQ ID NO:12)、KGKK(SEQ ID NO:13)、KKGK(SEQ ID NO:14)、HBHBH(SEQ ID NO:15)、HBKBH(SEQ ID NO:16)、RRRRR(SEQ ID NO:17)、KKKKK(SEQ ID NO:18)、KKKRK(SEQ ID NO:19)、RKKKK(SEQ ID NO:20)、KRKKK(SEQ ID NO:21)、KKRKK(SEQ ID NO:22)、KKKKR(SEQ ID NO:23)、KBKBK(SEQ ID NO:24)、RKKKKG(SEQ ID NO:25)、KRKKKG(SEQ ID NO:26)、KKRKKG(SEQ ID NO:27)、KKKKRG(SEQ ID NO:28)、RKKKKB(SEQ ID NO:29)、KRKKKB(SEQ ID NO:30)、KKRKKB(SEQ ID NO:31)、KKKKRB(SEQ ID NO:32)、KKKRKV(SEQ ID NO:33)、RRRRRR(SEQ ID NO:34)、HHHHHH(SEQ ID NO:35)、RHRHRH(SEQ ID NO:36)、HRHRHR(SEQ ID NO:37)、KRKRKR(SEQ ID NO:38)、RKRKRK(SEQ ID NO:39)、RBRBRB(SEQ ID NO:40)、KBKBKB(SEQ ID NO:41)、PKKKRKV(SEQ ID NO:42)、PGKKRKV(SEQ ID NO:43)、PKGKRKV(SEQ ID NO:44)、PKKGRKV(SEQ ID NO:45)、PKKKGKV(SEQ ID NO:46)、PKKKRGV(SEQ ID NO:47) or PKKKRKG (SEQ ID NO: 48), wherein B is beta-alanine. The amino acids in EP may have D or L stereochemistry.
EP may comprise KK、KR、RR、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKKK(SEQ ID NO:7)、KKRK(SEQ ID NO:8)、KRKK(SEQ ID NO:9)、KRRK(SEQ ID NO:10)、RKKR(SEQ ID NO:11)、RRRR(SEQ ID NO:12)、KGKK(SEQ ID NO:13)、KKGK(SEQ ID NO:14)、KKKKK(SEQ ID NO:18)、KKKRK(SEQ ID NO:19)、KBKBK(SEQ ID NO:24)、KKKRKV(SEQ ID NO:33)、PKKKRKV(SEQ ID NO:42)、PGKKRKV(SEQ ID NO:43)、PKGKRKV(SEQ ID NO:44)、PKKGRKV(SEQ ID NO:45)、PKKKGKV(SEQ ID NO:46)、PKKKRGV(SEQ ID NO:47) or PKKKRKG (SEQ ID NO: 48). EP may comprise PKKKRKV (SEQ ID NO: 42), RR, RRR, RHR, RBR, RBRBR (SEQ ID NO: 49), RBHBR (SEQ ID NO: 50) or HBRBH (SEQ ID NO: 51), wherein B is beta-alanine. The amino acids in EP may have D or L stereochemistry.
EP may consist of KK、KR、RR、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKKK(SEQ ID NO:7)、KKRK(SEQ ID NO:8)、KRKK(SEQ ID NO:9)、KRRK(SEQ ID NO:10)、RKKR(SEQ ID NO:11)、RRRR(SEQ ID NO:12)、KGKK(SEQ ID NO:13)、KKGK(SEQ ID NO:14)、KKKKK(SEQ ID NO:18)、KKKRK(SEQ ID NO:19)、KBKBK(SEQ ID NO:24)、KKKRKV(SEQ ID NO:33)、PKKKRKV(SEQ ID NO:42)、PGKKRKV(SEQ ID NO:Z43)、PKGKRKV(SEQ ID NO:Z44)、PKKGRKV(SEQ ID NO:Z45)、PKKKGKV(SEQ ID NO:46)、PKKKRGV(SEQ ID NO:47) or PKKKRKG (SEQ ID NO: 48). EP may consist of PKKKRKV (SEQ ID NO: 42), RR, RRR, RHR, RBR, RBRBR (SEQ ID NO: 49), RBHBR (SEQ ID NO: 50) or HBRBH (SEQ ID NO: 51), wherein B is beta-alanine. The amino acids in EP may have D or L stereochemistry.
An EP may comprise an amino acid sequence identified in the art as a Nuclear Localization Sequence (NLS). An EP may consist of an amino acid sequence identified in the art as a Nuclear Localization Sequence (NLS). EP may comprise an NLS comprising the amino acid sequence PKKKRKV (SEQ ID NO: 42). EP may consist of an NLS comprising the amino acid sequence PKKKRKV (SEQ ID NO: 42). EP may comprise an NLS comprising an amino acid sequence selected from NLSKRPAAIKKAGQAKKKK(SEQ ID NO:52)、PAAKRVKLD(SEQ ID NO:53)、RQRRNELKRSF(SEQ ID NO:54)、RMRKFKNKGKDTAELRRRRVEVSVELR(SEQ ID NO:Z55)、KAKKDEQILKRRNV(SEQ ID NO:56)、VSRKRPRP(SEQ ID NO:57)、PPKKARED(SEQ ID NO:58)、PQPKKKPL(SEQ ID NO:59)、SALIKKKKKMAP(SEQ ID NO:60)、DRLRR(SEQ ID NO:61)、PKQKKRK(SEQ ID NO:62)、RKLKKKIKKL(SEQ ID NO:63)、REKKKFLKRR(SEQ ID NO:64)、KRKGDEVDGVDEVAKKKSKK(SEQ ID NO:65) and RKCLQAGMNLEARKTKK (SEQ ID NO: 66). EP may consist of an NLS comprising an amino acid sequence selected from NLSKRPAAIKKAGQAKKKK(SEQ ID NO:52)、PAAKRVKLD(SEQ ID NO:53)、RQRRNELKRSF(SEQ ID NO:54)、RMRKFKNKGKDTAELRRRRVEVSVELR(SEQ ID NO:55)、KAKKDEQILKRRNV(SEQ ID NO:56)、VSRKRPRP(SEQ ID NO:57)、PPKKARED(SEQ ID NO:58)、PQPKKKPL(SEQ ID NO:59)、SALIKKKKKMAP(SEQ ID NO:60)、DRLRR(SEQ ID NO:61)、PKQKKRK(SEQ ID NO:62)、RKLKKKIKKL(SEQ ID NO:63)、REKKKFLKRR(SEQ ID NO:64)、KRKGDEVDGVDEVAKKKSKK(SEQ ID NO:65) and RKCLQAGMNLEARKTKK (SEQ ID NO: 66).
All exocyclic sequences may also contain N-terminal acetyl groups. Thus, for example, an EP may have the following structure: ac-PKKKKRKV (SEQ ID NO: 42).
Cell Penetrating Peptide (CPP)
The Cell Penetrating Peptide (CPP) may comprise from 6 to 20 amino acid residues. The cell penetrating peptide may be a cyclic cell penetrating peptide (cCPP). cCPP are capable of penetrating cell membranes. The Exocyclic Peptide (EP) may be conjugated to cCPP and the resulting construct may be referred to as an Endosomal Escape Vector (EEV). cCPP can direct cargo (e.g., a Therapeutic Moiety (TM), such as an oligonucleotide, peptide, or small molecule) to penetrate a cell membrane. cCPP can deliver cargo to the cytosol of a cell. cCPP can deliver cargo to the cell site where the target (e.g., pre-mRNA) is located. To conjugate cCPP to a cargo (e.g., peptide, oligonucleotide, or small molecule), at least one bond or lone pair of electrons on cCPP may be replaced.
The total number of amino acid residues in cCPP is in the range of 6 to 20 amino acid residues, e.g., 6,7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acid residues, including all ranges and subranges therebetween. cCPP may comprise from 6 to 13 amino acid residues. cCPP disclosed herein may comprise from 6 to 10 amino acids. By way of example, cCPP comprising 6-10 amino acid residues may have a structure according to any one of formulas I-a to I-E:
Wherein AA1、AA2、AA3、AA4、AA5、AA6、AA7、AA8、AA9 and AA 10 are amino acid residues.
CCPP may comprise from 6 to 8 amino acids. cCPP may comprise 8 amino acids.
CCPP may be natural or unnatural amino acids. The term "unnatural amino acid" refers to an organic compound that is a generic species of natural amino acids because it has a structure similar to that of a natural amino acid, thereby mimicking the structure and reactivity of a natural amino acid. The unnatural amino acid can be a modified amino acid and/or amino acid analog that is not one of the 20 common naturally occurring amino acids or the rare natural amino acid selenocysteine or pyrrolysine. The unnatural amino acid can also be a D-isomer of the natural amino acid. Examples of suitable amino acids include, but are not limited to, alanine, alloleucine, arginine, citrulline, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, naphthylalanine, phenylalanine, proline, pyroglutamic acid, serine, threonine, tryptophan, tyrosine, valine, derivatives thereof, or combinations thereof. These and other amino acids are listed in table 1 along with their abbreviations used herein.
TABLE 1 amino acid abbreviations
/>
As used herein, "polyethylene glycol" and "PEG" are used interchangeably. In embodiments, n is 1 or 2, in embodiments n is 1, in embodiments n is 2, in embodiments n is 1, and m is 4, in embodiments n is 2, and m is 4, in embodiments n is 1, and m is 12.
As used herein, "miniPEGm" or "miniPEG m" is or derives from a molecule of the formula HO (CO) - (CH 2)n-(OCH2CH2)m-NH2), where n is 1 and m is any integer from 1 to 23, for example, "miniPEG2" or "miniPEG 2" is or derives from (2- [2- [ 2-aminoethoxy ] ethoxy ] acetic acid), and "miniPEG4" or "miniPEG 4" is or derives from HO (CO) - (CH 2)n-(OCH2CH2)m-NH2, where n is 1 and m is 4.
CCPP may comprise from 4 to 20 amino acids, wherein: (i) At least one amino acid has a side chain comprising a guanidino group or a protonated form thereof; (ii) At least one amino acid having no side chain or having a chain comprising Or a side chain of a protonated form thereof; and (iii) at least two amino acids independently have side chains comprising aromatic or heteroaromatic groups.
At least two amino acids may have no side chains or have a chain comprising Or a side chain of a protonated form thereof. As used herein, when a side chain is not present, the amino acid has two hydrogen atoms (e.g., -CH 2 -) on the carbon atoms linking the amine and the carboxylic acid.
The amino acid without a side chain may be glycine or β -alanine.
CCPP may comprise 6 to 20 amino acid residues that form cCPP, wherein: (i) At least one amino acid may be glycine, beta-alanine or 4-aminobutyric acid residues; (ii) At least one amino acid may have a side chain comprising an aryl or heteroaryl group; and (iii) at least one amino acid has a guanidine group, Or a side chain of a protonated form thereof.
CCPP may comprise 6 to 20 amino acid residues that form cCPP, wherein: (i) At least two amino acids may independently be glycine, beta-alanine or 4-aminobutyric acid residues; (ii) At least one amino acid may have a side chain comprising an aryl or heteroaryl group; and (iii) at least one amino acid has a guanidine group, Or a side chain of a protonated form thereof.
CCPP may comprise 6 to 20 amino acid residues that form cCPP, wherein: (i) At least three amino acids may independently be glycine, beta-alanine or 4-aminobutyric acid residues; (ii) At least one amino acid may have a side chain comprising an aromatic or heteroaromatic group; and (iii) at least one amino acid may have a guanidine group, Or a side chain of a protonated form thereof.
Glycine and related amino acid residues
CCPP may comprise (i) 1,2, 3, 4, 5, or 6 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 2 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 3 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 4 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 5 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 6 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 3, 4, or 5 glycine, beta-alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 3 or 4 glycine, beta-alanine, 4-aminobutyric acid residues, or a combination thereof.
CCPP may comprise (i) 1, 2,3, 4,5 or 6 glycine residues. cCPP may comprise (i) 2 glycine residues. cCPP may comprise (i) 3 glycine residues. cCPP may comprise (i) 4 glycine residues. cCPP may comprise (i) 5 glycine residues. cCPP may comprise (i) 6 glycine residues. cCPP may comprise (i) 3,4 or 5 glycine residues. cCPP may comprise (i) 3 or 4 glycine residues. cCPP may comprise (i) 2 or 3 glycine residues. cCPP may comprise (i) 1 or 2 glycine residues.
CCPP may comprise (i) 3, 4, 5, or 6 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 3 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 4 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 5 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 6 glycine, β -alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 3, 4, or 5 glycine, beta-alanine, 4-aminobutyric acid residues, or a combination thereof. cCPP may comprise (i) 3 or 4 glycine, beta-alanine, 4-aminobutyric acid residues, or a combination thereof.
CCPP may comprise at least three glycine residues. cCPP may comprise (i) 3, 4, 5 or 6 glycine residues. cCPP may comprise (i) 3 glycine residues. cCPP may comprise (i) 4 glycine residues. cCPP may comprise (i) 5 glycine residues. cCPP may comprise (i) 6 glycine residues. cCPP may comprise (i) 3, 4 or 5 glycine residues. cCPP can comprise (i) 3 or 4 glycine residues
In embodiments, none of the glycine, β -alanine, or 4-aminobutyric acid residues in cCPP are contiguous. Two or three glycine, beta-alanine, or 4-aminobutyric acid residues may be contiguous. The two glycine, β -alanine or 4-aminobutyric acid residues may be contiguous.
In embodiments, none of the glycine residues cCPP are contiguous. Each glycine residue in cCPP may be separated by an amino acid residue that is not glycine. Two or three glycine residues may be contiguous. The two glycine residues may be contiguous.
Amino acid side chains with aromatic or heteroaromatic groups
CCPP may comprise (ii) 2,3, 4, 5 or 6 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. cCPP may comprise (ii) 2 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. cCPP may comprise (ii) 3 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. cCPP may comprise (ii) 4 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. cCPP may comprise (ii) 5 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. cCPP may comprise (ii) 6 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. cCPP may comprise (ii) 2,3 or 4 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. cCPP may comprise (ii) 2 or 3 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group.
CCPP may comprise (ii) 2, 3, 4, 5 or 6 amino acid residues independently having a side chain comprising an aromatic group. cCPP may comprise (ii) 2 amino acid residues independently having a side chain comprising an aromatic group. cCPP may comprise (ii) 3 amino acid residues independently having a side chain comprising an aromatic group. cCPP may comprise (ii) 4 amino acid residues independently having a side chain comprising an aromatic group. cCPP may comprise (ii) 5 amino acid residues independently having a side chain comprising an aromatic group. cCPP may comprise (ii) 6 amino acid residues independently having a side chain comprising an aromatic group. cCPP may comprise (ii) 2, 3 or 4 amino acid residues independently having a side chain comprising an aromatic group. cCPP may comprise (ii) 2 or 3 amino acid residues independently having a side chain comprising an aromatic group.
The aromatic group may be a 6 to 14 membered aryl group. Aryl groups may be phenyl, naphthyl or anthracenyl, each of which is optionally substituted. Aryl groups may be phenyl or naphthyl, each of which is optionally substituted. The heteroaromatic group may be a 6 to 14 membered heteroaryl group having 1, 2 or 3 heteroatoms selected from N, O and S. Heteroaryl may be pyridinyl, quinolinyl or isoquinolinyl.
Amino acid residues having a side chain comprising an aromatic or heteroaromatic group may each independently be bis (Gao Naiji alanine), gao Naiji alanine, naphthylalanine, phenylglycine, bis (homophenylalanine), homophenylalanine, phenylalanine, tryptophan, 3- (3-benzothienyl) -alanine, 3- (2-quinolinyl) -alanine, O-benzylserine, 3- (4- (benzyloxy) phenyl) -alanine, S- (4-methylbenzyl) cysteine, N- (naphthalen-2-yl) glutamine, 3- (1, 1' -biphenyl-4-yl) -alanine, 3- (3-benzothienyl) -alanine or tyrosine, each of which is optionally substituted with one or more substituents. Amino acids having side chains comprising aromatic or heteroaromatic groups may each be independently selected from:
Wherein the H at the N-terminal and/or H at the C-terminal is replaced by a peptide bond. /(I)
The amino acid residues having a side chain comprising an aromatic or heteroaromatic group may each independently be a residue of phenylalanine, naphthylalanine, phenylglycine, homophenylalanine, gao Naiji alanine, bis (homophenylalanine), bis- (Gao Naiji alanine), tryptophan or tyrosine, each of which is optionally substituted with one or more substituents. The amino acid residues having a side chain containing an aromatic group may each independently be a residue of tyrosine, phenylalanine, 1-naphthylalanine, 2-naphthylalanine, tryptophan, 3-benzothienyl alanine, 4-phenylphenylalanine, 3, 4-difluorophenylalanine, 4-trifluoromethylphenylalanine, 2,3,4,5, 6-pentafluorophenylalanine, homophenylalanine, β -homophenylalanine, 4-tert-butyl-phenylalanine, 4-pyridylalanine, 3-pyridylalanine, 4-methylphenylalanine, 4-fluorophenylalanine, 4-chlorophenylalanine, 3- (9-anthryl) -alanine. The amino acid residues having a side chain comprising an aromatic group may each independently be a residue of phenylalanine, naphthylalanine, phenylglycine, homophenylalanine or homonaphthylalanine, each of which is optionally substituted with one or more substituents. The amino acid residues having a side chain comprising an aromatic group may each independently be a residue of phenylalanine, naphthylalanine, homophenylalanine, gao Naiji alanine, bis (Gao Naiji alanine) or bis (Gao Naiji alanine), each of which is optionally substituted with one or more substituents. Amino acid residues having a side chain comprising an aromatic group may each independently be residues of phenylalanine or naphthylalanine, each of which is optionally substituted with one or more substituents. At least one amino acid residue having a side chain comprising an aromatic group may be a residue of phenylalanine. The at least two amino acid residues having a side chain comprising an aromatic group may be residues of phenylalanine. Each amino acid residue having a side chain comprising an aromatic group may be a residue of phenylalanine.
In embodiments, none of the amino acids having side chains comprising aromatic or heteroaromatic groups are contiguous. The two amino acids having side chains comprising aromatic or heteroaromatic groups may be contiguous. Two contiguous amino acids may have opposite stereochemistry. Two contiguous amino acids may have the same stereochemistry. Three amino acids having side chains containing aromatic or heteroaromatic groups may be contiguous. The three contiguous amino acids may have the same stereochemistry. The three contiguous amino acids may have alternating stereochemistry.
The amino acid residue comprising an aromatic or heteroaromatic group may be an L-amino acid. The amino acid residue comprising an aromatic or heteroaromatic group may be a D-amino acid. The amino acid residue comprising an aromatic or heteroaromatic group may be a mixture of D-amino acids and L-amino acids.
The optional substituent may be any atom or group that does not significantly reduce (e.g., more than 50%) the cytoplasmic delivery efficiency of cCPP, e.g., as compared to an otherwise identical sequence without the substituent. The optional substituents may be hydrophobic or hydrophilic substituents. The optional substituents may be hydrophobic substituents. Substituents may increase the solvent accessible surface area (as defined herein) of the hydrophobic amino acid. The substituents may be halogen, alkyl, alkenyl, alkynyl, cycloalkyl, cycloalkenyl, cycloalkynyl, heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, acyl, alkylcarbamoyl, alkylcarboxamide, alkoxycarbonyl, alkylthio or arylthio. The substituent may be halogen.
While not wishing to be bound by theory, it is believed that amino acids having aromatic or heteroaromatic groups with higher hydrophobicity values (i.e., amino acids having side chains comprising aromatic or heteroaromatic groups) may improve the cytoplasmic delivery efficiency of cCPP relative to amino acids having lower hydrophobicity values. Each hydrophobic amino acid may independently have a hydrophobicity value that is greater than glycine. Each hydrophobic amino acid can independently be a hydrophobic amino acid having a hydrophobicity value greater than alanine. Each hydrophobic amino acid independently can have a hydrophobicity value greater than or equal to phenylalanine. Hydrophobicity can be measured using a hydrophobicity scale known in the art. The hydrophobicity values of the various amino acids are listed in Table 2, as reported by Eisenberg and Weiss (Proc. Natl. Acad. U.S. A.1984;81 (1): 140-144), engleman et al (Ann. Rev. Of Biophys. Chem.1986;1986 (15): 321-53), kyte and Doolittle (J. Mol. Biol.1982;157 (1): 105-132), hoop and Woods (Proc. Natl. Acad. Sci. U.S.A.1981;78 (6): 3824-3828) and Janin (Nature. 1979;277 (5696): 491-492), each of which is incorporated herein by reference in its entirety. Hydrophobicity can be measured using the hydrophobicity scale reported by Engleman et al.
TABLE 2 amino acid hydrophobicity
The size of the aromatic or heteroaromatic groups may be selected to improve the cytoplasmic delivery efficiency of cCPP. While not wishing to be bound by theory, it is believed that larger aromatic or heteroaromatic groups on the amino acid side chains may improve cytoplasmic delivery efficiency compared to otherwise identical sequences with smaller hydrophobic amino acids. The size of the hydrophobic amino acid may be measured in terms of the molecular weight of the hydrophobic amino acid, the steric effect of the hydrophobic amino acid, the solvent accessible surface area of the side chain (SASA), or a combination thereof. The size of the hydrophobic amino acid can be measured in terms of the molecular weight of the hydrophobic amino acid, and the larger hydrophobic amino acid has side chains with a molecular weight of at least about 90g/mol, or at least about 130g/mol, or at least about 141 g/mol. The size of an amino acid can be measured in terms of the SASA of the hydrophobic side chain. The hydrophobic amino acid may have a side chain with SASA greater than or equal to alanine, or greater than or equal to glycine. The larger hydrophobic amino acid may have a side chain with a SASA greater than alanine or greater than glycine. The hydrophobic amino acid may have an aromatic or heteroaromatic group with a SASA greater than or equal to about piperidine-2-carboxylic acid, greater than or equal to about tryptophan, greater than or equal to about phenylalanine, or greater than or equal to about naphthylalanine. The first hydrophobic amino acid (AA H1) may have a SASA of at least aboutAt least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least aboutAt least about/>At least about/>At least about/>At least about/>Or at least about/>Is a side chain of (c). The second hydrophobic amino acid (AA H2) may have a SASA of at least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>Or at least about/>Is a side chain of (c). The side chains of AA H1 and AA H2 may have at least about/>At least about/>At least about/>At least about/>At least aboutAt least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>Greater than aboutAt least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>At least about/>Greater than about/>At least about/>At least aboutAt least about/>At least about/>Or at least about/>Is a combination of SASA. AA H2 can be a hydrophobic amino acid residue whose side chain SASA is less than or equal to the SASA of the hydrophobic side chain of AA H1. By way of example and not limitation, cCPP having a Nal-Arg motif may exhibit improved cytoplasmic delivery efficiency compared to cCPP which is otherwise identical to that having a Phe-Arg motif; cCPP having a Phe-Nal-Arg motif may exhibit improved cytoplasmic delivery efficiency compared to otherwise identical cCPP having a Nal-Phe-Arg motif; and the Phe-Nal-Arg motif may exhibit improved cytoplasmic delivery efficiency compared to otherwise identical cCPP with the Nal-Phe-Arg motif.
As used herein, "hydrophobic surface area" or "SASA" refers to the surface area of an amino acid side chain that is accessible to a solvent (reported in square angstroms; ). SASA can be calculated using the "rolling ball" algorithm developed by Shrake & Rupley (JMol biol.79 (2): 351-71), which is incorporated by reference in its entirety for all purposes. This algorithm uses solvent "spheres" of specific radius to probe the molecular surface. Typical values for spheres are/> Which approximates the radius of a water molecule.
The SASA values for some of the side chains are shown in table 3 below. The SASA values described herein are based on the theoretical values listed in Table 3 below, as reported by Tien et al (PLOS ONE (11): e80635, available from doi.org/10.1371/journ.fine.0080635), which is incorporated herein by reference in its entirety for all purposes.
TABLE 3 amino acid SASA values
Residues Theoretical value Empirical values Miller et al (1987) Rose et al (1985)
Alanine (Ala) 129.0 121.0 113.0 118.1
Arginine (Arg) 274.0 265.0 241.0 256.0
Asparagine derivatives 195.0 187.0 158.0 165.5
Aspartate salt 193.0 187.0 151.0 158.7
Cysteine (S) 167.0 148.0 140.0 146.1
Glutamate salt 223.0 214.0 183.0 186.2
Glutamine 225.0 214.0 189.0 193.2
Glycine (Gly) 104.0 97.0 85.0 88.1
Histidine 224.0 216.0 194.0 202.5
Isoleucine (Ile) 197.0 195.0 182.0 181.0
Leucine (leucine) 201.0 191.0 180.0 193.1
Lysine 236.0 230.0 211.0 225.8
Methionine 224.0 203.0 204.0 203.4
Phenylalanine (Phe) 240.0 228.0 218.0 222.8
Proline (proline) 159.0 154.0 143.0 146.8
Serine (serine) 155.0 143.0 122.0 129.8
Threonine (Thr) 172.0 163.0 146.0 152.5
Tryptophan 285.0 264.0 259.0 266.3
Tyrosine 263.0 255.0 229.0 236.8
Valine (valine) 174.0 165.0 160.0 164.5
Amino acid residues having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof
Guanidine, as used herein, refers to the following structure:
as used herein, the protonated form of guanidine refers to the following structure:
Guanidine-displacing groups refer to functional groups on the side chains of amino acids that will be positively charged at physiological pH or above, or that reproduce guanidine The hydrogen bonding of the groups gives and accepts activity.
The guanidine replacement group facilitates cell permeation and delivery of therapeutic agents while reducing toxicity associated with the guanidine group or protonated form thereof. cCPP may comprise at least one compound having a guanidine or guanidine-containing groupAmino acids in the side chains of the replacement groups. cCPP can comprise at least two compositions having a composition comprising guanidine or guanidine/>Amino acids in the side chains of the replacement groups. cCPP may comprise at least three compositions having a composition comprising guanidine or guanidineAmino acids substituted in side chains of groups
Guanidine or guanidineThe group may be guanidine or guanidine/>Is an isostere of (2). Guanidine or guanidine/>The replacement group may be less basic than guanidine.
As used herein, guanidine replacement group refers to Or a protonated form thereof.
The present disclosure relates to cCPP comprising 4 to 20 amino acid residues, wherein: (i) At least one amino acid has a side chain comprising a guanidino group or a protonated form thereof; (ii) At least one amino acid residue having no side chain or having a chain comprising Or a side chain of a protonated form thereof; and (iii) at least two amino acid residues independently have a side chain comprising an aromatic or heteroaromatic group. /(I)
At least two amino acid residues may have no side chains or have a chain comprising Or a side chain of a protonated form thereof. As used herein, an amino acid residue has two hydrogen atoms (e.g., -CH 2 -) on the carbon atom connecting the amine and the carboxylic acid when no side chains are present.
CCPP may comprise at least one amino acid having a side chain comprising one of the following moieties:
Or a protonated form thereof.
CCPP may comprise at least two amino acids, each independently having one of the following moieties:
Or a protonated form thereof. At least two amino acids may have side chains comprising the same moiety selected from the group consisting of: or a protonated form thereof. At least one amino acid may have a nucleotide sequence comprising/> Or a side chain of a protonated form thereof. At least two amino acids may have a nucleotide sequence comprising/>Or a side chain of a protonated form thereof. One, two, three or four amino acids may have a sequence comprising/>Or a side chain of a protonated form thereof. One amino acid may have a polypeptide comprisingOr a side chain of a protonated form thereof. Two amino acids may have a nucleotide sequence comprising/>Or a side chain of a protonated form thereof. /(I) Or a protonated form thereof may be attached to the end of the amino acid side chain. /(I)May be attached to the end of the amino acid side chain.
CCPP may comprise (iii) 2, 3, 4, 5 or 6 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) 2 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) 3 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) 4 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) 5 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) 6 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) 2, 3, 4, or 5 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group, or a protonated form thereof. cCPP may comprise (iii) 2, 3 or 4 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) 2 or 3 amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof. cCPP may comprise (iii) at least one amino acid residue having a side chain comprising a guanidino group or a protonated form thereof. cCPP may comprise (iii) two amino acid residues having a side chain comprising a guanidino group or a protonated form thereof. cCPP may comprise (iii) three amino acid residues having a side chain comprising a guanidino group or a protonated form thereof.
The amino acid residues may independently have side chains comprising non-contiguous guanidine groups, guanidine replacement groups, or protonated forms thereof. The two amino acid residues may independently have side chains comprising guanidine groups, guanidine replacement groups, or protonated forms thereof, which may be contiguous. The three amino acid residues may independently have side chains comprising guanidine groups, guanidine replacement groups, or protonated forms thereof, which may be contiguous. The four amino acid residues may independently have side chains comprising guanidine groups, guanidine replacement groups, or protonated forms thereof, which may be contiguous. Contiguous amino acid residues may have the same stereochemistry. Contiguous amino acids may have alternating stereochemistry.
The amino acid residue independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof may be an L-amino acid. The amino acid residue independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof may be a D-amino acid. The amino acid residues independently having a side chain comprising a guanidino group, a guanidine replacement group or a protonated form thereof may be an L-amino acid or a mixture of D-amino acids.
Each amino acid residue having a side chain comprising a guanidino group or a protonated form thereof may independently be an arginine, homoarginine, 2-amino-3-propionic acid, 2-amino-4-guanidino butyric acid, or residue of a protonated form thereof. Each amino acid residue having a side chain comprising a guanidine group or protonated form thereof can independently be an arginine residue or protonated form thereof.
Each amino acid having a side chain comprising a guanidine replacement group or protonated form thereof can independently be
Or a protonated form thereof.
Without being bound by theory, it is hypothesized that the guanidine replacement group has a reduced basicity relative to arginine, and in some cases is uncharged at physiological pH (e.g., -N (H) C (O)), and is capable of sustaining bidentate hydrogen bond interactions with phospholipids on the plasma membrane, which is believed to promote efficient membrane binding and subsequent internalization. Removal of the positive charge is also believed to reduce cCPP's toxicity.
Those skilled in the art will appreciate that the N-terminus and/or C-terminus of the above-described unnatural aromatic hydrophobic amino acids form amide linkages upon incorporation into the peptides disclosed herein.
CCPP may comprise a first amino acid having a side chain comprising an aromatic or heteroaromatic group and a second amino acid having a side chain comprising an aromatic or heteroaromatic group, wherein the N-terminus of the first glycine forms a peptide bond with the first amino acid having a side chain comprising an aromatic or heteroaromatic group and the C-terminus of the first glycine forms a peptide bond with the second amino acid having a side chain comprising an aromatic or heteroaromatic group. Although the term "first amino acid" generally refers to the N-terminal amino acid of a peptide sequence by convention, as used herein, a "first amino acid" is used to distinguish the referred amino acid from another amino acid (e.g., "second amino acid") in cCPP, such that the term "first amino acid" may refer to or may refer to an amino acid located at the N-terminal end of a peptide sequence.
CCPP may comprise: the N-terminus of the second glycine forms a peptide bond with an amino acid having a side chain comprising an aromatic or heteroaromatic group, and the C-terminus of the second glycine forms a peptide bond with an amino acid having a side chain comprising a guanidino group or protonated form thereof.
CCPP may comprise a first amino acid having a side chain comprising a guanidino group or a protonated form thereof, and a second amino acid having a side chain comprising a guanidino group or a protonated form thereof, wherein the N-terminus of the third glycine forms a peptide bond with the first amino acid having a side chain comprising a guanidino group or a protonated form thereof, and the C-terminus of the third glycine forms a peptide bond with the second amino acid having a side chain comprising a guanidino group or a protonated form thereof.
CCPP may comprise residues of asparagine, aspartic acid, glutamine, glutamic acid or homoglutamine. cCPP may comprise residues of asparagine. cCPP may comprise residues of glutamine.
CCPP may comprise residues of tyrosine, phenylalanine, 1-naphthylalanine, 2-naphthylalanine, tryptophan, 3-benzothienylalanine, 4-phenylphenylalanine, 3, 4-difluorophenylalanine, 4-trifluoromethylphenylalanine, 2,3,4,5, 6-pentafluorophenylalanine, homophenylalanine, β -homophenylalanine, 4-tert-butyl-phenylalanine, 4-pyridylalanine, 3-pyridylalanine, 4-methylphenylalanine, 4-fluorophenylalanine, 4-chlorophenylalanine, 3- (9-anthracenyl) -alanine.
While not wishing to be bound by theory, it is believed that the chirality of the amino acids in cCPP can affect cytoplasmic uptake efficiency. cCPP may comprise at least one D amino acid. cCPP may comprise one to fifteen D amino acids. cCPP may comprise one to ten D amino acids. cCPP may comprise 1, 2, 3 or 4D amino acids. cCPP may comprise 2, 3, 4, 5, 6, 7 or 8 contiguous amino acids with alternating D and L chiralities. cCPP may comprise three contiguous amino acids with the same chirality. cCPP may comprise two contiguous amino acids having the same chirality. At least two amino acids may have opposite chiralities. At least two amino acids having opposite chiralities may be adjacent to each other. At least three amino acids may have alternating stereochemistry with respect to each other. At least three amino acids having alternating chiralities relative to each other may be adjacent to each other. At least four amino acids have alternating stereochemistry relative to each other. At least four amino acids having alternating chiralities relative to each other may be adjacent to each other. At least two amino acids may have the same chirality. At least two amino acids having the same chirality may be adjacent to each other. At least two amino acids have the same chirality and at least two amino acids have opposite chiralities. At least two amino acids having opposite chiralities may be adjacent to at least two amino acids having the same chirality. Thus, adjacent amino acids in cCPP may have any of the following sequences: D-L; L-D; D-L-L-D; L-D-D-L; L-D-L-L-D; D-L-D-D-L; D-L-L-D-L; or L-D-D-L-D. The amino acid residues forming cCPP may all be L-amino acids. The amino acid residues forming cCPP may all be D-amino acids.
At least two amino acids may have different chiralities. At least two amino acids having different chiralities may be adjacent to each other. At least three amino acids may have different chiralities relative to adjacent amino acids. At least four amino acids may have different chiralities relative to adjacent amino acids. At least two amino acids have the same chirality and at least two amino acids have different chiralities. One or more of the amino acid residues forming cCPP may be achiral. cCPP may comprise a3, 4 or 5 amino acid motif, wherein two amino acids having the same chirality may be separated by an achiral amino acid. cCPP may comprise the following sequence: D-X-D; D-X-D-X; D-X-D-X-D; L-X-L; L-X-L-X; or L-X-L-X-L, wherein X is an achiral amino acid. The achiral amino acid may be glycine.
An amino acid having a side chain comprising:
or a protonated form thereof, may be adjacent to an amino acid having a side chain comprising an aromatic or heteroaromatic group. An amino acid having a side chain comprising: /(I) Or a protonated form thereof, may be adjacent to at least one amino acid having a side chain comprising guanidine, or a protonated form thereof. An amino acid having a side chain comprising guanidine or a protonated form thereof can be adjacent to an amino acid having a side chain comprising an aromatic or heteroaromatic group. Two amino acids having side chains comprising: /(I) Or a protonated form thereof. Two amino acids having side chains comprising guanidine or a protonated form thereof are adjacent to each other. cCPP may comprise at least two contiguous amino acids having a side chain which may comprise an aromatic or heteroaromatic group, and at least two non-contiguous amino acids having a side chain comprising: /(I) Or a protonated form thereof. cCPP can comprise at least two contiguous amino acids having a side chain comprising an aromatic or heteroaromatic group and at least two amino acids having a chain comprising/>Or a side chain of a protonated form thereof. Adjacent amino acids may have the same chirality. Adjacent amino acids may have opposite chirality. Other combinations of amino acids may have any arrangement of D and L amino acids, e.g., any of the sequences described in the preceding paragraphs.
At least two amino acids having side chains comprising:
Or a protonated form thereof, with at least two amino acids having side chains comprising a guanidino group or a protonated form thereof.
CCPP can have the structure of formula (a):
Or a protonated form thereof, wherein:
r 1、R2 and R 3 are each independently H or an aromatic or heteroaromatic side chain of an amino acid;
At least one of R 1、R2 and R 3 is an aromatic or heteroaromatic side chain of an amino acid;
r 4、R5、R6、R7 is independently H or an amino acid side chain;
At least one of R 4、R5、R6、R7 is a side chain of 3-guanidino-2-aminopropionic acid, 4-guanidino-2-aminobutyric acid, arginine, homoarginine, N-methylarginine, N, N-dimethylarginine, 2, 3-diaminopropionic acid, 2, 4-diaminobutyric acid, lysine, N-methyllysine, N, N-dimethyllysine, N-ethyllysine, N, N, N-trimethyllysine, 4-guanidinophenylalanine, citrulline, N, N-dimethyllysine, β -homoarginine, 3- (1-piperidinyl) alanine;
AA SC is an amino acid side chain; and
Q is 1,2, 3 or 4.
In an embodiment, the cyclic peptide of formula (A) is not FfΦ RrRrQ (SEQ ID NO: 67). In an embodiment, the cyclic peptide of formula (A) is FfΦ RrRrQ (SEQ ID NO: 67).
CCPP may have the structure of formula (I):
Or a protonated form thereof, wherein:
r 1、R2 and R 3 may each independently be H or an amino acid residue having a side chain comprising an aromatic group;
At least one of R 1、R2 and R 3 is an aromatic or heteroaromatic side chain of an amino acid;
R 4 and R 7 are independently H or an amino acid side chain;
AA SC is an amino acid side chain;
q is 1,2, 3 or 4; and
Each m is independently an integer of 0,1, 2 or 3.
R 1、R2 and R 3 may each independently be H, -alkylene-aryl or-alkylene-heteroaryl. R 1、R2 and R 3 may each independently be H, -C 1-3 alkylene-aryl or-C 1-3 alkylene-heteroaryl. R 1、R2 and R 3 may each independently be H or-alkylene-aryl. R 1、R2 and R 3 may each independently be H or-C 1-3 alkylene-aryl. The C 1-3 alkylene group may be methylene. The aryl group may be a 6 to 14 membered aryl group. The heteroaryl group may be a 6 to 14 membered heteroaryl group having one or more heteroatoms selected from N, O and S. The aryl group may be selected from phenyl, naphthyl or anthracenyl. Aryl may be phenyl or naphthyl. The aryl group may be phenyl. Heteroaryl groups may be pyridinyl, quinolinyl and isoquinolinyl. R 1、R2 and R 3 may each independently be H, -C 1-3 alkylene-Ph or-C 1-3 alkylene-naphthyl. R 1、R2 and R 3 may each independently be H, -CH 2 Ph or-CH 2 naphthyl. R 1、R2 and R 3 may each independently be H or-CH 2 Ph.
R 1、R2 and R 3 may each independently be a side chain of tyrosine, phenylalanine, 1-naphthylalanine, 2-naphthylalanine, tryptophan, 3-benzothienylalanine, 4-phenylphenylalanine, 3, 4-difluorophenylalanine, 4-trifluoromethylphenylalanine, 2,3,4,5, 6-pentafluorophenylalanine, homophenylalanine, β -homophenylalanine, 4-tert-butyl-phenylalanine, 4-pyridylalanine, 3-pyridylalanine, 4-methylphenylalanine, 4-fluorophenylalanine, 4-chlorophenylalanine, 3- (9-anthracenyl) -alanine.
R 1 can be a side chain of tyrosine. R 1 can be the side chain of phenylalanine. R 1 can be the side chain of 1-naphthylalanine. R 1 can be the side chain of 2-naphthylalanine. R 1 can be the side chain of tryptophan. R 1 can be the side chain of 3-benzothiophenylalanine. R 1 can be the side chain of 4-phenylphenylalanine. R 1 can be the side chain of 3, 4-difluorophenylalanine. R 1 can be the side chain of 4-trifluoromethylphenylalanine. R 1 can be the side chain of 2,3,4,5, 6-pentafluorophenylalanine. R 1 can be the side chain of homophenylalanine. R 1 can be the side chain of beta-homophenylalanine. R 1 may be the side chain of 4-tert-butyl-phenylalanine. R 1 can be the side chain of 4-pyridylalanine. R 1 can be the side chain of 3-pyridylalanine. R 1 can be the side chain of 4-methylphenylalanine. R 1 can be the side chain of 4-fluorophenylalanine. R 1 can be the side chain of 4-chlorophenylalanine. R 1 can be the side chain of 3- (9-anthryl) -alanine.
R 2 can be a side chain of tyrosine. R 2 can be the side chain of phenylalanine. R 2 can be the side chain of 1-naphthylalanine. R 1 can be the side chain of 2-naphthylalanine. R 2 can be the side chain of tryptophan. R 2 can be the side chain of 3-benzothiophenylalanine. R 2 can be the side chain of 4-phenylphenylalanine. R 2 can be the side chain of 3, 4-difluorophenylalanine. R 2 can be the side chain of 4-trifluoromethylphenylalanine. R 2 can be the side chain of 2,3,4,5, 6-pentafluorophenylalanine. R 2 can be the side chain of homophenylalanine. R 2 can be the side chain of beta-homophenylalanine. R 2 may be the side chain of 4-tert-butyl-phenylalanine. R 2 can be the side chain of 4-pyridylalanine. R 2 can be the side chain of 3-pyridylalanine. R 2 can be the side chain of 4-methylphenylalanine. R 2 can be the side chain of 4-fluorophenylalanine. R 2 can be the side chain of 4-chlorophenylalanine. R 2 can be the side chain of 3- (9-anthryl) -alanine.
R 3 can be a side chain of tyrosine. R 3 can be the side chain of phenylalanine. R 3 can be the side chain of 1-naphthylalanine. R 3 can be the side chain of 2-naphthylalanine. R 3 can be the side chain of tryptophan. R 3 can be the side chain of 3-benzothiophenylalanine. R 3 can be the side chain of 4-phenylphenylalanine. R 3 can be the side chain of 3, 4-difluorophenylalanine. R 3 can be the side chain of 4-trifluoromethylphenylalanine. R 3 can be the side chain of 2,3,4,5, 6-pentafluorophenylalanine. R 3 can be the side chain of homophenylalanine. R 3 can be the side chain of beta-homophenylalanine. R 3 may be the side chain of 4-tert-butyl-phenylalanine. R 3 can be the side chain of 4-pyridylalanine. R 3 can be the side chain of 3-pyridylalanine. R 3 can be the side chain of 4-methylphenylalanine. R 3 can be the side chain of 4-fluorophenylalanine. R 3 can be the side chain of 4-chlorophenylalanine. R 3 can be the side chain of 3- (9-anthryl) -alanine.
R 4 can be H, -alkylene-aryl, -alkylene-heteroaryl. R 4 can be H, -C 1-3 alkylene-aryl, or-C 1-3 alkylene-heteroaryl. R 4 can be H or-alkylene-aryl. R 4 can be H or-C 1-3 alkylene-aryl. The C 1-3 alkylene group may be methylene. The aryl group may be a 6 to 14 membered aryl group. The heteroaryl group may be a 6 to 14 membered heteroaryl group having one or more heteroatoms selected from N, O and S. The aryl group may be selected from phenyl, naphthyl or anthracenyl. Aryl may be phenyl or naphthyl. The aryl group may be phenyl. Heteroaryl groups may be pyridinyl, quinolinyl and isoquinolinyl. R 4 can be H, -C 1-3 alkylene-Ph or-C 1-3 alkylene-naphthyl. R 4 can be H or the side chain of an amino acid in Table 1 or Table 3. R 4 can be H or an amino acid residue having a side chain comprising an aromatic group. R 4 can be H, -CH 2 Ph or-CH 2 naphthyl. R 4 can be H or-CH 2 Ph.
R 5 can be H, -alkylene-aryl, -alkylene-heteroaryl. R 5 can be H, -C 1-3 alkylene-aryl, or-C 1-3 alkylene-heteroaryl. R 5 can be H or-alkylene-aryl. R 5 can be H or-C 1-3 alkylene-aryl. The C 1-3 alkylene group may be methylene. The aryl group may be a 6 to 14 membered aryl group. The heteroaryl group may be a 6 to 14 membered heteroaryl group having one or more heteroatoms selected from N, O and S. The aryl group may be selected from phenyl, naphthyl or anthracenyl. Aryl may be phenyl or naphthyl. The aryl group may be phenyl. Heteroaryl groups may be pyridinyl, quinolinyl and isoquinolinyl. R 5 can be H, -C 1-3 alkylene-Ph or-C 1-3 alkylene-naphthyl. R 5 can be H or the side chain of an amino acid in Table 1 or Table 3. R 4 can be H or an amino acid residue having a side chain comprising an aromatic group. R 5 can be H, -CH 2 Ph or-CH 2 naphthyl. R 4 can be H or-CH 2 Ph.
R 6 can be H, -alkylene-aryl, -alkylene-heteroaryl. R 6 can be H, -C 1-3 alkylene-aryl, or-C 1-3 alkylene-heteroaryl. R 6 can be H or-alkylene-aryl. R 6 can be H or-C 1-3 alkylene-aryl. The C 1-3 alkylene group may be methylene. The aryl group may be a 6 to 14 membered aryl group. The heteroaryl group may be a 6 to 14 membered heteroaryl group having one or more heteroatoms selected from N, O and S. The aryl group may be selected from phenyl, naphthyl or anthracenyl. Aryl may be phenyl or naphthyl. The aryl group may be phenyl. Heteroaryl groups may be pyridinyl, quinolinyl and isoquinolinyl. R 6 can be H, -C 1-3 alkylene-Ph or-C 1-3 alkylene-naphthyl. R 6 can be H or the side chain of an amino acid in Table 1 or Table 3. R 6 can be H or an amino acid residue having a side chain comprising an aromatic group. R 6 can be H, -CH 2 Ph or-CH 2 naphthyl. R 6 can be H or-CH 2 Ph.
R 7 can be H, -alkylene-aryl, -alkylene-heteroaryl. R 7 can be H, -C 1-3 alkylene-aryl, or-C 1-3 alkylene-heteroaryl. R 7 can be H or-alkylene-aryl. R 7 can be H or-C 1-3 alkylene-aryl. The C 1-3 alkylene group may be methylene. The aryl group may be a 6 to 14 membered aryl group. The heteroaryl group may be a 6 to 14 membered heteroaryl group having one or more heteroatoms selected from N, O and S. The aryl group may be selected from phenyl, naphthyl or anthracenyl. Aryl may be phenyl or naphthyl. The aryl group may be phenyl. Heteroaryl groups may be pyridinyl, quinolinyl and isoquinolinyl. R 7 can be H, -C 1-3 alkylene-Ph or-C 1-3 alkylene-naphthyl. R 7 can be H or the side chain of an amino acid in Table 1 or Table 3. R 7 can be H or an amino acid residue having a side chain comprising an aromatic group. R 7 can be H, -CH 2 Ph or-CH 2 naphthyl. R 7 can be H or-CH 2 Ph.
One, two or three of R 1、R2、R3、R4、R5、R6 and R 7 may be-CH 2Ph.R1、R2、R3、R4、R5、R6 and one of R 7 may be-CH 2Ph.R1、R2、R3、R4、R5、R6 and two of R 7 may be-CH 2Ph.R1、R2、R3、R4、R5、R6 and three of R 7 may be-CH 2Ph.R1、R2、R3、R4、R5、R6 and at least one of R 7 may be-CH 2Ph.R1、R2、R3、R4、R5、R6 and no more than four of R 7 may be-CH 2 Ph.
One, two or three of R 1、R2、R3 and R 4 are-CH 2Ph.R1、R2、R3 and one of R 4 are-CH 2Ph.R1、R2、R3 and two of R 4 are-CH 2Ph.R1、R2、R3 and three of R 4 are-CH 2Ph.R1、R2、R3 and at least one of R 4 are-CH 2 Ph.
One, two or three of R 1、R2、R3、R4、R5、R6 and R 7 may be H. One of R 1、R2、R3、R4、R5、R6 and R 7 may be H. Two of R 1、R2、R3、R4、R5、R6 and R 7 are H. Three of R 1、R2、R3、R5、R6 and R 7 may be H. At least one of R 1、R2、R3、R4、R5、R6 and R 7 may be H. No more than three of R 1、R2、R3、R4、R5、R6 and R 7 may be-CH 2 Ph.
One, two or three of R 1、R2、R3 and R 4 are H. One of R 1、R2、R3 and R 4 is H. Two of R 1、R2、R3 and R 4 are H. Three of R 1、R2、R3 and R 4 are H. At least one of R 1、R2、R3 and R 4 is H.
At least one of R 4、R5、R6 and R 7 may be a side chain of 3-guanidino-2-aminopropionic acid. At least one of R 4、R5、R6 and R 7 may be a side chain of 4-guanidino-2-aminobutyric acid. At least one of R 4、R5、R6 and R 7 may be a side chain of arginine. At least one of R 4、R5、R6 and R 7 may be a side chain of homoarginine. At least one of R 4、R5、R6 and R 7 may be a side chain of N-methyl arginine. At least one of R 4、R5、R6 and R 7 may be a side chain of N, N-dimethylarginine. At least one of R 4、R5、R6 and R 7 may be a side chain of 2, 3-diaminopropionic acid. At least one of R 4、R5、R6 and R 7 may be a side chain of 2, 4-diaminobutyric acid, lysine. At least one of R 4、R5、R6 and R 7 may be a side chain of N-methyl lysine. At least one of R 4、R5、R6 and R 7 may be a side chain of N, N-dimethyl lysine. At least one of R 4、R5、R6 and R 7 may be a side chain of N-ethyl lysine. At least one of R 4、R5、R6 and R 7 may be a side chain of N, N-trimethyllysine, 4-guanidinophenylalanine. At least one of R 4、R5、R6 and R 7 may be a side chain of citrulline. At least one of R 4、R5、R6 and R 7 may be a side chain of N, N-dimethyl lysine, β -homoarginine. At least one of R 4、R5、R6 and R 7 may be the side chain of 3- (1-piperidinyl) alanine.
At least two of R 4、R5、R6 and R 7 may be side chains of 3-guanidino-2-aminopropionic acid. At least two of R 4、R5、R6 and R 7 may be side chains of 4-guanidino-2-aminobutyric acid. At least two of R 4、R5、R6 and R 7 may be side chains of arginine. At least two of R 4、R5、R6 and R 7 may be side chains of homoarginine. At least two of R 4、R5、R6 and R 7 may be side chains of N-methyl arginine. At least two of R 4、R5、R6 and R 7 may be side chains of N, N-dimethylarginine. At least two of R 4、R5、R6 and R 7 may be side chains of 2, 3-diaminopropionic acid. At least two of R 4、R5、R6 and R 7 may be side chains of 2, 4-diaminobutyric acid, lysine. At least two of R 4、R5、R6 and R 7 may be side chains of N-methyl lysine. At least two of R 4、R5、R6 and R 7 may be side chains of N, N-dimethyl lysine. At least two of R 4、R5、R6 and R 7 may be side chains of N-ethyl lysine. At least two of R 4、R5、R6 and R 7 may be side chains of N, N, N-trimethyllysine, 4-guanidinophenylalanine. At least two of R 4、R5、R6 and R 7 may be side chains of citrulline. At least two of R 4、R5、R6 and R 7 may be side chains of N, N-dimethyl lysine, β -homoarginine. At least two of R 4、R5、R6 and R 7 may be side chains of 3- (1-piperidinyl) alanine.
At least three of R 4、R5、R6 and R 7 may be side chains of 3-guanidino-2-aminopropionic acid. At least three of R 4、R5、R6 and R 7 may be side chains of 4-guanidino-2-aminobutyric acid. At least three of R 4、R5、R6 and R 7 may be side chains of arginine. At least three of R 4、R5、R6 and R 7 may be side chains of homoarginine. At least three of R 4、R5、R6 and R 7 may be side chains of N-methyl arginine. At least three of R 4、R5、R6 and R 7 may be side chains of N, N-dimethylarginine. At least three of R 4、R5、R6 and R 7 may be side chains of 2, 3-diaminopropionic acid. At least three of R 4、R5、R6 and R 7 may be side chains of 2, 4-diaminobutyric acid and lysine. At least three of R 4、R5、R6 and R 7 may be side chains of N-methyllysine. At least three of R 4、R5、R6 and R 7 may be side chains of N, N-dimethyl lysine. At least three of R 4、R5、R6 and R 7 may be side chains of N-ethyl lysine. At least three of R 4、R5、R6 and R 7 may be the side chain of N, N, N-trimethyllysine, 4-guanidinophenylalanine. At least three of R 4、R5、R6 and R 7 may be side chains of citrulline. At least three of R 4、R5、R6 and R 7 may be side chains of N, N-dimethyl lysine, beta-homoarginine. At least three of R 4、R5、R6 and R 7 may be side chains of 3- (1-piperidinyl) alanine.
AA SC can be a side chain of an asparagine, glutamine or homoglutamine residue. AA SC can be a side chain of a glutamine residue. cCPP can also comprise a linker conjugated to AA SC (e.g., residues of asparagine, glutamine, or homoglutamine). Thus cCPP may also comprise linkers conjugated to asparagine, glutamine or homoglutamine residues. cCPP may also comprise a linker conjugated to the glutamine residue.
Q may be 1,2 or 3.q may be 1 or 2.q may be 1.q may be 2.q may be 3.q may be 4.
M may be 1-3.m may be 1 or 2.m may be 0 and m may be 1.m may be 2.m may be 3.
CCPP of formula (a) may have the structure of formula (I):
Or a protonated form thereof, wherein AA SC、R2、R3、R4、R7, m, and q are as defined herein.
CCPP of formula (A) may have the structure of formula (I-a) or formula (I-b):
Or a protonated form thereof, wherein AA SC、R1、R2、R3、R4 and m are as defined herein.
CCPP of formula (A) may have the structure of formula (I-1), (I-2), (I-3) or (I-4):
Or a protonated form thereof, wherein AA SC and m are as defined herein. cCPP of formula (A) may have the structure of formula (I-5) or (I-6):
or a protonated form thereof, wherein AA SC is as defined herein.
CCPP of formula (A) may have the structure of formula (I-1):
or a protonated form thereof,
Wherein AA SC and m are as defined herein.
CCPP of formula (a) may have the structure of formula (I-2):
Or a protonated form thereof, wherein AA SC and m are as defined herein.
CCPP of formula (A) may have the structure of formula (I-3):
Or a protonated form thereof, wherein AA SC and m are as defined herein.
CCPP of formula (A) may have the structure of formula (I-4):
Or a protonated form thereof, wherein AA SC and m are as defined herein.
CCPP of formula (a) may have the structure of formula (I-5):
or a protonated form thereof,
Wherein AA SC and m are as defined herein.
CCPP of formula (A) may have the structure of formula (I-6):
Or a protonated form thereof, wherein AA SC and m are as defined herein.
CCPP may comprise one of the following sequences: FGFGRGR (SEQ ID NO: 68); gfFGrGr (SEQ ID NO: 69), ffPhi GRGR (SEQ ID NO: 70); ffFGRGR (SEQ ID NO: 71); or FfPhi GrGr (SEQ ID NO: 72). cCPP may have one of the following sequences: FGF phi (SEQ ID NO: 73); gfFGrGrQ (SEQ ID NO: 74), ffPhi GRGRQ (SEQ ID NO: 75); ffFGRGRQ (SEQ ID NO: 76); or FfPhi GrGrQ (SEQ ID NO: 77).
The present disclosure also relates to cCPP having the structure of formula (II):
Wherein:
AA SC is an amino acid side chain;
R 1a、R1b and R 1c are each independently 6 to 14 membered aryl or 6 to 14 membered heteroaryl;
r 2a、R2b、R2c and R 2d are independently amino acid side chains;
At least one of R 2a、R2b、R2c and R 2d is Or a protonated form thereof;
At least one of R 2a、R2b、R2c and R 2d is guanidine or a protonated form thereof;
each n "is independently an integer of 0, 1, 2, 3, 4, or 5;
Each n' is independently an integer of 0,1,2 or 3; and
If n' is 0, then R 2a、R2b、R2b or R 2d are absent.
At least two of R 2a、R2b、R2c and R 2d may be Or a protonated form thereof. Two or three of R 2a、R2b、R2c and R 2d may be/> Or a protonated form thereof. One of R 2a、R2b、R2c and R 2d may be/>/>Or a protonated form thereof. At least one of R 2a、R2b、R2c and R 2d may be/>Or a protonated form thereof, and the remainder of R 2a、R2b、R2c and R 2d may be guanidine or a protonated form thereof. At least two of R 2a、R2b、R2c and R 2d may be/>Or a protonated form thereof, and the remainder of R 2a、R2b、R2c and R 2d may be guanidine or a protonated form thereof.
All R 2a、R2b、R2c and R 2d may be Or a protonated form thereof. At least one of R 2a、R2b、R2c and R 2d may be/>Or a protonated form thereof, and the remainder of R 2a、R2b、R2c and R 2d may be guanidine or a protonated form thereof. At least two of R 2a、R2b、R2c and R 2d may be/>Or a protonated form thereof, and the remainder of R 2a、R2b、R2c and R 2d are guanidine or a protonated form thereof.
Each of R 2a、R2b、R2c and R 2d may independently be a side chain of 2, 3-diaminopropionic acid, 2, 4-diaminobutyric acid, ornithine, lysine, methyllysine, dimethyllysine, trimethyllysine, homolysine, serine, homoserine, threonine, allothreonine, histidine, 1-methylhistidine, 2-aminobutyric acid, aspartic acid, glutamic acid, or homoglutamic acid.
AA SC can beWherein t may be an integer from 0 to 5. AA SC can beWherein t may be an integer from 0 to 5.t may be 1 to 5.t is 2 or 3.t may be 2.t may be 3.
R 1a、R1b and R 1c may each independently be a 6 to 14 membered aryl group. R 1a、R1b and R 1c may each independently be a 6 to 14 membered heteroaryl group having one or more heteroatoms selected from N, O or S. R 1a、R1b and R 1c may each be independently selected from phenyl, naphthyl, anthracenyl, pyridinyl, quinolinyl or isoquinolinyl. R 1a、R1b and R 1c may each be independently selected from phenyl, naphthyl or anthracenyl. R 1a、R1b and R 1c may each independently be phenyl or naphthyl. R 1a、R1b and R 1c may each be independently selected from pyridinyl, quinolinyl or isoquinolinyl.
Each n' may independently be 1 or 2. Each n' may be 1. Each n' may be 2. At least one n' may be 0. At least one n' may be 1. At least one n' may be 2. At least one n' may be 3. At least one n' may be 4. At least one n' may be 5.
Each n "may independently be an integer from 1 to 3. Each n "may independently be 2 or 3. Each n "may be 2. Each n "may be 3. At least one n "may be 0. At least one n "may be 1. At least one n "may be 2. At least one n "may be 3.
Each n "may independently be 1 or 2, and each n' may independently be 2 or 3. Each n "may be 1 and each n' may independently be 2 or 3. Each n "may be 1 and each n' may be 2. Each n "is 1 and each n' is 3.
CCPP of formula (II) may have the structure of formula (II-1):
Wherein R 1a、R1b、R1c、R2a、R2b、R2c、R2d、AASC, n 'and n' are as defined herein.
CCPP of formula (II) may have the structure of formula (IIa):
wherein R 1a、R1b、R1c、R2a、R2b、R2c、R2d、AASC- and n' are as defined herein.
CCPP of formula (II) may have the structure of formula (IIb):
wherein R 2a、R2b、AASC- and n' are as defined herein.
CCPP can have the structure of formula (IIc):
Or a protonated form thereof, wherein:
AA SC and n' are as defined herein.
CCPP of formula (IIa) has one of the following structures:
Wherein AA SC and n are as defined herein. cCPP of formula (IIa) has one of the following structures: /(I)
Wherein AA SC and n are as defined herein for formula (IIa) cCPP has one of the following structures: /(I)
Wherein AA SC and n are as defined herein. cCPP of formula (II) may have the following structure: /(I)
CCPP of formula (II) may have the following structure:
/>
cCPP can have the structure of formula (III):
Wherein:
AA SC is an amino acid side chain;
R 1a、R1b and R 1c are each independently 6 to 14 membered aryl or 6 to 14 membered heteroaryl;
r 2a and R 2c are each independently H, Or a protonated form thereof;
r 2b and R 2d are each independently guanidine or a protonated form thereof;
Each n "is independently an integer from 1 to 3;
each n' is independently an integer from 1 to 5; and
Each p' is independently an integer from 0 to 5.
CCPP of formula (III) may have the structure of formula (III-1):
Wherein:
AA SC、R1a、R1b、R1c、R2a、R2c、R2b、R2d, n ', n ", and p' are as defined herein.
CCPP of formula (III) may have the structure of formula (IIIa):
Wherein:
AA SC、R2a、R2c、R2b、R2d, n ', n ", and p' are as defined herein.
In formulas (III), (III-1) and (IIIa), R a and R c may be H. R a and R c may be H and R b and R d may each independently be guanidine or a protonated form thereof. R a may be H. R b may be H. p' may be 0.R a and R c may be H and each p' may be 0.
In formulas (III), (III-1) and (IIIa), R a and R c may be H, R b and R d may each independently be guanidine or protonated form thereof, n "may be 2 or 3, and each p' may be 0.
P' may be 0.p' may be 1.p' may be 2.p' may be 3.p' may be 4.p' may be 5.
CCPP can have the following structure:
cCPP of formula (a) may be selected from:
CPP sequence SEQ ID NO:
(FfΦRrRrQ) 78
(FfΦCit-r-Cit-rQ) 79
(FfΦGrGrQ) 80
(FfFGRGRQ) 81
(FGFGRGRQ) 82
(GfFGrGrQ) 83
(FGFGRRRQ) 84
(FGFRRRRQ) 85
CCPP of formula (a) may be selected from:
CPP sequence SEQ ID NO:
FΦRRRRQ 86
fΦRrRrQ 87
FfΦRrRrQ 78
FfΦCit-r-Cit-rQ 79
FfΦGrGrQ 80
FfΦRGRGQ 88
FfFGRGRQ 81
FGFGRGRQ 82
GfFGrGrQ 83
FGFGRRRQ 84
FGFRRRRQ 85
In embodiments cCPP is selected from:
Wherein Φ=l-naphthylalanine; -naphthylalanine; Ω=l-norleucine
In embodiments, cCPP is not selected from:
Wherein Φ=l-naphthylalanine; -naphthylalanine; Ω=l-norleucine
AA SC may be conjugated to a linker.
Joint
CCPP of the present disclosure may be conjugated to a linker. The connector may connect cargo to cCPP. The linker may be attached to the side chain of the amino acid of cCPP and the cargo may be attached at the appropriate position on the linker.
The linker may be any suitable moiety that can conjugate cCPP to one or more additional moieties, such as a cyclic Exopeptide (EP) and/or cargo. Prior to conjugation with cCPP and one or more additional moieties, the linker has two or more functional groups, each of which is capable of independently forming a covalent bond with cCPP and one or more additional moieties. If the cargo is an oligonucleotide, the linker may be covalently bound to the 5 'end of the cargo or the 3' end of the cargo. The linker may be covalently bound to the 5' end of the cargo. The linker may be covalently bound to the 3' end of the cargo. If the cargo is a peptide, the linker may be covalently bound to the N-terminus or the C-terminus of the cargo. The linker may be covalently bound to the backbone of the oligonucleotide or peptide cargo. The linker may be any suitable moiety that conjugates cCPP described herein with cargo such as oligonucleotides, peptides, or small molecules.
The linker may comprise a hydrocarbon linker.
The linker may comprise a cleavage site. The cleavage site may be a disulfide or caspase cleavage site (e.g., val-Cit-PABC).
The joint may comprise: (i) One or more D or L amino acids, each of which is optionally substituted; (ii) optionally substituted alkylene; (iii) optionally substituted alkenylene; (iv) optionally substituted alkynylene; (v) optionally substituted carbocyclyl; (vi) optionally substituted heterocyclyl; (vii) One or more- (R 1-J-R2) z "-subunits, wherein R 1 and R 2 are each independently selected from alkylene, alkenylene, alkynylene, carbocyclyl, and heterocyclyl, each J is independently C, NR 3、-NR3 C (O) -, S, and O, wherein R 3 is independently selected from H, alkyl, alkenyl, alkynyl, carbocyclyl, and heterocyclyl, each of which is optionally substituted, and z" is an integer from 1 to 50; (viii) - (R 1- J) z "-or- (J-R 1) z" -, wherein R 1 is each independently alkylene, alkenylene, alkynylene, carbocyclyl or heterocyclyl, each J is independently C, NR 3、-NR3 C (O) -, S or O, wherein R 3 is H, alkyl, alkenyl, alkynyl, carbocyclyl or heterocyclyl, each of which is optionally substituted, and z "is an integer from 1 to 50; or (ix) the linker may comprise one or more of (i) to (x).
The linker may comprise one or more D or L amino acids and/or- (R 1-J-R2) z "-, wherein R 1 and R 2 are each independently alkylene, each J is independently C, NR 3、-NR3 C (O) -, S and O, wherein R 4 is independently selected from H and alkyl, and z" is an integer from 1 to 50; or a combination thereof.
The linker may comprise- (OCH 2CH2)z' - (e.g., as a spacer), where z 'is an integer from 1 to 23, e.g., 2,3,4,5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23, "- (OCH 2CH2) z'" may also be referred to as polyethylene glycol (PEG).
The linker may comprise one or more amino acids. The linker may comprise a peptide. The linker may comprise- (OCH 2CH2)z' -and a peptide, wherein z' is an integer from 1 to 23, the peptide may comprise from 2 to 10 amino acids, the linker may further comprise a Functional Group (FG) capable of click chemistry reaction, FG may be an azide or alkyne, and triazole is formed when cargo is conjugated to the linker.
The linker may comprise (i) a beta alanine residue and a lysine residue; (ii) - (J-R 1) z "; or (iii) combinations thereof. Each R 1 can independently be alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each J is independently C, NR 3、-NR3 C (O) -, S, or O, wherein R 3 is H, alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl, each of which is optionally substituted, and z "can be an integer from 1 to 50. Each R 1 can be alkylene and each J can be O.
The linker may comprise residues of (i) beta-alanine, glycine, lysine, 4-aminobutyric acid, 5-aminopentanoic acid, 6-aminocaproic acid, or a combination thereof; and (ii) - (R 1- J) z '-or- (J-R 1) z'. Each R 1 can independently be alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each J is independently C, NR 3、-NR3 C (O) -, S, or O, wherein R 3 is H, alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl, each of which is optionally substituted, and z "can be an integer from 1 to 50. Each R 1 can be alkylene and each J can be O. The linker may comprise glycine, beta-alanine, 4-aminobutyric acid, 5-aminopentanoic acid, 6-aminocaproic acid, or a combination thereof.
The linker may be a trivalent linker. The joint may have the following structure: Wherein A 1、B1 and C 1 can independently be a hydrocarbon linker (e.g., NRH- (CH 2)n -COOH), a PEG linker (e.g., NRH- (CH 2O)n -COOH, wherein R is H, methyl, or ethyl), or one or more amino acid residues, and Z independently is a protecting group.
The hydrocarbon may be a glycine or beta-alanine residue.
The linker may be divalent and connect cCPP to the cargo. The linker may be bivalent and connects cCPP to the Exocyclic Peptide (EP).
The linker may be trivalent and connects cCPP to the cargo and EP.
The linker may be a divalent or trivalent C 1-C50 alkylene group, wherein 1-25 methylene groups are optionally and independently replaced by-N (H) -, -N (C 1-C4 alkyl) -, -N (cycloalkyl) -, -O-, -C (O) O-, -S (O) 2-、-S(O)2N(C1-C4 alkyl) -, -S (O) 2 N (cycloalkyl) -, -N (H) C (O) -, -N (C 1-C4 alkyl) C (O) -, -N (cycloalkyl) C (O) -, -C (O) N (H) -, -C (O) N (C 1-C4 alkyl), -C (O) N (cycloalkyl), aryl, heterocyclyl, heteroaryl, cycloalkyl or cycloalkenyl. The linker may be a divalent or trivalent C 1-C50 alkylene group in which 1-25 methylene groups are optionally and independently replaced by-N (H) -, -O-, -C (O) N (H) -or a combination thereof.
The joint may have the following structure:
Wherein: each AA is independently an amino acid residue; * Is an attachment point to AA SC, and AA SC is a side chain of the amino acid residue of cCPP; x is an integer from 1 to 10; y is an integer from 1 to 5; z is an integer from 1 to 10. X may be an integer from 1 to 5. X may be an integer from 1 to 3. X may be 1.Y may be an integer of 2 to 4. Y may be 4.Z may be an integer of 1 to 5. Z may be an integer of 1 to 3. Z may be 1. Each AA may be independently selected from glycine, beta-alanin
Acids, 4-aminobutyric acid, 5-aminopentanoic acid and 6-aminocaproic acid.
CCPP may be attached to the cargo by a joint ("L"). The linker may be conjugated to the cargo via a binding group ("M").
The joint may have the following structure:
wherein: x is an integer from 1 to 10; y is an integer from 1 to 5; z is an integer from 1 to 10; each AA is independently an amino acid residue; * Is an attachment point to AA SC, and AA SC is a side chain of the amino acid residue of cCPP; and M is a binding group as defined herein.
The joint may have the following structure:
wherein: x' is an integer from 1 to 23; y is an integer from 1 to 5; z' is an integer from 1 to 23; * Is an attachment point to AA SC, and AA SC is a side chain of the amino acid residue of cCPP; and M is a binding group as defined herein.
The joint may have the following structure:
/>
Wherein: x' is an integer from 1 to 23; y is an integer from 1 to 5; and z' is an integer from 1 to 23; * Is an attachment point to AA SC, and AA SC is a side chain of the amino acid residue of cCPP.
X may be an integer from 1 to 10, such as 1,2, 3, 4, 5, 6,7, 8, 9, or 10, including all ranges and subranges therebetween.
X' may be an integer from 1 to 23, such as 1,2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23, including all ranges and subranges therebetween. X' may be an integer from 5 to 15. X' may be an integer from 9 to 13. X' may be an integer from 1 to 5. X' may be 1.
Y may be an integer from 1 to 5, such as 1, 2, 3, 4 or 5, including all ranges and subranges therebetween. Y may be an integer of 2 to 5.Y may be an integer of 3 to 5.Y may be 3 or 4.Y may be 4 or 5.Y may be 3.Y may be 4.Y may be 5.
Z may be an integer from 1 to 10, such as 1,2, 3, 4, 5, 6,7, 8, 9, or 10, including all ranges and subranges therebetween.
Z' may be an integer from 1 to 23, such as 1,2,3,4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23, including all ranges and subranges therebetween. Z' may be an integer of 5 to 15. Z' may be an integer from 9 to 13. Z' may be 11.
As discussed above, the linker or M (where M is part of the linker) may be covalently bound to any suitable location on the cargo. The linker or M (where M is part of the linker) may be covalently bound to the 3 'end of the oligonucleotide cargo or the 5' end of the oligonucleotide cargo. The linker or M (where M is part of the linker) may be covalently bound to the N-terminus or the C-terminus of the peptide cargo. The linker or M (where M is part of the linker) may be covalently bound to the backbone of the oligonucleotide or peptide cargo.
The linker may be bound to a side chain of aspartic acid, glutamic acid, glutamine, asparagine, or lysine on cCPP, or a modified side chain of glutamine or asparagine (e.g., a reduced side chain having an amino group). The linker may be attached to the side chain of the lysine on cCPP.
The linker may be bound to a side chain of aspartic acid, glutamic acid, glutamine, asparagine, or lysine on the peptide cargo, or a modified side chain of glutamine or asparagine (e.g., a reduced side chain having an amino group). The linker may be bound to the side chain of lysine on the peptide cargo.
The joint may have the following structure:
Wherein the method comprises the steps of
M is a group that conjugates L with cargo such as oligonucleotides;
AA s is the side chain or terminal of the amino acid on cCPP;
each AA x is independently an amino acid residue;
o is an integer of 0 to 10; and
P is an integer from 0 to 5.
The joint may have the following structure:
Wherein the method comprises the steps of
M is a group that conjugates L with cargo such as oligonucleotides;
AA s is the side chain or terminal of the amino acid on cCPP;
each AA x is independently an amino acid residue;
o is an integer of 0 to 10; and
P is an integer from 0 to 5.
M may include alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each of which is optionally substituted. M may be selected from:
Wherein R is alkyl, alkenyl, alkynyl, carbocyclyl or heterocyclyl.
M may be selected from:
Wherein: r 10 is alkylene, cycloalkyl or Wherein a is 0 to 10.
M may beR 10 can be/>And a is 0 to 10.M may be/>
M may be a heterobifunctional crosslinker, e.gWhich is disclosed in Williams et al curr.protoc Nucleic Acid chem.2010,42,4.41.1-4.41.20, incorporated herein by reference in its entirety.
M may be-C (O) -.
AA s may be a side chain or a terminal end of an amino acid on cCPP. Non-limiting examples of AA s include aspartic acid, glutamic acid, glutamine, asparagine, or lysine, or modified side chains of glutamine or asparagine (e.g., reduced side chains having an amino group). AA s may be AA SC as defined herein.
Each AA x is independently a natural or unnatural amino acid. One or more AA x may be a natural amino acid. One or more AA x may be an unnatural amino acid. One or more AA x may be a β -amino acid. The beta-amino acid may be beta-alanine.
O may be an integer from 0 to 10, such as 0,1, 2, 3, 4,5,6,7, 8,9, and 10.O may be 0,1, 2 or 3.O may be 0.O may be 1.O may be 2.O may be 3.
P may be 0 to 5, for example 0,1, 2,3, 4 or 5.P may be 0.P may be 1.P may be 2.P may be 3.P may be 4.P may be 5.
The joint may have the following structure:
Wherein M, AA s, each- (R 1-J-R2) z "-, o, and z" are as defined herein; r may be 0 or 1.
R may be 0.R may be 1.
The joint may have the following structure:
Wherein M, AA s, o, p, q, r, and z "each may be as defined herein.
Z "may be an integer from 1 to 50, such as 1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49 and 50, including all ranges and values therebetween. Z' may be an integer from 5 to 20. Z' may be an integer from 10 to 15.
The joint may have the following structure:
Wherein:
m, AA s and o are as defined herein.
Other non-limiting examples of suitable linkers include:
/>
Wherein M and AA s are as defined herein.
Provided herein are compounds comprising cCPP and AC complementary to a target in a pre-mRNA sequence, the compounds further comprising L, wherein the linker is conjugated to AC through a binding group (M), wherein M is
Provided herein are compounds comprising cCPP and a cargo comprising an Antisense Compound (AC), such as an antisense oligonucleotide, complementary to a target in a pre-mRNA sequence, wherein the compound further comprises L, wherein the linker is conjugated to AC through a binding group (M), wherein M is selected from the group consisting of:
Wherein: r 1 is alkylene, cycloalkyl or/> Wherein t' is 0 to 10, wherein each R is independently alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl, wherein R 1 is/>And t' is 2.
The joint may have the following structure:
Wherein AA s is as defined herein, and m' is 0-10.
The linker may have the formula:
The linker may have the formula: wherein "base" is the nucleobase at the 3' end of the cargo phosphorodiamidate morpholino oligomer.
The linker may have the formula:
wherein the "base" corresponds to the nucleobase at the 3' end of the cargo phosphorodiamidate morpholino oligomer.
The linker may have the formula:
wherein "base" is the nucleobase at the 3' end of the cargo phosphorodiamidate morpholino oligomer.
The linker may have the formula: wherein "base" is the nucleobase at the 3' end of the cargo phosphorodiamidate morpholino oligomer.
The linker may have the formula:
The linker may be covalently bound to the cargo at any suitable location on the cargo. The linker is covalently bound to the 3 'end of the cargo or the 5' end of the oligonucleotide cargo. The linker may be covalently bound to the backbone of the cargo.
The linker may be bound to a side chain of aspartic acid, glutamic acid, glutamine, asparagine, or lysine on cCPP, or a modified side chain of glutamine or asparagine (e.g., a reduced side chain having an amino group). The linker may be attached to the side chain of the lysine on cCPP.
CCPP-linker conjugates
CCPP may be conjugated to a linker as defined herein. The linker may be conjugated to AA SC of cCPP as defined herein.
The linker may comprise a- (OCH 2CH2)z' -subunit (e.g., as a spacer), wherein z' is an integer from 1 to 23, e.g., 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23 "- (OCH 2CH2)z'", also known as peg.cpp-linker conjugate) may have a structure selected from table 4:
table 4: cCPP-linker conjugates and SEQ ID NOs
The linker may comprise- (OCH 2CH2)z' -subunit and a peptide subunit, wherein z' is an integer from 1 to 23, the peptide subunit may comprise from 2 to 10 amino acids cCPP-linker conjugate may have a structure selected from table 5:
table 5: cCPP-linker conjugates and SEQ ID NOs
EEVs comprising a cyclic cell penetrating peptide (cCPP), a linker, and an Exocyclic Peptide (EP) are provided. The EEV may have the structure of formula (B):
Or a protonated form thereof, wherein:
r 1、R2 and R 3 are each independently H or an aromatic or heteroaromatic side chain of an amino acid;
R 4 and R 7 are independently H or an amino acid side chain;
EP is an exocyclic peptide as defined herein;
Each m is independently an integer from 0 to 3;
n is an integer from 0 to 2;
x' is an integer from 1 to 20;
y is an integer from 1 to 5;
q is 1-4; and
Z' is an integer from 1 to 23.
R 1、R2、R3、R4、R7, EP, m, q, y, x ', z' are as described herein.
N may be 0.n may be 1.n may be 2.
EEVs may have the structure of formula (B-a) or (B-B):
Or a protonated form thereof, wherein EP (shown as "PE"), R 1、R2、R3、R4, m and z' are as defined above in formula (B).
EEVs may have the structure of formula (B-c):
Or a protonated form thereof, wherein EP, R 1、R2、R3、R4 and m are as defined above in formula (B); AA is an amino acid as defined herein; m is as defined herein; n is an integer from 0 to 2; x is an integer from 1 to 10; y is an integer from 1 to 5; z is an integer from 1 to 10.
EEVs may have the structure of formula (B-1), (B-2), (B-3) or (B-4):
/>
or a protonated form thereof, wherein EP is as defined above in formula (B).
The EEV may comprise formula (B) and may have the following structure: ac-PKKKRKVAEEA-K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH (Ac-SEQ ID NO:132-K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH) or Ac-PK-KKR-KV-AEEA-K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH (Ac-SEQ ID NO:133-K (cyclo [ SEQ ID NO:83 ]) -PEG 12 -OH.
The EEV may comprise cCPP of the formula:
The EEV may comprise the formula: ac-PKKKKRKV-miniPEG-Lys (loop (FfFGRGRQ) -miniPEG-K (N3) (Ac-SEQ ID NO:42-PEG 2 -Lys (loop (SEQ ID NO: 81) -PEG 2-K(N3)).
The EEV may be:
EEVs may be
EEV may be Ac-P-K (Tfa) -K (Tfa) -K (Tfa) -R-K (Tfa) -V-miniPEG 2 -K (cyclo (Ff-Nal-GrGrQ) -PEG 12-OH(Ac-SEQ ID NO:134-miniPEG2 -K (cyclo (SEQ ID NO: 135) -PEG 12 -OH).
EEVs may be
EEV may be Ac-P-K-K-K-R-K-V-miniPEG 2 -K (ring (Ff-Nal-GrGrQ) -PEG 12-OH(Ac-SEQ ID NO:42-PEG2 -K (ring (SEQ ID NO: 135) -PEG 12 -OH).
EEVs may be
EEVs may be
EEVs may be
EEVs may be
EEVs may be
EEVs may be
The EEV may be:
EEVs may be
EEVs may be
EEVs may be
EEVs may be
EEVs may be selected from
/>
/>
EEV may be selected from:
Ac-PKKKKRKV-Lys (cyclo [ FfΦ GrGrQ ]) -PEG 12-K(N3)-NH2
(Ac-SEQ ID NO:42-Lys (cyclo [ SEQ ID NO:80 ]) -PEG 12-K(N3)-NH2)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FfΦ GrGrQ ]) -miniPEG 2-K(N3)-NH2
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:80 ]) -miniPEG 2-K(N3)-NH2)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FGFGRGRQ ]) -miniPEG 2-K(N3)-NH2
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:82 ]) -miniPEG 2-K(N3)-NH2)
Ac-KR-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 2-K(N3)-NH2
(Ac-KR-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 2-K(N3)-NH2)
Ac-PKKKGKV-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 2-K(N3)-NH2
(Ac-SEQ ID NO:46-PEG 2 -K (cyclo [ SEQ ID NO:82 ])) -PEG 2-K(N3)-NH2
Ac-PKKKRKG-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 2-K(N3)-NH2
(Ac-SEQ ID NO:48-PEG 2 -K (cyclo [ SEQ ID NO:82 ])) -PEG 2-K(N3)-NH2
Ac-KKKRK-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 2-K(N3)-NH2
(Ac-SEQ ID NO:19-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 2-K(N3)-NH2)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FF. Phi. GRGRQ ]) -miniPEG 2-K(N3)-NH2
(Ac-SEQ ID NO:42mini-PEG 2 -Lys (cyclo [ SEQ ID NO:80 ]) -miniPEG 2-K(N3)-NH2)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ beta hFf. Phi. GrGrQ ]) -miniPEG 2-K(N3)-NH2
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:142 ]) -miniPEG 2-K(N3)-NH2)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FfΦ SrSrQ ]) -miniPEG 2-K(N3)-NH2
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:143 ]) -miniPEG 2-K(N3)-NH2).
EEV may be selected from:
Ac-PKKKKRKV-miniPEG 2 -Lys (loop (GfFGrGrQ)) PEG 12 -OH
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (Loop (SEQ ID NO: 133)) -PEG 12 -OH
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FGFKRKRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:144 ]) -PEG 12 -OH)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FGFRGRGQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:145 ]) -PEG 12 -OH)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FGFGRGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:146 ]) -PEG 12 -OH)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FGFGRrRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:147 ]) -PEG 12 -OH)
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FGFGRRRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:84 ]) -PEG 12 -OH), and
Ac-PKKKKRKV-miniPEG 2 -Lys (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-miniPEG 2 -Lys (cyclo [ SEQ ID NO:85 ])) -PEG 12 -OH.
EEV may be selected from:
Ac-K-K-K-R-K-G-miniPEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:148-miniPEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH)
Ac-K-K-K-R-K-miniPEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:19-miniPEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH)
Ac-K-K-R-K-K-PEG 4 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:22-PEG 4 -K (cyclo [ SEQ ID NO:82 ])) -PEG 12 -OH
Ac-K-R-K-K-K-PEG 4 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:21-PEG 4 -K (cyclo [ SEQ ID NO:82 ])) -PEG 12 -OH
Ac-K-K-K-K-R-PEG 4 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:23-PEG 4 -K (cyclo [ SEQ ID NO:82 ])) -PEG 12 -OH
Ac-R-K-K-K-K-PEG 4 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:20-PEG 4 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH), and
Ac-K-K-K-R-K-PEG 4 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:19-PEG 4 -K (cyclo [ SEQ ID NO:82 ])) -PEG 12 -OH.
EEV may be selected from:
Ac-PKKKRKV-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 2-K(N3)-NH2
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:82 ])) -PEG 2-K(N3)-NH2
Ac-PKKKRKV-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:82 ])) -PEG 12 -OH
Ac-PKKKRKV-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 2-K(N3)-NH2
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:133 ]) -PEG 2-K(N3)-NH2), and
Ac-PKKKRKV-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:133 ])) -PEG 12 -OH.
The cargo may be a protein and the EEV may be selected from:
Ac-PKKKRKV-PEG 2 -K (cyclo [ FfPhi GrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:80 ])) -PEG 12 -OH
Ac-PKKKKRKV-PEG 2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:79 ])) -PEG 12 -OH
Ac-PKKKRKV-PEG 2 -K (cyclo [ FfFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12 -OH)
Ac-PKKKRKV-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:82 ])) -PEG 12 -OH
Ac-PKKKRKV-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:133 ])) -PEG 12 -OH
Ac-PKKKRKV-PEG 2 -K (cyclo [ FGFGRRRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:42-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12-OH)Ac-PKKKRKV-PEG2 -K (cyclo [ FGFRRRRQ ]) -PEG 12-OH(Ac-SEQ ID NO:42-PEG2 -K (cyclo [ SEQ ID NO:85 ]) -PEG 12-OH)Ac-rr-PEG2 -K (cyclo [ FfPhi GrGrQ ]) -PEG 12 -OH
(Ac-rr-PEG 2 -K (cyclo [ SEQ ID NO:80 ]) -PEG 12 -OH)
Ac-rr-PEG 2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-rr-PEG 2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12 -OH)
Ac-rr-PEG 2 -K (cyclo [ FfF-GRGRQ ]) -PEG 12 -OH
(Ac-rr-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12 -OH)
Ac-rr-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-rr-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH)
Ac-rr-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-rr-PEG 2 -K (cyclo [ SEQ ID NO:133 ]) -PEG 12 -OH)
Ac-rr-PEG 2 -K (cyclo [ FGFGRRRQ ]) -PEG 12 -OH
(Ac-rr-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12 -OH)
Ac-rr-PEG 2 -K (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-rr-PEG 2 -K (cyclo [ SEQ ID NO:85 ]) -PEG 12 -OH)
Ac-rrr-PEG 2 -K (cyclo [ FfPhi GrGrQ ]) -PEG 12 -OH
(Ac-rrr-PEG 2 -K (cyclo [ SEQ ID NO:80 ]) -PEG 12 -OH)
Ac-rrr-PEG 2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-rrr-PEG 2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12 -OH)
Ac-rrr-PEG 2 -K (cyclo [ FfFGRGRQ ]) -PEG 12 -OH
(Ac-rrr-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12 -OH)
Ac-rrr-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-rrr-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH)
Ac-rrr-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-rrr-PEG 2 -K (cyclo [ SEQ ID NO:133 ]) -PEG 12 -OH)
Ac-rrr-PEG 2 -K (cyclo [ FGFGRRRQ ]) -PEG 12 -OH
(Ac-rrr-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12 -OH)
Ac-rrr-PEG 2 -K (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-rrr-PEG 2 -K (cyclo [ SEQ ID NO:85 ]) -PEG 12 -OH)
Ac-rhr-PEG 2 -K (cyclo [ FfΦ GrGrQ ]) -PEG 12 -OH
(Ac-rhr-PEG 2 -K (cyclo [ SEQ ID NO:80 ]) -PEG 12 -OH)
Ac-rhr-PEG 2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-rhr-PEG 2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12 -OH)
Ac-rhr-PEG 2 -K (cyclo [ FfFGRGRQ ]) -PEG 12 -OH
(Ac-rhr-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12 -OH)
Ac-rhr-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-rhr-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH)
Ac-rhr-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-rhr-PEG 2 -K (cyclo [ SEQ ID NO:133 ]) -PEG 12 -OH)
Ac-rhr-PEG 2 -K (cyclo [ FGFGRRRQ ]) -PEG 12 -OH
(Ac-rhr-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12 -OH)
Ac-rhr-PEG 2 -K (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-rhr-PEG 2 -K (cyclo [ SEQ ID NO:85 ]) -PEG 12 -OH)
Ac-rbr-PEG 2 -K (cyclo [ FfΦ GrGrQ ]) -PEG 12 -OH
(Ac-rbr-PEG 2 -K (cyclo [ SEQ ID NO:80 ]) -PEG 12 -OH)
Ac-rbr-PEG 2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-rbr-PEG 2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12 -OH)
Ac-rbr-PEG 2 -K (cyclo [ FfFGRGRQ ]) -PEG 12 -OH
(Ac-rbr-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12 -OH)
Ac-rbr-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-rbr-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH)
Ac-rbr-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-rbr-PEG 2 -K (cyclo [ SEQ ID NO:133 ]) -PEG 12 -OH)
Ac-rbr-PEG 2 -K (cyclo [ FGFGRRRQ ]) -PEG 12 -OH
(Ac-rbr-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12 -OH)
Ac-rbr-PEG 2 -K (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-rbr-PEG 2 -K (cyclo [ SEQ ID NO:85 ]) -PEG 12 -OH)
Ac-rbrbr-PEG 2 -K (cyclo [ FfΦ GrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:138-PEG 2 -K (cyclo [ SEQ ID NO:80 ])) -PEG 12-OH)Ac-rbrbr-PEG2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:138-PEG 2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12-OH)Ac-rbrbr-PEG2 -K (cyclo [ FfFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:138-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12-OH)Ac-rbrbr-PEG2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:138-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12-OH)Ac-rbrbr-PEG2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:138-PEG 2 -K (cyclo [ SEQ ID NO:133 ]) -PEG 12-OH)Ac-rbrbr-PEG2 -K (cyclo [ FGFGRRRQ ])) -PEG 12 -OH
(Ac-SEQ ID NO:138-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12-OH)Ac-rbrbr-PEG2 -K (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:138-PEG 2 -K (cyclo [ SEQ ID NO:85 ])) -PEG 12-OH)Ac-rbhbr-PEG2 -K (cyclo [ FfPhi GrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:149-PEG 2 -K (cyclo [ SEQ ID NO:80 ])) -PEG 12-OH)Ac-rbhbr-PEG2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:149-PEG 2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12-OH)Ac-rbhbr-PEG2 -K (cyclo [ FfFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:149-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12-OH)Ac-rbhbr-PEG2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:149-PEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12-OH)Ac-rbhbr-PEG2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:149-PEG 2 -K (cyclo [ SEQ ID NO:133 ]) -PEG 12-OH)Ac-rbhbr-PEG2 -K (cyclo [ FGFGRRRQ ])) -PEG 12 -OH
(Ac-SEQ ID NO:149-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12-OH)Ac-rbhbr-PEG2 -K (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:149-PEG 2 -K (cyclo [ SEQ ID NO:85 ])) -PEG 12-OH)Ac-hbrbh-PEG2 -K (cyclo [ FfPhi GrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:141-PEG 2 -K (cyclo [ SEQ ID NO:80 ])) -PEG 12-OH)Ac-hbrbh-PEG2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:141-PEG 2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12-OH)Ac-hbrbh-PEG2 -K (cyclo [ FfFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:141-PEG 2 -K (cyclo [ SEQ ID NO:81 ]) -PEG 12-OH)Ac-hbrbh-PEG2 -K (cyclo [ FGFGRGRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:141-PEG 2 -K (cyclo [ SEQ ID NO:82 ])) -PEG 12 -OH
Ac-hbrbh-PEG 2 -K (cyclo [ GfFGrGrQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:141-PEG 2 -K (cyclo [ SEQ ID NO:133 ])) -PEG 12 -OH
Ac-hbrbh-PEG 2 -K (cyclo [ FGFGRRRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:141-PEG 2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12 -OH)
Ac-hbrbh-PEG 2 -K (cyclo [ FGFRRRRQ ]) -PEG 12 -OH
(Ac-SEQ ID NO:141-PEG 2 -K (cyclo [ SEQ ID NO:85 ])) -PEG 12 -OH,
Wherein b is beta-alanine and the exocyclic sequence may be D or L stereochemistry.
Goods (e.g. freight)
A Cell Penetrating Peptide (CPP), such as a cyclic cell penetrating peptide (e.g., cCPP), may be conjugated to the cargo. As used herein, a "cargo" is a compound or moiety that is desired to be delivered into a cell. The cargo may be conjugated to the terminal carbonyl group of the linker. At least one atom of the cyclic peptide may be replaced by a cargo or at least one lone pair may form a bond with the cargo. The cargo may be conjugated to cCPP via a linker. The cargo may be conjugated to AA SC via a linker. At least one atom of cCPP may be replaced with a therapeutic moiety or at least one lone pair of cCPP forms a bond with a therapeutic moiety. The hydroxyl group on the amino acid side chain of cCPP may be replaced by a bond to the cargo. The hydroxyl group on the glutamine side chain of cCPP can be replaced by a bond to the cargo. The cargo may be conjugated to cCPP via a linker. The cargo may be conjugated to AA SC via a linker.
In embodiments, the amino acid side chain comprises a chemically reactive group conjugated to a linker or cargo. Chemically reactive groups may include amine groups, carboxylic acids, amides, hydroxyl groups, sulfhydryl groups, guanidine groups, phenol groups, thioether groups, imidazole groups, or indole groups. In embodiments, the amino acid cCPP conjugated to the cargo includes lysine, arginine, aspartic acid, glutamic acid, asparagine, glutamine, homoglutamine, serine, threonine, tyrosine, cysteine, arginine, tyrosine, methionine, histidine, or tryptophan.
The cargo may comprise one or more detectable moieties, one or more Therapeutic Moieties (TM), one or more targeting moieties, or any combination thereof. In embodiments, the cargo comprises a TM. In embodiments, the TM comprises an Antisense Compound (AC). In embodiments, AC binds to at least a portion of a Splice Element (SE) of a target gene transcript or is in sufficient proximity to SE of the target gene transcript to modulate splicing of the target gene transcript. In embodiments, the AC binds to at least a portion of the SE of the target IRF-5, DPMK or DUX4 gene transcript. In embodiments, AC binds SE of the target IRF-5, DPMK or DUX4 gene transcript in sufficient proximity to modulate splicing of the target IRF-5, DPMK or DUX4 gene transcript.
Cyclic cell penetrating peptides conjugated to cargo moieties (cCPP)
The cyclic cell penetrating peptide (cCPP) may be conjugated to a cargo moiety.
The cargo moiety may be conjugated to a linker at the terminal carbonyl group to provide the following structure:
Wherein:
EP is a cyclic exopeptide and M, AA SC, cargo, x ', y and z' are as defined above, are attachment points to AA SC. x' may be 1.y may be 4.z' may be 11.- (OCH 2CH-2)x' -and/or- (OCH 2CH-2)z' -may independently be replaced by one or more amino acids including, for example, glycine, beta-alanine, 4-aminobutyric acid, 5-aminopentanoic acid, 6-aminocaproic acid, or combinations thereof.
An Endosomal Escape Vector (EEV) may comprise a cyclic cell penetrating peptide (cCPP), an Exocyclic Peptide (EP), and a linker, and may be conjugated to cargo to form an EEV-conjugate comprising a structure of formula (C):
or a protonated form thereof,
Wherein:
r 1、R2 and R 3 may each independently be H or an amino acid residue having a side chain comprising an aromatic group;
r 4 is H or an amino acid side chain;
EP is an exocyclic peptide as defined herein;
Cargo is part as defined herein;
Each m is independently an integer from 0 to 3;
n is an integer from 0 to 2;
x' is an integer from 2 to 20;
y is an integer from 1 to 5;
q is an integer from 1 to 4; and
Z' is an integer from 2 to 20.
R 1、R2、R3,R4, EP, cargo, m, n, x ', y, q, and z' are as defined herein.
EEV may be conjugated to cargo, and EEV-conjugates may comprise a structure of formula (C-a) or (C-b):
or a protonated form thereof, wherein EP, m and z are as defined above in formula (C).
EEV may be conjugated to cargo, and EEV-conjugates may comprise a structure of formula (C-C):
Or a protonated form thereof, wherein EP, R 1、R2、R3、R4 and m are as defined above in formula (III); AA may be an amino acid as defined herein; n may be an integer from 0 to 2; x may be an integer from 1 to 10; y may be an integer from 1 to 5; z may be an integer from 1 to 10.
EEV may be conjugated to an oligonucleotide cargo, and EEV-oligonucleotide conjugates may comprise a structure of formula (C-1), (C-2), (C-3), or (C-4):
/>
EEV can be conjugated to an oligonucleotide cargo, and EEV-conjugates can comprise the following structure:
Cytoplasmic delivery efficiency
Modification of the cyclic cell penetrating peptide (cCPP) may increase cytoplasmic delivery efficiency. By comparing the cytoplasmic delivery efficiency of cCPP with the modified sequence to a control sequence, improved cytoplasmic uptake efficiency can be measured. Regulatory sequences do not include specific replacement amino acid residues in the modified sequence (including, but not limited to, arginine, phenylalanine, and/or glycine), but are otherwise identical.
As used herein, cytoplasmic delivery efficiency refers to the ability of cCPP to cross the cell membrane and enter the cytosol of the cell. cCPP are not necessarily dependent on the receptor or cell type. Cytoplasmic delivery efficiency may refer to absolute cytoplasmic delivery efficiency or relative cytoplasmic delivery efficiency.
Absolute cytosolic delivery efficiency is the ratio of the cytosolic concentration of cCPP (or cCPP-cargo conjugate) to the concentration of cCPP (or cCPP-cargo conjugate) in the growth medium. Relative cytosol delivery efficiency refers to the concentration of cCPP in the cytosol compared to the concentration of control cCPP in the cytosol. Quantification may be achieved by fluorescent labeling cCPP (e.g., with FITC dye) and measuring the fluorescence intensity using techniques well known in the art.
The relative cytoplasmic delivery efficiency is determined by comparing the amount of the invention cCPP that is internalized by a cell type (e.g., a HeLa cell) to the amount of the control cCPP that is internalized by the same cell type. To measure relative cytoplasmic delivery efficiency, the cell types can be incubated in the presence of cCPP for a specified period of time (e.g., 30 minutes, 1 hour, 2 hours, etc.), after which the amount of cCPP internalized by the cell can be quantified using methods known in the art, such as fluorescence microscopy. Separately, the same concentration of control cCPP was incubated in the presence of this cell type for the same period of time and the amount of control cCPP internalized by the cells was quantified.
The relative cytoplasmic delivery efficiency can be determined by measuring cCPP with the modified sequence versus IC 50 of the intracellular target and comparing cCPP with the modified sequence versus IC 50 of the control sequence (as described herein).
The relative cytoplasmic delivery efficiency of cCPP, as compared to loop (Ff Φrrrq, SEQ ID NO: 150), can be in the range of about 50% to about 450%, for example, about 60%, about 70%, about 80%, about 90%, about 100%, about 110%, about 120%, about 130%, about 140%, about 150%, about 160%, about 170%, about 180%, about 190%, about 200%, about 210%, about 220%, about 230%, about 240%, about 250%, about 260%, about 270%, about 280%, about 290%, about 300%, about 310%, about 320%, about 330%, about 340%, about 350%, about 360%, about 370%, about 380%, about 390%, about 400%, about 410%, about 420%, about 430%, about 440%, about 450%, about 460%, about 470%, about 480%, about 490%, about 500%, about 510%, about 520%, about 530%, about 540%, about 550%, about 560%, about 570%, about 580%, or about 590%, including all ranges and values therebetween. The relative cytoplasmic delivery efficiency of cCPP can be improved by greater than about 600% as compared to a cyclic peptide comprising a loop (FfΦRrRQ, SEQ ID NO: 150).
The absolute cytoplasmic delivery potency is about 40% to about 100%, for example about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, including all values and subranges therebetween.
The cCPP of the present disclosure can increase cytoplasmic delivery efficiency by a factor of about 1.1 to about 30 times, e.g., about 1.2, about 1.3, about 1.4, about 1.5, about 1.6, about 1.7, about 1.8, about 1.9, about 2.0, about 2.5, about 3.0, about 3.5, about 4.0, about 4.5, about 5.0, about 5.5, about 6.0, about 6.5, about 7.0, about 7.5, about 8.0, about 8.5, about 9.0, about 10, about 10.5, about 11.0, about 11.5, about 12.0, about 12.5, about 13.0, about 13.5, about 14.0, about 14.5, about 15.0, about 15.5, about 16.0, about 16.5, about 17.0, about 17.5, about 18.0, about 18.5, about 19.5, about 19.0, about 19.5, about 21.0, about 20.5, about 25.0, about 25.5, about 21.5, about 22.0, about 25.5, about 25.0, about 22.5, about 23.5, about 25.0, about 26.5, about 20.0, about 26.0, about 26.5, about 25.0, about 26.0, about 25.5, about 26.0, about 0, about 26.5, about 0, about 25.5, about 0, about 26.0, about 0, about 26.0, about 26.5, about 10.5, about 10.0, and the equivalent therebetween.
Detectable moiety
In embodiments, the compounds disclosed herein comprise a detectable moiety. In embodiments, the detectable moiety is attached to the cell penetrating peptide at the side chain of the amino group, carboxylate group, or any amino acid of the cell penetrating peptide (e.g., at the side chain of the amino group, carboxylate group, or any amino acid in a CPP). In embodiments, the therapeutic moiety comprises a detectable moiety. The detectable moiety may comprise any detectable label. Examples of suitable detectable labels include, but are not limited to, UV-Vis labels, near infrared labels, luminescent groups, phosphorescent groups, magnetic spin resonance labels, photosensitizers, photocleavable moieties, chelate centers, heavy atoms, radioisotopes, isotopically detectable spin resonance labels, paramagnetic moieties, chromophores, or any combination thereof. In embodiments, the label is detectable without the addition of other reagents.
In embodiments, the detectable moiety is a biocompatible detectable moiety, such that the compound is suitable for use in a variety of biological applications. As used herein, "biocompatible" and "biologically compatible" generally refer to compounds that, along with any metabolites or degradation products thereof, are generally non-toxic to cells and tissues and do not cause any significant adverse effects to cells and tissues when they are incubated (e.g., cultured) in the presence thereof.
The detectable moiety may contain a luminophore, such as a fluorescent label or a near infrared label. Examples of suitable luminophores include, but are not limited to, metalloporphyrins; benzoporphyrin; azabenzoporphyrins; naphthalene porphyrin; a phthalocyanine; polycyclic aromatic hydrocarbons such as perylene diimine, pyrene; azo dyes; xanthene dyes; dipyrromethene boron, azadipyrromethene boron, cyanine dyes, metal ligand complexes such as bipyridine, bipyridine-like, phenanthroline, coumarin, and acetylacetonates of ruthenium and iridium; acridine (acridine),Oxazine derivatives, such as dibenzo/>An oxazine; aza-rotaene, squaric acid; 8-hydroxyquinoline, polymethine, luminescent nanoparticles such as quantum dots, nanocrystals; a quinolone; terbium complexes; an inorganic phosphor; ionophores, such as crown ether ancillary or derivatized dyes; or a combination thereof. Specific examples of suitable luminophores include, but are not limited to, octaethylporphyrin Pd (II); octaethylporphyrin Pt (II); tetraphenylporphyrin Pd (II); tetraphenylporphyrin Pt (II); meso-tetraphenylporphyrin tetrabenzoporphin Pd (II); meso-tetraphenylmethyl benzoporphyrin Pt (II); octaethylporphyrin Pd (II); octaethylporphyrin ketone Pt (II); meso-tetrakis (pentafluorophenyl) porphyrin Pd (II); meso-tetrakis (pentafluorophenyl) porphyrin Pt (II); tris (4, 7-diphenyl-1, 10-phenanthroline) Ru (II) (Ru (dpp) 3); tris (1, 10-phenanthroline) Ru (II) (Ru (phen) 3), tris (2, 2' -bipyridine) ruthenium (II) chloride hexahydrate (Ru (bpy) 3); erythrosine B; fluorescein; fluorescein Isothiocyanate (FITC); eosin; ((N-methyl-benzoimidazol-2-yl) -7- (diethylamino) -coumarin) iridium (III);
benzothiazol) ((benzothiazol-2-yl) -7- (diethylamino) -coumarin) -2- (acetylacetonate); lumogen dye; macroflex fluorescent red; macrolex fluorescent yellow; texas red; rhodamine B; rhodamine 6G; thiorhodamine; m-cresol; thymol blue; xylenol blue; cresol red; chlorophenol blue; bromocresol green; bromocresol red; bromothymol blue; cy2; cy3; cy5; cy5.5; cy7; 4-nitrophenol; alizarin; phenolphthalein; o-cresolphthalein; chlorophenol red; a calcium magnesium reagent; bromoxylenol; phenol red; neutral red; nitrooxazine; 3,4,5, 6-tetrabromophenolphthalein; congo red; fluorescein; eosin; 2',7' -dichlorofluorescein; 5 (6) -carboxyfluorescein; carboxynaphthofluorescein; 8-hydroxypyrene-136-trisulfonic acid; semi-naphthorhodamine (semi-naphthorhodafluor); semi-naphthofluorescein; tris (4, 7-diphenyl-1, 10-phenanthroline) ruthenium (II) dichloride; (4, 7-diphenyl-1, 10-phenanthroline) ruthenium (II) tetraphenylboron; platinum (II) octaethylporphyrin; dialkyl carbocyanines; dioctadecyl epoxy carbocyanine; fluorenylmethoxy carbonyl chloride; 7-amino-4-methylcoumarin (Amc); green Fluorescent Protein (GFP); and derivatives or combinations thereof.
In some examples, the detectable moiety may include rhodamine B (Rho), fluorescein Isothiocyanate (FITC), 7-amino-4-methylcoumarin (Amc), green Fluorescent Protein (GFP), or derivatives or combinations thereof.
Preparation method
The compounds described herein may be prepared in a number of ways known to those skilled in the art of organic synthesis or in variations thereof as understood by those skilled in the art. The compounds described herein can be prepared from readily available starting materials. The optimal reaction conditions may vary with the particular reactants or solvents used, but such conditions may be determined by one skilled in the art.
Variations of the compounds described herein include addition, subtraction, or movement of the various components as described for each compound. Similarly, the chirality of a molecule may change when one or more chiral centers are present in the molecule. In addition, compound synthesis may involve protection and deprotection of various chemical groups. The use of protection and deprotection and the selection of appropriate protecting groups can be determined by one skilled in the art. The chemical nature of the protecting groups can be found, for example, in wuts and greene, protective Groups in Organic Synthesis, 4 th edition, wiley & sons,2006, which is incorporated herein by reference in its entirety.
Starting materials and reagents for preparing the disclosed compounds and compositions are available from commercial suppliers such as Aldrich Chemical Co.(Milwaukee,WI)、Acros Organics(Morris Plains,NJ)、Fisher Scientific(Pittsburgh,PA)、Sigma(St.Louis,MO)、Pfizer(New York,NY)、GlaxoSmithKline(Raleigh,NC)、Merck(Whitehouse Station,NJ)、Johnson&Johnson(New Brunswick,NJ)、Aventis(Bridgewater,NJ)、AstraZeneca(Wilmington,DE)、Novartis(Basel,Switzerland)、Wyeth(Madison,NJ)、Bristol-Myers-Squibb(New York,NY)、Roche(Basel,Switzerland)、Lilly(Indianapolis,IN)、Abbott(Abbott Park,IL)、Schering Plough(Kenilworth,NJ)、 or Boehringer Ingelheim (ingelheim, germany) or are prepared by methods known to those skilled in the art following procedures described in the references, such as FIESER AND FIESER' S REAGENTS for Organic Synthesis, volumes 1-17 (John Wiley and Sons, 1991); rodd' S CHEMISTRY of Carbon Compounds, volumes 1-5 and supplements (ELSEVIER SCIENCE publishers, 1989); organic Reactions, volumes 1-40 (John Wiley and Sons, 1991); march' S ADVANCED Organic Chemistry, (John Wiley and Sons, 4 th edition); larock's Comprehensive Organic Transformations (VCH publishers Inc., 1989). Other materials, such as the pharmaceutical carriers disclosed herein, are available from commercial sources.
The reaction to prepare the compounds described herein may be carried out in a solvent, which may be selected by one skilled in the art of organic synthesis. The solvent may be substantially non-reactive with the starting materials (reactants), intermediates, or products under the conditions (e.g., temperature and pressure) under which the reaction is carried out. The reaction may be carried out in one solvent or a mixture of more than one solvent. Product or intermediate formation may be monitored according to any suitable method known in the art. For example, product formation may be monitored by spectroscopic means such as nuclear magnetic resonance spectroscopy (e.g., 1 H or 13 C), infrared spectroscopy, spectrophotometry (e.g., UV-visible light) or mass spectrometry, or by chromatography such as High Performance Liquid Chromatography (HPLC) or thin layer chromatography.
The disclosed compounds can be prepared by solid phase peptide synthesis in which the amino acid α -N-terminus is protected by an acid or base protecting group. Such protecting groups should have properties that are stable to the conditions under which the peptide linkage is formed, while being readily removable without disrupting the growing peptide chain or racemizing any chiral centers contained therein. Suitable protecting groups are 9-fluorenylmethoxycarbonyl (Fmoc), t-butoxycarbonyl (Boc), benzyloxycarbonyl (Cbz), biphenylisopropoxycarbonyl, t-pentyloxycarbonyl, isobornyloxycarbonyl, α -dimethyl-3, 5-dimethoxybenzyloxycarbonyl, o-nitrophenylsulfinyl, 2-cyano-t-butoxycarbonyl and the like. A9-fluorenylmethoxycarbonyl (Fmoc) protecting group is particularly preferred for the synthesis of the disclosed compounds. For side chain amino groups such as lysine and arginine, other preferred side chain protecting groups are 2,5,7, 8-pentamethylchroman-6-sulfonyl (pmc), nitro, p-toluenesulfonyl, 4-methoxybenzenesulfonyl, cbz, boc and adamantoxycarbonyl; for tyrosine are benzyl, o-bromobenzyloxy-carbonyl, 2, 6-dichlorobenzyl, isopropyl, t-butyl (t-Bu), cyclohexyl, cyclopentyl and acetyl (Ac); for serine are tert-butyl, benzyl and tetrahydropyranyl; for histidine are trityl, benzyl, cbz, p-toluenesulfonyl and 2, 4-dinitrophenyl; for tryptophan is formyl; benzyl and tert-butyl for aspartic acid and glutamic acid, and triphenylmethyl (trityl) for cysteine.
In the solid phase peptide synthesis method, the α -C-terminal amino acid is attached to a suitable solid support or resin. Suitable solid supports useful in the above synthesis are those materials which are inert to the reagents and reaction conditions of the progressive condensation-deprotection reaction and insoluble in the medium used. The solid support used for the synthesis of the α -C-terminal carboxy peptide is a 4-hydroxymethylphenoxymethyl-co (styrene-1% divinylbenzene) or 4- (2 ',4' -dimethoxyphenyl-fmoc-aminomethyl) phenoxyacetamido ethyl resin available from Applied Biosystems (foster city, calif.). alpha-C-terminal amino acids are prepared by reacting N, N '-Dicyclohexylcarbodiimide (DCC), N, N' -Diisopropylcarbodiimide (DIC) or O-benzotriazol-1-yl-N, N, N ', N' -tetramethyluronium Hexafluorophosphate (HBTU), with or without 4-Dimethylaminopyridine (DMAP), 1-Hydroxybenzotriazole (HOBT), benzotriazol-1-yloxy-tris (dimethylamino)Hexafluorophosphate (BOP) or bis (2-oxo-3-/>)Oxazolidinyl) phosphine chloride (BOPCl) is coupled to the resin at a temperature of 10 ℃ to 50 ℃ for about 1 to about 24 hours in a solvent such as dichloromethane or DMF. When the solid support is a 4- (2 ',4' -dimethoxyphenyl-Fmoc-aminomethyl) phenoxy-acetamidoethyl resin, the Fmoc group is cleaved with a secondary amine (preferably piperidine) prior to coupling with the α -C-terminal amino acid as described above. One method for coupling with the deprotected 4- (2 ',4' -dimethoxyphenyl-Fmoc-aminomethyl) phenoxy-acetamidoethyl resin is O-benzotriazol-1-yl-N, N, N ', N' -tetramethyluronium hexafluorophosphate (HBTU, 1 eq.) and 1-hydroxybenzotriazole (HOBT, 1 eq.) in DMF. The coupling of the consecutive protected amino acids can be performed in an automated polypeptide synthesizer. In one example, fmoc is used to protect the alpha-N-terminus in the amino acid of the growing peptide chain. Removal of the Fmoc protecting group from the alpha-N-terminal side of the growing peptide is accomplished by treatment with a secondary amine, preferably piperidine. Each protected amino acid is then introduced in about a 3-fold molar excess and preferably coupled in DMF. The coupling agent may be O-benzotriazol-1-yl-N, N, N ', N' -tetramethyluronium hexafluorophosphate (HBTU, 1 eq.) and 1-hydroxybenzotriazole (HOBT, 1 eq.). At the end of the solid phase synthesis, the polypeptide is removed from the resin and deprotected, either continuously or in a single operation. Removal and deprotection of the polypeptide can be accomplished in a single operation by treating the resin-bound polypeptide with a cleavage reagent comprising anisole, water, ethylene dithiol and trifluoroacetic acid. In the case where the α -C-terminus of the polypeptide is an alkylamide, the resin is cleaved by ammonolysis with the alkylamine. Alternatively, the peptide may be removed by transesterification (e.g., with methanol), followed by ammonolysis, or by direct transamidation. The protected peptide may be purified at this point or used directly in the next step. The removal of the side chain protecting groups can be accomplished using the cleavage mixtures described above. The fully deprotected peptide may be purified by a series of chromatographic steps using any or all of the following types: ion exchange on a weakly basic resin (acetate form); hydrophobic adsorption chromatography on underivatized polystyrene-divinylbenzene (e.g., amberlite XAD); silica gel adsorption chromatography; ion exchange chromatography on carboxymethyl cellulose; partition chromatography (e.g., on Sephadex G-25, LH-20) or countercurrent distribution; high Performance Liquid Chromatography (HPLC), particularly reverse phase HPLC on octyl-octadecylsilyl-silica bonded phase column packing.
The above-described polymers, such as PEG groups, may be attached to an oligonucleotide, such as AC, under any suitable conditions. Any method known in the art may be used, including other chemoselective conjugation/attachment methods via acylation, reductive alkylation, michael addition, thiol alkylation, or by reactive groups on the PEG moiety (e.g., aldehyde, amino, ester, thiol, α -haloacetyl, maleimide, or hydrazino) to reactive groups on AC (e.g., aldehyde, amino, ester, thiol, α -haloacetyl, maleimide, or hydrazino). Activating groups that may be used to attach the water-soluble polymer to one or more proteins include, but are not limited to, sulfones, maleimides, thiols, triflates, azidirine, oxiranes, 5-pyridinyl, and alpha-haloacyl groups (e.g., alpha-iodoacetic acid, alpha-bromoacetic acid, alpha-chloroacetic acid). If attached to AC by reductive alkylation, the polymer selected should have a single reactive aldehyde, thereby controlling the degree of polymerization. See, e.g., kinstler et al, adv. Drug. Delivery Rev. (2002), 54:477-485; roberts et al, adv. Drug Delivery Rev. (2002), 54:459-476; and Zalipsky et al, adv. Drug Delivery Rev. (1995), 16:157-182.
In order to covalently attach an AC or linker directly to a CPP, a suitable amino acid residue of the CPP may be reacted with an organic derivatizing agent capable of reacting with a selected side chain or N-terminus or C-terminus of the amino acid. Reactive groups on the peptide or conjugate moiety include, for example, aldehyde, amino, ester, thiol, α -haloacetyl, maleimide, or hydrazine groups. Derivatizing agents include, for example, maleimide benzoyl sulfosuccinimidyl ester (conjugated via a cysteine residue), N-hydroxysuccinimide (conjugated via a lysine residue), glutaraldehyde, succinic anhydride, or other agents known in the art.
Methods of preparing and conjugating AC to linear CPPs are generally described in U.S. publication No. 2018/0298383, which is incorporated herein by reference for all purposes. The method is applicable to the cyclic CPPs disclosed herein.
The synthesis schemes are provided in fig. 5A-5D and fig. 6.
Non-limiting examples of compounds comprising CPPs and reactive groups useful for conjugation to AC are shown in table 6. Exemplary linker groups are also shown. Examples of reactive groups include tetrafluorophenyl esters (TFP), free carboxylic acids (COOH), and azides (N 3). In table 6, n is an integer of 0 to 20; pipa6 is AcRXRRBRRXRYQFLIRXRBRXRB, wherein B is β -alanine and X is aminocaproic acid; dap is 2, 3-diaminopropionic acid; NLS is a nuclear localization sequence; βa is βalanine; -ss-is a disulfide; PABC is the C-terminal domain of the poly (A) binding protein; c x is an alkyl chain of length x, where x is a number; and BCN is bicyclo [6.1.0] nonyne.
TABLE 6 Compounds comprising CPP and reactive groups
In embodiments, the CPP has free carboxylic acid groups that are useful for conjugation to AC. In embodiments, EEVs have free carboxylic acid groups available for conjugation to AC.
The following structure is a 3' cyclooctyne modified PMO for click reaction with azide containing compounds:
exemplary schemes for conjugation of CPPs and linkers to the 3' end of AC via an amide bond are shown below.
Exemplary schemes for conjugation of CPPs and linkers to 3' -cyclooctyne modified PMOs via strain-promoted azide-alkyne cycloaddition are shown below:
Examples of conjugation chemistries for attaching AC and CPP to additional linkers containing polyethylene glycol moieties are shown below:
/>
Examples of conjugation of CPP-linkers via strain-promoted azide-alkyne cycloaddition (click chemistry) to 5' -cyclooctyne modified PMOs are shown below:
Methods of synthesizing oligomeric antisense compounds are known in the art. The present disclosure is not limited by the method of synthesizing the AC. In embodiments, provided herein are compounds having reactive phosphorus groups that can be used to form internucleoside linkages, including, for example, phosphodiester and phosphorothioate internucleoside linkages. The methods of preparation and/or purification of the precursor or antisense compound are not limitations of the compositions or methods provided herein. Methods for the synthesis and purification of DNA, RNA and antisense compounds are well known to those skilled in the art.
Oligomerization of modified and unmodified nucleosides can be routinely performed according to literature procedures of DNA (Protocols for Oligonucleotides and Analogs, editors Agrawal (1993), humana Press) and/or RNA (Scaringe, methods (2001), 23,206-217; gait et al, applications of Chemically synthesized RNA in RNA: protein Interactions, editors Smith (1998), 1-36; gallo et al, tetrahedron (2001), 57, 5707-5713).
Antisense compounds provided herein can be conveniently and routinely prepared by well known solid phase synthesis techniques. Devices for such synthesis are sold by several suppliers including, for example Applied Biosystems (foster city, CA). Any other means known in the art for such synthesis may additionally or alternatively be used. The preparation of oligonucleotides such as phosphorothioates and alkylated derivatives using similar techniques is well known. The present disclosure is not limited by the synthetic methods of antisense compounds.
Methods of oligonucleotide purification and analysis are known to those skilled in the art. Analytical methods include Capillary Electrophoresis (CE) and electrospray mass spectrometry. Such synthetic and analytical methods can be performed in multiwell plates. The method of the present invention is not limited by the method of oligomer purification.
Disease of the human body
In some embodiments, various diseases or conditions may be treated, prevented, or ameliorated by the administration of a composition comprising one or more compounds described herein. In embodiments, diseases treated, prevented or ameliorated with the compositions of the present disclosure are associated with dysregulation of splicing, protein expression and/or protein activity.
In embodiments, the compounds disclosed herein are useful for treating, preventing, or ameliorating a disease or disorder. Exemplary diseases or conditions that may be treated, prevented, or modulated using the compounds of the present disclosure may include, but are not limited to, cancers, including, for example, acute myeloid leukemia, B-cell leukemia/lymphoma, bladder cancer, breast cancer, chronic lymphocytic leukemia, colon cancer, colorectal cancer, duchenne muscular dystrophy, esophageal squamous cell carcinoma, fanconi anemia, gastric cancer, glioblastoma, hepatocellular carcinoma, lung cancer, lindgkin's syndrome, mantle cell lymphoma, melanoma, nasopharyngeal carcinoma, neuroblastoma, ovarian cancer, pancreatic ductal adenocarcinoma, proliferative disorders, prostate cancer, and small intestine neuroendocrine cancer; cardiovascular disorders including, for example, atherosclerosis, cardiac hypertrophy, dilated cardiomyopathy, hypertension, ischemia/reperfusion injury, thrombosis (deep vein) and thrombosis (vein); congenital abnormalities including small eye deformity, miaole's dysplasia, bone fragility (osteogenesis imperfecta) and rickets; endocrine disorders, including neonatal diabetes and type 2 diabetes; hematological disorders, including thrombocytopenia, alpha-thalassemia and beta-thalassemia; immune disorders including IPEX syndrome, nasal polyp, severe combined immunodeficiency disease, systemic lupus erythematosus and vickers-aldrich syndrome; pulmonary disorders, including pulmonary fibrosis; musculoskeletal disorders including myofibrosis, facial shoulder humeral dystrophy, oculopharyngeal muscular dystrophy, tonic muscular dystrophy, and oculopharyngeal muscular dystrophy; neurological disorders including Alzheimer's disease, amyotrophic lateral sclerosis, anxiety, fabry's disease, fragile X syndrome, friedel-crafts ataxia, huntington's disease, metachromatic leukodystrophy, pseudodeficiency, neuropsychiatric disease, parkinson's disease and suicidal behavior; stress; jersey syndrome; glycogen storage diseases such as pompe disease; and combinations thereof.
In embodiments, the compounds disclosed herein are useful for treating, preventing or ameliorating diseases associated with aberrant gene transcription, splicing and/or translation. In embodiments, the compounds disclosed herein are useful for treating, preventing or ameliorating diseases associated with aberrant IRF-5, GYS1 and/or DUX4 transcription, splicing and/or translation. In embodiments, the compounds disclosed herein are useful for treating, preventing or ameliorating a disease associated with: IRF-5, GYS1 and/or DUX4 up-regulation; IRF-5, GYS1 and/or DUX4 polymorphisms; accumulation of mutant IRF-5, GYS1 and/or DUX 4; or a combination thereof.
Glycogen storage disease
Glycogen synthesis and degradation are multi-step processes involving many different enzymatic reactions. For example, alpha-Glucosidase (GAA) catalyzes the hydrolysis of glycogen by cleaving alpha-1, 4 and alpha-1, 6 glycosidic bonds, allowing glucose to be released into the cytoplasm. In the absence of GAA, glycogen accumulates in lysosomes in various tissues (mainly cardiac and skeletal muscle). The disorder caused by this lack of protein is known as Glycogen Storage Disease (GSD) (Douillard-Guilloux et al, hum. Mol. Genet. (2010), 19 (4): 684-96).
GSD is a genetic metabolic disorder of glycogen metabolism. Glycogen storage disease is of more than 12 types, which are classified based on enzyme deficiency and affected tissues (mainly liver or muscle). Type 0 GSD is due to the lack of glycogen synthase. Type I is due to the lack of glucose-6-phosphatase a. Type II is due to the lack of alpha-Glucosidase (GAA). Type III is due to the lack of Glycogen Debranching Enzyme (GDE). Type IV is due to lack of glycogen branching activity. Type V is due to the lack of a muscle isoform of glycogen phosphorylase (encoded by PYGM). Type VI is due to the lack of liver isoforms of glycogen phosphorylase (encoded by PYGL). There is little information about the remaining GSDs, and several previous GSDs have been classified into other disorders. A list of glycogen storage diseases is provided in Table 7 (Ellingwood S. et al, (2018), J.Endocrinol.238 (3): R131-R141.Doi: 10.1530/JOE-18-0120).
Table 7: glycogen storage disease
/>
/>
* OMIM (online human mendelian genetics); in early sources, GSD type XI was associated with lactate dehydrogenase a (OMIM 612933) deficiency.
Glycogen storage disease type II (GSDII) or pompe disease is an autosomal recessive lysosomal storage disease caused by a mutation in the gene encoding acid glucosidase α (GAA) that results in a deletion or deficiency of GAA protein essential for the breakdown of complex carbohydrates, glycogen. Typically, the body uses GAA to break down and convert complex carbohydrate glycogen to glucose. Failure to achieve proper breakdown and abnormal glycogen metabolism results in excessive accumulation of glycogen in body cells, particularly in cardiac, smooth and skeletal muscle cells, which can lead to impairment and deterioration of normal tissue and organ function. Patients with pompe disease experience serious muscle-related problems including progressive muscle weakness throughout the body, especially in the legs, trunk and diaphragm. As the condition progresses, respiratory problems may lead to respiratory failure. To date, over 300 pathogenic mutations have been identified in GAA. In the united states and europe, it is generally estimated that a total of 5,000 to 10,000 patients suffer from pompe disease; however, the advent of neonatal screening suggests that the disease is under diagnosed.
Pompe disease is generally classified as infant type pompe disease (IOPD) or Late Onset Pompe Disease (LOPD) based on age and severity of the symptoms. IOPD are characterized by severe muscle weakness and abnormally reduced muscle tone, and typically develop within the first few months of life. IOPD, if left untreated, is often fatal due to progressive heart failure, respiratory distress, or malnutrition caused by feeding difficulties. LOPD exists in childhood, adolescence or adulthood. Patients with LOPD often have milder symptoms such as reduced mobility and respiratory problems. Patients with LOPD experience progressive walking difficulties and respiratory depression. The initial symptoms of LOPD may be subtle and unnoticed for many years.
At the time of filing, the only therapies currently approved for pompe disease are the arabinosidase α (Lumizyme in the united states and Myozyme in other areas) and the atorvastatin enzyme α -ngpt (Nexviazyme in the united states), all in the form of Enzyme Replacement Therapy (ERT) delivered via IV infusion. Although infant patients treated with ERT for pompe disease have shown improved survival rates, ERT is not curative and many patients continue to have increased risk of both cardiomyopathy and heart failure in long-term observational studies. These patients may also experience residual muscle weakness, including dysphagia and the concomitant increased risk of aspiration. ERT has a particularly limited ability to ameliorate skeletal myopathies and respiratory dysfunction, mainly due to its inability to penetrate critical tissues affected by the disease, lack of activity in the cytosol, and potential immunogenicity. Despite the availability of ERT, there is still a significant unmet medical need in IOPD or LOPD or patients.
GAA catalyzes the hydrolysis of glycogen by cleavage of alpha-1, 4 and alpha-1, 6 glycosidic bonds, allowing glucose to be released into the cytoplasm. In the absence of GAA, glycogen accumulates in lysosomes in various tissues (mainly cardiac and skeletal muscle). The disorder caused by this protein deficiency is known as Glycogen Storage Disease (GSD) (Douillard-Guilloux (2010) hum. Mol. Genet.19 (4): 684-96).
One way in which glycogen storage disease can be treated is by down-regulating glycogen synthesis, for example, by down-regulating expression and/or activity of glycogen synthase. Glycogen synthase has two major isoenzymes, GYS1 and GYS2.GYS1 is ubiquitously expressed in skeletal and cardiac muscles (NCBI reference 2997). GYS2 is expressed mainly in liver and adipose tissue (NCBI gene reference 2998). GYS1 is used to break down the glucose taken in and provide glycogen energy reserves for muscles. In contrast, GYS2 is used to maintain blood glucose levels. Alignment of the mRNA of GYS1 and GYS2 shows that 54% of the two isozymes share 71% homology.
Downregulation of glycogen synthase (GYS 1) expression has been shown to result in reversal of glycogen accumulation (Douillard-Guilloux et al, hum. Mol. Genet. (2010), 19 (4): 684-96). The structure and mechanism of action of GYS1 have been reviewed (palm, D.C. et al, FEBS (2013), 280 (1), 2-27; and Baskaran S. et al, proc.Natl. Acad.Sci. USA (2010) 107,17563-17568). Due to the functional differences between GYS1 and GYS1, it is important that GYS1 is selectively targeted for down-regulation.
In an embodiment, a method for treating a glycogen storage disease is provided. In embodiments, the method comprises administering a compound that down regulates glycogen synthesis. In embodiments, the method comprises administering a compound that down-regulates expression of glycogen synthase. In embodiments, the method comprises administering a compound that down-regulates expression of a muscle form of glycogen synthase (GYS 1). In embodiments, the compound comprises AC. The AC may be any AC and have any AC characteristics as described elsewhere herein. In embodiments, the AC may bind to at least a portion of a SE or SRE of a target transcript as described elsewhere herein. In embodiments, the AC may bind near SE or SRE of the target transcript as described elsewhere herein. In an embodiment, the AC is ASO. In embodiments, the ASO is a PMO. AC may bind to any splice element of the GYS1 target transcript as described elsewhere herein.
In an embodiment, a method for treating a glycogen storage disease is provided. In embodiments, a method for treating a glycogen storage disease associated with glycogen accumulation in muscle tissue is provided. In an embodiment, a method for treating a glycogen storage disease associated with glycogen accumulation in myocardial tissue is provided. In embodiments, a method for treating a glycogen storage disease associated with glycogen accumulation in skeletal muscle tissue is provided. In an embodiment, a method for treating a glycogen storage disease type II is provided. In an embodiment, a method for treating pompe disease is provided. In an embodiment, a method for treating anderson's disease is provided. In an embodiment, a method for treating mecalder disease is provided. In an embodiment, a method for treating lafu pull-on disease is provided. In an embodiment, a method for treating tarry black disease (Tariu disease) is provided.
In embodiments, GYS1 is encoded by a nucleotide sequence encoding isoform 1 or isoform 2. Nucleotide sequences can be obtained publicly through the online NCBI database (isoform 1=nm_ 002103.5; isoform 2=nm_ 001161587.2). In embodiments, the nucleotide sequence encoding GYS1 differs from the nucleotide sequence encoding isoform 1 or isoform 2 by one or more nucleic acids. In embodiments, the nucleotide sequences encoding GYS1 differ in one or more polymorphisms (e.g., single nucleotide polymorphisms or SNPs). In embodiments, the nucleotide sequence encoding GYS1 shares less than 100% sequence identity with the nucleotide sequence encoding isoform 1 or isoform 2, and in embodiments GYS1 is encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% identity with the nucleotide sequence encoding isoform 1 or isoform 2. In embodiments, GYS1 is encoded by a nucleotide sequence having 80% to 100%, 90% to 100%, 95% to 100%, or 99% to 100% identity to a nucleic acid sequence encoding isoform 1 or isoform 2.
In embodiments, the method comprises administering a compound that induces exon skipping of one or more exons in the GYS1 target transcript. In embodiments, the method comprises administering a compound comprising an Antisense Compound (AC) that induces skipping of one or more exons in the GYS1 target transcript. In embodiments, hybridization of AC to the target nucleotide sequence of the GYS1 transcript results in the inclusion or skipping of one or more exons in the target transcript. In embodiments, the skipping of one or more exons or the inclusion induces a frame shift in the GYS1 target transcript. In embodiments, the frame shift results in a GYS1 transcript encoding a glycogen synthase with reduced activity. In embodiments, the frame shift results in truncated or nonfunctional glycogen synthase. In embodiments, the frame shift results in the introduction of a premature stop codon in the GYS1 transcript. In embodiments, the introduction of a premature stop codon results in degradation of the GYS1 mRNA transcript by nonsense-mediated decay.
In embodiments, the compound comprises an Antisense Compound (AC) that induces skipping of one or more of exons 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 of human and/or mouse GYS 1. In embodiments, the compound comprises AC that induces one or more exon skipping to produce an out-of-frame, resulting in degradation (e.g., nonsense-mediated decay) of the GYS1 target transcript or translation into a GYS1 protein with reduced or no activity. In embodiments, the compound comprises AC that induces skipping of one or more of exons 2, 5, 6, 7, 8, 10, 12, and/or 14 to produce out of frame frameshifting. In embodiments, the compound comprises AC that induces skipping of one or more exons to produce an in frame deletion in the GYS1 target transcript. In embodiments, the compound comprises AC that induces skipping of one or more of exons 3, 4, 9, 11, 13 and/or 15. In embodiments, the compound comprises AC that induces skipping of one or more of exons 3, 4, 9, 11, 13 and/or 15 to produce an in frame deletion in the GYS1 target transcript.
In embodiments, the compounds comprise AC that incorporates one or more exons/introns and/or introns/exon junctions to induce exon skipping. In embodiments, the AC compound comprises any one of the following sequences in table 8, wherein capital letters represent exon nucleotides and lowercase letters represent intron nucleotides. In Table 8, SEQ ID NOS 151-247 are designed to induce exon skipping to produce frameshift changes. In embodiments, the frameshift change results in a premature stop codon. In embodiments, the frameshift change results in nonsense-mediated decay of the GYS1 target transcript. In Table 8, SEQ ID NOS 249-318 are designed to induce exon skipping to produce in-frame deletions. The ACs listed in table 8 are designed to bind target nucleotide sequences comprising exons, exon/intron junctions, and/or intron/exon junctions.
Table 8: various AC sequences targeting GYS1
/>
/>
/>
/>
/>
/>
/>
/>
In some embodiments, the AC comprises PMO sequences from U.S. application Ser. No. 16/867,261 and/or Clayton et al, molecular Therapy-nucleic acids (2014) 3, e206, such as those listed in Table 9, or a portion thereof.
The PMO sequence is designed to induce exon skipping, resulting in a frameshift change. In embodiments, the frameshift change results in a premature stop codon that results in nonsense-mediated decay of the GYS1 target transcript. 321-327 is designed to bind to a target nucleotide sequence comprising an intron/exon and/or exon/intron junction of a GYS1 target transcript. SEQ ID NOS 319 and 327 are designed to bind to a target nucleotide sequence comprising the intron sequence of the target GYS1 transcript.
Table 9: various AC sequences targeting GYS1
SEQ ID NO: Sequence (5 'to 3') Exons to be skipped
319 TCAGGGTTGTGGACTCAATCATGCC 8
320 AAGGACCAGGGTAAGACTAGGGACT 5
321 GTCCTGGACAAGGATTGCTGACCAT 8
322 CTGCTTCCTTGTCTACATTGAACTG 5
323 ATACCCGGCCCAGGTACTTCCAATC 14
324 CTGGACAAGGATTGCTGACCATAGT 8
325 TCCCACCGAGCAGGCCTTACTCTGA 7
326 GACCACAGCTCAGACCCTACCTGGT 5
327 TCACTGTCTGGCTCACATACCCATA 6
In embodiments, the AC comprises 10 or more, 15 or more, or 20 or more consecutive bases of any of the sequences in table 8 and/or table 9. In embodiments, the AC comprises 25 or fewer, 20 or fewer, or 15 or fewer consecutive bases of any of the sequences in table 8 and/or table 9. In embodiments, AC comprises 10 to 25, 10 to 20, or 10 to 15 consecutive bases of any of the sequences of table 8 and/or table 9. In embodiments, AC comprises 15 to 25 or 15 to 20 consecutive bases of any of the sequences in table 8 and/or table 9. In embodiments, AC comprises 20-25 consecutive bases of any one of the sequences in table 8 and/or table 9.
In embodiments, a mouse model using mouse GYS1 is used to study the effect of compounds that induce exon skipping in GYS1 target transcripts. Mouse and human GYS1 two have 97% homology on chromosome 19. In addition, both mouse and human GYS1 have 16 exons and identical splicing patterns, resulting in a 737 amino acid long full-length protein.
In embodiments, the compound comprises an Antisense Compound (AC) that induces down-regulation of human and/or mouse GYS1 by targeting its initiation codon. Examples of such sequences include those in table 10.
Table 10: various AC sequences targeting GYS1
Interferon regulator 5 (IRF-5)
In embodiments, compounds for modulating the activity of interferon regulatory factor 5 (IRF-5) are provided. IRF-5 is a member of the IRF transcription factor family, which is highly expressed in monocytes, macrophages, B cells and dendritic cells, and whose expression can be induced by type I interferons in other types of cells (Almuttaqi and Udalova, FEBS j. (2018), 286:1624-1637). IRF-5 is involved in innate and adaptive immunity, antiviral defense, pro-inflammatory cytokine production, macrophage polarization, cell growth regulation, differentiation and apoptosis.
Aberrant IRF-5 expression is associated with a variety of diseases. Furthermore, increased IRF5 mRNA levels are highly correlated with disease pathology. For example, upregulation of IRF-5 can lead to increased production of IFN, which is associated with the development of a variety of inflammatory diseases including autoimmune diseases, infectious diseases, cancer, obesity, neuropathic pain, cardiovascular diseases (e.g., atherosclerosis) and metabolic dysfunction (Banga et al, sci.adv. (2020), 6:eaay 1057). In addition, IRF-5 gene polymorphisms associated with higher IRF-5 expression are associated with susceptibility to inflammatory and autoimmune diseases including Rheumatoid Arthritis (RA), inflammatory Bowel Disease (IBD), multiple Sclerosis (MS), inflammatory Bowel Disease (IBD), systemic Lupus Erythematosus (SLE) and Sjogren's syndrome (Almuttaqi and Udalova (2018) FEBS J.286:1624-1637; thompson et al, front. Immunol.,2018,9:2622; ban et al International Immunology (2018), 30,11:529-536; chehimi et al, J. Clin. Med (2017), 6,712, doi. Org/10.3390/jcm 6070068). In addition, IRF-5 is involved in type I interferon and Toll-like receptor signaling pathways and is a downstream regulator of cytokine expression (Krisjansdottir et al, J.Med. Genet. (2008), 45:362-369).
IRF-5 exists in multiple isoforms generated from three alternative non-coding 5' exons and at least nine alternatively spliced mrnas. The sequences of the IRF-5 isoforms may be obtained, for example, publicly available through the on-line NCBI database. Isoforms exhibit cell type specific expression, subcellular localization and function. Some isoforms are associated with a risk of autoimmune disease. For example, isoform 2 is associated with overexpression of IRF-5 and susceptibility to autoimmune diseases such as systemic lupus erythematosus. In addition, polymorphisms leading to higher mRNA expression, including single nucleotide polymorphisms in the gene encoding IRF-5, are associated with many autoimmune diseases (Krausgruber et al, nat. Immunol. (2010), 12 (3): 231-238; kozyrev et al, ARTHRITIS AND Rheumatology (2007), 56 (4): 1234-1241).
IRF-5 activation, mechanism of action, signaling pathways and regulatory elements have been reviewed (Song et al, j. Clin. Invest. (2020), 130 (12): 6700-6717; almutaqqi and Udalova FEBS j. (2018), 286:1624-1637; banga et al, sci. Adv. (2020), 6:eaay1057; thompson et al, front. Immunol. (2018), 9:2622).
The gene encoding IRF-5 comprises 9 exons (exon 1, exon 2, exon 3, exon 4, exon 5, exon 6, exon 7, exon 8 and exon 9). Exon 1 is located in the 5 '-untranslated region (5' -UTR) and has three variants, exon 1A, exon 1B, exon 1C and exon 1D. The major isoform comprises exon 1A. Exon 1B is associated with IRF-5 hyperactivation and disease progression. Single Nucleotide Polymorphisms (SNPs) (e.g., rs 2004640) introduced into a donor splice site can result in increased exon 1B transcript expression and decreased exon 1C-derived transcript expression. Other SNPs (e.g., rs 2280714) are also associated with increased IRF-5 expression (Kozyrev et al, ARTHRITIS AND Rheumatology (2007), 56 (4): 1234-1241).
Six isoforms of IRF-5 are provided below.
Human interferon regulatory factor 5 (IRF-5) (isoform 1)
Human interferon regulatory factor 5 (IRF-5) (isotype 2)
Human interferon regulatory factor 5 (IRF-5) (isoform 3)
Human interferon regulatory factor 5 (IRF-5) (isoform 4)
Human interferon regulatory factor 5 (IRF-5) (isoform 5)
Human interferon regulatory factor 5 (IRF-5) (isoform 6)
In embodiments, IRF-5 is encoded by a nucleotide sequence encoding IRF-5 isoform 1, IRF-5 isoform 2, IRF-5 isoform 3, IRF-5 isoform 4, IRF-5 isoform 5, or IRF-5 isoform 6. In embodiments, the nucleotide sequence encoding IRF-5 differs from the nucleotide sequence encoding IRF-5 isoform 1, IRF-5 isoform 2, IRF-5 isoform 3, IRF-5 isoform 4, IRF-5 isoform 5 or IRF-5 isoform 6 by one or more nucleic acids. In embodiments, the nucleotide sequences encoding IRF-5 differ in one or more polymorphisms (e.g., single nucleotide polymorphisms or SNPs). In embodiments, the nucleotide sequence encoding IRF-5 shares less than 100% sequence identity with a nucleotide sequence encoding IRF-5 isoform 1, IRF-5 isoform 2, IRF-5 isoform 3, IRF-5 isoform 4, IRF-5 isoform 5, or IRF-5 isoform 6. In embodiments, IRF-5 is encoded by a nucleotide sequence that is at least about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% identical to a nucleic acid sequence encoding IRF-5 isoform 1, IRF-5 isoform 2, IRF-5 isoform 3, IRF-5 isoform 4, IRF-5 isoform 5, or IRF-5 isoform 6. In embodiments, IRF-5 is encoded by a nucleotide sequence that is 80% to 100%, 90% to 100%, 95% to 100%, or 99% to 100% identical to a nucleic acid sequence encoding IRF-5 isoform 1, IRF-5 isoform 2, IRF-5 isoform 3, IRF-5 isoform 4, IRF-5 isoform 5, or IRF-5 isoform 6.
IRF-5 has been demonstrated to affect inflammatory macrophage phenotypes (Almuttaqi and Udalova, FEBS j. (2018), 286:1624-1637). Macrophages can be classified as M1 (classical activated macrophages) or M2 (alternatively activated macrophages) and can be interconverted according to the tissue microenvironment. There are three classes of alternatively activated macrophages (M2 a, M2b and M2 c). In normal tissues, the ratio of M1 to M2 macrophages is highly regulated. Imbalance between M1 and M2 macrophages can lead to conditions such as osteoclastogenesis in asthma, chronic lung disease, atherosclerosis, or rheumatoid arthritis. IRF-5 is the primary regulator of pro-inflammatory M1 macrophage polarization (Weiss et al Mediators of Inflammation (2013) Dx.doi.org/10.1155/2013/245804).
). Exposure of natural monocytes or recruited macrophages to the Th1 cytokine IFN- γ, TNF or LPS promotes M1 development, which secretes pro-inflammatory cytokines such as TNF, IL-1β, IL-6, IL-12, IL-23 and promotes development of Th1 lymphocytes. Exposure of monocytes to IL-4 and IL-13 promotes the M2a phenotype, which expresses chemokines that promote growth of Th2 cells, eosinophils, and basophils. M2b macrophages are induced by a combination of LPS, immune complexes, apoptotic cells and IL-1 Ra. M2b macrophages secrete high levels of IL-10 and the pro-inflammatory cytokines TNF and IL-6 and express iNOS. M2c macrophages are induced by a combination of IL-10, TGF-beta and glucocorticoids and secrete IL-10 and TGF-beta, which promote the development of Th2 lymphocytes (Duque and Descoteaux (2014) front. Immunol.5:491.Doi:10.3389/fimmu. 2014.00491).
Expression of IRF-5 in macrophages is reversibly induced by inflammatory stimuli and contributes to macrophage polarization. IRF-5 up-regulates M1 macrophage expression and down-regulates M2 macrophage expression (Krausgruber et al, nat. Immunol. (2010), 12 (3): 231-238).
In an embodiment, a method for treating an inflammatory disease is provided. In embodiments, the disease is associated with aberrant expression of IRF-5. In embodiments, the disease is associated with IRF-5 overexpression. In embodiments, the methods comprise administering a compound that down-regulates IRF-5 expression. In embodiments, the compound comprises AC. The AC may be any AC and have any AC characteristics as described elsewhere herein. In embodiments, the AC may bind to at least a portion of a SE or SRE of a target transcript as described elsewhere herein. In embodiments, the AC may bind near SE or SRE of the target transcript as described elsewhere herein. In an embodiment, the AC is ASO. In embodiments, the ASO is a PMO. AC may bind to any Splice Element (SE) of the IRF-5 target transcript as described elsewhere herein.
In embodiments, the methods comprise administering a compound that induces exon skipping of one or more exons in an IRF-5mRNA transcript. In embodiments, the method comprises administering a compound comprising an Antisense Compound (AC) that induces skipping of one or more exons in an IRF-5 target transcript. In embodiments, hybridization of AC to a target nucleotide sequence comprising at least a portion of an IRF-5 target transcript results in the inclusion or skipping of one or more exons in an mRNA transcript. In embodiments, the skipping of one or more exons or inclusion induces a frame shift in the IRF-5 target transcript. In embodiments, the frame shift results in an IRF-5 target transcript encoding a protein having reduced activity. In embodiments, the frameshift results in a truncated or nonfunctional IRF-5. In embodiments, the frame shift results in the introduction of a premature stop codon in the IRF-5mRNA transcript. In embodiments, the frameshift results in degradation of IRF-5mRNA transcripts by nonsense-mediated decay. In embodiments, the compound comprises an Antisense Compound (AC) that induces skipping of one or more of exons 2,3, 4,5, 6,7 and/or 8 of human and/or mouse IRF-5. In embodiments, the compound comprises an AC that induces one or more exon skipping to produce an out-of-frame, resulting in the IRF-5 target transcript being degraded (e.g., nonsense-mediated decay) or translated into an IRF-5 protein with reduced or no activity. In embodiments, the compound comprises AC that induces skipping of one or more of exons 3,4, 5, and/or 8 to produce out of frame frameshifting.
In an embodiment, the AC comprises any one of SEQ ID NOS 157-161 in Table 11. In embodiments, AC comprises 10 to 25, 10 to 20, or 10 to 15 consecutive bases of any of the sequences in table 11. In embodiments, SEQ ID NOS 340, 365, 369 or fragments thereof induce skipping of exon 4 to generate a premature stop codon in exon 5. In embodiments, SEQ ID NOS: 340, 365, 369 or fragments thereof induce exon skipping of exon 4, resulting in nonsense-mediated decay of the IRF-5 target transcript. In embodiments, SEQ ID NOS 340 and 365 or fragments thereof induce skipping of exon 4 to generate a premature stop codon. In embodiments, SEQ ID NOS 366-368 or fragments thereof induce exon skipping of exon 5, resulting in a premature stop codon in exon 6. In embodiments, SEQ ID NO 366-368 or fragments thereof induce exon skipping of exon 5, resulting in nonsense-mediated decay of the IRF-5 target transcript.
Table 11: AC sequences inducing exon skipping
In embodiments, the compound comprises an AC that induces skipping of one or more exons to produce an in frame deletion in the IRF-5 target transcript. In embodiments, the compounds comprise AC that induces skipping of one or more of exons 6 and/or 7 to produce an in frame deletion in IRF-5 target transcripts.
In embodiments, a method for treating a disease or disorder associated with IRF-5 is provided. In embodiments, the disease or disorder is associated with IRF-5 genetic variation. In embodiments, the disease or disorder is associated with a genetic mutation in the IRF-5 gene. In embodiments, the genetic mutation in IRF-5 results in overexpression of IRF-5. In embodiments, the genetic mutation results in expression of an alternative isoform. In embodiments, the disease or disorder is associated with IRF-5 overexpression. In embodiments, the disease or disorder is associated with IRF-5 isoform expression.
In embodiments, a method for treating inflammation, autoantibody production, inflammatory cell infiltration, collagen deposition, or inflammatory cytokine production in a patient is provided.
IRF-5 is involved in a variety of diseases has been described (see, e.g., graham et al, nat genet. (2006), 38 (5): 550-5; rueda et al ,Arthritis Rheum.(2006),54(12):3815-9;Henrique da Mota,Clin Rheumatol.(2015),34(9):1495-501;Sigurdsson et al, hum mol genet. (2008), 17 (6): 872-81; peng et al, nephrology (Carlton) (2010), 15 (7): 710-3; ishimura et al, J clin immunol. (2011), 31 (6): 946-51; summers et al, J Rheumatol. (2008), 35 (11): 2106-18; ni et al, information (2019), 2 (5): 1821-1829; dideberg et al, hum mol genet. (2007), 16 (24): 3008-16; lim et al, J. Dig. Dis. (2015), 16 (4): 205-16; inn. Rheum. Dis.) (2012), 202 (7): 7-7; r. 6 et al, J Rheumatol et al, 35 (2018): 35 (35) and/or the like, and (35 (11):) 2106-18; ni et al, information (2018, r.) (2018-35, 35) and (5); flesch et al, tissue Antigens (2011), 78 (1): 65-8; heijde et al, arthritis Rheum (2007), 56 (12): 3989-94; hafler et al, genes Immun (2009), 10 (1): 68-76; balasa et al, eur. Cytokine Net (2012), 23 (4): 166-72; byre et al, mucosal Immunol (2017), 10 (3) 716-726; wang et al, gene (2012), 10,504 (2): 220-5; pimenta et al mol. Cancer (2015), 14 (1): 32; rambod et al, clin Rheumatoid (2018), 37 (10): 2661-2665; davi et al, J Rheumatoid (2011), 38 (4): 769-74; zimmerman et al, kidney 360 (2020), 1 (3): 179-190; pandey et al, mucosal.immunol. (2019), 12 (4): 874-887; masuda et al, nat.Commun. (2014), 5:3771; alzaid et al, JCI weight (2016), 1 (20): e88689; SENEVIRANTE et al, circulation (2017), 136 (12): 1140-1154; cevik et al, J.biol.chem. (2017), 292 (52): 21676-21689; sharif et al, ann.Rheum.Dis. (2012), 71 (7): 1197-1202; and Yang et al, J Pediatr.Surg. (2017), 52 (12): 1984-1988).
In embodiments, methods of down-regulating IRF-5 expression in a patient using one or more compounds disclosed herein are provided. In embodiments, IRF-5 expression in macrophages is reduced. In embodiments, IRF-5 expression in kupfu cells is reduced. In embodiments, IRF-5 expression is reduced in the gastrointestinal tract. In embodiments, IRF-5 expression is reduced in the liver. In embodiments, the expression of IRF-5 is reduced in the lung. In embodiments, IRF-5 expression is reduced in the kidney. In embodiments, the expression of IRF-5 is reduced in a joint. In embodiments, IRF-5 is expressed in the central nervous system.
In embodiments, the compounds disclosed herein are useful for treating diseases associated with IRF-5. Examples of diseases associated with IRF-5 include, but are not limited to, inflammatory Bowel Disease (IBD), ulcerative colitis, crohn's disease, systemic Lupus Erythematosus (SLE), rheumatoid arthritis, primary biliary cirrhosis, systemic sclerosis, sjogren's syndrome, multiple sclerosis, scleroderma, interstitial lung disease (SSc-ILD), polycystic Kidney Disease (PKD), chronic Kidney Disease (CKD), non-alcoholic steatohepatitis (NASH), liver fibrosis, asthma, severe asthma, and combinations thereof. In embodiments, the compounds disclosed herein are useful for reducing inflammation, sclerosis, fibrosis, proteinuria, joint inflammation, autoantibody production, inflammatory cell infiltration, collagen deposition, inflammatory cytokine production, or a combination thereof in a patient. In embodiments, the compounds disclosed herein are useful for reducing inflammation, diarrhea, pain, fatigue, abdominal cramps, hematochezia, intestinal inflammation, disruption of the gastrointestinal epithelial barrier, dysbiosis, increased frequency of bowel movement, tenesmus or painful cramps of the anal sphincter muscle, constipation, unexpected weight loss, or combinations thereof in the gastrointestinal tract.
In embodiments, the compounds disclosed herein are useful for treating inflammatory diseases. By "inflammatory disease" is meant a disease in which activation of an innate or adaptive immune response is the primary cause of a clinical condition. Inflammatory diseases include, but are not limited to, acne vulgaris, asthma, COPD, autoimmune diseases, celiac disease, chronic (plaque) prostatitis, glomerulonephritis, hypersensitivity reactions, inflammatory bowel disease (IBD, crohn's disease, ulcerative colitis), pelvic inflammation, reperfusion injury, rheumatoid arthritis, sarcoidosis, graft rejection, vasculitis, interstitial cystitis, atherosclerosis, allergies (type 1, 2 and 3 hypersensitivity reactions, pollinosis), inflammatory myopathies such as systemic sclerosis, and include dermatomyositis, polymyositis, inclusion body myositis, cheto-east syndrome, chronic granulomatosis, vitamin a deficiency, cancer (solid tumor, gallbladder cancer), periodontitis, granulomatous inflammation (tuberculosis, leprosy, sarcoidosis and syphilis), cellulosic inflammation, suppurative inflammation, serous inflammation, ulcerative inflammation and ischemic heart disease, type I diabetes, diabetic nephropathy, and combinations thereof.
In embodiments, the compounds disclosed herein are useful for treating autoimmune diseases. An "autoimmune disease" refers to a disease or condition in which the patient's immune system attacks the patient's own tissues. Examples of autoimmune diseases or disorders include, but are not limited to, inflammatory reactions, such as inflammatory skin diseases, including psoriasis and dermatitis (e.g., atopic dermatitis); systemic scleroderma and scleroderma; responses associated with inflammatory bowel disease (such as crohn's disease and ulcerative colitis); respiratory distress syndrome (including adult respiratory distress syndrome; ARDS); dermatitis is treated; meningitis; encephalitis; uveitis; colitis; glomerulonephritis; allergic disorders such as eczema and asthma and other diseases involving T cell infiltration and chronic inflammatory responses; atherosclerosis; insufficient leukocyte adhesion; rheumatoid arthritis; systemic Lupus Erythematosus (SLE) (including but not limited to lupus nephritis, cutaneous lupus); systemic sclerosis (scleroderma); diabetes (e.g., type I diabetes or insulin dependent diabetes); multiple sclerosis; raynaud's syndrome; autoimmune thyroiditis; hashimoto thyroiditis; allergic encephalomyelitis; sjogren's syndrome; juvenile diabetes mellitus; and immune responses associated with cytokine and T lymphocyte mediated acute and delayed hypersensitivity reactions, which are commonly found in tuberculosis, sarcoidosis, polymyositis, dermatomyositis; granulomatosis and vasculitis; primary biliary cirrhosis; pernicious anemia (Ai Disen disease); autoimmune gastritis; autoimmune hepatitis; diseases involving leukocyte exudation; inflammatory disorders of the Central Nervous System (CNS); vitiligo; multiple organ injury syndrome; hemolytic anemia (including but not limited to cryoglobulinemia or coom positive anemia); myasthenia gravis; antigen-antibody complex mediated diseases; anti-glomerular basement membrane diseases; antiphospholipid syndrome; allergic neuritis; graves' disease; lambert-eaton muscle weakness syndrome; bullous pemphigoid; pemphigus; autoimmune polycycloadenosis; listetter's disease; stiff person syndrome; behcet's disease; giant cell arteritis; immune complex nephritis; igA nephropathy; igM polyneuropathy; immune Thrombocytopenic Purpura (ITP) or autoimmune thrombocytopenia; autoimmune encephalomyelitis; non-alcoholic steatohepatitis (NASH); ankylosing spondylitis; pulmonary fibrosis; or a combination thereof.
In embodiments, the compounds disclosed herein are useful for treating autoimmune diseases, such as Systemic Lupus Erythematosus (SLE), systemic sclerosis (scleroderma), polymyositis/dermatomyositis, crohn's disease, ulcerative colitis, rheumatoid arthritis, sjogren's syndrome, autoimmune encephalomyelitis, nonalcoholic steatohepatitis (NASH), sarcoidosis, behcet's disease, myasthenia gravis, lupus nephritis, inflammatory Bowel Disease (IBD), ankylosing spondylitis, primary biliary cirrhosis, colitis, pulmonary fibrosis, antiphospholipid syndrome, or psoriasis.
In embodiments, the compounds disclosed herein are useful for treating cardiovascular diseases. In embodiments, the cardiovascular disease is associated with inflammation. In embodiments, the cardiovascular disease comprises systemic scleroderma. In embodiments, the cardiovascular disease comprises an aneurysm; angina pectoris; atherosclerosis; cerebrovascular accident (stroke); cerebrovascular diseases; congestive heart failure; coronary artery disease; myocardial infarction (heart attack); peripheral vascular disease; or a combination thereof. In embodiments, the cardiovascular disease comprises atherosclerosis.
In embodiments, the compounds disclosed herein are useful for treating gastrointestinal disorders. In embodiments, the gastrointestinal disease comprises crohn's disease, primary biliary cirrhosis, sclerosing cholangitis, ulcerative colitis, inflammatory bowel disease, sjogren's syndrome, or a combination thereof.
In embodiments, the compounds disclosed herein are used to treat urinary system disorders. In embodiments, the urinary system disease includes systemic lupus erythematosus, systemic scleroderma, or a combination thereof.
In embodiments, the compounds disclosed herein are useful for treating genetic, familial, or congenital diseases. In embodiments, the genetic, familial, or congenital disease comprises crohn's disease, primary biliary cirrhosis, systemic scleroderma, systemic lupus erythematosus, ulcerative colitis, psoriasis, inflammatory bowel disease, or a combination thereof.
In embodiments, the compounds disclosed herein are useful for treating diseases of the endocrine system. In embodiments, the endocrine system disorders include thyroid adenocarcinoma, primary biliary cirrhosis, sclerosing cholangitis, hypothyroidism, or a combination thereof.
In embodiments, the compounds disclosed herein are useful for treating cell proliferative disorders. In embodiments, the cell proliferative disorder comprises primary biliary cirrhosis, thyroid adenocarcinoma, tumor, or a combination thereof.
In embodiments, the compounds disclosed herein are useful for treating immune system disorders. In embodiments, the immune system disorder comprises sjogren's syndrome, inflammatory bowel disease, psoriasis, myositis, systemic scleroderma, autoimmune disease, systemic lupus erythematosus, rheumatoid arthritis, crohn's disease, ulcerative colitis, ankylosing spondylitis, or a combination thereof.
In embodiments, the compounds disclosed herein are useful for treating hematological disorders. In embodiments, the hematological disorder comprises systemic lupus erythematosus.
In embodiments, the compounds disclosed herein are useful for treating musculoskeletal or connective tissue diseases. In embodiments, musculoskeletal or connective tissue diseases include myositis, systemic scleroderma, systemic lupus erythematosus, rheumatoid arthritis, ankylosing spondylitis, juvenile idiopathic scoliosis, or combinations thereof.
In embodiments, the compounds disclosed herein are useful for treating neuroinflammatory disorders. In embodiments, the neuroinflammatory disease or disorder includes inflammation due to traumatic brain injury, acute Disseminated Encephalomyelitis (ADEM), autoimmune encephalitis, acute Optic Neuritis (AON), chronic meningitis, anti-Myelin Oligodendrocyte Glycoprotein (MOG) disease, transverse myelitis, neuromyelitis optica (NMO), alzheimer's disease, parkinson's disease, multiple Sclerosis (MS), or a combination thereof.
In embodiments, the compounds disclosed herein are useful for treating inflammation due to infection by a microorganism such as a virus, bacterium, fungus, parasite, or combination thereof.
In embodiments, the compounds disclosed herein are useful for treating diseases associated with fibrosis, which are referred to herein as fibrotic diseases. "fibrosis" refers to the pathological formation of fibrous connective tissue, e.g., due to injury, irritation, or chronic inflammation, and includes more than normal amounts of fibroblast accumulation and collagen deposition in tissue. "fibrotic disease" refers to a disease associated with pathological fibrosis. Examples of fibrotic diseases include, but are not limited to, idiopathic pulmonary fibrosis; scleroderma; scleroderma of skin; lung scleroderma; collagen vascular diseases (e.g., lupus; rheumatoid arthritis; scleroderma); hereditary pulmonary fibrosis (e.g., hermannsky-prader-gram syndrome); radiation pneumonitis; asthma; asthma airway remodeling; chemotherapy-induced pulmonary fibrosis (e.g., bleomycin, methotrexate, or cyclophosphamide induction); radiofibrosis; gaucher disease; interstitial lung disease; retroperitoneal fibrosis; myelofibrosis; interstitial or pulmonary vascular disease; fibrosis or interstitial lung disease associated with drug exposure; interstitial lung diseases associated with exposure, such as asbestosic lung, silicosis and particle exposure; chronic hypersensitivity pneumonitis; adhering; intestinal or abdominal adhesions; heart fibrosis; kidney fibrosis; sclerosis disease; non-alcoholic steatohepatitis (NASH) -induced fibrosis; or a combination thereof. In embodiments, the fibrotic disease comprises non-alcoholic steatohepatitis NASH.
In embodiments, the compounds disclosed herein are useful for treating respiratory or thoracic diseases, such as systemic scleroderma. In embodiments, the compounds disclosed herein are useful for treating skin system disorders, such as psoriasis or systemic scleroderma. In embodiments, the compounds disclosed herein are useful for treating a disease of the visual system, such as sjogren's syndrome or systemic scleroderma. In embodiments, the compounds disclosed herein are useful for treating conditions associated with eosinophil count, glomerular filtration rate, systolic blood pressure, eosinophil percentage of leukocytes, or a combination thereof. In embodiments, the compounds disclosed herein are used to treat an ulcer disease or a dental ulcer.
Inflammatory Bowel Disease (IBD)
Inflammatory Bowel Disease (IBD) refers to two disorders characterized by chronic inflammation of the Gastrointestinal (GI) tract: crohn's disease and ulcerative colitis. Common symptoms of IBD include persistent diarrhea, abdominal pain, rectal bleeding/hematochezia, weight loss, and fatigue. In 2015, it was estimated that 1.3% of U.S. adults (3 million people) reported diagnosis of IBD (crohn's disease or ulcerative colitis). IBD is associated with an inflammatory macrophage phenotype in intestinal macrophages promoted by IRF-5.
Rheumatoid Arthritis (RA)
Rheumatoid Arthritis (RA) is an autoimmune disease affecting 0.5% to 1% of the population worldwide. It can cause joint pain and injury throughout the patient's body. Treatment of RA generally involves the use of drugs that slow down the condition and prevent joint deformation, known as antirheumatic drugs (DMARDs) that improve the condition and biological agents (antibodies) that target the part of the immune system that triggers inflammation that causes joint and tissue damage. IRF-5 polymorphisms have been identified as risk factors for RA. Reduced IRF-5 levels are associated with reduced disease phenotypes. IRF-5 activation of TLR3 and TLR7 promotes inflammatory cytokine and chemokine production.
Sjogren's Syndrome (SS)
Sjogren's Syndrome (SS) is an immune disorder identified by dry eyes and dry mouth. The disorder is often accompanied by other immune system disorders such as Rheumatoid Arthritis (RA) and Systemic Lupus Erythematosus (SLE). The disease affects mainly women between 40-60 years of age. In the united states, the prevalence of primary SS is estimated to be 2 to 10 per 10,000 residents. Existing SS therapies include the treatment of dry eyes and dry mouth symptoms. There is no disease modifying therapy yet. In several studies, the IRF-5rs2004640T allele and CGGGG insertions/deletions were associated with SS.
Multiple Sclerosis (MS)
Multiple Sclerosis (MS) is a disease that debilitates the central nervous system (brain and spinal cord). In MS, the immune system attacks a protective sheath (myelin) covering nerve fibers and causes communication problems between the brain and body of the patient. Multiple sclerosis causes a wide range of neurological symptoms including sensory or motor paralysis, vision disorders, ataxia, impaired coordination, pain, cognitive dysfunction and fatigue. Current estimates indicate that 300,000 to 400,000 individuals are affected in the united states and that over 2 million individuals are affected worldwide. Treatment of MS is generally limited to corticosteroids and plasmapheresis therapy.
Two Single Nucleotide Polymorphisms (SNPs) (rs 4728142, rs 3807306) and a 5bp insertion-deletion polymorphism in the promoter and first intron of the IRF-5 gene are highly correlated with MS. Kristjansdottir et al (2008)"Interferon regulatory factor 5(IRF-5)gene variants are associated with multiple sclerosis in three distinct populations,"J.Med.Genet.45(6):362-369.
Scleroderma or systemic sclerosis (SSc)
Scleroderma is a chronic connective tissue disease associated with extensive fibrosis of the skin and internal organs, small vascular lesions, and immune disorders that produce autoantibodies. Sharif et al (2012)"IRF-5polymorphism predicts prognosis in patients with systemic sclerosis,"Ann.Rheum.Dis.71(7):1197-1202.
IRF-5 variant rs4728142 is associated with longer survival and lower IRF-5 transcript levels in SSc patients and predicts longer survival and milder Interstitial Lung Disease (ILD) in SSc patients. Patients without copies of IRF-5rs4728142 increased IRF-5 expression levels and experienced more severe ILD and shorter survival. Additional single nucleotide polymorphisms (rs 10488631 and rs 12537284) were identified in the genome-wide association study (GWAS) of Systemic Sclerosis (SS). Sharif et al (2012)"IRF-5polymorphism predicts prognosis in patients with systemic sclerosis,"Ann Rheum Dis.71(7):1197-202.
Double homologous cassette 4 (DUX 4) genes
Facial shoulder humeral muscular dystrophy (FSHD) is the third most common form of hereditary muscular dystrophy. It is caused by incomplete inhibition of the double homology cassette (DUX 4) of transcription factors in skeletal muscle. DUX4 overexpression in myoblasts induces different toxic cascades including increases in oxidative stress, nonsense-mediated decay inhibition and inhibition of myogenesis (Bouwman et al, curr. Opin. Neurol. (2020), 33 (5): 635-640).
The DUX4 gene is located in a region called D4Z4 near the end of chromosome 4. The region comprises 11 to more than 100 repeat segments, each segment being about 3,300 DNA bases (3.3 kb) long. Each repeat segment in the D4Z4 region contains a copy of the DUX4 gene. The copy closest to the chromosome end is called DUX4, while the other copies are called "DUX 4-like" or DUX4L.
DUXc have also been identified as being up-regulated in FSHD (Ansseau et al, PLoS one. (2009), 4 (10): e7482, doi: 10.1371/journ. Fine. 0007482). DUXc have been located in the 42kb centromere sequence of the D4Z4 region. DUX4c encodes a 47kb protein identical to DUX4, except for the carboxy terminal region.
FSHD is characterized by shrinkage of the D4Z4 array in the subtelomere region of chromosome 4, resulting in aberrant expression of DUX4 transcription factors and deregulation of hundreds of genes (Marsollier et al, (int. J. Mol. Sci. (2018), 19,1347, doi:10.3390/ijms 19051347).
There are four variants of the human DUX4 gene, the nucleotide sequences of which are publicly available through NCBI database: variant 1 (nm_ 001306068.3), variant 2 (nm_ 001293798.3), variant 3 (nr_ 137167.1) and variant 4 (nm_ 001363820.2). Both DUX4 variant 1 and variant 2 encode full length DUX4 (DUX 4-fl). Overexpression of full-length DUX4 is associated with FSHD. The difference between variant 1 and variant 2 is that variant 2 lacks a substitution segment in the 3' utr compared to variant 1. Compared to variant 1, DUX4 variant 3 has multiple differences at the 3 'end, including a different 3' end. The variant is represented as non-coding in that the use of the 5' -most desirable translation initiation codon makes the transcript a candidate for nonsense-mediated mRNA decay (NMD). Variant 4 lacks a majority of the coding region compared to variant 1. The resulting truncated DUX4 isoform (DUX 4-s) has a shorter and different C-terminus compared to isoform DUX 4-fl. The DUX4-s protein has been shown to be non-toxic to cells.
DUX4 contains three exons. Exon one is the coding exon of the DUX4 protein and exons 2 and 3 are untranslated. The full length DUX4 protein comprises two DNA binding domains and a C-terminal transactivation domain. The truncated isoform of DUX4 comprises two protein binding domains, but does not comprise a C-terminal transactivation domain. The first exon comprises two 5' splice sites. Transcripts encoding full length or truncated DUX4 proteins were generated depending on the 5' ss used. To generate the full length isoform, a first 5'ss located 3' to the first exon is used. To generate the truncated isoform, a second 5' ss located within exon 1 and closer to the 5' end of the transcript than the first 5' ss is used.
Variants 1,2 and 4 share the last exon. The sequences of variants 1,2 and 4 are shown below.
Variant 1 (DUX 4-fl 2):
Variant 2 (DUX 4-fl 1):
/>
Variant 4 (DUX 4-s):
Since FSHD is caused by gain-of-function mutations, DUX4 and/or DUX4c inhibition is a promising therapeutic strategy. Additionally or alternatively, since DUX4-s has been proven to be non-toxic, downregulating the expression of DUX4-fl by upregulating the expression of DUX4-s is a possible therapeutic strategy. However, many highly homologous copies of DUX4 can exist in the human genome, and the D4Z4 repeat fragment is extremely GC-rich, making DUX4 and DUX4c difficult targets. At this point, there is no therapy to prevent or delay progression of the FSHD patient (Bouwman et al, curr. Opin. Neurol. (2020), 33 (5): 635-640).
U.S. patent No. 10,907,157 and canadian patent No. 2999192 describe the use of antisense and RNA interfering agents to reduce expression of DUX4 or DUX4 c. Published PCT US2017/019422 has used micronuclear RNA to induce exon skipping of DUX4, resulting in expression of DUX 4-s. Phosphorodiamidate morpholino oligomers targeting various SE's of DUX4 have been shown to alter expression of genes downstream of DUX4 (Marsollier et al, human molecular genetics (2016), 25 (8), 1468-1478; and Lu-Nguyen et al, hum mol Genet. (2021), 30 (15): 1398-1412).
Provided herein are compositions and methods for modulating DUX4 and/or DUX4c expression. In embodiments, compounds for use in the treatment of FSHD are provided. In embodiments, compounds are provided that induce exon skipping in DUX4 transcripts, thereby resulting in expression of DUX4-s but not DUX 4-fl. In embodiments, the compound comprises at least one AC and at least one CPP.
In embodiments, AC hybridizes to a target nucleotide sequence comprising at least a portion of a splice element of a DUX4 transcript. In embodiments, the AC hybridizes to a target nucleotide sequence comprising at least a portion of a DUX4 transcript and induces exon skipping to produce a transcript encoding DUX 4-s. In embodiments, the exon skipping upregulates expression of DUX 4-s. In embodiments, exon skipping down-regulates expression of DUX 4-fl.
In embodiments, compounds and methods are provided that induce alternative splicing of DUX4 target transcripts. In embodiments, compounds and methods are provided that transfer the splicing of DUX4 to a second 5' ss to produce a transcript encoding a truncated DUX4 protein. In embodiments, compounds and methods are provided that down regulate the production of full length DUX4 mRNA transcripts and/or proteins. In embodiments, compounds and methods are provided that up-regulate production of truncated DUX4 mRNA transcripts and/or proteins. In embodiments, the compound comprises AC. The AC may be any AC and have any AC characteristics as described elsewhere herein. In an embodiment, the AC is ASO. In embodiments, the ASO is a PMO. AC can bind to any splice element of the DUX4 target transcript as described elsewhere herein.
In embodiments, the AC comprises any portion of the micronuclear RNA in publication PCT US2017/019422 (U.S. patent No. 11,180,755). In embodiments, the AC comprises any portion of the sequences in table 12.
Table 12: various AC sequences targeting DUX4
/>
In embodiments, the AC may comprise 10 or more, 15 or more, or 20 or more consecutive bases of any of the sequences in table 12. In embodiments, AC may comprise 25 or fewer, 20 or fewer, or 15 or fewer consecutive bases of any of the sequences in table 12. In embodiments, the AC may comprise 10-25, 10-20, or 10-15 consecutive bases of any of the sequences in Table 12. In embodiments, the AC may comprise 15-25 or 10-20 consecutive bases of any of the sequences in table 12. In embodiments, AC may comprise 20-25 consecutive bases of any of the sequences in table 12.
Therapeutic method
The present disclosure provides a method of treating a disease in a patient in need thereof, the method comprising administering a compound disclosed herein. In embodiments, the disease is any disease provided in the present disclosure. In embodiments, a method of treating a disease comprises administering a compound disclosed herein to a patient, thereby treating the disease. In embodiments, a method of treating a disease associated with IRF-5, GYS1, or DUX4 comprises administering a compound disclosed herein to a patient, thereby treating the disease.
In embodiments, the patient is identified as having or at risk of having a disease associated with IRF-5, GYS1 or DUX 4. In embodiments, the disease or disorder is associated with IRF-5, GYS1 or DUX4 genetic variation. In embodiments, the disease or disorder is associated with a genetic mutation of the IRF-5 gene, the GYS1 gene, or the DUX4 gene. In embodiments, the genetic mutation results in overexpression of IRF-5, GYS1 or DUX4 (e.g., DUX 4-fl). In embodiments, the genetic mutation results in expression of an alternative isoform of IRF-5, GYS1 or DUX 4. In embodiments, the disease or disorder is associated with overexpression of IRF-5, GYS1 or DUX4 (e.g., DUX 4-fl).
In various embodiments, treating refers to partially or completely alleviating, ameliorating, alleviating, inhibiting one or more symptoms in a patient, delaying the onset of the symptoms, reducing the severity and/or incidence of the symptoms.
In embodiments, a method of treating a disease or disorder by down-regulating expression of a target protein is provided. In embodiments, expression of the target protein is down-regulated by inducing exon skipping. In embodiments, exon skipping induces frame shifts that result in reduced expression or activity of the target protein. In embodiments, exon skipping results in premature stop codons and degradation of the target transcript. In embodiments, the treatment results in reduced expression of the protein isoform.
In embodiments, the treatment modulates IRF-5 activity in a patient in need thereof. In embodiments, the treatment modulates IRF-5 activity in cells of the patient. In embodiments, the treatment modulates IRF-5 activity in immune cells of the patient. In embodiments, the immune cell is a monocyte, lymphocyte, or dendritic cell. In embodiments, the lymphocyte is a B-lymphocyte. In embodiments, the monocyte is a macrophage. In embodiments, the macrophage is a tissue resident macrophage. In embodiments, the macrophage is a monocyte-derived macrophage. In embodiments, the macrophage is a coulomb cell, glomerular endometrium cell, alveolar macrophage, sinus tissue cell, huo Fubao mole cell, microglial cell, or langerhans cell. In embodiments, the immune cell is a kupfu cell.
In embodiments, the treatment modulates DUX4 activity in a patient in need thereof. In embodiments, the treatment modulates DUX4 activity in cells of the patient. In embodiments, the treatment modulates DUX4 activity in a patient's myocytes. In embodiments, the muscle cells are skeletal muscle cells.
In embodiments, treatment according to the present disclosure results in a reduction of IRF-5, DUX4-fl or GYS1 activity and/or expression in a patient by 5% to 10%, 5% to 20%, 5% to 30%, 5% to 40%, 5% to 50%, 5% to 60%, 5% to 70%, 5% to 80%, 5% to 90%, or 5% to 100% compared to the average level and/or activity of a target protein in a pre-treated patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP disclosed herein. In embodiments, treatment according to the present disclosure results in a decrease in IRF-5, DUX4-fl, or GYS1 activity and/or expression in a patient of 10% to 20%, 10% to 30%, 10% to 40%, 10% to 50%, 10% to 60%, 10% to 70%, 10% to 80%, 10% to 90%, or 10% to 100% compared to the average level and/or activity of a target protein in a pre-treated patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP disclosed herein. In embodiments, treatment according to the present disclosure results in a 20% to 30%, 20% to 40%, 20% to 50%, 20% to 60%, 20% to 70%, 20% to 80%, 20% to 90%, or 20% to 100% decrease in IRF-5, DUX4-fl, or GYS1 activity and/or expression in a patient as compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or as compared to treatment with a therapeutic moiety not conjugated to a CPP as disclosed herein. In embodiments, treatment according to the present disclosure results in a 30% to 40%, 30% to 50%, 30% to 60%, 30% to 70%, 30% to 80%, 30% to 90%, or 30% to 100% decrease in IRF-5, DUX4-fl, or GYS1 activity and/or expression in a patient compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP as disclosed herein. In embodiments, treatment according to the present disclosure results in a reduction of IRF-5, DUX4-fl or GYS1 activity and/or expression in a patient by 40% to 50%, 40% to 60%, 40% to 70%, 40% to 80%, 40% to 90%, or 40% to 100% compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP as disclosed herein. In embodiments, treatment according to the present disclosure results in a 50% to 60%, 50% to 70%, 50% to 80%, 50% to 90%, or 50% to 100% decrease in IRF-5, DUX4-fl, or GYS1 activity and/or expression in a patient compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP disclosed herein. In embodiments, treatment according to the present disclosure results in a reduction of IRF-5, DUX4-fl, or GYS1 activity and/or expression in a patient by 60% to 70%, 60% to 80%, 60% to 90%, or 60% to 100% compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP as disclosed herein. In embodiments, treatment according to the present disclosure results in a 70% to 80%, 70% to 90% or 70% to 100% decrease in IRF-5, DUX4-fl or GYS1 activity and/or expression in a patient compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP as disclosed herein. In embodiments, treatment according to the present disclosure results in a reduction of 80% to 90% or 80% to 100% in IRF-5, DUX4-fl or GYS1 activity and/or expression in a patient compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP as disclosed herein. In embodiments, treatment according to the present disclosure results in a 90% to 100% reduction in IRF-5, DUX4-fl or GYS1 activity and/or expression in a patient compared to the average level and/or activity of a target protein in a pre-treatment patient, untreated one or more control individuals with a similar disease, or compared to treatment with a therapeutic moiety not conjugated to a CPP as disclosed herein.
The terms "improving," "increasing," "decreasing," "reducing," and the like as used herein refer to values relative to a control. In embodiments, a suitable control is a baseline measurement, such as a measurement in the same individual prior to initiation of the treatment described herein, or a measurement in a control individual (or individuals) in the absence of the treatment described herein. A "control individual" is an individual with the same disease that is about the same age and/or sex as the treated individual (to ensure that the disease stages of the treated individual and the control individual are comparable).
The individual being treated (also referred to as a "patient" or "subject") is an individual (fetus, infant, child, adolescent or adult) who has a disease or who has a potential to develop a disease. An individual may have a disease mediated by aberrant gene expression or aberrant gene splicing. In various embodiments, the wild-type target protein expression or activity level of an individual suffering from a disease may be about 1% to 99% lower than the normal protein expression or activity level of an individual not suffering from the disease. In embodiments, this range includes, but is not limited to, less than about 80% -99%, less than about 65% -80%, less than about 50% -65%, less than about 30% -50%, less than about 25% -30%, less than about 20% -25%, less than about 15% -20%, less than about 10% -15%, less than about 5% -10%, less than about 1% -5% of the normal thymidine phosphorylase expression or activity level. In embodiments, the individual's target protein expression or activity level may be 1% to 500% higher than the normal wild-type target protein expression or activity level. In embodiments, this range includes, but is not limited to, about 1% -10%, about 10% -50%, about 50% -100%, about 100% -200%, about 200% -300%, about 300% -400%, about 400% -500%, or about 500% -1000% higher.
In embodiments, the individual is a patient that has been recently diagnosed with a disease. In general, early treatment (starting treatment as soon as possible after diagnosis) reduces the impact of the disease and increases the beneficial effects of the treatment.
Compositions and methods of administration
In embodiments, compositions comprising one or more of the compounds described herein are provided.
In embodiments, pharmaceutically acceptable salts and/or prodrugs of the disclosed compounds are provided. Pharmaceutically acceptable salts include salts of the disclosed compounds prepared with acids or bases according to the particular substituents found on the compound. It may be appropriate to administer the compounds as salts under conditions wherein the compounds disclosed herein are sufficiently basic or acidic to form stable, non-toxic acid or base salts. Examples of pharmaceutically acceptable base addition salts include sodium, potassium, calcium, ammonium or magnesium salts. Examples of physiologically acceptable acid addition salts include hydrochloric acid, hydrobromic acid, nitric acid, phosphoric acid, carbonic acid, sulfuric acid, and organic acids such as acetic acid, propionic acid, benzoic acid, succinic acid, fumaric acid, mandelic acid, oxalic acid, citric acid, tartaric acid, malonic acid, ascorbic acid, alpha-ketoglutaric acid, alpha-sugar phosphoric acid, maleic acid, toluenesulfonic acid, methanesulfonic acid, and the like. Thus, disclosed herein are hydrochlorides, nitrates, phosphates, carbonates, bicarbonates, sulfates, acetates, propionates, benzoates, succinates, fumarates, mandelates, oxalates, citrates, tartrates, malonates, ascorbates, alpha-ketoglutarates, alpha-sugar phosphates, maleates, tosylates and methanesulfonates. Pharmaceutically acceptable salts of the compounds may be obtained using standard methods well known in the art, for example, by reacting a sufficiently basic compound such as an amine with a suitable acid that provides a physiologically acceptable anion. Alkali metal (e.g., sodium, potassium, or lithium) or alkaline earth metal (e.g., calcium) salts of carboxylic acids may also be prepared.
In vivo application of the disclosed compounds and compositions containing them may be accomplished by any suitable method and technique currently or contemplated to be known to those skilled in the art. For example, the disclosed compounds may be formulated in a physiologically or pharmaceutically acceptable form and administered by any suitable route known in the art, including, for example, oral and parenteral routes of administration. As used herein, the term parenteral includes subcutaneous, intradermal, intravenous, intramuscular, intraperitoneal, intrasternal and intrathecal administration, such as by injection. The administration of the disclosed compounds or compositions may be a single administration, or at successive or different intervals, as readily determinable by one of skill in the art.
The compounds disclosed herein and compositions comprising them may also be administered using liposome technology, slow release capsules, implantable pumps, and biodegradable containers. These delivery methods can advantageously provide uniform doses over an extended period of time. The compounds may also be administered in the form of their salt derivatives or in crystalline form.
The compounds disclosed herein may be formulated according to known methods for preparing pharmaceutically acceptable compositions. Formulations are described in detail in many sources well known and readily available to those skilled in the art. For example, remington's Pharmaceutical Science, e.w. martin (1995) describes formulations that can be used in conjunction with the disclosed methods. In general, the compounds disclosed herein can be formulated such that an effective amount of the compound is combined with a suitable carrier in order to facilitate effective administration of the compound. The composition used may also be in various forms. Such dosage forms include, for example, solid, semi-solid, and liquid dosage forms such as tablets, pills, powders, liquid solutions or suspensions, suppositories, injectable and infusible solutions, and sprays. The form depends on the intended mode of administration and the therapeutic application. The composition may also comprise conventional pharmaceutically acceptable carriers and diluents known to those skilled in the art. Examples of carriers or diluents for use with the compounds include ethanol, dimethylsulfoxide, glycerol, alumina, starch, saline and equivalent carriers and diluents. To provide for administration of such doses for the desired therapeutic treatment, the compositions disclosed herein may advantageously comprise between about 0.1% and 100% by weight of one or more of the subject compounds, in total, based on the weight of the total composition comprising the carrier or diluent.
Formulations suitable for administration include, for example, sterile injectable aqueous solutions which may contain antioxidants, buffers, bacteriostats and solutes which render the formulation isotonic with the blood of the intended recipient; and aqueous and non-aqueous sterile suspensions which may contain suspending agents and thickening agents. The formulations may be presented in unit-dose or multi-dose containers, for example sealed ampoules and vials, and may be stored in a freeze-dried (lyophilized) condition requiring only a sterile liquid carrier, for example water for injections, immediately prior to use. Extemporaneous injection solutions and suspensions may be prepared from sterile powders, granules, tablets and the like. It should be understood that the compositions disclosed herein may contain other conventional agents in the art regarding the type of formulation in question, in addition to the ingredients specifically mentioned above.
The compounds disclosed herein and compositions comprising them may be delivered to cells by direct contact with the cells or via carrier means. Carrier means for delivering the compounds and compositions to cells are known in the art and include, for example, encapsulation of the compositions in a liposomal fraction. Another means for delivering the compounds and compositions disclosed herein to cells includes attaching the compounds to proteins or nucleic acids that are targeted for delivery to the target cells. U.S. Pat. No. 6,960,648 and U.S. application publication nos. 20030032594 and 20020120100 disclose amino acid sequences that can be coupled to another composition and allow translocation of the composition across a biological membrane. U.S. application publication number 20020035243 also describes compositions for transporting biological moieties across cell membranes for intracellular delivery. The compounds may also be incorporated into polymers, examples of which include poly (D-L lactide-co-glycolide) polymers for intracranial tumors; poly [ bis (p-carboxyphenoxy) propane) at a molar ratio of 20:80: sebacic acid ] (as used in GLIADEL); chondroitin; chitin; and chitosan.
The compounds and compositions disclosed herein, including pharmaceutically acceptable salts or prodrugs thereof, may be administered intravenously, intramuscularly, or intraperitoneally by infusion or injection. Solutions of the active agent or salt thereof may be prepared in water, optionally mixed with a non-toxic surfactant. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, triacetin, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these formulations may contain preservatives to prevent microbial growth.
Pharmaceutical dosage forms suitable for injection or infusion may comprise sterile aqueous solutions or dispersions or sterile powders containing the active ingredient which are suitable for the extemporaneous preparation of sterile injectable or infusible solutions or dispersions optionally encapsulated in liposomes. The final dosage form should be sterile, fluid and stable under the conditions of manufacture and storage. The liquid carrier or vehicle may be a solvent or liquid dispersion medium including, for example, water, ethanol, polyols (e.g., glycerol, propylene glycol, liquid polyethylene glycol, and the like), vegetable oils, non-toxic glycerides, and suitable mixtures thereof. Proper fluidity can be maintained, for example, by the formation of liposomes, by the maintenance of the required particle size in the case of dispersions, or by the use of surfactants. Optionally, the action of microorganisms may be prevented by various other antibacterial and antifungal agents, such as parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In many cases, it is desirable to include isotonic agents, for example, sugars, buffers, or sodium chloride. Prolonged absorption of the injectable compositions can be brought about by the inclusion of agents which delay absorption, for example, aluminum monostearate and gelatin.
Sterile injectable solutions are prepared by incorporating the compounds and/or agents disclosed herein in the required amounts with various other ingredients enumerated above, as required, followed by filtered sterilization. In the case of sterile powders for the preparation of sterile injectable solutions, the methods of preparation include vacuum-drying and freeze-drying techniques which yield a powder of the active ingredient plus any additional desired ingredient present in the previously sterile-filtered solution thereof.
For topical administration, the compounds and agents disclosed herein may be applied as liquid or solid forms. However, it is generally desirable to apply them topically to the skin as a composition in combination with a dermatologically acceptable carrier, which may be solid or liquid. The compounds and agents and compositions disclosed herein may be topically applied to the skin of a patient to reduce the size of malignant or benign growths (and may include complete removal), or to treat the site of an infection. The compounds and agents disclosed herein may be applied directly to the locus of growth or infection. In embodiments, the compounds and agents are applied to the growth or infection site in a formulation such as an ointment, cream, lotion, solution, tincture, or the like.
Useful solid carriers include finely divided solids such as talc, clay, microcrystalline cellulose, silica, alumina and the like. Useful liquid carriers include water, alcohols or glycols or water-alcohol/glycol mixtures in which the compounds can be dissolved or dispersed at an effective level, optionally with the aid of non-toxic surfactants. Adjuvants such as fragrances and additional antimicrobial agents may be added to improve the characteristics for a given use. The resulting liquid composition may be applied from an absorbent pad for impregnating bandages and other dressings, or sprayed onto the affected area using, for example, a pump or aerosol sprayer.
Thickeners such as synthetic polymers, fatty acids, fatty acid salts and esters, fatty alcohols, modified celluloses or modified mineral materials may also be used with the liquid carrier to form spreadable pastes, gels, ointments, soaps, and the like for direct application to the skin of a user.
Useful dosages of the compounds and agents and pharmaceutical compositions disclosed herein can be determined by comparing their in vitro and in vivo activity in animal models. Methods for extrapolating effective dosages in mice and other animals to humans are known in the art.
The dosage range in which the composition is administered is a dosage range large enough to produce the desired effect affecting the symptom or disorder. The dosage should not be so large as to cause adverse side effects such as undesired cross-reactions, allergic reactions, etc. Generally, the dosage will vary with the age, condition, sex and degree of disease of the patient and can be determined by one skilled in the art. In the case of any contraindications, the dosage can be adjusted by the individual physician. The dosage may vary, and may be administered in one or more doses per day for one or more days.
Also disclosed are pharmaceutical compositions comprising a combination of a compound disclosed herein and a pharmaceutically acceptable carrier. In embodiments, the pharmaceutical composition is suitable for oral, topical or parenteral administration. The dose administered to a patient, particularly a human, should be sufficient to achieve a therapeutic response in the patient within a reasonable time frame without lethal toxicity and without causing side effects or morbidity exceeding acceptable levels. Those skilled in the art will recognize that the dosage will depend on a variety of factors including the condition (health) of the patient, the weight of the patient, the type of concurrent therapy (if any), the frequency of treatment, the rate of treatment, and the severity and stage of the pathological condition.
Kits comprising the compounds disclosed herein in one or more containers are also disclosed. The disclosed kits may optionally include a pharmaceutically acceptable carrier and/or diluent. In embodiments, the kit includes one or more additional components, adjuvants or adjuvants as described herein. In another embodiment, the kit includes one or more anti-cancer agents, such as those described herein. In embodiments, the kit includes instructions or packaging materials describing how to administer the compounds or compositions of the kit. The container of the kit may be of any suitable material, such as glass, plastic, metal, etc., and may be of any suitable size, shape or configuration. In embodiments, the compounds and/or medicaments disclosed herein are provided in a kit as a solid (such as in the form of a tablet, pill, or powder). In another embodiment, the compounds and/or agents disclosed herein are provided in a kit as a liquid or solution. In embodiments, the kit comprises an ampoule or syringe containing a compound and/or a medicament disclosed herein in liquid or solution form.
Certain definitions
As used in the specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a composition" includes a mixture of two or more such compositions, reference to "an agent" includes a mixture of two or more such agents, reference to "the component" includes a mixture of two or more such components, and so forth.
The term "about" when immediately preceding a numerical value means a range (e.g., plus or minus 10% of the value). For example, "about 50" may mean 45 to 55, "about 25,000" may mean 22,500 to 27,500, etc. Unless the context of the present disclosure indicates otherwise or is inconsistent with such interpretation. For example, in a list of values such as "about 49, about 50, about 55, …," about 50 "means a range extending to less than half the interval between the front and back values, e.g., greater than 49.5 to less than 52.5. Furthermore, the phrase "less than about" value or "greater than about" value should be understood in accordance with the definition of the term "about" provided herein. Similarly, the term "about" when preceding a series of values or ranges of values (e.g., "about 10, 20, 30" or "about 10-30"), refers to all values in the series or endpoints of the range, respectively.
As used herein, "cell penetrating peptide" or "CPP" refers to a peptide that facilitates delivery of cargo, such as a Therapeutic Moiety (TM), into a cell. In an embodiment, the CPP is cyclic and is denoted "cCPP". In embodiments cCPP are capable of directing the therapeutic moiety to penetrate the cell membrane. In embodiments cCPP delivers the therapeutic moiety to the cytosol of the cell. In embodiments cCPP delivers Antisense Compounds (ACs) to the cell site where the pre-mRNA is located.
As used herein, the term "endosomal escape vector" (EEV) refers to cCPP conjugated to a linker and/or an Exocyclic Peptide (EP) by chemical ligation (i.e., covalent or non-covalent interactions). The EEV may be an EEV of formula (B).
As used herein, the term "EEV-conjugate" refers to an endosomal escape carrier as defined herein that is conjugated to cargo by chemical attachment (i.e., covalent bond or non-covalent interaction). The cargo may be a therapeutic moiety (e.g., an oligonucleotide, peptide, or small molecule) that can be delivered into the cell by EEV. The EEV-conjugate may be an EEV-conjugate of formula (C).
As used herein, the terms "exocyclic peptide" (EP) and "modulator peptide" (MP) are used interchangeably to refer to two or more amino acid residues joined by peptide bonds that can be conjugated to a cyclic cell penetrating peptide (cCPP) as disclosed herein. When conjugated to the cyclic peptides disclosed herein, EP may alter the tissue distribution and/or retention of the compound. Typically, an EP comprises at least one positively charged amino acid residue, e.g. at least one lysine residue and/or at least one arginine residue. Non-limiting examples of EPs are described herein. An EP may be a peptide identified in the art as a "nuclear localization sequence" (NLS). Non-limiting examples of nuclear localization sequences include the nuclear localization sequence of SV40 viral large T antigen, whose minimal functional unit is the seven amino acid sequence PKKKRKV (SEQ ID NO: 42), the dual-typed nucleoplasmin NLS with sequence NLSKRPAAIKKAGQAKKKK (SEQ ID NO: 52), the c-myc nuclear localization sequence with amino acid sequence PAAKRVKLD (SEQ ID NO: 53) or RQRRNELKRSF (SEQ ID NO: 54), the sequence RMRKFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV from the IBB domain of input protein-alpha (SEQ ID NO: 50), the sequence VSRKRPRP of myoma T protein (SEQ ID NO: 57) and PPKKARED (SEQ ID NO: 58), the sequence PQPKKKPL of human p53 (SEQ ID NO: 59), the sequence SALIKKKKKMAP of mouse c-abl IV (SEQ ID NO: 60), the sequence DRLRR of influenza virus NS1 (SEQ ID NO: 61) and PKQKKRK (SEQ ID NO: 62), the sequence RKLKKKIKKL of hepatitis virus delta antigen (SEQ ID NO: 63) and the sequence REKKKFLKRR of mouse protein (SEQ ID NO: 64), the human glucocorticoid receptor (ADP-9665) and human glucocorticoid receptor (SEQ ID NO: 9665). Additional examples of NLS are described in International publication No. 2001/038547 and incorporated herein by reference in its entirety.
As used herein, "linker" or "L" refers to a moiety that covalently binds one or more moieties (e.g., exocyclic Peptides (EPs) and cargo, such as oligonucleotides, peptides, or small molecules) to a cyclic cell penetrating peptide (cCPP). The linker may comprise a natural or unnatural amino acid or polypeptide. The linker may be a synthetic compound containing two or more suitable functional groups suitable for binding cCPP to the cargo moiety to form the compounds disclosed herein. The linker may comprise a polyethylene glycol (PEG) moiety. The linker may comprise one or more amino acids. cCPP may be covalently bound to the cargo via a linker.
The terms "peptide," "protein," and "polypeptide" are used interchangeably to refer to a natural or synthetic molecule comprising two or more amino acids linked to an alpha amino group of one amino acid through a carboxyl group of another amino acid. Two or more amino acid residues may be linked to an alpha amino group through the carboxyl group of one amino acid. Two or more amino acids of a polypeptide may be joined by peptide bonds. A polypeptide may include peptide backbone modifications in which two or more amino acids are covalently linked by a bond other than a peptide bond. The polypeptide may include one or more unnatural amino acids, amino acid analogs, or other synthetic molecules that are capable of being integrated into the polypeptide. The term polypeptide includes naturally occurring and artificially occurring amino acids. The term polypeptide includes, for example, peptides comprising about 2 to about 100 amino acid residues as well as proteins comprising more than about 100 amino acid residues or more than about 1000 amino acid residues, including but not limited to therapeutic proteins such as antibodies, enzymes, receptors, soluble proteins, and the like.
As used herein, the term "contiguous" refers to two amino acids joined by a covalent bond. For example, in a representative cyclic cell penetrating peptide (cCPP) such asContiguous pairs of amino acids are illustrated in the case of AA 1/AA2、AA2/AA3、AA3/AA4 and AA 5/AA1.
As used herein, a residue of a chemical refers to a derivative of a chemical that is present in a particular product. To form a product, at least one atom of the substance is substituted with a bond to another moiety such that the product contains a derivative or residue of the chemical substance. For example, the cyclic cell penetrating peptides (cCPP) described herein have amino acids (e.g., arginine) incorporated therein by formation of one or more peptide bonds. The amino acid incorporated into cCPP may be referred to as a residue, or simply as an amino acid. Thus, arginine or arginine residues refer to
The term "protonated form thereof" refers to a protonated form of an amino acid. For example, the guanidine group on the arginine side chain can be protonated to form guanidineA group. The structure of the protonated form of arginine is/>
As used herein, the term "chiral" refers to a molecule having more than one stereoisomer that differs in the three-dimensional arrangement of atoms, wherein one stereoisomer is a non-superimposable mirror image of the other stereoisomer. In addition to glycine, amino acids have a chiral carbon atom adjacent to a carboxyl group. The term "enantiomer" refers to a chiral stereoisomer. Chiral molecules may be amino acid residues having the "D" and "L" enantiomers. Molecules without chiral centers, such as glycine, may be referred to as "achiral".
As used herein, the term "hydrophobic" refers to a moiety that is insoluble or has minimal solubility in water. Typically, the neutral and/or non-polar moiety, or predominantly neutral and/or non-polar moiety, is hydrophobic. Hydrophobicity can be measured by one of the methods disclosed herein.
As used herein, "aromatic" refers to an unsaturated ring molecule having 4n+2 pi electrons, wherein n is any integer. The term "non-aromatic" refers to any unsaturated ring molecule that does not fall within the definition of aromatic.
"Alkyl", "alkyl chain" or "alkyl group" refers to a fully saturated, straight or branched hydrocarbon chain group having one to forty carbon atoms and linked to the remainder of the molecule by a single bond. Including any number of alkyl groups containing 1 to 40 carbon atoms. The alkyl group containing up to 40 carbon atoms is a C 1-C40 alkyl group, the alkyl group containing up to 10 carbon atoms is a C 1-C10 alkyl group, the alkyl group containing up to 6 carbon atoms is a C 1-C6 alkyl group, and the alkyl group containing up to 5 carbon atoms is a C 1-C5 alkyl group. C 1-C5 alkyl includes C 5 alkyl, C 4 alkyl, C 3 alkyl, C 2 alkyl, and C 1 alkyl (i.e., methyl). C 1-C6 alkyl includes all moieties described above for C 1-C5 alkyl, but also includes C 6 alkyl. C 1-C10 alkyl includes all of the moieties described above for C 1-C5 alkyl and C 1-C6 alkyl, but also includes C 7、C8、C9 and C 10 alkyl. Similarly, C 1-C12 alkyl includes all of the foregoing moieties, but also includes C 11 and C 12 alkyl. Non-limiting examples of C 1-C12 alkyl groups include methyl, ethyl, n-propyl, isopropyl, sec-propyl, n-butyl, isobutyl, sec-butyl, tert-butyl, n-pentyl, tert-pentyl, n-hexyl, n-heptyl, n-octyl, n-nonyl, n-decyl, n-undecyl and n-dodecyl. Unless specifically stated otherwise in the specification, an alkyl group may be optionally substituted.
"Alkylene", "alkylene chain" or "alkylene group" refers to a fully saturated straight or branched divalent hydrocarbon chain group having one to forty carbon atoms. Non-limiting examples of C 2-C40 alkylene include ethylene, propylene, n-butylene, vinylene, propenylene, n-butenylene, propynylene, n-butynylene, and the like. Unless specifically stated otherwise in the specification, the alkylene chain may be optionally substituted.
"Alkenyl", "alkenyl chain" or "alkenyl group" refers to a straight or branched hydrocarbon chain group having two to forty carbon atoms and having one or more carbon-carbon double bonds. Each alkenyl group is attached to the remainder of the molecule by a single bond. Including any number of alkenyl groups containing 2to 40 carbon atoms. Alkenyl containing up to 40 carbon atoms is C 2-C40 alkenyl, alkenyl containing up to 10 carbon atoms is C 2-C10 alkenyl, alkenyl containing up to 6 carbon atoms is C 2-C6 alkenyl, and alkenyl containing up to 5 carbon atoms is C 2-C5 alkenyl. C 2-C5 alkenyl includes C 5 alkenyl, C 4 alkenyl, C 3 alkenyl and C 2 alkenyl. C 2-C6 alkenyl includes all moieties described above with respect to C 2-C5 alkenyl, but also includes C 6 alkenyl. C 2-C10 alkenyl includes all moieties described above for C 2-C5 alkenyl and C 2-C6 alkenyl, but also includes C 7、C8、C9 and C 10 alkenyl. Similarly, C 2-C12 alkenyl includes all of the foregoing moieties, but also includes C 11 and C 12 alkenyl. Non-limiting examples of C 2-C12 alkenyl include vinyl (ethyl/vinyl)), 1-propenyl, 2-propenyl (allyl), isopropenyl, 2-methyl-1-propenyl, 1-butenyl, 2-butenyl, 3-butenyl, 1-pentenyl, 2-pentenyl, 3-pentenyl, 4-hexenyl, 1-hexenyl, 2-hexenyl, 3-hexenyl, 4-hexenyl, 5-hexenyl, 1-heptenyl, 2-heptenyl, 3-heptenyl, 4-heptenyl, 5-heptenyl, 6-heptenyl, 1-octenyl, 2-octenyl, 3-octenyl, 4-octenyl, 5-octenyl, 6-octenyl, 7-octenyl, 1-nonenyl, 2-nonenyl, 3-nonenyl, 4-nonenyl, 5-nonenyl, 6-nonenyl, 7-nonenyl, 8-nonenyl, 1-decenyl, 2-decenyl, 3-decenyl, 4-decenyl, 5-decenyl, 6-decenyl, 7-decenyl, 8-decenyl, 9-decenyl, 1-undecenyl, 2-undecenyl, 3-undecenyl, 4-undecenyl, 5-undecenyl, 6-undecenyl, 7-undecenyl, 8-undecenyl, 9-undecenyl, 10-undecenyl, 1-dodecenyl, 2-dodecenyl, 3-dodecenyl, 4-dodecenyl, 5-dodecenyl, 6-dodecenyl, 7-dodecenyl, 8-dodecenyl, 9-dodecenyl, 10-dodecenyl and 11-dodecenyl. Unless specifically stated otherwise in the specification, an alkyl group may be optionally substituted.
"Alkenylene", "alkenylene" or "alkenylene group" refers to a straight or branched divalent hydrocarbon chain radical having two to forty carbon atoms and having one or more carbon-carbon double bonds. Non-limiting examples of C 2-C40 alkenylene groups include ethylene, propylene, butene, and the like. Unless specifically stated otherwise in the specification, alkenylene chains may be optional.
"Alkoxy" OR "alkoxy group" refers to the group-OR, wherein R is alkyl, alkenyl, alkynyl, cycloalkyl, OR heterocyclyl as defined herein. Unless specifically stated otherwise in the specification, an alkoxy group may be optionally substituted.
"Acyl" or "acyl group" refers to the group-C (O) R, wherein R is hydrogen, alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl, as defined herein. Unless specifically stated otherwise in the specification, an acyl group may be optionally substituted.
"Alkylcarbamoyl" or "alkylcarbamoyl group" refers to the group-O-C (O) -NR aRb, wherein R a and R b are the same or different and are independently alkyl, alkenyl, alkynyl, aryl, heteroaryl as defined herein, or R aRb may together form a cycloalkyl group or a heterocyclyl group as defined herein. Unless specifically stated otherwise in the specification, alkylcarbamoyl groups may be optionally substituted.
"Alkylcarboxamide" or "alkylcarboxamide group" refers to the group-C (O) -NR aRb, wherein R a and R b are the same or different and are independently an alkyl, alkenyl, alkynyl, aryl, heteroaryl, cycloalkyl, cycloalkenyl, cycloalkynyl or heterocyclyl group as defined herein, or R aRb may together form a cycloalkyl group as defined herein. Unless specifically stated otherwise in the specification, the alkylcarboxamido groups may be optionally substituted.
"Aryl" refers to a hydrocarbon ring system group comprising hydrogen, 6 to 18 carbon atoms, and at least one aromatic ring. For the purposes of the present invention, aryl groups may be monocyclic, bicyclic, tricyclic or tetracyclic ring systems, which may include fused or bridged ring systems. Aryl groups include, but are not limited to, aryl groups derived from acetate (ACEANTHRYLENE), acenaphthylene (ACENAPHTHYLENE), acetenaphthalene (acetenaphthalene), anthracene, azulene (azulene), benzene, chrysene (chrysene), fluoranthene (fluoranthene), fluorene, asymmetric indacene (as-indacene), symmetric indacene (s-indacene), indane, indene, naphthalene, phenalene, phenanthrene, obsidiene (pleiadiene), pyrene, and benzophenanthrene. Unless specifically stated otherwise in the specification, the term "aryl" is meant to include optionally substituted aryl groups.
"Heteroaryl" refers to a group of a 5 to 20 membered ring system comprising a hydrogen atom, one to thirteen carbon atoms, one to six heteroatoms selected from nitrogen, oxygen and sulfur, and at least one aromatic ring. For the purposes of the present invention, heteroaryl groups may be monocyclic, bicyclic, tricyclic or tetracyclic ring systems, which may include fused or bridged ring systems; and the nitrogen, carbon or sulfur atoms in the heteroaryl group may optionally be oxidized; the nitrogen atom may optionally be quaternized. Examples include, but are not limited to, azepinyl, acridinyl, benzimidazolyl, benzothiazolyl, benzindolyl, benzodioxolyl, benzofuranyl, and benzofuranylAzolyl, benzothiazolyl, benzothiadiazolyl, benzo [ b ] [1,4] dioxenyl, 1, 4-benzodi/>Alkyl, benzonaphthofuranyl, benzo/>Oxazolyl, benzodioxolyl, benzodioxanyl, benzopyranyl, benzopyronyl, benzofuranyl, benzofuranonyl, benzothienyl (benzothienyl), benzotriazolyl, benzo [4,6] imidazo [1,2-a ] pyridinyl, carbazolyl, cinnolinyl, dibenzofuranyl, dibenzothienyl, furanyl, furanonyl, isothiazolyl, imidazolyl, indazolyl, indolyl, indazolyl, isoindolyl, indolinyl, isoindolinyl, isoquinolyl, indolizinyl, iso/>Azolyl, naphthyridinyl,/>Diazolyl, 2-oxo-azepinyl,/>Oxazolyl, oxiranyl, 1-oxypyridyl, 1-oxypyrimidinyl, 1-oxypyrazinyl, 1-oxypyridazinyl, 1-phenyl-1H-pyrrolyl, phenazinyl, phenothiazinyl, pheno/>Oxazinyl, phthalazinyl, pteridinyl, purinyl, pyrrolyl, pyrazolyl, pyridinyl, pyrazinyl, pyrimidinyl, pyridazinyl, quinazolinyl, quinoxalinyl, quinolinyl, quinuclidinyl, isoquinolinyl, tetrahydroquinolinyl, thiazolyl, thiadiazolyl, triazolyl, tetrazolyl, triazinyl, and thiophenyl (i.e., thienyl). Unless specifically stated otherwise in the specification, heteroaryl groups may be optionally substituted.
The term "substituted" as used herein means any of the above groups (i.e., alkyl, alkenyl, alkynyl, cycloalkyl, cycloalkenyl, cycloalkynyl, heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, acyl, alkylcarbamoyl, alkylcarboxamide, alkoxycarbonyl, alkylthio, or arylthio) in which at least one atom is replaced by a non-hydrogen atom such as, but not limited to: halogen atoms such as F, cl, br, and I; oxygen atoms in groups such as hydroxyl groups, alkoxy groups, ester groups, and the like; a sulfur atom in groups such as a thiol group, a thioalkyl group, a sulfone group, a sulfonyl group, and a sulfoxide group; nitrogen atoms in groups such as amines, amides, alkylamines, dialkylamines, arylamines, alkylarylamines, diarylamines, N-oxides, imides, and enamines; a silicon atom in a group such as a trialkylsilyl group, a dialkylarylsilyl group, an alkyldiarylsilyl group, and a triarylsilyl group; and other heteroatoms in various other groups. "substituted" also means any of the above groups in which one or more atoms are replaced with Gao Jiejian (e.g., double or triple bonds) to heteroatoms (such as oxygen in oxo, carbonyl, carboxyl, and ester groups); and nitrogen in groups such as imines, oximes, hydrazones, and nitriles. For example, "substituted" includes any of the above groups in which one or more atoms are replaced by -NRgRh、-NRgC(=O)Rh、-NRgC(=O)NRgRh、-NRgC(=O)ORh、-NRgSO2Rh、-OC(=O)NRgRh、-ORg、-SRg、-SORg、-SO2Rg、-OSO2Rg、-SO2ORg、=NSO2Rg and-SO 2NRgRh. "substituted" also means any of the above groups in which one or more hydrogen atoms are replaced by -C(=O)Rg、-C(=O)ORg、-C(=O)NRgRh、-CH2SO2Rg、-CH2SO2NRgRh. In the above, R g and R h are the same or different and are independently hydrogen, alkyl, alkenyl, alkynyl, alkoxy, alkylamino, thioalkyl, aryl, aralkyl, cycloalkyl, cycloalkenyl, cycloalkynyl, cycloalkylalkyl, haloalkyl, haloalkenyl, haloalkynyl, heterocyclyl, N-heterocyclyl, heterocyclylalkyl, heteroaryl, N-heteroaryl and/or heteroarylalkyl. "substituted" also means any of the foregoing groups in which one or more atoms is replaced with an amino, cyano, hydroxy, imino, nitro, oxo, thio, halo, alkyl, alkenyl, alkynyl, alkoxy, alkylamino, thioalkyl, aryl, aralkyl, cycloalkyl, cycloalkenyl, cycloalkynyl, cycloalkylalkyl, haloalkyl, haloalkenyl, haloalkynyl, heterocyclyl, N-heterocyclyl, heterocyclylalkyl, heteroaryl, N-heteroaryl, and/or heteroarylalkyl group. "substituted" may also mean an amino acid in which one or more atoms in the side chain are replaced by alkyl, alkenyl, alkynyl, acyl, alkylcarboxamide, alkoxycarbonyl, carbocyclyl, heterocyclyl, aryl or heteroaryl groups. Furthermore, each of the foregoing substituents may also be optionally substituted with one or more of the substituents described above.
As used herein, a symbol(Hereinafter may be referred to as "attachment point bond") means a bond that is an attachment point between two chemical entities, one of which is described as attached to the attachment point and the other of which is not described as attached to the attachment point. For example,/>Meaning that chemical entity "XY" is bound to another chemical entity via an attachment point bond. Furthermore, specific points of attachment to undescribed chemical entities may be specified by inference. For example, compound CH 3-R3, wherein R 3 is H orIt is inferred that when R 3 is "XY", the attachment point bond is the same bond as that described for R 3 as being bonded to CH 3. /(I)
As used herein, "subject" refers to an individual. Thus, a "subject" may include domestic animals (e.g., cats, dogs, etc.), farm animals (e.g., cattle, horses, pigs, sheep, goats, etc.), laboratory animals (e.g., mice, rabbits, rats, guinea pigs, etc.), and birds. "subject" may also include mammals, such as primates or humans. Thus, the subject may be a human or veterinary patient. The term "patient" refers to a subject under treatment by a clinician (e.g., physician).
The term "inhibition" refers to a decrease in activity, response, disorder, disease or other biological parameter. This may include, but is not limited to, complete elimination of an activity, reaction, condition or disease. This may also include, for example, a 10% reduction in activity, response, disorder or disease as compared to a natural or control level. Thus, a decrease may be a 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100% or any amount decrease therebetween as compared to a native or control level.
"Reduction" or other forms of the word, such as "reduction" or "reduction", means a decrease in an event or feature (e.g., tumor growth). It will be appreciated that this is typically associated with some standard or expected value, in other words it is relative, but reference to a standard or relative value is not always required. For example, "reducing tumor growth" means reducing the growth rate of a tumor relative to a standard or control (e.g., untreated tumor).
The term "treatment" refers to the medical management of a patient with the aim of curing, ameliorating, stabilizing or preventing a disease, pathological condition or disorder. The term includes active therapies, i.e. therapies directed specifically to ameliorating a disease, pathological condition or disorder, and also includes causal therapies, i.e. therapies directed to eliminating the etiology of the associated disease, pathological condition or disorder. Furthermore, the term includes palliative treatment, i.e. treatment intended to alleviate symptoms rather than cure a disease, pathological condition or disorder; prophylactic treatment, i.e., treatment intended to minimize or partially or completely inhibit the development of a related disease, pathological condition, or disorder; and supportive treatment, i.e., treatment for supplementing another specific therapy aimed at ameliorating the associated disease, pathological condition, or disorder.
The term "therapeutically effective" means that the amount of the composition used is sufficient to ameliorate one or more causes or symptoms of the disease or disorder. Such improvements need only be reduced or altered and need not be eliminated.
The term "pharmaceutically acceptable" refers to those compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication commensurate with a reasonable benefit/risk ratio.
The term "carrier" means a compound, composition, substance or structure that, when combined with a compound or composition, facilitates or facilitates the preparation, storage, administration, delivery, availability, selectivity or any other characteristic of the compound or composition for its intended use or purpose. For example, the carrier may be selected to minimize any degradation of the active ingredient and to minimize any adverse side effects in the subject.
As used herein, the term "pharmaceutically acceptable carrier" refers to a carrier suitable for administration to a patient. Pharmaceutically acceptable carrier means sterile aqueous or non-aqueous solutions, dispersions, suspensions or emulsions, as well as sterile powders for reconstitution into sterile injectable solutions or dispersions just prior to use. Examples of suitable aqueous and nonaqueous carriers, diluents, solvents or vehicles include water, ethanol, polyols (such as glycerol, propylene glycol, polyethylene glycol, and the like), carboxymethyl cellulose and suitable mixtures thereof, vegetable oils (such as olive oil), and injectable organic esters such as ethyl oleate. Proper fluidity can be maintained, for example, by the use of a coating material such as lecithin, by the maintenance of the required particle size in the case of dispersions and by the use of surfactants. These compositions may also contain adjuvants such as preserving, wetting, emulsifying and dispersing agents. Prevention of the action of microorganisms can be ensured by the inclusion of various antibacterial and antifungal agents such as parabens, chlorobutanol, phenol, sorbic acid, and the like. It may also be desirable to include isotonic agents, such as sugars, sodium chloride, and the like. The injectable formulation may be sterilized, for example, by filtration through a bacterial-retaining filter, or by incorporating sterilizing agents in the form of sterile solid compositions which may be dissolved or dispersed in sterile water or other sterile injectable medium immediately prior to use. Suitable inert carriers may include sugars such as lactose.
The term "pharmaceutically acceptable salts" includes those obtained by reacting an active compound used as a base with an inorganic or organic acid to form a salt, such as salts of hydrochloric acid, sulfuric acid, phosphoric acid, methanesulfonic acid, camphorsulfonic acid, oxalic acid, maleic acid, succinic acid, citric acid, formic acid, hydrobromic acid, benzoic acid, tartaric acid, fumaric acid, salicylic acid, mandelic acid, carbonic acid and the like. Those skilled in the art will further recognize that acid addition salts may be prepared by reacting a compound with an appropriate inorganic or organic acid via any of a variety of known methods. The term "pharmaceutically acceptable salts" also includes those obtained by reacting an active compound used as an acid with an inorganic or organic base to form a salt, such as salts of ethylenediamine, N-methyl-glucamine, lysine, arginine, ornithine, choline, N' -dibenzylethylenediamine, chloroprocaine, diethanolamine, procaine, N-benzylphenethylamine, diethylamine, piperazine, tris- (hydroxymethyl) -aminomethane, tetramethylammonium hydroxide, triethylamine, dibenzylamine, xylylenediamine (ephenamine), dehydroabietylamine, N-ethylpiperidine, benzylamine, tetramethylammonium, tetraethylammonium, methylamine, dimethylamine, trimethylamine, ethylamine, basic amino acids, and the like. Non-limiting examples of inorganic or metal salts include lithium, sodium, calcium, potassium, magnesium, and the like.
As used herein, the term "parenteral administration" refers to administration by injection or infusion. Parenteral administration includes, but is not limited to, subcutaneous, intravenous, or intramuscular administration.
As used herein, the term "subcutaneous administration" refers to administration under the skin. By "intravenous administration" is meant administration into a vein.
As used herein, the term "dose" refers to a specified amount of an agent provided in a single administration. In embodiments, the dosage may be administered in two or more boluses, tablets, or injections. In embodiments requiring subcutaneous administration, the desired dose requires a volume that is not readily administered by a single injection. In such embodiments, two or more injections may be used to achieve the desired dose. In embodiments, the dose may be administered in two or more injections to reduce injection site reactions in the patient.
As used herein, the term "dosage unit" refers to a form in which a pharmaceutical agent is provided. In embodiments, the dosage unit is a vial comprising lyophilized antisense oligonucleotide. In embodiments, the dosage unit is a vial comprising the reconstituted antisense oligonucleotide.
The term "therapeutic moiety" (TM) refers to a compound useful for treating at least one symptom of a disease or disorder, and may include, but is not limited to, therapeutic polypeptides, oligonucleotides, small molecules, and other agents useful for treating at least one symptom of a disease or disorder. In embodiments, the therapeutic moiety modulates the expression or activity of the target protein. In embodiments, the therapeutic moiety modulates splicing. In embodiments, the therapeutic moiety induces exon skipping in the target mRNA transcript. In embodiments, the therapeutic moiety down-regulates the expression or activity of the target protein. In embodiments, the therapeutic moiety down-regulates expression or activity of the target protein by inducing exon skipping in the target transcript.
The terms "modulation (modulate)", "modulating" and "modulation" refer to a disturbance of expression, function or activity when compared to the level of expression, function or activity prior to modulation. Modulation may include an increase (stimulation or induction) or decrease (inhibition or decrease) in expression, function or activity. In embodiments, the compounds disclosed herein comprise a Therapeutic Moiety (TM) that down-regulates the expression, function and/or activity of a target protein. In embodiments, the compounds disclosed herein comprise a therapeutic moiety that upregulates expression, function and/or activity of a target protein.
"Amino acid" means a compound comprising an amino group and a carboxylic acid group and having the general formulaWherein R may be any organic group. The amino acid may be a naturally occurring amino acid or a non-naturally occurring amino acid. The amino acid may be a proteinogenic amino acid or a non-proteinogenic amino acid. The amino acid may be an L-amino acid or a D-amino acid. The term "amino acid side chain" or "side chain" refers to a characteristic substituent ("R") that is bound to the α -carbon of a natural or unnatural α -amino acid. Amino acids may be incorporated into polypeptides via peptide bonds.
As used herein, the term "sequence identity" refers to the percentage of nucleic acids or amino acids between two oligonucleotide or polypeptide sequences that are identical and in the same relative position, respectively. Thus, one sequence has a certain percentage of sequence identity compared to another sequence. For sequence comparison, one sequence is typically used as a reference sequence to which the test sequence is compared. One of ordinary skill in the art will appreciate that two sequences are generally considered "substantially identical" if they contain the same residues at the corresponding positions. In embodiments, sequence identity between sequences may be determined using the Niadman-West application algorithm (Needleman and Wunsch,1970, J.mol. Biol. 48:443-453) implemented in the Needle program of the EMBOSS software package (EMBOSS: european molecular biology open software suite, rice et al, trends Genet. (2000), 16:276-277), taking the version that existed at the date of submission. The parameters used are gap opening penalty of 10, gap extension penalty of 0.5 and EBLOSUM62 (the embosm version of BLOSUM 62) substitution matrix. The output of Needle labeled "longest identity" (obtained using the-nobrief option) is used as the percent identity and is calculated as follows: (identical residues. Times.100)/(alignment Length-total number of pairs of air)
In other embodiments, sequence identity may be determined using the smith-whatman algorithm, employing a version that exists at the date of submission.
As used herein, "sequence homology" refers to the percentage of amino acids that are homologous and in the same relative position between two polypeptide sequences. Thus, one polypeptide sequence has a certain percentage of sequence homology to another polypeptide sequence. As will be appreciated by those of ordinary skill in the art, two sequences are generally considered "substantially homologous" if they contain homologous residues at corresponding positions. Homologous residues may be identical residues. Alternatively, homologous residues may be different residues having suitably similar structural and/or functional characteristics. For example, as is well known to those of ordinary skill in the art, certain amino acids are generally classified as "hydrophobic" or "hydrophilic" amino acids, and/or have "polar" or "nonpolar" side chains, and amino acid substitutions of one amino acid for another amino acid of the same type can generally be considered "homologous" substitutions.
As is well known in the art, amino acid sequences can be compared using any of a variety of algorithms, including those available in commercial computer programs, such as BLASTP, gapped BLAST, and PSI-BLAST, which exist at the date of filing. Such procedures are described in Altschul et al, J.mol.biol., (1990), 215 (3): 403-410; altschul et al, nucleic Acids Res (1997), 25:3389-3402; baxevanis et al, bioinformation A PRACTICAL guide to THE ANALYSIS of Genes and Proteins, wiley,1998; and Misener et al, (ed), bioinformatics Methods and Protocols (Methods in Molecular Biology, vol.132), humana press,1999. In addition to identifying homologous sequences, the above procedure will generally provide an indication of the degree of homology.
As used herein, a "cell targeting moiety" refers to a molecule or macromolecule that specifically binds to a molecule, such as a receptor, on the surface of a target cell. In embodiments, the cell surface molecules are expressed only on the surface of the target cells. In embodiments, the cell surface molecules are also present on the surface of one or more non-target cells, but the amount of cell surface molecule expression is higher on the surface of the target cells. Examples of cell targeting moieties include, but are not limited to, antibodies, peptides, proteins, aptamers, or small molecules.
As used herein, the terms "antisense compound" and "AC" are used interchangeably to refer to a polymeric nucleic acid structure that is at least partially complementary to a target nucleic acid molecule to which it (the AC) hybridizes. The AC may be a short (in embodiments, less than 50 bases) polynucleotide or polynucleotide homolog that comprises a sequence that is complementary to a target sequence. In embodiments, AC is a polynucleotide or polynucleotide homolog comprising a sequence that is complementary to a target sequence in a target pre-mRNA strand. AC may be formed from natural nucleic acids, synthetic nucleic acids, nucleic acid homologs, or any combination thereof. In embodiments, the AC comprises an oligonucleotide. In embodiments, the AC comprises an antisense oligonucleotide. In embodiments, AC comprises a conjugate group. Non-limiting examples of ACs include, but are not limited to, primers, probes, antisense oligonucleotides, external Guide Sequence (EGS) oligonucleotides, sirnas, oligonucleotides, oligonucleotide analogs, oligonucleotide mimics, and chimeric combinations of these. Thus, these compounds may be introduced in single-stranded, double-stranded, cyclic, branched or hairpin form, and may contain structural elements such as internal or terminal bulges or loops. The oligomeric double-stranded compound may be two strands that hybridize to form a double-stranded compound, or a single strand that has sufficient self-complementarity to allow hybridization and formation of a complete or partial double-stranded compound. In embodiments, AC modulates (increases, decreases, or alters) expression of a target nucleic acid.
As used herein, the term "targeting" refers to the association of a therapeutic moiety, such as an antisense compound, with a target nucleic acid molecule or region of a target nucleic acid molecule. In embodiments, the therapeutic moiety comprises an antisense compound capable of hybridizing to the target nucleic acid under physiological conditions. In embodiments, the antisense compound targets a particular portion or site within the target nucleic acid, e.g., a portion of the target nucleic acid having at least one identifiable structure, function, or feature, such as a particular exon or intron, or a selected nucleobase or motif within an exon or intron, such as a splice element or cis-acting splice regulatory element.
As used herein, the terms "target nucleic acid sequence" and "target nucleotide sequence" refer to a nucleic acid sequence or nucleotide sequence that binds to or hybridizes with a therapeutic moiety, such as an antisense compound. Target nucleic acids include, but are not limited to, a portion of a target transcript, a target RNA (including, but not limited to, pre-mRNA and mRNA or portions thereof), a portion of a target cDNA derived from such RNA, and a portion of a target untranslated RNA (such as miRNA). For example, in embodiments, a target nucleic acid may be part of a target cell gene (or mRNA transcribed from such gene), whose expression is associated with a particular disorder or disease state. The term "moiety" refers to a defined number of consecutive (i.e., linked) nucleotides of a nucleic acid.
As used herein, the term "transcript" or "gene transcript" refers to an RNA molecule transcribed from DNA and includes, but is not limited to, mRNA, pre-mRNA, and partially processed RNA.
The terms "target transcript" and "target RNA" refer to a pre-mRNA or mRNA transcript that binds to a therapeutic moiety. The target transcript may comprise the target nucleotide sequence. In one embodiment, the target transcript comprises a splice site.
The terms "target gene" and "gene of interest" refer to genes whose expression and/or activity is or is desired to be regulated. The target gene may be transcribed into a target transcript comprising the target nucleotide sequence. The target transcript may be translated into a protein of interest.
The term "target protein" refers to a polypeptide or protein encoded by a target transcript (e.g., a target mRNA).
As used herein, the term "mRNA" refers to an RNA molecule encoding a protein, and includes pre-mRNA and mature mRNA. "Pre-mRNA" refers to a newly synthesized eukaryotic mRNA molecule immediately after transcription of the DNA. In embodiments, the pre-mRNA is capped with a 5 'cap, modified with a 3' poly a tail, and/or spliced to produce a mature mRNA sequence. In embodiments, the pre-mRNA comprises one or more introns. In one embodiment, the pre-mRNA undergoes a process known as splicing to remove introns and junction exons. In embodiments, the pre-mRNA comprises one or more splice elements or splice regulatory elements. In embodiments, the pre-mRNA comprises a polyadenylation site.
As used herein, the terms "expression," "gene expression," "expression of a gene," and the like refer to all functions and steps in which information encoded in a gene is converted in a cell into a functional gene product (such as a polypeptide or non-coding RNA). Examples of non-coding RNAs include transfer RNAs (trnas) and ribosomal RNAs. Gene expression of a polypeptide includes transcription of the gene to form a pre-mRNA, processing of the pre-mRNA to form a mature mRNA, translocation of the mature mRNA from the nucleus to the cytoplasm, translation of the mature mRNA into the polypeptide, and assembly of the encoded polypeptide. Expression includes partial expression. For example, expression of a gene may be referred to herein as production of a gene transcript. Translation of mature mRNA may be referred to herein as expression of mature mRNA.
As used herein, "modulation of gene expression" and the like refer to modulation of one or more processes associated with gene expression. For example, modification of gene expression may include modification of one or more of gene transcription, RNA processing, translocation of RNA from the nucleus to the cytoplasm, and translation of mRNA into protein.
As used herein, the term "gene" refers to a nucleic acid sequence comprising a 5 'promoter region associated with expression of a gene product, as well as any intronic and exonic regions and 3' untranslated regions ("UTRs") associated with expression of a gene product.
The term "immune cell" refers to a cell of hematopoietic origin and that plays a role in the immune response. Immune cells include, but are not limited to, lymphocytes (e.g., B cells and T cells), natural Killer (NK) cells, and bone marrow cells. The term "bone marrow cells" includes monocytes, macrophages and granulocytes (e.g., basophils, neutrophils, eosinophils and mast cells). Monocytes are lymphocytes that circulate in the blood for 1-3 days, after which time they migrate into the tissue and differentiate into macrophages or inflammatory dendritic cells or die. The term "macrophage" as used herein includes embryonic-derived macrophages (which may also be referred to as resident tissue macrophages) and macrophages derived from monocytes that have migrated from the blood stream into in vivo tissue (which may be referred to as monocyte-derived macrophages). Depending on the tissue in which the macrophages are located, they are called Coulopfry cells (liver), mesangial cells (kidney), alveolar macrophages (lung), sinus tissue cells (lymph node), huo Fubao's cells (placenta), microglial cells (brain and spinal cord), langerhans cells (skin) or the like.
As used herein, "proximal" with respect to AC and a splice regulatory element means that AC binds to a nucleic acid sequence within about 25, about 20, about 15, about 10, about 5, about 4, about 3, about 2, or about 1 nucleotide of the splice regulatory element, including, for example, a 5 'splice site (5' ss), a Branch Point Sequence (BPS), a poly pyrimidine (Py) bundle, or a3 'splice site (3' ss).
As used herein, "splice control element (SRE)" and "Splice Element (SE)" are used interchangeably and refer to any nucleotide sequence within a transcript in which splicing occurs or promotes, inhibits or alters splicing. Examples of splice elements include terminal stem loop sequences (TLS), branch Point Sequences (BPS), polypyrimidine sequences (Py), 5 'splice sites (5' ss), 3 'splice sites (3' ss), and cis regulatory elements such as Intronic Splice Silencer (ISS) sequences, intronic Splice Enhancer (ISE) sequences, exonic Splice Enhancer (ESE) sequences, exonic Splice Silencer (ESS) sequences, and sequences comprising exon/intron junctions.
As used herein, the term "splicing" refers to the modification of post-transcriptional pre-mRNA in which introns are removed and exons are linked. Splicing occurs in a series of reactions catalyzed by large RNA-protein complexes that contain five small nuclear ribonucleoproteins (snrnps), called spliceosomes. Splice regulatory elements include 3 'splice sites, 5' splice sites and branching sites. The 5' splice site is bound by U1 snRNP, followed by U6 snRNP. RNA binding protein SF1 binds to the branch point sequence but is then replaced by U2 snRNP (see, e.g., ward and Cooper (2011) "The pathobiology of splicing," J.Pathol.220 (2): 152-163).
As used herein, "splice site" refers to the linkage between an exon and an intron in a pre-mRNA molecule. "cryptic splice sites" are splice sites that are not normally used, but can be used when the common splice site is blocked or unavailable or when a mutation results in a normal inactive site becoming an active splice site. An "abnormal splice site" is a splice site that results from mutations in natural DNA and mRNA. An antisense compound that "targets a splice site" refers to a compound that hybridizes to at least a portion of a target nucleotide sequence that comprises the splice site, or to an intron or exon near the splice site, thereby regulating splicing of mRNA. The targeted splice site may be a common splice site, a cryptic splice site, or an aberrant splice site.
As used herein, a "splice donor site" is used interchangeably with the term "5 'splice site" and refers to a nucleotide sequence immediately surrounding the exon-intron boundary at the 5' end of an intron. The term "splice acceptor site" is used interchangeably with the term "3 'splice site" and refers to a nucleic acid sequence immediately surrounding the intron-exon boundary at the 3' end of an intron. A number of splice donor and acceptor sites have been characterized (see, e.g., ohshima et al (1987)"Signals for the selection of a splice site in pre-mRNA:computer analysis of splice junction sequences and like sequences,"J.Mol.Biol.,195:247-259(1987)).
As used herein, the term "oligonucleotide" refers to an oligomeric compound comprising a plurality of linked nucleotides or nucleosides. One or more nucleotides of the oligonucleotide may be modified. The oligonucleotides may include ribonucleic acid (RNA) or deoxyribonucleic acid (DNA). Oligonucleotides may consist of natural and/or modified nucleobases, sugars, and covalent internucleoside linkages, and may further include non-nucleic acid conjugates.
As used herein, the term "nucleoside" refers to a glycosylamine comprising a nucleobase and a sugar. Nucleosides include, but are not limited to, natural nucleosides, abasic nucleosides, modified nucleosides, and nucleosides having simulated bases and/or sugar groups. A "natural nucleoside" or "unmodified nucleoside" is a nucleoside comprising a natural nucleobase and a natural sugar. Natural nucleosides include RNA and DNA nucleosides.
As used herein, the term "natural sugar" refers to a sugar of a nucleoside that has not been modified in the form in which it naturally occurs in RNA (2 '-OH) or DNA (2' -H).
As used herein, the term "nucleotide" refers to a nucleoside having a phosphate group covalently linked to a sugar. The nucleotide may be modified with any of a variety of substituents.
As used herein, the term "nucleobase" refers to a nucleoside or a base portion of a nucleotide. A nucleobase may comprise any atom or group of atoms capable of hydrogen bonding with a base of another nucleic acid. A natural nucleobase is a nucleobase that has not been modified in a form in which it naturally occurs in RNA or DNA.
As used herein, the term "heterocyclic base moiety" refers to a nucleobase comprising a heterocycle.
As used herein, "internucleoside linkage" refers to a covalent linkage between adjacent nucleosides.
As used herein, "natural internucleoside linkage" refers to a3 'to 5' phosphodiester linkage.
As used herein, the term "modified internucleoside linkage" refers to any linkage between nucleosides or nucleotides other than naturally occurring internucleoside linkages.
As used herein, the term "chimeric antisense compound" refers to an antisense compound having at least one sugar, nucleobase, and/or internucleoside linkage that is modified in a different manner compared to other sugar, nucleobase, and internucleoside linkages within the same oligomeric compound. The remainder of the sugar, nucleobase and internucleoside linkages may be independently modified or unmodified. Generally, chimeric oligomeric compounds will have modified nucleosides that can be located at separate positions or clustered together in regions that will define a particular motif. Any combination of modifying and/or mimicking groups may include chimeric oligomeric compounds as described herein.
As used herein, the term "mixed backbone antisense oligonucleotide" refers to an antisense oligonucleotide in which at least one internucleoside linkage of the antisense oligonucleotide is different from at least one other internucleoside linkage of the antisense oligonucleotide.
As used herein, the term "nucleobase complementarity" refers to a nucleobase that is capable of base pairing with another nucleobase. For example, in DNA, adenine (a) is complementary to thymine (T). For example, in RNA adenine (A) is complementary to uracil (U). In embodiments, complementary nucleobases refer to nucleobases of an antisense compound that are capable of base pairing with nucleobases of its target nucleic acid. For example, if a nucleobase at a particular position of an antisense compound is capable of hydrogen bonding with a nucleobase at a particular position of a target nucleic acid, then the hydrogen bonding position between the oligonucleotide and the target nucleic acid is considered complementary at that nucleobase pair.
As used herein, the term "non-complementary nucleobases" refers to a pair of nucleobases that do not form hydrogen bonds with each other or otherwise support hybridization.
As used herein, the term "complementary" refers to the ability of an oligomeric compound to hybridize to another oligomeric compound or nucleic acid by nucleobase complementarity. In embodiments, an antisense compound is complementary to its target when a sufficient number of corresponding positions in each molecule are occupied by nucleobases that can be bonded to each other to allow stable association between the antisense compound and the target. Those skilled in the art recognize that it is possible to include mismatches without eliminating the ability of the oligomeric compounds to remain associated. Thus, antisense compounds described herein can comprise up to about 20% mismatched nucleotides (i.e., nucleobases that are not complementary to the corresponding nucleotides of the target). In embodiments, the antisense compounds contain no more than about 15%, such as no more than about 10%, such as no more than 5% mismatch or no mismatch. The remaining nucleotides are nucleobase complementary or otherwise do not disrupt hybridization (e.g., universal bases). One of ordinary skill in the art will recognize that the compounds provided herein are at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% nucleobases complementary to a target nucleic acid.
As used herein, "hybridization" means pairing of complementary oligomeric compounds (e.g., antisense compounds and their target nucleic acids). Although not limited to a particular mechanism, the most common pairing mechanism involves hydrogen bonding between complementary nucleoside or nucleotide bases (nucleobases), which may be Watson-Crick, holstein or reverse Holstein hydrogen bonding. For example, the natural base adenine is a nucleobase that is complementary to the natural nucleobases thymine and uracil that pair by forming hydrogen bonds. The natural base guanine is a nucleobase complementary to the natural bases cytosine and 5-methylcytosine. Hybridization may occur under different conditions.
As used herein, the term "specific hybridization" refers to the ability of an oligomeric compound to hybridize with one nucleic acid site with greater affinity than it hybridizes with another nucleic acid site. In embodiments, the antisense oligonucleotide specifically hybridizes to more than one target site. In embodiments, the oligomeric compounds specifically hybridize to their targets under stringent hybridization conditions.
In the context of nucleic acid hybridization, "stringent hybridization conditions" and "stringent hybridization wash conditions" are sequence-dependent and are different under different environmental parameters. Extensive guidance for nucleic acid hybridization can be found in section Tijssen,Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid Probes, chapter 2, "Overview of principles of hybridization and the strategy of nucleic acid probe assays"Elsevier,New York(1993)., generally, the highly stringent hybridization and wash conditions are selected to be about 5 ℃ below the thermal melting point (Tm) for a particular sequence at a defined ionic strength and pH. Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm of the particular probe. An example of stringent hybridization conditions for hybridization of complementary nucleotide sequences having more than 100 complementary residues on a filter in a Southern or Northern blot is 50% formamide with 1mg heparin at 42 ℃, wherein hybridization is performed overnight. An example of high stringency wash conditions is about 15 minutes with 0.15M NaCl at 72 ℃. An example of stringent wash conditions is a 0.2 XSSC wash at 65℃for 15 minutes (for descriptions of SSC buffers, see Sambrook and Russel,Molecular Cloning:A laboratory Manual,3rd ed.,Cold Spring Harbor Laboratory Press,2001)., typically, a low stringency wash is performed prior to a high stringency wash to remove background probe signals, an example of a medium stringency wash for a duplex of, e.g., more than 100 nucleotides is a1 XSSC wash at 45℃for 15 minutes, an example of a low stringency wash for a duplex of, e.g., more than 100 nucleotides is a 4-6XSSC wash at 40℃for 15 minutes, stringent conditions typically involve salt concentrations of less than about 1.0M Na ion, typically about 0.01M to 1.0M Na ion concentration (or other salts) at pH 7.0 to 8.3, and temperatures typically at least about 30℃stringent conditions can also be achieved by the addition of destabilizing agents such as formamide.
As used herein, the term "2' -modified" or "2' -substituted" refers to a sugar that contains substituents other than H or OH at the 2' position. 2' -modified monomers include, but are not limited to, BNA and monomers (e.g., nucleosides and nucleotides) having 2' -substituents such as allyl, amino, azido, thio, O-allyl, O-C1-C10 alkyl, -OCF3, O- (CH 2) 2-O-CH3, 2' -O (CH 2) 2SCH3, O- (CH 2) 2-O-N (Rm) (Rn) or O-CH2-C (=o) -N (Rm) (Rn), wherein each Rm and Rn is independently H or substituted or unsubstituted C1-C10 alkyl.
As used herein, the term "MOE" refers to a 2' -O-methoxyethyl substituent.
As used herein, the term "high affinity modified nucleotide" refers to a nucleotide having at least one modified nucleobase, internucleoside linkage, or sugar moiety such that the modification increases the affinity of an antisense compound comprising the modified nucleotide for a target nucleic acid. High affinity modifications include, but are not limited to, BNA, LNA, and 2' -MOE.
As used herein, the term "mimetic" refers to a group that replaces a sugar, nucleobase, and/or internucleoside linkage in AC. Typically, a mimetic is used in place of a sugar or sugar-internucleoside linkage combination and a nucleobase is maintained to hybridize to the selected target. Representative examples of glycomimetics include, but are not limited to, cyclohexenyl or morpholinyl. Representative examples of mimetics of sugar-internucleoside linkage combinations include, but are not limited to, peptide Nucleic Acids (PNAs) and morpholino groups linked by uncharged achiral linkages. In some cases, a mimetic is used instead of a nucleobase. Representative nucleobase mimics are well known in the art and include, but are not limited to, tricyclic phenonesOxazine analogs and universal bases (Berger et al, nuc Acid Res.2000,28:2911-14, incorporated herein by reference). Methods of synthesis of sugar, nucleoside and nucleobase mimetics are well known to those skilled in the art.
As used herein, the term "bicyclic nucleoside" or "BNA" refers to a nucleoside in which the furanose portion of the nucleoside comprises a bridge connecting two atoms on the furanose ring, thereby forming a bicyclic ring system. BNA includes, but is not limited to, alpha-L-LNA, beta-D-LNA, ENA, oxyBNA (2 '-O-N (CH 3) -CH 2-4') and aminooxyBNA (2 '-N (CH 3) -O-CH 2-4').
As used herein, the term "4 'to 2' bicyclic nucleoside" refers to BNA in which a bridge connecting two atoms of a furanose ring bridges the 4 'carbon atom and the 2' carbon atom of the furanose ring to form a bicyclic ring system.
As used herein, "locked nucleic acid" or "LNA" refers to a nucleotide modified such that the 2 '-hydroxyl group of the ribosyl sugar ring is attached to the 4' carbon atom of the sugar ring via a methylene group to form a 2'-C,4' -C-oxymethylene linkage. LNAs include, but are not limited to, alpha-L-LNA and beta-D-LNA.
As used herein, the term "cap structure" or "terminal cap portion" refers to a chemical modification that has been incorporated at either end of an AC.
All publications, patents and patent applications mentioned in this specification are indicative of the level of skill of those skilled in the art to which this invention pertains. All publications, patents, and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Examples
EXAMPLE 1 construction of cell penetrating peptide-antisense Compound conjugates
Antisense Compounds (AC) of any one of SEQ ID NOS 155 to 333 and/or 340 designed to bind to and block expression of IRF-5 and/or GYS1 were constructed as Phosphorodiamidate Morpholino Oligomers (PMO) with C6-thiol 5' modification. Antisense compounds of any one of SEQ IS NO:344 to 364 designed to produce nontoxic isoforms of DUX4 were constructed as Phosphorodiamidate Morpholino Oligomers (PMO) with C6-mercapto 5' modification.
EEVs comprising CCP are formulated. Cell penetrating peptides are formulated using Fmoc chemistry and conjugated to AC, for example, as described in international application No. PCT/US20/66459, filed by Entrada Therapeutics, inc. At 12, 21, the disclosure of which is hereby incorporated herein in its entirety. In an embodiment cCPP comprises the amino acid sequence FfΦ RrRrQ (SEQ ID NO: 78). In an embodiment, the EEV comprises an exocyclic peptide having the sequence KKKRKV (SEQ ID NO: 33). In an embodiment, the EEV comprises KKKRKV-PEG 2 -K- (loop (FfΦ RrRrQ)) -PEG 12-K(N3)(SEQ ID NO:33-PEG2 -K- (loop (SEQ ID NO: 78)) -PEG 12-K(N3). In embodiments, the AC compound is conjugated to EEV using click chemistry. In an embodiment, the compound comprises KKKRKV-PEG 2 -K- (loop (FfΦ RrRrQ)) -PEG 12 -K-linker-3 '-AC-5' (SEQ ID NO:33-PEG 2 -K- (loop (SEQ ID NO: 78)) -PEG 12 -K-linker-3 '-AC-5'), wherein the linker comprises the product of a strain-promoted click reaction between azide and cyclooctyne. The linker may also include other groups such as carbon chains, PEG chains, carbamates, ureas, and the like.
Example 2 knockdown of GYS1 expression via exon-skipping
EEV-PMO is used to induce exon skipping of exon 6, resulting in premature stop codons and nonsense-mediated decay of the GYS1 target transcript. EEVs used are (Ac-PKKKKRKV-PEG 2 -K (cyclo [ Ff-Nal-GrGrQ ]) -PEG 2-K(N3)-NH2)(SEQ ID NO:42-PEG2 -K (cyclo [ SEQ ID NO:135 ]) -PEG 2-K(N3)-NH2)). The PMO sequence is TCACTGTCTGGC TCA CATACC CATA (SEQ ID NO: 327). PMO and EEV were conjugated using azide-alkyne click chemistry.
GYS1/GAA double knockout mice show a significant reduction in the amount of glycogens in heart and skeletal muscle, lysosomal swelling and autophagy accumulation when compared to GAA single knockout mice. These cell grade changes lead to cardiac hypertrophy correction, glucose metabolism normalization and muscle atrophy correction. Although GAA is absent, GYS1 elimination may play an important role, and iGAA knockout mice (GAA -/-) are injected with a single IV dose of 13.5mg/kg EEV-PMO, 27mg/kg PMO, or negative control (vehicle). GYS1 mRNA and protein levels were measured one week after injection. GYS1 levels were also assessed one, two, four and eight weeks after the IV dose of 13.5mg/kg EEV-PMO.
Figures 7A to 7D show significant knockdown of GYS1 expression in the diaphragm and myocardium in the EEV-PMO group, but not in the PMO group alone. This pharmacodynamic result is notable because it is a single dose experiment administered at very low doses and suggests that GYS1 is an addressable target. In addition, the GYS1 protein levels and mRNA persisted for up to eight weeks when injected in the heart, diaphragm, quadriceps, and triceps (fig. 8A-8D and fig. 9A-9D). Protein levels are relative to total protein. mRNA levels were relative to mouse β -actin and mouse GAPDH (two control housekeeping genes).
Example 3 knockdown of IRF5 expression via exon skipping
Exon skipping of exon 4 was induced using four EEV-PMO conjugates to introduce premature stop codons, resulting in nonsense-mediated decay of IRF-5 target transcripts. The PMO sequence of each of the four conjugates was 5'-AGA ACG TAA TCA TCA GTG GGT TGG C-3' (SEQ ID NO: 340). EEVs used are Ac-PKKKRKV-miniPEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG12-OH (EEV #1, 1120) (Ac-SEQ ID NO:42-miniPEG 2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12-OH); ac-PKKKKRKV-miniPEG 2 -K (cyclo [ Ff-Nal-GrGrQ ]) -PEG 12-OH(EEV#2,1113)(Ac-SEQ ID NO:42-miniPEG2 -K (cyclo [ SEQ ID NO:135 ]) -PEG 12-OH);Ac-PKKKRKV-miniPEG2 -K (cyclo [ FGFGRRRQ ]) -PEG 12-OH(EEV#4;1184)(Ac-SEQ ID NO:42-miniPEG2 -K (cyclo [ SEQ ID NO:84 ]) -PEG 12 -OH; 1185: ac-PKKKRKV-miniPEG 2 -K (cyclo [ FGFRRRRQ ]) -PEG 12-OH(EEV#4,1185)(Ac-SEQ ID NO:42-miniPEG2 -K (cyclo [ SEQ ID NO:85 ]) -PEG 12 -OH. EEV was conjugated to PMO using an amide conjugation chemistry.
For in vivo studies, wild-type mice were treated with two doses of EEV #1-PMO on day 0 and day 3. Samples were collected for qPCR on day 7 to measure mRNA levels. For in vitro studies, mouse macrophages were treated as follows: treatment with EEV#1-PMO, or pretreatment with 2. Mu.M EEV-PMOs#1-4 for 4 hours followed by stimulation overnight with R848, an imidazoquinoline compound which is a specific activator of toll-like receptor (TLR) 7/8. 24 hours after treatment, cells were harvested and evaluated by Western blotting.
After treatment with EEV #1-PMO, a significant knockdown of IRF5 levels was observed in the liver (fig. 10A), small intestine (fig. 10B) and tibialis anterior (fig. 10C). In all tissues, knockdown was dose dependent. In addition, at doses of 30. Mu.M, 10. Mu.M and 3. Mu.M, the macrophages of mice treated with EEV #1-PMO had statistically significant reductions in IRF5 protein levels (FIG. 11A). However, pretreatment with EEV#2-PMO, EEV#3-PMO, and EEV#4-PMO followed by R848 stimulated mouse macrophages had a significant improvement in relative efficacy when compared to EEV#1-PMO, as measured by IRF-5 protein expression (FIG. 11B). mRNA levels were relative to vehicle controls set at 100%.
Example 4 in vitro knock down of GYS1 via exon skipping
The effectiveness of PMO 220 and PMO-EEV 220-814 in knocking down mGYS1 levels in the mouse cell line was evaluated. PMO was designed to induce exon skipping of exon 6 of mouse GYS1 to introduce premature stop codons, resulting in nonsense-mediated decay of IRF-5 target transcripts. Constructs 220-814 were PMO 220 conjugated with EEV having the sequence Ac-PKKKKRKV-PEG 2 -K (cyclo [ FfΦCit-r-Cit-rQ ])-PEG 12-K(N3)-NH2(EEV 814)(Ac-SEQ ID NO:42-PEG2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12-K(N3)-NH2) (5'-TCACTGTCTGGCTCACATACCCATA-3'; SEQ ID NO: 327). PMO was conjugated to EEV using click chemistry.
Wild-type mouse myoblast line C2C12 and wild-type mouse fibroblast line 3T3 were treated with different concentrations of PMO 220 and PMO-EEV 220-814 by Endoporter transfection (6. Mu.L/ml; 6. Mu.M). Two days after treatment, the cell lines were evaluated for GYS1 mRNA levels.
PMO 220 and PMO-EEV 220-814 both showed a decrease in GYS1mRNA levels in the C2C12 myoblast line (FIGS. 12 and 13A). PMO 220 also showed a decrease in GYS1mRNA levels in the 3T3 fibroblast line (fig. 13B).
Example 5 in vivo evaluation of GYS 1-targeting PMO-EEV constructs
The in vivo effectiveness of the PMO-EEV construct (EEV-PMO 220-814) in modulating GYS1 levels and Pompe phenotype was determined using a Pompe disease model in mice. PMO-EEV 220 is PMO 220 (5'-TCACTGTCTGGCTCACATACCCATA-3'; SEQ ID NO: 327) conjugated to EEV 814 (Ac-PKKKKRKV-PEG 2 -K (cyclo [ FfΦCit-r-Cit-rQ ]) -PEG 12-K(N3)-NH2)(Ac-SEQ ID NO:42-PEG2 -K (cyclo [ SEQ ID NO:79 ]) -PEG 12-K(N3)-NH2).
Wild type B6-129 mice and acid alpha-Glucosidase (GAA) knockout mice (GAA -/-) were treated with single doses of 27mpk PMO 220, 20mpk PMO-EEV 220-814 or 40mpk PMO-EEV 220-814 via intravenous injection. One week after treatment, mice were sacrificed and tissues were collected for western blot and RT-qPCR analysis. PMO was conjugated to EEV using click chemistry.
Effective knockdown of mouse GYS1mRNA was observed in cardiac and skeletal muscle after treatment with PMO-EEV 220-814, but not after treatment with PMO 220 alone (FIGS. 14A-14D).
GYS2 is the most common GYS protein in the liver. PMO 220 and PMO-EEV 220-814 treatment did not affect GYS2 mRNA levels at therapeutic levels below 30mpk (FIG. 15), indicating selective GYS1 target engagement. 40mpk 220 814 treated groups showed higher GYS2 mRNA levels in the liver. This may be due to a feedback loop between GYS1 and GYS2, where a down-regulation of GYS1 results in an up-regulation of GYS 2.
GYS1 levels in quadriceps and triceps were measured using GYS antibodies not specific for GYS1 (FIGS. 25A-25B). After treatment with the EEV-PMO construct, a decrease in GYS1 protein in quadriceps was observed (fig. 25A). Due to gel inconsistencies in loading, the level of GYS1 protein in the triceps was not concluded (FIG. 25B).
GYS 1-specific antibodies were also used to measure GYS1 protein levels in the diaphragm, heart and triceps (FIG. 26). After treatment with the EEV-PMO construct, a clear trend of decrease in GYS1 in the diaphragm and heart was observed. There was no significant decrease in GYS21 levels in the triceps. Notably, there was a large fluctuation in protein levels in untreated animals, and no GYS1 signal was observed in the liver using GYS 1-specific antibodies, possibly due to low GYS1 expression in the liver. Example 6 in vivo evaluation of the second PMO-EEV construct targeting GYS1
The in vivo effectiveness of PMO-EEV constructs 220-1055 was determined using a pompe disease model mouse. Construct 220-1055 is PMO 220 (5'-TCACTGTCTGGCTCACATACCCATA-3'; SEQ ID NO: 327) conjugated to EEV 1055 (Ac-PKKKRKV-K (loop [FfΦGrGrQ])-PEG2-K(N3)-NH2)(Ac-SEQ ID NO:42-K(SEQ ID NO:77)-PEG2-K(N3)-NH2))) conjugated to EEV using click chemistry.
The same knockout mouse model as in example 5 was used. Mice were treated with single doses of 15mpk PMO 220, 10mpk PMO-EEV 220-1055 or 20mpk PMO-EEV 220-1055 via intravenous injection. Mice were sacrificed one week, 2 weeks, 4 weeks, or 8 weeks after treatment. Tissues were harvested for western blot, RT-qPCR and glycogen storage analysis. Mice were fasted overnight before 2 and 8 weeks of sacrifice.
A decrease in Gys1 mRNA levels was observed in the heart 1 week, 2 weeks and 4 weeks after treatment with PMO-EEV 220-1055 (FIG. 16A). A slight decrease in GYS1 mRNA level was observed in the heart 8 weeks after treatment with PMO-EEV 220-1055 (FIG. 16A). The decrease in GYS1 mRNA levels in the diaphragm (FIG. 16B), quadriceps (FIG. 16C) and triceps (FIG. 16D) was observed 1 week, 2 weeks, 4 weeks and 8 weeks after treatment with PMO-EEV 220-1055.
Similar to RNA levels, the heart (FIG. 17A), diaphragm (FIG. 17B), triceps (FIG. 17C) and quadriceps (FIG. 17D) all showed a large decrease in GYS1 protein levels at 2 and 4 weeks after PMO-EEV 220-1055 treatment. The decrease in GYS1 appears to be stronger over time until 4 weeks after treatment, then weaker.
Drug exposure was reduced in the heart (fig. 18A), diaphragm (fig. 18B), triceps (fig. 18C) and quadriceps (fig. 18D) 1 to 8 weeks after treatment with PMO-EEV 220-1055. However, drug exposure was detectable 8 weeks after injection for all muscle tissues analyzed.
Glycogen storage levels in selected tissues were determined using AMPLEX red glucose/glucose oxidase assay kit (available from ThermoFischer, waltham, MA). Glycogen levels are determined by subtracting glucose levels from the same sample digested with alpha-amyloglucosidase.
A slight but not significant decrease in glycogen was observed in heart, diaphragm, triceps and quadriceps at 1 week, 2 weeks and 4 weeks after treatment with PMO-EEV 220-1055. Similar observations were made 8 weeks after treatment. The lack of reduced glycogen levels may be due to incomplete development of the glycogen storage phenotype when mice are treated. For example, comparison of wild-type mice and GAA knockout mice showed that glycogen storage levels in the heart, diaphragm, quadriceps, and triceps increased with age. This finding is consistent with the literature (Raben, N. et al, journal of Biological Chemistry (1998), 273 (30), pg.19086). If the glycogen phenotype is still developing when the mice are treated, the mice may not accurately reflect the pompe model at the time of testing.
Example 7 in vivo evaluation of the third PMO-EEV construct targeting GYS1
The in vivo effectiveness of PMO-EEV constructs 220-1120 was determined using a pompe disease model mouse. Construct 220-1120 is PMO 220 (5'-TCACTGTCTGGCTCACATACCCATA-3'; SEQ ID NO: 327) conjugated to EEV 1120Ac-PKKKRKV-AEEA-Lys (cyclo [ FGFGRGRQ ] -PEG 12 -OH) (Ac-SEQ ID NO:42-AEEA-Lys (cyclo [ SEQ ID NO:82] -PEG 12 -OH)). PMO was conjugated to EEV using an amide conjugation chemistry.
The same mouse model as in example 5 was used. Wild-type mice and GAA knockout mice were treated with single doses of 40mpk PMO 220, 5mpk PMO-EEV 220-1120, 10mpk PMO-EEV 220-1120, 20mpk PMO-EEV 220-1120, or 40mpk PMO-EEV 220-1120 via intravenous injection. Mice were sacrificed 2 weeks after treatment. Tissues were harvested for western blot analysis, RT-qPCR and glycogen storage. Mice were fasted overnight prior to sacrifice.
Dose-dependent knockdown of GYS1 mRNA (FIG. 19) and protein levels (FIG. 20) was observed in triceps (C) and quadriceps (D) in mice treated with PMO-EEV 220-1120. Moderate knockdown of GYS1 mRNA and protein levels was observed in heart (a) and diaphragm (B), probably due to the limited number of mice tested.
Similar studies were performed, but mice received multiple doses of PMO-EEV 220-1120. Briefly, GAA knockout mice were given 7mpk PMO 220 or 10mpk PMO-EEV 220-1120 via intravenous injection every two weeks for 8 weeks (total dose of PMO = 35mpk; total dose of PMO-EEV = 50 mpk). Ten weeks after the first treatment, mice were sacrificed. Tissues were harvested for western blot, RT-qPCR and glycogen storage analysis. The mice were probed for grip, line time and heart function (via echocardiography) before the first treatment, after the third treatment and after the last treatment.
Robust GYS1 mRNA knockdown was observed in both the myocardium and skeletal muscle (fig. 21A-21C). mRNA levels of GYS1 and GYS2 were not affected in the liver (fig. 22A-22B).
Robust glycogen synthase activity knockdown was observed in the heart and quadriceps. However, repeated doses of EEV-PMO 220-1120 did not reduce tissue glycogen storage in skeletal and muscle tissue. In addition, no significant change in grip strength or forelimb suspension time was observed in the treated mice compared to the untreated mice.
Example 8: IFR-5 ablation using two doses of EEV-PMO mice study
Two dose mouse studies were used to study the effectiveness of PM-EEV 278-1120. On day zero and day three, mice were given either 40 milligrams per kilogram (mpk) or 20mpk of PMO 278 (AGA ACG TAA TCA TCA GTG GGT TGG C; SEQ ID NO: 340) or EEV-PMO compound 278-1120.PMO-EEV 278-1120 comprises PMO 278 conjugated with EEV 1120 (Ac-PKKKRKV-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12-OH)(Ac-SEQ ID NO:42-PEG2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH). Five days after the second dose, mice were sacrificed and blood and tissues were collected.
FIGS. 23A-23C show IRF-5 expression levels in various tissues after treatment. IRF-5 expression knockdown was observed in mouse TiA and liver tissues, but not in small intestine tissues.
Example 9: IFR-5 ablation using single dose EEV-PMO mouse studies
The effectiveness of PM-EEV 278-1120 was studied using a single dose mouse study. PMO-EEV 278-1120 is PMO 278 (AGA ACG TAA TCA TCA GTG GGT TGG C; SEQ ID NO: 340) conjugated with EEV 1120 (Ac-PKKKRKV-PEG 2 -K (cyclo [ FGFGRGRQ ]) -PEG 12-OH)(Ac-SEQ ID NO:42-PEG2 -K (cyclo [ SEQ ID NO:82 ]) -PEG 12 -OH). On day zero, mice were given 80 milligrams per kilogram (mpk) PMO 278; PMO-EEV 278-1120 via 40mpk or 80mpk of IV; subcutaneous (SC) 120mpk PMO-EEV 278-1120. Seven days after the second dose, mice were sacrificed and blood and tissues were collected.
FIG. 24 shows IRF-5 expression levels in liver (A), kidney (B) and tibialis anterior (C) tissues with 80mpk PMO 278; a is 80mpk PMO and B is 40mpk PMO-EEV278-1120 delivered via IV; c is 80mpk PMO-EEV278-1120 delivered via IV; and D is 120mpk PMO-EEV278-1120 delivered subcutaneously. 278-1120, administered at a single dose at 40mpk and 80mpk, significantly reduced IRF-5 protein expression in mouse liver tissue, corresponding to 40% and 53% reduction, respectively (a). IRF-5 levels in kidney tissue were low compared to other tissues examined. The data showed fluctuations, probably due to the difficulty in quantifying the band intensity versus background. In addition, fluctuations in tibial anterior muscle tissue data were observed (data not shown) as the samples performed poorly on the gel. Overall, the data show a similar trend in kidneys and liver; IRF-5 protein levels were significantly reduced after single dose administration.
Example 10: evaluation of exon skipping in vitro of various EEV-PMOs targeting IRF-5
After treatment with the two EEV-PMO compounds 277-1120 and 278-1120, the IRF-5 expression and exon skipping were assessed using unstimulated RAW 264.7 monocytes/macrophages. PMO-EEV 277-1120 is a PMO sequence ACG TAA TCA TCA GTG GGT TGG CTC T (SEQ ID NO: 365) conjugated to EEV 1120Ac-PKKKRKV-AEEA-Lys- (cyclo [ FGFGRGRQ ]) -PEG12-OH (Ac-SEQ ID NO: 42-AEEA-Lys-loop (SEQ ID NO: 82) -PEG 12 -OH) by amide conjugation chemistry. PMO-EEV 278-1120 is a PMO sequence AGA ACG TAA TCA TCA GTG GGT TGG C (SEQ ID NO: 340) conjugated to EEV 1120Ac-PKKKRKV-AEEA-Lys- (cyclo [ FGFGRGRQ ]) -PEG12-OH (Ac-SEQ ID NO: 42-AEEA-Lys-loop (SEQ ID NO: 82) -PEG 12 -OH) by amide conjugation chemistry. Briefly, 150K cells/well were seeded in 0.5ml DMEM in a 24-well plate. After 4 hours, EEV-PMO compound was added to the cells to give a total volume of 500. Mu.L. The cells were then incubated for 24 hours. After incubation, the cell culture medium was collected for cytokine, IL6 and TNF- α detection. RNA was extracted and used for IRF-5 transcript quantification. IRF-5 protein level changes were measured using protein lysates. IRF-5 expression levels were determined relative to β -tubulin.
For exon skipping studies, cells were treated as described above. After incubation with EEV-PMO compounds, cells were washed with fresh medium and then incubated overnight. After the second incubation, RNA was harvested and RT-PCR was performed using primers that detect exon 5 skipping in the IRF-5 gene.
277-1120 And 278-1120 each showed target engagement in RAW 264.7 mouse macrophages/monocytes and significantly reduced IRF-5 protein levels in a dose-dependent manner (fig. 27A). Compound 277-1120 consumed significantly about 80% of IRF-5 protein level at 30 μm, about 50% at 10 μm, and no significant change was observed at the lower dose of 3.3 μm. Compounds 278-1120 have a stronger effect on IRF-5 consumption than 277-1120. Compounds 278-1120 reduced IRF-5 protein levels by about 80% at 30. Mu.M and about 65% at 10. Mu.M. 278-1120 had about 40% IRF-5 protein consumption even at the lower dose of 3.3 μm.
EEV-PMO compound 0278-1120 induced partial exon skipping 30 minutes after exposure, with increased potency with increased exposure time (fig. 27B).
A similar experiment was performed with additional EEV-PMO compounds, where PMO is PMO 278 (AGA ACG TAA TCA TCA GTG GGT TGG C; SEQ ID NO: 340). PMO 278 was conjugated to EEVs using an amide conjugation chemistry, each EEV including Ac-PKKKRKV-PEG 2 -K (ring [ FGFGRGRQ ]) -PEG 12-OH(EEV#1,1120,Ac-SEQ ID NO:42-PEG2 -K (ring [ SEQ ID NO:82 ]) -PEG 12-OH);Ac-PKKKRKV-PEG2 -K (ring [ Ff-Nal-GrGrQ ]) -PEG 12-OH(EEV#2,1113,Ac-SEQ ID NO:42-PEG2 -K (ring [ SEQ ID NO:135 ]) -PEG 12-OH);Ac-PKKKRKV-PEG2 -K (ring [ FGFGRRRQ ]) -PEG 12-OH(EEV#3;1184,Ac-SEQ ID NO:42-PEG2 -K (ring [ SEQ ID NO:84 ]) -PEG 12 -OH; ac-PKKKRKV-PEG 2 -K (cyclo [ FGFRRRRQ ]) -PEG 12-OH(EEV#4,1185,Ac-SEQ ID NO:42-PEG2 -K (cyclo [ SEQ ID NO:85 ]) -PEG 12 -OH.
Similar methods as described above were used, except that cells were pretreated with EEV-PMO compound and then stimulated overnight with R848. R484 is a Toll-like receptor agonist and causes induction of IRF-5 expression. The total treatment time was 24 hours.
R848 significantly increased IRF-5 protein expression in RAW264.7 cells. All EEV-PMO treated samples at all concentrations tested showed a significant decrease in IRF-5 protein expression when compared to R848 stimulated cells (fig. 28A). EEV-PMO compounds 278-1113, 278-1184 and 278-1185 were 5-fold more potent than 278-1120, with IRF-5 protein decreasing by about 80% at concentrations as low as 2. Mu.M when compared to IRF-5 levels in cells stimulated with R848.
EEV-PMO compounds 278-1113, 278-1184 and 278-1185 showed higher exon skipping than 278-1120 at 5. Mu.M (FIG. 12B). No significant difference in exon skipping was observed between 278-1113, 278-1184 and 278-1185.
Example 11: evaluation of various EEV-PMO Compounds in human THP1 cells
Human THP1 cells were used to evaluate IRF-5 expression and exon skipping after treatment with various PMO compounds and various EEV-PMO compounds. The PMO compounds tested contained 344(TTGGCAACATCCTCTGCAGCTGAAG;SEQ ID NO:366,Hs-IRF-5-E4N6);345(GCAACATCCTCTGCAGCTG;SEQ ID NO:367,Hs-IRF-5-E4N3);346(TCAGGCTTGGCAACATCCTCTGCAG;SEQ ID NO:368,Hs-IRF-5-E5P0;IRF5-E4N3(TAATCATCAGTGGGTTGGCTCTCTG,SEQ ID NO:369);278(AGA ACG TAA TCA TCA GTG GGT TGG C;SEQ ID NO:340,Hs-IRF-5-E4P3), and 277 (ACG TAA TCA TCA GTG GGT TGG CTC T; SEQ ID NO:365, hs-IRF-5-E4 PO). EEV-PMO compounds including PMOs 344, 345 and 346 are conjugated to EEV 1120 (Ac-PKKKRKV-AEEA-Lys- (cyclo [ FGFGRGRQ ]) -PEG 12-OH) (Ac-SEQ ID NO: 42-AEEA-Lys-cyclo (SEQ ID NO: 82) -PEG 12 -OH), respectively, via an amide conjugation chemistry.
Briefly, for PMO-only studies, PMO compounds were transfected into THP1 cells using nuclear transfection methods. Cells were plated in culture medium containing PMA after nuclear transfection and incubated for 24 hours prior to harvest. RNA was harvested and RT-PCR was performed using primers that detect exon 4 and exon 5 skipping in the IRF-5 gene.
Briefly, for EEV-PMO studies, THP1 cells differentiated overnight by PMA. Cells were then treated with various EEV-PMO conjugates and incubated for 24 hours prior to harvest. RNA was harvested and RT-PCR was performed using primers that detect exon 5 skipping in the IRF-5 gene.
Fig. 29A shows exon 4 and exon 5 skipping levels after treatment with various PMO compounds. PMO compounds that work well in mouse cells do not necessarily translate into adult cells. For example, for Hs-IRF-5-E4P3 (PMO 278) and Hs-IRF-5-E4PO (PMO 277), low levels of exon skipping were observed. Exon skipping was observed for Hs-IRF-5-E5N6 (PMO 344), hs-IRF-5-E5N3 (PMO 345) and Hs-IRF-5-E5P0 (PMO 346).
FIG. 29B shows exon 5 skipping levels after treatment with various PMO-EEV compounds. The results indicate that EEV-PMO conjugates can induce exon skipping and down-regulation of target genes in THP1 cells.
Various embodiments have been described herein. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.

Claims (135)

1. A compound, the compound comprising:
A cyclic cell penetrating peptide (cCPP); and
An Antisense Compound (AC) comprising a nucleotide sequence complementary to a target nucleotide sequence of a target transcript of a target gene, wherein the AC specifically hybridizes to the target nucleotide sequence and modulates splicing of the target transcript to down-regulate expression or activity of a protein expressed by the target transcript.
2. The compound of claim 1, wherein the AC comprises exon skipping.
3. The compound of claim 2, wherein the exon skipping introduces a frameshift.
4. The compound of claim 1 or 2, wherein the AC comprises a nucleotide sequence that is complementary to at least a portion of a splice element or is sufficiently close to the splice element to modulate splicing of the target transcript.
5. The compound of claim 4, wherein the splice element is one or more of a terminal stem loop sequence, a branch point sequence, a polypyrimidine sequence, a 5 'splice site, a 3' splice site, and an intronic splice silencer sequence, an intronic splice enhancer sequence, an exonic splice silencer sequence, and a sequence comprising an exon/intron junction.
6. The compound of claim 4, wherein the splice element is a 5 'splice site or a 3' splice site.
7. The compound of any one of claims 2 to 6, wherein the AC comprises a nucleotide sequence complementary to at least a portion of the splice element.
8. The compound according to any one of claims 1 to 7, wherein the target gene is involved in the pathogenesis of a disease.
9. The compound of any one of claims 1 to 8, wherein the AC has a length of about 5 to about 1000, about 5 to about 500, about 5 to about 100, about 5 to about 50, or about 5 to about 25 nucleotides.
10. The compound of any one of claims 1 to 9, wherein the AC comprises at least one modified nucleotide or nucleic acid comprising Phosphorothioate (PS) nucleotides, phosphorodiamidate morpholino nucleotides, locked Nucleic Acids (LNAs), peptide Nucleic Acids (PNAs), nucleotides comprising a backbone modified with 2' -O-methyl (2 ' -OMe), 2' -O-methoxy-ethyl (2 ' -MOE) nucleotides, 2',4' -constrained ethyl (cEt) nucleotides, 2' -deoxy-2 ' -fluoro- β -D-arabinonucleotides (2 ' f-ANA), or a combination thereof.
11. The compound of any one of claims 1 to 9, wherein the AC comprises one or more phosphorodiamidate morpholino nucleosides, 2' -O-methylated nucleosides, locked Nucleic Acids (LNAs), or combinations thereof.
12. The compound of claims 1-11, further comprising a linker that conjugates the cCPP with the AC.
13. The compound of any one of claims 1 to 12, wherein the linker is conjugated to a chemically reactive side chain of the cCPP amino acid.
14. The compound of claim 13, wherein the chemically reactive side chain of the cCPP comprises an amine group, a carboxylic acid, an amide, a hydroxyl group, a sulfhydryl group, a guanidine group, a phenol group, a thioether group, an imidazole group, or an indole group.
15. The compound of claim 13 or 14, wherein the amino acid of the cCPP conjugated to the AC comprises lysine, arginine, aspartic acid, glutamic acid, asparagine, glutamine, homoglutamine, serine, threonine, tyrosine, cysteine, arginine, tyrosine, methionine, histidine, or tryptophan.
16. The compound of any one of claims 12 to 15, wherein the linker is conjugated to the 5 'end, 3' end of the AC.
17. The compound of any one of claims 12 to 16, wherein the linker comprises one or more D or L amino acids, each optionally substituted; alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each of which is optionally substituted; or- (R 1-J-R2) z "-, wherein R 1 and R 2 are each independently selected from alkylene, alkenylene, alkynylene, carbocyclyl and heterocyclyl, each J is independently NR 3、-NR3 C (O) -, S and O, wherein R 3 is independently selected from H, alkyl, alkenyl, alkynyl, carbocyclyl and heterocyclyl, each of which is optionally substituted, and z" is an integer from 1 to 50; or a combination thereof.
18. The compound of any one of claims 1 to 17, wherein the cCPP comprises 4-12 amino acids, wherein
At least two of the amino acids are arginine,
At least two of the amino acids comprise a hydrophobic side chain,
And at least 1 amino acid is a D amino acid.
19. The compound of any one of claims 1 to 17, wherein the cCPP is of formula (a):
Or a protonated form thereof, wherein:
r 1、R2 and R 3 are each independently H or an aromatic or heteroaromatic side chain of an amino acid;
At least one of R 1、R2 and R 3 is an aromatic or heteroaromatic side chain of an amino acid;
r 4、R5、R6、R7 is independently H or an amino acid side chain;
At least one of R 4、R5、R6、R7 is a side chain of 3-guanidino-2-aminopropionic acid, 4-guanidino-2-aminobutyric acid, arginine, homoarginine, N-methylarginine, N, N-dimethylarginine, 2, 3-diaminopropionic acid, 2, 4-diaminobutyric acid, lysine, N-methyllysine, N, N-dimethyllysine, N-ethyllysine, N, N, N-trimethyllysine, 4-guanidinophenylalanine, citrulline, N, N-dimethyllysine, β -homoarginine, 3- (1-piperidinyl) alanine;
AA SC is an amino acid side chain; and
Q is 1,2, 3 or 4.
20. The compound of claim 19, wherein the cCPP is of formula (I):
Or a protonated form or salt thereof,
Wherein each m is independently an integer from 0 to 3.
21. The compound of any one of claims 19 to 20, wherein R 1、R2 and R 3 are independently H or a side chain comprising an aryl group.
22. The compound of any one of claims 19 to 21, wherein the side chain comprising an aryl group is a side chain of tyrosine, phenylalanine, 1-naphthylalanine, 2-naphthylalanine, tryptophan, 3-benzothienyl alanine, 4-phenylphenylalanine, 3, 4-difluorophenylalanine, 4-trifluoromethylphenylalanine, 2,3,4,5, 6-pentafluorophenylalanine, homophenylalanine, β -homophenylalanine, 4-tert-butyl-phenylalanine, 4-pyridylalanine, 3-pyridylalanine, 4-methylphenylalanine, 4-fluorophenylalanine, 4-chlorophenylalanine, or 3- (9-anthracenyl) -alanine.
23. The compound of any one of claims 19 to 22, wherein the side chain comprising an aryl group is a side chain of phenylalanine.
24. The compound of any one of claims 19 to 23, wherein both of R 1、R2 and R 3 are side chains of phenylalanine.
25. The compound of any one of claims 19 to 25, wherein both of R 1、R2、R3 and R 4 are H.
26. The compound of claim 19, wherein the cCPP is of the formula (I-1),
Or a protonated form or salt thereof.
27. The compound of claim 19, wherein the cCPP is of formula (I-2) below:
Or a protonated form or salt thereof.
28. The compound of claim 19, wherein the cCPP is of formula (I-3):
Or a protonated form or salt thereof.
29. The compound of claim 19, wherein the cCPP is of formula (I-4):
Or a protonated form or salt thereof.
30. The compound of claim 19, wherein the cCPP is of formula (I-5):
Or a protonated form or salt thereof.
31. The compound of claim 19, wherein the cCPP is of formula (I-6):
Or a protonated form or salt thereof.
32. The compound of any one of claims 1 to 17, wherein the cCPP is of formula (II):
Wherein:
AA SC is an amino acid side chain;
R 1a、R1b and R 1c are each independently 6 to 14 membered aryl or 6 to 14 membered heteroaryl;
r 2a、R2b、R2c and R 2d are independently amino acid side chains;
At least one of R 2a、R2b、R2c and R 2d is Or a protonated form or salt thereof;
At least one of R 2a、R2b、R2c and R 2d is guanidine or a protonated form or salt thereof;
Each n "is independently an integer from 0 to 5;
Each n' is independently an integer from 0 to 3; and
If n' is 0, then R 2a、R2b、R2b or R 2d are absent.
33. The compound of claim 32, wherein the cCPP is of formula (II-1) below:
34. The compound of claim 32 or 33, wherein R 1a、R1b and R 1c are each independently selected from the group consisting of phenyl, naphthyl and anthracenyl.
35. The compound of claim 32, wherein said cCPP is of formula (IIa):
36. the compound of any one of claims 32 to 35, wherein R 2a、R2b、R2c
Is guanidine or a protonated form or salt thereof.
37. The compound of any one of claims 32 to 36, wherein R 2a、R2b、R2c
Is guanidine or a protonated form or salt thereof.
38. The compound of claim 32, wherein the cyclic peptide has the following formula (IIb):
39. The cyclic peptide of any one of claims 32 to 38, wherein R 2a and R 2c are each
40. The compound of claim 32, wherein the cCPP is of formula (IIc):
Or a protonated form or salt thereof.
41. The compound of any one of claims 19 to 40, wherein AA SC is a side chain of an asparagine residue, an aspartic acid residue, a glutamic acid residue, a homoglutamic acid residue, or a homoglutamate residue.
42. The compound of any one of claims 19 to 40, wherein AA SC is a side chain of a glutamic acid residue.
43. The compound of any one of claims 19 to 40, wherein AA SC is: Wherein t is an integer from 0 to 5.
44. A compound according to any one of claims 1 to 17, wherein said cCPP has the structure:
Or a protonated form or salt thereof, wherein at least one atom of the amino acid side-chain is replaced by a therapeutic moiety or linker or at least one lone pair forms a bond with the therapeutic moiety or linker.
45. A compound according to any one of claims 1 to 17, wherein said cCPP has the structure:
Or a protonated form or salt thereof, wherein at least one atom of the amino acid side-chain is replaced by a therapeutic moiety or linker or at least one lone pair forms a bond with the therapeutic moiety or linker.
46. The compound of any one of claims 19 to 45, wherein at least one atom on the AA SC is replaced with the therapeutic moiety or linker or at least one lone pair forms a bond with the therapeutic moiety or linker.
47. The compound of any one of claims 44 to 46, wherein the linker comprises a- (OCH 2CH2)z' -subunit, wherein z' is an integer from 1 to 23.
48. The compound of any one of claims 44 to 46, wherein the linker comprises:
(i) - (OCH 2CH2)z -subunit wherein z' is an integer from 1 to 23;
(ii) One or more amino acid residues such as residues of glycine, beta-alanine, 4-aminobutyric acid, 5-aminopentanoic acid or 6-aminocaproic acid or combinations thereof; or (b)
(Iii) A combination of (i) and (ii).
49. The compound of any one of claims 44 to 46, wherein the linker comprises:
(i) - (OCH 2CH2)z -subunit wherein z is an integer from 2 to 20;
(ii) Residues of one or more glycine, beta-alanine, 4-aminobutyric acid, 5-aminopentanoic acid, 6-aminocaproic acid, or combinations thereof; or (b)
(Iii) A combination of (i) and (ii).
50. The compound of any one of claims 44 to 46, wherein the linker comprises a divalent or trivalent C 1-C50 alkylene group, wherein 1-25 methylene groups are optionally and independently replaced by-N (H) -, -N (C 1-C4 alkyl) -, -N (cycloalkyl) -, -O-, -C (O) O-, -S (O) 2-、-S(O)2N(C1-C4 alkyl) -, -S (O) 2 N (cycloalkyl) -, -N (H) C (O) -, -N (C 1-C4 alkyl) C (O) -, -N (cycloalkyl) C (O) -, -C (O) N (H) -, -C (O) N (C 1-C4 alkyl), -C (O) N (cycloalkyl), aryl, heteroaryl, cycloalkyl, or cycloalkenyl.
51. The compound of any one of claims 44 to 46, wherein the linker has the structure:
Wherein:
x' is an integer from 1 to 23; y is an integer from 1 to 5; z' is an integer from 1 to 23; * Is an attachment point to the AASC, and AASC is a side chain of an amino acid residue of the cyclic peptide; and M is a binding group.
52. The compound of claim 51, wherein z' is 11.
53. The compound of claim 51 or 52, wherein x' is 1.
54. The compound of any one of claims 1 to 53, further comprising a cyclic exopeptide conjugated to the cCPP.
55. The compound of claim 54 when appended to claim 51, wherein the exocyclic peptide is conjugated to the linker at the amino terminus of the linker.
56. The compound of claim 54 or 55, wherein the exocyclic peptide comprises 2 to 10 amino acid residues.
57. The compound of claim 54 or 55, wherein the exocyclic peptide comprises 4 to 8 amino acid residues.
58. The compound of any one of claims 54 to 57, wherein the exocyclic peptide comprises 1 or 2 amino acid residues comprising a side chain comprising a guanidino group or protonated form or salt thereof.
59. The compound of any one of claims 54 to 58, wherein the exocyclic peptide comprises 2, 3 or 4 lysine residues.
60. The compound of claim 59, wherein the amino group on the side chain of each lysine residue is substituted with a trifluoroacetyl (-COCF 3), allyloxycarbonyl (Alloc), 1- (4, 4-dimethyl-2, 6-dioxocyclohexylidene) ethyl (Dde) or (4, 4-dimethyl-2, 6-dioxocyclohex-1-ylidene-3) -methylbutyl (ivDde) group.
61. The compound of any one of claims 54 to 60, wherein the exocyclic peptide comprises at least 2 amino acid residues having a hydrophobic side chain.
62. The compound of claim 61, wherein the amino acid residue having a hydrophobic side chain is selected from the group consisting of valine, proline, alanine, leucine, isoleucine and methionine.
63. The compound of claim 54 or 55, wherein the exocyclic peptide comprises one of the following sequences :KK、KR、RR、HH、HK、HR、RH、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKH、KHK、HKK、HRR、HRH、HHR、HBH、HHH、HHHH、KHKK、KKHK、KKKH、KHKH、HKHK、KKKK、KKRK、KRKK、KRRK、RKKR、RRRR、KGKK、KKGK、HBHBH、HBKBH、RRRRR、KKKKK、KKKRK、RKKKK、KRKKK、KKRKK、KKKKR、KBKBK、RKKKKG、KRKKKG、KKRKKG、KKKKRG、RKKKKB、KRKKKB、KKRKKB、KKKKRB、KKKRKV、RRRRRR、HHHHHH、RHRHRH、HRHRHR、KRKRKR、RKRKRK、RBRBRB、KBKBKB、PKKKRKV、PGKKRKV、PKGKRKV、PKKGRKV、PKKKGKV、PKKKRGV or PKKKRKG, wherein B is β -alanine.
64. The compound of claim 54 or 55, wherein the exocyclic peptide comprises one of the following sequences: PKKKRKV, RR, RRR, RHR, RBR, RBRBR, RBHBR or HBRBH, wherein B is β -alanine.
65. The compound of claim 54 or 55, wherein the exocyclic peptide comprises one of the following sequences :KK、KR、RR、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKKK、KKRK、KRKK、KRRK、RKKR、RRRR、KGKK、KKGK、KKKKK、KKKRK、KBKBK、KKKRKV、PKKKRKV、PGKKRKV、PKGKRKV、PKKGRKV、PKKKGKV、PKKKRGV or PKKKRKG.
66. The compound of claim 54 or 55, wherein the exocyclic peptide comprises PKKKRKV.
67. The compound of claim 66, wherein the exocyclic peptide comprises one of the following sequences :NLSKRPAAIKKAGQAKKKK、PAAKRVKLD、RQRRNELKRSF、RMRKFKNKGKDTAELRRRRVEVSVELR、KAKKDEQILKRRNV、VSRKRPRP、PPKKARED、PQPKKKPL、SALIKKKKKMAP、DRLRR、PKQKKRK、RKLKKKIKKL、REKKKFLKRR、KRKGDEVDGVDEVAKKKSKK or RKCLQAGMNLEARKTKK.
68. The compound of any one of claims 1 to 21, wherein the compound has the following formula (C):
Or a protonated form or salt thereof,
Wherein:
R 1、R2 and R 3 are each independently H or a side chain comprising an aryl or heteroaryl group, wherein at least one of R 1、R2 and R 3 is a side chain comprising an aryl or heteroaryl group;
R 4 and R 7 are independently H or an amino acid side chain;
EP is a cyclic exopeptide;
Each m is independently an integer from 0 to 3;
n is an integer from 0 to 2;
x' is an integer from 1 to 23;
y is an integer from 1 to 5;
q is an integer from 1 to 4;
z' is an integer from 1 to 23; and
The cargo is the therapeutic moiety.
69. The compound of claim 68, wherein R 1、R2 and R 3 are H or side chains comprising an aryl group.
70. The compound of claim 68 or 69, wherein the side chain comprising an aryl group is that of phenylalanine.
71. The compound of any one of claims 68 to 70, wherein both of R 1、R2 and R 3 are side chains of phenylalanine.
72. The compound of any one of claims 68 to 70, wherein both of R 1、R2、R3 and R 4 are H.
73. The compound of any one of claims 68 to 72, wherein z' is 11.
74. The compound of any of claims 68 to 73, wherein x' is 1.
75. The compound of any one of claims 68 to 74, wherein the EP comprises 2 to 10 amino acid residues.
76. The compound of any one of claims 68 to 74, wherein the EP comprises 4 to 8 amino acid residues.
77. The compound of any one of claims 68 to 76, wherein the EP comprises 1 or 2 amino acid residues comprising a side chain comprising a guanidino group or protonated form or salt thereof.
78. The compound of any one of claims 68 to 77, wherein the EP comprises at least 1 lysine residue.
79. The compound of any one of claims 68 to 77, wherein the EP comprises 2, 3 or 4 lysine residues.
80. The compound of any one of claims 68 to 78, wherein the EP comprises at least 2 amino acids having hydrophobic side chains.
81. The compound of claim 80, wherein the amino acid residue having a hydrophobic side chain is selected from valine, proline, alanine, leucine, isoleucine and methionine residues.
82. The compound of any one of claims 68 to 74, wherein the EP comprises one of the following sequences :PKKKRKV、KR、RR、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKKK、KKRK、KRKK、KRRK、RKKR、RRRR、KGKK、KKGK、KKKKK、KKKRK、KBKBK、KKKRKV、PGKKRKV、PKGKRKV、PKKGRKV、PKKKGKV、PKKKRGV or PKKKRKG.
83. The compound of any one of claims 68 to 74, wherein the EP has the structure: ac-PKKKRKV.
84. The compound of any one of claims 1 to 21, comprising a structure of formula (C-1), (C-2), (C-3), or (C-4):
/>
Or a protonated form or salt thereof,
Wherein EP is a cyclic exopeptide, and
An oligonucleotide is the therapeutic moiety, which is an oligonucleotide.
85. The compound of claim 84, wherein the EP comprises 2 to 10 amino acid residues.
86. The compound of claim 84, wherein the EP comprises 4 to 8 amino acid residues.
87. The compound of any one of claims 84 to 86, wherein the EP comprises 1 or 2 amino acid residues comprising a side chain comprising a guanidino group or protonated form or salt thereof.
88. The compound of any one of claims 84 to 87, wherein the EP comprises at least 1 lysine residue.
89. The compound of any one of claims 84 to 87, wherein the EP comprises 2, 3 or 4 lysine residues.
90. The compound of any one of claims 84 to 89, wherein the EP comprises at least 2 amino acids having a hydrophobic side chain.
91. The compound of claim 90, wherein the amino acid residue having a hydrophobic side chain is selected from valine, proline, alanine, leucine, isoleucine and methionine residues.
92. The compound of claim 84, wherein the EP comprises one of the following sequences :PKKKRKV、KR、RR、KKK、KGK、KBK、KBR、KRK、KRR、RKK、RRR、KKKK、KKRK、KRKK、KRRK、RKKR、RRRR、KGKK、KKGK、KKKKK、KKKRK、KBKBK、KKKRKV、PGKKRKV、PKGKRKV、PKKGRKV、PKKKGKV、PKKKRGV or PKKKRKG.
93. The compound of claim 84, wherein the EP has the structure: ac-PKKKRKV.
94. The compound of any one of claims 1 to 93, wherein the compound down-regulates expression of glycogen synthase.
95. The compound of claim 94, wherein the compound down-regulates expression of GYS 1.
96. The compound of claim 94 or 95, wherein the compound induces exon skipping of an exon in a GYS1 transcript.
97. The compound of any one of claims 94-96, wherein the AC hybridizes to the target nucleotide sequence, wherein the target nucleotide sequence comprises at least a portion of a GYS1 transcript, and the AC induces skipping of an exon in the GYS1 transcript.
98. The compound of claim 97, wherein skipping of the exon induces a frame shift in the GYS1 transcript.
99. The compound of claim 98, wherein the frame shift results in a GYS1 transcript encoding a glycogen synthase having reduced activity.
100. The compound of claim 98, wherein the frame shift results in a truncated or nonfunctional glycogen synthase.
101. The compound of claim 98, wherein the frameshift results in the introduction of a premature stop codon in the GYS1 transcript.
102. The compound of claim 98, wherein the frameshift results in degradation of the GYS1 transcript by nonsense-mediated decay.
103. The compound of any one of claims 96, wherein the compound is designed to skip frame exons, such as GYS1 exon 2, GYS1 exon 5, GYS1 exon 6, GYS1 exon 7, GYS1 exon 8, GYS1 exon 10, or GYS1 exon 12.
104. The compound of claim 103, wherein skipping the frame exon results in a frame shift in the GYS1 transcript.
105. The compound of any one of claims 96 to 104, wherein the compound comprises one or more of the following sequences:
sequence (5 '-3')
ctgtgggcccaagcgtgtgagggca
ACCCActgtgggcccaagcgtgtga
ATGCCACCCActgtgggcccaagcg
TGTAGATGCCACCCActgtgggccc
CACCGTGTAGATGCCACCCActgtg
TGCAGCACCGTGTAGATGCCACCCA
CTTGCAGCCCTTGCTGTTCATGGAA
cccacCTTGCAGCCCTTGCTGTTCA
cacgtcccacCTTGCAGCCCTTGCT
tgggccacgtcccacCTTGCAGCCC
tgggctgggccacgtcccacCTTGC
tgccctgggctgggccacgtcccac
ctgggtgggaggggacagcagtccg
TTGAActgggtgggaggggacagca
CCACGTTGAActgggtgggagggga
CTTGTCCACGTTGAActgggtggga
GCTTCCTTGTCCACGTTGAActggg
CCCCTGCTTCCTTGTCCACGTTGAA
CTGGTTTCCTCTTGAGCAAGTGCTG
cctacCTGGTTTCCTCTTGAGCAAG
cagcccctacCTGGTTTCCTCTTGA
cagcccagcccctacCTGGTTTCCT
gacttcagcccagcccctacCTGGT
ctggggacttcagcccagcccctac
ctgggattgggggtgagggtcccat
AATATctgggattgggggtgagggt
GTCACAATATctgggattgggggtg
TGGGGGTCACAATATctgggattgg
CCCATTGGGGGTCACAATATctggg
TTCAGCCCATTGGGGGTCACAATAT
CCATAAAAATGGCCCCGCACAAACT
cgtacCCATAAAAATGGCCCCGCAC
ccccacgtacCCATAAAAATGGCCC
atatgccccacgtacCCATAAAAAT
taggtatatgccccacgtacCCATA
agacctaggtatatgccccacgtac
ctaaagaacccacaaggcacggtaa
GATGCctaaagaacccacaaggcac
GTCCAGATGCctaaagaacccacaa
TTGAAGTCCAGATGCctaaagaacc
CCAAGTTGAAGTCCAGATGCctaaa
CTTGTCCAAGTTGAAGTCCAGATGC
TCTGAGCAGATAGTTGAGCCGAGCC
ctcacTCTGAGCAGATAGTTGAGCC
caggcctcacTCTGAGCAGATAGTT
tagcccaggcctcacTCTGAGCAGA
cctcatagcccaggcctcacTCTGA
tgtcccctcatagcccaggcctcac
ctgcgcagaaagaaaggagggggag
TTCACctgcgcagaaagaaaggagg
TGCCGTTCACctgcgcagaaagaaa
CTCGCTGCCGTTCACctgcgcagaa
GTCTGCTCGCTGCCGTTCACctgcg
CCACTGTCTGCTCGCTGCCGTTCAC
CAAAGCTGTTTGCGCACAGCTTGGC
ctaacCAAAGCTGTTTGCGCACAGC
ggctgctaacCAAAGCTGTTTGCGC
gcgagggctgctaacCAAAGCTGTT
ccggagcgagggctgctaacCAAAG
aggggccggagcgagggctgctaac
ctgcaaggcaagcaggggcatgcat
TCCCActgcaaggcaagcaggggca
AAGGCTCCCActgcaaggcaagcag
TCGGGAAGGCTCCCActgcaaggca
TCATGTCGGGAAGGCTCCCActgca
CTTGTTCATGTCGGGAAGGCTCCCA
CTGCGTTGCAAAGATGGCTCTCTTC
catacCTGCGTTGCAAAGATGGCTC
caatccatacCTGCGTTGCAAAGAT
aggtccaatccatacCTGCGTTGCA
cacagaggtccaatccatacCTGCG
ctctgcacagaggtccaatccatac
ctggtagtgaaaaagaaggactcag
ATCACctggtagtgaaaaagaagga
GGAAAATCACctggtagtgaaaaag
CGGGTGGAAAATCACctggtagtga
AACTCCGGGTGGAAAATCACctggt
AGAGGAACTCCGGGTGGAAAATCAC
CCGGTGTGTAGCCCCAAGGCTCATA
ctcacCCGGTGTGTAGCCCCAAGGC
ctacactcacCCGGTGTGTAGCCCC
gcccactacactcacCCGGTGTGTA
cccctgcccactacactcacCCGGT
gctgtcccctgcccactacactcac
ctgtggaggccaggacccaggttca
GATACctgtggaggccaggacccag
ATGTAGATACctgtggaggccagga
CAAGAATGTAGATACctgtggaggc
CCGGTCAAGAATGTAGATACctgtg
AACCGCCGGTCAAGAATGTAGATAC
CCGGCCTAGGTATTTCCAGTCCAGA
cctacCCGGCCTAGGTATTTCCAGT
ggggtcctacCCGGCCTAGGTATTT
aagtgggggtcctacCCGGCCTAGG
taggaaagtgggggtcctacCCGGC
gaggataggaaagtgggggtcctac
106. The compound of claim 94 or 95, wherein the compound comprises a moiety designed to hybridize to the start codon of GYS 1.
107. The compound of claim 106, wherein the compound comprises one or more of the following sequences:
sequence (5 '-3')
GCGGTTTAAAGGCATGGCTGGCGCA
TGGACAAAGTGCGGTTTAAAGGCAT
GTGAGGACATGGACAAAGTGCGGTT
CAGTCCTGGCAGTGAGGACATGGAC
CCAGTCCTCCAGTCCTGGCAGTGAG
CACCTGGGATTCTTAAATATAGATG
108. A compound according to any one of claims 1 to 93, wherein the compound down-regulates expression of interferon regulatory factor 5 (IRF-5).
109. The compound of claim 108, wherein the compound induces exon skipping of an exon in an IRF-5mRNA transcript.
110. The compound of claim 108 or 109, wherein the AC hybridizes to the target nucleotide sequence, wherein the target nucleotide sequence comprises at least a portion of an IRF-5 transcript, and the AC induces skipping of an exon in the IRF-5 transcript.
111. The compound of claim 108, wherein skipping of the exon induces a frame shift in the IRF-5 transcript.
112. The compound of claim 111, wherein the frameshift results in an IRF-5 transcript encoding IRF-5 having reduced activity.
113. The compound of claim 111, wherein the frameshift results in truncated or nonfunctional IRF-5.
114. The compound of claim 111, wherein the frame shift results in the introduction of a premature stop codon in the IRF-5mRNA transcript.
115. The compound of claim 111, wherein the frameshift results in degradation of the IRF-5mRNA transcript by nonsense-mediated decay.
116. The compound of any one of claims 108 to 115, wherein the compound is designed to skip a frame exon, such as IRF-5 exon 3, IRF-5 exon 4, IRF-5 exon 5 or IRF-5 exon 8.
117. The compound of claim 116, wherein skipping the frame exon results in a frame shift in the IRF-5mRNA transcript.
118. The compound of any one of claims 1 to 93, wherein the compound down-regulates expression of dual homology cassette 4 (DUX 4).
119. The compound of claim 118, wherein the target transcript is DUX4, and wherein the AC induces alternative splicing in the target transcript.
120. The compound of claim 119, wherein the selective splice upregulates expression of DUX 4-s.
121. The compound of claim 119, wherein exon skipping down-regulates expression of DUX 4-fl.
122. A pharmaceutical composition comprising a compound according to any one of claims 1 to 121 and a pharmaceutically acceptable carrier.
123. A cell comprising a compound according to any one of claims 1 to 121.
124. A method of down-regulating the activity of a target protein in a cell, the method comprising administering to the cell a compound according to any one of claims 1 to 121 or a pharmaceutical composition according to claim 121.
125. A method of down-regulating the activity of a target protein in a patient, the method comprising administering to the patient a therapeutically effective amount of a compound according to any one of claims 1 to 121 or a pharmaceutical composition according to claim 122.
126. A method of treating a disease or disorder associated with a target gene in a patient, the method comprising administering to the patient a therapeutically effective amount of a compound according to any one of claims 1 to 121 or a pharmaceutical composition according to claim 122.
127. The method of any one of claims 124-126, wherein the target gene comprises interferon regulatory factor 5 (IRF-5).
128. The method of any one of claims 124-126, wherein the target gene comprises glycogen synthase (GYS 1).
129. The method of claim 128, wherein the disease or disorder comprises a glycogen storage disease.
130. The method of claim 129, wherein the glycogen storage disease is associated with glycogen accumulation in muscle tissue.
131. The method of claim 130, wherein the glycogen storage disease is associated with glycogen accumulation in myocardial tissue.
132. The method of claim 130, wherein the glycogen storage disease is associated with glycogen accumulation in skeletal muscle tissue.
133. The method of claim 130, wherein the glycogen storage disease comprises a type II glycogen storage disease.
134. The method of claim 129, wherein the glycogen storage disease comprises pompe disease.
135. The method of claim 129, wherein the glycogen storage disease comprises anderson's disease, mecalder's disease, lafu pull's disease, or tartrazine's disease.
CN202280041112.4A 2021-05-10 2022-05-09 Compositions and methods for modulating mRNA splicing Pending CN117915957A (en)

Applications Claiming Priority (10)

Application Number Priority Date Filing Date Title
US63/186,664 2021-05-10
US63/210,882 2021-06-15
US63/210,866 2021-06-15
US63/239,671 2021-09-01
US63/298,587 2022-01-11
US63/318,201 2022-03-09
US63/321,921 2022-03-21
US202263362295P 2022-03-31 2022-03-31
US63/362,295 2022-03-31
PCT/US2022/028357 WO2022240760A2 (en) 2021-05-10 2022-05-09 COMPOSITIONS AND METHODS FOR MODULATING mRNA SPLICING

Publications (1)

Publication Number Publication Date
CN117915957A true CN117915957A (en) 2024-04-19

Family

ID=90646052

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202280044682.9A Pending CN117897176A (en) 2021-05-10 2022-05-09 Compositions and methods for modulating tissue distribution of intracellular therapeutic agents
CN202280041112.4A Pending CN117915957A (en) 2021-05-10 2022-05-09 Compositions and methods for modulating mRNA splicing

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202280044682.9A Pending CN117897176A (en) 2021-05-10 2022-05-09 Compositions and methods for modulating tissue distribution of intracellular therapeutic agents

Country Status (1)

Country Link
CN (2) CN117897176A (en)

Also Published As

Publication number Publication date
CN117897176A (en) 2024-04-16

Similar Documents

Publication Publication Date Title
AU2019200789B2 (en) Multimeric oligonucleotide compounds
US10093924B2 (en) Multimetric oligonucleotide compounds
US20230020092A1 (en) Compositions for delivery of antisense compounds
JP2023179431A (en) Rna modulating oligonucleotide having improved characteristics for treating duchenne type muscular dystrophy and becker type muscular dystrophy
JP6851201B2 (en) Antisense oligonucleotides useful in the treatment of Pompe disease
CA2579638C (en) Rna interference targeting non-disease causing single nucleotide polymorphisms within a gene encoding a gain-of-function mutant huntingtin protein
CN113423385A (en) Oligonucleotide compositions and methods thereof
WO2014203518A1 (en) Double-stranded antisense nucleic acid with exon-skipping effect
CA3003267A1 (en) Nanoparticle formulations for delivery of nucleic acid complexes
US20210052706A1 (en) Compositions and methods for facilitating delivery of synthetic nucleic acids to cells
KR20240012425A (en) Compositions and methods for intracellular therapeutics
KR20240009393A (en) Cyclic cell penetrating peptide
WO2018231060A1 (en) Enzymatic replacement therapy and antisense therapy for pompe disease
EP4337261A2 (en) Compositions and methods for modulating mrna splicing
WO2022240758A1 (en) Compositions and methods for modulating gene expression
WO2022271818A1 (en) Antisense compounds and methods for targeting cug repeats
CN117915957A (en) Compositions and methods for modulating mRNA splicing
EP4337263A1 (en) Compositions and methods for modulating interferon regulatory factor-5 (irf-5) activity
CN117957022A (en) Antisense compounds and methods for targeting CUG repeats
WO2023034818A1 (en) Compositions and methods for skipping exon 45 in duchenne muscular dystrophy
WO2023034817A1 (en) Compounds and methods for skipping exon 44 in duchenne muscular dystrophy
WO2024015924A2 (en) Hybrid oligonucleotides

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication