CN116635523A - Alternatives to RAG1 for use in therapy - Google Patents

Alternatives to RAG1 for use in therapy Download PDF

Info

Publication number
CN116635523A
CN116635523A CN202180082483.2A CN202180082483A CN116635523A CN 116635523 A CN116635523 A CN 116635523A CN 202180082483 A CN202180082483 A CN 202180082483A CN 116635523 A CN116635523 A CN 116635523A
Authority
CN
China
Prior art keywords
homologous
identity
region
cells
chr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180082483.2A
Other languages
Chinese (zh)
Inventor
A·维拉
P·吉诺维斯
L·纳尔迪尼
N·萨切蒂
M·C·卡斯蒂洛
S·费拉里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ospedale San Raffaele SRL
Original Assignee
Ospedale San Raffaele SRL
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ospedale San Raffaele SRL filed Critical Ospedale San Raffaele SRL
Priority claimed from PCT/EP2021/078222 external-priority patent/WO2022079054A1/en
Publication of CN116635523A publication Critical patent/CN116635523A/en
Pending legal-status Critical Current

Links

Landscapes

  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Medicines Containing Material From Animals Or Micro-Organisms (AREA)

Abstract

The present invention relates to isolated polynucleotides comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region for use in treating RAG-deficient immunodeficiency.

Description

Alternatives to RAG1 for use in therapy
Technical Field
The present invention relates to methods for gene editing of cells to introduce RAG1 polypeptides, for example as a treatment of severe combined immunodeficiency. The invention also relates to polynucleotides, vectors, guide RNAs, kits, compositions and gene editing systems for use in the methods. The invention also relates to genomes and cells obtained or obtainable by said method.
Background
RAG1 and RAG2 proteins initiate V (D) J recombination, generating a diverse pool of T cells and B cells (Teng G, schatz DG. Advances in immunology.2015; 128:1-39). Human RAG mutations result in a broad spectrum of phenotypes, including T - B - SCID, omenn Syndrome (OS), atypical SCID (AS) and combined immunodeficiency with granuloma/autoimmunity (CID-G/AI) (Notarangelo LD, et al Nat Rev immunol.2016;16 (4): 234-246).
Hematopoietic Stem Cell Transplantation (HSCT) is a serious form of RAG 1-deficient strut, comprising T - B - SCID, OS and AS, overall survival after transplantation from donors other than matched siblings was about 80% (Haddad E, et al blood.2018;132 (17): 1737-49). However, the overall survival of non-matched sibling donors is low, and high failure rates of transplantation and poor T-cell and B-cell immune reconstitution are observed in the absence of myeloablative or reduced intensity conditioning. In addition to donor type and conditioning, other factors associated with poor results after HSCT include age #>3.5 months of life) and infections at the time of implantation.
Gene therapy represents an alternative approach to overcoming HSCT disorders. The selective advantage of genetically corrected Hematopoietic Stem Cells (HSCs) to overcome T and B cell blockade that occurs in the absence of RAG activity represents a fundamental principle of developing such strategies. In recent years, lentiviral vectors have become the strategy of choice for delivering and expressing transgenes of interest under the control of a suitable promoter (Naldini L, nature.2015; 526:351-360). In the case of RAG1 deficiency, it was observed that endogenous RAG1 gene expression was tightly regulated during cell cycle and lymphoid development, potentially exposing ectopic or deregulated gene expression to the risk that could lead to immune dysfunction or leukemia (Lagresle-Peyrou C, et al blood 2006;107 (1): 63-72; pike-Overzet K, et al Leukemia.2011;25 (9): 1471-83; and Pike-Overzet K, et al journal of Allergy and Clinical immunology.2014; 134:242-243). Several groups have examined the safety and efficacy of lentiviral-mediated gene therapies for RAG defects in preclinical models that show poor immune reconstitution or severe inflammatory signs with cellular infiltration in the skin, lung, liver, kidney, and the presence of circulating anti-double stranded DNA (van Til NP, et al j Allergy Clin immunol 2014;133 (4): 1116-23).
Overall, these data raise significant concerns over the clinical use of conventional RAG1 gene therapy vectors that allow for suboptimal levels and deregulation patterns of gene expression.
Thus, there is a need for improved treatments for RAG1 defects.
Disclosure of Invention
The inventors have developed a gene editing strategy to correct mutations in the RAG1 gene by targeting the genomic region 5' to the second exon (which contains the entire coding sequence of the gene).
The present inventors have designed and selected a set of CRISPR-Cas9 nucleases and identified specific sites in the non-repeat region of the first intron of the human RAG1 gene. The present inventors have identified guide RNAs and optimal conditions for delivery of CRISPR-Cas9 nuclease ribonucleoprotein complexes. Meanwhile, the present inventors developed donor DNA carrying human RAG1 cDNA.
The gene editing strategy allows for high levels of activity (measured as NHEJ induction frequency) and targeting efficiency (measured as GFP expression) in both alternative cell lines lacking RAG1 expression and expression recombination cassettes, as well as cd34+ HSCs obtained from mobilized peripheral blood (mPB) in humans. Using Gene editing strategiesIn mobilized peripheral blood (mPB) CD34 + High editing efficiency is achieved in cells.
In one aspect, the invention provides a polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region.
In another aspect, the invention provides a polynucleotide comprising, from 5 'to 3': a first homologous region, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region.
In some embodiments:
(i) The first homologous region is homologous to a first region of RAG1 intron 1 and the second homologous region is homologous to a second region of RAG1 intron 1; or alternatively
(ii) The first homologous region is homologous to a first region of RAG1 intron 1 or RAG1 exon 2, and the second homologous region is homologous to a second region of RAG1 exon 2.
In some embodiments, the first homologous region is homologous to a first region of RAG1 intron 1 and the second homologous region is homologous to a second region of RAG1 intron 1.
In some embodiments, the first homologous region is homologous to a first region of RAG1 intron 1 and the second homologous region is homologous to a second region of RAG1 exon 2.
In some embodiments, the first homologous region is homologous to a first region of RAG1 exon 2 and the second homologous region is homologous to a second region of RAG1 exon 2.
In some embodiments:
(i) The first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr11: 36569298;
(ii) The first homologous region is homologous to a region upstream of chr 11:36573790, and the second homologous region is homologous to a region downstream of chr11: 36573793;
(iii) The first homologous region is homologous to a region upstream of chr 11:36573641 and the second homologous region is homologous to a region downstream of chr11: 36573644;
(iv) The first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr11: 36573354;
(v) The first homologous region is homologous to a region upstream of chr 11:36569080 and the second homologous region is homologous to a region downstream of chr11: 36569083;
(vi) The first homologous region is homologous to a region upstream of chr 11:36572472 and the second homologous region is homologous to a region downstream of chr11: 36572475;
(vii) The first homologous region is homologous to a region upstream of chr 11:36571458 and the second homologous region is homologous to a region downstream of chr11: 36571461;
(viii) The first homologous region is homologous to a region upstream of chr 11:36571366 and the second homologous region is homologous to a region downstream of chr 11:36571369;
(ix) The first homologous region is homologous to a region upstream of chr 11:36572859 and the second homologous region is homologous to a region downstream of chr11: 36572862;
(x) The first homologous region is homologous to a region upstream of chr 11:36571457 and the second homologous region is homologous to a region downstream of chr11: 36571460;
(xi) The first homologous region is homologous to a region upstream of chr 11:36569351 and the second homologous region is homologous to a region downstream of chr11: 36569354; or alternatively
(xii) The first homologous region is homologous to a region upstream of chr 11:36572375 and the second homologous region is homologous to a region downstream of chr11: 36572378.
In some embodiments:
(i) The first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr11: 36569298;
(ii) The first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr11: 36573354; or alternatively
(iii) The first homologous region is homologous to a region upstream of chr 11:36571366 and the second homologous region is homologous to a region downstream of chr11: 36571369.
In a preferred embodiment, the first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr 11:36569298.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36573790, and the second homologous region is homologous to a region downstream of chr 11:36573793.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36573641 and the second homologous region is homologous to a region downstream of chr 11:36573644.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr 11:36573354.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36569080 and the second homologous region is homologous to a region downstream of chr 11:36569083.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36572472, and the second homologous region is homologous to a region downstream of chr 11:36572475.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36571458 and the second homologous region is homologous to a region downstream of chr 11:36571461.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36571366, and the second homologous region is homologous to a region downstream of chr 11:36571369.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36572859 and the second homologous region is homologous to a region downstream of chr 11:36572862.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36571457 and the second homologous region is homologous to a region downstream of chr 11:36571460.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36569351 and the second homologous region is homologous to a region downstream of chr 11:36569354.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36572375 and the second homologous region is homologous to a region downstream of chr 11:36572378.
In a preferred embodiment, the first homologous region is homologous to a region comprising chr 11:36569245-chr 11:36569294 and/or the second homologous region is homologous to a region comprising chr 11:36569299-chr 11:36569348.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 7, and/or the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 19.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 31 or a fragment thereof and/or the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 32 or a fragment thereof.
In some embodiments, the first and second homologous regions are each 50-1000bp in length, 100-500bp in length, or 200-400bp in length.
In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence encoding an amino acid sequence having at least 70% identity to SEQ ID NO. 4 or SEQ ID NO. 5.
In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 6.
In some embodiments, the splice acceptor site comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 33.
In preferred embodiments, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence, optionally wherein the polyadenylation sequence is a bGH polyadenylation sequence.
In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence comprising or consisting of a nucleotide sequence having at least 70% identity to SEQ ID NO. 35.
In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a Kozak sequence, optionally wherein the Kozak sequence comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 36.
In some embodiments, the polynucleotide comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID NO. 39.
In another aspect, the invention provides a vector comprising a polynucleotide of the invention.
In some embodiments, the vector is a viral vector, optionally an adeno-associated virus (AAV) vector, such as an AAV6 vector. In some embodiments, the vector is a lentiviral vector, such as an Integration Defective Lentiviral Vector (IDLV).
In another aspect, the invention provides a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity to any one of SEQ ID NOS.41-52.
In another aspect, the invention provides a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity to any one of SEQ ID NOs 53-55.
In a preferred embodiment, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity with SEQ ID NO. 41. In a preferred embodiment, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity with SEQ ID NO. 53. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 42. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 43. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 44. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 45. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 46. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 47. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 48. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 49. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 50. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 51. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 52. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 54. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO. 55.
In some embodiments, one to five of the terminal nucleotides at the 5 'and/or 3' end of the guide RNA are chemically modified to enhance stability, optionally wherein three of the terminal nucleotides at the 5 'and/or 3' end of the guide RNA are chemically modified to enhance stability, optionally wherein the chemical modification is a modification with 2 '-O-methyl 3' phosphorothioate.
In another aspect, the invention provides a kit comprising a polynucleotide or vector of the invention.
In another aspect, the invention provides a composition comprising a polynucleotide or vector of the invention.
In another aspect, the invention provides a gene editing system comprising a polynucleotide or vector of the invention.
In some embodiments, the kit, composition or gene editing system further comprises a guide RNA of the invention. In some embodiments, the kit, composition, or gene editing system further comprises an RNA-guided nuclease, optionally wherein the RNA-guided nuclease is a Cas9 endonuclease
In another aspect, the invention provides the use of a polynucleotide, vector, kit, composition or gene editing system for gene editing of a cell or population of cells. In some embodiments, the use is an ex vivo or in vitro use.
In another aspect, the invention provides a genome comprising a polynucleotide of the invention.
In another aspect, the invention provides a genome comprising a splice acceptor sequence located in RAG1 intron 1 or RAG1 exon 2 and a nucleotide sequence encoding a RAG1 polypeptide. In some embodiments, the splice acceptor sequence and the nucleotide sequence encoding RAG1 are located in RAG1 intron 1.
In some embodiments:
(i) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36569295 with chr 11:36569298;
(ii) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36573790 with chr 11:36573793;
(iii) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36573641 with chr 11:36573644;
(iv) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36573351 with chr 11:36573354;
(v) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36569080 with chr 11:36569083;
(vi) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36572472 with chr 11:36572475;
(vii) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36571458 with chr 11:36571461;
(viii) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36571366 with chr 11:36571369;
(ix) Splice acceptor sequences and nucleotide sequences encoding RAG1 substituting chr 11:36572859 for chr 11:36572862;
(x) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36571457 with chr 11:36571460;
(xi) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36569351 with chr 11:36569354; or alternatively
(xii) Splice acceptor sequences and nucleotide sequences encoding RAG1 replaced chr 11:36572375 with chr 11:36572378.
In some embodiments:
(i) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr11:36569295 with chr 11:36569298;
(ii) Splice acceptor sequences and nucleotide sequences encoding RAG1 replace chr 11:36573351 with chr 11:36573354; or alternatively
(iii) Splice acceptor sequences and nucleotide sequences encoding RAG1 replaced chr 11:36571366 with chr 11:36571369.
In some embodiments, the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr11:36569295 with chr 11:36569298.
In another aspect, the invention provides a cell comprising a polynucleotide, vector or genome of the invention.
In another aspect, the invention provides a population of cells comprising one or more cells of the invention.
In another aspect, the invention provides a method of gene editing a population of cells comprising delivering a polynucleotide or vector of the invention to a population of cells to obtain a population of gene edited cells. In some embodiments, the method is an ex vivo or in vitro method.
In another aspect, the invention provides a method of treating an immunodeficiency in a subject in need thereof, comprising delivering a polynucleotide or vector of the invention to a population of cells to obtain a population of genetically edited cells and administering the population of genetically edited cells to the subject.
In another aspect, the invention provides a population of genetically edited cells obtainable by the method of the invention.
In another aspect, the invention provides a polynucleotide, vector, guide RNA, kit, composition or gene editing system for treating an immunodeficiency in a subject.
In another aspect, the invention provides a method of treating a subject comprising administering to the subject a cell, population of cells, or population of genetically edited cells of the invention.
In another aspect, the invention provides a method of treating an immunodeficiency in a subject in need thereof, comprising administering to the subject a cell, population of cells, or population of genetically edited cells of the invention.
In another aspect, the invention provides a cell, population of cells or population of genetically edited cells of the invention for use as a medicament.
In another aspect, the invention provides a cell, population of cells or population of genetically engineered cells of the invention for use in treating an immunodeficiency in a subject.
Drawings
FIG. 1 production of NALM6 Cas9 and K562Cas9 cell lines
A) Schematic representation of the gene correction method; b) Schematic representations of protocols for generating K562Cas9 and NALM6 Cas9 cell lines; c) Vector Copy Number (VCN) of cassettes containing integrated Cas9 measured by ddPCR, telomerase was used as normalizer; d) Cas9 expression of scaled doses of doxycycline measured by qPCR in NALM6 Cas9 (left panel) and K562Cas9 (right panel) cell lines is expressed as fold change relative to actin.
FIG. 2 selection of Performance-optimized gRNA
A) Schematic representation of the introns and exons bases targeted by different grnas tested; b) Schematic representation of the experimental protocol; c) Percent NHEJ-induced indels (indels), n=1, in K562Cas9 treated with different doses of plasmid encoding different guides 7 days post transfection; d) Percent NHEJ-induced indels in NALM6 Cas9 treated with different doses of plasmid encoding guides 3, 7 and 9, n=1, 7 days post transfection; e) Percent NHEJ-induced indels in NALM6 Cas9 treated with different doses of guide 3 and 9 pre-assembled in vitro at 7 days post transfection, n=1.
FIG. 3 donor DNA optimization
A) RAG1 gene expression, as measured by RT-qPCR, expressed as fold change relative to RAG1 expression in 293T cell lines, actin was used as normalizer; b) Schematic representation of the different sa_gfp DNA donors tested; c) Schematic representation of the sa_gfp_sd donor splicing machinery; d) 7 days after transfection, the percentage of targeted cells measured as gfp+ cells by flow cytometry; e) GFP expression levels measured as Mean Fluorescence Intensity (MFI) gating on gfp+ events; f) A representative FlowJo diagram; one-way anova, geisser-Greenhouse correction was used for multiple comparisons, n=3. P value: * <0.05; * <0.005; * <0.0005; * <0.0001. Mean ± SD are shown.
FIG. 4 off-target analysis
A) The table shows the first 10 off-target sites predicted by the computer cosid tool of guide 9. Off-target sequences, PAM type, score, number of mismatches, and chromosomal location are shown. B-C) cleavage efficiency measured as a percentage of NHEJ (D) and dsDNA tag integration (ODN) at the target site was assessed by RFLP in K562 cells. D-E) graphs show coverage of mid-target reads (chromosome 11) to guide 9 (D) and guide 7 (E) and off-target reads (chromosomes 20 and 9) identified by relaxation constraints for guide 7.
FIG. 5 optimization of Gene editing scheme, efficiency towards guide 3
A) Schematic representation of a gene editing scheme; b) The gating strategy is schematically represented; c) hCB-CD34 treated with different doses of guide 3 and 9 as in vitro pre-assembled RNP + In the cells, NHEJ induces the percentage of indels, n=2; d) hCD34 by flow cytometry using the percentage of targeted cells to guide 3 + GFP in the door + Cell measurement, n=1; e) GFP in three major hcd34+ cell subsets expressed as hCD133 hCD90 by flow cytometry using the percentage of targeted cells to guide 3 + Cell measurement, n=1.
FIG. 6 optimization of Gene editing scheme, efficiency towards guide 9
A) Day 4 as 7AAD by flow cytometry - /AnnexinV - The measured percentage of viable cells; b) Total number of cells on day 7, expressed as fold increase compared to day 3; c) hCD34 measured by flow cytometry on day 7 + The frequency of the cells; d) Based on the expression of hCD133 and hCD90 on day 7, 3 hCD34 as measured by flow cytometry + Distribution of cell subsets; e) Based on the expression of hCD133 and hCD90 on day 7, 3 hCD34 were obtained by flow cytometry + GFP in cell subsets + Cell-measured frequency of targeted cells; F) The percentage of targeted cells measured by ddPCR on day 7, telomerase genomic locus was used as normalizer; g) Total number of edited cells calculated on day 7 based on frequency of ddPCR targeted cells. One-way anova, geisser-Greenhouse correction was used for multiple comparisons, n=3. P value: *<0.05;**<0.005;***<0.0005;****<0.0001. Mean ± SD are shown.
FIG. 7 Gene-edited hCB-CD34 + In vivo transplantation of cells
A) The percentage of targeted cells measured by ddPCR on day 4, telomerase genomic locus was used as normalizer; b) hCD45 in Peripheral Blood (PB) by flow cytometry + Treated cell transplantation for frequency measurement of cells; c) hCD45 as a result of flow cytometry in PB + In-door GFP + Targeted cell transplantation for frequency measurement of cells; D. f, H) B cell, T cell and myeloid cell frequencies in PB, which are expressed as hCD45, respectively + In-door hCD19 + Cell (D), hCD3 + Cell (F), hCD13 + The percentage of cells (F) was measured. E. G, I) target cells between B cells, T cells and myeloid compartments in PB, which are expressed as hCD19, respectively + Door (E), hCD3 + Door (G) and hCD13 + GFP in door (I) + Cell measurement; l) hCD45 in bone marrow as measured by flow cytometry + Intercellular hCD34 + The frequency of the cells; m) use of hCD34 as hCD34 by flow cytometry in bone marrow + Intercellular GFP + Cell-measured frequency of targeted cells; n) 17 weeks after implantation, GFP measured by flow cytometry between different T cell development phases in thymus (according to hCD4 and hCD8 expression), peripheral blood and spleen (according to hCD3, hCD4 and hCD8 expression) + Frequency of expressing cells. Mann-Whitney test at 17 weeks post-implantation. Group size: sa_gfp n=5; pgk_gfp n=4. P value: *<0.05;**<0.005;***<0.0005;****<0.0001. Mean ± SD are shown.
FIG. 8 at hMPB-CD34 + On-cell test proofreading donor
A) Schematic representation of a corrective donor; b) Schematic representation of the experimental protocol; c) On day 4, according to the tables for hCD133 and hCD90Up to, hCD34 sorted according to ddPCR + The percentage of targeted cells measured on the cell subpopulation, telomerase genomic region was used as normalizer; d) Total number of cells on day 4, expressed as fold increase compared to day 0. N=3.
FIG. 9 edited hMPB-CD34 from HD and RAG1 patients + In vivo transplantation of cells
A) The experimental group is schematically represented; b) The percentage of targeted cells measured by ddPCR on day 4, telomerase genomic region was used as normalizer; c) hCD45 as a result of flow cytometry in PB + Cell transplantation for frequency measurement of cells; d) 8 weeks after implantation, frequency of targeted cells between human cells measured by ddPCR in PB, telomerase genomic region was used as normalizer; e) According to hCD45 + Expression of hCD19, hCD3 and hCD13 in the phylum, distribution of immune cells in PB of mice, which were treated with HD and untreated cells, as measured by flow cytometry + Transplanting; f) Immune cell distribution in PB of mice measured by flow cytometry using MPB-CD34 derived from treated and untreated cells of RAG1 patients, according to the expression of hCD19, hCD3 and hCD13 in hCD45+ gate + Cell transplantation; G. h) analysis of human transplantation ratio and targeting efficiency in bone marrow (G) and spleen (H), human transplantation ratio was analyzed by flow cytometry as hCD45 + Frequency measurement of cells (left panel), targeting efficiency was measured by ddPCR in HDR (right panel). Mean ± SD are shown.
FIG. 10 hMPB-CD34 from HD and RAG1 patients before and after the gene editing operation + Multiparameter analysis of cells.
A. B) analysis of HSPC composition by flow cytometry on MPB-CD34 derived from healthy donors (HD, A) and RAG1 patients (Pt, B) + Performed in cells. Analysis was performed 1 day before the amplification stage (day 3) and after the gene editing protocol (GE). Untreated cells (UTs) were also analyzed on the same day as the edited cells. The graph shows the effect of the negative (Lin - )CD34 + 20 subtypes analyzed in gating, including: hematopoietic Stem Cells (HSC), multipotent progenitor cells (MPP), multiple lymphoid progenitor cells (MLP), early T progenitor cells (ETP), B and NK cell precursors (Pre-B/NK), common myeloid progenitor Cells (CMP), granulocyte-monocyte progenitor cells (GMP), megakaryoerythroid progenitor cells (MEP), megakaryocyte progenitor cells (MKp) and erythroid progenitor cells (EP).
FIG. 11 donor screening for RAG1 editing.
A) Schematic representation of donor construction. Ha_l, left homology arm; ha_r, right homology arm; SA, splice acceptor; SD, splice donor; BGHpA, bovine growth hormone poly a; WPRE, woodchuck hepatitis virus posttranscriptional regulatory element; IRES, internal ribosome entry site sequence; PEST, proline (P), glutamic acid (E), serine (S) and threonine (T). B) Schematic representation of the experimental protocol. C) GFP expression levels, shown as Mean Fluorescence Intensity (MFI) gating on gfp+ events (d, days post-editing) measured by flow cytometry over time. D) Modulation of GFP expression in serum starved cells was shown as the ratio of GFP MFI of starved cells (-FBS) to GFP MFI of non-starved cells (+fbs) (1 experiment represents 3).
Figure 12 effects of edit enhancers on HDR efficiency of RAG1 locus.
A) Schematic representations of the gene editing protocol (upper panels) and the artificial thymus organoid protocol (ATO) (lower panels). B) HDR efficiency is shown as the percentage of edited alleles measured by ddPCR 7 days after editing; c) hCD34 by flow cytometry 7 days after editing + GFP between subsets + Cell-measured frequency of targeted cells; d) Analysis of HSPC composition by flow cytometry on MPB or BM CD34 derived from healthy donors + Performed in cells. Analysis was performed prior to the amplification stage (day 0) and 1 day after the gene editing protocol (GE, day 4). Untreated cells (UTs) were also analyzed on the same day as the edited cells. The graph shows the effect of the negative (Lin - )CD34 + 20 subtypes analyzed in gating, including: hematopoietic Stem Cells (HSCs), multipotent progenitor cells (MPPs), multiple lymphoid progenitor cells (MLPs), early T progenitor cells (ETPs), B and NK cell precursors (Pre-B/NK), common myeloid progenitor Cells (CMP), granulocyte-monocyte progenitor cells (GMPs), megakaryocyte-erythroid progenitor cells (MEPs), megakaryocyte progenitor cells (MKp) and erythroid progenitor cells (EP).
FIG. 13 effects of edit enhancers on T cell differentiation potential.
Representative images of Artificial Thymus Organoids (ATO) 4 weeks after ATO inoculation with untreated cells (UT) or edited cells with or without HDR enhancers. B) Total number of cells harvested from ATO 4 weeks after ATO inoculation. C) HDR efficiency is shown as the percentage of edited allele measured in T cells differentiated in subjects by ddPCR 4 weeks after ATO inoculation. D) HDR efficiency was measured by flow cytometry as the percentage of gfp+ cells within different T cell subsets 4 weeks after ATO inoculation.
FIG. 14 donor constructs for intron correction strategy.
Schematic representations of sa_corag1 cds_bghpa (a) and sa_corag1cds_sd (B) donor templates for the intron correction strategy. HA, homology arm; SA, splice acceptor; SD, splice donor; a coRAG1CDS, codon optimized RAG1 coding sequence; BGHpA, bovine growth hormone poly a; ex, exons; gRNA, guide RNA;3'UTR,3' untranslated region; HDR, homology directed repair.
FIG. 15 corrected donor comparisons in NALM6.Rag1KO cells.
(A) Schematic representation of experiments performed to compare the correction efficiencies of two donors: SA_cora1CDS_BGHpA and SA_cora1CDS_SD donor. (B) RAG1CDS expression was assessed by RT-qPCR in various nalm6.RAG1ko edited clones and measured as relative expression to housekeeping β -actin. (C) Recombinant activity was assessed 7 days after serum starvation as the proportion of gfp+ cells gated on transduced cells by flow cytometry.
Correction donor comparisons in hd-HSPC.
(A) Hematopoietic stem and progenitor cells were edited by guide 9 and Cas9 in combination with sa_corrag1cds_bghpa or sa_corrag1cds_sd donor as RNPs. 4 days after editing, the proportion of edited alleles was assessed by ddPCR in the subject HSPCs. (B) The proportion of edited alleles was assessed by ddPCR in HSPC subsets isolated by cell sorting. (C) Kinetics of cell growth in Untreated (UT) or edited HSPCs according to the indicated donor, dose and days after Gene Editing (GE). (D) Colony Forming Unit (CFU) assays were performed on untreated or edited HSPCs by counting the number of red (erythroid), white (medullary) and mixed colonies under a microscope 14 days after plating. (E) Distribution of cd34+ cell subsets and CD 34-cells as measured by flow cytometry based on hCD133 and hCD90 expression analyzed 4 days after editing. (F) Representative graphs of T cell differentiation stages analyzed by flow cytometry 7 weeks after ATO inoculation. (G) HDR efficiency was measured by flow cytometry at 6 weeks post ATO inoculation as the proportion of edited alleles in the host cd4+cd8+ Double Positive (DP) cells and CD4-CD 8-Double Negative (DN) cells.
Detailed Description
It must be noted that, as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise.
As used herein, the term "comprising" is synonymous with "including" or "containing" and is inclusive or open-ended and does not exclude additional, non-enumerated members, elements, or method steps. The term "comprising" also includes the term "consisting of … …".
Numerical ranges include numbers defining the range. Unless otherwise indicated, any nucleic acid sequence is written in the 5 'to 3' direction from left to right; the amino acid sequences are written left to right in the amino to carboxyl direction, respectively.
All cited genomic positions were based on the human genome assembly grch38.p13 (gcf_ 000001405.39). Those skilled in the art will be able to identify the corresponding genomic positions in the replacement genome assembly and switch the listed genomic positions accordingly. For example, RAG1 is located at chr 11:36510353-36579762 in assembly GRCh38.p13 and at chr 11:36532053-36601312 in assembly GRCh37.p13.
The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that such publication forms the prior art with respect to the appended claims.
Recombinant activating gene 1 (RAG 1)
The present invention relates to methods for gene editing of cells to introduce RAG1 polypeptides, for example as a treatment for severe combined immunodeficiency. The invention also relates to polynucleotides, vectors, guide RNAs, kits, compositions and gene editing systems for use in the methods, as well as genomes and cells obtained or obtainable by the methods.
"RAG1" is a short term for the polypeptide encoded by recombinant activating gene 1, and is also called RAG-1, RNF74, recombinant activating 1.
RAG1 is a catalytic component of a RAG complex, a multiprotein complex, that mediates the DNA cleavage phase during V (D) J recombination. V (D) J recombination assembles a diverse immunoglobulin and T cell receptor gene pool in developing B and T lymphocytes through the rearrangement of different V (variable), in some cases D (diversity) and J (junction) gene segments. In the RAG complex RAG1 mediates binding of DNA to a conserved Recombination Signal Sequence (RSS) and catalyzes DNA cleavage activity by introducing a double strand break between the RSS and adjacent coding segments. RAG2 is not a catalytic component, but is essential for all known catalytic activities.
A "RAG1 polypeptide" is a polypeptide having RAG1 activity, such as a polypeptide that is capable of forming a RAG complex, mediating DNA binding to RSS, and introducing a double strand break between the RSS and adjacent coding segments. Suitably, the RAG1 polypeptide may have the same or similar activity as the wild-type RAG1, e.g. may have an activity of at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 100%, at least 110%, at least 120%, at least 130%, at least 140% or at least 150% of the wild-type RAG1 polypeptide.
The RAG1 polypeptide can be a fragment of RAG1 and/or a RAG1 variant.
A "fragment of RAG 1" may refer to a portion or region of a full-length RAG1 polypeptide that has the same or similar activity as the full-length RAG1 polypeptide, i.e., the fragment may be a functional fragment. Fragments may have an activity of at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95% or 100% of the full-length RAG1 polypeptide. Those skilled in the art will be able to generate fragments based on the known structural and functional features of RAG 1. These are described, for example, in Arbuckle, j.l., et al, 2011.BMC biochemistry,12 (1), p.23; ru, H., et al, 2015.Cell,163 (5), pp.1138-1152; and Kim, m.s., et al, 2015.Nature,518 (7540), pp.507-511.
The smallest cell of RAG1 required for catalysis has been identified. These regions are called core proteins. Core RAG1 is composed of multiple domains, called the nonamer-binding domain (NBD; residues 389-464), the central domain (residues 528-760) and the C-terminal domain (residues 761-980). In addition to the ability to recognize RSS nonamers and heptamers via NBD and central domains, respectively, core RAG1 also contains the necessary acidic active site residues (Arbuckle, j.l., et al, 2011.BMC biochemistry,12 (1), p.23). Suitably, the fragment of RAG1 comprises a nonamer binding domain, a central domain and/or a C-terminal domain.
"RAG1 variant" may comprise an amino acid sequence or a nucleotide sequence which may be at least 50%, at least 55%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85% or at least 90% identical, optionally at least 95% or at least 97% or at least 99% identical to a wild-type RAG1 polypeptide. The RAG1 variant may have the same or similar activity as the wild-type RAG1 polypeptide, e.g. may have an activity of at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 100%, at least 110%, at least 120%, at least 130%, at least 140% or at least 150% of the wild-type RAG1 polypeptide. Those skilled in the art will be able to generate RAG1 variants based on known structural and functional features of RAG1 and/or using conservative substitutions.
The gene encoding RAG1 (NCBI gene ID: 5896) is located in the human genome at chr 11:36510353-36579762.
Several alternative mRNAs were transcribed from the RAG1 gene. Transcript variant 1 (NM-000448) has two exons and one intron. As used herein, the region of the RAG1 gene corresponding to the first exon of transcriptional variant 1 is referred to as "RAG1 exon 1", the region of the RAG1 gene corresponding to the intron of transcriptional variant 1 is referred to as "RAG1 intron 1", and the region of the RAG1 gene corresponding to the second exon (which encodes the RAG1 polypeptide) is referred to as "RAG1 exon 2".
Suitably, RAG1 exon 1 is from chr 11:36568006 to chr 11:36568122; RAG1 intron 1 is from chr 11:36568123 to chr 11:36573290; and/or RAG1 exon 2 is from chr 11:36573291 to chr 11:36579762.
Suitably, RAG1 exon 1 consists of the nucleotide sequence of SEQ ID NO. 1 or a variant thereof; RAG1 intron 1 consists of the nucleotide sequence of SEQ ID NO. 2 or a variant thereof; and/or RAG1 exon 2 consists of the nucleotide sequence of SEQ ID NO. 3 or a variant thereof.
Illustrative RAG1 exon 1 (SEQ ID NO: 1)
Illustrative RAG1 intron 1 (SEQ ID NO: 2)
/>
/>
Illustrative RAG1 exon 2 (SEQ ID NO: 3)
/>
/>
/>
In the illustrative RAG1 exon 2 (SEQ ID NO: 3), capital letters indicate the nucleotide sequence encoding the RAG1 polypeptide.
RAG1 polypeptides
The RAG1 polypeptide may be a human RAG1 polypeptide. Suitably, the RAG1 polypeptide may comprise or consist of the polypeptide sequence of UniProtKB accession number P15918, or a fragment or variant thereof.
In some embodiments of the invention, the RAG1 polypeptide comprises or consists of an amino acid sequence that is at least 70% identical to SEQ ID NO. 4 or a fragment thereof. Suitably, the RAG1 polypeptide comprises or consists of an amino acid sequence that is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO. 4 or a fragment thereof.
In some embodiments, the RAG1 polypeptide comprises or consists of SEQ ID NO. 4 or a fragment thereof.
RAG1 polypeptide isoform 1, uniProtKB accession No. P15918 (SEQ ID NO: 4)
In some embodiments of the invention, the RAG1 polypeptide comprises or consists of an amino acid sequence that is at least 70% identical to SEQ ID NO. 5 or a fragment thereof. Suitably, the RAG1 polypeptide comprises or consists of an amino acid sequence that is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO. 5 or a fragment thereof.
In some embodiments, the RAG1 polypeptide comprises or consists of SEQ ID NO. 5 or a fragment thereof.
RAG1 polypeptide isoform 2, uniProtKB accession No. P15918 (SEQ ID NO: 5)
RAG1 polynucleotides
The nucleotide sequence encoding a RAG1 polypeptide may be codon optimized. Suitably, the nucleotide sequence encoding a RAG1 polypeptide may be codon optimized for expression in human cells.
The selection of specific codons varies from cell to cell. This codon bias corresponds to the bias in the relative abundance of a particular tRNA in a cell type. Expression can be increased by altering codons in the sequence to match the relative abundance of the corresponding tRNA. For the same reason, it is possible to reduce expression by deliberately selecting codons for the corresponding tRNA known to be rare in a particular cell type. Thus, an additional degree of translational control is available. Codon usage tables are known in the art for mammalian cells (e.g., humans) as well as a variety of other organisms.
In some embodiments of the invention, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence that is at least 70% identical to SEQ ID NO. 6 or a fragment thereof. Suitably, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence that is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO. 6 or a fragment thereof.
In some embodiments of the invention, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of the nucleotide sequence SEQ ID NO. 6 or a fragment thereof.
Exemplary nucleotide sequence encoding RAG1 polypeptide (SEQ ID NO: 6)
/>
Polynucleotides and genomes
In one aspect, the invention provides a polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region. The polynucleotide may be an isolated polynucleotide. The polynucleotide may be a DNA molecule, such as a double stranded DNA molecule.
Suitably, the polynucleotides of the invention may be limited in size to fit into a vector (e.g., an adeno-associated virus (AAV) vector, such as AAV 6). Suitably, the total size of the polynucleotide of the present invention may be 5.0kb or less, 4.9kb or less, 4.8kb or less, 4.7kb or less, 4.6kb or less, 4.5kb or less, 4.4kb or less, 4.3kb or less, 4.2kb or less, 4.1kb or less, 4.0kb or less. In some embodiments, the polynucleotides of the invention are 4.1kb or less in size or 4.0kb or less in size.
In another aspect, the invention provides a genome comprising a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide. Suitably, the genome may comprise a polynucleotide of the invention. The genome may be an isolated genome. The genome may be a mammalian genome, such as a human genome.
Homologous regions
A "homology region" (also referred to as a "homology arm") is a nucleotide sequence polypeptide located upstream or downstream of the nucleotide sequence to be inserted ("nucleotide sequence insert", e.g., splice acceptor sequence and RAG1 encoding nucleotide sequence). The polynucleotides of the invention comprise two homologous regions, one upstream of the nucleotide sequence insert ("first homologous region") and one downstream of the nucleotide insert ("second homologous region").
Each "homology region" is designed such that nucleotide sequence inserts can be introduced into the genome at Double Strand Break (DSB) sites by Homology Directed Repair (HDR). Those skilled in the art will be able to design homology arms based on the desired insertion site (i.e., DSB site) (see, e.g., ran, f.a., et al 2013.Nature protocols,8 (11), pp.2281-2308). Each "homology region" is homologous to a region on either side of the DSB. For example, the first homologous region may be homologous to a region upstream of the DSB and the second homologous region may be homologous to a region downstream of the DSB.
As used herein, the term "homologous" refers to nucleotide sequences that are similar or identical. For example, the nucleotide sequences may be at least 70% identical, at least 75% identical, at least 80% identical, at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, at least 99% identical, or 100% identical.
As used herein, "upstream" and "downstream" both refer to relative positions in DNA or RNA. Each strand of DNA or RNA has a 5 'end and a 3' end, and "upstream" and "downstream" are conventionally associated with the 5 'to 3' direction, respectively, in which RNA transcription occurs. For example, when double-stranded DNA is considered, "upstream" refers to the 5 'end of the coding strand of the relevant gene (e.g., RAG 1), and downstream refers to the 3' end of the coding strand of the relevant gene (e.g., RAG 1).
The homology region may be any length suitable for HDR. The length of the homologous regions may be the same or different. Suitably, the homologous regions are each independently 50-1000bp in length, 100-500bp in length or 200-400bp in length. For example, the first homologous region may be 50-1000bp in length and homologous to a region upstream of the DSB, and the second homologous region may be 50-1000bp in length and homologous to a region downstream of the DSB.
In some embodiments:
(i) The first homologous region is homologous to a first region of RAG1 intron 1 and the second homologous region is homologous to a second region of RAG1 intron 1; or alternatively
(ii) The first homologous region is homologous to a first region of RAG1 intron 1 or RAG1 exon 2, and the second homologous region is homologous to a second region of RAG1 exon 2.
In some embodiments, the first homologous region is homologous to a first region of RAG1 intron 1 and the second homologous region is homologous to a second region of RAG1 intron 1.
In some embodiments:
(i) The first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr11: 36569298;
(ii) The first homologous region is homologous to a region upstream of chr 11:36573790, and the second homologous region is homologous to a region downstream of chr11: 36573793;
(iii) The first homologous region is homologous to a region upstream of chr 11:36573641 and the second homologous region is homologous to a region downstream of chr11: 36573644;
(iv) The first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr11: 36573354;
(v) The first homologous region is homologous to a region upstream of chr 11:36569080 and the second homologous region is homologous to a region downstream of chr11: 36569083;
(vi) The first homologous region is homologous to a region upstream of chr 11:36572472 and the second homologous region is homologous to a region downstream of chr11: 36572475;
(vii) The first homologous region is homologous to a region upstream of chr 11:36571458 and the second homologous region is homologous to a region downstream of chr11: 36571461;
(viii) The first homologous region is homologous to a region upstream of chr 11:36571366 and the second homologous region is homologous to a region downstream of chr 11:36571369;
(ix) The first homologous region is homologous to a region upstream of chr 11:36572859 and the second homologous region is homologous to a region downstream of chr11: 36572862;
(x) The first homologous region is homologous to a region upstream of chr 11:36571457 and the second homologous region is homologous to a region downstream of chr11: 36571460;
(xi) The first homologous region is homologous to a region upstream of chr 11:36569351 and the second homologous region is homologous to a region downstream of chr11: 36569354; or alternatively
(xii) The first homologous region is homologous to a region upstream of chr 11:36572375 and the second homologous region is homologous to a region downstream of chr11: 36572378.
In some embodiments:
(i) The first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr11: 36569298;
(ii) The first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr11: 36573354; or alternatively
(iii) The first homologous region is homologous to a region upstream of chr 11:36571366 and the second homologous region is homologous to a region downstream of chr11: 36571369.
In some embodiments, the first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr 11:36569298.
In some embodiments:
(i) The first homologous region is homologous to a region comprising chr 11:36569245-36569294 and the second homologous region is homologous to a region comprising chr 11:36569299-36569348;
(ii) The first homologous region is homologous to a region comprising chr 11:36573740-36573789, and the second homologous region is homologous to a region comprising chr 11:36573794-36573843;
(iii) The first homologous region is homologous to a region comprising chr 11:36573591-36573640 and the second homologous region is homologous to a region comprising chr 11:36573645-36573694;
(iv) The first homologous region is homologous to a region comprising chr 11:36573301-36573350, and the second homologous region is homologous to a region comprising chr 11:36573355-36573404;
(v) The first homologous region is homologous to a region comprising chr 11:36569030-36569079 and the second homologous region is homologous to a region comprising chr 11:36569084-36569133;
(vi) The first homologous region is homologous to a region comprising chr 11:36572422-36572471 and the second homologous region is homologous to a region comprising chr 11:36572476-36572525;
(vii) The first homologous region is homologous to a region comprising chr 11:36571408-36571457 and the second homologous region is homologous to a region comprising chr 11:36571462-36571511;
(viii) The first homologous region is homologous to a region comprising chr 11:36571316-36571365, and the second homologous region is homologous to a region comprising chr 11:36571370-36571419;
(ix) The first homologous region is homologous to a region comprising chr 11:36572809-36572858, and the second homologous region is homologous to a region comprising chr 11:36572863-36572912;
(x) The first homologous region is homologous to a region comprising chr 11:36571407-36571456 and the second homologous region is homologous to a region comprising chr 11:36571461-36571510;
(xi) The first homologous region is homologous to a region comprising chr 11:36569301-36569350 and the second homologous region is homologous to a region comprising chr 11:36569355-36569404; or alternatively
(xii) The first homologous region is homologous to a region comprising chr 11:36572325-36572374 and the second homologous region is homologous to a region comprising chr 11:36572379-36572428.
In some embodiments:
(i) The first homologous region is homologous to a region comprising chr 11:36569245-36569294 and the second homologous region is homologous to a region comprising chr 11:36569299-36569348;
(ii) The first homologous region is homologous to a region comprising chr 11:36573301-36573350, and the second homologous region is homologous to a region comprising chr 11:36573355-36573404; or alternatively
(iii) The first homologous region is homologous to a region comprising chr 11:36571316-36571365 and the second homologous region is homologous to a region comprising chr 11:36571370-36571419.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36569245-36569294 and the second homologous region is homologous to a region comprising chr 11:36569299-36569348.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36573740-36573789, and the second homologous region is homologous to a region comprising chr 11:36573794-36573843.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36573591-36573640, and the second homologous region is homologous to a region comprising chr 11:36573645-36573694.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36573301-36573350, and the second homologous region is homologous to a region comprising chr 11:36573355-36573404.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36569030-36569079 and the second homologous region is homologous to a region comprising chr 11:36569084-36569133.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36572422-36572471, and the second homologous region is homologous to a region comprising chr 11:36572476-36572525.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36571408-36571457, and the second homologous region is homologous to a region comprising chr 11:36571462-36571511.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36571316-36571365, and the second homologous region is homologous to a region comprising chr 11:36571370-36571419.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36572809-36572858, and the second homologous region is homologous to a region comprising chr 11:36572863-36572912.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36571407-36571456, and the second homologous region is homologous to a region comprising chr 11:36571461-36571510.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36569301-36569350 and the second homologous region is homologous to a region comprising chr 11:36569355-36569404.
In some embodiments, the first homologous region is homologous to a region comprising chr 11:36572325-36572374, and the second homologous region is homologous to a region comprising chr 11:36572379-36572428.
Exemplary homology regions are shown in table 1 below.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any of SEQ ID NOS.7-18 and/or the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any of SEQ ID NOS.19-30.
TABLE 1 exemplary homology regions
/>
Preferably, the first and second homologous regions comprise or consist of nucleotide sequences having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to the first and second homologous regions in the same row of table 1. Suitably, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any one of SEQ ID NOS: 7-18, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to the corresponding nucleotide sequence in Table 1 (i.e., SEQ ID NOS: 19-30). For example, in some embodiments:
(i) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 7, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 19;
(ii) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 8, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 20;
(iii) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 9, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 21;
(iv) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 10 and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 22;
(v) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 11, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 23;
(vi) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 12 and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 24;
(vii) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 13, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 25;
(viii) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 14, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 26;
(ix) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 15, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 27;
(x) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 16, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 28;
(xi) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 17 and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 29; or alternatively
(xii) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 18, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 30.
In some embodiments:
(i) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 7, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 19;
(ii) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 10 and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 22; or alternatively
(iii) The first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 14 and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 26.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 7, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 19.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 8, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 20.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 9, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 21.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 10, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 22.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 11, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 23.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 12, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 24.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 13, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 25.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 14, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 26.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 15, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 27.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 16, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 28.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 17, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 29.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 18, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 30.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 98% identity to SEQ ID NO. 7, and the second homologous region comprises or consists of a nucleotide sequence having at least 98% identity to SEQ ID NO. 19.
In some embodiments, the first homologous region comprises or consists of the nucleotide sequence of SEQ ID NO. 7 and the second homologous region comprises or consists of the nucleotide sequence of SEQ ID NO. 19.
In some embodiments, the 3 'terminal sequence of the first homologous region consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any of SEQ ID NOS: 7-18 and/or the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any of SEQ ID NOS: 19-30.
Suitably, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any of SEQ ID NOS: 7-18, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to the corresponding nucleotide sequence in Table 1 (i.e. SEQ ID NOS: 19-30).
For example, in some embodiments:
(i) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 7, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 19;
(ii) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 8, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 20;
(iii) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 9, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 21;
(iv) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 10, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 22;
(v) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 11, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 23;
(vi) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 12, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 24;
(vii) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 13, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 25;
(viii) The 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 14, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 26;
(ix) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 15, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 27;
(x) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 16, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 28;
(xi) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 17, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 29; or alternatively
(xii) The 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 18, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 30.
In some embodiments:
(i) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 7, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 19;
(ii) The 3 'terminal sequence of the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 10, and the 5' terminal sequence of the second homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 22; or (b)
(iii) The 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 14, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 26.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 7, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 19.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 8, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 20.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 9, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 21.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 10, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 22.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 11, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 23.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 12, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 24.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 13, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 25.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 14, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 26.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 15, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 27.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 16, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 28.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 17, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 29.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 18, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID No. 30.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 98% identity to SEQ ID NO. 7, and the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 98% identity to SEQ ID NO. 19.
In some embodiments, the 3 'terminal sequence of the first homologous region comprises or consists of the nucleotide sequence of SEQ ID NO. 7 and the 5' terminal sequence of the second homologous region comprises or consists of the nucleotide sequence of SEQ ID NO. 19.
In some embodiments, the first homology region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity with SEQ ID NO. 31 or a fragment thereof; and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 32, or a fragment thereof. Suitably, the fragment is at least 50bp in length, for example 50-250bp or 100-200bp in length.
In some embodiments, the first homologous region comprises or consists of a nucleotide sequence having at least 98% identity to SEQ ID NO. 31 or a fragment thereof; and the second homologous region comprises or consists of a nucleotide sequence having at least 98% identity with SEQ ID NO. 32 or a fragment thereof.
In some embodiments, the first homologous region comprises or consists of the nucleotide sequence of SEQ ID NO. 31 or a fragment thereof, and the second homologous region comprises or consists of the nucleotide sequence of SEQ ID NO. 32 or a fragment thereof.
Illustrative first homology region of guide RNA 9 (SEQ ID NO: 31)
Illustrative second homology region of guide RNA 9 (SEQ ID NO: 32)
Genomic insertion sites
Sites for Double Strand Breaks (DSBs) may be specifically introduced by any suitable technique, for example using the CRISPR/Cas9 system and guide RNAs disclosed herein. In the present invention, DSBs are introduced into RAG1 intron 1 or RAG1 exon 2. For example, DSBs may be introduced at any of the sites listed in table 2 below. Optionally, DSBs are introduced into RAG1 intron 1.
TABLE 2 exemplary DSB sites in RAG1 intron 1 or RAG1 exon 2
Guide article Exemplary DSB sites
9 Between chr 11:36569296 and 36569297
1 chr 11:36573791 36573792 between
2 Between chr 11:36573642 and 36573643
3 Between chr 11:36573352 and 36573353
4 Between chr 11:36569081 and 36569082
5 Between chr 11:36572473 and 36572474
6 Between chr 11:36571459 and 36571460
7 chr 11:36571367 and 36571368
8 Between chr 11:36572860 and 36572861
10 Between chr 11:36571458 and 36571459
11 Between chr 11:36569352 and 36569353
12 Between chr 11:36572376 and 36572377
Suitably, each homology region is homologous to a fragment of RAG1 intron 1 and/or RAG1 exon 2 on either side of the DSB. For example, the first homologous region may be homologous to regions in RAG1 intron 1 and/or RAG1 exon 2 upstream of the DSB, and the second homologous region may be homologous to regions downstream of the DSB.
In the present invention, nucleotide sequence inserts (e.g., splice acceptor sequences and nucleotide sequences encoding RAG1 polypeptides) may be introduced at DSB sites by Homology Directed Repair (HDR). Thus, nucleotide inserts (e.g., splice acceptor sequences and nucleotide sequences encoding RAG1 polypeptides) may replace the regions of the genome that flank the homologous regions and contain the DSB.
As used herein, a "nucleotide sequence insert" may consist of a polynucleotide region flanked by a first and a second homologous region. For example, the nucleotide sequence insert may comprise a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide.
The nucleotide sequence insert may be introduced into the genome at any of the sites listed in table 2 above. In other words, the genome of the present invention may comprise a nucleotide sequence insert at any one of the sites listed in table 2 above.
In some embodiments, nucleotide sequence inserts are introduced:
(i) chr 11:36569296 and 36569297;
(ii) chr 11:36573352 and 36573353; or (b)
(iii) chr 11:36571367 and 36571368.
In some embodiments, a nucleotide sequence insert is introduced between chr 11:36569296 and 36569297.
In some embodiments, the genome of the invention comprises a nucleotide sequence comprising a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide, which is introduced:
(i) chr 11:36569296 and 36569297;
(ii) chr 11:36573352 and 36573353; or (b)
(iii) chr 11:36571367 and 36571368.
In some embodiments, the genome of the invention comprises a nucleotide sequence comprising a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide, which is introduced between chr 11:36569296 and 36569297.
The nucleotide sequence insert may replace any of the regions listed in table 3 below. In other words, the genome of the present invention may comprise a nucleotide sequence insert that replaces any of the regions listed in table 3.
TABLE 3 exemplary insertion sites in RAG1 Intron 1 or RAG1 exon 2
/>
In some embodiments, nucleotide sequence insert substitutions:
(i) chr 11:36569295-36569298;
(ii) chr 11:36573351-36573354; or (b)
(iii) chr 11:36571366-36571369.
In some embodiments, the nucleotide sequence insert replaces chr 11:36569295 to 36569298.
In some embodiments, the genome of the invention comprises a nucleotide sequence comprising a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide, which replaces:
(i) chr 11:36569295-36569298;
(ii) chr 11:36573351-36573354; or (b)
(iii) chr 11:36571366-36571369.
In some embodiments, the genome of the invention comprises a nucleotide sequence comprising a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide that replaces chr 11:36569295 to 36569298.
Splice acceptor and donor sequences
RNA splicing is a form of RNA processing in which newly made precursor messenger RNA (pre-mRNA) transcripts are converted into mature messenger RNA (mRNA). During splicing, introns (non-coding regions) are removed and exons (coding regions) are joined together.
In an intron, splicing requires a donor site (5 ' end of the intron), a branching site (near the 3' end of the intron), and an acceptor site (3 ' end of the intron). The splice donor site includes a nearly constant sequence GU 5' to the intron, within a larger, less highly conserved region. The splice acceptor site 3' to the intron terminates the intron with a nearly constant AG sequence. Upstream (5' to) the AG region with high pyrimidine (C and U) or bundles of polypyrimidine. Further upstream of the bundles of polypyrimidines is a branching point.
A "splice acceptor sequence" is a nucleotide sequence that can serve as an acceptor site at the 3' end of an intron. The consensus sequence and frequency of the human splice site region is described in Ma, s.l., et al, 2015.Plos one,10 (6), p.e 0130229.
Suitably, the splice acceptor sequence may comprise a nucleotide sequence (Y) n NYAG, where n is 10-20, or variants having at least 90% or at least 95% sequence identity. Suitably, the splice acceptor sequence may comprise the sequence (Y) n NCAG, wherein n is 10-20, or a variant having at least 90% or at least 95% sequence identity.
In some embodiments of the invention, the splice acceptor sequence comprises or consists of a nucleotide sequence that is at least 70% identical to SEQ ID NO. 33 or a fragment thereof. Suitably, the splice acceptor sequence comprises or consists of a nucleotide sequence which is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO 33 or a fragment thereof.
In some embodiments of the invention, the splice acceptor sequence comprises or consists of the nucleotide sequence SEQ ID NO 33 or a fragment thereof.
Exemplary splice acceptor sequences (SEQ ID NO: 33)
ctgacctcttctcttcctcccacag
The polynucleotides of the invention may comprise a splice donor sequence. The genome may comprise a splice donor sequence in RAG1 intron 1. Suitably, the splice donor sequence nucleotide sequence is 3' to the nucleotide sequence encoding the RAG1 polypeptide. Splice donor sequences can be used to provide mRNA comprising a RAG1 polypeptide and RAG1 exon 2.
A "splice donor sequence" is a nucleotide sequence that can serve as a donor site at the 5' end of an intron. The consensus sequence and frequency of the human splice site region is described in Ma, s.l., et al, 2015.Plos one,10 (6), p.e 0130229.
In some embodiments of the invention, the splice donor sequence comprises or consists of a nucleotide sequence that is at least 85% identical to SEQ ID NO 34 or a fragment thereof. In some embodiments of the invention, the splice donor sequence comprises or consists of the nucleotide sequence SEQ ID NO 34 or a fragment thereof.
Exemplary splice donor sequences (SEQ ID NO: 34)
aggtaagt
In some embodiments of the invention, the polynucleotides of the invention do not comprise a splice donor sequence.
Regulatory element
Polynucleotides of the invention may comprise one or more regulatory elements, which may act pre-or post-transcriptionally. Suitably, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to one or more regulatory elements, which may function either pre-or post-transcriptionally. One or more regulatory elements may promote expression of a RAG1 polypeptide in a cell of the invention.
A "regulatory element" is any nucleotide sequence that facilitates expression of a polypeptide, e.g., functions to increase transcript expression or enhance mRNA stability. Suitable regulatory elements include, for example, promoters, enhancer elements, post-transcriptional regulatory elements and polyadenylation sites.
Polyadenylation sequences
The polynucleotides of the invention may comprise polyadenylation sequences. Suitably, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence. Polyadenylation sequences may improve gene expression.
Suitable polyadenylation sequences will be well known to those skilled in the art. Suitable polyadenylation sequences include Bovine Growth Hormone (BGH) polyadenylation sequences or early SV40 polyadenylation signals. In some embodiments of the invention, the polyadenylation sequence is a BGH polyadenylation sequence.
In some embodiments of the invention, the polyadenylation sequence comprises or consists of a nucleotide sequence which is at least 70% identical to SEQ ID NO. 35, 62 or 65 or a fragment thereof. Suitably, the polyadenylation sequence comprises or consists of a nucleotide sequence which is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO. 35, 62 or 65 or a fragment thereof.
In some embodiments of the invention, the polyadenylation sequence comprises or consists of the nucleotide sequence SEQ ID NO 35, 62 or 65 or fragments thereof.
Exemplary BGH polyadenylation sequence (SEQ ID NO: 35)
Exemplary BGH polyadenylation sequence (SEQ ID NO: 62)
Exemplary BGH polyadenylation sequence (SEQ ID NO: 65)
Kozak sequence
The polynucleotides of the invention may comprise a Kozak sequence. Suitably, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a Kozak sequence. A Kozak sequence may be inserted before the initiation codon of the RAG1 polypeptide to improve the initiation of translation.
Suitable Kozak sequences will be well known to those skilled in the art.
In some embodiments of the invention, the Kozak sequence comprises or consists of a nucleotide sequence that is at least 70% identical to SEQ ID NO. 36 or a fragment thereof. Suitably, the Kozak sequence comprises or consists of a nucleotide sequence which is at least 80% or at least 90% identical to SEQ ID NO. 36 or a fragment thereof.
In some embodiments of the invention, the Kozak sequence comprises or consists of the nucleotide sequence SEQ ID NO. 36 or a fragment thereof.
Exemplary Kozak sequences (SEQ ID NO: 36)
gccgccaccatg
Post-transcriptional regulatory elements
Polynucleotides of the invention may comprise post-transcriptional regulatory elements. Suitably, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a post-transcriptional regulatory element. Post-transcriptional regulatory elements may improve gene expression.
Suitable post-transcriptional regulatory elements will be well known to those skilled in the art.
The polynucleotides of the invention may comprise woodchuck hepatitis virus post-transcriptional regulatory elements (WPREs). Suitably, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to WPRE.
In some embodiments of the invention, WPRE comprises or consists of a nucleotide sequence that is at least 70% identical to SEQ ID NO 37 or a fragment thereof. Suitably, WPRE comprises or consists of a nucleotide sequence that is at least 80% or at least 90% identical to SEQ ID NO. 37 or a fragment thereof.
In some embodiments of the invention, the WPRE comprises or consists of the nucleotide sequence SEQ ID NO 37 or a fragment thereof.
Exemplary WPRE (SEQ ID NO: 37)
In some embodiments of the invention, the RAG1 polypeptide is not operably linked to a post-transcriptional regulatory element. In some embodiments of the invention, the RAG1 polypeptide is not operably linked to the WPRE.
Endogenous 3' UTR
The polynucleotides of the invention may comprise an endogenous RAG1 3' utr. Suitably, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to an endogenous RAG1 3' utr.
In some embodiments of the invention, the RAG1 3' UTR comprises or consists of a nucleotide sequence which is at least 70% identical to SEQ ID NO. 38 or a fragment thereof. Suitably, the RAG1 3' UTR comprises or consists of a nucleotide sequence which is at least 80% or at least 90% identical to SEQ ID NO. 38 or a fragment thereof.
In some embodiments of the invention, the RAG1 3' UTR comprises or consists of the nucleotide sequence SEQ ID NO. 38 or a fragment thereof.
Exemplary RAG1 3' UTR (SEQ ID NO: 38)
/>
In some embodiments of the invention, the RAG1 polypeptide is not operably linked to the RAG1 3' utr.
Further coding sequences
The polynucleotides of the invention may comprise further coding sequences. The polynucleotides of the invention may comprise an internal ribosome entry site sequence (IRES). IRES may increase or allow expression of further coding sequences. IRES may be operably linked to additional coding sequences.
In some embodiments of the invention, the IRES comprises or consists of a nucleotide sequence that is at least 70% identical to SEQ ID NO. 63 or a fragment thereof. Suitably, IRES comprises or consists of a nucleotide sequence which is at least 80% or at least 90% identical to SEQ ID NO. 63 or a fragment thereof.
In some embodiments of the invention, the IRES comprises or consists of the nucleotide sequence SEQ ID NO. 63 or a fragment thereof.
Exemplary IRES (SEQ ID NO: 63)
Further coding sequences may encode a selector, e.g., an NGFR receptor, e.g., a low affinity NGFR, such as a C-terminally truncated low affinity NGFR. The selector may be used to enrich cells.
In some embodiments of the invention, the NGFR coding sequence comprises or consists of a nucleotide sequence that is at least 70% identical to SEQ ID NO. 64 or a fragment thereof. Suitably, the NGFR coding sequence comprises or consists of a nucleotide sequence which is at least 80% or at least 90% identical to SEQ ID NO. 64 or a fragment thereof.
In some embodiments of the invention, the NGFR coding sequence comprises or consists of the nucleotide sequence SEQ ID NO. 64 or a fragment thereof.
Exemplary NGFR coding sequence (SEQ ID NO: 64)
/>
Further coding sequences may encode destabilizing domains, such as peptide sequences rich in proline (P), glutamic acid (E), serine (S) and threonine (T) (PEST). Endogenous RAG1 proteins may be disrupted by destabilizing domains, such as PEST signal peptide degraded via proteasome.
In some embodiments of the invention, the PEST coding sequence comprises or consists of a nucleotide sequence at least 70% identical to SEQ ID NO. 66 or a fragment thereof. Suitably, the PEST coding sequence comprises or consists of a nucleotide sequence which is at least 80% or at least 90% identical to SEQ ID No. 66 or a fragment thereof.
In some embodiments of the invention, the PEST coding sequence comprises or consists of the nucleotide sequence SEQ ID NO. 66 or a fragment thereof.
Exemplary PEST coding sequence (SEQ ID NO: 66)
Promoters and enhancers
Suitably, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a promoter and/or enhancer element.
A "promoter" is a region of DNA that results in initiation of transcription of a gene. The promoter is located near the transcription initiation site of the gene, upstream of the DNA (toward the 5' region of the sense strand). Any suitable promoter may be used, the selection of which may be readily made by the skilled artisan.
An "enhancer" is a region of DNA that can be bound by a protein (activator) to increase the likelihood that transcription of a particular gene will occur. Enhancers are cis-acting. They may be located up to 1Mbp (1,000,000 bp) from the gene, either upstream or downstream of the start site. Any suitable enhancer may be used, the selection of which may be readily made by the skilled artisan.
Transcription of the nucleotide sequence encoding the RAG1 polypeptide may be driven by an endogenous promoter. For example, if a polynucleotide of the invention is inserted into RAG1 intron 1, transcription of the nucleotide sequence encoding the RAG1 polypeptide may be driven by an endogenous RAG1 promoter.
In some embodiments of the invention, the polynucleotides of the invention do not comprise promoter and/or enhancer elements. In some embodiments of the invention, the genome of the invention does not comprise promoter and/or enhancer elements (e.g., exogenous promoter and/or enhancer elements) in RAG1 intron 1.
Exemplary Polynucleotide and genome
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, a polyadenylation sequence and a second homologous region.
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a kozak sequence, a nucleotide sequence encoding a RAG1 polypeptide, a polyadenylation sequence and a second homologous region.
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, WPRE, a polyadenylation sequence and a second homologous region.
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a kozak sequence, a nucleotide sequence encoding a RAG1 polypeptide, WPRE, a polyadenylation sequence and a second homologous region.
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a kozak sequence, a nucleotide sequence encoding a RAG1 polypeptide, a 3' utr, a polyadenylation sequence and a second homologous region.
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a kozak sequence, a nucleotide sequence encoding a RAG1 polypeptide, an IRES, a nucleotide sequence encoding a selector (e.g., NGFR), a polyadenylation sequence, and a second homologous region.
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a kozak sequence, a nucleotide sequence encoding a RAG1 polypeptide, an IRES, a nucleotide sequence encoding a destabilizing domain (e.g., PEST sequence), a splice donor sequence, and a second homologous region.
In some embodiments, a polynucleotide of the invention comprises, consists essentially of, or consists of, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, a splice donor sequence, and a second homologous region.
In some embodiments, polynucleotides of the invention comprise or consist of a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identity to SEQ ID NO 39.
In some embodiments, the polynucleotides of the invention comprise or consist of the nucleotide sequence of SEQ ID NO. 39.
In some embodiments, the genome of the invention comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% identity to SEQ ID NO 39.
In some embodiments, the genome of the present invention comprises the nucleotide sequence of SEQ ID NO. 39.
In some embodiments, the genome of the invention comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% identical to nucleotides 297-3687 of SEQ ID NO:39 or nucleotides 291-3693 of SEQ ID NO: 39.
In some embodiments, the genome of the invention comprises the nucleotide sequence of nucleotides 297-3687 of SEQ ID NO:39 or nucleotides 291-3693 of SEQ ID NO: 39.
Exemplary Polynucleotide (SEQ ID NO: 39)
/>
/>
In some embodiments, the genome of the invention comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% identity to SEQ ID NO. 40.
In some embodiments, the genome of the present invention comprises the nucleotide sequence of SEQ ID NO. 40.
Exemplary nucleotide sequence inserts (SEQ ID NO: 40)
/>
Variants, derivatives, analogs and fragments
In addition to the specific proteins and nucleotides mentioned herein, variants, derivatives and fragments thereof are also encompassed by the present invention.
In the context of the present invention, a "variant" of any given sequence is a sequence in which a particular sequence of residues (whether amino acid or nucleic acid residues) has been modified in such a way that the polypeptide or polynucleotide in question retains at least one of its endogenous functions. For example, variants of RAG1 may retain the ability to form RAG complexes, mediate DNA binding to RSS, and introduce double strand breaks between RSS and adjacent coding segments. Variant sequences may be obtained by addition, deletion, substitution, modification, replacement and/or variation of at least one residue present in a naturally occurring polypeptide or polynucleotide.
As used herein, the term "derivative" in connection with a protein or polypeptide of the invention includes any substitution, variation, modification, substitution, deletion and/or addition of one (or more) amino acid residues from or to a sequence, provided that the resulting protein or polypeptide retains at least one of its endogenous functions. For example, derivatives of RAG1 may retain the ability to form RAG complexes, mediate DNA binding to RSS, and introduce double strand breaks between RSS and adjacent coding segments.
Typically, amino acid substitutions, e.g., from 1, 2, or 3 to 10 or 20 substitutions, can be made, provided that the modified sequence retains the desired activity or ability. Amino acid substitutions may include the use of non-naturally occurring analogs.
Proteins used in the present invention may also have deletions, insertions or substitutions of amino acid residues which produce a silent change and produce a functionally equivalent protein. Deliberate amino acid substitutions may be made based on similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues as long as endogenous function is retained. For example, negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids having uncharged polar head groups with similar hydrophilicity values include asparagine, glutamine, serine, threonine and tyrosine.
Conservative substitutions may be made, for example according to the table below. The amino acids in the same block in the second column and the same line in the third column may be substituted for each other:
in general, variants may have some identity to a wild-type amino acid sequence or a wild-type nucleotide sequence.
In this context, variant sequences are considered to include amino acid sequences that may be at least 50%, 55%, 65%, 75%, 85% or 90% identical, suitably at least 95%, 96% or 97% or 98% or 99% identical, to the subject sequence. Although variants may also be considered in terms of similarity (i.e. amino acid residues with similar chemical properties/functions), they are preferably expressed in terms of sequence identity in the context of the present invention.
In this context, variant sequences are considered to include nucleotide sequences that may be at least 50%, 55%, 65%, 75%, 85% or 90% identical, suitably at least 95%, 96% or 97% or 98% or 99% identical, to the subject sequence. In the context of the present invention, it is preferably expressed in terms of sequence identity, although variants may also be considered in terms of similarity.
Suitably, reference to a sequence having a percent identity to any one of the SEQ ID NOs detailed herein refers to a sequence having said percent identity over the entire length of the SEQ ID NOs referred to.
Sequence identity comparisons may be made by the naked eye, or more commonly by means of readily available sequence comparison procedures. These commercially available computer programs can calculate the percent identity between two or more sequences.
The percent identity can be calculated over consecutive sequences, i.e., one sequence is aligned with another sequence and each amino acid or nucleotide in one sequence is directly compared to the corresponding amino acid or nucleotide in the other sequence, one residue at a time. This is referred to as "unpinned" alignment. Typically, such vacancy free alignments are performed on only a relatively short number of residues.
Although this is a very simple and consistent method, it does not take into account that, for example, in a pair of otherwise identical sequences, an insertion or deletion in an amino acid or nucleotide sequence may result in the following residues or codons being placed out of alignment, thus potentially resulting in a substantial reduction in the percentage of identity when performing a global alignment. Thus, most sequence comparison methods aim to produce optimal alignments, taking into account possible insertions and deletions, without unduly penalizing the overall identity score. This is achieved by inserting "gaps" in the sequence alignment in an attempt to maximize local identity.
However, these more complex methods assign a "gap penalty" to each gap that occurs in an alignment, and thus, for the same number of identical amino acids or nucleotides, a sequence alignment with as few gaps as possible, reflecting a higher correlation between two compared sequences, will achieve a higher score than a sequence with many gaps. The "affine vacancy cost" is typically used to charge a relatively high cost for the existence of vacancies and a small penalty for each subsequent residue in a vacancy. This is the most commonly used vacancy scoring system. A high gap penalty will of course result in an optimized alignment with fewer gaps. Most alignment programs allow for modifying the vacancy penalty. However, when using such software for sequence comparison, default values are preferably used. For example, when using the GCG Wisconsin Bestfit package, the default gap penalty for an amino acid sequence is-12 for gaps and-4 for each extension.
Thus, the calculation of the maximum percent identity first requires the generation of an optimal alignment while taking into account the gap penalty. A suitable computer program for this pair is package GCG Wisconsin Bestfit (University of Wisconsin, USA; devereux et al (1984) Nucleic Acids Research 12:387). Examples of other software that may perform sequence comparisons include, but are not limited to, BLAST packages (see Ausubel et al (1999) supra-chapter 18), FASTA (Atschul et al (1990) J.mol. Biol. 403-410), EMBOSS Needle (Madeira, F., et al, 2019.Nucleic acids research,47 (W1), pp.W 636-W641), and GENEWORKS comparison tool kits. Both BLAST and FASTA can be used for both offline and online searches (see Ausubel et al (1999) supra, pages 7-58 to 7-60). However, for some applications, the GCG Bestfit program is preferred. Another tool BLAST 2Sequences can also be used to compare protein and nucleotide Sequences (FEMS Microbiol. Lett. (1999) 174 (2): 247-50;FEMS Microbiol.Lett. (1999) 177 (1): 187-8).
Although the final percent identity can be measured, the alignment process itself is generally not based on an all or nothing pairing comparison. Instead, a scaled similarity score matrix is typically used to assign a score to each pair comparison based on chemical similarity or evolutionary distance. An example of such a common matrix is the BLOSUM62 matrix (the default matrix of the BLAST suite of programs). The GCG Wisconsin program typically uses a common default value or custom symbol comparison table (if provided) (see user manual for details). For some applications it is preferred to use a common default value for the GCG package, or in the case of other software, a default matrix, such as BLOSUM62.
Once the software has produced the optimal alignment, it is possible to calculate the percent sequence identity. The software typically compares this as part of a sequence and generates a numerical result. The percent sequence identity can be calculated as the number of identical residues as a percentage of the total residues in the indicated SEQ ID NO.
"fragment" is also a variant and the term generally refers to a selected region of a polypeptide or polynucleotide of interest, either functionally or, for example, in an assay. Thus, a "fragment" refers to an amino acid or nucleic acid sequence that is part of a full-length polypeptide or polynucleotide.
Such variants, derivatives and fragments may be prepared using standard recombinant DNA techniques such as site-directed mutagenesis. Where an insertion is to be made, synthetic DNA encoding the insertion along with the 5 'and 3' flanking regions corresponding to the native sequence on either side of the insertion site may be prepared. The flanking regions will contain convenient restriction sites corresponding to sites in the naturally-occurring sequence so that the sequence may be cleaved with the appropriate enzymes and the synthetic DNA ligated into the cleavage. The DNA is then expressed according to the invention to produce the encoded protein. These methods are merely illustrative of the numerous standard techniques known in the art for manipulating DNA sequences, and other known techniques may also be used.
Carrier body
In one aspect, the invention provides a vector comprising a polynucleotide of the invention.
The vector may be suitable for editing a genome using a polynucleotide of the invention. Vectors may be used to deliver polynucleotides into cells. Subsequently, nucleotide sequence inserts can be introduced into the genome at the site of Double Strand Breaks (DSBs) by Homology Directed Repair (HDR).
The vectors of the invention may be capable of transducing mammalian cells, such as human cells. Suitably, the vector of the invention is capable of transducing HSCs, HPCs and/or LPCs. Suitably, the vector of the invention is capable of transducing cd34+ cells. Suitably, the vectors of the invention are capable of transducing NALM6, K562 and/or other human cell lines (e.g.molt4, U937, etc.). Suitably, the vector of the invention is capable of transducing T cells.
Suitably, the vector of the invention is a viral vector. The vector of the invention may be an adeno-associated virus (AAV) vector, although it is contemplated that other viral vectors may be used, such as lentiviral vectors (e.g., IDLV vectors), or single-or double-stranded DNA.
The vector of the invention may be in the form of viral vector particles. Suitably, the viral vector of the invention is in the form of an AAV vector particle. Suitably, the viral vector of the invention is in the form of a lentiviral vector particle, for example an IDLV vector particle.
Methods of preparing and modifying viral vectors and viral vector particles, such as those derived from AAV, are well known in the art. Suitable Methods are described in Ayuso, E., et al, 2010.Current gene therapy,10 (6), pp.423-436, merten, O.W., et al, 2016.Molecular Therapy-Methods & Clinical Development,3, p.16017; and Nadeau, I.and Kamen, A.,2003.Biotechnology advances,20 (7-8), pp.475-489.
Adeno-associated virus (AAV) vectors
The vector of the invention may be an adeno-associated virus (AAV) vector. Optionally, the vector is an AAV6 vector. The vectors of the invention may be in the form of AAV vector particles. Optionally, the vector is in the form of AAV6 vector particles.
The AAV vector or AAV vector particle may comprise an AAV genome or fragment or derivative thereof. AAV genomes are polynucleotide sequences that can encode the functions required to produce AAV particles. These functions include those that are manipulated during the replication and packaging cycle of AAV in a host cell, including encapsidation of AAV genome into AAV particles. Naturally occurring AAV is replication defective and relies on the provision of trans-helper functions to complete the replication and packaging cycle. Thus, the AAV genome of the AAV vector of the invention is typically replication defective.
AAV genomes may be in single stranded form, either positive or negative, or double stranded. The use of double stranded forms allows bypassing the DNA replication step in the target cell and thus can accelerate transgene expression.
AAV found in nature can be classified according to various biological systems. AAV genomes may be from any naturally derived serotype, isolate, or clade of AAV.
AAV may be referred to according to its serotype. Serotypes correspond to variant subspecies of AAV, which are uniquely reactive due to the expression profile of capsid surface antigens, which can be used to distinguish them from other variant subspecies. In general, AAV vector particles having a particular AAV serotype do not cross-react efficiently with neutralizing antibodies specific for any other AAV serotype. AAV serotypes include AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, and AAV11. The AAV vector of the invention may be an AAV6 serotype.
AAV may also be referred to as clade or clone. This refers to the phylogenetic relationship of naturally derived AAV and generally refers to the phylogenetic group of AAV that can be traced back to a common ancestor, and includes all its offspring. Furthermore, AAV may refer to a particular isolate, i.e., a genetic isolate of a particular AAV found in nature. The term genetic isolate describes a population of AAV that has undergone limited genetic mixing with other naturally occurring AAV, thereby defining a population that can recognize different at the genetic level.
Typically, the AAV genome of a naturally derived serotype, isolate, or clade of AAV comprises at least one Inverted Terminal Repeat (ITR). The ITR sequences act in cis to provide a functional origin of replication and allow integration and excision of the vector from the cell genome. ITR may be the only sequence required for the next cis form of the therapeutic gene. Suitably, one or more ITR sequences flank a polynucleotide of the invention.
AAV genomes may also comprise packaging genes, such as rep and/or cap genes encoding packaging functions of AAV particles. The promoter may be operably linked to each of the packaging genes. Specific examples of such promoters include the p5, p19 and p40 promoters. For example, p5 and p19 promoters are typically used to express the rep gene, while p40 promoters are typically used to express the cap gene. The Rep gene encodes one or more of the proteins Rep78, rep68, rep52, and Rep40, or variants thereof. The cap gene encodes one or more capsid proteins, such as VP1, VP2 and VP3 or variants thereof.
The AAV genome may be the complete genome of a naturally occurring AAV. For example, vectors comprising whole AAV genomes can be used to prepare AAV vectors or vector particles.
Suitably, the AAV genome is derivatized for the purpose of administration to a patient. Such derivatization is standard in the art and the present invention encompasses any known derivative using AAV genomes, as well as derivatives that can be generated by application of techniques known in the art. The AAV genome may be a derivative of any naturally occurring AAV. Suitably, the AAV genome is a derivative of AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10 or AAV 11. Suitably, the AAV genome is a derivative of AAV 6.
Derivatives of the AAV genome include any truncated or modified form of the AAV genome that allows for expression of a transgene from an AAV vector of the invention in vivo. In general, it is possible to truncate the AAV genome significantly to include minimal viral sequences but retain the above functions. This can reduce the risk of recombination of the vector with the wild-type virus and avoid triggering a cellular immune response due to the presence of viral gene proteins in the target cell.
Typically, the derivative will comprise at least one Inverted Terminal Repeat (ITR), optionally more than one ITR, such as two or more ITRs. One or more of the ITRs can be derived from AAV genomes with different serotypes, or can be chimeric or mutant ITRs. Suitable mutant ITRs are those having a deletion of trs (terminal resolution sites). This deletion allows the genome to continue to replicate to produce a single stranded genome containing the coding sequence and complementary sequences, i.e., a self-complementary AAV genome. This allows bypassing DNA replication in the target cell and thus can accelerate transgene expression.
The AAV genome may comprise one or more ITR sequences from any naturally derived serotype, isolate or clade of AAV, or variant thereof. The AAV genome may comprise at least one, such as two AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, or AAV11 ITRs, or variants thereof.
One or more ITRs can flank either end of a nucleotide sequence of the invention. Inclusion of one or more ITRs can aid AAV vectors in forming concatemers in the nucleus of a host cell, e.g., after conversion of single stranded vector DNA to double stranded DNA by host cell DNA polymerase. The formation of such episomal concatamers protects the AAV vector during the life of the host cell, allowing for prolonged expression of the transgene in vivo.
Suitably, the ITR element will be the only sequence in the derivative that is retained from the native AAV genome. Suitably, the derivative may not comprise the rep and/or cap genes of the native genome and any other sequences of the native genome. This may reduce the likelihood of integration of the vector into the host cell genome. Furthermore, reducing the size of AAV genomes allows for increased flexibility in incorporating other sequence elements (such as regulatory elements) into the vector in addition to the transgene.
The following moieties can thus be removed in the derivatives of the invention: an Inverted Terminal Repeat (ITR) sequence, replication (rep) and capsid (cap) genes. However, the derivative may additionally comprise one or more rep and/or cap genes of the AAV genome or other viral sequences. Naturally occurring AAV integrates at a high frequency at a specific site on human chromosome 19 and exhibits a negligible random integration frequency, and thus can tolerate retention of integration capacity in AAV vectors in a therapeutic setting.
The invention additionally encompasses providing the sequences of the AAV genome in a different order and configuration than the native AAV genome. The invention also encompasses the replacement of one or more AAV sequences or genes with sequences from another virus or with chimeric genes composed of sequences from more than one virus. Such chimeric genes may be composed of sequences of two or more related viral proteins from different viral species.
AAV vector particles may be encapsidated by capsid proteins. Suitably, the AAV vector particles may be in a trans-capsid form, wherein AAV genomes or derivatives having ITRs of one serotype are packaged in capsids of a different serotype. AAV vector particles also include mosaic forms in which a mixture of unmodified capsid proteins from two or more different serotypes constitute the viral capsid. AAV vector particles also include chemically modified forms with ligands adsorbed to the capsid surface. For example, such ligands may include antibodies for targeting specific cell surface receptors.
When the derivative comprises capsid proteins, i.e., VP1, VP2 and/or VP3, the derivative may be a chimeric, shuffled or capsid modified derivative of one or more naturally occurring AAV. In particular, the invention encompasses providing capsid protein sequences from different serotypes, clades, clones, or isolates of AAV within the same vector (i.e., a pseudotyped vector). AAV vectors may be in the form of pseudotyped AAV vector particles.
Chimeric, shuffled or capsid modified derivatives will typically be selected to provide one or more desired functions to the AAV vector. Thus, these derivatives may exhibit increased gene delivery efficiency and/or reduced immunogenicity (humoral or cellular) as compared to AAV vectors comprising naturally occurring AAV genomes. For example, an increase in gene delivery efficiency may be achieved by improved cell surface receptor or co-receptor binding, improved internalization, improved intracellular and transport into the nucleus, improved uncoating of viral particles, and improved conversion of single-stranded genomes into double stranded forms.
Chimeric capsid proteins include those produced by recombination between two or more capsid coding sequences of naturally occurring AAV serotypes. This may be performed, for example, by a marker rescue method, in which non-infectious capsid sequences of one serotype are co-transfected with capsid sequences of a different serotype, and directional selection is used to select capsid sequences having the desired properties. Capsid sequences of different serotypes can be altered by homologous recombination within the cell to produce novel chimeric capsid proteins.
Chimeric capsid proteins also include chimeric capsid proteins generated by engineering the capsid protein sequence to transfer a particular capsid protein domain, surface loop, or particular amino acid residue between two or more capsid proteins, e.g., of different serotypes.
Shuffling or chimeric capsid proteins can also be generated by DNA shuffling or error-prone PCR. Hybrid AAV capsid genes can be created by randomly fragmenting sequences of related AAV genes, such as those encoding capsid proteins of multiple different serotypes, and then subsequently reassembling the fragments in a self-priming polymerase reaction, which can also result in crossover of sequence homology regions. Libraries of hybrid AAV genes created in this manner by shuffling several serotypes of capsid genes can be screened to identify viral clones with the desired function. Similarly, error-prone PCR can be used to randomly mutate AAV capsid genes to create a diverse library of variants, which can then be selected for desired properties.
The sequence of the capsid gene may also be genetically modified to introduce specific deletions, substitutions or insertions relative to the native wild-type sequence. In particular, the capsid gene may be modified by inserting sequences of unrelated proteins or peptides in the open reading frame of the capsid coding sequence or at the N-and/or C-terminus of the capsid coding sequence. The unrelated protein or peptide may advantageously be a protein or peptide that acts as a ligand for a particular cell type, thereby conferring improved binding to the target cell or improved specificity of the vector for targeting a particular cell population. The unrelated protein may also be a protein that aids in the purification of the viral particle as part of the production process, i.e., an epitope or an affinity tag. The insertion site will typically be selected so as not to interfere with other functions of the viral particle, such as internalization, transport of the viral particle.
The capsid protein may be an artificial or mutant capsid protein. As used herein, the term "artificial capsid" means that the capsid particles comprise an amino acid sequence that does not exist in nature or comprise an amino acid sequence that has been engineered (e.g., modified) from a naturally occurring capsid amino acid sequence. In other words, in the case of aligning the artificial capsid amino acid sequence with the parent capsid amino acid sequence, the artificial capsid protein comprises a mutation or variation in the amino acid sequence compared to the sequence of the parent capsid from which it was derived. AAV vector particles may comprise AAV6 capsid proteins.
Retrovirus and lentiviral vectors
The vector of the present invention may be a retroviral vector or a lentiviral vector. The vector of the present invention may be a retroviral vector particle or a lentiviral vector particle.
The retroviral vector may be derived from or may be derivable from any suitable retrovirus. A number of different retroviruses have been identified. Examples include Murine Leukemia Virus (MLV), human T cell leukemia virus (HTLV), mouse Mammary Tumor Virus (MMTV), rous Sarcoma Virus (RSV), fujinami sarcoma virus (FuSV), moloney murine leukemia virus (Mo-MLV), FBR murine osteosarcoma virus (FBRMSV), moloney murine osteosarcoma virus (Mo-MSV), abelson murine leukemia virus (A-MLV), avian myeloblastosis virus 29 (MC 29), and Avian Erythroblastosis Virus (AEV).
Retroviruses can be broadly divided into two categories, "simple" and "complex". Retroviruses can be divided even further into seven groups. Five of which represent retroviruses with oncogenic potential. The remaining two groups are lentiviruses and foamy viruses (spuaviruses).
The basic structures of retroviral and lentiviral genomes share many common features, such as the 5'LTR and the 3' LTR. Between or within them are packaging signals to enable the genome to be packaged, primer binding sites, integration sites to enable integration into the host cell genome, and gag, pol and env genes encoding packaging components, which are polypeptides required for viral particle assembly. Lentiviruses have additional features such as rev and RRE sequences in HIV, which enable efficient export of RNA transcripts of the integrated provirus from the nucleus into the cytoplasm of infected target cells.
In provirus, both ends of these genes are flanked by regions called Long Terminal Repeats (LTRs). The LTR is responsible for proviral integration and transcription. The LTR also acts as an enhancer-promoter sequence and can control the expression of viral genes.
LTRs are themselves identical sequences and can be divided into three elements: u3, R and U5. U3 is derived from a unique sequence at the 3' end of RNA. R is derived from sequences repeated at both ends of the RNA. U5 is derived from a unique sequence at the 5' end of RNA. The sizes of these three elements may vary widely between different retroviruses.
In defective retroviral vector genomes gag, pol and env may be absent or nonfunctional.
In a typical retroviral vector, at least a portion of one or more protein coding regions necessary for replication may be removed from the virus. This makes viral vectors replication defective. Portions of the viral genome may also be replaced by a library encoding candidate regulatory portions operably linked to regulatory control regions in the vector genome and reporter portions in order to generate a vector comprising candidate regulatory portions capable of transducing a target host cell and/or integrating its genome into the host genome.
Lentiviral vectors are part of a larger retroviral vector group. Briefly, lentiviruses can be divided into primates and non-primates. Examples of primate lentiviruses include, but are not limited to, human Immunodeficiency Virus (HIV), which is the causative agent of human acquired immunodeficiency syndrome (AIDS); and Simian Immunodeficiency Virus (SIV). Examples of non-primate lentiviruses include the prototype "slow virus" visna/maedi virus (VMV), and the related Caprine Arthritic Encephalitis Virus (CAEV), equine Infectious Anemia Virus (EIAV), and the recently described Feline Immunodeficiency Virus (FIV) and Bovine Immunodeficiency Virus (BIV).
The lentiviral family differs from retroviruses in that lentiviruses have the ability to infect both dividing and non-dividing cells. In contrast, other retroviruses, such as MLV, are unable to infect non-dividing or slowly dividing cells, such as those that make up, for example, muscle, brain, lung, and liver tissue.
As used herein, a lentiviral vector is a vector comprising at least one component that may be derived from a lentivirus. Suitably, the component involves a biological mechanism by which the vector infects cells, expresses a gene or replicates.
The lentiviral vector may be a "primate" vector. Lentiviral vectors may be "non-primate" vectors (i.e., derived from viruses that do not primarily infect primates, particularly humans). An example of a non-primate lentivirus may be any member of the lentiviraceae family that does not naturally infect primates.
As an example of lentiviral-based vectors, HIV-1 and HIV-2 based vectors are described below.
HIV-1 vectors contain cis-acting elements that are also present in simple retroviruses. It has been shown that sequences extending into the gag open reading frame are important for the packaging of HIV-1. Thus, HIV-1 vectors typically contain a relevant portion of gag in which the translation initiation codon has been mutated. In addition, most HIV-1 vectors also contain a portion of the env gene that includes RRE. Rev binds to RRE, which allows full-length or single-spliced mRNA to be transported from the nucleus to the cytoplasm. In the absence of Rev and/or RRE, full length HIV-1RNA accumulates in the nucleus. Alternatively, constitutive transport elements from certain simple retroviruses (such as mersen-xeno virus) can be used to alleviate the need for Rev and RRE. Efficient transcription of the HIV-1LTR promoter requires the viral protein Tat.
Most HIV-2 based vectors are very similar in structure to HIV-1 vectors. Like HIV-1 based vectors, HIV-2 vectors also require RRE for efficient transport of full-length or single-spliced viral RNA.
Optionally, the viral vectors used in the present invention have a minimal viral genome.
By "minimal viral genome" is understood that the viral vector has been manipulated to remove non-essential elements and retain essential elements in order to provide the desired functions of infecting, transducing and delivering the nucleotide sequence of interest to the target host cell. Further details of this strategy can be found in WO 1998/017815.
Optionally, the plasmid vector used to produce the viral genome within the host cell/packaging cell will have sufficient lentiviral genetic information to allow packaging of the RNA genome into a viral particle capable of infecting the target cell, but not replicating independently to produce infectious viral particles within the final target cell, in the presence of packaging components. Optionally, the vector lacks a functional gag-pol and/or env gene and/or other genes necessary for replication.
However, the plasmid vector used to produce the viral genome within the host cell/packaging cell will also include transcriptional regulatory control sequences operably linked to the lentiviral genome to direct transcription of the genome in the host cell/packaging cell. These regulatory sequences may be the native sequence associated with the transcribed viral sequence (i.e., the 5' u3 region), or they may be a heterologous promoter, such as another viral promoter (e.g., the CMV promoter).
The vector may be a self-inactivating (SIN) vector in which the viral enhancer and promoter sequences have been deleted. SIN vectors can be generated and transduced in vivo into non-dividing cells with similar efficacy as wild type vectors. Transcriptional inactivation of the Long Terminal Repeat (LTR) in SIN provirus should prevent mobilization by replication competent viruses. This should also enable tunable expression of genes from internal promoters by eliminating any cis-acting effect of the LTR.
The vector may be integration defective. Integration-defective lentiviral vectors (IDLV) can be produced, for example, by packaging the vector with a catalytically inactive integrase, such as an HIV integrase with a D64V mutation at the catalytic site, or by modifying or deleting the essential att sequence from the vector LTR, or a combination of the above.
Adenovirus vector
The vector of the present invention may be an adenovirus vector. The vector of the present invention may be an adenovirus vector particle.
Adenoviruses are double-stranded linear DNA viruses that do not undergo RNA intermediates. Adenoviruses have more than 50 different human serotypes, and are divided into 6 subgroups based on gene sequence homology. The natural target of adenoviruses is the respiratory and gastrointestinal epithelium, usually causing only mild symptoms. Serotypes 2 and 5 (with 95% sequence homology) are most commonly used in adenovirus vector systems and are commonly associated with upper respiratory tract infections in young humans.
Adenoviruses have been used as vectors for gene therapy and heterologous gene expression. The large (36 kb) genome can accommodate up to 8kb of foreign insert DNA and can replicate efficiently in complementary cell lines to yield up to 10 12 Is present in the sample, is a very high titer. Thus, adenovirus is one of the best systems to study gene expression in primary non-replicating cells.
Expression of a virus or foreign gene from the adenovirus genome does not require replicating cells. Adenovirus vectors enter cells by receptor-mediated endocytosis. Once in the cell, the adenovirus vector rarely integrates into the host chromosome. Instead, they function as linear genomes in the nucleus of the host cell in episomal form (independent of the host genome). Thus, the use of recombinant adenoviruses alleviates the problems associated with random integration into the host genome.
Herpes simplex virus vector
The vector of the present invention may be a herpes simplex virus vector. The vector of the present invention may be a herpes simplex virus vector particle.
Herpes Simplex Virus (HSV) is a neurotropic DNA virus with advantageous properties as a gene delivery vehicle. HSV is highly infectious, and thus HSV vectors are highly efficient vehicles for delivering exogenous genetic material to cells. Viral replication is easily disrupted by null mutations in the immediate early genes, which can be trans-complemented in vitro, enabling direct production of highly titre pure formulations of non-pathogenic vectors. The genome is large (152 Kb) and many viral genes are essential for replication in vitro, allowing them to be replaced with large or multiple transgenes. Latent infection with wild-type virus results in the persistence of episomal virus in the nucleus of sensory neurons for the duration of the lifetime of the host. These vectors are nonpathogenic, unable to reactivate and exist for a long period of time. The latency active promoter complexes can be used in vector design to achieve long term stable transgene expression in the nervous system. HSV vectors transduce a wide range of tissues due to the broad expression pattern of viral-recognized cellular receptors. An increasing awareness of the processes involved in cell entry has allowed targeting of the tropism of HSV vectors.
Vaccinia virus vector
The vector of the invention may be a vaccinia virus vector. The vector of the invention may be a vaccinia virus vector particle.
Vaccinia virus is a large enveloped virus with a linear double stranded DNA genome of about 190 kb. Vaccinia virus can contain up to about 25kb of foreign DNA, which also makes it useful for delivery of large genes.
Many attenuated vaccinia virus strains suitable for use in gene therapy applications are known in the art, such as the MVA and NYVAC strains.
RNA-guided gene editing
The vectors of the invention may be used to deliver polynucleotides into cells. Subsequently, nucleotide sequence inserts can be introduced into the cell genome at Double Strand Break (DSB) sites by Homology Directed Repair (HDR). The site of the Double Strand Break (DSB) may be specifically introduced by any suitable technique, for example by using an RNA-guided gene editing system.
An "RNA-guided gene editing system" may be used to introduce DSBs and typically comprises guide RNAs and RNA-guided nucleases. The CRISPR/Cas9 system is an example of a commonly used RNA-guided gene editing system, but other RNA-guided gene editing systems may also be used.
Guide RNA
"guide RNAs" (grnas) confer target sequence specificity to RNA-guided nucleases. Guide RNAs are non-coding short RNA sequences that bind to complementary target DNA sequences. For example, in a CRISPR/Cas9 system, the guide RNA first binds to the Cas9 enzyme and the gRNA sequence directs the resulting complex via base pairing to a specific location on the DNA where Cas9 performs its nuclease activity by cleaving the target DNA strand.
The term "guide RNA" encompasses any suitable grnas that can be used with any RNA-guided nuclease, not just those grnas that are compatible with a particular nuclease (such as Cas 9).
The guide RNA may comprise transactivation CRISPR RNA (tracrRNA) providing a stem-loop structure and target-specific CRISPR RNA (crRNA) designed to cleave a gene target site of interest. The tracrRNA and crRNA can be annealed, for example, by heating them at 95 ℃ for 5 minutes and allowing them to cool slowly to room temperature for 10 minutes. Alternatively, the guide RNA may be a single guide RNA (sgRNA) consisting of both crRNA and tracrRNA as a single construct.
The guide RNA may comprise a 3 'end that forms a scaffold for nuclease binding and a5' end that is programmable to target different DNA sites. For example, the targeting specificity of CRISPR-Cas9 may be determined by the 15-25bp sequence at the 5' end of the guide RNA. The desired target sequence is typically located before the Protospacer Adjacent Motif (PAM), a short DNA sequence typically 2-6bp in length, located after the targeted DNA region that is cleaved by the CRISPR system (such as CRISPR-Cas 9). PAM is necessary for Cas nuclease cleavage and is typically found 3-4bp downstream of the cleavage site. After base pairing of the guide RNA with the target, cas9 mediates a double strand break approximately 3nt upstream of PAM.
There are many tools for designing guide RNAs (e.g., cui, y., et al 2018.Interdisciplinary Sciences:Computational Life Sciences,10 (2), pp. 455-465). For example, COSID is a web-based tool for authenticating and validating guide RNAs (Cradick TJ, et al mol Ther-Nucleic acids.2014;3 (12): e 214).
A list of exemplary guide RNAs for use in the present invention is provided in table 4 below.
TABLE 4 exemplary guide RNA
In one aspect, the invention provides a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity or at least 95% identity to any one of SEQ ID NOS: 41-52, optionally wherein the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity or at least 95% identity to SEQ ID NO: 41.
In some embodiments, the guide RNA comprises or consists of the nucleotide sequence of any one of SEQ ID NOS: 41-52, optionally wherein the guide RNA comprises or consists of the nucleotide sequence of SEQ ID NO: 41.
For example, the sequences of conductors 9, 3 and 7 may be extended as follows, for example when used as crrnas:
in one aspect, the invention provides a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity or at least 95% identity to any one of SEQ ID NOs 53-55, optionally wherein the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity or at least 95% identity to SEQ ID NO 53.
In some embodiments, the guide RNA comprises or consists of the nucleotide sequence of any one of SEQ ID NOs 53-55, optionally wherein the guide RNA comprises or consists of the nucleotide sequence of SEQ ID NO 53.
Suitably, the guide RNA is chemically modified. Chemical modification may enhance the stability of the guide RNA. For example, one to five (e.g., three) terminal nucleotides at the 5 'and/or 3' ends of the guide RNA can be chemically modified to enhance stability.
Any chemical modification that enhances the stability of the guide RNA may be used. For example, the chemical modification may be with 2 '-O-methyl 3' -phosphorothioate, such as Hendel A, et al Nat Biotechnol.2015;33 (9) 985-9.
RNA-guided nucleases
A "nuclease" is an enzyme that can cleave the phosphodiester bonds present within a polynucleotide chain. Suitably, the nuclease is an endonuclease. Endonucleases are able to break bonds from the middle of the strand.
An "RNA-guided nuclease" is a nuclease that can be guided by a guide RNA to a specific site. The invention may be practiced using any suitable RNA-guided nuclease, such as any of the RNA-guided nucleases described in Murugan, k., et al, 2017.Molecular cell,68 (1), pp.15-25. RNA-guided nucleases include, but are not limited to, type II CRISPR nucleases such as Cas9 and type V CRISPR nucleases such as Cas12a and Cas12b, and other nucleases derived therefrom. In a broad sense, RNA-guided nucleases can be defined by their PAM specificity and cleavage activity.
Suitably, the RNA-guided nuclease is a type II CRISPR nuclease, such as a Cas9 nuclease. Cas9 is a dual RNA guide endonuclease associated with Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) adaptive immune system. Cas9 nucleases include well-characterized orthologs from streptococcus pyogenes (SpCas 9). SpCas9 and other orthologs (including SaCas9, fnCa9 and AnaCas 9) have been reviewed by Jiang, F.and Doudna, J.A.,2017.Annual review of biophysics,46,pp.505-529.
The RNA-guided nuclease may form a complex with the guide RNA, i.e., the guide RNA and the RNA-guided nuclease may together form a Ribonucleoprotein (RNP). Suitably, the RNP is a Cas9 RNP. RNP can be formed by any method known in the art, for example by incubating RNA-guided nucleases with guide RNA for 5-30 minutes at room temperature. Delivery of Cas9 as a pre-assembled RNP can protect the guide RNA from intracellular degradation, thereby increasing the stability and activity of RNA-guided nucleases (kims, et al genome res.2014;24 (6): 1012-9).
Kit, composition, and gene editing system
In one aspect, the invention provides a kit, composition or gene editing system comprising a polynucleotide of the invention, a vector of the invention and/or a guide RNA of the invention.
As used herein, a "gene editing system" is a system that comprises all components required to edit a genome using a polynucleotide of the invention.
In some embodiments, the kit, composition or gene editing system comprises a polynucleotide and/or vector of the invention and a guide RNA. The guide RNA may correspond to the same DSB site targeted by the homology arm. For example, in some embodiments, a kit, composition, or gene editing system comprises:
(i) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr 11:36569298, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 41 or 53 (preferably SEQ ID NO. 41);
(ii) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36573790 and the second homologous region is homologous to a region downstream of chr 11:36573793, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 42;
(iii) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36573641 and the second homologous region is homologous to a region downstream of chr 11:36573644, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 43;
(iv) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr 11:36573354, and/or a vector comprising the polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO 44 or 54 (preferably SEQ ID NO 44);
(v) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36569080 and the second homologous region is homologous to a region downstream of chr 11:36569083, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 45;
(vi) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36572472 and the second homologous region is homologous to a region downstream of chr 11:36572475, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 46;
(vii) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36571458 and the second homologous region is homologous to a region downstream of chr 11:36571461, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 47;
(viii) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36571366 and the second homologous region is homologous to a region downstream of chr 11:36571369, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 48 or 55 (preferably SEQ ID NO. 48);
(ix) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36572859 and the second homologous region is homologous to a region downstream of chr 11:36572862, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 49;
(x) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36571457 and the second homologous region is homologous to a region downstream of chr 11:36571460, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 50;
(xi) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36569351 and the second homologous region is homologous to a region downstream of chr 11:36569354, and/or a vector comprising said polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 51; or alternatively
(xii) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36572375 and the second homologous region is homologous to a region downstream of chr 11:36572378, and/or a vector comprising the polynucleotide; and a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 52.
In some embodiments, the kit, composition or gene editing system comprises:
(i) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr 11:36569298, and/or a vector comprising said polynucleotide; and
(ii) A guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID No. 41 or 53 (preferably SEQ ID No. 41).
In some embodiments, the kit, composition or gene editing system comprises:
(i) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region is homologous to a region comprising chr 11:36569245-36569294 and the second homologous region is homologous to a region comprising chr 11:36569299-36569348, and/or a vector comprising the polynucleotide; and
(ii) A guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID No. 41 or 53 (preferably SEQ ID No. 41).
In some embodiments, the kit, composition or gene editing system comprises:
(i) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 7, and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 19, and/or a vector comprising the polynucleotide; and
(ii) A guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID No. 41 or 53 (preferably SEQ ID No. 41).
In some embodiments, the kit, composition or gene editing system comprises:
(i) A polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity, or 100% identity to SEQ ID NO 31 or a fragment thereof; and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to SEQ ID NO. 32 or a fragment thereof, and/or a vector comprising said polynucleotide; and
(ii) A guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID No. 41 or 53 (preferably SEQ ID No. 41).
The kit, composition or gene editing system may further comprise an RNA-guided nuclease. Suitably, the RNA-guided nuclease corresponds to the guide RNA used. For example, if the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity, at least 95% identity, or 100% identity to any of SEQ ID NOS.41-52, the RNA-guided nuclease is suitably a Cas9 endonuclease. For example, if the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity, at least 95% identity, or 100% identity to any of SEQ ID NOS.53-55, the RNA-guided nuclease is suitably a Cas9 endonuclease.
The RNA-guided nuclease may form a complex with the guide RNA, i.e., the guide RNA and the RNA-guided nuclease together form a Ribonucleoprotein (RNP).
Cells
In one aspect, the invention provides cells that have been edited using a polynucleotide, vector, kit, composition, or gene editing system of the invention.
In a related aspect, the invention provides a cell comprising a polynucleotide, vector and/or genome of the invention.
Suitably, the cells are isolated cells. Suitably, the cell is a mammalian cell, e.g. a human cell.
Suitably, the cell is a Hematopoietic Stem Cell (HSC), hematopoietic Progenitor Cell (HPC) or Lymphoid Progenitor Cell (LPC). In some embodiments, the cell is a HSC or HPC, optionally the cell is a HSC.
As used herein, a "hematopoietic stem cell" is a stem cell that has no differentiation potential for cells other than hematopoietic cells, a "hematopoietic progenitor cell" is a progenitor cell that has no differentiation potential for cells other than hematopoietic cells, and a "lymphoid progenitor cell" is a progenitor cell that has no differentiation potential for cells other than lymphocytes.
The cells may be obtained from any source. The cells may be autologous or allogeneic. The cells may be obtained or obtainable from any biological sample, such as peripheral blood or umbilical cord blood. The peripheral blood may be treated with a mobilizing agent, i.e., mobilizing the peripheral blood. The cell may be a universal cell.
The cells may be isolated or isolatable using methods known to those skilled in the art using commercially available antibodies that bind to cell surface antigens such as CD 34. For example, antibodies can be conjugated to magnetic beads and the desired cell type recovered using an immunological procedure. Suitably, the cells are identified by the presence or absence of one or more antigenic markers. Suitable antigenic markers include CD34, CD133, CD90, CD45, CD4, CD19, CD13, CD3, CD56, CD14, CD61/41, CD135, CD45RA, CD33, CD66b, CD38, CD45, CD10, CD11c, CD19, CD7 and CD71.
Suitably, the cell is identified by the presence of the antigenic marker CD34 (cd34+), i.e. the cell is a cd34+ cell. For example, the cells may be cord blood cd34+ cells or peripheral blood cd34+ cells (mobilized). The cells may be cd34+ HSCs, cd34+ HPCs or cd34+ LPCs, optionally the cells are cd34+ HSCs.
In some embodiments, the cell is identified by the presence of CD34 and the presence or absence of one or more further antigenic markers. The further antigenic marker may be selected from one or more of CD133, CD90, CD3, CD56, CD14, CD61/41, CD135, CD45RA, CD33, CD66b, CD38, CD45, CD10, CD11c, CD19, CD7 and CD 71. For example, the cell may be a CD34+CD133+CD90+ cell, a CD34+CD133+CD90-cell, or a CD34+CD133-CD 90-cell.
Suitably, the cells are NALM6 cells, K562 cells or other human cells (e.g. Molt4 cells, U937 cells, etc.). Suitably, the cell is a T cell.
Cell population
In one aspect, the invention provides a population of cells comprising the cells of the invention. Suitably, at least 1%, at least 2%, at least 5%, at least 10% or at least 20% of the cells in the population are cells of the invention. Suitably, the population of cells comprises at least 10x10 5 At least 50x10 5 Or at least 100x10 5 The cells of the invention.
In a related aspect, the invention provides a population of cells that has been edited using a polynucleotide, vector, kit, composition or gene editing system of the invention. Suitably, at least 1%, at least 2%, at least 5%, at least 10% or at least 20% of the cells in the population of cells are cells that have been edited using a polynucleotide, vector, kit, composition or gene editing system of the invention. Suitably, the population of cells comprises at least 10x10 5 At least 50x10 5 Or at least 100x10 5 Cells that have been edited using a polynucleotide, vector, kit, composition, or gene editing system of the invention.
In a related aspect, the invention provides a population of cells comprising a polynucleotide, vector and/or genome of the invention. Suitably, at least 1%, at least 2%, at least 5%, at least 10% or at least 20% of the cells in the population of cells are cells comprising a polynucleotide, vector and/or genome of the invention. Suitably, the population of cells comprises at least 10x10 5 At least 50x10 5 Or at least 100x10 5 A cell comprising a polynucleotide, vector and/or genome of the invention.
Suitably, the population of cells is mammalian cells, for example human cells. The cell population may be autologous or allogeneic. Suitably, the population of cells is obtained or obtainable from (mobilized) peripheral blood or umbilical cord blood. The cell population may be universal cells.
Suitably, at least 50%, at least 60%, at least 70% or at least 80% of the population of cells are HSCs, HPCs and/or LPCs. Suitably, at least 50%, at least 60%, at least 70% or at least 80% of the population of cells are cd34+ cells.
In some embodiments, at least 1%, at least 2%, at least 5%, at least 10%, or at least 20% of the population of cells are cd34+ cells comprising the polynucleotides, vectors, and/or genomes of the invention. For example, in some embodiments, at least 20% of the population of cells are cd34+ cells comprising the genome of the invention.
In some embodiments, the population of cells comprises at least 10x10 5 At least 50x10 5 Or at least 100x10 5 Cd34+ cells comprising the polynucleotides, vectors and/or genomes of the invention. For example, in some embodiments, the population of cells comprises at least 100x10 5 A cd34+ cell comprising a genome of the invention.
Gene editing method
In one aspect, the invention provides methods of gene editing of a cell or population of cells using the polynucleotides, vectors, guide RNAs, kits, compositions and/or gene editing systems of the invention. The invention also provides a population of genetically edited cells obtained or obtainable by the method.
In another aspect, the invention provides the use of a polynucleotide, vector, guide RNA, kit, composition and/or gene editing system of the invention for gene editing of a cell or cell population.
Suitably, the method of gene editing a cell or population of cells comprises:
(a) Providing a cell or population of cells; and
(b) The kit, composition and/or gene editing system described herein are used to obtain a gene-edited cell or population of gene-edited cells.
For example, methods of gene editing of a cell or cell population include:
(a) Providing a cell or population of cells; and
(b) The RNA-guided nucleases, guide RNAs and/or polynucleotides or vectors of the invention are delivered to a cell or population of cells to obtain a gene-edited cell or population of gene-edited cells.
The gene-edited cell or population of gene-edited cells may be as defined herein. The invention also provides a gene-edited cell or population of gene-edited cells obtained or obtainable by said method.
Step (a) providing a cell or cell population
The population of cells may be obtained or obtainable from any suitable source. Suitably, the population of cells is obtained or obtainable from (mobilized) peripheral blood or umbilical cord blood. The population of cells may be obtained or obtainable from a subject, such as a subject to be treated. Suitably, the cell population may be isolated and/or enriched from the biological sample by any method known in the art, for example by FACS and/or magnetic bead sorting.
Suitably, the population of cells is mammalian cells, for example human cells. The cell population may be, for example, autologous or allogeneic. The cell population may be, for example, universal cells.
Suitably, the population of cells comprises about 1x10 5 Individual cells/well to about 10x10 5 Individual cells/well, e.g. about 2x10 5 Individual cells/well, or about 5x10 5 Individual cells/wells.
The cell population may comprise HSCs, HPCs and/or LPCs. Suitably, at least 50%, at least 60%, at least 70% or at least 80% of the population of cells are HSCs, HPCs and/or LPCs. In some embodiments, the population of cells consists essentially of, or consists of, HSCs, HPCs, and/or LPCs.
The cell population may comprise cd34+ cells, such as cd34+ HSCs, HPCs, and/or LPCs. Suitably, at least 50%, at least 60%, at least 70% or at least 80% of the population of cells are cd34+ cells, e.g. cd34+ HSCs, HPCs and/or LPCs. In some embodiments, the population of cells consists essentially of, or consists of cd34+ cells, e.g., cd34+ HSCs, HPCs, and/or LPCs.
The population of cells may comprise CD34+CD133+CD90+ cells, CD34+CD133+CD90-cells, and/or CD34+CD133-CD90-. Suitably, at least 50%, at least 60%, at least 70% or at least 80% of the population of cells are cd34+cd133+cd90+ cells, cd34+cd133+cd90-cells and/or cd34+cd133-CD 90-cells. In some embodiments, the population of cells consists essentially of, or consists of, cd34+cd133+cd90+ cells, cd34+cd133+cd90-cells, and/or cd34+cd133-CD 90-cells.
The cell or population of cells may be cultured prior to step (b). The pre-incubation step may comprise a pre-activation step and/or a pre-amplification step, optionally the pre-incubation step is a pre-activation step.
As used herein, a "pre-culture step" refers to a culture step that occurs prior to genetic modification of a cell. As used herein, a "pre-activation step" refers to an activation step or stimulation step that occurs prior to genetic modification of a cell. As used herein, a "pre-amplification step" refers to an amplification step that occurs prior to genetic modification of a cell.
Suitably, the method may comprise:
(a1) Providing a population of cells;
(a2) Pre-culturing (e.g., pre-activating and/or pre-expanding) the population of cells to obtain a pre-cultured (e.g., pre-activating and/or pre-expanding) population of cells;
(b) The RNA-guided nucleases, guide RNAs and/or polynucleotides or vectors of the invention are delivered to a pre-cultured (e.g., pre-activated and/or pre-expanded) cell population to obtain a gene-edited cell population.
The pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) may be performed using any suitable conditions.
During the preculture step (e.g., the preactivation step and/or the preamplification step), the cell population may be at 1x10 5 Individual cells/ml to about 10x10 5 Individual cells/ml, e.g. about 2X10 5 Individual cells/ml, or about 5x10 5 The individual cells/ml concentration was inoculated.
Suitably, the pre-incubation step (e.g. the pre-activation step and/or the pre-amplification step) is at least 1 day, at least 2 days or at least 3 days. Suitably, the cell population is pre-cultured (e.g., pre-activated and/or pre-expanded) for about 3 days. Suitably, the population of cells is at 5% CO 2 Pre-incubation was performed at 37 ℃ in a humid atmosphere.
Any suitable medium may be used. For example, commercially available media such as StemSpan media containing bovine serum albumin, insulin, transferrin and supplements in Iscove's MDM may be used. The medium may be supplemented with one or more antibiotics (e.g., penicillin, streptomycin).
The pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) may be performed in the presence of one or more cytokines and/or growth factors. As used herein, "cytokine" is any cell signaling substance and includes chemokines, interferons, interleukins, lymphokines, and tumor necrosis factors. As used herein, a "growth factor" is any substance capable of stimulating cell proliferation, wound healing, or cell differentiation. The terms "cytokine" and "growth factor" may overlap.
The pre-incubation step (e.g., pre-activation step and/or pre-amplification step) may be performed in the presence of one or more early acting cytokines, one or more transduction enhancers, and/or one or more amplification enhancers.
Early acting cytokines
As used herein, an "early acting cytokine" is a cytokine that stimulates HSC, HPCS, and/or LPC or cd34+ cells. Early acting cytokines include Thrombopoietin (TPO), stem Cell Factor (SCF), flt 3-ligand (FLT 3-L), interleukin (IL) -3, and IL-6. In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of at least one early acting cytokine. Early acting cytokines may be used at any suitable concentration. For example, 1-1000ng/ml, or 10-500ng/ml.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of SCF. The concentration of SCF may be about 10-1000ng/ml, about 50-500ng/ml, or about 100-300ng/ml.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of FLT 3-L. The concentration of FLT3-L may be about 10-1000ng/ml, about 50-500ng/ml, or about 100-300ng/ml.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of TPO. The TPO concentration may be about 5 to 500ng/ml, about 10 to 200ng/ml, or about 20 to 100ng/ml.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of IL-3. The concentration of IL-3 may be about 10-200ng/ml, about 20-100ng/ml, or about 60ng/ml.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of IL-6. The concentration of IL-6 may be about 5-100ng/ml, about 10-50ng/ml, or about 20ng/ml.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of SCF (e.g., at a concentration of about 100 ng/ml), FLT3-L (e.g., at a concentration of about 100 ng/ml), TPO (e.g., at a concentration of about 20 ng/ml), and IL-6 (e.g., at a concentration of about 20 ng/ml), particularly when the cell population is cord blood CD34+ cells.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of SCF (e.g., at a concentration of about 300 ng/ml), FLT3-L (e.g., at a concentration of about 300 ng/ml), TPO (e.g., at a concentration of about 100 ng/ml), and IL-3 (e.g., at a concentration of about 60 ng/ml), particularly when the cell population is (mobilized) peripheral blood CD34+ cells.
Transduction enhancers
As used herein, a "transduction enhancing agent" is a substance capable of improving viral transduction of HSC, HPCS and/or LPC or cd34+ cells. Suitable transduction enhancers include LentiBOOST, prostaglandin E2 (PGE 2), protamine Sulfate (PS), vectofusin-1, viraDuctin, retroNectin, stauro, 7-hydroxy-Stauro, human serum albumin, polyvinyl alcohol, and cyclosporin H (CsH). In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of at least one transduction enhancing agent. Transduction enhancers may be used in any suitable concentration, for example as described in Schott, J.W., et al, 2019.Molecular Therapy-Methods & Clinical Development,14, pp.134-147 or Yang, H., et al, 2020.Molecular Therapy-Nucleic Acids,20, pp.451-458.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of PGE 2. Suitably, PGE2 is 16, 16-dimethylprostadine E2 (dmPGE 2). The concentration of PGE2 may be about 1-100. Mu.M, about 5-20. Mu.M, or about 10. Mu.M.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of CsH. The concentration of CsH may be about 1-50. Mu.M, 5-50. Mu.M, about 10-50. Mu.M, or about 10. Mu.M.
Amplification enhancer
As used herein, an "expansion enhancer" is a substance capable of improving expansion of HSC, HPCS, and/or LPC or cd34+ cells. Suitable amplification enhancers include UM171, UM729, stemRegin 1 (SR 1), diethylaminobenzaldehyde (DEAB), LG1506, BIO (GSK 3. Beta. Inhibitor), NR-101, trichostatin A (TSA), garcinol (GAR), valproic acid (VPA), copper chelators, tetraethylenepentamine and nicotinamide. In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of at least one amplification enhancer. Amplification enhancers may be used at any suitable concentration, for example as described in Huang, x.et al, 2019.F1000research,8, 1833.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of UM171 or UM 729. The concentration of UM171 may be about 10-200nM, about 20-100nM, or about 50nM.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of SR 1. The concentration of SR1 may be about 0.1-10. Mu.M, about 0.5-5. Mu.M, or about 1. Mu.M.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of UM171 (e.g., at a concentration of about 50 nM) or UM729 and SR1 (e.g., at a concentration of about 1 μm).
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of SCF (e.g., at a concentration of about 100 ng/ml), FLT3-L (e.g., at a concentration of about 100 ng/ml), TPO (e.g., at a concentration of about 20 ng/ml), IL-6 (e.g., at a concentration of about 20 ng/ml), PGE2 (e.g., at a concentration of about 10. Mu.M), UM171 (e.g., at a concentration of about 50 nM), and SR1 (e.g., at a concentration of about 1. Mu.M), particularly when the cell population is cord blood CD34+ cells.
In some embodiments, the pre-incubation step (e.g., the pre-activation step and/or the pre-amplification step) is performed in the presence of SCF (e.g., at a concentration of about 300 ng/ml), FLT3-L (e.g., at a concentration of about 300 ng/ml), TPO (e.g., at a concentration of about 100 ng/ml), IL-3 (e.g., at a concentration of about 60 ng/ml), PGE2 (e.g., at a concentration of about 10 μm)), UM171 (e.g., at a concentration of about 50 nM), and SR1 (e.g., at a concentration of about 1 μm), particularly when the cell population is (mobilized) peripheral blood cd34+ cells.
Step (b) obtaining a gene-edited cell or a population of gene-edited cells
Kits, compositions and/or gene editing systems comprising RNA-guided nucleases, guide RNAs and/or polynucleotides or vectors of the invention may be used, for example, to obtain a gene-edited cell or population of gene-edited cells.
The RNA-guided nuclease, guide RNA, and/or polynucleotide or vector may be any suitable combination described herein. The guide RNA may correspond to the same DSB site targeted by the homology arm. The RNA-guided nuclease may correspond to the guide RNA used. For example:
(i) The RNA-guided nuclease may be a Cas9 endonuclease;
(ii) The guide RNA may be a guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity or at least 95% identity to any of SEQ ID NOS: 41-52 or 53-55, optionally wherein the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity or at least 95% identity to SEQ ID NO:41 or 53 (preferably SEQ ID NO: 41); and
(iii) The polynucleotide may be a polynucleotide comprising from 5 'to 3' the following: a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any of SEQ ID NOS: 7-18, and/or the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity to any of SEQ ID NOS: 19-30; or the vector may be a vector comprising said polynucleotide.
In some embodiments:
(i) The RNA-guided nuclease may be a Cas9 endonuclease;
(ii) The guide RNA comprises or consists of a nucleotide sequence having at least 90% identity, at least 95% identity or 100% identity to SEQ ID NO. 41 or 53 (preferably SEQ ID NO. 41); and
(iii) The polynucleotide comprises from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region, wherein the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity, or 100% identity to SEQ ID NO 31 or a fragment thereof; and the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 98% identity or 100% identity with SEQ ID No. 32 or a fragment thereof; or the vector comprises said polynucleotide.
Delivery of RNA-guided nucleases, guide RNAs and/or polynucleotides or vectors
The RNA-guided nuclease, guide RNA, and/or polynucleotide or vector may be delivered to the cells by any suitable technique. For example, RNA-guided nucleases can be delivered directly using electroporation, microinjection, bead loading, or the like, or indirectly via transfection and/or transduction. The guide RNA and/or polynucleotide or vector may be introduced by transfection and/or transduction.
As used herein, "transfection" is the process of delivering a polypeptide and/or polynucleotide to a target cell using a non-viral vector. Typical transfection methods include electroporation, DNA gene gun, lipid-mediated transfection, compact DNA-mediated transfection, liposomes, immunoliposomes, liposome transfection, cationic agent-mediated transfection, cationic facial amphiphiles (cationic facial amphiphiles, CFA), and combinations thereof.
As used herein, "transduction" is the process of delivering a polynucleotide to a target cell using a viral vector. Typical transduction methods include infection with recombinant viral vectors, such as adeno-associated virus, retrovirus, lentivirus, adenovirus, baculovirus, and herpes simplex virus vectors.
RNA-directed nucleases and guide RNA can be delivered by any suitable method, such as any of the methods described in Wilbie, D., et al, 2019.Accounts of chemical research,52 (6), pp.1555-1564. Suitably, the RNA-guided nuclease and the guide RNA are delivered pre-assembled together in the form of an RNP complex. RNP complexes can be delivered by electroporation.
Any suitable dosage of RNA-guided nuclease and/or guide RNA can be used. For example, the guide RNA may be delivered at a dose of about 10-100 pmol/well, optionally about 50 pmol/well. For example, RNP may be delivered at a dose of about 1-10. Mu.M, optionally 1-2.5. Mu.M.
The RNA-guided nuclease and/or guide RNA may be delivered prior to the vector and/or concurrently with the polynucleotide or vector of the invention. Suitably, the RNA-guided nuclease and/or guide RNA is delivered prior to the polynucleotide or vector. For example, the RNA-guided nuclease and/or guide RNA can be delivered about 1-100 minutes, about 5-30 minutes, or about 15 minutes before the polynucleotide or vector.
The polynucleotides or vectors of the invention may be delivered by any suitable method. For example, when the polynucleotide may be in a viral vector or the vector may be a viral vector and delivered by transduction.
Any suitable dose of polynucleotide or vector may be used. For example, it may be about 10 4 To 10 5 vg/cell, optionally about 10 4 MOI delivery vehicle for vg/cell.
Delivery of p53 inhibitors and/or HDR enhancers
The method may further comprise the step of delivering a p53 inhibitor and/or an HDR enhancer. The p53 inhibitor and/or HDR enhancer may be delivered simultaneously. The p53 inhibitor and/or HDR enhancer may be delivered simultaneously with or subsequent to the RNA-guided nuclease and/or guide RNA.
As used herein, a "p53 inhibitor" is a substance that inhibits activation of the p53 pathway. The p53 pathway plays a role in the regulation or progression of cell cycle, apoptosis, and genomic stability by several mechanisms including: activation of DNA repair proteins, arrest of the cell cycle; and initiation of apoptosis. Inhibition of this p53 response by delivery during editing has been shown to increase hematopoietic repopulation of treated cells (Schiroli, g.et al 2019.Cell Stem Cell 24, 551-565). Suitably, the p53 inhibitor is a dominant negative p53 mutant protein, e.g. GSE56.
GSE56 may have the following amino acid sequence:
in one embodiment, the p53 dominant negative peptide is a variant of GSE56 comprising 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions, additions, or deletions while retaining GSE56 activity, e.g., reducing or preventing p53 signaling.
In one embodiment, the p53 dominant negative peptide comprises an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID No. 67.
As used herein, an "HDR enhancer" is a substance capable of increasing the HDR efficiency in HSC, HPCS, and/or LPC or cd34+ cells. HDR is constrained in long term repopulating HSCs. Any suitable HDR enhancer may be used, for example as described in Ferrari, s.et al, 2020.Nature Biotechnology,pp.1-11. Suitably, the HDR enhancer is an adenovirus 5E4orf6/7 protein. The adenovirus 5E4orf6/7 protein may be as disclosed in WO 2020/002380 (incorporated herein by reference).
The p53 inhibitor and the HDR enhancer may be delivered by any suitable method. The p53 inhibitor and/or HDR enhancer may be transiently expressed, e.g., the p53 inhibitor and/or HDR enhancer may be delivered via mRNA. The P53 inhibitor and HDR enhancer may be delivered either by separate mRNA or on a single mRNA encoding the fusion protein, optionally together with a self-cleaving peptide (e.g. P2A). Any suitable dose of p53 inhibitor and/or HDR enhancer may be used, for example, to deliver mRNA at a concentration of about 10-1000 μg/ml, about 50-500 μg/ml, or about 150 μg/ml.
In some embodiments, step (b) comprises:
(b1) Delivering the RNA-guided nucleases and guide RNAs of the invention by electroporation, optionally preassembled in the form of RNP complexes;
(b2) Optionally, delivering a p53 inhibitor and/or an HDR enhancer; and
(b3) The polynucleotides or vectors of the invention are delivered by transduction to provide a population of genetically edited cells.
Culturing a Gene-edited cell or Gene-edited cell population
The method may further comprise the step of culturing the population of gene-edited cells. This may be an expansion step, i.e. the method may further comprise the step of expanding the population of gene-edited cells.
The culturing step (e.g., the amplification step) may be performed using any suitable conditions.
During the culturing step (e.g., the amplification step), the particles are finely dividedThe cell population may be about 1x10 5 Individual cells/ml to about 10x10 5 Individual cells/ml, e.g. about 2X10 5 Individual cells/ml, or about 5x10 5 The individual cells/ml concentration was inoculated. Suitably, the culturing step (e.g. the amplification step) is continued for at least one day, or one to five days. For example, the culturing step (e.g., the amplification step) may last for about one day. Suitably, the population of cells is at 5% CO 2 Culturing at 37deg.C in humid atmosphere.
Any suitable medium may be used. For example, commercially available media such as StemSpan media containing bovine serum albumin, insulin, transferrin and supplements in Iscove's MDM may be used. The medium may be supplemented with one or more antibiotics (e.g., penicillin, streptomycin). The culturing step (e.g., the amplifying step) may be performed in the presence of one or more cytokines and/or growth factors.
In some embodiments, step (b) comprises:
(b1) Delivering RNA-guided nucleases and guide RNAs of the invention by electroporation, optionally preassembled in the form of RNP complexes;
(b2) Optionally, delivering a p53 inhibitor and/or an HDR enhancer;
(b3) Delivering a polynucleotide or vector of the invention by transduction to provide a population of genetically edited cells; and
(b4) Culturing (e.g., expanding) the genetically edited cell population.
Therapeutic method
In one aspect, the invention provides methods of treating a subject using the polynucleotides, vectors, guide RNAs, kits, compositions, gene editing systems, cells and/or cell populations of the invention. Suitably, the method of treating a subject may comprise administering a cell or population of cells of the invention.
In related aspects, the invention provides polynucleotides, vectors, guide RNAs, kits, compositions, gene editing systems, cells and/or cell populations of the invention for use as a medicament. Suitably, the cell or cell population of the invention may be used as a medicament.
In a related aspect, the invention provides the use of a polynucleotide, vector, guide RNA, kit, composition, gene editing system, cell and/or cell population of the invention for the manufacture of a medicament. Suitably, the cell or cell population of the invention may be used in the manufacture of a medicament.
Suitably, the method of treating a subject may comprise:
(a) Providing a cell or population of cells;
(b) Obtaining a gene-edited cell or population of gene-edited cells using the kits, compositions, and/or gene-editing systems described herein; and
(c) Administering a population of genetically edited cells to a subject.
For example, a method of treating a subject may comprise:
(a) Providing a cell or population of cells;
(b) Delivering an RNA-guided nuclease, guide RNA and/or polynucleotide or vector of the invention to a cell or population of cells to obtain a gene-edited cell or population of gene-edited cells; and
(c) Administering a population of genetically edited cells to a subject.
Steps (a) and (b) may be the same as described in the preceding section.
Suitably, the cells of the cell population may be isolated and/or enriched from the subject to be treated, for example the cell population may be an autologous cd34+ cell population. Suitably, the cell population is isolated from (mobilized) peripheral blood or umbilical cord blood of the subject to be treated and subsequently enriched (e.g. by FACS and/or magnetic bead sorting).
The subject may be immunocompromised and/or the disease to be treated may be immunodeficiency, i.e., the drug may be used to treat immunodeficiency. As used herein, an "immunodeficiency" is a condition in which the immune system is low or completely absent from its ability to fight infectious diseases and cancers. Subjects with immunodeficiency are referred to as "immune hypofunction. In addition to possibly affecting the normal infection of everyone, immunocompromised persons may be particularly susceptible to opportunistic infections.
RAG deficiency-immunodeficiency
The subject may have a RAG defect, e.g., a RAG1 defect. The RAG1 deficiency may be due to a loss of function mutation in the RAG1 gene, optionally in exon 2 of RAG 1.
The immunodeficiency may be RAG deficiency-immunodeficiency. As used herein, a "RAG deficiency-immunodeficiency" is an immunodeficiency characterized by a loss of RAG1/RAG2 activity. For example, RAG deficiency-immunodeficiency may be caused by mutation of the RAG gene.
Suitably, the RAG defect-immunodeficiency may be a RAG1 defect. The RAG1 deficiency may be due to a loss of function mutation in the RAG1 gene, optionally in exon 2 of RAG 1.
Mutations in the RAG gene in humans are associated with different clinical phenotypes, characterized by variable association of infection and autoimmunity. In some cases, environmental factors have been shown to contribute to such phenotypic heterogeneity. In humans, RAG1 deficiency may cause a broad phenotype, including T-B-SCID, omenn Syndrome (OS), atypical SCID (AS), and combined immunodeficiency with granuloma/autoimmunity (CID-G/AI). (Notarangelo, L.D., et al, 2016.Nature Reviews Immunology,16 (4), pp.234-246 and Delmonte, O.M., et al, 2018.Journal of clinical immunology,38 (6), pp.646-655).
In some embodiments, the RAG-deficient immunodeficiency is T-B-SCID, omenn syndrome, atypical SCID, or CID-G/AI.
Severe Combined Immunodeficiency (SCID) includes a group of heterogeneous diseases characterized by extreme abnormalities in the development and function of T cells (and also B cells in some forms of SCID) and associated with severe early-onset infections. This condition is inevitably fatal early in life unless immune reconstitution (usually with HSCT) is achieved. After the introduction of neonatal SCID screening in the united states, it has been determined that RAG mutations account for 19% of all SCID and SCID related condition cases, and in particular are the leading cause of atypical SCID and Omenn syndrome. (Notarangelo, L.D., et al, 2016.Nature Reviews Immunology,16 (4), pp.234-246).
In 1996, RAG mutations were identified as the primary cause of T-B-SCID with normal cellular radiosensitivity. The first Omenn syndrome described in 1965 has unique phenotypic characteristics. These patients show early onset systemic erythroderma, lymphadenectasis, hepatosplenomegaly, eosinophilia and severe hypogammaglobemia with elevated IgE levels, which are associated with the presence of autologous, oligoclonal and activated T cells infiltrating multiple organs. In some patients with sub-potent (hypomorphic) RAG mutations, the residual presence of autologous T cells was demonstrated to be devoid of clinical manifestations of omnen syndrome. This condition is known as "atypical" or "leaky" SCID. The unique SCID phenotype involving the oligoclonal expansion of autologous γδ T cells (referred to herein as γδ t+scid) has been reported in infants with RAG deficiency and disseminated Cytomegalovirus (CMV) infection. (Notarangelo, L.D., et al, 2016.Nature Reviews Immunology,16 (4), pp.234-246).
Whereas SCID, atypical SCID and Omenn syndrome are inevitably fatal in early life if left untreated, several forms of RAG deficiency with lighter clinical procedures and delayed manifestations have been reported in recent years. In particular, the occurrence of CID-G/AI was reported in 3 unrelated girls with RAG mutations, who showed granulomas in skin, mucous membranes and internal organs, and had serious complications after viral infection, including B-cell lymphomas. According to this description, CID-G/AI has also been reported with several other cases of various autoimmune manifestations such as cytopenia, leukoplakia, psoriasis, myasthenia gravis and Guillain-Barre syndrome. (Notarangelo, L.D., et al, 2016.Nature Reviews Immunology,16 (4), pp.234-246).
Other phenotypes associated with RAG deficiency include idiopathic cd4+ T cell lymphopenia, common variant immunodeficiency, igA deficiency, selective deficiency of polysaccharide-specific antibody responses, high IgM syndrome, and sterile chronic multifocal osteomyelitis. (Notarangelo, L.D., et al, 2016.Nature Reviews Immunology,16 (4), pp.234-246).
The skilled person will appreciate that they can combine all features of the invention disclosed herein without departing from the scope of the invention disclosed.
Preferred features and embodiments of the invention will now be described by way of non-limiting example.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of chemistry, biochemistry, molecular biology, microbiology and immunology, which are within the ability of a person of ordinary skill in the art. These techniques are explained in the literature. See, e.g., sambrook, j., fritsch, e.f. and Maniatis, t. (1989) Molecular Cloning: A Laboratory Manual,2nd Edition,Cold Spring Harbor Laboratory Press; ausubel, f.m. et al (1995 and periodic supplements) Current Protocols in Molecular Biology, ch.9,13and 16,John Wiley&Sons; roe, b., crabtree, j.and Kahn, a. (1996) DNAIsolation and Sequencing: essential Techniques, john Wiley & Sons; polak, J.M. and McGee, J.O' D. (1990) In Situ Hybridization: principles and Practice, oxford University Press; gait, m.j. (1984) Oligonucleotide Synthesis: A Practical Approach, IRL Press; and Lilley, D.M. and Dahlberg, J.E. (1992) Methods in Enzymology: DNA Structures Part A: synthesis and Physical Analysis of DNA, academic Press. Each of these general texts is incorporated herein by reference.
Examples
EXAMPLE 1 editing of RAG1 Gene
Results
We developed correction of CD34 by using gene targeting methods + Platform of hematopoietic stem cells.
In the methods described herein, we deliver Cas9 Ribonucleoprotein (RNP) by nuclear transfection, introducing a DNA Double Strand Break (DSB) in the first intron of the RAG1 gene. Following DNA DSB, the correct donor DNA delivered by the AAV6 vector is integrated by Homology Directed Repair (HDR), due to the presence of two sequences flanking the correct donor homologous to the Cas9 cleavage site. The alternative Splice Acceptor (SA) upstream of the proofreading DNA allows the endogenous promoter of RAG1 to control the expression of the transgene (FIG. 1 Panel A). Notably, RAG1 exon 2 contains the entire coding sequence, so that integration of the correct RAG1 coding sequence upstream of exon 2 may have therapeutic effect on any clinically relevant RAG1 mutation.
Generation of NALM6 and K562 Cas9 cell lines
First, to test our Cas9 guide RNA set, we generated two cell lines with inducible Cas9 expression. NALM6 and K562 cell lines were transduced with lentiviral vectors carrying Cas9 cassettes and cassettes conferring puromycin resistance under the control of a TET inducible promoter. Following transduction with MOI 20, both cell lines were incubated with 1.5 μg/ml puromycin for one week to select for transduced cells (FIG. 1, panel B). After puromycin selection, VCN 3.65 and VCN 4.35 were validated by LTR-specific ddPCR in NALM6 Cas9 and K562 Cas9 cell lines, respectively (fig. 1 panel C). High efficiency Cas9 expression was also verified by RT-qPCR after two days of induction with a scaled dose of doxycycline (fig. 1 panel D). Among the two cell lines, highest Cas9 expression was found at a dose of 1 μg/ml doxycycline.
RAG1 guide selection
A set of nine guides was first identified to target three non-repeat sites of RAG1 intron 1. Furthermore, the first 200bp three guides (gRNA 1, 2, 3) targeting RAG1 exon 2 were designed, with the final objective of integrating the correct RAG1 coding sequence with endogenous ATG in-frame. This strategy would utilize endogenous splice acceptors, thereby preserving any putative endogenous splice rules (fig. 2A).
The guide was electroporated as plasmid DNA in K562Cas9 and NALM6Cas9 cell lines, considering two different doses (100 ng/well and 200 ng/well). One day before and two days after electroporation, doxycycline (1. Mu.g/ml) was added to the medium. Genomic DNA was extracted on day 7 and cleavage frequency was assessed and the percentage of NHEJ-mediated indel mutations was measured by T7 nuclease assay (scheme shown in fig. 2B).
Most test guides have good cleavage frequencies and show similar results in both cell lines. In particular, guide 9 is the best performing guide targeting introns, with a cleavage frequency of up to 72.7% in K562Cas9 and up to 78.5% in NALM6Cas 9. Guide 7 also achieved a similar cleavage frequency, which showed cleavage frequencies up to 67.5% in K562Cas9 and up to 70.5% in NALM6Cas9 cell line. Guide 3 is the best performing guide targeting exons, with a cleavage frequency of up to 58.9% in K562Cas9 (fig. 2C) and up to 73.5% in NALM6Cas9 (fig. 2D). Notably, although Cas9 expression in K562Cas9 was higher than in the NALM6Cas9 cell line, no difference in overall cleavage efficiency was observed. The frequency of cleavage was also tested in NALM6 WT using in vitro pre-assembled RNPs to guide 9 and guide 3 at doses of 25 or 50 pmol/well (FIG. 2E). Both guides retained good activity, with the cleavage frequency of guide 3 reaching up to 71.5% and guide 9 reaching up to 78.5% at higher doses of RNP.
Donor DNA optimization
Guide 9 was further tested in the NALM6 Cas9 and K562 Cas9 cell lines to verify the correct integration of the pgk_gfp reporter cassette flanked by two homology arms.
We also assessed the ability of the endogenous RAG1 promoter to induce GFP expression using a donor plasmid containing the Splice Acceptor (SA) sa_gfp cassette in the absence of the PGK promoter. RAG1 expression occurs only during lymphocyte differentiation at the DN 2T and pro-B cell stages. To assess whether the endogenous promoter of RAG1 was able to induce expression of GFP cassette, we utilized the NALM6 cell line, a Pre-B cell line that constitutively expressed RAG1 (fig. 3A). As previously described, the RAG1 genomic region consists of two exons and the entire coding sequence of 3.1Kb is encoded by the second exon, followed by the long 3' UTR region of 3.3 Kb. Our correction strategy was designed to provide AAV6 vectors containing complete coding sequences that target the intron region upstream of exon 2. The 3' utr region (> 3 Kb) downstream of the RAG1 coding sequence is not inserted because of the limited size carried by AAV6 vectors.
To assess whether the 3' utr of RAG1 is necessary for efficient expression of our correcting donor, we generated four different sa_gfp donor DNA (fig. 3B):
i. Constructs carrying Bovine Growth Hormone (BGH) PolyA (sa_gfp_bgh) downstream of sa_gfp;
constructs carrying the woodchuck hepatitis virus posttranscriptional regulatory element (WPRE) downstream of SA_GFP and upstream of BGH PolyA (SA_GFP_WPRE). WPRE has been reported to generally enhance transgene expression;
constructs with the same endogenous RAG1 'utr after the sa_gfp cassette (sa_gfp_3' utr);
constructs containing splice donors downstream of the sa_gfp cassette (sa_gfp_sd) to obtain fusion transcripts, including corrected sequences and endogenous RAG1 followed by 3' utr sequences (fig. 3C).
NALM6 Cas9 and K562 Cas9 cell lines previously stimulated with doxycycline to induce Cas9 were transfected with guide 9 plasmid DNA (100 ng/well) and various linearized DNA donors (1600 ng/well). Stable integration of donor DNA was verified by flow cytometry as GFP expression.
Pgk_gfp positive control was stably integrated into both cell lines. In particular, ten days after transfection, 14% k562 Cas9 and 1.8% nalm6 Cas9 were GFP positive (fig. 3D). Notably, the NALM6 cell line is particularly difficult to edit and we expect to be less efficient than K562. Similar frequencies of gfp+ cells were observed in NALM6 Cas9 transfected with different sa_gfp donors, whereas little GFP was detected in K562 cell line transfected with sa_gfp donor + And (3) cells. This observation demonstrates that the endogenous RAG1 promoter efficiently induces expression of the sa_gfp cassette in the NALM6 Cas9 cell line. Notably, GFP was absent in the K562 Cas9 cell line lacking RAG1 expression + Cells, it was further demonstrated that GFP expression specificity observed in NALM6 was dependent on RAG1 promoter activity.
GFP in flow cytometry + The fluorescence intensity (MFI) of the event assessed the effect of constructs carrying different 3' utrs in the NALM6 Cas9 cell line. Analysis indicated that endogenous RAG 1' utr had a negative effect on transgene expression. GFP MFI obtained after transfection with the sa_gfp_sd and sa_gfp_3' utr constructs was significantly lower than that obtained with sa_gfp_bgh (fig. 3E, F). No improvement was found using sa_gfp_wpre. Based on these data reporting GFP expression levels, we decided to clone our vector with bgh_polya.
Off-target analysis
Preliminary computer analysis indicated that the off-target profile of guide 9 is promising and that the most likely off-target falls within the intron region, thus indicating a low risk of off-target related gene disruption events (fig. 4A). The off-target profile of the targeting vectors 7 and 9 was characterized more in depth by an unbiased off-target detection assay (GUIDE-seq, tsai SQ, et al Nat Biotechnol.2015;33 (2): 187-97). Analysis of K562 cells using 50pmol of high fidelity Cas9 nuclease V3 resulted in cleavage frequencies of 45.3% and 64.6% for guides 7 and 9, respectively (fig. 4B). We achieved low (8.4%) ODN integration to guide 7, but the frequency of integration to guide 9 was very high (38.2%), allowing analysis of off-target in the samples (fig. 4C). No off-target sites were identified for both GUIDEs according to the analysis performed using the R Bioconductor package GUIDE-seq (Zhu LJ, et al BMC genomics.2017;18 (1)) using default parameters. In order to investigate also very weak potential off-targets, a second analysis with relaxed constraints was performed and only two off-target sites were found for guide 7. These off-target sites belong to introns or intergenic regions, have some mismatches >9 and are less frequent, indicating a low risk profile for guide 7. Notably, no off-target sites were identified for guide 9.
Optimization of Gene editing protocols for human umbilical cord blood-CD34+ cells
Then in human CD34 from cord blood + The editing protocol was optimized in cells (hCB-CD 34). For this purpose, hCB-CD34 cells were thawed on day 0 and 1X10 6 Cells/ml were pre-stimulated in StemSpan enriched for cytokines (hTPO 20ng/ml, hIL6 20ng/ml, hSCF 100ng/ml, hFlt 3-L100 ng/ml, SR1 1uM, UM17150 nM) for three days.
On day 3, conductors 3 and 9 were delivered by electroporation as in vitro preassembled RNP, and two doses of 25 and 50 pmol/well were considered. To enhance cell stability, chemical modifications consisting of 2 '-O-methyl 3' phosphorothioate were added at the last three terminal nucleotides at the 5 'and 3' ends of the guide RNA. After 15', three (10 4 、5x10 4 、10 5 ) MOI doses AAV6 vector was added to the medium (fig. 5A). To easily follow edited cells using flow cytometry methods, two AAV6 donors (one for each guide) were used) Which carries a pgk_gfp_bgh cassette flanked by two arms homologous to each of the two cleavage sites. Toxicity of the protocol was assessed 24 hours after treatment by staining cells with 7AAD and annexin V and measuring the fraction of necrotic and apoptotic cells by flow cytometry. Four days after electroporation we performed a multiparameter flow cytometry analysis to evaluate the composition of the various cell subsets that make up the subject treated cell culture and measure the percentage of gfp+ cells in these subsets. For this analysis we utilized a method that allowed the identification of the original (CD 34 + CD133 + CD90 + ) Early (CD 34) + CD133 + CD90 - ) And more set (CD 34) + CD133 - CD90 - ) Surface markers for progenitor cells (fig. 5B). In addition, genomic DNA was extracted to determine nuclease activity by T7 nuclease assay.
Guide 9 retained activity comparable to that verified in NALM6 and K562 cell lines, with 73.9% cleavage frequency observed at 25 pmol/well and 80.1% at 50 pmol/well. Guide 3 showed lower activity in hCB-CD34 with cleavage frequencies of 16.9% and 19.3% at 25 and 50 pmol/well, respectively (fig. 5C). From the latter observation, the targeted integration efficiency with guide 3 was low and the dose at 25 pmol/well was low with the highest MOI (10 5 ) Under the subject CD34 + The level of integration in (a) was 18.3% and 1.25% in most of the original subpopulations (fig. 5D, E).
Guide 9 facilitates efficient targeted integration of pgk_gfp cassette. Apoptosis analysis showed low toxicity associated with the editing protocol and the following day after editing under all test conditions, viability (7 AAD - Annexin V - Cells) were higher than 70% (fig. 6A). The analysis also shows that AAV6 transduction has a stronger effect on cell viability than Cas9 transfection. According to this observation, MOI 10 5 AAV6 transduction of (c) was more detrimental to cell growth than 25pmol Cas9 transfection, indicating that cell adaptation may be affected (fig. 6B). Under edited conditions CD34 + The high frequency of cells (87.5%) was comparable to the untreated control (fig. 6C). Three CD 34's between different conditions + SubpopulationsNo significant differences in distribution were observed (fig. 6D).
Analysis of the integration frequency showed that the most primitive subpopulation (CD 34 + CD133 + CD90 + ) Is a part with poor permission. Cas9 and MOI 10 using 25pmol 5 (52.8%) the highest edit frequency in this subpopulation was obtained. At lower MOI, higher Cas9 doses (50 pmol) increased editing efficiency, especially in the most primitive subpopulations, indeed at MOI 10 4 The edit frequency of Cas9 was 24.6% and 40.5% for 25 and 50pmol, respectively (fig. 6E). To confirm the integration observed by flow cytometry at the molecular level, genomic DNA was analyzed by ddPCR assay using a set of primers specific for targeted integration. The percentage of GFP measured by flow cytometry and the percentage of HDR obtained by ddPCR were comparable, confirming the majority of targets in integration (fig. 6F).
Overall, these data indicate that with this platform we can even be at the most primitive CD34 + Efficient targeting is achieved in the subpopulations. Editing protocols did not affect the phenotype of the cells (in terms of total CD34 + Both in terms of cell and subpopulation distribution). In particular, we identified high frequency guide RNAs that promote targeted integration and set edit conditions that allow an optimal tradeoff between toxicity and targeting frequency (50 pmol/well Cas9 and MOI 10 4 Vg/cell) (fig. 6G).
Gene-edited hCB-CD34 + In vivo transplantation of cells
To assess whether our protocol allows targeted integration in HSCs while retaining its long term repopulating activity, CD34 will be edited + Cell transplantation into sublethally irradiated NOD-scid IL2Rg Empty space Mice (NSG) mice. hCB-CD34 after 3 days of stimulation following the same protocol used in the previous experiments + Cells were electroporated with 50 pmol/well guide 9RNP and after 15 minutes were subjected to MOI 10 with AAV6 4 Vg/cell transduction. In this experiment, two different AAV6 vectors were used. The first AAV6 vector carrying pgk_gfp_bgh was used as a positive control to easily follow the engraftment of the editing cells. Second donor carrying SA_GFP_BGHFor assessing the in vivo expression of GFP gene under the control of the RAG1 endogenous promoter. The following day after the editing protocol, the treated hCB-CD34 was exposed to sublethal whole-body irradiation (120 rads) for 6 hours + 350,000 cells/mouse were injected into each group of 4-5 NSG mice. To assess the level of gene targeting efficiency after treatment, a few cells were maintained in culture for an additional 4 days. Using both AAV6 vectors, we measured about 80% targeted integration by ddPCR (fig. 7A), thus re-ranking the results obtained in the previous experiments. Flow cytometry analysis was performed on peripheral blood obtained from transplanted mice at 6, 9, 13 weeks after transplantation and at the time of sacrifice at week 17. Analysis of the frequency of hcd45+ cells in total viable cells in peripheral blood demonstrated that the treated cells were present at normal levels (up to about 56%), indicating long term transplantation and similar kinetics in both groups (fig. 7B). Regarding peripheral blood composition, mice showed no clearly skewed subgroup composition, and the normal presence of B, T and myeloid cells in both groups demonstrated that the editing protocol did not affect multilineage differentiation (fig. 7D, F, H).
In the group of mice receiving cells treated with pgk_gfp_bgh vector, edited hCD45 + GFP + Cells remained at a high percentage (about 40-50%) over time, thus indicating that the treatment was tolerated starting from the most primitive cells and confirming their long-term in vivo survival (fig. 7C). Similar levels of edited hCD45 were found in B cells, T cells and myeloid cells in peripheral blood + GFP + Cells, the edited cells were confirmed to maintain the multi-directional differentiation ability (fig. 7, E, G, I). In mice transplanted with sa_gfp_bgh treated cells, although a high targeting frequency was observed in vitro, we observed a decrease in the frequency of gfp+ cells in peripheral blood (fig. 7C). As expected, the myeloid and circulating T cells were GFP-negative, as both cell populations did not express RAG1 (fig. 7G, I). In contrast, a relevant percentage (about 18%) of GFP was observed in circulating B cells + Cells (fig. 7E), probably due to their immature phenotype, because most B cells express CD24 and CD38.
At the time of sacrifice, analysis of bone marrow confirmed treated CD34 + Transplantation of stem cells. This isIn addition, in the PGK_GFP_BGH group, in CD34 + High frequency GFP was observed in cells + Targeting cells (about 38%) further demonstrated efficient transplantation of long term re-populated stem cells (fig. 7L and M). Although NSG mice had thymus atrophy and dysfunction, we analyzed GFP expression during thymus production based on CD4 and CD8 expression (FIG. 7N). Using the pgk_gfp_bgh cassette, GFP expression was consistent across developmental stages and no differences were observed between immature thymocytes and mature circulating T cells. In contrast, GFP expression was found in developing thymocytes, whereas GFP expression was barely detected in peripheral blood and spleen T cells using the sa_gfp_bgh cassette as a donor (fig. 7N).
Taken together these observations, we have established a highly efficient protocol to edit long term repopulating stem cells without affecting their engraftment and multilineage differentiation capacity. Our data further indicate a controlled expression pattern in vivo of the transgene in the absence of exogenous promoters, emphasizing that expression is lymphoid specific and limited to immature lymphocytes.
Testing of proofreading donors on hMPB-CD34+ cells
Next we designed and tested a corrective AAV6 vector carrying the RAG1 coding sequence. Specifically, the correction donor included two homology arms at the 3 'and 5' ends, a splice acceptor, followed by a Kozak sequence, a RAG1 coding sequence, and BGH poly a, with a total length of 4.1Kb (fig. 8A). The RAG1 coding sequence is codon optimized to replace more "rare" codons with more frequent codons without altering the amino acid sequence, thereby enhancing protein translation. We obtained hCD34 in peripheral blood (MPB) of the follower + New donor DNA was tested on cells to verify if the size of the donor DNA would affect the integration efficiency and/or toxicity profile.
MPB-CD34 + Cells (purchased commercially from us AllCells California) were thawed and pre-stimulated for three days. We adjust the editing scheme as follows: stem Cell Factor (SCF) 300ng/ml, flt3 ligand (Flt-3L) 300ng/ml, thrombopoietin (TPO) 100ng/ml, interleukin 3 (IL-3) 60ng/ml, stemRegenin1 (SR 1,1 uM) and 16, 16-dimethyl front Prostaglandin E2 (dmPGE 2,10 uM), UM171 nM.
Cas9 was electroporated as an in vitro pre-assembled RNP at two doses (25 pmol/well and 50 pmol/well). Since our previous observations suggest that high AAV6 vector MOI impairs cell adaptation, we consider two low MOI (10 4 And 2 x 10 4 )。
Consider the effect of an editing protocol by flow cytometry in terms of cell growth and cell phenotype. Since the correcting donor did not include any reporter gene, we assessed integration by molecular assay. Four days after editing, cells were sorted according to CD34, CD133 and CD90 expression to identify and analyze primitive, early and committed progenitor cell subsets. Genomic DNA was extracted from the sorted subpopulations and targeted integration of the correcting donor was verified by ddPCR assay, using a set of primers specific for targeting target integration and codon optimization of the donor sequence (fig. 8B). From previous observations, the editing protocol did not affect cell phenotypes based on CD133 and CD90 expression (data not shown), and high target integration frequency was observed in all CD34 subpopulations. In particular, in the most primitive subpopulations, 50 pmol/well Cas9 and 10 were used 4 AAV6 vector at MOI observed a targeting frequency of 45.3% (fig. 8C), also showed a lower effect on cell growth compared to higher MOI (fig. 8D). hCD34 of both MPB or CB in terms of efficiency and toxicity + No differences were found between cells.
In vivo transplantation of edited hMPB-cd34+ cells from HD and patients
To assess whether our gene editing protocol would affect the ability to transplant, edited hMPB-CD34 + Cells were transplanted into sub-lethal dose irradiated NSG mice. hMPB-CD34 after 3 days of stimulation following the same protocol used in the previous experiments + Cells were electroporated with 50 pmol/well guide 9RNP and 15 minutes later with corrected AAV6 at MOI 10 4 Vg/cell transduction. To suppress the previously reported edit-induced p53 response, which reduces hematopoietic reconstitution of edited HSPC, we added mRNA encoding the dominant negative p53 inhibitor GSE56 to the electroporation mixture (Schiroli G, et al cell Stem cell 2019;24 (4): 551)–565.e8)。
To evaluate in vivo gene correction, we obtained hMPB-CD34 obtained from patient (NIHPID 0021) carrying a sub-effect mutation (hypomorphic mutation) in RAG1 gene + And (3) cells. Notably, NIHPID0021 is a CID-G/AI adult patient with B-cell and T-cell residual development due to missense RAG1 mutation (C1228T; G1520A). The patient presented with B cell 23/uL, T cell 665/uL (8% initial), normal NK counts. Notably, the very low B cell count in the periphery was also due to treatment with anti-CD 20 mAb to control severe autoimmune manifestations.
RAG1 patients received G-CSF/Plerixafor, CD34 + Cells were collected by NIH clinical facility and verified for purity by flow cytometry>97%CD34 + )。
Parallel use of hMPB-CD34 from two independent healthy donors + Cells (commercial). The next day after editing, 1×10 6 The treated or untreated cells were injected into sub-lethally irradiated (120 rad) mice (FIG. 9A). To assess the level of gene targeting efficiency after treatment, a few cells were maintained in culture for an additional four days. ddPCR showed a targeting frequency of 86% in patient cells, whereas 89% and 80% were observed in the two healthy donor batches, respectively, thus re-ranking the results obtained in the previous experiments (fig. 9B).
Peripheral blood was analyzed by flow cytometry 5, 8, 12 weeks after transplantation, and mice were sacrificed at 15 weeks.
Analysis of peripheral blood showed that hMPB-CD34 + Is significantly lower than hCB-CD34 + . hCD45 from HD assessed in blood at all time points + The frequency of cells was between 4.4% and 8.7% and the two batches of transplants were stackable. In contrast, in CID/AG NIHPID002 patients, hCD45 in PB + The frequency of cells is generally lower (between 2.1% and 5.2% at the first two time points) and decreases at the later time points, indicating depletion of the transplant. Notably, in both cases (CID/AG patient and HD cell), hCD45 in PB + No difference in frequency between treated and untreated cells confirmed the ability to transplantThe force is not affected by the editing scheme (fig. 9C).
Molecular analysis performed by ddPCR assay revealed MPB-CD34 edited from the received gene + The targeting frequency of human cells obtained in peripheral blood of mice with HD cells was 35.3%, thereby re-ranking the observations previously obtained using the reporter gene and further confirming that the targeting procedure did not affect transplantation (fig. 9D). CD34 in MPB patients edited with Gene + Lower targeting frequency (9.3%) was obtained in PB 8 weeks after cell transplantation (fig. 9D).
Regarding peripheral blood composition, NSG mice transplanted with treated HD cells showed no significant bias in subgroup composition, and comparable frequencies of B, T and myeloid cells were observed in mice receiving treated or untreated cells, confirming that multilineage differentiation was not impaired (fig. 9E). Untreated patient cells showed partial bias in the B-cell and T-cell compartments when compared to HD, consistent with the immunophenotype of patients carrying sub-effect mutations (Delmonte OM, et al blood 2020;135 (9): 610-9). At the last time point, mice received untreated patient cells with a B cell frequency of 17.2% (HD untreated=81.9%) and a T cell frequency of 2.3% (HD untreated=9.2%) with a high myeloid cell frequency of 19.9% (HD untreated=3.0%). These observations confirm that although B and T cell development is defective, some circulating B and T lymphocytes can be detected. No significant differences were found between mice receiving untreated or treated patient cells in terms of peripheral blood immune composition, although we observed that a slight increase in B cell frequency in the treated patient cells was maintained over time (fig. 9F).
Mice were sacrificed 17 weeks after transplantation to analyze the transplantation of edited cells in bone marrow, thymus, and spleen. In bone marrow and spleen, the frequency of human cd45+ cells was higher than those retrieved from mouse peripheral blood (fig. 8G, H left panels and 8C). NSG mice transplanted with edited MPB CD34 cells from HD showed 13.9% hCD45 in bone marrow + And 23.4% in the untreated group (fig. 9G, left panel). In mice receiving edited RAG1 patient cellsA similar level of transplantation (10.2%) was found, but a lower proportion of hCD45 was found in mice receiving untreated RAG1 patient cells + Cells (6.9%) (fig. 9G, left panel). hCD45 in spleen for both HD and patient edited and untreated cells + Cell transplantation is even higher. In mice receiving HD cells hCD45 + The frequency of cells was 37.4% and 43.3% in mice with edited or untreated cells, respectively (fig. 9H, left panel), indicating no difference between edited and unedited cells. Similarly, hCD45 + The frequency of cells was 24% and 23.7% in mice with edited or untreated cells derived from RAG1 patients, respectively (fig. 9H, left panel).
HDR targeting efficiency assessed by ddPCR on DNA samples extracted from bone marrow and spleen showed a range of 1.1% to 19.6% in edited cells from bone marrow, and 2.1% to 8.5% in the case of patient cells (fig. 9G, right panel). Spleen showed the highest targeting frequency, ranging between 6.1% and 22.2% for mice with edited HD cells, and 11.9% and 14.8% for mice with edited patient cells (fig. 9H, right panel).
Taken together, these findings demonstrate the feasibility of gene editing approaches to target human RAG1 loci in HSCs derived from HD and RAG1 mutant patients. The GE protocol does not affect the engraftability and multilineage differentiation of HSCs.
Discussion of the invention
Classical gene addition-based gene therapy strategies rely on the use of integrating vectors. The introduction of a new generation of vectors, the improved design of which confers a safer integration profile, lessens but does not eliminate the risk of insertional mutagenesis caused by semi-random integration of the vector into the genome (Doi K, takeuchi y. Vol.65, uirusu.2015. P27-36). Furthermore, the use of ubiquitous promoters greatly impedes the physiological expression of therapeutic transgenes, whose expression is cell-specific or strictly controlled during the cell cycle.
RAG1 molecules mediate site-specific DNA double strand breaks necessary for initiation of V (D) J recombination (Oettinger MA, et al science 1990;248 (4962): 1517-23). DNA double strand breaks are themselves dangerous lesions that can lead to pathological genomic rearrangements or chromosomal translocations. An important mechanism to ensure fidelity of V (D) J recombination is the fine control of RAG1 expression, which is limited to specific target cells at specific developmental stages. RAG1 expression modulation is also essential for the selection of functional, non-autoreactive lymphocytes through "allelic exclusion" or complex mechanisms of BCR and TCR receptor editing (Ten Boekel E, et al Immunity 1998;8 (2): 199-207).
In the past, several attempts to correct RAG1 deficiency by retrovirus or lentivirus mediated gene transfer have resulted in variable T and B cell reconstitution with inflammatory infiltrates and autoimmunity in achieving suboptimal immune reconstitution (Pike-Overzet K, et al Leukemia.2011;25 (9): 1471-83; pike-Overzet K, et al Vol.134, journal of Allergy and Clinical immunology.2014.p.242-3; lagresleslese-Peyrou C, et al blood.2006;107 (1): 63-72; and van Til NP, et al J Allergy Clin immunol.2014;133 (4): 1116-23). Meanwhile, the use of exogenous and ubiquitous promoters may lead to genotoxicity (Zhang Y, et al Advances in immunology.2010.P.93-133; and Papaemmanuil E, et al Nat Genet.2014;46 (2): 116-25).
The development of gene editing platforms represents a strategy to overcome several problems caused by conventional gene addition schemes. We have focused on HSC-based genome editing strategies to correct a broad spectrum of RAG1 defects. To this end, we devised a strategy to target the first RAG1 intron, thereby replacing the RAG1 coding sequence fully contained in exon 2. Our strategy has the advantage of curing most pathogenic RAG1 mutations while retaining gene expression driven by its own promoter. To this end, we identified the best combination of nuclease agent and proofreading cDNA donor in NALM6 and K562 cell lines. Cas9 was electroporated as an in vitro pre-assembled RNP to ensure robust and short-term persistence in cells, as long-term persistence of Cas9 protein in primary cells may lead to off-target cleavage, potentially affecting cell homeostasis and function (kims, et al genome res.2014;24 (6): 1012-9). Cas9 was delivered as a pre-assembled RNP well-tolerated and partially protected gRNA from intracellular degradation, thereby increasing nuclease stability and activity (Hendel a, et al, nat biotechnol.2015;33 (9): 985-9). To further improve Cas9 activity profile, chemically modified grnas were used to enhance stability, as well as high fidelity Cas9 variants to reduce off-target related toxicity (Vakulskas CA, et al Nat Med.2018;24 (8): 1216-24). Predictive analysis of gRNA activity using Cas9 expressing cell lines revealed reliable results for intronic-targeted guides (guide 9).
Next, we turned to hCB-CD34 + And (3) cells. While HDR preferably occurs, HSPC is pre-stimulated to facilitate transmission through the S/G2 stage (Genovese P, et al Nature.2014;510 (7504): 235-40; and Kass EM, jasin M.Vol.584, FEBS letters.2010.p.3703-8), resulting in modest cell expansion while preserving the original dryness phenotype in view of expression of the CD34, CD133 and CD90 markers.
Guide 9 (50 pmol/well), cas9 RNP and AAV6 vector carrying pgk_gfp reporter cassette (MOI 10 4 ) We are in CD34 + CD133 + CD90 + Good levels of targeting frequency (40.5%) were obtained in the most primitive cell subpopulations. Molecular analysis assessed by ddPCR analysis showed that most of the integration was targeted. Notably, during Cas9 and AAV6 dose optimization, we noted that high MOI of AAV6 has a great impact on cell adaptability. In vivo experiments further confirmed in vitro data. The treated hCB-CD34 + Cell transplantation into sublethally irradiated NSG mice showed long-term transplantation in both bone marrow and peripheral blood, confirming the multilineage differentiation capacity and long-term transplantation of target cells. We also tested the SA_GFP cassette, in which GFP expression is controlled by the RAG1 endogenous promoter. In vivo data from NSG mice indicate a controlled lymphoid specific expression pattern of the transgene, which is limited to immature lymphocytes in which RAG1 is physiologically expressed. To assess the effect of endogenous RAG 1' utr in donor DNA, we tested different donor constructs carrying GFP reporter genes. Analysis of donor AAV6 carrying endogenous RAG1-3' UTR showed that water obtained with the use of donor with BGH_PolyA GFP expression was reduced compared to plain. These data correlate with the lack of clinically relevant mutations in the RAG13' UTR reported so far in the literature, suggesting that this region may be optional in the design of corrective donors. Finally, sa_gfp_wpre showed no advantage in GFP expression, indicating that WPRE-mediated enhancement of expression may be promoter and cell line dependent. Based on this evidence, BGH poly a sequences that allow the highest levels of transgene expression were cloned into donor DNA. In addition, to further enhance protein translation, human RAG1 coding sequences were codon optimized with more frequent codon substitutions of more "rare" codons without altering the final amino acid sequence.
Newly designed donor AAV6 vectors (comprising SA sequence followed by Kozak sequence, optimized RAG1 codon followed by BGH_PolyA) were also described in hMPB-CD34 + The test was performed in cells. We observed the same efficiencies obtained with previous donors, confirming that our protocol was reproducible using several donors and several HSPC sources. Furthermore, multiparameter analysis of HSPC composition in untreated and edited HD cells showed redistribution of HSPC subtypes in cultured cells compared to cells analyzed prior to the expansion phase (fig. 10A). In untreated and edited cells, we observed expansion of Hematopoietic Stem Cells (HSCs), multipotent progenitor cells (MPPs) and multiple lymphoid progenitor cells (MLPs) at the expense of normal myeloid progenitor Cells (CMP), indicating that the editing protocol retained the dry composition (fig. 10A).
Notably, ddPCR analysis showed total CD34 + Over 80% HDR in cells, and in the most primitive (CD 133 + CD90 + ) Targeting frequencies of 45% were observed in the subset of subgroups. Transplanting with treated hMPB-CD34 + In vivo experiments in NSG mice of cells showed as good transplantation and multilineage differentiation capacity as mice treated with non-edited cells.
We obtained hMPB-CD34 from CID-G/AI RAG1 patient + Cells, which carry a sub-effect mutation and exhibit combined immunodeficiency associated with severe inflammation and autoimmune signs. We confirmed that the editing protocol did not affect HSPC composition in RAG 1-deficient cells (fig. 10B). I.e.So that in this case we also reached 86% of the targeting frequency, as shown by ddPCR analysis. In vivo transplantation of treated and untreated cells showed lower transplantation of edited cells in peripheral blood of NSG mice with patient cells compared to HD donor cells. In contrast, comparable transplants were observed in bone marrow and spleen between HD and patient-treated mice, suggesting that the genetically edited patient derived CD34 + Cells retain the ability to transplant and multi-lineage differentiation in vivo comparable assays of central and peripheral lymphoid organs. The effects of severe inflammatory conditions and/or drug administration (anti-CD 20 monoclonal antibodies or high doses of corticosteroids) occurring in CID patients may affect CD34 + Cell adaptation.
Overall, we have established an efficient and promising genome editing platform to correct RAG1 defects.
Materials and methods
Production and titration of lentiviral vectors
LV was produced by transient transfection of 293T cells. 9X10 will be 24 hours prior to transfection 6 Individual cells were plated in 15cm dishes and Iscove's Modified Dulbecco's Medium (IMDM) was changed 2 hours prior to transfection. The desired transfer vector (34. Mu.g) was mixed with 9. Mu.g of VSV-G envelope-encoding plasmid, 12.5. Mu.g pMDLg/pRRE, 6.25. Mu.g REV plasmid and 15. Mu.g pADVANTAGE per 15cm dish. This mixture was added to 293T cells by calcium phosphate precipitation. After 12-14 hours, the medium was replaced with fresh complete IMDM supplemented with 1mM sodium butyrate. The supernatant was collected and filtered 30 hours after the medium was changed. After collection, LV was concentrated 500-fold by ultracentrifugation (2 hours, 20.000rpm,20 ℃). Serial dilutions were performed from a known amount of LV infected 293T cells. After 3 days, useBlood and tissue kits separate genomic DNA (gDNA) at different dilutions. Vector Copy Number (VCN) of LV was measured by ddPCR. The titer was calculated using the following formula: titer = VCN x dilution factor x number of 293T cells infected. Determination of p24 HIV protein by ELISA assay (Abcam 218268) to estimate vector particles The amount of granules and the relative infectivity of the carrier formulation was calculated.
Cas9 inducible cell lines
NALM6 Cas9 cell lines were generated by transducing NALM6 cells with lentiviral vectors expressing Cas9 protein under the control of a TET inducible promoter and vectors constitutively expressing TET transactivator (Clackson T.Vol.7, gene therapy.2000. P.120-5). When doxycycline is administered to the culture medium, TET transactivator can bind to the promoter of Cas9 and induce its expression in cells. K562 Cas9 cell lines are generated with the same vector. Doxycycline was administered 24 hours prior to nuclease electroporation. The cell line was maintained in RPMI 1640 medium (complete medium) supplemented with 10% fbs, glutamine and penicillin/streptomycin antibiotics.
gRNA and RNP Assembly
Cas9 protein and custom RNA guide were purchased from Integrated DNA Technologies (IDT) and assembled according to the manufacturer's protocol. To enhance cell stability, chemically modified guide RNAs are used. Briefly, crrnas and trrnas were annealed, they were heated at 95 ℃ for 5 minutes, and then allowed to cool slowly at RT for 10 minutes. The Cas9 protein is then incubated with the annealed guide RNA fragment for 15 minutes at room temperature to assemble Ribonucleoprotein (RNP).
The guide sequences are shown in the following table:
guide 1 TTTTCCGGATCGATGTGA
Guide 2 GACATCTCTGCCGCATCTG
Guide 3 GTGGGTGCTGAATTTCATC
Guide 4 GATTGTGGGCCAAGTAACG
Guide 5 GAAAGTCACTGTTGGTCGA
Guide 6 CAATTTTGAGGTGTTCGTT
Guide 7 GGGTTGAGTTCAACCTAAG
Guide 8 TTAGCCTCATTGTACTAGC
Guide 9 TCAGATGGCAATGTCGAGA
Guide 10 GCAATTTTGAGGTGTTCGT
Guide 11 ACCAGCCTCGGGATCTCAA
Guide 12 TCAAATCAGTCGGGTTTCC
Guide RAG1KO CCTTCTCAGCATTCCGA
Guide RAG1KO AACATCTTCTGTCGCTGACT
When used directly as RNA, the following guide sequences of guides 3, 7, 9 and RAG1KO may be used:
guide 3 TGTGGGTGCTGAATTTCATC
Guide 7 GGGGTTGAGTTCAACCTAAG
Guide 9 GTCAGATGGCAATGTCGAGA
Guide RAG1KO GTACCTTCTCAGCATTCCGA
Mismatch selective endonuclease assay
T7 endonuclease (T7E 1) assay was used to measure NHEJ-induced indels. Briefly, gDNA of the gene-edited cells were extracted and amplified by PCR with primers flanking the Cas9 RNP target site. The PCR product was denatured, slowly re-annealed and digested with T7 endonuclease (New England BioLabs) for 1 hour at 37 ℃. T7 nucleases cleave DNA only at sites where there is a mismatch between the DNA strands, thus cleaving between the re-annealed wild type and mutant alleles. Fragments LabChip GXII Touch high resolution DNA chipSeparated and analyzed by provided software. The ratio of uncleaved parent fragment to cleaved fragment was calculated and it was well estimated for artificial nucleiNHEJ efficiency of the acid enzyme. Calculation of% NHEJ: (total cut segment)/(total cut segment + parent segment) x 100. Primers for NHEJ assay:
Guides 1, 2, 3FW CCATAAACACTGTCAGAAGAGG
Guides 1, 2, 3RV GTGTTGCAGATGTCACAGG
Guides 4, 9, 11FW GAAGTGGTTCATGCAAGAGG
Guides 4, 9, 11RV GGATGAACATGGAGAAAGCAG
Guides 6, 7, 10FW GGGGAGAAATGTGTAGGGAAG
Guides 6, 7, 10RV CTCAAAAACAAAGAAATGGGCG
Guides 5, 8, 12FW ATAGGTGGATGGGATGATGG
Guides 5, 8, 12RV CCTCTTCTGACAGTGTTTATGG
Guide RAG1KO FW GGAAAATGAATGCCAGGCAG
Guide RAG1KO RV AGGTCATCATGCTGTACAAATG
Guide RAG1KO FW TCCATGCTTCCCTACTGAC
Guide RAG1KO RV CTCCCATTCCATCACAAGAC
Off-target analysis
Computer prediction of the Off-target profile was performed with COSID (CRISPR Off-target Sites with Mismatches, inserts, and Deletions) to search for potential CRISPR Off-target sites in the genome (Cradick TJ, et al mol Ther-Nucleic acids.2014;3 (12): e 214). For the GUIDE-Seq analysis, K562 cells were electroporated with 50pmol of high fidelity Cas9 nuclease V3 GUIDE 7 or GUIDE 9 (as RNP) and dsODN to tag the break via a terminal ligation process consistent with NHEJ. The dsODN integration site in genomic DNA was precisely located at the nucleotide level using unbiased amplification and next generation sequencing (TsaiSQ, et al Nat Biotechnol.2015;33 (2): 187-97). Library construction and GUIDE-Seq sequencing were performed by Creative Biogen Biotechnology (NY, USA) using unique molecular identifiers (Unique Molecular Identifier, UMI) for tracking PCR repeats. Quality checks and trimming were performed on sequencing reads using FastQC and trim_galore, respectively. High quality reads were aligned against human reference genome (GRCh 38) in a "very sensitive local" mode using Bowtie2 (langmedia B, salzberg sl. Nat methods.2012;9 (4): 357-9) to achieve optimal alignment. GUIDE-Seq data analysis was performed using the R/Bioconductor package GUIDE-Seq (Zhu LJ, et al BMC genomics.2017;18 (1)) and using UMI to de-repeat reads.
Donor constructs
Cloning of the plasmid was performed using basic molecular biology techniques. Briefly, plasmids were digested with restriction enzymes (New England BioLabs) and the correct fragments were isolated and purified by agarose gel electrophoresis. After purification using the QIAquick PCR purification kit (QIAGEN), the fragments were inserted into the dephosphorylated linearized backbone using either Quick ligase or T4 ligase. After ligation, TOP10 chemocompetent E.coli was transformed and plated onto plates containing antibiotics. Plasmid DNA was extracted and purified using the Wizard Plus SV Minipreps DNA purification system (Promega) and EndoFree Plasmid Maxi Kit (QIAGEN). Colonies were screened with control digests and sequenced. The sequence of the vector insert with the main features is reported below:
AAV6 vector carrying sa_gfp_bghpolya, guide 9:
insert
HA_left
Splice acceptor
KOZAK
GFP
PolyA
HA_Right
AAV6 vector carrying sa_gfp_wpre_bghpolya, guide 9:
insert
HA_left
Splice acceptor
KOZAK
GFP
WPRE
PolyA
HA_Right
AAV6 vector carrying sa_gfp_sd, guide 9:
insert
/>
HA_left
Splice acceptor
KOZAZ
GFP
Splice donor
HA_Right
AAV6 vector carrying pgk_gfp_bghpolya, guide 9:
insert
HA_left
PGK promoter
KOZAK
GFP
PolyA
HA_Right
AAV6 vector carrying sa_gfp_3' utr-RAG1, guide 9:insert
/>
HA_left
Splice acceptor
KOZAK
GFP
3’UTR
HA_Right
/>
AAV6 vector carrying sa_rag1-cds_bghpolya, guide 9:
insert
/>
HA_left
SA
KOZAK
RAG1-CDS
/>
PolyA
HA_Right
AAV6 vector carrying pgk_gfp_bghpolya, guide 3:insert
HA_left
PGK promoter
KOZAK
/>
GFP
PolyA
HA_Right
Lentiviral vector carrying TET_Cas9_PGK_PuroInsert
/>
Cas9
/>
PGK promoter
Puromycin
rTTA
WPRE
Flow cytometry analysis (FACS) and cell sorting
Assays were performed to assess the integration of GFP cassettes in different cell types and cell populations. Undyed and singly stained cells or compensation beads were used as negative and positive controls. For apoptosis/necrosis detection, cells were stained with 7-amino actinomycin D (7-AAD, BD Pharming) and Pacific Blue (PB) annexin V (Biolegend). HSCs were stained with phycoerythrin cyanin 7 (PECy 7) CD34 (clone: AC136, miltenyi Biotec), phycoerythrin (PE) CD133 (Miltenyi Biotec), allophycocyanin (APC) CD90 (BD Biosciences). Cell sorting was performed on CD133/CD90 edited cells using a MoFlo XDP cell sorter (Beckman Coulter).
For mouse analysis, single cell suspensions were obtained from bone marrow, spleen, thymus and peripheral blood and stained with the following anti-human antibodies: CD45 (clone REA 757), CD3 (clone REA 613) (Miltenyi Biotech), CD19 (clone SJ25C 1), CD13 (clone WM 15) (BD Biosciences). Human and murine Fc blocking was performed using human F-Block and murine CD16/CD32 from BD Pharmingen prior to each staining. Live/Dead Fixable Yellow (Thermo Fisher Scientific, waltham, MA) was added to the antibody mixture to exclude dead cells. Samples were taken on a FACSCanto II (BD) and analyzed using FlowJo software (treesar, ashland, ore).
HSPC composition analysis of MPB-CD34+ cells according to Basso-Ricci L, et al Cytom Part A.2017;91 (10) the protocol described in 952-65 is performed. Briefly, 1.5x10 was labeled with fluorescent antibodies directed against CD3, CD56, CD14, CD61/41, CD135, CD34, CD45RA (Biolegend) and CD33, CD66b, CD38, CD45, CD90, CD10, CD11c, CD19, CD7 and CD71 (BD Biosciences) 5 Individual cells. All samples were collected by BD LSR-Fortessa (BD Bioscience) cytofluorometer after calibration of Rainbow beads (Spherech), and raw data was collected by DIVA software (BD Biosciences). The data was then analyzed with FlowJo software 9.3.2 version (TreeStar) and the graphical output was automatically generated by Prism 6.0c (GraphPad software).
AAV6 production and titration
AAV vectors are produced by transient triple transfection of HEK293 cells with calcium phosphate. The next day, the medium was changed to serum-free DMEM and cells were harvested 72 hours after transfection. Cells were lysed by three rounds of freeze thawing to release the viral particles and lysates were incubated with DNAseI and RNAse I to eliminate nucleic acids. AAV vectors were then purified by successive rounds of cesium chloride (CsCl 2) gradients. For each viral preparation, physical titer (genome copy number/mL) was determined by PCR quantification using TaqMan.
AAV6 Gene editing scheme in cell lines
2X10 per well with plasmid or RNP 5 /5x10 5 Individual cells were electroporated (Lonza, SF cell line 4D Nucleofector X kit, procedure FF120 for K562 or procedure DS100 for NALM 6). Cells were incubated at different MOI 15 min after electroporation (10 4 ;5x10 4 ;10 5 Vector genome/cell, vg/cell) was infected with AAV 6.
CD34+ cells
Human cord blood CD34+ cells (CB CD34+ cells) were obtained from Lonza (Poietics TM cat#2C101). CB cd34+ cells/ml were stimulated in StemSpan medium supplemented with penicillin/streptomycin antibiotics and the following early acting cytokines: stem Cell Factor (SCF) 100ng/ml, flt3 ligand (Flts-L) 100ng/ml, thrombopoietin (TPO) 20ng/ml, interleukin 6 (IL-6) 20ng/ml, stemRegenin1 (SR 1) (1 uM) and 16, 16-dimethyl prostaglandin E2 (dmPGE 2) (10 uM), UM171 50nM. Patient mobilized peripheral blood cd34+ cells (CB cd34+ cells) were provided by Luigi Notarangelo doctor (clinical immunology and microbiology laboratory, national institute of allergy and infectious diseases, institute of national institutes of health, bezienda, maryland, usa) friendship. MPB CD34+ cells/ml were stimulated in StemSpan medium supplemented with penicillin/streptomycin antibiotics and the following early acting cytokines: stem Cell Factor (SCF) 300ng/ml, flt3 ligand (Flts-L) 300ng/ml, thrombopoietin (TPO) 100ng/ml, interleukin 3 (IL-3) 60ng/ml, stemRegenin1 (SR 1) (1 uM) and 16, 16-dimethyl prostaglandin E2 (dmPGE 2) (10 uM), UM171 50nM.
AAV6 Gene editing scheme in CD34+ cells
After 3 days of amplification, each condition was 2X10 5 The CD34+ cells were electroporated with RNP (Lonza, P3 primary cell 4D Nucleofector X kit, CD34+ program) and GSE56 mRNA (P53 inhibitor) was added at a dose of 150 μg/ml when the cells were intended to be transplanted. 15 minutes after electroporation, CD34+ cells were at different MOI (10 4 ;5x10 4 ;10 5 Vg/cell) was infected with AAV 6.
Digital PCR
Digital PCR (ddPCR) was performed to assess targeted integration. Briefly, gDNA was quantified using Nanodrop and was found to be stable in H 2 Diluted in O to achieve 5-10ng (1-2 ng/ul) per reaction. The amount of gDNA per reaction can be increased, but it is important to remain below the saturation limit of the system. The ddPCR master mix was supplemented with 11ul ddPCR Supermix for Probes (without dUTP; bioRad), 1.1ul of primer mix forward primer+reverse primer (final concentration 0,9 uM) +probe (final concentration 0,25 uM), 1.1ul of normalization primer mix, 4.9ul H by each reaction 2 O. Finally, 17ul of ddPCR master mix and 5ul of diluted gDNA were added to each well (we will put UT and H 2 O served as negative control and single or bi-allelic clones served as positive control to verify the system). Droplets were prepared on a BioRad AutoDG automatic droplet generator and droplet plates were sealed with aluminum foil using a BioRad PX1 PCR plate sealer. The seal plate was placed in a BioRad T100 thermocycler and we run the appropriate PCR procedure. The run was read in a BioRad QX200 Droplet Reader.
The copy of each genome was calculated: concentration of gene of interest (copy/. Mu.l)/concentration of normalizer gene (copy/. Mu.l) x 2 calculated percent of HDR: copy of each genome x100.
Optimized PCR program (40 cycles):
·95℃x10min
·40x94°x30sec
·55°x1min
·72°x2min
·98°x10min
4 DEG hold
Primers and probes for ddPCR assay were as follows:
PGK_GFP box FW CAAGAGGTTGTCTGAAGGAAG
PGK_GFP box RV GACGTGAAGAATGTGCGAG
PGK_GFP box PROBE FAM CTGCTGCACCCTGGCCTCCTGAACTAA
Corrective CDS FW GTGGAACAGGTGTGATAATGAG
Corrective CDS RV GGAGGACAATCCAAGGGTAG
Corrective CDS PROBE FAM TGCTGCTGCACCCTGGCCTCCTGAA
Mice and transplantation protocol
NOD-scid IL2Rgnull mice (NSG; charles River) were purchased from Charles River Laboratories Inc. (Calco, italy) and maintained in the absence of Specific Pathogens (SPF). Mice were time-transplanted at 8-10 weeks via treated HSCP intravenously injected in phosphate buffered saline approximately 6 hours after sublethal systemic irradiation (120 rads). Gentamicin sulfate (Italfamaco, milan, italy) was added to drinking water (8 mg/mL) for the first 2 weeks after transplantation to prevent infection. Mice were tracked until sacrificed and then euthanized for ex vivo analysis.
Statistical analysis
When the normalization assumption is not met, a non-parametric statistical test is performed. When more groups are compared, a Kruskal-Wallis test with multiple post-comparison tests is performed. When the normalization assumption is met, a two-way analysis of variance (ANOVA) is used. For repeated measurements over time, a two-way anova with Bonferroni multiple post-comparison testing was used. Values are expressed as mean ± standard deviation.
Example 2
Results and discussion
Corrective donor screening
To further explore the role and selection strategy of the 3' UTR, the following further corrective donor sequences numbered 5-8 were designed and compared to the following sequences numbered 1-4 (FIG. 11A):
1. constructs carrying Bovine Growth Hormone (BGH) PolyA downstream of sa_gfp (sa_gfp_bgh);
2. constructs (SA_GFP_WPRE) carrying a woodchuck hepatitis virus posttranscriptional regulatory element (WPRE) downstream of SA_GFP and upstream of BGH PolyA;
3. a construct containing a splice donor downstream of the sa_gfp cassette (sa_gfp_sd) to obtain a fusion transcript comprising the corrected sequence and the endogenous RAG1 followed by the 3' utr sequence;
4. constructs with the same endogenous RAG1 'utr after the sa_gfp cassette (sa_gfp_3' utr);
5. constructs with sa_gfp cassette followed by endogenous RAG1 3'utr and BGH PolyA (sa_gfp_3' utr_bgh);
6. constructs of the sa_gfp cassette (sa_gfp_ires_ngfr_bgh) with a subsequent internal ribosome entry site sequence (IRES), a clinically compatible selector (C-terminally truncated low affinity NGFR receptor, hereinafter NGFR) and BGH PolyA sequences-this strategy might allow enrichment of edited cells by the NGFR selector and improved GFP expression by IRES and mRNA stabilization;
7. Constructs of the sa_gfp cassette (sa_gfp_ires_pest_sd) with a post-IRES, a peptide sequence rich in proline (P), glutamic acid (E), serine (S) and threonine (T) and a splice donor sequence-this construct will produce fusion transcripts, including corrected sequences and endogenous RAG1, followed by 3' utr sequences (endogenous RAG1 proteins are expected to be destabilized by PEST signal peptide via proteasome degradation;
8. constructs with GFP expression driven by the PGK promoter served as internal positive controls (PGK_GFP-BGH).
To screen for the above donors, NALM6 cells were transfected with guide 9 and Cas9 as RNP (25 pmol) and donor as linearized DNA fragment (1600 ng) and then cultured with RPMI and 10% FBS. To synchronize the cell cycle at the G0/G1 phase, where the RAG1 gene was mainly expressed, cells were serum starved 16 days after transfection (fig. 11B).
We assessed GFP expression over time by flow cytometry as a percentage of gfp+ cells and GFP Mean Fluorescence Intensity (MFI). As expected, the proportion of gfp+ cells was low under all conditions, as the allowability of NALM6 to editing was low. We confirmed the data depicted in fig. 3, which shows that cells edited with the sa_gfp_sd and sa_gfp_3' utr constructs have a lower MFI than that obtained by sa_gfp_bgh (fig. 11C). In addition, sa_gfp_ires_ngfr_bgh and sa_gfp_ires_pest_sd did not improve GFP expression compared to other constructs (fig. 11C).
We analyzed GFP expression 4, 5 and 7 days after serum starvation to evaluate the modulation of transgene expression when regulated by the RAG1 promoter. We found that all donors carrying 3'utr or using endogenous 3' utr (via SD sequences) resulted in modulation of GFP expression upon starvation (figure 11D).
Effect of edit enhancers on the HDR efficiency and T cell differentiation potential of RAG1 loci
To further understand the efficacy of gene editing methods to correct RAG1 deficiency, we utilized a novel organoid platform called Artificial Thymus Organoid (ATO) based on the DLL4 expressing stromal cell line (MS 5-hDLL 4) and CD34 isolated from bone marrow or mobilized peripheral blood + Aggregation of cells. The ATO platform (setet al (2017) Nat Methods) is a suitable tool for studying the first step of human T cell differentiation. We used this platform to evaluate the effect of the gene editing protocol on T cell differentiation and evaluate the extent to which accurate correction allows overcoming T cell differentiation blockades.
To this end, we established and optimized the ATO system using cd34+ cells obtained from peripheral blood (MPB) or Bone Marrow (BM) mobilized from Healthy Donors (HD). One day after editing, cd34+ cells aggregated with MS5-hDLL4 cells and remained cultured for 4 to 7 weeks to evaluate T cell differentiation potential and editing efficiency (fig. 12). ATO produced with genetically edited CD34+ cells showed lower cell viability than ATO containing untreated CD34+ cells.
To overcome the high toxicity that may be caused by the exacerbated p53 response and simultaneously increase HDR efficiency, we tested the effect of gene editing enhancer compounds: to this end, we utilized the messenger RNA of dominant negative p53GSE56 with or without Ad5-E4orf6/7, or Ad5-E4orf6/7 alone, during editing. Ad5-E4orf6/7 is an adenovirus protein called a helper in Ad-AAV co-infection that interacts with several components involved in survival and cell cycle.
We electroporate cd34+ cells in the presence of the following gene editing enhancers: GSE56 or Ad5-E4orf6/7 alone or a combination of GSE56 and Ad5-E4orf6/7 (COMBO). Cells were then transduced with the following AAV6 vector: a corrected donor vector carrying codon optimized RAG1 downstream of the Splice Acceptor (SA) followed by BGH polyA (sa_corrag1_bgh polyA) or an AAV6 vector carrying pgk_gfp_bghhpolya to track edited cells in a subset of HPSC cells (fig. 12A). The HDR efficiency of CD34+ cells edited with SA-corAG1-BGHpolyA was evaluated by ddPCR 7 days after gene editing, while CD34+ cells edited with PGK_GFP_BGHpolyA were evaluated by flow cytometry. In the presence of a corrective donor, molecular analysis showed a significant increase in the frequency of the edited allele under the gene editing conditions performed in the presence of gse56+ad5-E4orf6/7 (COMBO) (fig. 12B). Notably, cd34+ cells undergoing gene editing with AAV 6pgk_gfp_bghdraa revealed that the frequency of GFP positive cells was 40% in the most primitive HSPC subset (cd133+cd90+) (fig. 12C).
Furthermore, we performed multiparameter analysis on MPB or BM HSPC compositions before (day 0) and after (day 4) gene editing (fig. 12D). We confirm that the previous data (figure 10) show that the redistribution of HSPC subpopulations is primarily due to the amplification scheme. On day 4, in untreated and edited cd34+ cells, we observed the relative expansion of Hematopoietic Stem Cells (HSCs), multipotent progenitor cells (MPPs) and multiple lymphoid progenitor cells (MLPs) at the expense of normal myeloid progenitor Cells (CMP), indicating that the gene editing protocol using gse56+ad5-E4orf6/7 (COMBO) retained the dryness in the composition (fig. 12D).
After 24 hours from gene editing (day 4), cd34+ cells were washed, counted and inoculated in the presence of MS 5-hll 4 to form thymus organoids to track T cell differentiation for 4-7 weeks. Starting from the fourth week after inoculation, ATO was dissociated and the HDR efficiency of the host cells edited with the correcting donor was analyzed by molecular analysis (ddPCR), while cells edited with pgk_gfp_bghd polyaaaav 6 vector were analyzed by flow cytometry to detect gfp+ cells in different T cell subsets. Evaluation of ATO showed that organoid morphology was improved in the presence of the combined effect of gse56+e4orf6/7 (fig. 13A). This finding was confirmed by increasing the number of cells harvested from ATO inoculated with CD34+ edited with Ad5-E4orf6/7 and treatment with COMBO reached its highest value (FIG. 13B).
Molecular analysis of HDR frequency from cd34+ differentiated T cells edited with sa_corrag1_bghtpoly a further demonstrated synergy of gse56+ad5-E4orf6/7, revealing a higher proportion of edited alleles under COMBO conditions than other conditions (fig. 13C). Flow cytometry analysis of Double Negative (DN), double Positive (DP), single Positive (SP) T cells obtained from ATO seeded with cd34+ cells transduced and edited with AAV6 pgk_gfp_bghtolya showed that gfp+ cells were most frequent under COMBO conditions (fig. 13D). The synergistic effect of GSE56+Ad5-E4orf6/7 is more pronounced in the TCR alpha/beta+ cell subset, which is a relevant subset not present in RAG1 deficient patients.
Overall, these data demonstrate that the use of gene editing enhancers significantly increases HDR editing efficiency in cd34+ cells while preserving their ability to differentiate toward the T cell lineage.
Materials and methods
Donor constructs
Cloning of the plasmid was performed using general molecular biology techniques. Briefly, plasmids were digested with restriction enzymes (New England BioLabs) and the correct fragments were isolated and purified by agarose gel electrophoresis. After purification using the QIAquick PCR purification kit (QIAGEN), the fragments were inserted into the dephosphorylated linearized backbone using either Quick ligase or T4 ligase. After ligation, TOP10 chemocompetent E.coli was transformed and plated onto plates containing antibiotics. Plasmid DNA was extracted and purified using the Wizard Plus SV Minipreps DNA purification system (Promega) and EndoFree Plasmid Maxi Kit (QIAGEN). Colonies were screened with control digests and sequenced. The sequence of the further inserts is shown below:
AAV6 vector carrying sa_gfp_3' utr_bgh, guide 9:
insert
/>
/>
HA_left
/>
Splice acceptor
KOZAK
GFP
3’UTR
/>
/>
BGH
HA_Right
AAV6 vector carrying sa_gfp_ires_ngfr_bgh-RAG1, guide 9:
insert
/>
HA_left
Splice acceptor
KOZAK
GFP
IRES
NGFR
BGH
HA_Right
AAV6 vector carrying sa_gfp_ires_pest_sd-RAG1, guide 9:
insert
HA_left
Splice acceptor
KOZAK
GFP
IRES
PEST
Splice donor
HA_Right
AAV6 Gene editing scheme in cell lines
5X10 per well with plasmid or RNP 5 Individual cells were electroporated (Lonza, SF cell line 4D Nucleofector X kit, procedure FF120 for K562 or procedure DC100 for NALM 6). The donor DNA was delivered by electroporation as a fragment plasmid spanning the region between the left and right homology arms at a dose of 1600 ng.
CD34+ cells
Human MPB or bmcd34+ cells were obtained from Lonza and stimulated in StemSpan medium supplemented with penicillin/streptomycin antibiotics and the following early acting cytokines: stem Cell Factor (SCF) 300ng/ml, flt3 ligand (Flt 3-L) 300ng/ml, thrombopoietin (TPO) 100ng/ml, stemRegenin1 (SR 1) (1. Mu.M) and 16, 16-dimethyl prostaglandin E2 (dmPGE 2) (10. Mu.M), UM171 nM.
AAV6 Gene editing scheme in CD34+ cells
After 3 days of amplification, each condition was 2-5X10 5 Electroporation of CD34+ cells with RNP, GSE56 mRNA (3 ug/test), ad5-E4orf6/7 (1.5 ug/test) or GSE56+Ad5-E4orf6/7 as fusion protein with P2A self-cleaving peptide (5 ug/test) (Lonza, P3 primary cell 4 DNucoflefector X kit, cd34+ program). 15 minutes after electroporation, CD34+ cells were at 10 4 Vg/cells were infected with AAV6 and kept in culture with StemSpan medium supplemented with penicillin/streptomycin antibiotics and early acting cytokines: stem Cell Factor (SCF) 300ng/ml, flt3 ligand (Flt 3-L) 300ng/ml, thrombopoietin (TPO) 100ng/ml, stemRegenin1 (SR 1) (1. Mu.M) and UM171 nM.
Flow cytometry analysis (FACS) and sorting
For analysis of GFP expression, unstained and single stained cells or compensation beads were used as negative and positive controls. For apoptosis/necrosis detection, cells were stained with 7-amino actinomycin D (7-AAD, BD Pharming). CD34+ cells were stained with phycoerythrin cyanin 7 (PECy 7) CD34 (clone: AC136, miltenyi Biotec), phycoerythrin (PE) CD133 (Miltenyi Biotec), allophycocyanin (APC) CD90 (BD Biosciences). Cell sorting was performed on CD133/CD90 edited cells using a MoFlo XDP cell sorter (Beckman Coulter).
Analysis of HSPC composition of MPB/BM-CD34+ cells was performed according to the protocol in (Basso-Ricci et al (2017) Cytom Part A.91:952-65). Briefly, 1.5x10 was labeled with fluorescent antibodies directed against CD3, CD56, CD14, CD61/41, CD135, CD34, CD45RA (Biolegend) and CD33, CD66b, CD38, CD45, CD90, CD10, CD11c, CD19, CD7 and CD71 (BD Biosciences) 5 Individual cells. All samples were collected by BD LSR-Fortessa (BD Bioscience) cytofluorometer after Rainbow bead (Spherech) calibration and raw data was collected by DIVA software (BD Biosciences).
T cell differentiation was analyzed after harvesting cells from ATO by flow cytometry using the following mabs: TCRab APC (cl.IP26, eBioscience), CD4 Alexa Fluor 700 (cl.OKT4, eBioscience), CD19 PerCP-Cy5.5 (cl.HIB19, bioleged), CD56 FITC (cl.MEM-188, bioleged), CD8a PE/Dazzle (cl.RPA-T8, bioleged), CD 45V 500 (cl.HI30, BD Biosciences), CD3 BV421 (cl.UCHT1, BD Biosciences), CD8b PE (cl.2ST8.5H7, BD Biosciences) LIVE/DEATTM can immobilize yellow dead cell stain kit (Invitrogen). All samples were collected by BD CantoII (BD Biosciences) cytofluorometer after the Rainbow bead (Spherotech) calibration and raw data were collected by DIVA software (BD Biosciences).
The data was then analyzed with FlowJo software 9.3.2 version (TreeStar) and the graphical output was automatically generated by Prism 6.0c (GraphPad software).
Clone formation assay (Clonogenic assay)
CFU-C assays were performed 24 hours after the editing procedure by plating 600 cells in methylcellulose-based medium (MethoCult H4434, stemCell Technologies) supplemented with 100IU/ml penicillin and 100 μg/ml streptomycin. Three technical iterations are performed for each condition. Two weeks after plating, colonies were counted and identified according to morphological criteria.
ATO culture system
ATO was generated as described in Seet et al (Seet al. (2017) Nat Methods). Briefly, 5000-10000 CDs 34 from BM or MPB samples (commercially available, lonza) one day after the editing protocol + In combination with 150000 MS5-hDLL4 cells per ATO. We normalized the number of "real" live cd34+ cells based on flow cytometry analysis that excluded dead cells and CD 34-cells. Each ATO (5. Mu.l) was then plated in 0.4. Mu. M Millicell Transwell insert and placed in wells of a 6-well plate containing 1ml of complete RB27 medium supplemented with rhIL-7 (5 ng/ml), rhFlt3-L (5 ng/ml) and 30. Mu. M L-ascorbate 2-phosphate magnesium hydrate. Each insert contains a maximum of two ATO. The medium was changed every 3-4 days. From week 4 to week 9, ATO was collected by adding MACS buffer (PBS containing 7.5% bsa and 0.5M EDTA) to each well and pipetting to dissociate ATO. Cells were then resuspended in FACS buffer (PBS 2% fbs), counted and stained with the following antibodies: CD14 PE, CD45 PerCP-Cy5.5, CD1a APC, CD7 Alexa Fluor 700, CD5PE-Cy7, CD34 VioBlue, CD56 FITC, CD8a APC, TCRab PerCP-Cy5.5, CD3 APC, CD4 PeVio770, CD8b PE. Yellow live dead cells were used to exclude dead cells. Samples were analyzed using FlowJo software version 10.5.2 (FlowJo, LLC, ashland, OR).
Digital PCR
Digital PCR (ddPCR) was performed to assess targeted integration. Briefly, gDNA was quantified using Nanodrop and was found to be stable in H 2 Diluted in O to achieve 5-10ng (1-2 ng/ul) per reaction. The amount of gDNA per reaction can be increased, but it is important to remain below the saturation limit of the system. The ddPCR master mix was supplemented with 11ul ddPCR Supermix for Probes (without dUTP; bioRad), 1.1ul of primer mix forward primer+reverse primer (final concentration 0.9 uM) +probe (final concentration 0.25 uM), 1.1ul of normalization primer mix, 4.9ul H by each reaction 2 O. Finally, 17ul of ddPCR master mix and 5ul of diluted gDNA were added to each well (we will put UT and H 2 O served as negative control and single or bi-allelic clones served as positive control to verify the system). Droplets were prepared on a BioRad AutoDG automatic droplet generator and droplet plates were sealed with aluminum foil using a BioRad PX1 PCR plate sealer. The seal plate was placed in a BioRad T100 thermocycler and we run the appropriate PCR procedure. The run was read in a BioRad QX200 Droplet Reader.
The copy of each genome was calculated: concentration of gene of interest (copy/. Mu.l)/concentration of normalizer gene (copy/. Mu.l) x2 calculated percent of HDR: copy of each genome x100.
Optimized PCR program (40 cycles):
95℃ x 10min
40x 94°x 30sec
55°x 1min
72°x 2min
98°x 10min
4 DEG hold
Primers and probes for ddPCR assay were as follows:
RT-qPCR
for gene expression analysis, total RNA was extracted using RNeasy Plus Micro Kit (QIAGEN) and DNase treatment was performed using RNase-free DNase Set (QIAGEN) according to the manufacturer's instructions. cDNA was synthesized using a high capacity cDNA reverse transcription kit (Applied Biosystem). Then use Power Syber Green PCR Master Mix (Applied Biosystems) at Viia7 qPCR was performed using cDNA in a real-time PCR thermocycler. Data were analyzed using via 7 real-time PCR software (Applied Biosystem). The relative expression of each target gene was expressed as fold change relative to β -actin normalizer (2 -ΔCt )。
Example 3
Results and discussion
Two further donor constructs were designed and generated:
i) An sa_corrag1 cds_bghpa donor carrying Bovine Growth Hormone (BGH) PolyA downstream of the sa_corrag1 CDS to allow transcription termination of the corrected RAG1CDS (fig. 14A);
ii) a_corag1cds_sd containing a Splice Donor (SD) sequence to obtain a fusion transcript including the corrected codon optimized sequence and endogenous RAG1 followed by the 3' utr sequence (fig. 14B).
To test these two correction donors, NALM6.Rag1KO cells were transfected with guide 9 and Cas9 as RNP (50 pmol) and transfected with SA_cora1CDS_BGHpA or SA_cora1CDS_SD AAV6 donors at two doses (10 4 And 5x10 4 ) Transduction was performed (fig. 15A). As expected, we obtained a low proportion of edited alleles in the subject edited NALM6.Rag1ko cells, as the allowability of NALM6 cells for HDR-mediated editing was low. To evaluate gene editing efficiency based on RAG1 expression and recombinant activity, edited subject nalm6.RAG1ko cells were subcloned to isolate various single colonies carrying single allele or double allele editing (fig. 15A). We screened 429 clones by ddPCR and we identified 5 single allele clones edited by sa_corrag1cds_bghpa and 11 single allele clones edited by sa_corrag1cds_sd.
To compare the correction efficiency of two donors in selected edited clones, we analyzed RAG1CDS expression by RT-qPCR and evaluated the recombinant activity by LV transduced cells carrying a reverse GFP cassette, which was recombined in the presence of a functional RAG1 protein (Liang HE, et al Immunity.2002;17:639-651;Bredemeyer AL,et al.Nature.2006;442 (7101): 466-470;De Ravin SS,et al.Blood.2010;116:1263-1271;Lee YN,et al, J Allergy Clin Immunol.2014;133 (4): 1099-10).
We observed an increase in RAG1CDS expression (fig. 15B) and recombinant activity in most clones edited by sa_corrag1cds_bghpa or sa_corrag1cds_sd AAV6 donors (fig. 15C).
To compare the effect of two donors on Hematopoietic Stem and Progenitor Cells (HSPCs), we edited HSPCs derived from HD mobilized peripheral blood with guide 9 and Cas9 as RNP (50 pmol) in the presence of a combination of edit enhancers (GSE 56 and Ad5-E4orf 6/7), and then transduced with three different doses of sa_corag1cds_bghpa or sa_corrag1cds_sd AAV6 donors.
We observed that the editing efficiency between HSPCs edited by the sa_corrag1cds_bghpa or sa_corrag1cds_sd AAV6 donor was comparable, as well as confirmed by analysis of editing efficiency in sorted HSPCs as the dose increased (fig. 16A) (fig. 16B). In addition to the known effects of gene editing on cell growth (fig. 16C) and colony formation potential (fig. 16D), HSPCs edited by sa_corrag1cds_bghpa or sa_corrag1cds_sd AAV6 donors showed similar i) growth kinetics (fig. 16C), ii) erythroid and myeloid colony generation (fig. 16D), and iii) preservation of cell subset composition of the most primitive cd34+cd133+cd90+ cells (fig. 16E), compared to untreated cells.
To further compare the two AAV6 donor constructs, we utilized an Artificial Thymus Organoid (ATO) platform to differentiate edited HSPCs towards the T cell lineage by applying the protocol of the scheme described previously (fig. 12). Hematopoietic stem and progenitor cells edited by both donors differentiated similarly in early and late T cell subsets (fig. 16F), with comparable levels of editing efficiency in sorted double negative CD4-CD 8-cells and double positive cd4+cd8+ cells (fig. 16G).
Taken together, these data demonstrate that both corrective donors can achieve efficient targeting while retaining the most primitive cd34+cd133+cd90+ cell subpopulations.
All publications mentioned in the above specification are herein incorporated by reference. Various modifications and variations of the disclosed polynucleotides, vectors, RNAs, methods, cells, kits, compositions, systems and uses of the invention will be apparent to those of skill in the art without departing from the scope and spirit of the invention. Although the invention has been disclosed in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the disclosed modes for carrying out the invention which are obvious to those skilled in the art are intended to be within the scope of the following claims.

Claims (45)

1. An isolated polynucleotide comprising, from 5 'to 3': a first homologous region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homologous region.
2. The isolated polynucleotide of claim 1, wherein:
(i) The first homologous region is homologous to a first region of RAG1 intron 1 and the second homologous region is homologous to a second region of RAG1 intron 1; or alternatively
(ii) The first homologous region is homologous to a first region of either RAG1 intron 1 or RAG1 exon 2, and the second homologous region is homologous to a second region of RAG1 exon 2.
3. The isolated polynucleotide of claim 1 or claim 2, wherein the first homologous region is homologous to a first region of RAG1 intron 1 and the second homologous region is homologous to a second region of RAG1 intron 1.
4. The isolated polynucleotide of any of the preceding claims, wherein:
(i) The first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr 11:36569298;
(ii) The first homologous region is homologous to a region upstream of chr 11:36573790 and the second homologous region is homologous to a region downstream of chr 11:36573793;
(iii) The first homologous region is homologous to a region upstream of chr 11:36573641 and the second homologous region is homologous to a region downstream of chr 11:36573644;
(iv) The first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr 11:36573354;
(v) The first homologous region is homologous to a region upstream of chr 11:36569080 and the second homologous region is homologous to a region downstream of chr 11:36569083;
(vi) The first homologous region is homologous to a region upstream of chr 11:36572472 and the second homologous region is homologous to a region downstream of chr 11:36572475;
(vii) The first homologous region is homologous to a region upstream of chr 11:36571458 and the second homologous region is homologous to a region downstream of chr 11:36571461;
(viii) The first homologous region is homologous to a region upstream of chr 11:36571366 and the second homologous region is homologous to a region downstream of chr 11:36571369;
(ix) The first homologous region is homologous to a region upstream of chr 11:36572859 and the second homologous region is homologous to a region downstream of chr 11:36572862;
(x) The first homologous region is homologous to a region upstream of chr 11:36571457 and the second homologous region is homologous to a region downstream of chr 11:36571460;
(xi) The first homologous region is homologous to a region upstream of chr 11:36569351 and the second homologous region is homologous to a region downstream of chr 11:36569354; or alternatively
(xii) The first homologous region is homologous to a region upstream of chr 11:36572375 and the second homologous region is homologous to a region downstream of chr 11:36572378.
5. The isolated polynucleotide of any of the preceding claims, wherein:
(i) The first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr 11:36569298;
(ii) The first homologous region is homologous to a region upstream of chr 11:36573351 and the second homologous region is homologous to a region downstream of chr 11:36573354; or alternatively
(iii) The first homologous region is homologous to a region upstream of chr 11:36571366 and the second homologous region is homologous to a region downstream of chr 11:36571369;
preferably, wherein the first homologous region is homologous to a region upstream of chr 11:36569295 and the second homologous region is homologous to a region downstream of chr 11:36569298.
6. The isolated polynucleotide of any one of the preceding claims, wherein the first homology region is homologous to a region comprising chr 11:36569245-chr 11:36569294 and/or the second homology region is homologous to a region comprising chr 11:36569299-chr 11:36569348.
7. The isolated polynucleotide of any of the preceding claims, wherein the 3 'terminal sequence of the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 7 and/or the 5' terminal sequence of the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 19.
8. The isolated polynucleotide of any of the preceding claims, wherein the first homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 31 or a fragment thereof and/or the second homologous region comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 32 or a fragment thereof.
9. The isolated polynucleotide of any of the preceding claims, wherein the first and second homologous regions are each 50-1000bp in length, 100-500bp in length, or 200-400bp in length.
10. The isolated polynucleotide of any of the preceding claims, wherein the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence encoding an amino acid sequence having at least 70% identity to SEQ ID No. 4 or SEQ ID No. 5.
11. The isolated polynucleotide of any of the preceding claims, wherein the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 6.
12. The isolated polynucleotide of any of the preceding claims, wherein the splice acceptor site comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 33.
13. The isolated polynucleotide of any one of the preceding claims, wherein the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence, optionally wherein the polyadenylation sequence is a bGH polyadenylation sequence.
14. The isolated polynucleotide of any of the preceding claims, wherein the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence comprising or consisting of a nucleotide sequence having at least 70% identity to SEQ ID No. 35.
15. The isolated polynucleotide of any of the preceding claims, wherein the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a Kozak sequence, optionally wherein the Kozak sequence comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 36.
16. The isolated polynucleotide of any of the preceding claims, wherein the polynucleotide comprises or consists of a nucleotide sequence having at least 70% identity to SEQ ID No. 39.
17. A vector comprising the polynucleotide of any one of the preceding claims.
18. The vector of claim 17, wherein the vector is a viral vector, optionally an adeno-associated virus (AAV) vector, such as an AAV6 vector.
19. A guide RNA comprising or consisting of a nucleotide sequence having at least 90% identity to any one of SEQ ID NOs 41-52 or 53-55, optionally wherein the guide RNA comprises or consists of a nucleotide sequence having at least 90% identity to SEQ ID NO 41 or 53 (preferably SEQ ID NO: 41).
20. The guide RNA of claim 19, wherein 1 to 5 of the terminal nucleotides of the 5 'and/or 3' end of the guide RNA are chemically modified to enhance stability, optionally wherein the 3 terminal nucleotides of the 5 'and/or 3' end of the guide RNA are chemically modified to enhance stability, optionally wherein the chemical modification is a modification with 2 '-O-methyl 3' phosphorothioate.
21. A kit, composition or gene editing system comprising a polynucleotide according to any one of claims 1 to 16 or a vector according to any one of claims 17 or 18.
22. The kit, composition, gene editing system of claim 21, wherein the kit, composition, or gene editing system further comprises the guide RNA of claim 19 or claim 20.
23. The kit, composition or gene editing system of claim 21 or claim 22, wherein the kit, composition or gene editing system further comprises an RNA-guided nuclease, optionally wherein the RNA-guided nuclease is a Cas9 endonuclease.
24. Use of an isolated polynucleotide according to any one of claims 1 to 16, a vector according to any one of claims 17 or 18, a guide RNA according to any one of claims 19 or 20, or a kit, composition or gene editing system according to any one of claims 21 to 23 for gene editing of a cell or population of cells.
25. An isolated genome comprising the polynucleotide of any one of claims 1 to 16.
26. An isolated cell comprising the polynucleotide of any one of claims 1 to 16 or the genome of claim 25.
27. The isolated cell of claim 26, wherein the cell is a Hematopoietic Stem Cell (HSC), hematopoietic Progenitor Cell (HPC), or Lymphoid Progenitor Cell (LPC).
28. The isolated cell of claim 26 or claim 27, wherein the cell is a cd34+ cell.
29. A population of cells comprising one or more isolated cells according to any one of claims 26 to 28.
30. The population of claim 29, wherein at least 50% of the population of cells are cd34+ cells.
31. The population of cells according to claim 29 or claim 30, wherein at least 20% of the population of cells are cd34+ cells comprising the genome of claim 25.
32. A method of gene editing a population of cells, comprising:
(a) Providing a population of cells; and
(b) Delivering an RNA-guided nuclease, a guide RNA according to claim 19 or claim 20 and a vector according to claim 17 or claim 18 to the cell population to obtain a gene-edited cell population.
33. A method of treating RAG-deficient immunodeficiency in a subject comprising:
(a) Providing a population of cells;
(b) Delivering an RNA-guided nuclease, a guide RNA according to claim 19 or claim 20 and a vector according to claim 17 or claim 18 to the cell population to obtain a gene-edited cell population.
(c) Administering the population of gene-edited cells to the subject.
34. The method of claim 32 or claim 33, wherein the population of cells comprises or consists of HSCs, HPCs, and/or LPCs and/or wherein the population of cells comprises or consists of cd34+ cells.
35. The method of any one of claims 32 to 34, wherein the population of cells is pre-activated, optionally wherein the population of cells is cultured with one or more cytokines selected from the group consisting of: one or more early acting cytokines such as TPO, IL-6, IL-3, SCF, FLT3-L; one or more transduction enhancers, such as PGE2; and one or more amplification enhancers, such as UM171, UM729, SR1.
36. The method of any one of claims 32 to 35, wherein the RNA-guided nuclease and/or guide RNA is delivered prior to and/or simultaneously with the vector.
37. The method of any one of claims 32 to 36, wherein the RNA-guided nuclease is Cas9, optionally wherein the Cas9 and the guide RNA are pre-assembled for delivery as a Cas9 RNP.
38. The method of any one of claims 32 to 37, wherein the method further comprises delivering a p53 inhibitor and/or an HDR enhancer, optionally wherein the p53 inhibitor and/or HDR enhancer is delivered simultaneously with the RNA-guided nuclease and/or guide RNA.
39. The method according to any one of claims 32 to 38, wherein the population of gene-edited cells is defined according to any one of claims 29 to 31.
40. A population of genetically edited cells obtainable by the method of any one of claims 32 to 39.
41. A method of treating RAG-deficient immunodeficiency comprising administering to a subject in need thereof an isolated cell according to any one of claims 26 to 28, a population of cells according to any one of claims 29 to 31, or a population of genetically edited cell according to claim 40.
42. An isolated cell according to any one of claims 26 to 28, a population of cells according to any one of claims 29 to 31 or a population of gene-edited cell according to claim 40 for use in treating RAG-deficient immunodeficiency in a subject.
43. The method according to claim 41, or the isolated cell, cell population or gene-edited cell population for use according to claim 42, wherein said RAG-deficient immunodeficiency is T-B-Severe Combined Immunodeficiency (SCID), omnen syndrome, atypical SCID or combined immunodeficiency with granuloma/autoimmunity (CID-G/AI).
44. The method of claim 41 or claim 43, or the isolated cell, cell population or gene-edited cell population for use of claim 42 or claim 43, wherein the subject has a RAG1 deficiency.
45. The method of any one of claims 41, 43 or 44, or the isolated cell, cell population or gene-edited cell population for use according to any one of claims 42 to 44, wherein the subject has a mutation in the RAG1 gene, optionally in RAG1 exon 2.
CN202180082483.2A 2020-10-12 2021-10-12 Alternatives to RAG1 for use in therapy Pending CN116635523A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
GBGB2016139.4A GB202016139D0 (en) 2020-10-12 2020-10-12 Polynucleotide
GB2016139.4 2020-10-12
AU2021202657 2021-04-28
PCT/EP2021/078222 WO2022079054A1 (en) 2020-10-12 2021-10-12 Replacement of rag1 for use in therapy

Publications (1)

Publication Number Publication Date
CN116635523A true CN116635523A (en) 2023-08-22

Family

ID=73460658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180082483.2A Pending CN116635523A (en) 2020-10-12 2021-10-12 Alternatives to RAG1 for use in therapy

Country Status (3)

Country Link
CN (1) CN116635523A (en)
CA (1) CA3195268A1 (en)
GB (1) GB202016139D0 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113840918A (en) * 2019-03-11 2021-12-24 莱顿大学医学中心附属莱顿教学医院 Optimized RAG1 deficient gene therapy

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113840918A (en) * 2019-03-11 2021-12-24 莱顿大学医学中心附属莱顿教学医院 Optimized RAG1 deficient gene therapy

Also Published As

Publication number Publication date
GB202016139D0 (en) 2020-11-25
CA3195268A1 (en) 2022-04-21

Similar Documents

Publication Publication Date Title
US20140341852A1 (en) Capsid-modified raav vector compositions and methods therefor
US20090148425A1 (en) Therapeutic method for blood coagulation disorder
US20150224209A1 (en) Lentiviral vector for stem cell gene therapy of sickle cell disease
AU2018202200A1 (en) Improved gene therapy methods
JP2020533969A (en) Lentiviral vector expressing FOXP3 in hematopoietic stem cells for the treatment of immunodeficiency and autoimmune diseases
KR20180015751A (en) Retroviral vectors containing an inverse directed human ubiquitin C promoter
US20230365996A1 (en) Replacement of rag1 for use in therapy
US20140199279A1 (en) Methods for enhancing the delivery of gene-transduced cells
US20230174622A1 (en) Epidermal growth factor receptor
CN116635523A (en) Alternatives to RAG1 for use in therapy
CA3115902A1 (en) Selection by means of artificial transactivators
WO2008136656A1 (en) Improved methods and means for lentiviral gene delivery
AU2021202657A1 (en) Polynucleotide
WO2023062030A1 (en) Polynucleotides useful for correcting mutations in the rag1 gene
Klein Advances in viral vector design: Tissue-and cell-type specific promoters can improve the safety and efficacy of lentiviral gene therapy
Kitowski A lentiviral vector conferring coregulated, erythroid-specific expression of γ-globin and shRNA sequences to BCL11A for the treatment of sickle cell disease
CN117441023A (en) Lentiviral vector and use thereof
Chen Engineering Synthetic Promoters to Optimize Therapeutic Gene Expression for AAV Gene Therapy
Li Adult stem cell-based gene therapy for alpha 1-antitrypsin deficiency
Class et al. Patent application title: CAPSID-MODIFIED RAAV VECTOR COMPOSITIONS AND METHODS THEREFOR Inventors: Arun Srivastava (Gainesville, FL, US) Arun Srivastava (Gainesville, FL, US) George V. Aslanidi (Gainesville, FL, US) Sergei Zolotukhin (Gainesville, FL, US) Sergei Zolotukhin (Gainesville, FL, US) Mavis Agbandje-Mckenna (Gainesville, FL, US) Kim M. Van Vliet (Gainesville, FL, US) Li Zhong (Boxborough, MA, US) Lakshmanan Govindasamy (Gainesville, FL, US) Assignees: University of Florida Research Foundation Inc.
Bailey Self-inactivating retroviral vectors for gene therapy of X-Linked severe combined immunodeficiency
Bellantuono Gene therapy for chronic granulomatous disease
AU2011353591A1 (en) Methods for enhancing the delivery of gene-transduced cells

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination