WO2020150499A1 - Methods to tag and isolate cells infected with the human immunodeficiency virus - Google Patents

Methods to tag and isolate cells infected with the human immunodeficiency virus Download PDF

Info

Publication number
WO2020150499A1
WO2020150499A1 PCT/US2020/013919 US2020013919W WO2020150499A1 WO 2020150499 A1 WO2020150499 A1 WO 2020150499A1 US 2020013919 W US2020013919 W US 2020013919W WO 2020150499 A1 WO2020150499 A1 WO 2020150499A1
Authority
WO
WIPO (PCT)
Prior art keywords
cells
hiv
sequence
cell
genome
Prior art date
Application number
PCT/US2020/013919
Other languages
French (fr)
Inventor
Keith JEROME
Daniel Stone
Original Assignee
Fred Hutchinson Cancer Research Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fred Hutchinson Cancer Research Center filed Critical Fred Hutchinson Cancer Research Center
Publication of WO2020150499A1 publication Critical patent/WO2020150499A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/46Cellular immunotherapy
    • A61K39/461Cellular immunotherapy characterised by the cell type used
    • A61K39/4611T-cells, e.g. tumor infiltrating lymphocytes [TIL], lymphokine-activated killer cells [LAK] or regulatory T cells [Treg]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/46Cellular immunotherapy
    • A61K39/464Cellular immunotherapy characterised by the antigen targeted or presented
    • A61K39/464838Viral antigens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • A61P31/18Antivirals for RNA viruses for HIV
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1086Preparation or screening of expression libraries, e.g. reporter assays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0634Cells from the blood or the immune system
    • C12N5/0636T lymphocytes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16022New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16111Human Immunodeficiency Virus, HIV concerning HIV env
    • C12N2740/16122New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/02Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof

Definitions

  • the current disclosure provides methods to tag and isolate cells infected with the human immunodeficiency virus (HIV) from a human patient.
  • the methods do not affect the viability of the tagged and isolated cells, so that such cells can be expanded and maintained to study the characteristics of cells that harbor virus. Information derived from the studies can be used to further the development of clinical treatments to eradicate HIV infection in individuals.
  • HIV human immunodeficiency virus
  • HIV Human immunodeficiency virus
  • AIDS acquired immunodeficiency syndrome
  • the clinical course of HIV infection can vary according to a number of factors, including the subject's genetic background, age, general health, nutrition, treatment received, and the HIV subtype. In general, most individuals develop flu-like symptoms within a few weeks or months of infection. The symptoms can include fever, headache, muscle aches, rash, chills, sore throat, mouth or genital ulcers, swollen lymph glands, joint pain, night sweats, and diarrhea. The intensity of the symptoms can vary from mild to severe depending upon the individual. [0007] During the acute phase, HIV viral particles are attracted to and enter cells expressing the appropriate CD4 receptor molecules, such as CD4-expressing T cells of the immune system.
  • HIV encoded reverse transcriptase Once the virus has entered a CD4-expressing T cell, HIV encoded reverse transcriptase generates a proviral DNA copy of the HIV RNA and the proviral DNA becomes integrated into the CD4-expressing T cell genomic DNA. It is this HIV provirus that is replicated by the CD4- expressing T cell. When replicated, new HIV virions are produced which can then leave the originally infected T cell and proceed to infect additional CD4-expressing T cells. Without treatment, this process kills the originally infected T cell, leading to depletion of T cells in infected patients.
  • the acute phase of HIV infection subsides and is followed by a latent period.
  • the subject's CD4 cell numbers rebound, although not to pre-infection levels.
  • Most patients also begin to show detectable levels of anti-HIV antibody in their blood.
  • ART Anti-retroviral therapies
  • the latent period may extend for several decades or more, and most patients on ART do not have detectable HIV in their blood.
  • ART does not cure HIV infection, and once therapy is stopped, HIV in blood rebounds to its pre-treatment level.
  • the ability to stop treatment without viral rebound would be beneficial because long-term treatment with ART is associated with other serious health considerations such as bone or renal toxicity, insulin resistance, and accelerated cardiovascular disease.
  • stopping treatment is discouraged due to the likelihood of viral rebound from the latent HIV reservoir.
  • the latent HIV reservoir refers to cells infected with the HIV provirus when the provirus is not being actively replicated to create new virions.
  • one in ten thousand resting T cells contains HIV DNA, and one per million resting T cells contains provirus that can be reactivated to produce infectious virus.
  • values can vary over 2 logs the HIV provirus is found in 0.01-0.1 % of peripheral blood mononuclear cells (PBMCs), and 0.003% contain intact provirus. Based on these numbers, each vial of 5 million PBMCs from ART- treated subjects should contain 500-5,000 HIV+ cells, and 150 cells with intact provirus capable of reactivation.
  • PBMCs peripheral blood mononuclear cells
  • a cure for HIV is likely to require a thorough understanding of the latent HIV reservoir and the mechanisms by which it is maintained and/or-reactivated.
  • Most current methods to measure quantitatively the replication competent latent HIV reservoir are difficult to perform, time- consuming, and expensive.
  • the difficulty associated with identifying these rare cells within the background of uninfected cells has made it difficult to fully define the biology of the HIV reservoir, including the mechanisms by which HIV latency is maintained, and the processes leading to reactivation and production of infectious HIV.
  • the current disclosure provides use of gene editing systems to efficiently tag and isolate cells of the latent HIV reservoir from a patient. Isolating these cells allows the study of latently infected cells to determine factors associated with HIV reservoir maintenance and reactivation, and conditions allowing potential eradication.
  • the systems and methods disclosed herein can be used to link latent viral infection with integration sites in the genome and/or with particular T cell receptors of a given latently infected cell. This type of information can help to elucidate the types of cells and/or integration sites that allow a latent virus to be replication competent. Factors that result in viral reactivation can also be assessed.
  • the current disclosure provides systems and methods to generate libraries of cells infected with HIV virus that allow the study of the latent HIV reservoir to further the development of clinical treatments.
  • the current disclosure provides tagging latently infected cells with genetic constructs that allow sorting and collection of the tagged cells. Genetic tagging of T cells infected with latent HIV provirus that is capable of reactivation is not easily achieved due to the sequence variability between and among different strains of HIV.
  • the current disclosure overcomes this obstacle by utilizing homology-independent targeted integration (HITI) of genetic constructs. HITI can utilize small micro-regions of homology to a target site but does not require them.
  • HITI homology-independent targeted integration
  • HITI can be accomplished without homology sequences or with conserved target sites for genetic tag insertion of less than 30 (e.g., 23) nucleotide base pair (bp).
  • bp nucleotide base pair
  • the inserted genetic constructs were generated to be polyA signal-less. This approach was recognized as feasible because constructs that properly insert into the HIV provirus genome can be stably expressed using the HIV genome’s endogenous polyA signals. If the construct is not inserted within the HIV genome, however, the lack of the polyA signal results in unstable mRNA that will not subsequently be translated to any significant degree. This approach thus significantly reduces, if not eliminates the noted background noise from the systems and methods disclosed herein.
  • cells expressing the inserted polyA signal-less reporter constructs utilize endogenous polyA signals within the HIV genome to maintain reporter expression, and thus can be isolated, maintained, and expanded to study HIV, such as the mechanisms of HIV latency and reactivation.
  • FIG. 1 CRISPR-Assisted Provirus Tagging In Vitro (CAPTIV).
  • a DNA tag is inserted into the HIV provirus of infected cells following delivery of HIV-specific Cas9 RNPs and a HITI donor AAV substrate. Tag insertion into the HIV provirus enables isolation of HIV infected cells via cell sorting, and subsequent expansion.
  • FIGs. 2A-2D HITI-mediated provirus tagging in ACH-2 cells.
  • PCR primers and target sites are shown for MND (striped), SV40pA (spotted), and pol (labeled); 5Tia1-MND (SEQ ID NO: 1); 3Tia1-SV40Pa (SEQ ID NO: 2); and HIV POL1 site (SEQ ID NO: 3); (2D) Expected 5’ HITI junction PCR products (asterisks) of 255bp (forward orientation) and 318bp (reverse orientation) are shown. Sanger chromatograms for sequenced PCR product showing fusion of the pol target site (upper case underlined) with Tia1-MND (lower case underlined) or Tia1-SV40pA (lower case underlined) target sites.
  • MND - MND promoter and pA - SV40 polyA signal are listed; 5’pol1-5’Tia1 (SEQ ID NO: 4) and 5’pol1-3’Tia1 (SEQ ID NO: 5).
  • FIGs. 3A, 3B PolyA signal-less construct validation.
  • (3A) insertion sites in the pol or nef genes of pNL4-3 for the polyA signal-less MND-GFP expression cassette are shown along with locations of known polyA signals (striped triangles) or predicted polyA signals (spotted triangles);
  • FIGs. 4A-4D PolyA signal-less provirus tagging in ACH-2 cells.
  • T7 endonuclease I cleavage of a PCR fragment yields fragments (asterisks) of 390bp + 166bp (envO) if the target site is disrupted (SEQ ID NO: 6);
  • (4D) junction PCR with primers between GFP and env (410 bp) show show tag insertion occurred at the envO locus.
  • FIGs. 5A-5D Tagged ACH-2 cells are viable and functional. Following CAPTIV tagging at the envO target site GFP+ ACH-2 cells were seen by flow cytometry after 6 days (5A) and were sorted after 8 and 24 days to enrich for GFP+ cells (5B, 5C). GFP+ cells proliferate in culture after 2 rounds of sorting (5D).
  • FIGs. 6A-6C HIV provirus and integration site sequencing in ACH-2 cells via probe capture analysis.
  • FIG. 7 CAPTIV sgRNA target sites in HIV env and nef. Alignment of a consensus ACH-2 HIV sequence with equivalent regions of NL4-3 (Genbank accession: AF324493.2) and HXB2 (Genbank accession: K03455.1). sgRNA target sites (boxes) and PCR primers used to amplify CAPTIV tagging junctions (black arrows) are also shown (SEQ ID NOs: 20-22). Reference sequences for ACH-2 (SEQ ID NO: 80), HXB2 (SEQ ID NO: 81), and NL4-3 (SEQ ID NO: 82) are also provided.
  • FIG. 8 Organization of the human CCR5 locus (Genbank Accession number AH005786.2, nucleotides 2077-8135) showing gRNA target sites CCR5-1 and CCR5-2 (Black boxes), plus the closest putative canonical AATAAA or ATT AAA polyA signals (Open arrows) on the sense or anti-sense strand for each CCR5 gRNA target site. Donor tag insertions and the distance to the nearest orientation dependent 3’ canonical polyA signals are shown. Locations of primers used to amplify target site-specific PCR products for T7E1 assays are shown (Black arrows).
  • FIG. 9. Activated CD4+ T cells were electroporated with CCR5-1 or CCR5-2 targeting RNPs using the Neon electroporation system. At day 6 post-electroporation, genomic DNA was isolated from cells and a PCR product spanning each target site was amplified and used to determine the levels of gene editing that had occurred via the T7 endonuclease I (T7E1) assay. The T7E1 assay showed that gene editing had occurred at the CCR5-1 or CCR5-2 target sites in 22.9% and 34.1 % of cells respectively. Bands indicative of PCR product cleavage and gene editing are highlighted (asterisks).
  • FIGs. 10A-10C (10A) Diagram of Control AAV and Donor AAV. Open diamonds indicate flanking envO or nef1 gRNA target sites (10B) Primary human CD4+ T cells were electroporated with Cas9 RNPs containing crRNAs specific for CCR5-1 in combination with a second crRNA that was specific for either the HIV-specific gRNA target sequence envO or the HIV-specific gRNA target sequence nef1. These cells were then infected with scAAV donor vectors scAAV6- envO-GFPApA or scAAV6-nef1-GFPApA at a multiplicity of infection of 350,000 vector genomes per cell.
  • FIGs. 11A-11C (11A) Diagram of Control AAV and Donor AAV. Diamonds indicate flanking envO or nef1 gRNA target sites (11 B) Primary human CD4+ T cells were electroporated with Cas9 RNPs containing crRNAs specific for CCR5-2 in combination with a second crRNA that was specific for either the HIV-specific gRNA target sequence envO or the HIV-specific gRNA target sequence nef1. These cells were then infected with scAAV donor vectors scAAV6- envO-GFPApA or scAAV6-nef1-GFPApA at a multiplicity of infection of 350,000 vector genomes per cell.
  • FIG. 12 Protein and nucleic acid sequences including or encoding: GFP (SEQ ID NO:23); MND promoter (SEQ ID NO:24); pCAPTIV-DrA-EhnO (SEQ ID NO: 25); pCAPTIV-DrA- Env1 (SEQ ID NO:26); pCAPTIV-ApA-Nef1 (SEQ ID NO:27); pCAPTIV-ApA-Nef2 (SEQ ID NO: 28); pNL4-3-Nef-MND-GFP (SEQ ID NO: 29); pNL4-3-Nef-MND-GFP-R (SEQ ID NO: 30); pNL4-3-Pol-MND-GFP (SEQ ID NO: 31); pNL4-3-Pol-MND-GFP-R (SEQ ID NO: 32); pTial- MND-GFP (SEQ ID NO: 33); Lachnospiraceae bacterium ND2006 Reference Sequence
  • BV3L6 Reference Sequence (SEQ ID NO: 35); F2A (SEQ ID NO: 36); E2A (SEQ ID NO: 37); T2A (SEQ ID NOs: 38 or 39); and P2A (SEQ ID NOs: 40 or 41); Streptococcus pyogenes serotype M1 Cas9 protein (UniProt Accession Q99ZW2) (SEQ ID NO: 83); Streptococcus pyogenes serotype M1 Cas9 cds (nucleotides 854751-858857 of NCBI Reference Sequence: NC_002737.2) (SEQ ID NO: 84); Francisella tularensis type V CRISPR- associated protein Cpf1 (NCBI Reference Sequence: WP_003040289.1) (SEQ ID NO: 85); Acidaminococcus sp.
  • BV3L6 type V CRISPR-associated protein Cpf1 (AsCpfl) (NCBI Reference Sequence: WP_021736722.1) (SEQ ID NO: 86); Lachnospiraceae bacterium MC2017 type V CRISPR-associated protein Cpf1 (NCBI Reference Sequence: WP_044910712.1) (SEQ ID NO: 87); Staphylococcus aureus Cas9 (saCas9) (GenBank Reference Sequence: AYD60511.1 (SEQ ID NO: 88); Reference sequence for particular variant saCas9 sequences described herein (SEQ ID NO: 89).
  • HIV Human immunodeficiency virus
  • AIDS acquired immunodeficiency syndrome
  • ART Anti-retroviral therapies
  • ART Anti-retroviral therapies
  • other serious health considerations such as bone or renal toxicity, insulin resistance, and accelerated cardiovascular disease.
  • the need for a cure for HIV is widely recognized, and a number of potentially curative strategies are currently being investigated, including gene therapy, latency reversal, immunotherapy, and others.
  • PBMC peripheral blood mononuclear cells
  • PBMCs In patients receiving ART, 0.01-0.1 % of PBMCs contain the HIV provirus (Eriksson et al., PLoS Pathog 2013; 9(2): e1003174; Besson et al., Clin Infect Dis. 2014;59(9): 1312-21), and 0.003% contain intact provirus (Ho et al., Cell. 2013;155(3):540-51). Based on these numbers, in an ART-suppressed patient, one in ten thousand resting T cells contain HIV DNA, and one per million resting T cells contain provirus that can be reactivated and produce infectious virus. Accordingly, each vial of 5 million PBMCs obtained from an untreated subject should contain 5,000-50,000 HIV+ cells, and each vial from treated subjects should contain 500-5,000 HIV+ cells, and 150 cells with intact provirus.
  • a cure for HIV is likely to require a thorough understanding of the latent HIV reservoir and the mechanisms by which it is maintained. Most current methods to measure quantitatively the replication competent latent HIV reservoir are difficult to perform, time-consuming, and expensive.
  • the two most common assays are the quantitative viral outgrowth assay (QVOA) and Tat/Rev Induced Limiting Dilution Assay (TILDA). These two assays have been described in Finzi et al (14 Nov. 1997) "Identification of a reservoir for HIV-1 in patients on highly active antiretroviral therapy," Science 278(5341): 1295-1300; and Procopio et al. (27 Jun. 2015) "A Novel Assay to Measure the Magnitude of the Inducible Viral Reservoir in HIV-infected Individuals," EBioMedicine 2(8):874-83.
  • QVOA quantitative viral outgrowth assay
  • TILDA Tat/Rev Induced Limiting Dilution Assay
  • the current disclosure provides systems and methods to allow the isolation and study of viable cells latently infected with HIV provirus. This allows the in-depth genotypic characterization of these cells to a level not previously possible. Such studies will be of widespread utility for HIV research and the development of curative HIV strategies and will also allow addressing fundamental questions about the HIV reservoir. Since the methods disclosed herein allow in vitro proliferation of HIV-infected cells, they also greatly simplify sequencing and other genetic testing, which can be much easier with larger numbers of clonal cells.
  • methods disclosed herein provide ways to obtain this sequence data from 1000 or fewer HIV+ cells, and include both full LTR sequences, which are often missed with PCR-based sequencing (Ho et al., Cell. 2013;155(3):540-51).
  • the methods disclosed herein utilize advances in targeted genetic engineering to tag cells infected with HIV that create the latent HIV reservoir.
  • Most genetic engineering approaches typically include a targeting element for precise genome targeting and a cutting element for cutting the targeted genetic site. If no further elements are provided, the DNA will repair itself through non-homologous end joining (NHEJ) which is error prone. More particularly, NHEJ is performed on the two cleaved ends of DNA which can result in a non-perfect repair, such as base pairs being inserted or deleted resulting in insertions and deletions (indels) at the break site.
  • NHEJ non-homologous end joining
  • gene editing systems typically include a targeting element, a cutting element, and a genetic construct to be inserted which generally includes regions of homology and a functional sequence (e.g., a sequence to be expressed).
  • the targeting element targets the cutting element to a specific genomic site for cutting, and the regions of homology provide for homology directed repair (HDR) which is less error prone than NHEJ.
  • HDR homology directed repair
  • regions of homology facilitate HDR based on sequence complementarity with the edited area of the genome.
  • HDR is generally precise and efficient, it is cell cycle phase-dependent. Moreover, the high level of sequence heterogeneity between HIV strains, even within the same patient, complicates HDR-mediated gene insertion into the provirus, because even minor sequence differences between the sequences of a targeting element or regions of homology and the sequence of the targeted area of the genome can greatly reduce HDR efficiency (Deyle et ai, Nucleic Acids Res. 2014;42(5):3119-24). Thus, HDR is poorly suited to heterogeneous genetic loci like the HIV provirus.
  • homology-independent targeted integration requires target site cleavage.
  • it differs from HDR, however, in that a linear dsDNA donor (marker) is inserted through a homology-independent NHEJ pathway (Lackner et ai, Nat Commun. 2015;6: 10237; Suzuki et ai, Nature. 2016;540(7631):144-9) or a microhomology- mediated end joining (MMEJ) pathway.
  • HITI does not require long and highly conserved regions of homology, and instead can successfully insert genetic constructs with minimal to no regions of homology.
  • HITI can occur with high efficiency, even in non-dividing cells. For at least these reasons that make HITI amenable to tagging of heterogeneous HIV proviruses, embodiments disclosed herein utilize HITI to tag cells of the latent HIV reservoir for isolation.
  • HITI utilizes a targeting element, a cutting element, and a genetic construct to be inserted into the cut genome.
  • the genetic construct to be inserted includes sequences that match or substantially match the targeting element sequences.
  • the genetic constructs to be inserted provide for small or micro-regions of homology to facilitate insertion of the genetic construct.
  • these small or micro-homology sequences are 75 bp or less.
  • the small or micro-homology sequences are 50 bp or less.
  • the small or micro-homology sequences are 40 bp or less.
  • the small or micro-homology sequences are 30 bp or less.
  • the small or micro-homology sequences are 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 11 , 10, 9, 8, 7, 6, 5, 4, 3, or 2 bp.
  • FACS fluorescence activated cell sorting
  • a polyA signal is a base sequence that leads to the addition of a polyA- tail on transcribed mRNA.
  • PolyA signals generally include 6 base pairs, such as ATTAAA and AATAAA. The polyA signal determines where on the mRNA molecule a polyA tail is added.
  • PolyA tails refer to a sequence of adenines that are endogenously added to unprocessed mRNA along with a 5’ cap to stabilize the mRNA.
  • expansion easily provides more than the 1000 cells required for HIV probe capture; typically 50% of T cell clones may expand to >2 million cells by day 28 of the protocol described in Riddell & Greenberg, J Immunol Methods. 1990; 128(2):189-201.
  • Cells can then be subjected to HIV hybridization probe capture deep sequencing which allows HIV sequence enrichment prior to Next-Generation Sequencing (NGS).
  • NGS Next-Generation Sequencing
  • HIV integration sites, the complete provirus sequence, the T cell receptor of the cell, and other information can be identified. Identifying HIV integration sites in bulk pools of HIV+ cells, can allow determination of integration site preferences of virus that make the latent reservoir. Identifying complete provirus sequences can allow sequence or strain preferences that allow the latent reservoir.
  • Identifying T cell receptors of latently infected cells can determine if particular TCR are related to maintenance of the latent reservoir.
  • a combination of HIV probe capture/lllumina sequencing and TCR-specific PCR can be used to determine complete provirus genome sequences, paired integration sites, and TCR sequences from individual HIV+ CD4+ T cells. Tens to hundreds of paired sequences can be obtained from isolated HIV+ CD4+ T cell clones, which will provide important insights into the role of antigen specificity in the maintenance of the HIV reservoir.
  • a goal is to tag at least 10% of HIV proviruses containing intact target sites within participant samples, which should provide >1000 clonally expanded or bulk purified HIV+ cells per sample, enabling significant and informative genetic analyses.
  • T cells are cells of the immune system that develop in the thymus. There are different types of T-cells, each type having a distinct function. The majority of T-cells have a T-cell receptor (TCR) existing as a complex of several proteins. The actual T-cell receptor is composed of two separate peptide chains, which are produced from the independent T-cell receptor alpha and beta (TCRa and TCF ⁇ ) genes and are called a- and b- TCR chains gd T-cells represent a small subset of T-cells that possess a distinct T-cell receptor (TCR) on their surface. In gd T-cells, the TCR is made up of one g-chain and one d-chain. This group of T-cells is much less common (2% of total T-cells) than the ab T-cells.
  • CD3 is expressed on all mature T cells and activated T-cells express 4-1 BB (CD137).
  • T- cells can further be classified into helper cells (CD4+ T-cells) and cytotoxic T-cells (CTLs, CD8+ T-cells), which include cytolytic T-cells.
  • CD4+ T helper cells assist other white blood cells in immunologic processes, including maturation of B cells into plasma cells and activation of cytotoxic T-cells and macrophages, among other functions. These cells are also known as CD4+ T-cells because they express the CD4 protein on their surface.
  • Helper T-cells become activated when they are presented with peptide antigens by MHC class II molecules that are expressed on the surface of antigen presenting cells (APCs). Once activated, they divide rapidly and secrete small proteins called cytokines that regulate or assist in the active immune response. CD4+ T cells can be infected by the HIV virus.
  • CD8+ cytotoxic T-cells destroy virally infected cells and tumor cells, and are also implicated in transplant rejection. These cells are also known as CD8+ T-cells because they express the CD8 glycoprotein at their surface. These cells recognize their targets by binding to antigen associated with MHC class I, which is present on the surface of nearly every cell of the body.
  • Central memory T-cells refer to T lymphocytes that have previously been exposed to an antigen and express CD62L or CCR7 and CD45RO on the surface and do not express or have decreased expression of CD45RA as compared to naive cells.
  • Effective memory T-cells refer to an antigen experienced T-cells that do not express or have decreased expression of CD62L on the surface thereof as compared to central memory cells and do not express or have decreased expression of CD45RA as compared to a naive cell.
  • effector memory cells are negative for expression of CD62L and CCR7, compared to naive cells or central memory cells, and have variable expression of CD28 and CD45RA.
  • Effector T-cells are positive for granzyme B and perforin as compared to memory or naive T-cells.
  • TREG Regulatory T cells
  • Regulatory T cells are a subpopulation of T cells, which modulate the immune system, maintain tolerance to self-antigens, and abrogate autoimmune disease.
  • TREG express CD25, CTLA-4, GITR, GARP and LAP.
  • Neive T-cells refers to a non-antigen experienced T cell that expresses CD62L and CD45RA and does not express CD45RO as compared to central or effector memory cells.
  • a statement that a cell or population of cells is "positive" for or expressing a particular marker refers to the detectable presence on or in the cell of the particular marker.
  • the term can refer to the presence of surface expression as detected by flow cytometry, for example, by staining with an antibody that specifically binds to the marker and detecting said antibody, wherein the staining is detectable by flow cytometry at a level substantially above the staining detected carrying out the same procedure with an isotype- matched control under otherwise identical conditions and/or at a level substantially similar to that for cell known to be positive for the marker, and/or at a level substantially higher than that for a cell known to be negative for the marker.
  • a statement that a cell or population of cells is "negative" for a particular marker or lacks expression of a marker refers to the absence of substantial detectable presence on or in the cell of a particular marker.
  • the term can refer to the absence of surface expression as detected by flow cytometry, for example, by staining with an antibody that specifically binds to the marker and detecting said antibody, wherein the staining is not detected by flow cytometry at a level substantially above the staining detected carrying out the same procedure with an isotype-matched control under otherwise identical conditions, and/or at a level substantially lower than that for cell known to be positive for the marker, and/or at a level substantially similar as compared to that for a cell known to be negative for the marker.
  • cells are derived from T cell lines.
  • T cells are derived from humans.
  • T cells are derived or isolated from samples such as whole blood, peripheral blood mononuclear cells (PBMCs), leukocytes, bone marrow, thymus, tissue biopsy, tumor, leukemia, lymphoma, lymph node, gut associated lymphoid tissue, mucosa associated lymphoid tissue, spleen, other lymphoid tissues, liver, lung, stomach, intestine, colon, kidney, pancreas, breast, bone, prostate, cervix, testes, ovaries, tonsil, or other organ, and/or cells derived therefrom.
  • PBMCs peripheral blood mononuclear cells
  • leukocytes derived or isolated from samples such as whole blood, peripheral blood mononuclear cells (PBMCs), leukocytes, bone marrow, thymus, tissue biopsy, tumor, leukemia, lymphoma, lymph node, gut associated lymphoid tissue, mucosa associated lymphoid tissue, spleen, other lymphoid tissues, liver
  • the T cells are derived or isolated from blood or a blood- derived sample, or is or is derived from an apheresis or leukapheresis product.
  • Primary T cells are those that are isolated from a living organism’s tissue or a sample thereof.
  • blood cells collected from a subject are washed, e.g., to remove the plasma fraction and to place the cells in an appropriate buffer or media for subsequent processing steps.
  • the cells are washed with phosphate buffered saline (PBS).
  • PBS phosphate buffered saline
  • the wash solution lacks calcium and/or magnesium and/or many or all divalent cations. Washing can be accomplished using a semi-automated "flow through" centrifuge (for example, the Cobe 2991 cell processor, Baxter) according to the manufacturer's instructions. Tangential flow filtration (TFF) can also be performed.
  • cells can re-suspended in a variety of biocompatible buffers after washing, such as, Ca++/Mg++ free PBS.
  • a sample can be enriched for T cells by using density-based cell separation methods and related methods.
  • white blood cells can be separated from other cell types in the peripheral blood by lysing red blood cells and centrifuging the sample through a Percoll or Ficoll gradient.
  • a bulk T cell population can be used that has not been enriched for a particular T cell type.
  • a selected T cell type can be enriched for and/or isolated based on cell-marker based positive and/or negative selection.
  • positive selection cells having bound cellular markers are retained for further use.
  • negative selection cells not bound by a capture agent, such as an antibody to a cellular marker are retained for further use. In some examples, both fractions can be retained for a further use.
  • multiple rounds of separation steps are carried out, where the positively or negatively selected fraction from one step is subjected to another separation step, such as a subsequent positive or negative selection.
  • cell populations can be isolated and/or analyzed based on light scattering properties of the cells based on side scatter channel (SSC) brightness and forward scatter channel (FSC) brightness.
  • Side scatter refers to the amount of light scattered orthogonally (90° from the direction of the laser source), as measured by flow cytometry.
  • Forward scatter refers to the amount of light scattered generally less than 90° from the direction of the light source. Generally, as cell granularity increases, the side scatter increases and as cell diameter increases, the forward scatter increases.
  • Side scatter and forward scatter are measured as intensity of light.
  • the amount of side scatter can be differentiated with user-defined settings.
  • low (lo) side scatter refers to less than 50% intensity, less than 40% intensity, less than 30% intensity, or even less intensity, in the side scatter channel of the flow cytometer.
  • high (hi) side scatter cells are the reciprocal population of cells that are not low side scatter.
  • Forward scatter is defined in the same manner as side scatter but the light is collected in forward scatter channel.
  • particular embodiments include selection of cell populations based on precise combinations of cell surface markers (CD markers) and the associated light scattering properties of the cells.
  • CD markers cell surface markers
  • an antibody or binding domain for a cellular marker is bound to a solid support or matrix, such as a magnetic bead or paramagnetic bead, to allow for separation of cells for positive and/or negative selection.
  • a solid support or matrix such as a magnetic bead or paramagnetic bead
  • the cells and cell populations are separated or isolated using immunomagnetic (or affinity magnetic) separation techniques (reviewed in Methods in Molecular Medicine, vol. 58: Metastasis Research Protocols, Vol. 2: Cell Behavior In Vitro and In Vivo, p 17-25 Edited by: S. A. Brooks and U. Schumacher ⁇ Humana Press Inc., Totowa, NJ); see also US 4,452,773; US 4,795,698; US 5,200,084; and EP 452342.
  • affinity-based selection is via magnetic-activated cell sorting (MACS) (Miltenyi Biotec, Auburn, CA).
  • MACS systems are capable of high-purity selection of cells having magnetized particles attached thereto.
  • MACS operates in a mode wherein the non-target and target species are sequentially eluted after the application of the external magnetic field. That is, the cells attached to magnetized particles are held in place while the unattached species are eluted. Then, after this first elution step is completed, the species that were trapped in the magnetic field and were prevented from being eluted are freed in some manner such that they can be eluted and recovered.
  • the non target cells are labelled and depleted from the heterogeneous population of cells.
  • a T cell population is collected and enriched (or depleted) via flow cytometry, in which cells stained for multiple cell surface markers are carried in a fluidic stream.
  • a cell population described herein is collected and enriched (or depleted) via preparative scale (FACS)-sorting.
  • a cell population described herein is collected and enriched (or depleted) by use of microelectromechanical systems (MEMS) chips in combination with a FACS-based detection system (see, e.g., WO 2010/033140, Cho et al. (2010) Lab Chip 10, 1567-1573; and Godin et al. (2008) J Biophoton. 1 (5):355— 376). In both cases, cells can be labeled with multiple markers, allowing for the isolation of well-defined T cell subsets at high purity.
  • MEMS microelectromechanical systems
  • a CD4+ selection step is used to separate CD4+ helper and CD8+ cytotoxic T cells.
  • Such CD4+ populations can be further sorted into sub-populations by positive or negative selection for markers expressed or expressed to a relatively higher degree on one or more naive or memory T cell subpopulations.
  • a CD4+ enriched cell population can further be sorted based on the expression of CCR7, CD45RO, and/or CD62L.
  • an enriched cell population will include, as a percentage of cell types, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%, of the targeted cell type.
  • an enriched cell population includes >70% or >90% of the target cell type.
  • any gene editing system capable of precise sequence targeting, cutting, and construct insertion can be used, so long as the system does not require regions of homology (also referred to as homology arms or homology-directed repair templates) of greater than 75 bp yet still results in the insertion of a genetic construct at a targeted site.
  • these systems typically include a targeting element for precise genome targeting, a cutting element for cutting at or near the targeted genetic site, and the genetic construct to be inserted during repair.
  • different gene editing systems can adopt different components and configurations while maintaining the ability to precisely target, cut, and modify selected genomic sites.
  • the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR-associated protein) nuclease system is an engineered nuclease system used for genetic engineering that is based on a bacterial system. It is based in part on the adaptive immune response of many bacteria and archaea. When a virus or plasmid invades a bacterium, segments of the invader's DNA are converted into CRISPR RNAs (crRNA) by the bacteria’s "immune" response.
  • crRNA CRISPR RNAs
  • the crRNA then associates, through a region of partial complementarity, with another type of RNA called tracrRNA to guide a Cas nuclease to a region homologous to the crRNA in the target DNA called a "protospacer.”
  • the Cas nuclease cleaves the DNA to generate blunt ends at the double-strand break at sites specified by a 20-nucleotide complementary strand sequence contained within the crRNA transcript.
  • the Cas nuclease requires both the crRNA and the tracrRNA for site-specific DNA recognition and cleavage.
  • gRNA Guide RNA
  • crRNA complementarity
  • gRNA can also include additional components.
  • gRNA can include a targeting sequence (e.g., crRNA) and a component to link the targeting sequence to a cutting element.
  • This linking component can be tracrRNA.
  • gRNA including crRNA and tracrRNA can be expressed as a single molecule referred to as single gRNA (sgRNA).
  • sgRNA single gRNA
  • gRNA can also be linked to a cutting element through other mechanisms such as through a nanoparticle or through expression or construction of a dual or multi-purpose molecule.
  • targeting elements can include one or more modifications (e.g., a base modification, a backbone modification), to provide the nucleic acid with a new or enhanced feature (e.g., improved stability).
  • Modified backbones may include those that retain a phosphorus atom in the backbone and those that do not have a phosphorus atom in the backbone.
  • Suitable modified backbones containing a phosphorus atom may include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates such as 3'- alkylene phosphonates, 5'-alkylene phosphonates, chiral phosphonates, phosphinates, phosphoramidates including 3'-amino phosphoramidate and aminoalkylphosphoramidates, phosphorodiamidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, selenophosphates, and boranophosphates having normal 3'-5' linkages, 2'-5' linked analogs, and those having inverted polarity wherein one or more internucleotide linkages is a 3' to 3', a 5' to 5' or a 2' to 2
  • Suitable targeting elements having inverted polarity can include a single 3' to 3' linkage at the 3'-most internucleotide linkage (i.e. a single inverted nucleoside residue in which the nucleobase is missing or has a hydroxyl group in place thereof).
  • Various salts e.g., potassium chloride or sodium chloride
  • mixed salts, and free acid forms can also be included.
  • targeting elements can include a morpholino backbone structure.
  • the targeting elements can include a 6-membered morpholino ring in place of a ribose ring.
  • a phosphorodiamidate or other non- phosphodiester internucleoside linkage replaces a phosphodiester linkage.
  • targeting elements can include one or more substituted sugar moieties.
  • Suitable polynucleotides can include a sugar substituent group selected from: OH; F; 0-, S-, or N-alkyl; 0-, S-, or N-alkenyl; 0-, S- or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may be substituted or unsubstituted C1 to C10 alkyl or C2 to C10 alkenyl and alkynyl.
  • n and m are from 1 to 10.
  • targeting elements and targeted areas of the genome are based on complementary base pairing between the targeting element and the targeted area.
  • targeting elements and targeted areas will have 100% sequence complementarity.
  • targeting elements and targeted areas will have at least 90%, 95%, 97%, 98%, or 99% sequence complementarity.
  • targeting elements and targeted areas will bind in vitro under stringent hybridization conditions. In vitro stringent hybridization conditions are described in section (x) of this disclosure.
  • Examples of cutting elements include nucleases.
  • CRISPR-Cas loci have more than 50 gene families and there are no strictly universal genes, indicating fast evolution and extreme diversity of loci architecture.
  • Exemplary Cas nucleases include Casl, CasIB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), CaslO, , Cpfl, C2c3, C2c2 and C2clCsyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Cpfl, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, C
  • Type II Cas nucleases There are three main types of Cas nucleases (type I, type II, and type III), and 10 subtypes including 5 type I, 3 type II, and 2 type III proteins (see, e.g., Hochstrasser and Doudna, Trends Biochem Sci, 2015:40(l):58-66).
  • Type II Cas nucleases include Casl, Cas2, Csn2, and Cas9. These Cas nucleases are known to those skilled in the art.
  • the amino acid sequence of the Streptococcus pyogenes wild-type Cas9 polypeptide is set forth, e.g., in NBCI Ref. Seq. No. NP 269215
  • amino acid sequence of Streptococcus thermophilus wild-type Cas9 polypeptide is set forth, e.g., in NBCI Ref. Seq. No. WP_011681470.
  • Cas9 refers to an RNA-guided double-stranded DNA-binding nuclease protein or nickase protein. Wild-type Cas9 nuclease has two functional domains, e.g., RuvC and HNH, that cut different DNA strands. Cas9 can induce double-strand breaks in genomic DNA (target DNA) when both functional domains are active.
  • the Cas9 enzyme includes one or more catalytic domains of a Cas9 protein derived from bacteria such as Corynebacter, Sutterella, Legionella, Treponema, Filif actor, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Flaviivola, Flavobacterium, Sphaerochaeta, Azospirillum, Gluconacetobacter, Neisseria, Roseburia, Parvibaculum, Staphylococcus, Nitratifractor, and Campylobacter.
  • the Cas9 is a fusion protein, e.g. the two catalytic domains are derived from different bacterial species.
  • the CRISPR/Cas system has been engineered such that, in certain cases, crRNA and tracrRNA can be combined into one molecule called a single gRNA (sgRNA).
  • sgRNA single gRNA
  • the sgRNA guides Cas to target any desired sequence (see, e.g., Jinek et al. (2012) Science 337:816-821 ; Jinek et al. (2013) eLife 2:e00471 ; Segal (2013) eLife 2:e00563).
  • the CRISPR/Cas system can be engineered to create a double strand break at a desired target in a genome of a cell, and harness the cell's endogenous mechanisms to repair the induced break by HDR, HITI-associated MMEJ or NHEJ (as described herein), or complete NHEJ depending on whether a genetic construct is provided for insertion and the length of any provided homology regions (e.g., > or ⁇ 75 bp).
  • Useful variants of the Cas9 nuclease include a single inactive catalytic domain, such as a RuvC " or HNH " enzyme or a nickase.
  • a Cas9 nickase has only one active functional domain and, in some embodiments, cuts only one strand of the target DNA, thereby creating a single strand break or nick.
  • the mutant Cas9 nuclease having at least a D10A mutation is a Cas9 nickase.
  • the mutant Cas9 nuclease having at least a H840A mutation is a Cas9 nickase.
  • a double-strand break is introduced using a Cas9 nickase if at least two DNA-targeting RNAs that target opposite DNA strands are used.
  • a double-nicked induced double-strand break is repaired by NHEJ, HDR or HITI. This gene editing strategy generally favors HDR and decreases the frequency of indel mutations at off-target DNA sites.
  • the Cas9 nuclease or nickase in some embodiments, is codon-optimized for the target cell or target organism.
  • Particular embodiments can utilize Staphylococcus aureus Cas9 (SaCas9).
  • Particular embodiments can utilize SaCas9 with mutations at one or more of the following positions: E782, N968, and/or R1015.
  • Particular embodiments can utilize SaCas9 with mutations at one or more of the following positions: E735, E782, K929, N968, A1021 , K1044 and/or R1015.
  • the variant SaCas9 protein includes one or more of the following mutations: R1015Q, R1015H, E782K, N968K, E735K, K929R, A1021T, and/or K1044N.
  • the variant SaCas9 protein includes mutations at D10A, D556A, H557A, N580A, e.g., D10A/H557A and/or D10A/D556A/H557A/N580A.
  • the variant SaCas9 protein includes one or more mutations selected from E735, E782, K929, N968, R1015, A1021 , and/or K1044.
  • the SaCas9 variants can include one of the following sets of mutations: E782K/N968K/R1015H (KKH variant); E782K/K929R/R1015H (KRH variant); or E782K/K929R/N968K/R1015H (KRKH variant).
  • An appropriate reference sequence is provided in FIG. 12. However, the skilled person will be able to determine appropriate corresponding residues in other Cas9 proteins.
  • Cpf1 A putative Class II, Type V CRISPR-Cas class exemplified by Cpf1 has been identified Zetsche et al. (2015) Cell 163(3): 759-771.
  • the Cpf1 nuclease particularly can provide added flexibility in target site selection by means of a short, three base pair recognition sequence (TTN), known as the protospacer-adjacent motif or PAM.
  • TTN three base pair recognition sequence
  • PAM protospacer-adjacent motif
  • CpfTs cut site is at least 18bp away from the PAM sequence.
  • staggered DSBs with sticky ends permit orientation-specific donor template insertion, which is advantageous in non-dividing cells.
  • Particular embodiments can utilize engineered Cpfls.
  • US 2018/0030425 describes engineered Cpf1 nucleases from Lachnospiraceae bacterium ND2006 and Acidaminococcus sp. BV3L6 with altered and improved target specificity.
  • Particular variants include Lachnospiraceae bacterium ND2006 of SEQ ID NO: 34, e.g., at least including amino acids 19-1246 of SEQ ID NO: 34, with mutations (i.e.
  • Cpf1 variants can also include Acidaminococcus sp.
  • engineered Cpf1 variants include eCfpl
  • Cpf1 variants include Cpf1 homologs and orthologs of the Cpf1 polypeptides disclosed in Zetsche et al. (2015) Cell 163: 759-771 as well as the Cpf1 polypeptides disclosed in U.S. 2016/0208243.
  • Other engineered Cpf1 variants are known to those of ordinary skill in the art and included within the scope of the current disclosure (see, e.g., WO/2017/184768).
  • CAPTIV CRISPR-assisted provirus tagging in vitro
  • CAPTIV can include any appropriate combination of CRISPR components described herein.
  • CAPTIV may utilize gRNA, sgRNA, Cas9 variants, Cpf1 , variants of Cpf1 , etc. in any appropriate configuration.
  • CRISPR-Cas systems and components thereof are described in, US8697359, US8771945, US8795965, US8865406, US8871445, US8889356, US8889418, US8895308, US8906616, US8932814, US8945839, US8993233 and US8999641 and applications related thereto; and WO2014/018423, WO2014/093595, WO2014/093622, WO2014/093635, WO2014/093655, WO2014/093661 , WO2014/093694, WO2014/093701 , WO2014/093709, WO2014/093712, WO2014/093718, WO2014/145599, WO2014/204723, WO2014/204724, WO2014/204725, WO2014/204726, WO2014/204727, WO2014/204728, WO2014/204729, WO2015/065964, WO2015/08
  • ZFNs zinc finger nucleases
  • DSBs double strand breaks
  • ZFNs are synthesized by fusing a zinc finger DNA-binding domain to a DNA cleavage domain.
  • the DNA-binding domain includes three to six zinc finger proteins which are similar to those found in transcription factors.
  • the DNA cleavage domain includes the catalytic domain of, for example, Fokl endonuclease.
  • the Fokl domain functions as a dimer requiring two constructs with unique DNA binding domains for sites on either side of the target site cleavage sequence.
  • the Fokl cleavage domain cleaves within a five or six base pair spacer sequence separating the two inverted half-sites.
  • TALENs transcription activator like effector nucleases
  • TALE transcription activator-like effector
  • TALENs are used to edit genes and genomes by inducing DSBs in the DNA, which induce repair mechanisms in cells.
  • two TALENs must bind and flank each side of the target DNA site for the DNA cleavage domain to dimerize and induce a DSB.
  • TALENs have been engineered to bind a target sequence of, for example, an endogenous genome, and cut DNA at the location of the target sequence.
  • the TALEs of TALENs are DNA binding proteins secreted by Xanthomonas bacteria.
  • the DNA binding domain of TALEs include a highly conserved 33 or 34 amino acid repeat, with divergent residues at the 12 th and 13 th positions of each repeat. These two positions, referred to as the Repeat Variable Diresidue (RVD), show a strong correlation with specific nucleotide recognition. Accordingly, targeting specificity can be improved by changing the amino acids in the RVD and incorporating nonconventional RVD amino acids.
  • RVD Repeat Variable Diresidue
  • DNA cleavage domains that can be used in TALEN fusions are wild-type and variant Fokl endonucleases.
  • TALENs see U.S. Patent Nos. 8,440,431 ; 8,440,432; 8,450,471 ; 8,586,363; and 8,697,853; as well as Joung and Sander, Nat Rev Mol Cell Biol, 2013, 14(l):49-55; Beurdeley et at., Nat Commun, 2013, 4: 1762; Scharenberg et al., Curr Gene Ther, 2013, 13(4):291-303; Gaj et al., Nat Methods, 2012, 9(8):805-7; Miller, et al.
  • MegaTALs have a single chain rare-cleaving nuclease structure in which a TALE is fused with the DNA cleavage domain of a meganuclease.
  • Meganucleases also known as homing endonucleases, are single peptide chains that have both DNA recognition and nuclease function in the same domain. In contrast to the TALEN, the megaTAL only requires the delivery of a single peptide chain for functional activity.
  • Exemplary meganucleases include l-Scel, I- Scell, l-Scelll, l-ScelV, l-SceV, l-SceVI, I- SceVII, l-Ceul, l-CeuAIIP, l-Crel, l-CrepsbIP, I- CrepsbllP, l-CrepsbIIIP, l-CrepsbIVP, l-Tlil, I- Ppol, Pl-Pspl, F-Scel, F-Scell, F-Suvl, F- Tevl, F-Tevll, l-Amal, l-Anil, l-Chul, l-Cmoel, l-Cpal, I- Cpall, l-Csml, l-Cvul, l-CvuAIP, l-Ddil, l-Ddill, l-Dir
  • HIV and Targeted HIV Sites Human immunodeficiency virus (HIV) is a member of the genus Lentivirinae, which is part of the family of Retroviridae. Two species of HIV infect humans: HIV-1 and HIV-2. HIV-1 is the most common and pathogenic strain of the virus, with more than 90% of HIV/AIDS cases resulting from infection with HIV-1.
  • HIV Human immunodeficiency virus
  • HIV is categorized into multiple clades with a high degree of genetic divergence.
  • HIV clade or "HIV subtype” refers to related human immunodeficiency viruses classified according to their degree of genetic similarity.
  • M major strains
  • O outer strains
  • Group N is a new HIV-1 isolate that has not been categorized in either group M or O.
  • the HIV-1 genome is 9.8 kb in length, including two viral long-terminal repeats located at both ends when integrated into the host genome.
  • the genome also includes genes that encode for the structural proteins Gag, Pol, and Env, regulatory proteins (Tat and Rev), and accessory proteins Vpu, Vpr, Vif, and Nef.
  • the HIV-1 transactivator of transcription (Tat) is a multifunctional protein that has been proposed to contribute to several pathological consequences of HIV-1 infection. Tat not only plays an important role in viral transcription and replication, it is also capable of inducing the expression of a variety of cellular genes as well as acting as a neurotoxic protein. Tat protein is secreted by H IV- 1 -infected cells and acts by diffusing through the cell membrane. It may act as a secreted, soluble neurotoxin and induces HIV-1-infected macrophages and microglia to release neurotoxic substances. Tat transcription is driven by the HIV-1 LTR promoter and is required for overall viral replication of HIV.
  • the targeted sequence is 10 to 30 nucleotides in length, from 12 to 28 nucleotides in length, from 16 to 26 nucleotides in length, or from 10 to 40 nucleotides in length.
  • the targeted sequence includes a nuclease binding site.
  • the targeted sequence includes a nick/cleavage site.
  • the targeted sequence includes a protospacer adjacent motif (PAM) sequence.
  • PAM protospacer adjacent motif
  • HITI can be performed anywhere in the provirus that can be cut. HIV pol is the most conserved region of the genome but is also further from the 3’ LTR. Potential tag insertion sites can be screened for suitability, defined as efficient tag expression when inserted in either 5’ to 3’ or 3’ to 5’ orientation, by inserting the tag into potential target sites within the HIV plasmid pNL4-3 using the approached described in FIG. 3, and analyzing expression levels. Particular embodiments utilize the following sites within the HIV genome that can be targeted for genetic insertion using HITI along with associated gRNA (e.g., crRNA) targeting sequences:
  • gRNA e.g., crRNA
  • these particular target and guide sites can depend on the particular nuclease selected and can be derived using prediction algorithms for the nuclease.
  • Particular embodiments utilize conserved Cas9 sites within the HIV genome.
  • Particular embodiments target env or nef regions due to their proximity to the 3’ LTR.
  • Particular embodiments may target gag, pol or other HIV regions that show high levels of genetic conservation across different HIV isolates.
  • the genetic constructs inserted into HIV provirus at targeted sites include at least a reporter gene and a promoter, collectively flanked on both sides by the same gRNA target recognition sequence as the HIV provirus gRNA target recognition sequence. These segments include HITI-regions of homology (e.g., micro-homology sequences).
  • Cas9 can excise the tag sequence from the donor so that it becomes linear dsDNA, while also cutting the HIV provirus at the selected target site (e.g., env or nef sequence as described) to generate a site for insertion of the excised linear tag in the HIV provirus via the HITI process. While this process is described in relation to Cas9, as indicated previously, numerous other targeting and/or cutting elements could be used including sgRNA, Cas9 variants, Cpf 1 , variants of Cpf1 , etc.
  • flanking recognition sequence does not have to be HIV-specific. It can be simpler, however, to deliver one ribonucleoprotein. If only one ribonucleoprotein is used, then the donor should be different each time whereas two ribonucleoproteins could be used to insert a universal donor at different target sites.
  • gRNA/HITI-micro-homology sequences of the genetic construct can be between 18-30 bp and share at least 95% sequence identity with the crRNA for the target sequence.
  • gRNA/HITI-micro-homology sequences can be 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 bp and can be 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the crRNA for the target sequence.
  • gRNA/HITI-micro-homology sequences of the genetic construct are 20 bp with 100% sequence identity with the crRNA for the target sequence.
  • gRNA/HITI-micro-homology sequences bind in vitro to a target sequence under stringent hybridization conditions.
  • the reporter gene encodes a fluorescent or light-emitting protein allowing FACs.
  • the promoter can be a T cell specific promoter.
  • the inserted genetic construct lacks a polyA signal. The absence of a polyA signal in the inserted genetic construct significantly reduces background signals of the systems and methods disclosed herein, and in particular embodiments, is critical to selectively isolating HIV-infected cells.
  • Exemplary fluorescent proteins that can be used as reporters include blue fluorescent proteins (e.g. eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T-sapphire); cyan fluorescent proteins (e.g. eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan); green fluorescent proteins (e.g.
  • Genetic construct also include promoters to drive expression of the reporter gene.
  • the reporter is under control of a promoter.
  • the promoter includes a constitutive promoter.
  • Exemplary constitutive promoters include simian virus 40 early promoter (SV40), cytomegalovirus immediate-early promoter (CMV), human Ubiquitin C promoter (UBC), human elongation factor 1a promoter (EF1a), mouse phosphoglycerate kinase 1 promoter (PGK), and chicken b-Actin promoter coupled with CMV early enhancer (CAGG, also known as the CBA promoter).
  • the constitutive promoter is a synthetic or modified promoter.
  • the promoter is an MND promoter.
  • an MND promoter refers to a synthetic promoter that contains the U3 region of a modified MoMuLV LTR with myeloproliferative sarcoma virus enhancer (see Challita et al. (1995) J. Virol. 69(2):748-755).
  • the promoter is a cell-specific promoter.
  • the promoter is a viral promoter.
  • the promoter is a non-viral promoter.
  • the promoter includes the EF1a promoter or a modified form thereof and/or the MND promoter.
  • the promoter is a regulated promoter (e.g., inducible promoter).
  • the promoter is an inducible promoter or a repressible promoter.
  • the promoter includes a Lac operator sequence, a tetracycline operator sequence, a galactose operator sequence or a doxycycline operator sequence, or is an analog thereof or is capable of being bound by or recognized by a Lac repressor or a tetracycline repressor, or an analog thereof.
  • promoters appropriate for use in T cells include an MND promoter, a CD3A promoter, the murine stem cell virus LTR promoter, the distal lck promoter, and the spleen focus forming virus LTR (SFFV) promoter.
  • the genetic construct includes a signal sequence that encodes a signal peptide.
  • the signal sequence may encode a signal peptide derived from a native polypeptide.
  • the signal sequence may encode a heterologous or non-native signal peptide.
  • a single promoter may direct expression of an RNA that contains, in a single open reading frame (ORF), two or three reporter genes separated from one another by sequences encoding a self-cleavage peptide (e.g., 2A sequences) or a protease recognition site (e.g., furin).
  • ORF open reading frame
  • the ORF thus encodes a single polypeptide, which, either during (in the case of 2A) or after translation, is processed into the individual proteins. This feature can be useful when, for example, it could be useful to tag cells with different subtypes of fluorescent marker combinations.
  • the peptide such as T2A
  • T2A can cause the ribosome to skip (ribosome skipping) synthesis of a peptide bond at the C-terminus of a 2A element, leading to separation between the end of the 2A sequence and the next peptide downstream (see, for example, de Felipe. Genetic Vaccines and Ther. 2: 13 (2004) and deFelipe et al. Traffic 5:616- 626 (2004)).
  • Many 2A elements are known.
  • Examples of 2A sequences that can be used in the methods and nucleic acids disclosed herein include 2A sequences from the foot-and-mouth disease virus (F2A, e.g., SEQ ID NO: 36), equine rhinitis A virus (E2A, e.g., SEQ ID NO: 37), Thosea asigna virus (T2A, e.g., SEQ ID NO: 38 or 39), and porcine teschovirus-1 (P2A, e.g., SEQ ID NO: 40 or 41) as described in U.S. Patent Publication No. 20070116690.
  • F2A foot-and-mouth disease virus
  • E2A equine rhinitis A virus
  • T2A e.g., SEQ ID NO: 38 or 39
  • P2A porcine teschovirus-1
  • inserted genetic constructs may also include other regulatory elements such as enhancer elements.
  • Coding sequences encoding molecules e.g., RNA, proteins
  • Coding sequences can be obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule.
  • the term“encode” or“encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.
  • the term“gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, and termination regions.
  • the term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites.
  • the sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type.
  • Inserted genetic constructs can be of any suitable size.
  • the inserted genetic construct integrated into a genome is more than 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, 400, 450, or 500 kb) in length.
  • the inserted genetic construct includes micro homology regions of less than 75 bp or micro-homology regions of 20 bp.
  • Introduction of genetic engineering components with targeting, cutting, and insertion properties into T cells may be by any suitable method described herein or known to one of ordinary skill in the art.
  • introduction of the genetic engineering components with targeting, cutting, and insertion properties into a T cell can be accomplished chemically, biologically, or mechanically.
  • Exemplary methods include calcium phosphate transfection, DEAE-dextran mediated transfection, dendrimer-mediated delivery, electroporation, gene gun delivery, lipotransfection, polyethyleneimine (PEI)-mediated transfection, protoplast fusion, microinjection, nanoparticle-mediated nucleic acid delivery, sonoporation, and viral vector- mediated delivery.
  • Performing electroporation at a temperature below 35°C e.g., at 30°C
  • Particular embodiments utilize one delivery mechanism to deliver targeting elements and cutting elements followed by a second mechanism to deliver genetic constructs for insertion into the HIV genome.
  • particular embodiments can utilize electroporation to deliver targeting elements and cutting elements followed by use of a vector to deliver genetic constructs for insertion into the HIV genome.
  • a vector refers to a composition that facilitates transfer of non-native nucleic acid molecules into a cell and expression of non-native nucleic acid derived molecules within that cell.
  • rAAV recombinant adeno-associated viral
  • Viral vector is widely used to refer to a vector that includes virus-derived components that facilitate transfer and expression of non-native nucleic acid molecules within a cell.
  • the term "retroviral vector” refers to a viral vector containing structural and functional genetic elements, or portions thereof, that are primarily derived from a retrovirus.
  • the term “lentiviral vector” refers to a viral vector containing structural and functional genetic elements, or portions thereof, that are primarily derived from a lentivirus, and so on.
  • hybrid viral vector refers to a viral vector including structural and/or functional genetic elements from more than one virus type.
  • Adeno-Associated Virus is a parvovirus, discovered as a contamination of adenoviral stocks. It is a ubiquitous non-pathogenic virus (antibodies are present in up to 85% of the US human population). It is also classified as a dependovirus, because its replication is dependent on the presence of a helper virus, such as adenovirus.
  • a recombinant AAV (rAAV) vector refers to a recombinant AAV-derived virus that is packaged so that it contains a recombinant ssDNA nucleic acid genome. rAAV genomes are deleted of all native AAV coding sequences, contain exogenous DNA, and require two AAV derived inverted terminal repeat sequences (ITRs) for efficient packaging.
  • a self-complementary AAV (scAAV) vector contains a double-stranded vector genome generated by deletion of the terminal resolution site (TR) from one of the rAAV ITRs in the plasmid used to make the vector, which prevents the initiation of replication at the mutated end.
  • constructs generate single- stranded scAAV genomes, with a wild-type (wt) ITR at each end and a mutated ITR in the middle.
  • wt wild-type
  • mutated ITR in the middle.
  • Each half of the scAAV genome is complimentary, which enables self-hybridization and generation of a transcriptionally active dsDNA genome.
  • AAV serotypes include AAV-1 , AAV-2, AAV-3A, AAV-3B, AAV-4, AAV-5, AAV-6, AAV-7, AAV- 8, AAV-9, AAV- 10, and AAV-11 (Choi et al., 2005, Curr. Gene Ther. 5:299-310).
  • a scAAV vector can be generated based on any of these or other serotypes of AAV.
  • rationally engineered variants of naturally occurring AAV capsids, or laboratory derived intra-serotype recombinants have also been widely used to generate AAV vectors.
  • AAV vectors that can be used within the teachings of the current disclosure, see, e.g, West et al., Virology 160:38-47 (1987); U.S. Pat. No. 4,797,368; WO 93/24641 ; Kotin, Human Gene Therapy 5:793-801 (1994); Muzyczka, J. Clin. Invest.
  • genetic constructs may be delivered by plasmids or non integrating Lentiviral viral vectors.
  • non-integrating lentiviral vectors see, e.g., Ory et al. (1996) Proc. Natl. Acad. Sci. USA 93:11382-11388; Dull et al. (1998) J. Virol. 72:8463-8471 ; Zuffery et al. (1998) J. Virol. 72:9873-9880; Follenzi et al. (2000) Nature Genetics 25:217-222; U.S. Patent Publication No 2009/054985.
  • Illustrative lentiviruses include: HIV (human immunodeficiency virus; including HIV type 1 , and HIV type 2); visna-maedi virus (VMV); the caprine arthritis-encephalitis virus (CAEV); equine infectious anemia virus (EIAV); feline immunodeficiency virus (FIV); bovine immune deficiency virus (BIV); and simian immunodeficiency virus (SIV).
  • HIV based vector backbones i.e. , HIV cis-acting sequence elements
  • HIV based vector backbones i.e. , HIV cis-acting sequence elements
  • exemplary viral vectors that may be used include Friend murine leukemia virus, feline leukemia virus (FLV), gibbon ape leukemia virus (GaLV), Harvey murine sarcoma virus (HaMuSV), Moloney murine leukemia virus (M-MuLV), Moloney murine sarcoma virus (MoMSV), murine mammary tumor virus (MuMTV), spumavirus, Murine Stem Cell Virus (MSCV), and Rous Sarcoma Virus (RSV).
  • FLV feline leukemia virus
  • GaLV gibbon ape leukemia virus
  • Harvey murine sarcoma virus HaMuSV
  • Moloney murine leukemia virus M-MuLV
  • Moloney murine sarcoma virus MoMSV
  • murine mammary tumor virus MuMTV
  • RSV Rous Sarcoma Virus
  • Synthetic nanoparticles can also be used to deliver gene-editing components.
  • synthetic nanoparticles can be utilized to deliver all required gene editing components.
  • Different nanoparticles can deliver different gene-editing components and/or all required gene editing components can be delivered to a cell by a single type of nanoparticle (e.g., Nanoblade (see, e.g., Mangeot et al., Nat Commun. 2019, 10(1):45, doi: 10.1038/s41467-01807845-z.)
  • nucleic acids that can be delivered are described in, for example, Hardee et al., Genes (2017), 8, 65. Hardee et al., reviews methods of non-viral DNA gene delivering vectors including plasmids, minicircles, and minivectors. Particular embodiments include use of double-stranded DNA (dsNDA), single-stranded oligonucleotides (e.g., ssODN), conventional plasmids, minicircles, and/or closed-ended linear ceDNA (see Li et al., PLoS One, Aug. 1 , 2013 doi.org/10.1371/journal.pone. 0069879).
  • dsNDA double-stranded DNA
  • ssODN single-stranded oligonucleotides
  • conventional plasmids minicircles
  • closed-ended linear ceDNA see Li et al., PLoS One, Aug. 1 , 2013 doi.org/10.1371/journal.
  • the isolation of tagged cells relies on the positive expression of a reporter.
  • a negative control tube first can be analyzed first to set a gate (bitmap) around the population of interest by FSC and SSC and the photomultiplier tube voltages and gains for fluorescence in the desired emission wavelengths can be adjusted, such that a set percentage of cells (e.g., 97%) of the cells appear unstained for the fluorescence marker with the negative control. Once established, these parameters can be used to isolate cells expressing the fluorescent reporter.
  • the reporter can be a protein expressed and trafficked to the cell surface.
  • sorting of tagged cells may be based on antibody-based magnetic separation; affinity chromatography; "panning" with antibody attached to a solid matrix (Broxmeyer et al., 1984, J. Clin. Invest. 73:939-953); etc.
  • “antibody” refers to a full antibody protein. In other embodiments,“antibody” can refer to only those portions of an antibody necessary to result in targeted protein binding (e.g., antibody binding fragments).
  • Drug selectable markers e.g., dihydrosulfate reductase, cytosine deaminase, HSV-1 thymidine kinase may also be used.
  • a tagged and isolated cell population refers to one in which the isolated cell types makes up >70%, >80%, >90%, >95%, >98%, or >99% of the cell population.
  • isolated latently infected T cells can be maintained and expanded in libraries.
  • any method known in the art for expanding the number of isolated T cells can be used.
  • the isolated T cells can be cultured under cell growth conditions such that the T cells grow and divide (proliferate) to obtain a population of cells infected with HIV.
  • the technique used for expansion is one that has been shown to result in an increase in the number of HIV-Infected T-Cells relative to an unexpanded sample.
  • the expansion technique results in a 50-, 75-, 100-, 150-, 200-, 250-, 300-, 350-, 400-, 450-, or 500- fold or more increase relative to an unexpanded sample.
  • Exemplary expansion techniques are described in, for example, WO2018157072, CA2999496A1 , and WO2015188119A.
  • isolated HIV-Infected T-Cells are cultured with growth factors and are exposed to cell growth conditions.
  • the cell growth conditions can include an incubation temperature suitable for the growth of human T-Cells, for example, at least 25°C, at least 30°C, or 37°C.
  • the HIV-Infected T-Cells are cultured for 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 days or more. In certain embodiments, the HIV-Infected T-Cells are cultured for at least 7 days.
  • HIV inhibitors within culture media. Inhibiting HIV before genetic tagging of cells occurs can be particularly beneficial to protect cells from viral replication (which can lead to their destruction).
  • HIV inhibitors include efavirenz, Nevirapine, Lamivudine and AZT.
  • the HIV-Infected T-Cells are cultured in a culture medium including one or more of Minimal Essential Media (MEM), Roswell Park Memorial Institute (RPMI) Media 1640, X-VIVO 15, X-VIVO 20, Click’s Medium, AIM V Medium, Dulbecco’s Modified Eagle Medium (DMEM), Eagle’s MEM, a-MEM, F-12 nutrient mixture, and human AB serum.
  • MEM Minimal Essential Media
  • RPMI Roswell Park Memorial Institute
  • X-VIVO 15 X-VIVO 20
  • Click’s Medium AIM V Medium
  • DMEM Modified Eagle Medium
  • Eagle’s MEM a-MEM
  • F-12 nutrient mixture F-12 nutrient mixture
  • human AB serum human AB serum
  • the isolated T cells can be cultured in the culture medium in the presence of one or more additional factors.
  • the one or more additional factors may be serum (e.g., fetal bovine or human serum), interleukin-2 (IL-2), insulin, interferon gamma (IFN- y), IFN-a, IL-4, IL-7, IL-21 , granulocyte-macrophage colony-stimulating factor (GM-CSF), IL-10, IL-12, IL-15, TGFp, tumor necrosis factor alpha (TNF-a).
  • serum e.g., fetal bovine or human serum
  • IL-2 interleukin-2
  • IFN- y interferon gamma
  • IFN-a interferon gamma
  • IL-4 interferon gamma
  • IL-7 interferon gamma
  • IL-21 granulocyte-macrophage colony-stimulating factor
  • GM-CSF granulocyte
  • surfactant plasmanate, N-acetyl- cystine, 2-mercaptoethanol, added amino acids, sodium pyruvate, vitamins, hormones, cytokine(s), penicillin, streptomycin, L-glutamine, plasma, efavirenz, Phytohemagglutinin-L (PHA-L), and GlutaMAX.
  • the cell culture media can include glucose:galactose in a ratio including 10:90, 15:85, 20:80, 25:75, 30:70, 35:65, 40:60, 45:55, 50:50, 55:45, 60:40, 65:35, 70:30, 75:25, 80:20, 85:15, and 90:10 (e.g., from 10:90 to 90:10, from 20:80 to 90:10, from 30:70 to 90:10, from 40:60 to 90:10, from 10:90 to 80:20, from 10:90 to 70:30, from 10:90 to 60:40, from 20:80 to 80:20, from 30:70 to 70:30, from 40:60 to 60:40, from 45:55 to 55:45, from 47:53 to 53:47, etc.).
  • glucose:galactose in a ratio including 10:90, 15:85, 20:80, 25:75, 30:70, 35:65, 40:60, 45:55, 50:50, 55
  • the cell culture media further includes one or more of fatty acids, cholesterol, arachidonic acid, linoleic acid, linolenic acid, myristic acid, oleic acid, palmitic acid, palmitoleic acid and stearic acid.
  • the media is sterile.
  • isolated T cells can be maintained in RPMI-1640 medium (Hyclone TM) supplemented with 10% heat inactivated FBS, 100ug/ml_ penicillin/streptomycin and 1ug/ml_ combinational anti-retrovirus drugs (cART) at 37°C and 5% CO2 incubator in a biosafety level (BSL) 2+ facility.
  • RPMI-1640 medium Hyclone TM
  • FBS 100ug/ml_ penicillin/streptomycin
  • cART combinational anti-retrovirus drugs
  • Additional culture conditions that can be used for expanding HIV-Infected T-Cells is set forth in Baxter et al., Cell Host & Microbe, Vol. 20, Issue 3, 14 September 2016, pages 368-380 and includes stimulating isolated CD4 T-cells for 36-40 hours in RPMI with PHA-L (10 pg/ml, Sigma) and IL-2 (50 U/ml). The cells can be subsequently washed and maintained for 6-7 days in RPMI with IL-2 (100 U/ml).
  • [0143] According to another example culture condition for expanding the HIV-Infected T-Cells is set forth in Cillo, et al., PNAS 2014 May 13, 111 (19); 7078-83 and includes seeding 304 cells per well into individual wells of 48-well plates, and culturing the cells for 7 days in RPMI medium 1640 without phenol red containing 10% (vol/vol) fetal bovine serum, and 0.6% penicillin/streptomycin. 300 nM efavirenz can also be introduced into the culture to prevent subsequent rounds of HIV-1 replication.
  • culture conditions can include those described Kavanagh, et al., Blood 2006 March 1 , 107(4) 1963-9 wherein the HIV-Infected T-Cells can be grown in HEPES-buffered RPMI medium supplemented with penicillin, streptomycin, L- glutamine, and 5% pooled human AB serum.
  • the cells can be cultured in RPMI 1640 supplemented with 10% human blood group AB serum, penicillin-streptomycin (100 lU/mL and 100 g/mL, respectively), 2 mM L-glutamine, 1 mM sodium pyruvate, and 1% HEPES (N-2- hydroxyethylpiperazine-N -2-ethanesulfonic acid) buffer.
  • RPMI 1640 supplemented with 10% human blood group AB serum, penicillin-streptomycin (100 lU/mL and 100 g/mL, respectively), 2 mM L-glutamine, 1 mM sodium pyruvate, and 1% HEPES (N-2- hydroxyethylpiperazine-N -2-ethanesulfonic acid) buffer.
  • the systems and methods of the present disclosure do not affect the viability of tagged and isolated cells.
  • a lack of effect on viability can be demonstrated following expansion protocols described herein wherein cell growth and expected cell surface marker expression is maintained for the isolated cell type.
  • Maintenance of cell growth and expected cell surface marker expression can be over a defined number of passages, such as over at least 5 passages, over at least 10 passages, over at least 50 passages, or even indefinitely under appropriate culture conditions.
  • a lack of effect on viability can also be demonstrated utilizing a cell function and/or viability assay.
  • a lack of effect on viability can be demonstrated using an ELISPOT (enzyme-linked immunospot) assay.
  • the ELISPOT assay is capable of detecting viable cytokine producing cells by employing high affinity capture and detection antibodies and enzyme-amplification.
  • cytokines detected in an ELISPOT assay include IFN-g, IL-2, IL-4, IL-5, IL-6, IL-10, IL-12, IL-13, IL-21 , and/or TNF-a.
  • CD4+ T cells can be assessed utilizing an interferon gamma ELISPOT assay.
  • ELISPOT IFN-g assays and reagents are available from BD Biosciences 2350 Qume Drive San Jose, Calif., 95131. Additional information regarding use of ELISPOT assays is provided in J. Immunol. Methods. 2001 , 254(1-2):59.
  • methods of assessing cell function and/or viability include measurement of intracellular cytokines.
  • the function of CD4+ cells can be assessed using interferon gamma intracellular cytokine staining (ICS).
  • ICS staining involves permeabilizing the cells and treating them with antibodies that bind cytokines that have accumulated inside the cell.
  • Cell viability also can be determined using Trypan blue and light microscopy or 7-amino- actinomycin D vital dye and flow cytometry. Additionally or alternatively, the function of CD4+ cells can be assessed by the ability of the cell to respond to mitogenic stimulation by quantifying the amount of ATP produced (Kowalski, et al. , 2007, Journal of Immunotoxicology, 4:3: 225- 232). For additional assays and techniques to assess T cell function, see, McMichael & O'Callaghan, J. Exp. Med. 187(9)1367-1371 , 1998; Mcheyzer-Williams et al., Immunol. Rev. 150:5-21 , 1996; and Lalvani, et al., J. Exp. Med. 186:859-865, 1997.
  • cells from an expanded library are assessed for HIV integration sites, HIV sequences, TCR of the latently infected cells, and/or other factors of interest to a particular research program.
  • genomic DNA (gDNA) of the HIV-Infected T-Cells is isolated.
  • the isolated gDNA is sheered into DNA fragments.
  • each of the DNA fragments for example, has a length of 500-700 base pairs.
  • the DNA fragments are amplified.
  • the DNA fragments are amplified using a polymerase chain reaction (PCR) technique, such as allele- specific PCR, assembly PCR, asymmetric PCR, endpoint PCR, hot-start PCR, in situ PCR, intersequence-specific PCR, inverse PCR, linear after exponential PCR, ligation-mediated PCR, methylation-specific PCR, miniprimer PCR, multiplex ligation-dependent probe amplification, multiplex PCR, nested PCR, overlap-extension PCR, polymerase cycling assembly, qualitative PCR, quantitative PCR, real-time PCR, single-cell PCR, solid-phase PCR, thermal asymmetric interlaced PCR, touchdown PCR, universal fast walking PCR, etc.
  • Ligase chain reaction (LCR) may also be used.
  • thermostable polymerase such as Taq DNA polymerase (e.g., wild-type enzyme, a Stoffel fragment, FastStart polymerase, etc.), Pfu DNA polymerase, S-Tbr polymerase, Tth polymerase, Vent polymerase, or a combination thereof, among others.
  • PCR and LCR are driven by thermal cycling.
  • Alternative amplification reactions which may be performed isothermally, can also be used.
  • Exemplary isothermal techniques include branched-probe DNA assays, cascade-RCA, helicase-dependent amplification, loop-mediated isothermal amplification (LAMP), nucleic acid based amplification (NASBA), nicking enzyme amplification reaction (NEAR), PAN-AC, Q-beta replicase amplification, rolling circle replication (RCA), self-sustaining sequence replication, strand-displacement amplification, etc.
  • Amplification may be performed with any suitable reagents (e.g. template nucleic acid (e.g. DNA or RNA)), primers, probes, buffers, replication catalyzing enzymes (e.g. DNA polymerase, RNA polymerase), nucleotides, salts (e.g. MgCI2), etc.
  • an amplification mixture includes any combination of at least one primer or primer pair, at least one probe, at least one replication enzyme (e.g., at least one polymerase), and deoxynucleotide (and/or nucleotide) triphosphates (dNTPs and/or NTPs), etc.
  • HITI specific junction primers can be used to amplify sequences of particular interest:
  • DNA sequencing with commercially available NGS platforms may be conducted with the following steps.
  • First, DNA sequencing libraries may be generated by clonal amplification by PCR in vitro.
  • Second, the DNA may be sequenced by synthesis, such that the DNA sequence is determined by the addition of nucleotides to the complementary strand rather through chain-termination chemistry.
  • DNA templates may be sequenced simultaneously in a massively parallel fashion without the requirement for a physical separation step. While these steps are followed in most NGS platforms, each utilizes a different strategy (see e.g., Anderson, M. W. and Schrijver, I., 2010,
  • NGS platforms include:
  • DNA segments can undergo an amplification as part of NGS.
  • host-viral junctions and host DNA breakpoints are identified in the DNA fragments.
  • the integration sites of the HIV in the genomes of the HIV-Infected T-Cells can be identified by matching the host-viral junctions and host DNA breakpoints to the human genome.
  • isolated gDNA of T-cells can be fragmented by random sheering into fragments, each 300 to 500 base-pairs long.
  • Linker- mediated nested PCR can then be performed on the DNA fragments, in order to amplify the human genomic regions and the linked viral sequences from both the 5’ and 3’ long terminal repeats (LTRs).
  • LTRs long terminal repeats
  • paired end-sequencing of the amplified DNA can be performed using, for example, the MiSEq.
  • the integration sites are then recovered by mapping host DNA sequences to the human genome.
  • a stringent filter can be used to check quality of the recovered integration sites.
  • a and b TCR chains it may be necessary to pair TCR chains following sequencing.
  • Various methods can be utilized to pair isolated a and b TCR chains.
  • post-sequencing pairing may be unnecessary or relatively simple, for example in embodiments in which the a and b chain pairing information is not lost in the procedure, such as if one were to sequence from single cells or from a clonally expanded group of cells.
  • chain pairing may be assisted in silico by computer methods. For example, specialized, publicly available immunology gene alignment software is available from IMGT, JOINSOLVER, VDJSolver, SoDA, iHMMune-align, or other similar tools for annotating VDJ gene segments.
  • chain pairing may be performed using VDJ antibodies. For example, one may obtain antibodies for the identified segments and use the antibodies to purify a subset of cells that express that gene segment in their (surface) receptors (e.g. using FACS, or immunomagnetic selection with microbeads). One may then sequence from this subset of cells which have been purified for the desired gene segments. If necessary, this secondary sequencing may be done more deeply (i.e. at a higher resolution) than the first round of sequencing. In this second sequence data set, there will be far fewer induced clonotypes, greatly easing the task of chain pairing. Depending on the gene segments, there may be only one induced a chain and one induced b chain for example.
  • chain pairing may be performed using multiwell sequencing. For example, one may isolate gene segment purified cells or unpurified cells into a microwell plate, where each microwell has a very low number of cells. One can amplify and sequence the cells in each well individually, which provides another means to pair the chains of interest by sequencing on a single cell basis, facilitating the pairing of induced a and b chains. Assays such as PairSEQ® (Adaptive Biotechnologies Corp., Seattle, WA) have also been developed.
  • PairSEQ® Adaptive Biotechnologies Corp., Seattle, WA
  • the identified provirus can be re-assembled using plasmids so that new viruses can be generated in culture for replication studies. This can generate information regarding whether the initially tagged provirus was potentially replication competent. In particular embodiments, such studies would only be undertaken with proviruses that do not contain large deletions, found in many integrated proviruses.
  • a method of isolating CD4+ primary T cells latently infected with human immunodeficiency virus (HIV) provirus including HIV
  • genetic engineering components include:
  • a ribonucleoprotein complex including Cas9 and guide RNA (gRNA) including the sequence set forth in SEQ ID NO: 44; and
  • scAAV6 self-complementary adeno-associated virus 6
  • the donor vector genome includes an insertion construct including an MND promoter and a GFP reporter gene collectively flanked at the 5’ and 3’ ends by SEQ ID NO: 43 wherein the insertion construct does not include a polyA signal and
  • the delivering results in insertion of the insertion construct within the env gene of the HIV provirus and expression of the reporter gene;
  • a method of isolating T cells latently infected with human immunodeficiency virus (HIV) provirus including HIV
  • the genetic engineering components integrate a genetic construct into a targeted portion of the HIV provirus genome
  • the genetic construct includes a promoter and a gene encoding a reporter, but lacks a polyA signal and results in expression of the reporter;
  • T cells are CD4+ primary T cells.
  • reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
  • sorting includes fluorescence activated cell sorting (FACs), magnetic based cell-sorting, affinity chromatography, panning, or drug selection.
  • FACs fluorescence activated cell sorting
  • magnetic based cell-sorting affinity chromatography
  • panning panning
  • drug selection drug selection
  • non-integrating viral vector is an adeno- associated viral vector (AAV) or a lentiviral vector.
  • AAV adeno- associated viral vector
  • non-integrating viral vector is a self complementary AAV6 vector (scAAV6).
  • a method including creating a library of isolated T cells latently infected with human immunodeficiency virus (HIV) including
  • each sample is obtained from a different patient latently infected with HIV at a targeted portion of the HIV provirus genome
  • the genetic engineering components integrate a genetic construct into the HIV provirus genome within the samples at a targeted portion of the HIV provirus genome
  • the genetic construct includes a promoter and a reporter gene but lacks a polyA signal and results in expression of the reporter gene;
  • T cells are CD4+ T cells.
  • T cells are primary T cells.
  • the targeted portion of the HIV provirus genome includes a sequence as set forth in one of SEQ ID NOs: 3, 43, 45, 47, or 49.
  • the genetic engineering components include a guide RNA sequence including a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50.
  • reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
  • sorting includes fluorescence activated cell sorting (FACs), magnetic based cell-sorting, affinity chromatography, panning, or drug selection.
  • FACs fluorescence activated cell sorting
  • magnetic based cell-sorting affinity chromatography
  • panning panning
  • drug selection drug selection
  • the viral vector is a non-integrating viral vector.
  • the non-integrating viral vector is an adeno- associated viral vector (AAV) or a lentiviral vector.
  • AAV adeno- associated viral vector
  • non-integrating viral vector is a self complementary AAV vector (scAAV).
  • non-integrating viral vector is a self complementary AAV6 vector (scAAV6).
  • invention 65 further including identifying one or more of whole provirus genome, integration site of the HIV provirus, T cell receptor a and b chains, and deletions or stop codons within the HIV provirus genome that predict replication incompetence based on the sequencing.
  • kits to isolate T cells latently infected with human immunodeficiency virus including a guide RNA sequence including a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50, a Cas or Cpf1 nuclease, and a genetic construct that includes a promoter and a reporter gene but lacks a polyA signal.
  • HAV human immunodeficiency virus
  • kit of embodiment 73 including Cas9.
  • a polyA signal-less genetic construct is one that does not include a polyA signal.
  • a polyA signal-less genetic construct is one that does not include a polyA signal within a region that stabilizes unprocessed mRNA.
  • Example 1 CRISPR/Cas9-mediated HITI facilitates HIV provirus tagging.
  • HDR-mediated targeted integration whether HITI could be harnessed to tag the HIV provirus was investigated.
  • experiments using the ACH-2 cell line, which contains 1 copy of the provirus/cell were performed.
  • ACH-2 cells were electroporated with Cas9 RNPs containing gRNAs that target a highly-conserved region of the HIV pol gene and the non-human zebrafish Tia1 sequence, along with a plasmid donor containing a GFP-expressing reporter (under control of MND, a strong promoter in T cells, (Sather et al., Sci Transl Med. 2015;7(307):307ra156)) flanked by Tia1 sgRNA cleavage sites (FIG. 2A).
  • Flow cytometry was performed at day 1 to demonstrate plasmid uptake via GFP expression and Cas9/RNP activity via GFP knockdown using the eGFP gRNA that targets GFP (FIG. 2B).
  • Genomic DNA from treated cells was extracted at day 4 for genetic analysis.
  • HITI junction-specific PCR showed that the reporter had been inserted into the HIV pol target site in ACH-2 cells (FIG. 2C) in forward and reverse orientations (FIG. 2D).
  • Sanger sequencing of PCR products showed that the predicted forward and reverse HITI junction sequences were present (FIG. 2D).
  • FIG. 2D shows that the predicted forward and reverse HITI junction sequences were present.
  • polyA signal-less cassette was introduced into unique sites in pol and nef of the pNL4-3 molecular clone, in both the positive and negative orientations, so that sites distal and proximal to the native 3’LTR polyA signal could be evaluated for GFP expression (FIG. 3A). After transfection of these constructs into 293 cells, GFP expression was analyzed after 72 hours by flow cytometry (FIG. 3B).
  • Cells receiving any of the 4 MND-GFP polyA-less pNL4-3 reporter constructs were GFP positive, regardless of reporter orientation, with forward orientation insertion in nef proximal to the native 3’LTR polyA signal producing the highest number of GFP expressing cells. Importantly, cells receiving either pNL4-3 or the linearized polyA-less MND-GFP donor alone did not express GFP.
  • polyA signal-less donor construct could tag the HIV provirus via CAPTIV in ACH-2 cells.
  • a highly-conserved gRNA target site proximal to the 3’LTR polyA signal in env (envO) was identified as a provirus tagging site, and the activity of a gRNA targeting envO was validated in 293 cells following delivery of pNL4-3 and Cas9 RNPs (FIG. 4A).
  • a polyA signal-less CAPTIV donor for envO was then electroporated into ACH-2 cells along with Cas9 RNPs for envO (FIG.
  • Proviruses and HIV integration sites can be sequenced following probe capture enrichment. An attractive possibility arising from CAPTIV isolation of cells containing HIV would be to interrogate the provirus sequence and integration site within clonally expanded cells following probe capture of HIV DNA.
  • a set of 120bp 1X tiling HIV capture probes across the entire length of the NCBI HIV reference genome were designed and used to pull down HIV sequences from fragmented genomic DNA of naive or CAPTIV-tagged ACH-2 cells prior to performing lllumina sequencing (FIG. 6A). Using this method, complete provirus sequences from naive or CAPTIV-tagged cells were obtained.
  • a consensus provirus sequence was generated from an average read coverage of 685 reads/position, which was 99% homologous to HXB2 (FIG. 6B).
  • This ACH-2 consensus sequence and the NL4-3 and HXB2 sequences used for the alignments are provided in FIG. 6C and 7.
  • full provirus sequences from as few as 1000 cells were obtained. Multiple different integration sites with reads of up to >250bp off the end of the HIV LTR were also recovered.
  • LTR integration site junctions in ACH-2 cells were found at a major integration site previously identified within the NT5C3A gene of chromosome 7, and a previously identified minor integration site in the SLC25A25-AS gene of chromosome 9 (Sunshine et ai, J Virol. 2016;90(9):4511-9; Symons et al., Retrovirology. 2017; 14(1):2) (FIG. 6C).
  • HIV-1 latent T cell clones with one integrated proviral copy, ACH-2 National Institutes of Health (NIH) AIDS reagent program
  • ACH-2 National Institutes of Health (NIH) AIDS reagent program
  • RPMI-1640 medium Hyclone TM
  • FBS heat inactivated FBS
  • penicillin/streptomycin 100ug/mL
  • combinational anti- retrovirus drugs cART
  • BSL biosafety level
  • DM EM Hyclone®, GE Healthcare Life Sciences, Pittsburgh, PA
  • Human CD4+ T cells were isolated from PBMCs by using EasySep Human CD4+ T cell isolation kit (Stemcell) and EasyEights magnet (Stemcell). Isolated CD4+ cells were activated by CD3/CD28 activator beads (Life Technologies) for three days following the manufacture’s protocol. After T cell activator was removed, activated CD4+ T cells were maintained in RPMI-1640 medium containing 10% heat inactivated FBS and 30U/mL rlL-2 (peprotech). rlL-2 were supplied to culture every two or three days.
  • CRISPR/Cas9 crRNAs, electroporation enhancer and recombinant S. pyogenes Cas9 nuclease containing nuclear localization sequence and C-terminal 6-His tag were obtained from Integrated DNA Technologies.
  • DNA Plasmids To generate the plasmid pTia1-MND-GFP, a PCR product encompassing a MND-GFP-SV40pA expression cassette was amplified from the plasmid pscAAV-MND-GFP (De Silva et al., Antiviral Res. 2016; 126:90-8) using primers:
  • the constructs pCAPTIV-EhnO-DrA, pCAPTIV-Envl-DrA, pCAPTIV-Nefl-DrA and pCAPTIV- Nef2-ApA were developed in which the donor tag is flanked by two identical and conserved HIV- specific gRNA target sites, and contains the MND promoter and a GFP reporter, but does not contain a polyA signal.
  • the gRNA target sequence for EnvO, Env1 , Nef1 , and Nef 2 are provided in Tables 1 and 2.
  • AAV6 vectors 16 million 293 cells were seeded to a 15cm dish the day before transfection, and at least 10 dishes were prepared to make each virus stock. Either pscAAV-EnvO-MND-GFP-DrA or pscAAV-Nefl-MND-GFP-DrA, pHelper and pRepCap6 plasmid were transfected into 293 cells at a 5:3:2 ratio. A total of 28ug of DNA and 1 12 ug polyethylenimine (PEI) at 1 :4 ratio per plate were added to 500ul OptiMEM. Transfection mix was incubated at room temperature for 15 min and then added to 293 cells.
  • PEI polyethylenimine
  • Cas9 ribonuclueoprotein (RNP) electroporation 2.5 x 10 5 ACH-2 cells were resuspended in 9ul buffer R (Thermo Fisher Scientific) before electroporation. Cells were electroporated with 1 ug CAPTIV donor plasmids at 1500V, 10ms, 1 pulse by using Neon transfection system (Invitrogen), together with 1.8 uM electroporation enhancer and 1.8uM Cas9 RNP containing 1.5uM Cas9 nuclease and gRNA. At 6 days-post-transfection, GFP tagged provirus was validated by flow cytometry. Genomic DNA (gDNA) was extracted by QIAmp micro kit (QIAGEN) following the manufacturer’s protocol for next generation sequencing and for tag insertion junction-specific PCR analysis.
  • gDNA Genomic DNA
  • Genomic DNA was extracted from ACH-2 cell line by QIAmp micro kit (Quiagen) following the manufacture’s protocol. Libraries were prepared for lllumina sequencing as described previously (Greninger et al., BMC Genomics, 2018 Mar 20; 19(1): 204; Greninger et al., mSphere, 2018 Jun 13 3(3)). Briefly, 100ng of DNA was fragmented to an average insert size of 500bp using the Kapa HyperPlus kit using a 37°C incubation for 7 minutes.
  • T7 endonuclease 1 assay The T7 endonuclease I cleavage assay and amplicon- sequencing protocols have been described in (Aubert et ai, Molecular Therapy - Nucleic Acids (2014) 3, e146). Primers to amplify the region that contains CRISPR/Cas9 target sites in env and nef are shown in Table 2.
  • Example 2 CAPTIV tagging of the human CCR5 locus.
  • primary CD4+ T cells from HIV naive normal donors were utilized.
  • the CCR5 gene was initially targeted as a proxy for the HIV provirus, since HIV infection of primary human CD4+ T cells in vitro is detrimental to cell viability, and it was desired to identify the experimental parameters that would maximize cell survival when moving into HIV+ patient derived CD4+ T cells.
  • CCR5-1 and CCR5-2 Two spCas9 gRNA target sites were first identified in the CCR5 gene locus (CCR5-1 and CCR5-2) that were located at different distances from their nearest canonical AATAAA or ATT AAA polyA signals in each direction (FIG. 8).
  • CCR5-1 and CCR5-2 Two spCas9 gRNA target sites were first identified in the CCR5 gene locus (CCR5-1 and CCR5-2) that were located at different distances from their nearest canonical AATAAA or ATT AAA polyA signals in each direction (FIG. 8).
  • T7E1 T7 endonuclease I
  • scAAV self-complimentary AAV vectors
  • the CCR5-specific RNPs were incorporated to enable cleavage of the host CCR5 locus for donor tag insertion, whilst the envO or nef1 RNPs were incorporated to allow excision of the polyA-less MND-GFP dsDNA tag from the scAAV donor templates that contain either envO or nef1 target sites immediately flanking both the 5’ and 3’ ends of the polyA-less MND-GFP tag.
  • cells were left for 3 hours to recover before being infected with scAAV donor vectors scAAV6-envO-GFPApA or scAAV6-nef1- GFPApA at a multiplicity of infection of 350,000 vector genomes per cell.
  • treated cells were analyzed by flow cytometry to detect GFP expressing cells that had been tagged with the polyA-less MND-GFP dsDNA tag that was excised from the scAAV donor.
  • Tagging of the CCR5 locus can only occur when the scAAV6 donor reaches the nucleus of a cell and the RNP cleaves the CCR5 target site. Therefore, when the efficiency of AAV transduction (28.9%) is used as a proxy for donor delivery efficiency and the target site gene editing rate (22.9%, CCR5-1 ; 34.1 %, CCR5-2) is used as a proxy for CCR5 locus cleavage, the efficiency of the CCR5 tagging process in primary human CD4+ T cells can be approximated.
  • Example 3 Optimization of HIV provirus tagging in DHIV-infected CD4+ primary T cells.
  • Optimal conditions for tagging of primary CD4+ T cells will be determined to maximize the number of cells available for downstream genetic analysis and/or clonal expansion.
  • Initial studies will be carried out using an in vitro model of HIV infection using activated CD4+ T cells that are transduced with the replication-defective HIV molecular clone DHIV (Bosque & Planelles, Blood. 2009; 113(1):58-65) at a multiplicity of 1 infectious unit/cell. This will enable optimization of parameters required for efficient provirus tagging in a setting where HIV+ cells are abundant.
  • AAV efficiently transduces CD4+ T cells (Sather et al., Sci Transl Med. 2015;7(307):307ra156; Wang et al., Nucleic Acids Res. 2016;44(3):e30), does not impact T-cell viability, and can facilitate gene insertion by HDR at levels of over 30% following electroporation of Cas9 mRNA (Gwiazda et al., Mol Ther. 2016;24(9):1570-80).
  • Self-complimentary AAV6 (scAAV) vectors that provide a dsDNA template for CRIPSR/Cas9-mediated donor release and subsequent insertion can be used which can facilitate higher levels of targeted integration than plasmid donors.
  • the scAAV vectors will contain a polyA signal-less MND-GFP donor flanked by target sequences from conserved regions of env or nef proximal to the 3’ LTR polyA signal.
  • Cells will be electroporated with env- or nef-specific Cas9 RNPs, and the AAV6 donor vector will then be delivered at increasing MOIs (20,000; 100,000; 500,000 vg/cell) 2-4 hours post electroporation as previously described (Schumann et al., Proc Natl Acad Sci U S A. 2015;112(33):10437-42; Gwiazda et al., Mol Ther. 2016;24(9): 1570-80). Levels of gene tagging will be assessed by flow cytometry at 2 and 6 days post electroporation, and lllumina sequencing.
  • AAV vectors will be delivered 24 hours before Cas9 RNP electroporation, as this may increase gene insertion levels.
  • Example 4 CAPTIV isolation of CD4+ T cells from HIV+ participant PBMCs. Provirus tagging in samples from HIV+ participants will be performed. Initial analyses will use samples from participants not receiving ART, since the frequency of HIV+ cells will be higher, before moving onto samples from participants receiving ART. De-identified participant samples chosen to include similar numbers of men and women, and reflect the US distribution of minority groups, will be obtained from the University of Washington/Fred Hutch Center for AIDS Research (CFAR) HIV specimen repository, in cryopreserved aliquots of 5 million PBMCs. Participant CD4+ T cells will be electroporated as previously described (Schumann et al., Proc Natl Acad Sci U S A.
  • Example 5 Analysis of HIV integration sites in HIV+ participant samples. The preference for HIV integration in active genes in vitro has been well established (43-45), but the potential role that an HIV integration event may play in the in vivo expansion and/or persistence of an HIV infected cell has only recently been recognized (Cohn et al., Cell. 2015;160(3):420-32; Maldarelli et al., Science. 2014;345(6193): 179-83; Wagner et a!., Science. 2014;345(6196):570- 3). Currently-used methods to isolate HIV integration sites require multiple PCR steps to isolate the integration sites, and multiple reactions to achieve representative sampling of the pool of infected cells (Maldarelli et al., Science.
  • Paired end lllumina sequencing will then be performed with captured DNA, and integration site locations determined using blastn.
  • this method identified known integration sites (Sunshine et al., J Virol. 2016;90(9):4511-9; Symons et al., Retrovirology. 2017; 14(1):2) from as few as 1000 ACH-2 cells, although the ultimate sensitivity of this approach should be much better.
  • a single MiSeq run would, assuming a 100% capture rate, theoretically allow sequencing of all integration sites in a sample of over 10,000 HIV+ cells.
  • Example 6 Linkage of provirus genomes and their integration sites in HIV+ participant samples. Recent studies have provided important information about the completeness of integrated genomes in HIV- infected cells and their location within each infected cell (Ho et al., Cell. 2013; 155(3):540-51 ; Bruner et al., Nat Med. 2016;22(9): 1043-9), and these studies have helped inform the composition of the latent HIV reservoir. However, the methods used to obtain this data rely upon a long- range PCR step followed by nested PCR and sequence assembly from multiple overlapping fragments. CAPTIV offers a way to simplify this process, while increasing the number of individual complete provirus sequences that can be obtained from a single participant sample.
  • CAPTIV will be used to collect complete provirus sequences in combination with their integration sites.
  • a HIV probe capture technique will first be used to obtain complete provirus sequences and integration junctions from individual HIV+ CD4+ T cell clones that have been expanded in culture after CAPTIV isolation.
  • 10X Genomics whole genome sequencing (WGS) platform can be used to obtain paired provirus sequences and integration sites from bulk populations of HIV+ participant CD4+ T cells isolated by CAPTIV.
  • the 10X Genomics WGS platform allows the partition of 15-30Kb genomic DNA fragments onto individual beads that are partitioned into droplets and assigned unique barcodes.
  • the partitioning of barcoded DNA fragments larger than the HIV provirus enables downstream lllumina sequencing that allow identification of paired whole provirus sequences and integration sites from a pool of mixed genomic DNA. Partial provirus sequences created during the shearing step can also be used to generate complete proviral genomes through pairing with a matched integration site partial provirus.
  • the 10X Genomics WGS platform can be run using as little as 1 ng (167 cells) of sample DNA (Zheng et ai, Nat Biotechnol. 2016;34(3):303-1 1 ; Zook et ai, Sci Data. 2016;3:160025), which is important in that the minimal input cell number for disclosed hybridization approaches has not yet been defined (although it is ⁇ 1000 cells).
  • Example 7 Analysis of provirus genomes, integration sites, and T cell receptor (TCR) sequences in individual HIV+ participant samples.
  • TCR T cell receptor
  • TCR sequence data will be compared with known antigen-specific TCR sequences (Shugay et ai, Nucleic Acids Res. 2017. doi: 10.1093/nar/gkx760. PubMed PMID: 28977646), to identify potential clones that are responsive to these viruses. Participants with common and well-studied HLA types will also be evaluated and the epitope specificity of individual clones will be inferred using recently developed methods for TCR specificity prediction (Dash et ai, Nature. 2017;547(7661):89-93; Glanville et ai, Nature. 2017;547(7661):94-8). Paired provirus/TCR information allows determination of whether replication-competent HIV is truly enriched within CD4+ T cells with a particular antigen specificity.
  • variants of protein and/or nucleic acid sequences disclosed herein can also be used. Variants include sequences with at least 70% sequence identity, 80% sequence identity, 85% sequence, 90% sequence identity, 95% sequence identity, 96% sequence identity, 97% sequence identity, 98% sequence identity, or 99% sequence identity to the protein and nucleic acid sequences described or disclosed herein wherein the variant exhibits substantially similar or improved biological function.
  • % sequence identity refers to a relationship between two or more sequences, as determined by comparing the sequences.
  • identity also means the degree of sequence relatedness between protein and nucleic acid sequences as determined by the match between strings of such sequences.
  • Identity (often referred to as “similarity") can be readily calculated by known methods, including those described in: Computational Molecular Biology (Lesk, A. M., ed.) Oxford University Press, NY (1988); Biocomputing: Informatics and Genome Projects (Smith, D. W., ed.) Academic Press, NY (1994); Computer Analysis of Sequence Data, Part I (Griffin, A. M., and Griffin, H.
  • Variants also include nucleic acid molecules that hybridizes under stringent hybridization conditions to a sequence disclosed herein and provide the same function as the reference sequence.
  • Exemplary stringent hybridization conditions include an overnight incubation at 42°C in a solution including 50% formamide, 5XSSC (750 mM NaCI, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5XDenhardt's solution, 10% dextran sulfate, and 20 pg/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1XSSC at 50°C.
  • 5XSSC 750 mM NaCI, 75 mM trisodium citrate
  • 50 mM sodium phosphate pH 7.6
  • 5XDenhardt's solution 10% dextran sulfate
  • 20 pg/ml denatured, sheared salmon sperm DNA followed by washing the filters in 0.1XSSC at 50°C.
  • Changes in the stringency of hybridization and signal detection are primarily accomplished through the manipulation of formamide concentration (lower percentages of formamide result in lowered stringency); salt conditions, or temperature.
  • washes performed following stringent hybridization can be done at higher salt concentrations (e.g. 5XSSC).
  • Variations in the above conditions may be accomplished through the inclusion and/or substitution of alternate blocking reagents used to suppress background in hybridization experiments.
  • Typical blocking reagents include Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and commercially available proprietary formulations.
  • the inclusion of specific blocking reagents may require modification of the hybridization conditions described above, due to problems with compatibility.
  • variant proteins include conservative amino acid substitutions.
  • a conservative amino acid substitution may not substantially change the structural characteristics of the reference sequence (e.g., a replacement amino acid should not tend to break a helix that occurs in the reference sequence or disrupt other types of secondary structure that characterizes the reference sequence). Examples of art-recognized polypeptide secondary and tertiary structures are described in Proteins, Structures and Molecular Principles (Creighton, Ed., W. H. Freeman and Company, New York (1984)); Introduction to Protein Structure (C. Branden & J. Tooze, eds., Garland Publishing, New York, N.Y. (1991)); and Thornton et ai, Nature, 354:105 (1991).
  • a“conservative substitution” involves a substitution found in one of the following conservative substitutions groups: Group 1 : Alanine (Ala), Glycine (Gly), Serine (Ser), Threonine (Thr); Group 2: Aspartic acid (Asp), Glutamic acid (Glu); Group 3: Asparagine (Asn), Glutamine (Gin); Group 4: Arginine (Arg), Lysine (Lys), Histidine (His); Group 5: Isoleucine (lie), Leucine (Leu), Methionine (Met), Valine (Val); and Group 6: Phenylalanine (Phe), Tyrosine (Tyr), Tryptophan (Trp).
  • amino acids can be grouped into conservative substitution groups by similar function or chemical structure or composition (e.g., acidic, basic, aliphatic, aromatic, sulfur- containing).
  • an aliphatic grouping may include, for purposes of substitution, Gly, Ala, Val, Leu, and lie.
  • Other groups containing amino acids that are considered conservative substitutions for one another include: sulfur-containing: Met and Cysteine (Cys); acidic: Asp, Glu, Asn, and Gin; small aliphatic, nonpolar or slightly polar residues: Ala, Ser, Thr, Pro, and Gly; polar, negatively charged residues and their amides: Asp, Asn, Glu, and Gin; polar, positively charged residues: His, Arg, and Lys; large aliphatic, nonpolar residues: Met, Leu, lie, Val, and Cys; and large aromatic residues: Phe, Tyr, and Trp. Additional information is found in Creighton (1984) Proteins, W.H. Freeman and Company.
  • nucleic acid sequences are shown using standard letter abbreviations for nucleotide bases, as defined in 37 C.F.R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included.
  • each embodiment disclosed herein can comprise, consist essentially of or consist of its particular stated element, step, ingredient or component.
  • the terms“include” or“including” should be interpreted to recite: “comprise, consist of, or consist essentially of.”
  • the transition term“comprise” or“comprises” means includes, but is not limited to, and allows for the inclusion of unspecified elements, steps, ingredients, or components, even in major amounts.
  • the transitional phrase“consisting of’ excludes any element, step, ingredient or component not specified.
  • the transition phrase “consisting essentially of” limits the scope of the embodiment to the specified elements, steps, ingredients or components and to those that do not materially affect the embodiment. A material effect would cause a statistically significant reduction in the ability to specifically isolate latently infected HIV cells from a biological sample.
  • the term“about” has the meaning reasonably ascribed to it by a person skilled in the art when used in conjunction with a stated numerical value or range, i.e. denoting somewhat more or somewhat less than the stated value or range, to within a range of ⁇ 20% of the stated value; ⁇ 19% of the stated value; ⁇ 18% of the stated value; ⁇ 17% of the stated value; ⁇ 16% of the stated value; ⁇ 15% of the stated value; ⁇ 14% of the stated value; ⁇ 13% of the stated value; ⁇ 12% of the stated value; ⁇ 11% of the stated value; ⁇ 10% of the stated value; ⁇ 9% of the stated value; ⁇ 8% of the stated value; ⁇ 7% of the stated value; ⁇ 6% of the stated value; ⁇ 5% of the stated value; ⁇ 4% of the stated value; ⁇ 3% of the stated value; ⁇ 2% of the stated value; or ⁇ 1% of the stated value.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Virology (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Veterinary Medicine (AREA)
  • Cell Biology (AREA)
  • Biophysics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Public Health (AREA)
  • Plant Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Mycology (AREA)
  • Epidemiology (AREA)
  • Oncology (AREA)
  • AIDS & HIV (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Communicable Diseases (AREA)
  • Hematology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Methods to tag and isolate cells infected with the human immunodeficiency virus (HIV) provirus from a human patient are described. The methods do not affect the viability of the tagged and isolated cells, so that such cells can be expanded and maintained to study the characteristics of cells that harbor virus. Information derived from the studies can be used to further the development of clinical treatments to eradicate HIV infection in individuals.

Description

METHODS TO TAG AND ISOLATE CELLS
INFECTED WITH THE HUMAN IMMUNODEFICIENCY VIRUS
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims priority to U.S. Provisional Patent Application No. 62/793,323 filed January 16, 2019, the entire contents of which are incorporated by reference herein in their entirety.
STATEMENT OF GOVERNMENT SUPPORT
[0002] This invention was made with government support under AH 40900 awarded by the National Institutes of Health. The government has certain rights in the invention.
STATEMENT REGARDING SEQUENCE LISTING
[0003] The Sequence Listing associated with this application is provided in text format in lieu of a paper copy and is hereby incorporated by reference into the specification. The name of the text file containing the Sequence Listing is F053-0087PCT_ST25.txt. The text file is 258 KB, was created on January 15, 2020, and is being submitted electronically via EFS-Web.
FIELD OF THE DISCLOSURE
[0004] The current disclosure provides methods to tag and isolate cells infected with the human immunodeficiency virus (HIV) from a human patient. The methods do not affect the viability of the tagged and isolated cells, so that such cells can be expanded and maintained to study the characteristics of cells that harbor virus. Information derived from the studies can be used to further the development of clinical treatments to eradicate HIV infection in individuals.
BACKGROUND OF THE DISCLOSURE
[0005] Human immunodeficiency virus (HIV) is a retrovirus that causes acquired immunodeficiency syndrome (AIDS), a medical condition where progressive failure of the immune system leads to life-threatening opportunistic infections. HIV infection, while treatable for long periods of time, remains a largely incurable infection.
[0006] The clinical course of HIV infection can vary according to a number of factors, including the subject's genetic background, age, general health, nutrition, treatment received, and the HIV subtype. In general, most individuals develop flu-like symptoms within a few weeks or months of infection. The symptoms can include fever, headache, muscle aches, rash, chills, sore throat, mouth or genital ulcers, swollen lymph glands, joint pain, night sweats, and diarrhea. The intensity of the symptoms can vary from mild to severe depending upon the individual. [0007] During the acute phase, HIV viral particles are attracted to and enter cells expressing the appropriate CD4 receptor molecules, such as CD4-expressing T cells of the immune system. Once the virus has entered a CD4-expressing T cell, HIV encoded reverse transcriptase generates a proviral DNA copy of the HIV RNA and the proviral DNA becomes integrated into the CD4-expressing T cell genomic DNA. It is this HIV provirus that is replicated by the CD4- expressing T cell. When replicated, new HIV virions are produced which can then leave the originally infected T cell and proceed to infect additional CD4-expressing T cells. Without treatment, this process kills the originally infected T cell, leading to depletion of T cells in infected patients.
[0008] Generally, the acute phase of HIV infection subsides and is followed by a latent period. During the latent period, the subject's CD4 cell numbers rebound, although not to pre-infection levels. Most patients also begin to show detectable levels of anti-HIV antibody in their blood. Also during the latent period, there is reduced detectable viral replication in peripheral blood mononuclear cells and reduced culturable virus in peripheral blood as compared to during the acute phase of infection.
[0009] Anti-retroviral therapies (ART) have greatly improved the outcome for HIV-infected individuals. For individuals receiving ART, the latent period may extend for several decades or more, and most patients on ART do not have detectable HIV in their blood. However, ART does not cure HIV infection, and once therapy is stopped, HIV in blood rebounds to its pre-treatment level. The ability to stop treatment without viral rebound would be beneficial because long-term treatment with ART is associated with other serious health considerations such as bone or renal toxicity, insulin resistance, and accelerated cardiovascular disease. However, stopping treatment is discouraged due to the likelihood of viral rebound from the latent HIV reservoir.
[0010] The latent HIV reservoir refers to cells infected with the HIV provirus when the provirus is not being actively replicated to create new virions. In an ART-suppressed patient, one in ten thousand resting T cells contains HIV DNA, and one per million resting T cells contains provirus that can be reactivated to produce infectious virus. Although values can vary over 2 logs, the HIV provirus is found in 0.01-0.1 % of peripheral blood mononuclear cells (PBMCs), and 0.003% contain intact provirus. Based on these numbers, each vial of 5 million PBMCs from ART- treated subjects should contain 500-5,000 HIV+ cells, and 150 cells with intact provirus capable of reactivation.
[0011] A cure for HIV is likely to require a thorough understanding of the latent HIV reservoir and the mechanisms by which it is maintained and/or-reactivated. Most current methods to measure quantitatively the replication competent latent HIV reservoir are difficult to perform, time- consuming, and expensive. The difficulty associated with identifying these rare cells within the background of uninfected cells has made it difficult to fully define the biology of the HIV reservoir, including the mechanisms by which HIV latency is maintained, and the processes leading to reactivation and production of infectious HIV.
[0012] Advances in genetic engineering have allowed the targeted insertion of marker sequences into precise locations within genomic DNA. However, most of these systems rely on homologous recombination between long stretches of homology between a genetic engineering molecule and the area of the genome to be altered. This makes genetic engineering of HIV exceedingly difficult due to the high level of sequence variability between different strains of HIV, and within each host. Even minor sequence differences between a genetic engineering molecule and the area of the genome to be altered can greatly reduce the efficiency of many forms of genetic engineering, and particularly those that rely on homology-directed repair (HDR).
SUMMARY OF THE DISCLOSURE
[0013] The current disclosure provides use of gene editing systems to efficiently tag and isolate cells of the latent HIV reservoir from a patient. Isolating these cells allows the study of latently infected cells to determine factors associated with HIV reservoir maintenance and reactivation, and conditions allowing potential eradication. For example, the systems and methods disclosed herein can be used to link latent viral infection with integration sites in the genome and/or with particular T cell receptors of a given latently infected cell. This type of information can help to elucidate the types of cells and/or integration sites that allow a latent virus to be replication competent. Factors that result in viral reactivation can also be assessed. Thus, the current disclosure provides systems and methods to generate libraries of cells infected with HIV virus that allow the study of the latent HIV reservoir to further the development of clinical treatments.
[0014] In particular embodiments, the current disclosure provides tagging latently infected cells with genetic constructs that allow sorting and collection of the tagged cells. Genetic tagging of T cells infected with latent HIV provirus that is capable of reactivation is not easily achieved due to the sequence variability between and among different strains of HIV. The current disclosure overcomes this obstacle by utilizing homology-independent targeted integration (HITI) of genetic constructs. HITI can utilize small micro-regions of homology to a target site but does not require them. Thus, unlike the more commonly used homology-directed repair for gene editing which requires long stretches of homology, HITI can be accomplished without homology sequences or with conserved target sites for genetic tag insertion of less than 30 (e.g., 23) nucleotide base pair (bp). [0015] The use of HITI alone to target and isolate latently infected HIV cells was not sufficient to lead to the systems and methods as they are disclosed herein, however, because of an unacceptable level of background noise in expression of provided reporter constructs. Based on the use of HITI alone, the background noise led to the isolation of too many cells that were not in fact latently infected with HIV. Because of the extremely low number of true latently infected cells, even a relatively small amount of background expression/noise can unacceptably mask the identity and collection of truly latently infected cells.
[0016] To overcome the challenge associated with background expression of reporter genes within provided genetic constructs, the inserted genetic constructs were generated to be polyA signal-less. This approach was recognized as feasible because constructs that properly insert into the HIV provirus genome can be stably expressed using the HIV genome’s endogenous polyA signals. If the construct is not inserted within the HIV genome, however, the lack of the polyA signal results in unstable mRNA that will not subsequently be translated to any significant degree. This approach thus significantly reduces, if not eliminates the noted background noise from the systems and methods disclosed herein.
[0017] After genetic modification, cells expressing the inserted polyA signal-less reporter constructs utilize endogenous polyA signals within the HIV genome to maintain reporter expression, and thus can be isolated, maintained, and expanded to study HIV, such as the mechanisms of HIV latency and reactivation.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0018] Many of the drawings submitted herein are better understood in color. Applicant considers the color versions of the drawings as part of the original submission and reserves the right to present color images of the drawings in later proceedings.
[0019] FIG. 1. CRISPR-Assisted Provirus Tagging In Vitro (CAPTIV). A DNA tag is inserted into the HIV provirus of infected cells following delivery of HIV-specific Cas9 RNPs and a HITI donor AAV substrate. Tag insertion into the HIV provirus enables isolation of HIV infected cells via cell sorting, and subsequent expansion.
[0020] FIGs. 2A-2D. HITI-mediated provirus tagging in ACH-2 cells. (2A) experimental timeline and plasmid donor substrate for ACH-2 provirus tagging; (2B) GFP expression 24 hours post electroporation with pTia1-MND-GFP and eGFP2, Tia1 , or HIV pol-specific POL1 Cas9 RNPs; (2C) PCR amplicons used to detect HITI junctions (left) and 5’ to 3’ sequences for provirus (Poll) and donor flanking (Tia1) Cas9 target sites pre-HITI (right). PCR primers and target sites are shown for MND (striped), SV40pA (spotted), and pol (labeled); 5Tia1-MND (SEQ ID NO: 1); 3Tia1-SV40Pa (SEQ ID NO: 2); and HIV POL1 site (SEQ ID NO: 3); (2D) Expected 5’ HITI junction PCR products (asterisks) of 255bp (forward orientation) and 318bp (reverse orientation) are shown. Sanger chromatograms for sequenced PCR product showing fusion of the pol target site (upper case underlined) with Tia1-MND (lower case underlined) or Tia1-SV40pA (lower case underlined) target sites. MND - MND promoter and pA - SV40 polyA signal are listed; 5’pol1-5’Tia1 (SEQ ID NO: 4) and 5’pol1-3’Tia1 (SEQ ID NO: 5).
[0021] FIGs. 3A, 3B. PolyA signal-less construct validation. (3A) insertion sites in the pol or nef genes of pNL4-3 for the polyA signal-less MND-GFP expression cassette are shown along with locations of known polyA signals (striped triangles) or predicted polyA signals (spotted triangles); (3B) GFP expression in 293 cells at 72 hours post transfection with each construct.
[0022] FIGs. 4A-4D. PolyA signal-less provirus tagging in ACH-2 cells. (4A) validation of a conserved env gRNA target site proximal to the 3’ LTR by cleavage of pNL4-3 in 293 cells at 48 hours post Cas9 RNP electroporation. T7 endonuclease I cleavage of a PCR fragment yields fragments (asterisks) of 390bp + 166bp (envO) if the target site is disrupted (SEQ ID NO: 6); (4B) experimental timeline and plasmid donors used for polyA signal-less provirus tagging in ACH-2 cells; (4C) GFP expression at 6 days post electroporation for cells transfected with pTia1-MND-GFP or pCAPTIV-DrA, with or without control or HIV-specific Cas9 RNPs; (4D) junction PCR with primers between GFP and env (410 bp) show show tag insertion occurred at the envO locus.
[0023] FIGs. 5A-5D. Tagged ACH-2 cells are viable and functional. Following CAPTIV tagging at the envO target site GFP+ ACH-2 cells were seen by flow cytometry after 6 days (5A) and were sorted after 8 and 24 days to enrich for GFP+ cells (5B, 5C). GFP+ cells proliferate in culture after 2 rounds of sorting (5D).
[0024] FIGs. 6A-6C. HIV provirus and integration site sequencing in ACH-2 cells via probe capture analysis. (6A) HIV-probe capture sequencing work flow. HIV-specific DNA was pulled down from ACH-2 genomic DNA using sequential 120bp probes tiled across the HIV reference sequence, and sequenced by lllumina; (6B) lllumina read coverage of the ACH- 2 provirus and consensus sequence comparison (identity and deletions) with the HIV HXB2 reference sequence; (6C) Examples of overlapping reads for 5’ LTR and 3’ LTR integration sites and NCBI blastn mapping to chromosome 7 (Gene NT5C3A) and chromosome 9* (Gene SLC25A25-AS) (SEQ ID Nos: 7-19 and 58).
[0025] FIG. 7. CAPTIV sgRNA target sites in HIV env and nef. Alignment of a consensus ACH-2 HIV sequence with equivalent regions of NL4-3 (Genbank accession: AF324493.2) and HXB2 (Genbank accession: K03455.1). sgRNA target sites (boxes) and PCR primers used to amplify CAPTIV tagging junctions (black arrows) are also shown (SEQ ID NOs: 20-22). Reference sequences for ACH-2 (SEQ ID NO: 80), HXB2 (SEQ ID NO: 81), and NL4-3 (SEQ ID NO: 82) are also provided.
[0026] FIG. 8. Organization of the human CCR5 locus (Genbank Accession number AH005786.2, nucleotides 2077-8135) showing gRNA target sites CCR5-1 and CCR5-2 (Black boxes), plus the closest putative canonical AATAAA or ATT AAA polyA signals (Open arrows) on the sense or anti-sense strand for each CCR5 gRNA target site. Donor tag insertions and the distance to the nearest orientation dependent 3’ canonical polyA signals are shown. Locations of primers used to amplify target site-specific PCR products for T7E1 assays are shown (Black arrows).
[0027] FIG. 9. Activated CD4+ T cells were electroporated with CCR5-1 or CCR5-2 targeting RNPs using the Neon electroporation system. At day 6 post-electroporation, genomic DNA was isolated from cells and a PCR product spanning each target site was amplified and used to determine the levels of gene editing that had occurred via the T7 endonuclease I (T7E1) assay. The T7E1 assay showed that gene editing had occurred at the CCR5-1 or CCR5-2 target sites in 22.9% and 34.1 % of cells respectively. Bands indicative of PCR product cleavage and gene editing are highlighted (asterisks).
[0028] FIGs. 10A-10C. (10A) Diagram of Control AAV and Donor AAV. Open diamonds indicate flanking envO or nef1 gRNA target sites (10B) Primary human CD4+ T cells were electroporated with Cas9 RNPs containing crRNAs specific for CCR5-1 in combination with a second crRNA that was specific for either the HIV-specific gRNA target sequence envO or the HIV-specific gRNA target sequence nef1. These cells were then infected with scAAV donor vectors scAAV6- envO-GFPApA or scAAV6-nef1-GFPApA at a multiplicity of infection of 350,000 vector genomes per cell. At 6 days post electroporation, treated cells were analyzed by flow cytometry to detect GFP expressing cells that had been tagged with the polyA-less MND-GFP dsDNA tag that was excised from the scAAV donor. (10C) Levels of GFP expression in cells that were either untreated, mock-electroporated with no Cas9 RNPs then infected with control AAV vector, mock-electroporated with no Cas9 RNPs then infected with donor AAV vectors, or electroporated with two Cas9 RNPs (CCR5-1 + envO or nef1) then infected with donor AAV vector.
[0029] FIGs. 11A-11C. (11A) Diagram of Control AAV and Donor AAV. Diamonds indicate flanking envO or nef1 gRNA target sites (11 B) Primary human CD4+ T cells were electroporated with Cas9 RNPs containing crRNAs specific for CCR5-2 in combination with a second crRNA that was specific for either the HIV-specific gRNA target sequence envO or the HIV-specific gRNA target sequence nef1. These cells were then infected with scAAV donor vectors scAAV6- envO-GFPApA or scAAV6-nef1-GFPApA at a multiplicity of infection of 350,000 vector genomes per cell. At 6 days post electroporation, treated cells were analyzed by flow cytometry to detect GFP expressing cells that had been tagged with the polyA-less MND-GFP dsDNA tag that was excised from the scAAV donor. (11C) Levels of GFP expression in cells that were either untreated, mock-electroporated with no Cas9 RNPs then infected with control AAV vector, mock-electroporated with no Cas9 RNPs then infected with donor AAV vectors, or electroporated with two Cas9 RNPs (CCR5-2 + envO or nef1) then infected with donor AAV vector.
[0030] FIG. 12 Protein and nucleic acid sequences including or encoding: GFP (SEQ ID NO:23); MND promoter (SEQ ID NO:24); pCAPTIV-DrA-EhnO (SEQ ID NO: 25); pCAPTIV-DrA- Env1 (SEQ ID NO:26); pCAPTIV-ApA-Nef1 (SEQ ID NO:27); pCAPTIV-ApA-Nef2 (SEQ ID NO: 28); pNL4-3-Nef-MND-GFP (SEQ ID NO: 29); pNL4-3-Nef-MND-GFP-R (SEQ ID NO: 30); pNL4-3-Pol-MND-GFP (SEQ ID NO: 31); pNL4-3-Pol-MND-GFP-R (SEQ ID NO: 32); pTial- MND-GFP (SEQ ID NO: 33); Lachnospiraceae bacterium ND2006 Reference Sequence (SEQ ID NO: 34); Acidaminococcus sp. BV3L6 Reference Sequence (SEQ ID NO: 35); F2A (SEQ ID NO: 36); E2A (SEQ ID NO: 37); T2A (SEQ ID NOs: 38 or 39); and P2A (SEQ ID NOs: 40 or 41); Streptococcus pyogenes serotype M1 Cas9 protein (UniProt Accession Q99ZW2) (SEQ ID NO: 83); Streptococcus pyogenes serotype M1 Cas9 cds (nucleotides 854751-858857 of NCBI Reference Sequence: NC_002737.2) (SEQ ID NO: 84); Francisella tularensis type V CRISPR- associated protein Cpf1 (NCBI Reference Sequence: WP_003040289.1) (SEQ ID NO: 85); Acidaminococcus sp. BV3L6 type V CRISPR-associated protein Cpf1 (AsCpfl) (NCBI Reference Sequence: WP_021736722.1) (SEQ ID NO: 86); Lachnospiraceae bacterium MC2017 type V CRISPR-associated protein Cpf1 (NCBI Reference Sequence: WP_044910712.1) (SEQ ID NO: 87); Staphylococcus aureus Cas9 (saCas9) (GenBank Reference Sequence: AYD60511.1 (SEQ ID NO: 88); Reference sequence for particular variant saCas9 sequences described herein (SEQ ID NO: 89).
DETAILED DESCRIPTION
[0031] Human immunodeficiency virus (HIV) is a retrovirus that causes acquired immunodeficiency syndrome (AIDS), a medical condition where progressive failure of the immune system leads to life-threatening opportunistic infections.
[0032] Anti-retroviral therapies (ART) have greatly improved the outcome for HIV-infected individuals. However, ART is associated with other serious health considerations such as bone or renal toxicity, insulin resistance, and accelerated cardiovascular disease. Thus, the need for a cure for HIV is widely recognized, and a number of potentially curative strategies are currently being investigated, including gene therapy, latency reversal, immunotherapy, and others.
[0033] Finding a cure for HIV is complicated by the fact that it establishes latent infection in long-lived cells, forming a reservoir that persists in infected individuals even after years of ART. In HIV+ patients 0.01-1 % of peripheral blood mononuclear cells (PBMC) contain HIV depending on treatment status. For example, although values can vary over 2 logs (Escaich et al., AIDS Res Hum Retroviruses. 1992; 8(10): 1833-7), HIV provirus capable of re-activation is found in 0.1-1 % of PBMCs from HIV+ patients not receiving therapy (Jurriaans et al., AIDS. 1992;6(7):635-41). In patients receiving ART, 0.01-0.1 % of PBMCs contain the HIV provirus (Eriksson et al., PLoS Pathog 2013; 9(2): e1003174; Besson et al., Clin Infect Dis. 2014;59(9): 1312-21), and 0.003% contain intact provirus (Ho et al., Cell. 2013;155(3):540-51). Based on these numbers, in an ART-suppressed patient, one in ten thousand resting T cells contain HIV DNA, and one per million resting T cells contain provirus that can be reactivated and produce infectious virus. Accordingly, each vial of 5 million PBMCs obtained from an untreated subject should contain 5,000-50,000 HIV+ cells, and each vial from treated subjects should contain 500-5,000 HIV+ cells, and 150 cells with intact provirus.
[0034] A cure for HIV is likely to require a thorough understanding of the latent HIV reservoir and the mechanisms by which it is maintained. Most current methods to measure quantitatively the replication competent latent HIV reservoir are difficult to perform, time-consuming, and expensive.
[0035] The two most common assays are the quantitative viral outgrowth assay (QVOA) and Tat/Rev Induced Limiting Dilution Assay (TILDA). These two assays have been described in Finzi et al (14 Nov. 1997) "Identification of a reservoir for HIV-1 in patients on highly active antiretroviral therapy," Science 278(5341): 1295-1300; and Procopio et al. (27 Jun. 2015) "A Novel Assay to Measure the Magnitude of the Inducible Viral Reservoir in HIV-infected Individuals," EBioMedicine 2(8):874-83.
[0036] These assays are difficult to perform due to a host of factors inherent in the HIV infection process. For one, there are no known extracellular markers that are associated with latently infected cells. Additionally, the reservoir is established in the very early stages of HIV infection, but the exact timing remains unknown. See Palmer S. (2014) "HIV Cure 101 : Challenges in identifying and targeting the HIV reservoir." AIDS 2014 20th International AIDS Conference. Latency often occurs within infected cells prior to the onset of therapy and once it is established, it is not eliminated by currently available therapies. While chemotherapy can block reactivation of the latent reservoir by blocking the virion reproduction process, the virus can and does reactive when chemotherapy is discontinued.
[0037] The rarity of latently-infected cells, and the difficulties associated with identifying and isolating them, has greatly hampered studies to define the mechanisms by which HIV latency is maintained, the factors leading to reactivation and production of infectious HIV, and the development of interventions designed to manipulate these processes for therapeutic benefit. Unlike other proposed approaches to identify latently infected cells that have relied upon DNA or RNA in situ hybridization and signal amplification (Deleage et al., Pathog. Immun. 2016; 1 (1): 68-106; Puray-Chavez et al., Nat. Commun. 2017; 8(1 ): 1882), the approach disclosed herein allows the isolation of viable cells. This facilitates studies of the functional status and proliferative potential of reservoir cells.
[0038] The current disclosure provides systems and methods to allow the isolation and study of viable cells latently infected with HIV provirus. This allows the in-depth genotypic characterization of these cells to a level not previously possible. Such studies will be of widespread utility for HIV research and the development of curative HIV strategies and will also allow addressing fundamental questions about the HIV reservoir. Since the methods disclosed herein allow in vitro proliferation of HIV-infected cells, they also greatly simplify sequencing and other genetic testing, which can be much easier with larger numbers of clonal cells. In contrast to existing methods that utilize multiple step PCRs and require extensive sampling to gain representative sequence data, methods disclosed herein provide ways to obtain this sequence data from 1000 or fewer HIV+ cells, and include both full LTR sequences, which are often missed with PCR-based sequencing (Ho et al., Cell. 2013;155(3):540-51).
[0039] The methods disclosed herein utilize advances in targeted genetic engineering to tag cells infected with HIV that create the latent HIV reservoir. Most genetic engineering approaches typically include a targeting element for precise genome targeting and a cutting element for cutting the targeted genetic site. If no further elements are provided, the DNA will repair itself through non-homologous end joining (NHEJ) which is error prone. More particularly, NHEJ is performed on the two cleaved ends of DNA which can result in a non-perfect repair, such as base pairs being inserted or deleted resulting in insertions and deletions (indels) at the break site. Thus, this type of genetic engineering is most often used when disruption or silencing of an existing dysfunctional gene is required.
[0040] When insertion of a genetic construct is required, gene editing systems typically include a targeting element, a cutting element, and a genetic construct to be inserted which generally includes regions of homology and a functional sequence (e.g., a sequence to be expressed). In these approaches, the targeting element targets the cutting element to a specific genomic site for cutting, and the regions of homology provide for homology directed repair (HDR) which is less error prone than NHEJ. Following HDR repair, the provided genetic construct has been inserted within the target site. As is understood by one of ordinary skill in the art, regions of homology facilitate HDR based on sequence complementarity with the edited area of the genome.
[0041] While HDR is generally precise and efficient, it is cell cycle phase-dependent. Moreover, the high level of sequence heterogeneity between HIV strains, even within the same patient, complicates HDR-mediated gene insertion into the provirus, because even minor sequence differences between the sequences of a targeting element or regions of homology and the sequence of the targeted area of the genome can greatly reduce HDR efficiency (Deyle et ai, Nucleic Acids Res. 2014;42(5):3119-24). Thus, HDR is poorly suited to heterogeneous genetic loci like the HIV provirus.
[0042] Like HDR, homology-independent targeted integration (HITI) requires target site cleavage. In particular embodiments, it differs from HDR, however, in that a linear dsDNA donor (marker) is inserted through a homology-independent NHEJ pathway (Lackner et ai, Nat Commun. 2015;6: 10237; Suzuki et ai, Nature. 2016;540(7631):144-9) or a microhomology- mediated end joining (MMEJ) pathway. Thus, HITI does not require long and highly conserved regions of homology, and instead can successfully insert genetic constructs with minimal to no regions of homology. In particular embodiments, it can be advantageous to utilize micro-regions of homology, such as those including a conserved target site of 23bp. Moreover, HITI can occur with high efficiency, even in non-dividing cells. For at least these reasons that make HITI amenable to tagging of heterogeneous HIV proviruses, embodiments disclosed herein utilize HITI to tag cells of the latent HIV reservoir for isolation.
[0043] In particular embodiments, HITI utilizes a targeting element, a cutting element, and a genetic construct to be inserted into the cut genome. In particular embodiments, the genetic construct to be inserted includes sequences that match or substantially match the targeting element sequences. In this manner, the genetic constructs to be inserted provide for small or micro-regions of homology to facilitate insertion of the genetic construct. To qualify as HITI, rather than HDR, these small or micro-homology sequences are 75 bp or less. In particular embodiments, the small or micro-homology sequences are 50 bp or less. In particular embodiments, the small or micro-homology sequences are 40 bp or less. In particular embodiments, the small or micro-homology sequences are 30 bp or less. In particular embodiments, the small or micro-homology sequences are 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 11 , 10, 9, 8, 7, 6, 5, 4, 3, or 2 bp.
[0044] Using HITI, cells infected with the HIV provirus are tagged by inserting a genetic construct with a gene encoding a reporter within the HIV genome. Only those cells containing the target site should be tagged with the reporter. In particular embodiments, the reporter is a fluorescent protein that allows the tagged cells to be isolated by fluorescence activated cell sorting (FACS). FACS refers to a specialized type of flow cytometry. FACS provides a method for sorting a heterogeneous mixture of biological cells into two or more containers, one cell at a time, based upon the specific light scattering and fluorescent characteristics of each cell. FACS provides fast, objective and quantitative recording of fluorescent signals from individual cells as well as physical separation of cells of particular interest.
[0045] The use of HITI alone to target and isolate latently infected HIV cells was not sufficient to lead to the systems and methods as they are disclosed herein, however, because of an unacceptable level of background noise in expression of provided reporter constructs. Based on the use of HITI alone, the background noise led to the isolation of too many cells that were not in fact latently infected with HIV. Because of the extremely low number of true latently infected cells, even a relatively small amount of background expression/noise can unacceptably mask the identity and collection of truly latently infected cells.
[0046] To overcome the challenge associated with background expression of reporter genes within provided genetic constructs, the inserted genetic constructs were generated to lack a polyA signal. This approach was recognized as feasible because constructs that properly insert into the HIV provirus genome can be stably expressed using the HIV genome’s endogenous polyA signals. If the construct is not inserted within the HIV genome, however, the lack of the polyA signal results in unstable mRNA that will not subsequently be translated to any significant degree. As used herein, a polyA signal is a base sequence that leads to the addition of a polyA- tail on transcribed mRNA. PolyA signals generally include 6 base pairs, such as ATTAAA and AATAAA. The polyA signal determines where on the mRNA molecule a polyA tail is added. PolyA tails refer to a sequence of adenines that are endogenously added to unprocessed mRNA along with a 5’ cap to stabilize the mRNA.
[0047] The described genetic constructs that lack a polyA signal significantly reduce, if not eliminate the noted background noise from the systems and methods disclosed herein. As described above, studies show that 0.01-0.1% of PBMCs contain the HIV provirus in infected patients. Therefore, background noise of 1 % or above would result in >90% of events being background noise and not true provirus tagging events. Accordingly, in particular embodiments, background noise of greater than 0.1 % is not acceptable. [0048] Using the systems and methods disclosed herein, tagged and isolated cells remain viable, readily proliferate in culture, and are available for downstream genetic and other analysis. For example, expansion easily provides more than the 1000 cells required for HIV probe capture; typically 50% of T cell clones may expand to >2 million cells by day 28 of the protocol described in Riddell & Greenberg, J Immunol Methods. 1990; 128(2):189-201. Cells can then be subjected to HIV hybridization probe capture deep sequencing which allows HIV sequence enrichment prior to Next-Generation Sequencing (NGS). In this manner HIV integration sites, the complete provirus sequence, the T cell receptor of the cell, and other information can be identified. Identifying HIV integration sites in bulk pools of HIV+ cells, can allow determination of integration site preferences of virus that make the latent reservoir. Identifying complete provirus sequences can allow sequence or strain preferences that allow the latent reservoir. Identifying T cell receptors of latently infected cells can determine if particular TCR are related to maintenance of the latent reservoir. In particular embodiments, a combination of HIV probe capture/lllumina sequencing and TCR-specific PCR can be used to determine complete provirus genome sequences, paired integration sites, and TCR sequences from individual HIV+ CD4+ T cells. Tens to hundreds of paired sequences can be obtained from isolated HIV+ CD4+ T cell clones, which will provide important insights into the role of antigen specificity in the maintenance of the HIV reservoir.
[0049] Although sample size will dictate the total number of integration sites and other types of information that can be sequenced per experiment, methods disclosed herein allow the sequencing of hundreds to thousands of, for example, HIV integration sites for each participant. In particular embodiments, a goal is to tag at least 10% of HIV proviruses containing intact target sites within participant samples, which should provide >1000 clonally expanded or bulk purified HIV+ cells per sample, enabling significant and informative genetic analyses.
[0050] Additional aspects and options of the disclosure are now described in more detail as follows: (i) T Cells and T Cell Enrichment; (ii) Gene Editing Systems; (iii) HIV and Targeted HIV Sites; (iv) Genetic Constructs Inserted into the HIV Provirus at Targeted Sites; (v) Isolation of Tagged Cells; (vi) Maintenance and/or Expansion of Isolated Cells; (vii) Analysis of Tagged and Isolated T Cells; (viii) Exemplary Embodiments; (ix) Experimental Examples; and (x) Closing Paragraphs.
[0051] (i) T Cells and T Cell Enrichment. T cells are cells of the immune system that develop in the thymus. There are different types of T-cells, each type having a distinct function. The majority of T-cells have a T-cell receptor (TCR) existing as a complex of several proteins. The actual T-cell receptor is composed of two separate peptide chains, which are produced from the independent T-cell receptor alpha and beta (TCRa and TCF^) genes and are called a- and b- TCR chains gd T-cells represent a small subset of T-cells that possess a distinct T-cell receptor (TCR) on their surface. In gd T-cells, the TCR is made up of one g-chain and one d-chain. This group of T-cells is much less common (2% of total T-cells) than the ab T-cells.
[0052] CD3 is expressed on all mature T cells and activated T-cells express 4-1 BB (CD137). T- cells can further be classified into helper cells (CD4+ T-cells) and cytotoxic T-cells (CTLs, CD8+ T-cells), which include cytolytic T-cells. CD4+ T helper cells assist other white blood cells in immunologic processes, including maturation of B cells into plasma cells and activation of cytotoxic T-cells and macrophages, among other functions. These cells are also known as CD4+ T-cells because they express the CD4 protein on their surface. Helper T-cells become activated when they are presented with peptide antigens by MHC class II molecules that are expressed on the surface of antigen presenting cells (APCs). Once activated, they divide rapidly and secrete small proteins called cytokines that regulate or assist in the active immune response. CD4+ T cells can be infected by the HIV virus.
[0053] CD8+ cytotoxic T-cells destroy virally infected cells and tumor cells, and are also implicated in transplant rejection. These cells are also known as CD8+ T-cells because they express the CD8 glycoprotein at their surface. These cells recognize their targets by binding to antigen associated with MHC class I, which is present on the surface of nearly every cell of the body.
[0054] "Central memory" T-cells (or "TCM") refer to T lymphocytes that have previously been exposed to an antigen and express CD62L or CCR7 and CD45RO on the surface and do not express or have decreased expression of CD45RA as compared to naive cells.
[0055] "Effector memory" T-cells (or "TEM") refer to an antigen experienced T-cells that do not express or have decreased expression of CD62L on the surface thereof as compared to central memory cells and do not express or have decreased expression of CD45RA as compared to a naive cell. In particular embodiments, effector memory cells are negative for expression of CD62L and CCR7, compared to naive cells or central memory cells, and have variable expression of CD28 and CD45RA. Effector T-cells are positive for granzyme B and perforin as compared to memory or naive T-cells.
[0056] Regulatory T cells (“TREG”) are a subpopulation of T cells, which modulate the immune system, maintain tolerance to self-antigens, and abrogate autoimmune disease. TREG express CD25, CTLA-4, GITR, GARP and LAP.
[0057] "Naive" T-cells as used herein refers to a non-antigen experienced T cell that expresses CD62L and CD45RA and does not express CD45RO as compared to central or effector memory cells.
[0058] A statement that a cell or population of cells is "positive" for or expressing a particular marker refers to the detectable presence on or in the cell of the particular marker. When referring to a surface marker, the term can refer to the presence of surface expression as detected by flow cytometry, for example, by staining with an antibody that specifically binds to the marker and detecting said antibody, wherein the staining is detectable by flow cytometry at a level substantially above the staining detected carrying out the same procedure with an isotype- matched control under otherwise identical conditions and/or at a level substantially similar to that for cell known to be positive for the marker, and/or at a level substantially higher than that for a cell known to be negative for the marker.
[0059] A statement that a cell or population of cells is "negative" for a particular marker or lacks expression of a marker refers to the absence of substantial detectable presence on or in the cell of a particular marker. When referring to a surface marker, the term can refer to the absence of surface expression as detected by flow cytometry, for example, by staining with an antibody that specifically binds to the marker and detecting said antibody, wherein the staining is not detected by flow cytometry at a level substantially above the staining detected carrying out the same procedure with an isotype-matched control under otherwise identical conditions, and/or at a level substantially lower than that for cell known to be positive for the marker, and/or at a level substantially similar as compared to that for a cell known to be negative for the marker.
[0060] Methods of sample collection and enrichment are known by those skilled in the art. In some embodiments, cells are derived from T cell lines. In particular embodiments, T cells are derived from humans.
[0061] In some embodiments, T cells are derived or isolated from samples such as whole blood, peripheral blood mononuclear cells (PBMCs), leukocytes, bone marrow, thymus, tissue biopsy, tumor, leukemia, lymphoma, lymph node, gut associated lymphoid tissue, mucosa associated lymphoid tissue, spleen, other lymphoid tissues, liver, lung, stomach, intestine, colon, kidney, pancreas, breast, bone, prostate, cervix, testes, ovaries, tonsil, or other organ, and/or cells derived therefrom. In some aspects, the T cells are derived or isolated from blood or a blood- derived sample, or is or is derived from an apheresis or leukapheresis product. Primary T cells are those that are isolated from a living organism’s tissue or a sample thereof.
[0062] In some embodiments, blood cells collected from a subject are washed, e.g., to remove the plasma fraction and to place the cells in an appropriate buffer or media for subsequent processing steps. In particular embodiments, the cells are washed with phosphate buffered saline (PBS). In some embodiments, the wash solution lacks calcium and/or magnesium and/or many or all divalent cations. Washing can be accomplished using a semi-automated "flow through" centrifuge (for example, the Cobe 2991 cell processor, Baxter) according to the manufacturer's instructions. Tangential flow filtration (TFF) can also be performed. In particular embodiments, cells can re-suspended in a variety of biocompatible buffers after washing, such as, Ca++/Mg++ free PBS.
[0063] In particular embodiments, a sample can be enriched for T cells by using density-based cell separation methods and related methods. For example, white blood cells can be separated from other cell types in the peripheral blood by lysing red blood cells and centrifuging the sample through a Percoll or Ficoll gradient.
[0064] In particular embodiments, a bulk T cell population can be used that has not been enriched for a particular T cell type. In particular embodiments, a selected T cell type can be enriched for and/or isolated based on cell-marker based positive and/or negative selection. In positive selection, cells having bound cellular markers are retained for further use. In negative selection, cells not bound by a capture agent, such as an antibody to a cellular marker are retained for further use. In some examples, both fractions can be retained for a further use.
[0065] In some examples, multiple rounds of separation steps are carried out, where the positively or negatively selected fraction from one step is subjected to another separation step, such as a subsequent positive or negative selection.
[0066] In particular embodiments, cell populations can be isolated and/or analyzed based on light scattering properties of the cells based on side scatter channel (SSC) brightness and forward scatter channel (FSC) brightness. Side scatter refers to the amount of light scattered orthogonally (90° from the direction of the laser source), as measured by flow cytometry. Forward scatter refers to the amount of light scattered generally less than 90° from the direction of the light source. Generally, as cell granularity increases, the side scatter increases and as cell diameter increases, the forward scatter increases.
[0067] Side scatter and forward scatter are measured as intensity of light. Those skilled in the art recognize that the amount of side scatter can be differentiated with user-defined settings. In particular embodiments, low (lo) side scatter refers to less than 50% intensity, less than 40% intensity, less than 30% intensity, or even less intensity, in the side scatter channel of the flow cytometer. Conversely high (hi) side scatter cells are the reciprocal population of cells that are not low side scatter. Forward scatter is defined in the same manner as side scatter but the light is collected in forward scatter channel. Thus, particular embodiments include selection of cell populations based on precise combinations of cell surface markers (CD markers) and the associated light scattering properties of the cells. [0068] In some embodiments, an antibody or binding domain for a cellular marker is bound to a solid support or matrix, such as a magnetic bead or paramagnetic bead, to allow for separation of cells for positive and/or negative selection. For example, in some embodiments, the cells and cell populations are separated or isolated using immunomagnetic (or affinity magnetic) separation techniques (reviewed in Methods in Molecular Medicine, vol. 58: Metastasis Research Protocols, Vol. 2: Cell Behavior In Vitro and In Vivo, p 17-25 Edited by: S. A. Brooks and U. Schumacher © Humana Press Inc., Totowa, NJ); see also US 4,452,773; US 4,795,698; US 5,200,084; and EP 452342.
[0069] In some embodiments, affinity-based selection is via magnetic-activated cell sorting (MACS) (Miltenyi Biotec, Auburn, CA). MACS systems are capable of high-purity selection of cells having magnetized particles attached thereto. In certain embodiments, MACS operates in a mode wherein the non-target and target species are sequentially eluted after the application of the external magnetic field. That is, the cells attached to magnetized particles are held in place while the unattached species are eluted. Then, after this first elution step is completed, the species that were trapped in the magnetic field and were prevented from being eluted are freed in some manner such that they can be eluted and recovered. In certain embodiments, the non target cells are labelled and depleted from the heterogeneous population of cells.
[0070] In some embodiments, a T cell population is collected and enriched (or depleted) via flow cytometry, in which cells stained for multiple cell surface markers are carried in a fluidic stream. In some embodiments, a cell population described herein is collected and enriched (or depleted) via preparative scale (FACS)-sorting. In certain embodiments, a cell population described herein is collected and enriched (or depleted) by use of microelectromechanical systems (MEMS) chips in combination with a FACS-based detection system (see, e.g., WO 2010/033140, Cho et al. (2010) Lab Chip 10, 1567-1573; and Godin et al. (2008) J Biophoton. 1 (5):355— 376). In both cases, cells can be labeled with multiple markers, allowing for the isolation of well-defined T cell subsets at high purity.
[0071] In particular embodiments, a CD4+ selection step is used to separate CD4+ helper and CD8+ cytotoxic T cells. Such CD4+ populations can be further sorted into sub-populations by positive or negative selection for markers expressed or expressed to a relatively higher degree on one or more naive or memory T cell subpopulations. For example, a CD4+ enriched cell population can further be sorted based on the expression of CCR7, CD45RO, and/or CD62L.
[0072] Separation need not result in 100% enrichment or removal of a particular cell population or cells expressing a particular marker. For example, positive selection of or enrichment for cells of a particular type refers to increasing the number or percentage of such cells but need not result in a complete absence of cells not expressing the marker. Likewise, negative selection, removal, or depletion of cells of a particular type refers to decreasing the number or percentage of such cells but need not result in a complete removal of all such cells. In particular embodiments, an enriched cell population will include, as a percentage of cell types, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%, of the targeted cell type. In particular embodiments, an enriched cell population includes >70% or >90% of the target cell type.
[0073] (ii) Gene Editing Systems. Within the teachings of the current disclosure, any gene editing system capable of precise sequence targeting, cutting, and construct insertion can be used, so long as the system does not require regions of homology (also referred to as homology arms or homology-directed repair templates) of greater than 75 bp yet still results in the insertion of a genetic construct at a targeted site. As indicated previously, these systems typically include a targeting element for precise genome targeting, a cutting element for cutting at or near the targeted genetic site, and the genetic construct to be inserted during repair. As detailed further below, however, different gene editing systems can adopt different components and configurations while maintaining the ability to precisely target, cut, and modify selected genomic sites.
[0074] The CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR-associated protein) nuclease system is an engineered nuclease system used for genetic engineering that is based on a bacterial system. It is based in part on the adaptive immune response of many bacteria and archaea. When a virus or plasmid invades a bacterium, segments of the invader's DNA are converted into CRISPR RNAs (crRNA) by the bacteria’s "immune" response. The crRNA then associates, through a region of partial complementarity, with another type of RNA called tracrRNA to guide a Cas nuclease to a region homologous to the crRNA in the target DNA called a "protospacer." The Cas nuclease cleaves the DNA to generate blunt ends at the double-strand break at sites specified by a 20-nucleotide complementary strand sequence contained within the crRNA transcript. In some instances, the Cas nuclease requires both the crRNA and the tracrRNA for site-specific DNA recognition and cleavage.
[0075] Guide RNA (gRNA) is one example of a targeting element. In its simplest form, gRNA provides a sequence that targets a site within a genome based on complementarity (e.g., crRNA). As explained below, however, gRNA can also include additional components. For example, in particular embodiments, gRNA can include a targeting sequence (e.g., crRNA) and a component to link the targeting sequence to a cutting element. This linking component can be tracrRNA. In particular embodiments, as described below, gRNA including crRNA and tracrRNA can be expressed as a single molecule referred to as single gRNA (sgRNA). gRNA can also be linked to a cutting element through other mechanisms such as through a nanoparticle or through expression or construction of a dual or multi-purpose molecule.
[0076] In particular embodiments, targeting elements (e.g., gRNA) can include one or more modifications (e.g., a base modification, a backbone modification), to provide the nucleic acid with a new or enhanced feature (e.g., improved stability). Modified backbones may include those that retain a phosphorus atom in the backbone and those that do not have a phosphorus atom in the backbone. Suitable modified backbones containing a phosphorus atom may include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates such as 3'- alkylene phosphonates, 5'-alkylene phosphonates, chiral phosphonates, phosphinates, phosphoramidates including 3'-amino phosphoramidate and aminoalkylphosphoramidates, phosphorodiamidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, selenophosphates, and boranophosphates having normal 3'-5' linkages, 2'-5' linked analogs, and those having inverted polarity wherein one or more internucleotide linkages is a 3' to 3', a 5' to 5' or a 2' to 2' linkage. Suitable targeting elements having inverted polarity can include a single 3' to 3' linkage at the 3'-most internucleotide linkage (i.e. a single inverted nucleoside residue in which the nucleobase is missing or has a hydroxyl group in place thereof). Various salts (e.g., potassium chloride or sodium chloride), mixed salts, and free acid forms can also be included.
[0077] Targeting elements can include one or more phosphorothioate and/or heteroatom internucleoside linkages, in particular --CH2-NH-0--CH2-, -CH2-N(CH3)-0--CH2- (i.e. a methylene (methylimino) or MMI backbone), -CH2-0-N(CH3)-CH2-, -CH2-N(CH3)-N(CH3)- CH2- and --0-N(CH3)-CH2-CH2- (wherein the native phosphodiester internucleotide linkage is represented as --0--P(=0)(0H)~0--CH2-).
[0078] In particular embodiments, targeting elements can include a morpholino backbone structure. For example, the targeting elements can include a 6-membered morpholino ring in place of a ribose ring. In some of these embodiments, a phosphorodiamidate or other non- phosphodiester internucleoside linkage replaces a phosphodiester linkage.
[0079] In particular embodiments, targeting elements can include one or more substituted sugar moieties. Suitable polynucleotides can include a sugar substituent group selected from: OH; F; 0-, S-, or N-alkyl; 0-, S-, or N-alkenyl; 0-, S- or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may be substituted or unsubstituted C1 to C10 alkyl or C2 to C10 alkenyl and alkynyl. Particularly suitable are 0((CH2)n0) mCH3, 0(CH2)n0CH3, 0(CH2)nNH2, 0(CH2)nCH3, 0(CH2)n0NH2, and 0(CH2)n0N((CH2)nCH3)2, where n and m are from 1 to 10.
[0080] As is understood by one of ordinary skill in the art, the specificity of targeting elements to targeted areas of the genome is based on complementary base pairing between the targeting element and the targeted area. In particular embodiments, targeting elements and targeted areas will have 100% sequence complementarity. In particular embodiments, targeting elements and targeted areas will have at least 90%, 95%, 97%, 98%, or 99% sequence complementarity. In particular embodiments, targeting elements and targeted areas will bind in vitro under stringent hybridization conditions. In vitro stringent hybridization conditions are described in section (x) of this disclosure.
[0081] Examples of cutting elements include nucleases. CRISPR-Cas loci have more than 50 gene families and there are no strictly universal genes, indicating fast evolution and extreme diversity of loci architecture. Exemplary Cas nucleases include Casl, CasIB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), CaslO, , Cpfl, C2c3, C2c2 and C2clCsyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Cpfl, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csfl, Csf2, Csf3, and Csf4.
[0082] There are three main types of Cas nucleases (type I, type II, and type III), and 10 subtypes including 5 type I, 3 type II, and 2 type III proteins (see, e.g., Hochstrasser and Doudna, Trends Biochem Sci, 2015:40(l):58-66). Type II Cas nucleases include Casl, Cas2, Csn2, and Cas9. These Cas nucleases are known to those skilled in the art. For example, the amino acid sequence of the Streptococcus pyogenes wild-type Cas9 polypeptide is set forth, e.g., in NBCI Ref. Seq. No. NP 269215, and the amino acid sequence of Streptococcus thermophilus wild-type Cas9 polypeptide is set forth, e.g., in NBCI Ref. Seq. No. WP_011681470.
[0083] In particular embodiments, Cas9 refers to an RNA-guided double-stranded DNA-binding nuclease protein or nickase protein. Wild-type Cas9 nuclease has two functional domains, e.g., RuvC and HNH, that cut different DNA strands. Cas9 can induce double-strand breaks in genomic DNA (target DNA) when both functional domains are active. The Cas9 enzyme, in some embodiments, includes one or more catalytic domains of a Cas9 protein derived from bacteria such as Corynebacter, Sutterella, Legionella, Treponema, Filif actor, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Flaviivola, Flavobacterium, Sphaerochaeta, Azospirillum, Gluconacetobacter, Neisseria, Roseburia, Parvibaculum, Staphylococcus, Nitratifractor, and Campylobacter. In some embodiments, the Cas9 is a fusion protein, e.g. the two catalytic domains are derived from different bacterial species.
[0084] As indicated previously, the CRISPR/Cas system has been engineered such that, in certain cases, crRNA and tracrRNA can be combined into one molecule called a single gRNA (sgRNA). In this engineered approach, the sgRNA guides Cas to target any desired sequence (see, e.g., Jinek et al. (2012) Science 337:816-821 ; Jinek et al. (2013) eLife 2:e00471 ; Segal (2013) eLife 2:e00563). Thus, the CRISPR/Cas system can be engineered to create a double strand break at a desired target in a genome of a cell, and harness the cell's endogenous mechanisms to repair the induced break by HDR, HITI-associated MMEJ or NHEJ (as described herein), or complete NHEJ depending on whether a genetic construct is provided for insertion and the length of any provided homology regions (e.g., > or < 75 bp).
[0085] Useful variants of the Cas9 nuclease include a single inactive catalytic domain, such as a RuvC" or HNH" enzyme or a nickase. A Cas9 nickase has only one active functional domain and, in some embodiments, cuts only one strand of the target DNA, thereby creating a single strand break or nick. In some embodiments, the mutant Cas9 nuclease having at least a D10A mutation is a Cas9 nickase. In other embodiments, the mutant Cas9 nuclease having at least a H840A mutation is a Cas9 nickase. Other examples of mutations present in a Cas9 nickase include N854A and N863 A. A double-strand break is introduced using a Cas9 nickase if at least two DNA-targeting RNAs that target opposite DNA strands are used. A double-nicked induced double-strand break is repaired by NHEJ, HDR or HITI. This gene editing strategy generally favors HDR and decreases the frequency of indel mutations at off-target DNA sites. The Cas9 nuclease or nickase, in some embodiments, is codon-optimized for the target cell or target organism.
[0086] Particular embodiments can utilize Staphylococcus aureus Cas9 (SaCas9). Particular embodiments can utilize SaCas9 with mutations at one or more of the following positions: E782, N968, and/or R1015. Particular embodiments can utilize SaCas9 with mutations at one or more of the following positions: E735, E782, K929, N968, A1021 , K1044 and/or R1015. In some embodiments, the variant SaCas9 protein includes one or more of the following mutations: R1015Q, R1015H, E782K, N968K, E735K, K929R, A1021T, and/or K1044N. In some embodiments, the variant SaCas9 protein includes mutations at D10A, D556A, H557A, N580A, e.g., D10A/H557A and/or D10A/D556A/H557A/N580A. In some embodiments, the variant SaCas9 protein includes one or more mutations selected from E735, E782, K929, N968, R1015, A1021 , and/or K1044. In some embodiments, the SaCas9 variants can include one of the following sets of mutations: E782K/N968K/R1015H (KKH variant); E782K/K929R/R1015H (KRH variant); or E782K/K929R/N968K/R1015H (KRKH variant). An appropriate reference sequence is provided in FIG. 12. However, the skilled person will be able to determine appropriate corresponding residues in other Cas9 proteins.
[0087] A putative Class II, Type V CRISPR-Cas class exemplified by Cpf1 has been identified Zetsche et al. (2015) Cell 163(3): 759-771. The Cpf1 nuclease particularly can provide added flexibility in target site selection by means of a short, three base pair recognition sequence (TTN), known as the protospacer-adjacent motif or PAM. CpfTs cut site is at least 18bp away from the PAM sequence. Moreover, staggered DSBs with sticky ends permit orientation-specific donor template insertion, which is advantageous in non-dividing cells.
[0088] Particular embodiments can utilize engineered Cpfls. For example, US 2018/0030425 describes engineered Cpf1 nucleases from Lachnospiraceae bacterium ND2006 and Acidaminococcus sp. BV3L6 with altered and improved target specificity. Particular variants include Lachnospiraceae bacterium ND2006 of SEQ ID NO: 34, e.g., at least including amino acids 19-1246 of SEQ ID NO: 34, with mutations (i.e. , replacement of the native amino acid with a different amino acid, e.g., alanine, glycine, or serine), at one or more of the following positions: S202, N274, N278, K290, K367, K532, K609, K915, Q962, K963, K966, K1002, and/or S1003 of SEQ ID NO: 34. Particular Cpf1 variants can also include Acidaminococcus sp. BV3L6 Cpf1 (AsCpfl) of SEQ ID NO: 35 with mutations (i.e., replacement of the native amino acid with a different amino acid, e.g., alanine, glycine, or serine (except where the native amino acid is serine)), at one or more of the following positions: N178, S186, N278, N282, R301 , T315, S376, N515, K523, K524, K603, K965, Q1013, Q1014, and/or K1054 of SEQ ID NO: 35. In particular embodiments, engineered Cpf1 variants include eCfpl
[0089] Other Cpf1 variants include Cpf1 homologs and orthologs of the Cpf1 polypeptides disclosed in Zetsche et al. (2015) Cell 163: 759-771 as well as the Cpf1 polypeptides disclosed in U.S. 2016/0208243. Other engineered Cpf1 variants are known to those of ordinary skill in the art and included within the scope of the current disclosure (see, e.g., WO/2017/184768).
[0090] Particular embodiments disclosed herein can utilize a method referred to herein as CRISPR-assisted provirus tagging in vitro (CAPTIV) which, in particular embodiments, utilizes CRISPR/Cas9 and HITI to insert a fluorescent marker construct into the integrated provirus of cells latently infected with HIV. In particular embodiments, CAPTIV can include any appropriate combination of CRISPR components described herein. For example, CAPTIV may utilize gRNA, sgRNA, Cas9 variants, Cpf1 , variants of Cpf1 , etc. in any appropriate configuration.
[0091] Additional information regarding CRISPR-Cas systems and components thereof are described in, US8697359, US8771945, US8795965, US8865406, US8871445, US8889356, US8889418, US8895308, US8906616, US8932814, US8945839, US8993233 and US8999641 and applications related thereto; and WO2014/018423, WO2014/093595, WO2014/093622, WO2014/093635, WO2014/093655, WO2014/093661 , WO2014/093694, WO2014/093701 , WO2014/093709, WO2014/093712, WO2014/093718, WO2014/145599, WO2014/204723, WO2014/204724, WO2014/204725, WO2014/204726, WO2014/204727, WO2014/204728, WO2014/204729, WO2015/065964, WO2015/089351 , WO2015/089354, WO2015/089364, WO2015/089419, WO2015/089427, WO2015/089462, WO2015/089465, WO2015/089473 and WO2015/089486, W02016205711 , WO2017/106657, WO2017/127807 and applications related thereto.
[0092] Particular embodiments utilize zinc finger nucleases (ZFNs) as gene editing agents. ZFNs are a class of site-specific nucleases engineered to bind and cleave DNA at specific positions. ZFNs are used to introduce double strand breaks (DSBs) at a specific site in a DNA sequence which enables the ZFNs to target unique sequences within a genome in a variety of different cells.
[0093] ZFNs are synthesized by fusing a zinc finger DNA-binding domain to a DNA cleavage domain. The DNA-binding domain includes three to six zinc finger proteins which are similar to those found in transcription factors. The DNA cleavage domain includes the catalytic domain of, for example, Fokl endonuclease. The Fokl domain functions as a dimer requiring two constructs with unique DNA binding domains for sites on either side of the target site cleavage sequence. The Fokl cleavage domain cleaves within a five or six base pair spacer sequence separating the two inverted half-sites.
[0094] For additional information regarding ZFNs and ZFNs useful within the teachings of the current disclosure, see, e.g., U.S. Patent Nos. 6,534,261 ; 6,607,882; 6,746,838; 6,794,136; 6,824,978; 6,866,997; 6,933, 113; 6,979,539; 7,013,219; 7,030,215; 7,220,719; 7,241 ,573; 7,241 ,574; 7,585,849; 7,595,376; 6,903,185; 6,479,626; and U.S. Application Publication Nos. 2003/0232410 and 2009/0203140 as well as Gaj et aL, Nat Methods, 2012, 9(8):805-7; Ramirez et aL, Nucl Acids Res, 2012, 40(12):5560-8; Kim et aL, Genome Res, 2012, 22(7): 1327-33; Urnov et al., Nature Reviews Genetics, 2010, 11 :636-646; Miller, et aL Nature biotechnology 25, 778-785 (2007); Bibikova, et al. Science 300, 764 (2003); Bibikova, et al. Genetics 161 , 1169-1175 (2002); Wolfe, et aL Annual review of biophysics and biomolecular structure 29, 183- 212 (2000); Kim, et aL Proceedings of the National Academy of Sciences of the United States of America 93, 1156-1160 (1996); and Miller, et aL The EMBO journal 4, 1609-1614 (1985).
[0095] Particular embodiments can use transcription activator like effector nucleases (TALENs) as gene editing agents. TALENs refer to fusion proteins including a transcription activator-like effector (TALE) DNA binding protein and a DNA cleavage domain. TALENs are used to edit genes and genomes by inducing DSBs in the DNA, which induce repair mechanisms in cells. Generally, two TALENs must bind and flank each side of the target DNA site for the DNA cleavage domain to dimerize and induce a DSB.
[0096] As indicated, TALENs have been engineered to bind a target sequence of, for example, an endogenous genome, and cut DNA at the location of the target sequence. The TALEs of TALENs are DNA binding proteins secreted by Xanthomonas bacteria. The DNA binding domain of TALEs include a highly conserved 33 or 34 amino acid repeat, with divergent residues at the 12th and 13th positions of each repeat. These two positions, referred to as the Repeat Variable Diresidue (RVD), show a strong correlation with specific nucleotide recognition. Accordingly, targeting specificity can be improved by changing the amino acids in the RVD and incorporating nonconventional RVD amino acids.
[0097] Examples of DNA cleavage domains that can be used in TALEN fusions are wild-type and variant Fokl endonucleases. For additional information regarding TALENs, see U.S. Patent Nos. 8,440,431 ; 8,440,432; 8,450,471 ; 8,586,363; and 8,697,853; as well as Joung and Sander, Nat Rev Mol Cell Biol, 2013, 14(l):49-55; Beurdeley et at., Nat Commun, 2013, 4: 1762; Scharenberg et al., Curr Gene Ther, 2013, 13(4):291-303; Gaj et al., Nat Methods, 2012, 9(8):805-7; Miller, et al. Nature biotechnology 29, 143-148 (2011); Christian, et al. Genetics 186, 757-761 (2010); Boch, et al. Science 326, 1509-1512 (2009); and Moscou, & Bogdanove, Science 326, 1501 (2009).
[0098] Particular embodiments utilize MegaTALs as gene editing agents. MegaTALs have a single chain rare-cleaving nuclease structure in which a TALE is fused with the DNA cleavage domain of a meganuclease. Meganucleases, also known as homing endonucleases, are single peptide chains that have both DNA recognition and nuclease function in the same domain. In contrast to the TALEN, the megaTAL only requires the delivery of a single peptide chain for functional activity.
[0099] Exemplary meganucleases include l-Scel, I- Scell, l-Scelll, l-ScelV, l-SceV, l-SceVI, I- SceVII, l-Ceul, l-CeuAIIP, l-Crel, l-CrepsbIP, I- CrepsbllP, l-CrepsbIIIP, l-CrepsbIVP, l-Tlil, I- Ppol, Pl-Pspl, F-Scel, F-Scell, F-Suvl, F- Tevl, F-Tevll, l-Amal, l-Anil, l-Chul, l-Cmoel, l-Cpal, I- Cpall, l-Csml, l-Cvul, l-CvuAIP, l-Ddil, l-Ddill, l-Dirl, l-Dmol, l-Hmul, l-Hmull, l-HsNIP, l-Llal, I- Msol, l-Naal, l-Nanl, I- NcllP, l-NgrIP, l-Nitl, l-Njal, l-Nsp236IP, l-Pakl, l-PbolP, l-PculP, l-PcuAI, l-PcuVI, I- PgrIP, 1-PoblP, l-Porl, l-PorllP, l-PbpIP, l-SpBetalP, l-Scal, l-SexlP, 1-SnelP, I- Spoml, l-SpomCP, l-SpomlP, l-SpomllP, l-SquIP, I-Ssp6803l, l-SthPhiJP, l-SthPhiST3P, I- SthPhiSTe3bP, l-TdelP, l-Tevl, l-Tevll, l-Tevlll, l-UarAP, l-UarHGPAIP, l-UarHGPA13P, l-VinIP, 1-ZbilP, Pl-Mtul, PI-MtuHIP PI-MtuHIIP, Pl-Pful, Pl-Pfull, Pl-Pkol, Pl-Pkoll, PI-Rma43812IP, Pl- SpBetalP, Pl-Scel, Pl-Tful, Pl-Tfull, Pl-Thyl, Pl-Tlil, and PI-THII.
[0100] (iii) HIV and Targeted HIV Sites. Human immunodeficiency virus (HIV) is a member of the genus Lentivirinae, which is part of the family of Retroviridae. Two species of HIV infect humans: HIV-1 and HIV-2. HIV-1 is the most common and pathogenic strain of the virus, with more than 90% of HIV/AIDS cases resulting from infection with HIV-1.
[0101] HIV is categorized into multiple clades with a high degree of genetic divergence. As used herein, the term "HIV clade" or "HIV subtype" refers to related human immunodeficiency viruses classified according to their degree of genetic similarity. There are currently three groups of HIV- 1 isolates: M, N and O. Group M (major strains) consists of at least ten clades, A through J. Group O (outer strains) can consist of a similar number of clades. Group N is a new HIV-1 isolate that has not been categorized in either group M or O.
[0102] Generally, the HIV-1 genome is 9.8 kb in length, including two viral long-terminal repeats located at both ends when integrated into the host genome. The genome also includes genes that encode for the structural proteins Gag, Pol, and Env, regulatory proteins (Tat and Rev), and accessory proteins Vpu, Vpr, Vif, and Nef. The HIV-1 transactivator of transcription (Tat) is a multifunctional protein that has been proposed to contribute to several pathological consequences of HIV-1 infection. Tat not only plays an important role in viral transcription and replication, it is also capable of inducing the expression of a variety of cellular genes as well as acting as a neurotoxic protein. Tat protein is secreted by H IV- 1 -infected cells and acts by diffusing through the cell membrane. It may act as a secreted, soluble neurotoxin and induces HIV-1-infected macrophages and microglia to release neurotoxic substances. Tat transcription is driven by the HIV-1 LTR promoter and is required for overall viral replication of HIV.
[0103] In particular embodiments, the targeted sequence is 10 to 30 nucleotides in length, from 12 to 28 nucleotides in length, from 16 to 26 nucleotides in length, or from 10 to 40 nucleotides in length. In particular embodiments, the targeted sequence includes a nuclease binding site. In particular embodiments the targeted sequence includes a nick/cleavage site. In particular embodiments, the targeted sequence includes a protospacer adjacent motif (PAM) sequence.
[0104] Within the teachings of the current disclosure, HITI can be performed anywhere in the provirus that can be cut. HIV pol is the most conserved region of the genome but is also further from the 3’ LTR. Potential tag insertion sites can be screened for suitability, defined as efficient tag expression when inserted in either 5’ to 3’ or 3’ to 5’ orientation, by inserting the tag into potential target sites within the HIV plasmid pNL4-3 using the approached described in FIG. 3, and analyzing expression levels. Particular embodiments utilize the following sites within the HIV genome that can be targeted for genetic insertion using HITI along with associated gRNA (e.g., crRNA) targeting sequences:
Table 1. Sites within the HIV Genome that can be Targeted for Genetic Insertion using HITI along with Associated gRNA
Figure imgf000026_0001
[0105] As indicated above, however, and as is understood by one of ordinary skill in the art, numerous different target and guide sequences can be selected within the teachings of the current disclosure. For HIV this is highly relevant due to the genetic diversity found both within the HIV infected population, and within individual patients. For example, within the Los Alamos National Laboratory (LANL) database of HIV genomes, 246 unique spCas9 target sites are found in LTR, 573 in gag, and 897 in pol of HIV-1 group M isolates (Roychoudhury et al. 2018, PMID 29996827), demonstrating the large number of potential target sites for nuclease cleavage across the entire HIV-1 provirus. In particular embodiments, these particular target and guide sites can depend on the particular nuclease selected and can be derived using prediction algorithms for the nuclease. Particular embodiments utilize conserved Cas9 sites within the HIV genome. Particular embodiments target env or nef regions due to their proximity to the 3’ LTR. Particular embodiments may target gag, pol or other HIV regions that show high levels of genetic conservation across different HIV isolates.
[0106] (iv) Genetic Constructs Inserted into the HIV Provirus at Targeted Sites. In particular embodiments, the genetic constructs inserted into HIV provirus at targeted sites include at least a reporter gene and a promoter, collectively flanked on both sides by the same gRNA target recognition sequence as the HIV provirus gRNA target recognition sequence. These segments include HITI-regions of homology (e.g., micro-homology sequences). Therefore, in an HIV infected cell containing the donor plasmid and a single sgRNA/Cas9 ribonucleoprotein (e.g., env or nef sequence as described), Cas9 can excise the tag sequence from the donor so that it becomes linear dsDNA, while also cutting the HIV provirus at the selected target site (e.g., env or nef sequence as described) to generate a site for insertion of the excised linear tag in the HIV provirus via the HITI process. While this process is described in relation to Cas9, as indicated previously, numerous other targeting and/or cutting elements could be used including sgRNA, Cas9 variants, Cpf 1 , variants of Cpf1 , etc. in any appropriate configuration. Moreover, if two sgRNA/Cas9 ribonucleoproteins are delivered simultaneously, then the flanking recognition sequence does not have to be HIV-specific. It can be simpler, however, to deliver one ribonucleoprotein. If only one ribonucleoprotein is used, then the donor should be different each time whereas two ribonucleoproteins could be used to insert a universal donor at different target sites.
[0107] In particular embodiments, gRNA/HITI-micro-homology sequences of the genetic construct can be between 18-30 bp and share at least 95% sequence identity with the crRNA for the target sequence. In particular embodiments, gRNA/HITI-micro-homology sequences can be 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 bp and can be 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the crRNA for the target sequence. In particular embodiments, gRNA/HITI-micro-homology sequences of the genetic construct are 20 bp with 100% sequence identity with the crRNA for the target sequence. In particular embodiments, gRNA/HITI-micro-homology sequences bind in vitro to a target sequence under stringent hybridization conditions.
[0108] In particular embodiments, the reporter gene encodes a fluorescent or light-emitting protein allowing FACs. In particular embodiments, the promoter can be a T cell specific promoter. Those of ordinary skill in the art will recognize that other components may be included in a genetic construct (and that other types of reporter and isolation systems may be chosen), however, within the teachings of the current disclosure the inserted genetic construct lacks a polyA signal. The absence of a polyA signal in the inserted genetic construct significantly reduces background signals of the systems and methods disclosed herein, and in particular embodiments, is critical to selectively isolating HIV-infected cells.
[0109] Exemplary fluorescent proteins that can be used as reporters include blue fluorescent proteins (e.g. eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T-sapphire); cyan fluorescent proteins (e.g. eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan); green fluorescent proteins (e.g. GFP, GFP-2, tagGFP, turboGFP, eGFP, Emerald, Azami Green, Monomeric Azami Green, CopGFP, AceGFP, ZsGreenl); orange fluorescent proteins (mOrange, mKO, Kusabira-Orange, Monomeric Kusabira-Orange, mTangerine, tdTomato); red fluorescent proteins (mKate, mKate2, mPlum, DsRed monomer, mCherry, mRFP1 , DsRed-Express, DsRed2, DsRed-Monomer, HcRed-Tandem, HcRedl, AsRed2, eqFP611 , mRaspberry, mStrawberry, Jred); yellow fluorescent proteins (e.g., YFP, eYFP, Citrine, Venus, YPet, PhiYFP, ZsYellowl); and any other suitable fluorescent proteins known to those of ordinary skill in the art, including firefly luciferase.
[0110] Genetic construct also include promoters to drive expression of the reporter gene. In particular embodiments, the reporter is under control of a promoter. In some embodiments, the promoter includes a constitutive promoter. Exemplary constitutive promoters include simian virus 40 early promoter (SV40), cytomegalovirus immediate-early promoter (CMV), human Ubiquitin C promoter (UBC), human elongation factor 1a promoter (EF1a), mouse phosphoglycerate kinase 1 promoter (PGK), and chicken b-Actin promoter coupled with CMV early enhancer (CAGG, also known as the CBA promoter). In some embodiments, the constitutive promoter is a synthetic or modified promoter. In some embodiments, the promoter is an MND promoter. In particular embodiments, an MND promoter refers to a synthetic promoter that contains the U3 region of a modified MoMuLV LTR with myeloproliferative sarcoma virus enhancer (see Challita et al. (1995) J. Virol. 69(2):748-755). In some embodiments, the promoter is a cell-specific promoter. In particular embodiments, the promoter is a viral promoter. In another embodiment, the promoter is a non-viral promoter. In some embodiments, the promoter includes the EF1a promoter or a modified form thereof and/or the MND promoter.
[0111] In particular embodiments, the promoter is a regulated promoter (e.g., inducible promoter). In some embodiments, the promoter is an inducible promoter or a repressible promoter. In some embodiments, the promoter includes a Lac operator sequence, a tetracycline operator sequence, a galactose operator sequence or a doxycycline operator sequence, or is an analog thereof or is capable of being bound by or recognized by a Lac repressor or a tetracycline repressor, or an analog thereof.
[0112] In particular embodiments, promoters appropriate for use in T cells include an MND promoter, a CD3A promoter, the murine stem cell virus LTR promoter, the distal lck promoter, and the spleen focus forming virus LTR (SFFV) promoter.
[0113] In particular embodiments, the genetic construct includes a signal sequence that encodes a signal peptide. In some aspects, the signal sequence may encode a signal peptide derived from a native polypeptide. In other aspects, the signal sequence may encode a heterologous or non-native signal peptide.
[0114] In particular embodiments, a single promoter may direct expression of an RNA that contains, in a single open reading frame (ORF), two or three reporter genes separated from one another by sequences encoding a self-cleavage peptide (e.g., 2A sequences) or a protease recognition site (e.g., furin). The ORF thus encodes a single polypeptide, which, either during (in the case of 2A) or after translation, is processed into the individual proteins. This feature can be useful when, for example, it could be useful to tag cells with different subtypes of fluorescent marker combinations. In some cases, the peptide, such as T2A, can cause the ribosome to skip (ribosome skipping) synthesis of a peptide bond at the C-terminus of a 2A element, leading to separation between the end of the 2A sequence and the next peptide downstream (see, for example, de Felipe. Genetic Vaccines and Ther. 2: 13 (2004) and deFelipe et al. Traffic 5:616- 626 (2004)). Many 2A elements are known. Examples of 2A sequences that can be used in the methods and nucleic acids disclosed herein include 2A sequences from the foot-and-mouth disease virus (F2A, e.g., SEQ ID NO: 36), equine rhinitis A virus (E2A, e.g., SEQ ID NO: 37), Thosea asigna virus (T2A, e.g., SEQ ID NO: 38 or 39), and porcine teschovirus-1 (P2A, e.g., SEQ ID NO: 40 or 41) as described in U.S. Patent Publication No. 20070116690.
[0115] As indicated, inserted genetic constructs may also include other regulatory elements such as enhancer elements.
[0116] Coding sequences encoding molecules (e.g., RNA, proteins) described herein can be obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule. The term“encode” or“encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.
[0117] As indicated, the term“gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, and termination regions. The term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites. The sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type.
[0118] Inserted genetic constructs can be of any suitable size. In particular embodiments, the inserted genetic construct integrated into a genome is more than 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, 400, 450, or 500 kb) in length. In particular embodiments, the inserted genetic construct includes micro homology regions of less than 75 bp or micro-homology regions of 20 bp.
[0119] Introduction of genetic engineering components with targeting, cutting, and insertion properties into T cells may be by any suitable method described herein or known to one of ordinary skill in the art. For example, introduction of the genetic engineering components with targeting, cutting, and insertion properties into a T cell can be accomplished chemically, biologically, or mechanically. Exemplary methods include calcium phosphate transfection, DEAE-dextran mediated transfection, dendrimer-mediated delivery, electroporation, gene gun delivery, lipotransfection, polyethyleneimine (PEI)-mediated transfection, protoplast fusion, microinjection, nanoparticle-mediated nucleic acid delivery, sonoporation, and viral vector- mediated delivery. Performing electroporation at a temperature below 35°C (e.g., at 30°C), can favor transfection efficiency.
[0120] Particular embodiments utilize one delivery mechanism to deliver targeting elements and cutting elements followed by a second mechanism to deliver genetic constructs for insertion into the HIV genome. For example, particular embodiments can utilize electroporation to deliver targeting elements and cutting elements followed by use of a vector to deliver genetic constructs for insertion into the HIV genome.
[0121] As used herein, a vector refers to a composition that facilitates transfer of non-native nucleic acid molecules into a cell and expression of non-native nucleic acid derived molecules within that cell. Many types of vectors exist, with one prevalent example including non integrating viral vectors, such as recombinant adeno-associated viral (rAAV).
[0122] Viral vector is widely used to refer to a vector that includes virus-derived components that facilitate transfer and expression of non-native nucleic acid molecules within a cell. The term "retroviral vector" refers to a viral vector containing structural and functional genetic elements, or portions thereof, that are primarily derived from a retrovirus. The term "lentiviral vector" refers to a viral vector containing structural and functional genetic elements, or portions thereof, that are primarily derived from a lentivirus, and so on. The term "hybrid viral vector" refers to a viral vector including structural and/or functional genetic elements from more than one virus type.
[0123] Adeno-Associated Virus (AAV) is a parvovirus, discovered as a contamination of adenoviral stocks. It is a ubiquitous non-pathogenic virus (antibodies are present in up to 85% of the US human population). It is also classified as a dependovirus, because its replication is dependent on the presence of a helper virus, such as adenovirus.
[0124] A recombinant AAV (rAAV) vector refers to a recombinant AAV-derived virus that is packaged so that it contains a recombinant ssDNA nucleic acid genome. rAAV genomes are deleted of all native AAV coding sequences, contain exogenous DNA, and require two AAV derived inverted terminal repeat sequences (ITRs) for efficient packaging. A self-complementary AAV (scAAV) vector contains a double-stranded vector genome generated by deletion of the terminal resolution site (TR) from one of the rAAV ITRs in the plasmid used to make the vector, which prevents the initiation of replication at the mutated end. These constructs generate single- stranded scAAV genomes, with a wild-type (wt) ITR at each end and a mutated ITR in the middle. Each half of the scAAV genome is complimentary, which enables self-hybridization and generation of a transcriptionally active dsDNA genome. Several naturally occurring and hybrid AAV serotypes are known, including AAV-1 , AAV-2, AAV-3A, AAV-3B, AAV-4, AAV-5, AAV-6, AAV-7, AAV- 8, AAV-9, AAV- 10, and AAV-11 (Choi et al., 2005, Curr. Gene Ther. 5:299-310). Those of skill in the art will recognize that a scAAV vector can be generated based on any of these or other serotypes of AAV. Alternatively, rationally engineered variants of naturally occurring AAV capsids, or laboratory derived intra-serotype recombinants have also been widely used to generate AAV vectors. For additional information regarding AAV vectors that can be used within the teachings of the current disclosure, see, e.g, West et al., Virology 160:38-47 (1987); U.S. Pat. No. 4,797,368; WO 93/24641 ; Kotin, Human Gene Therapy 5:793-801 (1994); Muzyczka, J. Clin. Invest. 94: 1351 (1994); U.S. Pat. No. 5, 173,414; Tratschin et al., Mol. Cell. Biol. 5:3251-3260 (1985); Tratschin, et al., Mol. Cell. Biol. 4:2072-2081 (1984); Hermonat & Muzyczka, PNAS 81 :6466-6470 (1984); and Samulski et aL, J. Virol. 63 :03822-3828 (1989).
[0125] In particular embodiments, genetic constructs may be delivered by plasmids or non integrating Lentiviral viral vectors. For non-integrating lentiviral vectors see, e.g., Ory et al. (1996) Proc. Natl. Acad. Sci. USA 93:11382-11388; Dull et al. (1998) J. Virol. 72:8463-8471 ; Zuffery et al. (1998) J. Virol. 72:9873-9880; Follenzi et al. (2000) Nature Genetics 25:217-222; U.S. Patent Publication No 2009/054985. For additional information, see e.g., Anderson, Science 256:808-813 (1992); Nabel & Feigner, TIBTECH 11 :211-217 (1993); Mitani & Caskey, TIBTECH 11 : 162-166 (1993); Dillon. TIBTECH 11 : 167-175 (1993); Miller, Nature 357:455- 460 (1992); Van Brunt, Biotechnology 6(10): 1149-1154 (1988); Vigne, Restorative Neurology and Neuroscience 8:35-36 (1995); Kremer & Perricaudet, British Medical Bulletin 51 (l):31-44 (1995); Haddada et al., in Current Topics in Microbiology and Immunology Doerfler and Bohm (eds) (1995); and Yu et al., Gene Therapy 1 : 13-26 (1994). Illustrative lentiviruses include: HIV (human immunodeficiency virus; including HIV type 1 , and HIV type 2); visna-maedi virus (VMV); the caprine arthritis-encephalitis virus (CAEV); equine infectious anemia virus (EIAV); feline immunodeficiency virus (FIV); bovine immune deficiency virus (BIV); and simian immunodeficiency virus (SIV). In particular embodiments, HIV based vector backbones (i.e. , HIV cis-acting sequence elements) can be used.
[0126] Other exemplary viral vectors that may be used include Friend murine leukemia virus, feline leukemia virus (FLV), gibbon ape leukemia virus (GaLV), Harvey murine sarcoma virus (HaMuSV), Moloney murine leukemia virus (M-MuLV), Moloney murine sarcoma virus (MoMSV), murine mammary tumor virus (MuMTV), spumavirus, Murine Stem Cell Virus (MSCV), and Rous Sarcoma Virus (RSV).
[0127] Synthetic nanoparticles can also be used to deliver gene-editing components. In particular embodiments, synthetic nanoparticles can be utilized to deliver all required gene editing components. Different nanoparticles can deliver different gene-editing components and/or all required gene editing components can be delivered to a cell by a single type of nanoparticle (e.g., Nanoblade (see, e.g., Mangeot et al., Nat Commun. 2019, 10(1):45, doi: 10.1038/s41467-01807845-z.)
[0128] Additional forms of nucleic acids that can be delivered are described in, for example, Hardee et al., Genes (2017), 8, 65. Hardee et al., reviews methods of non-viral DNA gene delivering vectors including plasmids, minicircles, and minivectors. Particular embodiments include use of double-stranded DNA (dsNDA), single-stranded oligonucleotides (e.g., ssODN), conventional plasmids, minicircles, and/or closed-ended linear ceDNA (see Li et al., PLoS One, Aug. 1 , 2013 doi.org/10.1371/journal.pone. 0069879). As is understood by one of ordinary skill in the art, different delivery platforms and nucleic acid forms are associated with different cellular toxicity profiles that should be taken into account when practicing the systems and methods disclosed herein. [0129] (v) Isolation of Tagged Cells. Following tagging of cells latently-infected with the HIV provirus, the tagged cells can be isolated. Tagged cells can isolated from a sample using any appropriate technique. Appropriate collection and isolation procedures include magnetic separation; fluorescence activated cell sorting (FACS; Williams et al. , 1985; Lu et al., 1986); nanosorting based on fluorophore expression; affinity chromatography; cytotoxic agents joined to a monoclonal antibody or used in conjunction with a monoclonal antibody, e.g., complement and cytotoxins; "panning" with antibody attached to a solid matrix (Broxmeyer et al., 1984); selective agglutination using a lectin such as soybean (Reisner et al., 1980); immunomagnetic bead-based sorting or combinations of these techniques, etc.
[0130] In particular embodiments, the isolation of tagged cells relies on the positive expression of a reporter. In these embodiments, a negative control tube first can be analyzed first to set a gate (bitmap) around the population of interest by FSC and SSC and the photomultiplier tube voltages and gains for fluorescence in the desired emission wavelengths can be adjusted, such that a set percentage of cells (e.g., 97%) of the cells appear unstained for the fluorescence marker with the negative control. Once established, these parameters can be used to isolate cells expressing the fluorescent reporter.
[0131] When using a technique such as magnetic-based cell sorting, the reporter can be a protein expressed and trafficked to the cell surface. In these embodiments, sorting of tagged cells may be based on antibody-based magnetic separation; affinity chromatography; "panning" with antibody attached to a solid matrix (Broxmeyer et al., 1984, J. Clin. Invest. 73:939-953); etc. In particular embodiments, “antibody” refers to a full antibody protein. In other embodiments,“antibody” can refer to only those portions of an antibody necessary to result in targeted protein binding (e.g., antibody binding fragments). Drug selectable markers (e.g., dihydrosulfate reductase, cytosine deaminase, HSV-1 thymidine kinase) may also be used.
[0132] Techniques described in section (ii) of the disclosure related to enrichment can also be used. Regardless of the particular isolation method chosen, the practice of the systems and methods results in isolation of T cells latently infected with the HIV provirus. A tagged and isolated cell population refers to one in which the isolated cell types makes up >70%, >80%, >90%, >95%, >98%, or >99% of the cell population.
[0133] (vi) Maintenance and/or Expansion of Isolated Cells. Following (i) insertion of the genetic constructs, (ii) expression of reporters and (iii) isolation of cells based on reporter expression, isolated latently infected T cells can be maintained and expanded in libraries. In this aspect, any method known in the art for expanding the number of isolated T cells can be used. The isolated T cells can be cultured under cell growth conditions such that the T cells grow and divide (proliferate) to obtain a population of cells infected with HIV.
[0134] In particular embodiments, the technique used for expansion is one that has been shown to result in an increase in the number of HIV-Infected T-Cells relative to an unexpanded sample. In certain embodiments, the expansion technique results in a 50-, 75-, 100-, 150-, 200-, 250-, 300-, 350-, 400-, 450-, or 500- fold or more increase relative to an unexpanded sample. Exemplary expansion techniques are described in, for example, WO2018157072, CA2999496A1 , and WO2015188119A.
[0135] In particular embodiments, isolated HIV-Infected T-Cells are cultured with growth factors and are exposed to cell growth conditions. The cell growth conditions can include an incubation temperature suitable for the growth of human T-Cells, for example, at least 25°C, at least 30°C, or 37°C.
[0136] In particular embodiments, the HIV-Infected T-Cells are cultured for 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 days or more. In certain embodiments, the HIV-Infected T-Cells are cultured for at least 7 days.
[0137] In particular embodiments, it is useful to include one or more HIV inhibitors within culture media. Inhibiting HIV before genetic tagging of cells occurs can be particularly beneficial to protect cells from viral replication (which can lead to their destruction). Exemplary HIV inhibitors include efavirenz, Nevirapine, Lamivudine and AZT.
[0138] In particular embodiments, the HIV-Infected T-Cells are cultured in a culture medium including one or more of Minimal Essential Media (MEM), Roswell Park Memorial Institute (RPMI) Media 1640, X-VIVO 15, X-VIVO 20, Click’s Medium, AIM V Medium, Dulbecco’s Modified Eagle Medium (DMEM), Eagle’s MEM, a-MEM, F-12 nutrient mixture, and human AB serum. In some instances, the cells are cultured in serum-free media or low-serum media.
[0139] In particular embodiments, the isolated T cells can be cultured in the culture medium in the presence of one or more additional factors. The one or more additional factors may be serum (e.g., fetal bovine or human serum), interleukin-2 (IL-2), insulin, interferon gamma (IFN- y), IFN-a, IL-4, IL-7, IL-21 , granulocyte-macrophage colony-stimulating factor (GM-CSF), IL-10, IL-12, IL-15, TGFp, tumor necrosis factor alpha (TNF-a). surfactant, plasmanate, N-acetyl- cystine, 2-mercaptoethanol, added amino acids, sodium pyruvate, vitamins, hormones, cytokine(s), penicillin, streptomycin, L-glutamine, plasma, efavirenz, Phytohemagglutinin-L (PHA-L), and GlutaMAX.
[0140] In particular embodiments, the cell culture media can include glucose:galactose in a ratio including 10:90, 15:85, 20:80, 25:75, 30:70, 35:65, 40:60, 45:55, 50:50, 55:45, 60:40, 65:35, 70:30, 75:25, 80:20, 85:15, and 90:10 (e.g., from 10:90 to 90:10, from 20:80 to 90:10, from 30:70 to 90:10, from 40:60 to 90:10, from 10:90 to 80:20, from 10:90 to 70:30, from 10:90 to 60:40, from 20:80 to 80:20, from 30:70 to 70:30, from 40:60 to 60:40, from 45:55 to 55:45, from 47:53 to 53:47, etc.). In some aspects, the cell culture media further includes one or more of fatty acids, cholesterol, arachidonic acid, linoleic acid, linolenic acid, myristic acid, oleic acid, palmitic acid, palmitoleic acid and stearic acid. In some aspects the media is sterile.
[0141] In particular embodiments, isolated T cells can be maintained in RPMI-1640 medium (Hyclone TM) supplemented with 10% heat inactivated FBS, 100ug/ml_ penicillin/streptomycin and 1ug/ml_ combinational anti-retrovirus drugs (cART) at 37°C and 5% CO2 incubator in a biosafety level (BSL) 2+ facility.
[0142] Additional culture conditions that can be used for expanding HIV-Infected T-Cells is set forth in Baxter et al., Cell Host & Microbe, Vol. 20, Issue 3, 14 September 2016, pages 368-380 and includes stimulating isolated CD4 T-cells for 36-40 hours in RPMI with PHA-L (10 pg/ml, Sigma) and IL-2 (50 U/ml). The cells can be subsequently washed and maintained for 6-7 days in RPMI with IL-2 (100 U/ml).
[0143] According to another example culture condition for expanding the HIV-Infected T-Cells is set forth in Cillo, et al., PNAS 2014 May 13, 111 (19); 7078-83 and includes seeding 304 cells per well into individual wells of 48-well plates, and culturing the cells for 7 days in RPMI medium 1640 without phenol red containing 10% (vol/vol) fetal bovine serum, and 0.6% penicillin/streptomycin. 300 nM efavirenz can also be introduced into the culture to prevent subsequent rounds of HIV-1 replication.
[0144] According to yet another example, culture conditions can include those described Kavanagh, et al., Blood 2006 March 1 , 107(4) 1963-9 wherein the HIV-Infected T-Cells can be grown in HEPES-buffered RPMI medium supplemented with penicillin, streptomycin, L- glutamine, and 5% pooled human AB serum.
[0145] According to another example culture condition, as set forth in Weiss, et al., Blood 2004 Nov 15; 104(10); 3249-56 can be used. In this example, the cells can be cultured in RPMI 1640 supplemented with 10% human blood group AB serum, penicillin-streptomycin (100 lU/mL and 100 g/mL, respectively), 2 mM L-glutamine, 1 mM sodium pyruvate, and 1% HEPES (N-2- hydroxyethylpiperazine-N -2-ethanesulfonic acid) buffer.
[0146] As indicated, the systems and methods of the present disclosure do not affect the viability of tagged and isolated cells. In particular embodiments, a lack of effect on viability can be demonstrated following expansion protocols described herein wherein cell growth and expected cell surface marker expression is maintained for the isolated cell type. Maintenance of cell growth and expected cell surface marker expression can be over a defined number of passages, such as over at least 5 passages, over at least 10 passages, over at least 50 passages, or even indefinitely under appropriate culture conditions.
[0147] A lack of effect on viability can also be demonstrated utilizing a cell function and/or viability assay. In particular embodiments, a lack of effect on viability can be demonstrated using an ELISPOT (enzyme-linked immunospot) assay. The ELISPOT assay is capable of detecting viable cytokine producing cells by employing high affinity capture and detection antibodies and enzyme-amplification. In particular embodiments, cytokines detected in an ELISPOT assay include IFN-g, IL-2, IL-4, IL-5, IL-6, IL-10, IL-12, IL-13, IL-21 , and/or TNF-a. For example, function of CD4+ T cells can be assessed utilizing an interferon gamma ELISPOT assay. ELISPOT IFN-g assays and reagents are available from BD Biosciences 2350 Qume Drive San Jose, Calif., 95131. Additional information regarding use of ELISPOT assays is provided in J. Immunol. Methods. 2001 , 254(1-2):59.
[0148] In particular embodiments, methods of assessing cell function and/or viability include measurement of intracellular cytokines. For example, the function of CD4+ cells can be assessed using interferon gamma intracellular cytokine staining (ICS). In particular embodiments, ICS staining involves permeabilizing the cells and treating them with antibodies that bind cytokines that have accumulated inside the cell.
[0149] Cell viability also can be determined using Trypan blue and light microscopy or 7-amino- actinomycin D vital dye and flow cytometry. Additionally or alternatively, the function of CD4+ cells can be assessed by the ability of the cell to respond to mitogenic stimulation by quantifying the amount of ATP produced (Kowalski, et al. , 2007, Journal of Immunotoxicology, 4:3: 225- 232). For additional assays and techniques to assess T cell function, see, McMichael & O'Callaghan, J. Exp. Med. 187(9)1367-1371 , 1998; Mcheyzer-Williams et al., Immunol. Rev. 150:5-21 , 1996; and Lalvani, et al., J. Exp. Med. 186:859-865, 1997.
[0150] (vii) Analysis of Tagged and Isolated T Cells. According to various embodiments, cells from an expanded library are assessed for HIV integration sites, HIV sequences, TCR of the latently infected cells, and/or other factors of interest to a particular research program.
[0151] In particular embodiments, genomic DNA (gDNA) of the HIV-Infected T-Cells is isolated. According to various embodiments, the isolated gDNA is sheered into DNA fragments. In particular embodiments, each of the DNA fragments, for example, has a length of 500-700 base pairs.
[0152] In particular embodiments, the DNA fragments are amplified. For example, the DNA fragments are amplified using a polymerase chain reaction (PCR) technique, such as allele- specific PCR, assembly PCR, asymmetric PCR, endpoint PCR, hot-start PCR, in situ PCR, intersequence-specific PCR, inverse PCR, linear after exponential PCR, ligation-mediated PCR, methylation-specific PCR, miniprimer PCR, multiplex ligation-dependent probe amplification, multiplex PCR, nested PCR, overlap-extension PCR, polymerase cycling assembly, qualitative PCR, quantitative PCR, real-time PCR, single-cell PCR, solid-phase PCR, thermal asymmetric interlaced PCR, touchdown PCR, universal fast walking PCR, etc. Ligase chain reaction (LCR) may also be used.
[0153] PCR may be performed with a thermostable polymerase, such as Taq DNA polymerase (e.g., wild-type enzyme, a Stoffel fragment, FastStart polymerase, etc.), Pfu DNA polymerase, S-Tbr polymerase, Tth polymerase, Vent polymerase, or a combination thereof, among others.
[0154] PCR and LCR are driven by thermal cycling. Alternative amplification reactions, which may be performed isothermally, can also be used. Exemplary isothermal techniques include branched-probe DNA assays, cascade-RCA, helicase-dependent amplification, loop-mediated isothermal amplification (LAMP), nucleic acid based amplification (NASBA), nicking enzyme amplification reaction (NEAR), PAN-AC, Q-beta replicase amplification, rolling circle replication (RCA), self-sustaining sequence replication, strand-displacement amplification, etc.
[0155] Amplification may be performed with any suitable reagents (e.g. template nucleic acid (e.g. DNA or RNA)), primers, probes, buffers, replication catalyzing enzymes (e.g. DNA polymerase, RNA polymerase), nucleotides, salts (e.g. MgCI2), etc. In some embodiments, an amplification mixture includes any combination of at least one primer or primer pair, at least one probe, at least one replication enzyme (e.g., at least one polymerase), and deoxynucleotide (and/or nucleotide) triphosphates (dNTPs and/or NTPs), etc.
[0156] In particular embodiments, the following HITI specific junction primers can be used to amplify sequences of particular interest:
Table 2. Exemplary HITI-Specific Junction Primers
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0002
_
[0157] In particular embodiments, DNA sequencing with commercially available NGS platforms may be conducted with the following steps. First, DNA sequencing libraries may be generated by clonal amplification by PCR in vitro. Second, the DNA may be sequenced by synthesis, such that the DNA sequence is determined by the addition of nucleotides to the complementary strand rather through chain-termination chemistry. Third, the spatially segregated, amplified
DNA templates may be sequenced simultaneously in a massively parallel fashion without the requirement for a physical separation step. While these steps are followed in most NGS platforms, each utilizes a different strategy (see e.g., Anderson, M. W. and Schrijver, I., 2010,
Genes, 1 : 38-69.). Examples of NGS platforms include:
Table 3. Exemplary NGS Platforms
Figure imgf000039_0001
[0158] In particular embodiments, DNA segments can undergo an amplification as part of NGS. [0159] Using the sequences of the amplified DNA fragments, in various embodiments, host-viral junctions and host DNA breakpoints are identified in the DNA fragments. In various embodiments, the integration sites of the HIV in the genomes of the HIV-Infected T-Cells can be identified by matching the host-viral junctions and host DNA breakpoints to the human genome.
[0160] In one particular exemplary method, as described in Simonetti et al. (Proc Natl Acad Sci U S A. 2016 Feb 16; 113(7): 1883-8), isolated gDNA of T-cells can be fragmented by random sheering into fragments, each 300 to 500 base-pairs long. Linker- mediated nested PCR can then be performed on the DNA fragments, in order to amplify the human genomic regions and the linked viral sequences from both the 5’ and 3’ long terminal repeats (LTRs). Then, paired end-sequencing of the amplified DNA can be performed using, for example, the MiSEq. 2 c 150-bp paired end kit (lllumina), and sequences of the host-viral junctions and the host DNA breakpoints can be determined. The integration sites are then recovered by mapping host DNA sequences to the human genome. A stringent filter can be used to check quality of the recovered integration sites.
[0161] In particular embodiments, it may be necessary to pair TCR chains following sequencing. Various methods can be utilized to pair isolated a and b TCR chains. In particular embodiments post-sequencing pairing may be unnecessary or relatively simple, for example in embodiments in which the a and b chain pairing information is not lost in the procedure, such as if one were to sequence from single cells or from a clonally expanded group of cells. In particular embodiments, chain pairing may be assisted in silico by computer methods. For example, specialized, publicly available immunology gene alignment software is available from IMGT, JOINSOLVER, VDJSolver, SoDA, iHMMune-align, or other similar tools for annotating VDJ gene segments.
[0162] In particular embodiments, chain pairing may be performed using VDJ antibodies. For example, one may obtain antibodies for the identified segments and use the antibodies to purify a subset of cells that express that gene segment in their (surface) receptors (e.g. using FACS, or immunomagnetic selection with microbeads). One may then sequence from this subset of cells which have been purified for the desired gene segments. If necessary, this secondary sequencing may be done more deeply (i.e. at a higher resolution) than the first round of sequencing. In this second sequence data set, there will be far fewer induced clonotypes, greatly easing the task of chain pairing. Depending on the gene segments, there may be only one induced a chain and one induced b chain for example.
[0163] In particular embodiments, chain pairing may be performed using multiwell sequencing. For example, one may isolate gene segment purified cells or unpurified cells into a microwell plate, where each microwell has a very low number of cells. One can amplify and sequence the cells in each well individually, which provides another means to pair the chains of interest by sequencing on a single cell basis, facilitating the pairing of induced a and b chains. Assays such as PairSEQ® (Adaptive Biotechnologies Corp., Seattle, WA) have also been developed.
[0164] Regarding the study of viral reactivation, once a tagged provirus has been sequenced, the identified provirus can be re-assembled using plasmids so that new viruses can be generated in culture for replication studies. This can generate information regarding whether the initially tagged provirus was potentially replication competent. In particular embodiments, such studies would only be undertaken with proviruses that do not contain large deletions, found in many integrated proviruses.
[0165] While the systems and methods disclosed herein have been described in relation to the tagging and isolation of cells latently infected with the HIV provirus, one of ordinary skill in the art will recognize that the systems and methods can also be applied to tag and isolate other cell types, such as other rare cell types and/or cell types with regions of high sequence variability within the genome (e.g., cancer cells).
[0166] (viii) Exemplary Embodiments.
1. A method of isolating CD4+ primary T cells latently infected with human immunodeficiency virus (HIV) provirus including
acquiring a sample enriched for CD4+ primary T cells obtained from a patient infected with HIV; delivering genetic engineering components to the CD4+ primary T cells within the sample wherein the genetic engineering components include:
a ribonucleoprotein complex including Cas9 and guide RNA (gRNA) including the sequence set forth in SEQ ID NO: 44; and
a self-complementary adeno-associated virus 6 (scAAV6) donor vector wherein the donor vector genome includes an insertion construct including an MND promoter and a GFP reporter gene collectively flanked at the 5’ and 3’ ends by SEQ ID NO: 43 wherein the insertion construct does not include a polyA signal and
wherein the delivering results in insertion of the insertion construct within the env gene of the HIV provirus and expression of the reporter gene; and
sorting the sample based on expression of the GFP reporter gene;
thereby isolating CD4+ primary T cells latently infected with the HIV provirus.
2. A method of isolating T cells latently infected with human immunodeficiency virus (HIV) provirus including
acquiring a sample enriched for T cells obtained from a patient infected with HIV; delivering genetic engineering components to the T cells within the sample
wherein the genetic engineering components integrate a genetic construct into a targeted portion of the HIV provirus genome and
wherein the genetic construct includes a promoter and a gene encoding a reporter, but lacks a polyA signal and results in expression of the reporter; and
sorting the sample based on expression of the reporter;
thereby isolating T cells latently infected with HIV.
3. The method of embodiment 2, wherein the T cells are CD4+ T cells.
4. The method of embodiments 2 or 3, wherein the T cells are primary T cells.
5. The method of any of embodiments 2-4, wherein the T cells are CD4+ primary T cells.
6. The method of any of embodiments 2-5, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 75 base pairs.
7. The method of any of embodiments 2-6, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 25 base pairs.
8. The method of any of embodiments 2-7, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are 20 base pairs.
9. The method of any of embodiments 2-8, wherein the targeted portion of the HIV provirus genome includes a sequence as set forth in SEQ ID NOs: 3, 43, 45, 47, or 49.
10. The method of any of embodiments 2-9, wherein the genetic engineering components include a guide RNA sequence including a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50.
11. The method of any of embodiments 2-10, wherein the genetic engineering components include Cas9 or Cpf1.
12. The method of any of embodiments 2-11 , wherein the promoter includes the MND promoter.
13. The method of any of embodiments 2-12, wherein the reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
14. The method of any of embodiments 2-13, wherein the genetic construct includes a sequence as set forth in one of SEQ ID NOs: 23-33.
15. The method of any of embodiments 2-14, wherein the sorting includes fluorescence activated cell sorting (FACs), magnetic based cell-sorting, affinity chromatography, panning, or drug selection.
16. The method of any of embodiments 2-15, wherein the delivering includes electroporation and viral vector delivery.
17. The method of any of embodiments 2-16, wherein the genetic engineering components include targeting elements and cutting elements delivered by electroporation.
18. The method of any of embodiments 2-17, wherein the genetic engineering components include a genetic construct expressed by a viral vector.
19. The method of embodiment 18, wherein the viral vector is a non-integrating viral vector.
20. The method of embodiment 19, wherein the non-integrating viral vector is an adeno- associated viral vector (AAV) or a lentiviral vector.
21. The method of embodiments 19 or 20, wherein the non-integrating viral vector is a self complementary AAV vector (scAAV).
22. The method of any of embodiments 19-21 , wherein the non-integrating viral vector is a self complementary AAV6 vector (scAAV6).
23. The method of any of embodiments 2-22, further including expanding the isolated T cells to create a population of T cells having the genetic construct inserted within the HIV provirus genome.
24. The method of any of embodiments 2-23, further including amplifying portions of the genetically modified HIV provirus genome utilizing a primer sequence including a sequence as set forth in one of SEQ ID NOs: 51-57, or 65.
25. The method of any of embodiments 2-24, further including sequencing portions of the genetically modified HIV provirus genome.
26. The method of embodiment 25, further including assessing information regarding HIV integration sites within the T cell genome based on the sequencing.
27. The method of any of embodiments 2-26, further including identifying T cell receptor a and b chains from the isolated T cells.
28. The method of any of embodiments 2-27, further including assessing conditions that trigger viral reactivation.
29. The method of any of embodiments 2-28, further including:
Extracting and fragmenting DNA from the isolated T cells;
Capturing and/or partitioning the extracted and fragmented DNA; and
Sequencing the captured and/or partitioned DNA.
30. The method of embodiment 29, further including identifying one or more of whole provirus genome, integration site of the HIV provirus, T cell receptor a and b chains, and deletions or stop codons within the HIV provirus genome that predict replication incompetence based on the sequencing. 31. The method of embodiment 30, wherein the identifying utilizes a sequence as set forth in one or more of SEQ ID NOs: 20-22.
32. The method of any of embodiments 2-31 , wherein the patient is receiving anti-retroviral therapy when the sample is obtained.
33. The method of any of embodiments 2-31 , wherein the patient is not receiving anti-retroviral therapy when the sample is obtained.
34. The method of any of embodiments 2-33, wherein the patient is receiving chemotherapy when the sample is obtained.
35. The method of any of embodiments 2-33, wherein the patient is not receiving chemotherapy when the sample is obtained.
36. A population of T cells genetically modified according to a method of any of embodiments 1- 35.
37. The population of T cells of embodiment 36, wherein the population of T cells has been expanded.
38. A method including creating a library of isolated T cells latently infected with human immunodeficiency virus (HIV) including
acquiring multiple samples enriched for T cells wherein each sample is obtained from a different patient latently infected with HIV at a targeted portion of the HIV provirus genome;
delivering genetic engineering components to the T cells within each of the samples
wherein the genetic engineering components integrate a genetic construct into the HIV provirus genome within the samples at a targeted portion of the HIV provirus genome and
wherein the genetic construct includes a promoter and a reporter gene but lacks a polyA signal and results in expression of the reporter gene; and
sorting the samples based on expression of the reporter gene;
thereby isolating T cells latently infected with HIV and creating a library of isolated T cells latently infected with HIV.
39. The method of embodiment 38, wherein the T cells are CD4+ T cells.
40. The method of embodiments 38 or 39, wherein the T cells are primary T cells.
41. The method of any of embodiments 38-40, wherein the T cells are CD4+ primary T cells.
42. The method of any of embodiments 38-41 , wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 75 base pairs.
43. The method of any of embodiments 38-42, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 25 base pairs.
The method of any of embodiments 38-43, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are 20 base pairs.
The method of any of embodiments 38-44, wherein the targeted portion of the HIV provirus genome includes a sequence as set forth in one of SEQ ID NOs: 3, 43, 45, 47, or 49.
The method of any of embodiments 38-45, wherein the genetic engineering components include a guide RNA sequence including a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50.
The method of any of embodiments 38-46, wherein the genetic engineering components include Cas9 or Cpf1.
The method of any of embodiments 38-47, wherein the promoter includes the MND promoter.
The method of any of embodiments 38-48, wherein the reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
The method of any of embodiments 38-49, wherein the genetic construct includes a sequence as set forth in one of SEQ ID NOs: 23-33.
The method of any of embodiments 38-50, wherein the sorting includes fluorescence activated cell sorting (FACs), magnetic based cell-sorting, affinity chromatography, panning, or drug selection.
The method of any of embodiments 38-51 , wherein the delivering includes electroporation and viral vector delivery.
The method of any of embodiments 38-52, wherein the genetic engineering components include targeting elements and cutting elements delivered by electroporation.
The method of any of embodiments 38-53, wherein the genetic engineering components include a genetic construct expressed by a viral vector.
The method of embodiment 54, wherein the viral vector is a non-integrating viral vector. The method of embodiment 55, wherein the non-integrating viral vector is an adeno- associated viral vector (AAV) or a lentiviral vector.
The method of embodiments 55 or 56, wherein the non-integrating viral vector is a self complementary AAV vector (scAAV).
The method of any of embodiments 55-57, wherein the non-integrating viral vector is a self complementary AAV6 vector (scAAV6).
The method of any of embodiments 38-58, further including expanding the isolated T cells to create a population of T cells having the genetic construct inserted within the HIV provirus genome.
60. The method of any of embodiments 38-59, further including amplifying portions of the genetically modified HIV provirus genome utilizing a primer sequence including a sequence as set forth in one of SEQ ID NOs: 51-57, or 65.
61. The method of any of embodiments 38-60, further including sequencing portions of the genetically modified HIV provirus genome.
62. The method of any of embodiments 38-61 , further including assessing information regarding HIV integration sites within the T cell genome based on the sequencing.
63. The method of any of embodiments 38-62, further including T cell receptor a and b chains from the isolated T cells.
64. The method of any of embodiments 38-63, further including assessing conditions that trigger viral reactivation.
65. A method of any of embodiments 38-64, further including:
Extracting and fragmenting DNA from the isolated T cells;
Capturing and/or partitioning the extracted and fragmented DNA; and
Sequencing the captured and/or partitioned DNA.
66. The method of embodiment 65, further including identifying one or more of whole provirus genome, integration site of the HIV provirus, T cell receptor a and b chains, and deletions or stop codons within the HIV provirus genome that predict replication incompetence based on the sequencing.
67. The method of any of embodiments 38-66, wherein at least one sample is obtained from a patient receiving anti-retroviral therapy when the sample is obtained.
68. The method of any of embodiments 38-66, wherein at least one sample is obtained from a patient not receiving anti-retroviral therapy when the sample is obtained.
69. The method of any of embodiments 38-68, wherein at least one sample is obtained from a patient receiving chemotherapy when the sample is obtained.
70. The method of any of embodiments 38-68, wherein at least one sample is obtained from patient is not receiving chemotherapy when the sample is obtained.
71. A library of populations of T cells genetically modified according to a method of any of embodiments 38-70.
72. The library of populations of T cells of embodiment 71 , wherein at least a portion of the populations within the library have been expanded.
73. A kit to isolate T cells latently infected with human immunodeficiency virus (HIV) including a guide RNA sequence including a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50, a Cas or Cpf1 nuclease, and a genetic construct that includes a promoter and a reporter gene but lacks a polyA signal.
74. The kit of embodiment 73, including Cas9.
75. The kit of embodiments 73 or 74, wherein the reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
76. The kit of any of embodiments 73-75, wherein the promoter is an MND promoter, a CD3A promoter, a murine stem cell virus promoter, or a distal lck promoter.
77. The kit of any of embodiments 73-76, wherein the genetic construct includes a sequence as set forth in one of SEQ ID NOs: 23-33.
78. The kit of any of embodiments 73-77, further including T cell culture and/or expansion components including one or more of a culture medium, interleukin-2 (IL-2), insulin, interferon gamma (IFN-g), IFN-a, IL-4, IL-7, IL-21 , granulocyte-macrophage colony- stimulating factor (GM-CSF), IL-10, IL-12, IL-15, TGFp, tumor necrosis factor alpha (TNF-a), surfactant, plasmanate, N-acetyl-cystine, 2-mercaptoethanol, amino acids, sodium pyruvate, vitamins, hormones, penicillin, streptomycin, L-glutamine, plasma, efavirenz, Phytohemagglutinin-L (PHA-L), GlutaMAX, glucose, galactose, fatty acids, cholesterol, arachidonic acid, linoleic acid, linolenic acid, myristic acid, oleic acid, palmitic acid, palmitoleic acid, and stearic acid.
[0167] In particular embodiments, a polyA signal-less genetic construct is one that does not include a polyA signal. In particular embodiments, a polyA signal-less genetic construct is one that does not include a polyA signal within a region that stabilizes unprocessed mRNA.
[0168] (ix) Experimental Examples. Example 1. CRISPR/Cas9-mediated HITI facilitates HIV provirus tagging. As an alternative to HDR-mediated targeted integration, whether HITI could be harnessed to tag the HIV provirus was investigated. To determine whether HITI could be used to tag HIV provirus, experiments using the ACH-2 cell line, which contains 1 copy of the provirus/cell (Clouse et al., J Immunol. 1989;142(2):431-8; Folks et al., Proc Natl Acad Sci U S A. 1989; 86(7): 2365-8) were performed. ACH-2 cells were electroporated with Cas9 RNPs containing gRNAs that target a highly-conserved region of the HIV pol gene and the non-human zebrafish Tia1 sequence, along with a plasmid donor containing a GFP-expressing reporter (under control of MND, a strong promoter in T cells, (Sather et al., Sci Transl Med. 2015;7(307):307ra156)) flanked by Tia1 sgRNA cleavage sites (FIG. 2A). Flow cytometry was performed at day 1 to demonstrate plasmid uptake via GFP expression and Cas9/RNP activity via GFP knockdown using the eGFP gRNA that targets GFP (FIG. 2B).
[0169] Genomic DNA from treated cells was extracted at day 4 for genetic analysis. HITI junction-specific PCR showed that the reporter had been inserted into the HIV pol target site in ACH-2 cells (FIG. 2C) in forward and reverse orientations (FIG. 2D). Sanger sequencing of PCR products showed that the predicted forward and reverse HITI junction sequences were present (FIG. 2D). As would be expected when imprecise DNA repair creates multiple different junctions during NHEJ, multiple peaks were present in the chromatogram around each provirus-reporter junction.
[0170] PolyA signal-less donor constructs eliminate background fluorescence. In initial experiments to demonstrate HITI in ACH-2 cells, high levels of GFP expression remained for more than 7 days in cells transfected with either a plasmid or linear donor, even in the absence of Cas9 RNPs (data not shown). Presumably, this background resulted from transcription and translation of unintegrated donor construct. To selectively isolate rare latently infected cells from a background of uninfected cells, it is critical to eliminate background donor expression. A construct in which the donor tag contains the MND promoter and a GFP reporter but does not contain a polyA signal was therefore developed, so that unintegrated donor-derived mRNA reporter transcripts are unstable and become rapidly degraded so that they are not translated. It was hypothesized that upon forward orientation insertion of a polyA signal-less construct into the HIV provirus, the construct would utilize the native polyA signal within the 3’ LTR, and upon reverse orientation insertion it could use one of several predicted polyA signals present in the reverse strand of the provirus. To evaluate this, the polyA signal-less cassette was introduced into unique sites in pol and nef of the pNL4-3 molecular clone, in both the positive and negative orientations, so that sites distal and proximal to the native 3’LTR polyA signal could be evaluated for GFP expression (FIG. 3A). After transfection of these constructs into 293 cells, GFP expression was analyzed after 72 hours by flow cytometry (FIG. 3B). Cells receiving any of the 4 MND-GFP polyA-less pNL4-3 reporter constructs were GFP positive, regardless of reporter orientation, with forward orientation insertion in nef proximal to the native 3’LTR polyA signal producing the highest number of GFP expressing cells. Importantly, cells receiving either pNL4-3 or the linearized polyA-less MND-GFP donor alone did not express GFP.
[0171] It was then evaluated whether the polyA signal-less donor construct could tag the HIV provirus via CAPTIV in ACH-2 cells. A highly-conserved gRNA target site proximal to the 3’LTR polyA signal in env (envO) was identified as a provirus tagging site, and the activity of a gRNA targeting envO was validated in 293 cells following delivery of pNL4-3 and Cas9 RNPs (FIG. 4A). A polyA signal-less CAPTIV donor for envO was then electroporated into ACH-2 cells along with Cas9 RNPs for envO (FIG. 4B), and the presence of tagged GFP+ cells validated by flow cytometry 6 days post electroporation (FIG. 4C). [0172] Independently, similar levels of ACH-2 provirus tagging were seen at an alternative target site in env (Env1 , not shown) and 2 target sites in nef (Nef1 , Nef2 not shown). HITI- specific junction PCR demonstrated that the polyA signal-less donor had integrated into the envO site in ACH-2 cells (FIG. 4D), and lllumina sequencing of HITI junction PCR products confirmed the presence of HITI target-donor integration in forward and reverse orientations (not shown).
[0173] Cells isolated following CAPTIV remain viable. To confirm that CAPTIV is not detrimental to the expansion of isolated GFP+ ACH-2 cells, cells were sorted following CAPTIV tagging of the envO target site (FIGs. 5A-5C). Linear expansion of tagged cells was seen in culture after 2 rounds of sorting (FIG. 5D), with a doubling time similar to untagged ACH-2 cells (28h vs. 24h).
[0174] Proviruses and HIV integration sites can be sequenced following probe capture enrichment. An attractive possibility arising from CAPTIV isolation of cells containing HIV would be to interrogate the provirus sequence and integration site within clonally expanded cells following probe capture of HIV DNA. A set of 120bp 1X tiling HIV capture probes across the entire length of the NCBI HIV reference genome were designed and used to pull down HIV sequences from fragmented genomic DNA of naive or CAPTIV-tagged ACH-2 cells prior to performing lllumina sequencing (FIG. 6A). Using this method, complete provirus sequences from naive or CAPTIV-tagged cells were obtained. In naive cells a consensus provirus sequence was generated from an average read coverage of 685 reads/position, which was 99% homologous to HXB2 (FIG. 6B). This ACH-2 consensus sequence and the NL4-3 and HXB2 sequences used for the alignments are provided in FIG. 6C and 7. In CAPTIV-tagged ACH-2 samples, full provirus sequences from as few as 1000 cells were obtained. Multiple different integration sites with reads of up to >250bp off the end of the HIV LTR were also recovered. The majority of 5’ and 3’ LTR integration site junctions in ACH-2 cells were found at a major integration site previously identified within the NT5C3A gene of chromosome 7, and a previously identified minor integration site in the SLC25A25-AS gene of chromosome 9 (Sunshine et ai, J Virol. 2016;90(9):4511-9; Symons et al., Retrovirology. 2017; 14(1):2) (FIG. 6C).
[0175] Material and Methods. Cells and Reagents. HIV-1 latent T cell clones with one integrated proviral copy, ACH-2 (National Institutes of Health (NIH) AIDS reagent program) were maintained in RPMI-1640 medium (Hyclone TM) supplemented with 10% heat inactivated FBS, 100ug/mL penicillin/streptomycin and 1 ug/mL combinational anti- retrovirus drugs (cART) at 37°C and 5% CO2 incubator in a biosafety level (BSL) 2+ facility. cART drugs were obtained from the NIH AIDS reagent program and included Nevirapine, Lamivudine and AZT. 293 cells were cultured in DM EM (Hyclone®, GE Healthcare Life Sciences, Pittsburgh, PA) supplemented with 10% heat inactivated FBS. Human CD4+ T cells were isolated from PBMCs by using EasySep Human CD4+ T cell isolation kit (Stemcell) and EasyEights magnet (Stemcell). Isolated CD4+ cells were activated by CD3/CD28 activator beads (Life Technologies) for three days following the manufacture’s protocol. After T cell activator was removed, activated CD4+ T cells were maintained in RPMI-1640 medium containing 10% heat inactivated FBS and 30U/mL rlL-2 (peprotech). rlL-2 were supplied to culture every two or three days. CRISPR/Cas9 crRNAs, electroporation enhancer and recombinant S. pyogenes Cas9 nuclease containing nuclear localization sequence and C-terminal 6-His tag were obtained from Integrated DNA Technologies.
[0176] DNA Plasmids. To generate the plasmid pTia1-MND-GFP, a PCR product encompassing a MND-GFP-SV40pA expression cassette was amplified from the plasmid pscAAV-MND-GFP (De Silva et al., Antiviral Res. 2016; 126:90-8) using primers:
Tia1 l-MND-F:
GCGCCAATTCTGCAGACAAATGGCTGGT AT GTCGGGAACCTCTCCAGGCT AGT GAACAGA
G AAACAGG AGAAT ATGGG (SEQ ID NO: 70) and
SV40-R:
CAACTCCATCACTAGGGGTTCCTGCGGTATGTCGGGAACCTCTCCAGGCTAGAAAAAAAC CTCCCACACCTCCCCCTG (SEQ ID NO: 71)
that flanked the MND-GFP-SV40pA cassette with inverted copies of the Tia1 sgRNA target sequence. This PCR fragment was cloned into the Xbal/Notl sites of the plasmid pX330 (Addgene plasmid #42230).
[0177] To generate the plasmids pCAPTIV-EnvO-MND-GFP, pCAPTIV-Env1-MND-GFP, pCAPTIV-Nef1-MND-GFP, and pCAPTIV-Nef2-MND-GFP, 4 PCR products encompassing the MND promoter and the GFP gene were amplified from the plasmid pscAAV-MND-GFP (De Silva et al. 2016) using these primer sets,
Xbal-EnvO-MND-F:
GCCCGACGTCGCATGCTCCTT ACGT ACCT GTGCCTCTTCAGCT ACCACCGAACAGAGAAA
CAGGAGAAT (SEQ ID NO: 72), and
BamHI-EnvO-GFP-R:
AACGCGTT GGGAGCTCTCCGGTT AACCCT GTGCCTCTTCAGCT ACCACCTT ACTT GT ACAG CTCGTCCATGC (SEQ ID NO: 73),
Xbal-Env1-MND-F:
GCCCGACGTCGCATGCTCCTT ACGTACCCTGT CTT ATT CTT CT AGGT AT G AACAGAGAAAC AGGAGAAT (SEQ ID NO: 74), BamHI-Env1-GFP-R:
AACGCGTTGGGAGCTCTCCGGTTAACCCCTGTCTTATTCTTCTAGGTATTTACTTGTACAGC TCGTCCATGC (SEQ ID NO: 75),
Xbal-Nef1-MND-F:
GCCCGACGTCGCATGCTCCTT ACGTACCT CTT GTGCTT CT AGCCAGGCAG AACAG AGAAA CAGGAGAAT (SEQ ID NO: 76),
BamHI-Nef1-GFP-R:
AACGCGTTGGGAGCT CTCCGGTT AACCCTCTT GTGCTTCT AGCCAGGCATT ACTT GT ACAG CTCGTCCATGC (SEQ ID NO: 77),
Xbal-Nef2-MND-F:
GCCCGACGTCGCATGCTCCTT ACGTACCTCAGGT ACCTTT AAGACCAAT G AACAGAGAAAC AGGAGAAT (SEQ ID NO: 78),
BamHI-Nef2-GFP-R:
AACGCGTTGGGAGCT CTCCGGTT AACCCTCAGGT ACCTTT AAGACCAATTT ACTT GT ACAG CTCGTCCATGC (SEQ ID NO: 79) respectively.
These constructs were flanked by the MND-GFP cassette with inverted copies of one of 4 different HIV-specific sgRNA target sites (EnvO, Env1 , Nef1 and Nef2). Each PCR fragment was then independently cloned into the Xbal/BamHI sites of the plasmid pGEM-7Zf by Gibson assembly.
[0178] To reduce the background GFP expression derived from un-integrated donor plasmid, the constructs pCAPTIV-EhnO-DrA, pCAPTIV-Envl-DrA, pCAPTIV-Nefl-DrA and pCAPTIV- Nef2-ApA were developed in which the donor tag is flanked by two identical and conserved HIV- specific gRNA target sites, and contains the MND promoter and a GFP reporter, but does not contain a polyA signal. The gRNA target sequence for EnvO, Env1 , Nef1 , and Nef 2 are provided in Tables 1 and 2.
[0179] Production of AAV6 vectors. 16 million 293 cells were seeded to a 15cm dish the day before transfection, and at least 10 dishes were prepared to make each virus stock. Either pscAAV-EnvO-MND-GFP-DrA or pscAAV-Nefl-MND-GFP-DrA, pHelper and pRepCap6 plasmid were transfected into 293 cells at a 5:3:2 ratio. A total of 28ug of DNA and 1 12 ug polyethylenimine (PEI) at 1 :4 ratio per plate were added to 500ul OptiMEM. Transfection mix was incubated at room temperature for 15 min and then added to 293 cells. At 18 hr post transfection, media was replaced with 25 ml_ of serum free DMEM. 72hr post-transfection, cells were pelleted by centrifugation at 3000 rpm for 10 min followed by lysis with AAV lysis buffer (50mM Tris, 150 mM NaCI, pH 8.5). Cell lysate containing AAV were frozen and thawed 4 times between dry ice-ethanol bath and 37°C water bath. After final thaw, Benzonase was added to the lysate at 50U/ml_ and incubated at 37°C for 30 min. Cell debris was pelleted by centrifugation at 3000 rpm for 10 min. Crude cell lysate was purified by using iodixanol gradient, and AAV was concentrated into PBS using an Amicon Ultra-15 column (EMD Millipore) before storage at -80°C.
[0180] Cas9 ribonuclueoprotein (RNP) electroporation. 2.5 x 105 ACH-2 cells were resuspended in 9ul buffer R (Thermo Fisher Scientific) before electroporation. Cells were electroporated with 1 ug CAPTIV donor plasmids at 1500V, 10ms, 1 pulse by using Neon transfection system (Invitrogen), together with 1.8 uM electroporation enhancer and 1.8uM Cas9 RNP containing 1.5uM Cas9 nuclease and gRNA. At 6 days-post-transfection, GFP tagged provirus was validated by flow cytometry. Genomic DNA (gDNA) was extracted by QIAmp micro kit (QIAGEN) following the manufacturer’s protocol for next generation sequencing and for tag insertion junction-specific PCR analysis.
[0181] Flow cytometry analysis. Cells were fixed by 1 % para-formaldehyde/PBS solution for at least 10 minutes which was then followed by flow cytometry analysis. GFP expression was validated by BD FACSCanto II (BD Biosciences), and the flow data was analyzed by FlowJo 10. 4.2 (FlowJo, LLC).
[0182] lllumina library preparation and sequencing. Genomic DNA (gDNA) was extracted from ACH-2 cell line by QIAmp micro kit (Quiagen) following the manufacture’s protocol. Libraries were prepared for lllumina sequencing as described previously (Greninger et al., BMC Genomics, 2018 Mar 20; 19(1): 204; Greninger et al., mSphere, 2018 Jun 13 3(3)). Briefly, 100ng of DNA was fragmented to an average insert size of 500bp using the Kapa HyperPlus kit using a 37°C incubation for 7 minutes. Libraries were end-repaired, dA tailed, and ligated using Y-stub adapters and dual-indexed Truseq-based adapters were added using 10 cycles of Kapa HiFi PCR, following the manufacturer’s instructions. HIV sequences were captured using a 1X tiling IDT xGen panel synthesized from the reference HIV-1 genome (NC_001802). Libraries were reamplified using 12 cycles of Kapa HiFi PCR, quantified on a Qubit 3.0 fluorometer, and sequenced using a 2x300bp sequencing run on an lllumina MiSeq.
[0183] Sequence analysis. Whole provirus genome sequences were constructed from raw sequencing reads using a modified version of a previously described computational pipeline (Greninger et al., BMC Genomics, 2018 Mar 20; 19(1): 204; Greninger et al., mSphere, 2018 Jun 13 3(3)). Briefly, raw sequencing reads in fastq format were trimmed to remove adapters and low-quality regions and de novo assembled into contigs using SPAdes (Bankevich et al., J. Comput. Biol. 2012 May; 19(5): 455-77). Contigs were ordered by aligning against the HIV-1 reference genome (HXB-2, NC_001802), gaps were filled with reference bases, and reads were re-mapped against this template to obtain the final consensus sequence for the sample.
[0184] T7 endonuclease 1 assay. The T7 endonuclease I cleavage assay and amplicon- sequencing protocols have been described in (Aubert et ai, Molecular Therapy - Nucleic Acids (2014) 3, e146). Primers to amplify the region that contains CRISPR/Cas9 target sites in env and nef are shown in Table 2.
[0185] Example 2. CAPTIV tagging of the human CCR5 locus. To optimize the conditions required for CAPTIV tagging of an integrated HIV provirus in primary human CD4+ T cells, primary CD4+ T cells from HIV naive normal donors were utilized. The CCR5 gene was initially targeted as a proxy for the HIV provirus, since HIV infection of primary human CD4+ T cells in vitro is detrimental to cell viability, and it was desired to identify the experimental parameters that would maximize cell survival when moving into HIV+ patient derived CD4+ T cells. Two spCas9 gRNA target sites were first identified in the CCR5 gene locus (CCR5-1 and CCR5-2) that were located at different distances from their nearest canonical AATAAA or ATT AAA polyA signals in each direction (FIG. 8). Analysis of the ability of spCas9 RNPs containing crRNAs that were specific for the CCR5-1 (AGCTGAGAGGTTACTTACCGGGG; SEQ ID NO: 59) or CCR5-2 (CAGGCCACAAGTCTCTCGCCTGG; SEQ ID NO: 60) target sites to enable cleavage of their respective target sites in primary human CD4+ T cells was then performed. To do this, primary human CD4+ T cells isolated from PBMCs by negative selection were activated for 24 hours using CD3/CD28 beads, then electroporated with CCR5-1 or CCR5-2 targeting RNPs using the Neon electroporation system. At day 6 post electroporation, genomic DNA was isolated from cells and a PCR product spanning each target site was amplified and used to determine the levels of gene editing that had occurred via the T7 endonuclease I (T7E1) assay. The T7E1 assay showed that gene editing had occurred at the CCR5-1 or CCR5-2 target sites in 22.9% and 34.1% of cells respectively (FIG. 9).
[0186] After demonstrating efficient gene editing at the CCR5-1 and CCR5-2 target sites, insertion of a polyA-less tag containing the MND promoter and the GFP reporter at both CCR5 target sites was attempted. To minimize the levels of cell death, self-complimentary AAV vectors (scAAV) were used to introduce a dsDNA CAPTIV donor template, as infection of primary human CD4+ T cells with AAV vectors is known to be less toxic to this cell type than electroporation with double stranded DNA. scAAV vectors derived from the AAV6 serotype were used, since this serotype is known to efficiently transduce primary human CD4+ T cells in vitro.
[0187] Primary human CD4+ T cells were first electroporated with spCas9 RNPs containing crRNAs specific for either CCR5-1 or CCR5-2, in combination with a second crRNA that was specific for either the HIV-specific gRNA target sequence envO or the HIV-specific gRNA target sequence nef1. The CCR5-specific RNPs were incorporated to enable cleavage of the host CCR5 locus for donor tag insertion, whilst the envO or nef1 RNPs were incorporated to allow excision of the polyA-less MND-GFP dsDNA tag from the scAAV donor templates that contain either envO or nef1 target sites immediately flanking both the 5’ and 3’ ends of the polyA-less MND-GFP tag. After electroporation with spCas9 RNPs, cells were left for 3 hours to recover before being infected with scAAV donor vectors scAAV6-envO-GFPApA or scAAV6-nef1- GFPApA at a multiplicity of infection of 350,000 vector genomes per cell. At 6 days post electroporation, treated cells were analyzed by flow cytometry to detect GFP expressing cells that had been tagged with the polyA-less MND-GFP dsDNA tag that was excised from the scAAV donor. Background levels of expression from the scAAV6-envO-GFPApA or scAAV6- nef1-GFPApA donors were determined in cells that were mock-electroporated with no spCas9 RNPs, and the percentage of cells that had been successfully transduced with the scAAV6 donor were determined in cells that had been mock electroporated then infected with a scAAV6- MND-GFP-pA vector that contains a functional polyA signal and can efficiently express GFP unlike the polyA-less donor scAAV6 vectors. The levels of GFP+ primary CD4+ T cells were analyzed in cells treated with CCR5-1 targeted RNPs in combination with either envO- or nef1- specific RNPs (FIGs. 10A-10C).
[0188] Compared to untreated control cells (0.24% GFP+) and scAAV donor only control cells (0.28% GFP+, scAAV6-envO-GFPApA; 0.32% GFP+, scAAV6-nef1-GFPApA), cells treated with 2 RNPs in combination with their respective scAAV6 donor contained up to 1.44% (CCR5- 1/env0 RNPs) or 1 % (CCR5-1/nef1 RNPs) GFP+ cells, in conditions where 28.9% of cells likely contained the scAAV6 donor, as determined in the scAAV6-MND-GFP-pA control vector treated cells. These results indicate that when background fluorescence is subtracted up to 1.16% (CCR5-1/envO) and 0.68% (CCR5-1/nef1) of treated cells were efficiently tagged so that GFP could be expressed and detected by flow cytometry. Furthermore, the higher levels of GFP+ cells in envO treated cells suggest that the polyA-less MND-GFP dsDNA tag is more efficiently excised from its respective scAAV6 donor by the envO targeting RNPs than by the nef1 targeting RNPs.
[0189] The levels of GFP+ primary CD4+ T cells was next analyzed in cells treated with CCR5- 2 targeted RNPs in combination with either envO- or nef1-specific RNPs (FIGs. 11A-11C). Compared to untreated control cells (0.24% GFP+) and donor only control cells (0.28% GFP+, scAAV6-envO-GFPApA; 0.32% GFP+, scAAV6-nef1-GFPApA), cells treated with CCR5-2 and envO RNPs in combination with the scAAV6-envO-GFPApA donor contained up to 0.82% GFP+ cells, indicating that when background fluorescence is subtracted up to 0.54% of treated cells were efficiently tagged so that GFP could be expressed and detected by flow cytometry. In cells treated with CCR5-2 and nef1 RNPs in combination with the scAAV6-nef1-GFPApA donor, the level of GFP+ cells was below the level of background fluorescence indicating that no tagging had occurred. As in the cells treated with CCR5-1 RNPs, tagging levels were higher in CCR5- 2/envO treated cells than CCR5-2/nef1 treated cells, supporting the idea that the polyA-less MND-GFP dsDNA tag is more efficiently excised from its respective scAAV6 donor by the envO targeting RNPs than by the nef1 targeting RNPs. Additionally, observations that tagging levels were lower at the CCR5-2 target site than at the CCR5-1 target site despite gene editing occurring at higher levels at the CCR5-2 target site in the T7E1 assay suggest that the CCR5-1 locus is more readily able to allow GFP expression from an inserted polyA-less MND-GFP tag. It is hypothesized that this is due to putative canonical polyA signals being in closer proximity on average to the CCR5-1 target site (308 bp reverse orientation and 449 bp forward orientation) than to the CCR5-2 target site (412 bp reverse orientation and 1664 bp forward orientation). Tagging of the CCR5 locus can only occur when the scAAV6 donor reaches the nucleus of a cell and the RNP cleaves the CCR5 target site. Therefore, when the efficiency of AAV transduction (28.9%) is used as a proxy for donor delivery efficiency and the target site gene editing rate (22.9%, CCR5-1 ; 34.1 %, CCR5-2) is used as a proxy for CCR5 locus cleavage, the efficiency of the CCR5 tagging process in primary human CD4+ T cells can be approximated. Using this approximation, only 6.62% of all CD4+ T cells could theoretically be tagged at the CCR5-1 locus and 9.85% at the CCR5-2 locus, so the true tagging rate for CCR5-1 are 17.5% when using envO RNPs [(1.16/6.62)*100] and 10.27% when using nef1 RNPs [(0.68/6.62)*100], whereas the tagging rate for CCR5-2 is 5.48% when using envO RNPs [(0.54/9.85)*100]
[0190] Example 3. Optimization of HIV provirus tagging in DHIV-infected CD4+ primary T cells. Optimal conditions for tagging of primary CD4+ T cells will be determined to maximize the number of cells available for downstream genetic analysis and/or clonal expansion. Initial studies will be carried out using an in vitro model of HIV infection using activated CD4+ T cells that are transduced with the replication-defective HIV molecular clone DHIV (Bosque & Planelles, Blood. 2009; 113(1):58-65) at a multiplicity of 1 infectious unit/cell. This will enable optimization of parameters required for efficient provirus tagging in a setting where HIV+ cells are abundant. Gene editing in primary T cells following electroporation of Cas9 RNPs and ssDNA oligonucleotide donors has proven successful (Schumann et ai, Proc Natl Acad Sci U S A. 2015; 112(33): 10437-42). However, plasmid DNA delivery into primary CD4+ T often results in low cell viability (Van Tendeloo et ai, Gene Ther. 2000;7(16):1431-7; Bell et ai, Nat Med. 2001 ;7(10): 1155-8). Therefore, AAV vectors will be used as a donor for HIV provirus tagging in CD4+ T cells. AAV efficiently transduces CD4+ T cells (Sather et al., Sci Transl Med. 2015;7(307):307ra156; Wang et al., Nucleic Acids Res. 2016;44(3):e30), does not impact T-cell viability, and can facilitate gene insertion by HDR at levels of over 30% following electroporation of Cas9 mRNA (Gwiazda et al., Mol Ther. 2016;24(9):1570-80). Self-complimentary AAV6 (scAAV) vectors that provide a dsDNA template for CRIPSR/Cas9-mediated donor release and subsequent insertion can be used which can facilitate higher levels of targeted integration than plasmid donors. The scAAV vectors will contain a polyA signal-less MND-GFP donor flanked by target sequences from conserved regions of env or nef proximal to the 3’ LTR polyA signal. Cells will be electroporated with env- or nef-specific Cas9 RNPs, and the AAV6 donor vector will then be delivered at increasing MOIs (20,000; 100,000; 500,000 vg/cell) 2-4 hours post electroporation as previously described (Schumann et al., Proc Natl Acad Sci U S A. 2015;112(33):10437-42; Gwiazda et al., Mol Ther. 2016;24(9): 1570-80). Levels of gene tagging will be assessed by flow cytometry at 2 and 6 days post electroporation, and lllumina sequencing.
[0191] Additionally, AAV vectors will be delivered 24 hours before Cas9 RNP electroporation, as this may increase gene insertion levels.
[0192] Example 4. CAPTIV isolation of CD4+ T cells from HIV+ participant PBMCs. Provirus tagging in samples from HIV+ participants will be performed. Initial analyses will use samples from participants not receiving ART, since the frequency of HIV+ cells will be higher, before moving onto samples from participants receiving ART. De-identified participant samples chosen to include similar numbers of men and women, and reflect the US distribution of minority groups, will be obtained from the University of Washington/Fred Hutch Center for AIDS Research (CFAR) HIV specimen repository, in cryopreserved aliquots of 5 million PBMCs. Participant CD4+ T cells will be electroporated as previously described (Schumann et al., Proc Natl Acad Sci U S A. 2015;112(33):10437-42; Gwiazda et al., Mol Ther. 2016;24(9): 1570-80) with the optimal Cas9 RNPs identified for HITI, and then transduced with AAV donors at the optimal multiplicity, either 24 hours prior to or 2-4 hours post electroporation. Levels of gene tagging will be followed by flow cytometry 3, 7, and 14 days later, and provirus tagging will be confirmed by lllumina sequencing.
[0193] Example 5. Analysis of HIV integration sites in HIV+ participant samples. The preference for HIV integration in active genes in vitro has been well established (43-45), but the potential role that an HIV integration event may play in the in vivo expansion and/or persistence of an HIV infected cell has only recently been recognized (Cohn et al., Cell. 2015;160(3):420-32; Maldarelli et al., Science. 2014;345(6193): 179-83; Wagner et a!., Science. 2014;345(6196):570- 3). Currently-used methods to isolate HIV integration sites require multiple PCR steps to isolate the integration sites, and multiple reactions to achieve representative sampling of the pool of infected cells (Maldarelli et al., Science. 2014;345(6193): 179-83; Wagner et at., Science. 2014; 345(6196) : 570-3) . Integration site analysis of CAPTIV-isolated HIV+ cells following HIV probe capture offers an advantage over these methods, as minimal PCR amplification is required, and an entire sample can be analyzed in a single run. Probe capture protocols (Johnston et al ., PLoS Med. 2017; 14(12):e1002475; Koelle et al ., Sci Rep. 2017;7:44084; Greninger et al., bioRxiv. 2017(bioRxiv 181248)) have been adapted for HIV and will be used on DNA extracted from bulk pools of HIV+ participant CD4+ T cells isolated by CAPTIV as described. Capture probes (IDT XGen) will be synthesized using provirus consensus sequences of group M subtypes A-K generated from LANL database sequences and pooled to maximize levels of DNA capture. 5’ and 3’ HIV LTR- specific capture probes contained in the HIV genome capture probe library allow recovery of integration sites. Alternatively, a custom SureSelect DNA bait library (Agilent) containing 500 different HIV genomes, including LTR, will be designed to cover the 90% nucleotide identity space occupied by group M HIV. Paired end lllumina sequencing will then be performed with captured DNA, and integration site locations determined using blastn. In studies described in Example 1 , this method identified known integration sites (Sunshine et al., J Virol. 2016;90(9):4511-9; Symons et al., Retrovirology. 2017; 14(1):2) from as few as 1000 ACH-2 cells, although the ultimate sensitivity of this approach should be much better. Moreover, unlike traditional techniques for integration site analysis, a single MiSeq run would, assuming a 100% capture rate, theoretically allow sequencing of all integration sites in a sample of over 10,000 HIV+ cells.
[0194] Example 6. Linkage of provirus genomes and their integration sites in HIV+ participant samples. Recent studies have provided important information about the completeness of integrated genomes in HIV- infected cells and their location within each infected cell (Ho et al., Cell. 2013; 155(3):540-51 ; Bruner et al., Nat Med. 2016;22(9): 1043-9), and these studies have helped inform the composition of the latent HIV reservoir. However, the methods used to obtain this data rely upon a long- range PCR step followed by nested PCR and sequence assembly from multiple overlapping fragments. CAPTIV offers a way to simplify this process, while increasing the number of individual complete provirus sequences that can be obtained from a single participant sample. CAPTIV will be used to collect complete provirus sequences in combination with their integration sites. A HIV probe capture technique will first be used to obtain complete provirus sequences and integration junctions from individual HIV+ CD4+ T cell clones that have been expanded in culture after CAPTIV isolation. As an alternative approach, 10X Genomics whole genome sequencing (WGS) platform can be used to obtain paired provirus sequences and integration sites from bulk populations of HIV+ participant CD4+ T cells isolated by CAPTIV. The 10X Genomics WGS platform allows the partition of 15-30Kb genomic DNA fragments onto individual beads that are partitioned into droplets and assigned unique barcodes. The partitioning of barcoded DNA fragments larger than the HIV provirus enables downstream lllumina sequencing that allow identification of paired whole provirus sequences and integration sites from a pool of mixed genomic DNA. Partial provirus sequences created during the shearing step can also be used to generate complete proviral genomes through pairing with a matched integration site partial provirus. The 10X Genomics WGS platform can be run using as little as 1 ng (167 cells) of sample DNA (Zheng et ai, Nat Biotechnol. 2016;34(3):303-1 1 ; Zook et ai, Sci Data. 2016;3:160025), which is important in that the minimal input cell number for disclosed hybridization approaches has not yet been defined (although it is <1000 cells). Complete provirus sequences generated by either technique can be analyzed for deletions, or stop codons that predict replication incompetence, and would allow reconstruction of full virus and analysis of replication competence (Ho et ai, Cell. 2013; 155(3):540-51) in future work.
[0195] Example 7. Analysis of provirus genomes, integration sites, and T cell receptor (TCR) sequences in individual HIV+ participant samples. The ability to pair the sequence of the TCR expressed by an HIV+ CD4+ T cell with its integrated provirus could provide insights into the mechanisms by which infected clonal T cell populations are maintained. Although previous studies suggest that CD4+ T cells with specificity for CMV, EBV, or HIV (Abana et ai, J Immunol. 2017; 199(9):3187-201 ; Henrich et ai, J Infect Dis. 2017;216(2):254-62; Demoustier et ai, AIDS. 2002; 16(13): 1749-54) may be enriched within the HIV reservoir, widespread studies of HIV infected CD4+ T cell TCR sequences have not been performed. Using DNA isolated from individual HIV+ CD4+ T cell clones expanded after CAPTIV isolation, provirus and integration site sequences will be obtained using the HIV probe capture technique as described herein. The same DNA can then be used to determine the TCRa and TCR gene sequences for individual clones. Molecular TCR sequence analysis can be performed by PCR within the Immune Monitoring shared resource at the Fred Hutchinson Cancer Research Center, so that provirus information can be paired with TCR sequence data. The diversity of TCR sequences from HIV-infected cells compared to uninfected cells will initially be evaluated. If the TCR repertoire of HIV infected cells is restricted or there are dominant TCR sequences, TCR sequence data will be compared with known antigen-specific TCR sequences (Shugay et ai, Nucleic Acids Res. 2017. doi: 10.1093/nar/gkx760. PubMed PMID: 28977646), to identify potential clones that are responsive to these viruses. Participants with common and well-studied HLA types will also be evaluated and the epitope specificity of individual clones will be inferred using recently developed methods for TCR specificity prediction (Dash et ai, Nature. 2017;547(7661):89-93; Glanville et ai, Nature. 2017;547(7661):94-8). Paired provirus/TCR information allows determination of whether replication-competent HIV is truly enriched within CD4+ T cells with a particular antigen specificity.
[0196] (x) Closing Paragraphs. Variants of protein and/or nucleic acid sequences disclosed herein can also be used. Variants include sequences with at least 70% sequence identity, 80% sequence identity, 85% sequence, 90% sequence identity, 95% sequence identity, 96% sequence identity, 97% sequence identity, 98% sequence identity, or 99% sequence identity to the protein and nucleic acid sequences described or disclosed herein wherein the variant exhibits substantially similar or improved biological function.
[0197]“% sequence identity” refers to a relationship between two or more sequences, as determined by comparing the sequences. In the art, "identity" also means the degree of sequence relatedness between protein and nucleic acid sequences as determined by the match between strings of such sequences. "Identity" (often referred to as "similarity") can be readily calculated by known methods, including those described in: Computational Molecular Biology (Lesk, A. M., ed.) Oxford University Press, NY (1988); Biocomputing: Informatics and Genome Projects (Smith, D. W., ed.) Academic Press, NY (1994); Computer Analysis of Sequence Data, Part I (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, NJ (1994); Sequence Analysis in Molecular Biology (Von Heijne, G., ed.) Academic Press (1987); and Sequence Analysis Primer (Gribskov, M. and Devereux, J., eds.) Oxford University Press, NY (1992). Preferred methods to determine identity are designed to give the best match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. Sequence alignments and percent identity calculations may be performed using the Megalign program of the LASERGENE bioinformatics computing suite (DNASTAR, Inc., Madison, Wisconsin). Multiple alignment of the sequences can also be performed using the Clustal method of alignment (Higgins and Sharp CABIOS, 5, 151-153 (1989) with default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Relevant programs also include the GCG suite of programs (Wsconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wsconsin); BLASTP, BLASTN, BLASTX (Altschul, et ai, J. Mol. Biol. 215:403-410 (1990); DNASTAR (DNASTAR, Inc., Madison, Wsconsin); and the FASTA program incorporating the Smith-Waterman algorithm (Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Publisher: Plenum, New York, N.Y. Within the context of this disclosure it will be understood that where sequence analysis software is used for analysis, the results of the analysis are based on the "default values" of the program referenced. "Default values" will mean any set of values or parameters, which originally load with the software when first initialized.
[0198] Variants also include nucleic acid molecules that hybridizes under stringent hybridization conditions to a sequence disclosed herein and provide the same function as the reference sequence. Exemplary stringent hybridization conditions include an overnight incubation at 42°C in a solution including 50% formamide, 5XSSC (750 mM NaCI, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5XDenhardt's solution, 10% dextran sulfate, and 20 pg/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1XSSC at 50°C. Changes in the stringency of hybridization and signal detection are primarily accomplished through the manipulation of formamide concentration (lower percentages of formamide result in lowered stringency); salt conditions, or temperature. For example, moderately high stringency conditions include an overnight incubation at 37°C in a solution including 6XSSPE (20XSSPE=3M NaCI; 0.2M NaH2P04; 0.02M EDTA, pH 7.4), 0.5% SDS, 30% formamide, 100 pg/ml salmon sperm blocking DNA; followed by washes at 50 °C with 1XSSPE, 0.1% SDS. In addition, to achieve even lower stringency, washes performed following stringent hybridization can be done at higher salt concentrations (e.g. 5XSSC). Variations in the above conditions may be accomplished through the inclusion and/or substitution of alternate blocking reagents used to suppress background in hybridization experiments. Typical blocking reagents include Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and commercially available proprietary formulations. The inclusion of specific blocking reagents may require modification of the hybridization conditions described above, due to problems with compatibility.
[0199] In particular embodiments, variant proteins include conservative amino acid substitutions. In particular embodiments, a conservative amino acid substitution may not substantially change the structural characteristics of the reference sequence (e.g., a replacement amino acid should not tend to break a helix that occurs in the reference sequence or disrupt other types of secondary structure that characterizes the reference sequence). Examples of art-recognized polypeptide secondary and tertiary structures are described in Proteins, Structures and Molecular Principles (Creighton, Ed., W. H. Freeman and Company, New York (1984)); Introduction to Protein Structure (C. Branden & J. Tooze, eds., Garland Publishing, New York, N.Y. (1991)); and Thornton et ai, Nature, 354:105 (1991).
[0200] In particular embodiments, a“conservative substitution” involves a substitution found in one of the following conservative substitutions groups: Group 1 : Alanine (Ala), Glycine (Gly), Serine (Ser), Threonine (Thr); Group 2: Aspartic acid (Asp), Glutamic acid (Glu); Group 3: Asparagine (Asn), Glutamine (Gin); Group 4: Arginine (Arg), Lysine (Lys), Histidine (His); Group 5: Isoleucine (lie), Leucine (Leu), Methionine (Met), Valine (Val); and Group 6: Phenylalanine (Phe), Tyrosine (Tyr), Tryptophan (Trp).
[0201] Additionally, amino acids can be grouped into conservative substitution groups by similar function or chemical structure or composition (e.g., acidic, basic, aliphatic, aromatic, sulfur- containing). For example, an aliphatic grouping may include, for purposes of substitution, Gly, Ala, Val, Leu, and lie. Other groups containing amino acids that are considered conservative substitutions for one another include: sulfur-containing: Met and Cysteine (Cys); acidic: Asp, Glu, Asn, and Gin; small aliphatic, nonpolar or slightly polar residues: Ala, Ser, Thr, Pro, and Gly; polar, negatively charged residues and their amides: Asp, Asn, Glu, and Gin; polar, positively charged residues: His, Arg, and Lys; large aliphatic, nonpolar residues: Met, Leu, lie, Val, and Cys; and large aromatic residues: Phe, Tyr, and Trp. Additional information is found in Creighton (1984) Proteins, W.H. Freeman and Company.
[0202] The disclosed nucleic acid sequences are shown using standard letter abbreviations for nucleotide bases, as defined in 37 C.F.R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included.
[0203] Unless otherwise indicated, the practice of the present disclosure can employ conventional techniques of immunology, molecular biology, microbiology, cell biology and recombinant DNA. These methods are described in the following publications. See, e.g., Sambrook, et al. Molecular Cloning: A Laboratory Manual, 2nd Edition (1989); F. M. Ausubel, et al. eds., Current Protocols in Molecular Biology, (1987); the series Methods IN Enzymology (Academic Press, Inc.); M. MacPherson, et al., PCR: A Practical Approach, IRL Press at Oxford University Press (1991); MacPherson et al., eds. PCR 2: Practical Approach, (1995); Harlow and Lane, eds. Antibodies, A Laboratory Manual, (1988); and R. I. Freshney, ed. Animal Cell Culture (1987) and updated editions thereof.
[0204] As will be understood by one of ordinary skill in the art, each embodiment disclosed herein can comprise, consist essentially of or consist of its particular stated element, step, ingredient or component. Thus, the terms“include” or“including” should be interpreted to recite: “comprise, consist of, or consist essentially of.” The transition term“comprise” or“comprises” means includes, but is not limited to, and allows for the inclusion of unspecified elements, steps, ingredients, or components, even in major amounts. The transitional phrase“consisting of’ excludes any element, step, ingredient or component not specified. The transition phrase “consisting essentially of” limits the scope of the embodiment to the specified elements, steps, ingredients or components and to those that do not materially affect the embodiment. A material effect would cause a statistically significant reduction in the ability to specifically isolate latently infected HIV cells from a biological sample.
[0205] Unless otherwise indicated, all numbers expressing quantities of ingredients, properties such as molecular weight, reaction conditions, and so forth used in the specification and claims are to be understood as being modified in all instances by the term“about.” Accordingly, unless indicated to the contrary, the numerical parameters set forth in the specification and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by the present invention. At the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope of the claims, each numerical parameter should at least be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. When further clarity is required, the term“about” has the meaning reasonably ascribed to it by a person skilled in the art when used in conjunction with a stated numerical value or range, i.e. denoting somewhat more or somewhat less than the stated value or range, to within a range of ±20% of the stated value; ±19% of the stated value; ±18% of the stated value; ±17% of the stated value; ±16% of the stated value; ±15% of the stated value; ±14% of the stated value; ±13% of the stated value; ±12% of the stated value; ±11% of the stated value; ±10% of the stated value; ±9% of the stated value; ±8% of the stated value; ±7% of the stated value; ±6% of the stated value; ±5% of the stated value; ±4% of the stated value; ±3% of the stated value; ±2% of the stated value; or ±1% of the stated value.
[0206] Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors necessarily resulting from the standard deviation found in their respective testing measurements.
[0207] The terms“a,”“an,”“the” and similar referents used in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. Recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g.,“such as”) provided herein is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the invention.
[0208] Groupings of alternative elements or embodiments of the invention disclosed herein are not to be construed as limitations. Each group member may be referred to and claimed individually or in any combination with other members of the group or other elements found herein. It is anticipated that one or more members of a group may be included in, or deleted from, a group for reasons of convenience and/or patentability. When any such inclusion or deletion occurs, the specification is deemed to contain the group as modified thus fulfilling the written description of all Markush groups used in the appended claims.
[0209] Certain embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Of course, variations on these described embodiments will become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventor expects skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
[0210] Furthermore, numerous references have been made to patents, printed publications, journal articles and other written text throughout this specification (referenced materials herein). Each of the referenced materials are individually incorporated herein by reference in their entirety for their referenced teaching.
[0211] In closing, it is to be understood that the embodiments of the invention disclosed herein are illustrative of the principles of the present invention. Other modifications that may be employed are within the scope of the invention. Thus, by way of example, but not of limitation, alternative configurations of the present invention may be utilized in accordance with the teachings herein. Accordingly, the present invention is not limited to that precisely as shown and described.
[0212] The particulars shown herein are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of various embodiments of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for the fundamental understanding of the invention, the description taken with the drawings and/or examples making apparent to those skilled in the art how the several forms of the invention may be embodied in practice.
[0213] Definitions and explanations used in the present disclosure are meant and intended to be controlling in any future construction unless clearly and unambiguously modified in the following examples or when application of the meaning renders any construction meaningless or essentially meaningless. In cases where the construction of the term would render it meaningless or essentially meaningless, the definition should be taken from Webster's Dictionary, 3rd Edition or a dictionary known to those of ordinary skill in the art, such as the Oxford Dictionary of Biochemistry and Molecular Biology (Ed. Anthony Smith, Oxford University Press, Oxford, 2004).

Claims

CLAIMS What is claimed is:
1. A method of isolating CD4+ primary T cells latently infected with human immunodeficiency virus (HIV) provirus comprising
acquiring a sample enriched for CD4+ primary T cells obtained from a patient infected with HIV;
delivering genetic engineering components to the CD4+ primary T cells within the sample
wherein the genetic engineering components comprise:
a ribonucleoprotein complex comprising Cas9 and guide RNA (gRNA) comprising the sequence set forth in SEQ ID NO: 44; and
a self-complementary adeno-associated virus 6 (scAAV6) donor vector wherein the donor vector genome comprises an insertion construct comprising an MND promoter and a GFP reporter gene collectively flanked at the 5’ and 3’ ends by SEQ ID NO: 43 wherein the insertion construct does not comprise a polyA signal and
wherein the delivering results in insertion of the insertion construct within the env gene of the HIV provirus and expression of the reporter gene; and sorting the sample based on expression of the GFP reporter gene;
thereby isolating CD4+ primary T cells latently infected with the HIV provirus.
2. A method of isolating T cells latently infected with human immunodeficiency virus (HIV) provirus comprising
acquiring a sample enriched for T cells obtained from a patient infected with HIV;
delivering genetic engineering components to the T cells within the sample
wherein the genetic engineering components integrate a genetic construct into a targeted portion of the HIV provirus genome and
wherein the genetic construct comprises a promoter and a gene encoding a reporter but does not comprise a polyA signal and results in expression of the reporter; and
sorting the sample based on expression of the reporter;
thereby isolating T cells latently infected with HIV.
3. The method of claim 2, wherein the T cells are CD4+ T cells.
4. The method of claim 2, wherein the T cells are primary T cells.
5. The method of claim 2, wherein the T cells are CD4+ primary T cells.
6. The method of claim 2, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 75 base pairs.
7. The method of claim 2, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 25 base pairs.
8. The method of claim 2, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are 20 base pairs.
9. The method of claim 2, wherein the targeted portion of the HIV provirus genome comprises a sequence as set forth in SEQ ID NOs: 3, 43, 45, 47, or 49.
10. The method of claim 2, wherein the genetic engineering components comprise a guide RNA sequence comprising a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50.
11. The method of claim 2, wherein the genetic engineering components comprise Cas9 or Cpf1.
12. The method of claim 2, wherein the promoter comprises the MND promoter.
13. The method of claim 2, wherein the reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
14. The method of claim 2, wherein the genetic construct comprises a sequence as set forth in one of SEQ ID NOs: 23-33.
15. The method of claim 2, wherein the sorting comprises fluorescence activated cell sorting (FACs), magnetic based cell-sorting, affinity chromatography, panning, or drug selection.
16. The method of claim 2, wherein the delivering comprises electroporation and viral vector delivery.
17. The method of claim 2, wherein the genetic engineering components comprise targeting elements and cutting elements delivered by electroporation.
18. The method of claim 2, wherein the genetic engineering components comprise a genetic construct expressed by a viral vector.
19. The method of claim 18, wherein the viral vector is a non-integrating viral vector.
20. The method of claim 19, wherein the non-integrating viral vector is an adeno-associated viral vector (AAV) or a lentiviral vector.
21. The method of claim 19, wherein the non-integrating viral vector is a self-complementary AAV vector (scAAV).
22. The method of claim 19, wherein the non-integrating viral vector is a self-complementary AAV6 vector (scAAV6).
23. The method of claim 2, further comprising expanding the isolated T cells to create a population of T cells having the genetic construct inserted within the HIV provirus genome.
24. The method of claim 2, further comprising amplifying portions of the genetically modified HIV provirus genome utilizing a primer sequence comprising a sequence as set forth in one of SEQ ID NOs: 51-57, or 65.
25. The method of claim 2, further comprising sequencing portions of the genetically modified HIV provirus genome.
26. The method of claim 25, further comprising assessing information regarding HIV integration sites within the T cell genome based on the sequencing.
27. The method of claim 2, further comprising identifying T cell receptor a and b chains from the isolated T cells.
28. The method of claim 2, further comprising assessing conditions that trigger viral reactivation.
29. The method of claim 2, further comprising:
Extracting and fragmenting DNA from the isolated T cells;
Capturing and/or partitioning the extracted and fragmented DNA; and
Sequencing the captured and/or partitioned DNA.
30. The method of claim 29, further comprising identifying one or more of whole provirus genome, integration site of the HIV provirus, T cell receptor a and b chains, and deletions or stop codons within the HIV provirus genome that predict replication incompetence based on the sequencing.
31. The method of claim 30, wherein the identifying utilizes a sequence as set forth in one or more of SEQ ID NOs: 20-22.
32. The method of claim 2, wherein the patient is receiving anti-retroviral therapy when the sample is obtained.
33. The method of claim 2, wherein the patient is not receiving anti-retroviral therapy when the sample is obtained.
34. The method of claim 2, wherein the patient is receiving chemotherapy when the sample is obtained.
35. The method of claim 2, wherein the patient is not receiving chemotherapy when the sample is obtained.
36. A population of T cells genetically modified according to a method of any of claims 1-35.
37. The population of T cells of claim 36, wherein the population of T cells has been expanded.
38. A method comprising creating a library of isolated T cells latently infected with human immunodeficiency virus (HIV) comprising
acquiring multiple samples enriched for T cells wherein each sample is obtained from a different patient latently infected with HIV at a targeted portion of the HIV provirus genome; delivering genetic engineering components to the T cells within each of the samples wherein the genetic engineering components integrate a genetic construct into the HIV provirus genome within the samples at a targeted portion of the HIV provirus genome and
wherein the genetic construct comprises a promoter and a reporter gene but does not comprise a polyA signal and results in expression of the reporter gene; and
sorting the samples based on expression of the reporter gene;
thereby isolating T cells latently infected with HIV and creating a library of isolated T cells latently infected with HIV.
39. The method of claim 38, wherein the T cells are CD4+ T cells.
40. The method of claim 38, wherein the T cells are primary T cells.
41. The method of claim 38, wherein the T cells are CD4+ primary T cells.
42. The method of claim 38, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 75 base pairs.
43. The method of claim 38, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are less than 25 base pairs.
44. The method of claim 38, wherein the genetic construct has regions of homology to the targeted portion of the HIV provirus genome that are 20 base pairs.
45. The method of claim 38, wherein the targeted portion of the HIV provirus genome comprises a sequence as set forth in one of SEQ ID NOs: 3, 43, 45, 47, or 49.
46. The method of claim 38, wherein the genetic engineering components comprise a guide RNA sequence comprising a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50.
47. The method of claim 38, wherein the genetic engineering components comprise Cas9 or Cpf1.
48. The method of claim 38, wherein the promoter comprises the MND promoter.
49. The method of claim 38, wherein the reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
50. The method of claim 38, wherein the genetic construct comprises a sequence as set forth in one of SEQ ID NOs: 23-33.
51. The method of claim 38, wherein the sorting comprises fluorescence activated cell sorting (FACs), magnetic based cell-sorting, affinity chromatography, panning, or drug selection.
52. The method of claim 38, wherein the delivering comprises electroporation and viral vector delivery.
53. The method of claim 38, wherein the genetic engineering components comprise targeting elements and cutting elements delivered by electroporation.
54. The method of claim 38, wherein the genetic engineering components comprise a genetic construct expressed by a viral vector.
55. The method of claim 54, wherein the viral vector is a non-integrating viral vector.
56. The method of claim 55, wherein the non-integrating viral vector is an adeno-associated viral vector (AAV) or a lentiviral vector.
57. The method of claim 55, wherein the non-integrating viral vector is a self-complementary AAV vector (scAAV).
58. The method of claim 55, wherein the non-integrating viral vector is a self-complementary AAV6 vector (scAAV6).
59. The method of claim 38, further comprising expanding the isolated T cells to create a population of T cells having the genetic construct inserted within the HIV provirus genome.
60. The method of claim 38, further comprising amplifying portions of the genetically modified HIV provirus genome utilizing a primer sequence comprising a sequence as set forth in one of SEQ ID NOs: 51-57, or 65.
61. The method of claim 38, further comprising sequencing portions of the genetically modified HIV provirus genome.
62. The method of claim 38, further comprising assessing information regarding HIV integration sites within the T cell genome based on the sequencing.
63. The method of claim 38, further comprising T cell receptor a and b chains from the isolated T cells.
64. The method of claim 38, further comprising assessing conditions that trigger viral reactivation.
65. A method of claim 38, further comprising:
Extracting and fragmenting DNA from the isolated T cells;
Capturing and/or partitioning the extracted and fragmented DNA; and
Sequencing the captured and/or partitioned DNA.
66. The method of claim 65, further comprising identifying one or more of whole provirus genome, integration site of the HIV provirus, T cell receptor a and b chains, and deletions or stop codons within the HIV provirus genome that predict replication incompetence based on the sequencing.
67. The method of claim 38, wherein at least one sample is obtained from a patient receiving anti-retroviral therapy when the sample is obtained.
68. The method of claim 38, wherein at least one sample is obtained from a patient not receiving anti-retroviral therapy when the sample is obtained.
69. The method of claim 38, wherein at least one sample is obtained from a patient receiving chemotherapy when the sample is obtained.
70. The method of claim 38, wherein at least one sample is obtained from patient is not receiving chemotherapy when the sample is obtained.
71. A library of populations of T cells genetically modified according to a method of any of claims 38-70.
72. The library of populations of T cells of claim 71 , wherein at least a portion of the populations within the library have been expanded.
73. A kit to isolate T cells latently infected with human immunodeficiency virus (HIV) comprising a guide RNA sequence comprising a sequence as set forth in one of SEQ ID NOs: 42, 44, 46, 48, or 50, a Cas or Cpf1 nuclease, and a genetic construct that comprises a promoter and a reporter gene but does not comprise a polyA signal.
74. The kit of claim 73, comprising Cas9.
75. The kit of claim 73, wherein the reporter gene encodes a fluorescent protein, a protein bound by an antibody binding domain, or a drug selectable marker.
76. The kit of claim 73, wherein the promoter is an MND promoter, a CD3A promoter, a murine stem cell virus promoter, or a distal lck promoter.
77. The kit of claim 73, wherein the genetic construct comprises a sequence as set forth in one of SEQ ID NOs: 23-33.
78. The kit of claim 73, further comprising T cell culture and/or expansion components comprising one or more of a culture medium, interleukin-2 (IL-2), insulin, interferon gamma (IFN-g), IFN-a, IL-4, IL-7, IL-21 , granulocyte-macrophage colony-stimulating factor (GM-CSF), IL-10, IL-12, IL-15, TGFp, tumor necrosis factor alpha (TNF-a), surfactant, plasmanate, N- acetyl-cystine, 2-mercaptoethanol, amino acids, sodium pyruvate, vitamins, hormones, penicillin, streptomycin, L-glutamine, plasma, efavirenz, Phytohemagglutinin-L (PHA-L), GlutaMAX, glucose, galactose, fatty acids, cholesterol, arachidonic acid, linoleic acid, linolenic acid, myristic acid, oleic acid, palmitic acid, palmitoleic acid, and stearic acid.
PCT/US2020/013919 2019-01-16 2020-01-16 Methods to tag and isolate cells infected with the human immunodeficiency virus WO2020150499A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962793323P 2019-01-16 2019-01-16
US62/793,323 2019-01-16

Publications (1)

Publication Number Publication Date
WO2020150499A1 true WO2020150499A1 (en) 2020-07-23

Family

ID=71613453

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/013919 WO2020150499A1 (en) 2019-01-16 2020-01-16 Methods to tag and isolate cells infected with the human immunodeficiency virus

Country Status (1)

Country Link
WO (1) WO2020150499A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030017538A1 (en) * 2001-06-08 2003-01-23 Riken Fluorescent protein
US20040053219A1 (en) * 2000-11-10 2004-03-18 Bioalliance Pharma (S.A.) Method for analysing human immunodeficiency virus (HIV) phenotypic characteristics
US20050042747A1 (en) * 2001-10-29 2005-02-24 Frederic Clayton Hivgp120-induced bob/gpr15 activation
US20160016971A1 (en) * 2011-01-10 2016-01-21 Susana Valente Inhibitors of Retroviral Replication
US20180296649A1 (en) * 2015-06-01 2018-10-18 Temple University - Of The Commonwealth System Of Higher Education Methods and compositions for rna-guided treatment of hiv infection
WO2018195360A1 (en) * 2017-04-21 2018-10-25 Seattle Children's Hospital (Dba Seattle Childern's Research Institute) Therapeutic genome editing in wiskott-aldrich syndrome and x-linked thrombocytopenia

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040053219A1 (en) * 2000-11-10 2004-03-18 Bioalliance Pharma (S.A.) Method for analysing human immunodeficiency virus (HIV) phenotypic characteristics
US20030017538A1 (en) * 2001-06-08 2003-01-23 Riken Fluorescent protein
US20050042747A1 (en) * 2001-10-29 2005-02-24 Frederic Clayton Hivgp120-induced bob/gpr15 activation
US20160016971A1 (en) * 2011-01-10 2016-01-21 Susana Valente Inhibitors of Retroviral Replication
US20180296649A1 (en) * 2015-06-01 2018-10-18 Temple University - Of The Commonwealth System Of Higher Education Methods and compositions for rna-guided treatment of hiv infection
WO2018195360A1 (en) * 2017-04-21 2018-10-25 Seattle Children's Hospital (Dba Seattle Childern's Research Institute) Therapeutic genome editing in wiskott-aldrich syndrome and x-linked thrombocytopenia

Similar Documents

Publication Publication Date Title
ES2926513T3 (en) Methods for Assessing the Presence or Absence of Replication-Competent Virus
AU2006214278C1 (en) Lentiviral vectors and their use
Holkers et al. Differential integrity of TALE nuclease genes following adenoviral and lentiviral vector gene transfer into human cells
CA3178308A1 (en) Targeted lipid particles and compositions and uses thereof
CA3100247A1 (en) Drug-resistant immune cells and methods of use thereof
CN114929862A (en) Production method for producing T cell expressing chimeric antigen receptor
WO2022010889A1 (en) Methods and compositions for producing viral fusosomes
Shy et al. Hybrid ssDNA repair templates enable high yield genome engineering in primary cells for disease modeling and cell therapy manufacturing
WO2023115041A1 (en) Modified paramyxoviridae attachment glycoproteins
Rettig et al. Transduction and selection of human T cells with novel CD34/thymidine kinase chimeric suicide genes for the treatment of graft-versus-host disease
WO2023115039A2 (en) Modified paramyxoviridae fusion glycoproteins
WO2020150499A1 (en) Methods to tag and isolate cells infected with the human immunodeficiency virus
CA3221125A1 (en) Gene editing in primary immune cells using cell penetrating crispr-cas system
EP4158042A1 (en) Nucleic acid constructs for protein manufacture
US20240067958A1 (en) Engineered enveloped vectors and methods of use thereof
WO2024050450A1 (en) Engineered enveloped vectors and methods of use thereof
EP4347810A2 (en) Ciita targeting zinc finger nucleases
WO2023081900A1 (en) Engineered t cells expressing a recombinant t cell receptor (tcr) and related systems and methods
Mekkaoui et al. Efficient clinical-grade γ-retroviral vector purification by high-speed centrifugation for CAR T cell manufacturing
US20180238877A1 (en) Isolation of antigen specific b-cells
WO2023037123A1 (en) Method
WO2020074922A1 (en) Vectors
Romito Pre-clinical development of HIV resistant genome edited human CD4+ T cells
CA3045442A1 (en) Methods and materials for cloning functional t cell receptors from single t cells

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20741789

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20741789

Country of ref document: EP

Kind code of ref document: A1