WO2013159103A1 - Programming and reprogramming of cells - Google Patents

Programming and reprogramming of cells Download PDF

Info

Publication number
WO2013159103A1
WO2013159103A1 PCT/US2013/037623 US2013037623W WO2013159103A1 WO 2013159103 A1 WO2013159103 A1 WO 2013159103A1 US 2013037623 W US2013037623 W US 2013037623W WO 2013159103 A1 WO2013159103 A1 WO 2013159103A1
Authority
WO
WIPO (PCT)
Prior art keywords
cells
cell
reprogramming
pluripotency
somatic
Prior art date
Application number
PCT/US2013/037623
Other languages
French (fr)
Inventor
Yosef BUGANIM
Dina A. FADDAH
Rudolf Jaenisch
Styliani MARKOULAKI
Original Assignee
Whitehead Institute For Biomedical Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Whitehead Institute For Biomedical Research filed Critical Whitehead Institute For Biomedical Research
Publication of WO2013159103A1 publication Critical patent/WO2013159103A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0696Artificially induced pluripotent stem cells, e.g. iPS
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New breeds of animals
    • A01K67/027New breeds of vertebrates
    • A01K67/0275Genetically modified vertebrates, e.g. transgenic
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/873Techniques for producing new embryos, e.g. nuclear transfer, manipulation of totipotent cells or production of chimeric embryos
    • C12N15/877Techniques for producing new mammalian cloned embryos
    • C12N15/8775Murine embryos
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2227/00Animals characterised by species
    • A01K2227/10Mammal
    • A01K2227/105Murine
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2501/00Active agents used in cell culture processes, e.g. differentation
    • C12N2501/60Transcription factors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2501/00Active agents used in cell culture processes, e.g. differentation
    • C12N2501/60Transcription factors
    • C12N2501/605Nanog
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2501/00Active agents used in cell culture processes, e.g. differentation
    • C12N2501/60Transcription factors
    • C12N2501/608Lin28
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2506/00Differentiation of animal cells from one lineage to another; Differentiation of pluripotent cells
    • C12N2506/13Differentiation of animal cells from one lineage to another; Differentiation of pluripotent cells from connective tissue cells, from mesenchymal cells
    • C12N2506/1307Differentiation of animal cells from one lineage to another; Differentiation of pluripotent cells from connective tissue cells, from mesenchymal cells from adult fibroblasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2999/00Further aspects of viruses or vectors not covered by groups C12N2710/00 - C12N2796/00 or C12N2800/00
    • C12N2999/007Technological advancements, e.g. new system for producing known virus, cre-lox system for production of transgenic animals

Definitions

  • Stem cells are cells that are capable of self-renewal and of giving rise to more differentiated cells.
  • Embryonic stem (ES) cells for example, which can be derived from the inner cell mass of a normal embryo in the blastocyst stage, can differentiate into the multiple specialized cell types that collectively comprise the body (See, e.g., U.S. Pat. Nos. 5,843,780 and 6,200,806, Thompson, J. A. et al. Science, 282: 1 145-7, 1 998). As cells differentiate they undergo a progressive loss of developmental potential that has generally been considered largely irreversible. Somatic cell nuclear transfer (SCNT) experiments, however, showed that nuclei from differentiated adult cells could be reprogrammed to a totipotent state by factors present in the oocyte cytoplasm.
  • SCNT Somatic cell nuclear transfer
  • SCNT and conventional methods of obtaining ES cells suffer from a number of limitations that hamper their use in regenerative medicine applications, and alternatives have been avidly sought. Examples can be found in the scientific literature in which differentiated cells of a particular type have been converted into cells of a different type without apparently being reverted to a fully pluripotent state as an intermediate step. For example, dermal fibroblasts can be converted into muscle-like cells by forced expression of MyoD.
  • such examples do not provide a general approach to generating large numbers of patient-specific cells of numerous diverse types.
  • ES have been produced by introducing genes encoding four transcription factors associated with pluripotency, i.e., Oct3/4, Sox2, c-Myc and lf4, into mouse skin fibroblasts via retroviral infection, and then selecting cells that expressed a marker of pluripotency, Fbxl 5, in response to these factors (Takahashi, K. &
  • the present invention provides novel methods and compositions for reprogramming mammalian cel ls. Certain methods and compositions of the invention are of use to enhance generation of induced pluripotent stem cel ls by reprogramming somatic cells. Certain methods and compositions of the invention are of use to identify cells destined to become iPSCs. Certain compositions and methods of the invention are of use to enhance reprogramming of pluripotent mammalian cells to a differentiated cell type. Certain compositions and methods of the invention are of use to enhance reprogramming of differentiated mammalian cells of a first cell type to differentiated mammalian cells of a second differentiated cell type. The reprogrammed somatic cells are useful for a number of purposes, including treating or preventing a medical condition in an individual. The invention further provides methods for identifying an agent that enhances or contributes to reprogramm ing mammalian cells.
  • methods of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
  • methods of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Dppa2, Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
  • methods of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Nanog, Esrrb, and any one or more of Etz2, Kdm 1 , and Utfl into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
  • methods of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Dppa2 Esrrb, and any one or more of Etz2, dm l , and Utfl into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
  • methods of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Nanog and/or Dppa2, and Esrrb, into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
  • said reprogramming factors are introduced into said somatic cell in the form of one or more nucleic acid sequences encoding the reprogramming factors.
  • said one or more nucleic acid sequences comprise DNA.
  • said one or more nucleic acid sequences comprise RNA.
  • said one or more nucleic acid sequences comprises a nucleic acid construct.
  • said one or more nucleic acid sequences comprises a vector.
  • said vector comprises an inducible vector.
  • said inducible vector activates expression of said reprogramming factors in the presence of dox in said medium.
  • said vector integrates into a genome of said somatic cell.
  • said vector comprises a viral vector.
  • said vector comprises a retroviral vector.
  • said vector comprises a lentiviral vector.
  • said vector comprises an excisable vector.
  • said excisable vector comprises a transposon, wherein said excisable vector is excisable from said genome by transient expression of a transposase.
  • said transposon comprises a piggyback transposon.
  • said excisable vector comprises one or more loxP site incorporated into said vector, wherein said vector can be excised from said genome by transient expression of a Cre recombinase.
  • said excisable vector comprises a floxed lentiviral vector.
  • said vector does not integrate into the genome of said somatic cell.
  • said vector comprises an adenoviral vector.
  • said vector comprises a sendai viral vector.
  • said vector comprises a plasmid.
  • said vector comprises an episome.
  • said RNA comprises mRNA. In some embodiments said mRNA is translatable in vitro in said mammalian somatic cell. In some embodiments said mRNA is in vitro transcribed mRNA. In some embodiments said in vitro transcribed mRNA comprises a sequence encoding SV40 large T (LT). In some embodiments said in vitro transcribed mRNA comprises one or more modifications that increase stability or translatability of said mRNA. In some embodiments said in vitro transcribed mRNA comprises a 5 ' cap. In some embodiments said in vitro transcribed mRNA comprises an open reading frame flanked by a 5 ' untranslated region and a 3 ' untranslated region that enhance translation of said open reading frame.
  • said 5 ' untranslated region comprises a strong Kozak translation initiation signal.
  • said 3 ' untranslated region comprises an alpha-globin 3 ' untranslated region.
  • said in vitro transcribed mRNA comprises a polyA tail.
  • said in vitro transcribed mRNA is introduced into said somatic cell via electroporation.
  • said in vitro transcribed mRNA is introduced into said somatic cell complexed with a cationic vehicle that facilitates uptake of said mRNA into said somatic cell via endocytosis.
  • said in vitro transcribed mRNA is introduced into said somatic cell in an amount and for a period of time sufficient to maintain expression of the reprogramming factors until cellular reprogramming of said somatic cell occurs.
  • said in vitro transcribed mRNA is treated with a phosphatase to reduce a cytotoxic response by said somatic cell upon introduction of said mRNA into said somatic cell.
  • said in vitro transcribed mRNA comprises one or more base substitutions.
  • said base substitutions are selected from the group consisting of 5-methylcytidine (5mC), pseudouridine (psi), 5-methyluridine, 2'0-methyluridine, 2-thiouridine, and N6-methyladenosine.
  • said reprogramming factors are introduced into said somatic cell in the form of one or more proteins or functional variants or fragments thereof.
  • said one or more proteins comprise a recombinant protein.
  • said one or more proteins comprise a fusion protein.
  • said one or more proteins further comprise a cell-penetrating peptide.
  • said cell-penetrating peptide is fused to a C terminus of said one or more proteins.
  • said cell-penetrating peptide comprises HIV tat.
  • said cell-penetrating peptide comprises poly-arginine.
  • said one or more proteins is introduced into said somatic cell in an amount and for a period of time sufficient for reprogramming of said somatic cell to occur.
  • such method further comprises (c) supplementing said medium with one or more agents that increase reprogramming efficiency.
  • said one or more agents are selected from the group consisting of a nucleic acid, an antisense oligonucleotide, siRNA, miRNA, an antibody or a fragment thereof.
  • said one or more agents comprise a histone deacetylase inhibitor.
  • said histone deacetylase inhibitor comprises valproic acid (VPA).
  • said histone deacetylase inhibitor comprises biityrate.
  • said one or more agents comprise an interferon inhibitor.
  • said interferon inhibitor comprises a recombinant B l 8R protein.
  • said one or more agents comprise a signaling pathway modulator selected from the group consisting of a TGF-beta pathway inhibitor, a MAPK/ERK pathway inhibitor, a GSK3 pathway inhibitor, a WNT pathway activator, a 3 '-phosphoinositide-dependent kinase- 1 (PDK1 ) pathway activator, a mitochrondrial oxidation modulatory, a glycolytic metabolism modulator, a HIF pathway activator, and combinations thereof.
  • a signaling pathway modulator selected from the group consisting of a TGF-beta pathway inhibitor, a MAPK/ERK pathway inhibitor, a GSK3 pathway inhibitor, a WNT pathway activator, a 3 '-phosphoinositide-dependent kinase- 1 (PDK1 ) pathway activator, a mitochrondrial oxidation modulatory, a glycolytic metabolism modulator, a HIF pathway activator, and combinations thereof.
  • such method further comprises (c) monitoring said culture for cells which display one or more markers of pluripotency.
  • said one or more markers of pluripotency is selected from the group consisting of Fbxo l 5, Nanog, Oct4, Sox2, Sall4 and combinations thereof.
  • said one or more markers of pluripotency comprise early markers of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • such method further comprises (c) or (d) isolating said reprogrammed cell from said culture.
  • said somatic cell is a terminally differentiated somatic cell.
  • mammalian cells comprising said isolated reprogrammed cell.
  • said isolated reprogrammed mammalian cell is a human cell.
  • said isolated reprogrammed mammalian cell is a non-human mammal cell.
  • said isolated reprogrammed mammalian cell further comprises a reporter gene integrated at a locus whose activation serves as a marker of reprogramming to pluripotency.
  • the locus is selected from Nanog, Sox2, and Oct4.
  • said isolated reprogrammed mammalian cell is an iPS cell.
  • a chimeric mouse is disclosed, such chimeric mouse generated at least in part from said isolated repgorammed mammalian iPS cell.
  • said mouse is generated by injecting said mammalian iPS cell into a mouse blastocyt and allowing said blastocyst to develop into a mouse in vivo.
  • a cell is disclosed, such cell comprising a cell obtained from said mouse, wherein said cell is derived from said iPS cell.
  • a non-human mammal comprising a non-human mammal generated at least in part from said mammalian iPS cell.
  • said non-human mammal is a mouse.
  • methods of producing a non-human mammal comprising introducing said mammalian iPS cell into tetraploid blastocysts of the same mammalian species under conditions that result in production of an embryo and said resulting embryo is transferred into a foster mother which is maintained under conditions that result in development of live offspring.
  • said non-human mammal is a mouse.
  • said iPS cells are introduced into said tetraploid blastocysts by injection.
  • said injection is a microinjection.
  • a non-human mammal comprising a non-human mammal produced according said method of producing a non-human mammal.
  • a mouse comprising a mouse produced according to said method of producing a non-human mammal.
  • methods of producing a non-human mammalian embryo comprising injecting non-human iPS cel ls generated according to a reprogramm ing method of the present invention into non-human tetraploid blastocysts and maintaining said resulting tetraploid blastocysts under conditions that result in formation of embryos, thereby producing a non-human mammalian embryo.
  • said non-human iPS cells are mouse cells and said non-human mammalian embryo is a mouse.
  • mutant mouse iPS cells are injected into said non-human tetraploid blastocysts by
  • a non-human mammalian embryo produced according to said method of producing a non-human mammalian embryo is disclosed.
  • said non-human mammalian embryo is a mouse embryo.
  • said somatic cells are differentiated cells of a first cell type, and said reprogramming reprograms said somatic cells to a second differentiated cell type.
  • a disclosed method comprises (a) reprogramming somatic cells to a pluripotent state according to method of generating a reprogrammed cell of the present invention; and (b) reprogramming said pluripotent cells to a desired, differentiated cell type, wherein said differentiated cell type optionally comprises an adult stem cell or a fully differentiated cell.
  • compositions comprising multiple isolated reprogrammed mammalian iPS cells.
  • methods of treating a patient in need of such treatment comprising administering to the patient a composition comprising multiple isolated reprogrammed mammalian iPS cells.
  • methods of treating an individual in need of such treatment comprising: (a) obtaining somatic cells from said individual; (b) reprogramming said somatic cells obtained from said individual according to a method of generating reprogrammed cells of the present invention; and (c) administering at least some of said reprogrammed cells to said individual.
  • the method further comprises separating cells that are reprogrammed to a desired state from cells that are not reprogrammed to a desired state and/or wherein at least some of said reprogrammed cells are differentiated to a selected cell type prior to administration to said individual, in some embodiments, the method further comprises separating reprogrammed cells that have differentiated to a desired cell type from cells that have not differentiated to a desired cell type prior to admistering the eels. In some embodimetns a method comprises eliminating residual pluripotent cells ex vivo prior to administration. In some embodiments said individual is a human.
  • compositions for identifying a reprogramming agent comprising one or more cells that expresses a subset of reprogramming factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent.
  • said subset of reprogramming factors consists of at least three of said reprogramming factors.
  • such composition further comprises an agent that induces expression of said subset of reprogramming factors.
  • methods of identifying a reprogramming agent comprising: (a) maintaining a composition comprising one or more cells that expresses a subset of reprogramming factors selected from the group consisting of SalI4, Nanog, Esrrb and Lin28 and a test agent for a time period under conditions in which said reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein the test agent is identified as a reprogramming agent if reprogramming occurs at a similar frequency as would be the case if said composition contained all of said reprogramming factors and had lacked said test agent.
  • methods of identifying a reprogramming agent comprising: (a) maintaining a composition comprising one or more cells that expresses a subset of reprogramm ing factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28 and a test agent for a time period under conditions in which the reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein said test agent is identified as a reprogramming agent or enhancer of reprogramming if reprogramming occurs at a significantly greater frequency than would be the case had said composition lacked said test agent.
  • said composition is maintained for at least X days.
  • said test agent is present for at least X days.
  • said test agent is identified as a reprogramming agent if cells do not become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent but do become reprogrammed at a detectable frequency if maintained in the presence of said test agent for at least a portion of said time period.
  • said test agent is identified as an enhancer of reprogramm ing agent if cells become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent and become reprogrammed at a significantly greater frequency if maintained in the presence of said test agent for at least a portion of said time period.
  • a nucleic acid construct comprising at least four coding regions linked to each other by nucleic acids that encode a self-cleaving peptide so as to form a single open reading frame, wherein said coding regions encode reprogramming factors Sall4, Nanog, Esrrb, and Lin28, and wherein said reprogramming factors are capable, either alone or in combination with one or more add itional reprogramming factors, of reprogramming a mammalian somatic cell to pluripotency.
  • one of the four coding regions encodes Dppa2 instead of Nanog and/or one of the four coding regions encodes
  • Kdm l , Utfl , or Etzh2 instead of Lin28 or is absent.
  • said nucleic acid construct further comprises a fifth coding region that encodes a fifth reprogramming factor, wherein said five coding regions are linked to each other by nucleic acids that encode sel f-cleaving peptides so as to form a single open reading frame.
  • said fifth coding region that encodes a fifth reprogramming factor
  • reprogramming factor is c-Myc.
  • said nucleic acid construct further comprises fifth and sixth genes that encode fifth and sixth reprogramming factors, wherein said six coding regions are linked to each other by nucleic acids that encode self-cleaving peptides so as to form a single open reading frame.
  • said fifth and sixth genes that encode fifth and sixth reprogramming factors, wherein said six coding regions are linked to each other by nucleic acids that encode self-cleaving peptides so as to form a single open reading frame.
  • reprogramming factor is c-Myc and said sixth reprogramming factor is Klf4.
  • said self-cleaving peptide is a viral 2A peptide. In some embodiments said self-cleaving peptide is an aphthovirus 2A peptide.
  • said construct does not encode Oct4. In some embodiments said construct does not encode Klf4. In some embodiments said construct does not encode Sox2. In some embodiments said construct does not encode c-Myc.
  • expression cassettes comprising said nucleic acid construct operably linked to a promoter, wherein said promoter drives transcription of a polycistronic message that encodes said reprogramming factors, each reprogramm ing factor being linked to at least one other reprogramming factor by a self-cleaving peptide.
  • said expression cassette further comprises one or more sites that mediate integration into a genome of a mammalian cell. In some embodiments said expression cassette is integrated into said genome at a locus whose disruption has minimal or no effect on said cell. In some aspects, expression vectors comprising said expression cassette are disclosed. In some embodiments said vector is retroviral. In some embodiments said promoter is inducible.
  • reprogramming compositions are disclosed, such compositions comprising at least two, three, or four reprogramming factors selected from the group consisting of Sall4 protein, Nanog protein, Esrrb protein, and Lin28 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing.
  • reprogramming compositions are disclosed, such compositions comprising at least two, three, or four reprogramming factors selected from the group consisting of Sall4 protein, Dppa2 protein, Esrrb protein, and Lin28 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing.
  • reprogramming compositions comprising at least two, three, or four reprogramming factors selected from the group consisting of Sall4 protein, Nanog protein, Esrrb protein, and any of Kdm 1 , Utfl , or Etzh2 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing.
  • reprogramming compositions are disclosed, such compositions comprising at least two, three, or four
  • reprogramming factors selected from the group consisting of Sall4 protein, Dppa2 protein, Esrrb protein, and any of Kdm l , Utfl , or Etzh2 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing.
  • each of said reprogramming factors comprises a cell- penetrating peptide fused to its C terminus.
  • said cell- penetrating peptide comprises poly-arginine.
  • the invention provides methods of producing a pluripotent cell from a somatic cell, such methods comprising the steps of: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; (b) maintaining said one or more cells under conditions appropriate and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene; (c) selecting one or more cells which display an early marker of pluripotency; (d) generating a colony or an embryo utilizing said one or more cells which display the early marker of pluripotency; (e) obtaining one or more somatic cells from said colony or embryo; (f) maintaining said one or more somatic cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene; and (g) differentiating between cells which display one or more markers of pluripotency and cells which do not.
  • said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments said early marker of pluripotency is a group of early markers of pluripotency consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments step (d) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency.
  • the present invention provides isolated pluripotent cells produced by a method comprising: (a) introducing exogenous reprogramming factors SaII4, Nanog, Esrrb, and Lin28 into one or more somatic cells; (b) maintaining said one or more cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramm ing factors to activate at least one endogenous pluripotency gene; (c) selecting one or more cells which display an early marker of pluripotency; (d) generating a colony or an embryo utilizing said one or more cells which display the early marker of pluripotency; (e) obtaining one or more differentiated somatic cells from said colony or embryo; (!) maintaining said one or more differentiated somatic cells under conditions appropriate for and for a period of time sufficient for said reprogramming factors to activate at least one endogenous pluripotency gene; and (g) differentiating cells which display one or more markers of pluripotency and cells which do not.
  • Nanog is replaced by Dppa
  • said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments said early marker of pluripotency is a group of early pluripotency markers consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments step (d) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency.
  • methods of selecting a somatic cell that is likely to be reprogrammed to a pluripotent state comprising (a) measuring expression of one or more early markers of pluripotency in a population of a plurality of somatic cells; (b) sorting the population of the plurality of somatic cells into a plurality of populations of single somatic cells; and (c) measuring expression of the one or more early markers of pluripotency in each population of single somatic cells, wherein increased expression of the one or more early markers of pluripotency in each population of single somatic cells as compared to expression of the one or more early markers of pluripotency in the population of the plurality of somatic cells indicates that the single somatic cell is a somatic cell that is likely to be
  • said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • methods of selecting a cell that is likely to become programmed to a pluripotent state comprising (a) maintaining a population of a plurality of differentiated somatic cells containing at least one exogenously introduced factor that contributes to reprogramming of said cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) sorting said population of said plurality of cells into a plurality of populations of single cells; and (c) isolating said sorted cells which display one or more early markers of pluripotency, wherein each sorted cell which displays said one or more early markers of pluripotency is a cell that is likely to become programmed to the pluripotent state.
  • said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • methods for increasing the efficiency of the expansion of induced pluripotent stem cells comprising (a) maintaining a population of differentiated somatic cells that contains at least one exogenously introduced factor that contributes to reprogramming of said population of cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) monitoring each cell in said population of cells for the expression of one or more early pluripotency markers, wherein cells expressing the one or more early pluripotency markers are more likely to become programmed to a pluripotent state than cells which do not express the one or more early pluripotency markers; (c) isolating each cell in said population of cells that expresses the one or more early pluripotency markers; and (d) expanding only those cells which express the one or more early pluripotency markers, thereby increasing the efficiency of the expansion of induced pluripotent stem cells.
  • said one or more early pluripotency markers is selected from the group consisting of Esrrb, Utfl , Lin28, Dppa2, and combinations thereof.
  • said monitoring of said cells is performed during a stochastic phase of reprogramming.
  • proliferation of said cell forms a clonal colony of said cell.
  • methods of increasing the likelihood that a differentiated somatic cell subjected to a reprogramm ing protocol wi ll become reprogrammed to an iPSC comprising introducing into the differentiated somatic cell one or more early pluripotency factors prior to subjecting the differentiated somatic cell to said reprogramming protocol.
  • said one or more early pluripotency factors is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • methods of isolating an iPS colony comprising: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a differentiated mammalian somatic cell (b) culturing said differentiated somatic cell in a suitable medium under conditions appropriate for and for a time period sufficient for proliferation of and reprogramming of said cells to occur; and (c) isolating one or more colonies visible in said culture after said period of time.
  • each of said exogenous reprogramming factors is introduced into said cell in the form of a recombinant protein comprising a cell- penetrating peptide fused to a C terminus of said recombinant protein.
  • each of said exogenous reprogramming factors is introduced into said cell in the form of mRNA optionally complexed with a cationic vehicle, wherine said mRNA comprises in vitro transcribed mRNA comprising one or more of a 5 ' cap, an open reading frame flanked by a 5 ' untranslated region containing a strong Kozak translation initiation signal and an alpha-globin 3 ' untranslated region, a polyAtail, and one or more modifications which confer stability to the mRNA.
  • such method further comprises: (d) growing said isolated one or more colonies on a layer of feeder cells in the absence of an inducer of said inducible transgenes.
  • such method further comprises (e) passaging said one or more grown colonies at least once.
  • methods of enhancing isolation of iPSCs comprising (d) sorting said one or more colonies visible in said culture after said period of time into single cells; (e) differentiating between said sorted cells which display one or more early markers of pluripotency and said sorted cells which do not display one or more early markers of pluripotency; and (f) isolating said sorted cells which display one or more early markers of plurioptency.
  • a mouse iPS cell characterized by an efficiency of said mouse iPS cell of generating live offspring by tetraploid complementation is disclosed, wherein said efficiency is at least 5%.
  • methods of producing a mouse iPS cell characterized by an efficiency of said mouse iPS cell of generating live offspring by tetraploid complementation of at least 5% comprising: (a) transfecting mouse embryonic fibroblasts with a dox-inducible vector comprising reprogramming factors Sall4, Nanog, Esrrb and Lin28 operably linked to a tetracycline operator and a CMV promoter; (b) culturing said mouse embryonic fibroblasts under conditions suitable and for a time period sufficient for proliferation and reprogramm ing of said mouse embryonic fibroblasts to occur; (c) exposing said culture to an effective amount of doxycycline for a period of time sufficient for one or more iPS colonies to form; (d) isolating said one or more iPS colonies; (e) growing said isolated iPS colonies on feeder cells in the absence of doxycycline; and optionally (f) passaging said grown iPS colonies at least once.
  • the present invention provides a collection of reprogramming factors capable of producing a mouse iPS cell having an efficiency of generating live offspring by tetracomplementation of at least 5%, such collection comprising Sall4, Nanog, Esrrb, and Lin28.
  • kits for generating a reprogrammed cell in vitro comprising: (a) a set of reprogramming factors comprising Sall4, Nanog, Esrrb and Lin28, which are capable alone, or in combination with one or more additional reprogramming factors, of reprogramming said mammalian somatic cells to a pluripotent state , wherein the kit optionally comprises (b) a medium suitable for culturing mammalian iPS cells and/or (c) a population of mammalian somatic cells, and wherein the reprogramming factors are optionally provided as one or more nucleic acids (e.g., one or more vectors) encoding said reprogamming factors.
  • a set of reprogramming factors comprising Sall4, Nanog, Esrrb and Lin28, which are capable alone, or in combination with one or more additional reprogramming factors, of reprogramming said mammalian somatic cells to a pluripotent state
  • the kit optionally comprises (b)
  • kits further comprise (d) one or more reagents for an assay for detecting one or more markers of pluripotency.
  • the one or more markers of pluripotency is an early marker of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • kits further comprises of one or more of: (e) instructions for preparing the medium; (f) instructions for deriving or culturing pluripotent cells; (g) serum replacement; (h) albumin; (i) at least one protein or small molecule useful for deriving or culturing iPS cells, wherein the protein or small molecule activates or inhibits a signal transduction pathway; j) a population of mammalian somatic cells and (k) at least one reagent useful for characterizing pluripotent cells.
  • at least some of the ingredients are dissolved in liquid. In some embodiments at least some of the ingredients are provided in dry form.
  • RNA interference RNA interference
  • Figures 1 A- I F Experimental scheme used to monitor transcriptional profiles of single cells at defined tiinepoints during the reprogramming process.
  • A Scheme used for measuring single-cell gene expression with Fluidigm BioMark after the addition of dox at days 2, 4, and 6.
  • B Representative images of Nanog-GF?2 (NGFP2) cells without dox and undergoing the reprogramming process after the addition of dox at days 2, 4, and 6.
  • C Scheme of NGFP2/tdTomato secondary system used to measure single-cell gene expression of clonal dox-dependent (GFP-, GFP+) and independent (GFP+) cells.
  • D Representative images and FACS analysis of dox-dependent and independent cells after the addition of dox at day 1 2, 32, and 61 .
  • FIGS 2A-2C NGFP2-tdTomato system. Representative images of bright field, GFP, and tdTomato in (A) NGFP2-iPSCs-tdTomato and (B) NGFP2-MEFs- tdTomato after six days of dox exposure (C) Flow cytometric analysis of GFP and tdTomato in NGFP2 cells of Colony 44 on dox for 61 days.
  • FIGS 3A-3B Fluidigm data. Representative (A) raw and (B) normalized
  • Fluidigm data for NGFP2-MEFs Colony 1 5-day 1 2 on dox, NGFP2-iPSCs. See Supplemental Methods for detailed explanation of normalization and data analysis.
  • Figures 4A-4D Two defined reprogramming populations
  • A Principal component (PC) projections of individual cells, colored by their sample identification, The blue circle surrounds one population and the red circle surrounds another population. The orange dotted circle surrounds a third potential population.
  • B PC projections of the 48 genes, showing the contribution of each gene to the first two PCs. The first PC can be interpreted as discriminating between cluster 1 and cluster 2; the second between pluripotency genes and cell cycle regulators.
  • C Jensen Shannon Divergence analysis of within-group variability, colored by the same sample identification as in (A).
  • D Jensen Shannon Divergence analysis of within-colony variability, colored by the same sample identification as in (A) and (C).
  • Figures 5 A-5D Two defined reprogramming populations
  • Colonies 23 and 44 Representative images of Colonies 23 and 44 and flow cytometric analysis of tdTomato and GFP at day 81 .
  • Colony 23 failed to activate GFP in the majority of cells upon continual passaging to day 81 (0.01 % tdTomato+/GFP+).
  • Colony 44 contained a few cells with a low level of GFP that disappeared upon continual passaging and dox- withdrawal.
  • B Representative images of stable dox-independent GFP+ colonies after 30 days of treatment with AZA.
  • C Flow cytometric analysis of GFP in Colony 23 (2.2% GFP+) and Colony 44 (0.5% GFP+) after 30 days of treatment with AZA.
  • Figures 9A-9D Model to predict the order of transcriptional events in single cells.
  • A Bayesian network to describe the hierarchy of transcriptional events among a subset of pluripotent genes.
  • B Bar plot of fraction of cells with transcripts, quantified by single molecule mRNA FISH, of Sox2, Sall4, Fgf4 (single positive, purple), Sox2/Sall4, Sall4/Fgf4, Sox2/Fgf4 (double positive, brown), and
  • Sox2/Sall4/Fgf4 triple positive, blue expression in NGFP2 cells at day 12 on dox.
  • the numbers of cells in each category is indicated on top of each bar.
  • C Bar plot of fraction of cells with transcripts, quantified by single molecule mRNA FISH, of Sox2, Lin28, Dnmt3b (single positive, purple), Sox2/Lin28, Lin28/Dnmt3b, Sox2/Dnmt3b (double positive, brown), and Sox2/Lin28/Dnmt3b (triple positive, blue) expression in NGFP2 cells at day 12 on dox.
  • the numbers of cells in each category is indicated on top of each bar.
  • Figures 1 0A- 10F Late candidate markers.
  • a and D mRNA expression levels of Gdf3 and Sox2 in populations noted in Figure 1 and legend of Figures 10A- 10F (right) are shown in violin plots. Median values are indicated by red line, lower and upper quartiles by blue rectangle, and sample minima/maxima by black line.
  • C and F Quantitative RT-PCR of Gdf3 and Sox2 expression in MEFs, NGFP2 iPSCs, Colony 23, and Colony 44, normalized to the Hprt house keeping control gene. Error bars are presented as a mean ⁇ standard deviation of two duplicate runs from a typical experiment.
  • A Flow cytometric analysis of GFP in Oct4-GFP cells reprogrammed with Oct4, Esrrb, Nanog, Klf4, c-Myc, 25 days on dox, 5 days without dox. Representative images of stable dox-independent GFP+ colonies and bright-field pictures of chimeras derived from these iPSCs are shown.
  • FIG. 1 Representative images of stable dox-independent GFP+ colonies and bright- field pictures of chimeras derived from these iPSCs are shown.
  • C Flow cytometric analysis of GFP in Oct4-GFP cells reprogrammed with Lin28, Sall4, Esrrb, Nanog, Klf4, and c-Myc, 25 days on dox, 5 days without dox. Representative images of stable dox-independent GFP+ colonies and bright-field pictures of chimeras derived from these iPSCs are shown.
  • D Flow cytometric analysis of GFP in Oct4- GFP cells reprogrammed with Lin28, Sall4, Esrrb, and Nanog, 25 days on dox, 5 days without dox.
  • G Flow cytometric analysis of GFP in Oct4-GFP cells reprogrammed with Lin28, Sall4, Ezh2, Nanog, Klf4 and c-Myc. Representative bright-field pictures of the cells 25 days on dox, 1 day post dox withdrawal, and 7 days post dox withdrawal are shown (bottom). Flow cytometric analysis of GFP at day 7 days post dox withdrawal is shown (upper right).
  • FIGS 12A- 12F Analysis of Ezh2 and individual factor contributions.
  • A Flow cytometric analysis of GFP upon overexpression of Ezh2 and dox exposure for 7 days followed by 3 days of dox withdrawal.
  • B Quantitative RT-PCR of Ezh2 expression in NGFP2 cel ls, three days post shRNA knockdown. Two hairpins were used and expression levels were normalized for Hprt.
  • C Alkaline phosphatase immunostaining of NGFP2 cells after 1 6 days of shRNA knockdown and dox addition.
  • D Flow cytometric analysis of GFP in NGFP2 cells at day 16 upon shRNA knockdown and dox addition. GFP+ cells are gated.
  • E Flow cytometric analysis of GFP upon overexpression of Lin28, Sall4, Esrrb, and Nanog individually in NGFP2 MEFs on dox for 10 days followed by 4 days dox withdrawal.
  • F Flow cytometric analysis of GFP upon overexpression of Nanog individually in NGFP2 MEFs on dox for 16 days followed by 3 days dox withdrawal.
  • Figures 1 3A- 1 3C Model of the reprogramming process.
  • the reprogramming process can be split into two phases: an early stochastic phase (A and B) of gene activation followed by a later more deterministic phase (C) of gene activation that begins with the activation of the Sox2 locus.
  • a and B early stochastic phase
  • C deterministic phase
  • the cell can proceed into either one of two stochastic phases.
  • stochastic gene activation can lead to the activation of the Sox2 locus.
  • stochastic gene activation can lead to the activation of "predictive markers” like Utfl , Esrrb, Dppa2, Lin28, which then mark cells that have a higher probability of activating the Sox2 locus.
  • Activation of the Sox2 locus can be via two potential paths: (1 ) direct activation of the Sox2 locus or (2) sequential gene activation that leads to the activation of the Sox2 locus.
  • probabilistic events decrease and hierarchal events increase as the cell progresses from fibroblast to iPSC.
  • Solid red arrows and black arrows denote hypothetical interactions and interactions supported by our data, respectively.
  • the white gap shown between the stochastic (A and B) and deterministic (C) panels represents the transition from induced fibroblast to iPSC illustrated between the orange dotted cluster and red cluster in Figure 4A.
  • FIG. 14A- 14D Characterization of SNEL-iPSC l ines.
  • a Schematic presentation of Bayesian network demonstrates the hierarchy of a subset of pluripotent genes that leads to a stable and transgene independent pluripotency state 22 .
  • Sall4, Nanog, Esrrb and Nanog are marked by red circle.
  • B Representative images of two stable dox-independent, GFP-positive colonies (Nanog-GFP SNEL# 1 and Oct4-GFP SNEL#3) and immunostaining for Sall4, Sox2, Utfl and Esrrb.
  • Figure 1 5A- 15D SNEL-iPSCs produce "all-iPSC" mice with high success rate compared to OSKM.
  • A Table summarizing the developmental potential of SNEL-iPSC or OSKM-iPSC lines via 4n complementation assay.
  • Implantation Sites The number was not recorded in all experiments; when not documented a "N/D" mark was made. The ">” sign denotes that implantations were recorded only in females in which c-section was performed. The "+” sign denotes that implantation sites were recorded for some females only.
  • Dead fetuses and pups This represents the number of fetuses found dead in utero and pups found dead at the time of c-section or right after natural birth.
  • C Representative images of 4n adult mice produced from Oct4-GFP SNEL# 1 and Oct4- GFP SNEL#4 lines and their Fl generation.
  • D Confirmation of origin of "all-iPSC” mice by PCR for strain-specific polymorphisms. Two different Simple Sequence Polymorphism (SSLP) markers were tested using genomic DNA isolated from tissues of "all-iPSC” mice. Genomic DNA from the parental iPSCs (donor cells), as well as from a 129 Sv/Jae mouse (donor strain) and a B6D2F 1 mouse (host blastocyst strain) served as controls.
  • SSLP Simple Sequence Polymorphism
  • Figure 1 6A- 1 6D Unbiased comparative transcriptome analyses distinguish iPSCs according to their 4n proficiency.
  • FIG. 17A- 17C SNEL-iPSC lines produce healthy chimeras with high contribution.
  • A Table summarizing the ability of all SNEL-iPSC lines to contribute to chimeras. The percentage of chimerism is estimated qualitatively based on coat color. The incidence of germ line transmission is also recorded. "N/D” is used to denote that these mice were not tested for germline transmission.
  • B Representative pictures of chimeras and their estimated percentage of chimerism.
  • C Representative pictures of adult mice and their progeny from two lines that were tested for germline transmission. Germ line transmission is based on the presence of agouti pups in the litters.
  • FIG. 1 Oct4-GFP SNEL#2 secondary MEFs express high levels of Lin28 and Esrrb. Secondary MEFs derived from Oct4-GFP SNEL# 1 and Oct4-GFP SNEL#2 were exposed to dox for 48 hours and analyzed for the expression of Lin28, Esrrb, Oct4 and Sox2. MEFs and iPSCs served as controls. Error bars are presented as a mean ⁇ standard deviation (SD) of two duplicate runs from a typical experiment.
  • SD standard deviation
  • Figure 1 "All-iPSC" pups produced from SNEL-iPSC lines. Representative pictures of entire litters after 4n complementation assay for two Oct4-GFP and two Nanog-GFP SNEL-iPSC lines. The female number is shown at the bottom of each litter.
  • Figure 20A-20B SNEL-iPSCs produce "all-iPSC” m ice with high success rate compared to OSKM.
  • FIG 21 Pups generated from poor, good and high quality iPSC lines.
  • Representative images of small and abnormal pups from a "poor" quality iPSC line (Nanog-GFP SNEL#2) are shown on the left.
  • representative photos are shown for pups born live from a "good” quality iPSC line (Nanog-GFP SNEL#3). These pups breathed normally at birth, but died within a few hours.
  • On the right, one week-old pups are shown from a "high” quality iPSC line (Oct4-GFP SNEL# 1 ). These pups grew to adulthood.
  • FIG 22A-22B Comparative transcriptome analysis demonstrates similar global gene expression profiles across ESC and iPSC samples.
  • A Hierarchical clustering of global gene expression profiles for two microarray technical replicates for every iPSC and ESC (reference) line. Replicate pairs are assigned a shared numerical value. Each group (poor, good, high and ESCs) is marked by different color.
  • B Principal component analysis for expression data from (A). Each of the iPSC and ESC groups is marked by specific color and is surrounded by circle. The numbers inside the circles are corresponding to the numbers in Figure 1 6A.
  • Figure 23 Comparative DNA methylome analysis of iPSCs and ESCs.
  • Hierarchical clustering by 2628 differentially methylated regions (DMRs) derived from whole genome bisulphite sequencing does not segregate samples by either reprogramming factor combination or ESC versus iPSC status. Each group (poor, good, high and ESCs) is marked by different color.
  • the present invention relates in some aspects to novel methods and compositions for reprogramming mammalian cells. Certain methods and compositions of the invention are of use to enhance generation of induced pluripotent stem cells by reprogramming somatic cells. Certain methods and compositions of the invention are of use to identify cells destined to become iPSCs. Certain compositions and methods of the invention are of use to enhance reprogramming of pluripotent mammalian cells to a differentiated cell type. Certain compositions and methods of the invention are of use to enhance reprogramming of differentiated mammalian cells of a first cell type to differentiated mammalian cells of a second differentiated cell type. The reprogrammed somatic cells are useful for a number of purposes, including treating or preventing a medical condition in an individual. The invention further provides methods for identifying an agent that enhances or contributes to
  • Differentiated cel ls can be reprogrammed to a pluripotent state by
  • iPSCs Fully reprogrammed induced pluripotent stem cells
  • iPSCs Fully reprogrammed induced pluripotent stem cells
  • the reprogramming process is characterized by widespread epigenetic changes (Kim et al., 2010; Koche et al., 201 1 ; Maherali et al., 2007;
  • reprogramming process shows that the immediate response to the reprogramming factors is characterized by de-differentiation of mouse embryonic fibroblasts (MEFs) and upregulation of proliferative genes, consistent with the expression of c-Myc. It has been shown that expression of early markers such as alkaline phosphatase and SSEA 1 is followed by activation of endogenous pluripotency markers, Sox2 and Nanog (Brambrink et al., 2008; Stadtfeld et al., 2008). Live imaging analysis of single cells enabled retroactive tracking of reprogramming events and defined transitions within induced cells (Smith et al., 201 0).
  • Single-cell analysis can provide a snapshot of the state of individual cells in heterogeneous cell populations and therefore elucidate unknown genes and signaling pathways involved in the reprogramming process (Graf and Stadtfeld, 2008; Hayashi et al., 2008; Kalisky et al., 201 1 ; Kalisky and Quake, 201 1 ; Raj and van Oudenaarden, 2008; Tang et al., 2010; Tang et al., 2009; Tang et al., 201 1 ).
  • nuclear transfer Boiani et al., 2002
  • cell fusion Boiani et al., 201 0; Do and Scholer, 2010
  • reprogramming rapidly and possibly as a single event with little heterogeneity observed in somatic cells possibly consistent with a deterministic process (Hanna et al., 201 0)
  • Haanna et al., 201 0 So far the molecular analyses of reprogramming were based on gene expression measurements over heterogeneous populations of cells precluding insight into events that occur in the rare single cells that ultimately become iPS cells.
  • Sox2 is indispensable for maintaining ES-cell pluripotency because Sox2-null ES cells differentiated primarily into trophoectoderm- like cells and it was suggested, consistent with our hypothesis, that Sox2 was partially responsible for the activation of Oct4 by maintaining high levels of orphan nuclear receptors like Nr5a2 (Lrhl)(Masui et al., 2007).
  • Sox2 activator Esrrb from a cocktail of transcription factors (Lin28, Sall4, Nanog, Ezh2, Klf4 and c-Myc) yielded iPS-like colonies that were unstable due to their failure to activate the core pluripotency circuitry.
  • Activation of the endogenous Sox2 represents a late cell state and can be considered as a first step that drives a consecutive chain of events that allow the cells to enter the pluripotent state.
  • Dppa2 could substitute for Nanog in this combination, i.e., Sall4, Dppa2, Esrrb, and Lin28 (SDEL reprogramming factors) are sufficient to generate fully reprogrammed iPSCs, consistent with our model.
  • Lin28 could be replaced by, e.g., any of Ezh2, Kdm l , and Utfl .
  • single cell gene expression analysis revealed an unanticipated heterogeneity in gene expression between sister cells, consistent with stochastic epigenetic alterations during the early phase of the reprogramming process. This was followed by a more hierarchal mechanism late in the process where activation of some key genes predicts the expression of downstream genes and the establishment of the pluripotency circuitry.
  • reprogramming factors Sall4, Nanog, Esrrb, and Lin28 are reprogramming factors that are reprogramming factors.
  • reprogramming factors Sall4, Nanog, Esrrb, and Lin28 are reprogramming factors that are reprogramming factors.
  • Also disclosed herein are methods of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Dppa2 Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
  • Also disclosed herein are methods of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Dppa2 Esrrb, and any one or more of Etz2, dm i , or Utfl into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
  • reprogramming factor Sall4 refers to GenelD 57167 of NCBI's Gene database or a homolog thereof.
  • reprogramming factor Sall4 refers to a reprogramming factor obtained using the primers in Table 2 below.
  • reprogramming factor Nanog refers to GenelD 79923 of NCBI's Gene database or a homolog thereof.
  • reprogramming factor Nanog refers to a reprogramming factor obtained using the primers in Table 2 below.
  • reprogramming factor Esrrb refers to GenelD 21 03 of NCBI's Gene database or a homolog thereof.
  • reprogramming factor Esrrb refers to a reprogramming factor obtained using the primers in Table 2 below.
  • reprogramming factor Lin28 refers to GenelD 79727 of NCBI 's Gene database or a homolog thereof. In some embodiments, reprogramming factor Lin28 refers to a reprogramming factor obtained using the primers in Table 2 below. In some embodiments, reprogramming factor Dppa2 refers to GenelD 151 871 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramming factor Dppa2 refers to a
  • reprogramming factor Etz2 refers to GenelD 2146 of NCBI 's Gene database or a homolog thereof. In some embodiments, reprogramming factor Etz2 refers to a reprogramm ing factor obtained using the primers in Table 4 below. In some embodiments, reprogramming factor dm 1 refers to dm 1 a having GenelD 23028 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramming factor Utfl refers to GenelD 8433 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramming factor Utfl refers to a reprogramming factor obtained using the primers in Table 4 below.
  • Reprogramming refers to a process that alters the differentiation state or identity of a cell.
  • Cells are classified into different “types” based on various criteria such as morphological and functional characteristics and gene expression profile.
  • Cell state encompasses the concept of "cell type” or “cell identity” but also refers to any one or more features or characteristics (or sets of features or characteristics) that characterize a cell (e.g., pluripotent state, differentiated state, post-mitotic state, etc.). It will be understood that in at least some aspects the initial cell(s) gives rise to a population of descendants and that reprogramming occurs over time within the population of cells.
  • a population of descendants e.g., pluripotent state, differentiated state, post-mitotic state, etc.
  • any aspect herein pertaining to a cell pertains to a population comprising multiple cells.
  • a cell, reprogramming factor or combination thereof, or composition comprising one or more cells and/or reprogramming factors is isolated or ex vivo.
  • the invention provides methods for reprogramming somatic cells to a less differentiated state. The resulting cells thus reprogrammed are sometimes referred to herein as "ES-like” or "iPSCs" if they are pluripotent.
  • reprogramming entails complete reversion of the differentiation state of a somatic cell to a pluripotent state, in which the cell has the ability to differentiate into or give rise to cells derived from al l three embryonic germ layers (endoderm, mesoderm and ectoderm) and typically has the potential to divide in vitro for a long period of time, e.g., greater than one year or more than 30 passages.
  • reprogramming entails partial reversion of the differentiation state of a differentiated somatic cell to a multipotent state, in which the cell is able to differentiate into some but not all of the cells derived from all three germ layers.
  • reprogramming entails differentiating a pluripotent cell (e.g., an iPSC) or multipotent cell to a more differentiated cell of a desired cell type.
  • reprogramming entails converting a cell of a first differentiated cell type into a cell of a second differentiated cell type (also referred to as "trans-differentiation"), without apparently going through an intermediate stage of pluripotency.
  • the methods for reprogramming cells are performed in vitro, i.e., they are practiced using cells maintained in culture.
  • reprogramming factor refers to a gene, RNA, or protein that promotes or contributes to cell reprogramming, e.g., in vitro. Many useful reprogramming factors are transcription factors. In aspects of the invention relating to reprogramming factor(s), the invention provides embodiments in which the reprogramming factor(s) are of interest for reprogramming somatic cells to pluripotency in vitro. Examples of reprogramming factors of interest for
  • reprogramming somatic cells to pluripotency in vitro are Sall4, Nanog, Esrrb, Lin28, Klf4, c-Myc, and any gene/RN A/protein that can substitute for one or more of these in a method of reprogramming somatic cel ls in vitro.
  • "Reprogramming to a pluripotent state in vitro” is used herein to refer to in vitro reprogramming methods that do not require and typically do not include nuclear or cytoplasmic transfer or cell fusion, e.g., with oocytes, embryos, germ cells, or pluripotent cells. Any embodiment or claim of the invention may specifically exclude compositions or methods relating to or involving nuclear or cytoplasmic transfer or cell fusion, e.g., with oocytes, embryos, germ cells, or pluripotent cells.
  • reprogramming protocol refers to any treatment or combination of treatments that causes at least some cells to become reprogrammed.
  • reprogramming protocol can refer to a variation of a known reprogramming protocol, wherein a factor or other agent used in a known reprogramming protocol is omitted or modified.
  • reprogramming protocol can refer to a variation of a known reprogramming protocol, wherein a factor or agent known to be of use for reprogramming is used together with a different agent whose utility in reprogramming has not been established.
  • exogenous reprogramming factors into somatic cells in any form that is capable of maintaining exogenous reprogramming factors for a period of time and at levels sufficient to activate endogenous pluripotency genes and for reprogramming of at least some of the somatic cells into which the exogenous reprogramming factors are introduced to occur.
  • exogenous refers to a substance present in a cell or organism other than its native source.
  • exogenous nucleic acid or “exogenous protein” refer to a nucleic acid or protein that has been introduced by a process involving the hand of man into a biological system such as a cell or organism in which it is not normally found or in which it is found in lower amounts.
  • a substance will be considered exogenous if it is introduced into a cell or an ancestor of the cell that inherits the substance.
  • endogenous refers to a substance that is native to the biological system.
  • Somatic cells of use in aspects of the invention may be primary cells (non- immortalized cells), such as those freshly isolated from an animal, or may be derived from a cell line capable or prolonged proliferation in culture (e.g., for longer than 3 months) or indefinite proliferation (immortalized cells).
  • Adult somatic cel ls may be obtained from individuals, e.g., human subjects, and cultured according to standard cell culture protocols available to those of ordinary skill in the art.
  • the cells may be maintained in cell culture following their isolation from a subject.
  • the cells are passaged once or more following their isolation from the individual (e.g., between 2-5, 5- 10, 10-20, 20-50, 50- 1 00 times, or more) prior to their use in a method of the invention.
  • cell may be frozen and subsequently thawed prior to use.
  • cells will have been passaged no more than 1 , 2, 5, 1 0, 20, or 50 times following their isolation from an individual prior to their use in a method of the invention.
  • Somatic cells of use in aspects of the invention include mammalian cells, such as, for example, human cells, non-human primate cells, or mouse cells. They may be obtained by well-known methods from various organs, e.g., skin, lung, pancreas, liver, stomach, intestine, heart, reproductive organs, bladder, kidney, urethra and other urinary organs, etc., generally from any organ or tissue containing live somatic cells.
  • Mammalian somatic cells useful in various embodiments include, for example, fibroblasts, adult stem cells, Sertoli cells, granulosa cells, neurons, pancreatic cells, epidermal cells, epithelial cells, endothelial cells, hepatocytes, hair follicle cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T lymphocytes), macrophages, monocytes, mononuclear cells, cardiac muscle cells, skeletal muscle cells, etc.
  • reprogramming factors of the present invention are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors.
  • SNEL reprogramming factors are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors.
  • the one or more nucleic acid sequences comprise DNA.
  • the one or more nucleic acid sequences comprise RNA.
  • the one or more nucleic acid sequences comprise a nucleic acid construct.
  • the one or more nucleic acid sequences comprise a vector for delivery of the reprogramming factors of the present invention into a target cell (e.g., a mammalian somatic cell, e.g., a human or mouse fibroblast cell).
  • a target cell e.g., a mammalian somatic cell, e.g., a human or mouse fibroblast cell.
  • the present invention contemplates the use of any suitable vector.
  • suitable vectors are described by Stadtfeld and Hochedlinger (Genes Dev. 24:2239-2263, 2010, incorporated herein by reference in its entirety). Other suitable vectors are apparent to those skil led in the art.
  • the vector comprises an inducible vector.
  • the inducible vector is a doxycycline inducible vector (i.e., a vector activates expression of said reprogramming factors in the presence of doxycyclin in a culture medium).
  • “Expression” refers to the cellular processes involved in producing RNA and proteins as applicable, for example, transcription, translation, folding, modification and processing.
  • “Expression products” include RNA transcribed from a gene and polypeptides obtained by translation of mRNA transcribed from a gene.
  • the inducible vector is a tamoxifen inducible vector.
  • the vector is an integrating vector that integrates into a genome of a host cell (e.g., a mammalian somatic cell).
  • a host cell e.g., a mammalian somatic cell
  • the vector comprises a viral vector, e.g., a retroviral vector, e.g., a lentiviral vector.
  • the vector comprises an excisable vector.
  • the excisable vector comprises a transposon, wherein said excisable vector is excisable from said genome by transient expression of a transposase.
  • the transposon comprises a piggyback transposon (See, e.g., Woltjen et al. Nature 458:766-770, 2009; Yusa et al. Nat Methods 6:363-369, 2009, incorporated herein by reference in its entirety).
  • the excisable vector comprises one or more loxP site incorporated into said vector, wherein said vector can be excised from said genome by transient expression of a Cre recombinase (See, e.g., Kaj i et al. Nature 458:771 -775, 2009; Soldner et al. Cell 1 36:964-977, 2009, eachof which is incorporated herein by reference in its entirety).
  • the excisable vector comprises a floxed lentiviral vector.
  • the vector does not integrate into the genome of said somatic cell.
  • the vector comprises an adenoviral vector (See, e.g., Zhou and Freed. Stem Cells 27:2667-2674, 2009, the teachings of which are incorporated herein by reference).
  • the vector comprises a Sendai viral vector (See, e.g., Fusaki et al. Proc Jpn Acad 85:348-362, 2009, the teachings of which are incorporated herein by reference).
  • the vector comprises a plasmid.
  • the vector comprises an episome (Yu et al. Science 324(5928):797-801 , 2009, the teachings of which are incorporated herein by reference).
  • the one or more nucleic acids for introducing the reprogramming factors of the present invention comprise mRNA that is translatable in a mammalian somatic cell.
  • the mRNA can be introduced in vitro into somatic cells to be reprogrammed and translated by endogenous enzymes into proteins that can activate one or more endogenous pluripotency genes in the cell.
  • pluripotency gene refers to a gene whose expression under normal conditions (e.g., in the absence of genetic engineering or other manipulation designed to alter gene expression) occurs in and is typically restricted to pluripotent stem cells, and is crucial for their functional identity as such.
  • the polypeptide encoded by a pluripotency gene may be present as a maternal factor in the oocyte.
  • the gene may be expressed by at least some cells of the embryo, e.g., throughout at least a portion of the preimplantation period and/or in germ cell precursors of the adult.
  • the gene may be expressed in ES cells and/or in embryonic carcinoma cells.
  • the pluripotency gene is typically substantially not expressed in somatic cell types that constitute the body of an adult animal under normal conditions (with the exception of germ cells or precursors thereof, or possibly in certain disease states such as cancer).
  • the pluripotency gene may be one whose average expression level (based on RNA or protein) in ES cells is at least 50-fold or 100-fold greater than its average level in those terminally differentiated cell types present in the body of an adult mammal.
  • the pluripotency gene is one that encodes multiple splice variants or isoforms of a protein, wherein one or more such variants or isoforms is expressed in at least some adult somatic cell types, while one or more other variants or isoforms is not substantially expressed in adult somatic cells under normal conditions.
  • expression of the pluripotency gene is essential to maintain the viability or pluripotent state of iPSCs.
  • the iPSCs are not formed, die or, in some embodiments, differentiate or cease to be pluripotent.
  • the pluripotency gene is characterized in that its expression in an ES cell or iPS cell decreases (resulting in, e.g., a reduction in the average steady state level of RNA transcript and/or protein encoded by the gene by at least 50%, 60%, 70%, 80%, 90%, 95%, or more) when the cell differentiates into a terminally differentiated cell.
  • Oct4 and Nanog are exemplary pluripotency genes.
  • the mRNA is in vitro transcribed mRNA. A non-limiting example of producing in vitro transcribed mRNA of the present invention is described by Warren et al. (Cell Stem Cell 7(5):61 8-30, 2010, the teachings of which are incorporated herein by reference).
  • the in vitro transcribed mRNA comprises a sequence encoding SV40 large T (LT).
  • the in vitro transcribed mRNA comprises one or more modifications that increase stability or trans latability of said mRNA.
  • the in vitro transcribed mRNA comprises a 5' cap. The cap may be wild-type or modified. Examples of suitable caps and methods of synthesizing in vitro transcribed mRNA containing such caps are apparent to those skilled in the art.
  • the in vitro transcribed mRNA comprises an open reading frame flanked by a 5 ' untranslated region and a 3 ' untranslated region that enhance translation of said open reading frame, e.g., a 5 ' untranslated region that comprises a strong Kozak translation initiation signal, and/or a 3 ' untranslated region comprises an alpha-globin 3 ' untranslated region.
  • the in vitro transcribed mRNA comprises a polyA tail.
  • Methods of adding a polyA tail to in vitro transcribed mRNA are known in the art, e.g., enzymatic addition via polyA polymerase or ligation with a suitable ligase.
  • the present invention contemplates any suitable method for introducing in vitro transcribed mRNA encoding reprogramming factors (e.g., SNEL reprogramming factors) of the present invention into somatic cells.
  • the in vitro transcribed mRNA is introduced into said somatic cell via electroporation.
  • the in vitro transcribed mRNA is introduced into said somatic cell complexed with a cationic vehicle that facilitates uptake of said mRNA into said somatic cell via endocytosis (e.g., a cationic liposome or a nanoparticle).
  • the in vitro transcribed mRNA is introduced into said somatic cell in an amount and for a period of time sufficient to maintain expression of the reprogramming factors until cellular reprogramming of said somatic cell occurs.
  • the period of time sufficient to maintain expression of the reprogramming factors may vary depending on the type of somatic cell and the reprogramming factors employed. One of ordinary skill in the art can readily determine the appropriate period of time by routine experimentation.
  • in vitro transcribed mRNA is introduced into somatic cells at various intervals during the course of reprogramming to maintain sufficient levels of exogenous reprogramming factors in the somatic cells until reprogramming of the cells occurs.
  • the culture medium comprising the somatic cells to be reprogrammed is supplemented or treated with one or more agents that increases the efficiency of reprogramming or enhance the reprogramming process.
  • Cells may be treated in any of a variety of ways to cause reprogramming according to the methods of the present invention.
  • the treatment can comprise contacting the cells with one or more agent(s) that contribute to reprogramming ("reprogramm ing agent"). Such contacting may be performed by maintaining the cell in culture medium comprising the agent(s).
  • the somatic cells are genetically engineered.
  • the somatic cell may be genetically engineered to express one or more reprogramming factor(s) as described herein and known in the art.
  • the culture medium is supplemented with low oxygen culture conditions (e.g., about 5% O2) to promote more efficient reprogramming of the somatic cells to iPSCs.
  • the in vitro transcribed mRNA is treated with a phosphatase to reduce a cytotoxic response by said somatic cell upon introduction of said mRNA into said somatic cell.
  • the in vitro transcribed mRNA comprises one or more base substitutions.
  • Methods of modifying bases of mRNA are well known in the art.
  • suitable base substitutions include 5-methylcytidine (5mC), pseudouridine (psi), 5-methyluridine, 2'O-methyluridine, 2-thiouridine, and N6- methyladenosine. It should be appreciated that any number bases in an RNA of the present invention (e.g., in vitro transcribed mRNA) can be substituted.
  • reprogramming factors e.g., SNEL reprogramming factors
  • somatic cells in the form of one or more proteins or functional variants or fragments thereof that are capable of activating endogenous pluripotency genes in the cells and reprogramming at least some of the cells to iPSCs.
  • Zhou et al. have successfully produced iPSCs derived from both mouse and human fibroblasts using purified recombinant proteins, and such methods can be adapted for use with the inventive reprogramming factors of the present invention to produce iPSCs (Zhou et al. 2009. Cell Stem Cell 4:381 -384, incorporated herein by reference in its entirety).
  • the one or more protein reprogramming factors comprise a recombinant protein.
  • the one or more proteins comprise a fusion protein.
  • the one or more proteins further comprise a cell-penetrating peptide that facilitates entry of the one or more proteins into a cell nucleus where the one or more proteins can function to activate endogenous pluripotency genes in the cells.
  • the cell-penetrating peptide is fused to a C terminus of said one or more proteins.
  • Recombinant proteins comprising cell-penetrating peptides fused to their C terminus can be produced according to routine methods, e.g., expression in E. coli inclusion bodies followed by solubilization, refolding, and purification as described by Zhou et al. 2009, or expression in a suitable cell line, for example, an HEK293 cell line as described in Kim et al. (Cell Stem Cell 4(6):472-476, 2009, incorporated herein by reference in its entirety).
  • the cell-penetrating peptide comprises HIV tat.
  • the cell-penetrating peptide comprises poly-arginine.
  • the one or more proteins are introduced into somatic cells in an amount and for a period of time sufficient for reprogramming of said somatic cell to occur. Such amount and period of time would be apparent to those skilled in the art depending on the particular reprogramming factors, the type of somatic cell, and the culture conditions.
  • the one or more protein reprogramming factors is introduced into somatic cells over successive intervals throughout the period of time to maintain levels sufficient to activate endogenous pluripotency genes in at least some of the cells into which the reprogramming proteins have be been introduced.
  • the one or more protein reprogramming factors is introduced into somatic cells repeatedly throughout a stochastic phase of programming until a sequential phase of reprogramming beings.
  • a method of generating a reprogrammed cell further comprises (c) supplementing said medium with an agent that increases
  • Agent as used herein means any compound or substance such as, but not limited to, a small molecule, nucleic acid, polypeptide, peptide, drug, ion, etc.
  • agent may increase reprogramm ing efficiency and/or allow generation of reprogrammed cells under conditions in which detectable generation of reprogrammed cells would not otherwise occur.
  • "increase the efficiency of reprogramming” encompasses causing an increase in the percentage of cells that undergo reprogramming to a desired cell state or cell type (e.g., to iPSCs) when a population of cells is subjected to a
  • the inventive methods decrease the amount of time required to obtain at least some reprogrammed cells or decrease the amount of time required to obtain a given number of colonies of reprogrammed cells from a given number of somatic cells. For example, such time may be decreased by at least 1 , 2, 3, 4, or 5 days, or more.
  • somatic cells are treated (e.g., genetically engineered) so that they express one or more reprogramming factors selected from: Sall4, Nanog, Esrrb and Lin28 (and optionally from: Sox2, Klf family members (e.g., Klf2, Klf4), and c-Myc) at levels greater than would be the case in the absence of such treatment (i.e., they "overexpress" the factor(s).
  • the cells are treated so that they overexpress SaI14, Nanog, Esrrb and Lin28.
  • Suitable methods of engineering such expression include infecting cells with viruses (e.g., retrovirus, lentivirus) or transfecting the cells with viral vectors (e.g., retroviral, lentiviral) that contain the sequences of the factors operably linked to suitable expression control elements to drive expression in the cells following infection or transfection and, optionally integration into the genome as known in the art.
  • viruses e.g., retrovirus, lentivirus
  • viral vectors e.g., retroviral, lentiviral
  • the invention provides the recognition that inhibiting histone methyiation, e.g., H3K9 methyiation, enhances reprogramming of somatic cells that have not been genetically modified to increase their expression of an oncogene such as c-Myc.
  • said invention thus provides ways to substitute for engineered expression of c-Myc in any method of reprogramming somatic cells that would otherwise involve engineering cells to express c-Myc.
  • said one or more agents comprise a histone deacetylase inhibitor.
  • the histone deacetylase inhibitor comprises valproic acid (VPA).
  • the histone deacetylase inhibitor comprises butyrate.
  • the one or more agents comprise an interferon inhibitor.
  • the one or more agents comprise a recombinant B 1 8R protein.
  • the one or more agents comprise a signaling pathway modulator that is capable of supplementing or substituting for one of the
  • Module is used consistently with its use in the art, i.e., meaning to cause or facilitate a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest. Without lim itation, such change may be an increase, decrease, or change in relative strength or activity of different components or branches of the process, pathway, or phenonomenon.
  • a “modulator” is an agent that causes or facilitates a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest.
  • Non-limiting examples of signaling pathway modulators are selected from the group consisting of a TGF-beta pathway inhibitor, a MAPK/ERK pathway inhibitor, a GSK3 pathway inhibitor, a WNT pathway activator, a 3 '- phosphoinositide-dependent kinase- 1 (PDK 1 ) pathway activator, a mitochrondrial oxidation modulator, a glycolytic metabolism modulator, a HIF pathway activator, and combinations thereof.
  • Examplary TGF-beta pathway inhibitors include
  • SB43 1 542 (4-[4-( l ,3-benzodioxol-5-yl)-5-(2-pyridinyl)- l H-imidazol-2-yl]- benzamide), and A-83-01 [3-(6-Methyl-2-pyridinyl)-N-phenyl-4-(4-quinolinyl)- l H- pyrazole- l -carbothioamide].
  • MAPK/ERK pathway inhibitors is the extracellular signal-regulated kinases (ERK) and microtubule-associated protein kinase (MAPK/ERK) pathway inhibitor PD0325901 (N-[(2R)-2,3- dihydroxypropoxy]-3,4-difluoro-2-[(2-fluoro-4-iodophenyl)am ino]-benzamide).
  • ERK extracellular signal-regulated kinases
  • MAPK/ERK microtubule-associated protein kinase pathway inhibitor PD0325901 (N-[(2R)-2,3- dihydroxypropoxy]-3,4-difluoro-2-[(2-fluoro-4-iodophenyl)am ino]-benzamide).
  • An exemplary GS 3 pathway inhibitor is the GS 3 inhibitor CHIR99021 [6-((2-((4-(2,4- Dichlorophenyl)-5-(4-methyl- l H-imidazol-2-yl)pyrimidin-2- yl)amino)ethyl)amino)nicotinonitrile] which activates activates Wnt signal ling by stabilizing beta-catenin.
  • An exemplary PDKl pathway activiator is the small molecule activator of 3'-phosphoinositide-dependent kinase- 1 (PDKl ) PS48 [(2Z)-5- (4-Chlorophenyl)-3-phenyl-2-pentenoic acid].
  • An exemplary small molecule that modulates mitochondrial oxidation is 2,4-dinitrophenol.
  • Examplary agents that modulate glycolytic metabolism include fructose 2,6-bisphosphate and oxalate.
  • HIF pathway activators include N-oxaloylglycine and Quercetin (See, e.g. Zhu et al., 201 0, Cell Stem. Cell 7: 65 1 -655, incorporated by reference herein in its entirety).
  • a method of generating a reprogrammed cell further comprises (c) monitoring said culture for cells which display one or more markers of pluripotency.
  • the one or more markers of pluripotency are selected from the group consisting of Fbxo l 5, Nanog, Oct4, Sox2, Sall4 and combinations thereof.
  • the one or more markers of pluripotency comprise an early marker of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • a method of generating a reprogrammed cel l further comprises (c) or (d) isolating said reprogrammed cell from said culture.
  • the reprogramming methods disclosed herein may be used to generate iPS cells for a variety of animal species.
  • the iPS cells generated can be useful to produce desired animals.
  • Animals include, for example, avians and mammals as well as any animal that is an endangered species.
  • Exemplary birds include domesticated birds (e.g., chickens, ducks, geese, turkeys).
  • Exemplary mammals include murine, caprine, ovine, bovine, porcine, canine, feline and non- human primate. Of these members include domesticated animals, including, for examples, cattle, pigs, horses, cows, rabbits, guinea pigs, sheep, and goats.
  • a reprogrammed cell isolated according to the inventive methods comprises a mammalian cell.
  • said mammalian cell is a human cell.
  • said mammalian cell is a non-human mammal cell,
  • said mammalian cell further comprises a reporter gene integrated at a locus whose activation serves as a marker of reprogramming to pluripotency.
  • the locus is selected from Nanog, Sox2, and Oct4.
  • said cell is an iPS cell.
  • chimeric mice and methods of generating such mice are disclosed.
  • a chimeric mouse is generated at least in part from a mammalian iPS cell generated according to the inventive methods described herein.
  • the chimeric mouse is generated by injecting said mammalian iPS cell into a mouse blastocyt and allowing said blastocyst to develop into a mouse in vivo.
  • the present invention provides a cell obtained from said mouse wherein said cell is derived from an iPSC of the present invention.
  • non-human mammals and methods of producing such non- human mammals are disclosed, e.g., a non-human mammalian iPSC produced according to the inventive methods can be used, at least in part, to generate a non- human mammal.
  • the non-human mammal is a transgenic non-human mammal generated using iPSCs of the invention.
  • iPSCs are genetically modified.
  • a "genetically modified" or “engineered” cell refers to a cell into which an exogenous nucleic acid has been introduced by a process involving the hand of man (or a descendant of such a cell that has inherited at least a portion of the nucleic acid).
  • the nucleic acid may for example contain a sequence that is exogenous to the cell, it may contain native sequences (i.e., sequences naturally found in the cells) but in a non-naturally occurring arrangement (e.g., a coding region linked to a promoter from a different gene), or altered versions of native sequences, etc.
  • the process of transferring the nucleic into the cell can be achieved by any suitable technique. Suitable techniques include calcium phosphate or lipid-mediated transfection, electroporation, and transduction or infection using a viral vector. In some embodiments the polynucleotide or a portion thereof is integrated into the genome of the cell.
  • the nucleic acid may have subsequently been removed or excised from the genome, provided that such removal or excision results in a detectable alteration in the cell relative to an unmodified but otherwise equivalent cell.
  • genetic modification comprises replacing a selected nucleotide or nucleotide sequence with a different nucleotide or nucleotide sequence.
  • a mutant sequence e.g., a mutant sequence at least in part responsible for a disease
  • a normal or functional sequence e.g., encoding a protein
  • resulting iPS ceils or differentiated descendants thereof may be used in cell therapy, e.g., to treat a subject suffering from the disease.
  • an integration is targeted to a selected locus.
  • the locus may be selected in order to disable a particular gene or may be a "safe harbour" locus, i.e., a locus where insertion of a nucleic acid is not known to be detrimental to or affect the phenotype of a cell.
  • a nucleic acid that integrated into the genome may have subsequently been at least in part excised from the genome, e.g., by site-specific recombination (e.g., using the Lox/Cre, Flp/Frt, or similar systems).
  • a cell may be genetically modified using an endonuclease that is targeted to selected DNA sequences so as to cause chromosomal double-stranded DNA breaks (DSBs), wh ich stimulate breakage repair mechanisms such as nonhomologous end-joining (NHEJ) or homologous recombination (HR).
  • DSBs chromosomal double-stranded DNA breaks
  • NHEJ nonhomologous end-joining
  • HR homologous recombination
  • Proteins that comprise a DNA binding domain (DBD) capable of recognizing a selected target DNA sequence and a cleavage domain e.g., a cleavage domain of a non-specific endonuclease such as Fokl or a variant thereof
  • DBD DNA binding domain
  • cleavage domain e.g., a cleavage domain of a non-specific endonuclease such as Fokl or a variant thereof
  • ZFNs zinc- finger
  • ZFNs comprise DBDs derived from or designed based on DBDs of zinc finger (ZF) proteins.
  • TALENs comprise DBDs derived from or designed based on DBDs of transcription activator-like (TAL) effectors of plant pathogenic Xanthomonas spp.
  • Modifications of interest may include gene disruption (e.g., by targeted insertions or deletions), introduction of discrete base substitutions specified by a homologous donor DNA), and targeted insertion into a selected native genomic locus of DNA whose expression is desired. In some embodiments such modifications may be performed without using a selectable marker and/or without using donor DNA comprising lengthy sequences homologous to the target locus and/or without requiring donor DNA.
  • the iPSCs are not genetically modified.
  • the non-human mammals can be genetically modified or non-genetically modified.
  • the iPSC has a mutation or polymorphism associated with a trait or disease that has a genetic component.
  • non-human mammals are produced using methods known in the art for producing non-human mammals from non-human ESCs or IPSCs.
  • the non-human mammal serves as a model for a human disease.
  • models are useful, e.g., for studying physiological processes or disease pathogenesis, testing the effect of a compound on the mammal, e.g., testing potential treatments, etc.
  • iPSCs or ESC-like cells could be used to generate farm animals (e.g., cows, pigs, sheep, goats, horses), e.g., farm animals with desired traits. Examples of such traits could be, e.g., reduced susceptibility to disease, increased size, increased milk production, etc.
  • non-human mammals are useful for research on apoptosis, autoimmune disease, cancer, cardiovascular disease, cell biology, dermatology, development, diabetes and/or obesity, endocrine deficiency, hearing (or hearing loss), hematological research, immunology, inflammation, musculoskeletal disorders, neurobiology, neurodenerative disease, metabolism, vision (or vision loss), reproductive biology, or infectious disease.
  • Research can include, e.g., identification of targets for development of therapeutic agents, testing potential therapeutic agents, toxicity testing, etc.
  • an iPS cell, differentiated cells obtained from the iPS cell, or non-human mammal of the invention is used as a model for a disease, e.g., a disease for which a treatment, e.g., a pharmacological treatment, is sought.
  • a disease e.g., a disease for which a treatment, e.g., a pharmacological treatment, is sought.
  • a method of identifying a compound to be administered to treat a disease in a mammal comprises providing an iPSC of the invention or a cell obtained by differentiating the iPSC, wherein the iPSC or differentiated cell or descendants thereof manifest at least one indicator of a disease; administering a test compound to the cell, wherein the test compound is to be assessed for its effectiveness in treating the disease; and assessing the ability of the compound to modify the indicator of disease.
  • the i PSC was derived from a somatic cell obtained from a donor suffering from the disease.
  • the iPSC is genetically modified to harbor a mutation at least in part responsible for a disease.
  • a method of producing a non-human mammal comprises introducing an iPSC produced according to the inventive methods disclosed herein into tetraploid blastocysts of the same non-human mammalian species under conditions that result in production of an embryo and said resulting embryo is transferred into a foster mother which is maintained under conditions that result in development of l ive offspring.
  • said non-human mammal is a mouse.
  • said iPS cells are introduced into said tetraploid blastocysts by injection.
  • said injection is a microinjection.
  • said injection is laser-assisted
  • the method of producing a non-human mammal employs mouse iPSCs and the resulting non-human mammal is a mouse.
  • non-human mammalian embryos and methods of producing non-human mammalian embryos are disclosed.
  • a method of producing a non-human mammalian embryo comprises injecting non-human mammalian iPSCs generated according to an inventive method of the present invention into non-human tetraploid blastocysts and maintaining said resulting tetraploid blastocysts under conditions that result in formation of embryos, thereby producing a non-human mammalian embryo.
  • said non-human mammalian iPSCs are mouse cells and said non-human mammalian embryo is a mouse.
  • said mouse cells are mutant mouse iPS cells and are injected into said non-human tetraploid blastocysts by microinjection, in some embodiments laser-assisted micromanipulation or piezo injection is used.
  • a non-human mammalian embryo comprises a mouse embryo.
  • the somatic cel l is a terminal ly differentiated somatic cell.
  • compositions and methods are of use to reprogram somatic cells to a less differentiated cell state.
  • compositions and methods are of use to reprogram somatic cells to pluripotent, embryonic stem cell-like cells, sometimes referred to herein as "induced pluripotent stem cells ("iPS cells” or "iPSCs").
  • iPS cells induced pluripotent stem cells
  • compositions and methods are of use to reprogram pluripotent cells to a more differentiated state.
  • compositions and methods are of use to reprogram pluripotent cells to a desired differentiated cell type.
  • compositions and methods are of use to reprogram mammalian cells from a first differentiated cell type to a second differentiated cell type.
  • the present invention provides a method comprising: (a) reprogramming somatic cells to a pluripotent state according a reprogramming method or protocol of the present invention; and (b) reprogramming said pluripotent cells to a desired, differentiated cell type, wherein said differentiated cell type optionally comprises an adult stem cell or a fully differentiated cell.
  • IPSCs of the invention may be induced to differentiate into desired cell types. Such differentiated cells are an aspect of the invention.
  • the IPSCs may be induced to differentiate into hematopoietic stem cel ls, neural lineage cells, striated muscle cells, cardiac muscle cells, liver cells, pancreatic cells, cartilage cells, epithelial cells, urinary tract cells, ocular cells (e.g., retinal cells, limbal epithelial stem cells), vascular cells etc., by culturing such cells in differentiation medium and under conditions which provide for cell differentiation.
  • Cell types of interest include, without lim itation, keratinocytes, pigmented retinal epithelium, neural crest cells, motor neurons, dopaminergic neurons, hepatic progenitors, pancreatic islet-like cells (e.g., insulin-secreting beta-like cel ls), and mesenchymal stem cells.
  • iPSCs are differentiated to the endodermal, mesendodermal, or neuroectoderm lineage.
  • a cell type of interest is a stem cell.
  • a stem cell is capable of self-renewal and of differentiating to at least one more mature cell type.
  • a stem cell is a multipotent stem cell.
  • a multipotent stem cell can give rise to cells of multiple different types but has less potential than a pluripotent cell.
  • Exemplary multipotent stem cells include mesenchymal stem cel ls, neural stem cells, hematopoietic stem cells and more restricted hematopoietic cells such as myeloid or lymphoid stem cells, endothelial stem cel ls, etc.
  • Cell types of interest can be identified, e.g., by cell surface markers, expression of reporter genes, gene expression profile, and/or characteristic morphology. If desired, a cell population can be enriched for cell type(s) of interest and/or further cultured to obtain more mature cell type(s). In some embodiments, enrichment comprises selecting cells that express one or more markers associated with the desired cell type(s) and/or selecting cells that do not express one or more markers associated with pluripotency. In some embodiments, enrichment comprises removing at least some cells that express one or more markers associated with pluripotency from the cell population.
  • enrichment comprises selecting cells that express one or more early markers of pluripotency (e.g., Esrrb, Utfl , Lin28, and Dppa2). In some embodiments, enrichment comprises selecting cells that express at least two early markers of pluripotency, at least three early markers of pluripotency, or at least four early markers of pluripotency. In some embodiments, enrichment comprises selecting cells that express a group of early pluripotency markers comprising Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, enrichment comprises removing at least some cells that express one or more early markers of pluripotency.
  • one or more early markers of pluripotency e.g., Esrrb, Utfl , Lin28, and Dppa2
  • enrichment comprises selecting cells that express at least two early markers of pluripotency, at least three early markers of pluripotency, or at least four early markers of pluripotency. In some embodiments, enrichment comprises selecting cells that express a group of
  • the invention provides a differentiated cell population obtained from iPSCs of the invention, wherein the cell population is substantial ly free of pluripotent cells. In some embodiments, no more than 5%, 2%, 1 %, 0.5%, 0. 1 % or 0.05% of the cells express a marker associated with pluripotency. In some embodiments, expression of said marker is not significantly greater than a reference level, e.g., a background or control level.
  • iPSCs cells Medium and methods which result in the differentiation of iPSCs cells are known in the art as are suitable culturing conditions.
  • the differentiation of hiPSCs into a variety of cell and tissue types often involves the formation of EBs.
  • Differentiation along lineages of interest can be promoted by a variety of different compounds such as polypeptides, nucleic acids, and small molecules.
  • exemplary compounds include growth factors, morphogenetic factors, and smal l cell -permeable molecules such as steroids (e.g, dexamethasone), vitamins (e.g., vitam in C), sodium pyruvate, thyroid hormones, prostaglandins, dibutryl cAMP, concavalin A, vanadate, and retinoic acids.
  • steroids e.g., dexamethasone
  • vitamins e.g., vitam in C
  • sodium pyruvate e.g., thyroid hormones, prostaglandins, dibutryl cAMP, concavalin A, vanadate, and retinoic acids.
  • BMP-2 bone morphogenetic proteins
  • Mechanical factors e.g., mechanical properties of a scaffold or culture substrate, application of forces
  • a "cell line” refers to a population of largely or substantially identical cells, wherein the cells have often been derived from a single ancestor cell or from a defined and/or substantially identical population of ancestor cells.
  • a cell line may consist of descendants of a single cell.
  • a cell line may have been or may be capable of being maintained in culture for an extended period (e.g., months, years, for an unlimited period of time). It will be appreciated that cells may acquire mutations and possibly epigenetic changes over time such that some individual cells of a cell line may differ with respect to each other.
  • At least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more of the cells of a cell line or cell culture are at least 90%, 95%, 96%, 97%, 98%, 99%, or more genetically identical.
  • at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% of the cells of a cell line or cell culture express a set of cell surface markers.
  • the set of markers could be markers indicative of pluripotency or cell-type specific markers.
  • a cell “clone” refers to a population of cells derived from a single cell. It will be understood that if cells of a clone are subjected to different culture conditions or if some of the cells are subjected to genetic modification, the resulting cells may be considered distinct clones.
  • the term "cell culture” refers to a composition comprising a plurality of viable cells wherein at least some of the cells are proliferating, e.g., not cell cycle arrested. A cell culture could be composed of cells from one or more different cell lines or sources.
  • a pluripotent cell line or cell clone of the invention is stable in culture.
  • a state, condition, or property is “stable” if it remains substantially unchanged over a time period of interest, e.g., exhibits little or no variability over such time period.
  • Stabilize refers to promoting the establishment and/or maintenance of a stable state, condition, or property, e.g., by inhibiting or preventing a change in such state, condition, or property.
  • a cell or cell line or cell clone is stable in culture if it continues to proliferate over multiple passages in culture (e.g., indefinitely), most or all cells in the culture (e.g., at least 90%, 95%, 97%, 98%, or more) are of the same type or differentiation state (e.g., are pluripotent), and cells resulting from cell division are of the same cell type or differentiation state.
  • a stabilized cell or cell line retains its "identity" in culture as long as the culture conditions are not altered, and the cells continue to be passaged appropriately.
  • methods and compositions of the invention enhance or promote existence of a stable pluripotent state.
  • the pluripotent state is an inner cell mass (ICM)-like state.
  • the invention is a method for stabi lizing a pluripotent cell in an ICM-like state.
  • the pluripotent state is characterized by cel l colonies that morphologically resemble those of ES cells of the 129 strain described in PCT
  • the pluripotent state e.g., in mice
  • the pluripotent state is characterized by ability to participate in chimera formation with frequencies at least 20% of that of ES cells of the 129 strain.
  • the pluripotent state e.g., in mice
  • the pluripotent state is characterized by ability to contribute to the germ line in chimeras with frequencies at least 20% of that of ES cells of the 129 strain.
  • the pluripotent state is characterized by colonies that morphologically resemble those of ES cells of the 129 strain.
  • the pluipotent state is characterized by maintenance of both X chromosomes (in XX lines) in an activated state.
  • a pluripotent state has at least 2, 3, 4, or more of the foregoing properties.
  • an inventive cell line or clone has a stable pluripotency state.
  • an inventive cell line or clone is karyotypically stable.
  • stage-specific embryonic antigens- 1 , -3, and -4 are glycoproteins specifically expressed in early embryonic development and are markers for ES cells (Solter and Knowles, 1978, Proc. Natl. Acad. Sci. USA 75:5565- 5569; Kannagi et al., 1983, EMBO J 2:2355-2361 ), with SSEA-I being a marker of mouse ES cells and SSEA-3 and -4 being markers of human ES cells.
  • Elevated expression of the enzyme alkaline phosphatase is another marker associated with undifferentiated embryonic stem cells (Wobus et al., 1 984, Exp. Cell 152:212-219; Pease et al., 1990, Dev. Biol. 141 :322-352). Additional ES cell markers are described in Ginis, I., et al., Dev, Biol, 269: 369-380, 2004 and in Adewumi O, et al., Nat Biotechnol., 25(7): 803-l 6, 2007 and references therein.
  • I R.A- 1 -60, TRA-I -81 , GCTM2 and GCT343, and the protein antigens CD9, Thy I (also known as CD90), NANOG, I DG 1 1 , DNMT3B, GABRB3 and GDF3, REX-I , TERT, UTF-I , TRF-I, TRF-2, connexin43, connexin45, Foxd3, FGFR-4, ABCG-2, and Glut- 1 are of use.
  • a mouse pluripotent stem cell line e.g., a mouse ES cell line, expresses Oct4, Nanog, and SSEA-I .
  • a human pluripotent stem cell line e.g., a human ES cell line, expresses Tra 1 -60, Nanog, Oct4, Sox2, and SSEA3 and/or SSEA4.
  • At least 80%, at least 90% of the pluripotent stem cells of a colony, cell line, or cell culture express one or more marker(s), e.g., a set of markers, indicative of pluripotency.
  • marker(s) e.g., a set of markers
  • Gene expression profiling may be used to assess pluripotency state.
  • Pluripotent cells, such as embryonic stem cells, and multipotent cells, such as adult stem cells, are known to have a distinct pattern of global gene expression.
  • pluripotency state include epigenetic analysis, e.g., analysis of DNA methylation state.
  • a pluripotent stem cell line e.g., an iPS cell line, derived or cultured according to the invention, e.g., a human iPS cell line, a non-human vertebrate iPS cell line, a mouse iPS cell line, has a normal karyotype.
  • a human iPS cell line e.g., a human iPS cell line, a non-human vertebrate iPS cell line, a mouse iPS cell line
  • has a normal karyotype e.g., a human iPS cell line, a non-human vertebrate iPS cell line, a mouse iPS cell line.
  • at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or greater than 95% of cells in metaphase examined exhibit a normal karyotype.
  • normal karyotype comprises having the correct number of chromosomes without evidence of translocation or deletion or duplication.
  • normal karyotype comprises having a normal banding pattern.
  • a karyotype is normal karyotype based on analysis by flourescence in situ hybridization (FISH).
  • FISH flourescence in situ hybridization
  • a pluripotent stem cell or cell line is an XO cell or cell line which, in some embodiments is otherwise karyotypically normal.
  • the invention provides a composition comprising: (a) one or more iPSCs derived under from reprogramming factors Sall4, Nanog, Esrrb and Lin28; and (b) one or more material(s) that promotes differentiation of the iPSC(s) to one or more cell type(s) of interest.
  • the material(s) could be, e.g., compound(s), a substrate, or cells.
  • the invention provides a method of generating a cell type of interest comprising: (a) providing one or more iPSCs from
  • the invention encompasses use of iPSCs of the invention to screen test compounds (e.g., test compounds such as those described herein), to identify compounds that promote differentiation of pluripotent cells (e.g., iPS cells) to one or more desired cell types.
  • Differentiated cells of the invention e.g., differentiated mammalian cells, e.g., differentiated human cells
  • neural lineage cells could be used to treat, prevent, or stabilize a neurological disease such as Alzheimer's disease, Parkinson's disease, Huntington's disease, or ALS, lysosomal storage diseases, multiple sclerosis, or a spinal cord injury.
  • Differentiated cells that produce a hormone may be administered to a mammal for the treatment or prevention of endocrine conditions.
  • a hormone such as a growth factor, thyroid hormone, thyroid-stimulating hormone, parathyroid hormone, steroid, serotonin, epinephrine, or norepinephrine
  • Differentiated cells may be administered to repair damage to the lining of a body cavity or organ, such as a lung, gut, exocrine gland, or urogenital tract or to treat damage or deficiency of cells in an organ or tissue such as the bladder, bone, bone marrow, brain, cartilage, esophagus, eye, fallopian tube, heart, intestines, gallbladder, kidney, liver, lung, musc le, ovaries, pancreas, prostate, skin, spinal cord, spleen, stomach, tendon, testes, thymus, thyroid, trachea, ureter, urethra, or uterus.
  • a body cavity or organ such as a lung, gut, exocrine gland, or urogenital tract
  • an organ or tissue such as the bladder, bone, bone marrow, brain, cartilage, esophagus, eye, fallopian tube, heart, intestines, gallbladder, kidney, liver, lung, mus
  • Differentiated cells could be used in tissue engineering, e.g., the construction of a replacement organ or tissue ex vivo.
  • tissue engineering e.g., the construction of a replacement organ or tissue ex vivo.
  • such cells could be combined with a suitable scaffold, which is optionally three-dimensional and/or biodegradable.
  • the cells are allowed to proliferate and possibly further differentiate ex vivo.
  • Scaffolds could be comprised of a wide variety of materials, including both naturally occurring and artificial materials. See, e.g., Lanza, R., et al. (eds.), Principles of Tissue Engineering, 3 ,d ed., Academic Press, 2007.
  • the replacement organ, tissue, or portion thereof is transplanted into a recipient in need thereof.
  • iPSCs may be combined with a matrix to form a tissue or organ in vitro or in vivo that may be used to repair or replace a tissue or organ in a recipient mammal (such methods being encompassed by the term "cell therapy").
  • iPSCs may be cultured in vitro in the presence of a matrix to produce a tissue or organ of the urogenital, cardiovascular, or musculoskeletal system.
  • a mixture of the cells and a matrix may be administered to a mammal for the formation of the desired tissue in vivo.
  • the iPSCs produced according to the invention may be used to produce genetical ly engineered or transgenic differentiated cells, e.g., by introducing a desired gene or genes, or removing all or part of an endogenous gene or genes of iPSCs produced according to the invention, and allowing such cells to differentiate into the desired cell type.
  • One method for achieving such modification is by homologous recombination, which technique can be used to insert, delete or modify a gene or genes at a specific site or sites in the genome.
  • This methodology can be used to replace defective genes or to introduce genes which result in the expression of therapeutically beneficial proteins such as growth factors, hormones, lymphokines, cytokines, enzymes, etc.
  • the gene encoding brain derived growth factor may be introduced into iPSCs or stem-like cells derived from such iPSCs, the cells differentiated into neural cells and the cells transplanted into a Parkinson's patient to retard the loss of neural cells during such disease.
  • iPSCs may be genetically engineered, and the resulting engineered cells differentiated into desired cel l types, e.g., hematopoietic cells, neural cells, pancreatic cells, cartilage cells, etc.
  • Genes which may be introduced into the iPSCs include, for example, epidermal growth factor, basic fibroblast growth factor, glial derived neurotrophic growth factor, insulin-like growth factor (I and II), neurotrophin3, neurotrophin-4/5, ciliary neurotrophic factor, AFT- 1 , cytokine genes (interleukins, interferons, colony stimulating factors, tumor necrosis factors (alpha and beta), etc.), genes encoding therapeutic enzymes, col lagen, human serum albumin, etc.
  • Negative selection systems known in the art can be used for eliminating therapeutic cells from a patient or ex vivo if desired.
  • cel ls transfected with the thymidine kinase (TK) gene will lead to the production of reprogrammed cells containing the TK gene that also express the TK gene.
  • Such cells may be selectively elim inated at any time from a patient upon gancyclovir administration.
  • TK thymidine kinase
  • Such a negative selection system is described in U.S. Patent No. 5,698,446, incorporated herein by reference in its entirety.
  • the cells are engineered to contain a gene that encodes a toxic product whose expression is under control of an inducible promoter. Administration of the inducer causes production of the toxic product, leading to death of the cells.
  • any of the somatic cells of the invention may comprise a suicide gene, optionally contained in an expression cassette, which may be integrated into the genome.
  • the suicide gene is one whose expression would be lethal to cells. Examples include genes encoding diphtheria toxin, cholera toxin, ricin, etc.
  • the suicide gene may be under control of expression control elements that do not direct expression under normal circumstances in the absence of a specific inducing agent or stimulus.
  • expression can be induced under appropriate conditions, e.g., (i) by administering an appropriate inducing agent to a cell or organism or (ii) if a particular gene (e.g., an oncogene, a gene involved in the cell division cycle, or a gene indicative of dedifferentiation or loss of differentiation) is expressed in the cells, or (iii) if expression of a gene such as a cell cycle control gene or a gene indicative of differentiation is lost.
  • a gene e.g., an oncogene, a gene involved in the cell division cycle, or a gene indicative of dedifferentiation or loss of differentiation
  • a gene such as a cell cycle control gene or a gene indicative of differentiation is lost.
  • the gene is only expressed following a recombination event mediated by a site-specific recombinase.
  • Such an event may bring the coding sequence into operable association with expression control elements such as a promoter.
  • Expression of the suicide gene may be induced if it is desired to eliminate cells (or their progeny) from the body of a subject after the cel ls (or their ancestors) have been administered to a subject. For example, if a reprogrammed somatic cell gives rise to a tumor, the tumor can be eliminated by inducing expression of the suicide gene, In some embodiments tumor formation is inhibited because the cells are automatically eliminated upon dedifferentiation or loss of proper cell cycle control.
  • the iPSCs obtained using methods of the present invention may be used as an in vitro model of differentiation, e.g., for the study of genes which are involved in the regulation of early development.
  • Differentiated cell tissues and organs generated using the reprogrammed cells may be used to study effects of drugs and/or identify potentially useful pharmaceutical agents.
  • differentiated cells or organs or tissues comprising them are introduced into a non-human animal that serves as a model of a disease.
  • the term "disease” as used herein encompasses, in various embodiments, art-recognized diseases, disorders, syndromes, injuries, impairments of health or conditions of abnormal functioning, e.g., for which medical/surgical treatment would be desirable.
  • the non-human animal may then be assessed, e.g., to evaluate the effects of the introduced cells, organs, or tissues in the model, thus providing means to assess therapeutic potential.
  • Differentiated cells of the invention can also be used for screening or other testing purposes, e.g., to identify compounds of use for treating diseases, to assess the effects of a compound on such cells (e.g., to assess potential toxicity or explore mechanism of action) or to study a cell biological process of interest.
  • neural cells could be used to study neurotransm itter synthesis, release, or uptake and/or to identify compounds that modulate (e.g., promote or inhibit) such processes.
  • Hepatocytes could be used in the study of drug metabolism and/or drug interactions.
  • cardiomyocytes can be used in study of processes such as action potential generation, repolarization, excitation-contraction coupling or calcium flux and/or to identify compounds that modulate such processes.
  • Compounds so identified could be used in research or in treatment of diseases in which such modulation would be beneficial.
  • the cells could be used in preclinical toxicology studies. For example, they could be used to assess potential cardiotoxicity, hepatoxicity, neurotoxicity, drug interactions, etc.
  • differentiated cells of the invention could be used in screens to identify compounds useful to direct endogenous cells to participate in the repair or regeneration of damaged tissues in vivo.
  • a composition comprising multiple cells produced by a reprogramming method or protocol of the present invention.
  • cells are considered to be essentially genetically identical if they are generated or descended from a cell or cell sample obtained from a particular subject.
  • non-human mammalian cells are considered to be essentially genetically identical if they are derived from one or more mammals of the same inbred strain (e.g., an inbred mouse strain) or if they are derived from a mammal generated by crossing individuals of two different inbred strains.
  • methods disclosed herein may be used to derive or culture pluripotent cells of any strain, e.g., mouse strain, or substrain of interest.
  • pluripotent cells are derived from somatic cells are obtained from F l hybrid mice produced by crossing m ice of two different inbred strains,
  • a composition comprises at least 1 0, 10 2 ; 10 3 , 10 4 , 1 0 5 , 10 6 , 10 7 , 10 8 , 10 9 , 1 0 10 , 10" cells, or more.
  • iPSCs of the present invention can be used to treat various diseases.
  • "treat”, “treating”, “therapy” and similar terms can include amelioration (e.g., reducing one or more symptoms of a disorder), cure, and/or maintenance of a cure (i.e., the prevention or delay of recurrence) of a disorder, or preventing a disorder from manifesting as severely as would be expected in the absence of treatment.
  • Treatment after a disorder has started aims to reduce, ameliorate or altogether elim inate the disorder, and/or at least some of its associated symptoms, to prevent it from becoming more severe, to slow the rate of progression, or to prevent the disorder from recurring once it has been initially eliminated.
  • Treatment can be prophylactic, e.g., administered to a subject that has not been diagnosed with the disorder, e.g., a subject with a significant risk of developing the disorder.
  • the subject may have a mutation associated with developing the disorder.
  • treatment can comprise administering a compound to a subject's mother.
  • a method of the invention comprises diagnosing a subject as having or being at risk of developing a disease, or providing such a subject, and treating the subject.
  • a subject diagnosed or treated according to the instant invention is a human.
  • a subject is a non-human mammal, e.g., any of the mammals mentioned herein.
  • a method of treating a patient in need of such treatment comprising administering to the patient a composition comprising multiple iPSCs cells produced by a reprogramming method or protocol of the present invention.
  • the iPSCs are autologous iPSCs derived from a differentiated cell of the patient (e.g., a fibroblast cell) that has been subjected to a reprogramm ing protocol or produced by a reprogramming method of the present invention.
  • the iPSCs are autologous iPSCs that have been derived from pathological cells of the patient.
  • the iPSCs are autologous iPSCs that have been derived from normal or healthy cells of the patient.
  • the iPSCs are derived from cells obtained from a donor other than the subject to whom the cells are to be administered.
  • the method of treatment comprises reprogramming a differentiated cell of a first type extracted from a patient into a differentiated cell of a second type utilizing a reprogramm ing method or protocol of the present invention and administering to the patient a composition comprises the autologous differentiated cells of the second type.
  • a method of treating an individual in need of such treatment comprising: (a) obtaining somatic cells from said individual; (b) reprogramming said somatic cells obtained from said individual with reprogramming factors comprising Sall4, Nanog, Esrrb, and Lin28 according to a reprogramming method or protocol described herein; and (c) administering at least some of said reprogrammed cells to said individual.
  • the method further comprises separating cells that are reprogrammed to a desired state from cells that are not reprogrammed to a desired state.
  • said individual is a human.
  • the methods of treatment using iPSCs of the present invention can be combined with conventional drugs or therapies to treat a patient in need of such treatment.
  • conventional drugs or therapies can be administered to alleviate symptoms associated with a disease or condition which the patient is suffering from.
  • conventional drugs or therapies can be adm inistered to prepare the patient for receiving an iPSC based treatment of the present invention.
  • conventional drugs or therapies can be administered in combination with one or more iPSC based treatments of the present invention to act in concert to ameliorate the disease or condition.
  • the present invention contemplates all modes of administration, including intramuscular, intravenous, intraarticular, intralesional, subcutaneous, or any other route sufficient to provide a dose adequate to prevent or treat a disease.
  • the iPSCs may be administered to the mammal in a single dose or multiple doses. When multiple doses are administered, the doses may be separated from one another by, for example, one week, one month, one year, or ten years.
  • One or more growth factors, hormones, interleukins, cytokines, or other cells may also be administered before, during, or after administration of the cells to further bias them towards a particular cell type.
  • the present invention provides compositions for identifying a reprogramming agent, such compositions comprising one or more cells that expresses a subset of reprogramm ing factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent.
  • a wide variety of compounds or combinations thereof can be used in aspects of the present invention, e.g., as test compounds or agents in the inventive methods.
  • compounds may comprise e.g., polypeptides, peptides, small organic or inorganic molecules, polysaccharides, polynucleotides, oligonucleotides, peptide nucleic acids, or lipids.
  • Polypeptide is used interchangeably herein with "protein”.
  • Polypeptides can contain standard amino acids (which refers to the 20 L-amino acids that are most commonly found in naturally occurring proteins) and/or non-standard amino acids or amino acid analogs.
  • One or more of the am ino acids in a polypeptide may be modified, for example, by the addition of a moiety such as a carbohydrate group, a phosphate group, a fatty acid group, etc.
  • Peptide is used herein to refer to a polypeptide containing 60 amino acids or less.
  • Polynucleotide is used herein interchangeably with “nucleic acid” and encompasses single-stranded, double-stranded, and partially double-stranded molecules, double-stranded molecules with overhangs, etc.
  • Oligonucleotide refers to a polynucleotide containing 60 nucleotides or less and encompasses antisense oligonucleotides, short interfering RNA (siRNA), and microRNA (miRNA).
  • a polynucleotide can comprise standard nucleosides (which term refers to nuc leosides that are most commonly found in DNA or RNA - adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and
  • Non-standard nucleosides can be naturally occurring nucleosides or may not be known to occur naturally.
  • a non-standard nucleoside or nucleoside analog may differ from a standard nucleoside with regard to the base and/or sugar moiety.
  • Variants of the sugar- phosphate backbone found in DNA or RNA can be used such as phosphorothioates, locked nucleic acids, or morpholinos.
  • Modifications e.g., nucleoside and/or backbone modifications
  • non-standard nucleotides e.g., delivery vehicles and systems, etc, known in the art as being useful in the context of siRNA or antisense-based molecules for research or therapeutic purposes
  • modifications may, e.g., increase stability, increase cell uptake, reduce clearance from the body, reduce toxicity, reduce off- target effects, or have other effects that may be desirable.
  • "Small molecule” as used herein refers to a molecule having a molecular weight of not more than 1 ,500 Da, e.g., not more than 1000 Da, e.g., not more than 500 Da.
  • the candidate compound is a small organic molecule comprising one or more functional groups that mediate structural interactions with proteins, e.g., hydrogen bonding.
  • a compound could comprise amine, carbonyl, hydroxyl or carboxyl group(s).
  • a compound comprises one or more cyclic carbon or heterocycl ic rings, e.g., an aromatic or polyaromatic ring substituted with one or more chemical functional groups and/or heteroatoms.
  • a small molecule has between 5 and 50 carbon atoms, e.g., between 7 and 30 carbons.
  • Compounds can be contacted with cells by adding the compound to the culture medium.
  • a range of concentrations can be used. Exemplary concentrations range from picomolar to millimolar, e.g., between 1 00 pM to 1 mM, e.g., between 1 0 nM and 500 ⁇ .
  • a vector that encodes a candidate compound is introduced into cells by an appropriate method and expressed therein to deliver a compound.
  • an expression vector that encodes a short hairpin RNA (shRNA) or microRNA (miRNA) precursor can be introduced into cells.
  • Compounds may be obtained from a wide variety of sources and can comprise compounds found in nature or compounds not known to occur in nature.
  • Compounds can be synthesized or obtained from natural sources.
  • polypeptides may be produced using recombinant DNA technology or synthesized through chemical means such as conventional sol id phase peptide synthesis. Numerous techniques are available for the random and directed synthesis of a wide variety of organic compounds.
  • candidate compounds are provided as mixtures of natural compounds in the form of bacterial, fungal, plant and animal extracts, fermentation broths, conditioned media, etc.
  • a library of compounds is screened.
  • a library is typically a collection of compounds that can be presented or displayed such that the compounds can be conveniently used in a screening assay.
  • each compound has associated information stored, e.g., in a database, such as the chemical structure, purity, quantity, physiochemical characteristics of the compound and/or information regarding known or suspected biological or biochemical activity.
  • compounds or mixtures thereof are housed in individual wells (e.g., of microtiter plates), vessels, tubes, etc.
  • Libraries include but are not limited to, for example, phage display libraries, peptide libraries, oligonucleotide libraries, siRNA libraries, shRNA libraries, aptamer libraries, synthetic small molecule libraries, and natural compound libraries. Libraries could comprise multiple different compounds having a similar biological activity of interest. For example, libraries could comprise inhibitors of one or more enzymes or enzyme classes of interest.
  • Exemplary compounds could be kinase inhibitors, phosphatase inhibitors, inhibitors of DNA or histone modifying enzymes (e.g., histone deacetylase inhibitors), etc.
  • Methods for preparing libraries of molecules are well known in the art, and many libraries are available from commercial or noncommercial sources.
  • a library comprises between 1 ,000 and 1 ,000,000 compounds, or more, e.g., between 1 0,000 and 500,000 compounds.
  • the candidate compound to be tested is a compound that is not present in ESC or iPSC culture medium or cryopreservation solutions known in the art.
  • a compound to be tested is a compound that is present in at least some ESC or iPSC culture medium or cryopreservation solutions known in the art but is used in a different, e.g., greater, concentration in a method or composition of the present invention.
  • said subset of reprogramming factors consists of at least three of said reprogramming factors.
  • two different regulatable systems each controlling expression of a subset of the factors can be used to identify reprogramming agents.
  • a first inducible (e.g., dox-inducible) promoter and the 4th factor under control of a second inducible (e.g., tamoxifen-inducible) promoter.
  • a first inducible e.g., dox-inducible
  • a second inducible e.g., tamoxifen-inducible
  • fibroblasts would be genetical ly homogenous and would be reprogrammable without need for viral infection.
  • a number of variations are possible; for example, one might stably induce expression of 3 factors and transiently induce expression of the 4th factor, etc. Any combination of factors can be assessed using the described methods. Also, one can modulate expression levels of the factors by using different concentrations of inducing agent.
  • the composition further comprises an agent that induces expression of said subset of reprogramming factors.
  • a method of identifying a reprogramming agent comprises: (a) maintaining said composition comprising one or more cells that expresses a subset of reprogramming factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent for a time period under conditions in which said reprogramming factors are expressed and cel l proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein the test agent is identified as a reprogramming agent if reprogramming occurs at a similar frequency as would be the case if said composition contained all of said reprogramming factors and had lacked said test agent.
  • a method of identifying a reprogramming agent comprising: (a) maintaining the composition comprising one or more ceils that expresses a subset of reprogramm ing factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent for a time period under conditions in which the reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become
  • the composition is maintained for at least X days, wherein X the number of days that it takes for one or more markers of pluripotency to be expressed in the cells.
  • the method further comprises determ ining whether one or more markers of pluripotency are being expressed in the cells.
  • said test agent is present for at least X days.
  • X is equal to the amount of days during which the composition is maintained.
  • the test agent is present for a number of days which is less than the number of days in which the composition is maintained.
  • X is between 1 and 365 days or any intervening particular value or subrange, e.g., between 1 and 1 80 days, between 2 and 60 days, between 3 and 30 days, to name just a few examples.
  • a reprogramming factor, reprogramming agent, or test agent is added to a composition once or more during a time period.
  • medium can be supplemented with a test agent, e.g., prior to or following medium changes.
  • multiple applications of a reprogramming factor, reprogramming agent, or test agent are used.
  • said test agent is identified as a reprogramming agent if cells do not become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent but do become reprogrammed at a detectable frequency if maintained in the presence of said test agent for at least a portion of said time period.
  • said test agent is identified as an enhancer of reprogramming agent if cells become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent and become reprogrammed at a significantly greater frequency if maintained in the presence of said test agent for at least a portion of said time period.
  • nucleic acid constructs comprising reprogramming factors
  • a nucleic acid construct comprises a single reprogramming factor packaged into a viral vector.
  • a nucleic acid construct comprises a polycistronic vector that can transduce any combination of reprogramming factors with a goal of reducing the number of proviral integrations.
  • polycistronic nucleic acid constructs, expression cassettes, and vectors that employ internal ribosomal entry sites and self-cleaving peptides and are capable of transducing any combination of reprogramming factors are described in PCT Application Publication No. WO 2009/1 52529, incorporated herein by reference in its entirety.
  • the present invention provides polycistronic nucleic acid constructs, expression cassettes, and vectors useful for generating iPSCs.
  • the polycistronic nucleic acid constructs comprise a portion that encodes a self- cleaving peptide.
  • the invention provides a polycistronic nucleic acid construct comprising at least two coding regions, wherein the coding regions are linked to each by a nucleic acid that encodes a self-cleaving peptide so as to form a single open reading frame, and wherein the coding regions encode first and second
  • the construct comprises two coding regions separated by a self-cleaving peptide. In some embodiments of the invention the construct comprises three coding regions each encoding a
  • the construct comprises four coding regions each encoding a reprogramming factor, wherein adjacent coding regions are separated by a self-cleaving peptide.
  • the invention thus provides constructs that encode a polyprotein that comprises 2, 3, or 4 reprogramming factors, separated by self-cleaving peptides.
  • the construct comprises expression control element(s), e.g., a promoter, suitable to direct expression in mammalian cells, wherein the portion of the construct that encodes the polyprotein is operably linked to the expression control element(s).
  • the invention thus provides an expression cassette comprising a nucleic acid that encodes a polyprotein comprising the reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide, operably l inked to a promoter (or other suitable expression control element).
  • the promoter drives transcription of a polycistronic message that encodes the reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide.
  • the promoter can be a viral promoter (e.g., a CMV promoter) or a mammalian promoter (e.g., a PG K promoter).
  • the expression cassette or construct can comprise other genetic elements, e.g., to enhance expression or stability of a transcript.
  • any of the foregoing constructs or expression cassettes may further include a coding region that does not encode a reprogramming factor, wherein the coding region is separated from adjacent coding region(s) by a self-cleaving peptide.
  • the additional coding region encodes a selectable marker.
  • a nucleic acid construct comprises at least four coding regions linked to each other by nucleic acids that encode a self-cleaving peptide so as to form a single open reading frame, wherein said coding regions encode reprogramming factors Sall4, Nanog, Esrrb, and Lin28, and wherein said
  • reprogramming factors are capable, either alone or in combination with one or more additional reprogramming factors, of reprogramm ing a mammalian somatic cell to pluripotency.
  • a nucleic acid construct of the present invention includes a fifth coding region that encodes a fifth reprogramming factor, wherein the five coding regions are linked to each other by nucleic acids that encode self-cleaving peptides so as to form a single open reading frame.
  • said fifth reprogramming factor is c-Myc.
  • a nuc leic acid construct of the present invention includes fifth and sixth genes that encode fifth and sixth reprogramming factors, wherein said six coding regions are linked to each other by nucleic acids that encode self-cleaving peptides so as to form a single open reading frame.
  • said fifth reprogramming factor is c-Myc and said sixth reprogramming factor is Klf4.
  • the self-cleaving peptide is a viral 2A peptide. In some embodiments, the self-cleaving peptide is an aphthovirus 2A peptide.
  • the nucleic acid construct of the present invention is capable of reprogramming a somatic cell to a pluripotent state in the absence of one or more of the canonical reprogramming factors.
  • the nucleic acid construct does not encode Oct4.
  • the nucleic acid construct does not encode Klf4.
  • the nucleic acid construct does not encode Sox2.
  • the nucleic acid construct does not encode c- Myc.
  • expression cassettes comprising a nucleic acid construct of the present invention are disclosed.
  • an expression cassette comprising a nucleic acid construct is operably linked to a promoter, wherein said promoter drives transcription of a polycistronic message that encodes said reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide.
  • the expression cassette comprises one or more sites that mediate integration into a genome of a mammalian cell. In some embdoments, the expression cassette is integrated into said genome at a locus whose disruption has minimal or no effect on said cell.
  • the construct comprises one or more sites that mediates or facilitates integration of the construct into the genome of a mammalian cell. In some embodiments the construct comprises one or more sites that mediates or facilitates targeting the construct to a selected locus in the genome of a mammalian cell. For example, the construct could comprise one or more regions homologous to a selected locus in the genome.
  • the construct comprises sites for a recombinase that is functional in mammalian cells, wherein the sites flank at least the portion of the construct that comprises the coding regions for the factors (i.e., one site is positioned 5 ' and a second site is positioned 3 ' to the portion of the construct that encodes the polyprotein), so that the sequence encoding the factors can be excised from the genome after reprogramming.
  • the recombinase can be, e.g., Cre or Flp, where the corresponding recombinase sites are LoxP sites and Frt sites.
  • the recombinase is a transposase.
  • the recombinase sites need not be directly adjacent to the region encoding the polyprotein but will be positioned such that a region whose eventual removal from the genome is desired is located between the sites.
  • the recombinase sites are on the 5 ' and 3 ' ends of an expression cassette. Excision may result in a residual copy of the recombinase site remaining in the genome, which in some embodiments is the only genetic change resulting from the reprogramming process.
  • the construct comprises a single recombinase site, wherein the site is copied during insertion of the construct into the genome such that at least the portion of the construct that encodes polyprotein comprising the factors (and, optionally, any other portion of the construct whose eventual removal from the genome is desired) is flanked by two recombinase sites after integration into the genome.
  • the recombinase site can be in the 3 ' LTR of a retroviral (e.g., lentiviral) vector.
  • the invention provides expression vectors comprising the polycistronic nucleic acid constructs.
  • the expression vectors are retroviral vectors, e.g., lentiviral vectors.
  • the expression vectors are non-retroviral vectors, e.g., which may be viral (e.g., adenoviral) or non- viral.
  • the expression vector includes an inducible promoter.
  • the invention provides cells and cell lines (e.g., somatic cells and cell lines such as fibroblasts, keratinocytes, and cells of other types discussed herein) in which a polycistronic nucleic acid construct or expression cassette (e.g., any of the constructs or expression cassettes described herein) is integrated into the genome.
  • the cells are rodent cells, e.g., murine cells.
  • the cells are primate cells, e.g., human cells.
  • At least the portion of the construct that encodes the polyprotein is flanked by sites for a recombinase.
  • a recombinase can be introduced into the cell, e.g., by protein transduction, or a gene encoding the recombinase can be introduced into the cell, e.g., using a vector such as an adenoviral vector.
  • the recombinase excises the sequences encoding the exogenous reprogramming factors from the genome.
  • the cells contain an inducible gene that encodes the recombinase, wherein the recombinase is expressed upon induction and excises the cassette.
  • the inducible gene is integrated into the genome.
  • the inducible gene is on an episome.
  • the cells do not contain an inducible gene encoding the recombinase.
  • the nucleic acid construct or cassette is targeted to a specific locus in the genome, e.g., using homologous recombination.
  • the locus is one that is dispensable for normal development of most or all cell types in the body of a mammal.
  • the locus is one into which insertion does not affect the ability to derive pluripotent iPS cells from a somatic cell having an insertion in the locus.
  • the locus is one into which insertion would not perturb pluripotency of an iPSC.
  • the locus is the COL 1 A 1 locus or the AAV integration locus. In some embodiments the locus comprises a constitutive promoter. In some embodiments the construct or cassette is targeted so that expression of the polycistronic message encoding the polypeptide comprising the factors is driven from an endogenous promoter present in the locus to which the construct or cassette is targeted.
  • the invention further provides pluripotent reprogrammed cells (iPSCs) generated from the somatic cells that harbor the nucleic acid construct or expression cassette in their genome.
  • iPS cells can be used for any purpose contemplated for pluripotent cells.
  • differentiated cell lines e.g., neural cells, hematopoietic cells, muscle cells, cardiac cells
  • derived from the pluripotent reprogrammed cells e.g., neural cells, hematopoietic cells, muscle cells, cardiac cells
  • the present invention provides a reprogramming composition, such composition comprising reprogramming factors selected from the group consisting of Sall4 protein, Nanog protein, Esrrb protein, and Lin28 protein, or functional variants or fragments thereof.
  • each of said reprogramming factors comprises a cell-penetrating peptide fused to its C term inus.
  • said cell-penetrating peptide comprises poly-arginine.
  • a methods of producing a pluripotent cell from a somatic cell comprising the steps of: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; and (b) maintaining said one or more cells under conditions appropriate and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene.
  • said period of time comprises a stochastic phase of reprogramming.
  • said cells are maintained for a period of time sufficient for said exogenous reprogramming factors to initiate a sequential phase of reprogramming.
  • such methods further comprise the step of (c) selecting one or more cells which display an early marker of pluripotency.
  • said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • said early marker of pluripotency is a group of early markers of pluripotency consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • step (c) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency.
  • such methods further comprise the step of (d) generating an embryo utilizing said one or more cells which display the early marker of pluripotency.
  • said embryo is a chimeric embryo.
  • such methods further comprise the step of (e) obtaining one or more somatic cells from said embryo.
  • such methods further comprise the step of (f) maintaining said one or more somatic cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene.
  • such methods further comprise the step of (g) differentiating between cells which display one or more markers of pluripotency and cells which do not.
  • the present invention provides an IPSC produced by a method comprising: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; and (b) maintaining said one or more cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene.
  • said period of time comprises a stochastic phase of reprogramming.
  • said cells are maintained for a period of time sufficient for said exogenous reprogramming factors to initiate a sequential phase of reprogramming.
  • such method further comprises (c) selecting one or more cells which display an early marker of pluripotency.
  • said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, said early marker of pluripotency is a group of early pluripotency markers consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, step (c) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency. In some embodiments, such methods further comprise (d) generating an embryo utilizing said one or more cells which display the early marker of pluripotency. In some embodiments, such embryo comprises a chimeric embryo.
  • such methods further comprise (e) obtaining one or more differentiated somatic cells from said embryo. In some embodiments, such methods further comprise (f) maintaining said one or more differentiated somatic cells under conditions appropriate for and for a period of time sufficient for said reprogramming factors to activate at least one endogenous pluripotency gene. In some embodiments, such methods further comprise (g) differentiating cells which display one or more markers of pluripotency and cells which do not.
  • said iPSC comprises a primary iPSC. In some embodiments, said iPSC comprises a secondary iPSC.
  • the present invention provides a method of selecting a somatic cell that is likely to be reprograrnmed to a pluripotent state, such method comprising (a) measuring expression of one or more early markers of pluripotency in a population of a plurality of somatic cells; (b) sorting the population of the plurality of somatic cells into a plurality of populations of single somatic cells; and (c) measuring expression of the one or more early markers of pluripotency in each population of single somatic cells, wherein increased expression of the one or more early markers of pluripotency in each population of single somatic cells as compared to expression of the one or more early markers of pluripotency in the population of the plurality of somatic cells indicates that the single somatic cel l is a somatic cell that is likely to be reprograrnmed to the pluripotent state.
  • said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. It should be appreciated that the steps of sorting the somatic cells and measuring expression of the one or more early markers of pluripotency can be accomplished by various methods which are well known in the art (e.g., see Example 4 below).
  • the present invention provides a method of selecting a cell that is likely to become programmed to a pluripotent state, such method comprising (a) maintaining a population of a plurality of differentiated somatic cells containing at least one exogenously introduced factor that contributes to reprogramming of said cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) sorting said population of said plurality of cells into a plurality of populations of single cells; and (c) isolating said sorted cells which display one or more early markers of pluripotency, wherein each sorted cell which displays said one or more early markers of pluripotency is a cell that is likely to become programmed to the pluripotent state.
  • said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • the sorting and isolating steps of the inventive method can be accomplished according to routine methods well known to those of ordinary skill in the art. Examplary methods of sorting and isolating such cells can be found in Example 4 below.
  • the present invention provides a method for increasing the efficiency of the expansion of induced pluripotent stem cells, such method comprising (a) maintaining a population of differentiated somatic cells that contains at least one exogenously introduced factor that contributes to reprogramming of said population of cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) monitoring each cell in said population of cells for the expression of one or more early pluripotency markers, wherein cells expressing the one or more early pluripotency markers are more likely to become programmed to a pluripotent state than cells which do not express the one or more early pluripotency markers; (c) isolating each cell in said population of cells that expresses the one or more early pluripotency markers; and (d) expanding only those cells which express the one or more early pluripotency markers, thereby increasing the efficiency of the expansion of induced pluripotent stem cells.
  • said one or more early pluripotency markers are selected from the group consisting of Esrrb, Utfl , Lin28, Dppa2, and combinations thereof.
  • said monitoring of said cells is performed during a stochastic phase of reprogramming.
  • proliferation of said cell forms a clonal colony of said cell.
  • the present invention provides a method of increasing the likelihood that a differentiated somatic cell subjected to a reprogramming protocol will become reprogrammed to an iPSC, comprising, introducing into the differentiated somatic cell one or more early pluripotency factors prior to subjecting the differentiated somatic cell to said reprogramming protocol.
  • the early markers of pluripotency of the present invention are more predictive than conventional pluripotency markers in identifying cel ls which are destined to become iPSCs, for example, when these early pluripotency markers are observed in a cell undergoing a reprogramming protocol the cell is more likely to become an iPSC as compared to cells undergoing the same reprogramming protocol which do not display the early pluripotency markers.
  • one or more early pluripotency markers can serve as early pluripotency factors that can be introduced into a differentiated ce ll to increase the l ikelihiood that the cell will become reprogrammed to an iPSC.
  • said one or more early pluripotency factors are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • the present invention includes a method of
  • the present invention provides a method of isolating an iPS colony, such method comprising: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a differentiated mammalian somatic cell (b) culturing said differentiated somatic cell in a suitable medium under conditions appropriate for and for a time period sufficient for proliferation of and reprogramming of said cells to occur; and (c) isolating one or more colonies visible in said culture after said period of time.
  • each of said exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a differentiated mammalian somatic cell
  • culturing said differentiated somatic cell in a suitable medium under conditions appropriate for and for a time period sufficient for proliferation of and reprogramming of said cells to occur
  • isolating one or more colonies visible in said culture after said period of time in some embodiments, each of said exogenous
  • each of said exogenous reprogramming factors is introduced into said cell in the form of mRNA optionally complexed with a cationic vehicle, wherine said mRNA comprises in vitro transcribed mRNA comprising one or more of a 5 ' cap, an open reading frame flanked by a 5' untranslated region containing a strong ozak translation initiation signal and an alpha-globin 3 ' untranslated region, a polyA tail, and one or more modifications which confer stability to the mRNA.
  • such method further comprises (d) growing said isolated one or more colonies on a layer of feeder cells in the absence of an inducer of said inducible transgenes. In some embodiments, such method further comprises (e) passaging said one or more grown colonies at least once.
  • a method of enhancing isolation of iPSCs comprising (d) sorting said one or more colonies visible in said culture after said period of time according to step (c) of the method of isolating an iPS colony into single cells; (e) differentiating between said sorted cells which display one or more early markers of pluripotency and said sorted cells which do not display one or more early markers of pluripotency; and (f) isolating said sorted cells which display one or more early markers of plurioptency.
  • said early markers of pluripotency are a combination of early pluripotency markers selected from any of Esrrb, Utfl , Lin28, and Dppa2.
  • the present invention provides mouse iPS cells (e.g., cell lines) characterized by an efficiency of said mouse iPS cell of generating live offspring by tetraploid complementation, wherein said efficiency is at least 5%, 6%, 7%, 8%, 9%, 1 0%, 1 1 %, 1 2%, 1 3%, 1 4%, 1 5%, or more, or any intervening particular value or subrange, such as between 5% and 10%, between 10% and 1 5%, etc.
  • the present invention provides mouse iPS cells characterized by an ability of said mouse iPS cells of generating live offspring by tetraploid
  • the present invention provides mouse iPS cells characterized by an ability of said mouse iPS cells of generating live offspring by tetraploid
  • rat iPS cells/cell lines/animals are provided.
  • a mouse iPS cell characterized by an efficiency of said mouse iPSC of generating live offspring by tetraploid complementation is produced by a method comprising: (a) transfecting mouse embryonic fibroblasts with a doxycycline-inducible vector comprising reprogramming factors Sall4, Nanog, Esrrb and Lin28 operably linked to a tetracycline operator and a C V promoter; (b) culturing said mouse embryonic fibroblasts under conditions suitable and for a time period sufficient for proliferation and reprogramming of said mouse embryonic fibroblasts to occur; (c) exposing said culture to an effective amount of doxycycline for a period of time sufficient for one or more iPS colonies to form; (d) isolating said one or more iPS colonies; (e) growing said isolated iPS colonies on feeder cells in the absence of doxycycline; and optionally (f) passaging said grown iPS colonies at least once prior to carrying out t
  • the present invention provides a collection of reprogramming factors capable of producing a mouse iPS cell characterized by an efficiency of said mouse iPSC of generating live offspring by tetraploid complementation of at least 5%, 6%, 7%, 8%, 9%, 1 0%, 1 1 %, 12%, 1 3%, 14%, 1 5%, or more, or any intervening subrange comprising Sall4, Nanog, Esrrb, and Lin28.
  • kits can contain any of the cells or compounds described herein or combinations thereof.
  • the invention provides a kit containing cells of an iPSC line of the invention. The cells can be provided frozen.
  • the kit further comprises at least one item selected from the group consisting of (a) instructions for thawing, culturing, and/or characterizing the iPSCs; (b) reagent(s) useful for characterizing the iPSCs.
  • reagent could be, e.g., antibody(ies) for detecting a cell marker or probe(s) (e.g., for performing FISH).
  • the invention further provides a kit for generating a reprogrammed cell in vitro, such kit comprising: (a) a set of reprogramming factors comprising Sall4, Nanog, Esrrb and Lin28, which are capable alone, or in combination with one or more additional reprogramming factors, of reprogramming said mammalian somatic cells to a pluripotent state, wherein the kit optionally comprises (b) a medium suitable for culturing mammalian iPS cells and/or (c) a population of mammalian somatic cells, and wherein the reprogramming factors are optionally provided as one or more nucleic acids (e.g., one or more vectors) encoding said reprogamming factors.
  • a set of reprogramming factors comprising Sall4, Nanog, Esrrb and Lin28, which are capable alone, or in combination with one or more additional reprogramming factors, of reprogramming said mammalian somatic cells to a pluripotent state
  • the kit further comprises (d) one or more reagents for an assay for detecting one or more markers of pluripotency. Suitable reagents for such an assay for detecting one or more markers of pluripotency are apparent to those skilled in the art.
  • the one or more markers of pluripotency are selected from the group consisting of Fbxo l 5, Nanog, Oct4, Sox2, Sall4 and combinations thereof.
  • the one or more markers of pluripotency are early markers of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
  • the kit further includes (e) instructions for preparing the medium; (f) instructions for deriving or culturing pluripotent cells; (g) serum replacement; (h) albumin; (i) at least one protein or small molecule useful for deriving or culturing iPS cells, wherein the protein or small molecule activates or inhibits a signal transduction pathway and and (j) at least one reagent useful for characterizing pluripotent cells.
  • at least some of the ingredients are dissolved in l iquid. In some embodiments, at least some of the ingredients are provided in dry form.
  • Dppa2 as a reprogramming factor, either alone or in combination with one or more additional reprogramming factors or reprogramming agents.
  • Dppa2 is used, either alone or in combination with one or more additional reprogramming factors or reprogramming agents, to replace Nanog.
  • Sal4, Lin28, Essrb, and Dppa2 are used in any of the compositions, methods, kits, cells, or vectors, described herein.
  • Sal4, Lin28, Essrb, and Dppa2 (SLED) are used reprogram a cell to a less differentiated state, e.g., a pluripotent state.
  • Essrb as a reprogramming factor and a ligand (e.g., an agonist) for Essrb as a reprogramming agent.
  • a ligand e.g., an agonist
  • a ligand may enhance nuclear translocation or activity of Essrb.
  • Lin28 is supplemented by, or replaced as a reprogramming factor by, any of a variety of different reprograming factors or reprogramm ing agents.
  • Ezh2, Kdm l , and/or Utfl is used instead of, or in addition to Lin28, in any of the compositions or methods herein.
  • Ezh2, Kdm I , and/or Utfl is used instead of, or in addition to Lin28 to reprogram a cell to a less differentiated state, e.g., a pluripotent state.
  • reprogramming is performed using Sal4, Essrb, Dppa2, and Ezh2.
  • reprogramming is performed using Sal4, Essrb, Dppa2, and Kdm l .
  • reprogramming is performed using Sal4, Essrb, Dppa2, and Utfl .
  • Lin28 can be omitted from reprogramming factor combinations without necessarily replacing it by a different reprogramming factor or reprogramm ing agent.
  • Lin28 is omitted from a composition, kit, or method herein.
  • reprogramming is performed without Lin28, e.g., using a combination comprising or consisting of Sal4, Nanog, and Essrb or using a combination comprising or consisting of Sal4, Essrb, and Dppa2.
  • reprogrammed cells e.g., iPSCs, generated as described herein (e.g., using SNEL reprogramming factors) are more suitable for use in cell therapy as compared with reprogrammed cells generated using at least some other methods, e.g., generated through use of at least 1 , 2, 3, or all 4 of the OKSM factors.
  • reprogrammed cells e.g., iPSCs, generated as described herein (e.g., using SNEL reprogramming factors) have reduced immunogenicity as compared with re programmed cells generated using at least some other methods, e.g., generated through use of at least 1 , 2, 3, or all 4 of the OKSM factors.
  • reprogrammed cells e.g., iPSCs, generated as described herein (e.g., using SNEL reprogramming factors) have reduced tumorogenicity as compared with
  • reprogrammed cells generated using at least some other methods, e.g., generated through use of at least 1 , 2, 3, or all 4 of the OKSM factors.
  • the disclosure provides a gene expression signature that may be used for a variety of purposes.
  • the gene expression signature comprises expression levels of the genes listed in Table S I or counterparts thereof (e.g., orthologs in other organisms, e.g., humans).
  • measurement of expression levels of the genes or a subset thereof may be used to identify iPS cells that exhibit high developmental potential (e.g., as compared with iPS cells generated using the OKSM factors).
  • measurement of expression levels of the genes or a subset thereof may be used to identify iPS cells that exhibit superior quality (e.g., as compared with iPS cells generated using the OKSM factors).
  • a subset comprises at least 1 0, 20, 50, 100, 200, 300, 500, 700, 900, 1 100, 1300, or 1500 genes listed in Table S I .
  • Gene expression levels may be measured by measuring mRNA, protein or other gene product. Any suitable method may be used.
  • gene expression may be measured using RNA-Seq, microarray analysis, or quantitative PCR.
  • iPSCs are classified based on the gene expression profile. For example, whether the iPSCs gene expression profile more closely resembles that high quality iPSCs or poor quality iPSCs may be determined.
  • Heirarchical clustering or PCA analysis may be used, for example, to determine whether a particular iPSC population (e.g., colony, culture, cell line, etc.) clusters with high quality iPSCs as described herein or clusters with poor qual ity iPSCs as described herein.
  • iPSC of superior quality e.g., that cluster with high quality iPSCs as described herein
  • a gene expression signature may be used in identifying compounds or conditions that promote formation of superior quality iPSCs.
  • compounds or conditions may be used in a reprogramming protocol and their effect on gene expression profile of somatic cells subjected to the reprogramming protocol may be assessed.
  • Compounds that promote a gene expression profile resembling that of high quality iPSCs may be identified.
  • Such compounds may be used in a reprogramming protocol to generate iPSCs, e.g., high quality iPSCs.
  • Example 1 Single-cell expression profiling at defined time points during the reprogramming process
  • sm-mRNA-FISH single-molecule-mRNA fluorescent in situ hybridization
  • Fluidigm analysis involves the sorting of single cel ls, lysis, cDNA synthesis, pre-amplification of targets, and quantification of gene expression using TaqMan quantitative real-time polymerase chain reactions (qRT-PCR) on the BioMark system (Guo et al., 201 0).
  • sm-RNA-FISH entails probing each mRNA species with 48 fluorophore- labeled oligonucleotide probes, imaging mRNAs by fluorescence microscopy, and quantifying and assigning mRNAs to single cells (Raj et al., 2008).
  • clonal doxycycline (dox)- inducible 'secondary' NGFP2 MEFs (Wernig et al., 2008). Briefly, these cells contain pro-viral integrations of Oct4, Sox2, KI/4, and c-Myc, each under the TetO promoter, reverse tetracycline transactivator (rtTA) in the Rosa26 locus, and a GFP reporter knocked into the Nanog locus (Silva et al., 2009).
  • rtTA reverse tetracycline transactivator
  • the presence of the tdTomato reporter enabled us to sort single secondary cel ls in the presence of unmarked feeder cells.
  • Unmarked feeder cells were important both for cell-cel l interactions that enable proliferation of the tdTomato-single cells and for the calibration of the FACS machine before sorting (i.c tdTomato-positive cells vs tdTomato-negative cells). This system allowed us to trace those tdTomato-positive rare cells that bypassed senescence and contact inhibition and continued to proliferate to form clonal colonies on top of the feeders.
  • tdTomato- NGFP2 MEFs were exposed to dox for six days, sorted for tdTomato-positive cells, which were then seeded each as single cell in one well of four 24-well plates containing unmarked feeders. At different time points (between one and three weeks) during the reprogramming process, tdTomato-positive colonies that were derived from the single cells were imaged, split to another plate, sorted to single cells and analyzed for their transcriptional profile using the Fluidigm BioMark. Each parental cell was passaged to test its capacity to generate dox-independent, fully
  • Colony 44 contained a few cells with a very low level of GFP ( Figures 2A- 2C) that disappeared upon continual passaging and dox-withdrawal. A few cells (0.01 %) from Colony 23 activated GFP at day 81 but those cells did not give rise to stable iPSC colonies.
  • Example 2 Behavior of single cells during the reprogramming process within a cell population
  • Biomark system is a 96x96 matrix of cycle threshold (Ct) values ( Figures 3A-3B).
  • Ct cycle threshold
  • Normalized expression value of a gene in an individual cell was derived by normalizing the average Ct of the gene replicates to the average Ct values of the control genes Hprt and Gapdh of that cell. Cells with low or absent endogenous control gene expression levels were removed from analysis (For more details see Supplemental Methods).
  • PCA principle component analysis
  • PCI and PC2 Figure 4A
  • the first cluster contains the three control groups, tail tip fibroblasts (TTF), mouse embryonic fibroblasts (MEFs) and NGFP2 MEFs. In addition, it contains GFP- cells exposed to dox for two, four and six days, and dox-dependent GFP- cells (yellow dotted).
  • the second cluster (orange, red, brown, enclosed in the red circle) contains dox-dependent and independent GFP+ cells and the parental NGFP2 iPSCs.
  • the third rather heterogenous cluster contains cells primarily from the early colonies prior to the activation of the Nanog-GYY locus, possibly representing an early intermediate state. Importantly, a few cells from earlier time points (green and yellow dots) showed a similar pattern of expression as in the second cluster. This agrees with the observation that iPS colonies appear with different latencies and that early colonies with ES-l ike morphology may not be dox- independent. Cells on dox for four days cluster very closely to the MEFs suggesting that the epigenetic changes that characterize a fully reprogrammed iPS cell do not occur early in the reprogramming process. (Guo et al., 2010) ( Figure 4A).
  • JSD Jensen-Shannon Divergence
  • a bootstrapping method was used to resample the gene expression probability vectors from each group with replacement and derive a 95% confidence interval.
  • a steep decrease in variation was observed after the activation of the Nanog locus (GFP+ ceils), suggesting that the activation of the endogenous Nanog locus marks events that drive the cells to pluripotency (Silva et al., 2009).
  • Colony 23 failed to activate GFP in the majority of cells upon continual passaging to day 81 . Ultimately, only a very small fraction of these cells activated the endogenous Nanog locus (0.01 % GFP+). Colony 44 contained a few cells with a low level of GFP that appeared at day 61 and disappeared upon continual passaging and dox- withdrawal. Colonies 23 and 44 were induced cells that did not give rise to iPSCs, thus we termed them 'partially reprogrammed colonies.
  • Fbxol 5, Fgf4, and endogenous Oct4 were expressed in some of these partially reprogrammed colonies at levels similar to those seen in iPS cells (Figure 5A and Figure 7).
  • Fbxol5 showed a bimodal distribution in both colonies 44 and 23, while Fg/4 shows bimodality in colony 44 and unimodality in colony 23.
  • endogenous Ocl4 was highly expressed in the partially reprogrammed colony 23.
  • Example 5 Activation of endogenous Sox2 is a late phase in reprogramming that initiates a series of consecutive steps toward phtripotency
  • Late markers of reprogramming cells would be expected to express no or very low transcript levels at early time points and high transcript levels as the cells mature and become iPSCs.
  • Gdfl and Sox2 as genes that appeared late in the process with very low levels of expression at early time points as measured by Fluidigm BioMark and sm-mRNA FISH ( Figures 1 0A- 1 0F).
  • Gdf3 was activated in the partially reprogrammed cells while Sox2 was not, suggesting Sox2 may be a sufficient late marker for iPSCs ( Figures 10A- 10F).
  • a Bayes network model is a probabilistic model that represents a set of variables and their conditional dependencies. For example, given that Sall4 is expressed, the expression of Oct4, Fgf4, Nr6al , and Fbxo l 5 is conditionally independent on whether Sox2 is expressed or not.
  • Sox2 initiates a sequence of activation and first activates Sall4 and then activates the four downstream target genes, one should not find a cell that expresses Sox2 and one of the four downstream genes (Oct4, Fgf4, Nr6al , and Fbxo l 5) without Sall4 expression.
  • Sox2 activates Sal 14 and then activates the downstream gene Fgf4.
  • Sox2 first activates Lin28 and then induces the downstream gene Dnml3b.
  • Sox2 activates Sall4 and then activates the downstream gene Fhxol5.
  • Combination 1 While 1 86 out of a total of 279 cells examined cells were negative for expression, 25 cells expressed one gene, 38 cells expressed two genes, and 30 cells expressed all three genes. Notably, no double positive cells were seen that co- expressed Sox2 and Fgf4 (Figure 9B).
  • Combination 2 Out of a total of 283 cells examined, 82 cel ls were positive for any of the genes with 49 cells expressing one, 23 cells expressing two and 10 cells expressing all three genes. No cells expressing just Sox2 and Dnmt3b were detected (Figure 9C).
  • Combination 3 Of 275 cells examined 101 cells were positive for either of the three genes with 50 cells expressing one, 30 cells expressing two and 20 cells expressing all three genes but only one cell was found that expressed just Sox2 and Fbxol5 at a very low level (Figure 9D). These data support the sequential activation of Sall4 and Lin28 by Sox2 followed by the activation of Fgf4, Fbxo IS, and Dnmt3b, respectively, in the Sox2-positve cells consistent with a model of a hierarchical activation of key pluripotency genes.
  • Example 6 The hierarchical model of gene activation predicts transcription factor combinations with the capability to induce reprogramming
  • Combination ( 1 ) replaced Sox2 with Esrrb because the network predicted that Esrrb could activate Sox2 ( Figure 1 1 A).
  • Combination (2) replaced Oct4 with Sall4 because Sall4 was predicted to be upstream of Oct4 ( Figure 1 1 B).
  • Combination (3) om itted both Sox2 and Oct4 because the model predicted that Lin28, Sall4, Esrrb, and Nanog can drive the cells to pluripotency independently of the two master regulators Sox2 and Oct4 ( Figure 1 1 C). Nanog was co-transduced in all combinations because the model predicted that this gene functioned also independently of Sox2 and Oct4 ( Figure 9A). Fibroblasts were transduced with the three different combinations as well as with lf4 and c-Myc to induce proliferation.
  • Ezh2 Overexpressing Ezh2 enhanced reprogramm ing and knocking down Ezh2 inhibited reprogramming, consistent with a positive effect of Ezh2 on the reprogramming process.
  • Lin28, Sall4, and Esrrb facilitated the reprogramming process after 10 days of dox exposure followed by 4 days of dox withdrawal, while Nanog facilitated the reprogramming process after 13 days of dox exposure followed by 3 days of dox withdrawal.
  • Each ORF was cloned into the TOPO-TA vector (Invitrogen), and then restricted with EcoR ⁇ or Mfel and inserted into the FUW-teto expressing vector.
  • Replication-incompetent lentiviral particles were packaged in 293T cells with a VS V- G coat and used to infect MEFs containing M2rtTA and Oct4-GFP or Nanog-GFP or Sox2-GFP or TTFS with m2rtta.
  • Viral supernatants from cultures were filtered through a 0.45 mM filter and added to the cells after 48, 60 and 72 hours post infection. One day after the last infection the cells were exposed to 2 ⁇ 3 ⁇ 4 ⁇ 1 doxycycline for 45 days. The cells were cultured in ES medium (DMEM
  • iPS colonies were isolated between 15-45 days post dox exposure and grown on feeder cells in the absence of doxycycline. Stable colonies were passaged twice before used in the functional assay.
  • iPSCs were derived from an agouti mouse and could be identified by coat color as adults.
  • Blastocysts (94-98 hr after hCG injection) were placed in a drop of HEPES-CZB medium under mineral oil.
  • a flat tip microinjection pipette with an internal diameter of 16 ⁇ was used for iPS cell injections.
  • Each blastocyst received 8-10 iPS cells. After injection, blastocysts were cultured in potassium simplex optimization medium (KSOM) and placed at 37°C until transferred to recipient females.
  • KSOM potassium simplex optimization medium
  • each chimeric male was set up for mating with 2 C57BL/6 females.
  • the second Oct4 GFP SNEL iPSC line tested, # 1 was even more efficient.
  • Kdm 1 a (7) Sall4, Esrrb, Dppa2, and Utfl .
  • Dox was stopped at day 25.
  • GFP- expressing stable iPS colonies were detected and were picked 5 days after cessation of dox.
  • the efficiency of reprogramming using these combinations was estimated to be slightly lower than when Lin28 was used in combination with Sal4, Essrb, and Dppa2.
  • Example 9 Reprogramming by Sall4, Nanog, Esrrb and Lin28 produces high quality iPSCs with a molecular signature of developmental potency that resembles that of ESCs
  • OSKM-derived iPSCs may have reduced differentiation potential as compared to ESCs derived by somatic cell nuclear transfer (SCNT), which are equivalent in their developmental potential to ESCs derived from the fertilized egg (Jiang et al. 201 1 ; Kim et al. 20 1 0; Polo et al. 201 0; Brambrink et al 2006; Wakayama et al. 2006).
  • OSKM-derived iPSCs exhibit genetic and epigenetic aberrations throughout the genome that are distinct from ESCs (Kim el al. 201 0; Polo et al.
  • Nanog-GFP or Oct4-GFP MEFs were infected with dox-inducible lentiviruses encoding the four reprogramming factors (SNEL) and cultured until the appearance of iPSC colonies.
  • the efficiency of the reprogramming process was low, producing 2-5 colonies per 1 X 10 5 plated cells with a latency that ranged between 14-60 days.
  • 10 SNEL-i PSC colonies (6 from Nanog-GFP and 4 from Oct4- GFP MEFs).
  • the resulting iPSC colonies expressed a bright GFP signal from both the OctA or the Nanog locus and upregulated key pluripotency markers such as Sox2, endogenous Sall4, Utfl , endogenous Esrrb, Dppa2, Dppa3, Lin28 and Rexl as assessed by immunostaining and quantitative real time PCR (qRT-PCR) ( Figure 1 4B and 14C).
  • qRT-PCR quantitative real time PCR
  • SSLP Simple sequence length polymorphism
  • iPSC lines for microarray analysis, i) "Poor quality” iPSCs: This group included the three OSKM-iPSC lines Nanog-GFP OSKM#2, Oct4-GFP OSKM#2 and KH2 OSKM (Stadtfeld et al. 2010), that either did not produce fully developed pups or produced very low number of pups; ii) "Good quality” iPSCs: This group included BC_2 OSKM (Carey et al.
  • Tetraploid complementation is the most stringent assay for pluripotency and only a small fraction of iPSCs have been shown to be 4n competent (Pera, M.F. 201 1 ; Zhao et al. 2009; Jiang et al. 2012; ang et al. 2009; Boland et al. 2009; Jiang et al. 201 1 ).
  • Our experiments show that the quality of iPSCs as assessed by 4n competence is significantly influenced by the choice of factors used to induce conversion.
  • lentiviral vectors containing Sall4, Nanog, Esrrb and Lin28 under control of the tetracycline operator and a minimal CMV promoter has been described previously (Buganim et al. 2012). iPSCs were generated from
  • 129SvJae/C57BL/6 MEFs containing Oct4-GFP or Nanog-GFP reporter and the M2rtTA in the Rosa26 locus were cultured in mESC medium containing 2 ⁇ g/mI doxycycline. Twenty colonies were isolated for derivation and all yielded stable cell lines. iPSC lines derived at different time points were further confirmed for pluripotent properties by immunofluorescent analysis of Sox2 (MAB2018, R&D), Sall4 (ab291 12, Abeam), Utfl (ab24273, Abeam) and Esrrb (PP- H6705-00, Perseus proteomics).
  • Teratoma assays were performed by injecting iPSCs into the subcutaneous flanks of SCID mice, followed by histological examination of the tumors 4-5 weeks later. Microarray analysis, Bisu lfite genom ic sequencing, SSLP analysis, and qRT-PCR were performed. Tetraploid embryo complementation was carried out as described (Carey et al. 201 1 ) by injecting iPSCs (agouti coat origin) into BDF 1 tetraploid embryos (4n). Pups were naturally born or delivered by cesarean section at day E l 9.5, and analyzed for morphology and developmental competency.
  • Mouse embryonic fibroblasts were grown in DMEM supplemented with 10% fetal bovine serum, 1 % non-essential amino acids, 2mM L-Glutamine and antibiotics.
  • ESCs and iPSCs were grown in DMEM supplemented with 1 5%» fetal bovine serum, 1 % non-essential amino acids, 2mM L-Glutamine, 2X 1 0 6 units mLif, 0. 1 mM ⁇ -mercaptoethanol (Sigma) and antibiotics or in 2i medium.
  • Five hundred m icroliters of 2i medium were generated by including: 230 mL DMEM/F 12
  • MEFs were isolated from mice Heterozygous for the reverse tetracycline-dependent transactivator (M2rtTA) that resides in the ubiquitously expressed Gt(ROSA)26Sor locus (Beard el al. 2006) and either with GFP that was knocked-in inside the Nanog or the Oct4 locus.
  • M2rtTA reverse tetracycline-dependent transactivator
  • ientiviraf vectors (FUW-ieto) containing Oct4, Sox2, Klf4 and c-Myc (OSKM) or Sall4, Nanog, Esrrb and Lin28 (SNEL) under control of the tetracycline operator and a minimal CMV promoter has been described previously (Brambrink et al. 2006; Buganim et al. 201 2).
  • Replication-incompetent lentiviral particles were packaged in 293T cells with a VSV-G coat and used to infect MEFs containing M2rtTA and Oct4-GFP or Nanog-GFP MEFs. Viral supernatants from cultures were filtered through a 0.45mM filter and added to the cells. To initiate reprogramming the cells were grown in ESC medium + 2mg/ml Doxycycline
  • DMEM DMEM supplemented with 1 5% FBS (Hyclone), leukem ia inhibitory factor, beta- mercaptoethanol (Sigma-Aldrich), penicillin/streptomycin, L-gliitamine and nonessential amino acid.
  • blastocysts were electrofusion performed at approximately 44-47 h post hCG using a BEX LF- 101 or LF-301 cell fusion apparatus (Protech International Inc., Boerne, Texas). Both fused and diploid embryos were cultured in KSOM (Mill ipore) or Zenith culture medium (Zenith Biotech) until they formed blastocysts (94-98 h after hCG injection) at which point they were placed in a drop of xxx (Zenith) medium under mineral oil. A flat tip microinjection pipette with an internal diameter of 16 ⁇ was used for iPSC injections. Each blastocyst received 1 0- 12 iPSCs.
  • blastocysts were transferred to day 2.5 recipient CDl females (20 blastocysts per female). Pups, when not born naturally, were recovered at day 1 9.5 by cesarean section and fostered to lactating Balb/c mothers.
  • SNEL-iPSCs were fixed in 4% paraformaldehyde in PBS for 20 min, rinsed 3 x with PBS, blocked for 1 h with PBS containing 0. 1 % Triton X- 100 and 5% FBS, and incubated O/N with one of the following antibodies: Sox2 (MAB201 8, R&D), Sall4 (ab291 12, Abeam), Utfl (ab24273, Abeam) and Esrrb (PP-H6705-00, Perseus proteomics). The cells were washed 3 x with PBS, incubated with the relevant secondary antibody (Invitrogen) for 1 h and visual ized under a fluorescence microscope (Nikon eclipse Ti-U). Teratoma assay
  • ESCs (1 x 106) were injected subcutaneously into SCID mice (Taconic). Mice were euthanized 3 weeks after injection and tumors were collected and fixed in formalin for two days followed by imbedding in paraffin, sectioning and staining with hematoxylin and eosin for histological analysis following standard procedures.
  • lentiviral vectors containing Klf4, Sox2, Oct4 and Myc under control of the tetracycline operator and a minimal CMV promoter has been described previously (Brambrink et al., 2008). Construction of lentiviral vectors containing the following factors (Lin28, Sall4, Ezh2, Esrrb, Nanog, Utfl , Dppa2, and Kdm l a) under control of the tetracycl ine operator and a minimal CMV promoter were generated by cloning the open reading frame of the factors, obtained by reverse transcription with specific primers (see Supplemental Methods), into the TOPO-TA vector (Invitrogen), and then restricted with EcoRl or Mfel and inserted into the FUW-teto expressing vector.
  • factors Lo28, Sall4, Ezh2, Esrrb, Nanog, Utfl , Dppa2, and Kdm l a
  • Replication-incompetent lentiviral particles were packaged in 293T cells with a VSV-G coat and used to infect MEFs containing M2rtTA and Oct4-GFP or NGFP2- MEFs. Viral supernatants from cultures were filtered through a 0.45 mM filter and added to the cells. To initiate reprogramming the ceils were grown in ES cell medium + 2mg/ml Doxycycline (DMEM supplemented with 1 5% FBS (Hyclone), leukemia inhibitory factor, beta-mercaptoethanol (Sigma-Aldrich), penicillin/streptomycin, L- glutamine and nonessential amino acid.
  • DMEM 2mg/ml Doxycycline
  • blastocysts were cultured in potassium simplex optim ization medium (KSOM) and placed at 37°C until transferred to recipient females. About 1 5-20 injected blastocysts were transferred to each uterine horn of 2.5-day-postcoitum pseudopregnant B6D2F 1 female.
  • KSOM potassium simplex optim ization medium
  • FACS fluorescence- activated cell sorting
  • TetO-tdTomato construct TetO-tdTomato construct.
  • the transduced cells were selected using the Zeocin (400ug/ml) antibiotic.
  • Zeocin 400ug/ml
  • MEF isolation chimeric embryos were isolated at E l 3.5, and the head and internal organs were removed. The remaining tissue was physically dissociated and incubated in trypsin at 37 °C for 20 m in, after which cells were resuspended in MEF media containing puromycin ⁇ g/ml, selection against the Zeocin (400ug/ml) antibiotic.
  • MEF isolation chimeric embryos were isolated at E l 3.5, and the head and internal organs were removed. The remaining tissue was physically dissociated and incubated in trypsin at 37 °C for 20 m in, after which cells were resuspended in MEF media containing puromycin ⁇ g/ml, selection against the
  • M2rTtA M2rTtA
  • Secondary MEFs used for the described experiments were thawed and experiments plated 2 days before dox addition.
  • Cells were plated at optimal density of 50,000 cell per 6-well plate and reprogrammed with mouse ES medium supplemented with 2 g/ml doxycycline (Sigma).
  • TMR tetramethylrhodamme
  • Alexa 594 Invitrogen
  • Cy5 GE Amersham
  • pluripotency-associated genes Oct4, Sox2, Nanog, Lin28, Fbxol5, Zfp42, Fut4, Tbx3, Esrrb, Dppa2, Utfl, Sall4, Gdf3 and Fgf4 which were lower than the maximum values observed in MEF samples are potential false positives and are thus set to zeros.
  • PCA Single-cell Data Visualization Principal component analysis
  • bpca Bayesian Principal Component Analysis
  • MVE missing value estimation
  • Inventoried TaqMan assays were pooled to a final concentration of 0.2 for each of the 48 assays.
  • Individual cells were sorted directly into 5 ⁇ 1 RT-PreAmp Master Mix (2.5 ⁇ 1 CellsDirect Reaction Mix (Invitrogen); 1 .25 ⁇ 0.2 pooled assays; 0.1 ⁇ RT/Taq enzyme [CellsDirect qRT- PCR kit, Invitrogen]; 1 .15 ⁇ water).
  • Cell lysis and sequence- specific reverse transcription were performed at 50°C for 15 min. The reverse transcriptase was inactivated by heating to 95°C for 2 min.
  • cDNA went through sequence-specific amplification by denaturing at 95C for 1 5s, and annealing and amplification at 60°C for 4 min for 1 8 cycles.
  • preamplified products were diluted 5-fold prior to analysis with Universal PCR Master Mix and inventoried TaqMan gene expression assays (ABI) in 96.96 Dynamic Arrays on a BioMark System (Fluidigm). Ct values were calculated from the system's software (BioMark Real-time PCR Analysis; Fluidigm). Each assay was performed in replicate.
  • JSD Jensen-Shannon Divergence
  • CIs Confidence intervals
  • Bayesian network was constructed using BNFinder (Wilczynski and Dojer, 2009). Cells used are listed below.
  • Esrrb- GCTGGAACACCTGAGGGTAA GGTCTCCACTTGGATCGTGT cDNA (SEQ ID NO. 3) (SEQ ID NO. 4) Lin28- HANNA ET AL.2009 NATURE HANNA ET AL.2009 NATURE cDNA
  • Wdr5 mediates self-renewal and reprogramming via the embryonic stem cell core transcriptional network.
  • mice generated from induced pluripotent stem cells. Nature 461 , 91 -94.
  • Reprogramming factor stoichiometry influences the epigenetic state and biological properties of induced pluripotent stem cells.
  • Nanog a new recruit to the embryonic stem cell orchestra. Cell 1 13, 551 -552.
  • Citri A., Pang, Z.P., Sudhof, T.C., Wernig, M Constant and Malenka, R.C. (2012). Comprehensive qPCR profiling of gene expression in single neuronal cel ls. Nat Protoc 7, 1 1 8- 127.
  • Direct cell reprogramming is a stochastic process amenable to acceleration. Nature 462, 595-601 .
  • Dppa2 and Dppa4 are closely linked SAP motif genes restricted to pluripotent cells and the germ line. Stem Cells 25, 19-28.
  • Nanog is the gateway to the pluripotent ground state.
  • Cell 141, 943- 955 Singhal, N., Graumann, J., Wu, G., Arauzo-Bravo, M.J., Han, D.W., Greber, B., Gentile, L prefer Mann, M, and Scholer, H.R. (2010), Chromatin-Remodeling Components of the BAF Complex Facilitate Reprogramming. Cell 141, 943- 955.
  • iPS cells produce viable mice through tetraploid complementation. Nature 461, 86-90.

Abstract

Disclosed herein are novel methods and compositions for reprogramming mammalian cells. Certain methods and compositions of the invention are of use to enhance generation of induced pluripotent stem cells by reprogramming somatic cells. Certain methods and compositions of the invention are of use to identify cells destined to become iPSCs. Certain compositions and methods of the invention are of use to enhance reprogramming of pluripotent mammalian cells to a differentiated cell type. Certain compositions and methods of the invention are of use to enhance reprogramming of differentiated mammalian cells of a first cell type to differentiated mammalian cells of a second differentiated cell type. The reprogrammed somatic cells are useful for a number of purposes, including treating or preventing a medical condition in an individual. The invention further provides methods for identifying an agent that enhances or contributes to reprogramming mammalian cells.

Description

PROGRAMMING AND REPROGRAMMING OF CELLS
CROSS-REFERENCE TO RELATED APPLICATION
This application claims the benefit of U.S. Provisional Application No. 61 /636,441 , filed Apri l 20, 201 2, U.S. Provisional Application No. 61 /700,781 , filed September 13, 2012 and U.S. Provisional Application No. 61 /798,423, filed March 1 5, 2013. The entire teachings of the above applications are incorporated herein by reference.
GOVERNMENT FUNDING
This invention was made with government support under HD 045022, R37CA084198, 1 F32 GM0991 53-01 A l , and ARR-RC 1 -CA 144872 awarded by the National Institutes of Health. The government has certain rights in the invention.
BACKGROUND OF THE INVENTION
Stem cells are cells that are capable of self-renewal and of giving rise to more differentiated cells. Embryonic stem (ES) cells, for example, which can be derived from the inner cell mass of a normal embryo in the blastocyst stage, can differentiate into the multiple specialized cell types that collectively comprise the body (See, e.g., U.S. Pat. Nos. 5,843,780 and 6,200,806, Thompson, J. A. et al. Science, 282: 1 145-7, 1 998). As cells differentiate they undergo a progressive loss of developmental potential that has generally been considered largely irreversible. Somatic cell nuclear transfer (SCNT) experiments, however, showed that nuclei from differentiated adult cells could be reprogrammed to a totipotent state by factors present in the oocyte cytoplasm.
Mammalian cells with the property of pluripotency hold great clinical promise for applications in regenerative medicine such as cell/tissue replacement therapies for disease. However, SCNT and conventional methods of obtaining ES cells suffer from a number of limitations that hamper their use in regenerative medicine applications, and alternatives have been avidly sought. Examples can be found in the scientific literature in which differentiated cells of a particular type have been converted into cells of a different type without apparently being reverted to a fully pluripotent state as an intermediate step. For example, dermal fibroblasts can be converted into muscle-like cells by forced expression of MyoD. However, such examples do not provide a general approach to generating large numbers of patient-specific cells of numerous diverse types.
ES have been produced by introducing genes encoding four transcription factors associated with pluripotency, i.e., Oct3/4, Sox2, c-Myc and lf4, into mouse skin fibroblasts via retroviral infection, and then selecting cells that expressed a marker of pluripotency, Fbxl 5, in response to these factors (Takahashi, K. &
Yamanaka, S. Cell 126, 663-676, 2006). However, the resulting cells differed from ES cells in their gene expression and DNA methylation patterns and when injected into normal mouse blastocysts did not result in live chimeras. Stable reprogrammed cell lines have been derived that, based on reported transcriptional, imprinting, and chromatin-modification profiles, appeared similar to ES cells (Okita, ., et al., 448, 3 13-3 1 7, 2007; Wernig, M. et al. Nature 448, 3 1 8-324, 2007; Maherali, N. et al. Cell Stem Cell 1 , 55-70, 2007). Human somatic cells have also been reprogrammed to pluripotency using these factors.
There exists a need in the art for alternative and improved methods for reprogramming mammalian cells.
SUMMARY OF THE INVENTION
The present invention provides novel methods and compositions for reprogramming mammalian cel ls. Certain methods and compositions of the invention are of use to enhance generation of induced pluripotent stem cel ls by reprogramming somatic cells. Certain methods and compositions of the invention are of use to identify cells destined to become iPSCs. Certain compositions and methods of the invention are of use to enhance reprogramming of pluripotent mammalian cells to a differentiated cell type. Certain compositions and methods of the invention are of use to enhance reprogramming of differentiated mammalian cells of a first cell type to differentiated mammalian cells of a second differentiated cell type. The reprogrammed somatic cells are useful for a number of purposes, including treating or preventing a medical condition in an individual. The invention further provides methods for identifying an agent that enhances or contributes to reprogramm ing mammalian cells.
In some aspects, methods of generating a reprogrammed cell are disclosed, such methods comprising: (a) introducing reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. In some aspects, methods of generating a reprogrammed cell are disclosed, such methods comprising: (a) introducing reprogramming factors Sall4, Dppa2, Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. In some aspects, methods of generating a reprogrammed cell are disclosed, such methods comprising: (a) introducing reprogramming factors Sall4, Nanog, Esrrb, and any one or more of Etz2, Kdm 1 , and Utfl into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. In some aspects, methods of generating a reprogrammed cell are disclosed, such methods comprising: (a) introducing reprogramming factors Sall4, Dppa2 Esrrb, and any one or more of Etz2, dm l , and Utfl into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. In some aspects, methods of generating a reprogrammed cell are disclosed, such methods comprising: (a) introducing reprogramming factors Sall4, Nanog and/or Dppa2, and Esrrb, into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
In some embodiments said reprogramming factors are introduced into said somatic cell in the form of one or more nucleic acid sequences encoding the reprogramming factors. In some embodiments said one or more nucleic acid sequences comprise DNA. In some embodiments said one or more nucleic acid sequences comprise RNA. In some embodiments said one or more nucleic acid sequences comprises a nucleic acid construct.
In some embodiments said one or more nucleic acid sequences comprises a vector. In some embodiments said vector comprises an inducible vector. In some embodiments said inducible vector activates expression of said reprogramming factors in the presence of dox in said medium. In some embodiments said vector integrates into a genome of said somatic cell. In some embodiments said vector comprises a viral vector. In some embodiments said vector comprises a retroviral vector. In some embodiments said vector comprises a lentiviral vector. In some embodiments said vector comprises an excisable vector. In some embodiments said excisable vector comprises a transposon, wherein said excisable vector is excisable from said genome by transient expression of a transposase. In some embodiments said transposon comprises a piggyback transposon. In some embodiments said excisable vector comprises one or more loxP site incorporated into said vector, wherein said vector can be excised from said genome by transient expression of a Cre recombinase. In some embodiments said excisable vector comprises a floxed lentiviral vector. In some embodiments said vector does not integrate into the genome of said somatic cell. In some embodiments said vector comprises an adenoviral vector. In some embodiments said vector comprises a sendai viral vector. In some embodiments said vector comprises a plasmid. In some embodiments said vector comprises an episome.
In some embodiments said RNA comprises mRNA. In some embodiments said mRNA is translatable in vitro in said mammalian somatic cell. In some embodiments said mRNA is in vitro transcribed mRNA. In some embodiments said in vitro transcribed mRNA comprises a sequence encoding SV40 large T (LT). In some embodiments said in vitro transcribed mRNA comprises one or more modifications that increase stability or translatability of said mRNA. In some embodiments said in vitro transcribed mRNA comprises a 5 ' cap. In some embodiments said in vitro transcribed mRNA comprises an open reading frame flanked by a 5 ' untranslated region and a 3 ' untranslated region that enhance translation of said open reading frame. In some embodiments said 5 ' untranslated region comprises a strong Kozak translation initiation signal. In some embodiments said 3 ' untranslated region comprises an alpha-globin 3 ' untranslated region. In some embodiments said in vitro transcribed mRNA comprises a polyA tail. In some embodiments said in vitro transcribed mRNA is introduced into said somatic cell via electroporation. In some embodiments said in vitro transcribed mRNA is introduced into said somatic cell complexed with a cationic vehicle that facilitates uptake of said mRNA into said somatic cell via endocytosis. In some embodiments said in vitro transcribed mRNA is introduced into said somatic cell in an amount and for a period of time sufficient to maintain expression of the reprogramming factors until cellular reprogramming of said somatic cell occurs. In some embodiments said in vitro transcribed mRNA is treated with a phosphatase to reduce a cytotoxic response by said somatic cell upon introduction of said mRNA into said somatic cell. In some embodiments said in vitro transcribed mRNA comprises one or more base substitutions. In some embodiments said base substitutions are selected from the group consisting of 5-methylcytidine (5mC), pseudouridine (psi), 5-methyluridine, 2'0-methyluridine, 2-thiouridine, and N6-methyladenosine.
In some embodiments said reprogramming factors are introduced into said somatic cell in the form of one or more proteins or functional variants or fragments thereof. In some embodiments said one or more proteins comprise a recombinant protein. In some embodiments said one or more proteins comprise a fusion protein. In some embodiments said one or more proteins further comprise a cell-penetrating peptide. In some embodiments said cell-penetrating peptide is fused to a C terminus of said one or more proteins. In some embodiments said cell-penetrating peptide comprises HIV tat. In some embodiments said cell-penetrating peptide comprises poly-arginine. In some embodiments said one or more proteins is introduced into said somatic cell in an amount and for a period of time sufficient for reprogramming of said somatic cell to occur.
In some embodiments, such method further comprises (c) supplementing said medium with one or more agents that increase reprogramming efficiency. In some embodiments said one or more agents are selected from the group consisting of a nucleic acid, an antisense oligonucleotide, siRNA, miRNA, an antibody or a fragment thereof. In some embodiments said one or more agents comprise a histone deacetylase inhibitor. In some embodiments said histone deacetylase inhibitor comprises valproic acid (VPA). In some embodiments said histone deacetylase inhibitor comprises biityrate. In some embodiments said one or more agents comprise an interferon inhibitor. In some embodiments said interferon inhibitor comprises a recombinant B l 8R protein. In some embodiments said one or more agents comprise a signaling pathway modulator selected from the group consisting of a TGF-beta pathway inhibitor, a MAPK/ERK pathway inhibitor, a GSK3 pathway inhibitor, a WNT pathway activator, a 3 '-phosphoinositide-dependent kinase- 1 (PDK1 ) pathway activator, a mitochrondrial oxidation modulatory, a glycolytic metabolism modulator, a HIF pathway activator, and combinations thereof.
In some embodiments, such method further comprises (c) monitoring said culture for cells which display one or more markers of pluripotency. In some embodiments said one or more markers of pluripotency is selected from the group consisting of Fbxo l 5, Nanog, Oct4, Sox2, Sall4 and combinations thereof. In some embodiments said one or more markers of pluripotency comprise early markers of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
In some embodiments such method further comprises (c) or (d) isolating said reprogrammed cell from said culture.
In some embodiments said somatic cell is a terminally differentiated somatic cell.
In some aspects, mammalian cells are disclosed, such mammalian cells comprising said isolated reprogrammed cell. In some embodiments said isolated reprogrammed mammalian cell is a human cell. In some embodiments said isolated reprogrammed mammalian cell is a non-human mammal cell. In some embodiments said isolated reprogrammed mammalian cell further comprises a reporter gene integrated at a locus whose activation serves as a marker of reprogramming to pluripotency. In some embodiments the locus is selected from Nanog, Sox2, and Oct4. In some embodiments said isolated reprogrammed mammalian cell is an iPS cell.
In some aspects, a chimeric mouse is disclosed, such chimeric mouse generated at least in part from said isolated repgorammed mammalian iPS cell. In some embodiments said mouse is generated by injecting said mammalian iPS cell into a mouse blastocyt and allowing said blastocyst to develop into a mouse in vivo. In some aspects, a cell is disclosed, such cell comprising a cell obtained from said mouse, wherein said cell is derived from said iPS cell.
In some aspects, a non-human mammal is disclosed, such non-human mammal comprising a non-human mammal generated at least in part from said mammalian iPS cell. In some embodiments said non-human mammal is a mouse.
In some aspects, methods of producing a non-human mammal are disclosed, such methods comprising introducing said mammalian iPS cell into tetraploid blastocysts of the same mammalian species under conditions that result in production of an embryo and said resulting embryo is transferred into a foster mother which is maintained under conditions that result in development of live offspring. In some embodiments said non-human mammal is a mouse. In some embodiments said iPS cells are introduced into said tetraploid blastocysts by injection. In some
embodiments said injection is a microinjection.
In some aspects, a non-human mammal is disclosed, such non-human mammal comprising a non-human mammal produced according said method of producing a non-human mammal.
In some aspects, a mouse is disclosed, such mouse comprising a mouse produced according to said method of producing a non-human mammal.
In some aspects, methods of producing a non-human mammalian embryo are disclosed, such methods comprising injecting non-human iPS cel ls generated according to a reprogramm ing method of the present invention into non-human tetraploid blastocysts and maintaining said resulting tetraploid blastocysts under conditions that result in formation of embryos, thereby producing a non-human mammalian embryo. In some embodiments said non-human iPS cells are mouse cells and said non-human mammalian embryo is a mouse. In some embodiments mutant mouse iPS cells are injected into said non-human tetraploid blastocysts by
microinjection.
In some aspects, a non-human mammalian embryo produced according to said method of producing a non-human mammalian embryo is disclosed. In some embodiments said non-human mammalian embryo is a mouse embryo. In some embodiments, said somatic cells are differentiated cells of a first cell type, and said reprogramming reprograms said somatic cells to a second differentiated cell type.
In some aspects, a disclosed method comprises (a) reprogramming somatic cells to a pluripotent state according to method of generating a reprogrammed cell of the present invention; and (b) reprogramming said pluripotent cells to a desired, differentiated cell type, wherein said differentiated cell type optionally comprises an adult stem cell or a fully differentiated cell.
In some aspects, compositions are disclosed, such compositions comprising multiple isolated reprogrammed mammalian iPS cells.
In some aspects, methods of treating a patient in need of such treatment are disclosed, such methods comprising administering to the patient a composition comprising multiple isolated reprogrammed mammalian iPS cells.
In some aspects, methods of treating an individual in need of such treatment are disclosed, such methods comprising: (a) obtaining somatic cells from said individual; (b) reprogramming said somatic cells obtained from said individual according to a method of generating reprogrammed cells of the present invention; and (c) administering at least some of said reprogrammed cells to said individual. In some embodiments the method further comprises separating cells that are reprogrammed to a desired state from cells that are not reprogrammed to a desired state and/or wherein at least some of said reprogrammed cells are differentiated to a selected cell type prior to administration to said individual, in some embodiments, the method further comprises separating reprogrammed cells that have differentiated to a desired cell type from cells that have not differentiated to a desired cell type prior to admistering the eels. In some embodimetns a method comprises eliminating residual pluripotent cells ex vivo prior to administration. In some embodiments said individual is a human.
In some aspects, compositions for identifying a reprogramming agent are disclosed, such compositions comprising one or more cells that expresses a subset of reprogramming factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent. In some embodiments said subset of reprogramming factors consists of at least three of said reprogramming factors. In some embodiments such composition further comprises an agent that induces expression of said subset of reprogramming factors.
In some aspects, methods of identifying a reprogramming agent are disclosed, such methods comprising: (a) maintaining a composition comprising one or more cells that expresses a subset of reprogramming factors selected from the group consisting of SalI4, Nanog, Esrrb and Lin28 and a test agent for a time period under conditions in which said reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein the test agent is identified as a reprogramming agent if reprogramming occurs at a similar frequency as would be the case if said composition contained all of said reprogramming factors and had lacked said test agent.
In some aspects, methods of identifying a reprogramming agent are disclosed, such methods comprising: (a) maintaining a composition comprising one or more cells that expresses a subset of reprogramm ing factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28 and a test agent for a time period under conditions in which the reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein said test agent is identified as a reprogramming agent or enhancer of reprogramming if reprogramming occurs at a significantly greater frequency than would be the case had said composition lacked said test agent. In some embodiments said composition is maintained for at least X days. In some embodiments said test agent is present for at least X days. In some embodiments said test agent is identified as a reprogramming agent if cells do not become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent but do become reprogrammed at a detectable frequency if maintained in the presence of said test agent for at least a portion of said time period. In some embodiments said test agent is identified as an enhancer of reprogramm ing agent if cells become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent and become reprogrammed at a significantly greater frequency if maintained in the presence of said test agent for at least a portion of said time period.
In some aspects, a nucleic acid construct is disclosed, such nucleic acid construct comprising at least four coding regions linked to each other by nucleic acids that encode a self-cleaving peptide so as to form a single open reading frame, wherein said coding regions encode reprogramming factors Sall4, Nanog, Esrrb, and Lin28, and wherein said reprogramming factors are capable, either alone or in combination with one or more add itional reprogramming factors, of reprogramming a mammalian somatic cell to pluripotency. In some embodiments one of the four coding regions encodes Dppa2 instead of Nanog and/or one of the four coding regions encodes
Kdm l , Utfl , or Etzh2 instead of Lin28 or is absent.
In some embodiments said nucleic acid construct further comprises a fifth coding region that encodes a fifth reprogramming factor, wherein said five coding regions are linked to each other by nucleic acids that encode sel f-cleaving peptides so as to form a single open reading frame. In some embodiments said fifth
reprogramming factor is c-Myc.
In some embodiments said nucleic acid construct further comprises fifth and sixth genes that encode fifth and sixth reprogramming factors, wherein said six coding regions are linked to each other by nucleic acids that encode self-cleaving peptides so as to form a single open reading frame. In some embodiments said fifth
reprogramming factor is c-Myc and said sixth reprogramming factor is Klf4.
In some embodiments said self-cleaving peptide is a viral 2A peptide. In some embodiments said self-cleaving peptide is an aphthovirus 2A peptide.
In some embodiments said construct does not encode Oct4. In some embodiments said construct does not encode Klf4. In some embodiments said construct does not encode Sox2. In some embodiments said construct does not encode c-Myc.
In aspects, expression cassettes are disclosed, such expression cassettes comprising said nucleic acid construct operably linked to a promoter, wherein said promoter drives transcription of a polycistronic message that encodes said reprogramming factors, each reprogramm ing factor being linked to at least one other reprogramming factor by a self-cleaving peptide.
In some embodiments said expression cassette further comprises one or more sites that mediate integration into a genome of a mammalian cell. In some embodiments said expression cassette is integrated into said genome at a locus whose disruption has minimal or no effect on said cell. In some aspects, expression vectors comprising said expression cassette are disclosed. In some embodiments said vector is retroviral. In some embodiments said promoter is inducible.
In some aspects, reprogramming compositions are disclosed, such compositions comprising at least two, three, or four reprogramm ing factors selected from the group consisting of Sall4 protein, Nanog protein, Esrrb protein, and Lin28 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing. In some aspects, reprogramming compositions are disclosed, such compositions comprising at least two, three, or four reprogramming factors selected from the group consisting of Sall4 protein, Dppa2 protein, Esrrb protein, and Lin28 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing. In some aspects, reprogramming compositions are disclosed, such compositions comprising at least two, three, or four reprogramming factors selected from the group consisting of Sall4 protein, Nanog protein, Esrrb protein, and any of Kdm 1 , Utfl , or Etzh2 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing. In some aspects, reprogramming compositions are disclosed, such compositions comprising at least two, three, or four
reprogramming factors selected from the group consisting of Sall4 protein, Dppa2 protein, Esrrb protein, and any of Kdm l , Utfl , or Etzh2 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing.
In some embodiments each of said reprogramming factors comprises a cell- penetrating peptide fused to its C terminus. In some embodiments said cell- penetrating peptide comprises poly-arginine.
In some aspects, the invention provides methods of producing a pluripotent cell from a somatic cell, such methods comprising the steps of: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; (b) maintaining said one or more cells under conditions appropriate and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene; (c) selecting one or more cells which display an early marker of pluripotency; (d) generating a colony or an embryo utilizing said one or more cells which display the early marker of pluripotency; (e) obtaining one or more somatic cells from said colony or embryo; (f) maintaining said one or more somatic cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene; and (g) differentiating between cells which display one or more markers of pluripotency and cells which do not. In some embodiments Nanog is replaced by Dppa2 and/or Lin28 is replaced by Kdm l , Utfl , or Etzh2, or omitted.
In some embodiments said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments said early marker of pluripotency is a group of early markers of pluripotency consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments step (d) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency.
In some aspects, the present invention provides isolated pluripotent cells produced by a method comprising: (a) introducing exogenous reprogramming factors SaII4, Nanog, Esrrb, and Lin28 into one or more somatic cells; (b) maintaining said one or more cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramm ing factors to activate at least one endogenous pluripotency gene; (c) selecting one or more cells which display an early marker of pluripotency; (d) generating a colony or an embryo utilizing said one or more cells which display the early marker of pluripotency; (e) obtaining one or more differentiated somatic cells from said colony or embryo; (!) maintaining said one or more differentiated somatic cells under conditions appropriate for and for a period of time sufficient for said reprogramming factors to activate at least one endogenous pluripotency gene; and (g) differentiating cells which display one or more markers of pluripotency and cells which do not. In some embodiments Nanog is replaced by Dppa2 and/or Lin28 is replaced by Kdm 1 , Utfl , or Etzh2, or omitted.
In some embodiments said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments said early marker of pluripotency is a group of early pluripotency markers consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments step (d) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency. In some aspects, methods of selecting a somatic cell that is likely to be reprogrammed to a pluripotent state are disclosed, such methods comprising (a) measuring expression of one or more early markers of pluripotency in a population of a plurality of somatic cells; (b) sorting the population of the plurality of somatic cells into a plurality of populations of single somatic cells; and (c) measuring expression of the one or more early markers of pluripotency in each population of single somatic cells, wherein increased expression of the one or more early markers of pluripotency in each population of single somatic cells as compared to expression of the one or more early markers of pluripotency in the population of the plurality of somatic cells indicates that the single somatic cell is a somatic cell that is likely to be
reprogrammed to the pluripotent state.
In some embodiments said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
In some aspects, methods of selecting a cell that is likely to become programmed to a pluripotent state are disclosed, such methods comprising (a) maintaining a population of a plurality of differentiated somatic cells containing at least one exogenously introduced factor that contributes to reprogramming of said cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) sorting said population of said plurality of cells into a plurality of populations of single cells; and (c) isolating said sorted cells which display one or more early markers of pluripotency, wherein each sorted cell which displays said one or more early markers of pluripotency is a cell that is likely to become programmed to the pluripotent state.
In some embodiments said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
In some aspects, methods for increasing the efficiency of the expansion of induced pluripotent stem cells are disclosed, such methods comprising (a) maintaining a population of differentiated somatic cells that contains at least one exogenously introduced factor that contributes to reprogramming of said population of cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) monitoring each cell in said population of cells for the expression of one or more early pluripotency markers, wherein cells expressing the one or more early pluripotency markers are more likely to become programmed to a pluripotent state than cells which do not express the one or more early pluripotency markers; (c) isolating each cell in said population of cells that expresses the one or more early pluripotency markers; and (d) expanding only those cells which express the one or more early pluripotency markers, thereby increasing the efficiency of the expansion of induced pluripotent stem cells.
In some embodiments said one or more early pluripotency markers is selected from the group consisting of Esrrb, Utfl , Lin28, Dppa2, and combinations thereof. In some embodiments said monitoring of said cells is performed during a stochastic phase of reprogramming.
In some embodiments proliferation of said cell forms a clonal colony of said cell.
In some aspects, methods of increasing the likelihood that a differentiated somatic cell subjected to a reprogramm ing protocol wi ll become reprogrammed to an iPSC is disclosed, such methods comprising introducing into the differentiated somatic cell one or more early pluripotency factors prior to subjecting the differentiated somatic cell to said reprogramming protocol. In some embodiments said one or more early pluripotency factors is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
In some aspects, methods of isolating an iPS colony are disclosed, such methods comprising: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a differentiated mammalian somatic cell (b) culturing said differentiated somatic cell in a suitable medium under conditions appropriate for and for a time period sufficient for proliferation of and reprogramming of said cells to occur; and (c) isolating one or more colonies visible in said culture after said period of time. In some embodiments each of said exogenous reprogramming factors is introduced into said cell in the form of a recombinant protein comprising a cell- penetrating peptide fused to a C terminus of said recombinant protein. In some embodiments each of said exogenous reprogramming factors is introduced into said cell in the form of mRNA optionally complexed with a cationic vehicle, wherine said mRNA comprises in vitro transcribed mRNA comprising one or more of a 5 ' cap, an open reading frame flanked by a 5 ' untranslated region containing a strong Kozak translation initiation signal and an alpha-globin 3 ' untranslated region, a polyAtail, and one or more modifications which confer stability to the mRNA. In some embodiments such method further comprises: (d) growing said isolated one or more colonies on a layer of feeder cells in the absence of an inducer of said inducible transgenes. In some embodiments such method further comprises (e) passaging said one or more grown colonies at least once.
In some aspects, methods of enhancing isolation of iPSCs are disclosed, such methods comprising (d) sorting said one or more colonies visible in said culture after said period of time into single cells; (e) differentiating between said sorted cells which display one or more early markers of pluripotency and said sorted cells which do not display one or more early markers of pluripotency; and (f) isolating said sorted cells which display one or more early markers of plurioptency.
In some aspects, a mouse iPS cell characterized by an efficiency of said mouse iPS cell of generating live offspring by tetraploid complementation is disclosed, wherein said efficiency is at least 5%.
In some aspects, methods of producing a mouse iPS cell characterized by an efficiency of said mouse iPS cell of generating live offspring by tetraploid complementation of at least 5% are disclosed, such methods comprising: (a) transfecting mouse embryonic fibroblasts with a dox-inducible vector comprising reprogramming factors Sall4, Nanog, Esrrb and Lin28 operably linked to a tetracycline operator and a CMV promoter; (b) culturing said mouse embryonic fibroblasts under conditions suitable and for a time period sufficient for proliferation and reprogramm ing of said mouse embryonic fibroblasts to occur; (c) exposing said culture to an effective amount of doxycycline for a period of time sufficient for one or more iPS colonies to form; (d) isolating said one or more iPS colonies; (e) growing said isolated iPS colonies on feeder cells in the absence of doxycycline; and optionally (f) passaging said grown iPS colonies at least once.
In some aspects, the present invention provides a collection of reprogramming factors capable of producing a mouse iPS cell having an efficiency of generating live offspring by tetracomplementation of at least 5%, such collection comprising Sall4, Nanog, Esrrb, and Lin28. In some aspects, kits for generating a reprogrammed cell in vitro are disclosed, such kits comprising: (a) a set of reprogramming factors comprising Sall4, Nanog, Esrrb and Lin28, which are capable alone, or in combination with one or more additional reprogramming factors, of reprogramming said mammalian somatic cells to a pluripotent state , wherein the kit optionally comprises (b) a medium suitable for culturing mammalian iPS cells and/or (c) a population of mammalian somatic cells, and wherein the reprogramming factors are optionally provided as one or more nucleic acids (e.g., one or more vectors) encoding said reprogamming factors. In some embodiments such kits further comprise (d) one or more reagents for an assay for detecting one or more markers of pluripotency. In some embodiments the one or more markers of pluripotency is an early marker of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments such kits further comprises of one or more of: (e) instructions for preparing the medium; (f) instructions for deriving or culturing pluripotent cells; (g) serum replacement; (h) albumin; (i) at least one protein or small molecule useful for deriving or culturing iPS cells, wherein the protein or small molecule activates or inhibits a signal transduction pathway; j) a population of mammalian somatic cells and (k) at least one reagent useful for characterizing pluripotent cells. In some embodiments at least some of the ingredients are dissolved in liquid. In some embodiments at least some of the ingredients are provided in dry form.
The practice of the present invention will typically employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant nucleic acid (e.g., DNA) technology, immunology, and RNA interference (RNAi) which are within the skill of the art. Non- limiting descriptions of certain of these techniques are found in the following publications: Ausubel, F., et al., (eds.), Current Protocols in Molecular Biology, Current Protocols in Immunology, Current Protocols in Protein Science, and Current Protocols in Cell Biology, all John Wiley & Sons, N.Y., edition as of December 2008; Sambrook, Russell, and Sambrook, Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 2001 ; Harlow, E. and Lane, D., Antibodies - A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1988; Freshney, R.I., "Culture of Animal Cells, A Manual of Basic Technique", 5th ed., John Wiley & Sons, Hoboken, NJ, 2005. Non-limiting information regarding therapeutic agents and human diseases is found in Goodman and Gilman's The Pharmacological Basis of Therapeutics, 1 1 th Ed., McGraw Hill, 2005, Katzung, B. (ed.) Basic and Clinical Pharmacology, McGraw-Hill/Appleton & Lange; 10th ed. (2006) or 1 1 th edition (July 2009). Non-limiting information regarding genes and genetic disorders is found in McKusick, V.A. : Mendelian Inheritance in Man. A Catalog of Human Genes and Genetic Disorders. Baltimore: Johns Hopkins University Press, 1998 (12th edition) or the more recent online database: Online Mendelian Inheritance in Man, OMIM™. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD), as of May 1 , 2010, World Wide Web URL:
http://www.ncbi.nlm.nih.gov/omim/ and in Online Mendelian Inheritance in Animals (OMIA), a database of genes, inherited disorders and traits in animal species (other than human and mouse), at http://omia.angis.org.au/contact.shtml. All patents, patent applications, and other publications (e.g., scientific articles, books, websites, and databases) mentioned herein are incorporated by reference in their entirety. In case of a conflict between the specification and any of the incorporated references, the specification (including any amendments thereof, which may be based on an incorporated reference), shall control. Standard art-accepted meanings of terms are used herein unless indicated otherwise. Standard abbreviations for various terms are used herein. The sequences of the reprogramming factors disclosed herein can be obtained readily, if desired, from public databases, such as those available from NCBI. BRIEF DESCRIPTION OF THE DRAWINGS
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee.
Figures 1 A- I F. Experimental scheme used to monitor transcriptional profiles of single cells at defined tiinepoints during the reprogramming process. (A) Scheme used for measuring single-cell gene expression with Fluidigm BioMark after the addition of dox at days 2, 4, and 6. (B) Representative images of Nanog-GF?2 (NGFP2) cells without dox and undergoing the reprogramming process after the addition of dox at days 2, 4, and 6. (C) Scheme of NGFP2/tdTomato secondary system used to measure single-cell gene expression of clonal dox-dependent (GFP-, GFP+) and independent (GFP+) cells. (D) Representative images and FACS analysis of dox-dependent and independent cells after the addition of dox at day 1 2, 32, and 61 . (E) Timeline for reprogramming and analysis of clonal populations ( 1 5, 16, 20, 23, 34, 43, 44). Dox-dependent GFP- cells (colony number in black), GFP+ cells (colony number in green), and dox-independent GFP+ cells (colony number in green and starred) were sorted for single-cell profiling at indicated days after the addition of dox. Colony 44 contained a few cells with a low level of GFP that were sorted at day 61 and disappeared upon continual passaging and dox-withdrawal. Colony 1 5 was picked and analyzed in a different set of experiments at day 12 to compare the transcriptional profile between early colonies between different experiments. (F) Chart summarizing timeline for reprogramming and analysis of clonal populations ( 15, 16, 20, 23, 34, 43, 44).
Figures 2A-2C. NGFP2-tdTomato system. Representative images of bright field, GFP, and tdTomato in (A) NGFP2-iPSCs-tdTomato and (B) NGFP2-MEFs- tdTomato after six days of dox exposure (C) Flow cytometric analysis of GFP and tdTomato in NGFP2 cells of Colony 44 on dox for 61 days.
Figures 3A-3B. Fluidigm data. Representative (A) raw and (B) normalized
Fluidigm data for NGFP2-MEFs, Colony 1 5-day 1 2 on dox, NGFP2-iPSCs. See Supplemental Methods for detailed explanation of normalization and data analysis.
Figures 4A-4D. Two defined reprogramming populations (A) Principal component (PC) projections of individual cells, colored by their sample identification, The blue circle surrounds one population and the red circle surrounds another population. The orange dotted circle surrounds a third potential population. (B) PC projections of the 48 genes, showing the contribution of each gene to the first two PCs. The first PC can be interpreted as discriminating between cluster 1 and cluster 2; the second between pluripotency genes and cell cycle regulators. (C) Jensen Shannon Divergence analysis of within-group variability, colored by the same sample identification as in (A). (D) Jensen Shannon Divergence analysis of within-colony variability, colored by the same sample identification as in (A) and (C). Figures 5 A-5D. Established early markers are not sufficient to mark cells that will become iPSCs. mRNA expression levels of (A) Fbxo l 5, Fgf4 and Oct4 (B) Sall4 and (C) Esrrb, Utfl , Lin28, Dppa2 in populations noted in Figure 1 and legend of Figures 5A-5D (upper right) are shown in violin plots. Median values are indicated by red line, lower and upper quartiles by blue rectangle, and sample minima/maxima by black line. (D) Quantitative RT-PCR of Fbxo l 5, Fgf4, Oct4, Sall4, Esrrb, Utfl , Lin28, and Dppa2 expression in populations noted in legend (upper right), normalized to the Hprt house keeping control gene. Error bars are presented as a mean ± standard deviation of two duplicate runs from a typical experiment. The two partially reprogrammed colonies (colonies 23 and 44) are marked by red.
Figures 6A-6C. Analysis of partially reprogrammed populations (A)
Representative images of Colonies 23 and 44 and flow cytometric analysis of tdTomato and GFP at day 81 . Colony 23 failed to activate GFP in the majority of cells upon continual passaging to day 81 (0.01 % tdTomato+/GFP+). Colony 44 contained a few cells with a low level of GFP that disappeared upon continual passaging and dox- withdrawal. (B) Representative images of stable dox-independent GFP+ colonies after 30 days of treatment with AZA. (C) Flow cytometric analysis of GFP in Colony 23 (2.2% GFP+) and Colony 44 (0.5% GFP+) after 30 days of treatment with AZA.
Figure 7. Gene expression of early candidate markers in partially
reprogrammed populations. Quantitative RT-PCR of FbxO l 5, Fgf4, Oct4 endogenous, Sall4, Esrrb, Utfl , Lin28, and Dppa2 expression in MEFs, NGFP2 iPSCs, Colony 23, and Colony 44, normalized to the Hprt house keeping control gene. Error bars are presented as a mean ± standard deviation of two duplicate runs from a typical experiment. Samples are numbers according to legend in Figures 5A-5D.
Figures 8A-8D. Early potential markers for the reprogramming process. (A)
Single molecule mRNA FISH of Utfl (red), Esrrb (purple), Sall4 (blue) expression in NGFP2 cells at day 6 on dox. Each cell is represented as a single dot. Representative images are shown below each plot. (B) Quantitative RT-PCR of Utfl , Esrrb, and Sall4 expression in NGFP2 MEFs following shRNA knockdown at day 16 post dox addition. Two hairpins were used for each gene and expression levels were normalized for Hprt. (C) Alkaline phosphatase immunostaining of NGFP2 MEFs after 1 6 days of shRNA knockdown. (D) Flow cytometric analysis of GFP in NGFP2 MEFs upon shRNA knockdown. GFP+ cel ls are gated.
Figures 9A-9D. Model to predict the order of transcriptional events in single cells. (A) Bayesian network to describe the hierarchy of transcriptional events among a subset of pluripotent genes. (B) Bar plot of fraction of cells with transcripts, quantified by single molecule mRNA FISH, of Sox2, Sall4, Fgf4 (single positive, purple), Sox2/Sall4, Sall4/Fgf4, Sox2/Fgf4 (double positive, brown), and
Sox2/Sall4/Fgf4 (triple positive, blue) expression in NGFP2 cells at day 12 on dox. The numbers of cells in each category is indicated on top of each bar. (C) Bar plot of fraction of cells with transcripts, quantified by single molecule mRNA FISH, of Sox2, Lin28, Dnmt3b (single positive, purple), Sox2/Lin28, Lin28/Dnmt3b, Sox2/Dnmt3b (double positive, brown), and Sox2/Lin28/Dnmt3b (triple positive, blue) expression in NGFP2 cells at day 12 on dox. The numbers of cells in each category is indicated on top of each bar. (D) Bar plot of fraction of cells with transcripts, quantified by single molecule mR A FISH, of Sox2, Sall4, Fbxo 1 5 (single positive, purple), Sox2/Sall4, Sall4/Fbxo l 5, Sox2/Fbxo l 5 (double positive, brown), and Sox2/Sall4/Fbxo l 5 (triple positive, blue) expression in NGFP2 cells at day 12 on dox. The numbers of cells in each category is indicated on top of each bar.
Figures 1 0A- 10F. Late candidate markers. (A and D) mRNA expression levels of Gdf3 and Sox2 in populations noted in Figure 1 and legend of Figures 10A- 10F (right) are shown in violin plots. Median values are indicated by red line, lower and upper quartiles by blue rectangle, and sample minima/maxima by black line. (C and F) Quantitative RT-PCR of Gdf3 and Sox2 expression in MEFs, NGFP2 iPSCs, Colony 23, and Colony 44, normalized to the Hprt house keeping control gene. Error bars are presented as a mean ± standard deviation of two duplicate runs from a typical experiment. Samples are numbers according to legend in Figures 5A-5D and to the (right). (B and E) single molecule mRNA FISH of Gdf3 (green) and Sox2 (red) expression in NGFP2 cells at day 6 on dox. Each cell is represented as a single dot.
Figures 1 1 A- l 1 1. Cel lular reprogramming with factors derived from Bayesian network. (A) Flow cytometric analysis of GFP in Oct4-GFP cells reprogrammed with Oct4, Esrrb, Nanog, Klf4, c-Myc, 25 days on dox, 5 days without dox. Representative images of stable dox-independent GFP+ colonies and bright-field pictures of chimeras derived from these iPSCs are shown. (B) Flow cytometric analysis of GFP in Oct4- GFP cells reprogrammed with Sox2, Sall4, Nanog, Klf4, and c-Myc, 25 days on dox, 5 days without dox. Representative images of stable dox-independent GFP+ colonies and bright- field pictures of chimeras derived from these iPSCs are shown. (C) Flow cytometric analysis of GFP in Oct4-GFP cells reprogrammed with Lin28, Sall4, Esrrb, Nanog, Klf4, and c-Myc, 25 days on dox, 5 days without dox. Representative images of stable dox-independent GFP+ colonies and bright-field pictures of chimeras derived from these iPSCs are shown. (D) Flow cytometric analysis of GFP in Oct4- GFP cells reprogrammed with Lin28, Sall4, Esrrb, and Nanog, 25 days on dox, 5 days without dox. Bright-field pictures of chimeras derived from these iPSCs are shown. (E) Flow cytometric analysis of GFP in Oct4-GFP cells and Nanog-GFP cells reprogrammed with Oct4, Esrrb, Dppa2, Klf4, and c-Myc, 25 days on dox, 5 days without dox. Representative images of stable dox-independent GFP+ colonies and bright-field pictures derived from these iPSCs are shown. (F) Flow cytometric analysis of GFP in Oct4-GFP cells and Nanog-GFP cells reprogrammed with Lin28, Sall4, Esrrb, and Dppa2, 25 days on dox, 5 days without dox. Representative images of stable dox-independent GFP+ colonies and bright-field pictures derived from these iPSCs are shown. (G) Flow cytometric analysis of GFP in Oct4-GFP cells reprogrammed with Lin28, Sall4, Ezh2, Nanog, Klf4 and c-Myc. Representative bright-field pictures of the cells 25 days on dox, 1 day post dox withdrawal, and 7 days post dox withdrawal are shown (bottom). Flow cytometric analysis of GFP at day 7 days post dox withdrawal is shown (upper right). (PI) Alkaline phosphatase immunostaining and flow cytometric analysis of GFP in control NGFP2-MEFs (upper left) and NGFP2 MEFs reprogrammed with Lin28, Sall4, Esrrb, and Nanog by primary infection (upper right), 5 days on dox, 3 days without dox. Flow cytometric analysis of GFP is shown (bottom). (I) Flow cytometric analysis of GFP in control NGFP2 MEFs (upper) and secondary NGFP2- Lin28, Sall4, Esrrb, and Nanog MEFs (bottom), 5 days on dox, 3 days without dox.
Figures 12A- 12F. Analysis of Ezh2 and individual factor contributions. (A) Flow cytometric analysis of GFP upon overexpression of Ezh2 and dox exposure for 7 days followed by 3 days of dox withdrawal. (B) Quantitative RT-PCR of Ezh2 expression in NGFP2 cel ls, three days post shRNA knockdown. Two hairpins were used and expression levels were normalized for Hprt. (C) Alkaline phosphatase immunostaining of NGFP2 cells after 1 6 days of shRNA knockdown and dox addition. (D) Flow cytometric analysis of GFP in NGFP2 cells at day 16 upon shRNA knockdown and dox addition. GFP+ cells are gated. (E) Flow cytometric analysis of GFP upon overexpression of Lin28, Sall4, Esrrb, and Nanog individually in NGFP2 MEFs on dox for 10 days followed by 4 days dox withdrawal. (F) Flow cytometric analysis of GFP upon overexpression of Nanog individually in NGFP2 MEFs on dox for 16 days followed by 3 days dox withdrawal.
Figures 1 3A- 1 3C. Model of the reprogramming process. The reprogramm ing process can be split into two phases: an early stochastic phase (A and B) of gene activation followed by a later more deterministic phase (C) of gene activation that begins with the activation of the Sox2 locus. After a fibroblast is induced with exogenous factors Oct4, Sox2, Klf4 , c-Myc, the cell can proceed into either one of two stochastic phases. In scenario A, stochastic gene activation can lead to the activation of the Sox2 locus. In scenario B,stochastic gene activation can lead to the activation of "predictive markers" like Utfl , Esrrb, Dppa2, Lin28, which then mark cells that have a higher probability of activating the Sox2 locus. Activation of the Sox2 locus can be via two potential paths: (1 ) direct activation of the Sox2 locus or (2) sequential gene activation that leads to the activation of the Sox2 locus. In this model, probabilistic events decrease and hierarchal events increase as the cell progresses from fibroblast to iPSC. Solid red arrows and black arrows denote hypothetical interactions and interactions supported by our data, respectively. The white gap shown between the stochastic (A and B) and deterministic (C) panels represents the transition from induced fibroblast to iPSC illustrated between the orange dotted cluster and red cluster in Figure 4A.
Figure 14A- 14D. Characterization of SNEL-iPSC l ines. (A) A Schematic presentation of Bayesian network demonstrates the hierarchy of a subset of pluripotent genes that leads to a stable and transgene independent pluripotency state22. Sall4, Nanog, Esrrb and Nanog (SNEL) are marked by red circle. (B) Representative images of two stable dox-independent, GFP-positive colonies (Nanog-GFP SNEL# 1 and Oct4-GFP SNEL#3) and immunostaining for Sall4, Sox2, Utfl and Esrrb. (C) Quantitative RT-PCR of Dppa3, Dppa2, Zfp42 (Rexl) and Lin28 normalized to the Hprt house-keeping control gene in the indicated samples. Error bars are presented as a mean ± standard deviation (SD) of two duplicate runs from a typical experiment. (D) Hematoxylin and eosin staining of teratoma sections generated from Oct4-GFP SNEL# 1 showing structures from all three layers.
Figure 1 5A- 15D. SNEL-iPSCs produce "all-iPSC" mice with high success rate compared to OSKM. (A) Table summarizing the developmental potential of SNEL-iPSC or OSKM-iPSC lines via 4n complementation assay. Implantation Sites: The number was not recorded in all experiments; when not documented a "N/D" mark was made. The ">" sign denotes that implantations were recorded only in females in which c-section was performed. The "+" sign denotes that implantation sites were recorded for some females only. Dead fetuses and pups: This represents the number of fetuses found dead in utero and pups found dead at the time of c-section or right after natural birth. It also includes pups born with a hernia that were sacrificed immediately (asterisk represents the number of pups). In the case of Oct4-GFP OS M#3, all pups were E l 0.5 or less. Live Pups: These exhibited assisted breathing at the time of birth that ceased shortly after birth. Fostered Pups: Pups in this category exhibited independent breathing followed by fostering with lactating moms, (B) Percent of injected blastocysts surviving to birth are plotted for OSKM and SNEL lines, with the number of blastocysts noted on the x-axis, and blue representing number of pups that merely survived delivery, red the number of pups additionally foster-nursed.
Percentages were compared by Chi-Squared test to compute significance. (C) Representative images of 4n adult mice produced from Oct4-GFP SNEL# 1 and Oct4- GFP SNEL#4 lines and their Fl generation. (D) Confirmation of origin of "all-iPSC" mice by PCR for strain-specific polymorphisms. Two different Simple Sequence Polymorphism (SSLP) markers were tested using genomic DNA isolated from tissues of "all-iPSC" mice. Genomic DNA from the parental iPSCs (donor cells), as well as from a 129 Sv/Jae mouse (donor strain) and a B6D2F 1 mouse (host blastocyst strain) served as controls.
Figure 1 6A- 1 6D. Unbiased comparative transcriptome analyses distinguish iPSCs according to their 4n proficiency. (A) Hierarchical clustering of all genes
( 1 765) exhibiting significant variation (p < 0.01 for F.test) across all ESC and iPSC samples. Each group (poor, good, high and ESCs) is marked by different color (B) Principle component analysis for genes from (A). Each of the iPSC and ESC groups is marked by specific color and is surrounded by circle. The numbers inside the circles are corresponding to the numbers in A. (C) qRT-PCR of the Col6al and Thsbl normal ized to the Hprt house-keeping control gene in the indicated samples. Error bars are presented as a mean ± standard deviation (SD) of two duplicate runs from a typical experiment. The numbers on the X axis are corresponding to the numbers in A. (D) Gene ontology analysis using the GeneDecks27 algorithm of genes from (A).
Figure 17A- 17C. SNEL-iPSC lines produce healthy chimeras with high contribution. (A) Table summarizing the ability of all SNEL-iPSC lines to contribute to chimeras. The percentage of chimerism is estimated qualitatively based on coat color. The incidence of germ line transmission is also recorded. "N/D" is used to denote that these mice were not tested for germline transmission. (B) Representative pictures of chimeras and their estimated percentage of chimerism. (C) Representative pictures of adult mice and their progeny from two lines that were tested for germline transmission. Germ line transmission is based on the presence of agouti pups in the litters.
Figure 1 8. Oct4-GFP SNEL#2 secondary MEFs express high levels of Lin28 and Esrrb. Secondary MEFs derived from Oct4-GFP SNEL# 1 and Oct4-GFP SNEL#2 were exposed to dox for 48 hours and analyzed for the expression of Lin28, Esrrb, Oct4 and Sox2. MEFs and iPSCs served as controls. Error bars are presented as a mean ± standard deviation (SD) of two duplicate runs from a typical experiment.
Figure 1 9. "All-iPSC" pups produced from SNEL-iPSC lines. Representative pictures of entire litters after 4n complementation assay for two Oct4-GFP and two Nanog-GFP SNEL-iPSC lines. The female number is shown at the bottom of each litter.
Figure 20A-20B. SNEL-iPSCs produce "all-iPSC" m ice with high success rate compared to OSKM. (A) Table summarizing the production of "all-iPSC" mice from SNEL-iPSC or OS M-iPSC lines grown in 2i medium by tetraploid
complementation. (B) The percentage of blastocysts that gave rise to live pups is plotted for OSKM and SNEL lines. The number of blastocysts is noted at the top of the bar. The number of pups that were born live but failed to breath independently thereafter is represented by the blue color bar. The number of pups that were born live but additional ly sustained an independent breathing and were fostered is represented by the red bar. Percentages were compared by Chi-Squared test to compute significance.
Figure 21 . Pups generated from poor, good and high quality iPSC lines. Representative images of small and abnormal pups from a "poor" quality iPSC line (Nanog-GFP SNEL#2) are shown on the left. Similarly, representative photos are shown for pups born live from a "good" quality iPSC line (Nanog-GFP SNEL#3). These pups breathed normally at birth, but died within a few hours. On the right, one week-old pups are shown from a "high" quality iPSC line (Oct4-GFP SNEL# 1 ). These pups grew to adulthood.
Figure 22A-22B. Comparative transcriptome analysis demonstrates similar global gene expression profiles across ESC and iPSC samples. (A) Hierarchical clustering of global gene expression profiles for two microarray technical replicates for every iPSC and ESC (reference) line. Replicate pairs are assigned a shared numerical value. Each group (poor, good, high and ESCs) is marked by different color. (B) Principal component analysis for expression data from (A). Each of the iPSC and ESC groups is marked by specific color and is surrounded by circle. The numbers inside the circles are corresponding to the numbers in Figure 1 6A.
Figure 23. Comparative DNA methylome analysis of iPSCs and ESCs.
Hierarchical clustering by 2628 differentially methylated regions (DMRs) derived from whole genome bisulphite sequencing does not segregate samples by either reprogramming factor combination or ESC versus iPSC status. Each group (poor, good, high and ESCs) is marked by different color.
DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS
The present invention relates in some aspects to novel methods and compositions for reprogramming mammalian cells. Certain methods and compositions of the invention are of use to enhance generation of induced pluripotent stem cells by reprogramming somatic cells. Certain methods and compositions of the invention are of use to identify cells destined to become iPSCs. Certain compositions and methods of the invention are of use to enhance reprogramming of pluripotent mammalian cells to a differentiated cell type. Certain compositions and methods of the invention are of use to enhance reprogramming of differentiated mammalian cells of a first cell type to differentiated mammalian cells of a second differentiated cell type. The reprogrammed somatic cells are useful for a number of purposes, including treating or preventing a medical condition in an individual. The invention further provides methods for identifying an agent that enhances or contributes to
reprogramming mammalian cells.
Differentiated cel ls can be reprogrammed to a pluripotent state by
overexpression of the four transcription factors Oct4, Sox2, Klf4, and c-Myc (Takahashi and Yamanaka, 2006). Fully reprogrammed induced pluripotent stem cells (iPSCs) can contribute to the three germ layers and give rise to fertile mice by tetraploid complementation (Boland et al., 2009; Hanna et al., 2008; Okita et al., 2007; Park et al., 2008; Takahashi et al„ 2007; Wernig et al., 2007; Yu et al., 2007; Zhao et al., 2009). The reprogramming process is characterized by widespread epigenetic changes (Kim et al., 2010; Koche et al., 201 1 ; Maherali et al., 2007;
Mikkelsen et al., 2008; Sridharan et al., 2009) that generate iPSCs that are functionally and molecularly sim ilar to embryonic stem (ES) cells (Carey et al., 201 1 ).
To further understand the reprogramming process many groups have analyzed transcriptional and epigenetic changes of populations of cells at different time points after factor induction. Microarray data at defined time points during the
reprogramming process (Mikkelsen et al., 2008) show that the immediate response to the reprogramming factors is characterized by de-differentiation of mouse embryonic fibroblasts (MEFs) and upregulation of proliferative genes, consistent with the expression of c-Myc. It has been shown that expression of early markers such as alkaline phosphatase and SSEA 1 is followed by activation of endogenous pluripotency markers, Sox2 and Nanog (Brambrink et al., 2008; Stadtfeld et al., 2008). Live imaging analysis of single cells enabled retroactive tracking of reprogramming events and defined transitions within induced cells (Smith et al., 201 0). Additionally, analysis of methylation and chromatin of fibroblasts that have undergone a discrete number of cel l divisions suggests that early transcriptional changes mostly rely on already existing accessible chromatin and, further, activating chromatin modifications are assigned to pluripotent promoters before transcriptional activation (Koche et al., 201 1 ). Recently, gene expression profi ling and RNAi screening in fibroblasts revealed three phases of reprogramming— initiation, maturation, and stabilization, with the earliest initiation phase marked by a mesenchymal-to-epithelial (MET) transition (Samavarchi-Tehrani et al., 2010).
Given these data, a stochastic model has emerged to explain how forced expression of the transcription factors initiates the process that eventually leads to the pluripotent state in only a very small fraction of the transduced cells (Hanna et al., 2009; Hanna et al., 201 0; Yamanaka, 2009). Most data have been interpreted to be consistent with the stochastic model (Hanna et al., 2009; Hanna et al., 201 0) posing that the reprogramming factors in fibroblasts initiate a sequence of stochastic events that eventually lead to the small and unpredictable fraction of iPSCs (Jaenisch and Young, 2008). The stochastic model is also supported by clonal analyses
demonstrating that the activation of pluripotency markers can occur at different times after infection in individual mitotic daughter cells of the same infected fibroblast (Meissner et al., 2007). However, since the molecular changes occurring at the different stages during the reprogramming process were based upon the analysis of heterogeneous cell populations, it has not been possible to clarify the events that occur in the rare single cells that eventually form an iPSC. Moreover, although a stochastic model has been described, there is little insight into the sequence of events that drive the process.
To understand the changes that precede iPSC formation, we established a new experimental paradigm that al lows for gene expression analysis on a single-cell level. Single-cell analysis can provide a snapshot of the state of individual cells in heterogeneous cell populations and therefore elucidate unknown genes and signaling pathways involved in the reprogramming process (Graf and Stadtfeld, 2008; Hayashi et al., 2008; Kalisky et al., 201 1 ; Kalisky and Quake, 201 1 ; Raj and van Oudenaarden, 2008; Tang et al., 2010; Tang et al., 2009; Tang et al., 201 1 ). We used gene expression analysis to profile 48 genes in single cells derived from early time points in the reprogramming process, intermediate cells, and fully reprogrammed iPSCs. Our results show that cells at different stages of the reprogramming process can be separated into two defined populations with high variation in gene expression at early time points. A steep decrease in the variation between sister cells is initially observed after the activation of the Nanog locus, that marks the induction of the core pluripotencty circuitry. We also demonstrate that activation of genes such as FbxolS, Fg†4 and Oct4 do not stringently predict successful reprogramming in contrast to Esrrb, Utfl , Lin28, and Dppa2, which more rigorously mark the rare cells that are destined to become iPSCs. Moreover, our results suggest that stochastic gene expression changes early in the reprogramming process are followed by a "non- stochastic" or more "hierarchical" phase of gene expression responsible for the activation of the endogenous pluripotent circuitry. This late phase begins with the activation of the endogenous Sox2 locus and drives a sequence of events that lead to the activation of other pluripotency genes. Finally, we show that the activation of the pluripotency core circuitry is possible by various combinations of factors based on the events that occur in this late consecutive phase.
During the course of work described herein, we used two different techniques to measure gene expression in single cells of clonal populations during the reprogramming of somatic cel ls to pluripotency. While single-cell gene expression analysis has been applied previously to studies in the mouse intestine (Itzkovitz et al., 201 1 ), human colon tumors (Dalerba et al., 201 1 ), the mouse zygote and blastocyst (Guo et al., 2010; Tang et al., 2010), and human iPSCs (Narsinh et al., 201 1 ), such an approach has not been used to define the cell states and molecular transitions during the conversion of somatic cells to iPSCs.
Two models, designated as a 'stochastic' or 'a 'deterministic' process, have been proposed to explain the mechanism of reprogramming (Hanna et a!., 2009; Yamanaka, 2009). A number of studies are most consistent with the stochastic model (Hanna et al., 2009; Hanna et al, 201 0) posing that the reprogramming factors in fibroblasts initiate a sequence of stochastic events that eventually leads to the small and unpredictable fraction of iPS cells (Jaenisch and Young, 2008). In contrast, nuclear transfer (Boiani et al., 2002) or cell fusion (Bhutani et al., 201 0; Do and Scholer, 2010) induce reprogramming rapidly and possibly as a single event with little heterogeneity observed in somatic cells, possibly consistent with a deterministic process (Hanna et al., 201 0), So far the molecular analyses of reprogramming were based on gene expression measurements over heterogeneous populations of cells precluding insight into events that occur in the rare single cells that ultimately become iPS cells.
Our single-cell gene expression data are in agreement with the stochastic model and suggest a sequence of gene activation at later stages (Figure 13). The significant variation between sister cells of initial colonies that does not reveal a specific sequential order of gene expression supports a stochastic mechanism of gene activation early in the process (Figure 13 A). Based on the Bayes network model derived from single-cell gene expression data, a second later phase of reprogramming seems to be governed by a more sequential or hierarchical mechanism of gene activation (Figure 13C). Our data suggest that the activation of the Sox2 locus initiates consecutive steps that lead to the pluripotent state. In support for this, we failed to find cells expressing transcripts solely of Sox2/Fgf4, Sox2/Dnmt3b, or Sox2/Fbxol 5 (Figure 9). These data support a hierarchical and defined order of gene activation. However, our data are also consistent with the possibility that the activation of "predictive" markers such as Essrb or Utfl represent a key event that either directly activates the Sox2 locus or initiates a sequence of gene activations eventually resulting in Sox2 activation (Figure 13B).
It has been shown that Sox2 is indispensable for maintaining ES-cell pluripotency because Sox2-null ES cells differentiated primarily into trophoectoderm- like cells and it was suggested, consistent with our hypothesis, that Sox2 was partially responsible for the activation of Oct4 by maintaining high levels of orphan nuclear receptors like Nr5a2 (Lrhl)(Masui et al., 2007). In agreement with this observation, removing the Sox2 activator Esrrb from a cocktail of transcription factors (Lin28, Sall4, Nanog, Ezh2, Klf4 and c-Myc) yielded iPS-like colonies that were unstable due to their failure to activate the core pluripotency circuitry. Thus, early in the reprogramming process the four factors induce the somatic cells to acquire epigenetic changes by a stochastic mechanism leading to an intermediate or partially reprogrammed state (Egli et al., 2008). Activation of the endogenous Sox2 represents a late cell state and can be considered as a first step that drives a consecutive chain of events that allow the cells to enter the pluripotent state.
We show that the activation of the pluripotent circuitry is possible by various subsets of transcription factors even without Oct4 and Sox2. A lthough Oct4 is a key factor in faci litating the reprogramming process (Figure 1 1 A- l 1 C), its own activation could not predict reactivation of the pluripotent circuitry (Figure 5). We showed that Sall4, Nanog, Esrrb and Lin28 (SNEL reprogramming factors) are sufficient to generate fully reprogrammed iPSCs. Our Bayes model is consistent with this data (Figure 9 and Figure 1 1 ). We further showed that Dppa2 could substitute for Nanog in this combination, i.e., Sall4, Dppa2, Esrrb, and Lin28 (SDEL reprogramming factors) are sufficient to generate fully reprogrammed iPSCs, consistent with our model. We also showed that Lin28 could be replaced by, e.g., any of Ezh2, Kdm l , and Utfl .
In summary, single cell gene expression analysis revealed an unanticipated heterogeneity in gene expression between sister cells, consistent with stochastic epigenetic alterations during the early phase of the reprogramming process. This was followed by a more hierarchal mechanism late in the process where activation of some key genes predicts the expression of downstream genes and the establishment of the pluripotency circuitry.
Disclosed herein are methods of generating a reprogrammed cell, such methods comprising: (a) introducing reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. Also disclosed herein are methods of generating a reprogrammed cell, such methods comprising: (a) introducing reprogramming factors Sall4, Dppa2 Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. Also disclosed herein are methods of generating a reprogrammed cell, such methods comprising: (a) introducing reprogramming factors Sall4, Dppa2 Esrrb, and any one or more of Etz2, dm i , or Utfl into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. Also disclosed herein are methods of generating a reprogrammed cell, such methods comprising: (a) introducing reprogramming factors Sall4, Dppa2 and/or Nanog, and Esrrb, into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell. In some embodiments, reprogramming factor Sall4 refers to GenelD 57167 of NCBI's Gene database or a homolog thereof. In some
embodiments, reprogramming factor Sall4 refers to a reprogramming factor obtained using the primers in Table 2 below. In some embodiments, reprogramming factor Nanog refers to GenelD 79923 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramming factor Nanog refers to a reprogramming factor obtained using the primers in Table 2 below. In some embodiments, reprogramming factor Esrrb refers to GenelD 21 03 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramming factor Esrrb refers to a reprogramming factor obtained using the primers in Table 2 below. In some embodiments, reprogramming factor Lin28 refers to GenelD 79727 of NCBI 's Gene database or a homolog thereof. In some embodiments, reprogramming factor Lin28 refers to a reprogramming factor obtained using the primers in Table 2 below. In some embodiments, reprogramming factor Dppa2 refers to GenelD 151 871 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramming factor Dppa2 refers to a
reprogramming factor obtained using the primers in Table 4 below. In some embodiments, reprogramming factor Etz2 refers to GenelD 2146 of NCBI 's Gene database or a homolog thereof. In some embodiments, reprogramming factor Etz2 refers to a reprogramm ing factor obtained using the primers in Table 4 below. In some embodiments, reprogramming factor dm 1 refers to dm 1 a having GenelD 23028 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramm ing factor Utfl refers to GenelD 8433 of NCBI's Gene database or a homolog thereof. In some embodiments, reprogramming factor Utfl refers to a reprogramming factor obtained using the primers in Table 4 below.
Reprogramming, as used herein, refers to a process that alters the differentiation state or identity of a cell. Cells are classified into different "types" based on various criteria such as morphological and functional characteristics and gene expression profile. "Cell state" encompasses the concept of "cell type" or "cell identity" but also refers to any one or more features or characteristics (or sets of features or characteristics) that characterize a cell (e.g., pluripotent state, differentiated state, post-mitotic state, etc.). It will be understood that in at least some aspects the initial cell(s) gives rise to a population of descendants and that reprogramming occurs over time within the population of cells. In some
embodiments any aspect herein pertaining to a cell pertains to a population comprising multiple cells. In some embodiments herein, a cell, reprogramming factor or combination thereof, or composition comprising one or more cells and/or reprogramming factors, is isolated or ex vivo. In some embodiments, the invention provides methods for reprogramming somatic cells to a less differentiated state. The resulting cells thus reprogrammed are sometimes referred to herein as "ES-like" or "iPSCs" if they are pluripotent. In some embodiments, reprogramming entails complete reversion of the differentiation state of a somatic cell to a pluripotent state, in which the cell has the ability to differentiate into or give rise to cells derived from al l three embryonic germ layers (endoderm, mesoderm and ectoderm) and typically has the potential to divide in vitro for a long period of time, e.g., greater than one year or more than 30 passages. In some embodiments, reprogramming entails partial reversion of the differentiation state of a differentiated somatic cell to a multipotent state, in which the cell is able to differentiate into some but not all of the cells derived from all three germ layers. In some embodiments, reprogramming entails differentiating a pluripotent cell (e.g., an iPSC) or multipotent cell to a more differentiated cell of a desired cell type. In some embodiments, reprogramming entails converting a cell of a first differentiated cell type into a cell of a second differentiated cell type (also referred to as "trans-differentiation"), without apparently going through an intermediate stage of pluripotency. Unless otherwise indicated, the methods for reprogramming cells are performed in vitro, i.e., they are practiced using cells maintained in culture.
As used herein, "reprogramming factor" refers to a gene, RNA, or protein that promotes or contributes to cell reprogramm ing, e.g., in vitro. Many useful reprogramming factors are transcription factors. In aspects of the invention relating to reprogramming factor(s), the invention provides embodiments in which the reprogramming factor(s) are of interest for reprogramming somatic cells to pluripotency in vitro. Examples of reprogramming factors of interest for
reprogramming somatic cells to pluripotency in vitro are Sall4, Nanog, Esrrb, Lin28, Klf4, c-Myc, and any gene/RN A/protein that can substitute for one or more of these in a method of reprogramming somatic cel ls in vitro. "Reprogramming to a pluripotent state in vitro", "reprogramming to a pluripotency in vitro", is used herein to refer to in vitro reprogramming methods that do not require and typically do not include nuclear or cytoplasmic transfer or cell fusion, e.g., with oocytes, embryos, germ cells, or pluripotent cells. Any embodiment or claim of the invention may specifically exclude compositions or methods relating to or involving nuclear or cytoplasmic transfer or cell fusion, e.g., with oocytes, embryos, germ cells, or pluripotent cells.
As used herein, "reprogramming protocol" refers to any treatment or combination of treatments that causes at least some cells to become reprogrammed. In some embodiments, "reprogramming protocol" can refer to a variation of a known reprogramming protocol, wherein a factor or other agent used in a known reprogramming protocol is omitted or modified. In some embodiments,
"reprogramming protocol" can refer to a variation of a known reprogramming protocol, wherein a factor or agent known to be of use for reprogramming is used together with a different agent whose utility in reprogramming has not been established.
It should be appreciated that the present invention contemplates introducing exogenous reprogramming factors into somatic cells in any form that is capable of maintaining exogenous reprogramming factors for a period of time and at levels sufficient to activate endogenous pluripotency genes and for reprogramming of at least some of the somatic cells into which the exogenous reprogramming factors are introduced to occur. As used herein, "exogenous" refers to a substance present in a cell or organism other than its native source. For example, the terms "exogenous nucleic acid" or "exogenous protein" refer to a nucleic acid or protein that has been introduced by a process involving the hand of man into a biological system such as a cell or organism in which it is not normally found or in which it is found in lower amounts. A substance will be considered exogenous if it is introduced into a cell or an ancestor of the cell that inherits the substance. In contrast, the term "endogenous" refers to a substance that is native to the biological system.
Somatic cells of use in aspects of the invention may be primary cells (non- immortalized cells), such as those freshly isolated from an animal, or may be derived from a cell line capable or prolonged proliferation in culture (e.g., for longer than 3 months) or indefinite proliferation (immortalized cells). Adult somatic cel ls may be obtained from individuals, e.g., human subjects, and cultured according to standard cell culture protocols available to those of ordinary skill in the art. The cells may be maintained in cell culture following their isolation from a subject. In certain embodiments, the cells are passaged once or more following their isolation from the individual (e.g., between 2-5, 5- 10, 10-20, 20-50, 50- 1 00 times, or more) prior to their use in a method of the invention. In some embodiments, cell may be frozen and subsequently thawed prior to use. In some embodiments, cells will have been passaged no more than 1 , 2, 5, 1 0, 20, or 50 times following their isolation from an individual prior to their use in a method of the invention. Somatic cells of use in aspects of the invention include mammalian cells, such as, for example, human cells, non-human primate cells, or mouse cells. They may be obtained by well-known methods from various organs, e.g., skin, lung, pancreas, liver, stomach, intestine, heart, reproductive organs, bladder, kidney, urethra and other urinary organs, etc., generally from any organ or tissue containing live somatic cells. Mammalian somatic cells useful in various embodiments include, for example, fibroblasts, adult stem cells, Sertoli cells, granulosa cells, neurons, pancreatic cells, epidermal cells, epithelial cells, endothelial cells, hepatocytes, hair follicle cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T lymphocytes), macrophages, monocytes, mononuclear cells, cardiac muscle cells, skeletal muscle cells, etc.
In some embodiments, reprogramming factors of the present invention are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors. In some embodiments, SNEL reprogramming factors are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors. In some embodiments, the one or more nucleic acid sequences comprise DNA. In some embodiments, the one or more nucleic acid sequences comprise RNA. In some embodiments, the one or more nucleic acid sequences comprise a nucleic acid construct.
In some embodiments, the one or more nucleic acid sequences comprise a vector for delivery of the reprogramming factors of the present invention into a target cell (e.g., a mammalian somatic cell, e.g., a human or mouse fibroblast cell). The present invention contemplates the use of any suitable vector. Such suitable vectors are described by Stadtfeld and Hochedlinger (Genes Dev. 24:2239-2263, 2010, incorporated herein by reference in its entirety). Other suitable vectors are apparent to those skil led in the art.
In some embodiments, the vector comprises an inducible vector. In some embodiments, the inducible vector is a doxycycline inducible vector (i.e., a vector activates expression of said reprogramming factors in the presence of doxycyclin in a culture medium). "Expression" refers to the cellular processes involved in producing RNA and proteins as applicable, for example, transcription, translation, folding, modification and processing. "Expression products" include RNA transcribed from a gene and polypeptides obtained by translation of mRNA transcribed from a gene. In some embodiments, the inducible vector is a tamoxifen inducible vector.
In some embodiments, the vector is an integrating vector that integrates into a genome of a host cell (e.g., a mammalian somatic cell).
In some embodiments, the vector comprises a viral vector, e.g., a retroviral vector, e.g., a lentiviral vector.
In some embodiments, the vector comprises an excisable vector. In some embodiments, the excisable vector comprises a transposon, wherein said excisable vector is excisable from said genome by transient expression of a transposase. In certain embodiments, the transposon comprises a piggyback transposon (See, e.g., Woltjen et al. Nature 458:766-770, 2009; Yusa et al. Nat Methods 6:363-369, 2009, incorporated herein by reference in its entirety). In some embodiments, the excisable vector comprises one or more loxP site incorporated into said vector, wherein said vector can be excised from said genome by transient expression of a Cre recombinase (See, e.g., Kaj i et al. Nature 458:771 -775, 2009; Soldner et al. Cell 1 36:964-977, 2009, eachof which is incorporated herein by reference in its entirety). In some embodiments, the excisable vector comprises a floxed lentiviral vector.
In some embodiments, the vector does not integrate into the genome of said somatic cell. In some embodiments, the vector comprises an adenoviral vector (See, e.g., Zhou and Freed. Stem Cells 27:2667-2674, 2009, the teachings of which are incorporated herein by reference). In some embodiments, the vector comprises a Sendai viral vector (See, e.g., Fusaki et al. Proc Jpn Acad 85:348-362, 2009, the teachings of which are incorporated herein by reference). In some embodiments, the vector comprises a plasmid. In some embodimens, the vector comprises an episome (Yu et al. Science 324(5928):797-801 , 2009, the teachings of which are incorporated herein by reference).
In some embodiments, the one or more nucleic acids for introducing the reprogramming factors of the present invention comprise mRNA that is translatable in a mammalian somatic cell. In some embodiments, the mRNA can be introduced in vitro into somatic cells to be reprogrammed and translated by endogenous enzymes into proteins that can activate one or more endogenous pluripotency genes in the cell. As used herein, "pluripotency gene", refers to a gene whose expression under normal conditions (e.g., in the absence of genetic engineering or other manipulation designed to alter gene expression) occurs in and is typically restricted to pluripotent stem cells, and is crucial for their functional identity as such. It will be appreciated that the polypeptide encoded by a pluripotency gene may be present as a maternal factor in the oocyte. The gene may be expressed by at least some cells of the embryo, e.g., throughout at least a portion of the preimplantation period and/or in germ cell precursors of the adult. The gene may be expressed in ES cells and/or in embryonic carcinoma cells. The pluripotency gene is typically substantially not expressed in somatic cell types that constitute the body of an adult animal under normal conditions (with the exception of germ cells or precursors thereof, or possibly in certain disease states such as cancer). For example, the pluripotency gene may be one whose average expression level (based on RNA or protein) in ES cells is at least 50-fold or 100-fold greater than its average level in those terminally differentiated cell types present in the body of an adult mammal. In some embodiments, the pluripotency gene is one that encodes multiple splice variants or isoforms of a protein, wherein one or more such variants or isoforms is expressed in at least some adult somatic cell types, while one or more other variants or isoforms is not substantially expressed in adult somatic cells under normal conditions. In some embodiments, expression of the pluripotency gene is essential to maintain the viability or pluripotent state of iPSCs. Thus if the gene is knocked out or its expression is inhibited (i.e., its expression is eliminated or substantially reduced, e.g., such that the average steady state level of RNA transcript and/or protein encoded by the gene is decreased by at least 50%, 60%, 70%, 80%, 90%, 95%, or more), the iPSCs are not formed, die or, in some embodiments, differentiate or cease to be pluripotent. In some embodiments the pluripotency gene is characterized in that its expression in an ES cell or iPS cell decreases (resulting in, e.g., a reduction in the average steady state level of RNA transcript and/or protein encoded by the gene by at least 50%, 60%, 70%, 80%, 90%, 95%, or more) when the cell differentiates into a terminally differentiated cell. Oct4 and Nanog are exemplary pluripotency genes. In some embodiments, the mRNA is in vitro transcribed mRNA. A non-limiting example of producing in vitro transcribed mRNA of the present invention is described by Warren et al. (Cell Stem Cell 7(5):61 8-30, 2010, the teachings of which are incorporated herein by reference). One of ordinary skill in the art can readily adapt the protocol described in Warren to produce one or more mRNAs of the present invention by substuting the reprogramming factors used in Warren with the reprogramming factors of the present invention. In some embodiments, the in vitro transcribed mRNA comprises a sequence encoding SV40 large T (LT). In some embodiments, the in vitro transcribed mRNA comprises one or more modifications that increase stability or trans latability of said mRNA. In some embodiments, the in vitro transcribed mRNA comprises a 5' cap. The cap may be wild-type or modified. Examples of suitable caps and methods of synthesizing in vitro transcribed mRNA containing such caps are apparent to those skilled in the art.
In some embodiments, the in vitro transcribed mRNA comprises an open reading frame flanked by a 5 ' untranslated region and a 3 ' untranslated region that enhance translation of said open reading frame, e.g., a 5 ' untranslated region that comprises a strong Kozak translation initiation signal, and/or a 3 ' untranslated region comprises an alpha-globin 3 ' untranslated region.
In some embodiments, the in vitro transcribed mRNA comprises a polyA tail. Methods of adding a polyA tail to in vitro transcribed mRNA are known in the art, e.g., enzymatic addition via polyA polymerase or ligation with a suitable ligase.
The present invention contemplates any suitable method for introducing in vitro transcribed mRNA encoding reprogramming factors (e.g., SNEL reprogramm ing factors) of the present invention into somatic cells. In some embodiments, the in vitro transcribed mRNA is introduced into said somatic cell via electroporation. In some embodiments, the in vitro transcribed mRNA is introduced into said somatic cell complexed with a cationic vehicle that facilitates uptake of said mRNA into said somatic cell via endocytosis (e.g., a cationic liposome or a nanoparticle).
In some embodiments, the in vitro transcribed mRNA is introduced into said somatic cell in an amount and for a period of time sufficient to maintain expression of the reprogramming factors until cellular reprogramming of said somatic cell occurs. The period of time sufficient to maintain expression of the reprogramming factors may vary depending on the type of somatic cell and the reprogramming factors employed. One of ordinary skill in the art can readily determine the appropriate period of time by routine experimentation. In some embodiments, in vitro transcribed mRNA is introduced into somatic cells at various intervals during the course of reprogramming to maintain sufficient levels of exogenous reprogramming factors in the somatic cells until reprogramming of the cells occurs.
In some embodiments, the culture medium comprising the somatic cells to be reprogrammed is supplemented or treated with one or more agents that increases the efficiency of reprogramming or enhance the reprogramming process. Cells may be treated in any of a variety of ways to cause reprogramming according to the methods of the present invention. The treatment can comprise contacting the cells with one or more agent(s) that contribute to reprogramming ("reprogramm ing agent"). Such contacting may be performed by maintaining the cell in culture medium comprising the agent(s). In some embodiments the somatic cells are genetically engineered. The somatic cell may be genetically engineered to express one or more reprogramming factor(s) as described herein and known in the art. In some embodiments, the culture medium is supplemented with low oxygen culture conditions (e.g., about 5% O2) to promote more efficient reprogramming of the somatic cells to iPSCs. In some embodiments, the in vitro transcribed mRNA is treated with a phosphatase to reduce a cytotoxic response by said somatic cell upon introduction of said mRNA into said somatic cell.
In some embodiments, the in vitro transcribed mRNA comprises one or more base substitutions. Methods of modifying bases of mRNA are well known in the art. Non-limiting examples of suitable base substitutions include 5-methylcytidine (5mC), pseudouridine (psi), 5-methyluridine, 2'O-methyluridine, 2-thiouridine, and N6- methyladenosine. It should be appreciated that any number bases in an RNA of the present invention (e.g., in vitro transcribed mRNA) can be substituted.
In some embodiments, reprogramming factors (e.g., SNEL reprogramming factors) of the present invention are introduced into somatic cells in the form of one or more proteins or functional variants or fragments thereof that are capable of activating endogenous pluripotency genes in the cells and reprogramming at least some of the cells to iPSCs. Zhou et al. have successfully produced iPSCs derived from both mouse and human fibroblasts using purified recombinant proteins, and such methods can be adapted for use with the inventive reprogramming factors of the present invention to produce iPSCs (Zhou et al. 2009. Cell Stem Cell 4:381 -384, incorporated herein by reference in its entirety). In some embodiments, the one or more protein reprogramming factors comprise a recombinant protein. In some embodiments, the one or more proteins comprise a fusion protein. In some embodiments, the one or more proteins further comprise a cell-penetrating peptide that facilitates entry of the one or more proteins into a cell nucleus where the one or more proteins can function to activate endogenous pluripotency genes in the cells. In some embodiments, the cell-penetrating peptide is fused to a C terminus of said one or more proteins.
Recombinant proteins comprising cell-penetrating peptides fused to their C terminus can be produced according to routine methods, e.g., expression in E. coli inclusion bodies followed by solubilization, refolding, and purification as described by Zhou et al. 2009, or expression in a suitable cell line, for example, an HEK293 cell line as described in Kim et al. (Cell Stem Cell 4(6):472-476, 2009, incorporated herein by reference in its entirety). In some embodiments, the cell-penetrating peptide comprises HIV tat. In some embodiments, the cell-penetrating peptide comprises poly-arginine. In some embodiments, the one or more proteins are introduced into somatic cells in an amount and for a period of time sufficient for reprogramming of said somatic cell to occur. Such amount and period of time would be apparent to those skilled in the art depending on the particular reprogramming factors, the type of somatic cell, and the culture conditions. In some embodiments, the one or more protein reprogramming factors is introduced into somatic cells over successive intervals throughout the period of time to maintain levels sufficient to activate endogenous pluripotency genes in at least some of the cells into which the reprogramming proteins have be been introduced. In some embodiments, the one or more protein reprogramming factors is introduced into somatic cells repeatedly throughout a stochastic phase of programming until a sequential phase of reprogramming beings.
In some embodiments, a method of generating a reprogrammed cell further comprises (c) supplementing said medium with an agent that increases
reprogramming efficiency. "Agent" as used herein means any compound or substance such as, but not limited to, a small molecule, nucleic acid, polypeptide, peptide, drug, ion, etc. For example, such agent may increase reprogramm ing efficiency and/or allow generation of reprogrammed cells under conditions in which detectable generation of reprogrammed cells would not otherwise occur. In some embodiments, "increase the efficiency of reprogramming" encompasses causing an increase in the percentage of cells that undergo reprogramming to a desired cell state or cell type (e.g., to iPSCs) when a population of cells is subjected to a
reprogramming treatment, typically resulting in a greater number of individual colonies of reprogrammed cells after a given time period, than would otherwise be the case ("colony enrichment"). For example, the number of colonies may be increased by a factor ("enrichment factor") of at least 2, e.g., between 2 and 50, e.g., about 2, 4, 8, 16, etc. In some embodiments, the inventive methods decrease the amount of time required to obtain at least some reprogrammed cells or decrease the amount of time required to obtain a given number of colonies of reprogrammed cells from a given number of somatic cells. For example, such time may be decreased by at least 1 , 2, 3, 4, or 5 days, or more. In some embodiments of the invention, wherein it is desired to reprogram somatic cells to iPSCs, somatic cells are treated (e.g., genetically engineered) so that they express one or more reprogramming factors selected from: Sall4, Nanog, Esrrb and Lin28 (and optionally from: Sox2, Klf family members (e.g., Klf2, Klf4), and c-Myc) at levels greater than would be the case in the absence of such treatment (i.e., they "overexpress" the factor(s). In some embodiments of the invention the cells are treated so that they overexpress SaI14, Nanog, Esrrb and Lin28. Suitable methods of engineering such expression include infecting cells with viruses (e.g., retrovirus, lentivirus) or transfecting the cells with viral vectors (e.g., retroviral, lentiviral) that contain the sequences of the factors operably linked to suitable expression control elements to drive expression in the cells following infection or transfection and, optionally integration into the genome as known in the art. The invention provides the recognition that inhibiting histone methyiation, e.g., H3K9 methyiation, enhances reprogramming of somatic cells that have not been genetically modified to increase their expression of an oncogene such as c-Myc. The invention thus provides ways to substitute for engineered expression of c-Myc in any method of reprogramming somatic cells that would otherwise involve engineering cells to express c-Myc. In some embodiments, said one or more agents comprise a histone deacetylase inhibitor. In some embodiments, the histone deacetylase inhibitor comprises valproic acid (VPA). In some embodiments, the histone deacetylase inhibitor comprises butyrate. In some embodiments, the one or more agents comprise an interferon inhibitor. In some embodiments, the one or more agents comprise a recombinant B 1 8R protein.
In some embodiments, the one or more agents comprise a signaling pathway modulator that is capable of supplementing or substituting for one of the
reprogramming factors of the present invention. "Modulate" is used consistently with its use in the art, i.e., meaning to cause or facilitate a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest. Without lim itation, such change may be an increase, decrease, or change in relative strength or activity of different components or branches of the process, pathway, or phenonomenon. A "modulator" is an agent that causes or facilitates a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest. Non-limiting examples of signaling pathway modulators are selected from the group consisting of a TGF-beta pathway inhibitor, a MAPK/ERK pathway inhibitor, a GSK3 pathway inhibitor, a WNT pathway activator, a 3 '- phosphoinositide-dependent kinase- 1 (PDK 1 ) pathway activator, a mitochrondrial oxidation modulator, a glycolytic metabolism modulator, a HIF pathway activator, and combinations thereof. Examplary TGF-beta pathway inhibitors include
SB43 1 542 (4-[4-( l ,3-benzodioxol-5-yl)-5-(2-pyridinyl)- l H-imidazol-2-yl]- benzamide), and A-83-01 [3-(6-Methyl-2-pyridinyl)-N-phenyl-4-(4-quinolinyl)- l H- pyrazole- l -carbothioamide]. An exemplary MAPK/ERK pathway inhibitors is the extracellular signal-regulated kinases (ERK) and microtubule-associated protein kinase (MAPK/ERK) pathway inhibitor PD0325901 (N-[(2R)-2,3- dihydroxypropoxy]-3,4-difluoro-2-[(2-fluoro-4-iodophenyl)am ino]-benzamide). An exemplary GS 3 pathway inhibitor is the GS 3 inhibitor CHIR99021 [6-((2-((4-(2,4- Dichlorophenyl)-5-(4-methyl- l H-imidazol-2-yl)pyrimidin-2- yl)amino)ethyl)amino)nicotinonitrile] which activates activates Wnt signal ling by stabilizing beta-catenin. An exemplary PDKl pathway activiator is the small molecule activator of 3'-phosphoinositide-dependent kinase- 1 (PDKl ) PS48 [(2Z)-5- (4-Chlorophenyl)-3-phenyl-2-pentenoic acid]. An exemplary small molecule that modulates mitochondrial oxidation is 2,4-dinitrophenol. Examplary agents that modulate glycolytic metabolism include fructose 2,6-bisphosphate and oxalate.
Exemplary HIF pathway activators include N-oxaloylglycine and Quercetin (See, e.g. Zhu et al., 201 0, Cell Stem. Cell 7: 65 1 -655, incorporated by reference herein in its entirety).
In some embodiments, a method of generating a reprogrammed cell further comprises (c) monitoring said culture for cells which display one or more markers of pluripotency. In some embodiments, the one or more markers of pluripotency are selected from the group consisting of Fbxo l 5, Nanog, Oct4, Sox2, Sall4 and combinations thereof. In some embodiments, the one or more markers of pluripotency comprise an early marker of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
In some embodiments, a method of generating a reprogrammed cel l further comprises (c) or (d) isolating said reprogrammed cell from said culture.
In some embodiments, the reprogramming methods disclosed herein may be used to generate iPS cells for a variety of animal species. The iPS cells generated can be useful to produce desired animals. Animals include, for example, avians and mammals as well as any animal that is an endangered species. Exemplary birds include domesticated birds (e.g., chickens, ducks, geese, turkeys). Exemplary mammals include murine, caprine, ovine, bovine, porcine, canine, feline and non- human primate. Of these members include domesticated animals, including, for examples, cattle, pigs, horses, cows, rabbits, guinea pigs, sheep, and goats. In some embodiments, a reprogrammed cell isolated according to the inventive methods comprises a mammalian cell. In some embodiments, said mammalian cell is a human cell. In some embodiments, said mammalian cell is a non-human mammal cell, In some embodiments, said mammalian cell further comprises a reporter gene integrated at a locus whose activation serves as a marker of reprogramming to pluripotency. In some embodiments, the locus is selected from Nanog, Sox2, and Oct4. In some embodiments, said cell is an iPS cell.
In some aspects, chimeric mice and methods of generating such mice are disclosed. In some embodiments, a chimeric mouse is generated at least in part from a mammalian iPS cell generated according to the inventive methods described herein. In some embodiments, the chimeric mouse is generated by injecting said mammalian iPS cell into a mouse blastocyt and allowing said blastocyst to develop into a mouse in vivo. In some embodiments, the present invention provides a cell obtained from said mouse wherein said cell is derived from an iPSC of the present invention.
In some aspects, non-human mammals and methods of producing such non- human mammals are disclosed, e.g., a non-human mammalian iPSC produced according to the inventive methods can be used, at least in part, to generate a non- human mammal.
In some embodiments, the non-human mammal is a transgenic non-human mammal generated using iPSCs of the invention. In some embodiments, such iPSCs are genetically modified. A "genetically modified" or "engineered" cell refers to a cell into which an exogenous nucleic acid has been introduced by a process involving the hand of man (or a descendant of such a cell that has inherited at least a portion of the nucleic acid). The nucleic acid may for example contain a sequence that is exogenous to the cell, it may contain native sequences (i.e., sequences naturally found in the cells) but in a non-naturally occurring arrangement (e.g., a coding region linked to a promoter from a different gene), or altered versions of native sequences, etc. The process of transferring the nucleic into the cell can be achieved by any suitable technique. Suitable techniques include calcium phosphate or lipid-mediated transfection, electroporation, and transduction or infection using a viral vector. In some embodiments the polynucleotide or a portion thereof is integrated into the genome of the cell. The nucleic acid may have subsequently been removed or excised from the genome, provided that such removal or excision results in a detectable alteration in the cell relative to an unmodified but otherwise equivalent cell. In some embodiments genetic modification comprises replacing a selected nucleotide or nucleotide sequence with a different nucleotide or nucleotide sequence. For example, it is contemplated to replace a mutant sequence (e.g., a mutant sequence at least in part responsible for a disease) with a normal sequence or provide a normal or functional sequence (e.g., encoding a protein) to a cell that lacks such sequence. In some embodiments resulting iPS ceils or differentiated descendants thereof may be used in cell therapy, e.g., to treat a subject suffering from the disease. In some embodiments, an integration is targeted to a selected locus. The locus may be selected in order to disable a particular gene or may be a "safe harbour" locus, i.e., a locus where insertion of a nucleic acid is not known to be detrimental to or affect the phenotype of a cell. In some embodiments, a nucleic acid that integrated into the genome may have subsequently been at least in part excised from the genome, e.g., by site-specific recombination (e.g., using the Lox/Cre, Flp/Frt, or similar systems). In some embodiments, a cell may be genetically modified using an endonuclease that is targeted to selected DNA sequences so as to cause chromosomal double-stranded DNA breaks (DSBs), wh ich stimulate breakage repair mechanisms such as nonhomologous end-joining (NHEJ) or homologous recombination (HR). Proteins that comprise a DNA binding domain (DBD) capable of recognizing a selected target DNA sequence and a cleavage domain (e.g., a cleavage domain of a non-specific endonuclease such as Fokl or a variant thereof) may be used. Examples include zinc- finger nucleases (ZFNs) and TALENs. ZFNs comprise DBDs derived from or designed based on DBDs of zinc finger (ZF) proteins. TALENs comprise DBDs derived from or designed based on DBDs of transcription activator-like (TAL) effectors of plant pathogenic Xanthomonas spp. (See, e.g., WO201 1 097036; Urnov, FD, et al., Nature Reviews Genetics (2010), 1 1 : 636-646; Miller JC, et al., Nat Biotechnol. (201 1 ) 29(2): 143-8; Cermak, T., et al. Nucleic Acids Research, 201 1 , Vol. 39, No. 12 e82 and references in any of the foregoing). Modifications of interest may include gene disruption (e.g., by targeted insertions or deletions), introduction of discrete base substitutions specified by a homologous donor DNA), and targeted insertion into a selected native genomic locus of DNA whose expression is desired. In some embodiments such modifications may be performed without using a selectable marker and/or without using donor DNA comprising lengthy sequences homologous to the target locus and/or without requiring donor DNA. In some embodiments, the iPSCs are not genetically modified. The non-human mammals can be genetically modified or non-genetically modified. In some embodiments the iPSC has a mutation or polymorphism associated with a trait or disease that has a genetic component. In some embodiments, non-human mammals are produced using methods known in the art for producing non-human mammals from non-human ESCs or IPSCs.
In some embodiments, the non-human mammal serves as a model for a human disease. Such models are useful, e.g., for studying physiological processes or disease pathogenesis, testing the effect of a compound on the mammal, e.g., testing potential treatments, etc. In other embodiments iPSCs or ESC-like cells could be used to generate farm animals (e.g., cows, pigs, sheep, goats, horses), e.g., farm animals with desired traits. Examples of such traits could be, e.g., reduced susceptibility to disease, increased size, increased milk production, etc.
In some embodiments, non-human mammals are useful for research on apoptosis, autoimmune disease, cancer, cardiovascular disease, cell biology, dermatology, development, diabetes and/or obesity, endocrine deficiency, hearing (or hearing loss), hematological research, immunology, inflammation, musculoskeletal disorders, neurobiology, neurodenerative disease, metabolism, vision (or vision loss), reproductive biology, or infectious disease. Research can include, e.g., identification of targets for development of therapeutic agents, testing potential therapeutic agents, toxicity testing, etc.
In some embodiments, an iPS cell, differentiated cells obtained from the iPS cell, or non-human mammal of the invention is used as a model for a disease, e.g., a disease for which a treatment, e.g., a pharmacological treatment, is sought. In some embodiments, a method of identifying a compound to be administered to treat a disease in a mammal comprises providing an iPSC of the invention or a cell obtained by differentiating the iPSC, wherein the iPSC or differentiated cell or descendants thereof manifest at least one indicator of a disease; administering a test compound to the cell, wherein the test compound is to be assessed for its effectiveness in treating the disease; and assessing the ability of the compound to modify the indicator of disease. In some embodiments the i PSC was derived from a somatic cell obtained from a donor suffering from the disease. In some embodiments the iPSC is genetically modified to harbor a mutation at least in part responsible for a disease.
In some embodiments, a method of producing a non-human mammal comprises introducing an iPSC produced according to the inventive methods disclosed herein into tetraploid blastocysts of the same non-human mammalian species under conditions that result in production of an embryo and said resulting embryo is transferred into a foster mother which is maintained under conditions that result in development of l ive offspring. In some embodiments, said non-human mammal is a mouse. In some embodiments, said iPS cells are introduced into said tetraploid blastocysts by injection. In some embodiments, said injection is a microinjection. In some embodiments said injection is laser-assisted
m icromanipulation or piezo injection.
In some embodiments, the method of producing a non-human mammal employs mouse iPSCs and the resulting non-human mammal is a mouse.
In some aspects, non-human mammalian embryos and methods of producing non-human mammalian embryos are disclosed. A method of producing a non-human mammalian embryo comprises injecting non-human mammalian iPSCs generated according to an inventive method of the present invention into non-human tetraploid blastocysts and maintaining said resulting tetraploid blastocysts under conditions that result in formation of embryos, thereby producing a non-human mammalian embryo. In some embodiments, said non-human mammalian iPSCs are mouse cells and said non-human mammalian embryo is a mouse. In some embodiments, said mouse cells are mutant mouse iPS cells and are injected into said non-human tetraploid blastocysts by microinjection, in some embodiments laser-assisted micromanipulation or piezo injection is used. In some embodiments, a non-human mammalian embryo comprises a mouse embryo.
In some embodiments, the somatic cel l is a terminal ly differentiated somatic cell.
In certain embodiments the compositions and methods are of use to reprogram somatic cells to a less differentiated cell state. In certain embodiments the compositions and methods are of use to reprogram somatic cells to pluripotent, embryonic stem cell-like cells, sometimes referred to herein as "induced pluripotent stem cells ("iPS cells" or "iPSCs"). In certain embodiments the compositions and methods are of use to reprogram pluripotent cells to a more differentiated state. In certain embodiments the compositions and methods are of use to reprogram pluripotent cells to a desired differentiated cell type. In certain embodiments the compositions and methods are of use to reprogram mammalian cells from a first differentiated cell type to a second differentiated cell type.
In some embodiments, the present invention provides a method comprising: (a) reprogramming somatic cells to a pluripotent state according a reprogramming method or protocol of the present invention; and (b) reprogramming said pluripotent cells to a desired, differentiated cell type, wherein said differentiated cell type optionally comprises an adult stem cell or a fully differentiated cell.
IPSCs of the invention may be induced to differentiate into desired cell types. Such differentiated cells are an aspect of the invention. For example, the IPSCs may be induced to differentiate into hematopoietic stem cel ls, neural lineage cells, striated muscle cells, cardiac muscle cells, liver cells, pancreatic cells, cartilage cells, epithelial cells, urinary tract cells, ocular cells (e.g., retinal cells, limbal epithelial stem cells), vascular cells etc., by culturing such cells in differentiation medium and under conditions which provide for cell differentiation. Cell types of interest include, without lim itation, keratinocytes, pigmented retinal epithelium, neural crest cells, motor neurons, dopaminergic neurons, hepatic progenitors, pancreatic islet-like cells (e.g., insulin-secreting beta-like cel ls), and mesenchymal stem cells.
In some embodiments iPSCs are differentiated to the endodermal, mesendodermal, or neuroectoderm lineage. In some embodiments a cell type of interest is a stem cell. A stem cell is capable of self-renewal and of differentiating to at least one more mature cell type. In some embodiments, a stem cell is a multipotent stem cell. A multipotent stem cell can give rise to cells of multiple different types but has less potential than a pluripotent cell. Exemplary multipotent stem cells include mesenchymal stem cel ls, neural stem cells, hematopoietic stem cells and more restricted hematopoietic cells such as myeloid or lymphoid stem cells, endothelial stem cel ls, etc. Cell types of interest can be identified, e.g., by cell surface markers, expression of reporter genes, gene expression profile, and/or characteristic morphology. If desired, a cell population can be enriched for cell type(s) of interest and/or further cultured to obtain more mature cell type(s). In some embodiments, enrichment comprises selecting cells that express one or more markers associated with the desired cell type(s) and/or selecting cells that do not express one or more markers associated with pluripotency. In some embodiments, enrichment comprises removing at least some cells that express one or more markers associated with pluripotency from the cell population. In some embodiments, enrichment comprises selecting cells that express one or more early markers of pluripotency (e.g., Esrrb, Utfl , Lin28, and Dppa2). In some embodiments, enrichment comprises selecting cells that express at least two early markers of pluripotency, at least three early markers of pluripotency, or at least four early markers of pluripotency. In some embodiments, enrichment comprises selecting cells that express a group of early pluripotency markers comprising Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, enrichment comprises removing at least some cells that express one or more early markers of pluripotency.
In some embodiments, the invention provides a differentiated cell population obtained from iPSCs of the invention, wherein the cell population is substantial ly free of pluripotent cells. In some embodiments, no more than 5%, 2%, 1 %, 0.5%, 0. 1 % or 0.05% of the cells express a marker associated with pluripotency. In some embodiments, expression of said marker is not significantly greater than a reference level, e.g., a background or control level.
Medium and methods which result in the differentiation of iPSCs cells are known in the art as are suitable culturing conditions. The differentiation of hiPSCs into a variety of cell and tissue types often involves the formation of EBs.
Differentiation along lineages of interest can be promoted by a variety of different compounds such as polypeptides, nucleic acids, and small molecules. Exemplary compounds include growth factors, morphogenetic factors, and smal l cell -permeable molecules such as steroids (e.g, dexamethasone), vitamins (e.g., vitam in C), sodium pyruvate, thyroid hormones, prostaglandins, dibutryl cAMP, concavalin A, vanadate, and retinoic acids. For example, e.g., bone morphogenetic proteins such as BMP-2 are useful to promote chondrogenic differentiation. Mechanical factors (e.g., mechanical properties of a scaffold or culture substrate, application of forces) can also promote differentiation along particular pathways. By way of example, methods useful for generating iPS cell-derived dopaminergic neurons are described in Kriks S, Studer L., Adv Exp Med Biol., 65 1 : 101 - 1 1 , 2009; methods useful for directing chondrogenic differentiation of ES cells using various growth factors such as BMP2 and ΤΟΡβ Ι are described in Toh, WS, et al., "Differentiation of Human Embryonic Stem Cells Toward the Chondrogenic Lineage", in Stem Cell Assays, Methods in Molecular Biology, Volume 407, pp. 333-359, Humana Press, Totowa, NJ, 2007; methods useful for directing chondrogenic differentiation of iPS cells using various growth factors such as BMP2 and TGFpl are described in Toh, WS, et al.,
"Differentiation of Human Embryonic Stem Cells Toward the Chondrogenic Lineage", in Stem Cell Assays, Methods in Molecular Biology, Volume 407, pp. 333- 359, Humana Press, Totowa, NJ, 2007; methods useful for generating cardiomyocytes are described in Cao F, et al., "Transcriptional and functional profiling of human embryonic stem cell-derived cardiomyocytes" PLoS One. ;3( 1 0):e3474, 2008. These references are merely exemplary of reported methods for obtaining differentiated cells from iPS cells.
In some aspects, cell lines, cell clones, and cell cultures derived or cultured using methods, reprogramming factors, and/or compositions disclosed herein are provided. In some aspects, a "cell line" refers to a population of largely or substantially identical cells, wherein the cells have often been derived from a single ancestor cell or from a defined and/or substantially identical population of ancestor cells. For example, a cell line may consist of descendants of a single cell. In some embodiments a cell line may have been or may be capable of being maintained in culture for an extended period (e.g., months, years, for an unlimited period of time). It will be appreciated that cells may acquire mutations and possibly epigenetic changes over time such that some individual cells of a cell line may differ with respect to each other. In some embodiments at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more of the cells of a cell line or cell culture are at least 90%, 95%, 96%, 97%, 98%, 99%, or more genetically identical. In some embodiments at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% of the cells of a cell line or cell culture express a set of cell surface markers. The set of markers could be markers indicative of pluripotency or cell-type specific markers. In some aspects, a cell "clone" refers to a population of cells derived from a single cell. It will be understood that if cells of a clone are subjected to different culture conditions or if some of the cells are subjected to genetic modification, the resulting cells may be considered distinct clones. In some embodiments, the term "cell culture" refers to a composition comprising a plurality of viable cells wherein at least some of the cells are proliferating, e.g., not cell cycle arrested. A cell culture could be composed of cells from one or more different cell lines or sources.
In certain embodiments of the invention, a pluripotent cell line or cell clone of the invention is stable in culture. As used herein, a state, condition, or property is "stable" if it remains substantially unchanged over a time period of interest, e.g., exhibits little or no variability over such time period. "Stabilize" refers to promoting the establishment and/or maintenance of a stable state, condition, or property, e.g., by inhibiting or preventing a change in such state, condition, or property. A cell or cell line or cell clone is stable in culture if it continues to proliferate over multiple passages in culture (e.g., indefinitely), most or all cells in the culture (e.g., at least 90%, 95%, 97%, 98%, or more) are of the same type or differentiation state (e.g., are pluripotent), and cells resulting from cell division are of the same cell type or differentiation state. Thus a stabilized cell or cell line retains its "identity" in culture as long as the culture conditions are not altered, and the cells continue to be passaged appropriately. In some embodiments, methods and compositions of the invention enhance or promote existence of a stable pluripotent state. In some embodiments the pluripotent state is an inner cell mass (ICM)-like state. Thus in some embodiments the invention is a method for stabi lizing a pluripotent cell in an ICM-like state. In some embodiments, the pluripotent state is characterized by cel l colonies that morphologically resemble those of ES cells of the 129 strain described in PCT
Application Publication No. WO 2010/124290, incorporated herein by reference in its entirety. In some embodiments, the pluripotent state, e.g., in mice, is characterized by ability to participate in chimera formation with frequencies at least 20% of that of ES cells of the 129 strain. In some embodiments, the pluripotent state, e.g., in mice, is characterized by ability to contribute to the germ line in chimeras with frequencies at least 20% of that of ES cells of the 129 strain. In some embodiments the pluripotent state is characterized by colonies that morphologically resemble those of ES cells of the 129 strain. In some embodiments the pluipotent state is characterized by maintenance of both X chromosomes (in XX lines) in an activated state. In some embodiments a pluripotent state has at least 2, 3, 4, or more of the foregoing properties. In some embodiments an inventive cell line or clone has a stable pluripotency state. In some embodiments an inventive cell line or clone is karyotypically stable.
One of skill in the art will be aware of ways to assess the stability of a cell population. One suitable method is to examine the expression of "markers" known in the art to be characteristic of cells of a particular type or differentiation state. For example, stage-specific embryonic antigens- 1 , -3, and -4 (SSEA-I, SSEA-3, SSEA-4) are glycoproteins specifically expressed in early embryonic development and are markers for ES cells (Solter and Knowles, 1978, Proc. Natl. Acad. Sci. USA 75:5565- 5569; Kannagi et al., 1983, EMBO J 2:2355-2361 ), with SSEA-I being a marker of mouse ES cells and SSEA-3 and -4 being markers of human ES cells. Elevated expression of the enzyme alkaline phosphatase (AP) is another marker associated with undifferentiated embryonic stem cells (Wobus et al., 1 984, Exp. Cell 152:212-219; Pease et al., 1990, Dev. Biol. 141 :322-352). Additional ES cell markers are described in Ginis, I., et al., Dev, Biol, 269: 369-380, 2004 and in Adewumi O, et al., Nat Biotechnol., 25(7): 803-l 6, 2007 and references therein. For example, I R.A- 1 -60, TRA-I -81 , GCTM2 and GCT343, and the protein antigens CD9, Thy I (also known as CD90), NANOG, I DG 1 1 , DNMT3B, GABRB3 and GDF3, REX-I , TERT, UTF-I , TRF-I, TRF-2, connexin43, connexin45, Foxd3, FGFR-4, ABCG-2, and Glut- 1 are of use. In an exemplary embodiment a mouse pluripotent stem cell line, e.g., a mouse ES cell line, expresses Oct4, Nanog, and SSEA-I . In an exemplary embodiment a human pluripotent stem cell line, e.g., a human ES cell line, expresses Tra 1 -60, Nanog, Oct4, Sox2, and SSEA3 and/or SSEA4.
In some embodiments, at least 80%, at least 90% of the pluripotent stem cells of a colony, cell line, or cell culture express one or more marker(s), e.g., a set of markers, indicative of pluripotency. In some embodimenst at least 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more of the cells of a colony, cell line, or cell culture express the marker(s). Gene expression profiling may be used to assess pluripotency state. Pluripotent cells, such as embryonic stem cells, and multipotent cells, such as adult stem cells, are known to have a distinct pattern of global gene expression. See, for example, Ramalho-Santos et al., Science 298: 597-600, 2002; Ivanova et al., Science 298: 601 - 604, 2002; Boyer, LA, et al. Nature 441 , 349, 2006, and Bernstein, BE, et ah, Cell 125 (2), 3 1 5, 2006. One may assess DNA methylation, gene expression, and/or epigenetic state of cellular DNA, and/or developmental potential of the cells, e.g., as described in Wernig, M., et al., Nature, 448:3 1 8-24, 2007. Other methods of assessing
pluripotency state include epigenetic analysis, e.g., analysis of DNA methylation state.
In certain embodiments of the invention, a pluripotent stem cell line, e.g., an iPS cell line, derived or cultured according to the invention, e.g., a human iPS cell line, a non-human vertebrate iPS cell line, a mouse iPS cell line, has a normal karyotype. In certain embodiments at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or greater than 95% of cells in metaphase examined exhibit a normal karyotype. In certain embodiments, at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or greater than 95% of cells exhibit a normal karyotype after at least 5, 6, 7, 8, 9, 10, 1 1 , 12, 1 3, 14, 1 5, 20, or more passages, e.g., 25 passages, or 30 passages, or more. In some embodiments normal karyotype comprises having the correct number of chromosomes without evidence of translocation or deletion or duplication. In some embodiments normal karyotype comprises having a normal banding pattern. In some embodiments a karyotype is normal karyotype based on analysis by flourescence in situ hybridization (FISH). In some embodiments a pluripotent stem cell or cell line is an XO cell or cell line which, in some embodiments is otherwise karyotypically normal.
In some aspects, the invention provides a composition comprising: (a) one or more iPSCs derived under from reprogramming factors Sall4, Nanog, Esrrb and Lin28; and (b) one or more material(s) that promotes differentiation of the iPSC(s) to one or more cell type(s) of interest. The material(s) could be, e.g., compound(s), a substrate, or cells. In other aspects, the invention provides a method of generating a cell type of interest comprising: (a) providing one or more iPSCs from
reprogramm ing factors Sall4, Nanog, Esrrb and Lin28; and (b) culturing the one or more iPSCs of step (a) under conditions that promote differentiation of said cell(s) to one or more cell types of interest. In some embodiments, the conditions comprise culturing the cell(s) in culture medium comprising one or more compound(s) that promote differentiation to a desired cell lineage or cell type. Furthermore, the invention encompasses use of iPSCs of the invention to screen test compounds (e.g., test compounds such as those described herein), to identify compounds that promote differentiation of pluripotent cells (e.g., iPS cells) to one or more desired cell types.
Differentiated cells of the invention, e.g., differentiated mammalian cells, e.g., differentiated human cells, have a variety of uses. In some embodiments, such cells are used for therapeutic purposes. For example, neural lineage cells could be used to treat, prevent, or stabilize a neurological disease such as Alzheimer's disease, Parkinson's disease, Huntington's disease, or ALS, lysosomal storage diseases, multiple sclerosis, or a spinal cord injury. Differentiated cells that produce a hormone, such as a growth factor, thyroid hormone, thyroid-stimulating hormone, parathyroid hormone, steroid, serotonin, epinephrine, or norepinephrine may be administered to a mammal for the treatment or prevention of endocrine conditions. Differentiated cells may be administered to repair damage to the lining of a body cavity or organ, such as a lung, gut, exocrine gland, or urogenital tract or to treat damage or deficiency of cells in an organ or tissue such as the bladder, bone, bone marrow, brain, cartilage, esophagus, eye, fallopian tube, heart, intestines, gallbladder, kidney, liver, lung, musc le, ovaries, pancreas, prostate, skin, spinal cord, spleen, stomach, tendon, testes, thymus, thyroid, trachea, ureter, urethra, or uterus.
Differentiated cells could be used in tissue engineering, e.g., the construction of a replacement organ or tissue ex vivo. For example, such cells could be combined with a suitable scaffold, which is optionally three-dimensional and/or biodegradable. Optionally, the cells are allowed to proliferate and possibly further differentiate ex vivo. Scaffolds could be comprised of a wide variety of materials, including both naturally occurring and artificial materials. See, e.g., Lanza, R., et al. (eds.), Principles of Tissue Engineering, 3,d ed., Academic Press, 2007. The replacement organ, tissue, or portion thereof is transplanted into a recipient in need thereof. In some embodiments, iPSCs may be combined with a matrix to form a tissue or organ in vitro or in vivo that may be used to repair or replace a tissue or organ in a recipient mammal (such methods being encompassed by the term "cell therapy"). For example, iPSCs may be cultured in vitro in the presence of a matrix to produce a tissue or organ of the urogenital, cardiovascular, or musculoskeletal system. Alternatively, a mixture of the cells and a matrix may be administered to a mammal for the formation of the desired tissue in vivo. The iPSCs produced according to the invention may be used to produce genetical ly engineered or transgenic differentiated cells, e.g., by introducing a desired gene or genes, or removing all or part of an endogenous gene or genes of iPSCs produced according to the invention, and allowing such cells to differentiate into the desired cell type. One method for achieving such modification is by homologous recombination, which technique can be used to insert, delete or modify a gene or genes at a specific site or sites in the genome.
This methodology can be used to replace defective genes or to introduce genes which result in the expression of therapeutically beneficial proteins such as growth factors, hormones, lymphokines, cytokines, enzymes, etc. For example, the gene encoding brain derived growth factor may be introduced into iPSCs or stem-like cells derived from such iPSCs, the cells differentiated into neural cells and the cells transplanted into a Parkinson's patient to retard the loss of neural cells during such disease. Using known methods to introduced desired genes/mutations into ES cells, iPSCs may be genetically engineered, and the resulting engineered cells differentiated into desired cel l types, e.g., hematopoietic cells, neural cells, pancreatic cells, cartilage cells, etc. Genes which may be introduced into the iPSCs include, for example, epidermal growth factor, basic fibroblast growth factor, glial derived neurotrophic growth factor, insulin-like growth factor (I and II), neurotrophin3, neurotrophin-4/5, ciliary neurotrophic factor, AFT- 1 , cytokine genes (interleukins, interferons, colony stimulating factors, tumor necrosis factors (alpha and beta), etc.), genes encoding therapeutic enzymes, col lagen, human serum albumin, etc.
Negative selection systems known in the art can be used for eliminating therapeutic cells from a patient or ex vivo if desired. For example, cel ls transfected with the thymidine kinase (TK) gene will lead to the production of reprogrammed cells containing the TK gene that also express the TK gene. Such cells may be selectively elim inated at any time from a patient upon gancyclovir administration. Such a negative selection system is described in U.S. Patent No. 5,698,446, incorporated herein by reference in its entirety. In other embodiments the cells are engineered to contain a gene that encodes a toxic product whose expression is under control of an inducible promoter. Administration of the inducer causes production of the toxic product, leading to death of the cells. Thus any of the somatic cells of the invention may comprise a suicide gene, optionally contained in an expression cassette, which may be integrated into the genome. The suicide gene is one whose expression would be lethal to cells. Examples include genes encoding diphtheria toxin, cholera toxin, ricin, etc. The suicide gene may be under control of expression control elements that do not direct expression under normal circumstances in the absence of a specific inducing agent or stimulus. However, expression can be induced under appropriate conditions, e.g., (i) by administering an appropriate inducing agent to a cell or organism or (ii) if a particular gene (e.g., an oncogene, a gene involved in the cell division cycle, or a gene indicative of dedifferentiation or loss of differentiation) is expressed in the cells, or (iii) if expression of a gene such as a cell cycle control gene or a gene indicative of differentiation is lost. See, e.g., U.S. Pat. No. 6,761 ,884, incorporated herein by reference in its entirety. In some embodiments the gene is only expressed following a recombination event mediated by a site-specific recombinase. Such an event may bring the coding sequence into operable association with expression control elements such as a promoter. Expression of the suicide gene may be induced if it is desired to eliminate cells (or their progeny) from the body of a subject after the cel ls (or their ancestors) have been administered to a subject. For example, if a reprogrammed somatic cell gives rise to a tumor, the tumor can be eliminated by inducing expression of the suicide gene, In some embodiments tumor formation is inhibited because the cells are automatically eliminated upon dedifferentiation or loss of proper cell cycle control.
The iPSCs obtained using methods of the present invention may be used as an in vitro model of differentiation, e.g., for the study of genes which are involved in the regulation of early development. Differentiated cell tissues and organs generated using the reprogrammed cells may be used to study effects of drugs and/or identify potentially useful pharmaceutical agents. In some embodiments, differentiated cells or organs or tissues comprising them are introduced into a non-human animal that serves as a model of a disease. The term "disease" as used herein, encompasses, in various embodiments, art-recognized diseases, disorders, syndromes, injuries, impairments of health or conditions of abnormal functioning, e.g., for which medical/surgical treatment would be desirable. The non-human animal may then be assessed, e.g., to evaluate the effects of the introduced cells, organs, or tissues in the model, thus providing means to assess therapeutic potential. Differentiated cells of the invention can also be used for screening or other testing purposes, e.g., to identify compounds of use for treating diseases, to assess the effects of a compound on such cells (e.g., to assess potential toxicity or explore mechanism of action) or to study a cell biological process of interest. For example, neural cells could be used to study neurotransm itter synthesis, release, or uptake and/or to identify compounds that modulate (e.g., promote or inhibit) such processes. Hepatocytes could be used in the study of drug metabolism and/or drug interactions. As another example, cardiomyocytes can be used in study of processes such as action potential generation, repolarization, excitation-contraction coupling or calcium flux and/or to identify compounds that modulate such processes. Compounds so identified could be used in research or in treatment of diseases in which such modulation would be beneficial. The cells could be used in preclinical toxicology studies. For example, they could be used to assess potential cardiotoxicity, hepatoxicity, neurotoxicity, drug interactions, etc. In still other embodiments, differentiated cells of the invention could be used in screens to identify compounds useful to direct endogenous cells to participate in the repair or regeneration of damaged tissues in vivo.
In some aspects, a composition is disclosed, such composition comprising multiple cells produced by a reprogramming method or protocol of the present invention. In some embodiments cells are considered to be essentially genetically identical if they are generated or descended from a cell or cell sample obtained from a particular subject. In some embodiments non-human mammalian cells are considered to be essentially genetically identical if they are derived from one or more mammals of the same inbred strain (e.g., an inbred mouse strain) or if they are derived from a mammal generated by crossing individuals of two different inbred strains. In some embodiments, methods disclosed herein may be used to derive or culture pluripotent cells of any strain, e.g., mouse strain, or substrain of interest. Numerous strains and substrains are available from The Jackson Laboratory (Bar Harbor, Maine) (http://www.jax.org), e.g. , those strains and substrains listed in the J AX® Mice database, which is incorporated herein by reference, or from Taconic (Hudson, NY) or other commercial suppliers. In some embodiments, pluripotent cells are derived from somatic cells are obtained from F l hybrid mice produced by crossing m ice of two different inbred strains, In some embodiments a composition comprises at least 1 0, 102; 103, 104, 1 05, 106, 107, 108, 109, 1 010, 10" cells, or more.
In some aspects, iPSCs of the present invention can be used to treat various diseases. As used herein, "treat", "treating", "therapy" and similar terms can include amelioration (e.g., reducing one or more symptoms of a disorder), cure, and/or maintenance of a cure (i.e., the prevention or delay of recurrence) of a disorder, or preventing a disorder from manifesting as severely as would be expected in the absence of treatment. Treatment after a disorder has started aims to reduce, ameliorate or altogether elim inate the disorder, and/or at least some of its associated symptoms, to prevent it from becoming more severe, to slow the rate of progression, or to prevent the disorder from recurring once it has been initially eliminated. Treatment can be prophylactic, e.g., administered to a subject that has not been diagnosed with the disorder, e.g., a subject with a significant risk of developing the disorder. For example, the subject may have a mutation associated with developing the disorder. In some embodiments, e.g., in the case of a disorder diagnosed prior to birth, treatment can comprise administering a compound to a subject's mother. In some
embodiments, a method of the invention comprises diagnosing a subject as having or being at risk of developing a disease, or providing such a subject, and treating the subject. In some embodiments, a subject diagnosed or treated according to the instant invention is a human. In some embodiments a subject is a non-human mammal, e.g., any of the mammals mentioned herein.
In some embodiments, a method of treating a patient in need of such treatment is disclosed, such method comprising administering to the patient a composition comprising multiple iPSCs cells produced by a reprogramming method or protocol of the present invention. In some embodiments, the iPSCs are autologous iPSCs derived from a differentiated cell of the patient (e.g., a fibroblast cell) that has been subjected to a reprogramm ing protocol or produced by a reprogramming method of the present invention. In some embodiments, the iPSCs are autologous iPSCs that have been derived from pathological cells of the patient. In some embodiments, the iPSCs are autologous iPSCs that have been derived from normal or healthy cells of the patient. In some embodiments the iPSCs are derived from cells obtained from a donor other than the subject to whom the cells are to be administered. In some embodiments, the method of treatment comprises reprogramming a differentiated cell of a first type extracted from a patient into a differentiated cell of a second type utilizing a reprogramm ing method or protocol of the present invention and administering to the patient a composition comprises the autologous differentiated cells of the second type.
In some embodiments, a method of treating an individual in need of such treatment is disclosed, such method comprising: (a) obtaining somatic cells from said individual; (b) reprogramming said somatic cells obtained from said individual with reprogramming factors comprising Sall4, Nanog, Esrrb, and Lin28 according to a reprogramming method or protocol described herein; and (c) administering at least some of said reprogrammed cells to said individual. In some embodiments, the method further comprises separating cells that are reprogrammed to a desired state from cells that are not reprogrammed to a desired state. In some embodiments, said individual is a human.
In some embodiments, the methods of treatment using iPSCs of the present invention can be combined with conventional drugs or therapies to treat a patient in need of such treatment. In some instances, conventional drugs or therapies can be administered to alleviate symptoms associated with a disease or condition which the patient is suffering from. In some instances, conventional drugs or therapies can be adm inistered to prepare the patient for receiving an iPSC based treatment of the present invention. In some instances, conventional drugs or therapies can be administered in combination with one or more iPSC based treatments of the present invention to act in concert to ameliorate the disease or condition.
The present invention contemplates all modes of administration, including intramuscular, intravenous, intraarticular, intralesional, subcutaneous, or any other route sufficient to provide a dose adequate to prevent or treat a disease. The iPSCs may be administered to the mammal in a single dose or multiple doses. When multiple doses are administered, the doses may be separated from one another by, for example, one week, one month, one year, or ten years. One or more growth factors, hormones, interleukins, cytokines, or other cells may also be administered before, during, or after administration of the cells to further bias them towards a particular cell type.
In some aspects, the present invention provides compositions for identifying a reprogramming agent, such compositions comprising one or more cells that expresses a subset of reprogramm ing factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent. A wide variety of compounds or combinations thereof can be used in aspects of the present invention, e.g., as test compounds or agents in the inventive methods. For example, compounds may comprise e.g., polypeptides, peptides, small organic or inorganic molecules, polysaccharides, polynucleotides, oligonucleotides, peptide nucleic acids, or lipids. "Polypeptide" is used interchangeably herein with "protein". Polypeptides can contain standard amino acids (which refers to the 20 L-amino acids that are most commonly found in naturally occurring proteins) and/or non-standard amino acids or amino acid analogs. One or more of the am ino acids in a polypeptide may be modified, for example, by the addition of a moiety such as a carbohydrate group, a phosphate group, a fatty acid group, etc. "Peptide" is used herein to refer to a polypeptide containing 60 amino acids or less. "Polynucleotide" is used herein interchangeably with "nucleic acid" and encompasses single-stranded, double-stranded, and partially double-stranded molecules, double-stranded molecules with overhangs, etc. "Oligonucleotide" refers to a polynucleotide containing 60 nucleotides or less and encompasses antisense oligonucleotides, short interfering RNA (siRNA), and microRNA (miRNA). A polynucleotide can comprise standard nucleosides (which term refers to nuc leosides that are most commonly found in DNA or RNA - adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and
deoxycytidine), non-standard nucleosides,and/or nucleoside analog(s). Non-standard nucleosides can be naturally occurring nucleosides or may not be known to occur naturally. A non-standard nucleoside or nucleoside analog may differ from a standard nucleoside with regard to the base and/or sugar moiety. Variants of the sugar- phosphate backbone found in DNA or RNA can be used such as phosphorothioates, locked nucleic acids, or morpholinos. Modifications (e.g., nucleoside and/or backbone modifications), non-standard nucleotides, delivery vehicles and systems, etc, known in the art as being useful in the context of siRNA or antisense-based molecules for research or therapeutic purposes are contemplated for use in various embodiments of the invention. Such modifications may, e.g., increase stability, increase cell uptake, reduce clearance from the body, reduce toxicity, reduce off- target effects, or have other effects that may be desirable. "Small molecule" as used herein refers to a molecule having a molecular weight of not more than 1 ,500 Da, e.g., not more than 1000 Da, e.g., not more than 500 Da. In some embodiments, the candidate compound is a small organic molecule comprising one or more functional groups that mediate structural interactions with proteins, e.g., hydrogen bonding. For example, a compound could comprise amine, carbonyl, hydroxyl or carboxyl group(s). In some embodiments a compound comprises one or more cyclic carbon or heterocycl ic rings, e.g., an aromatic or polyaromatic ring substituted with one or more chemical functional groups and/or heteroatoms. In some embodiments, a small molecule has between 5 and 50 carbon atoms, e.g., between 7 and 30 carbons.
Compounds can be contacted with cells by adding the compound to the culture medium. A range of concentrations can be used. Exemplary concentrations range from picomolar to millimolar, e.g., between 1 00 pM to 1 mM, e.g., between 1 0 nM and 500 μΜ. In some embodiments, a vector that encodes a candidate compound (an RNA or protein) is introduced into cells by an appropriate method and expressed therein to deliver a compound. For example, an expression vector that encodes a short hairpin RNA (shRNA) or microRNA (miRNA) precursor can be introduced into cells.
Compounds may be obtained from a wide variety of sources and can comprise compounds found in nature or compounds not known to occur in nature. Compounds can be synthesized or obtained from natural sources. For example, polypeptides may be produced using recombinant DNA technology or synthesized through chemical means such as conventional sol id phase peptide synthesis. Numerous techniques are available for the random and directed synthesis of a wide variety of organic compounds. In some embodiments, candidate compounds are provided as mixtures of natural compounds in the form of bacterial, fungal, plant and animal extracts, fermentation broths, conditioned media, etc. In some embodiments, a library of compounds is screened. A library is typically a collection of compounds that can be presented or displayed such that the compounds can be conveniently used in a screening assay. Often, each compound has associated information stored, e.g., in a database, such as the chemical structure, purity, quantity, physiochemical characteristics of the compound and/or information regarding known or suspected biological or biochemical activity. In some embodiments, compounds or mixtures thereof are housed in individual wells (e.g., of microtiter plates), vessels, tubes, etc. Libraries include but are not limited to, for example, phage display libraries, peptide libraries, oligonucleotide libraries, siRNA libraries, shRNA libraries, aptamer libraries, synthetic small molecule libraries, and natural compound libraries. Libraries could comprise multiple different compounds having a similar biological activity of interest. For example, libraries could comprise inhibitors of one or more enzymes or enzyme classes of interest. Exemplary compounds could be kinase inhibitors, phosphatase inhibitors, inhibitors of DNA or histone modifying enzymes (e.g., histone deacetylase inhibitors), etc. Methods for preparing libraries of molecules are well known in the art, and many libraries are available from commercial or noncommercial sources. In some embodiments, a library comprises between 1 ,000 and 1 ,000,000 compounds, or more, e.g., between 1 0,000 and 500,000 compounds. In many embodiments the candidate compound to be tested is a compound that is not present in ESC or iPSC culture medium or cryopreservation solutions known in the art. In some embodiments a compound to be tested is a compound that is present in at least some ESC or iPSC culture medium or cryopreservation solutions known in the art but is used in a different, e.g., greater, concentration in a method or composition of the present invention.
In some embodiments, said subset of reprogramming factors consists of at least three of said reprogramming factors. For example, two different regulatable systems, each controlling expression of a subset of the factors can be used to identify reprogramming agents. For example, one might place 3 of the factors under control of a first inducible (e.g., dox-inducible) promoter and the 4th factor under control of a second inducible (e.g., tamoxifen-inducible) promoter. Then, one could generate an iPS cell by inducing expression from both promoters, generate a mouse from this iPS cell, and isolate fibroblasts (or any other cell type) from the mouse. These fibroblasts would be genetical ly homogenous and would be reprogrammable without need for viral infection. One would then attempt to reprogram the fibroblasts under conditions in wh ich only the first promoter is active, in the presence of different small molecules that could potentially substitute for the 4th factor, in order to identify small molecule "reprogramming agents" or optimize transient transfection or other protocols for introducing the 4th factor. A number of variations are possible; for example, one might stably induce expression of 3 factors and transiently induce expression of the 4th factor, etc. Any combination of factors can be assessed using the described methods. Also, one can modulate expression levels of the factors by using different concentrations of inducing agent.
In some embodiments, the composition further comprises an agent that induces expression of said subset of reprogramming factors.
In some embodiments, a method of identifying a reprogramming agent comprises: (a) maintaining said composition comprising one or more cells that expresses a subset of reprogramming factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent for a time period under conditions in which said reprogramming factors are expressed and cel l proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein the test agent is identified as a reprogramming agent if reprogramming occurs at a similar frequency as would be the case if said composition contained all of said reprogramming factors and had lacked said test agent.
In some embodiments, a method of identifying a reprogramming agent is disclosed, such method comprising: (a) maintaining the composition comprising one or more ceils that expresses a subset of reprogramm ing factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent for a time period under conditions in which the reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become
reprogrammed, wherein said test agent is identified as a reprogramming agent or enhancer of reprogramming if reprogramming occurs at a significantly greater frequency than would be the case had said composition lacked said test agent. In some embodiments, the composition is maintained for at least X days, wherein X the number of days that it takes for one or more markers of pluripotency to be expressed in the cells. In some embodiments, the method further comprises determ ining whether one or more markers of pluripotency are being expressed in the cells.
Suitable methods for determining expression of pluripotency markers are apparent to those skilled in the art. In some embodiments, said test agent is present for at least X days. In some embodiments, X is equal to the amount of days during which the composition is maintained. In some embodiments, the test agent is present for a number of days which is less than the number of days in which the composition is maintained. In some embodiments, X is between 1 and 365 days or any intervening particular value or subrange, e.g., between 1 and 1 80 days, between 2 and 60 days, between 3 and 30 days, to name just a few examples.
It will be understood that in some embodiments of any aspect herein, a reprogramming factor, reprogramm ing agent, or test agent is added to a composition once or more during a time period. For example, medium can be supplemented with a test agent, e.g., prior to or following medium changes. In some embodiments multiple applications of a reprogramming factor, reprogramming agent, or test agent are used.
In some embodiments, said test agent is identified as a reprogramming agent if cells do not become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent but do become reprogrammed at a detectable frequency if maintained in the presence of said test agent for at least a portion of said time period.
In some embodiments, said test agent is identified as an enhancer of reprogramming agent if cells become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent and become reprogrammed at a significantly greater frequency if maintained in the presence of said test agent for at least a portion of said time period.
In some aspects, nucleic acid constructs comprising reprogramming factors
Sall4, Nanog, Esrrb, and Lin28 are disclosed. Some protocols to generate iPSCs call for transduction of the reprogramm ing factors by as many different viral vectors as there reprogramming factors to be transduced into a cell to be reprogrammed (e.g., if there are four reprogramming vactors as many as four viral vectors are employed). Reprogramming in this manner involves the selection for the small fraction of infected cells that carry multiple integrated vectors (up to 1 5 or more proviruses) raising concerns of cancer due to the use of powerful oncogenes and/or retrovirus induced insertional mutagenesis. In some embodiments, a nucleic acid construct comprises a single reprogramming factor packaged into a viral vector.
In some embodiments, to minimize the number of independent proviral integrations required for reprogramm ing, a nucleic acid construct comprises a polycistronic vector that can transduce any combination of reprogramming factors with a goal of reducing the number of proviral integrations. Such polycistronic nucleic acid constructs, expression cassettes, and vectors that employ internal ribosomal entry sites and self-cleaving peptides and are capable of transducing any combination of reprogramming factors are described in PCT Application Publication No. WO 2009/1 52529, incorporated herein by reference in its entirety.
The present invention provides polycistronic nucleic acid constructs, expression cassettes, and vectors useful for generating iPSCs. In certain embodiments the polycistronic nucleic acid constructs comprise a portion that encodes a self- cleaving peptide. The invention provides a polycistronic nucleic acid construct comprising at least two coding regions, wherein the coding regions are linked to each by a nucleic acid that encodes a self-cleaving peptide so as to form a single open reading frame, and wherein the coding regions encode first and second
reprogramming factors capable, either alone or in combination with one or more additional reprogramming factors, of reprogramming a mammalian somatic cell to pluripotency. In some embodiments of the invention the construct comprises two coding regions separated by a self-cleaving peptide. In some embodiments of the invention the construct comprises three coding regions each encoding a
reprogramming factor, wherein adjacent coding regions are separated by a self- cleaving peptide. In some embodiments of the invention the construct comprises four coding regions each encoding a reprogramming factor, wherein adjacent coding regions are separated by a self-cleaving peptide. The invention thus provides constructs that encode a polyprotein that comprises 2, 3, or 4 reprogramming factors, separated by self-cleaving peptides. In some embodiments the construct comprises expression control element(s), e.g., a promoter, suitable to direct expression in mammalian cells, wherein the portion of the construct that encodes the polyprotein is operably linked to the expression control element(s). The invention thus provides an expression cassette comprising a nucleic acid that encodes a polyprotein comprising the reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide, operably l inked to a promoter (or other suitable expression control element). The promoter drives transcription of a polycistronic message that encodes the reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide. The promoter can be a viral promoter (e.g., a CMV promoter) or a mammalian promoter (e.g., a PG K promoter). The expression cassette or construct can comprise other genetic elements, e.g., to enhance expression or stability of a transcript. In some embodiments of the invention any of the foregoing constructs or expression cassettes may further include a coding region that does not encode a reprogramming factor, wherein the coding region is separated from adjacent coding region(s) by a self-cleaving peptide. In some embodiments the additional coding region encodes a selectable marker.
Specific reprogramming factors that may be encoded by the polycistronic construct include transcription factors Sall4, Nanog, Esrrb, and Lin28. The invention encompasses all combinations of two or more of the foregoing factors, in each possible order. For purposes of brevity, not all of these combinations are individually listed herein. In certain embodiments, a nucleic acid construct comprises at least four coding regions linked to each other by nucleic acids that encode a self-cleaving peptide so as to form a single open reading frame, wherein said coding regions encode reprogramming factors Sall4, Nanog, Esrrb, and Lin28, and wherein said
reprogramming factors are capable, either alone or in combination with one or more additional reprogramming factors, of reprogramm ing a mammalian somatic cell to pluripotency.
In some embodiments, a nucleic acid construct of the present invention includes a fifth coding region that encodes a fifth reprogramming factor, wherein the five coding regions are linked to each other by nucleic acids that encode self-cleaving peptides so as to form a single open reading frame. In some embodiments, said fifth reprogramming factor is c-Myc.
In some embodiments, a nuc leic acid construct of the present invention includes fifth and sixth genes that encode fifth and sixth reprogramming factors, wherein said six coding regions are linked to each other by nucleic acids that encode self-cleaving peptides so as to form a single open reading frame. In some
embodiments, said fifth reprogramm ing factor is c-Myc and said sixth reprogramming factor is Klf4.
In some embodiments, the self-cleaving peptide is a viral 2A peptide. In some embodiments, the self-cleaving peptide is an aphthovirus 2A peptide.
In some embodiments, the nucleic acid construct of the present invention is capable of reprogramming a somatic cell to a pluripotent state in the absence of one or more of the canonical reprogramming factors. In some embodiments, the nucleic acid construct does not encode Oct4. In some embodiments, the nucleic acid construct does not encode Klf4. In some embodiments, the nucleic acid construct does not encode Sox2. In some embodiments, the nucleic acid construct does not encode c- Myc.
In some aspects, expression cassettes comprising a nucleic acid construct of the present invention are disclosed. In some embodiments, an expression cassette comprising a nucleic acid construct is operably linked to a promoter, wherein said promoter drives transcription of a polycistronic message that encodes said reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide. In some embodiments, the expression cassette comprises one or more sites that mediate integration into a genome of a mammalian cell. In some embdoments, the expression cassette is integrated into said genome at a locus whose disruption has minimal or no effect on said cell.
In some embodiments the construct comprises one or more sites that mediates or facilitates integration of the construct into the genome of a mammalian cell. In some embodiments the construct comprises one or more sites that mediates or facilitates targeting the construct to a selected locus in the genome of a mammalian cell. For example, the construct could comprise one or more regions homologous to a selected locus in the genome.
In some embodiments the construct comprises sites for a recombinase that is functional in mammalian cells, wherein the sites flank at least the portion of the construct that comprises the coding regions for the factors (i.e., one site is positioned 5 ' and a second site is positioned 3 ' to the portion of the construct that encodes the polyprotein), so that the sequence encoding the factors can be excised from the genome after reprogramming. The recombinase can be, e.g., Cre or Flp, where the corresponding recombinase sites are LoxP sites and Frt sites. In some embodiments the recombinase is a transposase. It will be understood that the recombinase sites need not be directly adjacent to the region encoding the polyprotein but will be positioned such that a region whose eventual removal from the genome is desired is located between the sites. In some embodiments the recombinase sites are on the 5 ' and 3 ' ends of an expression cassette. Excision may result in a residual copy of the recombinase site remaining in the genome, which in some embodiments is the only genetic change resulting from the reprogramming process.
In some embodiments the construct comprises a single recombinase site, wherein the site is copied during insertion of the construct into the genome such that at least the portion of the construct that encodes polyprotein comprising the factors (and, optionally, any other portion of the construct whose eventual removal from the genome is desired) is flanked by two recombinase sites after integration into the genome. For example, the recombinase site can be in the 3 ' LTR of a retroviral (e.g., lentiviral) vector.
In some aspects, the invention provides expression vectors comprising the polycistronic nucleic acid constructs. In some embodiments the expression vectors are retroviral vectors, e.g., lentiviral vectors. In other embodiments the expression vectors are non-retroviral vectors, e.g., which may be viral (e.g., adenoviral) or non- viral. In some embodiments, the expression vector includes an inducible promoter.
In some aspects, the invention provides cells and cell lines (e.g., somatic cells and cell lines such as fibroblasts, keratinocytes, and cells of other types discussed herein) in which a polycistronic nucleic acid construct or expression cassette (e.g., any of the constructs or expression cassettes described herein) is integrated into the genome. In some embodiments the cells are rodent cells, e.g., murine cells. In some embodiments the cells are primate cells, e.g., human cells.
In some embodiments at least the portion of the construct that encodes the polyprotein is flanked by sites for a recombinase. After a reprogrammed cell is derived, a recombinase can be introduced into the cell, e.g., by protein transduction, or a gene encoding the recombinase can be introduced into the cell, e.g., using a vector such as an adenoviral vector. The recombinase excises the sequences encoding the exogenous reprogramming factors from the genome. In some embodiments the cells contain an inducible gene that encodes the recombinase, wherein the recombinase is expressed upon induction and excises the cassette. In some embodiments the inducible gene is integrated into the genome. In some embodiments the inducible gene is on an episome. In some embodiments the cells do not contain an inducible gene encoding the recombinase.
In some embodiments, the nucleic acid construct or cassette is targeted to a specific locus in the genome, e.g., using homologous recombination. In some embodiments the locus is one that is dispensable for normal development of most or all cell types in the body of a mammal. In some embodiments the locus is one into which insertion does not affect the ability to derive pluripotent iPS cells from a somatic cell having an insertion in the locus. In some embodiments the locus is one into which insertion would not perturb pluripotency of an iPSC. In some
embodiments the locus is the COL 1 A 1 locus or the AAV integration locus. In some embodiments the locus comprises a constitutive promoter. In some embodiments the construct or cassette is targeted so that expression of the polycistronic message encoding the polypeptide comprising the factors is driven from an endogenous promoter present in the locus to which the construct or cassette is targeted.
The invention further provides pluripotent reprogrammed cells (iPSCs) generated from the somatic cells that harbor the nucleic acid construct or expression cassette in their genome. The iPS cells can be used for any purpose contemplated for pluripotent cells. Further provided are differentiated cell lines (e.g., neural cells, hematopoietic cells, muscle cells, cardiac cells), derived from the pluripotent reprogrammed cells.
In some aspects, the present invention provides a reprogramm ing composition, such composition comprising reprogramming factors selected from the group consisting of Sall4 protein, Nanog protein, Esrrb protein, and Lin28 protein, or functional variants or fragments thereof. In some embodiments, each of said reprogramming factors comprises a cell-penetrating peptide fused to its C term inus. In some embodiments, said cell-penetrating peptide comprises poly-arginine. In some aspects, a methods of producing a pluripotent cell from a somatic cell are disclosed, such methods comprising the steps of: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; and (b) maintaining said one or more cells under conditions appropriate and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene. In some embodiments, said period of time comprises a stochastic phase of reprogramming. In some embodiments, said cells are maintained for a period of time sufficient for said exogenous reprogramming factors to initiate a sequential phase of reprogramming. In some embodiments, such methods further comprise the step of (c) selecting one or more cells which display an early marker of pluripotency. In some embodiments, said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, said early marker of pluripotency is a group of early markers of pluripotency consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, step (c) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency. In some embodiments, such methods further comprise the step of (d) generating an embryo utilizing said one or more cells which display the early marker of pluripotency. In some embodiments, said embryo is a chimeric embryo. In some embodiments, such methods further comprise the step of (e) obtaining one or more somatic cells from said embryo. In some embodiments, such methods further comprise the step of (f) maintaining said one or more somatic cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene. In some embodiments, such methods further comprise the step of (g) differentiating between cells which display one or more markers of pluripotency and cells which do not.
In some aspects, the present invention provides an IPSC produced by a method comprising: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; and (b) maintaining said one or more cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene. In some embodiments, said period of time comprises a stochastic phase of reprogramming. In some embodiments, said cells are maintained for a period of time sufficient for said exogenous reprogramming factors to initiate a sequential phase of reprogramming. In some embodiments, such method further comprises (c) selecting one or more cells which display an early marker of pluripotency. In some embodiments, said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, said early marker of pluripotency is a group of early pluripotency markers consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, step (c) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency. In some embodiments, such methods further comprise (d) generating an embryo utilizing said one or more cells which display the early marker of pluripotency. In some embodiments, such embryo comprises a chimeric embryo. In some embodiments, such methods further comprise (e) obtaining one or more differentiated somatic cells from said embryo. In some embodiments, such methods further comprise (f) maintaining said one or more differentiated somatic cells under conditions appropriate for and for a period of time sufficient for said reprogramming factors to activate at least one endogenous pluripotency gene. In some embodiments, such methods further comprise (g) differentiating cells which display one or more markers of pluripotency and cells which do not. In some embodiments, said iPSC comprises a primary iPSC. In some embodiments, said iPSC comprises a secondary iPSC.
In some aspects, the present invention provides a method of selecting a somatic cell that is likely to be reprograrnmed to a pluripotent state, such method comprising (a) measuring expression of one or more early markers of pluripotency in a population of a plurality of somatic cells; (b) sorting the population of the plurality of somatic cells into a plurality of populations of single somatic cells; and (c) measuring expression of the one or more early markers of pluripotency in each population of single somatic cells, wherein increased expression of the one or more early markers of pluripotency in each population of single somatic cells as compared to expression of the one or more early markers of pluripotency in the population of the plurality of somatic cells indicates that the single somatic cel l is a somatic cell that is likely to be reprograrnmed to the pluripotent state. In some embodiments, said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. it should be appreciated that the steps of sorting the somatic cells and measuring expression of the one or more early markers of pluripotency can be accomplished by various methods which are well known in the art (e.g., see Example 4 below).
In some aspects, the present invention provides a method of selecting a cell that is likely to become programmed to a pluripotent state, such method comprising (a) maintaining a population of a plurality of differentiated somatic cells containing at least one exogenously introduced factor that contributes to reprogramming of said cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) sorting said population of said plurality of cells into a plurality of populations of single cells; and (c) isolating said sorted cells which display one or more early markers of pluripotency, wherein each sorted cell which displays said one or more early markers of pluripotency is a cell that is likely to become programmed to the pluripotent state. In some embodiments, said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. The sorting and isolating steps of the inventive method can be accomplished according to routine methods well known to those of ordinary skill in the art. Examplary methods of sorting and isolating such cells can be found in Example 4 below.
In some aspects, the present invention provides a method for increasing the efficiency of the expansion of induced pluripotent stem cells, such method comprising (a) maintaining a population of differentiated somatic cells that contains at least one exogenously introduced factor that contributes to reprogramming of said population of cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cells to occur; (b) monitoring each cell in said population of cells for the expression of one or more early pluripotency markers, wherein cells expressing the one or more early pluripotency markers are more likely to become programmed to a pluripotent state than cells which do not express the one or more early pluripotency markers; (c) isolating each cell in said population of cells that expresses the one or more early pluripotency markers; and (d) expanding only those cells which express the one or more early pluripotency markers, thereby increasing the efficiency of the expansion of induced pluripotent stem cells. In some embodiments, said one or more early pluripotency markers are selected from the group consisting of Esrrb, Utfl , Lin28, Dppa2, and combinations thereof. In some embodiments, said monitoring of said cells is performed during a stochastic phase of reprogramming. In some embodiments, proliferation of said cell forms a clonal colony of said cell. The steps of the method for increasing the efficiency of the expansion of IPSCs (e.g., maintaining, monitoring, isolating, and expanding) can be accomplished by performing methods routinely performed by those of ordinary skill in the art, some of which are described with particiilarlity in the experimental methods section below.
In some aspects, the present invention provides a method of increasing the likelihood that a differentiated somatic cell subjected to a reprogramming protocol will become reprogrammed to an iPSC, comprising, introducing into the differentiated somatic cell one or more early pluripotency factors prior to subjecting the differentiated somatic cell to said reprogramming protocol. During the course of work described herein, Applicants have observed that the early markers of pluripotency of the present invention are more predictive than conventional pluripotency markers in identifying cel ls which are destined to become iPSCs, for example, when these early pluripotency markers are observed in a cell undergoing a reprogramming protocol the cell is more likely to become an iPSC as compared to cells undergoing the same reprogramming protocol which do not display the early pluripotency markers.
Accordingly, without wishing to be bound by theory, it is believed that one or more early pluripotency markers can serve as early pluripotency factors that can be introduced into a differentiated ce ll to increase the l ikelihiood that the cell will become reprogrammed to an iPSC. In some embodiments, said one or more early pluripotency factors are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
In some embodiments, the present invention includes a method of
reprogramming a cell primed according to the priming method described above, such method comprising subjecting the primed cell to a reprogramming method of generating a reprogrammed cell of the present invention. In some aspects, the present invention provides a method of isolating an iPS colony, such method comprising: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a differentiated mammalian somatic cell (b) culturing said differentiated somatic cell in a suitable medium under conditions appropriate for and for a time period sufficient for proliferation of and reprogramming of said cells to occur; and (c) isolating one or more colonies visible in said culture after said period of time. In some embodiments, each of said exogenous
reprogramming factors is introduced into said cell in the form of a recombinant protein comprising a cell-penetrating peptide fused to a C terminus of said recombinant protein. In some embodiments, each of said exogenous reprogramming factors is introduced into said cell in the form of mRNA optionally complexed with a cationic vehicle, wherine said mRNA comprises in vitro transcribed mRNA comprising one or more of a 5 ' cap, an open reading frame flanked by a 5' untranslated region containing a strong ozak translation initiation signal and an alpha-globin 3 ' untranslated region, a polyA tail, and one or more modifications which confer stability to the mRNA. In some embodiments, such method further comprises (d) growing said isolated one or more colonies on a layer of feeder cells in the absence of an inducer of said inducible transgenes. In some embodiments, such method further comprises (e) passaging said one or more grown colonies at least once.
In some aspects, a method of enhancing isolation of iPSCs is disclosed, such method comprising (d) sorting said one or more colonies visible in said culture after said period of time according to step (c) of the method of isolating an iPS colony into single cells; (e) differentiating between said sorted cells which display one or more early markers of pluripotency and said sorted cells which do not display one or more early markers of pluripotency; and (f) isolating said sorted cells which display one or more early markers of plurioptency. In some embodiments, said early markers of pluripotency are a combination of early pluripotency markers selected from any of Esrrb, Utfl , Lin28, and Dppa2.
In some aspects, the present invention provides mouse iPS cells (e.g., cell lines) characterized by an efficiency of said mouse iPS cell of generating live offspring by tetraploid complementation, wherein said efficiency is at least 5%, 6%, 7%, 8%, 9%, 1 0%, 1 1 %, 1 2%, 1 3%, 1 4%, 1 5%, or more, or any intervening particular value or subrange, such as between 5% and 10%, between 10% and 1 5%, etc. In some aspects, the present invention provides mouse iPS cells characterized by an ability of said mouse iPS cells of generating live offspring by tetraploid
complementation, wherein at least some of said live offspring survive to adulthood. In some aspects, the present invention provides mouse iPS cells characterized by an ability of said mouse iPS cells of generating live offspring by tetraploid
complementation, wherein at least some of said live offspring are born naturally (without requiring C-section).
In some embodiments of any aspect herein relating to mouse iPS cells, cell lines, or animals derived therefrom, comparable rat iPS cells/cell lines/animals are provided.
In some embodiments, a mouse iPS cell characterized by an efficiency of said mouse iPSC of generating live offspring by tetraploid complementation is produced by a method comprising: (a) transfecting mouse embryonic fibroblasts with a doxycycline-inducible vector comprising reprogramming factors Sall4, Nanog, Esrrb and Lin28 operably linked to a tetracycline operator and a C V promoter; (b) culturing said mouse embryonic fibroblasts under conditions suitable and for a time period sufficient for proliferation and reprogramming of said mouse embryonic fibroblasts to occur; (c) exposing said culture to an effective amount of doxycycline for a period of time sufficient for one or more iPS colonies to form; (d) isolating said one or more iPS colonies; (e) growing said isolated iPS colonies on feeder cells in the absence of doxycycline; and optionally (f) passaging said grown iPS colonies at least once prior to carrying out tetraploid complementation.
In some aspects, the present invention provides a collection of reprogramming factors capable of producing a mouse iPS cell characterized by an efficiency of said mouse iPSC of generating live offspring by tetraploid complementation of at least 5%, 6%, 7%, 8%, 9%, 1 0%, 1 1 %, 12%, 1 3%, 14%, 1 5%, or more, or any intervening subrange comprising Sall4, Nanog, Esrrb, and Lin28.
In some aspects, the invention provides a variety of kits. A kit can contain any of the cells or compounds described herein or combinations thereof. In some aspects, the invention provides a kit containing cells of an iPSC line of the invention. The cells can be provided frozen. In some embodiments, the kit further comprises at least one item selected from the group consisting of (a) instructions for thawing, culturing, and/or characterizing the iPSCs; (b) reagent(s) useful for characterizing the iPSCs. Such reagent could be, e.g., antibody(ies) for detecting a cell marker or probe(s) (e.g., for performing FISH).
The invention further provides a kit for generating a reprogrammed cell in vitro, such kit comprising: (a) a set of reprogramming factors comprising Sall4, Nanog, Esrrb and Lin28, which are capable alone, or in combination with one or more additional reprogramming factors, of reprogramming said mammalian somatic cells to a pluripotent state, wherein the kit optionally comprises (b) a medium suitable for culturing mammalian iPS cells and/or (c) a population of mammalian somatic cells, and wherein the reprogramming factors are optionally provided as one or more nucleic acids (e.g., one or more vectors) encoding said reprogamming factors. In some embodiments, the kit further comprises (d) one or more reagents for an assay for detecting one or more markers of pluripotency. Suitable reagents for such an assay for detecting one or more markers of pluripotency are apparent to those skilled in the art. In some embodiments, the one or more markers of pluripotency are selected from the group consisting of Fbxo l 5, Nanog, Oct4, Sox2, Sall4 and combinations thereof. In some embodiments, the one or more markers of pluripotency are early markers of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2. In some embodiments, the kit further includes (e) instructions for preparing the medium; (f) instructions for deriving or culturing pluripotent cells; (g) serum replacement; (h) albumin; (i) at least one protein or small molecule useful for deriving or culturing iPS cells, wherein the protein or small molecule activates or inhibits a signal transduction pathway and and (j) at least one reagent useful for characterizing pluripotent cells. In some embodiments, at least some of the ingredients are dissolved in l iquid. In some embodiments, at least some of the ingredients are provided in dry form.
In some embodiments of any aspect herein, it is contemplated to use Dppa2 as a reprogramming factor, either alone or in combination with one or more additional reprogramming factors or reprogramming agents. In some embodiments Dppa2 is used, either alone or in combination with one or more additional reprogramming factors or reprogramming agents, to replace Nanog. For example, in some embodiments Sal4, Lin28, Essrb, and Dppa2 (SLED) are used in any of the compositions, methods, kits, cells, or vectors, described herein. In some embodiments Sal4, Lin28, Essrb, and Dppa2 (SLED) are used reprogram a cell to a less differentiated state, e.g., a pluripotent state.
In some embodiments of any aspect herein, it is contemplated to use Essrb as a reprogramming factor and a ligand (e.g., an agonist) for Essrb as a reprogramming agent. For example, in some embodiments a ligand may enhance nuclear translocation or activity of Essrb.
In some embodiments of any aspect herein, Lin28 is supplemented by, or replaced as a reprogramming factor by, any of a variety of different reprograming factors or reprogramm ing agents. For example, in some embodiments Ezh2, Kdm l , and/or Utfl is used instead of, or in addition to Lin28, in any of the compositions or methods herein. In some embodiments Ezh2, Kdm I , and/or Utfl is used instead of, or in addition to Lin28 to reprogram a cell to a less differentiated state, e.g., a pluripotent state. For example, in some embodiments reprogramming is performed using Sal4, Essrb, Dppa2, and Ezh2. In some embodiments reprogramming is performed using Sal4, Essrb, Dppa2, and Kdm l . In some embodiments reprogramming is performed using Sal4, Essrb, Dppa2, and Utfl .
It is contemplated that Lin28 can be omitted from reprogramming factor combinations without necessarily replacing it by a different reprogramming factor or reprogramm ing agent. In some embodiments Lin28 is omitted from a composition, kit, or method herein. For example, in some embodiments reprogramming is performed without Lin28, e.g., using a combination comprising or consisting of Sal4, Nanog, and Essrb or using a combination comprising or consisting of Sal4, Essrb, and Dppa2.
In some embodiments of any aspect herein, it is contemplated that reprogrammed cells, e.g., iPSCs, generated as described herein (e.g., using SNEL reprogramming factors) are more suitable for use in cell therapy as compared with reprogrammed cells generated using at least some other methods, e.g., generated through use of at least 1 , 2, 3, or all 4 of the OKSM factors.
In some embodiments of any aspect herein, it is contemplated that reprogrammed cells, e.g., iPSCs, generated as described herein (e.g., using SNEL reprogramming factors) have reduced immunogenicity as compared with re programmed cells generated using at least some other methods, e.g., generated through use of at least 1 , 2, 3, or all 4 of the OKSM factors.
In some embodiments of any aspect herein, it is contemplated that reprogrammed cells, e.g., iPSCs, generated as described herein (e.g., using SNEL reprogramming factors) have reduced tumorogenicity as compared with
reprogrammed cells generated using at least some other methods, e.g., generated through use of at least 1 , 2, 3, or all 4 of the OKSM factors.
In some embodiments the disclosure provides a gene expression signature that may be used for a variety of purposes. In some embodiments the gene expression signature comprises expression levels of the genes listed in Table S I or counterparts thereof (e.g., orthologs in other organisms, e.g., humans). In some embodiments measurement of expression levels of the genes or a subset thereof may be used to identify iPS cells that exhibit high developmental potential (e.g., as compared with iPS cells generated using the OKSM factors). In some embodiments measurement of expression levels of the genes or a subset thereof may be used to identify iPS cells that exhibit superior quality (e.g., as compared with iPS cells generated using the OKSM factors). In some embodiments a subset comprises at least 1 0, 20, 50, 100, 200, 300, 500, 700, 900, 1 100, 1300, or 1500 genes listed in Table S I . Gene expression levels may be measured by measuring mRNA, protein or other gene product. Any suitable method may be used. In some embodiments gene expression may be measured using RNA-Seq, microarray analysis, or quantitative PCR. In some embodiments iPSCs are classified based on the gene expression profile. For example, whether the iPSCs gene expression profile more closely resembles that high quality iPSCs or poor quality iPSCs may be determined. Heirarchical clustering or PCA analysis may be used, for example, to determine whether a particular iPSC population (e.g., colony, culture, cell line, etc.) clusters with high quality iPSCs as described herein or clusters with poor qual ity iPSCs as described herein. iPSC of superior quality (e.g., that cluster with high quality iPSCs as described herein) may, for example, have lower tumorigenic potential, lower immunogenicity, be easier to maintain or manipulate in culture, have increased developmental potential. In some embodiments a gene expression signature may be used in identifying compounds or conditions that promote formation of superior quality iPSCs. For example, compounds or conditions may be used in a reprogramming protocol and their effect on gene expression profile of somatic cells subjected to the reprogramming protocol may be assessed. Compounds that promote a gene expression profile resembling that of high quality iPSCs may be identified. Such compounds may be used in a reprogramming protocol to generate iPSCs, e.g., high quality iPSCs.
* * *
Examples
Example 1 : Single-cell expression profiling at defined time points during the reprogramming process
To measure gene expression in single cells at defined time points during the reprogramming process, we combined two novel complimentary tools: (i) 96.96 Dynamic Array chips (Fluidigm), which allows quantitative analysis of 48 genes in duplicate in 96 single cells (Citri et al., 2012; Diehn et al., 2009; Guo et al., 2010;
Narsinh et al., 201 1 ), and (ii) single-molecule-mRNA fluorescent in situ hybridization (sm-mRNA-FISH), which allows the quantification of mRNA transcripts of up to three genes in hundreds of cells (Raj et al., 2008). Fluidigm analysis involves the sorting of single cel ls, lysis, cDNA synthesis, pre-amplification of targets, and quantification of gene expression using TaqMan quantitative real-time polymerase chain reactions (qRT-PCR) on the BioMark system (Guo et al., 201 0). sm-RNA-FISH entails probing each mRNA species with 48 fluorophore- labeled oligonucleotide probes, imaging mRNAs by fluorescence microscopy, and quantifying and assigning mRNAs to single cells (Raj et al., 2008).
We selected the gene candidates based on the major events that occur during the reprogramming process (Table 1 ) (Gonzalez et al., 201 1 ). Table 1 , Selection of 48 Candidate Genes
Figure imgf000080_0001
Recently, short hairpin RNAs targeting genes DNA and histone methylation pathways in human fibroblasts found chromatin-modifying proteins to be positive and negative regulators of reprogramming (Onder et al., 2012). Because reprogramming requires a vast number of epigenetic changes, we selected a group of ES-associated chromatin remodeling genes and modification enzymes Myst3, Kdm l, Hdacl, Dnmtl, Prmt7, Ctcf, MystA, Dnmt3b, Ezh2, Bmil (Branco et al., 2008; Farthing et al„ 2008; Feng et al., 2010; Hemberger et al., 2009; Kurukuti et al., 2006; Moon et al., 201 1 ; Morgan et al., 2005 ; Reik, 2007; Singhal et al., 201 0; Surani et al., 2007). Since high proliferative capacity is essential to facilitate the reprogramming process we selected ESC cell cycle regulator genes, Bubl, Cdc20, Mad2ll, Cc«/'(Ballabeni et al., 201 1 ; Banito et al., 2009; Edel and Izpisua Belmonte, 2010; Hong et al., 2009; Kawamura et al., 2009; Li et al., 2009; Utikal et al., 2009). We also profiled key genes that are active in signal transduction pathways important for ES cells maintenance and differentiation [Bmprla, Stat3, Ctnnbll, Nes, Wntl, Gsk3b, Csnk2al, Lifr, Hesl, Jagl, Notchl, Fg/5, Fgf4) (Barrero et al., 2010; Boiani and Scholer, 2005; Marson et al., 2008; Samavarchi-Tehrani et al., 2010; Varga and Wrana, 2005). Finally, we selected a large number of pluripotency marker genes in an attempt to detect early and late markers for the reprogramming process \Oct4, Sox2, Nanog, Lin28, Fbxol 5, Zfp42, Fut4, Tbx3, Esrrb, Dppa2, Utfl, SalU, Gdfi, Grb2, Sic 2a 1, Fthil l, Nr6al) (Ang et al., 201 1 ; Cavaleri and Scholer, 2003 ; Furusawa et al ., 2006; Ivanova et al., 2006; Kim et al., 2009a; Kim et al., 2009b; Lu et al., 2009; Macarthur et al., 2009; Maldonado-Saldivia et al., 2007; Moore and Lemischka, 2006; Ng and Surani, 201 1 ; Pritsker et al., 2006; Ramalho-Santos et al., 2002; Scholer, 1991 ; Scholer et al., 1 990; Silva et al., 2009; Sterneckert et al., 2012; Tiscornia and Izpisua Belmonte, 2010; Viswanathan and Daley, 201 0; Viswanathan et al., 2008; West et al., 2009). We chose Gapdh and Hprl as house keeping control genes and Thyl and Col5a2 as gene markers for MEFs.
For analysis of reprogramming at the single-cell level, we used previously characterized, clonal doxycycline (dox)- inducible 'secondary' NGFP2 MEFs (Wernig et al., 2008). Briefly, these cells contain pro-viral integrations of Oct4, Sox2, KI/4, and c-Myc, each under the TetO promoter, reverse tetracycline transactivator (rtTA) in the Rosa26 locus, and a GFP reporter knocked into the Nanog locus (Silva et al., 2009). To detect early transcriptional changes in the reprogramming process, NGFP2 MEFs were exposed to dox for two, four, and six days. At each time point, the cells were imaged, sorted to single cells and gene expression was profiled using the Fluidigm BioMark system (Figures 1 A and 1 B). In the NGFP2 system, the first colonies appeared around seven days after the addition of dox. However, at that time point, the majority of the cells in the culture were senescent, contact inhibited, or transformed and single sorting of these cells would fail to identify those rare cells that are destined to become iPSCs. To overcome this technical limitation, we generated secondary cells that, in addition to the Nanog-GFP gene, carried a tdTomato reporter (tdTomato- NGFP2 MEFs) (Figures 2A-2C). The presence of the tdTomato reporter enabled us to sort single secondary cel ls in the presence of unmarked feeder cells. Unmarked feeder cells were important both for cell-cel l interactions that enable proliferation of the tdTomato-single cells and for the calibration of the FACS machine before sorting (i.c tdTomato-positive cells vs tdTomato-negative cells). This system allowed us to trace those tdTomato-positive rare cells that bypassed senescence and contact inhibition and continued to proliferate to form clonal colonies on top of the feeders. tdTomato- NGFP2 MEFs were exposed to dox for six days, sorted for tdTomato-positive cells, which were then seeded each as single cell in one well of four 24-well plates containing unmarked feeders. At different time points (between one and three weeks) during the reprogramming process, tdTomato-positive colonies that were derived from the single cells were imaged, split to another plate, sorted to single cells and analyzed for their transcriptional profile using the Fluidigm BioMark. Each parental cell was passaged to test its capacity to generate dox-independent, fully
reprogrammed iPSCs. This system allowed us to trace gene expression changes in multiple clonally related single sister cells over different times during the
reprogramming process. Clonal populations were passaged on dox and gene expression was profiled in single cells as a function of time in three subpopulations: (i) early dox-dependent GFP- cells (ii) intermediate dox-dependent GFP- and GFP+ cells and (iii) dox-independent GFP+ cells (Figures 1 C and I D).
Out of 96 tdTomato-positive single cells, only seven cells generated a colony reflecting the low efficiency of the process. Single cells in these seven clonal populations (colonies: 1 5, 16, 20, 23, 34, 43 and 44) were profiled over the course of 94 days (Figure I E). Cel ls were sorted for GFP when GFP was bright enough to be seen using an inverted fluorescence microscope. Colonies 34, 20, and 43 gave rise to dox-independent cells relatively early in the process, whereas Colony 1 6 gave rise to dox-independent cells very late in the process. Colony 23 and 44 did not give rise to stable GFP colonies during constant exposure to dox up to day 81 . During the process, Colony 44 contained a few cells with a very low level of GFP (Figures 2A- 2C) that disappeared upon continual passaging and dox-withdrawal. A few cells (0.01 %) from Colony 23 activated GFP at day 81 but those cells did not give rise to stable iPSC colonies.
Example 2: Behavior of single cells during the reprogramming process within a cell population
For each profiled subpopulation we obtained replicate gene expression data for 48 genes in 96 single cel ls. The Fluidigm microfluidics system combines samples and primer-probe sets into 9216 qRT-PCR reactions. The output of one run on the
Biomark system is a 96x96 matrix of cycle threshold (Ct) values (Figures 3A-3B). Normalized expression value of a gene in an individual cell was derived by normalizing the average Ct of the gene replicates to the average Ct values of the control genes Hprt and Gapdh of that cell. Cells with low or absent endogenous control gene expression levels were removed from analysis (For more details see Supplemental Methods). To globally visualize the gene expression data and attempt to identify meaningful clusters among cell populations, we used principle component analysis (PCA) to transform the 48-dimensional gene expression data into 48 principle components (PC) and projected the gene expression data onto the two most important principle components, PC I and PC2 (Figure 4A). Applied to the expression data of 1 864 cells from different stages during the reprogramming process, we found that the first principal component (PC I ) explains 22.5% of the observed variance while the second principal component (PC2) explains 5.8% (Supplemental Methods). These values are lower relative to a recent single-cell study performed in 64-celled embryos (Guo et a! ., 2010). Two factors may account for this discrepancy. First, we applied expression data derived from 12x the number of cells than the previous study. Second, the high degree of heterogeneity during the reprogramming process is predicted to strongly contribute to the overall lower values. A projection of the cells' expression patterns onto PC I and PC2 separates individual cells into 2 distinct clusters (Figure 4 A, blue and red circle). We identified a third potential cluster enclosed in the orange dotted line within the first cluster that represents the early transition from fibroblasts to iPSC precursors. The first cluster (dark and light blue, turquoise, green, and yellow, enclosed in the blue circle) contains the three control groups, tail tip fibroblasts (TTF), mouse embryonic fibroblasts (MEFs) and NGFP2 MEFs. In addition, it contains GFP- cells exposed to dox for two, four and six days, and dox-dependent GFP- cells (yellow dotted). The second cluster (orange, red, brown, enclosed in the red circle) contains dox-dependent and independent GFP+ cells and the parental NGFP2 iPSCs. The third rather heterogenous cluster (orange dotted circle) contains cells primarily from the early colonies prior to the activation of the Nanog-GYY locus, possibly representing an early intermediate state. Importantly, a few cells from earlier time points (green and yellow dots) showed a similar pattern of expression as in the second cluster. This agrees with the observation that iPS colonies appear with different latencies and that early colonies with ES-l ike morphology may not be dox- independent. Cells on dox for four days cluster very closely to the MEFs suggesting that the epigenetic changes that characterize a fully reprogrammed iPS cell do not occur early in the reprogramming process. (Guo et al., 2010) (Figure 4A). Because the principle components consist of weighted contributions from all 48 genes, it is possible to identify the most informative genes in classifying the two clusters (Figure 4B). By this criterion, Thy I, Col5a2, Bmil, Gsk3b, and Hesl were the most specific markers of the first cluster. For the second cluster it was Dppa2, Sox2, Nanog, Esrrb, Oct4, Sall4, Ulfl, Lin28, and Nr6al , We found that several pluripotency genes are not strictly associated with the second cluster. For example, Grb2 do not significantly differentiate between the two clusters. Similarly, genes such as Stat3, Hesl, Jag], Gsk3b, Bmprla, Nes, and Wntl, which are known to be important for the ES cel l state, are less indicative of the second cluster (Figure 4B).
To examine within-group variability combining all genes, we measured
Jensen-Shannon Divergence (JSD) on the gene expression probability vectors derived by normalizing each of the gene expression vectors of cells within a group to a sum of one. (Figure 4C and Figure 4D). JSD is a measure of the similarity between two probability distributions, which can be generalized to measure the similarity between a finite number of distributions. This method quantifies within-group variation by measuring the information radius of samples against the group average using Shannon entropy. The final JSD score for each group is bounded by 0 and 1 . Higher values mean more divergence within a group and lower values mean less divergence within a group. A bootstrapping method was used to resample the gene expression probability vectors from each group with replacement and derive a 95% confidence interval. We observed an increase in variation in EFs when dox was added to the cells. A steep decrease in variation was observed after the activation of the Nanog locus (GFP+ ceils), suggesting that the activation of the endogenous Nanog locus marks events that drive the cells to pluripotency (Silva et al., 2009). We observed the parental NGFP2 iPSCs to be the least variable group. Notably, although the dox-independent cells were derived from the same parental cells, they exhibited a higher variation in their transcriptional profile (red) than their parental cells (brown), indicating that each reprogramming event (colony) results in a slightly different epigenetic state (Figure 4C).
We further examined the variation within and between colonies using JSD
(Figure 4D). Generally, the variation observed within a colony between the GFP- and GFP+ cells was similar to the variation observed among the colonies collectively (Figure 4C). Colony 44, which contained only a few cells with low level of GFP (Figures 3A-3C), exhibited high variation between the GFP+ cel ls. Colonies 20 and 34, which gave rise to early stable dox-independent iPS colonies, showed low variation between late GFP- cells (Figure 4D) relatively early in the process. Notably, all of the colonies that gave rise to fully reprogrammed iPSCs (Colonies 43, 1 6, 20, 34) exhibited a sim ilarly low variation between GFP+ dox-independent cells, which indicates that when the core circuitry is activated the variation between single cells is reduced significantly. Example 3: Analysis of induced cells that do not give rise to iPSCs
Upon retrospective tracing, we found two colonies, 23 and 44, that failed to give rise to stable iPSCs (Figures 6A-6C). Both exhibited early de-differentiating morphological changes associated with reprogramming (Smith et al., 201 0); however, Colony 23 produced homogenous cultures of cells with mainly epiblast stem cell-like morphology (flat colonies), whereas Colony 44 produced transformed-like cells.
Colony 23 failed to activate GFP in the majority of cells upon continual passaging to day 81 . Ultimately, only a very small fraction of these cells activated the endogenous Nanog locus (0.01 % GFP+). Colony 44 contained a few cells with a low level of GFP that appeared at day 61 and disappeared upon continual passaging and dox- withdrawal. Colonies 23 and 44 were induced cells that did not give rise to iPSCs, thus we termed them 'partially reprogrammed colonies. ' We tested whether methylation of pluripotent genes contributed to the partially reprogrammed state by treating colonies 23 and 44 with the DNA methyltransferase inhibitor 5-aza-cyticline (azaC) (Mikkelsen et al., 2008). After thirty days of azaC and dox treatment followed by eight days of azaC and dox withdrawal, GFP+ cells appeared at a frequency of 2.2% in Colony 23 and 0.5% in Colony 44, compared to none in untreated cells (Figures 6A-6C). These partially reprogrammed colonies were used as a control for ful ly reprogrammed colonies. Example 4: Early markers of Preprogramming
To determine whether the variability in single-cell gene expression was a result of differences between distinct cell populations or just stochastic noise, we analyzed our data with violin plots. Population noise and gene expression noise should exhibit unimodal distribution around a reference level in these density plots, whereas a multimodal distribution is indicative of distinct gene expression differences between cell populations.
To examine the expression of established early markers of the reprogramm ing process we analyzed the expression profiles of three well known reprogramming markers, Fbxo l 5, Fgf4 and endogenous Oct4 (Brambrink et al., 2008; Takahashi and Yamanaka, 2006) (Figure 5A). All the three genes exhibited high expression levels very early in the process (day 2 or 4 or 6) in a few cells ( 1 to 8 cells) and were highly expressed in the GFP+ cells as can be expected from potential early markers. Very early and late in the reprogramm ing process, the distribution in expression levels of Fbxol 5, Fgf4 and endogenous Oct4 was unimodal, with a very narrow peak indicative of low variations between individual cells.
We noted that Fbxol 5, Fgf4, and endogenous Oct4 were expressed in some of these partially reprogrammed colonies at levels similar to those seen in iPS cells (Figure 5A and Figure 7). Fbxol5 showed a bimodal distribution in both colonies 44 and 23, while Fg/4 shows bimodality in colony 44 and unimodality in colony 23. Of particular interest is the observation that endogenous Ocl4 was highly expressed in the partially reprogrammed colony 23. These data suggest that the activation of the Oct4 locus can occur in partially reprogrammed cells with incomplete reactivation of the core regulatory circuitry (Jaenisch and Young, 2008). Although exogenous Oct4 is one of the key factors in the reprogramming process, its endogenous activation was insufficient to identify cells as fully reprogrammed and thus is a less than optimal predictive marker for reprogramming.
Further analysis for other early markers revealed five genes, Sall4, Esrrb, Uifl, Lin28, and Dppa2 that were activated very early in the process in a few cells and were highly expressed in the GFP+ cells (Figure 5B and 5C). We separated these genes into two classes: (i) non-predictive, like Sall4 that was activated very early in the process in a few cells but was also activated robustly in the partial ly reprogrammed cells (Figure 5B and Figure 7) and (ii) more predictive, like Esrrb, Utfl, Lin28, and Dppa2 that were activated early in the process in a small fraction of cells but exhibited only low levels of expression in few partially reprogrammed cells (Figure 5C and Figure 7). The distribution in expression levels of Esrrb, Utfl, Lin28, and Dppa2 was unimodal early and late in the reprogramming process with a narrow peak indicative of low variations between individual cells (Figure 5C). Of note is that the variability between single cells in early time points was masked in whole cell populations as detected by qRT-PCR (Figure 5D).
To validate the Fluidigm BioMark results, we utilized the single molecule mRNA FISH technique and quantified transcripts of the non-predictive marker, SalU, and two potential predictive markers, Esrrb and Utfl, in single NGFP2 MEFs on dox for six days. In agreement with the Fluidigm BioMark analysis, only a few cells exhibited high transcript levels of these genes. As demonstrated in the violin plots (Figure 5B and 5C), Sall4 exhibited the highest number of cells with high expression levels. Only 1 to 2 cells out of 125 examined cells showed relatively high levels of Utfl and Esrrb reflecting the low efficiency of the reprogramming process (Figure 8A). Our analysis found only 2-4% of the cells contained greater than 1 0 transcripts of Utfl and Esrrb, whereas 20% of the cells expressed more than 10 transcripts of SalU. These data suggest that Esrrb and Utfl are expressed in a few cells very early in the process and thus may represent early markers that predict eventual
reprogramming event of a given cell.
To differentiate between "predictive" marker genes and "non-predictive" marker genes in their potential to enhance reprogramming we infected the NGFP2 M EFs with shRNA for Utfl, Esrrb and SalU (Figure 8B) and initiated reprogramming by exposing the cel ls to dox for 13 days followed by three days of dox withdrawal. A lkaline phosphatase staining and flow cytometry for GFP at day 16 indicated that inhibition of any of the genes resulted in a reduction of colonies and GFP positive cells (Figure 8C and 8D). However, the effect of Utfl and Esrrb knockdown was significantly strongest compared to Sall4 knockdown as measured by alkaline phosphatase staining and flow cytometry for GFP (Figure 8C and 8D). Our data support that the previously characterized Utfl represents a predictive early marker for the reprogramming process (Morshedi et al, 201 1 ) and add Esrrb, Lin28 and Dppa2 as additional potential early markers.
Example 5: Activation of endogenous Sox2 is a late phase in reprogramming that initiates a series of consecutive steps toward phtripotency
To investigate the later phases of reprogramming, we wished to identify potential late markers for the reprogramming process. Late markers of reprogramming cells would be expected to express no or very low transcript levels at early time points and high transcript levels as the cells mature and become iPSCs. We identified Gdfl and Sox2 as genes that appeared late in the process with very low levels of expression at early time points as measured by Fluidigm BioMark and sm-mRNA FISH (Figures 1 0A- 1 0F). However, Gdf3 was activated in the partially reprogrammed cells while Sox2 was not, suggesting Sox2 may be a sufficient late marker for iPSCs (Figures 10A- 10F).
To examine whether reprogramming involves random or sequential activation of marker genes, we derived a Bayes network model using a subset of cells from different time points along the reprogramming process that have valid expression values for all 48 genes. The Bayes network model predicted that the activation of the endogenous Sox2 locus initiates a series of consecutive steps leading to the activation of many pluripotency genes (Figure 9A). A Bayes network is a probabilistic model that represents a set of variables and their conditional dependencies. For example, given that Sall4 is expressed, the expression of Oct4, Fgf4, Nr6al , and Fbxo l 5 is conditionally independent on whether Sox2 is expressed or not. In contrast, if Sox2 initiates a sequence of activation and first activates Sall4 and then activates the four downstream target genes, one should not find a cell that expresses Sox2 and one of the four downstream genes (Oct4, Fgf4, Nr6al , and Fbxo l 5) without Sall4 expression. To examine whether the Bayes network predicted true consecutive steps in reprogramming, we decided to investigate three possible sequences of events: (i) Sox2 activates Sal 14 and then activates the downstream gene Fgf4. (ii) Sox2 first activates Lin28 and then induces the downstream gene Dnml3b. (iii) Sox2 activates Sall4 and then activates the downstream gene Fhxol5. To test these possibilities we quanti fied transcripts by sm-mRNA-FISH of the three combinations of genes simultaneously in single NGFP2 MEFs on dox for 12 days, a time point when both, fully reprogrammed cells and partially reprogrammed cells have appeared.
Combination 1: While 1 86 out of a total of 279 cells examined cells were negative for expression, 25 cells expressed one gene, 38 cells expressed two genes, and 30 cells expressed all three genes. Notably, no double positive cells were seen that co- expressed Sox2 and Fgf4 (Figure 9B). Combination 2: Out of a total of 283 cells examined, 82 cel ls were positive for any of the genes with 49 cells expressing one, 23 cells expressing two and 10 cells expressing all three genes. No cells expressing just Sox2 and Dnmt3b were detected (Figure 9C). Combination 3: Of 275 cells examined 101 cells were positive for either of the three genes with 50 cells expressing one, 30 cells expressing two and 20 cells expressing all three genes but only one cell was found that expressed just Sox2 and Fbxol5 at a very low level (Figure 9D). These data support the sequential activation of Sall4 and Lin28 by Sox2 followed by the activation of Fgf4, Fbxo IS, and Dnmt3b, respectively, in the Sox2-positve cells consistent with a model of a hierarchical activation of key pluripotency genes.
Example 6: The hierarchical model of gene activation predicts transcription factor combinations with the capability to induce reprogramming
To assess whether the sequential activation of key pluripotency genes can predict their role in inducing reprogramming we infected Oct4-G¥? MEFs with transcription factor combinations derived from the top node of the network (Sox2), the middle nodes (Esrrb, Sall4, Lin28), and the bottom nodes (Oct4 and Nanog). We chose three combinations of genes that were predicted to allow the activation of the pluripotency circuitry and generate fully reprogrammed iPSCs when transduced into somatic cells: ( 1 ) Oct4, Esrrb, Nanog (2) Sox2, Sall4, Nanog and (3) Lin28, Sall4, Esrrb, Nanog. These three combinations omitted either Sox2 or Oct4 or both.
Combination ( 1 ) replaced Sox2 with Esrrb because the network predicted that Esrrb could activate Sox2 (Figure 1 1 A). Combination (2) replaced Oct4 with Sall4 because Sall4 was predicted to be upstream of Oct4 (Figure 1 1 B). Combination (3) om itted both Sox2 and Oct4 because the model predicted that Lin28, Sall4, Esrrb, and Nanog can drive the cells to pluripotency independently of the two master regulators Sox2 and Oct4 (Figure 1 1 C). Nanog was co-transduced in all combinations because the model predicted that this gene functioned also independently of Sox2 and Oct4 (Figure 9A). Fibroblasts were transduced with the three different combinations as well as with lf4 and c-Myc to induce proliferation. After 25 days on dox, GFP was detected by flow cytometry at a frequency of 22.2%, 0.3%, and 0.4%, respectively in the three combinations (Figure 1 1 A- l 1 C). A single passage increased the fraction of GFP+ cells to 20.7%, 17.3%, and 1 8.3%, respectively, suggesting that cell division was essential to convert the somatic state to the pluripotent state in the absence of exogenous Oct4 (Figure 1 1 A- l 1 C). These data support the role of Oct4 in facilitating the reactivation of the endogenous circuitry. Finally, we transduced the cells with combination (3) but without lf4 and c-Myc. GFP was detected by flow cytometry after 25 days on dox at a frequency of 0.6%, indicating that Klf4 and c-Myc were not required to drive the cells toward pluripotency (Figure 1 1 D). Dox-independent iPSCs from all combinations were GFP-positive as detected by microscopy and generated chimeras (Figure 1 1 A- l 1 D).
To test whether Dppa2 has a role in the activation of the core pluripotency as predicted by the model, we infected both Oct4-GFP and Nanog-GFP MEFs with modified combination ( 1 ) and (4), whereby Nanog was replaced by Dppa2 (Figures 1 1 E and 1 1 F). For modified combination 1 (Oct4, Esrrb, Dppa2, lf4, c- Myc), GFP was detected by flow cytometry after 16 days on dox followed by five days of dox withdrawal at a frequency of 0.6% and 0.2% in the Oct4-GFP MEFs and Nanog-GFP MEFs, respectively. For modified combination 4 (Lin28, Sall4, Esrrb, Dppa2), GFP was detected by flow cytometry after 1 6 days on dox followed by five days of dox withdrawal at a frequency of 0.2% and 0. 1 % in the Oct4-GFP MEFs and Nanog-GFP MEFs, respectively.
To determine the importance of a particular functional link in the network, we transduced the Oct4-GFP MEFs with Lin28, Sall4, Ezh2, Nanog, Klf4 and c-Myc (a modified combination 3), replacing Esrrb with its downstream target Ezh2 as predicted from the model (Figure 1 1 G). After 25 days on dox, abundant amounts of transformed cells were found on the plate, and 1 -day post dox withdrawal there appeared to be some cells that morphologically resembled iPSCs. However, 7 days after dox withdrawal, no stable iPS colonies were found, suggesting incomplete reactivation of the core circuitry required for ful ly reprogrammed iPSCs consistent with fai lure to detect GFP-positive cells (Figure 1 1 G). It is tempting to speculate that the absence of Esrrb from the combination prevented the activation of endogenous Sox2 and the pluripotency circuitry. To test whether Ezh2 has a negative effect on the reprogramming process that might be responsible for the observed incomplete reprogramming process, we transduced NGFP2 MEFs with a viral construct expressing Ezh2 and monitored its effect on the reprogramming process. In parallel, we transduced the cells with shRNA for Ezh2 and monitored its effect on the reprogramming process. Both overexpressing and knocking down Ezh2 revealed a positive effect of Ezh2 on the reprogramming process (Figures 12A- 1 2D).
Overexpressing Ezh2 enhanced reprogramm ing and knocking down Ezh2 inhibited reprogramming, consistent with a positive effect of Ezh2 on the reprogramming process.
To strengthen the importance of our findings and test the synergistic effects of our factors and the OKSM factors, we transduced NGFP2 MEFs that harbor the four canonical reprogramming factors with Lin28, Sall4, Esrrb, and Nanog and found stable dox-independent iPS colonies with GFP-positive cells with a frequency of 2.2% after only five days of dox exposure (Figure 1 1 H). Flow cytometric analysis of a secondary system of the cells after 5 days of dox exposure detected GFP at a frequency of 1 .9% compared to control (Figure 1 1 1). To examine the effect of each of the four transcription factors in facilitating the reprogramming process, we transduced NGFP2 MEFs with each of these factors (Lin28, Sall4, Esrrb, Nanog) individually, and found different contribution of each factor to the reprogramming process. Lin28, Sall4, and Esrrb facilitated the reprogramming process after 10 days of dox exposure followed by 4 days of dox withdrawal, while Nanog facilitated the reprogramming process after 13 days of dox exposure followed by 3 days of dox withdrawal. (Figures 12E and 12F) Our results show that the activation of the pluripotency circuitry is possible by various combinations of factors, even in the absence of exogenous Oct4 and Sox2 and support our model of activation that drives the cell toward transgenes independency. Example 7: A Mouse iPS Cell Line Produced by SNEL Reprogramming Factor Combination Characterized by a High Efficiency of Generating Live Offspring by Tetraploid Complementation Work described herein resulted in some aspects in a mouse iPS cell line produced by reprogramming factor combination Sall4, Nanog, Esrrb and Lin28 (SNEL) which is characterized by a high efficiency of generating live offspring by tetraploid complementation
Viral preparation, infection and iPS colony isolation
Construction of lentiviral vectors containing the factors Sall4, Nanog, Esrrb and Lin28 (SNEL combination) under a control of the tetracycline operator and a minimal CMV promoter were generated by cloning the open reading frame of the factors, obtained by reverse transcription with the primers shown Table 2 below. Table 2, Primers for SNEL Reprogramming Factors
Figure imgf000092_0001
Each ORF was cloned into the TOPO-TA vector (Invitrogen), and then restricted with EcoR\ or Mfel and inserted into the FUW-teto expressing vector. Replication-incompetent lentiviral particles were packaged in 293T cells with a VS V- G coat and used to infect MEFs containing M2rtTA and Oct4-GFP or Nanog-GFP or Sox2-GFP or TTFS with m2rtta. Viral supernatants from cultures were filtered through a 0.45 mM filter and added to the cells after 48, 60 and 72 hours post infection. One day after the last infection the cells were exposed to 2μ¾ πι1 doxycycline for 45 days. The cells were cultured in ES medium (DMEM
supplemented with 1 5% FBS (Hyclone), leukemia inhibitory factor, beta- mercaptoethanol (Sigma-Aldrich), penicillin/streptomycin, L-glutamine and nonessential amino acid. iPS colonies were isolated between 15-45 days post dox exposure and grown on feeder cells in the absence of doxycycline. Stable colonies were passaged twice before used in the functional assay.
Functional assays (chimera formation, germ line transmission and tetraploid complementation) .
All animal procedures were performed according to NI H guidelines and were approved by the Committee on Animal Care at MIT. All 2n and 4n injections were performed using B6D2F2 embryos. iPSCs were derived from an agouti mouse and could be identified by coat color as adults. Blastocysts (94-98 hr after hCG injection) were placed in a drop of HEPES-CZB medium under mineral oil. A flat tip microinjection pipette with an internal diameter of 16 μιη was used for iPS cell injections. Each blastocyst received 8-10 iPS cells. After injection, blastocysts were cultured in potassium simplex optimization medium (KSOM) and placed at 37°C until transferred to recipient females. About 1 5-20 injected blastocysts were transferred to each uterine horn of 2.5-day-postcoitum pseudopregnant B6D2F 1 female. Seventeen days after transfer to pseudopregnant females, fetuses were recovered by c-section and fostered to lactating Swiss or Balb/c females. In some instances the fetuses were delivered by natural birth.
To test for germ line transmission, each chimeric male was set up for mating with 2 C57BL/6 females.
Prelim inary results
The efficiency of the reprogramming process using the SNEL combination is very low. Out of 2X 105 infected cells, we could observe 2-20 iPS colonies (0.00001 - 0.0001 % efficiency). These colonies were positive to the three pluripotency markers Nanog, Sox2 and Oct4. They were stable in the absence of doxycycline and exhibited a morphology that resembled that of ESCs. The functional assays were performed on two independent clones that were isolated from Oct4-GFP m2rtta MEFs. In the chimeric formation assay, SNEL clone#2 generated 5 highly chimeric mice, with contribution of >90%. At least two of them transmitted the transgenes through the germline (the experiment is still on going for the other 3 chimeric mice and for SNEL clone# l ).
For the tetraploid complementation assay, 202 blastocysts were injected with iPS cells from SNEL clone#2. On the due date, all fetuses were delivered by c-section and 1 9 live pups were obtained. Out of those, 14 were fostered to lactating mothers; the remaining either had ambilical cord herneas, a condition not compatible with survival but typical for tetraploid complementation, or died perinataly by respiratory failure, another condition typically observed with tetraploid complementation. Two weeks after birth 5 pups were still surviving, while the majority of the remaining were cannibalized by the foster mothers. The rest were found dead. Three of the pups survived to adulthood, whereas two were small and did not survive to adulthood. Two of the pups that developed to adulthood were all-iPSC-derived and are phenotypically normal. One additional pup that developed to adulthood is chimeric, which is not atypical for experiments involving tetraploid complementation. The germline transmission is currently examined.
The second Oct4 GFP SNEL iPSC line tested, # 1 , was even more efficient.
1 12 blastocysts have been injected thus far. In th is clone some fetuses were obtained by c-section, the typical method for tetraploid complementation. The remaining, were del ivered by natural birth, which is not common for tetraploid complementation assays. We chose to test the ability of the mice to give birth naturally, given that we obtained pregnancies with high number of embryos. Such pregnancies are not typical for tetraploid assays and low-fetus-number pregnancy is one factor contributing to the inability of mothers to give birth naturally. In total, we obtained 1 1 live pups; 7 were born naturally. All 7 pups from this group, developed to adulthood. Three are chimeric, but 4 are all-iPSC-derived and phenotypically +normal animals.
The results obtained are unexpected and superior to what has been published in the l iterature. After testing several iPSC lines over several years, this is the first time that animals are derived that survive to adulthood. Moreover, from the particular combination of factors, only two cell lines were tested. Both cell lines gave rise consistently not only to high contribution chimeras but to adult mice after tetraploid complementation. This has no precedent in the literature. At most, in Zhao et al,
Nature 2009, out of 37 iPSC lines tested only 3 lines were successful in producing 4n pups (< 1 0% efficiency) compared with 2 of 2 lines using the SNEL combination ( 100% efficiency). In addition, using the three best Yamanaka iPSC lines Zhao et al, could get out of a total of 624 injected embryos only 22 live births. That's a success rate of only 3.5% compared with 10% efficiency with the two SNEL lines (19 live pups out of 202 blastocysts for SNEL#2 and 1 1 live pups out of 1 12 blastocysts for SNEL# 1 ). Also, thus far, the efficiency of live and adult pups produced with our lines is by far superior. In some instances, the rate of adult pups compared to blastocysts injected could be close to an order of magnitude higher than that reported in the literature (4 all-iPSC derived adult animals out of 60 blastocysts injected in one experiment).
Example 8: Replacement ofLin28 in Reprogramming Factor Combinations
We envisioned that it should be possible to replace Lin28 in reprogramming factor combinations. To that end, we transduced both Oct4-GFP and Nanog-GFP MEFs with modified versions of modified combination 4, in which Lin28 was replaced with either Ezh2, dm 1 a, or Ulfl . Specifically, we used the following three combinations of factors: (5) Sall4, Esrrb, Dppa2, Ezh2, (6) Sall4, Esrrb, Dppa2,
Kdm 1 a; (7) Sall4, Esrrb, Dppa2, and Utfl . Dox was stopped at day 25. GFP- expressing stable iPS colonies were detected and were picked 5 days after cessation of dox. The efficiency of reprogramming using these combinations was estimated to be slightly lower than when Lin28 was used in combination with Sal4, Essrb, and Dppa2.
Example 9: Reprogramming by Sall4, Nanog, Esrrb and Lin28 produces high quality iPSCs with a molecular signature of developmental potency that resembles that of ESCs
Recent reports indicate that the majority of Oct4, Sox2, Klf4 and c-Myc (OSKM)- derived iPSCs may have reduced differentiation potential as compared to ESCs derived by somatic cell nuclear transfer (SCNT), which are equivalent in their developmental potential to ESCs derived from the fertilized egg (Jiang et al. 201 1 ; Kim et al. 20 1 0; Polo et al. 201 0; Brambrink et al 2006; Wakayama et al. 2006). In addition, it has been suggested that OSKM-derived iPSCs exhibit genetic and epigenetic aberrations throughout the genome that are distinct from ESCs (Kim el al. 201 0; Polo et al. 201 0; Hussein et al 201 1 ; Laurent et al. 201 1 ; Mayshar et al. 2010; Lister et al 2Q \ \ ; Doi et al. 2009; Ohi et al. 201 1 ; Kim et al. 201 1 ; Chin et al. 2009; Phanstiel et al. 201 1 ). These data are consistent with the particular reprogramming method used affecting the qual ity of the resulting pluripotent cells. We hypothesized that different reprogramming factor combinations might affect the developmental potential of the iPSCs. Recently, using two complementary single-cel l techniques, we demonstrated that the reprogramm ing process involves a late hierarchical/deterministic phase that starts with the activation of the Sox2 locus and continues with a series of gene activation events that lead to a stable and transgene-independent pluripotency state (Figure 14A) (Buganim et al. 2012; Pan, G. and Pei, D. 2012). We reasoned that combination of key factors derived from this later phase will reprogram cells in a more controlled way and therefore might uniformly yield iPSCs of high quality. We choose Sall4, Esrrb and Lin28 key downstream players during the late reprogramming phase and Nanog because it acted in a separate pathway (Figure 14A) (Buganim et al. 201 2). Nanog-GFP or Oct4-GFP MEFs were infected with dox-inducible lentiviruses encoding the four reprogramming factors (SNEL) and cultured until the appearance of iPSC colonies. The efficiency of the reprogramming process was low, producing 2-5 colonies per 1 X 105 plated cells with a latency that ranged between 14-60 days. In total, we isolated 10 SNEL-i PSC colonies (6 from Nanog-GFP and 4 from Oct4- GFP MEFs). The resulting iPSC colonies expressed a bright GFP signal from both the OctA or the Nanog locus and upregulated key pluripotency markers such as Sox2, endogenous Sall4, Utfl , endogenous Esrrb, Dppa2, Dppa3, Lin28 and Rexl as assessed by immunostaining and quantitative real time PCR (qRT-PCR) (Figure 1 4B and 14C). When injected into NOD/SCID mice, the cells formed well-differentiated teratomas with structures from all three germ layers (Figure 1 4D).
The potential of SNEL-iPSCs to generate chimeras was tested by injecting cells from all 1 0 clones into BDF 1 host blastocysts that were subsequently transferred into ICR/SWISS pseudopregnant recipient females. All the lines gave rise to
chimeras, where 8/10 (80%) examined clones generated high-grade chimeras (50- 95%), qualitatively assessed by coat color (Figure 17). Germ line transmission was noted in 4 out of 4 examined high-grade chimera lines. Chimeric mice from one of the iPSC clones (Oct4-GFP SNEL#2) suffered from an eye problem and one adult mouse developed a tumor, However, these isolated events were likely the result of leaky expression of Esrrb and Lin28, which have been linked to similar phenotypes (Stadtfeld el al. 2010; Carey et al, 201 1 ). Consistent with that, MEFs from Oct4-GFP SNEL#2 E l 3.5 chimeric embryos exhibited very high levels of Esrrb and Lin28 compared to MEFs derived from a clone that gave rise to normal mice (Figure 1 8). All the other chimeras that were generated from independent clones grew to old age without any obvious evidence of tumorigenicity or other abnormality.
To stringently compare the developmental potential of SNEL and OSKM- derived iPSCs we used 4n complementation. Utilizing identical infection and culture conditions as used for derivation of the SNEL-iPSCs, 10 iPSC lines were derived by infection of MEFs with OSKM lentiviruses all of which expressed high levels of GFP and pluripotency markers (6/1 0 of the colonies are presented in Figure 14C). Cells from the 10 SNEL-iPSC and 10 OSKM-iPSC lines were injected into 4n blastocysts and transferred into ICR/SWISS pseudopregnant recipient females. Relative to the OKSM-iPSCs, SNEL-iPSCs produced approximately 5 times as many live 4n pups that survived to birth and postnatally (p=3.46xl 0"12 by Chi Squared test (Figure 1 5A and 1 5F3)). From a total of 1495 OSKM-iPSC-injected blastocysts only 21 ( 1 .4%) were delivered, 1 1 (0.7%) of which sustained normal breathing and were foster nursed. In contrast, from 2138 blastocysts injected with SNEL iPSCs, 149 (7%) survived to birth, 109 (5%) of which were breathing normally and were fostered nursed. In total, about 40% of the OSKM-iPSC lines gave rise to live pups, compared to 80% of the SNEL-iPSC lines (Figure 1 5 A). The adult "all-SNEL-iPSC" mice were healthy and fertile (Figure 15C), although some pups exhibited some delay in development. To exploit the maximum potential of the cells and determine whether the developmental potential differences between these two types of iPSCs would be further exacerbated, we cultured the 20 iPSC lines in 2i medium (LIF containing medium provided with a selective GSK3 P and Mek 1 /2 inhibitors) for two passages and then injected each line into 60 4n blastocysts. The percentage of live born pups in the SNEL combination was sign ificantly higher (Figure 1 9), reaching 23-25% in some SNEL-iPSC lines (Figure 20). From a total of 600 OSKM-iPSC-injected blastocysts only 1 3 (2.2%) were delivered, 8 ( 1 .7%) of which sustained normal breathing and were foster nursed. In contrast, from 600 blastocysts injected with SNEL-iPSCs, 64 ( 10.7%) survived to birth, 51 (8.5%) of which were breathing normal ly and were fostered nursed (Figure 20A and 20B). Simple sequence length polymorphism (SSLP) analysis for 10 randomly selected 4n embryos (PCR-based assay for two loci) confirmed that the examined embryos were completely derived from the injected i PSCs (Figure 2D). Our data suggest that reprogramm ing with SNEL, in contrast to reprogramming using OSKM under the same conditions, produces high quality iPSCs at high rates as assessed by the most stringent test of 4n complementation.
To reveal a "4n competency signature", we selected the following groups of iPSC lines for microarray analysis, i) "Poor quality" iPSCs: This group included the three OSKM-iPSC lines Nanog-GFP OSKM#2, Oct4-GFP OSKM#2 and KH2 OSKM (Stadtfeld et al. 2010), that either did not produce fully developed pups or produced very low number of pups; ii) "Good quality" iPSCs: This group included BC_2 OSKM (Carey et al. 201 1 ) and Nanog-GFP SNEL#3, both of which gave rise to live, normal pups that survived only few hours; iii) "High quality" iPSCs consisting of Nanog-GFP SNEL#2 and Oct4-GFP SNEL# 1 , both of which generated live pups that survived postnatal ly (representative pups from each iPSC group are presented in Figure 21 ). iv) As controls we used Nanog-GFP, Oct4-GFP and KH2 (Beard et al. 2006) ESCs.
Whole genome transcriptional analysis did not distinguish between the groups as assessed by hierarchical clustering and principle component analysis (PCA), consistent with their common identities as pluripotent cells. However, the SNEL lines clustered closer to the ESCs than the OSKM lines (Figure 22). In contrast, hierarchical clustering and PCA analysis of 1 ,765 differentially expressed genes (F.test, pO.01 , Table S I ) separated perfectly the different groups and clustered the "poor quality" group far from the other 3 groups (Figure 16A and 1 B). qRT-PCR for two differentially expressed genes, Col6al and Thsbl, validated the microarray results (Figure 1 6C).
To assess whether this gene expression pattern may be associated with underlying epigenetic alterations, we profiled the methylomes of these samples by whole genome bisulphite sequencing. Although over 2500 differentially methylated regions (DMRs) were identified, these were largely specific to individual iPSC lines (i.e. Nanog-GFP vs Oct4-GFP lines). Nonetheless, methylation profiles demonstrated higher correlations between SNEL and ESCs relative to OSKMs. However, the exclusively intronic and intergenic genomic distribution of DMRs precluded accurate assessment of any contribution of DNA methylation to the observed gene expression pattern (Figure 23).
Gene ontologies and pathways (GeneDecks (Stelzer et al. 2009)) for the 1 765 differentially expressed genes revealed enrichment not only for categories associated with control of cellular growth and division, but also for more refined and specific developmental pathways and phenotypes: respiratory, immune, musculature, and aortic integrity phenotypes; hypoxia, myocardical infarction, and pulmonary disease; abnormal limb/digit/tail morphology; genes involved in extracellular matrix composition and TGFp signaling; and defective embryogenesis (Figure 16D).
Tetraploid complementation is the most stringent assay for pluripotency and only a small fraction of iPSCs have been shown to be 4n competent (Pera, M.F. 201 1 ; Zhao et al. 2009; Jiang et al. 2012; ang et al. 2009; Boland et al. 2009; Jiang et al. 201 1 ). Our experiments show that the quality of iPSCs as assessed by 4n competence is significantly influenced by the choice of factors used to induce conversion. We demonstrate that the SNEL factors, which are downstream targets of the late pluripotency factor Sox2 (Buganim et al. 2012), produce iPSCs that have a considerably higher competence to generate "all-iPSC" mice by 4n complementation than iPSCs produced by the conventional OSKM factors. The various OSKM and SNEL lines were generated and tested under identical conditions to rule out effects caused by variations in cell culture, method of factor delivery or blastocyst injections. While reprogramming by OSKM produced more colonies with shorter latency, reprogramming by SNEL yielded fewer iPSCs of superior quality. This is reminiscent of previous studies in which similar transgenic OSKM systems producing iPSCs with different efficiencies resulted in different quality iPSCs, with the more efficient system (Stadtfeld et al. 2010) producing iPSCs of lower quality - as assessed by 4n complementation - than the less efficient system (Carey et al. 201 1 ). These two studies combined clearly suggest that reprogramm ing efficiency should not be considered as a determinant of iPSC quality.
To define molecular signatures that could differentiate between high and low quality iPSCs we compared global gene expression and DNA methylation patterns. Genes involved in 'Respiratory', 'Ischemia' and 'myocardial infarction' separated high, good and poor quality iPSCs consistent with the observation that poor "all- iPSC" pups were retarded in development and died of immature lung maturation. DNA methylation patterns, as revealed by whole genome bisulfite sequencing, did not detect a unique and specific methylation signature that can separate the various groups, suggesting that it is difficult to identify epigenetic alterations of specific loci that are being responsible for the biological differences. Instead, it may be that epigenetic alterations of multiple genomic regions are responsible for the differences between high and low quality iPSCs.
In summary, our study provides a proof of principle that different
combinations of reprogramming factors do not equally affect the biological characteristics of iPSCs, with some combinations consistently resulting in high, whereas others in low, quality cells. Based on these results it will be important to define the most optimal factor combinations for reprogramming and to assess how different factor combinations might affect the quality of human iPSCs. Example #9 Material and Methods
Summary
Construction of lentiviral vectors containing Sall4, Nanog, Esrrb and Lin28 under control of the tetracycline operator and a minimal CMV promoter has been described previously (Buganim et al. 2012). iPSCs were generated from
129SvJae/C57BL/6 MEFs containing Oct4-GFP or Nanog-GFP reporter and the M2rtTA in the Rosa26 locus. During reprogramming the cells were cultured in mESC medium containing 2μg/mI doxycycline. Twenty colonies were isolated for derivation and all yielded stable cell lines. iPSC lines derived at different time points were further confirmed for pluripotent properties by immunofluorescent analysis of Sox2 (MAB2018, R&D), Sall4 (ab291 12, Abeam), Utfl (ab24273, Abeam) and Esrrb (PP- H6705-00, Perseus proteomics). Teratoma assays were performed by injecting iPSCs into the subcutaneous flanks of SCID mice, followed by histological examination of the tumors 4-5 weeks later. Microarray analysis, Bisu lfite genom ic sequencing, SSLP analysis, and qRT-PCR were performed. Tetraploid embryo complementation was carried out as described (Carey et al. 201 1 ) by injecting iPSCs (agouti coat origin) into BDF 1 tetraploid embryos (4n). Pups were naturally born or delivered by cesarean section at day E l 9.5, and analyzed for morphology and developmental competency.
Cell culture and mice
Mouse embryonic fibroblasts (MEFs) were grown in DMEM supplemented with 10% fetal bovine serum, 1 % non-essential amino acids, 2mM L-Glutamine and antibiotics. ESCs and iPSCs and were grown in DMEM supplemented with 1 5%» fetal bovine serum, 1 % non-essential amino acids, 2mM L-Glutamine, 2X 1 06 units mLif, 0. 1 mM β-mercaptoethanol (Sigma) and antibiotics or in 2i medium. Five hundred m icroliters of 2i medium were generated by including: 230 mL DMEM/F 12
(Invitrogen; 1 1320), 230mL Neurobasal (Invitrogen; 21 103), 5mL N2 supplement (Invitrogen; 17502048), 1 0mL B27 supplement (Invitrogen; 1 7504044), 10mL (2%) fetal bovine serum, 2X 106 units mLif, I mM glutamine (Invitrogen), 1 % nonessential amino acids (InvitiOgen), 0. I mM β-mercaptoethanol (Sigma), penicillin-streptomycin (Invitrogen), 5mg/mL BSA (Sigma), and PD0325901 (PD, 1 μΜ), CHIR99021 (CH, 3 μΜ). All the ceils were maintained in a humidified incubator at 37°C and 5%> CO2. For the primary infection, MEFs were isolated from mice Heterozygous for the reverse tetracycline-dependent transactivator (M2rtTA) that resides in the ubiquitously expressed Gt(ROSA)26Sor locus (Beard el al. 2006) and either with GFP that was knocked-in inside the Nanog or the Oct4 locus.
Viral preparation and infection
Construction of ientiviraf vectors (FUW-ieto) containing Oct4, Sox2, Klf4 and c-Myc (OSKM) or Sall4, Nanog, Esrrb and Lin28 (SNEL) under control of the tetracycline operator and a minimal CMV promoter has been described previously (Brambrink et al. 2006; Buganim et al. 201 2). Replication-incompetent lentiviral particles were packaged in 293T cells with a VSV-G coat and used to infect MEFs containing M2rtTA and Oct4-GFP or Nanog-GFP MEFs. Viral supernatants from cultures were filtered through a 0.45mM filter and added to the cells. To initiate reprogramming the cells were grown in ESC medium + 2mg/ml Doxycycline
(DMEM supplemented with 1 5% FBS (Hyclone), leukem ia inhibitory factor, beta- mercaptoethanol (Sigma-Aldrich), penicillin/streptomycin, L-gliitamine and nonessential amino acid.
Tetraploid Embryo Complementation and Ch imera Formation
All animal procedures were performed according to N1H guidelines and were approved by the Committee on Animal Care at MIT. Blastocyst injections were performed using (C57/Bl6xDBA) B6D2F2 host embryos. All injected iPSC lines were derived from crosses of 129Sv/Jae to C57/B16 mice and could be identified by agouti coat color. Embryos were obtained 24 ( 1 -cell stage) or 40 (2-cell stage) h post hCG hormone priming. To obtain tetraploid (4n) blastocysts, electrofusion was performed at approximately 44-47 h post hCG using a BEX LF- 101 or LF-301 cell fusion apparatus (Protech International Inc., Boerne, Texas). Both fused and diploid embryos were cultured in KSOM (Mill ipore) or Zenith culture medium (Zenith Biotech) until they formed blastocysts (94-98 h after hCG injection) at which point they were placed in a drop of xxx (Zenith) medium under mineral oil. A flat tip microinjection pipette with an internal diameter of 16 μπι was used for iPSC injections. Each blastocyst received 1 0- 12 iPSCs. Shortly after injection, blastocysts were transferred to day 2.5 recipient CDl females (20 blastocysts per female). Pups, when not born naturally, were recovered at day 1 9.5 by cesarean section and fostered to lactating Balb/c mothers.
Immunofluorescence.
SNEL-iPSCs were fixed in 4% paraformaldehyde in PBS for 20 min, rinsed 3 x with PBS, blocked for 1 h with PBS containing 0. 1 % Triton X- 100 and 5% FBS, and incubated O/N with one of the following antibodies: Sox2 (MAB201 8, R&D), Sall4 (ab291 12, Abeam), Utfl (ab24273, Abeam) and Esrrb (PP-H6705-00, Perseus proteomics). The cells were washed 3 x with PBS, incubated with the relevant secondary antibody (Invitrogen) for 1 h and visual ized under a fluorescence microscope (Nikon eclipse Ti-U). Teratoma assay
ESCs (1 x 106) were injected subcutaneously into SCID mice (Taconic). Mice were euthanized 3 weeks after injection and tumors were collected and fixed in formalin for two days followed by imbedding in paraffin, sectioning and staining with hematoxylin and eosin for histological analysis following standard procedures.
Quantitative real-time PGR
Total RNA was isolated using Rneasy Kit (QIAGEN). One microgram of DNase treated RNA was reversed transcribed using a First Strand Synthesis kit (Invitrogen). Quantitative PGR analysis was performed in duplicate using 1/100 of the reverse transcription reaction in an ABI Prism 7300 (Applied Biosystems) with Platinum SYBR green qPCR SuperMix-UDG with ROX (Invitrogen). Specific primers flanking an intron were designed to the different genes (see Table 7). Table 7. Quantitative real-time PCR primers
Figure imgf000103_0001
Experimental procedures used in the Examples Quantitative real-time PCR
Total RNA was isolated using Rneasy Kit (QIAGEN). One microgram RNA was reversed transcribed using a First Strand Synthesis kit (Invitrogen). Quantitative PCR analysis was performed in duplicate using 1 /100 of the reverse transcription reaction in an ABI Prism 7300 (Applied Biosystems) with Platinum SYBR green qPCR SuperMix-UDG with ROX (Invitrogen). Specific primers flanking an intron were designed to the different genes (see Supplemental Methods). Error bars represent s.d. of the mean of duplicate reactions.
Viral preparation and infection
Construction of lentiviral vectors containing Klf4, Sox2, Oct4 and Myc under control of the tetracycline operator and a minimal CMV promoter has been described previously (Brambrink et al., 2008). Construction of lentiviral vectors containing the following factors (Lin28, Sall4, Ezh2, Esrrb, Nanog, Utfl , Dppa2, and Kdm l a) under control of the tetracycl ine operator and a minimal CMV promoter were generated by cloning the open reading frame of the factors, obtained by reverse transcription with specific primers (see Supplemental Methods), into the TOPO-TA vector (Invitrogen), and then restricted with EcoRl or Mfel and inserted into the FUW-teto expressing vector. Replication-incompetent lentiviral particles were packaged in 293T cells with a VSV-G coat and used to infect MEFs containing M2rtTA and Oct4-GFP or NGFP2- MEFs. Viral supernatants from cultures were filtered through a 0.45 mM filter and added to the cells. To initiate reprogramming the ceils were grown in ES cell medium + 2mg/ml Doxycycline (DMEM supplemented with 1 5% FBS (Hyclone), leukemia inhibitory factor, beta-mercaptoethanol (Sigma-Aldrich), penicillin/streptomycin, L- glutamine and nonessential amino acid.
Chimera Formation
All animal procedures were performed according to NIH guidelines and were approved by the Committee on Animal Care at MIT. All 2n injections were performed using B6D2F2 embryos. Oct4-GFP or NGFP-2 iPSCs were derived from an agouti mouse and could be identified by coat color as adults. Diploid blastocysts (94-98 hr after hCG injection) were placed in a drop of HEPES-CZB medium under mineral oil. A flat tip microinjection pipette with an internal diameter of 1 6 μηι was used for iPS cell injections. Each blastocyst received 8-1 0 iPS cells. After injection, blastocysts were cultured in potassium simplex optim ization medium (KSOM) and placed at 37°C until transferred to recipient females. About 1 5-20 injected blastocysts were transferred to each uterine horn of 2.5-day-postcoitum pseudopregnant B6D2F 1 female.
Flow cytometry
Cells were trypsinized, washed once in PBS and resuspended in fluorescence- activated cell sorting (FACS) buffer (PBS + 5% FBS). The percentage of GFP- positive cells (Nanog-GFP or Oct4-GFP) was analyzed using FACS- calibur.
Secondary somatic cell isolation and culture
Primary NGFP2 iPSCs were electroporated with 25 μg of linearized FUW-
TetO-tdTomato construct. The transduced cells were selected using the Zeocin (400ug/ml) antibiotic. For MEF isolation, chimeric embryos were isolated at E l 3.5, and the head and internal organs were removed. The remaining tissue was physically dissociated and incubated in trypsin at 37 °C for 20 m in, after which cells were resuspended in MEF media containing puromycin ^g/ml, selection against the
M2rTtA) and expanded for two passages before freezing. Secondary MEFs used for the described experiments were thawed and experiments plated 2 days before dox addition. Cells were plated at optimal density of 50,000 cell per 6-well plate and reprogrammed with mouse ES medium supplemented with 2 g/ml doxycycline (Sigma).
FISH and imaging
We performed FISH as outl ined in (Raj et al., 201 0; Raj et al., 2008). All hybridizations were performed in solution using probes coupled to either
tetramethylrhodamme (TMR) (Invitrogen), Alexa 594 (Invitrogen) or Cy5 (GE Amersham). We used TMR for the probes against Esrrb, Utfl, Sox2 3 'UTR, and Dnmt3b mRNA, Alexa 594 for SalM and Lin28 mRNA and Cy5 for Fgf4, Fbxol5, and Sox2 3 'UTR. Optimal probe concentrations during hybridization were determined empirically. Imaging involved taking stacks of images spaced 0.3 μπι apart using filters appropriate for DAPI, TMR, Alexa 594 and Cy5. All images were taken with a N ikon Ti-E inverted fluorescence microscope equipped with a 100X oil-immersion objective and a Photometries Pixis 1 024 CCD camera using MetaMorph software (Molecular Devices, Downington, PA). During imaging, we minimized
photobleaching through the use of an oxygen-scavenging solution using glucose oxidase. Image analysis
We segmented the cells manually and counted the number of fluorescent spots, each of which corresponds to an individual mRNA, using a combination of a semi-automated method described in (Itzkovitz et al., 201 1 ; Raj et al., 2008) and custom software written in MATLAB (Mathworks). We estimate our mRNA counts to be accurate to within 1 0-20%.
Single-cell Data Processing
Q values obtained from the BioMark System were converted into log-based expression values according to a set of rules provided in the Supplemental Methods. Briefly, for each gene, inconsistent readings or "Fai led" quality control readings were filtered out. Cells with fai led or inconsistent detection of control genes (Hprt, Gapdh) were removed from the analysis. Expression values were calculated by subtracting the average gene Q values from the average control Ct values in the corresponding cell. An arbitrary value of 20 was added to make all values non-negative. These values are called AC20 (Average Control at 20) to reflect the property that this quantity is a log- based representation of gene expression values such that the average control gene values are rescaled to 20. Expression values of pluripotency-associated genes (Oct4, Sox2, Nanog, Lin28, Fbxol5, Zfp42, Fut4, Tbx3, Esrrb, Dppa2, Utfl, Sall4, Gdf3 and Fgf4) which were lower than the maximum values observed in MEF samples are potential false positives and are thus set to zeros.
Single-cell Data Visualization Principal component analysis (PCA) was performed in R using Bayesian Principal Component Analysis (bpca) function with missing value estimation (MVE) provided in the pcaMethods module. The PCA scores of the principle component 1 (PC I ) and PC2 are color coded according to the cell types. And the loadings of each variable (genes) are represented in scatter plots.
Single-cell gene expression qPCR
Inventoried TaqMan assays (Appl ied Biosystem, see Supplemental Methods) were pooled to a final concentration of 0.2 for each of the 48 assays. Individual cells were sorted directly into 5μ1 RT-PreAmp Master Mix (2.5μ1 CellsDirect Reaction Mix (Invitrogen); 1 .25 μΐ 0.2 pooled assays; 0.1 μΐ RT/Taq enzyme [CellsDirect qRT- PCR kit, Invitrogen]; 1 .15 μΐ water). Cell lysis and sequence- specific reverse transcription were performed at 50°C for 15 min. The reverse transcriptase was inactivated by heating to 95°C for 2 min. Subsequently, in the same tube, cDNA went through sequence-specific amplification by denaturing at 95C for 1 5s, and annealing and amplification at 60°C for 4 min for 1 8 cycles. These preamplified products were diluted 5-fold prior to analysis with Universal PCR Master Mix and inventoried TaqMan gene expression assays (ABI) in 96.96 Dynamic Arrays on a BioMark System (Fluidigm). Ct values were calculated from the system's software (BioMark Real-time PCR Analysis; Fluidigm). Each assay was performed in replicate.
Jensen-Shannon Divergence
Jensen-Shannon Divergence (JSD) was calculated to assess within-group similarity of gene expression of cells within each sample according to [Lin, J.
( 1991 ). "Divergence measures based on the shannon entropy" . IEEE Transactions on Information Theory 37 ( 1 ): 145-1 51 . doi: 10.1 1 09/1 8.61 1 1 5]. Expression values of genes (with missing values predicted by bpca as described in the PCA anaylsis) were transformed so that they sum up to 1 in each cel l. Each cell is thus represented as a vector of probabil ities Pj. Cells from the same sample were grouped together and for each group, the Jensen-Shannon Divergence (JSD) was calculated from the probability vectors (Pi ,P2, . . . Pn) of cells in each group.
Figure imgf000107_0001
where non entropy given by:
Figure imgf000108_0001
Confidence intervals (CIs) were estimated by bootstrapping (sampling with replacement). The 95% CIs were shown as error bars.
Supplemental Methods
Single-cell Data Processing Methods
Q values obtained from the BioMark System were converted into log-based expression values according to a set of rules provided in the Supplemental Methods Ct value processing filters (in order of execution)
Primary filter:
1 ) For each gene,
a. For each gene, including controls, remove data with CtCall = FAILED and CtQual ity < threshold (--Ct-quality-threshold , Default No Threshold)
b. For each gene, including controls, remove CtValues >= CtValueThreshold (-Ct-value-threshold, Default: 30.0) to filter out low expression genes (they will be not expressed)
c. Here: No more values that are FAILs.
d. For each gene, including controls, set all the CtCall to "INC" (inconsistent) if the difference between the maximum CtValue and the minimum CtValue > MaxCtRepDev (— max-Ct-deviation-between- replicates, Default: 2.0)
Sample filter:
2) For each control gene (in control gene list):
a. if it is not found, remove the whole sample row.
b. if that gene is marked as "INC" (in 2 of primary filter), remove the whole sample row c. if no more CtValues are retained after primary filter or the number of CtValues < minValidReplicatesControl (--min-number-of-valid-data- point-per-control, Default: 1 ), remove the whole sample row.
d. If the mean of the CtValues > CtValueThreshoIdPerControl (— Ct- value-threshold-for-per-control-average, default: 25.0), remove the whole sample row.
Gene filter:
3) For each non-control gene:
a. if that gene is marked as "INC" (in 2 of primary filter) or has all Ctvalues removed, don't do anything here. Do not continue to next step.
b. If number of CtValues retained after primary filter is not zero but is < minValidReplicates (--min-number-of-valid-data-point-per-gene, default: 2), mark gene as "INC"
c. If the mean of the CtValues > CtValueThresholdPerData (-Ct-value- threshold-for-data-average, default: 30.0), remove gene (remove all CtValues)
ACx Output:
4) For each sample:
a. If sample is invalidated by sample filter, don't continue to next step. b. For each non-control gene:
i. If gene not found for this sample, output NA ii. If "INC", output N A
iii. If No CtValues (i.e., removed by primary filter or gene filter), output 0.0 (for genes that don't express, CtCall will be highly l ikely to FAIL in most/all replicates).
iv. Else output ACx(g,s) x (-offset-output, default: 20)
Average Control at x (ACx) values
ACx ( gene g, controls c E C , sample s) = x + Ct(c\s) - Ct(g v)
Property:
When Ct(c„) = Ct{g,s), ACx(g,C ,s) = x Larger value => higher expression
To find fold change of the gene from sample 1 (s i ) to sample 2 (s2):
expression of g in sample 2 _ R _ ^ (^,-ΑΟ^ Ο/ )
expression of g in sample 1. Bayesian Network Analysis
Bayesian network was constructed using BNFinder (Wilczynski and Dojer, 2009). Cells used are listed below.
Sample name . '
1 NGFP_2_iPS_table.S 16
2 NGFP_2_iPS_table.S 1 5
3 NGFP_2_iPS_table.S24
4 , NGFP_2_iPS_table.S76
5 NGFP_2_iPS_table.S75
6 ;· ',: NGFP_2_iPS_table.S21
7 NGFP_2_iPS_table.S30
8 NGFP 2 iPS tablc.S25
9 NGFP_2_iPS_table.S64
10 NGFP_2_iPS_table.S59
1 NGFP_2_iPS_table.S55
12 NGFP_2_iPS_table.S37
13 NGFP_2_iPS_table.S54
14 NGFP_2_iPS_table.S43
15 NGFP_2_MEFs_2_days_dox_table. S80
16 NGFP 2 MEFs 2 days dox table. S64
17 NGFP 2 MEFs 4 days dox table. S02
1 8 NGFP_2_M EFs_4_days_dox_table. S72
19 NGFP 2 MEFs 4 days dox table. S59
20 NGFP_2_MEFs_4_days_dox . table.S57
1 NGFP 2 MEFs 6 days dox table. S81
2 NGFP_2_MEFs_6_days_dox_table.S20
3 NGFP_2_MEFs_6_days_dox_table.S69 NGFP 2 MEFs 6 days dox table. S26
NGFP_2_MEFs_p3_table.S77
NGFP_2_MEFs_p3_table.S30
N G FP_ 2 M til ' p3 J a ble . S 58
NGFP_2_T6_MEFs_2_days_dox_table.S94
NGFP_2_T6_MEFs_2_day s_dox_tab le. S91
NGFP_2_T6_MEFs_2_days_dox_table.S3 1
NGFP 2 T6 MEFs 2 days dox table.S40
NGFP_2_T6_MEFs_2_days_dox_tabIe.S54
NGFP_2_T6_MEFs_6_days_dox_table.S05
NGFP_2_T6_colony_l 5_table.S81
NGFP_2_T6_colony_l 5_table.S77
NGFP_2_T6_colony_l 5_table.S34
NGFP_2_T6_colony_l 5_table.S57
NGFP_2_T6_colony_l 5_table.S45
NGFP_2_T6_colony_l 5_table.S50
NGFP_2_T6_colony_1 6_dox_independent_table.S 12
NGFP_2_T6_colony_l 6_dox_independent_table.S21
NGFP_2_T6_colony_l 6_dox_independent_tabie.S69
NGFP_2_T6_colony_1 6_dox_independent_table.S68
NGFP_2 T6_colony_l 6_dox_independent table. S59
NGFP_2_T6_colony_1 6_dox_independent_table.S56
NGFP_2_T6_colony_ l 6_dox_independent_table.S5 1
NGFP_2_T6_colony_1 6_gfp_negative_tdtomato_positive__table.S94
NGFP_2_T6_colony_l 6_gfpjiegative_tdtornato__positive_table.S92
NGFP_2_T6_colony__l 6_gfp_negative_tdtomato_positive_table.S 12 - I l l -
Table 3. Quantitative real-time PCR primers
Figure imgf000112_0001
Table 4. Primers using for cloning of cDNA for lentiviral vectors
Forward Reverse
Sall4- GCAAGTCACCAGGGCTCTT CCTCCTTAGCTGACAGCAAT cDNA (SEQ ID NO. 1 ) (SEQ ID NO. 2)
Esrrb- GCTGGAACACCTGAGGGTAA GGTCTCCACTTGGATCGTGT cDNA (SEQ ID NO. 3) (SEQ ID NO. 4) Lin28- HANNA ET AL.2009 NATURE HANNA ET AL.2009 NATURE cDNA
Nanog- CGCCATCACACTGACATGA TGGAAGAAGGAAGGAACCTG cDNA (SEQ ID NO.5) (SEQ ID NO.6)
Ezh2- GAAGAATAATCATGGGCCAGAC TGCCCACAGTACTCAAGGTTC cDNA (SEQ ID NO.51) (SEQ ID NO.52)
Utfl- CTACCTGGCTCAGGGATGCT GACTGGGAGTCGTTTCTGGA cDNA (SEQ ID NO.53) (SEQ ID NO.54)
Dppa2- AAAGAAGTCGGCATTCATTCA ATTCTTCCATTCCCTTTAGATCA cDNA (SEQ ID NO.55) (SEQ ID NO.56)
Ezh2- GAAGAATAATCATGGGCCAGAC TGCCCACAGTACTCAAGGTTC cDNA (SEQ ID NO.57) (SEQ ID NO.58)
Table 5.48 Inventoried TaqMan assays (obtained from Applied Biosystems)
1 Mm00650983_gl Inventoried Cdc20
2 Mm00660135jTil Inventoried Bubl
3 Mm00432385_ml Inventoried Ccnf
4 Mm00786984_sl Inventoried Mad211
5 Mm02391771_gl Inventoried Hdacl
6 Mm00484020_ml Inventoried Ctcf
v7 Mm01211941_ml Inventoried Myst3
8 Mm03053249_gl Inventoried Myst4
9 Mm03053308_gl Inventoried Bmil
10 Mm01181033_ml Inventoried Kdml
11 Mm00599763_ml Inventoried Dnnmtl
12 Mm03053759_sl Inventoried Nr6al
13 Mm03053810_sl Inventoried Sox2
14 Mm03053707_sl Inventoried Bmprla
15 Mm03053495_sl Inventoried Dnmt3b
16 Mm00487448_sl Inventoried Fut4 (SSEA-1)
17 Mm00442942_ml Inventoried Lifr 8 Mm00473214_sl Inventoried Lin28
9 Mm02384862__gl Inventoried Nanog
0 Mm03053917_gl Inventoried Pou5fl (Oct4)1 Mm01265526_ml Inventoried Fbxol 5
2 MmO 1192270 jnl Inventoried Slc2al
3 Mm00447703_gl Inventoried Utfl
4 Mm03053975_gl Inventoried Zfp42 (Rex-1)5 Mm03053853_sl Inventoried Esrrb
6 Mm03053490_sl Inventoried Stat3
7 Mm03023989^gl Inventoried Grb2
8 Mm00809779_sl Inventoried Tbx3
9 MmOl 34339 l_gH Inventoried Dppa2
0 Mm01615680_sH Inventoried F thll 7
1 Mm00453037_sl Inventoried Sall4
2 Mm03023988 ml Inventoried GdO
3 Mm00499427_m 1 Inventoried Ctnnbll (β-catenin)4 MmOl 243796_gl Inventoried Csnk2al
5 Mm03053261_sl Inventoried Gsk3b
6 Mm00810320_sl Inventoried Wntl
7 Mm01342805_ml Inventoried Hesl
8 Mm03053874_sl Inventoried Jagl
9 Mm03053614_sl Inventoried Notch 1
0 MmOl 159248 _ml Inventoried Ezh2
% Mm03053244_sl Inventoried Nes
2] Mm03053741^sl Inventoried Fgf4
Mm03053745_sl Inventoried Fgf5
Mm00493681_ml Inventoried Thyl
5 Mm00483675_ml Inventoried Col5a2
Mm00446968_ml Inventoried Hprt
Mm03302249_gl Inventoried Gapdh
8 Mm01250624__ml Inventoried Prmt7 Table 6. shRNA lentiviral vector set (obtained from open biosystems):
Esrrb RMM4534-NM_01 1 934
Sall4 RMM4534-NM_201396
Utfl RMM4534-NM_009482
Ezh2 RMM4534-NM_007971
References
Ang, Y.S., Tsai, S.Y., Lee, D.F., Monk, J., Su, J., Ratnakumar, K., Ding, J., Ge, Y., Darr, H., Chang, B., et al. (201 1 ). Wdr5 mediates self-renewal and reprogramming via the embryonic stem cell core transcriptional network. Cell 145, 1 83- 1 97.
Ballabeni, A., Park, I.H., Zhao, R., Wang, W., Lerou, P.H., Daley, G.Q., and Kirschner, M.W. (201 1 ). Cell cycle adaptations of embryonic stem cells. Proc Natl Acad Sci U S A 108, 19252- 19257.
Banito, A., Rashid, S.T., Acosta, J.C., Li, S., Pereira, C.F., Geti, I., Pinho, S., Silva, J.C., Azuara, V., Walsh, M., et al. (2009). Senescence impairs successful reprogramming to pluripotent stem cells. Genes Dev 23, 2134- 21 39.
Barrero, M .J., Boue, S., and Izpisua Belmonte, J.C. (201 0). Epigenetic mechanisms that regulate cell identity. Cell Stem Cell 7, 565-570.
Bhutani, N., Brady, J. J., Damian, M., Sacco, A., Corbel, S.Y., and Blau, H.M. (2010). Reprogramm ing towards pluripotency requires AID-dependent DNA demethylation. Nature 463, 1 042- 1 047.
Boiani, M., Eckardt, S., Scholer, H.R., and McLaughlin, K.J. (2002). Oct4 distribution and level in mouse clones: consequences for pluripotency. Genes Dev 1 6, 1209- 121 9.
Boiani, M., and Scholer, H.R. (2005). Regulatory networks in embryo-derived pluripotent stem cells. Nat Rev Mol Cell Biol 6, 872-884.
Boland, M.J., Hazen, J.L., Nazor, K.L., Rodriguez, A.R., Gifford, W., Martin,
G. , Kupriyanov, S., and Baldwin, K.K. (2009). Adult mice generated from induced pluripotent stem cells. Nature 461 , 91 -94.
Boyer, L.A., Lee, T.I., Cole, M.F., Johnstone, S.E., Levine, S. S., Zucker, J. P., Guenther, M.G., Kumar, R. M., Murray, H.L., Jenner, R.G., et al , (2005). Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947-956.
Brambrink, T., Foreman, R., Welstead, G.G., Lengner, C.J,, Wernig, M., Suh,
H. , and Jaenisch, R. (2008). Sequential expression of pluripotency markers during direct reprogramming of mouse somatic cells. Cell Stem Cell 2, 1 51 - 1 59.
Branco, M.R., Oda, M, and Reik, W. (2008). Safeguarding parental identity: Dnmtl maintains imprints during epigenetic reprogramming in early embryogenesis. Genes Dev 22, 1567- 1571 .
Carey, B.W., Markoulaki, S., Hanna, J.H., Faddah, D.A., Buganim, Y., Kim, J., Ganz, K., Steine, E.J., Cassady, J.P., Creyghton, M.P., et al. (201 1 ).
Reprogramming factor stoichiometry influences the epigenetic state and biological properties of induced pluripotent stem cells. Cell Stem Cell 9, 588- 598.
Cavaleri, F., and Scholer, H.R. (2003). Nanog: a new recruit to the embryonic stem cell orchestra. Cell 1 13, 551 -552.
Citri, A., Pang, Z.P., Sudhof, T.C., Wernig, M„ and Malenka, R.C. (2012). Comprehensive qPCR profiling of gene expression in single neuronal cel ls. Nat Protoc 7, 1 1 8- 127.
Dalerba, P., Kalisky, T., Sahoo, D., Rajendran, P.S., Rothenberg, M.E., Leyrat, A.A., Sim, S., Okamoto, J., Johnston, D.M., Qian, D., et al. (201 1 ). Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nat Biotechnol 29, 1 120-1 127.
Diehn, M„ Cho, R.W., Lobo, N.A., Kalisky, T., Dorie, M.J., Kulp, A.N., Qian, D., Lam, J.S., Ailles, L.E., Wong, M., et al. (2009). Association of reactive oxygen species levels and radioresistance in cancer stem cells. Nature 458, 780-783.
Do, J.T., and Scholer, H.R. (2010). Cel l fusion-induced reprogramming.
Methods Mol Biol 636, 1 79- 190.
Edel, M.J., and Izpisua Belmonte, J.C. (2010). The cell cycle and
pluripotency: Is there a direct link? Cell Cycle 9, 2694-2695.
Egli, D., Birkhoff, G., and Eggan, K. (2008). Mediators of reprogramming: transcription factors and transitions through mitosis. Nat Rev Mol Cell Biol 9,
505-5 16.
Farthing, C. R., Ficz, G., Ng, R.K., Chan, C.F., Andrews, S., Dean, W., Hemberger, M., and Reik, W. (2008). Global mapping of DNA methylation in mouse promoters reveals epigenetic reprogramming of pluripotency genes. PLoS Genet 4, e l 0001 1 6.
21 . Feng, B„ Jiang, J., Kraus, P., Ng, J.H., Heng, J.C., Chan, Y.S., Yaw, L.P., Zhang, W., Loh, Y.H., Han, J., et al. (2009). Reprogramming of fibroblasts into induced pluripotent stem cells with orphan nuclear receptor Esrrb. Nat
Cell Biol 1 1 , 197-203.
22. Feng, S., Jacobsen, S.E., and Reik, W. (201 0). Epigenetic reprogramming in plant and animal development. Science 330, 622-627.
23. Furusawa, T., Ikeda, M, Inoue, F., Ohkoshi, K., Hamano, T., and Tokunaga, T. (2006). Gene expression profi ling of mouse embryonic stem cell subpopulations. Biol Reprod 75, 555-561 .
24. Gonzalez, F., Boue, S., and Izpisua Belmonte, J.C. (201 1 ). Methods for
making induced pluripotent stem cells: reprogramming a la carte. Nat Rev Genet 12, 231 -242.
25. Graf, T., and Stadtfeld, M. (2008). Heterogeneity of embryonic and adult stem cells. Cell Stem Cell 3, 480-483.
26. Guo, G., Huss, M., Tong, G.Q., Wang, C, Li Sun, L., Clarke, N.D., and
Robson, P. (201 0). Resolution of cell fate decisions revealed by single-cell gene expression analysis from zygote to blastocyst. Dev Cel l 1 8, 675-685. 27. Hanna, J., Markoulaki, S., Schorderet, P., Carey, B.W., Beard, C, Wernig, M., Creyghton, M.P., Steine, E.J., Cassady, J.P., Foreman, R., et al. (2008). Direct reprogramming of terminally differentiated mature B lymphocytes to pluripotency. Cell 133, 250-264.
28. Hanna, J., Saha, ., Pando, B., van Zon, J., Lengner, C.J., Creyghton, M.P., van Oudenaarden, A., and Jaenisch, R. (2009). Direct cell reprogramming is a stochastic process amenable to acceleration. Nature 462, 595-601 .
29. Hanna, J.H., Saha, K., and Jaenisch, R. (2010). Pluripotency and cellular reprogramming: facts, hypotheses, unresolved issues. Cell 143, 508-525.
30. Hayashi, K., Lopes, S.M., Tang, F., and Surani, M.A. (2008). Dynamic
equilibrium and heterogeneity of mouse pluripotent stem cells with distinct functional and epigenetic states. Cell Stem Cell 3 , 391 -401. Hemberger, M., Dean, W., and Reik, W. (2009). Epigenetic dynamics of stem cells and cell lineage commitment: digging Waddington's canal. Nat Rev Mol Cell Biol 1 0, 526-537.
Hong, H., Takahashi, K., Ichisaka, T., Aoi, T., Kanagawa, O., Nakagawa, M., Okita, ., and Yamanaka, S. (2009). Suppression of induced pluripotent stem cell generation by the p53-p21 pathway. Nature 460, 1 132- 1 135.
Itzkovitz, S., Lyubimova, A., Blat, I.C., Maynard, M., van Es, J., Lees, J., Jacks, T., Clevers, H., and van Oudenaarden, A. (201 1 ). Single-molecule transcript counting of stem-cell markers in the mouse intestine. Nat Cell Biol 14, 1 06-1 14.
Ivanova, N., Dobrin, R., Lu, R., Kotenko, I., Levorse, J., DeCoste, C, Schafer, X., Lun, Y., and Lemischka, I.R. (2006). Dissecting self-renewal in stem cel ls with RNA interference. Nature 442, 533-538.
Jaenisch, R., and Young, R. (2008). Stem cells, the molecular circuitry of pluripotency and nuclear reprogramming. Cell 132, 567-582.
Kalisky, T., Blainey, P., and Quake, S.R. (201 1 ). Genomic analysis at the single-cell level. Annu Rev Genet 45, 43 1 -445.
Kalisky, T., and Quake, S.R. (201 1 ). Single-cell genomics. Nat Methods 8, 31 1 -3 14.
Kawamura, T., Suzuki, J., Wang, Y.V., Menendez, S., Morera, L.B., Raya, A., Wahl, G.M., and Izpisua Belmonte, J.C. (2009). Linking the p53 tumour suppressor pathway to somatic cell reprogramming. Nature 460, 1 140- 1 144. Kim, J.B., Greber, B., Arauzo-Bravo, M.J., Meyer, J., Park, K.I., Zaehres, H., and Scholer, H.R. (2009a). Direct reprogramming of human neural stem cells by OCT4. Nature 461, 649-643.
Kim, J.B., Sebastiano, V., Wu, G., Arauzo-Bravo, M.J., Sasse, P., Gentile, L., Ko, K., Ruau, D., Ehrich, M., van den Boom, D., et al. (2009b). Oct4-induced pluripotency in adult neural stem cells. Cell 136, 41 1 -419.
Kim, K., Doi, A., Wen, B., Ng, K., Zhao, R., Cahan, P., Kim, J., Aryee, M.J., Ji, H., Ehrlich, L. L, et al. (2010). Epigenetic memory in induced pluripotent stem cells. Nature 467, 285-290. Koche, R.P., Smith, Z.D., Adli, M., Gu, H., Ku, M ., Gnirke, A., Bernstein, B. E., and Meissner, A. (201 1 ). Reprogramming factor expression initiates widespread targeted chromatin remodeling. Cell Stem Cell 8, 96- 105.
Kurukuti, S., Tiwari, V. ., Tavoosidana, G., Pugacheva, E., Murrell, A., Zhao, Z., Lobanenkov, V., Reik, W., and Ohlsson, R. (2006). CTCF binding at the H I 9 imprinting control region mediates maternally inherited higher-order chromatin conformation to restrict enhancer access to Igf2. Proc Natl Acad Sci U S A 103, 10684- 1 0689.
Lewitzky, M., and Yamanaka, S. (2007). Reprogramming somatic cells towards pluripotency by defined factors. Curr Opin Biotechnol 18, 467-473. Li, H., Collado, M., Villasante, A., Strati, K., Ortega, S., Canamero, M., Blasco, M.A., and Serrano, M. (2009). The Ink4/Arf locus is a barrier for iPS cell reprogramm ing. Nature 460, 1 136- 1 139.
Lu, R., Markowetz, F., Unwin, R.D., Leek, J.T., Airoldi, E.M., MacArthur,
B. D., Lachmann, A., Rozov, R., Ma'ayan, A., Boyer, L.A., et al. (2009). Systems-level dynamic analyses of fate change in murine embryonic stem cells. Nature 462, 358-362.
Macarthur, B.D., Ma'ayan, A., and Lemischka, I.R. (2009). Systems biology of stem cell fate and cellular reprogramming. Nat Rev Mol Cell Biol 10, 672- 681 .
Maherali, N., Sridharan, R., Xie, W., Utikal, J., Eminli, S., Arnold, K., Stadtfeld, M., Yachechko, R„ Tchieu, J., Jaenisch, R., el al. (2007). Directly reprograrnrned fibroblasts show global epigenetic remodeling and widespread tissue contribution. Cell Stem Cell 1, 55-70.
Maldonado-Saldivia, J., van den Bergen, J., Krouskos, M., Gilchrist, M., Lee,
C, Li, R., Sinclair, A.H., Surani, M.A., and Western, P. S. (2007). Dppa2 and Dppa4 are closely linked SAP motif genes restricted to pluripotent cells and the germ line. Stem Cells 25, 19-28.
Marson, A., Foreman, R., Chevalier, B., Bilodeau, S., Kahn, M., Young, R.A., and Jaenisch, R. (2008). Wnt signaling promotes reprogramming of somatic cells to pluripotency. Cell Stem Cell 3, 1 32- 135. 5 1 . Masui, S., Nakatake, Y., Toyooka, Y., Shimosato, D., Yagi, R., Takahashi, K., Okochi, H., Okuda, A., Matoba, R., Sharov, A. A., et al. (2007). Pluripotency governed by Sox2 via regulation of Oct3/4 expression in mouse embryonic stem cells. Nat Cell Biol 9, 625-635.
52. Meissner, A., Wernig, M., and Jaenisch, R. (2007). Direct reprogramming of genetically unmodified fibroblasts into pluripotent stem cells. Nat Biotechnol 25, 1 1 77- 1 1 81 .
53. Mikkelsen, T. S., Hanna, J., Zhang, X., Ku, M., Wernig, M., Schorderet, P., Bernstein, B.E., Jaenisch, R., Lander, E.S., and Meissner, A. (2008).
Dissecting direct reprogramm ing through integrative genomic analysis. Nature
454, 49-55.
54. Moon, J.H., Heo, J.S., Kim, J.S., Jun, E.K., Lee, J.H., Kim, A., Kim, J.,
Whang, K.Y., Kang, Y.K., Yeo, S., et al. (201 1 ). Reprogramming fibroblasts into induced pluripotent stem cells with Bmi l . Cell Res 21, 1305- 13 1 5.
55. Moore, K.A., and Lemischka, I.R. (2006). Stem cells and their niches. Science 311, 1 880- 1 885.
56. Morgan, H.D., Santos, F., Green, K., Dean, W., and Reik, W. (2005).
Epigenetic reprogramming in mammals. Hum Mol Genet 14 Spec No 1, R47- 58.
57. Morshedi, A., Soroush Noghabi, M., and Droge, P. (201 1 ). Use of UTF1 Genetic Control Elements as iPSC Reporter. Stem Cell Rev.
58. Narsinh, K.H., Sun, N., Sanchez-Freire, V., Lee, A.S., Almeida, P., Hu, S., Jan, T., Wilson, K.D., Leong, D., Rosenberg, J., et al. (201 1 ). Single cell transcriptional profiling reveals heterogeneity of human induced pluripotent stem cells. J Clin Invest 121, 1 217- 1221 .
59. Ng, H.LI., and Surani, M.A. (201 1 ). The transcriptional and signalling
networks of pluripotency. Nat Cell Biol 13, 490-496.
60. Okita, K., Ichisaka, T., and Yamanaka, S. (2007). Generation of germline- competent induced pluripotent stem cells. Nature 448, 313-3 17.
61 . Onder, T.T., Kara, N., Cherry, A., Sinha, A.U., Zhu, N„ Bemt, K.M., Cahan, P., Mancarci, O.B., Unternaehrer, J., Gupta, P.B., et al. (2012). Chromatin- modifying enzymes as modulators of reprogramming. Nature. Park, I.H., Zhao, R., West, J. A., Yabuuchi, A., Huo, H., Ince, T.A., Lerou, P.H., Lensch, M.W., and Daley, G.Q. (2008). Reprogramming of human somatic cells to pluripotency with defined factors. Nature 451, 141 - 1 46.
Pritsker, M, Ford, N.R., Jenq, H.T., and Lemischka, I.R. (2006). Genomewide gain-of-function genetic screen identifies functionally active genes in mouse embryonic stem cells. Proc Natl Acad Sci U S A 103, 6946-6951 .
Raj, A., Rifkin, S.A., Andersen, E., and van Oudenaarden, A. (201 0).
Variability in gene expression underlies incomplete penetrance. Nature 463, 913-91 8.
Raj, A., van den Bogaard, P., Rifkin, S.A., van Oudenaarden, A., and Tyagi, S, (2008). Imaging individual mRNA molecules using multiple singly labeled probes. Nat Methods 5, 877-879.
Raj, A., and van Oudenaarden, A. (2008). Nature, nurture, or chance:
stochastic gene expression and its consequences. Cell 135, 216-226.
Ramalho-Santos, M., Yoon, S., Matsuzaki, Y., Mulligan, R.C., and Melton, D.A. (2002). "Sternness": transcriptional profiling of embryonic and adult stem cells. Science 298, 597-600.
Reik, W. (2007). Stability and flexibility of epigenetic gene regulation in mammalian development. Nature 447, 425-432.
Samavarchi-Tehrani, P., Golipour, A., David, L., Sung, H. K., Beyer, T.A., Datti, A., Woltjen, K., Nagy, A., and Wrana, J.L. (2010). Functional genomics reveals a BMP-driven mesenchymal-to-epithelial transition in the initiation of somatic cell reprogramming. Cell Stem Cell 7, 64-77.
Scholer, H.R. (1 991 ). Octamania: the POU factors in murine development. Trends Genet 7, 323-329.
Scholer, H.R., Ruppert, S„ Suzuki, N., Chowdhury, K., and Gross, P. (1990). New type of POU domain in germ line-specific protein Oct-4. Nature 344, 435-439.
Silva, J., Nichols, J., Theunissen, T.W., Guo, G., van Oosten, A.L., Barrandon, O., Wray, J., Yamanaka, S., Chambers, I., and Smith, A. (2009). Nanog is the gateway to the pluripotent ground state. Cell 138, 722-737. Singhal, N., Graumann, J., Wu, G., Arauzo-Bravo, M.J., Han, D.W., Greber, B., Gentile, L„ Mann, M, and Scholer, H.R. (2010), Chromatin-Remodeling Components of the BAF Complex Facilitate Reprogramming. Cell 141, 943- 955.
Smith, Z.D., Nachman, I., Regev, A., and Meissner, A. (201 0). Dynamic single-cell imaging of direct reprogramming reveals an early specifying event. Nat Biotechnol 28, 521 -526.
Sridharan, R., Tchieu, J., Mason, M.J., Yachechko, R., Kuoy, E., Horvath, S., Zhou, Q., and Plath, K. (2009). Role of the murine reprogramming factors in the induction of pluripotency. Cell 136, 364-377.
Stadtfeld, M., Maherali, N., Breault, D.T., and Hochedlinger, . (2008). Defining molecular cornerstones during fibroblast to iPS cell reprogramming in mouse. Cell Stem Cell 2, 230-240.
Sterneckert, J., Hoing, S., and Scholer, H. R. (2012). Concise review: Oct4 and more: the reprogramming expressway. Stem Cells 30, 15-21 .
Surani, M.A., Hayashi, K., and Hajkova, P. (2007). Genetic and epigenetic regulators of pluripotency. Cell 128, 1 '4 '-762.
Takahashi, ., Tanabe, ., Ohnuki, M., Narita, M., Ichisaka, T., Tomoda, ., and Yamanaka, S. (2007). Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131, 861 -872.
Takahashi, K., and Yamanaka, S. (2006). Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663-676.
Tang, F., Barbacioru, C, Bao, S., Lee, C, Nordman, E., Wang, X., Lao, ., and Surani, M.A. (2010). Tracing the derivation of embryonic stem cells from the inner cell mass by single-cell RNA-Seq analysis. Cell Stem Cell 6, 468- 478.
Tang, F., Barbacioru, C, Wang, Y., Nordman, E., Lee, C, Xu, N., Wang, X., Bodeau, J., Tuch, B.B., Siddiqui, A., et al. (2009). mRNA-Seq whole- transcriptome analysis of a single cell. Nat Methods 6, 377-382.
Tang, F., Lao, K., and Surani, M.A. (201 1 ). Development and applications of single-cell transcriptome analysis. Nat Methods 8, S6- 1 1 . 84. Tiscornia, G., and Izpisua Belmonte, J.C. (201 0). M icroRNAs in embryonic stem cell function and fate. Genes Dev 24, 2732-2 '41 .
85. Utikal, J., Polo, J .M., Stadtfeld, M., Maherali, N., ulalert, W„ Walsh, R.M., halit, A., Rheinwald, J.G., and Hochedlinger, K. (2009). Immortal ization eliminates a roadblock during cellular reprogramming into iPS cells. Nature
460, 1 145- 1 148.
86. Varga, A.C., and Wrana, J. L. (2005). The disparate role of BMP in stem cell biology. Oncogene 24, 571 3-5721 .
87. Viswanathan, S.R., and Daley, G.Q. (2010). Lin28: A microRNA regulator with a macro role. Cell 140, 445-449.
88. Viswanathan, S. R., Daley, G.Q., and Gregory, R.I. (2008). Selective blockade of microRNA processing by Lin28. Science 320, 97- 1 00.
89. Wernig, M., Lengner, C.J., Hanna, J., Lodato, M.A., Steine, E., Foreman, R., Staerk, J., Markoulaki, S., and Jaenisch, R. (2008). A drug-inducible transgenic system for direct reprogramming of multiple somatic cell types. Nat
Biotechnol 26, 91 6-924.
90. Wernig, M, Meissner, A., Foreman, R., Brambrink, T., Ku, M., Hochedlinger, K., Bernstein, B.E., and Jaenisch, R. (2007). In vitro reprogramming of fibroblasts into a pluripotent ES-cell-like state. Nature 448, 3 18-324.
91 . West, J. A., Viswanathan, S. R., Yabuuchi, A., Cunniff, K., Takeuchi, A., Park, I.H., Sero, J.E., Zhu, H., Perez-Atayde, A., Frazier, A.L., et al. (2009). A role for Lin28 in primordial germ-cell development and germ-cell malignancy. Nature 460, 909-913.
92. Wilczynski, B., and Dojer, N. (2009). BNFinder: exact and efficient method for learning Bayesian networks. Bioinformatics 25, 286-287.
93. Yamanaka, S. (2009). Elite and stochastic models for induced pluripotent stem cell generation. Nature 460, 49-52.
94. Yu, J., Vodyanik, M.A., Smuga-Otto, K., Antosiewicz-Bourget, J., Frane, J. L., Tian, S., Nie, J., Jonsdottir, G.A., Ruotti, V., Stewart, R., et al. (2007).
Induced pluripotent stem cell lines derived from human somatic cells. Science
318, 191 7- 1 920. 95. Zhang, J„ Tam, W.L., Tong, G.Q., Wu, Q., Chan, H.Y., Soh, B.S., Lou, Y., Yang, J., Ma, Y., Chai, L„ et al, (2006). Sa! l4 modulates embryonic stem cell pluripotency and early embryonic development by the transcriptional regulation of Pou5fl . Nat Cell Biol 8, 1 1 1 4- 1 1 23.
96. Zhao, X.Y., Li, W., Lv, Z., Liu, L., Tong, M., Hai, T., Hao, J., Guo, C.L., Ma, Q.W., Wang, L., et al. (2009). iPS cells produce viable mice through tetraploid complementation. Nature 461, 86-90.
97. Pera, M. F. Stem cells: The dark side of induced pluripotency. Nature 471, 46-47, (201 1 ).
98. Zhao, X. Y. et al. iPS cells produce viable mice through tetraploid complementation. Nature 461 , 86-90, (2009).
99. Jiang, J. et al. Zscan4 promotes genomic stability during reprogramm ing and dramatically improves the quality of iPS cells as demonstrated by tetraploid complementation. Cell Res, (2012).
1 00. Kang, L., Wang, J., Zhang, Y., Kou, Z. & Gao, S. iPS cells can support full- term development of tetraploid blastocyst-complemented embryos. Cell Stem Cell 5, 1 35- 138, (2009).
1 01 . Boland, M. J. et al. Adult mice generated from induced pluripotent stem cells.
Nature 461 , 91 -94, (2009).
1 02. Jiang, J. el al. Different developmental potential of pluripotent stem cells generated by different reprogramming strategies. J Mol Cell Biol 3, 1 97- 199, (201 1 ).
103. Kim, . et al. Epigenetic memory in induced pluripotent stem cells. Nature 467, 285-290, (2010).
1 04. Polo, J. M. et al. Cell type of origin influences the molecular and functional properties of mouse induced pluripotent stem cells. Nat Biotechnol 28, 848- 855, (201 0).
105. Brambrink, T., Hochedlinger, K., Bell, G. & Jaenisch, R. ES cells derived from cloned and fertilized blastocysts are transcriptionally and functionally indistinguishable. Proc Natl Acad Sci U SA 103, 933-938, (2006). 106. Wakayama, S. et al. Equivalency of nuclear transfer-derived embryonic stem cells to those derived from fertilized mouse blastocysts. Stem Cells 24, 2023- 2033, (2006).
107. Hussein, S. M. et al. Copy number variation and selection during reprogramming to pluripotency. Nature 471 , 58-62, (2011).
108. Laurent, L. C. et al. Dynamic changes in the copy number of pluripotency and cell proliferation genes in human ESCs and iPSCs during reprogramming and time in culture. Cell Stem Cell 8, 106-118, (2011 ).
109. Gore, A. et al. Somatic coding mutations in human induced pluripotent stem cells. Nature All, 63-67, (2011).
110. Mayshar, Y. et al. Identification and classification of chromosomal aberrations in human induced pluripotent stem cells. Cell Stem Cell 7, 521-531, (2010).
111. Lister, R. et al. Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471, 68-73, (2011).
112. Doi, A. et al. Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts. Nat Genet 41, 1350-1353, (2009).
113. Ohi, Y. et al. Incomplete DNA methylation underlies a transcriptional memory of somatic cells in human iPS cells. Nat Cell Biol 13, 541-549, (2011).
114. Bar-Nur, O., Russ, H. A., Efrat, S. & Benvenisty, N. Epigenetic memory and preferential lineage-specific differentiation in induced pluripotent stem cells derived from human pancreatic islet beta cells. Cell Stem Cell 9, 17-23, (2011).
115. Kim, K. et al. Donor cell type can influence the epigenome and differentiation potential of human induced pluripotent stem cells. Nat Biotechnol 29, 1117- 1119, (2011).
116. Chin, M. H. et al. Induced pluripotent stem cells and embryonic stem cells are distinguished by gene expression signatures. Cell Stem Cell 5, 111-123, (2009).
117. Phanstiel, D. H. et al. Proteomic and phosphoproteomic comparison of human ES and IPS cells. Nat Methods 8, 821-827, (2011). Buganim, Y. et al. Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell 1 50, 1209- 1222, (2012).
Pan, G. & Pei, D. Order from chaos: single cell reprogramming in two phases. Cell Stem Cell 1 1 , 445-447, (2012).
Stadtfeld, M. et al Aberrant silencing of imprinted genes on chromosome 12qFl in mouse induced pluripotent stem cells. Nature 465, 175- 1 81 , (2010). Carey, B. W. et al. Reprogramming factor stoichiometry influences the epigenetic state and biological properties of induced pluripotent stem cells. Cell Stem Cell 9, 588-598, (201 1 ).
Beard, C, Hochedlinger, K., Plath, K., Wutz, A. & Jaenisch, R. Efficient method to generate single-copy transgenic mice by site-specific integration in embryonic stem cells. Genesis 44, 23-28, (2006).
Stelzer, G. et al. GeneDecks: paralog hunting and gene-set distillation with GeneCards annotation. OMICS 13, 477-487, (2009).
Beard, C, Hochedlinger, K., Plath, K., Wutz, A. & Jaenisch, R. Efficient method to generate single-copy transgenic mice by site-specific integration in embryonic stem cells. Genesis 44, 23-28, (2006).
Brambrink, T. et al. Sequential expression of pluripotency markers during direct reprogramming of mouse somatic cells. Cell Stem Cell 2, 1 51 - 1 59,
(2008).
Buganim, Y. et al. Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell 150, 1209- 1222, (2012).
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000131_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
Figure imgf000137_0001
Figure imgf000138_0001
Figure imgf000139_0001
Figure imgf000140_0001
Figure imgf000141_0001
Figure imgf000142_0001
Figure imgf000143_0001
Figure imgf000144_0001
Figure imgf000145_0001
Figure imgf000146_0001
Figure imgf000147_0001
Figure imgf000148_0001
Figure imgf000149_0001
Figure imgf000150_0001
Figure imgf000151_0001
Figure imgf000152_0001
Figure imgf000153_0001
Figure imgf000154_0001
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
Figure imgf000159_0001
Figure imgf000160_0001
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
Figure imgf000170_0001
Figure imgf000171_0001
Figure imgf000172_0001
Figure imgf000173_0001
Figure imgf000174_0001
Figure imgf000175_0001
Figure imgf000176_0001
Figure imgf000177_0001
Figure imgf000178_0001
Figure imgf000179_0001
Figure imgf000180_0001
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0001
Figure imgf000184_0001
Figure imgf000185_0001
Figure imgf000186_0001
Figure imgf000187_0001
Figure imgf000188_0001
Figure imgf000189_0001
Figure imgf000190_0001
Figure imgf000191_0001
Figure imgf000192_0001
Figure imgf000193_0001
Figure imgf000194_0001
Figure imgf000195_0001
Figure imgf000196_0001
Figure imgf000197_0001
Figure imgf000198_0001
Figure imgf000199_0001
Figure imgf000200_0001
Figure imgf000201_0001
Figure imgf000202_0001
Figure imgf000203_0001
Figure imgf000204_0001
Figure imgf000205_0001
Figure imgf000206_0001
Figure imgf000207_0001
Figure imgf000208_0001
Figure imgf000209_0001
Figure imgf000210_0001
Figure imgf000211_0001
Figure imgf000212_0001
Figure imgf000213_0001
Figure imgf000214_0001
Figure imgf000215_0001
Figure imgf000216_0001
Figure imgf000217_0001
Figure imgf000218_0001
Figure imgf000219_0001
Figure imgf000220_0001
Figure imgf000221_0001
Figure imgf000222_0001
Figure imgf000223_0001
Figure imgf000224_0001
Figure imgf000225_0001
Figure imgf000226_0001
Figure imgf000227_0001
Figure imgf000228_0001
Figure imgf000229_0001
Figure imgf000230_0001
Figure imgf000231_0001

Claims

What is claimed is:
1 . A method of generating a reprogrammed cell comprising: (a) introducing reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a mammalian somatic cell; and (b) culturing said cell in a suitable medium under conditions appropriate for and for a time period sufficient to give rise to a reprogrammed cell.
2. A method accord ing to claim 1 wherein said reprogramm ing factors are introduced into said somatic cell in the form of one or more nucleic acid sequences encoding the reprogramming factors.
3. A method according to claim 2 wherein said one or more nucleic acid
sequences comprise DNA.
4. A method according to claim 2 wherein said one or more nucleic acid
sequences comprise RNA.
5. A method according to claim 2 wherein said one or more nucleic acid
sequences comprises a nucleic acid construct.
6. A method according to claim 2 wherein said one or more nucleic acid
sequences comprises a vector.
7. A method according to claim 6 wherein said vector comprises an inducible vector.
8. A method according to claim 7 wherein said inducible vector activates
expression of said reprogramm ing factors in the presence of dox in said medium.
9. A method according to claim 6 wherein said vector integrates into a g nome of said somatic ceil.
10. A method according to claim 9 wherein said vector comprises a viral vector.
1 1 . A method according to claim 1 0 wherein said vector comprises a retroviral vector.
1 2. A method according to claim 1 0 wherein said vector comprises a lentiviral vector.
1 3. A method according to claim 6 wherein said vector comprises an excisable vector.
14. A method according to claim 13 wherein said excisable vector comprises a transposon, wherein said excisable vector is excisable from said genome by transient expression of a transposase.
1 5. A method according to claim 14 wherein said transposon comprises a
piggyback transposon.
16. A method according to claim 13 wherein said excisable vector comprises one or more loxP site incorporated into said vector, wherein said vector can be excised from said genome by transient expression of a Cre recombinase.
1 7. A method according to claim 1 3 wherein said excisable vector comprises a floxed lentiviral vector.
18. A method according to claim 6 wherein said vector does not integrate into the genome of said somatic cell.
1 9. A method according to claim 1 8 wherein said vector comprises an adenoviral vector.
20. A method according to claim 1 8 wherein said vector comprises a sendai viral vector.
21 . A method according to claim 1 8 wherein said vector comprises a plasmid.
22. A method according to claim 1 8 wherein said vector comprises an episome.
23. A method according to claim 4 wherein said RNA comprises mRNA.
24. A method according to claim 23 wherein said mRNA is translatable in vitro in said mammalian somatic cell.
25. A method according to claim 24 wherein said mRNA is in vitro transcribed mRNA.
26. A method according to claim 25 wherein said in vitro transcribed mRNA
comprises a sequence encoding SV40 large T (LT).
27. A method according to claim 25 wherein said in vitro transcribed mRNA
comprises one or more modifications that increase stability or translatability of said mRNA.
28. A method according to claim 25 wherein said in vitro transcribed mRNA
comprises a 5 ' cap.
29. A method according to claim 25 wherein said in vitro transcribed mRNA
comprises an open reading frame flanked by a 5 ' untranslated region and a 3 ' untranslated region that enhance translation of said open reading frame.
30. A method according to claim 29 wherein said 5 ' untranslated region comprises a strong Kozak translation initiation signal.
3 1 . A method accord ing to claim 29 wherein said 3 ' untranslated region comprises an alpha-globin 3 ' untranslated region.
32. A method accord ing to claim 25 wherein said in vitro transcribed mRNA
comprises a polyA tail.
33. A method according to claim 25 wherein said in vitro transcribed mRNA is introduced into said somatic cell via electroporation.
34. A method according to claim 25 wherein said in vitro transcribed mRNA is introduced into said somatic cell complexed with a cationic vehicle that facilitates uptake of said mRNA into said somatic cell via endocytosis.
35. A method according to claim 25 wherein said in vitro transcribed mRNA is introduced into said somatic cell in an amount and for a period of time sufficient to maintain expression of the reprogramm ing factors until cellular reprogramm ing of said somatic cell occurs.
36. A method accoding to claim 25 wherein said in vitro transcribed mRNA is treated with a phosphatase to reduce a cytotoxic response by said somatic cell upon introduction of said mRNA into said somatic cell.
37. A method according to claim 25 wherein said in vitro transcribed mRNA comprises one or more base substitutions.
38. A method according to claim 37 wherein said base substitutions are selected from the group consisting of 5-methylcytidine (5mC), pseudouridine (psi), 5- methyluridine, 2'O-methyluridine, 2-thiouridine, and N6-methyladenosine.
39. A method according to claim 1 wherein said reprogramming factors are
introduced into said somatic cell in the form of one or more proteins or functional variants or fragments thereof.
40. A method according to claim 39 wherein said one or more proteins comprise a recombinant protein.
41 . A method according to claim 39 wherein said one or more proteins comprise a fusion protein.
42. A method according to claim 39 wherein said one or more proteins further comprise a cell-penetrating peptide.
43. A method according to claim 42 wherein said cell-penetrating peptide is fused to a C terminus of said one or more proteins.
44. A method according to claim 39 wherein said cell-penetrating peptide
comprises HIV tat.
45. A method according to claim 39 wherein said cell-penetrating peptide
comprises poly-arginine.
46. A method according to claim 39 wherein said one or more proteins is
introduced into said somatic cell in an amount and for a period of time sufficient for reprogramming of said somatic cell to occur.
47. A method according to claim 1 further comprising (c) supplementing said medium with one or more agents that increase reprogramming efficiency.
48. A method according to claim 47 wherein said one or more agents are selected from the group consisting of a nucleic acid, an antisense oligonucleotide, siRNA, miRNA, an antibody or a fragment thereof.
49. A method according to claim 47 wherein said one or more agents comprise a histone deacetylase inhibitor.
50. A method according to claim 49 wherein said histone deacetylase inhibitor comprises valproic acid (VPA).
5 1 . A method according to claim 49 wherein said histone deacetylase inhibitor comprises butyrate.
52. A method according to claim 47 wherein said one or more agents comprise an interferon inhibitor.
53. A method according to claim 52 wherein said interferon inhibitor comprises a recombinant B l 8R protein.
54. A method according to claim 47 wherein said one or more agents comprise a signaling pathway modulator selected from the group consisting of a TGF-beta pathway inhibitor, a MAPK/ERK pathway inhibitor, a GSK3 pathway inhibitor, a WNT pathway activator, a 3 '-phosphoinositide-dependent kinase- 1 (PDK1 ) pathway activator, a mitochrondrial oxidation modulatory, a glycolytic metabolism modulator, a HIF pathway activator, and combinations thereof.
55. A method according to claim 1 further comprising (c) monitoring said culture for cells which display one or more markers of pluripotency.
56. A method according to claim 55 wherein said one or more markers of
pluripotency is selected from the group consisting of Fbxo l 5, Nanog, Oct4, Sox2 and combinations thereof.
57. A method according to claim 55 wherein said one or more markers of
pluripotency comprises an early marker of pluripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
58. A method according to claim 55 further comprising (d) isolating said
reprogrammed cell from said culture.
59. A method according to claim 1 further comprising (c) isolating said
reprogrammed cell from said culture.
60. A method according to claim 1 wherein said somatic cell is a terminally
differentiated somatic cell.
61 . A mammalian cell comprising said reprogrammed cell isolated according to claim 59.
62. A mammalian cell according to claim 61 wherein said cell is a human cell.
63. A mammalian cel l according to claim 61 wherein said cell is a non-human mammal cell.
64. A mammalian cell according to claim 61 wherein said cell further comprises a reporter gene integrated at a locus whose activation serves as a marker of reprogramming to pluripotency.
65. A mammalian cell according to claim 64, wherein the locus is selected from Nanog, Sox2, and Oct4.
66. A mammalian cell according to claim 61 , wherein said cell is an iPS cell.
67. A chimeric mouse generated at least in part from said mammalian iPS cell of claim 66.
68. A chimeric mouse according to claim 67 wherein said mouse is generated by injecting said mammalian iPS cell into a mouse blastocyt and allowing said blastocyst to develop into a mouse in vivo.
69. A cell obtained from said mouse according to claim 68 wherein said cell is derived from said iPS cell.
70. A non-human mammal generated at least in part from said mammalian cell of claim 66.
71 . A non-human mammal according to claim 70 wherein said non-human
mammal is a mouse.
72. A method of producing a non-human mammal comprising introducing an iPS cell according to claim 66 into tetraploid blastocysts of the same mammalian species under conditions that result in production of an embryo and said resulting embryo is transferred into a foster mother which is maintained under conditions that result in development of live offspring.
73. A method according to claim 72 wherein said non-human mammal is a mouse.
74. A method according to claim 72 wherein said iPS cells are introduced into said tetraploid blastocysts by injection.
75. A method according to claim 74 wherein said injection is a microinjection.
76. A non-human mammal produced according to a method of claim 72.
77. A mouse produced according to a method of claim 72.
78. A method of producing a non-human mammalian embryo comprising
injecting non-human iPS cells generated according to a method of claim 66 into non-human tetraploid blastocysts and maintaining said resulting tetraploid blastocysts under conditions that result in formation of embryos, thereby producing a non-human mammalian embryo.
79. A method according to claim 78 wherein said non-human iPS cells are mouse cells and said non-human mammalian embryo is a mouse.
80. A method according to claim 78 wherein mutant mouse iPS cells are injected into said non-human tetraploid blastocysts by microinjection.
81 . A non-human mammalian embryo produced according to a method of claim 78.
82. A mouse embryo produced according to a method of claim 78.
83. A method according to claim 1 wherein said somatic cells are differentiated cells of a first cell type, and said reprogramming reprograms said somatic cells to a second differentiated cell type.
84. A method comprising: (a) reprogramm ing somatic cells to a pluripotent state according to a method of claim 1 ; and (b) reprogramming said pluripotent cells to a desired, differentiated cell type, wherein said differentiated cell type comprises opt ionally an adult stem cell or a fully differentiated cell.
85. A composition comprising multiple cells of claim 59 or claim 84.
86. A method of treating a patient in need of such treatment, comprising administering to the patient a composition according to claim 85.
87. A method of treating an individual in need of such treatment comprising: (a) obtaining somatic cells from said individual; (b) reprogramming said somatic cells obtained from said individual according to a method of claim 1 ; and (c) administering at least some of said reprogrammed cells to said individual.
88. A method according to claim 87 wherein the method further comprises
separating cells that are reprogrammed to a desired state from cel ls that are not reprogrammed to a desired state.
89. A method according to claim 87 wherein said individual is a human.
90. A composition for identifying a reprogramming agent, the composition
comprising one or more cells that expresses a subset of reprogramming factors selected from the group consisting of Sall4, Nanog, Esrrb and Lin28, and a test agent.
91. A composition according to claim 90 wherein said subset of reprogramming factors consists of at least three of said reprogramming factors.
92. A composition according to claim 90 further comprising an agent that induces expression of said subset of reprogramm ing factors.
93. A method of identifying a reprogramming agent comprising: (a) maintaining said composition of claim 90 for a time period under conditions in which said reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein the test agent is identified as a reprogramming agent if reprogramming occurs at a similar frequency as would be the case if said composition contained all of said reprogramming factors and had lacked said test agent.
94. A method of identifying a reprogramming agent comprising: (a) maintaining the composition of claim 90 for a time period under conditions in which the reprogramming factors are expressed and cell proliferation occurs; and (b) assessing the extent to which cells become reprogrammed, wherein said test agent is identified as a reprogramming agent or enhancer of reprogramming if reprogramming occurs at a significantly greater frequency than would be the case had said composition lacked said test agent.
95. A method according to claim 94 wherein said composition is maintained for at least X days.
96. A method according to claim 94 wherein said test agent is present for at least X days.
97. A method according to claim 94 wherein said test agent is identified as a reprogramming agent if cells do not become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent
, but do become reprogrammed at a detectable frequency if maintained in the presence of said test agent for at least a portion of said time period.
98. A method according to claim 94 wherein said test agent is identified as an enhancer of reprogramming agent if cells become reprogrammed at a detectable frequency if maintained for said time period in the absence of said test agent and become reprogrammed at a significantly greater frequency if maintained in the presence of said test agent for at least a portion of said time period.
99. A nucleic acid construct comprising at least four coding regions linked to each other by nucleic acids that encode a self-cleaving peptide so as to form a single open reading frame, wherein said coding regions encode
reprogramming factors Sal 14, Nanog, Esrrb, and Lin28, and wherein said reprogramming factors are capable, either alone or in combination with one or more additional reprogramming factors, of reprogramming a mammalian somatic cell to pluripotency.
1 00. A nucleic acid construct according to claim 99 further comprising a fifth coding region that encodes a fifth reprogramming factor, wherein said five coding regions are linked to each other by nucleic acids that encode self- cleaving peptides so as to form a single open reading frame.
1 01 . A nucleic acid construct according to claim 100 wherein said fifth
reprogramming factor is c-Myc.
102. A nucleic acid construct according to claim 99 further comprising fifth and sixth genes that encode fifth and sixth reprogramming factors, wherein said six coding regions are linked to each other by nucleic acids that encode self- cleaving peptides so as to form a single open reading frame.
1 03. A nucleic acid construct according to claim 1 02 wherein said fifth
reprogramming factor is c-Myc and said sixth reprogramming factor is Klf4.
1 04. A nucleic acid construct according to claim 99 wherein said self-cleaving peptide is a viral 2A peptide.
1 05. A nucleic acid construct according to claim 104 wherein said self-cleaving peptide is an aphthovirus 2A peptide.
1 06. A nucleic acid construct according to claim 99 wherein said construct does not encode Oct4. 07. A nucleic acid construct according to claim 99 wherein said construct does not encode lf4. 08. A nucleic acid construct according to claim 99 wherein said construct does not encode Sox2. 09. A nucleic acid construct according to claim 99 wherein said construct does not encode c~Myc.
10. An expression cassette comprising said nucleic acid construct of claim 99 operably linked to a promoter, wherein said promoter drives transcription of a polycistronic message that encodes said reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide. 1 1 . An expression cassette according to claim 1 10 further comprising one or more sites that mediate integration into a genome of a mammalian cell. 12. An expression cassette according to claim 1 1 1 wherein said expression
cassette is integrated into said genome at a locus whose disruption has minimal or no effect on said cell. 13. An expression vector comprising the expression cassette of claim 1 12. 14. An expression vector according to claim 1 1 3 wherein said vector is retroviral. 1 5. An expression vector according to claim 1 1 3 wherein said promoter is
inducible. 1 6. A reprogramming composition comprising at least two, three, or four
reprogramming factors selected from the group consisting of Sall4 protein, Nanog protein, Esrrb protein, and Lin28 protein, or functional variants or fragments thereof or nucleic acids encoding any of the foregoing. 17. A composition according to claim 1 16 wherein each of said reprogramming factors comprises a cell-penetrating peptide fused to its C terminus. 18. A composition according to claim 1 17 wherein said cell-penetrating peptide comprises poly-arginine. 19. A method of producing a pluripotent cell from a somatic cell, comprising the steps of: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; (b) maintaining said one or more cells under conditions appropriate and for a period of time sufficient for said exogenous reprogramm ing factors to activate at least one endogenous pluripotency gene; (c) selecting one or more cells which display an early marker of pluripotency; (d) generating a colony or an embryo utilizing said one or more cells which display the early marker of pluripotency; (e) obtaining one or more somatic cells from said colony or embryo; (f) maintaining said one or more somatic cells under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene; and (g) differentiating between cells which display one or more markers of pluripotency and cells which do not.
120. A method according to claim 1 19 wherein said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
121 . A method according to claim 1 19 wherein said early marker of pluripotency is a group of early markers of pluripotency consisting of Esrrb, Utfl , Lin28, and Dppa2.
122. A method according to claim 1 19 wherein step (d) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency.
123. An isolated pluripotent cell produced by a method comprising: (a) introducing exogenous reprogramm ing factors Sall4, Nanog, Esrrb, and Lin28 into one or more somatic cells; (b) maintaining said one or more cel ls under conditions appropriate for and for a period of time sufficient for said exogenous reprogramming factors to activate at least one endogenous pluripotency gene;
(c) selecting one or more cells which display an early marker of pluripotency;
(d) generating a colony or an embryo utilizing said one or more cells which display the early marker of pluripotency; (e) obtaining one or more differentiated somatic cells from said colony or embryo; (f) maintaining said one or more differentiated somatic cells under conditions appropriate for and for a period of time sufficient for said reprogramming factors to activate at least one endogenous pluripotency gene; and (g) differentiating cells which display one or more markers of pluripotency and cells which do not.
124. A method according to claim 123 or an isolated pluripotent cell produced thereby, wherein said early marker of pluripotency is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
1 25. A method according to claim 123 or an isolated pluripotent cell produced thereby, wherein said early marker of pluripotency is a group of early pluripotency markers consisting of Esrrb, Utfl , Lin28, and Dppa2.
1 26. A method according to claim 123 or an isolated pluripotent cell produced thereby, wherein step (d) comprises selecting one or more cells which display an early marker of pluripotency and at least one marker of pluripotency.
1 27. A method of selecting a somatic cell that is likely to be reprogrammed to a pluripotent state, comprising (a) measuring expression of one or more early markers of pluripotency in a population of a plurality of somatic cells; (b) sorting the population of the plurality of somatic cells into a plurality of populations of single somatic cells; and (c) measuring expression of the one or more early markers of pluripotency in each population of single somatic cells, wherein increased expression of the one or more early markers of pluripotency in each population of single somatic cel ls as compared to expression of the one or more early markers of pluripotency in the population of the plurality of somatic cells indicates that the single somatic cell is a somatic cell that is likely to be reprogrammed to the pluripotent state.
128. A method according to claim 127 wherein said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
129. A method of selecting a cell that is likely to become programmed to a
pluripotent state, comprising (a) maintaining a population of a plurality of differentiated somatic cells containing at least one exogenously introduced factor that contributes to reprogramming of said cells to a pluripotent state under conditions appropriate for proliferation and for reprogramm ing of said cells to occur; (b) sorting said population of said plurality of cells into a plurality of populations of single cells; and (c) isolating said sorted cells which display one or more early markers of pluripotency, wherein each sorted cell which displays said one or more early markers of pluripotency is a cell that is likely to become programmed to the pluripotent state.
1 30. A method according to claim 129 wherein said one or more early markers of pluripotency are selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
131 . A method for increasing the efficiency of the expansion of induced pluripotent stem cells, comprising (a) maintaining a population of differentiated somatic cells that contains at least one exogenously introduced factor that contributes to reprogramming of said population of cells to a pluripotent state under conditions appropriate for proliferation and for reprogramming of said cel ls to occur; (b) monitoring each cell in said population of cells for the expression of one or more early pluripotency markers, wherein cells expressing the one or more early pluripotency markers are more likely to become programmed to a pluripotent state than cells which do not express the one or more early pluripotency markers; (c) isolating each cell in said population of cells that expresses the one or more early pluripotency markers; and (d) expanding only those cells which express the one or more early pluripotency markers, thereby increasing the efficiency of the expansion of induced pluripotent stem cells.
132. A method for according to claim 13 1 wherein said one or more early
pluripotency markers is selected from the group consisting of Esrrb, Utfl , Lin28, Dppa2, and combinations thereof.
133. A method according to claim 13 1 wherein said monitoring of said cells is performed during a stochastic phase of reprogramm ing.
134. A method according to claim 13 1 wherein proliferation of said cell forms a clonal colony of said cell.
135. A method of increasing the likelihood that a differentiated somatic cell subjected to a reprogramming protocol will become reprogrammed to an iPSC comprising, introducing into the differentiated somatic cell one or more early pluripotency factors prior to subjecting the differentiated somatic cell to said reprogramming protocol.
1 36. A method according to claim 1 35 wherein said one or more early pluripotency factors is selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
1 37. A method of isolating an iPS colony, comprising: (a) introducing exogenous reprogramming factors Sall4, Nanog, Esrrb, and Lin28 into a differentiated mammalian somatic cell (b) culturing said differentiated somatic cell in a suitable medium under conditions appropriate for and for a time period sufficient for proliferation of and reprogramming of said cells to occur; and (c) isolating one or more colon ies visible in said culture after said period of time.
138. A method according to claim 137 wherein each of said exogenous
reprogramming factors is introduced into said cell in the form of a
recombinant protein comprising a cell-penetrating peptide fused to a C terminus of said recombinant protein.
139. A method according to claim 137 wherein each of said exogenous
reprogramming factors is introduced into said cell in the form of mRNA optionally complexed with a cationic vehic le, wherine said mRNA comprises in vitro transcribed mRNA comprising one or more of a 5 ' cap, an open reading frame flanked by a 5 ' untranslated region containing a strong Kozak translation initiation signal and an alpha-globin 3 ' untranslated region, a polyAtail, and one or more modifications which confer stability to the mRNA.
140. A method according to claim 137 further comprising: (d) growing said
isolated one or more colonies on a layer of feeder cells in the absence of an inducer of said inducible transgenes.
141 . A method according to claim 1 40 further comprising (e) passaging said one or more grown colonies at least once.
142. A method of enhancing isolation of iPSCs, comprising (d) sorting said one or more colonies visible in said culture after said period of time according to step (c) of claim 137 into single cells; (e) differentiating between said sorted cells which display one or more early markers of pluripotency and said sorted cells which do not display one or more early markers of pluripotency; and (f) isolating said sorted cells which display one or more early markers of plurioptency.
143. A mouse iPS cell characterized by an efficiency of said mouse iPS cell of generating live offspring by tetraploid complementation, wherein said efficiency is at least 5%.
144. A method of producting a mouse iPS cell according to claim 143, comprising:
(a) transfecting mouse embryonic fibroblasts with a dox-inducible vector comprising reprogramming factors Sall4, Nanog, Esrrb and Lin28 operably linked to a tetracycline operator and a CMV promoter; (b) culturing said mouse embryonic fibroblasts under conditions suitable and for a time period sufficient for proliferation and reprogramming of said mouse embryonic fibroblasts to occur; (c) exposing said culture to an effective amount of doxycycline for a period of time sufficient for one or more iPS colonies to form; (d) isolating said one or more iPS colonies; (e) growing said isolated iPS colonies on feeder cells in the absence of doxycycline; and optionally (f) passaging said grown iPS colonies at least once.
145. A collection of reprogramming factors capable of producing a mouse IPS cell according to claim 143, comprising Sall4, Nanog, Esrrb, and Lin28.
146. A kit for generating a reprogrammed cell in vitro, comprising: (a) a set of reprogramm ing factors comprising Sall4, Nanog, Esrrb and Lin28, which are capable alone, or in combination with one or more additional reprogramming factors, of reprogramming said mammalian somatic cells to a pluripotent state, wherein the kit optionally comprises (b) a medium suitable for culturing mammal ian iPS cells and/or (c) a population of mammalian somatic cells, and wherein the reprogramming factors are optionally provided as one or more nucleic acids encoding said reprogamming factors.
147. A kit according to claim 1 46 further comprising (d) one or more reagents for an assay for detecting one or more markers of piuripotency.
148. A kit according to claim 146 wherein the one or more markers of piuripotency is an early marker of piuripotency selected from the group consisting of Esrrb, Utfl , Lin28, and Dppa2.
149. A kit according to claim 146 further comprising one or more of: (e)
instructions for preparing the medium; (1) instructions for deriving or culturing pluripotent cells; (g) serum replacement; (h) albumin; (i) at least one protein or small molecule useful for deriving or culturing iPS cells, wherein the protein or small molecule activates or inhibits a signal transduction pathway; (j) a population of mammal ian somatic cells and (k) at least one reagent useful for characterizing pluripotent cells.
1 50. The kit of claim 149 wherein at least some of the ingredients are dissolved in liquid. 51 . The kit of claim 149 wherein at least some of the ingredients are provided in dry form. 52. A method according to any of claims 1 -60, 72-75, 78-84, 86-89, 92-99, 1 19- 1 22, 1 24- 1 26, 1 37- 1 42, or 144, with the proviso that Nanog is replaced by or supplemented by Dppa2. 53. A method according to any of claims 1 -60, 72-75, 78-84, 86-89, 92-99, 1 19- 1 22, 1 24- 1 26, 1 37- 142, 144, or 1 52 with the proviso that Lm28 is omitted or is replaced by one or more other reprogramming factors or reprogramming agents.
154. A method according to claim 153, wherein the one or more other
reprogramming factors is Etz2, Kdm 1 , or Utf 1.
155. A cell, chimeric mouse, non-human mammal, composition, nucleic acid
construct, nucleic acid cassette, vector, collection, or kit according to any of claims 61-71, 76, 77, 85, 90, 91, 100-118, 124-126, or 145-151, as applicable, with the proviso that Nanog is replaced by or supplemented by Dppa2.
156. A cell, chimeric mouse, non-human mammal, composition, nucleic acid
construct, nucleic acid cassette, vector, collection, or kit according to any of. claims 61-71, 76, 77, 85, 90, 91, 100-118, 124-126, 145-151, or 155, as applicable, with the proviso that Lin28 is omitted or is replaced by one or more other reprogramming factors or reprogramming agents.
157. A cell, chimeric mouse, non-human mammal, composition, nucleic acid
construct, nucleic acid cassette, vector, collection, or kit according to claim
156, wherein the one or more other reprogramming factors is Etz2, Kdml , or Utfl.
PCT/US2013/037623 2012-04-20 2013-04-22 Programming and reprogramming of cells WO2013159103A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201261636441P 2012-04-20 2012-04-20
US61/636,441 2012-04-20
US201261700781P 2012-09-13 2012-09-13
US61/700,781 2012-09-13
US201361798423P 2013-03-15 2013-03-15
US61/798,423 2013-03-15

Publications (1)

Publication Number Publication Date
WO2013159103A1 true WO2013159103A1 (en) 2013-10-24

Family

ID=49384148

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/037623 WO2013159103A1 (en) 2012-04-20 2013-04-22 Programming and reprogramming of cells

Country Status (1)

Country Link
WO (1) WO2013159103A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016005985A2 (en) 2014-07-09 2016-01-14 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Method for reprogramming cells
WO2016138464A1 (en) * 2015-02-27 2016-09-01 Salk Institute For Biological Studies Reprogramming progenitor compositions and methods of use therefore
WO2019144186A1 (en) * 2018-01-23 2019-08-01 Southern Eye Equipment Expression vector and method
CN112111446A (en) * 2014-03-19 2020-12-22 V 细胞治疗公司 Methods relating to pluripotent cells
CN113692442A (en) * 2019-04-17 2021-11-23 学校法人庆应义塾 Method for producing induced pluripotent stem cell and kit
US11268069B2 (en) * 2014-03-04 2022-03-08 Fate Therapeutics, Inc. Reprogramming methods and cell culture platforms
CN114269899A (en) * 2019-07-11 2022-04-01 巴布拉罕姆研究所 Novel reprogramming method
US11441126B2 (en) 2015-10-16 2022-09-13 Fate Therapeutics, Inc. Platform for the induction and maintenance of ground state pluripotency
US11685901B2 (en) 2016-05-25 2023-06-27 Salk Institute For Biological Studies Compositions and methods for organoid generation and disease modeling

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2698091A1 (en) * 2007-08-31 2009-03-12 Brett Chevalier Wnt pathway stimulation in reprogramming somatic cells
US20100003757A1 (en) * 2008-06-04 2010-01-07 Amanda Mack Methods for the production of ips cells using non-viral approach
WO2011055851A1 (en) * 2009-11-06 2011-05-12 Kyoto University Method of efficiently establishing induced pluripotent stem cells

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2698091A1 (en) * 2007-08-31 2009-03-12 Brett Chevalier Wnt pathway stimulation in reprogramming somatic cells
US20100003757A1 (en) * 2008-06-04 2010-01-07 Amanda Mack Methods for the production of ips cells using non-viral approach
WO2011055851A1 (en) * 2009-11-06 2011-05-12 Kyoto University Method of efficiently establishing induced pluripotent stem cells

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11268069B2 (en) * 2014-03-04 2022-03-08 Fate Therapeutics, Inc. Reprogramming methods and cell culture platforms
CN112111446A (en) * 2014-03-19 2020-12-22 V 细胞治疗公司 Methods relating to pluripotent cells
WO2016005985A2 (en) 2014-07-09 2016-01-14 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Method for reprogramming cells
US10920199B2 (en) 2015-02-27 2021-02-16 Salk Institute For Biological Studies Reprogramming progenitor compositions and methods of use therefore
JP2019107026A (en) * 2015-02-27 2019-07-04 ソーク インスティチュート フォー バイオロジカル スタディーズ Reprogramming progenitor compositions and methods of use therefore
JP7449648B2 (en) 2015-02-27 2024-03-14 ソーク インスティチュート フォー バイオロジカル スタディーズ Reprogramming precursor composition and method of use thereof
AU2016225076B2 (en) * 2015-02-27 2018-09-13 Salk Institute For Biological Studies Reprogramming progenitor compositions and methods of use therefore
JP2018506294A (en) * 2015-02-27 2018-03-08 ソーク インスティチュート フォー バイオロジカル スタディーズ Reprogramming precursor composition and method of use thereof
AU2018271254B2 (en) * 2015-02-27 2021-05-20 Salk Institute For Biological Studies Reprogramming progenitor compositions and methods of use therefore
WO2016138464A1 (en) * 2015-02-27 2016-09-01 Salk Institute For Biological Studies Reprogramming progenitor compositions and methods of use therefore
EP3262157A4 (en) * 2015-02-27 2018-12-05 Salk Institute for Biological Studies Reprogramming progenitor compositions and methods of use therefore
US11441126B2 (en) 2015-10-16 2022-09-13 Fate Therapeutics, Inc. Platform for the induction and maintenance of ground state pluripotency
US11760977B2 (en) 2016-05-25 2023-09-19 Salk Institute For Biological Studies Compositions and methods for organoid generation and disease modeling
US11685901B2 (en) 2016-05-25 2023-06-27 Salk Institute For Biological Studies Compositions and methods for organoid generation and disease modeling
AU2019211465B2 (en) * 2018-01-23 2022-07-21 Southern Eye Equipment Pty Ltd Expression vector and method
WO2019144186A1 (en) * 2018-01-23 2019-08-01 Southern Eye Equipment Expression vector and method
CN113692442A (en) * 2019-04-17 2021-11-23 学校法人庆应义塾 Method for producing induced pluripotent stem cell and kit
CN114269899A (en) * 2019-07-11 2022-04-01 巴布拉罕姆研究所 Novel reprogramming method

Similar Documents

Publication Publication Date Title
Velychko et al. Excluding Oct4 from Yamanaka cocktail unleashes the developmental potential of iPSCs
JP6934501B2 (en) Somatic cell reprogramming
WO2013159103A1 (en) Programming and reprogramming of cells
Stadtfeld et al. Induced pluripotency: history, mechanisms, and applications
Okita et al. Induction of pluripotency by defined factors
US10023922B2 (en) Reporter of genomic methylation and uses thereof
Buganim et al. The developmental potential of iPSCs is greatly influenced by reprogramming factor selection
Boland et al. Adult mice generated from induced pluripotent stem cells
Wu et al. Generation of healthy mice from gene-corrected disease-specific induced pluripotent stem cells
US9115345B2 (en) MicroRNA induction of pluripotential stem cells and uses thereof
EP3172316A2 (en) Enhanced reprogramming to ips cells
WO2011071476A2 (en) Compositions and methods for engineering cells
JP2010273680A (en) Method for preparing induced pluripotent stem cell wherein reprogramming factor is eliminated
JP2015500637A (en) Haploid cells
Pillai et al. Induced pluripotent stem cell generation from bovine somatic cells indicates unmet needs for pluripotency sustenance
WO2020257205A1 (en) SYSTEMS AND METHODS FOR IN VIVO DUAL RECOMBINASE-MEDIATED CASSETTE EXCHANGE (dRMCE) AND DISEASE MODELS THEREOF
EP2501803B1 (en) Methods of enhancing pluripotentcy
CN116529361B (en) Induction of pluripotent stem cells using polycistronic SOX2, KLF4 and optionally C-MYC production
Chen et al. The occurrence and development of induced pluripotent stem cells
Maza Frequent and transient acquisition of pluripotency during somatic cell trans-differentiation with iPSC reprogramming factors
Dreesen et al. Induced Pluripotent Stem Cells
Xia et al. Induced pluripotent stem cells generated from reprogramming differentiated cells by defined factors
Faddah Single-cell analyses of cellular reprogramming and embryonic stem cells
Chen Enhanced maintenance of genetic integrity in induced pluripotent stem cells

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13777906

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13777906

Country of ref document: EP

Kind code of ref document: A1