WO2022253022A1 - 一种基于单细胞测序数据分析的iPSC残留检测方法 - Google Patents
一种基于单细胞测序数据分析的iPSC残留检测方法 Download PDFInfo
- Publication number
- WO2022253022A1 WO2022253022A1 PCT/CN2022/094411 CN2022094411W WO2022253022A1 WO 2022253022 A1 WO2022253022 A1 WO 2022253022A1 CN 2022094411 W CN2022094411 W CN 2022094411W WO 2022253022 A1 WO2022253022 A1 WO 2022253022A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- ipsc
- cells
- cell
- data
- biomarkers
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 33
- 238000007405 data analysis Methods 0.000 title description 6
- 238000001514 detection method Methods 0.000 claims abstract description 50
- 238000003766 bioinformatics method Methods 0.000 claims abstract description 8
- 238000005516 engineering process Methods 0.000 claims abstract description 8
- 210000004027 cell Anatomy 0.000 claims description 144
- 108090000623 proteins and genes Proteins 0.000 claims description 80
- 230000014509 gene expression Effects 0.000 claims description 66
- 239000000090 biomarker Substances 0.000 claims description 61
- -1 Arid1b Proteins 0.000 claims description 49
- 239000000523 sample Substances 0.000 claims description 42
- 239000011159 matrix material Substances 0.000 claims description 41
- 238000004458 analytical method Methods 0.000 claims description 36
- 210000004413 cardiac myocyte Anatomy 0.000 claims description 32
- 101000984042 Homo sapiens Protein lin-28 homolog A Proteins 0.000 claims description 28
- 102100035423 POU domain, class 5, transcription factor 1 Human genes 0.000 claims description 28
- 102100025460 Protein lin-28 homolog A Human genes 0.000 claims description 28
- 102100039624 Embryonic stem cell-related gene protein Human genes 0.000 claims description 27
- 101000814086 Homo sapiens Embryonic stem cell-related gene protein Proteins 0.000 claims description 27
- 101001094700 Homo sapiens POU domain, class 5, transcription factor 1 Proteins 0.000 claims description 27
- 101000687905 Homo sapiens Transcription factor SOX-2 Proteins 0.000 claims description 27
- 102100024270 Transcription factor SOX-2 Human genes 0.000 claims description 27
- 210000000130 stem cell Anatomy 0.000 claims description 25
- 230000004069 differentiation Effects 0.000 claims description 23
- 238000012216 screening Methods 0.000 claims description 23
- 238000000513 principal component analysis Methods 0.000 claims description 18
- 210000004153 islets of langerhan Anatomy 0.000 claims description 14
- 230000003511 endothelial effect Effects 0.000 claims description 11
- 239000003153 chemical reaction reagent Substances 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 10
- IYOZTVGMEWJPKR-IJLUTSLNSA-N Y-27632 Chemical compound C1C[C@@H]([C@H](N)C)CC[C@@H]1C(=O)NC1=CC=NC=C1 IYOZTVGMEWJPKR-IJLUTSLNSA-N 0.000 claims description 9
- 230000002438 mitochondrial effect Effects 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 8
- 210000002901 mesenchymal stem cell Anatomy 0.000 claims description 6
- 210000001744 T-lymphocyte Anatomy 0.000 claims description 5
- 210000003719 b-lymphocyte Anatomy 0.000 claims description 5
- 210000003651 basophil Anatomy 0.000 claims description 5
- 210000001054 cardiac fibroblast Anatomy 0.000 claims description 5
- 210000002889 endothelial cell Anatomy 0.000 claims description 5
- 210000003979 eosinophil Anatomy 0.000 claims description 5
- 210000003743 erythrocyte Anatomy 0.000 claims description 5
- 210000003958 hematopoietic stem cell Anatomy 0.000 claims description 5
- 210000005229 liver cell Anatomy 0.000 claims description 5
- 210000002540 macrophage Anatomy 0.000 claims description 5
- 210000000274 microglia Anatomy 0.000 claims description 5
- 210000001616 monocyte Anatomy 0.000 claims description 5
- 210000000822 natural killer cell Anatomy 0.000 claims description 5
- 210000001178 neural stem cell Anatomy 0.000 claims description 5
- 210000000440 neutrophil Anatomy 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims description 5
- 210000000844 retinal pigment epithelial cell Anatomy 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 claims description 4
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 claims description 4
- 238000012258 culturing Methods 0.000 claims description 4
- 150000002500 ions Chemical class 0.000 claims description 4
- 238000002955 isolation Methods 0.000 claims description 4
- 102100024810 DNA (cytosine-5)-methyltransferase 3B Human genes 0.000 claims description 3
- 101710123222 DNA (cytosine-5)-methyltransferase 3B Proteins 0.000 claims description 3
- 101150084967 EPCAM gene Proteins 0.000 claims description 3
- 108700021430 Kruppel-Like Factor 4 Proteins 0.000 claims description 3
- 101150014309 ALCAM gene Proteins 0.000 claims description 2
- 101150096411 AXIN2 gene Proteins 0.000 claims description 2
- 101150062329 Ars2 gene Proteins 0.000 claims description 2
- 102100035683 Axin-2 Human genes 0.000 claims description 2
- 101150109128 BRIX1 gene Proteins 0.000 claims description 2
- 101150038349 CCNA1 gene Proteins 0.000 claims description 2
- 101150025841 CCND1 gene Proteins 0.000 claims description 2
- 101150082143 CD24 gene Proteins 0.000 claims description 2
- 101150017002 CD44 gene Proteins 0.000 claims description 2
- 101150036788 CD9 gene Proteins 0.000 claims description 2
- 101150015388 CDYL gene Proteins 0.000 claims description 2
- 101150058076 COPS4 gene Proteins 0.000 claims description 2
- 101100004988 Caenorhabditis elegans cdh-3 gene Proteins 0.000 claims description 2
- 101100257372 Caenorhabditis elegans sox-3 gene Proteins 0.000 claims description 2
- 101150079049 Ccnd2 gene Proteins 0.000 claims description 2
- 101150056334 Ccne1 gene Proteins 0.000 claims description 2
- 108010025464 Cyclin-Dependent Kinase 4 Proteins 0.000 claims description 2
- 102000013701 Cyclin-Dependent Kinase 4 Human genes 0.000 claims description 2
- 101150063564 DPPA3 gene Proteins 0.000 claims description 2
- 101150010993 DPY30 gene Proteins 0.000 claims description 2
- 101100382103 Danio rerio alcama gene Proteins 0.000 claims description 2
- 101100239628 Danio rerio myca gene Proteins 0.000 claims description 2
- 101150006195 Dppa4 gene Proteins 0.000 claims description 2
- 101100456569 Drosophila melanogaster kto gene Proteins 0.000 claims description 2
- 101100344902 Drosophila melanogaster skd gene Proteins 0.000 claims description 2
- 101150102539 E2F1 gene Proteins 0.000 claims description 2
- 101710095156 E3 ubiquitin-protein ligase RBX1 Proteins 0.000 claims description 2
- 102100023877 E3 ubiquitin-protein ligase RBX1 Human genes 0.000 claims description 2
- 101150028000 EED gene Proteins 0.000 claims description 2
- 101150027976 EIF2B1 gene Proteins 0.000 claims description 2
- 101150099612 Esrrb gene Proteins 0.000 claims description 2
- 101150012175 Fut4 gene Proteins 0.000 claims description 2
- 101150085536 GJA1 gene Proteins 0.000 claims description 2
- 101150052409 GRB7 gene Proteins 0.000 claims description 2
- 101150072276 Gabrb3 gene Proteins 0.000 claims description 2
- 101150020732 Gnl3 gene Proteins 0.000 claims description 2
- 101150059913 H2az1 gene Proteins 0.000 claims description 2
- 101150105330 HCFC1 gene Proteins 0.000 claims description 2
- 101150096895 HSPB1 gene Proteins 0.000 claims description 2
- 101150027313 Has2 gene Proteins 0.000 claims description 2
- 101150107737 Hmga1 gene Proteins 0.000 claims description 2
- 101100297420 Homarus americanus phc-1 gene Proteins 0.000 claims description 2
- 101100297421 Homarus americanus phc-2 gene Proteins 0.000 claims description 2
- 101000582846 Homo sapiens Mediator of RNA polymerase II transcription subunit 22 Proteins 0.000 claims description 2
- 101000976620 Homo sapiens Zinc finger protein 41 homolog Proteins 0.000 claims description 2
- 101000976622 Homo sapiens Zinc finger protein 42 homolog Proteins 0.000 claims description 2
- 101150054249 Hspa4 gene Proteins 0.000 claims description 2
- 101150047694 ID1 gene Proteins 0.000 claims description 2
- 101150109207 KAT5 gene Proteins 0.000 claims description 2
- 101150078354 KAT6A gene Proteins 0.000 claims description 2
- 101150106457 KITLG gene Proteins 0.000 claims description 2
- 101150018389 Kdm3a gene Proteins 0.000 claims description 2
- 101150072501 Klf2 gene Proteins 0.000 claims description 2
- 101150039798 MYC gene Proteins 0.000 claims description 2
- 101150022024 MYCN gene Proteins 0.000 claims description 2
- 101150024075 Mapk1 gene Proteins 0.000 claims description 2
- 101150028523 Med28 gene Proteins 0.000 claims description 2
- 102100030223 Mediator of RNA polymerase II transcription subunit 22 Human genes 0.000 claims description 2
- 101150032582 Metap2 gene Proteins 0.000 claims description 2
- 102100025744 Mothers against decapentaplegic homolog 1 Human genes 0.000 claims description 2
- 101710143123 Mothers against decapentaplegic homolog 2 Proteins 0.000 claims description 2
- 102100025751 Mothers against decapentaplegic homolog 2 Human genes 0.000 claims description 2
- 101710143111 Mothers against decapentaplegic homolog 3 Proteins 0.000 claims description 2
- 102100025748 Mothers against decapentaplegic homolog 3 Human genes 0.000 claims description 2
- 101150025362 Msi1 gene Proteins 0.000 claims description 2
- 101150026263 Mthfd1 gene Proteins 0.000 claims description 2
- 101100058550 Mus musculus Bmi1 gene Proteins 0.000 claims description 2
- 101100224389 Mus musculus Dppa5a gene Proteins 0.000 claims description 2
- 101100355655 Mus musculus Eras gene Proteins 0.000 claims description 2
- 101100446513 Mus musculus Fgf4 gene Proteins 0.000 claims description 2
- 101100335081 Mus musculus Flt3 gene Proteins 0.000 claims description 2
- 101100342379 Mus musculus Kmt2a gene Proteins 0.000 claims description 2
- 101100343539 Mus musculus L1td1 gene Proteins 0.000 claims description 2
- 101100291029 Mus musculus Mga gene Proteins 0.000 claims description 2
- 101100240442 Mus musculus Nfrkb gene Proteins 0.000 claims description 2
- 101100297651 Mus musculus Pim2 gene Proteins 0.000 claims description 2
- 101100361179 Mus musculus Polr2k gene Proteins 0.000 claims description 2
- 101100533947 Mus musculus Serpina3k gene Proteins 0.000 claims description 2
- 101100095661 Mus musculus Sf3a3 gene Proteins 0.000 claims description 2
- 101100421729 Mus musculus Smarca5 gene Proteins 0.000 claims description 2
- 101100257201 Mus musculus Smarcc1 gene Proteins 0.000 claims description 2
- 101100257376 Mus musculus Sox3 gene Proteins 0.000 claims description 2
- 101000976621 Mus musculus Zinc finger protein 41 Proteins 0.000 claims description 2
- 101000976618 Mus musculus Zinc finger protein 42 Proteins 0.000 claims description 2
- 101100321603 Mus musculus Zscan10 gene Proteins 0.000 claims description 2
- 101150096752 NCAM1 gene Proteins 0.000 claims description 2
- 101150019003 NCOA3 gene Proteins 0.000 claims description 2
- 101150060710 NPR1 gene Proteins 0.000 claims description 2
- 101150063042 NR0B1 gene Proteins 0.000 claims description 2
- 101150001806 NTS gene Proteins 0.000 claims description 2
- 101150031658 Nacc1 gene Proteins 0.000 claims description 2
- 101150081663 Ncoa2 gene Proteins 0.000 claims description 2
- 101150092239 OTX2 gene Proteins 0.000 claims description 2
- 101150041192 Otx1 gene Proteins 0.000 claims description 2
- 101150031895 PAF1 gene Proteins 0.000 claims description 2
- 101150008755 PCNA gene Proteins 0.000 claims description 2
- 101150100341 PEX2 gene Proteins 0.000 claims description 2
- 101150048468 PODXL gene Proteins 0.000 claims description 2
- 101150090112 PTPRZ1 gene Proteins 0.000 claims description 2
- 101150071042 Pcgf6 gene Proteins 0.000 claims description 2
- 101150050921 RBBP7 gene Proteins 0.000 claims description 2
- 101710178916 RING-box protein 1 Proteins 0.000 claims description 2
- 101150007742 RING1 gene Proteins 0.000 claims description 2
- 101150032317 RTF1 gene Proteins 0.000 claims description 2
- 101100292347 Rattus norvegicus Mt1 gene Proteins 0.000 claims description 2
- 101150045121 Rbbp5 gene Proteins 0.000 claims description 2
- 101150022698 Rbl2 gene Proteins 0.000 claims description 2
- 208000035217 Ring chromosome 1 syndrome Diseases 0.000 claims description 2
- 101150062997 Rnf2 gene Proteins 0.000 claims description 2
- 101700032040 SMAD1 Proteins 0.000 claims description 2
- 101150099493 STAT3 gene Proteins 0.000 claims description 2
- 102000000477 Sirtuin 2 Human genes 0.000 claims description 2
- 108010041216 Sirtuin 2 Proteins 0.000 claims description 2
- 101150054344 Smarca4 gene Proteins 0.000 claims description 2
- 108010090739 Smoothened Receptor Proteins 0.000 claims description 2
- 102100030684 Sphingosine-1-phosphate phosphatase 1 Human genes 0.000 claims description 2
- 101710168942 Sphingosine-1-phosphate phosphatase 1 Proteins 0.000 claims description 2
- 101100446445 Sulfurisphaera tokodaii (strain DSM 16993 / JCM 10545 / NBRC 100140 / 7) zfx1 gene Proteins 0.000 claims description 2
- 101150022916 TAF2 gene Proteins 0.000 claims description 2
- 101150080200 TAF7 gene Proteins 0.000 claims description 2
- 101150052863 THY1 gene Proteins 0.000 claims description 2
- 101150078250 Tcf3 gene Proteins 0.000 claims description 2
- 101150060139 Tnfrsf8 gene Proteins 0.000 claims description 2
- 101150107801 Top2a gene Proteins 0.000 claims description 2
- 101150091442 Trim28 gene Proteins 0.000 claims description 2
- 101150060771 WDR5 gene Proteins 0.000 claims description 2
- 101100459258 Xenopus laevis myc-a gene Proteins 0.000 claims description 2
- 101000929049 Xenopus tropicalis Derriere protein Proteins 0.000 claims description 2
- 101100240443 Xenopus tropicalis nfrkb gene Proteins 0.000 claims description 2
- 101150093411 ZNF143 gene Proteins 0.000 claims description 2
- 108010016200 Zinc Finger Protein GLI1 Proteins 0.000 claims description 2
- 108010088665 Zinc Finger Protein Gli2 Proteins 0.000 claims description 2
- 102100023551 Zinc finger protein 41 homolog Human genes 0.000 claims description 2
- 102100023550 Zinc finger protein 42 homolog Human genes 0.000 claims description 2
- 101150013659 ccnf gene Proteins 0.000 claims description 2
- 101150073031 cdk2 gene Proteins 0.000 claims description 2
- 101150055601 cops2 gene Proteins 0.000 claims description 2
- 101150086679 eif2b2 gene Proteins 0.000 claims description 2
- 101150035084 fgf13 gene Proteins 0.000 claims description 2
- 102000003684 fibroblast growth factor 13 Human genes 0.000 claims description 2
- 101150075618 foxd3 gene Proteins 0.000 claims description 2
- 101150027973 hira gene Proteins 0.000 claims description 2
- 101150070711 mcm2 gene Proteins 0.000 claims description 2
- 101150028533 smc1a gene Proteins 0.000 claims description 2
- 101150075118 sub1 gene Proteins 0.000 claims description 2
- 101150062941 wdr18 gene Proteins 0.000 claims description 2
- 101150029956 zfx gene Proteins 0.000 claims description 2
- 101150054847 RIF1 gene Proteins 0.000 claims 1
- 108020004999 messenger RNA Proteins 0.000 abstract description 4
- 230000008901 benefit Effects 0.000 abstract description 3
- 230000035945 sensitivity Effects 0.000 abstract description 3
- 239000002609 medium Substances 0.000 description 39
- 230000006698 induction Effects 0.000 description 14
- 238000011529 RT qPCR Methods 0.000 description 11
- 239000007640 basal medium Substances 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 238000007621 cluster analysis Methods 0.000 description 8
- 230000002107 myocardial effect Effects 0.000 description 8
- 108010049955 Bone Morphogenetic Protein 4 Proteins 0.000 description 7
- 102100024505 Bone morphogenetic protein 4 Human genes 0.000 description 7
- 108020005196 Mitochondrial DNA Proteins 0.000 description 7
- 230000035800 maturation Effects 0.000 description 7
- AQGNHMOJWBZFQQ-UHFFFAOYSA-N CT 99021 Chemical compound CC1=CNC(C=2C(=NC(NCCNC=3N=CC(=CC=3)C#N)=NC=2)C=2C(=CC(Cl)=CC=2)Cl)=N1 AQGNHMOJWBZFQQ-UHFFFAOYSA-N 0.000 description 6
- 101000808011 Homo sapiens Vascular endothelial growth factor A Proteins 0.000 description 6
- 238000003556 assay Methods 0.000 description 6
- 230000000747 cardiac effect Effects 0.000 description 6
- 238000007847 digital PCR Methods 0.000 description 6
- 108091070501 miRNA Proteins 0.000 description 6
- 239000002679 microRNA Substances 0.000 description 6
- OZFAFGSSMRRTDW-UHFFFAOYSA-N (2,4-dichlorophenyl) benzenesulfonate Chemical compound ClC1=CC(Cl)=CC=C1OS(=O)(=O)C1=CC=CC=C1 OZFAFGSSMRRTDW-UHFFFAOYSA-N 0.000 description 5
- 239000012591 Dulbecco’s Phosphate Buffered Saline Substances 0.000 description 5
- 238000002659 cell therapy Methods 0.000 description 5
- 238000000684 flow cytometry Methods 0.000 description 5
- 238000006116 polymerization reaction Methods 0.000 description 5
- 238000003908 quality control method Methods 0.000 description 5
- HJCMDXDYPOUFDY-WHFBIAKZSA-N Ala-Gln Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O HJCMDXDYPOUFDY-WHFBIAKZSA-N 0.000 description 4
- CIWBSHSKHKDKBQ-JLAZNSOCSA-N Ascorbic acid Chemical compound OC[C@H](O)[C@H]1OC(=O)C(O)=C1O CIWBSHSKHKDKBQ-JLAZNSOCSA-N 0.000 description 4
- OHCQJHSOBUTRHG-KGGHGJDLSA-N FORSKOLIN Chemical compound O=C([C@@]12O)C[C@](C)(C=C)O[C@]1(C)[C@@H](OC(=O)C)[C@@H](O)[C@@H]1[C@]2(C)[C@@H](O)CCC1(C)C OHCQJHSOBUTRHG-KGGHGJDLSA-N 0.000 description 4
- 238000004113 cell culture Methods 0.000 description 4
- 210000002304 esc Anatomy 0.000 description 4
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 4
- 108010082117 matrigel Proteins 0.000 description 4
- 210000003716 mesoderm Anatomy 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 210000001778 pluripotent stem cell Anatomy 0.000 description 4
- 238000000926 separation method Methods 0.000 description 4
- 239000013589 supplement Substances 0.000 description 4
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 3
- 210000001956 EPC Anatomy 0.000 description 3
- ZGSXEXBYLJIOGF-ALFLXDJESA-N IWR-1-endo Chemical compound C=1C=CC2=CC=CN=C2C=1NC(=O)C(C=C1)=CC=C1N1C(=O)[C@@H]2[C@H](C=C3)C[C@H]3[C@@H]2C1=O ZGSXEXBYLJIOGF-ALFLXDJESA-N 0.000 description 3
- 239000012980 RPMI-1640 medium Substances 0.000 description 3
- 102100039037 Vascular endothelial growth factor A Human genes 0.000 description 3
- 239000006285 cell suspension Substances 0.000 description 3
- 102000058223 human VEGFA Human genes 0.000 description 3
- 210000004263 induced pluripotent stem cell Anatomy 0.000 description 3
- 238000011551 log transformation method Methods 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 229960005322 streptomycin Drugs 0.000 description 3
- 102000004127 Cytokines Human genes 0.000 description 2
- 108090000695 Cytokines Proteins 0.000 description 2
- SUZLHDUTVMZSEV-UHFFFAOYSA-N Deoxycoleonol Natural products C12C(=O)CC(C)(C=C)OC2(C)C(OC(=O)C)C(O)C2C1(C)C(O)CCC2(C)C SUZLHDUTVMZSEV-UHFFFAOYSA-N 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 102100037362 Fibronectin Human genes 0.000 description 2
- 108010067306 Fibronectins Proteins 0.000 description 2
- 101000762379 Homo sapiens Bone morphogenetic protein 4 Proteins 0.000 description 2
- 102000004877 Insulin Human genes 0.000 description 2
- 108090001061 Insulin Proteins 0.000 description 2
- 239000002211 L-ascorbic acid Substances 0.000 description 2
- 235000000069 L-ascorbic acid Nutrition 0.000 description 2
- 108010023082 activin A Proteins 0.000 description 2
- 229960005070 ascorbic acid Drugs 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- OHCQJHSOBUTRHG-UHFFFAOYSA-N colforsin Natural products OC12C(=O)CC(C)(C=C)OC1(C)C(OC(=O)C)C(O)C1C2(C)C(O)CCC1(C)C OHCQJHSOBUTRHG-UHFFFAOYSA-N 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 210000001671 embryonic stem cell Anatomy 0.000 description 2
- 102000046148 human BMP4 Human genes 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000011081 inoculation Methods 0.000 description 2
- 229940125396 insulin Drugs 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- PJUIMOJAAPLTRJ-UHFFFAOYSA-N monothioglycerol Chemical compound OCC(O)CS PJUIMOJAAPLTRJ-UHFFFAOYSA-N 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 229940035024 thioglycerol Drugs 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 229940121396 wnt pathway inhibitor Drugs 0.000 description 2
- KHZOJCQBHJUJFY-UHFFFAOYSA-N 2-[4-(2-methylpyridin-4-yl)phenyl]-n-(4-pyridin-3-ylphenyl)acetamide Chemical compound C1=NC(C)=CC(C=2C=CC(CC(=O)NC=3C=CC(=CC=3)C=3C=NC=CC=3)=CC=2)=C1 KHZOJCQBHJUJFY-UHFFFAOYSA-N 0.000 description 1
- 108050006400 Cyclin Proteins 0.000 description 1
- 102100033587 DNA topoisomerase 2-alpha Human genes 0.000 description 1
- 102100037126 Developmental pluripotency-associated protein 4 Human genes 0.000 description 1
- 102000012804 EPCAM Human genes 0.000 description 1
- 102100041002 Forkhead box protein H1 Human genes 0.000 description 1
- 102100025413 Formyltetrahydrofolate synthetase Human genes 0.000 description 1
- 102100028501 Galanin peptides Human genes 0.000 description 1
- 102000002254 Glycogen Synthase Kinase 3 Human genes 0.000 description 1
- 108010014905 Glycogen Synthase Kinase 3 Proteins 0.000 description 1
- 102100021185 Guanine nucleotide-binding protein-like 3 Human genes 0.000 description 1
- 101000881868 Homo sapiens Developmental pluripotency-associated protein 4 Proteins 0.000 description 1
- 101000892840 Homo sapiens Forkhead box protein H1 Proteins 0.000 description 1
- 101000860415 Homo sapiens Galanin peptides Proteins 0.000 description 1
- 101001040748 Homo sapiens Guanine nucleotide-binding protein-like 3 Proteins 0.000 description 1
- 101001044098 Homo sapiens LINE-1 type transposase domain-containing protein 1 Proteins 0.000 description 1
- 101000593405 Homo sapiens Myb-related protein B Proteins 0.000 description 1
- 101001109682 Homo sapiens Nuclear receptor subfamily 6 group A member 1 Proteins 0.000 description 1
- 101000595198 Homo sapiens Podocalyxin Proteins 0.000 description 1
- 101001071145 Homo sapiens Polyhomeotic-like protein 1 Proteins 0.000 description 1
- 101000685824 Homo sapiens Probable RNA polymerase II nuclear localization protein SLC7A6OS Proteins 0.000 description 1
- 101001056567 Homo sapiens Protein Jumonji Proteins 0.000 description 1
- 101000695844 Homo sapiens Receptor-type tyrosine-protein phosphatase zeta Proteins 0.000 description 1
- 101000740178 Homo sapiens Sal-like protein 4 Proteins 0.000 description 1
- 101000864786 Homo sapiens Secreted frizzled-related protein 2 Proteins 0.000 description 1
- 101001001648 Homo sapiens Serine/threonine-protein kinase pim-2 Proteins 0.000 description 1
- 101000884271 Homo sapiens Signal transducer CD24 Proteins 0.000 description 1
- 101000800116 Homo sapiens Thy-1 membrane glycoprotein Proteins 0.000 description 1
- 101000904152 Homo sapiens Transcription factor E2F1 Proteins 0.000 description 1
- 101000784538 Homo sapiens Zinc finger and SCAN domain-containing protein 10 Proteins 0.000 description 1
- 101000976643 Homo sapiens Zinc finger protein ZIC 2 Proteins 0.000 description 1
- 102100021610 LINE-1 type transposase domain-containing protein 1 Human genes 0.000 description 1
- 101710167839 Morphogenetic protein Proteins 0.000 description 1
- 102100034670 Myb-related protein B Human genes 0.000 description 1
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 1
- 101710135898 Myc proto-oncogene protein Proteins 0.000 description 1
- 102100022670 Nuclear receptor subfamily 6 group A member 1 Human genes 0.000 description 1
- 101710126211 POU domain, class 5, transcription factor 1 Proteins 0.000 description 1
- 102100036031 Podocalyxin Human genes 0.000 description 1
- 102100033222 Polyhomeotic-like protein 1 Human genes 0.000 description 1
- 102100023136 Probable RNA polymerase II nuclear localization protein SLC7A6OS Human genes 0.000 description 1
- 102100036691 Proliferating cell nuclear antigen Human genes 0.000 description 1
- 102100025733 Protein Jumonji Human genes 0.000 description 1
- 101100247004 Rattus norvegicus Qsox1 gene Proteins 0.000 description 1
- 101150042152 Rbbp9 gene Proteins 0.000 description 1
- 102100028508 Receptor-type tyrosine-protein phosphatase zeta Human genes 0.000 description 1
- 102100037192 Sal-like protein 4 Human genes 0.000 description 1
- 102100030054 Secreted frizzled-related protein 2 Human genes 0.000 description 1
- 102100036120 Serine/threonine-protein kinase pim-2 Human genes 0.000 description 1
- 102100038081 Signal transducer CD24 Human genes 0.000 description 1
- 101150057140 TACSTD1 gene Proteins 0.000 description 1
- 108010033711 Telomeric Repeat Binding Protein 1 Proteins 0.000 description 1
- 102100036497 Telomeric repeat-binding factor 1 Human genes 0.000 description 1
- 206010043276 Teratoma Diseases 0.000 description 1
- 102100033523 Thy-1 membrane glycoprotein Human genes 0.000 description 1
- 102100024026 Transcription factor E2F1 Human genes 0.000 description 1
- 101710150448 Transcriptional regulator Myc Proteins 0.000 description 1
- 108010046308 Type II DNA Topoisomerases Proteins 0.000 description 1
- 102100020919 Zinc finger and SCAN domain-containing protein 10 Human genes 0.000 description 1
- 102100023492 Zinc finger protein ZIC 2 Human genes 0.000 description 1
- 210000001789 adipocyte Anatomy 0.000 description 1
- 210000000577 adipose tissue Anatomy 0.000 description 1
- 238000011316 allogeneic transplantation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 210000002449 bone cell Anatomy 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 210000003321 cartilage cell Anatomy 0.000 description 1
- 239000006143 cell culture medium Substances 0.000 description 1
- 230000024245 cell differentiation Effects 0.000 description 1
- 230000003833 cell viability Effects 0.000 description 1
- 210000001608 connective tissue cell Anatomy 0.000 description 1
- 230000032459 dedifferentiation Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 230000007071 enzymatic hydrolysis Effects 0.000 description 1
- 238000006047 enzymatic hydrolysis reaction Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 210000004700 fetal blood Anatomy 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 108010022790 formyl-methenyl-methylenetetrahydrofolate synthetase Proteins 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000000099 in vitro assay Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- ZGSXEXBYLJIOGF-BOPNQXPFSA-N iwr-1 Chemical compound C=1C=CC2=CC=CN=C2C=1NC(=O)C(C=C1)=CC=C1N1C(=O)[C@@H]2C(C=C3)CC3[C@@H]2C1=O ZGSXEXBYLJIOGF-BOPNQXPFSA-N 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 210000001704 mesoblast Anatomy 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010899 nucleation Methods 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 210000002826 placenta Anatomy 0.000 description 1
- 238000007747 plating Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 210000001082 somatic cell Anatomy 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
- 239000012224 working solution Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Definitions
- the invention belongs to the technical field of biomedicine and relates to a method for detecting iPSC residues, in particular to a method for detecting iPSC residues based on single-cell sequencing data analysis.
- Pluripotent stem cells are cells that can differentiate into various cell types.
- PSCs Pluripotent stem cells
- iPSC Induced pluripotent stem cell
- iPSC cells are similar to embryonic stem cells (Embryonic stem cell, ESC) cells also have strong self-renewal ability and multi-lineage differentiation potential, and have the characteristics of undifferentiated and poorly differentiated.
- ESC embryonic stem cell
- iPSC is derived from autologous somatic cells or other types of cells, so it can avoid immune rejection caused by allogeneic transplantation; in addition, it does not need to be obtained from the inner cell mass of early mammalian embryos, avoiding the ethical issues caused by ESC. academic controversy.
- iPSC-derived therapies hold great promise in patient-specific cell therapy, potentially enabling regenerative medicine for many life-threatening diseases. A growing number of cell therapies are in clinical development with promising clinical outcomes.
- a key safety concern in the development of iPSC-derived therapies is the potential for residual undifferentiated iPSCs to persist in the final cell therapy product, eventually spreading and forming teratomas. Therefore, it is crucial to establish a highly sensitive assay for the detection of residual undifferentiated hiPSCs.
- the detection principle of flow cytometry is to perform flow cytometry detection of 2-3 stem cell-specific genes in iPSC-derived functional cells to obtain the proportion of iPSC residues; quantitative real-time PCR (qRT-PCR) analysis, digital PCR, miRNA target
- the detection principle is to perform quantitative real-time PCR (qRT-PCR) analysis, digital PCR, and miRNA target analysis on 2-3 stem cell-specific genes in iPSC-derived functional cells to obtain the proportion of iPSC residues; the detection principle of the high-efficiency culture system It uses stem cell culture medium to expand and cultivate iPSC-derived functional cells.
- the present invention is creatively based on single-cell sequencing technology, and performs single-cell mRNA sequencing on each iPSC-derived functional cell, combined with bioinformatics analysis, at the level of whole gene transcriptome More accurate results can be obtained by analyzing the residues of iPSCs above.
- the present invention applies single-cell sequencing technology to the detection of iPSC residues for the first time, and has achieved better results. Detection effect.
- the present invention provides a method for detecting iPSC residues based on single-cell sequencing data analysis.
- the method is based on single-cell sequencing technology, and performs single-cell mRNA sequencing on each functional cell derived from iPSC , combined with bioinformatics analysis, analyze iPSC residues at the level of the whole gene transcriptome, and obtain more accurate results.
- the assay method of the present invention has high accuracy, high sensitivity, and high detection efficiency. high merit.
- a first aspect of the present invention provides a set of biomarkers for iPSC residue detection.
- the biomarkers include Alcam, Arid1b, Ars2, Ash2l, Axin2, Bmi1, Brix, Cbx1, Cbx5, Ccna1, Ccnd1, Ccnd2, Ccne1, Ccnf, Cd24, Cd44, Cd9, Cdh3, Cdk2, Cdk4, Cdk6, Cdkn1b, Cdyl, Cldn6, Cnot1, Cnot2, Cnot3, Cops2, Cops4, Cpsf3, rabp1, Dazap1, Dnmt3b, Dppa2, Dppa3, Dppa4, Dppa5, Dpy30, E2f1, Eed, Ehmt2, Eif2b1, Eif2b2, Eif2b3, Eif2s2, Epcam, Eras, ESRG, Esrrb, Ewsr1, Ezh1, Ezh2, Fbxo15, Fgf13, Fgf4, Flt3, Foxd3, Foxh1, Fry, Fut
- the biomarker is one or more of LIN28A, ESRG, SOX2, POU5F1, NANOG.
- the biomarkers include any one of LIN28A, ESRG, SOX2, POU5F1, and NANOG.
- the biomarkers include any two of LIN28A, ESRG, SOX2, POU5F1, and NANOG.
- the biomarkers include any three of LIN28A, ESRG, SOX2, POU5F1, and NANOG.
- the biomarkers include any four of LIN28A, ESRG, SOX2, POU5F1 and NANOG.
- the biomarkers include five of LIN28A, ESRG, SOX2, POU5F1, and NANOG.
- the second aspect of the present invention provides a method for screening biomarkers for iPSC residue detection.
- the method includes the steps of:
- step (2) Perform bioinformatics analysis on the results obtained in step (1), compare all expressed genes, and screen out iPSC residual biomarkers;
- the sample described in step (1) includes iPSC differentiated cells
- the sample includes endothelial progenitor cells, cardiomyocytes, endothelial cells, cardiac fibroblasts, neural stem cells, microglia, mesenchymal stem cells, retinal pigment epithelial cells, liver cells, hematopoietic stem cells, and pancreatic islet cells , red blood cells, B lymphocytes, T lymphocytes, natural killer cells, neutrophils, basophils, eosinophils, monocytes, macrophages;
- the sample includes endothelial progenitor cells, cardiomyocytes, and islet cells;
- comparing all expressed genes described in step (2) includes comparing the differences in the expression levels of all genes in iPSC cells and samples, and screening out iPSC residual biomarkers;
- the screening process includes the following steps: screening out iPSC stemness genes whose positive cell ratio > 50% expressed in iPSCs are iPSC residual candidate genes, and screening out iPSC stemness genes in the sample
- the genes whose expressed positive cell ratio is less than 10% are the candidate genes of iPSC residues, and the biomarkers of iPSC residues are determined on the basis of the candidate genes.
- the iPSC stemness genes include: POU5F1, CD24, TERF1, DPPA4, L1TD1, LIN28A, SFRP2, GAL, SOX2, SALL4, EPCAM, ESRG, PIM2, NR6A1, THY1, JARID2, TOP2A, GNL3, PCNA, FOXH1, ZIC2, DNMT3B, PODXL, NANOG, PHC1, ZSCAN10, MYBL2, PTPRZ1, MTHFD1, E2F1.
- bioinformatics analysis includes the following steps:
- the cellranger count tool is used in step a, and the reference genome version is GRCh38-2020-A.
- step b includes applying the R function Read10X to read the single-cell transcriptome expression matrix to obtain a sparse matrix, creating a Seurat object, and setting conditions to filter cells.
- the ratio of the mitochondrial gene described in step c should be small enough.
- the third aspect of the present invention provides a method for detecting iPSC residues.
- the method includes the following steps: detecting the expression level of the biomarker in the sample to be tested;
- the biomarker is the biomarker described in the first aspect of the present invention.
- the method also includes the steps of:
- the biomarker is the biomarker described in the first aspect of the present invention.
- the sample described in step (1) includes iPSC differentiated cells
- the sample includes endothelial progenitor cells, cardiomyocytes, endothelial cells, cardiac fibroblasts, neural stem cells, microglia, mesenchymal stem cells, retinal pigment epithelial cells, liver cells, hematopoietic stem cells, islet cells, Red blood cells, B lymphocytes, T lymphocytes, natural killer cells, neutrophils, basophils, eosinophils, monocytes, macrophages;
- the sample includes endothelial progenitor cells, cardiomyocytes, and islet cells.
- step (1) also includes the following steps:
- the biomarker is the biomarker described in the first aspect of the present invention.
- step a includes using the ScaleData function to make the mean value of the expression level of each gene among all cells be 0, and make the variance of the expression level of each gene among all cells be 1.
- step b also includes screening the scaled data, extracting cells expressing one or more of the biomarkers in the data as suspected iPSC cells, and obtaining expression matrix data;
- the biomarkers include LIN28A, ESRG, SOX2, POU5F1, NANOG.
- step d the Kmeans analysis described in step d is to perform Kmeans cluster analysis based on the data obtained by PCA analysis, and then visually display the data obtained by Kmeans cluster analysis to obtain the tSNE result.
- the fourth aspect of the present invention provides a kit for iPSC residual detection.
- the kit includes reagents for detecting one or more expression levels of biomarkers LIN28A, ESRG, SOX2, POU5F1, NANOG;
- the reagents include primers that specifically amplify one or more of the biomarkers LIN28A, ESRG, SOX2, POU5F1, NANOG or specifically recognize one or more of the biomarkers LIN28A, ESRG, SOX2, POU5F1, NANOG one or more probes;
- the kit also includes dNTPs, Mg 2+ ions, DNA polymerase or a PCR system comprising dNTPs, Mg 2+ ions, and DNA polymerase.
- the fifth aspect of the present invention provides a detection system for iPSC residues, the system includes a unit for detecting the expression level of one or more of the biomarkers LIN28A, ESRG, SOX2, POU5F1, NANOG in the sample to be tested;
- the system further comprises a unit for culturing iPSCs;
- the system further includes an iPSC-induced differentiation unit;
- the unit for culturing iPSCs includes E8 complete medium, Y-27632;
- the concentration of Y-27632 is 10 ⁇ M
- the unit for detecting the expression level of one or more of biomarkers LIN28A, ESRG, SOX2, POU5F1, NANOG in the sample to be tested comprises the method described in the third aspect of the present invention
- the unit for detecting the expression level of one or more of the biomarkers LIN28A, ESRG, SOX2, POU5F1, NANOG in the sample to be tested is to analyze whether there are iPSC residues according to the results of PCA and tSNE;
- sample to be tested includes iPSC differentiated cells
- the sample includes endothelial progenitor cells, cardiomyocytes, endothelial cells, cardiac fibroblasts, neural stem cells, microglia, mesenchymal stem cells, retinal pigment epithelial cells, liver cells, hematopoietic stem cells, islet cells, Red blood cells, B lymphocytes, T lymphocytes, natural killer cells, neutrophils, basophils, eosinophils, monocytes, macrophages;
- the sample includes endothelial progenitor cells, cardiomyocytes, and islet cells.
- the sixth aspect of the present invention provides the application of any of the following aspects:
- the kit is the kit described in the fourth aspect of the present invention.
- the iPSC residual detection system is the system described in the fifth aspect of the present invention.
- iPSC induced pluripotent stem cell
- iPSCs have very similar characteristics to ESCs, but avoid the ethical issues associated with ESCs because iPSCs are not derived from embryos, instead, iPSCs are usually derived from fully differentiated adult cells that have been "reprogrammed” back to multiple able state.
- differentiation refers to the process by which a cell changes from one cell type to another, in particular a less specialized type of cell becomes a more specialized type of cell.
- meenchymal stem cells refers to a specific type of stem cells that can be isolated from various tissues including bone marrow, adipose tissue (fat), placenta and umbilical cord blood, which can differentiate into bone cells, cartilage cells, fat cells, and other types of connective tissue cells.
- the present invention provides a brand-new detection method, which is based on single-cell sequencing, performs single-cell mRNA sequencing on each functional cell derived from iPSC, and combines bioinformatics analysis to analyze iPSC at the level of whole gene transcriptome residue, the detection method of the present invention can obtain more accurate results, and has the advantages of high accuracy, high sensitivity and high detection efficiency.
- FIG. 1 shows the flow chart of detection method of the present invention and other detection methods, wherein, A figure: detection method of the present invention, B figure: flow cytometry method, C figure: qRT-PCR analysis, digital PCR, miRNA Target method, Figure D: high-efficiency culture system method;
- FIG. 1 shows the results of EPC single-cell sequencing data quality control
- Figure 3 shows the PCA plot of the combined analysis of EPC and iPSC single-cell data
- Figure 4 shows the tSNE diagram of combined analysis of EPC and iPSC single-cell data, in which, diagram A: Mahalanobis, diagram B: Cosine, diagram C: Chebychev, diagram D: Euclidean;
- Figure 5 shows the results of quality control of myocardial single-cell sequencing data
- Figure 6 shows the PCA plot of combined analysis of myocardial single cell and iPSC single cell data
- Figure 7 shows the tSNE diagram of combined analysis of myocardial single cell and iPSC single cell data, in which, Figure A: Mahalanobis, Figure B: Cosine, Figure C: Chebychev, Figure D: Euclidean;
- Figure 8 shows the results of quality control of islet single-cell sequencing data
- Figure 9 shows the PCA plot of combined analysis of islet single cell and iPSC single cell data
- Figure 10 shows the tSNE graph of combined analysis of islet single cell and iPSC single cell data, in which, graph A: Mahalanobis, graph B: Cosine, graph C: Chebychev, graph D: Euclidean.
- the initial seeding density of iPSCs induced differentiation needs to be controlled at 3.0 ⁇ 10 4 -4.0 ⁇ 10 4 cells/cm 2 .
- mesoderm induction complete medium-2 include: stem pro 34 basal medium, 8 ⁇ M CHIR99021, 25ng/mL Recombinant Human BMP-4;
- the EPCs induction complete medium includes: stem pro 34 basal medium, 200ng/mL Human Recombinant VEGF165 (VEGFA), 10 ⁇ M SB431542, 2 ⁇ M Forskolin;
- EPC maintenance complete medium includes: stem pro 34 basal medium, 200ng/mL Human Recombinant VEGF165 (VEGFA);
- the Seurat software package can analyze single-cell data. First, use the R function Read10X to read the EPC single-cell transcriptome expression matrix to obtain a sparse matrix, create a Seurat object, and set conditions to filter cells:
- data_dir is the directory where the single-cell transcriptome expression matrix results are located
- project_name is the name of the data set
- QC_min_cells is the number of cells that can detect a certain gene
- QC_min_features is the number of genes that can be detected in each cell
- pbmc2 ⁇ -subset(pbmc1, subset nFeature_RNA>QC_min_features&nFeature_RNA ⁇ QC_max_features&percent.mt ⁇ QC_percent_mt)
- QC_max_features is the maximum number of genes that can be detected by the cell
- QC_percent_mt is the mitochondrial content in the cell
- LogNormalize which normalizes the characteristic expression measure of each cell by the total expression, multiplies it by a scaling factor (10,000 by default), and then performs log transformation on the result:
- the experimental results show that the flow chart of the detection method of the present invention is shown in Figure 1A-D, the ratio of 30 iPSC stemness genes expressed in iPSC, and the gene whose positive cell ratio is higher than 50% are the genes for preliminary screening of iPSC candidate genes ( See Table 1), the ratio of 30 iPSC stemness genes expressed in EPC cells, and the genes whose positive cell ratio is less than 10% are for preliminary screening of iPSC candidate genes (see Table 2), and the quality control of the obtained EPC single-cell sequencing data
- the result graph is shown in Figure 2, and the obtained PCA graph of combined analysis of EPC and iPSC single cell data is shown in Figure 3, in which the red part is the iPSC single cell, the blue part is the EPC single cell separation data, and the combined analysis of the EPC and iPSC single cell data
- the tSNE diagrams are shown in Figure 4A-D, where the red part is the iPSC single cell, the blue part is the EPC single cell separation data, and there
- DMEM/F-12 medium GlutaMAX TM Supplement, Penicilin-streptomycin (double antibody), BMP4, and B27 were purchased from Thermofisher; RPMI-1640 medium was purchased from Hyclone; TESR-E8 was purchased from STEMCELL Technologies; Y-27632, CHIR99021, C59, IWR1, thioglycerol, and L-ascorbic acid were purchased from Sigma.
- the cardiac progenitor cell induction differentiation medium is to add cytokine bone to the cardiac progenitor cell induction differentiation basal medium Morphogenetic protein 4 (BMP4) and the culture medium obtained after GSK-3 inhibitor CHIR99021, in the described cardiac progenitor cell differentiation medium (CIM), BMP4 concentration is 25ng/mL, CHIR99021 concentration is 3-5 ⁇ M, described Cardiac progenitor cell induction differentiation basal medium consists of DMEM/F-12 medium, GlutaMAX TM Supplement, VA-free B27 (B27-Minus VA), thioglycerol, L-ascorbic acid and Penicilin-streptomycin (double antibody);
- cardiomyocyte differentiation medium to induce differentiation cardiomyocytes
- the medium used here is the cardiomyocyte differentiation medium containing Wnt pathway inhibitor (C59 or IWR-1);
- the cardiomyocyte-induced differentiation medium is to add insulin-free B27 (B27-Minus insulin), cytokine bone morphogenetic protein 4 (BMP4) and Wnt pathway inhibitor to the cardiomyocyte-induced differentiation basal medium Medium obtained after C59 or IWR-1;
- the content of B27-Minus insulin is 2%
- the concentration of BMP4 is 10 ng/mL
- the concentration of C59 is 2 ⁇ M
- the concentration of IWR-1 is 5 ⁇ M;
- the cardiomyocyte-induced differentiation basal medium is composed of RPMI-1640 medium, GlutaMAX TM Supplement and Penicilin-streptomycin (double antibody);
- the cardiomyocyte differentiation basal medium is specifically composed of RPMI-1640 medium with a volume percentage of 98%, a volume percentage of 1% GlutaMAX TM Supplement, and a volume percentage of 1% double antibody;
- cardiomyocyte maturation medium to induce cardiomyocyte maturation
- cardiomyocyte maturation medium for full replacement continue to culture
- cardiomyocyte maturation medium (CDM2) every other day for the first 6 days during the culture period Carry out a full change of medium, and then use cardiomyocyte maturation medium (CDM2) for full change of medium every two days;
- the cardiomyocyte maturation medium (CDM2) is a medium obtained after adding B27 to the cardiomyocyte-induced differentiation basal medium;
- the content of B27 is 2%.
- the Seurat software package can analyze single-cell data. First, use the R function Read10X to read the EPC single-cell transcriptome expression matrix to obtain a sparse matrix, create a Seurat object, and set conditions to filter cells:
- data_dir is the directory where the single-cell transcriptome expression matrix results are located
- project_name is the name of the data set
- QC_min_cells is the number of cells that can detect a certain gene
- QC_min_features is the number of genes that can be detected in each cell
- pbmc2 ⁇ -subset(pbmc1, subset nFeature_RNA>QC_min_features&nFeature_RNA ⁇ QC_max_features&percent.mt ⁇ QC_percent_mt)
- QC_max_features is the maximum number of genes that can be detected by the cell
- QC_percent_mt is the mitochondrial content in the cell
- LogNormalize which normalizes the characteristic expression measure of each cell by the total expression, multiplies it by a scaling factor (10,000 by default), and then performs log transformation on the result:
- the experimental results show that the ratio of 30 iPSC stemness genes expressed in cardiomyocytes, and the genes whose positive cell ratio is less than 10%, are the genes for preliminary screening of iPSC candidate genes (see Table 3), and the quality control results of the obtained myocardial single-cell sequencing data
- Figure 5 the PCA plot of the combined analysis of the obtained myocardial single cell and iPSC single cell data is shown in Figure 6, in which the red part is the iPSC single cell, the blue part is the separation data of the myocardial single cell, and the myocardial single cell and iPSC single cell
- the tSNE diagrams of data combination analysis are shown in Figure 7A-D, where the red part is the iPSC single cell, the blue part is the separation data of the myocardial single cell, and the red and blue parts do not overlap, indicating that there is no iPSC remaining in the cardiomyocyte.
- the Seurat software package can analyze single-cell data. First, use the R function Read10X to read the EPC single-cell transcriptome expression matrix to obtain a sparse matrix, create a Seurat object, and set conditions to filter cells:
- data_dir is the directory where the single-cell transcriptome expression matrix results are located
- project_name is the name of the data set
- QC_min_cells is the number of cells that can detect a certain gene
- QC_min_features is the number of genes that can be detected in each cell
- pbmc2 ⁇ -subset(pbmc1, subset nFeature_RNA>QC_min_features&nFeature_RNA ⁇ QC_max_features&percent.mt ⁇ QC_percent_mt)
- QC_max_features is the maximum number of genes that can be detected by the cell
- QC_percent_mt is the mitochondrial content in the cell
- LogNormalize which normalizes the characteristic expression measure of each cell by the total expression, multiplies it by a scaling factor (10,000 by default), and then performs log transformation on the result:
- the experimental results show that the expression ratio of 30 iPSC stemness genes in islet cells, and the genes whose positive cell ratio is less than 10% are the genes for preliminary screening of iPSC candidate genes (see Table 4), and finally determined as the primary screening iPSC candidate gene as LIN28A .
- the red part is the iPSC single cell
- the blue part is the islet single cell isolation data
- the tSNE diagram of the combined analysis of the islet single cell and iPSC single cell data is shown in Figure 10A-D, where the red part is the iPSC single cell, and the blue part is the islet Small body single cell isolation data, there is no intersection between the red and blue parts, indicating that there are no iPSCs left in the islet body.
Landscapes
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Analytical Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Zoology (AREA)
- Evolutionary Computation (AREA)
- Wood Science & Technology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Probability & Statistics with Applications (AREA)
- Biochemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
一种iPSC残留检测方法,所述iPSC残留检测方法是基于单细胞测序技术,对每一个iPSC来源的功能细胞进行单细胞mRNA测序,结合生物信息学分析,在全基因转录组水平上分析iPSC残留,得到更加准确的结果,相对于传统的测定方法而言,所述的测定方法具有准确性高、灵敏度高、检测效率高的优点。
Description
本发明属于生物医学技术领域,涉及一种iPSC残留检测方法,具体而言,涉及一种基于单细胞测序数据分析的iPSC残留检测方法。
多能干细胞(Pluripotent stem cells,PSC)是能够分化形成多种细胞类型的细胞。2006年,日本科学家将分化的小鼠体细胞在特定诱导因子Oct4、Sox2、c-Myc和Klf4(即OSKM体系)过表达作用下,逆转去分化重回多能干细胞,并命名为诱导多能干细胞(Induced pluripotent stem cell,iPSC)(Takahashi K,Yamanaka S.Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors.Cell.2006;126:663-676.),iPSC为类似于胚胎干细胞(Embryonic stem cell,ESC)的细胞,同时也具有强大的自我更新能力和多向分化潜能,具有未分化和低分化的特征。相对于其他干细胞而言,iPSC来源自自体体细胞或其他类型的细胞,所以可避免异体移植产生的免疫排斥反应;此外,其无需取自哺乳动物早期胚胎内细胞团,避免了ESC引起的伦理学争议。iPSC衍生的疗法在患者特异性细胞疗法中具有广阔的前景,有可能为许多威胁生命的疾病提供再生医学。越来越多的细胞疗法正在临床开发中,并具有良好的临床疗效。在开发源自iPSC的疗法中,一个关键的安全问题是残留的未分化iPSC在最终的细胞疗法产品中持续存在的可能性,最终会扩散并形成畸胎瘤。因此,建立用于检测残留未分化hiPSC的高灵敏度测定至关重要。
目前,已经开发了各种测定方法来体外检测衍生细胞疗法中残留的未分化iPSC,例如流式细胞仪、定量实时PCR(qRT-PCR)分析、数字PCR、miRNA靶标和高效培养系统,其中,流式细胞仪的检测原理是对iPSC来源的功能细胞中的2-3个干细胞特异性基因进行流式检测,得到iPSC残留的比例;定量实时PCR(qRT-PCR)分析、数字PCR、miRNA靶标的检测原理是对iPSC来源的功能细胞中的2-3个干细胞特异性基因进行定量实时PCR(qRT-PCR)分析、数字PCR、miRNA靶标分析,得到iPSC残留的比例;高效培养系统的检测原理是使用干细胞培养基,扩大培养iPSC来源的功能细胞,对扩大培养后的细胞,使用2-3个干细胞特异性基因进行定量实时PCR(qRT-PCR)分析、数字PCR、miRNA靶标分析,得到iPSC残留的比例。高效培养系统需要10到14天的检测时间,存在检测效率低、耗时长的问题,同时,该方法会存在会有残留的iPSC自分化产生假阴性的可能,通过对细胞进行扩大培养的检测方式,还存在过低的浓度细胞不易成活的问题。除高效培养系统外,这些测定方法中的大多数都是基于检测未分化的细胞标志物的表达来进行检测的,以上方法仅仅只能对3个左右的标记物进行检测,因此存在假阴性高的问题。
为了解决上述测定方法存在的假阴性率高的问题,本发明创造性地基于单细胞测序技术,对每一个iPSC来源的功能细胞进行单细胞mRNA测序,结合生物信息学分析,在全基因转录组水平上分析iPSC残留,得到更加准确的结果,目前,未见将单细胞测序技术应用于iPSC残留检测的相关报道,本发明首次将单细胞测序技术应用于iPSC残留检测中,并取得 了较好的检测效果。
发明内容
为解决目前本领域面临的上述问题,本发明提供了一种基于单细胞测序数据分析的iPSC残留检测方法,所述方法基于单细胞测序技术,对每一个iPSC来源的功能细胞进行单细胞mRNA测序,结合生物信息学分析,在全基因转录组水平上分析iPSC残留,得到更加准确的结果,相对于传统的测定方法而言,本发明所述的测定方法具有准确性高、灵敏度高、检测效率高的优点。
本发明的上述目的通过以下技术方案得以实现:
本发明的第一方面提供了一组用于iPSC残留检测的生物标志物。
进一步,所述生物标志物包括Alcam、Arid1b、Ars2、Ash2l、Axin2、Bmi1、Brix、Cbx1、Cbx5、Ccna1、Ccnd1、Ccnd2、Ccne1、Ccnf、Cd24、Cd44、Cd9、Cdh3、Cdk2、Cdk4、Cdk6、Cdkn1b、Cdyl、Cldn6、Cnot1、Cnot2、Cnot3、Cops2、Cops4、Cpsf3、rabp1、Dazap1、Dnmt3b、Dppa2、Dppa3、Dppa4、Dppa5、Dpy30、E2f1、Eed、Ehmt2、Eif2b1、Eif2b2、Eif2b3、Eif2s2、Epcam、Eras、ESRG、Esrrb、Ewsr1、Ezh1、Ezh2、Fbxo15、Fgf13、Fgf4、Flt3、Foxd3、Foxh1、Fry、Fut4、SSEA1、Gabrb3、Gal、Gbx2、Gdf3、Gja1、Gli1、Gli2、Gli3、Glis1、Gnl3、Grb7、H2afz、Has2、Hcfc1、Herc5、Hesx1、Hira、Hmga1、Hspa4、Hspb1、Id1、Ing5、Itga6、Jarid2、Kat2a、Kat5、Kat6a、Kdm1a、Kdm3a、Kdm4a、Kdm4c、Kdm5b、Kit、Kitlg、Klf12、Klf2、Klf4、Klf5、L1td1、Lefty1、Lefty2、LIN28A、Lin28b、Ly6e、Mapk1、Max、Mcm2、Mcrs1、Med1、Med10、Med12、Med13、Med13l、Med14、Med17、Med19、Med24、Med28、Metap2、Mga、Mll、Mll2、Mll3、Mll5、Msi1、Mt1a、Mt2a、Mthfd1、Mybl2、Myc、Mycn、Nacc1、NANOG、Nanos1、Ncam、Ncoa2、Ncoa3、Nfrkb、Nodal、Npr1、Nr0b1、Nr6a1、Nts、Otx1、Otx2、Paf1、Pcgf6、Pcid2、Pcna、Phc1、Phc2、Phc3、Pim2、Podxl、POU5F1、Ppp1r3d、Prdm14、Prdm16、Prdm5、Prmt6、Prom1、Ptprz1、Pum1、Pum2、Rad21、Rb1、Rbbp4、Rbbp5、Rbbp7、Rbbp9、Rbl2、Rbx1、Rest、Rif1、Ring1、Rnf2、Rtf1、Sall1、Sall4Sema4a、Setdb1、Setdb2、Sf3a1、Sf3a3、Sfrp2、Sirt2、Skil、Smad1、Smad2、Smad3、Smarca4、Smarca5、Smarcd1、Smarcb1、Smarcc1、Smarcd1、Smc1a、Smo、SOX2、Sox3、Sp1、Spp1、Stag1、Stat3、Sub1、Suv39h2、Suz12、Taf2、Taf7、Tcf3、Tcf7l1、Tcl1a、Tdgf1、Terf1、Tert、Tgif、Thap11、Thy1、Tle1、Tnfrsf8、Top2a、Trim16、Trim24、Trim28、Utf1、Wdr18、Wdr5、Wnt2b、Wnt8a、Xpo7、Yy1、Zfhx3、Zfp41、Zfp42、Zfx、Zic2、Zic3、Zic5、Znf143、Znf219、Znf281、Zscan10中的一种或多种;
优选地,所述生物标志物为LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种。
作为本发明一种可实施的方式,所述的生物标志物包括LIN28A、ESRG、SOX2、POU5F1、NANOG中的任意一种。
作为本发明一种可实施的方式,所述的生物标志物包括LIN28A、ESRG、SOX2、POU5F1、NANOG中的任意两种。
作为本发明一种可实施的方式,所述的生物标志物包括LIN28A、ESRG、SOX2、POU5F1、NANOG中的任意三种。
作为本发明一种可实施的方式,所述的生物标志物包括LIN28A、ESRG、SOX2、POU5F1、NANOG中的任意四种。
作为本发明一种可实施的方式,所述的生物标志物包括LIN28A、ESRG、SOX2、POU5F1、NANOG中的五种。
本发明的第二方面提供了一种用于iPSC残留检测的生物标志物的筛选方法。
进一步,所述方法包括如下步骤:
(1)对待测样本进行单细胞测序;
(2)对步骤(1)测序得到的结果进行生物信息学分析,比对所有表达的基因,筛选出iPSC残留的生物标志物;
优选地,步骤(1)中所述的样本包括iPSC分化细胞;
更优选地,所述的样本包括内皮祖细胞、心肌细胞、内皮细胞、心脏成纤维细胞、神经干细胞、小胶质细胞、间充质干细胞、视网膜色素上皮细胞、肝细胞、造血干细胞、胰岛细胞、红细胞、B淋巴细胞、T淋巴细胞、自然杀伤细胞、嗜中性粒细胞、嗜碱性粒细胞、嗜酸性粒细胞、单核细胞、巨噬细胞;
最优选地,所述的样本包括内皮祖细胞、心肌细胞、胰岛细胞;
优选地,步骤(2)中所述的比对所有表达的基因包括比对iPSC细胞和样本中所有基因表达量的差异,筛选出iPSC残留的生物标志物;
更优选地,所述筛选的过程包括如下步骤:筛选出iPSC干性基因中在iPSC中表达的阳性细胞比例>50%的基因为iPSC残留的候选基因,筛选出iPSC干性基因中在样本中表达的阳性细胞比例<10%的基因为iPSC残留的候选基因,在候选基因的基础上确定iPSC残留的生物标志物。
进一步,所述iPSC干性基因包括:POU5F1、CD24、TERF1、DPPA4、L1TD1、LIN28A、SFRP2、GAL、SOX2、SALL4、EPCAM、ESRG、PIM2、NR6A1、THY1、JARID2、TOP2A、GNL3、PCNA、FOXH1、ZIC2、DNMT3B、PODXL、NANOG、PHC1、ZSCAN10、MYBL2、PTPRZ1、MTHFD1、E2F1。
进一步,所述生物信息学分析包括如下步骤:
a.使用cellranger-5.0.0对单细胞转录组rawdata数据进行分析;
b.Seurat软件包对单细胞数据进行分析;
c.添加线粒体百分比列,使用PercentageFeatureSet函数计算,并进行数据筛选;
d.使用全局缩放规范化方法LogNormalize对数据进行处理;
e.使用FindVariableFeatures完成差异分析,选择差异较高的特征基因。
进一步,步骤a中使用的为cellranger count工具,参考基因组版本为GRCh38-2020-A。
进一步,步骤b中包括应用R函数Read10X读取单细胞转录组表达矩阵结果得到一个稀疏矩阵,创建Seurat对象,并设置条件筛选细胞。
进一步,步骤c中所述的线粒体gene的比例要足够小。
本发明的第三方面提供了一种iPSC残留的检测方法。
进一步,所述方法包括如下步骤:检测待测样本中生物标志物的表达水平;
优选地,所述生物标志物为本发明第一方面所述的生物标志物。
进一步,所述方法还包括如下步骤:
(1)对待测样本中的生物标志物进行PCA分析和Kmeans分析;
(2)根据步骤(1)分析得到的PCA结果和tSNE结果,判断iPSC的残留水平;
优选地,所述生物标志物为本发明第一方面所述的生物标志物。
进一步,步骤(1)中所述的样本包括iPSC分化细胞;
优选地,所述的样本包括内皮祖细胞、心肌细胞、内皮细胞、心脏成纤维细胞、神经干细胞、小胶质细胞、间充质干细胞、视网膜色素上皮细胞、肝细胞、造血干细胞、胰岛细胞、红细胞、B淋巴细胞、T淋巴细胞、自然杀伤细胞、嗜中性粒细胞、嗜碱性粒细胞、嗜酸性粒细胞、单核细胞、巨噬细胞;
更优选地,所述的样本包括内皮祖细胞、心肌细胞、胰岛细胞。
进一步,步骤(1)中还包括如下步骤:
a.对生物标志物应用线性变换进行缩放;
b.对缩放得到的数据进行PCA分析,得到表达矩阵数据;
c.将样本表达矩阵数据与iPSC单细胞测序分析得到的表达矩阵数据合并取交集得到新的表达矩阵;
d.利用新的表达矩阵的数据进行PCA分析和Kmeans分析,得到PCA结果和tSNE结果;
优选地,所述生物标志物为本发明第一方面所述的生物标志物。
进一步,步骤a中所述的缩放包括采用ScaleData函数,使每个基因在所有细胞间的表达量均值为0,使每个基因在所有细胞间的表达量方差为1。
进一步,步骤b中还包括对缩放得到的数据进行筛选,提取数据中表达生物标志物中的一种或多种的细胞作为疑似iPSC细胞,得到表达矩阵数据;
优选地,所述生物标志物包括LIN28A、ESRG、SOX2、POU5F1、NANOG。
进一步,步骤d中所述的Kmeans分析是基于PCA分析得到的数据进行Kmeans聚类分析,再对Kmeans聚类分析得到的数据进行可视化展示得到tSNE结果。
本发明的第四方面提供了一种用于iPSC残留检测的试剂盒。
进一步,所述试剂盒包括检测生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种表达水平的试剂;
优选地,所述试剂包括特异性扩增生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种的引物或特异性识别生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种的探针;
优选地,所述试剂盒还包括dNTPs、Mg
2+离子、DNA聚合酶或包含dNTPs、Mg
2+离子、DNA聚合酶的PCR体系。
本发明的第五方面提供了一种iPSC残留的检测系统,所述系统包括检测待测样本中生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种表达水平的单元;
优选地,所述系统还包括培养iPSC的单元;
优选地,所述系统还包括iPSC诱导分化单元;
更优选地,所述培养iPSC的单元包括E8完全培养基、Y-27632;
最优选地,所述Y-27632的浓度为10μM;
更优选地,所述检测待测样本中生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种表达水平的单元包括本发明第三方面所述的方法;
最优选地,所述检测待测样本中生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中 的一种或多种表达水平的单元是根据PCA结果和tSNE结果分析是否含有iPSC残留;
最优选地,若PCA结果和tSNE结果显示iPSC和待测样本单细胞分离数据没有交集,则表明待测样本中没有iPSC残留;若PCA结果和tSNE结果显示iPSC和待测样本单细胞分离数据有交集,则表明待测样本中有iPSC残留。
进一步,所述待测样本包括iPSC分化细胞;
优选地,所述的样本包括内皮祖细胞、心肌细胞、内皮细胞、心脏成纤维细胞、神经干细胞、小胶质细胞、间充质干细胞、视网膜色素上皮细胞、肝细胞、造血干细胞、胰岛细胞、红细胞、B淋巴细胞、T淋巴细胞、自然杀伤细胞、嗜中性粒细胞、嗜碱性粒细胞、嗜酸性粒细胞、单核细胞、巨噬细胞;
更优选地,所述的样本包括内皮祖细胞、心肌细胞、胰岛细胞。
本发明的第六方面提供了如下任一方面的应用:
(1)单细胞测序技术在iPSC残留检测中的应用;
(2)本发明第一方面所述的生物标志物在iPSC残留检测中的应用;
(3)本发明第一方面所述的生物标志物在制备iPSC残留检测试剂中的应用;
(4)检测本发明第一方面所述的生物标志物表达水平的试剂在制备iPSC残留检测试剂盒中的应用;
优选地,所述试剂盒为本发明第四方面所述的试剂盒;
(5)检测本发明第一方面所述的生物标志物表达水平的试剂在制备iPSC分化动态监测系统中的应用;
优选地,所述iPSC残留的检测系统为本发明第五方面所述的系统;
(6)本发明第四方面所述的试剂盒在iPSC残留检测中的应用;
(7)本发明第五方面所述的系统在iPSC残留检测中的应用;
(8)PCA分析和Kmeans分析在iPSC残留检测中的应用。
除非另有定义,本发明上下文中的所使用的所有的技术和科学术语具有本领域普通技术人员所理解的相同含义。本发明的说明书中所使用的术语只是为了描述具体的实施例,不是旨在于限制本发明,此外,对部分术语解释如下。
本发明中使用的术语“诱导多能干细胞”或“iPSC”,是指从成体细胞衍生的ESC样细胞。iPSC具有与ESC非常相似的特征,但避免了与ESC相关的伦理问题,因为iPSC不是衍生自胚胎,相反,iPSC通常衍生自完全分化的成体细胞,该成体细胞已被“重新编程”回到多能状态。
本发明中使用的术语“分化”,是指细胞从一种细胞类型变为另一种细胞类型的过程,特别地是细胞的不太特化的类型变成细胞的更特化的类型。
本发明中使用的术语“间充质干细胞”,是指可以从各种组织(包括骨髓、脂肪组织(脂肪)、胎盘和脐带血)中分离的特定干细胞类型,其可以分化成骨细胞、软骨细胞、脂肪细胞和其他种类的结缔组织细胞。
本发明的优点和有益效果:
相对于目前已经开发出的各种体外检测衍生细胞疗法中残留的未分化iPSC测定方法而言,例如:流式细胞仪、定量实时PCR(qRT-PCR)分析、数字PCR、miRNA靶标和高效培养系统,本发明提供了一种全新的检测方法,所述方法基于单细胞测序,对每一个iPSC来源的 功能细胞进行单细胞mRNA测序,结合生物信息学分析,在全基因转录组水平上分析iPSC残留,本发明所述的检测方法能够得到更加准确的结果,具有准确性高、灵敏度高、检测效率高的优点。
以下,结合附图来详细说明本发明的实施方案,其中:
图1显示本发明所述检测方法和其他检测方法的流程图,其中,A图:本发明所述检测方法,B图:流式细胞检测法,C图:qRT-PCR分析、数字PCR、miRNA靶标方法,D图:高效培养系统方法;
图2显示EPC单细胞测序数据质量控制的结果图;
图3显示EPC与iPSC单细胞数据结合分析的PCA图;
图4显示EPC与iPSC单细胞数据结合分析的tSNE图,其中,A图:Mahalanobis,B图:Cosine,C图:Chebychev,D图:Euclidean;
图5显示心肌单细胞测序数据质量控制结果图;
图6显示心肌单细胞与iPSC单细胞数据结合分析的PCA图;
图7显示心肌单细胞与iPSC单细胞数据结合分析的tSNE图,其中,A图:Mahalanobis,B图:Cosine,C图:Chebychev,D图:Euclidean;
图8显示胰岛单细胞测序数据质量控制结果图;
图9显示胰岛单细胞与iPSC单细胞数据结合分析的PCA图;
图10显示胰岛单细胞与iPSC单细胞数据结合分析的tSNE图,其中,A图:Mahalanobis,B图:Cosine,C图:Chebychev,D图:Euclidean。
下面结合具体实施例,进一步阐述本发明,仅用于解释本发明,而不能理解为对本发明的限制。本领域的普通技术人员可以理解为:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。下列实施例中未注明具体条件的实验方法,通常按照常规条件或按照厂商所建议的条件实施检测。
实施例1iPSC在内皮祖细胞(EPC)中残留检测
1、实验材料
E8完全培养基、stem pro 34基础培养基、DMEM/F12培养基、TrypLE、BMP4、Human Recombinant VEGF165(VEGFA)、Forskolin、Human Recombinant Activin A购自于Thermofi sher公司;Y-27632、CHIR99021、SB431542购自于Sigma公司;matrigel、Fibronectin购自于康宁公司。
2、iPSC分化为EPC细胞流程
(1)按照iPSC传代步骤,细胞正常离心之后,移去上清,加入适量37℃预热过的含10μM Y-27632的E8完全培养基,轻轻吹打,重悬细胞沉淀,随后对重悬细胞液进行计数及活率检测;
(2)从37℃、5%CO
2的细胞培养箱中取出4个Matrigel-coated的T75培养瓶,移去 液体,每瓶加入13mL 37℃预热过的含10μM Y-27632的E8完全培养基;
(3)iPSCs诱导分化的起始铺种密度需控制在3.0×10
4-4.0×10
4个/cm
2,根据计数后的细胞重悬液密度,将适量体积的细胞重悬液加入到步骤(2)中准备好的Matrigel-coated的T75培养瓶中;
(4)将细胞板放入37℃、5%CO
2的细胞培养箱中,前后左右各晃动10次左右,尽量保证细胞在培养板面分布均匀,随后静置过夜;
(5)24h后,观察iPSCs接种后的聚合度,如果聚合度达到15-25%,则可以直接进入后续正式诱导的步骤,如果聚合度未达到15%,可更换新鲜的37℃预热过的E8完全培养基,适当延长iPSCs的培养时间至12-24h;
(6)iPSCs接种后,聚合度达到15-25%左右,开始启动正式诱导分化,规定为Day0,移去T75培养瓶中旧的培养基,用10mL DPBS洗一遍,然后每瓶加入30mL 37℃预热过的中胚层诱导完全培养基-1,随后,在37℃、5%CO
2细胞培养箱中孵育17-18h。中胚层诱导完全培养基-1的成分包括:stem pro 34基础培养基、8μM CHIR99021、25ng/mL Recombinant Human BMP-4、50ng/mL Human Recombinant Activin A;
(7)孵育17-18h后(Day1),移去T75培养瓶中旧的培养基,用10mL DPBS洗一遍,每瓶加入50mL 37℃预热过的中胚层诱导完全培养基-2,随后,在37℃、5%CO
2细胞培养箱中孵育2天不换液,中胚层诱导完全培养基-2的成分包括:stem pro 34基础培养基、8μM CHIR99021、25ng/mL Recombinant Human BMP-4;
(8)在侧板中胚层细胞形成后(Day3),移去T75培养瓶中旧的培养基,用10mL DPBS洗一遍,每瓶加入30mL 37℃预热过的EPCs诱导完全培养基,将培养板放回37℃、5%CO
2细胞培养箱中,孵育24h,EPCs诱导完全培养基包括:stem pro 34基础培养基、200ng/mL Human Recombinant VEGF165(VEGFA)、10μM SB431542、2μM Forskolin;
(9)一天后(Day4),重复Day3换液操作;
(10)Day5,在准备对EPCs进行酶解重铺至少前1h,准备8个Fibronectin-co ated T175细胞培养瓶,移去所有T75培养瓶中的旧培养基,DPBS洗2遍,随后,每瓶加入3mL TrypLE,置于37℃、5%CO
2细胞培养箱中3-5min,显微镜下观察细胞脱落程度,直到细胞大部分开始浮动;
(11)轻振培养瓶底部,待绝大部分细胞以流沙状脱落后,加入12mL DME M/F12 Medium中和TrypLE的消化作用,用移液枪轻轻吹打脱落细胞进行重悬,随后转移至离心管中,取适量细胞计数;
(12)室温条件下200g离心5min,使用1mL EPC维持完全培养基重悬细胞,并进行计数及细胞活率检测,EPC维持完全培养基包括:stem pro 34基础培养基、200ng/mL Human Recombinant VEGF165(VEGFA);
(13)维持在4℃的条件下,取10万个细胞送测序公司进行测序。
3、EPC单细胞测序数据分析流程
(1)使用cellranger-5.0.0对单细胞转录组rawdata数据进行分析,使用cellrange r count工具,参考基因组版本为GRCh38-2020-A,分析得到EPC单细胞转录组表达矩阵结果:
cellranger count--id=EPC--fastqs=rawdata_dir--sample=EPC-- localcores=8--localmem=64--transcriptome=refdata-gex-GRCh38-2020-A;
(2)Seurat软件包可以对单细胞数据进行分析,首先应用R函数Read10X读取EPC单细胞转录组表达矩阵结果得到一个稀疏矩阵,创建Seurat对象,并设置条件筛选细胞:
pbmc.data<-Read10X(data.dir=data_dir)
pbmc1<-CreateSeuratObject(counts=pbmc.data,project=project_name,min.cells=QC_min_cells,min.features=QC_min_features)
其中:data_dir为单细胞转录组表达矩阵结果所在目录,project_name为数据集名称,QC_min_cells为能检测到某个基因的细胞数,QC_min_features为每个细胞能检测到的基因数;
(3)添加线粒体百分比列,线粒体gene的比例要足够小,使用PercentageFeat ureSet函数计算,以MT-开头的则是线粒体gene,并进行数据筛选:
pbmc1[["percent.mt"]]<-PercentageFeatureSet(pbmc1,pattern="^MT-")
pbmc2<-subset(pbmc1,subset=nFeature_RNA>QC_min_features&nFeature_RNA<QC_max_features&percent.mt<QC_percent_mt)
其中:QC_max_features为细胞能检测到的最大基因数,QC_percent_mt为细胞中线粒体含量;
(4)使用全局缩放规范化方法LogNormalize,该方法通过总表达式对每个单元格的特征表达式度量进行标准化,并将其乘以一个缩放因子(默认为10,000),然后对结果进行log转换:
pbmc3<-NormalizeData(pbmc2,normalization.method="LogNormalize",scale.factor=10000);
(5)使用FindVariableFeatures完成差异分析,选择数据集中差异较高的特征基因(默认2000)并用于下游分析:
pbmc4<-FindVariableFeatures(pbmc3,selection.method="vst",nfeatures=2000);
(6)应用线性变换来缩放,这是一个标准的预处理步骤,ScaleData函数,使每个基因在所有细胞间的表达量均值为0,使每个基因在所有细胞间的表达量方差为1:
all.genes<-rownames(pbmc4)
pbmc5<-ScaleData(pbmc4,features=all.genes);
(7)对上一步骤得到的缩放数据进行PCA分析:上一步完成后会生成各个细胞和表达基因的数据矩阵数据data_pbmc11_RunTSNE.txt,筛选提取所表达矩阵数据中表达ips marker基因(LIN28A、ESRG、SOX2、POU5F1、NANOG)其中之一的细胞作为疑似iPSC细胞,形成新的矩阵数据sub_epc.txt;
(8)将表达矩阵数据sub_epc.txt与ips单细胞测序数据相同处理步骤产生的表达矩阵合并取交集得到ips_EPC.txt;
(9)利用ips_EPC.txt表达矩阵数据进行PCA分析,得到PCA结果,基于PCA分析得到的数据进行Kmeans聚类分析,再对Kmeans聚类分析得到的数据进行可视化展示得到tSNE结果,根据PCA结果和tSNE结果分析是否含有iPSC残留。
4、实验结果
实验结果显示,本发明所述的检测方法的流程图见图1A-D,30个iPSC干性基因在iPSC中表达的比例,阳性细胞比例高于50%的基因,为初步筛选iPSC候选基因(见表1),30个iPSC干性基因在EPC细胞中表达的比例,阳性细胞比例低于10%的基因,为初步筛选iPSC候选基因(见表2),得到的EPC单细胞测序数据质量控制结果图见图2,得到的EPC与iPSC单细胞数据结合分析的PCA图见图3,其中,红色部分为iPSC单细胞,蓝色部分为EPC单细胞分离数据,EPC与iPSC单细胞数据结合分析的tSNE图见图4A-D,其中,红色部分为iPSC单细胞,蓝色部分为EPC单细胞分离数据,红色和蓝色部分没有交集,说明EPC中没有iPSC残留。
表1 30个iPSC干性基因在iPSC中表达的比例
表2 30个iPSC干性基因在EPC细胞中表达的比例
实施例2 iPSC在心肌细胞中残留检测
1、实验材料
DMEM/F-12培养基、GlutaMAX
TM Supplement、Penicilin-streptomycin(双抗)、BMP4、B27购自于Thermofisher公司;RPMI-1640培养基购自于Hyclone公司;TESR-E8购自于STEMCELL Technologies公司;Y-27632、CHIR99021、C59、IWR1、硫代甘油、L-抗坏血酸购自于Sigma公司。
2、iPSC分化为心肌细胞流程
(1)当iPSC细胞扩增至75-85%聚合度时开始传代,以T25培养皿为例,吸去旧的培养基,用室温PBS洗两遍,随后加入3mL 37℃预热过的EDTA工作液,置于37℃、5%CO
2细胞培养箱中5min,显微镜下观察单个细胞间出现的空隙,弃去EDTA,加入3mL的TeSR-E8完全培养基终止消化,转移至15mL离心管,室温下,1000rpm离心5min,弃去上清,用1mL 37℃预热的含有10μM Rocki的TeSR-E8培养基轻轻吹打细胞然后重悬,计数后铺板在Matrigel包被的细胞培养板上,以6孔板为例,每孔细胞悬液2mL,铺板密度为5×10
4个/cm
2,将未分化的iPSC使用DPBS清洗三遍,去除死细胞后,加入TeSR-E8培养基,于4X的倒置显微镜下拍照,记录细胞状态,使用的培养基为TESR-E8+10μM Y-27632,此处记为DAY0;
(2)DAY1-3,使用心脏祖细胞诱导分化培养基诱导分化心脏祖细胞,所述心脏祖细胞诱导分化培养基(CIM)为在所述心脏祖细胞诱导分化基础培养基中加入细胞因子骨形态发生蛋白4(BMP4)及GSK-3抑制剂CHIR99021后得到的培养基,所述心脏祖细胞诱导分化培养基(CIM)中,BMP4浓度为25ng/mL,CHIR99021浓度为3-5μM,所述心脏祖细胞诱导分化基础培养基由DMEM/F-12培养基、GlutaMAX
TM Supplement、无VA的B27(B27-Minus VA)、硫代甘油、L-抗坏血酸和Penicilin-streptomycin(双抗)组成;
(3)DAY4-6,使用心肌细胞诱导分化培养基诱导分化心肌细胞,此处使用的培养基为含有Wnt通路抑制剂(C59或IWR-1)的心肌细胞诱导分化培养基;
所述心肌细胞诱导分化培养基(CDM1)为在所述心肌细胞诱导分化基础培养基中加入无胰岛素的B27(B27-Minus insulin)、细胞因子骨形态发生蛋白4(BMP4)及Wnt通路抑制剂C59或IWR-1后得到的培养基;
所述心肌细胞诱导分化培养基(CDM1)中,B27-Minus insulin的含量为2%,BMP4浓度为10ng/mL,C59浓度为2μM,IWR-1浓度5μM;
所述心肌细胞诱导分化基础培养基由RPMI-1640培养基、GlutaMAX
TM Supplement和Penicilin-streptomycin(双抗)组成;
所述心肌细胞诱导分化基础培养基具体由体积百分含量为98%的RPMI-1640培养基、体积百分含量为1%GlutaMAX
TM Supplement、体积百分含量为1%的双抗组成;
(4)DAY7-16,使用心肌细胞成熟培养基诱导心肌细胞成熟,采用心肌细胞成熟培养基进行全换液,继续培养,培养期间前6天每隔1天使用心肌细胞成熟培养基(CDM2)进行 全换液,以后每两天使用心肌细胞成熟培养基(CDM2)进行全换液;
所述心肌细胞成熟培养基(CDM2)为在所述心肌细胞诱导分化基础培养基中加入B27后得到的培养基;
所述心肌细胞成熟培养基(CDM2)中,B27的含量为2%。
3、心肌细胞单细胞测序数据分析流程
(1)使用cellranger-5.0.0对单细胞转录组rawdata数据进行分析,使用cellrange r count工具,参考基因组版本为GRCh38-2020-A,分析得到EPC单细胞转录组表达矩阵结果:
cellranger count--id=EPC--fastqs=rawdata_dir--sample=EPC--localcores=8--localmem=64--transcriptome=refdata-gex-GRCh38-2020-A;
(2)Seurat软件包可以对单细胞数据进行分析,首先应用R函数Read10X读取EPC单细胞转录组表达矩阵结果得到一个稀疏矩阵,创建Seurat对象,并设置条件筛选细胞:
pbmc.data<-Read10X(data.dir=data_dir)
pbmc1<-CreateSeuratObject(counts=pbmc.data,project=project_name,min.cells=QC_min_cells,min.features=QC_min_features)
其中:data_dir为单细胞转录组表达矩阵结果所在目录,project_name为数据集名称,QC_min_cells为能检测到某个基因的细胞数,QC_min_features为每个细胞能检测到的基因数;
(3)添加线粒体百分比列,线粒体gene的比例要足够小,使用PercentageFeat ureSet函数计算,以MT-开头的则是线粒体gene,并进行数据筛选:
pbmc1[["percent.mt"]]<-PercentageFeatureSet(pbmc1,pattern="^MT-")
pbmc2<-subset(pbmc1,subset=nFeature_RNA>QC_min_features&nFeature_RNA<QC_max_features&percent.mt<QC_percent_mt)
其中:QC_max_features为细胞能检测到的最大基因数,QC_percent_mt为细胞中线粒体含量;
(4)使用全局缩放规范化方法LogNormalize,该方法通过总表达式对每个单元格的特征表达式度量进行标准化,并将其乘以一个缩放因子(默认为10,000),然后对结果进行log转换:
pbmc3<-NormalizeData(pbmc2,normalization.method="LogNormalize",scale.factor=10000);
(5)使用FindVariableFeatures完成差异分析,选择数据集中差异较高的特征基因(默认2000)并用于下游分析:
pbmc4<-FindVariableFeatures(pbmc3,selection.method="vst",nfeatures=2000);
(6)应用线性变换来缩放,这是一个标准的预处理步骤,ScaleData函数,使每个基因在所有细胞间的表达量均值为0,使每个基因在所有细胞间的表达量方差为1:
all.genes<-rownames(pbmc4)
pbmc5<-ScaleData(pbmc4,features=all.genes);
(7)对上一步骤得到的缩放数据进行PCA分析:上一步完成后会生成各个细胞和表 达基因的数据矩阵数据data_pbmc11_RunTSNE.txt,筛选提取所表达矩阵数据中表达ips marker基因(LIN28A、ESRG、SOX2、POU5F1、NANOG)其中之一的细胞作为疑似iPSC细胞,形成新的矩阵数据sub_HeartMuscle.txt;
(8)将表达矩阵数据sub_HeartMuscle.txt与ips单细胞测序数据相同处理步骤产生的表达矩阵合并取交集得到ips_HeartMuscle.txt;
(9)利用ips_HeartMuscle.txt表达矩阵数据进行PCA分析,得到PCA结果,基于PCA分析得到的数据进行Kmeans聚类分析,再对Kmeans聚类分析得到的数据进行可视化展示得到tSNE结果,根据PCA结果和tSNE结果分析是否含有iPSC残留。
4、实验结果
实验结果显示,30个iPSC干性基因在心肌细胞中表达的比例,阳性细胞比例低于10%的基因,为初步筛选iPSC候选基因(见表3),得到的心肌单细胞测序数据质量控制结果图见图5,得到的心肌单细胞与iPSC单细胞数据结合分析的PCA图见图6,其中,红色部分为iPSC单细胞,蓝色部分为心肌单细胞分离数据,心肌单细胞与iPSC单细胞数据结合分析的tSNE图见图7A-D,其中,红色部分为iPSC单细胞,蓝色部分为心肌单细胞分离数据,红色和蓝色部分没有交集,说明心肌细胞中没有iPSC残留。
表3 30个iPSC干性基因在心肌细胞中表达的比例
实施例3胰岛细胞单细胞测序
1、胰岛细胞单细胞测序数据分析流程
(1)使用cellranger-5.0.0对单细胞转录组rawdata数据进行分析,使用cellrange r count工具,参考基因组版本为GRCh38-2020-A,分析得到EPC单细胞转录组表达矩阵结果:
cellranger count--id=EPC--fastqs=rawdata_dir--sample=EPC--localcores=8--localmem=64--transcriptome=refdata-gex-GRCh38-2020-A;
(2)Seurat软件包可以对单细胞数据进行分析,首先应用R函数Read10X读取EPC单细胞转录组表达矩阵结果得到一个稀疏矩阵,创建Seurat对象,并设置条件筛选细胞:
pbmc.data<-Read10X(data.dir=data_dir)
pbmc1<-CreateSeuratObject(counts=pbmc.data,project=project_name,min.cells=QC_min_cells,min.features=QC_min_features)
其中:data_dir为单细胞转录组表达矩阵结果所在目录,project_name为数据集名称,QC_min_cells为能检测到某个基因的细胞数,QC_min_features为每个细胞能检测到的基因数;
(3)添加线粒体百分比列,线粒体gene的比例要足够小,使用PercentageFeat ureSet函数计算,以MT-开头的则是线粒体gene,并进行数据筛选:
pbmc1[["percent.mt"]]<-PercentageFeatureSet(pbmc1,pattern="^MT-")
pbmc2<-subset(pbmc1,subset=nFeature_RNA>QC_min_features&nFeature_RNA<QC_max_features&percent.mt<QC_percent_mt)
其中:QC_max_features为细胞能检测到的最大基因数,QC_percent_mt为细胞中线粒体含量;
(4)使用全局缩放规范化方法LogNormalize,该方法通过总表达式对每个单元格的特征表达式度量进行标准化,并将其乘以一个缩放因子(默认为10,000),然后对结果进行log转换:
pbmc3<-NormalizeData(pbmc2,normalization.method="LogNormalize",scale.factor=10000);
(5)使用FindVariableFeatures完成差异分析,选择数据集中差异较高的特征基因(默认2000)并用于下游分析:
pbmc4<-FindVariableFeatures(pbmc3,selection.method="vst",nfeatures=2000);
(6)应用线性变换来缩放,这是一个标准的预处理步骤,ScaleData函数,使每个基因在所有细胞间的表达量均值为0,使每个基因在所有细胞间的表达量方差为1:
all.genes<-rownames(pbmc4)
pbmc5<-ScaleData(pbmc4,features=all.genes);
(7)对上一步骤得到的缩放数据进行PCA分析:上一步完成后会生成各个细胞和表达基因的数据矩阵数据data_pbmc11_RunTSNE.txt,筛选提取所表达矩阵数据中表达ips marker基因(LIN28A、ESRG、SOX2、POU5F1、NANOG)其中之一的细胞作为疑似iPSC细胞,形成新的矩阵数据sub_islet.txt;
(8)将表达矩阵数据sub_islet.txt与ips单细胞测序数据相同处理步骤产生的表达矩阵合并取交集得到ips_islet.txt;
(9)利用ips_islet.txt表达矩阵数据进行PCA分析,得到PCA结果,基于PCA分析得到的数据进行Kmeans聚类分析,再对Kmeans聚类分析得到的数据进行可视化展示得到tSNE结果,根据PCA结果和tSNE结果分析是否含有iPSC残留。
2、实验结果
实验结果显示,30个iPSC干性基因在胰岛细胞中表达的比例,阳性细胞比例低于10%的基因,为初步筛选iPSC候选基因(见表4),最终确定为初步筛选iPSC候选基因为LIN28A、ESRG、SOX2、POU5F1、NANOG(见表5),得到的胰岛单细胞测序数据质量控制结果图见图8,得到的胰岛单细胞与iPSC单细胞数据结合分析的PCA图见图9,其中,红色部分为iPSC单细胞,蓝色部分为胰岛单细胞分离数据,胰岛单细胞与iPSC单细胞数据结合分析的tSNE图见图10A-D,其中,红色部分为iPSC单细胞,蓝色部分为胰岛小体单细胞分离数据,红色和蓝色部分没有交集,说明胰岛小体中没有iPSC残留。
表4 30个iPSC干性基因在胰岛细胞中表达的比例
表5最终确定为初步筛选iPSC候选基因
上述实施例的说明只是用于理解本发明的方法及其核心思想。应当指出,对于本领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也将落入本发明权利要求的保护范围内。
Claims (10)
- 一组用于iPSC残留检测的生物标志物,其特征在于,所述生物标志物包括Alcam、Arid1b、Ars2、Ash2l、Axin2、Bmi1、Brix、Cbx1、Cbx5、Ccna1、Ccnd1、Ccnd2、Ccne1、Ccnf、Cd24、Cd44、Cd9、Cdh3、Cdk2、Cdk4、Cdk6、Cdkn1b、Cdyl、Cldn6、Cnot1、Cnot2、Cnot3、Cops2、Cops4、Cpsf3、rabp1、Dazap1、Dnmt3b、Dppa2、Dppa3、Dppa4、Dppa5、Dpy30、E2f1、Eed、Ehmt2、Eif2b1、Eif2b2、Eif2b3、Eif2s2、Epcam、Eras、ESRG、Esrrb、Ewsr1、Ezh1、Ezh2、Fbxo15、Fgf13、Fgf4、Flt3、Foxd3、Foxh1、Fry、Fut4、SSEA1、Gabrb3、Gal、Gbx2、Gdf3、Gja1、Gli1、Gli2、Gli3、Glis1、Gnl3、Grb7、H2afz、Has2、Hcfc1、Herc5、Hesx1、Hira、Hmga1、Hspa4、Hspb1、Id1、Ing5、Itga6、Jarid2、Kat2a、Kat5、Kat6a、Kdm1a、Kdm3a、Kdm4a、Kdm4c、Kdm5b、Kit、Kitlg、Klf12、Klf2、Klf4、Klf5、L1td1、Lefty1、Lefty2、LIN28A、Lin28b、Ly6e、Mapk1、Max、Mcm2、Mcrs1、Med1、Med10、Med12、Med13、Med13l、Med14、Med17、Med19、Med24、Med28、Metap2、Mga、Mll、Mll2、Mll3、Mll5、Msi1、Mt1a、Mt2a、Mthfd1、Mybl2、Myc、Mycn、Nacc1、NANOG、Nanos1、Ncam、Ncoa2、Ncoa3、Nfrkb、Nodal、Npr1、Nr0b1、Nr6a1、Nts、Otx1、Otx2、Paf1、Pcgf6、Pcid2、Pcna、Phc1、Phc2、Phc3、Pim2、Podxl、POU5F1、Ppp1r3d、Prdm14、Prdm16、Prdm5、Prmt6、Prom1、Ptprz1、Pum1、Pum2、Rad21、Rb1、Rbbp4、Rbbp5、Rbbp7、Rbbp9、Rbl2、Rbx1、Rest、Rif1、Ring1、Rnf2、Rtf1、Sall1、Sall4Sema4a、Setdb1、Setdb2、Sf3a1、Sf3a3、Sfrp2、Sirt2、Skil、Smad1、Smad2、Smad3、Smarca4、Smarca5、Smarcd1、Smarcb1、Smarcc1、Smarcd1、Smc1a、Smo、SOX2、Sox3、Sp1、Spp1、Stag1、Stat3、Sub1、Suv39h2、Suz12、Taf2、Taf7、Tcf3、Tcf7l1、Tcl1a、Tdgf1、Terf1、Tert、Tgif、Thap11、Thy1、Tle1、Tnfrsf8、Top2a、Trim16、Trim24、Trim28、Utf1、Wdr18、Wdr5、Wnt2b、Wnt8a、Xpo7、Yy1、Zfhx3、Zfp41、Zfp42、Zfx、Zic2、Zic3、Zic5、Znf143、Znf219、Znf281、Zscan10中的一种或多种;优选地,所述生物标志物为LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种。
- 一种用于iPSC残留检测的生物标志物的筛选方法,其特征在于,所述方法包括如下步骤:(1)对待测样本进行单细胞测序;(2)对步骤(1)测序得到的结果进行生物信息学分析,比对所有表达的基因,筛选出iPSC残留的生物标志物;优选地,步骤(1)中所述的样本包括iPSC分化细胞;更优选地,所述的样本包括内皮祖细胞、心肌细胞、内皮细胞、心脏成纤维细胞、神经干细胞、小胶质细胞、间充质干细胞、视网膜色素上皮细胞、肝细胞、造血干细胞、胰岛细胞、红细胞、B淋巴细胞、T淋巴细胞、自然杀伤细胞、嗜中性粒细胞、嗜碱性粒细胞、嗜酸性粒细胞、单核细胞、巨噬细胞;最优选地,所述的样本包括内皮祖细胞、心肌细胞、胰岛细胞;优选地,步骤(2)中所述的比对所有表达的基因包括比对iPSC细胞和样本中所有基因表达量的差异,筛选出iPSC残留的生物标志物;更优选地,所述筛选的过程包括如下步骤:筛选出iPSC干性基因中在iPSC中表达的阳性细胞比例>50%的基因为iPSC残留的候选基因,筛选出iPSC干性基因中在样本中表达的阳性细胞比例<10%的基因为iPSC残留的候选基因,在候选基因的基础上确定iPSC残留的生物标志物。
- 根据权利要求2所述的方法,其特征在于,所述生物信息学分析包括如下步骤:a.使用cellranger-5.0.0对单细胞转录组rawdata数据进行分析;b.Seurat软件包对单细胞数据进行分析;c.添加线粒体百分比列,使用PercentageFeatureSet函数计算,并进行数据筛选;d.使用全局缩放规范化方法LogNormalize对数据进行处理;e.使用FindVariableFeatures完成差异分析,选择差异较高的特征基因。
- 一种iPSC残留的检测方法,其特征在于,所述方法包括如下步骤:检测待测样本中生物标志物的表达水平;优选地,所述生物标志物为权利要求1所述的生物标志物。
- 根据权利要求4所述的方法,其特征在于,所述方法还包括如下步骤:(1)对待测样本中的生物标志物进行PCA分析和Kmeans分析;(2)根据步骤(1)分析得到的PCA结果和tSNE结果,判断iPSC的残留水平;优选地,所述生物标志物为权利要求1所述的生物标志物。
- 根据权利要求5所述的方法,其特征在于,步骤(1)中所述的样本包括iPSC分化细胞;优选地,所述的样本包括内皮祖细胞、心肌细胞、内皮细胞、心脏成纤维细胞、神经干细胞、小胶质细胞、间充质干细胞、视网膜色素上皮细胞、肝细胞、造血干细胞、胰岛细胞、红细胞、B淋巴细胞、T淋巴细胞、自然杀伤细胞、嗜中性粒细胞、嗜碱性粒细胞、嗜酸性粒细胞、单核细胞、巨噬细胞;更优选地,所述的样本包括内皮祖细胞、心肌细胞、胰岛细胞。
- 根据权利要求4-6中任一项所述的方法,其特征在于,步骤(1)中还包括如下步骤:a.对生物标志物应用线性变换进行缩放;b.对缩放得到的数据进行PCA分析,得到表达矩阵数据;c.将样本表达矩阵数据与iPSC单细胞测序分析得到的表达矩阵数据合并取交集得到新的表达矩阵;d.利用新的表达矩阵的数据进行PCA分析和Kmeans分析,得到PCA结果和tSNE结果;优选地,所述生物标志物为权利要求1所述的生物标志物;优选地,步骤b中还包括对缩放得到的数据进行筛选,提取数据中表达生物标志物中的一种或多种的细胞作为疑似iPSC细胞,得到表达矩阵数据;优选地,所述生物标志物包括LIN28A、ESRG、SOX2、POU5F1、NANOG。
- 一种用于iPSC残留检测的试剂盒,其特征在于,所述试剂盒包括检测生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种表达水平的试剂;优选地,所述试剂包括特异性扩增生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种的引物或特异性识别生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种的探针;优选地,所述试剂盒还包括dNTPs、Mg 2+离子、DNA聚合酶或包含dNTPs、Mg 2+离子、DNA聚合酶的PCR体系。
- 一种iPSC残留的检测系统,其特征在于,所述系统包括检测待测样本中生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种表达水平的单元;优选地,所述系统还包括培养iPSC的单元;优选地,所述系统还包括iPSC诱导分化单元;更优选地,所述培养iPSC的单元包括E8完全培养基、Y-27632;最优选地,所述Y-27632的浓度为10μM;更优选地,所述检测待测样本中生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种表达水平的单元包括权利要求4-7中任一项所述的方法;最优选地,所述检测待测样本中生物标志物LIN28A、ESRG、SOX2、POU5F1、NANOG中的一种或多种表达水平的单元是根据PCA结果和tSNE结果分析是否含有iPSC残留;最优选地,若PCA结果和tSNE结果显示iPSC和待测样本单细胞分离数据没有交集,则表明待测样本中没有iPSC残留;若PCA结果和tSNE结果显示iPSC和待测样本单细胞分离数据有交集,则表明待测样本中有iPSC残留。
- 如下任一方面的应用,其特征在于,所述应用包括:(1)单细胞测序技术在iPSC残留检测中的应用;(2)权利要求1所述的生物标志物在iPSC残留检测中的应用;(3)权利要求1所述的生物标志物在制备iPSC残留检测试剂中的应用;(4)检测权利要求1所述的生物标志物表达水平的试剂在制备iPSC残留检测试剂盒中的应用;优选地,所述试剂盒为权利要求8所述的试剂盒;(5)检测权利要求1所述的生物标志物表达水平的试剂在iPSC残留的检测系统中的应用;优选地,所述iPSC残留的检测系统为权利要求9所述的系统;(6)权利要求8所述的试剂盒在iPSC残留检测中的应用;(7)权利要求9所述的系统在iPSC残留检测中的应用;(8)PCA分析和Kmeans分析在iPSC残留检测中的应用。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110612182.8A CN113355433B (zh) | 2021-06-02 | 2021-06-02 | 一种基于单细胞测序数据分析的iPSC残留检测方法 |
CN202110612182.8 | 2021-06-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022253022A1 true WO2022253022A1 (zh) | 2022-12-08 |
Family
ID=77531098
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/094411 WO2022253022A1 (zh) | 2021-06-02 | 2022-05-23 | 一种基于单细胞测序数据分析的iPSC残留检测方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113355433B (zh) |
WO (1) | WO2022253022A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113355433B (zh) * | 2021-06-02 | 2022-07-19 | 呈诺再生医学科技(珠海横琴新区)有限公司 | 一种基于单细胞测序数据分析的iPSC残留检测方法 |
CN114150074A (zh) * | 2021-12-20 | 2022-03-08 | 安徽中盛溯源生物科技有限公司 | Tdgf1在人多能干细胞来源功能细胞产品中多能干细胞残留检测中应用 |
WO2023118050A1 (en) * | 2021-12-21 | 2023-06-29 | Novo Nordisk A/S | Use of novel markers to detect pluripotent stem cells |
WO2023184528A1 (zh) * | 2022-04-02 | 2023-10-05 | 武汉睿健医药科技有限公司 | 标记物基因在检测多能干细胞残留中的应用、检测方法及试剂盒 |
CN117511954B (zh) * | 2023-12-29 | 2024-04-26 | 湖南家辉生物技术有限公司 | Hcfc1基因突变体、突变体蛋白、试剂、试剂盒及应用 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101617043A (zh) * | 2007-10-31 | 2009-12-30 | 国立大学法人京都大学 | 核重编程方法 |
WO2019141878A1 (en) * | 2018-01-22 | 2019-07-25 | Sistemic Scotland Ltd | Cell contamination assay |
CN110573607A (zh) * | 2016-11-16 | 2019-12-13 | 洋蓟治疗有限公司 | 多能干细胞测定 |
US20200397828A1 (en) * | 2019-04-29 | 2020-12-24 | The Broad Institute, Inc. | Atlas of choroid plexus cell types and therapeutic and diagnostic uses thereof |
CN112262217A (zh) * | 2018-06-15 | 2021-01-22 | 公立大学法人横滨市立大学 | 未分化细胞检测法 |
CN113355433A (zh) * | 2021-06-02 | 2021-09-07 | 呈诺再生医学科技(珠海横琴新区)有限公司 | 一种基于单细胞测序数据分析的iPSC残留检测方法 |
WO2021175768A1 (en) * | 2020-03-02 | 2021-09-10 | Novo Nordisk A/S | Use of pluripotent markers to detect contaminating residual undifferentiated pluripotent stem cells |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018156734A1 (en) * | 2017-02-24 | 2018-08-30 | Trustees Of Boston University | Isolation of human lung progenitors derived from pluripotent stem cells |
EP4059960A4 (en) * | 2019-11-15 | 2023-12-27 | Public University Corporation Yokohama City University | METHOD FOR DETECTING UNDIFFERENTIATED CELLS |
CN111996241A (zh) * | 2020-08-13 | 2020-11-27 | 北京呈诺医学科技有限公司 | 一种使用ESRG基因作为通用标记基因的iPSC残留检测方法 |
-
2021
- 2021-06-02 CN CN202110612182.8A patent/CN113355433B/zh active Active
-
2022
- 2022-05-23 WO PCT/CN2022/094411 patent/WO2022253022A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101617043A (zh) * | 2007-10-31 | 2009-12-30 | 国立大学法人京都大学 | 核重编程方法 |
CN110573607A (zh) * | 2016-11-16 | 2019-12-13 | 洋蓟治疗有限公司 | 多能干细胞测定 |
WO2019141878A1 (en) * | 2018-01-22 | 2019-07-25 | Sistemic Scotland Ltd | Cell contamination assay |
CN112262217A (zh) * | 2018-06-15 | 2021-01-22 | 公立大学法人横滨市立大学 | 未分化细胞检测法 |
US20200397828A1 (en) * | 2019-04-29 | 2020-12-24 | The Broad Institute, Inc. | Atlas of choroid plexus cell types and therapeutic and diagnostic uses thereof |
WO2021175768A1 (en) * | 2020-03-02 | 2021-09-10 | Novo Nordisk A/S | Use of pluripotent markers to detect contaminating residual undifferentiated pluripotent stem cells |
CN113355433A (zh) * | 2021-06-02 | 2021-09-07 | 呈诺再生医学科技(珠海横琴新区)有限公司 | 一种基于单细胞测序数据分析的iPSC残留检测方法 |
Non-Patent Citations (2)
Title |
---|
LU JUNJIE, BACCEI ANNA, LUMMERTZ DA ROCHA EDROALDO, GUILLERMIER CHRISTELLE, MCMANUS SEAN, FINNEY LYDIA A., ZHANG CHENG, STEINHAUSE: "Single-cell RNA sequencing reveals metallothionein heterogeneity during hESC differentiation to definitive endoderm", STEM CELL RESEARCH, vol. 28, 1 April 2018 (2018-04-01), NL , pages 48 - 55, XP093009832, ISSN: 1873-5061, DOI: 10.1016/j.scr.2018.01.015 * |
SEKINE KEISUKE, TSUZUKI SYUSAKU, YASUI RYOTA, KOBAYASHI TATSUYA, IKEDA KAZUKI, HAMADA YUKI, KANAI ERIKO, CAMP J. GRAY, TREUTLEIN B: "Robust detection of undifferentiated iPSC among differentiated cells", SCIENTIFIC REPORTS, vol. 10, no. 1, 1 December 2020 (2020-12-01), pages 10293, XP055884848, DOI: 10.1038/s41598-020-66845-6 * |
Also Published As
Publication number | Publication date |
---|---|
CN113355433A (zh) | 2021-09-07 |
CN113355433B (zh) | 2022-07-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022253022A1 (zh) | 一种基于单细胞测序数据分析的iPSC残留检测方法 | |
US11970714B2 (en) | Method for reprogramming blood to induced pluripotent stem cells | |
US20240228977A1 (en) | Generation of induced pluripotent stem cells from small volumes of peripheral blood | |
US20240158757A1 (en) | Novel and efficient method for reprogramming immortalized lymphoblastoid cell lines to induced pluripotent stem cells | |
US20230279358A1 (en) | Cell reprogramming | |
Cao et al. | Chromatin accessibility dynamics during chemical induction of pluripotency | |
CN102417894B (zh) | 一种提高诱导生成多能性干细胞效率的方法 | |
US10696951B2 (en) | Method for culturing pluripotent stem cells | |
CN106244558B (zh) | 一种人单个核细胞重编程为诱导多能干细胞的方法 | |
US20160115455A1 (en) | Reprogrammed cells and methods of production and use thereof | |
CN102851314A (zh) | 诱导性多能干细胞的制备方法和用于制备诱导性多能干细胞的培养基 | |
CN115287254A (zh) | 一种ipsc培养体系及ipsc残留检测的方法 | |
US20220162550A1 (en) | Induced stem cells | |
CN113646424A (zh) | 具有分化成特定细胞的能力的多能干细胞的制造方法及其应用 | |
WO2020185856A1 (en) | Methods for increasing platelet production | |
TWI814716B (zh) | 人工多能性幹細胞的評估方法及選拔方法,以及人工多能性幹細胞的製造方法 | |
Gao et al. | Efficient generation of induced pluripotent stem cell lines from peripheral blood mononuclear cells | |
Conrad et al. | New Insights in Spermatogonial Stem Cells | |
Thornton | Production of more stable induced pluripotent stem cells using the Doggybone (dbDNA) vector. | |
CN118139973A (zh) | 用于使用非人灵长类动物的个体化的基因组组装和诱导性多能干细胞系进行临床前评估的组合物和方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22815069 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC |