WO2020224040A1 - 一种捕获rna原位高级结构及相互作用的方法 - Google Patents
一种捕获rna原位高级结构及相互作用的方法 Download PDFInfo
- Publication number
- WO2020224040A1 WO2020224040A1 PCT/CN2019/094790 CN2019094790W WO2020224040A1 WO 2020224040 A1 WO2020224040 A1 WO 2020224040A1 CN 2019094790 W CN2019094790 W CN 2019094790W WO 2020224040 A1 WO2020224040 A1 WO 2020224040A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- rna
- cell
- minutes
- solution
- situ
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 103
- 230000003993 interaction Effects 0.000 title claims abstract description 93
- 238000011065 in-situ storage Methods 0.000 title claims abstract description 40
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims abstract description 189
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 51
- 239000011616 biotin Substances 0.000 claims abstract description 42
- 229960002685 biotin Drugs 0.000 claims abstract description 42
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 25
- 230000001404 mediated effect Effects 0.000 claims abstract description 10
- 108091092236 Chimeric RNA Proteins 0.000 claims abstract description 9
- 238000012165 high-throughput sequencing Methods 0.000 claims abstract description 6
- 239000012528 membrane Substances 0.000 claims abstract description 6
- 210000004027 cell Anatomy 0.000 claims description 163
- 238000006243 chemical reaction Methods 0.000 claims description 74
- 239000011324 bead Substances 0.000 claims description 43
- WSFSSNUMVMOOMR-UHFFFAOYSA-N Formaldehyde Chemical group O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 claims description 37
- 239000000872 buffer Substances 0.000 claims description 28
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 claims description 24
- 239000002904 solvent Substances 0.000 claims description 21
- 239000003550 marker Substances 0.000 claims description 20
- 102000002260 Alkaline Phosphatase Human genes 0.000 claims description 14
- 108020004774 Alkaline Phosphatase Proteins 0.000 claims description 14
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 claims description 14
- 102000004190 Enzymes Human genes 0.000 claims description 12
- 108090000790 Enzymes Proteins 0.000 claims description 12
- 230000008823 permeabilization Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 12
- 239000011780 sodium chloride Substances 0.000 claims description 12
- 108020005198 Long Noncoding RNA Proteins 0.000 claims description 11
- 108010059724 Micrococcal Nuclease Proteins 0.000 claims description 11
- 108010067770 Endopeptidase K Proteins 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 9
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 claims description 8
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 claims description 8
- 239000003161 ribonuclease inhibitor Substances 0.000 claims description 8
- 239000004471 Glycine Substances 0.000 claims description 7
- 239000008098 formaldehyde solution Substances 0.000 claims description 7
- 125000002887 hydroxy group Chemical group [H]O* 0.000 claims description 7
- 210000000170 cell membrane Anatomy 0.000 claims description 6
- 238000004132 cross linking Methods 0.000 claims description 6
- 230000029087 digestion Effects 0.000 claims description 6
- 239000003599 detergent Substances 0.000 claims description 5
- 229940042399 direct acting antivirals protease inhibitors Drugs 0.000 claims description 5
- 239000000137 peptide hydrolase inhibitor Substances 0.000 claims description 5
- 229920001213 Polysorbate 20 Polymers 0.000 claims description 4
- 238000013467 fragmentation Methods 0.000 claims description 4
- 238000006062 fragmentation reaction Methods 0.000 claims description 4
- 238000010438 heat treatment Methods 0.000 claims description 4
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 claims description 4
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 claims description 4
- 230000015556 catabolic process Effects 0.000 claims description 3
- 238000006731 degradation reaction Methods 0.000 claims description 3
- 230000033444 hydroxylation Effects 0.000 claims description 3
- 238000005805 hydroxylation reaction Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 241001465754 Metazoa Species 0.000 claims description 2
- 108010021757 Polynucleotide 5'-Hydroxyl-Kinase Proteins 0.000 claims description 2
- 102000008422 Polynucleotide 5'-hydroxyl-kinase Human genes 0.000 claims description 2
- 239000013504 Triton X-100 Substances 0.000 claims description 2
- 229920004890 Triton X-100 Polymers 0.000 claims description 2
- 210000004102 animal cell Anatomy 0.000 claims description 2
- 230000026731 phosphorylation Effects 0.000 claims description 2
- 238000006366 phosphorylation reaction Methods 0.000 claims description 2
- 102000003960 Ligases Human genes 0.000 claims 1
- 108090000364 Ligases Proteins 0.000 claims 1
- 230000000640 hydroxylating effect Effects 0.000 claims 1
- 238000002372 labelling Methods 0.000 abstract description 3
- 230000000593 degrading effect Effects 0.000 abstract description 2
- 101100191561 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) PRP3 gene Proteins 0.000 abstract 1
- 230000006037 cell lysis Effects 0.000 abstract 1
- 239000000243 solution Substances 0.000 description 108
- 239000000523 sample Substances 0.000 description 65
- 239000000047 product Substances 0.000 description 57
- 239000006228 supernatant Substances 0.000 description 54
- 238000005406 washing Methods 0.000 description 54
- 239000000203 mixture Substances 0.000 description 52
- 108020004414 DNA Proteins 0.000 description 31
- 239000008188 pellet Substances 0.000 description 30
- 108091046869 Telomeric non-coding RNA Proteins 0.000 description 28
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 27
- 108091007767 MALAT1 Proteins 0.000 description 22
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 22
- 230000014509 gene expression Effects 0.000 description 21
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 20
- 238000012163 sequencing technique Methods 0.000 description 19
- 239000012634 fragment Substances 0.000 description 17
- DEFVIWRASFVYLL-UHFFFAOYSA-N ethylene glycol bis(2-aminoethyl)tetraacetic acid Chemical compound OC(=O)CN(CC(O)=O)CCOCCOCCN(CC(O)=O)CC(O)=O DEFVIWRASFVYLL-UHFFFAOYSA-N 0.000 description 16
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 15
- 108091027881 NEAT1 Proteins 0.000 description 15
- 239000003623 enhancer Substances 0.000 description 15
- 238000005516 engineering process Methods 0.000 description 14
- 210000001519 tissue Anatomy 0.000 description 14
- 108010077544 Chromatin Proteins 0.000 description 11
- 101710086015 RNA ligase Proteins 0.000 description 11
- 238000004458 analytical method Methods 0.000 description 11
- 210000003483 chromatin Anatomy 0.000 description 11
- 210000000349 chromosome Anatomy 0.000 description 11
- 239000002244 precipitate Substances 0.000 description 11
- 238000012546 transfer Methods 0.000 description 10
- 108091093018 PVT1 Proteins 0.000 description 9
- 239000002243 precursor Substances 0.000 description 9
- 238000000746 purification Methods 0.000 description 9
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 8
- 206010028980 Neoplasm Diseases 0.000 description 8
- 230000027455 binding Effects 0.000 description 8
- 230000002255 enzymatic effect Effects 0.000 description 8
- 238000006911 enzymatic reaction Methods 0.000 description 8
- 108020004999 messenger RNA Proteins 0.000 description 8
- 238000002156 mixing Methods 0.000 description 8
- 108091027963 non-coding RNA Proteins 0.000 description 8
- 102000042567 non-coding RNA Human genes 0.000 description 8
- 239000012266 salt solution Substances 0.000 description 8
- 238000003559 RNA-seq method Methods 0.000 description 7
- 239000007984 Tris EDTA buffer Substances 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 108020003224 Small Nucleolar RNA Proteins 0.000 description 6
- 102000042773 Small Nucleolar RNA Human genes 0.000 description 6
- 108010090804 Streptavidin Proteins 0.000 description 6
- 108091007416 X-inactive specific transcript Proteins 0.000 description 6
- 108091035715 XIST (gene) Proteins 0.000 description 6
- 201000011510 cancer Diseases 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical class NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 6
- 238000007405 data analysis Methods 0.000 description 6
- 238000009826 distribution Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 238000001727 in vivo Methods 0.000 description 6
- 230000009878 intermolecular interaction Effects 0.000 description 6
- 239000002679 microRNA Substances 0.000 description 6
- ZCCUUQDIBDJBTK-UHFFFAOYSA-N psoralen Chemical compound C1=C2OC(=O)C=CC2=CC2=C1OC=C2 ZCCUUQDIBDJBTK-UHFFFAOYSA-N 0.000 description 6
- 230000002829 reductive effect Effects 0.000 description 6
- 206010009944 Colon cancer Diseases 0.000 description 5
- AHCYMLUZIRLXAA-SHYZEUOFSA-N Deoxyuridine 5'-triphosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 AHCYMLUZIRLXAA-SHYZEUOFSA-N 0.000 description 5
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 5
- 101000900499 Homo sapiens Glutamate receptor ionotropic, delta-2 Proteins 0.000 description 5
- 101001098818 Homo sapiens cGMP-inhibited 3',5'-cyclic phosphodiesterase A Proteins 0.000 description 5
- 206010027476 Metastases Diseases 0.000 description 5
- 238000012408 PCR amplification Methods 0.000 description 5
- 102100037093 cGMP-inhibited 3',5'-cyclic phosphodiesterase A Human genes 0.000 description 5
- 239000003086 colorant Substances 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 238000003197 gene knockdown Methods 0.000 description 5
- 230000008863 intramolecular interaction Effects 0.000 description 5
- 230000009401 metastasis Effects 0.000 description 5
- 108091070501 miRNA Proteins 0.000 description 5
- 108020004418 ribosomal RNA Proteins 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 4
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 4
- 102000053602 DNA Human genes 0.000 description 4
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 4
- 102100022192 Glutamate receptor ionotropic, delta-2 Human genes 0.000 description 4
- 102100034343 Integrase Human genes 0.000 description 4
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 4
- 108091027974 Mature messenger RNA Proteins 0.000 description 4
- 229910019142 PO4 Inorganic materials 0.000 description 4
- 230000003321 amplification Effects 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 235000020958 biotin Nutrition 0.000 description 4
- 230000004663 cell proliferation Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000009545 invasion Effects 0.000 description 4
- 238000003199 nucleic acid amplification method Methods 0.000 description 4
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 4
- 239000010452 phosphate Substances 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 230000001105 regulatory effect Effects 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- VXGRJERITKFWPL-UHFFFAOYSA-N 4',5'-Dihydropsoralen Natural products C1=C2OC(=O)C=CC2=CC2=C1OCC2 VXGRJERITKFWPL-UHFFFAOYSA-N 0.000 description 3
- 206010008342 Cervix carcinoma Diseases 0.000 description 3
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 3
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 3
- 108010033040 Histones Proteins 0.000 description 3
- 101710203526 Integrase Proteins 0.000 description 3
- 238000000636 Northern blotting Methods 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 102000009572 RNA Polymerase II Human genes 0.000 description 3
- 108010009460 RNA Polymerase II Proteins 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 108700009124 Transcription Initiation Site Proteins 0.000 description 3
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 3
- 230000033228 biological regulation Effects 0.000 description 3
- 201000010881 cervical cancer Diseases 0.000 description 3
- YTRQFSDWAXHJCC-UHFFFAOYSA-N chloroform;phenol Chemical compound ClC(Cl)Cl.OC1=CC=CC=C1 YTRQFSDWAXHJCC-UHFFFAOYSA-N 0.000 description 3
- 238000010835 comparative analysis Methods 0.000 description 3
- 230000005014 ectopic expression Effects 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 239000004615 ingredient Substances 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 210000000633 nuclear envelope Anatomy 0.000 description 3
- 210000004940 nucleus Anatomy 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- -1 rRNA Proteins 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 239000011734 sodium Substances 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 238000012353 t test Methods 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 108010039627 Aprotinin Proteins 0.000 description 2
- 238000001353 Chip-sequencing Methods 0.000 description 2
- 108010017826 DNA Polymerase I Proteins 0.000 description 2
- 102000004594 DNA Polymerase I Human genes 0.000 description 2
- 239000003298 DNA probe Substances 0.000 description 2
- 108010053770 Deoxyribonucleases Proteins 0.000 description 2
- 102000016911 Deoxyribonucleases Human genes 0.000 description 2
- LTLYEAJONXGNFG-DCAQKATOSA-N E64 Chemical compound NC(=N)NCCCCNC(=O)[C@H](CC(C)C)NC(=O)[C@H]1O[C@@H]1C(O)=O LTLYEAJONXGNFG-DCAQKATOSA-N 0.000 description 2
- 101000960626 Homo sapiens Mitochondrial inner membrane protease subunit 2 Proteins 0.000 description 2
- 101150039798 MYC gene Proteins 0.000 description 2
- 102100039840 Mitochondrial inner membrane protease subunit 2 Human genes 0.000 description 2
- 108020005093 RNA Precursors Proteins 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 108020004688 Small Nuclear RNA Proteins 0.000 description 2
- 102000039471 Small Nuclear RNA Human genes 0.000 description 2
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 2
- 108091027544 Subgenomic mRNA Proteins 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 230000002378 acidificating effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- MGSKVZWGBWPBTF-UHFFFAOYSA-N aebsf Chemical compound NCCC1=CC=C(S(F)(=O)=O)C=C1 MGSKVZWGBWPBTF-UHFFFAOYSA-N 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 229960004405 aprotinin Drugs 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000008045 co-localization Effects 0.000 description 2
- 208000029742 colonic neoplasm Diseases 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 239000003480 eluent Substances 0.000 description 2
- 210000001671 embryonic stem cell Anatomy 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 230000030279 gene silencing Effects 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 108010051779 histone H3 trimethyl Lys4 Proteins 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- ZPNFWUPYTFPOJU-LPYSRVMUSA-N iniprol Chemical compound C([C@H]1C(=O)NCC(=O)NCC(=O)N[C@H]2CSSC[C@H]3C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(N[C@H](C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=4C=CC(O)=CC=4)C(=O)N[C@@H](CC=4C=CC=CC=4)C(=O)N[C@@H](CC=4C=CC(O)=CC=4)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC=4C=CC=CC=4)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC2=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](CC=2C=CC=CC=2)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H]2N(CCC2)C(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N2[C@@H](CCC2)C(=O)N2[C@@H](CCC2)C(=O)N[C@@H](CC=2C=CC(O)=CC=2)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N2[C@@H](CCC2)C(=O)N3)C(=O)NCC(=O)NCC(=O)N[C@@H](C)C(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(=O)N[C@@H](CC=2C=CC=CC=2)C(=O)N[C@H](C(=O)N1)C(C)C)[C@@H](C)O)[C@@H](C)CC)=O)[C@@H](C)CC)C1=CC=C(O)C=C1 ZPNFWUPYTFPOJU-LPYSRVMUSA-N 0.000 description 2
- 238000007169 ligase reaction Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 239000002773 nucleotide Substances 0.000 description 2
- 125000003729 nucleotide group Chemical group 0.000 description 2
- 108010091212 pepstatin Proteins 0.000 description 2
- FAXGPCHRFPCXOO-LXTPJMTPSA-N pepstatin A Chemical compound OC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)NC(=O)CC(C)C FAXGPCHRFPCXOO-LXTPJMTPSA-N 0.000 description 2
- 230000010399 physical interaction Effects 0.000 description 2
- 230000004962 physiological condition Effects 0.000 description 2
- 238000001556 precipitation Methods 0.000 description 2
- 239000011535 reaction buffer Substances 0.000 description 2
- 230000009711 regulatory function Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 239000001632 sodium acetate Substances 0.000 description 2
- 235000017281 sodium acetate Nutrition 0.000 description 2
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 2
- 238000000547 structure data Methods 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 108010057210 telomerase RNA Proteins 0.000 description 2
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- 108020004463 18S ribosomal RNA Proteins 0.000 description 1
- 101710159080 Aconitate hydratase A Proteins 0.000 description 1
- 101710159078 Aconitate hydratase B Proteins 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- VGGGPCQERPFHOB-MCIONIFRSA-N Bestatin Chemical compound CC(C)C[C@H](C(O)=O)NC(=O)[C@@H](O)[C@H](N)CC1=CC=CC=C1 VGGGPCQERPFHOB-MCIONIFRSA-N 0.000 description 1
- VGGGPCQERPFHOB-UHFFFAOYSA-N Bestatin Natural products CC(C)CC(C(O)=O)NC(=O)C(O)C(N)CC1=CC=CC=C1 VGGGPCQERPFHOB-UHFFFAOYSA-N 0.000 description 1
- XGDFITZJGKUSDK-UDYGKFQRSA-N Bestatin (hydrochloride) Chemical compound Cl.CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](O)[C@H](N)CC1=CC=CC=C1 XGDFITZJGKUSDK-UDYGKFQRSA-N 0.000 description 1
- 108010014064 CCCTC-Binding Factor Proteins 0.000 description 1
- 102000016897 CCCTC-Binding Factor Human genes 0.000 description 1
- 108091033409 CRISPR Proteins 0.000 description 1
- 238000010354 CRISPR gene editing Methods 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 108091007695 FTX Proteins 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- 102100031181 Glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 1
- 102100021196 Glypican-5 Human genes 0.000 description 1
- 101000904691 Homo sapiens Bcl-2-like protein 2 Proteins 0.000 description 1
- 101001040711 Homo sapiens Glypican-5 Proteins 0.000 description 1
- 101000616738 Homo sapiens NAD-dependent protein deacetylase sirtuin-6 Proteins 0.000 description 1
- 101000651197 Homo sapiens Sphingosine kinase 2 Proteins 0.000 description 1
- 101000649014 Homo sapiens Triple functional domain protein Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- GDBQQVLCIARPGH-UHFFFAOYSA-N Leupeptin Natural products CC(C)CC(NC(C)=O)C(=O)NC(CC(C)C)C(=O)NC(C=O)CCCN=C(N)N GDBQQVLCIARPGH-UHFFFAOYSA-N 0.000 description 1
- 108700011259 MicroRNAs Proteins 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 102100021840 NAD-dependent protein deacetylase sirtuin-6 Human genes 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 108091029480 NONCODE Proteins 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 102000043276 Oncogene Human genes 0.000 description 1
- 108091007412 Piwi-interacting RNA Proteins 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- 238000002123 RNA extraction Methods 0.000 description 1
- 239000003391 RNA probe Substances 0.000 description 1
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 101710105008 RNA-binding protein Proteins 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 108090000621 Ribonuclease P Proteins 0.000 description 1
- 102000004167 Ribonuclease P Human genes 0.000 description 1
- 108020001027 Ribosomal DNA Proteins 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 108020004459 Small interfering RNA Proteins 0.000 description 1
- 102100027662 Sphingosine kinase 2 Human genes 0.000 description 1
- 238000012233 TRIzol extraction Methods 0.000 description 1
- 102100028101 Triple functional domain protein Human genes 0.000 description 1
- 108091023045 Untranslated Region Proteins 0.000 description 1
- 210000001766 X chromosome Anatomy 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 239000007853 buffer solution Substances 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 230000000711 cancerogenic effect Effects 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 230000004709 cell invasion Effects 0.000 description 1
- 210000003855 cell nucleus Anatomy 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 201000010897 colon adenocarcinoma Diseases 0.000 description 1
- 230000005757 colony formation Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- FRTGEIHSCHXMTI-UHFFFAOYSA-N dimethyl octanediimidate Chemical compound COC(=N)CCCCCCC(=N)OC FRTGEIHSCHXMTI-UHFFFAOYSA-N 0.000 description 1
- VAYGXNSJCAHWJZ-UHFFFAOYSA-N dimethyl sulfate Chemical compound COS(=O)(=O)OC VAYGXNSJCAHWJZ-UHFFFAOYSA-N 0.000 description 1
- 230000019975 dosage compensation by inactivation of X chromosome Effects 0.000 description 1
- 238000002003 electron diffraction Methods 0.000 description 1
- 238000001493 electron microscopy Methods 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 230000006862 enzymatic digestion Effects 0.000 description 1
- 230000001605 fetal effect Effects 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 1
- 125000003055 glycidyl group Chemical group C(C1CO1)* 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003426 interchromosomal effect Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- GDBQQVLCIARPGH-ULQDDVLXSA-N leupeptin Chemical compound CC(C)C[C@H](NC(C)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C=O)CCCN=C(N)N GDBQQVLCIARPGH-ULQDDVLXSA-N 0.000 description 1
- 108010052968 leupeptin Proteins 0.000 description 1
- CIPMKIHUGVGQTG-VFFZMTJFSA-N leupeptin hemisulfate Chemical compound OS(O)(=O)=O.CC(C)C[C@H](NC(C)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C=O)CCCN=C(N)N.CC(C)C[C@H](NC(C)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C=O)CCCN=C(N)N CIPMKIHUGVGQTG-VFFZMTJFSA-N 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000010197 meta-analysis Methods 0.000 description 1
- 108091092859 miR-3064 stem-loop Proteins 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 230000004001 molecular interaction Effects 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 108700024542 myc Genes Proteins 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 238000000053 physical method Methods 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 230000001376 precipitating effect Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 238000001243 protein synthesis Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000037425 regulation of transcription Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000001850 reproductive effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 239000002924 silencing RNA Substances 0.000 description 1
- 239000004055 small Interfering RNA Substances 0.000 description 1
- 210000000130 stem cell Anatomy 0.000 description 1
- 229960005322 streptomycin Drugs 0.000 description 1
- 230000036964 tight binding Effects 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- GPRLSGONYQIRFK-MNYXATJNSA-N triton Chemical compound [3H+] GPRLSGONYQIRFK-MNYXATJNSA-N 0.000 description 1
- 229950009811 ubenimex Drugs 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 238000009281 ultraviolet germicidal irradiation Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1065—Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- the invention relates to the field of biotechnology, in particular to a method for capturing RNA in situ high-level structure and interaction.
- RNA is mainly used to encode and guide protein synthesis.
- This type of protein-encoding RNA is collectively referred to as messenger RNA (mRNA).
- mRNA messenger RNA
- ncRNA noncoding RNA
- the non-coding RNAs that have been found to have regulatory effects include: tRNA, rRNA, siRNA, miRNA, piRNA, snoRNA, circRNA, and lncRNA. Their abnormal expression and mutation are related to major diseases such as cancer occurrence, development and reproductive defects.
- RNA As a key regulator of genetic information, RNA often needs to form complex high-level structures through intramolecular base pairing, and then interact with other RNA molecules to perform important biological regulatory functions.
- sequencing technology we can already obtain detailed RNA sequence information, but it is still a worldwide problem to obtain RNA structure, especially high-level structural information.
- some physical methods such as nuclear magnetic resonance, cryo-electron microscopy, and crystallography, can analyze the high-resolution structure of RNA and intramolecular and intermolecular interactions, the throughput of these technologies is too low.
- the high-resolution structures of human RNA included in the international protein database PDB are few. Therefore, how to systematically and accurately analyze the intramolecular and intermolecular interactions of RNA is still a huge challenge we face.
- RNA in the single-stranded region is easily modified by the compound DMS (dimethyl suberimidate, dimethyl sulfate) or NAI-N3, and deduced which bases of the RNA are in the single-stranded region by analyzing where the reverse transcriptase stops. .
- RNA structure for the double-stranded paired region in the RNA structure, there are currently many methods that can be analyzed, such as PARIS, LIGR-seq, and SPLASH.
- the basic principle of these three methods is: adding Psoralen or AMT to the culture medium, after they enter the cell, they will pass through the cell membrane and quickly bind to the double-strand paired area on the RNA. After being irradiated with 254nm ultraviolet (UV), the cell The paired RNA will be covalently cross-linked by psoralen or AMT, and then the enriched RNA will be fragmented, and proximally connected in solution.
- UV 254nm ultraviolet
- the covalent bond between psoralen or AMT and double-stranded RNA can be opened, and then library construction and sequencing can be performed.
- the above methods can study the single-stranded and double-stranded regions of RNA with high throughput, they also have some disadvantages: First, they cannot capture non-Watson-Crick pairing and long-distance RNA loop-loop interactions. Second, these connection reactions are all carried out in solution, and there are non-specific connections, which cannot reflect the true structure of RNA in the cell, resulting in a large number of false positive intermolecular connections.
- RNA proximal ligation technology can theoretically overcome the above technical defects, but due to the lack of cross-linking and chimera RNA enrichment, RPL technology can only identify intramolecular interactions, but cannot identify inter-molecular RNA-RNA interactions. effect.
- lncRNA long non-coding RNA
- lncRNA long non-coding RNA
- NONCODE the number of human lncRNAs included in NONCODE has exceeded 160,000, which is 8 times that of protein-coding genes, but the function, target and mechanism of most non-coding RNAs are still unclear.
- the commonly used methods to identify lncRNA targets include: CHIRP, CHART and RAP.
- CHIRP, CHART, and RAP methods only focus on DNA targets, ignoring RNA target sites with important functions, and can only identify all potential DNA targets (one to all) of one lncRNA in the cell at a time, and the throughput is too low. Therefore, how to systematically identify all the binding sites of all lncRNAs in the cell on the genome is still a difficult problem.
- RNA in situ conformation sequencing RNA in situ conformation sequencing, RIC-seq for short.
- the basic principle is to cross-link cells with formaldehyde to immobilize protein-mediated RNA-RNA close-range interactions, and to perforate the membrane while keeping the cells intact, and then use micrococcal nuclease cutting to remove free proteins that are not protected by proteins.
- the fragments are then labeled with pCp-biotin on the 3'end of the RNA and ligated proximally in situ.
- the chimera RNA containing C-biotin is purified and a chain-specific library is constructed.
- RIC-seq performs in-situ connection of RNA-RNA under the condition of maintaining cell integrity. It can capture all direct RNA-RNA close contacts at one time, and can detect all the RNA binding targets of lncRNA in vivo in situ ( all to all). The most important thing is to reconstruct the high-level structure of RNA based on the proximal spatial distance information of RNA.
- the present invention claims a method for in situ capture of RNA high-level structure and/or identification of in situ RNA-RNA interaction (ie RIC-seq method).
- the method (RIC-seq method) for in situ capturing RNA high-level structure and/or identifying in situ RNA-RNA interaction (RIC-seq method) claimed in the present invention may include the following steps:
- RNA-RNA close-range interactions Process cell or tissue samples to fix protein-mediated RNA-RNA close-range interactions.
- the volume of the tissue sample may be 1 cubic millimeter; the close distance may be within 50 angstroms.
- RNA protected by the protein with "pCp-label 1" and perform proximal ligation in situ.
- the proximal end may be within 50 angstroms.
- the "pCp-label 1" is a cytosine nucleotide with phosphate groups at both ends and labeled with the label 1.
- the "Cp-marker” appearing in the following is a cytosine nucleotide whose 3'end is a phosphate gene and is labeled with the marker 1;
- “C-marker 1” is a cytosine nucleotide labeled with the marker 1. ⁇ cytosine nucleotides.
- the "pCp-marker 1" is specifically pCp-biotin.
- the "Cp-marker 1" is specifically Cp-biotin; the “C-marker 1” is specifically C-biotin.
- the pCp-biotin is a cytosine nucleotide with phosphate groups at both ends and labeled with biotin.
- Cp-biotin appearing below is a cytosine nucleotide with a phosphate gene at the 3'end and labeled with biotin;
- C-biotin is a cytosine nucleotide with a biotin labeled.
- step (1) of the method it may also include a step of washing the cell or tissue sample.
- the washing method can be specifically carried out as follows; adding a pre-cooled PBS solution (pH 7.4) to the cell or tissue sample After washing, centrifuge at 4°C and 2500 rpm for 10 minutes to remove the PBS solution to obtain a washed cell sample.
- step (1) of the method the processing of the cell or tissue sample is formaldehyde cross-linking of the cell or tissue sample.
- step (1) can be performed according to a method including the following steps:
- (a1) Put the cell or tissue sample in a formaldehyde solution and place it at room temperature for 10 minutes.
- the formaldehyde solution is a 1% formaldehyde solution (the solvent is a PBS solution).
- step (a1) the following step (a2) may be included:
- step (a2) Add glycine solution to the cell or tissue sample processed in step (a1) to terminate the reaction, and mix for 10 minutes.
- the glycine solution is a glycine solution with a concentration of 0.125 mol/L (the solvent is DEPC water).
- the permeabilization solution used when perforating the membrane is a Permeabilization solution.
- step (2) can be performed according to a method including the following steps:
- step (1) Place the cell or tissue sample processed in step (1) in the Permeabilization solution, place it at 0°C-4°C (such as an ice bath) for 15 minutes, and mix it every 2 minutes.
- the solvent of the Permeabilization solution is 10mM Tris-HCl buffer with pH 7.5, and the solute and concentration are as follows: 10mM NaCl, 0.5% (volume percentage) NP-40, 0.3% (volume percentage) Triton X- 100, 0.1% (volume percentage) Tween 20, 1 ⁇ protease inhibitors and 2U/ml SUPERase ⁇ In TM RNase Inhibitor.
- the 1 ⁇ protease inhibitors are specifically Sigma products, and the product number is P8340-5ML (the specific components are AEBSF, Aprotinin, Bestatin, E-64, Leupeptin, and Pepstatin A).
- the 1 ⁇ protease inhibitors can also be other products with the same composition.
- the SUPERase ⁇ In TM RNase Inhibitor is a Thermo Fisher product, and the product number is AM2694.
- the SUPERase ⁇ In TM RNase Inhibitor can also be other products with the same composition.
- step (b1) the following step (b2) may be included:
- step (b2) Wash the cell or tissue sample treated in step (b1) with 1 ⁇ PNK solution.
- the solvent of the 1 ⁇ PNK solution is 50 mM Tris-Cl buffer with pH 7.4, and the solute and concentration are: 10 mM MgCl 2 , 0.1 mg/ml BSA, 0.2% (volume percentage) NP-40.
- the washing may be multiple washings, such as 3 times.
- Each washing may include the following steps: 4°C, rotating (such as 20 rpm) and mixing for 5 minutes, and centrifuging at 4°C and 3500 rpm for 5 minutes to remove the washing solution.
- step (3) of the method micrococcal nuclease is used to achieve the "degradation of free RNA that is not protected by protein".
- step (3) can be performed according to a method including the following steps:
- step (2) Put the sample processed in step (2) in a 1 ⁇ micrococcal nuclease solution for reaction.
- concentration of the micrococcal nuclease in the 1 x micrococcal nuclease solution may be 0.03 U/ ⁇ l.
- the conditions of the reaction may be: reaction at 37°C for 10 minutes, shaking at 1000 rpm for 15 seconds every 2 minutes.
- step (c1) the following step (c2) may be included:
- step (c2) Wash the sample processed in step (c1) with 1 ⁇ PNK+EGTA solution and 1 ⁇ PNK solution successively.
- the solvent of the 1 ⁇ PNK+EGTA solution is 50 mM Tris-Cl buffer with pH 7.4, the solute and concentration are 20 mM EGTA, 0.5% (volume percentage) NP-40.
- the solvent of the 1 ⁇ PNK solution is 50 mM Tris-Cl buffer with pH 7.4, and the solvent and concentration are as follows: 10 mM MgCl 2 , 0.1 mg/ml BSA, 0.2% (volume percentage) NP-40.
- the washing may be multiple washings, such as washing with the 1 ⁇ PNK+EGTA solution twice, and washing with the 1 ⁇ PNK solution twice.
- Each washing may include the following steps: 4°C, rotating (such as 20 rpm) and mixing for 5 minutes, and centrifuging at 4°C and 3500 rpm for 5 minutes to remove the washing solution.
- the step (4) can be performed according to a method including the following steps:
- the 3'end of the protein-protected RNA can be hydroxylated
- the content of alkaline phosphatase in the reaction system may be 0.1 U/ ⁇ l.
- the reaction conditions can be: reaction at 37°C for 10 minutes, shaking at 1000 rpm for 15 seconds every 3 minutes.
- the step (d1) after the completion of the reaction may also include a washing step; the washing specifically includes successively using the 1 ⁇ PNK+EGTA solution (the formula is the same as before), the High-salt solution and the 1 ⁇ PNK The solution successively washes the cell sample.
- the solvent of the High-salt solution is 5 ⁇ PBS (no Mg 2+ , Ca 2+ ) (that is, 5 ⁇ PBS buffer (pH 7.4): NaCl 685 mmol/L, KCl 13.5 mmol/L, Na 2 HPO 4 50 mmol /L, KH 2 PO 4 10mmol/L), the solute and concentration are 0.5% (volume percentage) NP-40.
- the solvent of the 1 ⁇ PNK solution is 50 mM Tris-Cl buffer with pH 7.4, and the solvent and concentration are as follows: 10 mM MgCl 2 , 0.1 mg/ml BSA, 0.05% (volume percentage) NP-40.
- the washing may be multiple washings, such as washing with the 1 ⁇ PNK+EGTA solution twice, washing with the High-salt solution twice, and washing with the 1 ⁇ PNK solution twice.
- Each washing may include the following steps: 4°C, rotating (such as 20 rpm) and mixing for 5 minutes, and centrifuging at 4°C and 3500 rpm for 5 minutes to remove the washing solution.
- the pCp-biotin can be added to the sample processed in step (d1) to perform a ligation reaction, so that the 3'end of the RNA is labeled as Cp-biotin.
- the enzyme used in the ligation reaction may be T4 RNA ligase.
- the final concentration of the pCp-biotin may be 40 ⁇ M; the final concentration of the T4 RNA ligase may be 1 U/ ⁇ l.
- the reaction conditions can be: reaction at 16°C for 12-16 hours, shaking at 1000 rpm for 15 seconds every 3 minutes.
- a washing step may be included; the washing specifically includes washing the cell sample with the 1 ⁇ PNK solution (for the formula, see before Example 1 of the specific embodiment) successively ; Wherein, the washing can be multiple washings, such as 3 times.
- Each washing may include the following steps: 4°C, rotating (such as 20 rpm) and mixing for 5 minutes, and centrifuging at 4°C and 3500 rpm for 5 minutes to remove the washing solution.
- sample processed in step (d2) can be treated with alkaline phosphatase to realize the conversion of the phosphate group in the Cp-biotin at the 3'end of the RNA to a hydroxyl group;
- the content of alkaline phosphatase in the reaction system may be 0.1 U/ ⁇ l.
- the reaction conditions can be: reaction at 37°C for 10 minutes, shaking at 1000 rpm for 15 seconds every 3 minutes.
- a washing step may also be included; the washing may specifically be successively using the 1 ⁇ PNK+EGTA solution (the formula is the same as before) and the High-salt solution ( The formula is the same as before) and the 1 ⁇ PNK solution (the formula is the same as step (d1)) to wash the cell sample successively.
- the washing may be multiple washings, such as washing with the 1 ⁇ PNK+EGTA solution twice, washing with the High-salt solution twice, and washing with the 1 ⁇ PNK solution twice.
- Each washing may include the following steps: 4°C, rotating (such as 20 rpm) and mixing for 5 minutes, and centrifuging at 4°C and 3500 rpm for 5 minutes to remove the washing solution.
- RNA can be phosphorylated by subjecting the sample processed in step (d3) to T4PNK enzyme treatment.
- the content of T4PNK enzyme in the reaction system may be 1 U/ ⁇ l.
- the reaction conditions can be: reaction at 37°C for 45 minutes, shaking at 1000 rpm for 15 seconds every 3 minutes.
- a washing step may be included; the washing may specifically be the use of the 1 ⁇ PNK+EGTA solution (the formula is the same as before) and the 1 ⁇ PNK solution. (The formula is the same as step (d1)) for washing.
- the washing may be multiple washings, such as washing with the 1 ⁇ PNK+EGTA solution twice, and washing with the 1 ⁇ PNK solution twice.
- Each washing may include the following steps: 4°C, rotating (such as 20 rpm) and mixing for 5 minutes, and centrifuging at 4°C and 3500 rpm for 5 minutes to remove the washing solution.
- proximal connection in situ.
- the proximal end may be within 50 angstroms.
- step (d4) by adding T4 RNA ligase to the sample processed in step (d4), the proximal ligation is realized in situ.
- the content of T4 RNA ligase in the reaction system may be 0.5 U/ ⁇ l.
- the reaction conditions can be: reaction at 16°C for 12-16 hours, shaking at 1000 rpm for 15 seconds every 3 minutes.
- the step (d2) after the completion of the reaction may also include a washing step; the washing may specifically be successively washing the cell sample with the 1 ⁇ PNK solution (the formula is the same as before); wherein, the The washing may be multiple washings, such as 3 times.
- Each washing may include the following steps: 4°C, rotating (such as 20 rpm) and mixing for 5 minutes, and centrifuging at 4°C and 3500 rpm for 5 minutes to remove the washing solution.
- the step (5) can be performed according to a method including the following steps:
- the content of proteinase K in the reaction system may be 0.12 U/ ⁇ l.
- the reaction conditions can be: 37°C for 60 minutes, 56°C for 15 minutes.
- the total RNA can be extracted using TRIzol LS and chloroform.
- 500 ⁇ l isopropanol, 15 ⁇ g glycoblue can be added when precipitating RNA at -20°C overnight.
- RNA after the total RNA is extracted, the steps of removing genomic DNA (for example, treatment with DNaseI) and removing ribosomal RNA (for example, using probes complementary to ribosomal RNA to remove) may also be included.
- genomic DNA for example, treatment with DNaseI
- ribosomal RNA for example, using probes complementary to ribosomal RNA to remove
- the steps of removing ribosomal RNA using DNA probes complementary to ribosomal RNA can be as follows: adding ribosomal RNA probes of equal quality to RNA, reacting at 95°C for 2 minutes, -0.1°C/s drops to 22°C , React at 22°C for 5 minutes. (You can put it on ice immediately after the reaction is over).
- Degradation of DNA RNA in the RNA hybrid strand (such as adding RNase H), and degrading DNA probes (such as adding Turbo DNase). Then the RNA is purified (for example, with the Zymo RNA clean kit).
- the fragmentation treatment of RNA may specifically adopt an alkaline lysis method.
- 1 ⁇ first strand buffer (formulation: 50mM Tris-HCl, pH 8.3; 75mM KCl; 3mM MgCl 2 ) is used to fragment RNA in a PCR machine at 94°C for 5 minutes .
- RNA labeled with "C-label 1" such as C-biotin
- the marker can specifically bind to the marker 1.
- the marker 1 is specifically biotin, and the marker 2 is specifically streptavidin.
- the magnetic beads on which the marker 2 is immobilized are streptavidin magnetic beads.
- a step of blocking the streptavidin magnetic beads is also included.
- the specific steps can be as follows: take 20 ⁇ l C1 magnetic beads, place the centrifuge tube on the magnetic stand, wait for the solution to clear, aspirate the supernatant, add 20 ⁇ l solution A, resuspend the magnetic beads, leave at room temperature for 2 minutes, put the centrifuge tube on the magnetic stand After the solution is clear, aspirate the supernatant, repeat this step once, add 20 ⁇ l solution B, resuspend the magnetic beads, put the centrifuge tube on the magnetic stand, wait for the solution to clear, aspirate the supernatant, add 32 ⁇ l yeast RNA (50 ⁇ g) , 68 ⁇ l DEPC water and 100 ⁇ l 2 ⁇ TWB solution, resuspend the magnetic beads, put the centrifuge tube on the rotary mixer, spin and mix for 1 hour, then put the centrifuge tube
- a step of eluting RNA from the magnetic beads is also included.
- This step mainly includes: synthesizing one-strand cDNA; synthesizing two-strand DNA; repairing the end of dsDNA; adding'A' to the repaired DNA; connecting the adaptor; using the DNA with the adaptor as a template for PCR amplification, using agar Carbohydrate gum recovers PCR products of specific fragment size to obtain the chain-specific library; high-throughput sequencing.
- the method of constructing a chain-specific library according to the conventional process can refer to "Levin, JZ, Yassour, M., Adiconis, X., Nusbaum, C., Thompson, DA, Friedman, N., Gnirke, A., and Regev, A. .(2010).Comprehensive comparative analysis of strand-specific RNA sequencing methods.Nature methods 7,709-715.”
- a mixture of 25 mM dNTPs and dUTP is used, wherein the molar ratio of dTTP to dUTP is 4:1.
- DNA purification steps can be included.
- the purification method may be magnetic bead purification.
- the specific method for purification of the magnetic beads can be carried out according to the following steps: Mix the AMPure XP magnetic beads (XP magnetic beads for short) in advance at room temperature for 30 minutes. Then add XP magnetic beads to the eluent and mix gently.
- the DNA purification step (such as magnetic bead purification) after the "connection of the adapter" can be twice.
- the upstream and downstream primers used in the PCR amplification in this step are a primer pair composed of two single-stranded DNAs shown in SEQ ID No. 1 and SEQ ID No. 2.
- the PCR amplification reaction system performed in this step is specifically: supernatant (supernatant obtained in the step of purifying DNA with magnetic beads after "connecting the adapter") 15.7 ⁇ l, 10 ⁇ Pfx buffer Solution (Invitrogen) 2.5 ⁇ l, 10 ⁇ M upstream and downstream primers (SEQ ID No.1 and SEQ ID No.2) 1 ⁇ l each, 50mM MgSO 4 solution 1 ⁇ l, 25mM dNTP 0.4 ⁇ l, Pfx enzyme (Invitrogen) 0.4 ⁇ l, USER Enzyme (NEB) 3 ⁇ l.
- the PCR amplification reaction procedure is specifically: 37°C for 15 minutes; 94°C for 2 minutes; 94°C for 15 seconds, 62°C for 30 seconds, 72°C for 30 seconds, and 72°C for 30 seconds, reaction for 12 cycles; 72°C for 10 minute.
- the high-throughput sequencing can use an Illumina HiSeq X Ten sequencer to sequence the library obtained in step (5), and can perform PE150 paired-end sequencing.
- the maximum starting amount of the cells is 1 ⁇ 10 7 cells.
- the cell may be an animal cell (such as a human-derived cell), and the tissue may be an animal tissue.
- the cell is specifically a HeLa cell.
- the present invention claims a method for library construction.
- the library construction method claimed in the present invention includes steps (1) to (5) of the method described in the first aspect of the preceding paragraph.
- the present invention claims the application of the library constructed by the method described in the second aspect in capturing the high-level structure of RNA in situ and/or identifying in situ RNA-RNA interactions.
- the present invention also requires protection of any of the following applications:
- (A2) pCp-biotin is used in the identification of RNA-RNA close-range interactions; wherein, the close-range can be within 50 angstroms.
- the present invention also claims any of the following:
- (B1) Detergent is the Permeabilization solution described above.
- step (B2) The auxiliary use of the detergent in step (B1) in perforating the cell membrane.
- the in-situ connection is an in-situ connection under non-denaturing conditions.
- FIG. 1 shows the RIC-seq process and data verification.
- A is a schematic diagram of RIC-seq technology.
- the in-situ treatment part includes formaldehyde cross-linking, cell perforation, RNA digestion, pCp-biotin labeling and proximal connection.
- the in vitro part includes the enrichment of chimeric RNA and the construction of double-end strand specific libraries.
- RBP stands for binding RNA binding protein.
- B is the base composition of the junction of chimera RNA.
- C is the comparison of RIC-seq chimera RNA (light part) with the known structure of miR-3064 (curved line). The blank area between the chimera RNA represents the gap.
- a stem-loop structure composed of RIC-seq read is shown below, and the solid circle is where pCp-biotin is inserted.
- D is pCp-biotin enriched in the top ring region of miRNA precursor.
- E-G reproduced the known structures and interactions of U4, U6, RPPH1 and U3 snoRNA for RIC-seq.
- H identified the binding site of U1 on MALAT1 for RIC-seq.
- the dark shaded area is the joint identified by RIC-seq, PARIS and RAP methods.
- the light shade is the site specifically identified by RIC-seq.
- the dashed box area is shown in I.
- I is the new interaction site between U1 and MALAT1, which is conserved and supported by the chimeric RNA cluster (arrow).
- the U1 motif is represented by a solid box.
- J is the RNA map showing all the RNA-RNA interactions in HeLa cells, the bottom left is the +pCp sample, and the top right is the -pCp sample.
- the interaction between NEAT1 and MALAT1 is magnified and shown on the right.
- K is an ultra-high resolution imaging analysis that reveals that the 5'ends of MALAT1 and NEAT1 are co-localized. Box 1 and Box 2 are enlarged and displayed in the middle. Boxes 3 and 4 show the sites of direct interaction. The relative distance between them is shown on the right.
- Figure 2 shows the RIC-seq analysis process and repeatability.
- A is the RIC-seq data analysis process. PCR repeats, linker sequences and sequencing fragments containing poly-N are first removed, and then reads are aligned to the hg19 reference genome using STAR software.
- B is the high correlation between the two repeats of RIC-seq.
- C is that the cell hybrid strategy detects that the overall false positive rate of RIC-seq technology is 0.6% (dotted box area).
- D is RIC-seq successfully captured the role and modification sites of snoRNA on 28S rRNA. The arrow indicates the known modification site, and the box area represents D'box.
- E is the distribution of snoRNA interaction sites on the genome detected by RIC-seq.
- F is the site where SNORD22 interacts with SPHK2 and BCL2L2 genes. The underlined area indicates D-box.
- G is a statistical pie chart of intramolecular and intermolecular RNA-RNA interactions.
- H is a violin chart showing that the target gene expression levels of MALAT1 and NEAT1 are higher than other genes. The two-tailed t test calculates the significant difference.
- I is the motif of MALAT1 or NEAT1 target enrichment.
- J is a summary of MALAT1 sites in 15 cells.
- K is the summary of NEAT1 locus detected by smFISH in 15 cells and its overlap with MALAT1 locus. The dark bars represent the direct overlap of NEAT1 and MALAT1 sites.
- Figure 3 shows the 3D structure of 28S rRNA accurately captured by RIC-seq.
- A is a physical interaction map drawn based on the 28S rRNA cryo-electron microscope structure. The light gray area indicates that the spatial distance is greater than (remote).
- Watson-Crick, non-Watson-Crick base interactions including Watson-Crick, non-Watson-Crick base pairing interactions, and other types of proximal interactions are marked with different colors. Not available means that no structure data is available.
- B is a 28S rRNA3D map drawn based on HeLa cell RIC-seq data. The boxed area shows the interaction of Watson-Crick base pairing and long-distance non-Watson-Crick base pairing.
- C is the true positive and true negative data set generated from the 28S rRNA cryo-EM structure. Dark colors represent true positives, and light colors represent true negatives.
- D is ROC (receiver operating characteristic curve) analysis of the accuracy of RIC-seq in 28S rRNA 3D structure prediction. RIC-seq technology is represented by a dark line, while PARIS is represented by a light line. The dotted line is a randomly distributed area line. The missing part of the cryo-EM structure is not used to generate the ROC curve.
- Figure 4 shows the topological domains and folding principles of RNA in vivo.
- B is a function of the actual contact probability between the precursor mRNA and lncRNA on the linear distance. The slope of -1 accords with the theoretical model of the sub-sphere.
- C is a function of the actual contact probability between mature mRNA and lncRNA on the linear distance. The dashed line indicates a slope of -1.
- Figure 5 shows the RNA 3D interaction maps drawn in different cell lines.
- A is the action matrix of all RNAs in GM12878, IMR-90, H1 hESC, hNPC and HT29 cells. The following is an amplification of the RNA interaction in the chr4:93.2Mb-94.8Mb region.
- B is an amplification of the RNA-RNA interaction common to the five cell lines in Figure A.
- C-D show cell line-specific and shared RNA-RNA interactions, respectively.
- RNA-seq, ChIP-seq and TAD signals come from H1hESC ENCODE data.
- E is the Cas9-KRAB system knocking down LncPRESS2.
- Figure 6 shows the characteristics of in situ RNA-RNA interaction.
- A is the percentage of RNA-RNA interactions within and between chromosomes in 6 cell lines.
- B is the percentage of RNA-RNA interactions within and between genes.
- C is the percentage of chimera reads within the chromosome in the 6 cell lines and the distance spanned. The solid and dashed lines represent the interactions within and between genes, respectively.
- D is the interval distribution of Hi-C, RIC-seq and RNA-seq on chromosome 1, and the data displayed is GM12878.
- E is the percentage of RNA-RNA interactions from different intervals within and between genes.
- Figure 7 shows the identification of cell-specific hub-RNA.
- A is that each RNA in HeLa cells is sorted according to its chimera read density and the number of interacting genes. Among them, the top 5% are considered as hub-RNA. GAPDH served as a negative control.
- B is the interaction of MALAT1, CCAT1 and PDE3A with the RNA on 23 chromosomes. The arrow indicates the location of the gene.
- C is the Meta analysis of the RIC-seq signal intensity and distribution of Hub-RNA and other RNAs. The RIC-seq signals around the transcription start site and transcription termination site are displayed.
- D indicates that Hub-RNA is more conservative than other RNAs.
- E-H is the ChIP-seq signal distribution of RNA polymerase II, H3K4me3, H3K27ac and H3K27me3 on Hub-RNA and other RNAs. I is cell-specific for most hub-RNAs.
- Figure 8 shows that Hub-RNA CCAT1-5L cooperates with MYC promoter and enhancer RNA to positively regulate MYC gene expression.
- A is the distribution of RIC-seq, RNA-seq and H3K27ac signals on 8q24.
- the CCAT1 transcripts obtained by 5’ and 3’ RACE are shown below.
- the Northern blot probe is marked with a black line.
- the amplification of CCAT1, MYC and PVT1 genes is indicated in the figure.
- the chimera reads of CCAT1-5L and MYC are shown in the figure.
- B is Northern blot analysis of the expression of CCAT1-5L in different cell lines. 5L probe detection showed that CCAT1-5L was only expressed in HeLa cells.
- C is smFISH found that CCAT1-5L is located in the nucleus.
- the 5’ end probes are marked with different colors. Scale bar: 5 ⁇ m.
- D is the 5L specific LNA probe knocking down CCAT1, the expression of MYC was significantly down-regulated. 5L and Exon2 specific primers were used to detect the expression level of CCAT1.
- E is smFISH to detect the co-localization of CCAT1-5L, MYC promoter and MYC enhancer RNA.
- CCAT1, MYC, PVT1 are marked with different colors. Scale bar: 5 ⁇ m.
- F is the effect of knocking down CCAT1-5L or ectopic expression of CCAT1-5L on cell proliferation rate.
- G means knock down CCAT1-5L or ectopic expression of CCAT1-5L affects clone formation.
- PBS buffer pH 7.4: the solvent is water, the solute and the concentration are as follows: NaCl 137mmol/L, KCl 2.7mmol/L, Na 2 HPO 4 10mmol/L, KH 2 PO 4 2mmol/L.
- 1 ⁇ PNK solution the solvent is 50 mM Tris-Cl buffer, pH 7.4, the solute and concentration are as follows: 10 mM MgCl 2 , 0.1 mg/ml BSA, 0.2% (volume percentage) NP-40.
- 1 ⁇ PNK+EGTA solution the solvent is 50mM Tris-Cl buffer with pH 7.4, the solute and concentration are as follows: 20mM EGTA, 0.5% (volume percentage) NP-40.
- High-salt solution The solvent is 5 ⁇ PBS (no Mg 2+ , Ca 2+ ), and the solute and concentration are as follows: 0.5% (volume percentage) NP-40.
- the 5 ⁇ PBS (no Mg 2+ ,Ca 2+ ) is 5 ⁇ PBS buffer (pH 7.4): NaCl 685mmol/L, KCl 13.5mmol/L, Na 2 HPO 4 50mmol/L, KH 2 PO 4 10mmol/L.
- Permeabilization solution 10mM Tris-HCl (pH 7.5), 10mM NaCl, 0.5% (volume percentage) NP-40, 0.3% (volume percentage) Triton X-100, 0.1% (volume percentage) Tween 20 , 1 ⁇ protease inhibitors (Sigma, item number: P8340-5ML, the specific components are AEBSF, Aprotinin, Bestatin hydrochloride, E-64, Leupeptin hemisulfate salt and Pepstatin A) and 2U/ml SUPERase ⁇ In TM RNase Inhibitor (Thermo Fisher , The article number is: AM2694).
- 1 ⁇ MN reaction solution the solvent is 50mM Tris-Cl buffer with pH 8.0, the solute and concentration are as follows: 5mM CaCl 2 .
- Proteinase K solution The solvent is a 10mM Tris-Cl buffer with a pH of 7.5.
- the solute and concentration are as follows: 10mM EDTA, 0.5% (volume percentage) SDS.
- Solution A 0.1M NaOH, 0.05M NaCl.
- Solution B 0.1M NaCl.
- TWB solution 10mM Tris-HCl (pH 7.5), 1mM EDTA, 2M NaCl, 0.02% (volume percentage) Tween 20.
- PK solution 100mM NaCl, 10mM Tris-HCl (pH 7.0), 1mM EDTA, 0.5% (0.5g/100mL) SDS.
- TE buffer 10mM Tris-HCl (pH 8.0), 1mM EDTA.
- FIG. 1 A The construction process of the RIC-seq library of the present invention is shown in Figure 1 A. Including cultured cells, after formaldehyde cross-linking, cell membrane and nuclear membrane perforation, MNase treatment, RNA 3'end hydroxylation treatment, pCp-biotin connection, RNA 3'end hydroxylation, 5'end phosphorylation treatment, proximal connection , Total RNA extraction, DNase I removal of genomic DNA, ribosomal RNA removal, RNA fragmentation, C1 magnetic beads enrichment and elution of enriched RNA, cDNA one-strand synthesis, synthetic DNA two-strand, end repair, adding "A” , Connecting adapters, PCR amplification and other steps. Specific steps are as follows:
- step 2 After completing step 1, add 10 ml of 1% (volume percentage) formaldehyde solution (the solvent is PBS solution) to the washed cells obtained in step 1, and let stand at room temperature for 10 minutes. Then add glycine solution (final concentration is 0.125mol/L, solvent is DEPC water) to terminate the reaction, and place at room temperature for 10 minutes to obtain formaldehyde crosslinked and terminated cells.
- 1% (volume percentage) formaldehyde solution the solvent is PBS solution
- glycine solution final concentration is 0.125mol/L, solvent is DEPC water
- step 2 After completing step 2, add 10ml of pre-cooled PBS (pH 7.4) to the cells obtained in step 2 after the formaldehyde crosslinking and termination, wash 3 times, scrape the cells with a cell spatula and transfer them to a 50ml centrifuge tube Centrifuge at 4°C and 2500rpm for 10 minutes, discard the supernatant, add 2ml of pre-cooled PBS (pH 7.4) to resuspend the cell pellet, transfer the cell suspension to two 1.5ml centrifuge tubes, each tube 1ml, 4 Centrifuge at 2500 rpm for 10 minutes at °C, discard the supernatant, continue the next experiment or store the cell pellet in a refrigerator at -80 °C.
- pre-cooled PBS pH 7.4
- step 3 After completing step 3, add 1ml of Permeabilization solution to the cell pellet obtained in step 3, ice bath for 15 minutes, and mix it every 2 minutes. Centrifuge at 4°C and 3500rpm for 5 minutes, discard the supernatant, add 600 ⁇ l 1 ⁇ PNK solution to resuspend the cell pellet, rotate (20rpm) at 4°C and mix for 5 minutes, centrifuge at 4°C, 3500rpm for 5 minutes, discard the supernatant , Repeat this step 2 times.
- step 4 After completing step 4, add 200 ⁇ l of Micrococal nuclease (Thermo Fisher, catalog number EN0181) diluted at a volume ratio of 1:10000 to the cell pellet obtained in step 4 (where the MNase enzyme concentration is 0.03U/ ⁇ l), resuspend the cell pellet, the diluted solution is a 1 ⁇ MN reaction solution, react in ThermoMixer at 37°C for 10 minutes, and set the reaction program to shake at 1000 rpm for 15 seconds every 2 minutes.
- Micrococal nuclease Thermo Fisher, catalog number EN0181
- step 5 After completing step 5, add 10 ⁇ l 10 ⁇ FastAP buffer (product of Thermo Fisher), 10 ⁇ l Fast Alkaline Phosphatase (product of Thermo Fisher, product number EF0651; the final concentration in the reaction system is 0.1 U/ ⁇ l), 80 ⁇ l DEPC water, resuspend the cell pellet, and react in ThermoMixer at 37°C for 10 minutes. Set the reaction program to shake at 1000 rpm for 15 seconds every 3 minutes.
- step 6 After completing step 6, add 10 ⁇ l 10 ⁇ RNA ligase reaction buffer (product of Thermo Fisher), 6 ⁇ l RNase inhibitor, 4 ⁇ l Biotinylated Cytidine (Bis) phosphate (ie pCp-biotin, product of Thermo Fisher) to the cell pellet obtained in step 6.
- 10 ⁇ RNA ligase reaction buffer product of Thermo Fisher
- 6 ⁇ l RNase inhibitor 4 ⁇ l Biotinylated Cytidine (Bis) phosphate (ie pCp-biotin, product of Thermo Fisher)
- the article number is 20160) (1mM), 10 ⁇ l T4 RNA ligase (product of Thermo Fisher, article number is EL0021); the final concentration in the reaction system is 1U/ ⁇ l), 20 ⁇ l DEPC water, 50 ⁇ l 30% PEG, resuspend the cell pellet , React overnight at 16°C in ThermoMixer, and set the reaction program to shake at 1000 rpm for 15 seconds every 3 minutes.
- step 7 After completing step 7, add 10 ⁇ l 10 ⁇ FastAP buffer (product of Thermo Fisher), 10 ⁇ l Fast Alkaline Phosphatase (product of Thermo Fisher, product number EF0651) to the cell pellet obtained in step 7; the final concentration in the reaction system is 0.1U/ ⁇ l), 80 ⁇ l DEPC water, resuspend the cell pellet, and react in ThermoMixer at 37°C for 10 minutes. Set the reaction program to shake at 1000 rpm for 15 seconds every 3 minutes.
- 10 ⁇ l 10 ⁇ FastAP buffer product of Thermo Fisher
- 10 ⁇ l Fast Alkaline Phosphatase product of Thermo Fisher, product number EF0651
- step 8 After completing step 8, add 10 ⁇ l 10 ⁇ PNK buffer (product of Thermo Fisher), 15 ⁇ l 10mM ATP, 10 ⁇ l T4 PNK (product of Thermo Fisher, product number EK0032) to the cell pellet obtained in step 8.
- the final concentration is 1U/ ⁇ l
- 65 ⁇ l DEPC water After completing step 8, add 10 ⁇ l 10 ⁇ PNK buffer (product of Thermo Fisher), 15 ⁇ l 10mM ATP, 10 ⁇ l T4 PNK (product of Thermo Fisher, product number EK0032) to the cell pellet obtained in step 8.
- the final concentration is 1U/ ⁇ l
- 65 ⁇ l DEPC water resuspend the cell pellet, and react in ThermoMixer at 37°C for 45 minutes. Set the reaction program to shake at 1000 rpm for 15 seconds every 3 minutes.
- step 9 add 20 ⁇ l 10 ⁇ RNA ligase reaction buffer (product of Thermo Fisher), 8 ⁇ l RNase inhibitor, 10 ⁇ l T4 RNA ligase (product of Thermo Fisher, product number EL0021) to the cell pellet obtained in step 9;
- the final concentration in the system is 0.5U/ ⁇ l
- 142 ⁇ l DEPC water resuspend the cell pellet, and react overnight at 16°C in ThermoMixer.
- Set the reaction program to shake at 1000 rpm for 15 seconds every 3 minutes.
- step 11 After completing step 10, add 200 ⁇ l Proteinase K solution and 50 ⁇ l Proteinase K (product of Takara, article number 9034; final concentration in the reaction system is 0.12U/ ⁇ l) to the cell pellet obtained in step 10, mix well, In the ThermoMixer, react at 37°C for 60 minutes, and at 56°C for 15 minutes. After the reaction is over, after the sample is cooled to room temperature, add 750 ⁇ l Trizol LS (Product of Thermo Fisher, Item No. 10296028), mix well, leave at room temperature for 5 minutes, and add 220 ⁇ l chloroform , Mix vigorously for 15 seconds, and place at room temperature for 3 minutes.
- Trizol LS Product of Thermo Fisher, Item No. 10296028
- step 12 After completing step 11, centrifuge the sample obtained in step 11 at 4°C and 13000rpm for 20 minutes, discard the supernatant, add 500 ⁇ l 75% ethanol, wash the precipitate, centrifuge at 4°C and 13000rpm for 5 minutes, repeat this Step once, dry the precipitate naturally, add 20 ⁇ l DEPC water to dissolve the precipitate, take 1 ⁇ l sample and quantify it with NanoDrop.
- step 13 After completing step 12, take out 20 ⁇ g of total RNA from the sample obtained in step 12, add 10 ⁇ l 10 ⁇ RQ1 DNase I buffer (Promega company product), 3 ⁇ l RNAsin (Thermo Fisher company product, item number is EO0381) and 5 ⁇ l DNase I( Promega product, the product number is M6101), make up water to a total volume of 100 ⁇ l, and react in ThermoMixer at 37°C for 20 minutes. After the reaction is over, add 100 ⁇ l DEPC water, then add 200 ⁇ l acidic phenol chloroform, mix well, and place at room temperature for 3 minutes.
- 10 ⁇ l 10 ⁇ RQ1 DNase I buffer Promega company product
- 3 ⁇ l RNAsin Thermo Fisher company product, item number is EO0381
- 5 ⁇ l DNase I Promega product, the product number is M6101
- step 14 centrifuge the sample obtained in step 13 at 4°C and 13000rpm for 20 minutes, discard the supernatant, add 500 ⁇ l 75% ethanol, wash the precipitate, centrifuge at 4°C and 13000rpm for 5 minutes, repeat this Step once, dry the precipitate naturally, add 6 ⁇ l DEPC water to dissolve the precipitate, and transfer the sample to the PCR tube.
- step 14 After completing step 14, add 10 ⁇ l rRNA probe mix to the sample obtained in step 14 (refer to published literature (Adiconis, X., Borges-Rivera, D., Satija, R., DeLuca for the design and synthesis of probe sequences).
- step 16 After completing step 15, add 3 ⁇ l 10 ⁇ RNase H buffer (product of Thermo Fisher), 5 ⁇ l RNase H (product of Thermo Fisher, product number EN0202) (25U) and 2 ⁇ l DEPC water to the sample obtained in step 15. Evenly, put the sample into the PCR machine and set the program: 37°C for 30 minutes. Place the sample on ice immediately after the reaction.
- 3 ⁇ l 10 ⁇ RNase H buffer product of Thermo Fisher
- 5 ⁇ l RNase H product of Thermo Fisher, product number EN0202
- step 16 After completing step 16, add 4 ⁇ l 10 ⁇ TURBO buffer (product of Thermo Fisher), 5 ⁇ l TURBO DNase (product of Thermo Fisher, product number AM2238; the final concentration in the reaction system is 0.25U). / ⁇ l) and 1 ⁇ l DEPC water, mix well, put the sample into the PCR machine, set the program: 37°C for 30 minutes. Place the sample on ice immediately after the reaction.
- 10 ⁇ TURBO buffer product of Thermo Fisher
- 5 ⁇ l TURBO DNase product of Thermo Fisher, product number AM2238; the final concentration in the reaction system is 0.25U.
- / ⁇ l 1 ⁇ l DEPC water
- step 17 transfer the sample obtained in step 17 to a 1.5ml centrifuge tube, add 160 ⁇ l DEPC water, add 200 ⁇ l acid phenol chloroform, mix well, leave it at room temperature for 3 minutes, and centrifuge at 4°C and 13000rpm for 15 minutes. Take the supernatant to a 1.5ml centrifuge tube, add 20 ⁇ l 3M sodium acetate (pH5.5), 1 ⁇ l glycoblue and 500 ⁇ l 100% ethanol, mix well, put the centrifuge tube in a -20°C refrigerator, overnight for precipitation.
- step 18 centrifuge the sample obtained in step 18 at 4°C and 13000rpm for 20 minutes, discard the supernatant, add 500 ⁇ l 75% ethanol, wash the precipitate, centrifuge at 4°C and 13000rpm for 5 minutes, repeat this Step 1, dry the precipitate naturally, add 16 ⁇ l DEPC water to dissolve the precipitate, transfer the sample to the PCR tube, add 4 ⁇ l 5 ⁇ First-strand buffer (Thermo Fisher company product, catalog number 18064-014), mix well, and put React at 94°C for 5 minutes on a PCR machine, and place on ice immediately after the reaction.
- step 21 Take the sample from step 19, add 30 ⁇ l DEPC water, add a total of 50 ⁇ l sample to the sealed magnetic beads, mix, rotate and mix at room temperature for 30 minutes, put the centrifuge tube on the magnetic stand, wait for the solution to clear, aspirate the supernatant , Add 500 ⁇ l 1 ⁇ TWB solution and wash 4 times.
- step 22 After completing step 21, add 100 ⁇ l of PK solution to the washed magnetic beads obtained in step 21, mix well, and shake the reaction in ThermoMixer at 95°C at 1000 rpm for 10 minutes. Place the centrifuge tube on the magnetic stand and wait for the solution to clear.
- step 23 After completing step 22, centrifuge the sample obtained in step 22 at 4°C and 13000rpm for 20 minutes, discard the supernatant, add 500 ⁇ l 75% ethanol, wash the precipitate, centrifuge at 4°C and 13000rpm for 5 minutes, repeat this Step 1, dry the pellet naturally, add 10 ⁇ l DEPC water to dissolve the pellet, transfer the sample to a PCR tube, add 0.5 ⁇ l N6 primer (sequence is NNNNNN, where N represents A or T or C or G) (0.1 ⁇ g/ ⁇ l), mix well, put the PCR tube into the PCR machine, and react at 65°C for 5 minutes. After the reaction, place it on ice immediately.
- N6 primer sequence is NNNNNN, where N represents A or T or C or G
- step 24 After completing step 23, add 3 ⁇ l 5 ⁇ First-strand buffer (Thermo Fisher product number is 18064-014), 1 ⁇ l dNTP mix(10mM), 0.5 ⁇ l 100mM DTT, 0.5 ⁇ l RNase Inhibitor to the sample obtained in step 23 (40U/ ⁇ l), 0.5 ⁇ l Superscript II (Thermo Fisher, product number 18064-014) (200U/ ⁇ l), mix well, put the PCR tube into the PCR machine, set the program: 25°C 10 minutes, 42°C 40 minutes, 15 minutes at 70°C. After the reaction is complete, put the sample on ice.
- 3 ⁇ l 5 ⁇ First-strand buffer Thermo Fisher product number is 18064-014
- 1 ⁇ l dNTP mix(10mM) 1 ⁇ l dNTP mix(10mM)
- 0.5 ⁇ l 100mM DTT 0.5 ⁇ l 100mM DTT
- 0.5 ⁇ l RNase Inhibitor to the sample obtained in step 23 (40U/ ⁇ l)
- step 24 transfer the sample obtained in step 24 to a new 1.5ml centrifuge tube, add 10 ⁇ l 5 ⁇ Second-strand buffer (product of Thermo Fisher, item number 10812-014), 0.8 ⁇ l dNTP(dUTP)( 25mM) (ie a mixture of 25mM dNTPs and dUTP, where the molar ratio of dTTP to dUTP is 4:1), 0.2 ⁇ l RNaseH (Product of Thermo Fisher, product number EN0202) (5U/ ⁇ l), 2.5 ⁇ l DNA pol I (Enzymatics) The product, the item number is P705-500) (10U/ ⁇ l), put the centrifuge tube into the ThermoMixer, and set the reaction program: 16°C for 2 hours, 300rpm, 15s/2min.
- P705-500 10U/ ⁇ l
- step 25 After completing step 25, mix AMPure XP magnetic beads (XP magnetic beads, Beckman for short) in advance at room temperature for 30 minutes. Then add 90 ⁇ l (1.8 ⁇ ) XP magnetic beads to the reaction solution obtained in step 25, and mix gently. Let stand at room temperature for 5 minutes, transfer to a magnetic stand and stand for 5 minutes, remove the supernatant, and rinse the magnetic beads twice with 200 ⁇ l of fresh 80% ethanol solution. Place the magnetic beads on a magnetic rack to dry for 2 minutes, add 43 ⁇ l TE buffer to resuspend the magnetic beads, and pipette 50 times. Let it stand at room temperature for 5 minutes, then put it in a magnetic stand and let it stand for 5 minutes, draw 42 ⁇ l of supernatant and add it to a 1.5ml centrifuge tube.
- XP magnetic beads XP magnetic beads, Beckman for short
- step 26 After completing step 26, add 5 ⁇ l 10 ⁇ PNK solution (T4 PNK supporting reaction solution), 0.4 ⁇ l dNTPs (25mM), 1.2 ⁇ l T4 DNA polymerase (product of Enzymatics, product number P7080L) to the sample obtained in step 26 ( 3U/ ⁇ l), 0.2 ⁇ l Klenowfragment (Enzymatics product, item number P7060L) (5U/ ⁇ l), 1.2 ⁇ l T4 PNK (Enzymatics company product, item number Y9040L) (10U/ ⁇ l), mix well, and put the centrifuge tube into In ThermoMixer, react for 30 minutes at 20°C.
- step 26 After the reaction is over, add 90 ⁇ l AMPure AMPure XP magnetic beads for purification, the specific steps are the same as step 26, and finally eluted with 20.5 ⁇ l TE buffer, aspirate 19.7 ⁇ l supernatant, and add to a new 1.5ml centrifuge tube.
- step 28 After completing step 27, add 2.3 ⁇ l 10 ⁇ blue buffer (product of Enzymatics, product number B0110L), 0.5 ⁇ l dATP (5mM) and 0.5 ⁇ l Klenowexo-(3 ⁇ to 5 ⁇ exo) to the sample obtained in step 27 minus) (Product of Enzymatics, P7010-LC-L) (5U/ ⁇ l), mix well, put the centrifuge tube in ThermoMixer, and react at 37°C for 30 minutes.
- step 28 After completing step 28, add 1.4 ⁇ l 2 ⁇ Rapid ligation buffer (product of Enzymatics, product number B1010L), 0.1 ⁇ l 10mM ATP, 1 ⁇ l Adapter (PEI Adapter oligo A: GATCGGAAGAGCACACGTCT, PEI Adapter oligo) to the sample obtained in step 28 B: ACACTCTTTCCCTAC ACGACGCTCTTCCGATCT, the adapter in the reaction is formed by annealing two oligo) (2 ⁇ M) and 1 ⁇ l T4 quick DNA ligand (product of Enzymatics, item number L6030-HC-L) (600U/ ⁇ l), mix well, and centrifuge tube Put it in ThermoMixer and react for 15 minutes at 20°C.
- 2 ⁇ Rapid ligation buffer product of Enzymatics, product number B1010L
- 1 ⁇ l Adapter PEI Adapter oligo A: GATCGGAAGAGCACACGTCT
- step 26 After the reaction is over, add 47.7 ⁇ l AMPure XP magnetic beads for purification, the specific steps are the same as step 26, and finally eluted with 26 ⁇ l TE buffer, aspirate 25 ⁇ l supernatant, and add to a new 1.5ml centrifuge tube.
- step 26 Add 45 ⁇ l AMPure XP magnetic beads for secondary purification, the specific steps are the same as step 26, and finally eluted with 16.5 ⁇ l TE buffer, aspirate 15.7 ⁇ l supernatant, and add to PCR tube.
- step 30 After completing step 29, use the supernatant obtained in step 29 as a template to perform a PCR reaction in a PCR tube to obtain a PCR reaction solution (25 ⁇ l).
- the PCR reaction system is 25 ⁇ l: supernatant 15.7 ⁇ l, 10 ⁇ Pfx buffer (Invitrogen) 2.5 ⁇ l, 10 ⁇ M upstream and downstream primers 1 ⁇ l, 50mM MgSO 4 solution 1 ⁇ l, 25mM dNTP 0.4 ⁇ l, Pfx enzyme (Invitrogen) 0.4 ⁇ l, USER enzyme (NEB) 3 ⁇ l.
- the PCR reaction procedure is: 37°C for 15 minutes; 94°C for 2 minutes; 94°C for 15 seconds, 62°C for annealing for 30 seconds, 72°C for 30 seconds, reaction for 12 cycles; 72°C for 10 minutes.
- step 30 After completing step 30, use 2% agarose gel for electrophoresis of the PCR reaction solution obtained in step 30, and use Qiagen MinElute kit to cut and recover the products in the range of 200-450bp. Refer to the kit instructions for the operation steps, and finally use 16 ⁇ l The TE buffer is eluted to obtain the PCR eluate.
- step 32 After completing step 31, draw 1 ⁇ l of the PCR eluate obtained in step 31 and use Qubit 3.0 for quantification. Quantitative qualified samples are used for sequencing analysis.
- the HeLa cells cultured in the laboratory were used as samples.
- the initial amount of cell samples was 1 ⁇ 10 7 cells.
- Drosophila S2 cells were used as spike-in to evaluate the specificity of the proximal connection.
- the RIC-seq library was constructed according to the method in Example 1 based on the cell sample in step 1. Among them, the upstream and downstream primers in step 30 are as follows (NNNNNNN is the library Index sequence)
- N represents A or T or C or G.
- Illumina HiSeq X Ten sequencer was used to perform PE150 paired-end sequencing on the RIC-seq library constructed in step 2.
- the data analysis process is shown in Figure 2 A.
- RIC-seq R NA I n situ C onformation Seq uencing
- the specific process is shown in A in Figure 1.
- the cells are treated with formaldehyde to fix the interactions between protein and RNA, protein and DNA, and protein and protein, so that different RNA fragments that are close in space are fixed.
- the cell membrane and nuclear membrane of the fixed cells are punched with multiple detergents, and treated with micrococcal nuclease to remove free RNA that is not protected by protein.
- the 3'end of the RNA is a phosphate group, and the 5'end is a hydroxyl group ( Figure 1 A).
- alkaline phosphatase we use alkaline phosphatase to first turn the 3'end phosphate into a hydroxyl group, and then use T4 RNA ligase to connect pCp-biotin to the 3'end of the RNA.
- T4 RNA ligase to connect pCp-biotin to the 3'end of the RNA.
- the samples were treated with alkaline phosphatase and T4PNK enzymes to change the 3'end of Cp-biotin into a hydroxyl group and the 5'end of RNA to a phosphate group ( Figure 1 A).
- T4 RNA ligase is used to connect the spatially close RNAs. Then, use proteinase K digestion combined with TRIzol extraction to obtain total RNA. After removing genomic DNA and rRNA, perform alkaline lysis and fragmentation treatment.
- RNAs include microRNA, snRNA, snoRNA and lncRNA (CI in Figure 1, CI in Figure 2 DF).
- RIC-seq can accurately capture the classic stem-loop structure of miRNA precursors at the single-base resolution level (C in Figure 1, pCp insertion position is below), and the expression level of these miRNAs (RPM, reads per million) ranges from 0.05 It ranges from 31,067 (D in Figure 1), indicating the dynamic detection range of RIC-seq technology.
- pCp-biotin is mainly enriched on the top loop of miRNA precursor (C in Figure 1, D in Figure 1), indicating that the top loop may be rarely protected by protein.
- RIC-seq successfully detected the known intermolecular and intramolecular interactions of snRNA, snoRNA, RPPH1 (the RNA component of Ribonuclease P) and TERC (telomerase RNA, data not shown) (EI and EI in Figure 1).
- DF in Figure 2.
- RNA-RNA interactions enable specific targeting of noncoding RNAs to nascent Pre-mRNAs and chromatin sites.Cell 159(1):188-199.); (Lu,Z.,Zhang,QC,Lee,B .,Flynn,RA,Smith,MA,Robinson,JT,Davidovich,C.,Gooding,AR,Goodrich,KJ,Mattick,JS,et al.(2016).RNA Duplex Map in Living Cells Reveals Higher-Order Transcriptome Structure .Cell 165,1267-1279.), RIC-seq can not only capture the known U1-MALAT1 interactions, but also identify some specific U1 and MALAT1 interaction sites in
- the +pCp library covers complex intramolecular ( ⁇ 7M) and intermolecular interactions ( ⁇ 6M) (G in Figure 2), indicating that RNA is in The body is not only highly structured, but also has extensive mutual entanglement (J in Figure 1). Interestingly, some lncRNAs have extensive binding on all chromosomes, such as NEAT1 and MALAT1. In order to identify the true binding sites of these high-abundance RNAs, we performed cluster analysis on chimera reads, and identified 0.74M high-confidence RNA-RNA interaction sites in HeLa cells.
- MALAT1 and NEAT1 can not only interact with each other but also thousands of other targets (J in Figure 1). Consistent with recent reports (West, JA, Davis, CP, Sunwoo, H., Simon, MD, Sadreyev, RI, Wang, PI, Tolstorukov, MY, and Beverly, RE (2014). The long noncoding RNAs NEAT1 and MALAT1 bind active chromatin sites. Molecular cell55,791-802.), we also found that MALAT1 and NEAT1 are more likely to bind to transcriptionally active genes (H in Figure 2, p ⁇ 2.2e-16), and their binding motifs are highly similar (I in Figure 2).
- RIC-seq revealed that MALAT1 can bind to the 5'end of NEAT1 (NEAT1_5', J right in Figure 1).
- NEAT1_5 single molecule in situ hybridization
- SIM ultra-high resolution imaging
- a HeLa cell has an average of 7.5 paraspeckles (paraspeckles), of which ⁇ 63.7% is co-localized with MALAT1 (K in Figure 2).
- RIC-seq is a new method for identifying in situ RNA-RNA interactions with high specificity, high reproducibility and high precision.
- RNA proximity information detected by RIC-seq with the data obtained from the cryo-EM structure of the human 80S ribosome (Anger, AM, Armache, JP, Berninghausen, O., Habeck,M.,Subklewe,M.,Wilson,DN,and Beckmann,R.(2013).Structures of the human and Drosophila 80S ribosome.Nature 497,80-85.) were compared.
- the AUC value obtained by ROC analysis is 0.89, indicating that RIC-seq has high accuracy in the identification of high-level RNA structures (D in Figure 3, dark line).
- a control we used the same data set to evaluate the performance of PARIS. Regrettably, because a large number of 28S rRNA pairing regions and distant interaction sites cannot be captured by PARIS, a complete curve cannot be obtained, nor can it Generate the AUC value (D in Figure 3, light-colored line).
- RNA-RNA intramolecular interaction data generated by RIC-seq technology allows us to detect RNA folding rules in vivo.
- As in PDE3A and IMMP2L precursor RNA. In order to systematically identify similar topological domains, we invented an iterative algorithm that can identify the boundaries of topological domains by maximizing the ratio of the RIC-seq density between the inside of the domain and between the domains.
- RNA is highly structured in vivo and in specific regions, and RNA co-transcription processing may occur in independent topological domains.
- RNA polymers can also exist in the form of random coils, balanced spheres or fractal spheres.
- the specific conformation that RNA exists in can be deduced by calculating the connection probability between RNA fragments of different nucleotide distances (Fudenberg, G., and Mirny, LA (2012). Higher-order chromatin structure: bridging physics and biology. Current opinion in genetics&development 22,115-124.).
- RNA precursor may be folded in a fractal sphere conformation similar to genomic DNA in vivo. This conformation can ensure that the RNA lacks nodules while maintaining the maximum degree of packaging, while keeping the RNA open and re- The ability to fold its partial structure.
- RNA 3D maps in various cell lines including human neural precursor cells (hNPC) and colon adenocarcinoma cell line HT29.
- hNPC human neural precursor cells
- HT29 colon adenocarcinoma cell line
- ENCODE ENCODE cell types
- human B lymphocyte line GM12878, H1 human embryonic stem cells (hESCs) and human fetal lung fibroblasts IMR-90 so as to facilitate the later integration of genomic data generated by the ENCODE project.
- RIC-seq libraries in these 5 cell lines After removing PCR repeats, we got a total of 1,001M unique aligned reads, of which chimeras accounted for 8.4%.
- the RNA-RNA interactions in these five new cell types are also extremely complex (Figure 5, A).
- LncPRESS2 an embryonic stem cell-specific lncRNA regulated by P53 (Jain, AK, Xi, Y., McCarthy, R., Allton, K., Akdemir, KC, Patel, LR, Aronow, B .,Lin,C.,Li,W.,Yang,L.,et al.(2016).LncPRESS1 Is a p53-Regulated LncRNA that Safeguards Pluripotency by Disrupting SIRT6-Mediated De-acetylation of Histone H3K56.Molecular cell 64,967- 981.), RIC-seq detected extensive interaction between LncPRESS2 and its neighboring gene GRID2 in H1hESC ( Figure 5C).
- LncRNA Jpx induces Xist expression in mice using both trans and cis mechanisms.
- the long noncoding RNA, Jpx is a molecular switch for X chromosome inactivation. Cell 143,390-403.), and is essential for XIST-mediated X chromosome silencing.
- the tight binding between these two lncRNAs suggests that they may regulate XIST in the form of a complex.
- sgRNA Cas9-KRAB can be specifically targeted to the promoter region of lncRNA, and KRAB, as a transcription inhibitor of RNA polymerase II (E in Figure 5), can efficiently block the transcription of lncRNA.
- sgRNA Cas9-KRAB can be specifically targeted to the promoter region of lncRNA
- KRAB as a transcription inhibitor of RNA polymerase II
- RNA-RNA interactions In order to reveal the general characteristics of RNA-RNA interactions in different cell types, we first calculated the frequency of intrachromosomal and interchromosomal interactions. Using the RIC-seq data generated in the above six cell types, we found that ⁇ 70% of RNA-RNA interactions occur within the same chromosome, while the remaining ⁇ 30% occur between chromosomes ( Figure 6A). Since RNA can act on other RNA molecules either cis or trans, we also calculated the frequency of intra- and inter-gene RNA interactions. Similarly, about 60% of chimera reads are derived from cis-action within genes.
- This part of the data can be used to infer the 3D structure of RNA; the remaining 40% show the characteristics of trans-RNA-RNA interaction (Figure 6 In B), this indicates that a large number of RNAs in a cell can interact with other RNAs on the same chromosome or RNAs on different chromosomes over a long distance. If we only count chimera reads on the same chromosome, this trend is also obvious, where we detect two obvious peaks: the first peak corresponds to the RNA-RNA interaction within the gene, which can span hundreds of nuclei The distance of glycidyl acid; and the second peak corresponds to the interaction between genes, which spans a distance of more than 1 Mb (C in Figure 6).
- RNA interactions Similar to the organization of chromatin, RNA interactions also appear to be compartmentalized and can reproduce the compartments of DNA to a large extent (D in Figure 6), indicating that due to spatial proximity, RNA in the same compartment may be more Tend to interact with each other. We next quantified the interaction between the different compartments. Interestingly, in the interactions within genes, chimera reads are mainly enriched in the same compartment. A to A can account for ⁇ 90% of the total chimera reads (E in Figure 6), which may be due to the region The active transcription of room A is caused by the relatively closer spatial distance within the gene.
- compartments A to A decreased to about 65%, but the interaction between compartments A to B increased to ⁇ 30% (E in Figure 6), indicating that such trans-RNA interactions
- the effect may have some unknown functions and may regulate the activity of genes in compartment B.
- RNA-RNA interactions can span more than 1Mb or even across different chromosomes
- this analysis unexpectedly revealed ⁇ 500 highly abundant RNA-RNA interaction centers in HeLa cells (A in Figure 7), including well-known lncRNAs such as MALAT1, NEAT1, CCAT1, and PVT1.
- lncRNAs such as MALAT1, NEAT1, CCAT1, and PVT1.
- many protein-coding genes also show complex RNA-RNA interactions, such as PDE3A, GPC5 and TRIO ( Figure 7A).
- RNA polymerase II is also enriched in the TSS (transcription start site) region of these genes ( Figure 7). 7 in E).
- the locus activity histone markers H3K4me3 (F in Figure 7) and H3K27ac (G in Figure 7) corresponding to hub-RNAs have slightly higher binding signals.
- the inhibitory histone marker H3K27me3 has a slightly lower signal (H in Figure 7).
- hub-RNAs are cell-specific (I in Figure 7). Therefore, RIC-seq unexpectedly revealed a group of tissue-specific hub-RNAs that may play an important role in gene regulation.
- CCAT1 In order to study the role of hub-RNAs, we chose CCAT1 for further analysis because it has a wide range of trans-RNA interactions (B in Figure 7) and potential super-enhancer activity (Hnisz, D., Abraham, BJ, Lee, TI,Lau,A., Saint-Andre,V.,Sigova,AA,Hoke,HA,and Young,RA(2013).Super-enhancers in the control of cell identity and disease.Cell155,934-947.;Loven ,J.,Hoke,HA,Lin,CY,Lau,A.,Orlando,DA,Vakoc,CR,Bradner,JE,Lee,TI,and Young,RA(2013).Selective inhibition of tumor oncogenes by disruption of super -enhancers.Cell 153,320-334.).
- CCAT1 is located in the human 8q24 gene region, and is abnormally highly expressed in many cancers such as colorectal cancer, prostate cancer and liver cancer (Chen, H., He, Y., Hou, YS, Chen, DQ, He, SL, Cao ,YF,and Wu,XM(2018a).Long non-coding RNA CCAT1 promotion the migration and invasion of progress cancer PC-3cells.European review for medical and pharmacological sciences 22,2991-2996.;Deng,L.,Yang,SB ,Xu,FF,and Zhang,JH(2015).Long noncoding RNA CCAT1promotes hepatocellularcarcinomaprogression by functioning as let-7sponge.Journal of experimental&clinical cancer research:CR34,18.;Tseng,YY,Moriarity,BS,Gong, ,Akiyama,R.,Tiwari,A.,Kawakami,H.,Ronning,P.,Reuland
- CCAT1 transcript In colorectal cancer cells, researchers reported a CCAT1 transcript with an extra extension of the 3'end, and found that this transcript can regulate the DNA interaction between the promoter and enhancer of the MYC gene (Xiang, JF, Yin,QF,Chen,T.,Zhang,Y.,Zhang,XO,Wu,Z.,Zhang,S.,Wang,HB,Ge,J.,Lu,X.,et al. (2014).Human colorectal cancer-specific CCAT1-L lncRNA regulates long-range chromatin interactions at the MYC location.Cell research 24,513-531.), but the exact mechanism is unknown.
- CCAT1 may have additional 5'-end transcripts, instead of the 3'-end extension previously reported in colon cancer ( Figure 8A) .
- the 5'and 3'RACE in HeLa cells also confirmed that CCAT1 does have an extra extension at the 5'end (A in Figure 8, marked in brown at the bottom), and the total length of the transcript is ⁇ 4,700 nt, and it is directly related to the super enhancer. overlapping.
- CCAT1-5L is a nuclear-localized lncRNA and there are 2-3 spots in each nucleus (C in Figure 8).
- CCAT1-5L seems to be functional, because most of the interactions between CCAT1 and RNA in other regions of the 8q24 "gene desert" come from the first exon and the additional 5'end extension (Figure 8A).
- Figure 8A we detected extensive long-distance RNA-RNA interactions between CCAT1-5L, MYC promoter RNA, and PVT1. More importantly, the CCAT1-5L binding sites observed in the PVT1 locus are mainly located in the intron sequence containing the MYC enhancer ( Figure 8A). The above data suggest that CCAT1-5L may act as superenhancer RNA to interact with promoter and enhancer RNA to regulate the expression of MYC oncogene.
- CCAT1-5L can directly regulate MYC expression.
- CCAT1-5L was knocked down with two LNA oligonucleotides targeting the 5'end extension region (D in Figure 8)
- the RNA level of MYC was significantly reduced by ⁇ 40% (D in Figure 7), indicating that HeLa CCAT1-5L can indeed regulate the expression of MYC in cells.
- the positive regulator PVT1 of MYC Tseng,YY,Moriarity,BS,Gong,W.,Akiyama,R.,Tiwari,A.,Kawakami,H.,Ronning,P.,Reuland,B.
- CCAT1-5L is highly expressed in cervical cancer patients, we next checked whether CCAT1-5L can promote cell proliferation and metastasis, which are two hallmarks of cancer (Hanahan, D., and Weinberg, RA (2011). Hallmarks of cancer: the next generation. Cell 144,646-674.). Contrary to the LNA control, knockdown of CCAT1-5L by 5L-specific LNA oligonucleotides significantly reduced the proliferation rate of HeLa cells (F in Figure 8); on the contrary, the ectopic expression of CCAT1-5L using lentiviral plasmids can significantly enhance Cell proliferation (F in Figure 8), which is consistent with the carcinogenic effect of CCAT1-5L.
- the method for capturing the high-level structure and interaction of RNA in situ can process intracellular RNA in situ without damaging the cell structure and maintaining cell integrity, and capture the RNA molecules in physiological conditions and Intermolecular interaction.
- the method for capturing RNA in situ high-level structure and interaction provided by the present invention uses pCp-biotin to label RNA ends, and performs in situ connection under non-denaturing conditions, which greatly improves the labeling efficiency and reduces non-specific connections between molecules;
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- Microbiology (AREA)
- General Health & Medical Sciences (AREA)
- Analytical Chemistry (AREA)
- Biomedical Technology (AREA)
- Immunology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Plant Pathology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
提供了一种捕获RNA原位高级结构及相互作用的方法。该方法包括:固定细胞或组织中蛋白质介导的RNA相互作用;在保持细胞完整的情况下行膜穿孔;降解游离RNA;在RNA3'末端行pCp-biotin标记并原位近端连接;细胞消解后纯化含C-biotin的嵌合体RNA;构建链特异性文库;高通量测序。
Description
本发明涉及生物技术领域,具体涉及一种捕获RNA原位高级结构及相互作用的方法。
遗传信息的载体DNA需要先转录成RNA,然后再翻译成蛋白质才能发挥生物学功能。RNA作为遗传信息的传递者,主要作用是编码和指导蛋白质的合成,这一类编码蛋白质的RNA统称为信使RNA(messenger RNA,mRNA)。此外,人类基因组还转录产生了海量的不编码蛋白质的RNA,这类RNA被称为非编码RNA(noncoding RNA,ncRNA)。目前已发现的具有调控作用的非编码RNA包括:tRNA、rRNA、siRNA、miRNA、piRNA、snoRNA、circRNA和lncRNA等,它们的异常表达和突变与癌症发生、发育和生殖缺陷等重大疾病相关。作为遗传信息的关键调控者,RNA往往需要先通过分子内碱基配对形成复杂的高级结构,然后再通过与其他的RNA分子相互作用才能发挥重要的生物学调控功能。利用测序技术,我们已经可以获得RNA的详细序列信息,但是对于RNA的结构,尤其是高级结构信息的获取依然是个世界性难题。虽然一些物理手段,比如核磁共振、冷冻电镜和晶体学等方法可以解析RNA的高分辨率结构和分子内及分子间相互作用,但是这些技术的通量太低。目前国际蛋白质数据库PDB中收录的人源RNA的高分辨率结构屈指可数。因此,如何系统和精确的解析RNA的分子内和分子间相互作用依然是我们所面临的巨大挑战。
近年来,大量分析RNA二级结构的技术被开发出来,这些技术的特点是先利用化学修饰或者酶切处理RNA,然后再进行建库测序,例如:DMS-seq、Structure-seq、icSHAPE等,他们利用了单链区的RNA容易被化合物DMS(dimethyl suberimidate,硫酸二甲酯)或NAI-N3等修饰的特点,通过分析反转录酶停止的位置而推导RNA的哪些碱基处于单链区。此外,对于RNA结构中的双链配对区域,目前也有多个方法可以解析,例如PARIS、LIGR-seq和SPLASH等。这三种方法的基本原理是:在培养基中加入Psoralen或AMT,它们进入细胞后会穿过细胞膜并迅速结合到RNA上双链配对的区域,经254nm的紫外线(UV)照射后,细胞内配对的RNA会被psoralen或AMT共价交联起来,之后对富集出的RNA进行片段化处理,并在溶液中进行近端连接。然后对连接后的RNA进行365nm UV照射后可打开psoralen或AMT与双链RNA之间的共价键,之后进行文库构建和测序。以上方法虽然可以高通量的研究RNA的单链区和双链区,但也存在一些缺陷:第一,不能捕获非Watson-Crick配对以及远距离的RNA loop-loop相互作用。第二,这些连接反应都是在溶液中进行,存在非特异性连接,不能反映RNA在细胞内的真实结构,导致大量的假阳性分子间连接。第三,测序所得数据中,嵌合体reads(chimeric reads,即不同RNA片段间连接的产物)比 率较低,无用数据太多。RNA近端连接技术(RPL)理论上可以克服以上技术缺陷,但由于缺乏交联和嵌合体RNA富集,因此导致RPL技术只能鉴定分子内相互作用,而不能鉴定分子间的RNA-RNA相互作用。
近年来的高通量转录组测序表明超过90%的基因组被转录,产生了海量的非编码RNA,其中一些非编码RNA与染色质紧密结合,如lncRNA(long non-coding RNA)。lncRNA是一类长度在200nt以上、不编码蛋白质的RNA。目前NONCODE收录的人类lncRNA数目已经超过了160000个,是蛋白质编码基因的8倍,但是绝大多数非编码RNA的功能、靶标及作用机制还不清楚。目前常用的鉴定lncRNA靶标的方法包括:CHIRP、CHART和RAP。这些方法的原理是:在生理状态下,首先利用甲醛处理细胞,以固定RNA及其相互作用的靶标分子;然后裂解细胞,利用超声或酶对染色质进行片段化;之后利用生物素修饰的DNA探针富集与目的RNA相互作用的DNA片段;对DNA片段加接头后,利用PCR进行文库扩增和高通量测序;最后结合生物信息学分析,鉴定与特定lncRNA相互作用的靶标DNA。CHIRP、CHART和RAP方法只关注DNA靶标,忽略了具有重要功能的RNA靶位点,而且每次只能鉴定一个lncRNA在细胞内的所有潜在的DNA靶标(one to all),通量太低。因此,如何系统鉴定细胞内所有lncRNA在基因组上的所有结合位点还是个难题。
发明公开
鉴于以上技术存在的问题,本发明开发了RNA原位构象测序新技术(RNA in situ conformation sequencing,简称RIC-seq)。基本原理是对细胞进行甲醛交联以固定蛋白质介导的RNA-RNA近距离相互作用,并在保持细胞完整的情况下进行膜穿孔,进而利用微球菌核酸酶切割以去除游离的不受蛋白质保护的片段,然后在RNA的3’末端上进行pCp-biotin标记并在原位进行近端连接。细胞消解后纯化含有C-biotin的嵌合体RNA,并进行链特异性文库构建,该步骤大大提高了数据中嵌合体reads的比率,减少了无用数据,降低了测序成本。RIC-seq在保持细胞完整性的条件下进行RNA-RNA的原位连接,可一次性捕获所有直接的RNA-RNA近距离接触,并可在原位检测所有的lncRNA在体内的RNA结合靶标(all to all)。最重要的是可根据RNA的近端空间距离信息重构RNA的高级结构。
第一方面,本发明要求保护一种原位捕获RNA高级结构和/或鉴定原位RNA-RNA相互作用的方法(即RIC-seq法)。
本发明所要求保护的原位捕获RNA高级结构和/或鉴定原位RNA-RNA相互作用的方法(RIC-seq法),可包括如下步骤:
(1)对细胞或组织样本进行处理以固定蛋白质介导的RNA-RNA近距离相互作用。其中,所述组织样本的体积大小可为1立方毫米;所述近距离可为距离50埃以内。
(2)在保持细胞完整的情况下进行膜穿孔(细胞膜和核膜穿孔)。
(3)降解游离的不受蛋白质保护的RNA。
(4)在受蛋白质保护的RNA的3’末端进行“pCp-标记物1”标记并在原位进行近端连接。其中,所述近端可为距离50埃以内。
其中,所述“pCp-标记物1”为两端为磷酸基团且标记有所述标记物1的胞嘧啶核苷酸。相应的,后文出现的“Cp-标记物”为3’末端为磷酸基因且标记有所述标记物1的胞嘧啶核苷酸;“C-标记物1”为标记有所述标记物1的胞嘧啶核苷酸。
在本发明的具体实施例方式中,所述“pCp-标记物1”具体为pCp-biotin。相应的,所述“Cp-标记物1”具体为Cp-biotin;所述“C-标记物1”具体为C-biotin。
所述pCp-biotin为两端为磷酸基团且标记有生物素的胞嘧啶核苷酸。相应的,后文出现的Cp-biotin为3’末端为磷酸基因且标记有生物素的胞嘧啶核苷酸;C-biotin为标记有生物素的胞嘧啶核苷酸。
(5)细胞消解后纯化含有“C-标记物1”的嵌合体RNA(chimeric reads,即不同RNA片段间连接的产物);进行链特异性文库构建。
(6)进行高通量测序。根据测序结果可进而获知RNA高级结构和/或RNA-RNA相互作用。
在所述方法的步骤(1)之前,还可包括对细胞或组织样本进行洗涤的步骤,所述洗涤方法具体可按照如下步骤进行;向细胞或组织样本中加入预冷PBS溶液(pH 7.4)进行洗涤,在4℃、2500rpm条件下离心10分钟,去除PBS溶液,得到洗涤后的细胞样本。
在所述方法的步骤(1)中,所述对细胞或组织样本进行处理为对细胞或组织样本进行甲醛交联。
进一步地,所述步骤(1)可按照包括如下步骤的方法进行:
(a1)将所述细胞或组织样本置于甲醛溶液中,室温放置10分钟。其中,所述甲醛溶液为体积百分比为1%甲醛溶液(溶剂为PBS溶液)。
更进一步地,在步骤(a1)之后还可包括如下步骤(a2):
(a2)向步骤(a1)处理过的细胞或组织样本中加入甘氨酸溶液终止反应,混匀10分钟。其中,所述甘氨酸溶液为浓度为0.125mol/L的甘氨酸溶液(溶剂为DEPC水)。
在所述方法的步骤(2)中,进行所述膜穿孔时采用的穿孔液为Permeabilization溶液。
进一步地,所述步骤(2)可按照包括如下步骤的方法进行的:
(b1)将经步骤(1)处理过的细胞或组织样本置于所述Permeabilization溶液中,0℃-4℃(如冰浴)放置15分钟,每隔2分钟混匀一次。其中,所述Permeabilization溶液的溶剂为10mM pH 7.5的Tris-HCl缓冲液,溶质及浓度如下:10mM NaCl,0.5%(体积百分含量)NP-40,0.3%(体积百分含量)Triton X-100,0.1%(体积百分含量)Tween 20,1×protease inhibitors和2U/ml SUPERase·In
TM RNase Inhibitor。
在本发明的具体实施方式中,所述1×protease inhibitors具体为Sigma产品, 货号为P8340-5ML(其具体组份为AEBSF,Aprotinin,Bestatin,E-64,Leupeptin,Pepstatin A)。当然,所述1×protease inhibitors也可为组成相同的其他产品。
在本发明的具体实施方式中,所述SUPERase·In
TM RNase Inhibitor为Thermo Fisher产品,货号为AM2694。当然,所述SUPERase·In
TM RNase Inhibitor也可为组成相同的其他产品。
更进一步地,在步骤(b1)之后还可包括如下步骤(b2):
(b2)将步骤(b1)处理过的细胞或组织样本用1×PNK溶液洗涤。其中,所述1×PNK溶液的溶剂为50mM pH 7.4的Tris-Cl缓冲液,溶质及浓度为:10mM MgCl
2,0.1mg/ml BSA,0.2%(体积百分含量)NP-40。
在步骤(b2)中,所述洗涤可为多次洗涤,如3次。每次洗涤可包括如下步骤:4℃、旋转(如20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,去除洗涤溶液。
在所述方法的步骤(3)中,是采用微球菌核酸酶来实现所述“降解游离的不受蛋白质保护的RNA”的。
进一步地,所述步骤(3)可按照包括如下步骤的方法进行:
(c1)将经步骤(2)处理过的样本置于1×微球菌核酸酶溶液中反应。其中,所述1×微球菌核酸酶溶液中微球菌核酸酶的浓度可为0.03U/μl。所述反应的条件可为:37℃反应10分钟,每隔2分钟1000rpm震荡15秒。
更进一步地,在步骤(c1)之后还可包括如下步骤(c2):
(c2)将步骤(c1)处理过的样本先后用1×PNK+EGTA溶液和1×PNK溶液进行洗涤。其中,所述1×PNK+EGTA溶液的溶剂为50mM pH 7.4的Tris-Cl缓冲液,溶质及浓度为20mM EGTA,0.5%(体积百分含量)NP-40。所述1×PNK溶液的溶剂为50mM pH 7.4的Tris-Cl缓冲液,溶剂及浓度如下:10mM MgCl
2,0.1mg/ml BSA,0.2%(体积百分含量)NP-40。
在步骤(c2)中,所述洗涤可为多次洗涤,如用所述1×PNK+EGTA溶液洗涤2次,用所述1×PNK溶液洗涤2次。每次洗涤可包括如下步骤:4℃、旋转(如20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,去除洗涤溶液。
在所述方法中,所述步骤(4)可按照包括如下步骤的方法进行:
(d1)将所述受蛋白质保护的RNA的3’末端羟基化。
进一步地,可通过将步骤(3)处理过的样本进行碱性磷酸酶处理,从而实现将所述受蛋白质保护的RNA的3’末端羟基化;
更进一步地,所述“将步骤(3)处理过的样本进行碱性磷酸酶处理”时,反应体系中碱性磷酸酶的含量可为0.1U/μl。反应条件可为:37℃反应10分钟,每隔3分钟1000rpm震荡15秒。
再进一步地,所述步骤(d1)反应结束后,还可包括洗涤的步骤;所述洗涤具体为先后用所述1×PNK+EGTA溶液(配方同前)、High-salt溶液和1×PNK溶液先后对细胞样本进行洗涤。所述High-salt溶液的溶剂为5×PBS(no Mg
2+, Ca
2+)(即5×PBS缓冲液(pH 7.4):NaCl 685mmol/L,KCl 13.5mmol/L,Na
2HPO
450mmol/L,KH
2PO
4 10mmol/L),溶质及浓度为0.5%(体积百分含量)NP-40。所述1×PNK溶液的溶剂为50mM pH 7.4的Tris-Cl缓冲液,溶剂及浓度如下:10mM MgCl
2,0.1mg/ml BSA,0.05%(体积百分含量)NP-40。其中,所述洗涤可为多次洗涤,如用所述1×PNK+EGTA溶液洗涤2次,用所述High-salt溶液洗涤2次,用所述1×PNK溶液洗涤2次。每次洗涤可包括如下步骤:4℃、旋转(如20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,去除洗涤溶液。
(d2)将RNA的3’末端标记为Cp-biotin。
进一步地,可通过向步骤(d1)处理过的样本中加入所述pCp-biotin,进行连接反应,从而实现将RNA的3’末端标记为Cp-biotin。
更进一步地,进行所述连接反应时采用的酶可为T4RNA连接酶。在反应体系中,所述pCp-biotin的终浓度可为40μM;所述T4RNA连接酶的终浓度可为1U/μl。反应条件可为:16℃反应12-16h,每隔3分钟1000rpm震荡15秒。
再进一步地,所述步骤(d2)反应结束后,还可包括洗涤的步骤;所述洗涤具体为先后用所述1×PNK溶液(配方见具体实施方式实施例1前)对细胞样本进行洗涤;其中,所述洗涤可为多次洗涤,如3次。每次洗涤可包括如下步骤:4℃、旋转(如20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,去除洗涤溶液。
(d3)将RNA的3’末端的Cp-biotin中的磷酸基团转变为羟基。
进一步地,可通过将步骤(d2)处理过的样本进行碱性磷酸酶处理,从而实现将RNA的3’末端的Cp-biotin中的磷酸基团转变为羟基;
更进一步地,所述“将步骤(d2)处理过的样本进行碱性磷酸酶处理”时,反应体系中碱性磷酸酶的含量可为0.1U/μl。反应条件可为:37℃反应10分钟,每隔3分钟1000rpm震荡15秒。
再进一步地,所述步骤(d3)反应结束后,还可包括洗涤的步骤;所述洗涤具体可为先后用所述1×PNK+EGTA溶液(配方同前)、所述High-salt溶液(配方同前)和所述1×PNK溶液(配方同步骤(d1))先后对细胞样本进行洗涤。其中,所述洗涤可为多次洗涤,如用所述1×PNK+EGTA溶液洗涤2次,用所述High-salt溶液洗涤2次,用所述1×PNK溶液洗涤2次。每次洗涤可包括如下步骤:4℃、旋转(如20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,去除洗涤溶液。
(d4)将RNA的5’末端磷酸化。
进一步地,可通过将步骤(d3)处理过的样本进行T4PNK酶处理,从而实现将RNA的5’末端磷酸化。
更进一步地,所述“将步骤(d3)处理过的样本进行T4PNK酶处理”时,反应体系中T4PNK酶的含量可为1U/μl。反应条件可为:37℃反应45分钟,每隔3分钟1000rpm震荡15秒。
再进一步地,所述步骤(d4)反应结束后,还可如包括洗涤的步骤;所述洗涤具体可为先后用所述1×PNK+EGTA溶液(配方同前)和所述1×PNK溶液(配方同步骤(d1))进行洗涤。其中,所述洗涤可为多次洗涤,如用所述1×PNK+EGTA溶液洗涤2次,用所述1×PNK溶液洗涤2次。每次洗涤可包括如下步骤:4℃、旋转(如20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,去除洗涤溶液。
(d5)在原位进行近端连接。其中,所述近端可为距离50埃以内。
进一步地,是通过向步骤(d4)处理过的样本中加入T4RNA连接酶,从而实现在原位进行近端连接。
更进一步地,所述“向步骤(d4)处理过的样本中加入T4RNA连接酶”时,反应体系中T4RNA连接酶的含量可为0.5U/μl。反应条件可为:16℃反应12-16h,每隔3分钟1000rpm震荡15秒。
再进一步地,所述步骤(d2)反应结束后,还可包括洗涤的步骤;所述洗涤具体可为先后用所述1×PNK溶液(配方同前)对细胞样本进行洗涤;其中,所述洗涤可为多次洗涤,如3次。每次洗涤可包括如下步骤:4℃、旋转(如20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,去除洗涤溶液。
在所述方法中,所述步骤(5)可按照包括如下步骤的方法进行:
(e1)利用蛋白酶K消解细胞。
进一步地,所述“利用蛋白酶K消解细胞”时,反应体系中蛋白酶K的含量可为0.12U/μl。反应条件可为:37℃反应60分钟,56℃反应15分钟。
(e2)提取总RNA,并进行片段化处理。
该步骤中,所述提取总RNA可利用TRIzol LS、氯仿进行抽提。另外,在沉淀RNA时可加入500μl异丙醇,15μg glycoblue,-20℃过夜。
进一步地,在提取得到所述总RNA后,还可包括去除基因组DNA(如用DNaseI处理)和去除核糖体RNA(如利用与核糖体RNA互补配对的探针去除)的步骤。
其中,利用与核糖体RNA互补配对的DNA探针去除核糖体RNA的步骤可如下:向RNA中加入等质量的核糖体RNA探针,95℃反应2分钟,-0.1℃/s降至22℃,22℃反应5分钟。(反应结束后可立即放冰上)。降解DNA:RNA杂合链中的RNA(如加入RNase H),降解DNA探针(如加入Turbo DNA酶)。然后纯化RNA(如用Zymo RNA clean试剂盒)。
该步骤中,所述对RNA进行片段化处理具体可采用碱裂解法。在本发明的具体实施方式中,具体是使用1×first strand buffer(配方:50mM Tris-HCl,pH 8.3;75mM KCl;3mM MgCl
2),PCR仪中94℃,5分钟对RNA进行片段化处理。
(e3)利用固定有标记物2的磁珠富集标记有“C-标记物1”(如C-biotin)的嵌合体RNA;所述标记物能够与所述标记物1特异性结合。
在本发明的具体实施方式中,所述标记物1具体为生物素,所述标记物2具 体为链霉亲和素。所述固定有标记物2的磁珠即为链霉亲和素磁珠。
该步骤中,在利用链霉亲和素磁珠富集标记有C-biotin的嵌合体RNA前还包括对所述链霉亲和素磁珠进行封闭的步骤。具体步骤可如下:取20μl C1磁珠,把离心管放在磁力架上,待溶液澄清,吸出上清,加20μl溶液A,重悬磁珠,室温放置2分钟,把离心管放在磁力架上,待溶液澄清,吸出上清,重复此步骤1次,加入20μl溶液B,重悬磁珠,把离心管放在磁力架上,待溶液澄清,吸出上清,加入32μl酵母RNA(50μg),68μl DEPC water和100μl 2×TWB溶液,重悬磁珠,将离心管放在旋转混匀仪上,旋转混匀1小时,然后将离心管放在磁力架上,待溶液澄清,吸出上清,加入500μl 1×TWB溶液,重悬磁珠,将离心管放在磁力架上,待溶液澄清,吸出上清,重复此步骤2次。
该步骤中,在利用链霉亲和素磁珠富集标记有C-biotin的嵌合体RNA后还包括从磁珠上洗脱RNA的步骤。
(e4)构建链特异性文库。
该步骤主要包括:合成一链cDNA;合成二链DNA;对dsDNA进行末端修复;对末端修复的DNA进行加‘A’;连接接头;以加完接头的DNA为模板进行PCR扩增,用琼脂糖胶进行回收特定片段大小的PCR产物,得到所述链特异性文库;高通量测序。这些步骤均为本领域常规操作。按照常规流程构建链特异性文库的方法可参照“Levin,J.Z.,Yassour,M.,Adiconis,X.,Nusbaum,C.,Thompson,D.A.,Friedman,N.,Gnirke,A.,and Regev,A.(2010).Comprehensive comparative analysis of strand-specific RNA sequencing methods.Nature methods 7,709-715.”一文相关记载进行。
在本发明的具体实施方式中,所述合成二链DNA时,使用的是25mM dNTPs与dUTP的混合物,其中dTTP与dUTP摩尔比为4:1。
该步骤中,在“合成二链DNA”和“对dsDNA进行末端修复”之间,“对dsDNA进行末端修复”和“对末端修复的DNA进行加‘A’”之间,以及“连接接头”之后均可包括DNA纯化步骤。所述纯化的方法可为磁珠纯化。所述磁珠纯化的具体方法可按照如下步骤进行:将AMPure XP磁珠(简称XP磁珠)提前室温混匀平衡30分钟。然后将XP磁珠加入所述洗脱液中,轻柔混匀。室温静置5分钟,转移至磁力架静置5分钟,去除上清液,并使用新鲜的80%乙醇溶液漂洗磁珠2次。磁珠置于磁力架上晾干2分钟,加入TE缓冲液重悬磁珠,吹打50次。室温静置5分钟,然后放入磁力架静置5分钟,收集上清液即为纯化后的DNA产物。其中,所述“连接接头”之后的DNA纯化步骤(如磁珠纯化)可为两次。
在本发明的具体实施方式中,该步骤中进行所述PCR扩增时采用的上下游引物为由SEQ ID No.1和SEQ ID No.2所示的两条单链DNA组成的引物对。具体的,该步骤中进行的所述PCR扩增的反应体系具体为:上清液(经在“连接接头”之后的磁珠纯化DNA步骤所获得的上清液)15.7μl、10×Pfx缓冲液(Invitrogen)2.5μl、10μM的上下游引物(SEQ ID No.1和SEQ ID No.2)各1μl、50mM的MgSO
4溶 液1μl、25mM的dNTP 0.4μl、Pfx酶(Invitrogen)0.4μl、USER酶(NEB)3μl。所述PCR扩增的反应程序具体为:37℃反应15分钟;94℃反应2分钟;94℃变性15秒,62℃退火30秒,72℃延伸30秒,反应12个循环;72℃反应10分钟。
在所述方法的步骤(6)中,所述高通量测序可使用Illumina HiSeq X Ten测序仪对步骤(5)所得文库进行测序,可进行PE150双端测序。
在所述方法中,所述细胞的最高起始量为1×10
7个细胞。
进一步地,所述细胞可为动物细胞(如人源细胞),所述组织可为动物组织。在本发明的具体实施方式中,所述细胞具体为HeLa细胞。
第二方面,本发明要求保护一种文库构建方法。
本发明所要求保护的文库构建方法,包括前文第一方面所述方法的步骤(1)至步骤(5)。
第三方面,本发明要求保护利用第二方面所述方法构建得到的所述文库在原位捕获RNA高级结构和/或鉴定原位RNA-RNA相互作用中的应用。
第四方面,本发明还要求保护如下任一应用:
(A1)前文第一方面所述方法在鉴定活细胞内的lncRNA靶标中的应用。
(A2)pCp-biotin在鉴定RNA-RNA近距离相互作用中应用;其中,所述近距离可为距离50埃以内。
(A3)pCp-biotin在RNA原位近端连接中的应用;其中,所述近端可为距离50埃以内。
(A4)pCp-biotin在嵌合体RNA富集中的应用。
第五方面,本发明还要求保护如下任一:
(B1)洗涤剂,为前文所述Permeabilization溶液。
(B2)步骤(B1)所述洗涤剂在对细胞进行膜穿孔中的辅助性用途。
(B3)微球菌核酸酶(Micrococcal Nuclease)、碱性磷酸酶(Alkaline Phosphatase)和/或T4多聚核苷酸激酶(T4 PNK)在RNA的原位连接(如原位近端连接)中的应用。
(B4)蛋白酶K和加热连用在甲醛固定的细胞样本或组织样本中提取RNA的用途。其中,所述加热指的是37℃反应60分钟,56℃反应15分钟。
在本发明中,所述原位连接为非变性条件下的原位连接。
图1为RIC-seq流程及数据验证。A为RIC-seq技术示意图。原位处理部分包括甲醛交联、细胞穿孔、RNA消解、pCp-biotin标记和近端连接。体外部分包括富集嵌合体RNA和构建双端链特异性文库。RBP代表结合RNA结合蛋白。B为嵌合体RNA的接合处碱基组成。C为比较RIC-seq嵌合体RNA(浅色部分)与miR-3064已知结构(弧线)。嵌合体RNA之间空白区域代表缺口。一条RIC-seq read构成的茎环结构展示在下方,实心圆圈为插入了pCp-biotin的地方。D为pCp-biotin富集在miRNA前体顶端环状区。E-G为RIC-seq重现了U4,U6,RPPH1和U3 snoRNA 已知的结构和相互作用。H为RIC-seq鉴定了U1在MALAT1上的结合位点。深色阴影区域是RIC-seq、PARIS和RAP方法共同鉴定的位点。浅色阴影是RIC-seq特异性鉴定的位点。虚线方框区域在I中展示。I为新的U1和MALAT1相互作用位点是保守的,而且有嵌合体RNA簇支持(箭头)。U1基序由实心方框表示。J为RNA图谱展示了HeLa细胞中所有的RNA-RNA相互作用,左下方为+pCp样品,右上方为-pCp样品。NEAT1和MALAT1相互作用被放大后展示在右侧。K为超高分辨成像分析揭示了MALAT1与NEAT1的5’端是共定位的。方框1和方框2放大后展示在中间。方框3和方框4中展示直接相互作用的位点。它们之间的相对距离展示在右侧。
图2为RIC-seq分析流程及可重复性。A为RIC-seq数据分析流程。PCR重复、接头序列和含有多聚N的测序片段首先被去除,然后利用STAR软件将reads比对到hg19参考基因组。B为RIC-seq两个重复高度相关。C为细胞混合策略检测到RIC-seq技术的整体假阳性率是0.6%(虚线方框区域)。D为RIC-seq成功捕获snoRNA在28S rRNA上的作用和修饰位点。箭头表示已知的修饰位点,方框区域代表D’box。E为RIC-seq检测到的snoRNA相互作用位点在基因组上的分布。F为SNORD22与SPHK2和BCL2L2基因相互作用的位点。下划线区域表示D-box。G为分子内与分子间RNA-RNA相互作用的统计饼状图。H为小提琴图展示MALAT1和NEAT1的靶基因表达水平高于其他基因。双尾t检验计算显著性差异。I为MALAT1或者NEAT1靶标富集的基序。J为15个细胞中MALAT1的位点总结。K为smFISH在15个细胞中检测到的NEAT1位点及其与MALAT1位点重叠的总结。深色柱代表NEAT1位点与MALAT1位点直接重叠。
图3为RIC-seq精确捕获28S rRNA的3D结构。A为根据28S rRNA冷冻电镜结构绘制的物理相互作用图谱。浅灰色区域表示空间距离大于
(远端)。
以内的Watson-Crick,非Watson-Crick碱基相互作用,包含Watson-Crick,非Watson-Crick碱基配对的相互作用以及其他类型的近端相互作用用不同颜色标示。Not available表示没有可用的结构数据。B为根据HeLa细胞RIC-seq数据绘制的28S rRNA3D图谱。方框区域展示了Watson-Crick碱基配对和远距离非Watson-Crick碱基配对的相互作用。C为从28S rRNA冷冻电镜结构中产生真阳性和真阴性数据集。深色代表真阳性,浅色代表真阴性。D为ROC(受试者工作特征曲线)分析RIC-seq在28S rRNA 3D结构预测中的准确性。RIC-seq技术以深色线表示,而PARIS用浅色线表示。虚线为随机分布区线。冷冻电镜结构中缺失的部分没有被用来生成ROC曲线。
图4为RNA在体内的拓扑结构域和折叠原则。A为在PDE3A、IMMP2L、FTX和PVT1 RNA前体中观测到的拓扑结构域(虚线三角框)。将每个转录本分成100bins(每个bin代表1%的长度=1个像素)然后将总和标准化到1,生成的热图。B为前体mRNA和lncRNA线性距离上的实际接触机率函数。斜率为-1符合分型球体的理论模型。C为成熟mRNA和lncRNA线性距离上的实际接触机率函数。虚线 表示的斜率为-1。
图5为在不同细胞系中绘制的RNA 3D相互作用图谱。A为GM12878、IMR-90、H1 hESC、hNPC和HT29细胞中所有RNA的作用矩阵。下面是chr4:93.2Mb-94.8Mb区域RNA相互作用的放大。B为A图中5种细胞系共有的RNA-RNA相互作用的放大。C-D分别展示了细胞系特有的和共有的RNA-RNA相互作用。RNA-seq、ChIP-seq和TAD信号来自于H1hESC ENCODE数据。E为Cas9-KRAB系统敲降LncPRESS2。F为敲降LncPRESS2后,定量GRID2和OCT4的表达水平。*P<0.05,**P<0.01和***P<0.001,双尾t检验(n=3)。
图6为原位RNA-RNA相互作用的特征。A为6种细胞系中染色体内部和染色体之间的RNA-RNA相互作用所占百分比。B为基因内部和基因之间的RNA-RNA相互作用所占百分比。C为6种细胞系中染色体内部嵌合体reads的百分比和所能跨越的距离。实线和虚线分别表示基因内部和基因之间的相互作用。D为Hi-C、RIC-seq和RNA-seq在染色体1上的区间分布,展示的数据为GM12878。E为来自不同区间的RNA-RNA相互作用分别在基因内部和基因间所占百分比。
图7为细胞特异性hub-RNA的鉴定。A为HeLa细胞中每一个RNA根据其嵌合体read的密度和相互作用基因的数目进行分选,其中排名在top 5%的被认为是hub-RNA。GAPDH作为一个阴性对照。B为MALAT1、CCAT1和PDE3A与23条染色体上的RNA相互作用。箭头表示基因的位置。C为Meta分析Hub-RNA与其它RNA的RIC-seq信号强度与分布规律。转录起始位点及转录终止位点周围的RIC-seq信号被展示出来。D为Hub-RNA比其他RNAs更保守。E-H为Hub-RNA与其他RNAs上RNA聚合酶II、H3K4me3、H3K27ac和H3K27me3的ChIP-seq信号分布情况。I为大部分hub-RNA呈现细胞特异性。
图8为Hub-RNA CCAT1-5L协同MYC的启动子与增强子RNA正调控MYC基因的表达。A为RIC-seq、RNA-seq和H3K27ac信号在8q24上的分布。5’和3’RACE获得的CCAT1转录本展示在下方。Northern blot探针用黑色的线标记。CCAT1、MYC和PVT1基因放大标示在图中。CCAT1-5L与MYC的嵌合体reads展示在图中。B为Northern blot分析CCAT1-5L在不同细胞系中的表达。5L探针检测显示CCAT1-5L只在HeLa细胞中表达。18S rRNA与28S rRNA作为上样对照。C为smFISH发现CCAT1-5L定位在细胞核中。CCAT1-5L,CCAT1-Exon2(CCAT1-E),NEAT1 5’端探针用不同颜色标示。标尺:5μm。D为5L特异的LNA探针敲降CCAT1后,MYC的表达量显著性下调。5L和Exon2特异引物用来检测CCAT1表达水平。E为smFISH检测CCAT1-5L、MYC启动子和MYC增强子RNA的共定位情况。CCAT1,MYC,PVT1用不同颜色标示。标尺:5μm。F为敲降CCAT1-5L或者异位表达CCAT1-5L后对细胞增殖率的影响。G为敲降CCAT1-5L或者异位表达CCAT1-5L影响克隆形成。H为Transwell分析表明CCAT1-5L对细胞侵染与转移很重要。标尺:50μm.*P<0.05和**P<0.01,双尾t检验(n=3)。
实施发明的最佳方式
以下的实施例便于更好地理解本发明,但并不限定本发明。下述实施例中的实验方法,如无特殊说明,均为常规方法。下述实施例中所用的试验材料,如无特殊说明,均为自常规生化试剂商店购买得到的。
下述实施例中所使用的溶液配方如下:
PBS缓冲液(pH 7.4):溶剂为水,溶质及浓度如下:NaCl 137mmol/L,KCl 2.7mmol/L,Na
2HPO
4 10mmol/L,KH
2PO
4 2mmol/L。
1×PNK溶液:溶剂为50mM pH 7.4的Tris-Cl缓冲液,溶质及浓度如下:10mM MgCl
2,0.1mg/ml BSA,0.2%(体积百分含量)NP-40。
1×PNK+EGTA溶液:溶剂为50mM pH 7.4的Tris-Cl缓冲液,溶质及浓度如下:20mM EGTA,0.5%(体积百分含量)NP-40。
High-salt溶液:溶剂为5×PBS(no Mg
2+,Ca
2+),溶质及浓度如下:0.5%(体积百分含量)NP-40。其中,所述5×PBS(no Mg
2+,Ca
2+)即为5×PBS缓冲液(pH 7.4):NaCl 685mmol/L,KCl 13.5mmol/L,Na
2HPO
4 50mmol/L,KH
2PO
4 10mmol/L。
Permeabilization溶液:10mM Tris-HCl(pH 7.5),10mM NaCl,0.5%(体积百分含量)NP-40,0.3%(体积百分含量)Triton X-100,0.1%(体积百分含量)Tween 20,1×protease inhibitors(Sigma,货号为:P8340-5ML,具体组份为AEBSF,Aprotinin,Bestatin hydrochloride,E-64,Leupeptin hemisulfate salt and Pepstatin A)和2U/ml SUPERase·In
TM RNase Inhibitor(Thermo Fisher,货号为:AM2694)。
1×MN反应溶液:溶剂为50mM pH 8.0的Tris-Cl缓冲液,溶质及浓度如下:5mM CaCl
2。
Proteinase K溶液:溶剂为10mM pH 7.5的Tris-Cl缓冲液,溶质及浓度如下:10mM EDTA,0.5%(体积百分含量)SDS。
5×hybridization溶液:1M NaCl,500mM Tris-HCl(pH 7.4)。
溶液A:0.1M NaOH,0.05M NaCl。
溶液B:0.1M NaCl。
2×TWB溶液:10mM Tris-HCl(pH7.5),1mM EDTA,2M NaCl,0.02%(体积百分含量)Tween 20。
PK溶液:100mM NaCl,10mM Tris-HCl(pH 7.0),1mM EDTA,0.5%(0.5g/100mL)SDS。
TE缓冲液:10mM Tris-HCl(pH 8.0),1mM EDTA。
实施例1、RIC-seq文库的制备方法
本发明的RIC-seq文库构建流程如图1中A所示。其中包括培养细胞,经过甲醛交联、细胞膜和核膜穿孔、MNase酶处理、RNA 3’末端羟基化处理、 pCp-biotin连接、RNA 3’末端羟基化,5’末端磷酸化处理、近端连接、总RNA提取、DNase I去除基因组DNA、核糖体RNA去除、RNA片段化、C1磁珠富集及洗脱富集的RNA、cDNA一链合成、合成DNA二链、末端修复、加“A”、连接接头、PCR扩增等步骤。具体步骤如下:
1、取密度大约为80~90%的15cm盘细胞,倒掉培养基,加10ml预冷PBS(pH 7.4)洗涤细胞,倒掉PBS,重复此步骤3次,得到洗涤后的细胞。
2、完成步骤1后,向步骤1获得的洗涤后的细胞中加入10ml的1%(体积百分比)的甲醛溶液(溶剂为PBS溶液),室温静置10分钟。然后再加入甘氨酸溶液(终浓度为0.125mol/L,溶剂为DEPC水)终止反应,室温放置10分钟,得到甲醛交联并终止后的细胞。
3、完成步骤2后,向步骤2获得的甲醛交联并终止后的细胞中加10ml预冷PBS(pH 7.4),洗涤3次,用细胞刮铲将细胞刮下并转移到50ml离心管中,4℃、2500rpm条件下离心10分钟,弃上清液,加入2ml预冷PBS(pH 7.4)重悬细胞细胞沉淀,将细胞悬液转移至2个1.5ml离心管中,每管1ml,4℃、2500rpm条件下离心10分钟,弃上清液,继续下一步实验或者将细胞沉淀放在-80℃冰箱中保存。
4、完成步骤3后,向步骤3获得细胞沉淀中加入1ml Permeabilization溶液,冰浴15分钟,每隔2分钟混匀一次。4℃、3500rpm条件下离心5分钟,弃上清液,加入600μl 1×PNK溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤2次。
5、完成步骤4后,向步骤4获得细胞沉淀中加入200μl按照1:10000体积比稀释的微球菌核酸酶(Micrococal nuclease)(Thermo Fisher,货号为EN0181)(其中MNase酶使用浓度为0.03U/μl),重悬细胞沉淀,稀释溶液为1×MN反应溶液,在ThermoMixer中37℃反应10分钟,设置反应程序为每隔2分钟1000rpm震荡15秒。反应结束后,4℃、3500rpm条件下离心5分钟,弃上清液,加入600μl 1×PNK+EGTA溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。加入600μl 1×PNK溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。
6、完成步骤5后,向步骤5获得细胞沉淀中加入10μl 10×FastAP buffer(Thermo Fisher公司产品),10μl Fast Alkaline Phosphatase(Thermo Fisher公司产品,货号为EF0651;在反应体系中的终浓度为0.1U/μl),80μl DEPC水,重悬细胞沉淀,在ThermoMixer中37℃反应10分钟,设置反应程序为每隔3分钟1000rpm震荡15秒。反应结束后,4℃、3500rpm条件下离心5分钟,弃上清液,加入600μl 1×PNK+EGTA溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。加入600μl High-salt溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500 rpm条件下离心5分钟,弃上清,重复此步骤1次。加入600μl 1×PNK溶液(与前述配方相比,将NP-40的含量调整至0.05%体积百分含量,而其他成分及含量均不变)重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。
7、完成步骤6后,向步骤6获得细胞沉淀中加入10μl 10×RNA ligase reaction buffer(Thermo Fisher公司产品),6μl RNase inhibitor,4μl Biotinylated Cytidine(Bis)phosphate(即pCp-biotin,Thermo Fisher公司产品,货号为20160)(1mM),10μl T4 RNA ligase(Thermo Fisher公司产品,货号为EL0021);在反应体系中的终浓度为1U/μl),20μl DEPC水,50μl 30%PEG,重悬细胞沉淀,在ThermoMixer中16℃反应过夜,设置反应程序为每隔3分钟1000rpm震荡15秒。反应结束后,4℃、3500rpm条件下离心5分钟,弃上清液,加入600μl 1×PNK溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤2次。
8、完成步骤7后,向步骤7获得细胞沉淀中加入10μl 10×FastAP buffer(Thermo Fisher公司产品),10μl Fast Alkaline Phosphatase(Thermo Fisher公司产品,货号为EF0651);在反应体系中的终浓度为0.1U/μl),80μl DEPC水,重悬细胞沉淀,在ThermoMixer中37℃反应10分钟,设置反应程序为每隔3分钟1000rpm震荡15秒。反应结束后,4℃、3500rpm条件下离心5分钟,弃上清液,加入600μl 1×PNK+EGTA溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。加入600μl High-salt溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。加入600μl 1×PNK溶液(与前述配方相比,将NP-40的含量调整至0.05%体积百分含量,而其他成分及含量均不变)重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤2次。
9、完成步骤8后,向步骤8获得细胞沉淀中加入10μl 10×PNK buffer(Thermo Fisher公司产品),15μl 10mM ATP,10μl T4 PNK(Thermo Fisher公司产品,货号为EK0032);在反应体系中的终浓度为1U/μl),65μl DEPC水,重悬细胞沉淀,在ThermoMixer中37℃反应45分钟,设置反应程序为每隔3分钟1000rpm震荡15秒。反应结束后,4℃、3500rpm条件下离心5分钟,弃上清液,加入600μl 1×PNK+EGTA溶液重悬细胞沉淀,4℃旋转混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。加入600μl 1×PNK溶液(与前述配方相比,将NP-40的含量调整至0.05%体积百分含量,而其他成分及含量均不变)重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤1次。
10、完成步骤9后,向步骤9获得细胞沉淀中加入20μl 10×RNA ligase reaction buffer(Thermo Fisher公司产品),8μl RNase inhibitor,10μl T4 RNA ligase(Thermo Fisher公司产品,货号为EL0021);在反应体系中的终浓度为0.5U/μl),20μl BSA(1mg/ml),142μl DEPC水,重悬细胞沉淀,在ThermoMixer中16℃反应过夜,设置反应程序为每隔3分钟1000rpm震荡15秒。反应结束后,4℃、3500rpm条件下离心5分钟,弃上清液,加入600μl 1×PNK溶液重悬细胞沉淀,4℃旋转(20rpm)混匀5分钟,4℃、3500rpm条件下离心5分钟,弃上清,重复此步骤2次。
11、完成步骤10后,向步骤10获得细胞沉淀中加入200μl Proteinase K溶液和50μl proteinase K(Takara公司产品,货号为9034;在反应体系中的终浓度为0.12U/μl),混匀,在ThermoMixer中37℃反应60分钟,56℃反应15分钟,反应结束后,待样品冷却至室温,加入750μl Trizol LS(Thermo Fisher公司产品,货号为10296028),混匀,室温放置5分钟,加入220μl氯仿,剧烈混匀15秒,室温放置3分钟。4℃、13000rpm条件下离心15分钟,转移上清至1.5ml离心管中,加入500μl异丙醇,1μl glycoblue(浓度为15μg/μl),混匀,将离心管放在-20℃冰箱,过夜沉淀。
12、完成步骤11后,将步骤11获得的样品在4℃、13000rpm条件下离心20分钟,弃上清液,加入500μl 75%乙醇,洗涤沉淀,4℃、13000rpm条件下离心5分钟,重复此步骤1次,自然晾干沉淀,加入20μl DEPC水溶解沉淀,取1μl样品用NanoDrop定量。
13、完成步骤12后,从步骤12获得的样品中取出20μg总RNA,加入10μl 10×RQ1 DNase I buffer(Promega公司产品),3μl RNAsin(Thermo Fisher公司产品,货号为EO0381)和5μl DNase I(Promega公司产品,货号为M6101),补水至总体积为100μl,在ThermoMixer中37℃反应20分钟,反应结束后,加入100μl DEPC水,然后加入200μl酸性酚氯仿,混匀,室温放置3分钟,4℃、13000rpm条件下离心15分钟,取上清至1.5ml离心管中,加入20μl 3M乙酸钠(pH 5.5),1μl glycoblue和500μl 100%乙醇,混匀,将离心管放在-20℃冰箱,过夜沉淀。
14、完成步骤13后,将步骤13获得的样品在4℃、13000rpm条件下离心20分钟,弃上清液,加入500μl 75%乙醇,洗涤沉淀,4℃、13000rpm条件下离心5分钟,重复此步骤1次,自然晾干沉淀,加入6μl DEPC水溶解沉淀,将样品转移至PCR管中。
15、完成步骤14后,向步骤14获得的样品中加入10μl rRNA probe mix(探针序列的设计与合成参考已发表文献(Adiconis,X.,Borges-Rivera,D.,Satija,R.,DeLuca,D.S.,Busby,M.A.,Berlin,A.M.,Sivachenko,A.,Thompson,D.A.,Wysoker,A.,Fennell,T.,et al.(2013).Comparative analysis of RN A sequencing methods fordegraded or low-input samples.Nature methods 10,623-629.)(2μg/μl),4μl 5×Hybridization buffer,混匀,将PCR管放入PCR仪,设置反应程序:95℃2分钟,-0.1℃/s降至22℃,22℃5分钟,反应完立即将样品放在 冰上。
16、完成步骤15后,向步骤15获得的样品中加入3μl 10×RNase H buffer(Thermo Fisher公司产品),5μl RNase H(Thermo Fisher公司产品,货号为EN0202)(25U)和2μl DEPC水,混匀,将样品放入PCR仪,设置程序:37℃30分钟。反应完立即将样品放在冰上。
17、完成步骤16后,向步骤16获得的样品中加入4μl 10×TURBO buffer(Thermo Fisher公司产品),5μl TURBO DNase(Thermo Fisher公司产品,货号为AM2238;在反应体系中的终浓度为0.25U/μl)和1μl DEPC water,混匀,将样品放入PCR仪,设置程序:37℃30分钟。反应完立即将样品放在冰上。
18、完成步骤17后,将步骤17获得的样品转移至1.5ml离心管中,加入160μl DEPC水,加入200μl酸性酚氯仿,混匀,室温放置3分钟,4℃、13000rpm条件下离心15分钟,取上清至1.5ml离心管中,加入20μl 3M乙酸钠(pH5.5),1μl glycoblue和500μl 100%乙醇,混匀,将离心管放在-20℃冰箱,过夜沉淀。
19、完成步骤18后,将步骤18获得的样品在4℃、13000rpm条件下离心20分钟,弃上清液,加入500μl 75%乙醇,洗涤沉淀,4℃、13000rpm条件下离心5分钟,重复此步骤1次,自然晾干沉淀,加入16μl DEPC水溶解沉淀,将样品转移至PCR管中,加入4μl 5×First-strand buffer(Thermo Fisher公司产品,货号为18064-014),混匀,放入PCR仪,94℃反应5分钟,反应完立即放冰上。
20、取1.5ml离心管,加入20μl C1磁珠,把离心管放在磁力架上,待溶液澄清,吸出上清,加20μl溶液A,重悬磁珠,室温放置2分钟,把离心管放在磁力架上,待溶液澄清,吸出上清,重复此步骤1次,加入20μl溶液B,重悬磁珠,把离心管放在磁力架上,待溶液澄清,吸出上清,加入32μl酵母RNA(Roche公司产品,货号为10109223001)(50μg),68μl DEPC水和100μl 2×TWB溶液,重悬磁珠,将离心管放在旋转混匀仪上,旋转混匀1小时,然后将离心管放在磁力架上,待溶液澄清,吸出上清,加入500μl 1×TWB溶液,重悬磁珠,将离心管放在磁力架上,待溶液澄清,吸出上清,重复此步骤2次。
21、取步骤19的样品,加入30μl DEPC水,共50μl样品加入封闭好的磁珠中,混匀,室温旋转混匀30分钟,将离心管放在磁力架上,待溶液澄清,吸出上清,加入500μl 1×TWB溶液,洗涤4次。
22、完成步骤21后,向步骤21获得的洗涤后的磁珠中加入100μl PK溶液,混匀,在ThermoMixer中95℃1000rpm震荡反应10分钟,将离心管放在磁力架上,待溶液澄清,吸出上清,加入新的1.5ml离心管中,在原管中加入100μl PK溶液,混匀,在ThermoMixer中95℃1000rpm震荡反应10分钟,将离心管放在磁力架上,待溶液澄清,吸出上清,加入1.5ml离心管中,在原管中加入100μl PK溶液,混匀,将离心管放在磁力架上,待溶液澄清,吸出上清,加入 1.5ml离心管中,共300μl洗脱液,加入300μl酸性酚氯仿,混匀,室温放置3分钟,4℃、13000rpm条件下离心15分钟,吸出上清加入新的1.5ml离心管中,加入18μl 5M NaCl,混匀,加入1μl glycoblue和900μl 100%乙醇,混匀,将离心管放在-20℃冰箱,过夜沉淀。
23、完成步骤22后,将步骤22获得的样品在4℃、13000rpm条件下离心20分钟,弃上清液,加入500μl 75%乙醇,洗涤沉淀,4℃、13000rpm条件下离心5分钟,重复此步骤1次,自然晾干沉淀,加入10μl DEPC水溶解沉淀,将样品转移至PCR管中,加入0.5μl N6引物(序列为NNNNNN,其中,N表示A或T或C或G)(0.1μg/μl),混匀,将PCR管放入PCR仪中,65℃反应5分钟,反应完立即放冰上。
24、完成步骤23后,向步骤23获得的样品中加入3μl 5×First-strand buffer(Thermo Fisher公司产品货号为18064-014),1μl dNTP mix(10mM),0.5μl 100mM DTT,0.5μl RNase Inhibitor(40U/μl),0.5μl Superscript II(Thermo Fisher公司产品,货号为18064-014)(200U/μl),混匀,将PCR管放入PCR仪中,设置程序:25℃10分钟,42℃40分钟,70℃15分钟。反应完成后将样品放冰上。
25、完成步骤24后,将步骤24获得的样品转移至新1.5ml离心管中,加入10μl 5×Second-strand buffer(Thermo Fisher公司产品,货号为10812-014),0.8μl dNTP(dUTP)(25mM)(即25mM dNTPs与dUTP的混合物,其中dTTP与dUTP摩尔比为4:1),0.2μl RNaseH(Thermo Fisher公司产品,货号为EN0202)(5U/μl),2.5μl DNA pol I(Enzymatics公司产品,货号为P705-500)(10U/μl),将离心管放入ThermoMixer中,设置反应程序:16℃2小时,300rpm,15s/2min。
26、完成步骤25后,将AMPure XP磁珠(简称XP磁珠,Beckman)提前室温混匀平衡30分钟。然后将90μl(1.8×)XP磁珠加入步骤25获得的反应液中,轻柔混匀。室温静置5分钟,转移至磁力架静置5分钟,去除上清液,并使用200μl新鲜的80%乙醇溶液漂洗磁珠2次。磁珠置于磁力架上晾干2分钟,加入43μl TE缓冲液重悬磁珠,吹打50次。室温静置5分钟,然后放入磁力架静置5分钟,吸取42μl上清液,加入1.5ml离心管中。
27、完成步骤26后,向步骤26获得的样品中加入5μl 10×PNK溶液(T4 PNK配套反应溶液),0.4μl dNTPs(25mM),1.2μl T4 DNA polymerase(Enzymatics公司产品,货号为P7080L)(3U/μl),0.2μl Klenowfragment(Enzymatics公司产品,货号为P7060L)(5U/μl),1.2μl T4 PNK(Enzymatics公司产品,货号为Y9040L)(10U/μl),混匀,将离心管放入ThermoMixer中,20℃条件下反应30分钟。反应结束后,加入90μl AMPure AMPure XP磁珠进行纯化,具体步骤同步骤26,最后用20.5μl TE缓冲液洗脱,吸取19.7μl上清,加入新的1.5ml离心管中。
28、完成步骤27后,向步骤27获得的样品中加入2.3μl 10×blue buffer (Enzymatics公司产品,货号为B0110L),0.5μl dATP(5mM)和0.5μl Klenow exo-(3`to 5`exo minus)(Enzymatics公司产品,货号为P7010-LC-L)(5U/μl),混匀,将离心管放入ThermoMixer中,37℃条件下反应30分钟。
29、完成步骤28后,向步骤28获得的样品中加入1.4μl 2×Rapid ligation buffer(Enzymatics公司产品,货号为B1010L),0.1μl 10mM ATP,1μl Adapter(PEI Adapter oligo A:GATCGGAAGAGCACACGTCT,PEI Adapter oligo B:ACACTCTTTCCCTAC ACGACGCTCTTCCGATCT,反应中的adapter为两条oligo退火形成)(2μM)和1μl T4 quick DNA ligase(Enzymatics公司产品,货号为L6030-HC-L)(600U/μl),混匀,将离心管放入ThermoMixer中,20℃条件下反应15分钟。反应结束后,加入47.7μl AMPure XP磁珠进行纯化,具体步骤同步骤26,最后用26μl TE缓冲液洗脱,吸取25μl上清,加入新的1.5ml离心管中。加入45μl AMPure XP磁珠进行二次纯化,具体步骤同步骤26,最后用16.5μl TE缓冲液洗脱,吸取15.7μl上清,加入PCR管中。
30、完成步骤29后,以步骤29获得的上清液为模板于PCR管中进行PCR反应,得到PCR反应液(25μl)。
PCR反应体系为25μl:上清液15.7μl,10×Pfx缓冲液(Invitrogen)2.5μl,10μM的上下游引物各1μl,50mM的MgSO
4溶液1μl,25mM的dNTP 0.4μl,Pfx酶(Invitrogen)0.4μl,USER enzyme(NEB)3μl。
PCR反应程序为:37℃反应15分钟;94℃反应2分钟;94℃变性15秒,62℃退火30秒,72℃延伸30秒,反应12个循环;72℃反应10分钟。
31、完成步骤30后,将步骤30获得的PCR反应液用2%琼脂糖胶进行电泳,对200-450bp范围产物使用Qiagen MinElute试剂盒进行切胶回收,操作步骤参照试剂盒说明书,最后用16μl TE缓冲液进行洗脱,得到PCR洗脱液。
32、完成步骤31后,吸取1μl步骤31获得的PCR洗脱液使用Qubit 3.0定量。定量合格的样品用于测序分析。
实施例2、RIC-seq文库的制备方法的应用
一、培养HeLa细胞及果蝇S2细胞样本
以实验室培养的HeLa细胞为样本,细胞样本起始量为1×10
7个细胞,果蝇S2细胞作为spike-in,评估近端连接特异性。
二、RIC-seq文库的制备
基于步骤一中的细胞样本按照实施例1中的方法构建RIC-seq文库。其中,步骤30中的上下游引物分别如下(NNNNNNN为文库Index序列)
Primer1.0
5’-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-3’(SEQ ID No.1);
Index primer
5’-CAAGCAGAAGACGGCATACGAGATANNNNNNNGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3’(SEQ ID No.2)。
其中,N表示A或T或C或G。
三、测序
使用Illumina HiSeq X Ten测序仪对步骤二构建得到的RIC-seq文库进行PE150双端测序。
四、数据分析及结果
1、数据分析方法
数据分析流程如图2中A所示。首先利用Trimmomatic(0.36)软件过滤RIC-seq原始测序数据中的接头序列和低质量的测序片段,进一步去除冗余片段后利用Cutadapt(v1.15)剪除低复杂度序列如polyA等,随后利用STAR(2.5.2b)将高质量数据比对到人类参考基因组(hg19版本),最后从比对结果中筛选来自于RNA连接产物的测序片段(定义为嵌合体reads)。通过比较每个基因的嵌合体reads个数,计算皮尔森相关系数(Pearson correlation coefficient)以评估实验的可重复性。利用IGVtools以及Juicebox对RIC-seq数据进行可视化展示。
2、数据分析结果
为了捕获由蛋白质介导的RNA近端连接,我们发明了RIC-seq方法(
RNA
In situ
Conformation
Sequencing)。具体流程如图1中A所示。首先,用甲醛处理细胞以固定蛋白与RNA、蛋白与DNA以及蛋白与蛋白之间的相互作用,这样空间位置接近的不同RNA片段便被固定下来。其次,利用多组合洗涤剂对固定的细胞进行细胞膜和核膜打孔,并进行微球菌核酸酶处理以去除不受蛋白质保护的游离RNA。微球菌核酸酶消解后RNA的3’末端是磷酸基,而5’末端是羟基(图1中A)。为了进行pCp-biotin标记,我们采用碱性磷酸酶先将3’末端磷酸基变成羟基,然后用T4RNA连接酶将pCp-biotin连接到RNA的3’末端。紧接着,对样本分别进行碱性磷酸酶和T4PNK酶处理,将Cp-biotin的3’末端变成羟基,而RNA的5’末端则变成磷酸基(图1中A)。之后,在原位和非变性的条件下,利用T4RNA连接酶将空间上相互靠近的RNA连接起来。然后,利用蛋白酶K消解结合TRIzol抽提以获得总RNA,在去除基因组DNA和rRNA后,进行碱裂解片段化处理。最后,用链霉亲和素磁珠富集含有C-biotin的嵌合体RNA,并按照常规流程构建链特异性文库(Levin,J.Z.,Yassour,M.,Adiconis,X.,Nusbaum,C.,Thompson,D.A.,Friedman,N.,Gnirke,A.,and Regev,A.(2010).Comprehensive comparative analysis of strand-specific RNA sequencing methods.Nature methods 7,709-715.),上机测序并分析。
我们在HeLa细胞中构建了两个RIC-seq文库,共得到了155M可比对的reads。为了便于数据分析和可视化,我们整合多种算法和软件建立了一套完整的分析流程(图2中A)。分析发现嵌合体reads占所有测序片段的比例大约是 9%,而且超过90%的嵌合体reads在连接处含有一个多余的“C”,说明了pCp-biotin的高效率和链霉素磁珠富集的高特异性(图1中B)。每一个RIC-seq嵌合体read代表两个不同的RNA片段间的近端相互作用(图1中C-G)。多个RIC-seq嵌合体reads可以揭示共同的结构或者特异的反式RNA相互作用。RIC-seq技术的重复性非常好,两次生物学重复之间的Pearson相关系数为0.963(图2中B)。为了确定假阳性率,我们采用了细胞混合的策略(Li,X.,Zhou,B.,Chen,L.,Gou,L.T.,Li,H.,and Fu,X.D.(2017).GRID-seq reveals the global RNA-chromatin interactome.Nature biotechnology 35,940-950.),即将HeLa细胞和果蝇S2细胞按1:5比例混合在一起,然后构建RIC-seq文库并测序。结果显示只有约0.6%的嵌合体reads是一端来自果蝇的RNA,而另一端是来自人HeLa细胞的RNA(图2中C),这表明RIC-seq技术的假阳性率在1%以下。
接下来我们通过与已知的RNA结构和相互作用比较,系统检测了RIC-seq方法的分辨率、敏感度和特异性,这些RNA包括microRNA、snRNA、snoRNA和lncRNA(图1中C-I,图2中D-F)。RIC-seq可以精确的在单碱基分辨率水平捕获经典的miRNA前体茎环结构(图1中C,pCp插入位置在下方),而且这些miRNA的表达水平(RPM,reads per million)从0.05到31,067不等(图1中D),说明了RIC-seq技术的动态检测范围。出乎意料的是,pCp-biotin标记的位置主要富集在miRNA前体的顶端环上(图1中C,图1中D),表明顶端环可能很少被蛋白保护。更进一步的,RIC-seq成功检测到了已知的snRNA、snoRNA、RPPH1(the RNA component of Ribonuclease P)和TERC(telomerase RNA,数据没有展示)的分子间和分子内相互作用(图1中E-I和图2中D-F)。与PARIS和RAP方法相比(Engreitz,J.M.,Sirokman,K.,McDonel,P.,Shishkin,A.A.,Surka,C.,Russell,P.,Grossman,S.R.,Chow,A.Y.,Guttman,M.,Lander,E.S.(2014).RNA-RNA interactions enable specific targeting of noncoding RNAs to nascent Pre-mRNAs and chromatin sites.Cell 159(1):188-199.);(Lu,Z.,Zhang,Q.C.,Lee,B.,Flynn,R.A.,Smith,M.A.,Robinson,J.T.,Davidovich,C.,Gooding,A.R.,Goodrich,K.J.,Mattick,J.S.,et al.(2016).RNA Duplex Map in Living Cells Reveals Higher-Order Transcriptome Structure.Cell 165,1267-1279.),RIC-seq不仅可以捕获已知的U1-MALAT1相互作用,还可以鉴定一些HeLa细胞中特异的U1和MALAT1相互作用位点(图1中H)。不出所料,这些相互作用位点是保守的、包含U1基序、有嵌合体reads支持(箭头标示)(图1中I),提示了一定的功能性。
在充分验证了RIC-seq方法和数据后,我们合并了两次生物学重复的数据,用Juicebox绘制了全基因组的相互作用矩阵(Durand,N.C.,Robinson,J.T.,Shamim,M.S.,Machol,I.,Mesirov,J.P.,Lander,E.S.,and Aiden,E.L.(2016).Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom.Cell systems 3,99-101.),在矩阵中成对的相互作用通过二维的热图进行 可视化(IGV/Juicebox),强度表明嵌合体RNA连接的频率(图1中J和图2中A)。与-pCp的对照(检测到极少的嵌合体reads)相比,+pCp文库涵盖了复杂的分子内(~7M)和分子间相互作用(~6M)(图2中G),表明RNA在体内不仅是高度结构化的,而且具有广泛的相互纠缠(图1中J)。有意思的是,一些lncRNA在所有染色体上都具有广泛的结合,比如NEAT1和MALAT1。为了鉴定这些高丰度RNA的真正结合位点,我们对嵌合体reads进行了成簇分析,在HeLa细胞中共鉴定出了0.74M高可信度的RNA-RNA相互作用位点。在这些位点中,MALAT1和NEAT1不仅彼此可以相互作用而且还有数千个其他靶标(图1中J)。与近期的报道一致(West,J.A.,Davis,C.P.,Sunwoo,H.,Simon,M.D.,Sadreyev,R.I.,Wang,P.I.,Tolstorukov,M.Y.,and Kingston,R.E.(2014).The long noncoding RNAs NEAT1 and MALAT1 bind active chromatin sites.Molecular cell55,791-802.),我们也发现MALAT1和NEAT1更加倾向于结合转录活跃的基因(图2中H,p<2.2e-16),而且它们的结合基序高度相似(图2中I)。
RIC-seq揭示MALAT1可结合在NEAT1的5’端,(NEAT1_5’,图1中J右)。为了验证这些相互作用,我们采用单分子原位杂交(smFISH)和超高分辨成像(SIM)检测了这两个lncRNA的相互作用。我们发现NEAT_5’可形成环状结构,而MALAT1则为点状分布(每个HeLa细胞核中平均有248个点)(图1中K和图2中J)。一些NEAT1与MALAT1的荧光信号直接重叠(图1中K box 2-4)。我们发现一个HeLa细胞平均有7.5个核旁斑)(paraspeckles),其中有~63.7%与MALAT1共定位(图2中K)。总结起来,以上这些数据表明RIC-seq是一种高特异性、高可重复性和高精度的鉴定原位RNA-RNA相互作用的新方法。
为了检查RIC-seq能否捕获RNA的高级结构,我们将RIC-seq检测到的RNA临近位置信息与人80S核糖体冷冻电镜结构得到的数据(Anger,A.M.,Armache,J.P.,Berninghausen,O.,Habeck,M.,Subklewe,M.,Wilson,D.N.,and Beckmann,R.(2013).Structures of the human and Drosophila 80S ribosome.Nature 497,80-85.)进行了比较。首先我们基于每对5-nt窗口的相对空间距离绘制了28S rRNA的物理相互作用图谱(图3中A)。我们同样基于RIC-seq数据绘制了28S rRNA的3D图谱(图3中B)。这两个图谱无论在全局还是细微尺度上都是高度相似的(图3中A,图3中B)。出乎意料的是,RIC-seq不仅能捕获WC碱基配对(图3中B,box 1和box 2),还可以检测到远距离的loop-loop相互作用,如图3中B所示28S rRNA的50-200nt与4300-4400nt之间的相互作用(图3中B,box 3)。我们发现28S rRNA的结构中~70%的non-WC碱基配对的相互作用可以被RIC-seq检测到(图3中A,图3中B)。这些数据表明RIC-seq可以如实的捕获RNA的3D结构信息。
为了定量RIC-seq在检测RNA高级结构中的表现,我们根据28S rRNA的冷冻电镜结构数据生成了两组数据:真阳性集合(5-nt窗口对应的区域之间的 3D距离小于
)和真阴性集合(距离大于
)(4,847vs 369,698)(图3中C)。并利用这两个数据集来评估RIC-seq敏感度(成功检测到真阳性)和特异性(成功排除掉真阴性)。我们将RIC-seq检测到的28S rRNA的临近相互作用与真阳性数据集和真阴性数据集进行了比较,并生成了ROC曲线。ROC分析得到的AUC值是0.89,表明RIC-seq在RNA高级结构鉴定中具有很高的准确度(图3中D,深色线)。作为对照,我们利用同样的数据集评估了PARIS的表现,令人遗憾的是,由于大量28S rRNA中配对区域和远距离相互作用的位点不能被PARIS捕获,导致不能得到完整的曲线,也无法生成AUC值(图3中D,浅色线)。
RIC-seq技术产生的高质量RNA-RNA分子内相互作用数据使得我们可以在体内检测RNA的折叠规律。为了这个目的,我们重点关注了5179个mRNA前体,这些mRNA前体含有至少100个RNA分子内连接事件。有意思的是,我们发现在mRNA的内含子和外显子区域存在很多独立的拓扑结构域(图4中A),它们的共同特点是在某一区间内存在异常复杂的RNA-RNA相互作用,如在PDE3A和IMMP2L前体RNA中。为了系统鉴定类似的拓扑结构域,我们发明了一种迭代算法,可通过最大化结构域内部与结构域之间RIC-seq密度的比值来鉴定拓扑结构域的边界。与mRNA前体相比,类似的拓扑结构域在初始转录的lncRNA如FTX和PVT1中也很明显(图4中A)。这些数据表明RNA在体内和特定区间是高度结构化的,而且RNA共转录加工过程可能发生在独立的拓扑结构域中。
拓扑结构域的发现表明大的RNA分子可能边转录边形成复杂的局部结构,然后按照级联折叠的规律形成特定的高级结构,但是RNA在体内的具体折叠规则目前还不清楚。与DNA聚合物类似,RNA聚合物也可以以无规卷曲、平衡球体或分形球体的形式存在。RNA具体以那种构象存在可以通过计算不同核苷酸距离的RNA片段间的连接概率来推导(Fudenberg,G.,and Mirny,L.A.(2012).Higher-order chromatin structure:bridging physics and biology.Current opinion in genetics&development 22,115-124.)。
利用RIC-seq数据和类似的模拟方法(Lieberman-Aiden,E.,van Berkum,N.L.,Williams,L.,Imakaev,M.,Ragoczy,T.,Telling,A.,Amit,I.,Lajoie,B.R.,Sabo,P.J.,Dorschner,M.O.,et al.(2009).Comprehensive mapping of long-range interactions reveals folding principles of the human genome.Science 326,289-293.),我们检查了同一个RNA分子内部任意两个片段间的接触概率(或连接频率)与其线性距离之间的相关性。在聚合物的物理学特征中,如果聚合物以无规卷曲状态存在,则两个基因座间的接触概率随着线性距离的增加而迅速衰减,曲线的斜率预计会是-3/2(Fudenberg,G.,and Mirny,L.A.(2012).Higher-order chromatin structure:bridging physics and biology.Current opinion in genetics&development 22,115-124.)。相反,如果聚合物以平衡球体形式存在, 接触概率将首先以与无规卷曲类似的速率下降,但随后达到平衡,最后连接频率变得与线性距离无关(Fudenberg,G.,and Mirny,L.A.(2012).Higher-order chromatin structure:bridging physics and biology.Current opinion in genetics&development 22,115-124.)。然而,无规卷曲和平衡球体模型看起来都不符合RIC-seq得到的实际规律。因为无论内含子是否被统计,RIC-seq数据都显示不同RNA片段间的接触概率随着距离的增加而逐步下降,曲线的斜率接近-1(图4中B和图4中C),该值与分形球体模型能够非常好的吻合(Lieberman-Aiden,E.,van Berkum,N.L.,Williams,L.,Imakaev,M.,Ragoczy,T.,Telling,A.,Amit,I.,Lajoie,B.R.,Sabo,P.J.,Dorschner,M.O.,et al.(2009).Comprehensive mapping of long-range interactions reveals folding principles of the human genome.Science 326,289-293.)。因此,基于以上数据我们认为RNA前体可能类似于基因组DNA在体内以分形球体构象进行折叠,该构象在保证RNA缺乏结节的同时,可以保持最大程度的包装,同时保持了RNA随时打开和重折叠其局部结构的能力。
我们接下来使用仅来自外显子和非翻译区的嵌合体reads检查了成熟mRNA的构像。通过数据模拟,我们发现成熟mRNA的折叠呈现出幂律依赖性,其曲线斜率也接近-1(图4中C),提示成熟mRNA也被压缩成分形球体状态。类似的,内含子缺失的lncRNA,例如NEAT1和MALAT1,与mRNA前体和含内含子的lncRNA折叠类似(图4中C)。总之,这些结果表明mRNA和ncRNA可能都遵循分形球体的折叠途径以形成复杂的3D结构。
高度结构化的RNA进而与其他RNA分子通过分子间相互作用而发挥调节功能。为了探索新的分子间相互作用特征,我们在多种细胞系中构建RNA 3D图谱,包括人神经前体细胞(hNPC)和结肠腺癌细胞系HT29。此外,还包括三种ENCODE常用细胞类型:人B淋巴细胞系GM12878、H1人胚胎干细胞(hESCs)和人胎肺成纤维细胞IMR-90,以便于后期整合ENCODE项目产生的基因组数据。我们在这5种细胞系中也构建了RIC-seq文库,去除PCR重复后一共得到了1,001M唯一比对的read,其中,嵌合体占比8.4%。正如所料,这五种新的细胞类型中的RNA-RNA相互作用也异常复杂(图5中A)。
利用这些高质量的数据,我们在六种不同的细胞类型中共鉴定了约3M的细胞特异性相互作用簇(fragment cutoff=2)和大量组成型相互作用位点(图5中B)。如图5中A所示LncPRESS2,一个受P53调控的胚胎干细胞特异性lncRNA(Jain,A.K.,Xi,Y.,McCarthy,R.,Allton,K.,Akdemir,K.C.,Patel,L.R.,Aronow,B.,Lin,C.,Li,W.,Yang,L.,et al.(2016).LncPRESS1 Is a p53-Regulated LncRNA that Safeguards Pluripotency by Disrupting SIRT6-Mediated De-acetylation of Histone H3K56.Molecular cell 64,967-981.),RIC-seq在H1hESC中检测到LncPRESS2与其邻近基因GRID2具有广泛的相互作用(图5中C)。相反,在ChrX:73.1Mb-73.6Mb基因座中,我们观察到了lncRNA FTX和JPX之间的组成型结合(图5中B,图5中D)。已知这两个lncRNA都在XIST表达中起正 调控作用(Carmona,S.,Lin,B.,Chou,T.,Arroyo,K.,and Sun,S.(2018).LncRNA Jpx induces Xist expression in mice using both trans and cis mechanisms.PLoS genetics 14,e1007378.;Chureau,C.,Chantalat,S.,Romito,A.,Galvani,A.,Duret,L.,Avner,P.,and Rougeulle,C.(2011).Ftx is a non-coding RNA which affects Xist expression and chromatin structure within the X-inactivation center region.Human molecular genetics 20,705-718.;Sun,S.,Del Rosario,B.C.,Szanto,A.,Ogawa,Y.,Jeon,Y.,and Lee,J.T.(2013).Jpx RNA activates Xist by evicting CTCF.Cell 153,1537-1551.;Tian,D.,Sun,S.,and Lee,J.T.(2010).The long noncoding RNA,Jpx,is a molecular switch for X chromosome inactivation.Cell 143,390-403.),并且对XIST介导的X染色体沉默至关重要。这两个lncRNA之间的紧密结合表明它们有可能以复合物的形式调节XIST。以上这些细胞类型特异性和组成型相互作用进一步凸显了RIC-seq方法的特异性,并表明RIC-seq可用于鉴定活细胞内的lncRNA靶标。
为了进一步验证LncPRESS2-GRID2相互作用的功能,我们采用了Cas9-KRAB介导的lncRNA沉默策略(Gilbert,L.A.,Larson,M.H.,Morsut,L.,Liu,Z.,Brar,G.A.,Torres,S.E.,Stern-Ginossar,N.,Brandman,O.,Whitehead,E.H.,Doudna,J.A.,et al.(2013).CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes.Cell 154,442-451.),其中,sgRNA可特异性的将Cas9-KRAB直接靶向lncRNA的启动子区域,而KRAB作为RNA聚合酶II的转录抑制因子(图5中E),可高效阻断lncRNA的转录。我们发现使用三种不同的sgRNA特异性敲降LncPRESS2后,GRID2的表达水平显著性降低(图5中F),表明尽管线性距离大于25kb,LncPRESS2也能正调控GRID2的表达。意外的是,在LncPRESS2敲低细胞中,干细胞多能性关键因子OCT4的表达显著性降低(图5中F),意味着LncPRESS2介导的GRID2调控可能与干性密切相关。以上数据表明RIC-seq技术确实可以鉴定功能性的lncRNA靶标。
为了揭示不同细胞类型中RNA-RNA相互作用的普遍特征,我们首先分别计算了染色体内和染色体间相互作用的频率。利用在上述六种细胞类型中产生的RIC-seq数据,我们发现~70%的RNA-RNA相互作用是发生在同一染色体内部,而剩余的~30%发生在染色体间(图6中A)。由于RNA既可以顺式作用也可以反式作用于其他RNA分子,我们还分别计算了基因内和基因间RNA相互作用的频率。类似地,约60%的嵌合体read是来源于基因内部的顺式作用,这部分数据可以用来推断RNA的3D结构;剩余40%则表现出反式RNA-RNA相互作用的特性(图6中B),这说明细胞内大量的RNAs可以跨越很长的距离与相同染色体上的其他RNA或不同染色体上的RNA相互作用。如果我们仅统计同一染色体上的嵌合体reads,这一趋势也很明显,其中我们检测到两个明显的峰:第一个峰对应于基因内部的RNA-RNA相互作用,可跨越数百个核苷酸的距离;而第二个峰对应于基因间的相互作用,其跨越距离超过1Mb(图6中 C)。
染色质在体内级联折叠并根据转录活性的差异被分成区室A和B(Lieberman-Aiden,E.,van Berkum,N.L.,Williams,L.,Imakaev,M.,Ragoczy,T.,Telling,A.,Amit,I.,Lajoie,B.R.,Sabo,P.J.,Dorschner,M.O.,et al.(2009).Comprehensive mapping of long-range interactions reveals folding principles of the human genome.Science 326,289-293.)。与染色质的组织类似,RNA相互作用也似乎是区室化的并且可以在很大程度上重现DNA的区室(图6中D),表明由于空间接近,同一区室中的RNA可能更倾向于彼此相互作用。我们接下来量化了不同区室之间的相互作用。有趣的是,在基因内部的相互作用中,嵌合体read主要在同一区室中富集,A至A可占嵌合体read总量的~90%(图6中E),这可能是由于区室A的活跃转录和基因内部的空间距离相对更接近导致的。相反,对于基因间相互作用,区室A至A的相互作用降低至约65%,但区室A至B的相互作用增加至~30%(图6中E),表明此类反式RNA相互作用可能具有一些未知功能并可能调控区室B内基因的活性。
由于反式RNA-RNA相互作用可以跨越超过1Mb甚至跨越不同的染色体,我们接下来基于两个标准对RNA-RNA相互作用进行分类:靶基因的数量和嵌合体reads的密度(嵌合体reads的量除以RNA的表达水平)。有意思的是,该分析在HeLa细胞中意外的揭示了~500个高度丰富的RNA-RNA相互作用中枢(图7中A),包括众所周知的lncRNA,例如MALAT1、NEAT1、CCAT1和PVT1。出乎意料的是,许多蛋白质编码基因也显示出复杂的RNA-RNA相互作用,例如PDE3A,GPC5和TRIO(图7中A)。MALAT1、CCAT1和PDE3A的相互作用模式和基因组位置被可视化为Circos图(图7中B)。由于从这些基因座转录的RNA似乎在全基因组范围内起到组织RNA-RNA相互作用的作用,我们将这些RNA命名为hub-RNA,其中包括hub-mRNAs和hub-lncRNAs。
为了表征hub-RNA的特征,我们将HeLa细胞中表达的所有RNA分成hub-RNA和其他RNA两组。仅基于RIC-seq信号,我们发现hub-RNA具有更强的反式RNA-RNA相互作用并且作用位点显著的富集在基因体(gene body)(图7中C)。此外,这些hub-RNAs在进化上比其他RNA更保守(图7中D),并且转录也很活跃,在这些基因的TSS(转录起始位点)区域RNA聚合酶II也存在富集(图7中E)。对应的,hub-RNAs对应的基因座活性组蛋白标志物H3K4me3(图7中F)和H3K27ac(图7中G)结合信号稍高。相反,抑制性组蛋白标记H3K27me3信号略低(图7中H)。同时我们发现hub-RNAs是具有细胞特异性的(图7中I)。因此,RIC-seq意外地揭示了一组可能在基因调控中发挥重要作用的组织特异性hub-RNAs。
为了研究hub-RNAs的作用,我们选择CCAT1进行进一步分析,因为它具有广泛的反式RNA相互作用(图7中B)和潜在的超级增强子活性(Hnisz,D.,Abraham,B.J.,Lee,T.I.,Lau,A.,Saint-Andre,V.,Sigova,A.A.,Hoke,H.A.,and Young,R.A.(2013).Super-enhancers in the control of cell identity and disease.Cell155,934-947.;Loven,J.,Hoke,H.A.,Lin,C.Y.,Lau,A.,Orlando,D.A.,Vakoc,C.R.,Bradner,J.E.,Lee,T.I.,and Young,R.A.(2013).Selective inhibition of tumor oncogenes by disruption of super-enhancers.Cell 153,320-334.)。CCAT1定位于人类8q24基因区,并在多种癌症如结肠直肠癌,前列腺癌和肝癌中异常高表达(Chen,H.,He,Y.,Hou,Y.S.,Chen,D.Q.,He,S.L.,Cao,Y.F.,and Wu,X.M.(2018a).Long non-coding RNA CCAT1promotes the migration and invasion of prostate cancer PC-3cells.European review for medical and pharmacological sciences 22,2991-2996.;Deng,L.,Yang,S.B.,Xu,F.F.,and Zhang,J.H.(2015).Long noncoding RNA CCAT1promotes hepatocellular carcinoma progression by functioning as let-7sponge.Journal of experimental&clinical cancer research:CR34,18.;Tseng,Y.Y.,Moriarity,B.S.,Gong,W.,Akiyama,R.,Tiwari,A.,Kawakami,H.,Ronning,P.,Reuland,B.,Guenther,K.,Beadnell,T.C.,et al.(2014).PVT1dependence in cancer with MYC copy-number increase.Nature 512,82-86.;Xiang,J.F.,Yin,Q.F.,Chen,T.,Zhang,Y.,Zhang,X.O.,Wu,Z.,Zhang,S.,Wang,H.B.,Ge,J.,Lu,X.,et al.(2014).Human colorectal cancer-specific CCAT1-L lncRNA regulates long-range chromatin interactions at the MYC locus.Cell research 24,513-531.)。在结直肠癌细胞中,研究人员报道了一个具有额外延长的3'末端的CCAT1转录本,并发现该转录本可调节MYC基因的启动子和增强子之间的DNA相互作用(Xiang,J.F.,Yin,Q.F.,Chen,T.,Zhang,Y.,Zhang,X.O.,Wu,Z.,Zhang,S.,Wang,H.B.,Ge,J.,Lu,X.,et al.(2014).Human colorectal cancer-specific CCAT1-L lncRNA regulates long-range chromatin interactions at the MYC locus.Cell research 24,513-531.),但确切的机制不明。
由于CCAT1与报道的超级增强子部分重叠(Khan,A.,and Zhang,X.(2016).dbSUPER:a database of super-enhancers in mouse and human genome.Nucleic acids research 44,D164-171.),通过检查HeLa细胞中的RIC-seq和RNA-seq数据,我们意外地发现CCAT1可能存在5'末端额外延伸的转录本,而不是先前在结肠癌中报道的3'末端延长(图8中A)。在HeLa细胞中的5'和3'RACE也证实了CCAT1的确存在5'末端额外延伸(图8中A,底部标记为褐色),且转录本总长度为~4,700nt,并与超增强子直接重叠。我们将该新发现的lncRNA命名为CCAT1-5L。基于Exon2(E2)探针的Northern印迹(参见图8中A)证实了CCAT1在不同细胞类型中都表达,而5L特异性探针则仅在HeLa细胞中检测到CCAT1-5L(图8中B),表明CCAT1-5L可能是宫颈癌中的特定转录本,这在宫颈癌患者的RNA-seq数据中也得到进一步证实(数据未显示)。
smFISH显示CCAT1-5L是核定位的lncRNA并且每个细胞核内存在2-3个点(图8中C)。CCAT1-5L似乎是有功能的,因为CCAT1与8q24“基因沙漠”中其他区域的RNA相互作用大多数来自第一个外显子和额外的5'末端延伸区 (图8中A)。此外,我们检测到CCAT1-5L、MYC启动子RNA和PVT1之间具有广泛的远距离RNA-RNA相互作用。更重要的是,在PVT1基因座中观察到的CCAT1-5L结合位点主要位于含有MYC增强子的内含子序列中(图8中A)。以上数据提示CCAT1-5L可能作为超增强子RNA与启动子和增强子RNA相互作用进而调控MYC癌基因的表达。
我们接下来探讨了CCAT1-5L是否可以直接调节MYC表达。当用两种靶向5'末端延伸区的LNA寡核苷酸敲低CCAT1-5L后(图8中D),MYC的RNA水平显着降低~40%(图7中D),表明在HeLa细胞中CCAT1-5L确实可以调节MYC的表达。出乎意料的是,MYC的正调节因子PVT1(Tseng,Y.Y.,Moriarity,B.S.,Gong,W.,Akiyama,R.,Tiwari,A.,Kawakami,H.,Ronning,P.,Reuland,B.,Guenther,K.,Beadnell,T.C.,et al.(2014).PVT1dependence in cancer with MYC copy-number increase.Nature 512,82-86.)的表达水平也显着降低(图8中D)。因此,我们假设CCAT1-5L hub-lncRNA可能与MYC的启动子和增强子RNA协同作用以协调它们的表达水平。
为了验证这一假设,我们首先检查了CCAT1-5L、MYC启动子和MYC增强子RNA是否在体内共定位。为此,我们首先合成了smFISH探针,分别靶向RIC-seq检测到的CCAT1-5L部分、MYC的第一外显子和第一内含子、以及位于PVT1内含子中的增强子。结果显示三种RNA完美的定位在一起(图8中E)。从而进一步证实了CCAT1-5L对MYC和PVT1的调节作用。此外,在LNA寡核苷酸敲降CCAT1-5L后,MYC启动子和增强子RNA之间的共定位模式也没有改变(图8中E)。
由于CCAT1-5L在宫颈癌患者中高表达,我们接下来检查了CCAT1-5L是否可以促进细胞增殖和转移,这是癌症的两个标志性特征(Hanahan,D.,and Weinberg,R.A.(2011).Hallmarks of cancer:the next generation.Cell 144,646-674.)。与LNA对照相反,5L特异性LNA寡核苷酸对CCAT1-5L的敲低显著降低了HeLa细胞的增殖速率(图8中F);相反,使用慢病毒质粒异位表达CCAT1-5L能够显著增强细胞增殖(图8中F),这与CCAT1-5L的致癌作用一致。集落形成实验进一步佐证了CCAT1-5对细胞增殖的影响(图8中G)。为了检测CCAT1-5L是否可以影响细胞转移和侵袭,我们使用transwell实验测定HeLa细胞的侵袭能力,发现CCAT1-5L的敲降显著降低了HeLa细胞的转移能力,而CCAT1-5L的过表达则显著增加了细胞的侵袭和转移(图8中H)。总之,这些数据表明CCAT1-5L hub-lncRNA可以直接调节MYC表达以促进肿瘤发生。
工业应用
本发明所提供的捕获RNA原位高级结构及相互作用的方法能够在不破坏细胞结构、保持细胞完整性的情况下,在原位对细胞内RNA进行处理,捕获生理状态下的RNA分子内和分子间相互作用。本发明所提供的捕获RNA原位高级结 构及相互作用的方法使用pCp-biotin标记RNA末端,在非变性条件下进行原位连接,极大的提高标记效率,同时降低分子间非特异性连接;使用C1磁珠富集标记了C-biotin的嵌合体RNA,进而用于构建文库,能够高效的富集嵌合体RNA,增加有效数据比例,降低测序成本。
Claims (23)
- 一种原位捕获RNA高级结构和/或鉴定原位RNA-RNA相互作用的方法,包括如下步骤:(1)对细胞或组织样本进行处理以固定蛋白质介导的RNA-RNA近距离相互作用;(2)在保持细胞完整的情况下进行膜穿孔;(3)降解游离的不受蛋白质保护的RNA;(4)在受蛋白质保护的RNA的3’末端进行“pCp-标记物1”标记并在原位进行近端连接;(5)细胞消解后纯化含有“C-标记物1”的嵌合体RNA;进行链特异性文库构建;(6)进行高通量测序。
- 根据权利要求1所述的方法,其特征在于:步骤(1)中,所述对细胞或组织样本进行处理为对细胞或组织样本进行甲醛交联。
- 根据权利要求2所述的方法,其特征在于:所述步骤(1)是按照包括如下步骤的方法进行的:(a1)将所述细胞或组织样本置于甲醛溶液中,室温放置10分钟。
- 根据权利要求3所述的方法,其特征在于:所述甲醛溶液为体积百分比为1%的甲醛溶液。
- 根据权利要求3或4所述的方法,其特征在于:在步骤(a1)之后还包括如下步骤(a2):(a2)向步骤(a1)处理过的细胞或组织样本中加入甘氨酸溶液,放置10分钟。
- 根据权利要求5所述的方法,其特征在于:所述甘氨酸溶液为浓度为0.125mol/L的甘氨酸溶液。
- 根据权利要求1-6中任一所述的方法,其特征在于:步骤(2)中,进行所述膜穿孔时采用的穿孔液为Permeabilization溶液;所述Permeabilization溶液的溶剂为10mM pH 7.5的Tris-HCl缓冲液,溶质及浓度如下:10mM NaCl,0.5%NP-40,0.3%Triton X-100,0.1%Tween 20,1×protease inhibitors和2U/ml SUPERase·In TMRNase Inhibitor;%均表示体积百分含量。
- 根据权利要求7所述的方法,其特征在于:所述步骤(2)是按照包括如下步骤的方法进行的:(b1)将经步骤(1)处理过的细胞或组织样本置于所述Permeabilization溶液中,0-4℃放置15分钟。
- 根据权利要求8所述的方法,其特征在于:在步骤(b1)之后还包括如 下步骤(b2):(b2)将步骤(b1)处理过的细胞或组织样本用1×PNK溶液洗涤;其中,所述1×PNK溶液的溶剂为50mM pH 7.4的Tris-Cl缓冲液,溶剂及浓度如下:10mM MgCl 2,0.1mg/ml BSA,0.2%NP-40;%表示体积百分含量。
- 根据权利要求1-9中任一所述的方法,其特征在于:步骤(3)中,是采用微球菌核酸酶来实现所述“降解游离的不受蛋白质保护的RNA”的。
- 根据权利要求1-10中任一所述的方法,其特征在于:所述步骤(4)是按照包括如下步骤的方法进行的:(d1)将所述受蛋白质保护的RNA的3’末端羟基化;(d2)将RNA的3’末端标记为“Cp-标记物1”;(d3)将RNA的3’末端的“Cp-标记物1”中的磷酸基团转变为羟基;(d4)将RNA的5’末端磷酸化;(d5)在原位进行近端连接。
- 根据权利要求11所述的方法,其特征在于:步骤(d1)中,是通过将步骤(3)处理过的样本进行碱性磷酸酶处理,从而实现将所述受蛋白质保护的RNA的3’末端羟基化。
- 根据权利要求11或12所述的方法,其特征在于:步骤(d2)中,是通过向步骤(d1)处理过的样本中加入所述“pCp-标记物1”,进行连接反应,从而实现将RNA的3’末端标记为“Cp-标记物1”。
- 根据权利要求11-13中任一所述的方法,其特征在于:步骤(d3)中,是通过将步骤(d2)处理过的样本进行碱性磷酸酶处理,从而实现将RNA的3’末端的“Cp-标记物1”中的磷酸基团转变为羟基。
- 根据权利要求11-14中任一所述的方法,其特征在于:步骤(d4)中,是通过将步骤(d3)处理过的样本进行T4PNK酶处理,从而实现将RNA的5’末端磷酸化。
- 根据权利要求11-15中任一所述的方法,其特征在于:步骤(d5)中,是通过向步骤(d4)处理过的样本中加入T4RNA连接酶,从而实现在原位进行近端连接。
- 根据权利要求1-16中任一所述的方法,其特征在于:所述步骤(5)是按照包括如下步骤的方法进行的:(e1)利用蛋白酶K消解细胞;(e2)提取总RNA,并进行片段化处理;(e3)利用固定有标记物2的磁珠富集标记有“C-标记物1”的嵌合体RNA;所述标记物2能够与所述标记物1特异性结合;(e4)构建链特异性文库。
- 根据权利要求1-17中任一所述的方法,其特征在于:所述方法中,所述细胞的最高起始量为1×10 7个细胞。
- 根据权利要求1-18中任一所述的方法,其特征在于:所述细胞为动物细胞,所述组织为动物组织。
- 一种文库构建方法,包括权利要求1-19中任一所述方法的步骤(1)至步骤(5)。
- 利用权利要求20所述方法构建得到的文库在原位捕获RNA高级结构和/或鉴定原位RNA-RNA相互作用中的应用。
- 如下任一应用:(A1)权利要求1-19中任一所述方法在鉴定活细胞内的lncRNA靶标中的应用;(A2)pCp-biotin在鉴定RNA-RNA近距离相互作用中应用;(A3)pCp-biotin在RNA原位近端连接中的应用;(A4)pCp-biotin在嵌合体RNA富集中的应用。
- 如下任一:(B1)洗涤剂,为权利要求7中所述Permeabilization溶液。(B2)步骤(B1)所述洗涤剂在对细胞进行膜穿孔中的辅助性用途。(B3)微球菌核酸酶、碱性磷酸酶和/或T4多聚核苷酸激酶在RNA的原位连接中的应用。(B4)蛋白酶K和加热连用在甲醛固定的细胞样本或组织样本中提取RNA的用途;其中,所述加热指的是37℃反应60分钟,56℃反应15分钟。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/297,414 US20220033807A1 (en) | 2019-05-09 | 2019-07-05 | Method for capturing rna in situ higher-order structures and interactions |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910384194.2A CN111909991B (zh) | 2019-05-09 | 2019-05-09 | 一种捕获rna原位高级结构及相互作用的方法 |
CN201910384194.2 | 2019-05-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020224040A1 true WO2020224040A1 (zh) | 2020-11-12 |
Family
ID=73051442
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/094790 WO2020224040A1 (zh) | 2019-05-09 | 2019-07-05 | 一种捕获rna原位高级结构及相互作用的方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20220033807A1 (zh) |
CN (1) | CN111909991B (zh) |
WO (1) | WO2020224040A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114507721B (zh) * | 2020-11-16 | 2024-04-09 | 寻鲸生科(北京)智能技术有限公司 | 一种全转录组rna结构探测的方法及其应用 |
CN112592971B (zh) * | 2020-11-26 | 2022-04-22 | 南京大学 | 一种与系统性红斑狼疮相关的生物标志物及其应用 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108300767A (zh) * | 2017-10-27 | 2018-07-20 | 清华大学 | 一种核酸复合体中核酸区段相互作用的分析方法 |
CN109505012A (zh) * | 2019-01-15 | 2019-03-22 | 依科赛生物科技(太仓)有限公司 | 一种针对FFPE样本的mRNA二代测序文库构建的试剂盒及其应用 |
WO2019071262A1 (en) * | 2017-10-06 | 2019-04-11 | The Regents Of The University Of California | RAPID SITU DETECTION OF DNA AND RNA |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050042621A1 (en) * | 2002-09-27 | 2005-02-24 | Affymetrix, Inc. | Method for preparing a nucleic acid sample for hybridization to an array |
CN113151407A (zh) * | 2014-12-22 | 2021-07-23 | 西格马-奥尔德里奇有限责任公司 | 显现单个细胞中的修饰的核苷酸和核酸相互作用 |
CN105986324B (zh) * | 2015-02-11 | 2018-08-14 | 深圳华大智造科技有限公司 | 环状小rna文库构建方法及其应用 |
CN106282314A (zh) * | 2015-05-11 | 2017-01-04 | 中国科学院遗传与发育生物学研究所 | 一种在植物中鉴定与蛋白结合的rna种类和rna位点的方法 |
EP3455379B1 (en) * | 2016-05-12 | 2023-07-05 | Agency For Science, Technology And Research | Ribonucleic acid (rna) interactions |
CN107674870A (zh) * | 2016-08-01 | 2018-02-09 | 武汉生命之美科技有限公司 | 一种改进的鉴定细胞样品中rna结合蛋白的靶标rna序列的方法 |
-
2019
- 2019-05-09 CN CN201910384194.2A patent/CN111909991B/zh active Active
- 2019-07-05 US US17/297,414 patent/US20220033807A1/en active Pending
- 2019-07-05 WO PCT/CN2019/094790 patent/WO2020224040A1/zh active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019071262A1 (en) * | 2017-10-06 | 2019-04-11 | The Regents Of The University Of California | RAPID SITU DETECTION OF DNA AND RNA |
CN108300767A (zh) * | 2017-10-27 | 2018-07-20 | 清华大学 | 一种核酸复合体中核酸区段相互作用的分析方法 |
CN109505012A (zh) * | 2019-01-15 | 2019-03-22 | 依科赛生物科技(太仓)有限公司 | 一种针对FFPE样本的mRNA二代测序文库构建的试剂盒及其应用 |
Non-Patent Citations (2)
Title |
---|
BODO, M. ET AL.: "Simple Methods for the 3' biotinylation of RNA", RNA, vol. 20, 31 March 2014 (2014-03-31), pages 421 - 427, XP055354650, DOI: 20200207115244X * |
BODO, M. ET AL.: "Simple Methods for the 3' biotinylation of RNA", RNA, vol. 20, 31 March 2014 (2014-03-31), pages 421 - 427, XP055354650, DOI: 20200207125259A * |
Also Published As
Publication number | Publication date |
---|---|
US20220033807A1 (en) | 2022-02-03 |
CN111909991A (zh) | 2020-11-10 |
CN111909991B (zh) | 2021-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7532455B2 (ja) | 連続性を維持した転位 | |
Fu et al. | N6-methyldeoxyadenosine marks active transcription start sites in Chlamydomonas | |
CN106604994B (zh) | 通过测序评估的DSBs的全基因组无偏鉴定(GUIDE-Seq) | |
Deng et al. | Bipartite structure of the inactive mouse X chromosome | |
KR102332522B1 (ko) | 삼차원 dna 구조에서 뉴클레오티드 서열의 상호작용을 분석하기 위한 방법 | |
Guffanti et al. | A transcriptional sketch of a primary human breast cancer by 454 deep sequencing | |
US9085798B2 (en) | Nucleic acid constructs and methods of use | |
US20100311602A1 (en) | Sequencing method | |
CN111954720A (zh) | 用于分析核酸的方法和组合物 | |
JP6430631B2 (ja) | リンカー要素、及び、それを使用してシーケンシングライブラリーを構築する方法 | |
Ma et al. | Using DNase Hi-C techniques to map global and local three-dimensional genome architecture at high resolution | |
JP2014506788A (ja) | 大量並列連続性マッピング | |
CA2661640A1 (en) | Mapping of genomic interactions | |
CN107109698B (zh) | Rna stitch测序:用于直接映射细胞中rna:rna相互作用的测定 | |
US20190062736A1 (en) | In situ and in vivo analysis of chromatin interactions by biotinylated dcas9 protein | |
WO2020224040A1 (zh) | 一种捕获rna原位高级结构及相互作用的方法 | |
US10287621B2 (en) | Targeted chromosome conformation capture | |
US20230032136A1 (en) | Method for determination of 3d genome architecture with base pair resolution and further uses thereof | |
US20110091939A1 (en) | Methods and Compositions for Removing Specific Target Nucleic Acids | |
EP4001429A1 (en) | Analysis of crispr-cas binding and cleavage sites followed by high-throughput sequencing (abc-seq) | |
US20240150830A1 (en) | Phased genome scale epigenetic maps and methods for generating maps | |
US20240141325A1 (en) | Generation of novel crispr genome editing agents using combinatorial chemistry | |
Parasyraki et al. | 5-Formylcytosine is an activating epigenetic mark for RNA Pol III during zygotic reprogramming | |
Gavrilov et al. | Studying RNA–DNA interactome by Red-C identifies noncoding RNAs associated with repressed chromatin compartment and reveals transcription dynamics | |
WO2005058931A2 (en) | Methods and algorithms for identifying genomic regulatory sites |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19927875 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19927875 Country of ref document: EP Kind code of ref document: A1 |