WO2023237066A1 - 一种筛选rna适配体的方法 - Google Patents
一种筛选rna适配体的方法 Download PDFInfo
- Publication number
- WO2023237066A1 WO2023237066A1 PCT/CN2023/099222 CN2023099222W WO2023237066A1 WO 2023237066 A1 WO2023237066 A1 WO 2023237066A1 CN 2023099222 W CN2023099222 W CN 2023099222W WO 2023237066 A1 WO2023237066 A1 WO 2023237066A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- rna
- rna aptamer
- elution
- library
- aptamer
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 138
- 108091023037 Aptamer Proteins 0.000 title claims abstract description 84
- 238000012216 screening Methods 0.000 title claims abstract description 64
- 108091008103 RNA aptamers Proteins 0.000 claims abstract description 213
- 230000027455 binding Effects 0.000 claims abstract description 72
- 238000002360 preparation method Methods 0.000 claims abstract description 15
- 239000003480 eluent Substances 0.000 claims abstract description 10
- 238000010828 elution Methods 0.000 claims description 116
- 239000000872 buffer Substances 0.000 claims description 75
- 238000012163 sequencing technique Methods 0.000 claims description 54
- 239000011324 bead Substances 0.000 claims description 53
- 239000007787 solid Substances 0.000 claims description 34
- 239000003153 chemical reaction reagent Substances 0.000 claims description 30
- 238000004458 analytical method Methods 0.000 claims description 24
- 230000000903 blocking effect Effects 0.000 claims description 20
- 239000002299 complementary DNA Substances 0.000 claims description 17
- 238000009826 distribution Methods 0.000 claims description 17
- 238000012165 high-throughput sequencing Methods 0.000 claims description 14
- 238000012986 modification Methods 0.000 claims description 14
- 230000004048 modification Effects 0.000 claims description 14
- 108090000623 proteins and genes Proteins 0.000 claims description 13
- 239000003446 ligand Substances 0.000 claims description 12
- 238000000746 purification Methods 0.000 claims description 12
- 230000003068 static effect Effects 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 11
- 102000004169 proteins and genes Human genes 0.000 claims description 11
- 238000010839 reverse transcription Methods 0.000 claims description 11
- 150000003384 small molecules Chemical group 0.000 claims description 11
- 238000012377 drug delivery Methods 0.000 claims description 10
- 238000000018 DNA microarray Methods 0.000 claims description 9
- 229920002521 macromolecule Polymers 0.000 claims description 9
- 230000002829 reductive effect Effects 0.000 claims description 9
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 claims description 8
- 239000012141 concentrate Substances 0.000 claims description 8
- 230000002441 reversible effect Effects 0.000 claims description 8
- XEBWQGVWTUSTLN-UHFFFAOYSA-M phenylmercury acetate Chemical compound CC(=O)O[Hg]C1=CC=CC=C1 XEBWQGVWTUSTLN-UHFFFAOYSA-M 0.000 claims description 7
- 102000004190 Enzymes Human genes 0.000 claims description 6
- 108090000790 Enzymes Proteins 0.000 claims description 6
- JLVVSXFLKOJNIY-UHFFFAOYSA-N Magnesium ion Chemical compound [Mg+2] JLVVSXFLKOJNIY-UHFFFAOYSA-N 0.000 claims description 6
- 230000003321 amplification Effects 0.000 claims description 6
- VYFYYTLLBUKUHU-UHFFFAOYSA-N dopamine Chemical compound NCCC1=CC=C(O)C(O)=C1 VYFYYTLLBUKUHU-UHFFFAOYSA-N 0.000 claims description 6
- 229910001425 magnesium ion Inorganic materials 0.000 claims description 6
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 6
- YCKRFDGAMUMZLT-UHFFFAOYSA-N Fluorine atom Chemical compound [F] YCKRFDGAMUMZLT-UHFFFAOYSA-N 0.000 claims description 5
- 229910052731 fluorine Inorganic materials 0.000 claims description 5
- 239000011737 fluorine Substances 0.000 claims description 5
- 239000002502 liposome Substances 0.000 claims description 5
- 238000002156 mixing Methods 0.000 claims description 5
- 239000008194 pharmaceutical composition Substances 0.000 claims description 5
- 239000007853 buffer solution Substances 0.000 claims description 4
- 238000007385 chemical modification Methods 0.000 claims description 4
- 230000036961 partial effect Effects 0.000 claims description 4
- 239000011780 sodium chloride Substances 0.000 claims description 4
- 229930195730 Aflatoxin Natural products 0.000 claims description 3
- XWIYFDMXXLINPU-UHFFFAOYSA-N Aflatoxin G Chemical compound O=C1OCCC2=C1C(=O)OC1=C2C(OC)=CC2=C1C1C=COC1O2 XWIYFDMXXLINPU-UHFFFAOYSA-N 0.000 claims description 3
- 229930186147 Cephalosporin Natural products 0.000 claims description 3
- LTMHDMANZUZIPE-AMTYYWEZSA-N Digoxin Natural products O([C@H]1[C@H](C)O[C@H](O[C@@H]2C[C@@H]3[C@@](C)([C@@H]4[C@H]([C@]5(O)[C@](C)([C@H](O)C4)[C@H](C4=CC(=O)OC4)CC5)CC3)CC2)C[C@@H]1O)[C@H]1O[C@H](C)[C@@H](O[C@H]2O[C@@H](C)[C@H](O)[C@@H](O)C2)[C@@H](O)C1 LTMHDMANZUZIPE-AMTYYWEZSA-N 0.000 claims description 3
- 229920000877 Melamine resin Polymers 0.000 claims description 3
- 239000000020 Nitrocellulose Substances 0.000 claims description 3
- 239000002033 PVDF binder Substances 0.000 claims description 3
- 239000005409 aflatoxin Substances 0.000 claims description 3
- 239000011543 agarose gel Substances 0.000 claims description 3
- 229940124587 cephalosporin Drugs 0.000 claims description 3
- 150000001780 cephalosporins Chemical class 0.000 claims description 3
- 239000002738 chelating agent Substances 0.000 claims description 3
- 150000001875 compounds Chemical class 0.000 claims description 3
- LTMHDMANZUZIPE-PUGKRICDSA-N digoxin Chemical compound C1[C@H](O)[C@H](O)[C@@H](C)O[C@H]1O[C@@H]1[C@@H](C)O[C@@H](O[C@@H]2[C@H](O[C@@H](O[C@@H]3C[C@@H]4[C@]([C@@H]5[C@H]([C@]6(CC[C@@H]([C@@]6(C)[C@H](O)C5)C=5COC(=O)C=5)O)CC4)(C)CC3)C[C@@H]2O)C)C[C@@H]1O LTMHDMANZUZIPE-PUGKRICDSA-N 0.000 claims description 3
- 229960005156 digoxin Drugs 0.000 claims description 3
- LTMHDMANZUZIPE-UHFFFAOYSA-N digoxine Natural products C1C(O)C(O)C(C)OC1OC1C(C)OC(OC2C(OC(OC3CC4C(C5C(C6(CCC(C6(C)C(O)C5)C=5COC(=O)C=5)O)CC4)(C)CC3)CC2O)C)CC1O LTMHDMANZUZIPE-UHFFFAOYSA-N 0.000 claims description 3
- 229960003638 dopamine Drugs 0.000 claims description 3
- 238000011534 incubation Methods 0.000 claims description 3
- 229930027917 kanamycin Natural products 0.000 claims description 3
- 229960000318 kanamycin Drugs 0.000 claims description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 claims description 3
- 229930182823 kanamycin A Natural products 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- JDSHMPZPIAZGSV-UHFFFAOYSA-N melamine Chemical compound NC1=NC(N)=NC(N)=N1 JDSHMPZPIAZGSV-UHFFFAOYSA-N 0.000 claims description 3
- 239000012528 membrane Substances 0.000 claims description 3
- 229920001220 nitrocellulos Polymers 0.000 claims description 3
- 229920000642 polymer Polymers 0.000 claims description 3
- 229920001184 polypeptide Polymers 0.000 claims description 3
- 229920002981 polyvinylidene fluoride Polymers 0.000 claims description 3
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 3
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 3
- 230000001737 promoting effect Effects 0.000 claims description 3
- -1 salt ions Chemical class 0.000 claims description 3
- 150000003431 steroids Chemical class 0.000 claims description 3
- 238000003745 diagnosis Methods 0.000 claims description 2
- 239000000546 pharmaceutical excipient Substances 0.000 claims description 2
- LISFMEBWQUVKPJ-UHFFFAOYSA-N quinolin-2-ol Chemical compound C1=CC=C2NC(=O)C=CC2=C1 LISFMEBWQUVKPJ-UHFFFAOYSA-N 0.000 claims description 2
- WPTMTNQPLLKAOE-UHFFFAOYSA-N NC1=CC=CC=C1.[N].[N] Chemical compound NC1=CC=CC=C1.[N].[N] WPTMTNQPLLKAOE-UHFFFAOYSA-N 0.000 claims 1
- 239000012149 elution buffer Substances 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 13
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 114
- 108020004414 DNA Proteins 0.000 description 35
- 101800001768 Exoribonuclease Proteins 0.000 description 34
- 101800001554 RNA-directed RNA polymerase Proteins 0.000 description 34
- 101800004575 RNA-directed RNA polymerase nsp12 Proteins 0.000 description 34
- 238000006243 chemical reaction Methods 0.000 description 31
- 239000013615 primer Substances 0.000 description 31
- 102000053602 DNA Human genes 0.000 description 30
- 239000000203 mixture Substances 0.000 description 23
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 22
- 230000008569 process Effects 0.000 description 21
- 230000002401 inhibitory effect Effects 0.000 description 19
- 210000004027 cell Anatomy 0.000 description 16
- 230000008859 change Effects 0.000 description 16
- 108010026228 mRNA guanylyltransferase Proteins 0.000 description 16
- 238000002474 experimental method Methods 0.000 description 14
- 239000000523 sample Substances 0.000 description 13
- 239000000243 solution Substances 0.000 description 12
- 238000005406 washing Methods 0.000 description 12
- 238000010494 dissociation reaction Methods 0.000 description 11
- 241000711573 Coronaviridae Species 0.000 description 10
- 108060004795 Methyltransferase Proteins 0.000 description 10
- 238000007796 conventional method Methods 0.000 description 10
- 230000005593 dissociations Effects 0.000 description 10
- 239000007788 liquid Substances 0.000 description 9
- 235000018102 proteins Nutrition 0.000 description 9
- 238000010200 validation analysis Methods 0.000 description 9
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 8
- 239000000975 dye Substances 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 230000015572 biosynthetic process Effects 0.000 description 7
- 238000005259 measurement Methods 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- 238000012795 verification Methods 0.000 description 7
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 238000012408 PCR amplification Methods 0.000 description 6
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 6
- 238000003384 imaging method Methods 0.000 description 6
- 230000010076 replication Effects 0.000 description 6
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 6
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 5
- 241001678559 COVID-19 virus Species 0.000 description 5
- 241000713772 Human immunodeficiency virus 1 Species 0.000 description 5
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 5
- 108091005804 Peptidases Proteins 0.000 description 5
- 239000004365 Protease Substances 0.000 description 5
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 5
- 229910052799 carbon Inorganic materials 0.000 description 5
- 239000003086 colorant Substances 0.000 description 5
- 239000002773 nucleotide Substances 0.000 description 5
- 125000003729 nucleotide group Chemical group 0.000 description 5
- 239000000047 product Substances 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 4
- 108010067770 Endopeptidase K Proteins 0.000 description 4
- RHGKLRLOHDJJDR-BYPYZUCNSA-N L-citrulline Chemical compound NC(=O)NCCC[C@H]([NH3+])C([O-])=O RHGKLRLOHDJJDR-BYPYZUCNSA-N 0.000 description 4
- 108091028043 Nucleic acid sequence Proteins 0.000 description 4
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 4
- 238000000137 annealing Methods 0.000 description 4
- 229960002685 biotin Drugs 0.000 description 4
- 235000020958 biotin Nutrition 0.000 description 4
- 239000011616 biotin Substances 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 238000001962 electrophoresis Methods 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 230000005764 inhibitory process Effects 0.000 description 4
- 239000002244 precipitate Substances 0.000 description 4
- 235000019419 proteases Nutrition 0.000 description 4
- 239000000758 substrate Substances 0.000 description 4
- 239000006228 supernatant Substances 0.000 description 4
- 230000008685 targeting Effects 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 238000012800 visualization Methods 0.000 description 4
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 3
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 3
- 229920002527 Glycogen Polymers 0.000 description 3
- 102100034343 Integrase Human genes 0.000 description 3
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 3
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 108010006785 Taq Polymerase Proteins 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 238000005251 capillar electrophoresis Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000004925 denaturation Methods 0.000 description 3
- 230000036425 denaturation Effects 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 229940096919 glycogen Drugs 0.000 description 3
- 239000000411 inducer Substances 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 150000002605 large molecules Chemical class 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 239000008188 pellet Substances 0.000 description 3
- 238000001556 precipitation Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000003908 quality control method Methods 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 229960005322 streptomycin Drugs 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 230000029812 viral genome replication Effects 0.000 description 3
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 239000007995 HEPES buffer Substances 0.000 description 2
- 101900297506 Human immunodeficiency virus type 1 group M subtype B Reverse transcriptase/ribonuclease H Proteins 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 2
- 229930064664 L-arginine Natural products 0.000 description 2
- 235000014852 L-arginine Nutrition 0.000 description 2
- 229940096437 Protein S Drugs 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 101710198474 Spike protein Proteins 0.000 description 2
- 239000007984 Tris EDTA buffer Substances 0.000 description 2
- 239000013504 Triton X-100 Substances 0.000 description 2
- 229920004890 Triton X-100 Polymers 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 235000001014 amino acid Nutrition 0.000 description 2
- 150000001413 amino acids Chemical class 0.000 description 2
- 239000008346 aqueous phase Substances 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 239000011230 binding agent Substances 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 230000003197 catalytic effect Effects 0.000 description 2
- 229960002173 citrulline Drugs 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000012350 deep sequencing Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 229940000406 drug candidate Drugs 0.000 description 2
- 238000001035 drying Methods 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 239000012091 fetal bovine serum Substances 0.000 description 2
- 238000000799 fluorescence microscopy Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000010859 live-cell imaging Methods 0.000 description 2
- 238000011068 loading method Methods 0.000 description 2
- 238000007885 magnetic separation Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000002609 medium Substances 0.000 description 2
- 241000264288 mixed libraries Species 0.000 description 2
- LZGUHMNOBNWABZ-UHFFFAOYSA-N n-nitro-n-phenylnitramide Chemical compound [O-][N+](=O)N([N+]([O-])=O)C1=CC=CC=C1 LZGUHMNOBNWABZ-UHFFFAOYSA-N 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 239000002987 primer (paints) Substances 0.000 description 2
- 238000013138 pruning Methods 0.000 description 2
- RWWYLEGWBNMMLJ-MEUHYHILSA-N remdesivir Drugs C([C@@H]1[C@H]([C@@H](O)[C@@](C#N)(O1)C=1N2N=CN=C(N)C2=CC=1)O)OP(=O)(N[C@@H](C)C(=O)OCC(CC)CC)OC1=CC=CC=C1 RWWYLEGWBNMMLJ-MEUHYHILSA-N 0.000 description 2
- RWWYLEGWBNMMLJ-YSOARWBDSA-N remdesivir Chemical compound NC1=NC=NN2C1=CC=C2[C@]1([C@@H]([C@@H]([C@H](O1)CO[P@](=O)(OC1=CC=CC=C1)N[C@H](C(=O)OCC(CC)CC)C)O)O)C#N RWWYLEGWBNMMLJ-YSOARWBDSA-N 0.000 description 2
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 2
- 101150093914 seqA gene Proteins 0.000 description 2
- ATHGHQPFGPMSJY-UHFFFAOYSA-N spermidine Chemical compound NCCCCNCCCN ATHGHQPFGPMSJY-UHFFFAOYSA-N 0.000 description 2
- 239000003981 vehicle Substances 0.000 description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- 208000024827 Alzheimer disease Diseases 0.000 description 1
- 101000661812 Arabidopsis thaliana Probable starch synthase 4, chloroplastic/amyloplastic Proteins 0.000 description 1
- 241000238097 Callinectes sapidus Species 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- KDXKERNSBIXSRK-RXMQYKEDSA-N D-lysine Chemical compound NCCCC[C@@H](N)C(O)=O KDXKERNSBIXSRK-RXMQYKEDSA-N 0.000 description 1
- 108020001019 DNA Primers Proteins 0.000 description 1
- 108091008102 DNA aptamers Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 241000725303 Human immunodeficiency virus Species 0.000 description 1
- 101100333320 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) end-3 gene Proteins 0.000 description 1
- 229910021586 Nickel(II) chloride Inorganic materials 0.000 description 1
- 229930182555 Penicillin Natural products 0.000 description 1
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- BELBBZDIHDAJOR-UHFFFAOYSA-N Phenolsulfonephthalein Chemical compound C1=CC(O)=CC=C1C1(C=2C=CC(O)=CC=2)C2=CC=CC=C2S(=O)(=O)O1 BELBBZDIHDAJOR-UHFFFAOYSA-N 0.000 description 1
- 229920001213 Polysorbate 20 Polymers 0.000 description 1
- 108091030071 RNAI Proteins 0.000 description 1
- 108091007520 SARS-CoV-2 RNA polymerases Proteins 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- PZBFGYYEXUXCOF-UHFFFAOYSA-N TCEP Chemical compound OC(=O)CCP(CCC(O)=O)CCC(O)=O PZBFGYYEXUXCOF-UHFFFAOYSA-N 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000003314 affinity selection Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000012148 binding buffer Substances 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- 229940042399 direct acting antivirals protease inhibitors Drugs 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000003682 fluorination reaction Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 238000013537 high throughput screening Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 239000006225 natural substrate Substances 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- QMMRZOWCJAIUJA-UHFFFAOYSA-L nickel dichloride Chemical compound Cl[Ni]Cl QMMRZOWCJAIUJA-UHFFFAOYSA-L 0.000 description 1
- 230000009871 nonspecific binding Effects 0.000 description 1
- 108091008104 nucleic acid aptamers Proteins 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 229940049954 penicillin Drugs 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 229960003531 phenolsulfonphthalein Drugs 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 1
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 238000006862 quantum yield reaction Methods 0.000 description 1
- 238000010791 quenching Methods 0.000 description 1
- 150000007660 quinolones Chemical class 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 239000013643 reference control Substances 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 208000023504 respiratory system disease Diseases 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 150000003376 silicon Chemical class 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- JQWHASGSAFIOCM-UHFFFAOYSA-M sodium periodate Chemical compound [Na+].[O-]I(=O)(=O)=O JQWHASGSAFIOCM-UHFFFAOYSA-M 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 229940063673 spermidine Drugs 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 239000012096 transfection reagent Substances 0.000 description 1
- 229960005486 vaccine Drugs 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/70—Carbohydrates; Sugars; Derivatives thereof
- A61K31/7088—Compounds having three or more nucleosides or nucleotides
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K47/00—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
- A61K47/50—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates
- A61K47/51—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent
- A61K47/54—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent the modifying agent being an organic compound
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/12—Antivirals
- A61P31/14—Antivirals for RNA viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/115—Aptamers, i.e. nucleic acids binding a target molecule specifically and with high affinity without hybridising therewith ; Nucleic acids binding to non-nucleic acids, e.g. aptamers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B30/00—Methods of screening libraries
- C40B30/04—Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/573—Immunoassay; Biospecific binding assay; Materials therefor for enzymes or isoenzymes
Definitions
- the present invention relates to the field of biotechnology. Specifically, the present invention relates to a method for screening RNA aptamers.
- second-generation high-throughput gene sequencing (essentially independent of the screening process, only providing sequence information), microfluidic chips (precision equipment and professionals manually adjust empirical parameters), capillary electrophoresis (precision equipment and only suitable for screening Technologies such as aptamers that bind to macromolecules), bioinformatic models of subsequences and structures (data-driven, still dependent on data quality, high false positives) further optimize the screening process of RNA aptamers, but are fast , efficient and universal screening technology is still lacking.
- RNA drug candidates developed can have dual identities of vaccine and treatment, are highly specific and safe, are less affected by the mutation of the new coronavirus, have a short development cycle and low product costs.
- RNA drug candidates can also have the advantages of dynamic increase and decrease regulation, strong targeting and safety, and intelligent precision medicine.
- RNA aptamer screening technology still has limitations, such as high false positive rate; non-optimal binding ability of selected aptamers; high time cost, often requiring 10-16 rounds of repeated screening, costing 2-6 Months of research and development; poor reproducibility; and the need for manual operations are obvious shortcomings, which restrict the screening and subsequent application of RNA aptamers.
- the purpose of the present invention is to provide a screening method for RNA aptamers, which method can reduce the false positive rate; optimize screening conditions and enhance screening capabilities; shorten experimental time and reduce time costs; improve the screening process and achieve repeatability; and at the same time It can simplify experimental operations and adapt to mechanical intelligence.
- the present invention provides a method for screening RNA aptamers, the method comprising the following steps:
- step 2) Incubate the library in step 1) with the solid carrier bound to the target, thereby promoting the full binding of the RNA aptamer in the library to the target;
- step 3 Use the buffer gradient to elute the RNA aptamer that binds to the target on the solid support in step 2), and collect the eluate of each elution separately;
- step 6) Amplify and perform high-throughput sequencing on the cDNA obtained in step 6) to obtain sequencing data;
- step 8) Analyze the sequencing data obtained in step 7) and sort the RNA aptamer candidate sequences from high to low according to their binding potential, thereby obtaining high-affinity RNA aptamer sequences.
- providing an RNA aptamer library to be screened includes preparing the RNA aptamer library by yourself, purchasing it commercially, or obtaining an RNA aptamer library through gifts from others.
- step 2) after the RNA aptamer in the RNA aptamer library binds to the target, the solid carrier can be blocked to control and reduce non-specific background binding.
- the blocking is to block the solid support with non-target-specific random RNA; or to block the solid support with target-specific RNA.
- the solid carrier includes but is not limited to: magnetic beads and matrix.
- the matrix includes but is not limited to: agarose gel matrix, cephalosporin beads, nitrocellulose, polyvinylidene fluoride membrane, octyltrehalose and other carrier matrices.
- the target is a small molecule, including but not limited to: steroids, dopamine, kanamycin, digoxin, antoxin, dinitroaniline, melamine, quinolones , aflatoxin; macromolecules, including but not limited to: polypeptides, proteins (such as enzymes and antibodies, etc.) and complexes (proteins combined with RNA), high molecular polymers and compounds, etc.
- the gradient elution is performed with an increasing volume of buffer or a buffer with increasing elution strength; preferably, an increasing volume of buffer is used for elution.
- the buffer with increased elution strength refers to a buffer that increases the concentration of salt ions or chelating agents to prevent RNA from folding to form a spatial structure.
- background elution is performed several times until the number of RNA aptamer molecules contained in the eluate is no more than 1% of the high-throughput sequencing threshold.
- the volume of buffer for background elution should be no greater than the initial volume of buffer used for gradient elution.
- the elution can be static elution (discontinuous elution, collecting the complete eluate at one time) or dynamic elution (continuous elution, continuously collecting a small amount of part of the eluate), preferably static elution Elution.
- the last background elution is performed in a new vessel.
- the volume of buffer for background elution may or may not be increased, preferably not.
- the buffer for background elution and the buffer for gradient elution may be the same or different; preferably they are the same.
- the buffer used for gradient elution contains magnesium ions, preferably 5mM magnesium ions, a pH value below 8.5, preferably a pH value of 7-8, and a NaCl or KCl concentration between 75mM and 200mM.
- RNA aptamer molecules contained in the eluate is suitable for sequencing
- the theoretical minimum number of molecules in the library is reduced to less than 10 5 , in step 4 ) to completely elute out the RNA aptamer bound to the target on the solid support.
- the buffer used for complete elution contains a reagent capable of releasing the RNA aptamer, including a reagent capable of destroying the binding of the target to the solid support, and/or a reagent capable of destroying the binding of the RNA aptamer to the target. reagents, and /or reagents that directly destroy the target.
- a 0-6nt compensation sequence is randomly inserted between the sequencing adapter and the cDNA constant region.
- step 7 during the mixing of multiple samples, custom-designed PhiX is also introduced to further compensate for the unbalanced base distribution in the constant region
- the binding potential refers to the fastest increasing degree of enrichment in each eluent, rather than the highest degree of enrichment in single consideration.
- the binding potential is judged based on one or more of the following information of the RNA aptamer: the abundance of the RNA aptamer in each eluate, the presence of the RNA aptamer in each eluate.
- the frequency of individually detected RNA aptamers in the eluate was better than that in the initial eluate.
- the above information is integrated and fitted into a standard curve to evaluate the binding potential of the RNA aptamer according to the area under the curve (AUC).
- the RNA aptamer comprises a chemically modified sequence.
- the chemically modified sequence is a fluorine modified sequence.
- the present invention provides an RNA aptamer, which is screened using the method described in the first aspect.
- the RNA aptamer includes one whose sequence is known but has random modifications on different bases (eg, A, U, G, C).
- the RNA aptamers do not include conventional RNA aptamers whose sequences are known and without additional modifications.
- the RNA aptamer comprises a chemically modified sequence; preferably a fluorine modified sequence.
- the invention provides an apparatus for running the method described in the first aspect above.
- the device includes the following modules:
- Preparation module which prepares an RNA aptamer library to be screened
- Incubation module which incubates the prepared library with a solid carrier (magnetic beads or matrix) bound to the target;
- Elution and collection module which uses a buffer to perform gradient elution of the above-mentioned module of the RNA aptamer that binds to the target on the solid support, and collects the eluate of each elution respectively;
- Optional concentration and purification module which concentrates and purifies the RNA aptamer of the above eluate
- Reverse transcription module which reversely transcribes the above-mentioned RNA aptamer to obtain cDNA
- Amplification and high-throughput sequencing module which performs amplification and high-throughput sequencing on the cDNA obtained above to obtain sequencing data
- Analysis module which analyzes the above-mentioned sequencing data and sorts RNA aptamer candidate sequences from high to low according to their binding potential, thereby obtaining RNA aptamer sequences with high binding affinity.
- the present invention provides a biochip, which includes the RNA aptamer described in the second aspect.
- the present invention provides a method for manufacturing a biochip, which method includes the following steps:
- the present invention provides a pharmaceutical composition
- a pharmaceutical composition comprising the RNA aptamer described in the second aspect and a pharmaceutically acceptable excipient or drug delivery carrier.
- the present invention provides a drug delivery carrier to which the RNA aptamer described in the second aspect is connected.
- the drug delivery vehicle is liposomes.
- the RNA aptamer of the present invention can be glued or connected to a delivery carrier (such as nanoliposome), so that the RNA aptamer wrapped in the carrier can be specifically Drug delivery to designated cells.
- a delivery carrier such as nanoliposome
- the present invention provides a diagnostic reagent, which includes the RNA aptamer described in the second aspect and other auxiliary reagents required for diagnosis.
- the present invention provides the use of RNA aptamers obtained by screening using the method described in the first aspect in preparing biochips, pharmaceutical compositions or diagnostic reagents.
- Figure 1 shows the construction and modeling analysis of the RNA library of the present invention.
- a shows the screening RNA and sequencing library construction of the present invention.
- a random RNA library is incubated with targeting molecules attached to magnetic beads.
- the mixture of RNA-bound targeting molecules is washed sequentially with the same binding buffer with increasing volume gradient, and the washing liquid is collected, which is called group 1-10 (g1-10).
- group 1-10 group 1-10
- the mixture is subjected to final elution (group 11, g11) with corresponding chemical reagents or enzymes to completely detach the strongly bound target RNA from the magnetic beads.
- the purified RNA from g1-11 was further subjected to reverse transcription, offset PCR amplification and Illumina PCR amplification.
- b shows the sequencing data mining high-affinity aptamer of the present invention.
- the raw sequence was preprocessed into a 67nt core region. Data cleaned series were counted and merged into the same data frame.
- the fold ratio weights are adjusted according to the normalization of the sequences on each non-initial group (g2-11 relative to g1) according to the default initial gamma baseline.
- the sequence of each change ratio group and its gamma The rate of change (gf) is represented by the area under the curve (auc). Based on the subsequence characteristics of the top-ranked sequences and their enrichment routes in each group, the ranking model can be further fine-tuned.
- the aptamer sequence with the highest value AUC is selected for downstream functional applications.
- Figure 2 shows the construction characteristics of the RNA library of the present invention.
- a is the sequence composition of the random RNA library source of the present invention.
- the RNA sequence (103nt) contains the primer A binding region (19nt, purple square, sequence marked below) from the 5'-end to the 3'-end, the left arm random region (N26, 26nt), and the pre-loop (L12, 12nt ) region, the right arm random region (N26, 26nt), and the primer B binding region (20nt, dark green square, the sequence is marked below).
- b is the reverse transcribed single-stranded cDNA template sequence used for offset PCR amplification. The dark blue and purple shaded sequences represent the partial binding regions to primer B and primer A respectively.
- c is the sequence composition of the dsDNA library of the present invention used for sequencing.
- An RNA sequence consisting of a fixed-sequence pre-looped (L12) region and two random 26nt (N26) regions is reverse complementary to dsDNA.
- D is two versions of offsetPCR multiplex primer combinations.
- the forward primer (“Frw”) includes the sequencing 5'-end adapter sequence (gray shading), the 0-6nt compensation sequence (orange letters) and the partial primer B region sequence (dark green letters), while the reverse primer (“Rev” ”) includes sequencing 3'-end adapter sequence (green shading), 0-6nt compensation sequence (orange letters), and partial primer A region sequence (purple letters).
- the version 1 (V1) compensation primer of this design is 2 nt longer than the version 2 (V2) compensation primer.
- e is a PhiX sequence specially designed and customized by the present invention to balance the uneven distribution of bases.
- Custom PhiX includes sequencing adapter sequences at the 5' and 3' ends, with random nucleotides (indicated by "N”). Nucleotides shaded in light blue are used to compensate for base bias.
- F is Bioanlyzerd's electrophoretic analysis of offset PCR and Illuminating PCR products. The x-axis represents the length of dsDNA, while the y-axis corresponds to the fluorophore signal intensity.
- V1 and V2 are the same as in Figure d, while “52.5C”, “51C” and “68.5C” represent PCR annealing temperatures.
- g is the compensation sequence distribution map of the sequencing library.
- the x-axis represents the length of the compensation sequence identified from the raw reads, while the y-axis represents the percentage of compensation sequences in the library with the specified length.
- “RLA”, “RLB” and “RLC” are the abbreviated names of the "+None", “tRNA” and “cRNA” background blocking systems used in the RNA library of the present invention.
- NLA NLA
- NLC NLC
- the dashed line represents the average distribution of the ideal compensation sequence percentage (14.28%, 100/7).
- h is the percentage distribution diagram after processing the corresponding data of the library sequence.
- the x-axis represents the process from the original sequence number ("raw") to the compensation sequence pruning ("after_offset") to the establishment of the pre-loop area (“after_bridge”) to the unknown sequence signal processing ("after_N”), while the y-axis represents The percentage of sequences that passed this process.
- the name abbreviation is the same as the f diagram.
- the dashed line represents the 94% percentage.
- Figure 3 shows a comparison of the system gradient reproducibility (SGRELI) capabilities of the present invention with enriched ligands of the prior art.
- SGRELI system gradient reproducibility
- the x-axis represents the order of the sublibrary, while the y-axis is the relative sequence abundance in eligible sequences per million (RPM).
- b is the average Pearson similarity coefficient between the subsequences of the enriched sequences from the inventive/prior art sublibrary and the corresponding subsequences from the Dope directed random library validation data set.
- the abbreviated names, arrows, and x-axis of the sublibraries are the same as in Figure 2a.
- the y-axis represents the average Pearson correlation coefficient value, and the gray dotted line represents the reference value of 0.6.
- Solid lines of different colors represent the number of sequences at different enrichment levels ("t1k” represents the first 1000, “t10k” represents the first 10000, “t100k” represents the first 100000, “all” represents all sequences) is used for the calculation of subsequences, while the points in the line represent the specified subeluent group and use the specified The average pearson correlation coefficient for each validation data calculated by the number of enriched sequences.
- the first line of analysis is based on a subsequence length of 6nt ("n6” in the legend), while the second line uses 10nt ("n10" in the legend).
- Figure 4 shows the maximum threshold and trend characteristics of SGRELI in the present invention.
- a shows the abundance correlation of the sub-features (6nt) of the first 10,000 sequences enriched in the sub-library of the present invention/prior art and the sub-library of the Dope directed random verification data set.
- the x-axis represents the log 10-transformed relative abundance of the sub-features of the inventive/conventional techniques, while the y-axis represents the relative abundance of the Dope validation data set.
- the blue points represent the correlation calculated based on the seq A directed verification library, while the red and green points represent the correlations based on the seq B and seq C directed random libraries respectively.
- Figure 5 shows the application of the high-affinity silodamine RNA aptamer screened in the present invention to excite fluorescence imaging in cells.
- a shows the overlapping intersection of the top 25 RNA aptamers in three SiR libraries. The percentage is the ratio of the number of RNA aptamers that bind and activate fluorescent silodamine (turn-on).
- b shows the percentage of overlap among the three SiR libraries based on the number of top-ranked aptamers.
- the x-axis represents the number of RNA aptamers for each library selected during the analysis, while the y-axis is the percentage.
- the green line represents the percentage of RNA aptamers present in all three libraries, while orange and blue represent occurrences in only two libraries and a single library, respectively.
- c shows the KD curves of the top-ranked high-affinity aptamers RLB2 and RLB15. 50nM SiR-PEG2-NH2 probe was incubated with different concentrations of RNA aptamer. The y-axis represents the measured relative fluorescence intensity. d shows the turn-on fluorescence intensity fold of the top 25 RNA aptamers in the RLB library. The X-axis represents the top "N" RNA aptamers (in descending order, red column), pure dye reference (blue column) and pure buffer (gray column).
- the height of the column represents the change in fluorescence intensity, with the signal intensity change of the pure dye as a unit (shown as a dotted line). Error bars are mean ⁇ standard deviation.
- e shows that the library of the present invention activates higher fluorescence multiples than the top-ranked RNA aptamers of the prior art library.
- f shows the sorting distribution of RNA aptamers with SiR turned on and off in the library of the present invention.
- the y-axis represents the rank distribution of RNA aptamers corresponding to the library.
- g shows that the RLB aptamer has a higher fluorescence quantum yield than SiRA and is more suitable for RNA molecular imaging of living HEK237T cells. Under the action of 200nM SiR-PEG-NH2 dye, the target cells RNA is shown in red, and nuclear regions are shown in blue with Hoechst dye.
- Figure 6 shows the characteristic details of screening RNA silrhodamine aptamers of the present invention.
- a shows that SiR with large gamma value exhibits relatively high screening accuracy.
- the F1 score is calculated by evaluating the top-ranked sequences in the inventive library (predicted to "bind") and the rest (predicted to "not bind"), with reference to the top-ranked sequences in the directed random Dope validation library (actual "binding") and The rest (actually “not combined”) are calculated.
- the baseline boundary value range 0-0.25, the highest F1 separating true “binding” and true “non-binding" buckets (used to adjust enrichment trends) and the ruler sequence percentage (used to determine where to reweight), Plotted on a heat map.
- the maximum F1 for all given combinations is shown in the heat map as a white box and labeled with a number, while the F1 score based on gamma1.0 is shown in red.
- b shows the range of the sorting baseline boundaries, Details of the impact of changing gamma values on SiR ranking accuracy based on a fixed scale sequence percentage (0.01).
- the top row represents the effect of the entire baseline range, while the bottom row represents the most valuable high ranking baseline range (0-0.02).
- Different The gamma values are represented in the figure with different colors and line formats.
- the Sankey plot in c shows the enrichment route for SiR aptamers with the highest AUC.
- the colored bucket "A" represents the rank that is the top 0.01% of the specified sublibrary.
- the maximum auc value is scaled to 1.0.
- the bar height represents the number of sequences within the specified auc range.
- e shows the increasing trend of the normalized ranking factor values for three high-affinity SiR aptamers.
- the x-axis represents the gamma correction of the description
- the ratio change (gf) of the subclass is the same as in Figure 1a.
- the colored line represents the enrichment trend of the corresponding library, and the points on the line represent the ranking factor value of the specified subclass.
- the specific value of AUC is used in the legend.
- Figure 7 shows the high-affinity novel coronavirus replicase RNA aptamer screened by the present invention and its application in inhibiting RNA replication.
- a shows the overlapping intersection of the top 25 RNA aptamers in three libraries of the new coronavirus replicase (nsp12). The percentage in the figure is the ratio of the number of high-affinity RNA aptamers.
- B shows the same experimental and analytical procedures as Figure 5b, but using the nsp12 library as the data source.
- c shows the binding details between the RNA aptamer of the invention and nsp12 protein characterized using biofilm interference (BLI). The association process (left) and the dissociation process (right) are separated by vertical dash lines.
- B shows the same experimental and analytical procedures as Figure 5b, but using the nsp12 library as the data source.
- c shows the binding details between the RNA aptamer of the invention and nsp12 protein characterized using biofilm interference (BLI). The association
- Curves in different colors represent the use of corresponding RNA concentrations as measurement conditions.
- d shows the KD values of 77 randomly selected NLB aptamers of the present invention and their ranking statistics in the library.
- Horizontal and vertical histograms represent the distribution of library ranking and KD values, respectively.
- the numbers near the dots indicate their order in the library.
- the horizontal dashed line indicates a KD of 10 nM, while the gray percentage values indicate the proportion of aptamers above and below this boundary.
- f and g show that adding the 3' end-blocked RNA aptamer of the present invention to the nsp12/7/8 (RdRp) replicase complex species can effectively inhibit extension.
- RNA template:aptamer:nsp12/7/8 concentration ratio of RNA template:aptamer:nsp12/7/8 is 2.5 ⁇ M:1.25 ⁇ M:1.25 ⁇ M.
- RNA was separated in denaturing PAGE electrophoresis and visualized (panel g). Corresponding statistical values from three independent replicate experiments are shown on the f-plot. Error bars represent mean ⁇ standard deviation.
- h and i show the visualization of reaction kinetics ( Figure h) and quantification (Figure i) of the inhibitory effect of the 3'-end blocked aptamer of the invention. Error bars represent mean ⁇ standard deviation.
- Figure 8 shows the characteristic details of the screening RNA aptamer of the present invention applied to the new crown replicase nsp12 and the inhibitory effect of 3'-end modification on the polymerase.
- a shows that nsp12 with small gamma values has relatively high screening accuracy.
- the analysis process is the same as Figure 6a, but the nsp12 library is used for evaluation, and the last round of sequence abundance of the nsp12 traditional library is used as a reference verification.
- the analysis process in b is the same as that in Figure 6b, but the nsp12 library is used for evaluation.
- the last round of sequence abundance of the nsp12 traditional library is used as a reference for verification, and the analysis baseline boundary range is 0-0.2.
- RNA template:aptamer:nsp12/7/8 The concentration ratio of RNA template:aptamer:nsp12/7/8 is 2.5 ⁇ M:5 ⁇ M:2.5 ⁇ M.
- RNA was isolated and visualized as in Figure 7g.
- blocking the 3'-end of the RNA aptamer of the present invention increases the inhibitory effect on RdRp elongation activity. ratio of inducers.
- concentration ratios of RNA template (2.5 ⁇ M) and aptamer after addition of original (“o") or blocked treatment ("b") were used as 2, 1, 0.5, 0.25 and 0.125 respectively.
- the concentration of nsp12/7/8 is 2.5 ⁇ M.
- RNA was isolated and visualized as in Figure 7g.
- Figure 9 shows the screening effect of the present invention on chemically fluorinated modified RNA aptamers (five-carbon sugar 2'-end) applied to type I AIDS (HIV-1) reverse transcription replicase inhibition in vitro.
- a shows the inhibition effect of RNA aptamers obtained from three different screening libraries when added to the HIV-1 replicase system.
- the library includes RT-F ("RTF", the RNA contains 30 random sequences ("30F”), a single library, fluorinated modifications C and U), and the library RT-FF ("RT-FF", the RNA contains There are 30 random sequences (“30F”) or libraries containing 20 random sequences (“20F”), mixed libraries, homofluorinated modifications C and U), libraries RT-FN ("30FN", templates containing 30 A library containing 20 random sequences (“30F”) and a library containing 20 random sequences (“20F”), mixed libraries, only the 30F library RNA uses fluorinated modifications C and U, while the 20F library has no fluorinated modifications).
- ubique is an RNA aptamer that is highly repeated in the library, and the reference sequence 70N89 is a published nucleic acid aptamer that inhibits HIV-1 replicase.
- Cy3 labeling at the 5-terminus of DNA is used for instrumental measurement and visualization.
- the concentration ratio of DNA template: DNA primer: RNA aptamer: HIV replicase (p66) is 100nM: 100nM: 10nM: 60nM.
- the percentage rate of RNA aptamer inhibition of replication is a visualization of panel a. The ratio is calculated as the percentage of the unextended band brightness as a percentage of the total extended and unextended brightness, and then divided into the positive control (“PC”) and negative control (“-Enzyme”) for normalization, and three groups independently repeated the experiment.
- PC positive control
- -Enzyme negative control
- the inventor conducted extensive and in-depth research on the screening method of RNA aptamers. During the exploration process, he discovered and collected Each elution was performed without losing any information about the RNA aptamer, allowing all "background" wash information to be combined to determine the system gradient reproducibility (SGRELI) of its enriched ligands.
- SGRELI system gradient reproducibility
- the method of the present invention can achieve a low false positive rate, screened aptamers with strong binding ability, short library preparation time, can perform only one round of enrichment, high library preparation reproducibility, and is suitable for automated robotic arms and other technical effects. On this basis, the present invention was completed.
- RNA aptamer and "RNA ligand” used herein have the same meaning and refer to RNA substances that can interact with various biological and chemical targets to regulate their functions. Similar to antibodies, artificially screened short single-stranded RNA aptamers specifically recognize and bind targets by folding into specific three-dimensional structures.
- RNA aptamers in this field mainly relies on iterative and repeated screening and enrichment of RNA aptamers in RNA libraries.
- This technology has promoted the wide range of applications of RNA aptamers, such as live cell super-resolution RNA imaging, SARS-CoV-2 RNA detection and spike protein blocking, etc.
- researchers typically need to conduct 8-16 iterative rounds of screening followed by extensive Sanger sequencing, which can take anywhere from weeks to months.
- NGS second-generation high-throughput sequencing
- capillary electrophoresis capillary electrophoresis
- microfluidic chip separation have reduced the number of iterations of RNA screening and improved binding specificity, there is still a lack of rapid screening of high-affinity RNA aptamers in one round.
- the inventor developed a method for rapid screening of high-affinity RNA aptamers, and applied it to fluorescent silodamine (0.6kDa) for live cell RNA imaging; it was also applied to SARS-CoV-2 polymerase nsp12 ( ⁇ 110kDa) to inhibit the replication of RNA-dependent RNA polymerase (RdRp). After 5 hours of wet experimental screening of RNA aptamers, an NGS library of 11 gradient washed RNA solutions was established within one day, and sequencing and mathematical modeling sequencing analysis were completed.
- the method of the present invention combines all "background" wash information to determine the system gradient reproducibility of its enriched ligands ( SGRELI).
- SGRELI system gradient reproducibility of its enriched ligands
- the RNA aptamers screened by the method of the present invention are high-affinity, effective and reproducible aptamers, and the RNA aptamers screened by the method of the present invention are verified by silodamine-activated fluorescence and SARS-Cov-2 RNA polymerase activity. RNA aptamers can be successfully applied for functional regulation of targets.
- the method of the present invention combines wet and dry experiments.
- the end-to-end method of the present invention consists of two parts: wet experiment and dry experiment (Fig. 1a, b).
- Fig. 1a, b wet experiment and dry experiment
- Figure 2a ⁇ 2x10 14 random RNA molecules
- Biotin-labeled target molecules or His-labeled target molecules are captured by streptomycin and Ni-NTA magnetic beads respectively.
- this study conducted parallel experiments on three types of magnetic bead blocking systems: Type A (no blocking), Type B (tRNA blocking) and Type C (RNA blocking with known binding capacity). Compare. Within 1 hour, RNA with different binding and dissociation abilities were harvested from 11 groups respectively by increasing the washing pressure (Fig. 1a). The selected RNA from each group was reverse transcribed into single-stranded cDNA (Fig. 2b).
- this cDNA Since this cDNA has constant region sequences at the 5'-/3'-end and the middle pre-looped region, it is mainly used for PCR primer binding, but each sequencing cluster will produce the same sequence at the same base position. Fluorescence signal, which in Illumina high-throughput sequencing will result in a large area of single fluorescence overexposure, thereby covering other signals.
- offset PCR was designed here to randomly insert a 0-6nt compensation sequence between the sequencing adapter and the cDNA constant region (Figure 2c). Therefore, the sequences derived from the bases derived from 7 cDNAs of different lengths were correspondingly translated and the base composition was balanced ( Figures 2d and 2f).
- SGRELI makes the method of the present invention more advantageous than conventional methods.
- a conventional procedure involves multiple washes of non-specifically bound ligands and then collecting the final eluate as a collection of the most strongly bound ligands. Unlike conventional procedures, the method of the present invention collects information from all eluates, which would normally be discarded and ignored in conventional procedures.
- this paper generated 7 SiR-based deep sequencing libraries: one conventional library (RC) derived from previous work, and three using magnetic beads for blocking Inventive libraries of systems type A (RLA), type B (RLB) and type C (RLC), three validation libraries based on directed random generation of known SiR-binding aptamer sequences seqA, seqB and seqC (Table 1 ).
- RNA ligands harvested from the elution step may not be the most ideal enriched set of ligands, doped with more background.
- the inventors used the validation library as a reference to compare the top enrichment sub-characteristics of the sequences of the inventive method and the conventional method. As the selection groups or rounds increase, the similarity between those sequences with the highest enrichment in the method of the present invention and the conventional method also gradually increases ( Figure 3b and Figure 4a). It is worth noting that the peak of sub-feature similarity for the inventive method occurs before the final elution step and is higher than that of the conventional method library, regardless of the length of the sub-feature ( Figure 3b and Figure 4c).
- the inventors of the present invention studied the effect of the RNA silicon rhodamine aptamer obtained by the method of the present invention on turning on the dye fluorescence signal.
- this article analyzed the SGRELI ranking model using 380 super-parameter gamma values and scale quantile combinations in three SiR libraries of the present invention. Taking the verification library as a reference, comparing the ranking order of these sequences in the library of the invention and the conventional library, according to the best f1 score RLC (61)>RLB (59) ⁇ RLA (51)>RC (36), with a medium gamma value ( The model of 4) produced higher accuracy in predicting the rank order (Fig. 6a). In addition, targeting the scale quantile only has a small impact on improving the accuracy of predictions.
- the inventor further analyzed the prediction accuracy of hundreds of different buckets, ranging from (top abundance ranking 0.00025-0.25). Likewise, within the top small scale, the accuracy of the predicted rank order is the same as the best f1 score order at the global scale (Fig. 6b).
- the default gamma value (1.0) to analyze the library, which performed well in both the inventive method and the conventional method. Comparing the enrichment routes of the top sequences in the library of the method of the present invention, the first group appears first The top-ranked sequence was completely covered by the RLA library, followed by the RLB and RLC libraries (Fig. 6c).
- aptamers Due to limited manpower, usually only a small proportion of aptamers are selected for downstream validation, so it is important to understand the performance of the top-ranked aptamers in all three libraries of the invention.
- 100% of the top 25 aptamers of RLB bound to SiR and activated fluorescence compared with 52% and 68% for RLA and RLB, respectively (Fig. 3a).
- Most aptamers of RLB were also found in RLA and RLC (Fig. 5b), with similar ranking order (Fig. 5f).
- aptamers can also bind to SiR without activating dye fluorescence, since the main goal of this application is to screen those aptamers that can activate fluorescence, aptamers that do not activate dye fluorescence will not be analyzed separately.
- RLB is the most abundant aptamer library that activates fluorescence signals, followed by RLC, RLA, and RC ( Figure 5e).
- aptamers with opening capabilities are ranked higher than aptamers without opening capabilities.
- the RLB aptamers in the top 1000 may contain more other applicable aptamers.
- RLB aptamers were further used for live cell RNA imaging. It showed a higher and cleaner signal than SiRA (Fig. 5g).
- the method x of the present invention establishes high-quality aptamers and activates SiR and is applied to live cell RNA images.
- the inventor also verified the ability of the RNA new coronavirus replicase aptamer obtained by the method of the present invention to inhibit enzyme activity.
- the new coronavirus uses the RdRp complex to replicate the RNA genome and transcribe genes. Rather than blocking the spike protein to combat viral infection, viral replication may be disabled using high-affinity aptamers that compete with the RdRp's natural substrates.
- nsp12 libraries of the present invention were generated, which were type A (NLA), type B (NLB) and type C (NLC) magnetic bead closed systems.
- NLA type A
- NLB type B
- NCC type C
- the top-ranked aptamer NLB2 showed a K of 827 pM, while the lower-ranked aptamer NLB113 even had a K of 32 pM ( Figure 7c and Figure 8f). In contrast, neither the RNA background control system with nsp12 removed nor the background control system with random RNA with nsp12 showed any obvious signal (Fig. 8g).
- 86 of the top 200 mixtures were randomly selected from the NLB library. More than 90% of them were found to have strong binding ability to nsp12, and more than 10% of these binding ligands Half of the KDs were below 10 nM (Fig. 7d).
- These high-affinity aptamers ranked higher in the present invention than conventional libraries Fig. 7e). This shows that the method of the present invention can enrich high-quality aptamers more effectively than traditional methods.
- NLB2 showed higher inhibitory effect than other aptamers (Fig. 8h).
- Figure 8i oxidatively blocking the 3'-end of RNA aptamers can significantly enhance their inhibitory effect.
- NLB30 has a lower inhibitory effect before blocking the 3'-end but a higher inhibitory effect after blocking. This change in blocking effect suggests that RNA aptamers may bind to nsp12 and/or RdRp in different regions throughout the complex, but that the main inhibitory function is partially dependent on the RNA 3'-end.
- RNA aptamer has fewer molecules than the nsp12 protein, the complete inhibitory effect is reduced. This suggests that the 3'-end blocked RNA inducer interacts with nsp12 in a 1:1 ratio. Furthermore, multiple RNA aptamers (such as NLB30) inhibited the viral replication elongation reaction by more than 98%, whereas control RNA did not ( Figures 7f and 7g). Moreover, this strong inhibitory effect was consistent over time ( Figures 7h and 7i). Therefore, based on the inhibitory effect of RNA-nsp12, the method of the present invention can quickly screen high-affinity aptamers to inhibit the replication of the rapidly mutating new coronavirus.
- the present invention also verifies unconventional RNA aptamer screening.
- the present invention establishes three different screening libraries, including single-length libraries ("RT-F") and mixed-length libraries ("RT-FF", "RT-FN” ").
- the top 12 RNA aptamers screened from these libraries can effectively inhibit the replication efficiency of HIV-1 reverse transcriptase ( Figure 9).
- Other aptamers containing the reported "CGGG" pairing inhibitory functional domain also have inhibitory effects. Therefore, this method has reliable screening effect for RNA chemical modification.
- the RNA aptamers of the invention may comprise chemically modified sequences.
- the chemically modified sequence is a fluorine modified sequence.
- the present invention provides a method for screening RNA aptamers, which method includes the following steps:
- step 2) Incubate the library in step 1) with the solid carrier bound to the target, thereby promoting the full binding of the RNA aptamer in the library to the target;
- step 3 Use the buffer gradient to elute the RNA aptamer that binds to the target on the solid support in step 2), and collect the eluate of each elution separately;
- step 6) Amplify and perform high-throughput sequencing on the cDNA obtained in step 6) to obtain sequencing data;
- step 8) Analyze the sequencing data obtained in step 7) and sort the RNA aptamer candidate sequences from high to low according to their binding potential, thereby obtaining RNA aptamer sequences with high binding affinity.
- RNA aptamer libraries used for screening can be libraries from various sources, including but not limited to self-prepared, commercially purchased, or RNA aptamer libraries obtained through gifts from others.
- the solid carrier used in the method of the present invention can be various solid carriers well known to those skilled in the art, including but not limited to: magnetic beads, matrix, etc.
- the matrix includes but is not limited to: agarose gel matrix, cephalosporin beads, nitrocellulose, polyvinylidene fluoride membrane, octyltrehalose and other carrier matrices.
- the method of the present invention is suitable for screening RNA aptamers that bind to various targets.
- the target may be a small molecule, including but not limited to: steroids, dopamine, kanamycin, digoxin, antoxin, dinitroaniline, melamine, quinolone, aflatoxin ;
- the target can also be a macromolecule, including but not limited to: polypeptides, proteins (such as enzymes and antibodies, etc.) and complexes (proteins combined with RNA), high molecular polymers and compounds, etc.
- the gradient elution can be achieved by using an increasing volume of buffer or a buffer with increasing elution strength; preferably, using an increasing volume of buffer solution to achieve gradient elution.
- the buffer with increased elution strength refers to a buffer that increases the concentration of salt ions or chelating agents to prevent RNA from folding to form a spatial structure.
- background elution can be performed several times before gradient elution until the number of RNA aptamer molecules contained in the eluate is no more than 1% of the high-throughput sequencing threshold.
- the volume of buffer used for background elution should be no greater than the initial volume of buffer used for gradient elution.
- the elution may be static elution (i.e., discontinuous elution, collecting the entire eluate at once) or dynamic elution (i.e., continuous elution, continuously collecting small portions of the eluate) ), preferably static elution.
- buffer can be used to soak magnetic beads to achieve static elution; or buffer can be used to flow through a solid carrier, which is similar to column chromatography, to achieve dynamic elution.
- the last background elution is performed in a new vessel.
- the volume of buffer for background elution may or may not be increased, preferably not increased.
- the buffer for background elution and the buffer for gradient elution may be the same or different; preferably they are the same.
- the buffer used for gradient elution contains magnesium ions, preferably 5mM magnesium ions, a pH value below 8.5, preferably a pH value of 7-8, and a NaCl or KCl concentration between 75mM and 200mM.
- the number of background elutions can be determined based on the actual size of the library and subsequent sequencing throughput. For example, if the number of molecules in the RNA aptamer library to be screened is 10 14 , and the NextSeq sequencing throughput is approximately 10 7 sequences/each set of eluates, the number of molecules needs to be reduced to 1% of the sequencing throughput. If the following (10 7 molecules are sequenced and 10 7 sequences are sequenced, an average of 1 sequence/single molecule will be obtained, while if ⁇ 10 5 molecules are sequenced and 10 7 sequences are sequenced, an average of 100 sequences/single molecule will be obtained, reducing the number of sequences obtained due to random sampling.
- the number of RNA aptamer molecules contained in the eluate is suitable for sequencing, preferably so that the theoretical minimum number of molecules in the library is reduced to less than 10 5 , and complete elution is performed.
- the RNA aptamer bound to the target on the solid support is completely eluted.
- washing at a predetermined background Increase the number of stripping and gradient elution by 1-2 times.
- the advantage of the present invention is that no RNA aptamer information in the library is lost. Therefore, the method of the present invention not only retains the eluent obtained by the above-mentioned background elution and gradient elution, but also performs complete elution after gradient elution, thereby completely eluting the RNA aptamer bound to the target on the solid carrier. Take it off.
- the buffer used for complete elution contains reagents capable of releasing the RNA aptamer, including reagents capable of destroying the binding of the target to the solid support, and/or reagents capable of destroying the binding of the RNA aptamer to the target, and/ or agents that directly destroy the target.
- reagents capable of releasing the RNA aptamer including reagents capable of destroying the binding of the target to the solid support, and/or reagents capable of destroying the binding of the RNA aptamer to the target, and/ or agents that directly destroy the target.
- the buffer used for complete elution contains a reagent that separates the small molecule target from the solid support, such as DTT, and EDTA for isolation of RNA aptamers from small molecule targets.
- a reagent that separates the small molecule target from the solid support such as DTT, and EDTA for isolation of RNA aptamers from small molecule targets.
- the target is a macromolecule, such as a protein
- a reagent that binds the protein such as a protease
- protease it is necessary to pay attention that the presence of the protease cannot have an adverse effect on the subsequent reverse transcriptase.
- the RNA aptamer in the obtained eluate can be concentrated and purified. Whether to concentrate and purify the RNA aptamer in the eluate obtained can be decided by technicians based on specific needs. Without concentration and purification, the instrument can be directly used to automatically perform reverse transcription reaction on the RNA aptamer in the eluent, but the solution volume is large and reagents are wasted. If the RNA aptamer in the eluate is concentrated and purified, reagents can be saved, but the concentration time will be several hours.
- the inventor designed offset PCR to randomly insert between the sequencing adapter and the cDNA constant region. Compensation sequence of 0-6nt.
- the amplified DNA will still have a small number of bases in an unbalanced distribution (for example, the 10th base of the 5'-end fixed sequence is missing A, and the 11th to 13th bases are also missing. All lack A, as shown in Figure 2d in the V1 sequencing 5'-end starting sequence (marked with dark green and orange letters).
- the 3'-end fixed sequence lacks G at positions 13-14, and the middle pre-loop region at position 13-14.
- Positions 8-12 lack the composition of T). Therefore, it is necessary to add random sequences that are similar to the average length of the library to the sequencing library, and use fixed sequences missing from the library at unbalanced positions (such as the 5'-end, random N for bases 1-7, and random N for bases 8 Use G for the base to avoid fluorescence sequencing signals of 6 consecutive identical bases, use fixed base A for 9-13th position; use random N for bases 1-7 at the 3'-end, and use the complementary base of G at position 13 C). In short, if there are still missing bases in the sequence after adding offset, use a custom sequence to supplement the missing bases. For positions where bases are not missing, random N can be used in the custom sequence.
- PhiX was designed to further compensate for the unbalanced base distribution in constant regions
- the method of the present invention considers the binding potential of RNA aptamer candidate sequences for sorting when performing sequencing data analysis, thereby obtaining RNA aptamer sequences with high binding affinity.
- the binding potential refers to the fastest increasing degree of enrichment in each eluate, rather than the highest degree of enrichment in a single consideration.
- the binding potential is judged according to one or more of the following information of the RNA aptamer: the abundance of the RNA aptamer in each eluate, the individual presence of the RNA aptamer in each eluent.
- RNA aptamer appears in the subsequent eluate is better than that in the initial eluate.
- the above information is integrated and fitted into a standard curve to evaluate the binding potential of the RNA aptamer according to the area under the curve (AUC).
- the present invention provides an equipment for implementing the method of the present invention. Based on the teachings of the present invention, those skilled in the art can know how to construct equipment for implementing the method of the present invention.
- the device may include modules for performing steps in the method of the invention.
- the RNA aptamers screened using the method of the present invention can be made into various products.
- the RNA aptamers screened using the method of the present invention can be made into biochips.
- the RNA aptamers screened using the method of the present invention can be made into pharmaceutical compositions.
- the RNA aptamer screened by the method of the present invention has high binding affinity to the specific recognition receptor on the cell surface, the RNA aptamer of the present invention can be connected to the liposome, thereby Can specifically deliver drugs in liposomes to designated cells. Therefore, the RNA aptamers screened by the method of the present invention can be made into drug delivery carriers.
- the drug delivery vehicles are liposomes.
- RNA aptamers screened using the method of the present invention can also be made into diagnostic reagents.
- the false positive rate of the library is low (as low as 0%);
- the aptamers screened from the library have strong binding ability (K D 25nM siR small molecule, 32pM nsp12 large molecule);
- Library preparation can be further optimized and suitable for automated robotic arms (96 samples).
- the template (5' end-3' end) is composed of the primer A binding region (19nt) [Famulok, M. Molecular Recognition of Amino Acids by RNA-Aptamers: An L-Citrulline Binding RNA Motif and Its Evolution into an L-Arginine Binder.J Am Chem Soc 116,1698-1706,doi:10.1021/ja00084a010(2002)], left arm random region (26nt), pre-looped fixed region (12nt)[ Davis,J.H.&Szostak,J.W.Isolation of high-affinity GTP aptamers from partially structured RNA libraries.Proc Natl Acad Sci USA 99,11616-11621,doi:10.1073/pnas.182095699(2002)], right arm random region (26nt), Primer B binding region (20nt) [Famulok, M.Molecular Recognition of Amino Acids by RNA-Aptamers: An L-Citrulline Binding RNA Mo
- the PCR reaction was first heated to 94°C (3 minutes), and then performed for 6 cycles (each cycle included denaturation at 94°C for 1 minute, annealing at 52°C for 1 minute, and extension at 72°C for 3 minutes), and the final extension step (72°C, 20 minutes ), cool to 4°C.
- RNA was obtained by performing P/C/I purification and two chloroform extractions, followed by isopropyl alcohol precipitation for 1 hour at -20°C and two 75% ethanol washes. Then dissolve the RNA in dH 2 O and store at -20°C or -80°C for long-term storage.
- the preparation process of directional random RNA library is similar to that of random RNA library preparation except for the synthesis of ssDNA template and the process of PCR amplification and purification of dsDNA.
- PCR amplification For amplification of dsDNA, proceed with 50 nM ssDNA as template in 1X Taq buffer along with 5 ⁇ M Primer A, 5 ⁇ M Primer B, 500 ⁇ M dNTPs, 1.5 mM MgCl and 0.05 U/ ⁇ L Taq DNA polymerase in a total volume of 1 mL. PCR amplification. The PCR process is the same as for random RNA library preparation.
- PCR products were purified using QIAquick PCR purification kit (QIAGEN). In vitro transcription reactions were performed with 400 nM purified dsDNA in a total volume of 2 mL. After RNA was purified by PAGE, it was dissolved in dH 2 O and stored under the same storage conditions as above.
- RNA, yeast-tRNA (Invitrogen) and competitor RNA from random or directed random libraries were incubated at 75°C for 5 minutes, then slowly cooled to 4°C at a rate of 0.1°C/s, and then placed on ice. .
- Hydrophilic streptomycin magnetic beads (New England Biolabs) were enriched using a 6-tube magnetic separation rack (New England Biolabs) with 4 times the volume of magnetic beads in 1X ASB buffer (20mM HEPES pH 7.4, 125mM KCl, 5mM MgCl 2 ), perform 5 washes and equilibrate, then resuspend in 0.5 bead volume of 1X ASB buffer.
- a magnetic rack to collect the magnetic beads, remove the supernatant, and then wash the RNA-bound magnetic beads 4 times with 200 ⁇ L of 1X ASB buffer. Collect each washing liquid separately, and resuspend the magnetic beads each time. Place in the magnetic rack for 30 seconds, then resuspend in 200 ⁇ L 1X ASB buffer, transfer to a new 1.5mL centrifuge tube, the magnetic rack enriches the magnetic beads, collect the fifth washing liquid, continue to use 250 ⁇ L, 300 ⁇ L, 350 ⁇ L , 400 ⁇ L, 450 ⁇ L of 1X ASB buffer for 5 washes, and also collect each wash liquid separately, subtotaling 10 times.
- the magnetic beads were first incubated with 200 ⁇ L of 50 mM DTT solution at 25°C for 20 minutes, with a rotation speed of 650 rpm. This eluate was collected and combined with the subsequent second eluate as the 11th wash. The second eluate was incubated with 100 ⁇ L of 50mM DTT and 5mM EDTA for 5 minutes at 25°C and 650 rpm.
- RNA that binds to macromolecules taking His-tagged novel coronavirus replicase as an example
- HisPur TM Ni-NTA magnetic beads were enriched using a 6-tube magnetic separation rack (New England Biolabs) with 4 times the bead volume of 1X ERB buffer (100mM NaCl, 20mM Na-HEPES pH 7.5, 5 Equilibrate for 5 washes with % (v/v) glycerol, 10mM MgCl and 0.5mM ⁇ -mercaptoethanol (optional addition), then resuspend in 0.3 times the volume of magnetic beads in 1X ERB buffer.
- 1X ERB buffer 100mM NaCl, 20mM Na-HEPES pH 7.5
- a magnetic rack to collect the magnetic beads, remove the supernatant, and then wash the RNA-bound magnetic beads 4 times with 200 ⁇ L of 1X ERB buffer. Collect each washing liquid separately, and resuspend the magnetic beads each time. Place it in the magnetic rack for 30 seconds, then resuspend in 200 ⁇ L 1X ERB buffer, transfer to a new 1.5mL centrifuge tube, the magnetic rack enriches the magnetic beads, collect the fifth washing liquid, continue to use 250 ⁇ L, 300 ⁇ L, 350 ⁇ L , 400 ⁇ L, 450 ⁇ L of 1X ASB buffer for 5 washes, and also collect each wash liquid separately, subtotaling 10 times.
- the solution was first incubated in 400 ⁇ L 1X ERB buffer containing 0.1 U/ ⁇ L Proteinase K (New England Biolabs) and 2 mM CaCl 2 at 37°C for 45 minutes, while flicking with your fingers to mix. This eluate was collected and combined with the subsequent second eluate as the 11th wash.
- For the second elution use 100 ⁇ L of 1X ERB buffer, incubate at room temperature for 1 minute, and then collect using a magnetic rack. The combined eluate was replenished to a volume of 500 ⁇ L, and P/C/I extraction and purification were performed twice, and the supernatant was recovered.
- RNA For reverse transcription in a final volume of 20 ⁇ L, add 2.6 ⁇ M of RNA to 0.5 ⁇ M primer B, 0.5 mM dNTPs, and dH 2 O and react at 65°C for 5 minutes. After the reaction was completed, the system was immediately placed on ice to cool for 2 minutes. Then, 1X SSIV buffer (Thermo Fisher Scientific), 5mM DTT, 10U/ ⁇ L SuperScript IV reverse transcriptase (Thermo Fisher Scientific) was added to the reaction, and the mixture was incubated at 53°C for 1 hour.
- 1X SSIV buffer Thermo Fisher Scientific
- 5mM DTT 5mM DTT
- 10U/ ⁇ L SuperScript IV reverse transcriptase Thermo Fisher Scientific
- the PCR reaction was first heated to 94°C (3 minutes), and then performed for 11 cycles (each cycle included denaturation at 94°C for 1 minute, annealing at 52.5°C for 1 minute, and extension at 72°C for 2 minutes), and the final extension step (72°C, 5 minutes ), cool to 4°C.
- Sequencing primers and marker sequences were further added to dsDNA by PCR, using ⁇ 4.5nM offset-PCR dsDNA template, 500nM sequencing universal primer (New England Biolabs), 500nM sequencing index primer (New England Biolabs), 200nM dNTPs, 1 ⁇ Q5 Reaction buffer (New England Biolabs), 0.02 U/ ⁇ L Hot Start Q5 High-Fidelity DNA Polymerase (New England Biolabs), and dH 2 O in a total volume of 50 ⁇ L.
- the PCR reaction was first heated to 98°C (40 seconds), and then performed for 6 cycles (each cycle included denaturation at 98°C for 10 seconds, annealing at 68.5°C for 20 seconds, and extension at 72°C for 30 seconds), and the final extension step (72°C, 2 minutes ), cool to 4°C.
- step 5 dissolve dsDNA with 15 ⁇ L dH 2 O and store at -20°C.
- the raw sequencing data is decoded and classified into corresponding sample sequences.
- zero mismatch shall prevail, and at the same time, low phred quality sequences shall be filtered.
- the compensation sequences (7 types, 0-6nt) at the 5'-end of the sequence were pruned, and the balanced distribution of the compensation sequences was calculated.
- the pruned sequence further removes the sequence corresponding to primer B, retains the 67nt source sequence, and performs reverse complementation on the source sequence to adjust it back to a DNA sequence consistent with the original RNA sequence.
- background data with no compensation sequence, no primer B sequence, more than 25 consecutive identical bases, and a trimmed length less than 65 nt are removed.
- each row represents an independent and exclusive sequence statistics in a library.
- groups 1-11, in order, g1-11 each row represents an independent and exclusive sequence statistics in a library.
- the parent sequences that have more than 4 edit distances in the fixed area of the pre-looping region compared with the theoretical pre-looping sequence are removed from the merged data frame.
- the parent sequences containing unknown "N" in the random areas of the left and right arms are removed.
- the merged database storing the normalized sequences is compressed.
- the 1st column is the sequencing sequence itself
- the 2nd column is the group 2 change ratio (f2), that is, the abundance of the sequence in group 2 divided by the corresponding sequence of the sequence in group 1 Abundance
- the third column is the change ratio of group 3 (f3), that is, the abundance of the sequence in group 3 is divided by the corresponding abundance of the sequence in group 1, and so on.
- the fold change database for each set of change ratios, arrange them from high to low, and select the sequence located at 1% (scale quantile) as the scale passing sequence. Then according to the initially set gamma trend line (Changshu c* interval ratio gamma-0.0000001), the default value of gamma is 1, which generates 10 ratios of 0.1, 0.2, 0.3,...0.9 and 1. Scale the ratio of each group of ruler passing sequences to the corresponding ratio as a weighting process. For example, the original ratio of the first group of ruler passing sequences is 5, then divide all the ratios of the first group of sequences by 50, and the second group of rulers The original ratio of the passing series is 7, so divide the ratios of all second series series by 35, and so on.
- each sequence corresponds to 10 gamma change rates (gf) in 10 sets of change ratios, with 1-10 as the abscissa, and the corresponding gamma change rates as the ordinate, and the area under the curve (auc) is calculated.
- the binding ability of the sequence is predicted based on its AUC value. The larger the value, the stronger the potential binding ability, and vice versa.
- the hyperparameter gamma value and scale quantile can be adjusted according to the distribution and ratio of the original abundance of some potential strong binding sequences in each group, as well as the enrichment route, or according to the Pearson's subsequence of the strong binding sequence. (pearson) correlation coefficient is further optimized. For small molecules, gamma ⁇ 1 is recommended, and for large molecules, gamma ⁇ 1 is recommended. It is recommended to use 1% for the scale quantile. For subsequence analysis, first select several strong binding sequences in the merged data frame, split each sequence into a left-arm random area, a pre-looped area and a right-wing random area, and further remove primer A residues from the left-arm random area. base sequence.
- NDCG normalized loss cumulative gain
- the dissociation constant (KD) of the RNA aptamer with sililhodamine was determined from a set of JASCO fluorescence intensities at different RNA concentrations. Briefly, the RNA ligand was structurally folded according to step 2.1, and then the RNA was mixed with 50nMSiR-PEG2-NH2 probe in 1X ASB buffer in a fluorometer cuvette at 25°C, and the fluorescence intensity was recorded as the specified The numerical value by which RNA concentration increases. Among them, the excitation and emission wavelengths were set to 647nm and 662nm respectively, and the excitation and emission slit widths were set to ⁇ 5nm. For data calculations, dissociation constants were determined using the Hill equation to simulate binding curves.
- HEK293T human embryonic kidney-derived cells 293
- DMEM Dulbecco's Modified Eagle's medium
- FBS fetal bovine serum
- Some of the activated cells were inoculated into 300 ⁇ L of culture medium and transferred to an 8-well glass chamber coated with poly-D-lysine for overnight growth.
- the cells were then transfected with an appropriate amount of expression plasmid using FuGeneHD transfection reagent (Promega) according to standard methods. After 48 hours, the culture medium was replaced with Leibowitz (L15) medium containing 200 nM SiR-PEG-NH2. Cells were imaged and photographed at 37°C, and corresponding visual adjustments were made.
- Ni-NTA (NTA) biosensor Sartorius was pre-incubated in 1X ERBL buffer (20mM Tris-HCl pH 7.4, 100mM KCl, 5% (v/v) glycerol, 10mM Mg(OAc) 2 before dissociation constant measurement). , 1mM TCEP, 0.02% TWEEN20 (Carl Roth)) for 5 minutes. The RNA in each well was diluted 2-fold from the specified concentration in 1X ERBL buffer, and the sample wells in 1X ERBL buffer without RNA were set as blank control groups.
- the entire assay consists of a 60-second baseline-1 step, a 180-240 second protein loading step, a 60-second baseline-2 step, a 900-1800 second binding step, and a 600-3600 second dissociation step.
- the measured data were analyzed by Octet data analysis software. Briefly, data were preprocessed by reference control subtraction, Y-axis alignment based on baseline averages, inter-dissociation step correction, and Savitzky-Golay filtering. Use a 1:1 bonding model. The dissociation constants are then calculated by the corresponding fitting method (local or global). Ni-NTA biosensors are reusable.
- the biosensor was washed repeatedly for three cycles of a wash step containing 10mM glycine (pH 1.7) (10 sec) and 1X ERB buffer neutralization (10 sec), followed by 10mM NiCl2 regeneration (70 sec). ) and 1X ERB buffer wash analysis (60 sec), all steps above were performed with shaking at 1000 rpm.
- RNA aptamer was added to 200 ⁇ L reaction solution containing 10mM NaOAc pH 4.5, 50mM freshly prepared NaIO4 and dH2O, and incubated at room temperature for 2 minutes. Then add 10% (v/v) ethylene glycol, mix repeatedly, and let stand at room temperature for 5 minutes to quench the oxidation reaction. The quenched reaction was further added with 222mM Tris-HCl pH 8.9, 0.15M NaOAc pH 5.5, 2 ⁇ L glycogen (Thermo Fisher Scientific) and 50% (v/v) isopropanol. The reaction mixture was incubated for an additional 30 minutes at room temperature.
- RNA was precipitated by centrifugation (16000g, 20 min, 4°C) and washed twice with 75% EtOH. Dissolve RNA in dH 2 O and store at -20°C or -80°C for long-term storage.
- RNA aptamer sequence The relevant sequences of the sequencing library of the present invention, the RNA aptamer sequence and the control sequence are shown in Table 1 and Table 2 respectively.
- RLB7 KD 250nM
- RLB15 KD 194nM
- RLB3 KD 208nM
- RLB4 KD 195nM
- RLB8 KD 700nM
- RLB12 KD 370nM
- RLB13 KD 461nM
- seqB also RLB108, KD 25nM
- NLB113, NLB41, NLB30, NLB79, NLB34, NLB69, NLB32, NLB2, NLB58, and NLB5 showed excellent affinity.
- aptamer in addition to requiring complex instrumentation and fabrication techniques, one limitation is that when an aptamer binds to a target with a smaller molecular weight than itself, it does not produce sufficient mobility shift signal to distinguish the binding characteristics. Similarly, it is not easy to optimize the selection conditions (such as bead aggregation, microbubbles, RNA stability) of the microfluidic chip separation system for different binding targets.
- selection conditions such as bead aggregation, microbubbles, RNA stability
- computational prediction of aptamer binding it mainly uses subsequence and substructure information of RNA sequences, but this type of data-driven analysis is highly dependent on the data corresponding to the manual selection of screening rounds and the quality of traditional screening experiments.
- RNA aptamer screening method developed by this invention and applied to small molecules and large molecules only takes a few hours. Direct single-round RNA screening, no harsh instruments required, efficient deep sequencing library construction.
- the method of the present invention can be used for end-to-end analysis and is extremely easy to use.
- the characteristics of SGRELI maximize the useful information generated during the selection process. High-affinity RNA aptamers can be observed multiple times and have a tendency to be gradient sorted, so the predicted binding aptamers are low in false positives.
- the SiR RNA aptamer screened by the method of the present invention has better KD and fluorescence activation ability. Compared with the currently reported best aptamer SIRA, it increases the specificity while reducing RNA live cell image background. Additionally, sequence length and composition can be further optimized based on structural interaction information.
- the nsp12 RNA aptamer with pM KD obtained by applying the method of the present invention provides the application of inhibiting the replication of SARS-CoV-2 polymerase. Use foreground. Further 3'-end blocking modification of the aptamer is necessary to completely inhibit polymerase extension.
- the aptamer obtained in the present invention can achieve the same inhibitory effect at the same concentration of RdRp polymerase, and only requires one thousandth of the working concentration of remdesivir.
- the RNA aptamer with a 3'-end structure entering and occupying the catalytic center of the replicase complex may be the key to inhibiting viral replication.
- the present invention is applied to screening chemically modified RNA aptamers, and the obtained aptamers effectively inhibit HIV-1 reverse transcription replicase, further expanding the applicable scope of screening of the present invention. Further optimization of the method of the present invention can use machine learning and feature engineering (such as substructure, subsequence) to predict binding affinity, in addition to using automated robots for high-throughput screening.
- the present invention emphasizes a method with SGRELI characteristics for rapid screening of RNA aptamers and development of functionalized RNA that activates chemical dyes and inhibits SARS-CoV-2 polymerase and HIV-1 reverse transcriptase.
- Aptamers provide a theoretical basis.
- RNA aptamer sequence of the present invention and control sequence (fC and fU in the sequence represent the nucleotide, that is, C and U are fluorinated, such as 2'-F-CTP and 2'-F-UTP )
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Immunology (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Veterinary Medicine (AREA)
- Public Health (AREA)
- Urology & Nephrology (AREA)
- Hematology (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- General Chemical & Material Sciences (AREA)
- Epidemiology (AREA)
- Communicable Diseases (AREA)
- Plant Pathology (AREA)
- Oncology (AREA)
- Cell Biology (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Virology (AREA)
- Food Science & Technology (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
Abstract
本发明提供了一种从RNA库中筛选适配体的方法。该方法收集每次洗脱的洗脱液同时结合特定的洗脱程序,从而实现既不丢失RNA适配体的任何信息,又具有低假阳性率,筛选出的适配体结合能力强、文库制备时间短、可以仅作一轮富集、文库制备可重复性高且适用自动化机械臂等技术效果。
Description
本发明涉及生物技术领域。具体地说,本发明涉及一种筛选RNA适配体的方法。
近30年来,二代高通量基因测序(本质上独立于筛选过程,仅提供序列信息)、微流控芯片(精密设备和专业人员手动调节经验参数)、毛细管电泳(精密设备且仅适合筛选与大分子结合的适配体)、子序列和结构的生物信息模型(数据驱使型,仍依赖于数据质量,假阳性高)等技术进一步优化缩短了对RNA适配体的筛选过程,但快速、高效、通用的筛选技术依然缺乏。
面对当前全球大流行的新冠病毒,研发的RNA候选药物可具有疫苗和治疗双重身份,特异性高且安全,受新冠病毒变异影响小,研发周期短且产品成本低。此外,面对长期人口老龄化钟凸显的老年痴呆疾病,RNA候选药物亦可具有动态增减调控,靶向性强且安全,智能化精准医疗等优势。
然而,现有的RNA适配体筛选技术依然存在局限,例如假阳性率高;筛选出的适配体结合能力非最优;时间成本高,往往需要10-16轮重复筛选,花费2-6个月研发;可重复性差;需要手工操作等明显的缺点,从而制约了RNA适配体的筛选以及后续的应用。
因此,本领域急需一种新的RNA适配体筛选方法以便克服现有技术中存在的种种缺点。
发明内容
本发明的目的在于提供一种RNA适配体的筛选方法,该方法能够降低假阳性率;优化筛选条件,增强筛选能力;缩短实验时间,减少时间成本;改进筛选过程,实现可重复性;同时能够简化实验操作,适配机械智能。
在第一方面,本发明提供一种筛选RNA适配体的方法,所述方法包括以下步骤:
1)提供待筛选的RNA适配体文库;
2)将步骤1)的文库与结合有靶标的固体载体温育,进而促使文库中的RNA适配体与靶标充分结合;
3)利用缓冲液梯度洗脱步骤2)中与固体载体上的靶标结合的RNA适配体,分别收集每次洗脱的洗脱液;
4)完全洗脱步骤3)后仍保留在固体载体上的RNA适配体,收集洗脱液作为最后一组的洗脱液;
5)任选浓缩、纯化步骤3)和4)中得到的洗脱液中的RNA适配体;
6)逆转录步骤5)中得到的RNA适配体,从而得到cDNA;
7)对步骤6)中得到的cDNA进行扩增和高通量测序,获得测序数据;
8)分析步骤7)得到的测序数据,根据RNA适配体候选序列的结合潜力从高到低排序,从而得到高亲和力的RNA适配体序列。
在优选的实施方式中,所述提供待筛选的RNA适配体文库包括自行制备、商业购买、或通过他人馈赠获得RNA适配体文库。
在具体的实施方式中,在步骤2)中,RNA适配体文库中的RNA适配体与靶标结合后,可以封闭所述固体载体,控制和减少非特异背景结合。
在具体的实施方式中,所述封闭是用非靶标特异性的随机RNA封闭固体载体;或用靶标特异性的RNA封闭固体载体。
在优选的实施方式中,在步骤2)中,固体载体包括但不限于:磁珠、基质。
在优选的实施方式中,所述基质包括但不限于:琼脂糖凝胶基质、头孢菌素珠,硝化纤维素、聚偏二氟乙烯膜、辛基海藻糖等载体基质。
在优选的实施方式中,在步骤2)中,所述靶标是小分子,包括但不限于:类固醇、多巴胺、卡那霉素、地高辛、安托辛、二硝基苯胺、三聚氰胺、喹诺酮,黄曲霉毒素;大分子,包括但不限于:多肽、蛋白质(如酶和抗体等)及复合体(结合有RNA的蛋白质)、高分子聚合物和化合物等。
在具体的实施方式中,在步骤3)中,所述梯度洗脱是用体积增加的缓冲液、或用洗脱强度增加的缓冲液进行洗脱;优选用体积增加的缓冲液进行洗脱。
在优选的实施方式中,所述洗脱强度增加的缓冲液是指提高盐离子或螯合剂浓度等阻止RNA折叠形成空间结构的缓冲液。
在具体的实施方式中,在梯度洗脱之前,先进行数次背景洗脱,直至洗脱液中所含的RNA适配体的分子数量不大于高通量测序阈值的1%。
在优选的实施方式中,背景洗脱的缓冲液的体积应不大于梯度洗脱所用缓冲液的初始体积。
在优选的实施方式中,所述洗脱可以是静态洗脱(不连续洗脱,一次性收集完整洗脱液)或动态洗脱(连续洗脱,持续收集少量部分洗脱液),优选静态洗脱。
在优选的实施方式中,如果采用静态洗脱,最后一次背景洗脱在新的容器中进行。
在优选的实施方式中,背景洗脱的缓冲液的体积可以增加或不增加,优选不增加。
在优选的实施方式中,背景洗脱的缓冲液与梯度洗脱的缓冲液可以相同或不同;优选相同。
在优选的实施方式中,用于梯度洗脱的缓冲液中包含镁离子,优选5mM镁离子,pH值在8.5以下,优选pH值7-8,NaCl或KCl浓度在75mM-200mM之间。
在具体的实施方式中,在数次梯度洗脱使得洗脱液中所含的RNA适配体的分子数量适合测序,优选对文库中分子数量的理论最低值降低至105以下,在步骤4)中进行完全洗脱,从而将与固体载体上的靶标结合的RNA适配体完全洗脱下来。
在优选的实施方式中,用于完全洗脱的缓冲液中包含能够释放RNA适配体的试剂,包括能够破坏靶标与固体载体结合的试剂,和/或能够破坏RNA适配体与靶标结合的试剂,和
/或直接破坏靶标的试剂。
在具体的实施方式中,在步骤7)中,在测序接头和cDNA恒定区域之间随机插入0-6nt的补偿序列。
在具体的实施方式中,在步骤7)中,在混合多重样品过程中,还引入了定制设计的PhiX来进一步补偿恒定区域的不平衡碱基分布
在具体的实施方式中,在步骤8)中,所述结合潜力是指在每次洗脱液中富集程度增加快,而非单一考量富集程度最高的。
在具体的实施方式中,所述结合潜力是按照RNA适配体的以下一项或多项信息作出判断:RNA适配体在各洗脱液中出现的丰度、RNA适配体在各洗脱液中单独被检测到的频率次数、RNA适配体在后续洗脱液里出现优于在初期洗脱液里出现。
在优选的实施方式中,综合以上信息拟合成标准曲线按照曲线下面积(AUC)对RNA适配体的结合潜力进行评判。
在优选的实施方式中,所述RNA适配体包含化学修饰序列。
在优选的实施方式中,所述化学修饰序列是氟修饰序列。
在第二方面,本发明提供一种RNA适配体,所述RNA适配体采用第一方面所述方法筛选得到。
在优选的实施方式中,所述RNA适配体包括序列已知,但在不同的碱基(例如A、U、G、C)上有随机修饰的所述RNA适配体。
在优选的实施方式中,所述RNA适配体不包括序列已知且无额外修饰的常规RNA适配体。
在优选的实施方式中,所述RNA适配体包含化学修饰序列;优选氟修饰序列。
在第三方面,本发明提供一种用于运行上述第一方面所述的方法的设备。
在优选的实施方式中,所述设备包括以下模块:
1)制备模块,所述模块制备待筛选的RNA适配体文库;
2)温育模块,所述模块将制备的文库与结合有靶标的固体载体(磁珠或者基质)温育;
3)洗脱和收集模块,所述模块用缓冲液进行梯度洗脱上述与固体载体上的靶标结合的RNA适配体的模块,并分别收集每次洗脱的洗脱液;
4)任选的浓缩和纯化模块,所述模块浓缩、纯化上述洗脱液的RNA适配体;
5)逆转录模块,所述模块逆转录上述的RNA适配体,从而得到cDNA;
6)扩增和高通量测序模块,所述模块对上述得到的cDNA进行扩增和高通量测序,获得测序数据;
7)分析模块,所述模块分析上述的测序数据,根据RNA适配体候选序列的结合潜力从高到低排序,从而得到高结合亲和力的RNA适配体序列。
在第四方面,本发明提供一种生物芯片,所述生物芯片包含第二方面所述的RNA适配体。
在第五方面,本发明提供一种生物芯片的制造方法,所述方法包括以下步骤:
1)采用第一方面所述方法筛选得到RNA适配体;和
2)采用步骤1)筛选得到的RNA适配体制备生物芯片。
在第六方面,本发明提供一种药物组合物,所述药物组合物包含第二方面所述的RNA适配体以及药学上可接受的赋形剂或药物递送载体。
在第七方面,本发明提供一种药物递送载体,所述药物递送载体上连接有第二方面所述的RNA适配体。
在优选的实施方式中,所述药物递送载体是脂质体。
在具体的实施方式中,细胞表面有特异识别受体,则可以将本发明的RNA适配体胶着或连接到递送载体(如纳米脂质体)上,从而能特异性地将载体内包裹的药物递送给指定细胞。
在第八方面,本发明提供一种诊断试剂,所述诊断试剂包含第二方面所述的RNA适配体以及诊断所需的其它辅助试剂。
在第九方面,本发明提供采用第一方面所述方法筛选得到RNA适配体在制备生物芯片、药物组合物或诊断试剂中的用途。
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。
图1显示了本发明的RNA文库的构建和建模分析。其中,a显示了本发明的筛选RNA及测序建库。首先,将随机RNA库源与由磁珠连接的靶向分子共同孵育。用体积梯度增加的相同结合缓冲液,依次洗涤结合有RNA的靶向分子的混合物,并收集洗涤液,称为组1-10(g1-10)。然后,用相应的化学试剂或酶对该混合物进行最后的洗脱(组11,g11),使强结合在靶向RNA完全从磁珠上脱离。进一步对g1-11纯化后的RNA进行逆转录、offset PCR扩增和Illumina PCR扩增,最后在进行测序前,加入定制PhiX平衡文库碱基分布。b显示了本发明的测序数据挖掘高亲和力适配体。测序原始序列预处理成67nt的核心区域。数据清洗后的序列被计数并合并到同一个数据框中。根据默认初始的gamma基线对每个非初始组(g2-11相对于g1)上的序列的归一化,进行倍数比率权重的调整。每个变化比率组的序列其gamma
变化率(gf)由曲线下面积(auc)表示。根据排名靠前序列的子序列特征和它们在每个组上的富集路线,可进一步精调排序模型,最后,选取最高数值auc的适配体序列于下游功能应用。
图2显示了本发明的RNA文库的构建特征。其中,a是本发明的随机RNA库源的序列构成。RNA序列(103nt)从5’-端到3’-端依次包含引物A结合区(19nt,紫色方形,序列标注在下方),左臂随机区域(N26,26nt),预成环(L12,12nt)区域,右臂随机区域(N26,26nt),以及引物B结合区(20nt,深绿色方形,序列标注在下方)。b是用于offset PCR扩增的反转录单链cDNA模板序列。深蓝色和紫色阴影标注序列分别代表与引物B和引物A的部分结合区。c是用于测序的本发明dsDNA文库的序列构成。由固定序列的预成环(L12)区域和两个随机的26nt(N26)区域组成的RNA序列与dsDNA反向互补。D是两个版本的offsetPCR多重引物组合。正向引物("Frw")包括测序5'-端适配序列(灰色阴影),0-6nt补偿序列(橙色字母)和部分引物B区序列(深绿色字母),而反向引物("Rev")包括测序3'-端适配序列(绿色阴影),0-6nt补偿序列(橙色字母),和部分引物A区序列(紫色字母)。此设计的版本1(V1)补偿引物比版本2(V2)补偿引物长2nt。e是本发明专门设计定制的PhiX序列用于平衡碱基的不均匀分布。定制的PhiX包括5'和3'端的测序适配序列,随机核苷酸(用"N"表示)。浅蓝色阴影的核苷酸用于补偿碱基偏差。F是Bioanlyzerd对offset PCR和Illuminal PCR产物的电泳分析。x轴表示dsDNA的长度,而y轴对应表示荧光体信号强度。名称缩写"V1"和"V2"与d图相同,而"52.5C"、"51C"和"68.5C"代表PCR退火温度。g是测序文库的补偿序列分布图。x轴代表从原始读数中识别的补偿序列长度,而y轴代表库中具有指定长度的补偿序列百分占比。火炬形状的箱形图由本发明的11个组(n=11)中具有指定长度的补偿序列组成。"RLA"、"RLB"和"RLC"是本发明的RNA文库中应用于硅罗丹的"+None"、"tRNA"、"cRNA"背景封闭系统的缩写名称。同样,"NLA"、"NLB"和"NLC"是应用于新冠病毒复制酶nsp12的相应背景封闭系统的缩写。虚线代表平均分布的理想补偿序列百分占比(14.28%,100/7)。h是文库序列相应数据处理后的百分占比分布图。x轴表示依次从原始序列数目("raw")到补偿序列修剪("after_offset")到预成环区确立("after_bridge")到未知序列信号处理("after_N")的过程,而y轴表示通过该过程的序列百分占比。火炬形状的箱形图由本发明的11个组(n=11)的序列数目在相应的处理步骤中满足选择条件的百分比组成。名称缩写与f图相同。虚线代表94%的百分比。
图3显示了本发明与现有技术富集配体的系统梯度再现(SGRELI)能力比较。其中,a是本发明中和现有技术中每个子库(组group(g)/轮round(r))中三个代表性的高亲和力适配体的富集丰度走势。RLA、RLB和RLC是本发明文库的缩写名称,与图2g相同,而RC是常规文库的缩写名称。不同颜色的线代表不同适配体的富集趋势,每条线上的点代表相应适配体在指定子库上的丰度。箭头表示从倒数第二/第三子库到最后的子库的丰度变化趋势。x轴表示子库的顺序,而y轴为相对序列丰度(每百万合格序列(RPM))。b是来自本发明/现有技术子库的富集序列的子序列与对应来自Dope定向随机文库验证数据集的子序列的平均皮尔森(pearson)相似系数。子库的缩写名称、箭头和x轴与图2a相同。y轴表示平均pearson相关性系数值,灰色虚线表示参考值0.6。不同颜色的实线代表不同富集级别的序列数量("t1k"
代表前1000个,"t10k"代表前10000个,"t100k"代表前100000个,"all"代表所有序列)用于子序列的计算,而线中的点代表指定子洗脱液组和使用指定富集序列数量所计算的每个验证数据平均pearson相关性系数。第一行分析是基于子序列长度为6nt(图例中的"n6"),而第二行是使用10nt(图例中的"n10")。
图4显示了本发明中SGRELI的最大阈值和趋势特征。其中,a显示了本发明/现有技术的子库和Dope定向随机验证数据集的子库中富集的前10000条序列的子特征(6nt)的丰度相关性。x轴表示本发明/常规技术的子特征的log 10对数转换的相对丰度,而y轴表示Dope验证数据集的相对丰度。其中,蓝色的点表示基于seq A定向验证库计算得出的相关性,而红色和绿色的点分别表示基于seq B和seq C定向随机库的相关性。图例中给出了本发明/常规技术和相应的掺杂库源之间所有点的pearson相关性系数("pA"代表定向seq A库,"pB"和"pC"分别代表定向随机seq B和seq C库)。子库("gx"和"rx")和库("RLX"和"RC")缩写名称与图3a相同。b显示与图3b相同的实验和分析,但使用5nt、7nt、8nt、9nt作为相关性分析的子序列的长度。C显示了在本发明/现有技术的子库和Dope定向随机验证数据集中的前10000个序列中,不同长度的子序列的平均pearson相关性系数。子库("gx"和"rx")、库("RLX"和"RC")、x轴、y轴、箭头的缩写名称与图3b相同。不同颜色的线代表基于对应库源的子序列计算,而线中的点为指定子库和每个验证库之间的平均pearson相关性系数。“x-gram”表示子序列的x nt长度被应用于pearson相关性计算。D显示了gamma基线库。不同颜色的线(gamma>=1)和虚线(gamma<1)代表每个折叠比较的加权基线(指第1组)。y轴表示预期的富集加权权重。
图5显示了本发明筛选的高亲和力硅罗丹明RNA适配体在细胞中激发荧光成像应用。其中,a显示了硅罗丹明(SiR)三个库中排名前25个RNA适配体的重叠交集。其中百分比为结合并激活荧光硅罗丹明(开启"turn-on")的RNA适配体数量所占比率。b显示了基于不同排序靠前的适配体数量,其SiR三个文库所重叠交集百分比。x轴表示分析过程中所选取的每个库的RNA适配体的数量,而y轴为百分比。绿线表示RNA适配体在所有三个库中都出现的百分比,而橙色和蓝色分别表示只在两个库和单个库中出现。c显示了排序靠前的高亲和力适配体RLB2和RLB15的KD曲线。50nM SiR-PEG2-NH2探针与不同浓度的RNA适配体进行孵育。y轴表示测量的相对荧光强度。d显示了RLB库中排名前25位的RNA适配体的开启荧光强度倍数。X轴表示排名靠前的"N"个RNA适配体(排序降序表示,红色柱),纯染料参考(蓝色柱)和纯缓冲液(灰色柱)。柱高度代表荧光强度的变化倍数,以纯染料的信号强度变化为单位1(虚线所示)。误差线是平均值±标准差。e显示了本发明文库比现有技术文库的排序靠前RNA适配体激活更高荧光倍数。来自RLA(蓝色,n=25)、RLB(橙色,n=25)和RLC(绿色,n=25)的排序靠前的本发明适配体与来自常规RC(红色,n=42)的顶级适配体的相对荧光倍数变化进行比较。f显示了SiR开启和关闭的RNA适配体在本发明文库排序分布。y轴使对应库的RNA适配体的排序分布。紫色方框(n=20)和灰色方框(n=21)分别表示开启和关闭的RNA适配体组。g显示了RLB适配体比SiRA具有更高的荧光量子产率,更适用于活体HEK237T细胞的RNA分子成像。在200nM SiR-PEG-NH2染料的作用下,目标细胞内
RNA以红色显示,核区被Hoechst染料以蓝色显示。
图6显示了本发明的筛选RNA硅罗丹明适配体的特征细节。其中,a显示了具有大gamma值的SiR呈现出相对较高的筛选精度。F1分数是通过评估本发明文库中排名靠前的序列(预测"结合")和其余的(预测"不结合"),参考定向随机Dope验证库中排名靠前的序列(实际"结合")和其余的(实际"不结合")来计算。在基线边界值范围内(0-0.25,用于分离真"结合"和真"非结合"桶的最高F1(用于调整富集趋势)以及标尺序列百分比(用于确定重新加权的位置),绘制于热图上。所有给定组合的最大F1在热图中用白框表示,并标以数字,而基于gamma1.0的F1得分则用红色表示。b显示了在排序基线边界范围里,基于固定的标尺序列百分比(0.01),改变gamma值对SiR排序精度的影响细节。最上面一行代表整个基线范围的效果,而下面代表最有价值的高排序基线范围(0-0.02)。不同的gamma值在图中用不同的颜色和线条格式表示。c中Sankey图显示了具有最高AUC的SiR适配体的富集路线。彩色的桶"A"代表等级是指定子库中前0.01%的序列,"B"代表前0.01-0.05%,"C"代表前0.05-0.1%,"D"代表前0.1-0.5%,"E"代表前0.5-1%,"F"代表前1-5%,"G"代表前5-10%,"H"代表前10-50%,"I"代表前50-100%,"J"代表不存在于当前子库中。桶的大小与相应的最高比率范围内的序列数量呈线性相关,流量大小也与节点之间的序列数量呈线性相关。d显示了SiR的gamma校正后auc值的分布。为了在三个SiR文库之间进行交叉比较,最大auc值被缩放为1.0。柱高表示在指定的auc范围的序列数量。e显示了三个高亲和力的SiR适配体的归一化排序因子值的增加趋势。x轴表示描述的gamma校正的比率变化(gf)的子类与图1a相同。彩色的线代表相应库的富集趋势,线上的点代表指定子类的排序因子值。图例中使auc具体数值。
图7显示了本发明筛选的高亲和力新冠病毒复制酶RNA适配体及抑制RNA复制应用。其中,a显示了新冠病毒复制酶(nsp12)的三个文库中排名前25个RNA适配体的重叠交集。图中百分比为高亲和力的RNA适配体数量所占比率。B显示了与图5b相同的实验和分析过程,但使用nsp12库作为数据源。c显示了使用生物膜干涉(BLI)表征的本发明RNA适配体和nsp12蛋白之间的结合细节。结合过程(左)和解离过程(右)由垂直的破折线分开。不同颜色的曲线代表使用对应RNA浓度作为测量条件。d显示了随机选择的77个本发明NLB适配体的KD值及其在文库排序的统计。水平和垂直直方图分别表示文库排序和KD值的分布。深绿色的点(n=17)表示KD是通过三组独立重复的多次测量数据,而浅绿色(n=60)是通过两组独立重复的单次测量数据。圆点附近的数字表示其在文库的排序。水平虚线表示KD为10nM,而灰色的百分比值表示高于和低于此边界的适配体的比例。e显示了图7d中的高亲和力RNA适配体在本发明与传统文库中的排序分布。NCA(n=68)、NCB(n=66)、NCC(n=66)是nsp12用传统方法进行11轮筛选的文库,其封闭系统与本发明NLA(n=77)、NLB(n=77)、NLC(n=77)文库的封闭系统相同。f和g显示了将3'端封闭的本发明RNA适配体加入nsp12/7/8(RdRp)复制酶复合物种,可有效抑制延伸。RNA模板:适配体:nsp12/7/8浓度比为2.5μM:1.25μM:1.25μM。RNA在变性PAGE电泳中分离,并进行可视化(g图)。三个独立重复实验的相应统计数值显示在f图上。误差线代表平均值±标准差。h和i显示了3'-端封闭的本发明适配体的抑制效果反应动力学可视化(图h)和量化(图i)。误差条代表平均值±
标准差。
图8显示了本发明筛选RNA适配体应用于新冠复制酶nsp12的特征细节以及3'-端修饰对聚合酶的抑制作用。其中,a显示了小gamma值的nsp12具有相对较高的筛选准确性。与图6a分析过程相同,但使用nsp12文库进行评估,同时以nsp12传统文库的最后一轮序列丰度作为参考验证。b中与图6b分析过程相同,但使用nsp12库进行评估,同时以nsp12传统文库的最后一轮序列丰度作为参考验证,分析基线边界范围使用0-0.2。c中与图6c相同的分析程序,但使用nsp12库进行评估。彩色的桶"A"代表等级水平是指定子库中的前0.00001%的序列,"B"代表前0.0001-0.0005%,"C"代表前0.0005-0.001%,"D"代表前0.001-0.005%,"E"代表前0.005-0.01%,"F"代表前0.01-0.1%,"G"代表前0.1-1%,"H"代表前1-10%,"I"代表前10-100%,"J"代表不存在于当前子库。d中与图6d的分析程序相同,但使用nsp12库进行比较。e中与图6e相同的分析程序,但使用nsp12 RNA适配体。f中,用较低的RNA适配体浓度进行BLI KD检测,与较高的浓度一致。与图5c相同的实验和分析过程,但使用较低的RNA浓度用于测量。g显示了RNA适配体在BLI KD测定的低背景信号。与图5c相同的实验和分析程序,但使用不添加蛋白质的系统(左图)或使用随机选择的RNAI作为参考(右图)。h中,原始nsp12的本发明RNA适配体与RdRp的孵育产生可观察到的延伸抑制效应。RNA模板:适配体:nsp12/7/8浓度比为2.5μM:5μM:2.5μM。RNA的分离和可视化与图7g相同。i中,阻断本发明的RNA适配体的3'-端,增加了对RdRp延伸活性的抑制作用。诱导剂的比例。在抑制试验中,RNA模板(2.5μM)与加入原始("o")或封闭处理后的("b")适配体浓度比例分别使用2、1、0.5、0.25和0.125。而nsp12/7/8的浓度为2.5μM。RNA的分离和可视化与图7g相同。
图9显示了本发明筛选化学氟化修饰的RNA适配体(五碳糖2’-端)应用于I型艾滋病(HIV-1)逆转录复制酶体外抑制的效果。其中,a显示了三个不同筛选文库获得的RNA适配体加入HIV-1复制酶体系所产生抑制复制效果。其中文库包含RT-F(“RTF”,RNA中间包含有30个随机序列("30F”),单一文库,氟化修饰C和U),文库RT-FF(“RT-FF”,RNA中间包含有30个随机序列("30F”)或包含有20个随机序列("20F”)文库,混合文库,均氟化修饰C和U),文库RT-FN(“30FN”,模板中间包含有30个随机序列("30F”)的文库以及包含有20个随机序列("20F”)的文库,混合文库,仅30F文库RNA采用氟化修饰C和U,而20F文库无氟化修饰)。“ubique”为文库中均高度重复出现的RNA适配体,而参考序列70N89为已发表的抑制HIV-1复制酶的核酸适配体。在测量系统中,DNA5端Cy3标记用于仪器测量及可视化。反应体系中,DNA模板:DNA引物:RNA适配体:HIV复制酶(p66)浓度比为100nM:100nM:10nM:60nM。b图中,RNA适配体抑制复制的百分比率是对a图的可视化,计算比率为未延伸的条带亮度所占延伸和未延伸总亮度的百分比率,然后以阳性对照(“PC”)和阴性对照(“-Enzyme”)来归一化,三组独立重复实验。
发明人对RNA适配体的筛选方法进行了广泛而深入的研究,在探索过程中发现,收集
每次洗脱的洗脱液能够实现不丢失RNA适配体的任何信息,从而能够结合所有"背景"洗涤信息来确定其富集配体的系统梯度再现能力(SGRELI)。本发明的方法能够实现低假阳性率,筛选出的适配体结合能力强、文库制备时间短、可以仅作一轮富集、文库制备可重复性高且适用自动化机械臂等技术效果。在此基础上完成了本发明。
RNA适配体及其筛选方法
本文所用的术语“RNA适配体”和“RNA配体”具有相同的含义,是指能与各种生物和化学目标相互作用,从而调控其功能的RNA物质。与抗体类似,人为筛选的短单链RNA适配体通过折叠成特定的三维结构,特异性识别结合靶标。
本领域常用的适配体筛选技术主要通过在RNA库中迭代重复筛选和富集RNA适配体。这项技术推动了RNA适配体的广泛应用,如活细胞超分辨率RNA成像,SARS-CoV-2 RNA检测和刺突蛋白封闭阻断等等。然而,研究人员通常需要进行8-16轮反复的筛选,然后再进行大量的桑格(sanger)测序,花费的时间从几周到几个月不等。虽然反选择、二代高通量测序(NGS)、毛细管电泳、微流控芯片分离使RNA筛选的迭代次数减少,提高了结合特异性,但仍缺乏一轮快速筛选出高亲和力RNA适配体的方法。
本发明人开发了快速筛选高亲和力RNA适配体的方法,将其应用于荧光硅罗丹明(0.6kDa),来进行活细胞RNA成像;同时也应用于SARS-CoV-2聚合酶nsp12(~110kDa),来抑制RNA依赖性RNA聚合酶(RdRp)的复制。经过5个小时的RNA适配体湿实验筛选,然后在一天内,建立了11个梯度洗涤RNA溶液的NGS文库,并完成测序和数学建模排序分析。与其他依靠RNA适配体在最终富集洗脱液的丰度来确定高亲和力选择方法不同,本发明的方法结合了所有"背景"洗涤信息来确定其富集配体的系统梯度再现能力(SGRELI)。本发明方法筛选得到的RNA适配体是高亲和力、有效和可复现的适配体,并且以硅罗丹明激活荧光和SARS-Cov-2 RNA聚合酶活性验证了本发明的方法筛选得到的RNA适配体可以成功应用于靶标的功能性调控。
具体地说,本发明的方法结合了湿实验和干实验。端对端的本发明方法由湿实验和干实验两部分组成(图1a,b)。为了保证文库足够的多样性,实验过程中使用了~2x1014个随机RNA分子(图2a)来进行适配体的筛选。生物素标记的目标分子或His标记的目标分子分别被链霉菌素和Ni-NTA的磁珠捕获。为了系统地评估本发明方法的非特异性结合,本研究对A型(无封闭)、B型(tRNA封闭)和C型(已知结能结合的RNA封闭)三种磁珠封闭系统进行了平行比较。在1小时内,以增加洗涤压力的方式从11个分组中分别收获具有不同结合和解离能力的RNA(图1a)。将每组选出的RNA逆转录成单链cDNA(图2b)。
由于这种cDNA在5'-/3'-端和中间的预成环区都有恒定区域序列,其主要用于PCR引物的结合,但是每个测序簇在相同的碱基位置都会产生相同的荧光信号,这在Illumina高通量测序中会导致大面积单荧光过曝,进而覆盖其他信号。为了解决这个碱基不平衡的问题,这里设计了offset PCR,在测序接头和cDNA恒定区域之间随机插入0-6nt的补偿序列(图2c)。因此,源于7种不同长度cDNA的碱基,序列被对应平移,碱基组成得到平衡(图2d和2f)。
然后,将平衡的dsDNAs与测序引物和样品标签通过PCR连接。在混合多重样品过程中,这里还引入了定制设计的PhiX来进一步补偿恒定区域的不平衡碱基分布(图2e和2g)。
当测序完成并进入干实验部分时,每组的原始序列被清洗和合并以获得高质量的数据框架(图2h)。随后进行数据转换、排序建模以及参数调整,最后根据RNA适配体候选序列的结合潜力从高到低排序,完成整个分析过程大约30分钟(图1b)。
与常规方法相比,SGRELI使本发明的方法更有优势。常规程序会多次洗去非特异性结合的配体,然后收集最后的洗脱液作为最强结合配体的集合。与常规程序不同的是,本发明的方法从所有的洗脱液中收集信息,这些洗脱液通常会在常规程序中被丢弃和忽略。为了比较本发明的方法和常规方法对高亲和力RNA诱导体的富集性能,本文产生了7个基于SiR的深层测序文库:一个源自先前工作的常规文库(RC),三个使用磁珠封闭系统A型(RLA)、B型(RLB)和C型(RLC)的本发明文库,三个基于已知与SiR结合适配体seqA、seqB和seqC序列的定向随机产生的验证文库(表1)。
在单轮的本发明方法的11组中,适配体seqA(KD 430nM)和seqC(KD 1456nM)的富集趋势与常规方法最后11轮相似,但seqB(KD 25nM)在本发明方法中被特异性发掘(图3a)。这表明本发明方法对RNA适配体的高亲和力是有效的。一致的是,在本发明方法的最末几个洗脱组中,所有三种适配体的富集丰度首先增加,然后减少,而理论上应该最后一组是具有最强的富集效果。这种突然的下降表明,从洗脱步骤中收获的RNA配体可能不是最理想的富集配体组,掺杂有更多的背景。为了消除由小样本量分析产生的偏差,本发明人用验证库作为参考,比较了本发明方法和常规方法的序列的顶级富集子特征。随着选择组或轮次的增加,本发明方法和常规方法中那些富集度最高的序列的相似度也逐渐增加(图3b和图4a)。值得注意的是,本发明方法的子特征相似度的顶峰出现在最后的洗脱步骤之前,并且高于常规方法的文库,与子特征的长度无关(图3b和图4c)。其他所有对不同长度的子序列组相似性的分析,也都发现了本发明方法在最后洗脱步骤中的相似性突然下降(图4b)。因此,这里采用gamma基线来模拟本发明方法的这种富集趋势(图4d)。将这些突然下降的特征和较高的子特征相似性证明放在一起,本发明方法从那些常规方法的洗脱液中重复、梯度性的观测到了富集的适配体,且富集适配体有更低背景信号。
本发明人研究了本发明方法获得的RNA硅罗丹明适配体开启染料荧光信号的作用。
为了研究富集高质量适配体的性能,本文在三个SiR本发明文库中用380个超参gamma值和标尺分位组合分析了SGRELI排名模型。以验证库为参考,比较本发明文库和常规文库中这些序列的排名顺序,根据最佳f1得分RLC(61)>RLB(59)≥RLA(51)>RC(36),具有中等gamma值(4)的模型产生了更高的预测排名顺序的准确性(图6a)。此外,针对标尺分位只对预测的准确性提高影响较小。为了消除验证库中不同排序范围带来的偏差,本发明人进一步分析了数百个不同分桶的预测精度,范围为(丰度排名前0.00025-0.25)。同样,在排名靠前的小范围中,预测的等级顺序的准确性与全局范围内的最佳f1得分顺序相同(图6b)。为了优化泛化能力,我们使用默认的gamma值(1.0)对文库进行分析,该值在本发明方法和常规方法中都表现良好。比较本发明方法的文库中排名靠前序列的富集路线,首先出现第一组
完全覆盖排名靠前序列的是RLA文库,其次是RLB和RLC文库(图6c)。这也符合系统结合竞争者的顺序,即无封闭<tRNA封闭<已知结合-RNA封闭。此外,注意到RLB库中预测的排序靠前序列的桶比RLA和RLC中的要密集得多(图6d)。同时,RLB库为他们的排名提供了较大的AUC分数(图6e)。这意味着RLB库可能具有更好的富集能力。
由于人力有限,通常只有小部分的适配体会被选作下游验证,因此了解所有三个本发明文库中排序靠前适配体的表现是很重要的。这里发现RLB的前25个适配体100%与SiR结合并激活了荧光,而RLA和RLB则分别为52%和68%(图3a)。大多数RLB的适配体也在RLA和RLC中被发现(图5b),排名顺序相似(图5f)。尽管适配体也可以与SiR结合而不激活染料荧光,但由于该应用的主要目标是筛选那些可以激活荧光的适配体,所以不激活染料荧光的适配体就不再另行分析。最有前景的两个适配体RLB7和RLB15比报告的最佳适配体(SiRA,KD=430±70nM,~7倍荧光激活)的KD低2倍左右(图5c),荧光率提高达20%(图5d)。基于适配体激活荧光的性能比较,RLB是最丰富的激活荧光信号的适配体库,其次是RLC、RLA和RC(图5e)。除此之外,具有开启能力的适配体比没有开启能力的适配体排名更高。由于SiRA在所有本发明文库中的排名都在1000名左右,所以在前1000名中的RLB适配体可能包含更多其他适用的适配体。这里进一步利用RLB适配体进行了活细胞RNA成像。它显示出比SiRA更高更干净的信号(图5g)。综上所述,本发明的方法x确立了高质量的适配体,并激活SiR并应用于活细胞RNA图像。
本发明人还验证了本发明方法获得的RNA新冠病毒复制酶适配体抑制酶活性的能力。
由于SARS-CoV-2引起的呼吸道疾病大流行,迫切需要在短期内开发和同步更新一种有效的药物来对付不断变异的病毒。新冠病毒利用RdRp复合物来复制RNA基因组和转录基因。与阻断刺突蛋白对抗病毒感染不同,使用与RdRp的自然底物竞争的高亲和力适配体可能可以使病毒复制失效。
为了探索这一假设的正确性,本发明人首先在病毒催化复制酶nsp12上应用了本发明的方法和常规方法发掘高质量的适配体。从而生成了三个本发明的nsp12文库,分别为A型(NLA)、B型(NLB)和C型(NLC)的磁珠封闭系统。同时,还建立了三个常规的nsp12文库,分别为A型(NCA)、B型(NCB)和C型(NCC),具有相同的封闭系统。与本发明的SiR文库类似,与常规文库NCA(54)相比,NLA(75)、NLB(71)和NLC(73)的f1得分较高(图8a)。同样,使用默认的gamma值(1.0)对排序靠前适配体的预测也表现良好(图8b)。根据本发明中各组排序靠前适配体的富集路线,富集能力的差异较小且按照NLC>NLB≥NLA顺序(图8c)。此外,尽管NLB和NLC的auc得分相似(图8e),但NLB库仍然拥有比其他库更密集的顶级预测适配体序列(图8d)。值得关注的是,很高比例的排序靠前的适配体与nsp12结合,并在所有三个适配体上都能发现(图7a和7b)。排名靠前的适配体NLB2显示KD为827pM,而排名较低的适配体NLB113甚至具有KD为32pM(图7c和图8f)。相反,除去nsp12的RNA背景对照系统和用随机RNA与nsp12的背景对照系统都没有显示出任何明显的信号(图8g)。为了进一步评估排名靠前适配体的结合能力,这里从NLB库中随机选择了前200个合剂中的86个。其中90%以上被发现与nsp12有很强的结合力,这些结合适配体中超过
一半的KD低于10nM(图7d)。这些高亲和力的适配体在本发明中的排名高于常规文库(图7e)。这表明本发明的方法比传统方法能更有效地富集高质量的适配体。
同时,本发明人将这些高亲和力的适配体与RdRp延伸反应竞争RNA底物来研究其抑制作用。NLB2显示出比其他适配体更高的抑制效果(图8h)。很有价值的是当氧化封闭RNA适配体的3'-末端可以显著提高它们的抑制作用(图8i)。与NLB2适配体相比,NLB30在阻断3'-端之前有较低的抑制作用,但在封闭之后有较高的抑制作用。这种阻断效果的调转变化表明,RNA适配体可能在不同区域与nsp12和/或RdRp整个复合物结合,但主要抑制功能部分依赖于RNA 3'-端。此外,当RNA适配体的分子数量少于nsp12蛋白时,完全的抑制作用会减弱。这表明3'-端被阻断的RNA诱导体与nsp12是以1:1的比例相互作用的。此外,多个RNA适配体(如NLB30)可抑制98%以上的病毒复制延伸反应,而对照RNA则没有(图7f和7g)。而且这种强烈的抑制作用在一段时间内是一致的(图7h和7i)。因此,基于这种RNA-nsp12的抑制作用,本发明的方法可快速筛选高亲和力适配体抑制快速变异新冠病毒的复制。
此外,考虑到氟化等化学修饰能提高RNA的稳定性,本发明对于非常规的RNA适配体筛选也进行了验证。本发明基于HIV-1复制酶为筛选靶点,建立了三种不同筛选文库,其中包含单一长度的文库(“RT-F”)和混合长度的文库(“RT-FF”,“RT-FN”)。这些文库筛选排名前12的RNA适配体能高效抑制HIV-1逆转录酶的复制效率(图9),其他包含有已报道的“CGGG”配对抑制功能域的适配体亦具有抑制效果。因此,本方法对于RNA化学修饰具有可靠的筛选效果。在具体的实施方式中,本发明的RNA适配体可以包含化学修饰序列。在优选的实施方式中,所述化学修饰序列是氟修饰序列。
在具体的实施方式中,本发明提供了一种筛选RNA适配体的方法,所述方法包括以下步骤:
1)提供待筛选的RNA适配体文库;
2)将步骤1)的文库与结合有靶标的固体载体温育,进而促使文库中的RNA适配体与靶标充分结合;
3)利用缓冲液梯度洗脱步骤2)中与固体载体上的靶标结合的RNA适配体,分别收集每次洗脱的洗脱液;
4)完全洗脱步骤3)后仍保留在固体载体上的RNA适配体,收集洗脱液作为最后一组的洗脱液;
5)任选浓缩、纯化步骤3)和4)中得到的洗脱液中的RNA适配体;
6)逆转录步骤5)中得到的RNA适配体,从而得到cDNA;
7)对步骤6)中得到的cDNA进行扩增和高通量测序,获得测序数据;
8)分析步骤7)得到的测序数据,根据RNA适配体候选序列的结合潜力从高到低排序,从而得到高结合亲和力的RNA适配体序列。
基于本发明的教导,本领域技术人员知晓,用于筛选的RNA适配体文库可以是各种来源的文库,包括但不限于自行制备、商业购买、或通过他人馈赠获得RNA适配体文库。
本发明的方法中利用的固体载体可以是本领域技术人员熟知的各种固体载体,包括但不限于:磁珠、基质等。在优选的实施方式中,所述基质包括但不限于:琼脂糖凝胶基质、头孢菌素珠,硝化纤维素、聚偏二氟乙烯膜、辛基海藻糖等载体基质。
本发明的方法适用于筛选与各种靶标结合的RNA适配体。在具体的实施方式中,所述靶标可以是是小分子,包括但不限于:类固醇、多巴胺、卡那霉素、地高辛、安托辛、二硝基苯胺、三聚氰胺、喹诺酮,黄曲霉毒素;所述靶标也可以是大分子,包括但不限于:多肽、蛋白质(如酶和抗体等)及复合体(结合有RNA的蛋白质)、高分子聚合物和化合物等。
在利用缓冲液梯度洗脱与固体载体上的靶标结合的RNA适配体时,可以利用体积增加的缓冲液、或用洗脱强度增加的缓冲液来实现梯度洗脱;优选用体积增加的缓冲液来实现梯度洗脱。
在优选的实施方式中,所述洗脱强度增加的缓冲液是指提高盐离子或螯合剂浓度等阻止RNA折叠形成空间结构的缓冲液。
为便于后续的高通量测序,可以在梯度洗脱之前,先进行数次背景洗脱,直至洗脱液中所含的RNA适配体的分子数量不大于高通量测序阈值的1%。背景洗脱的缓冲液的体积应不大于梯度洗脱所用缓冲液的初始体积。
在具体的实施方式中,所述洗脱可以是静态洗脱(即,不连续洗脱,一次性收集完整洗脱液)或动态洗脱(即,连续洗脱,持续收集少量部分洗脱液),优选静态洗脱。本领域技术人员知晓实现所述静态洗脱和动态洗脱的具体技术手段。例如,可以采用缓冲液浸泡磁珠的方式实现静态洗脱;或者可以采用缓冲液流过固体载体的方式,即类似于柱层析的方式实现动态洗脱。在优选的实施方式中,如果采用静态洗脱,最后一次背景洗脱在新的容器中进行。背景洗脱的缓冲液的体积可以增加或不增加,优选不增加。背景洗脱的缓冲液与梯度洗脱的缓冲液可以相同或不同;优选相同。
在具体的实施方式中,用于梯度洗脱的缓冲液中包含镁离子,优选5mM镁离子,pH值在8.5以下,优选pH值7-8,NaCl或KCl浓度在75mM-200mM之间。
对于背景洗脱的次数,可以根据文库的实际大小以及后续的测序通量来确定。例如,如果待筛选的RNA适配体文库中的分子数为1014,而选用NextSeq测序通量约为107序列/每组洗脱液,则需要将分子数量降低至测序通量的1%以下(107个分子测序107条序列,则平均1条序列/单个分子,而<105个分子测序107条序列,则可获得平均100条序列/单个分子,减少因随机采样而获取读数为1条序列/单个分子的概率),即105=14-7-2以下;如果每次洗脱体积为200μL,而洗脱残存体积(含固相载体)为2μL左右,则洗脱强度为200/2=102,因此至少需要5次102背景洗脱将文库从1014降低至105以下。
类似地,在数次梯度洗脱使得洗脱液中所含的RNA适配体的分子数量适合测序,优选使得文库中分子数量的理论最低值降低至105以下,进行完全洗脱,从而将与固体载体上的靶标结合的RNA适配体完全洗脱下来。
根据分子质量和长度,本领域技术人员可以了解待筛选文库中的分子数量,从而预先确定背景洗脱的次数以及梯度洗脱的次数。在具体的实施方式中,可以在预先确定的背景洗
脱的次数、梯度洗脱的次数上增加1-2次。
在后续的梯度洗脱中,如果采用静态洗脱的方式,随着洗脱体积的增大,系统筛选压力也在增加,同时洗脱液面总是高于前一次洗脱液面,从而可以减少容器管壁污染。
相比于现有技术,本发明的优点在于不丢失文库中的任何RNA适配体信息。因此,本发明的方法不仅保留了上述背景洗脱和梯度洗脱得到的洗脱液,还在梯度洗脱后进行完全洗脱,从而将与固体载体上的靶标结合的RNA适配体完全洗脱下来。相应地,用于完全洗脱的缓冲液中包含能够释放RNA适配体的试剂,包括能够破坏靶标与固体载体结合的试剂,和/或能够破坏RNA适配体与靶标结合的试剂,和/或直接破坏靶标的试剂。本领域技术人员知晓可以根据具体的情况自主决定或选择能够释放RNA适配体的试剂。例如,如果小分子靶标通过生物素连接于包被了链霉亲和素的固体载体,则用于完全洗脱的缓冲液中包含使得小分子靶标与固体载体分离的试剂,例如DTT,以及使得RNA适配体与小分子靶标分离的EDTA。再例如,如果靶标是大分子,例如蛋白,则可以直接利用配合该蛋白的试剂,例如蛋白酶使得RNA适配体与靶标分离。在利用蛋白酶时,需要注意该蛋白酶的存在不能对后续的逆转录酶带来不利影响,例如可以进一步使用蛋白酶的抑制剂或者利用温度敏感性蛋白酶,从而可以在完全洗脱完成后通过升温使得蛋白酶失活,进而不会影响后续步骤中的逆转录酶活性。
在获得上述背景洗脱、梯度洗脱和完全洗脱得到的洗脱液后,可以浓缩、纯化得到的洗脱液中的RNA适配体。是否浓缩、纯化得到的洗脱液中的RNA适配体,可以由技术人员根据具体需求决定。不进行浓缩、纯化可以直接利用仪器对洗脱液中的RNA适配体自动进行逆转录反应,但溶液的体积大,浪费试剂。如果对洗脱液中的RNA适配体进行浓缩、纯化,可以节省试剂,但浓缩时间会长达数小时。
如上所述,在对洗脱液中的RNA适配体进行扩增和测序时,考虑到碱基不平衡的问题,本发明人设计了offset PCR,在测序接头和cDNA恒定区域之间随机插入0-6nt的补偿序列。然而,使用0-6nt补偿序列后,扩增的DNA仍会有少量位置的碱基处于不平衡分布(如5’-端固定序列第10个碱基缺失A,第11-13个碱基也都缺失A,如图2d中V1测序5’-端起始序列(深绿色及橙色字母标注)所示,同样3’-端固定序列第13-14号位置缺少G,中间预成环区域第8-12号位置缺少T的组成)。因此,需要在测序文库中加入与文库平均长度近似的随机序列,在不平衡的位置使用文库中缺失的固定序列(如5’-端,第1-7号碱基使用随机N,第8号碱基使用G避免连续6个相同碱基荧光测序信号,第9-13使用固定碱基A;3’-端第1-7号碱基使用随机N,第13号位置使用G的互补碱基C)。简而言之,序列在添加offset后仍然存在缺失碱基,则使用定制序列补充该缺失的碱基,对于未缺失碱基的位置,定制序列使用随机N即可。
因此,在本发明方法的扩增和测序步骤中,在混合多重样品过程中,还引入了定制设
计的PhiX来进一步补偿恒定区域的不平衡碱基分布
除了不丢失文库中的任何RNA适配体信息以外,本发明的方法在进行测序数据分析时,考虑RNA适配体候选序列的结合潜力进行排序,从而得到高结合亲和力的RNA适配体序列。在具体的实施方式中,所述结合潜力是指在每次洗脱液中富集程度增加快,而非单一考量富集程度最高的。在优选的实施方式中,按照RNA适配体的以下一项或多项信息判断结合潜力:RNA适配体在各洗脱液中出现的丰度、RNA适配体在各洗脱液中单独被检测到的频率次数、RNA适配体在后续洗脱液里出现优于在初期洗脱液里出现。在优选的实施方式中,综合以上信息拟合成标准曲线按照曲线下面积(AUC)对RNA适配体的结合潜力进行评判。
在本发明的RNA适配体筛选方法的基础上,本发明提供一种用于实施本发明方法的设备。基于本发明的教导,本领域技术人员可以知晓如何构建用于实施本发明方法的设备。例如,所述设备可以包括用于执行本发明方法中各步骤的模块。
基于本发明的教导,本领域技术人员知晓可以将采用本发明方法筛选得到的RNA适配体制成各种制品。例如,在具体的实施方式中,可以将采用本发明方法筛选得到的RNA适配体制成生物芯片。在其它实施方式中,可以将采用本发明方法筛选得到的RNA适配体制成药物组合物。
本领域技术人员知晓,如果采用本发明方法筛选得到的RNA适配体对细胞表面的特异性识别受体具有高结合亲和力,则可以将本发明的RNA适配体连接在脂质体上,从而能特异性地将脂质体内的药物递送给指定细胞。因此,采用本发明方法筛选得到的RNA适配体可以制成药物递送载体。在具体的实施方式中,所述药物递送载体是脂质体。
在其它实施方式中,还可以将采用本发明方法筛选得到的RNA适配体制成诊断试剂。
本发明的有益效果如下:
1.文库假阳性率低(低至0%);
2.文库筛选出的适配体结合能力强(KD 25nM siR小分子,32pM nsp12大分子);
3.文库制备时间短,1轮筛选(约4小时);
4.文库制备可重复性高;和
5.文库制备可进一步优化适用自动化机械臂(96个样品)。
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。下列实施例中未注明具体条件的实验方法,通常按照常规条件,或按照制造厂商所建议的条件。除非另外说明,否则百分比和份数按重量计算。
实施例
实施例1.制备RNA筛选库源
1.1.随机RNA筛选库源
1.1.1合成单链DNA模板
从引物试剂公司订制103nt的ssDNA模板库,该模板(5’端-3’端)由引物A结合区域(19nt)[Famulok,M.Molecular Recognition of Amino Acids by RNA-Aptamers:An L-Citrulline Binding RNA Motif and Its Evolution into an L-Arginine Binder.J Am Chem Soc 116,1698-1706,doi:10.1021/ja00084a010(2002)]、左臂随机区域(26nt)、预成环固定区域(12nt)[Davis,J.H.&Szostak,J.W.Isolation of high-affinity GTP aptamers from partially structured RNA libraries.Proc Natl Acad Sci USA 99,11616-11621,doi:10.1073/pnas.182095699(2002)]、右臂随机区域(26nt)、引物B结合区域(20nt)[Famulok,M.Molecular Recognition of Amino Acids by RNA-Aptamers:An L-Citrulline Binding RNA Motif and Its Evolution into an L-Arginine Binder.J Am Chem Soc 116,1698-1706,doi:10.1021/ja00084a010(2002)]构成(参见图2a)。随机区域在合成过程中,使用人工预混合的核苷酸底物,其A:C:G:T比例为3:3:2:2。整套反应以1μmol的合成规模进行商业合成的。
1.1.2.纯化单链DNA模板
使用10%聚丙烯酰胺PAGE凝胶(20cm*40cm)纯化ssDNA模板库(~200μg,0.5mL体积),20W,1.5h。在UV(254nm用于DNA成像,365nm用于RNA成像)灯下,区分电泳分离合成过程中的副产物,标记主产物,切胶,用5mL 0.3M NaOAc(pH5.5)旋转(550rpm)溶解过夜,加入3倍体积EtOH沉淀过夜,4℃离心(14,000g)30分钟,用2mL 70%EtOH沉淀漂洗沉淀,4℃离心(14,000g)10分钟,再用0.5mL 70%EtOH重复漂洗操作,最后室温干燥DNA沉淀,溶解沉淀于TE缓冲液(10mM Tris-HCl,1mM EDTA,pH 8.0)。
1.1.3.扩增双链DNA模板
0.16μM纯化ssDNA、5μM引物A(参见表1)、5μM引物B(参见表1)、1x Taq缓冲液(10mM Tris-HCl,pH 8.4,50mM KCl,0.1%(v/v)Triton X-100)、500μM dNTPs、6mM MgCl2和0.1U/μL Taq DNA聚合酶,总反应体积为30mL。PCR反应先加热到94℃(3分钟),然后进行6个循环(每个循环包含94℃变性1分钟,52℃退火1分钟,72℃延伸3分钟),最后延伸步骤(72℃,20分钟),冷却到4℃。
1.1.4.纯化双链DNA模板
加入等体积的P/C/I试剂,涡旋30s,14,000g室温离心20min,转移上清水相到新的50mL离心管,然后加入等体积氯仿,涡旋,离心,转移水相,再次重复该氯仿抽提过程。最后,沉淀dsDNA于70%EtOH,0.3M NaOAc,同样乙醇洗涤沉淀,干燥,溶解于200μL TE缓冲液。
1.1.5.体外转录RNA
534nM纯化的dsDNA在1X转录缓冲液(40mM Tris-HCl pH 8.1,1mM spermidine,22mM MgCl2,0.01%Triton X-100),以及10mM DTT、5%(v/v)DMSO、1U/mL焦磷酸酶(可选添加)、4mM ATP/CTP/GTP/UTP、40μg/mL乙酰化BSA(Sigma-Aldrich)(可选添加)、0.7μM T7聚合酶(Thermo Fisher Scientific),总反应体积为10mL。对于2’-羟基氟化的RNA体
外转录反应,相同反应体系需将CTP和UTP底物换成同等浓度的2’-F-CTP(Jena Bioscience)和2’-F-UTP(Jena Bioscience),同时将T7聚合酶替换成同等浓度的突变T7聚合酶(Y639F,自行制备),其他成分保持不变。体外转录反应在37℃下孵育4小时。随后,加入50U/mL DNase I,反应在37℃下再孵育1小时。通过进行P/C/I纯化和两次氯仿提取,然后在-20℃下进行异丙醇沉淀1小时和两次75%的乙醇洗涤,获得所需的RNA。然后将RNA溶于dH2O中,-20℃保存或-80℃长期保存。
1.2.定向随机RNA库源制备
定向随机RNA库源制备过程,除合成ssDNA模板和PCR扩增纯化dsDNA过程外,其他步骤与随机RNA库源制备类似。
1.2.1.合成单链DNA模板
在合成ssDNA模板库时,基于已知祖先序列信息(103nt,参见表1),在合成过程中,对应随机区域在使用人工预混合的核苷酸底物,其原始碱基与其他三个非原始碱基的比例为85:5:5:5,同样以1μmol的规模商业合成的。
1.2.2.扩增双链DNA模板
对于扩增dsDNA,以50nM ssDNA为模板,在1X Taq缓冲液中,连同5μM引物A、5μM引物B、500μM dNTPs、1.5mM MgCl2和0.05U/μL Taq DNA聚合酶,总体积为1mL,进行PCR扩增。PCR过程与随机RNA库源制备相同。
1.2.3.扩增双链DNA模板
使用QIAquick PCR纯化试剂盒(QIAGEN)对PCR产物进行纯化。400nM纯化的dsDNA在总体积为2mL的情况下进行体外转录反应。RNA经PAGE纯化后,溶于dH2O,保存于上述的相同储存条件。
实施例2.RNA适配体的筛选及测序
2.1.RNA高级结构折叠
将随机或者定向随机库源的RNA、酵母-tRNA(Invitrogen)和竞争RNA(参见表1)在75℃下孵育5分钟,然后以0.1℃/s的速度缓慢冷却到4℃,然后放置冰上。
2.2.a.筛选与小分子(以生物素标记的硅罗丹明为例)结合的RNA
2.2.1.a.预平衡磁珠
用6管磁分离架(New England Biolabs)将亲水链霉菌素磁珠(New England Biolabs)富集,用4倍磁珠体积的1X ASB缓冲液(20mM HEPES pH 7.4,125mM KCl,5mM MgCl2)进行5次洗涤平衡,然后重悬于0.5磁珠体积的1X ASB缓冲液中。
2.2.2.a.封闭磁珠背景:5μM的生物素连接的硅罗丹明与25%(v/v)的浓缩磁珠(2.1.a)孵育在100μL的1X ASB缓冲液中:A组无需添加额外的试剂(简称"+None",RLA);或添加0.15μg/μL的折叠酵母tRNA作为B组(简称"+tRNA",RLB);或添加4μM的折叠竞争
者RNA 1(参见表1),作为C组(简称"+cRNA",RLC)。将反应充分混匀后,在25℃下孵化30分钟,转速1000rpm。
2.2.3.a.平衡已封闭磁珠
用200μL的1X ASB缓冲液进行五次洗涤,每次洗涤轻柔混匀磁珠,室温静置30s,然后放入磁架,移除液体,快速加入1X ASB缓冲液,避免磁珠被干燥。
2.2.4.a.结合RNA
预先准备4μM的随机库源的RNA溶于100μL的1X ASB缓冲液中。洗涤完封闭磁珠后,将磁珠重新悬浮在该RNA溶液中,充分混合,25℃,转速1000rpm下孵育1小时。
2.2.5.a.梯度筛选洗涤RNA
用磁架收集磁珠,移走上清液,然后先将结合了RNA的磁珠用200μL用1X ASB缓冲液进行4次洗涤,单独收集每次洗涤液体,每次重悬磁珠需静置30秒再放入磁架,然后再重悬于200μL1X ASB缓冲液,转移到新的1.5mL离心管中,磁架富集磁珠,收集第5次洗涤液体,继续依次用250μL,300μL,350μL,400μL,450μL的1X ASB缓冲液进行5次洗涤,同样单独收集每次洗涤液体,小计10次。最后经行完全洗脱,首先将磁珠与200μL的50mM DTT溶液在25℃下孵育20分钟,转速650rpm。收集该洗脱液并与后续的第二次洗脱液合并,作为第11次洗涤。第二次洗脱液用100μL的50mM DTT和5mM EDTA在25℃下孵育5分钟,转速650rpm。
2.2.6.a.浓缩纯化RNA
加入3倍体积的冷却EtOH、0.1倍体积3M NaOAc以及1μL糖原(Thermo Fisher Scientific),将2.5.a中10个洗涤液和1个合并洗脱液-20℃沉淀2小时或者过夜。将沉淀的RNA离心(4℃,>20,000g,1h),洗涤,溶于10μL dH2O中。[可选步骤:2.4.a结合RNA步骤的反应体积可以相应地缩减到20微升,在2.5.a第二次洗脱步骤中可不加5mM EDTA。此时,这10个洗涤液和1个合并洗脱液可以直接加入到逆转录转录反应中,不需要2.6.a RNA沉淀]
2.2.b.筛选与大分子(以His标记的新冠病毒复制酶为例)结合的RNA
2.2.1.b.预平衡磁珠
用6管磁分离架(New England Biolabs)将HisPurTM Ni-NTA磁珠(Thermo Fisher Scientific)富集,用4倍磁珠体积的1X ERB缓冲液(100mM NaCl,20mM Na-HEPES pH 7.5,5%(v/v)甘油,10mM MgCl2和0.5mMβ-巯基乙醇(可选添加))进行5次洗涤平衡,然后重悬于0.3倍磁珠体积的1X ERB缓冲液中。
2.2.2.b.封闭磁珠背景
30μL的1X ERB缓冲液中加入50%(v/v)的浓缩平衡磁珠(2.1.a):A组无需添加额外的试剂(简称"+None",NLA);或添加0.4μg/μL的折叠酵母tRNA作为B组(简称"+tRNA",NLB);或添加10μM的折叠竞争RNA 2(参见表1),作为C组(简称"+cRNA",NLC)。将反应充分混匀后,在25℃下静置孵育2分钟,然后加入50μL溶于1X ERB缓冲液的30μM nsp12,充分轻柔混合,25℃孵育10分钟,期间用手指轻弹混合。
2.2.3.b.平衡已封闭磁珠
用200μL的1X ERB缓冲液进行五次洗涤,每次洗涤轻柔混匀磁珠,室温静置30s,然后放入磁架,移除液体,快速加入1X ERB缓冲液,避免磁珠被干燥。
2.2.4.b.结合RNA
预先准备4μM的随机库源的RNA溶于100μL的1X ERB缓冲液中。洗涤完封闭磁珠后,将磁珠重新悬浮在该RNA溶液中,充分轻柔混合,室温25℃孵育10分钟,期间用手指轻弹混合,然后37℃孵育5分钟,最后在25℃下孵育2分钟。
2.2.5.b.梯度筛选洗涤RNA
用磁架收集磁珠,移走上清液,然后先将结合了RNA的磁珠用200μL用1X ERB缓冲液进行4次洗涤,单独收集每次洗涤液体,每次重悬磁珠需静置30秒再放入磁架,然后再重悬于200μL1X ERB缓冲液,转移到新的1.5mL离心管中,磁架富集磁珠,收集第5次洗涤液体,继续依次用250μL,300μL,350μL,400μL,450μL的1X ASB缓冲液进行5次洗涤,同样单独收集每次洗涤液体,小计10次。最后经行完全洗脱,首先用含0.1U/μL蛋白酶K(New England Biolabs)和2mM CaCl2的400μL 1X ERB缓冲液中于37℃孵育45分钟,期间用手指轻弹混合。收集该洗脱液并与后续的第二次洗脱液合并,作为第11次洗涤。第二次洗脱液用100μL 1X ERB缓冲液,室温1分钟孵育,再用磁架回收。合并后的洗脱液补充到500μL体积,进行两次P/C/I提取纯化,回收上清液。
2.2.6.b浓缩纯化RNA
加入3倍体积的冷却EtOH、0.1倍体积3M NaOAc以及1μL糖原(Thermo Fisher Scientific),将2.5.a中10个洗涤液和1个合并洗脱液-20℃沉淀2小时或者过夜。将沉淀的RNA离心(4℃,>20,000g,1h),洗涤,溶于10μL dH2O中。[可选步骤:2.4.b结合RNA步骤的反应体积可以相应地缩减到20微升。在2.5.6完全洗脱步骤中,可用0.015U/μL浓度的热敏蛋白酶K(New England Biolabs)代替蛋白酶K,25℃下孵育1小时,期间用手指轻弹混合,然后将反应体系在55℃下孵育10分钟,使热敏解蛋白酶K失去活性。此时,这10个洗涤液和1个合并洗脱液可以直接加入到逆转录转录反应中,不需要2.6.b RNA沉淀]
2.3.逆转录RNA
对于最终体积为20μL的反转录,将2.6中RNA加入0.5μM引物B、0.5mM dNTPs和dH2O在65℃下反应5分钟。反应结束后,立即将体系放置冰上冷却2分钟。然后,向反应中加入1X SSIV缓冲液(Thermo Fisher Scientific),5mM DTT,10U/μL SuperScript IV逆转录酶(Thermo Fisher Scientific),然后将混合物在53℃下孵育1小时。
2.4.offset PCR dsDNA补偿
为了在文库中加入补偿序列,将完成的反转录反应的10μL与1X Taq缓冲液、0.2mM dNTPs、3μM offset selex frw primer_mix v1(或v2)、3μM offset selex frw primer mix v1(或v2)(参见图2d和表1)、2mM MgCl2、0.05U/μL Taq DNA聚合酶进一步PCR扩增,总体积为
50μL。PCR反应先加热到94℃(3分钟),然后进行11个循环(每个循环包含94℃变性1分钟,52.5℃退火1分钟,72℃延伸2分钟),最后延伸步骤(72℃,5分钟),冷却到4℃。
2.5.offset PCR dsDNA纯化
提前室温预平衡Ampure XP磁珠(Beckman Coulter)30分钟以上,在PCR反应中,加入1.2倍反应体积的磁珠,移液枪上下吹吸40次,充分混匀纯化,室温静置10分钟,放至SMARTer-Seq PCR磁架(Takara Bio),待磁珠充分富集后,移除,用180μL 80%EtOH洗涤磁珠两次,室温干燥磁珠约5分钟,加入30μL dH2O,移液枪上下吹吸充分混合,室温静置5分钟,放至磁架,收集dsDNA溶解液,用Nanodrop核酸浓度仪(Thermo Fisher Scientific)测量大致浓度。
2.6.Illumina PCR dsDNA标记
通过PCR,进一步添加测序引物和标记序列到dsDNA上,使用~4.5nM offset-PCR dsDNA模板、500nM测序Universal引物(New England Biolabs)、500nM测序index引物(New England Biolabs)、200nM dNTPs、1×Q5反应缓冲液(New England Biolabs)、0.02U/μL热启动Q5高保真DNA聚合酶(New England Biolabs)和dH2O,总体积为50μL。PCR反应先加热到98℃(40秒),然后进行6个循环(每个循环包含98℃变性10秒,68.5℃退火20秒,72℃延伸30秒),最后延伸步骤(72℃,2分钟),冷却到4℃。
2.7.Illumina PCR dsDNA纯化
同步骤5,溶解dsDNA与15μL dH2O,-20℃保存。
2.8.测序文库质控
取1μL样品用于Qubit Fluorometer(Invitrogen)方法测量样品的精准浓度,确定样品浓度大于2ng/μL。然后,再取1μL样品(稀释到1-2ng/μL)用于Bioanalyzer(Agilent)高精度电泳分析质控,确定信号单峰在225bp左右。
2.9.多重样品测序
将不超过47个质控后的样品以同浓度混合,加入5%的selexPhiX_v1(对应步骤5使用offset_selex_primer v1)或selexPhiX_v2(对应步骤5使用offset_selex_primer v2),然后稀释到最终浓度为20nM,稀释,NaOH变性。使用NextSeq 500单端(SE)高通量75bp测序,测序密度为1.8pM,将用于测序列标记的10bp的试剂用于序列本身测序,可获得总长86bp的序列输出,大约3.2-4亿条序列。
实施例3.数据建模分析和统计
3.1.数据清洗
根据每个样品所使用的引物标签序列(6nt),把原始测序数据解码分类成对应样品序列。在对应标签序列过程中,以零错配为准,同时,过滤低phred质量序列。然后,将序列5’-端的补偿序列(7种,0-6nt)剪除,并统计补偿序列的平衡分布。剪除后的序列进一步去除引物B对应序列,保留67nt源序列,并对源序列进行反向互补,调整回与RNA原始序列一致的DNA序列。在剪除序列过程中,清洗掉没有补偿序列、没有引物B序列、超过连续25个相同碱基以及修剪后长度小于65nt的背景数据。
3.2.数据合并
创建一个数据框结构(12列*n行),第1列为测序序列本身,第2-12列为2.2.5中分组的10组洗涤和1组合并洗脱(依次为组1-11,g1-11),每行代表一个文库中独立排他的序列统计。将3.1清洗后的数据,统计每条序列在组1-11中间的丰度,并记录到数据框结构中。随后,从合并数据框移除在预成环区固定区域与理论预成环序列相比有4个以上编辑距离的母序列,同时移除左右臂随机区域含有未知的"N"的母序列,最后,压缩存储标准化序列的该合并数据库。
3.3.数据转化
在合并数据库中,对于在组1中丰度为0的序列,将丰度替换成初始非零值0.5。然后创建一个倍数变化数据库(11列*m行),第1列为测序序列本身,第2列为组2变化比率(f2),即序列在组2丰度除以该序列在组1的对应丰度,第3列为组3变化比率(f3),即序列在组3丰度除以该序列在组1的对应丰度,依次类推,对于比率为0的数值,则替换为0.1,最后取log2对数,并保存该倍数变化数据库。
3.4.排序建模
在倍数变化数据库,对于每一组变化比率,从高到低排列,选取位于1%(标尺分位)的序列作为标尺及格序列。然后根据初始设置的gamma趋势线(常熟c*间隔比gamma-0.0000001),gamma默认值是1,即产生0.1,0.2,0.3,….0.9和1这10个比例。将每一组的标尺及格序列的比率缩放到对应比例上,作为加权处理,比如第一组标尺及格序列原始比率是5,则将所有第一组序列的比率都除以50,第二组标尺及格序列原始比率是7,则将所有第二组序列的比率都除以35,依次类推。最后,每个序列在10组变化比率中,对应10个gamma变化率(gf),以1-10作为横坐标,依次对应的gamma变化率作为纵坐标,计算曲线下面积(auc)。将序列根据其auc数值预测结合能力,数值越大,潜在结合能力越强,反之越弱。
3.5.模型精调
在模型中,超参数gamma值和标尺分位可根据一些潜在强结合序列的原始丰度在各组的分布和比率,以及富集路线来调整,也可根据强结合序列的子序列的皮尔森(pearson)相关系数进一步优化。对于小分子推荐gamma≥1,对于大分子推荐gamma≤1。标尺分位推荐使用1%。对于子序列分析,首先是在合并数据框中选取若干强结合序列,将每个序列拆分为左臂随机区、预成环区和右翼随机区,并进一步从左臂随机区域去除引物A残基序列。在每个区域应用步长为1的大小为n(6-10)滑动窗口,来计算每组的高丰度序列的子序列特征的丰
度,将丰富的相关性系数趋势作为超参数的参考。此外,当测量获得少量候选序列结合强度后,可采用归一化折损累计增益(NDCG),进一步优化排序相关的超参数gamma值和标尺分位。
实施例4.筛选验证
4.1.RNA适配体与硅罗丹明相互作用的解离常数测定
RNA适配体与硅罗丹明的解离常数(KD)是通过一组不同RNA浓度的JASCO荧光强度来测定的。简而言之,RNA配体按2.1步骤进行结构折叠,然后将RNA与50nMSiR-PEG2-NH2探针在1X ASB缓冲液中混合在25℃的荧光仪比色皿中,记录荧光强度随指定的RNA浓度增加的数值。其中,激发和发射波长分别被设定为647nm和662nm,激发和发射的狭缝宽度被设定为±5nm。数据计算时,使用希尔方程模拟结合曲线确定解离常数。
4.2.RNA适配体激活硅罗丹明荧光的测定及活细胞成像
RNA配体(5μM)按2.1步骤进行结构折叠,然后,将RNA溶液溶解在1x ASB缓冲液中,加入5nM的SiR-PEG2-NH2探针,在室温下孵育10分钟。荧光由JASCO分光光度计测量(λex=647nm;λem=662nm(±5nm狭缝宽度))数值。对于活细胞成像,使用Dulbecco's Modified Eagle's培养基(DMEM,高糖,不含苯酚红)(Gibco)培养人胚胎肾源細胞293(HEK293T),培养液额外添加10%胎牛血清(FBS)(Gibco),100U/mL青霉素(Thermo Fisher Scientific)和100μg/mL链霉素(Thermo Fisher Scientific)。将部分活化细胞接种到300μL的培养基中,并转移到涂有聚-D-赖氨酸的8孔玻璃室内生长过夜。然后使用FuGeneHD转染试剂(Promega)按照标准方法,用适量的表达质粒转染细胞。在48小时后,用含有200nM SiR-PEG-NH2的Leibowitz(L15)培养基交换培养液。在37℃下,对细胞进行成像和拍照,进行相应的可视化调整。
4.3.RNA适配体与新冠病毒复制酶相互作用的解离常数测定
基于生物膜干涉原理,使用R8系统(Sartorius)来进行测量解离常数。Ni-NTA(NTA)生物传感器(Sartorius)在解离常数测量前,预先在1X ERBL缓冲液(20mM Tris-HCl pH 7.4,100mM KCl,5%(v/v)甘油,10mM Mg(OAc)2,1mM TCEP,0.02%TWEEN20(Carl Roth))中平衡5分钟。每个孔的RNA在1X ERBL缓冲液中从指定浓度依次稀释2倍,而不含RNA的1X ERBL缓冲液的样品孔设定为空白对照组。同时,20ng/μL的His10-nsp12被用于蛋白质加载步骤。整个检测包含60秒的基线-1步骤,180-240秒的蛋白加载步骤,60秒的基线-2步骤,900-1800秒的结合步骤,以及600-3600秒的解离步骤。测量的数据由Octet数据分析软件进行分析。简而言之,通过参考对照减法、基于基线平均值的Y轴对齐、解离步骤间校正和Savitzky-Golay过滤对数据进行预处理。使用1:1的结合模型。然后通过相应的拟合方法(局部或整体)计算解离常数。Ni-NTA生物传感器可重复使用。简而言之,生物传感器被重复洗涤三个循环的洗涤步骤,包含10mM甘氨酸(pH 1.7)的洗涤(10秒)和1X ERB缓冲液中和(10秒),然后是10mM NiCl2再生(70秒)和1X ERB缓冲液的洗涤分析(60秒),上述所有步骤都是在1000转/分的摇动下进行的。
4.4.RNA适配体3’-端封闭修饰
3.77μM RNA适配体加入到200μL反应液中,其中含有10mM NaOAc pH 4.5、50mM新鲜制备的NaIO4和dH2O,在室温下孵育2分钟。然后加入10%(v/v)的乙二醇重复混匀,并在室温下静置5分钟来淬灭氧化反应。淬灭的反应进一步添加222mM Tris-HCl pH 8.9,0.15M NaOAc pH 5.5,2μL糖原(Thermo Fisher Scientific)和50%(v/v)异丙醇。反应混合物在室温下再孵育30分钟。最后通过离心(16000g,20分钟,4℃)沉淀RNA,并通过两次75%EtOH洗涤RNA沉淀。将RNA溶解于dH2O中,保存在-20℃或-80℃长期保存。
本发明的测序文库相关序列和RNA适配体序列和对照组序列分别如表1和表2所示。
其中,对于siR小分子,RLB7(KD 250nM)、RLB15(KD 194nM)、RLB3(KD 208nM)、RLB4(KD 195nM)、RLB8(KD 700nM)、RLB12(KD 370nM)、RLB13(KD 461nM)、seqB(也是RLB108,KD 25nM)均表现出优异的亲和力;
对于nsp12蛋白,NLB113、NLB41、NLB30、NLB79、NLB34、NLB69、NLB32、NLB2、NLB58、NLB5表现出优异的亲和力。
讨论
自1990年以来,本领域已经开发了应用于发现高亲和力适配体的技术,并取得了不错的成就。然而,进行整个进化选择过程被认为是操作一个"黑盒子",通常需要几周甚至几个月的时间。高通量测序工具虽然使每一轮的序列选择可视化,但它并没有改变所选择的适配体具有高假阳性率的筛选问题。随着精密仪器和算法的发展,筛选过程的迭代次数减少,并产生了更高亲和力和高特异性的适配体。其中一个有前景的“分区”方法使用毛细管电泳,应用于快速筛选DNA适配体。但是除了需要复杂的仪器和制造技术外,其中一个局限性是当适配体与比自身分子量小的目标结合时,不能产生足够的迁移率转移信号来区分结合特征。同样,针对不同结合目标,优化微流控芯片分离系统的选择条件(如珠子聚集、微气泡、RNA稳定性)也不是件容易的事。此外,对于通过计算预测适配体结合,其主要利用RNA序列的子序列和子结构信息,但是这类数据驱动型分析高度依赖于人工选择筛选轮对应的数据以及传统筛选实验的质量。
本发明开发的应用于小分子和大分子的RNA适配体筛选方法只需要耗时数小时。直接的单轮RNA筛选,不需要苛刻的仪器,高效的深度测序库构建。本发明的方法可用于端对端分析,使用极其简便。本发明的方法中,SGRELI的特点使选择过程中产生的有用信息最大化。高亲和力的RNA适配体可以被多次观察到,并具有梯度排序的趋势,因此,预测的结合适配体假阳性低。
根据上述实施例可以看出,通过本发明的方法筛选得到SiR RNA适配体具有更好的KD和荧光激活能力,与目前报道的最佳适配体SIRA相比,增加了特异性,同时减少了RNA活细胞图像的背景。此外,序列长度和组成可以根据结构相互作用信息进一步优化。应用本发明方法获得的具有pM KD的nsp12 RNA适配体提供了抑制SARS-CoV-2聚合酶复制的应
用前景。进一步对适配体进行3'-端阻断修饰是完全抑制聚合酶延伸的必要条件。与瑞德西韦(remdesivir)相比,本发明获得的适配体在相同RdRp聚合酶的浓度达到相同抑制效果,仅需要瑞德西韦的工作浓度千分之一倍。具有3'-端结构的RNA适配体进入并占据复制酶复合物的催化中心可能是抑制病毒复制的关键。同时,本发明应用在筛选化学修饰的RNA适配体,所得适配体高效抑制HIV-1逆转录复制酶,进一步扩大了本发明筛选适用范围。本发明方法的进一步优化可以使用机器学习与特征工程(如子结构、子序列)来预测结合亲和力,此外还可使用自动机器人进行高通量筛选。
综上上述,本发明强调了具有SGRELI特征的方法,以便用于快速筛选RNA适配体,并为开发激活化学染料和抑制SARS-CoV-2聚合酶及HIV-1逆转录酶的功能化RNA适配体提供了理论依据。
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。
表1.测序文库相关序列
表2.本发明的RNA适配体序列和对照组序列(序列中fC和fU表示该核苷酸,即C和U是氟化的,例如2’-F-CTP和2’-F-UTP)
Claims (37)
- 一种筛选RNA适配体的方法,所述方法包括以下步骤:1)提供待筛选的RNA适配体文库;2)将步骤1)的文库与结合有靶标的固体载体温育,进而促使文库中的RNA适配体与靶标充分结合;3)利用缓冲液梯度洗脱步骤2)中与固体载体上的靶标结合的RNA适配体,分别收集每次洗脱的洗脱液;4)完全洗脱步骤3)后仍保留在固体载体上的RNA适配体,收集洗脱液作为最后一组的洗脱液;5)任选浓缩、纯化步骤3)和4)中得到的洗脱液中的RNA适配体;6)逆转录步骤5)中得到的RNA适配体,从而得到cDNA;7)对步骤6)中得到的cDNA进行扩增和高通量测序,获得测序数据;8)分析步骤7)得到的测序数据,根据RNA适配体候选序列的结合潜力从高到低排序,从而得到高亲和力的RNA适配体序列。
- 如权利要求1所述的方法,其特征在于,所述提供待筛选的RNA适配体文库包括自行制备、商业购买、或通过他人馈赠获得RNA适配体文库。
- 如权利要求1所述的方法,其特征在于,在步骤2)中,RNA适配体文库中的RNA适配体与靶标结合后,可以封闭所述固体载体,控制和减少非特异背景结合。
- 所如权利要求3所述的方法,其特征在于,所述封闭是用非靶标特异性的随机RNA封闭固体载体;或用靶标特异性的RNA封闭固体载体。
- 如权利要求1所述的方法,其特征在于,在步骤2)中,固体载体包括但不限于:磁珠、基质。
- 如权利要求5所述的方法,其特征在于,所述基质包括但不限于:琼脂糖凝胶基质、头孢菌素珠,硝化纤维素、聚偏二氟乙烯膜、辛基海藻糖等载体基质。
- 如权利要求1所述的方法,其特征在于,在步骤2)中,所述靶标是小分子,包括但不限于:类固醇、多巴胺、卡那霉素、地高辛、安托辛、二硝基苯胺、三聚氰胺、喹诺酮,黄曲霉毒素;大分子,包括但不限于:多肽、蛋白质(如酶和抗体等)及复合体(结合有RNA的蛋白质)、高分子聚合物和化合物等。
- 如权利要求1所述的方法,其特征在于,在步骤3)中,所述梯度洗脱是用体积增加的缓冲液、或用洗脱强度增加的缓冲液进行洗脱;优选用体积增加的缓冲液进行洗脱。
- 如权利要求8所述的方法,其特征在于,所述洗脱强度增加的缓冲液是指提高盐离子或螯合剂浓度等阻止RNA折叠形成空间结构的缓冲液。
- 如权利要求8所述的方法,其特征在于,在梯度洗脱之前,先进行数次背景洗脱,直至洗脱液中所含的RNA适配体的分子数量不大于高通量测序阈值的1%。
- 如权利要求10所述的方法,其特征在于,背景洗脱的缓冲液的体积应不大于梯度 洗脱所用缓冲液的初始体积。
- 如权利要求1所述的方法,其特征在于,所述洗脱可以是静态洗脱(不连续洗脱,一次性收集完整洗脱液)或动态洗脱(连续洗脱,持续收集少量部分洗脱液),优选静态洗脱。
- 如权利要求12所述的方法,其特征在于,如果采用静态洗脱,最后一次背景洗脱在新的容器中进行。
- 如权利要求10所述的方法,其特征在于,背景洗脱的缓冲液的体积可以增加或不增加,优选不增加。
- 如权利要求10所述的方法,其特征在于,背景洗脱的缓冲液与梯度洗脱的缓冲液可以相同或不同;优选相同。
- 如权利要求1所述的方法,其特征在于,用于梯度洗脱的缓冲液中包含镁离子,优选5mM镁离子,pH值在8.5以下,优选pH值7-8,NaCl或KCl浓度在75mM-200mM之间。
- 如权利要求8所述的方法,其特征在于,在数次梯度洗脱使得洗脱液中所含的RNA适配体的分子数量适合测序,优选对文库中分子数量的理论最低值降低至105以下,在步骤4)中进行完全洗脱,从而将与固体载体上的靶标结合的RNA适配体完全洗脱下来。
- 如权利要求1所述的方法,其特征在于,用于完全洗脱的缓冲液中包含能够释放RNA适配体的试剂,包括能够破坏靶标与固体载体结合的试剂,和/或能够破坏RNA适配体与靶标结合的试剂,和/或直接破坏靶标的试剂。
- 如权利要求1所述的方法,其特征在于,在步骤7)中,在测序接头和cDNA恒定区域之间随机插入0-6nt的补偿序列。
- 如权利要求19所述的方法,其特征在于,在步骤7)中,在混合多重样品过程中,还引入了定制设计的PhiX来进一步补偿恒定区域的不平衡碱基分布。
- 如权利要求1所述的方法,其特征在于,在步骤8)中,所述结合潜力是指在每次洗脱液中富集程度增加快,而非单一考量富集程度最高的。
- 如权利要求21所述的方法,其特征在于,所述结合潜力是按照RNA适配体的以下一项或多项信息作出判断:RNA适配体在各洗脱液中出现的丰度、RNA适配体在各洗脱液中单独被检测到的频率次数、RNA适配体在后续洗脱液里出现优于在初期洗脱液里出现。
- 如权利要求22所述的方法,其特征在于,综合所述信息拟合成标准曲线按照曲线下面积(AUC)对RNA适配体的结合潜力进行评判。
- 如权利要求1-23中任一项所述的方法,其特征在于,所述RNA适配体包含化学修饰序列;优选氟修饰序列。
- 一种RNA适配体,所述RNA适配体采用权利要求1-24中任一项所述方法筛选得到。
- 如权利要求25所述的RNA适配体,其特征在于,所述RNA适配体包括序列已知,但在不同的碱基(例如A、U、G、C)上有随机修饰的所述RNA适配体。
- 如权利要求25所述的RNA适配体,其特征在于,所述RNA适配体不包括序列已 知且无额外修饰的常规RNA适配体。
- 如权利要求25所述的RNA适配体,其特征在于,所述RNA适配体包含化学修饰序列;优选氟修饰序列。
- 一种用于运行权利要求1-24中任一项所述的方法的设备。
- 如权利要求29所述的设备,其特征在于,所述设备包括以下模块:1)制备模块,所述模块制备待筛选的RNA适配体文库;2)温育模块,所述模块将制备的文库与结合有靶标的固体载体(磁珠或者基质)温育;3)洗脱和收集模块,所述模块用缓冲液进行梯度洗脱上述与固体载体上的靶标结合的RNA适配体的模块,并分别收集每次洗脱的洗脱液;4)任选的浓缩和纯化模块,所述模块浓缩、纯化上述洗脱液的RNA适配体;5)逆转录模块,所述模块逆转录上述的RNA适配体,从而得到cDNA;6)扩增和高通量测序模块,所述模块对上述得到的cDNA进行扩增和高通量测序,获得测序数据;7)分析模块,所述模块分析上述的测序数据,根据RNA适配体候选序列的结合潜力从高到低排序,从而得到高结合亲和力的RNA适配体序列。
- 一种生物芯片,所述生物芯片包含权利要求25-28中任一项所述的RNA适配体。
- 一种生物芯片的制造方法,所述方法包括以下步骤:1)采用权利要求1-24中任一项所述方法筛选得到RNA适配体;和2)采用步骤1)筛选得到的RNA适配体制备生物芯片。
- 一种药物组合物,所述药物组合物包含权利要求25-28中任一项所述的RNA适配体以及药学上可接受的赋形剂或药物递送载体。
- 一种药物递送载体,所述药物递送载体上连接有权利要求25-28中任一项所述的RNA适配体。
- 如权利要求34所述的药物递送载体,其特征在于,所述药物递送载体是脂质体。
- 一种诊断试剂,所述诊断试剂包含权利要求25-28中任一项所述的RNA适配体以及诊断所需的其它辅助试剂。
- 采用权利要求1-24中任一项所述方法筛选得到RNA适配体在制备生物芯片、药物组合物或诊断试剂中的用途。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210648004.5A CN117230068A (zh) | 2022-06-08 | 2022-06-08 | 一种筛选rna适配体的方法 |
CN202210648004.5 | 2022-06-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023237066A1 true WO2023237066A1 (zh) | 2023-12-14 |
Family
ID=89091756
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/099222 WO2023237066A1 (zh) | 2022-06-08 | 2023-06-08 | 一种筛选rna适配体的方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117230068A (zh) |
WO (1) | WO2023237066A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018198013A1 (en) * | 2017-04-24 | 2018-11-01 | Centre National De La Recherche Scientifique | Fluorogen-binding rna aptamers |
CN110938632A (zh) * | 2020-01-02 | 2020-03-31 | 郑州大学 | 一种特异结合tnf-r1的适配体及其筛选方法和应用 |
CN112501174A (zh) * | 2020-12-01 | 2021-03-16 | 苏州方舟生物科技有限公司 | 与cd44-透明质酸结合域蛋白特异结合的rna适配体及其筛选方法和应用 |
CN112695037A (zh) * | 2020-11-24 | 2021-04-23 | 江苏为真生物医药技术股份有限公司 | Egfr特异性的核酸适配体及其应用 |
CN113564155A (zh) * | 2021-07-22 | 2021-10-29 | 华侨大学 | 适配体筛选方法及其应用 |
-
2022
- 2022-06-08 CN CN202210648004.5A patent/CN117230068A/zh active Pending
-
2023
- 2023-06-08 WO PCT/CN2023/099222 patent/WO2023237066A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018198013A1 (en) * | 2017-04-24 | 2018-11-01 | Centre National De La Recherche Scientifique | Fluorogen-binding rna aptamers |
CN110938632A (zh) * | 2020-01-02 | 2020-03-31 | 郑州大学 | 一种特异结合tnf-r1的适配体及其筛选方法和应用 |
CN112695037A (zh) * | 2020-11-24 | 2021-04-23 | 江苏为真生物医药技术股份有限公司 | Egfr特异性的核酸适配体及其应用 |
CN112501174A (zh) * | 2020-12-01 | 2021-03-16 | 苏州方舟生物科技有限公司 | 与cd44-透明质酸结合域蛋白特异结合的rna适配体及其筛选方法和应用 |
CN113564155A (zh) * | 2021-07-22 | 2021-10-29 | 华侨大学 | 适配体筛选方法及其应用 |
Non-Patent Citations (1)
Title |
---|
LI MIN-SI; SONG ZHAN-YUN; FENG XIN; YANG QIAN; ZHENG YAN; WEI CHUN-YAN; WANG WEI-LI; WANG ZHEN-GUO; FU XIAO-PING: "Selection and characterization of RNA aptamer against Newcastle disease virus hemagglutinin protein RNA aptamer", ZHONGGUO-SHOUYI-XUEBAO = CHINESE JOURNAL OF VETERINARY SCIENCE, JILIN DAXUE ZHUBAN. ZHONGGUO-SHOUYI-XUEBAO BIANJIBU BIANJI, CHINA, vol. 34, no. 3, 15 March 2014 (2014-03-15), CHINA , pages 395 - 401, XP009550756, ISSN: 1005-4545, DOI: 10.16303/j.cnki.1005-4545.2014.03.006 * |
Also Published As
Publication number | Publication date |
---|---|
CN117230068A (zh) | 2023-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Datlinger et al. | Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing | |
Trachman III et al. | Structure and functional reselection of the Mango-III fluorogenic RNA aptamer | |
Mascini et al. | Nucleic acid and peptide aptamers: fundamentals and bioanalytical aspects | |
CN105358709B (zh) | 用于检测基因组拷贝数变化的系统和方法 | |
Duffy et al. | Enriching s4U‐RNA using methane thiosulfonate (MTS) chemistry | |
CN107090457A (zh) | 检测样品的多元分析 | |
US6207388B1 (en) | Compositions, methods, kits and apparatus for determining the presence or absence of target molecules | |
US20040018530A1 (en) | In vitro evolution of functional RNA and DNA using electrophoretic selection | |
Lund et al. | Protein unties the pseudoknot: S1-mediated unfolding of RNA higher order structure | |
JP2022160425A (ja) | 次世代配列決定法を用いた標的タンパク質の集団的定量方法とその用途 | |
KR20180041331A (ko) | 분자결합핵산 선정과 표적분자 동정 방법 및 키드, 그리고 그들의 용도 | |
Beckwith et al. | Visualization of loop extrusion by DNA nanoscale tracing in single human cells | |
Umar et al. | Development of RNA G-quadruplex (rG4)-targeting l-RNA aptamers by rG4-SELEX | |
US9714449B2 (en) | Nucleic acid amplification method | |
O'Brien et al. | Global Run-On sequencing to measure nascent transcription in Saccharomyces cerevisiae | |
WO2023237066A1 (zh) | 一种筛选rna适配体的方法 | |
Luo et al. | Iso-FRET: an isothermal competition assay to analyze quadruplex formation in vitro | |
Gangras et al. | Cloning and identification of recombinant argonaute-bound small RNAs using next-generation sequencing | |
CN114621958B (zh) | 特异性识别atp的单链dna适配体序列及其应用 | |
CN111363749B (zh) | 一种用于检测中华鳖虹彩病毒的核酸适配体及其构建方法和应用 | |
JP5083892B2 (ja) | ペルオキシレドキシン6(Prx6)に対するアプタマー | |
Gupta et al. | In Vivo RNA Structure Probing with DMS-MaPseq | |
US20200277601A1 (en) | Synthetic hammerhead ribozymes with ligand-responsive tertiary interactions | |
Vietri Rudan et al. | Detecting spatial chromatin organization by chromosome conformation capture II: genome-wide profiling by Hi-C | |
Puchta et al. | Genotype-phenotype map of an RNA-ligand complex |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23819230 Country of ref document: EP Kind code of ref document: A1 |