IT201900015902A1

IT201900015902A1 - PROCEDURE TO PREPARE AN RNA SAMPLE FOR SEQUENCING AND RELATED KIT

Info

Publication number: IT201900015902A1
Application number: IT102019000015902A
Authority: IT
Inventors: Claudia Firrito; Piano Alessia Del; Massimiliano Clamer
Original assignee: Immagina Biotechnology S R L
Priority date: 2019-09-09
Filing date: 2019-09-09
Publication date: 2021-03-09

Description

Descrizione dell’invenzione industriale dal titolo: Description of the industrial invention entitled:

“Procedimento per preparare un campione di RNA per il sequenziamento e relativo kit” "Procedure for preparing an RNA sample for sequencing and related kit"

TESTO DELLA DESCRIZIONE TEXT OF THE DESCRIPTION

Campo dell’invenzione Field of the invention

La presente descrizione riguarda un nuovo procedimento per la preparazione di un campione di RNA per il sequenziamento ed un kit per realizzare tale procedimento. The present description relates to a new process for the preparation of an RNA sample for sequencing and a kit for carrying out this process.

Sfondo Background

Le interazioni RNA-proteine ricoprono un ruolo fondamentale nel controllo di aspetti cruciali della biologia cellulare, dalla trascrizione dell’mRNA, splicing del pre-mRNA e funzioni di segnalazione dell’RNA, alla traduzione ed alla localizzazione di proteine<1>. Per via dell’importanza di comprendere tali processi biologici, sono stati spesi parecchi sforzi per lo sviluppo di procedimenti volti a studiare e caratterizzare queste interazioni, dalla marcatura chimica di RNA e proteine<2 >al sequenziamento ad alta produttività da genoma intero di impronte molecolari dell’RNA.<3-6 >Tuttavia, gli approcci di sequenziamento sono generalmente caratterizzati da parecchi limiti durante la preparazione dei campioni, come ampie fasi di manipolazione, amplificazione mediante PCR e incapacità di catturare selettivamente sequenze di RNA che presentano un gruppo fosfato o 2’,3’-fosfato ciclico all’estremità 3’ (3’-P/cP), portando quindi ad una ridotta accuratezza del risultato<7>. Questo porta ad una reattività incrociata delle librerie con bersagli di RNA indesiderati, alto rumore di fondo e scadente qualità delle librerie, ostacolando importanti informazioni biologiche relative ai prodotti di RNA 3’-P/cP-terminati. Il 3’-P/cP è generato per scissione enzimatica e gli RNA 3’-P/cP ricoprono un ruolo chiave in molti stati patologici (come cancro e sclerosi laterale amiotrofica<8,9>), processi biologici (come risposta da proteine non correttamente ripiegate<10>, produzione di granuli da stress<8>, metabolismo dell’RNA<11>, biogenesi dell’rRNA e del tRNA<12 >e splicing dell’mRNA<13>), e funzioni biologiche (come sopravvivenza neuronale<14 >e risposta infiammatoria<15>). Nonostante il tratto distintivo del fosfato all’estremità 3’ sia un importante marcatore funzionale, la maggior parte delle procedure di sequenziamento non preservano questa caratteristica chimica durante la preparazione delle librerie. Sono disponibili pochi procedimenti per la rivelazione di 3’-P o 3’-cP, ma permettono solamente la rivelazione indiretta di 3’-P<16>, o sono esclusivamente selettivi per 3’-cP<16,17>. Inoltre, questi protocolli sono laboriosi e richiedono molto tempo, implicando fasi di amplificazione mediante PCR che possono portare a una disomogenea copertura di sequenza o ad errori di sequenziamento (per esempio all’interno di regioni ripetitive). RNA-protein interactions play a fundamental role in the control of crucial aspects of cell biology, from mRNA transcription, pre-mRNA splicing and RNA signaling functions, to translation and localization of proteins <1>. Due to the importance of understanding these biological processes, a great deal of effort has been spent on the development of procedures aimed at studying and characterizing these interactions, from chemical labeling of RNA and <2> proteins to high-throughput whole-genome sequencing of molecular footprints. <3-6> However, sequencing approaches are generally characterized by several limitations during sample preparation, such as extensive handling steps, PCR amplification, and the inability to selectively capture RNA sequences that have a phosphate or group 2 ', 3'-cyclic phosphate at the 3' end (3'-P / cP), thus leading to a reduced accuracy of the <7> result. This leads to cross-reactivity of libraries with unwanted RNA targets, high background noise and poor library quality, hindering important biological information relating to 3'-P / cP-terminated RNA products. 3'-P / cP is generated by enzymatic cleavage and 3'-P / cP RNAs play a key role in many disease states (such as cancer and amyotrophic lateral sclerosis <8.9>), biological processes (as a response from proteins incorrectly folded <10>, stress bead production <8>, RNA metabolism <11>, rRNA and tRNA biogenesis <12> and mRNA splicing <13>), and biological functions (such as survival neuronal <14> and inflammatory response <15>). Although the distinctive trait of phosphate at the 3 'end is an important functional marker, most sequencing procedures do not preserve this chemical characteristic during library preparation. Few procedures are available for the detection of 3'-P or 3'-cP, but they only allow the indirect detection of 3'-P <16>, or are exclusively selective for 3'-cP <16.17>. Furthermore, these protocols are laborious and time-consuming, involving PCR amplification steps that can lead to uneven sequence coverage or sequencing errors (for example within repetitive regions).

Da una prospettiva tecnica, molte tecniche correlate alle impronte molecolari dell’RNA impiegano endoribonucleasi per caratterizzare l’interazione RNA-proteine, grandi complessi RNA-proteine<18 >o l’interazione di piccole molecole<19 >con l’RNA. Un contesto sperimentale che è fortemente toccato dalla mancanza di protocolli disponibili per la preparazione di librerie in grado di catturare selettivamente i terminali 3’-P è il profiling ribosomiale (Ribo-seq), un procedimento correlato alle impronte molecolari dell’RNA basato sul sequenziamento profondo di frammenti protetti da ribosoma (RPF) lunghi 25-35 nt, ovvero i frammenti di mRNA generati dopo digestione con nucleasi dell’RNA a singolo filamento non protetto. Fornendo un’informazione sulla posizione dei ribosomi lungo i trascritti catturata in un determinato momento, questa tecnica rappresenta un approccio potente per lo studio della biologia della sintesi proteica<20>. I protocolli attuali per il profiling ribosomiale implicano molte fasi sequenziali e sono basati sulla piattaforma di sequenziamento Illumina. In particolare, dopo l’isolamento degli RPF, sono ad oggi disponibili due flussi di lavoro alternativi di preparazione delle librerie: (i) flusso di lavoro basato sulle fasi sequenziali di ligazione degli adattatori all’estremità 3’ degli RPF, sintesi del cDNA, circolarizzazione ed amplificazione mediante PCR, con un totale di quattro fasi di estrazione da gel<21>; (ii) flusso di lavoro ligazione-indipendente per materiale a basso input, per il quale sono disponibili prodotti commerciali, che implica le fasi sequenziali di poliadenilazione al 3’ dell’RNA, sintesi di cDNA con commutazione dello stampo ed amplificazione mediante PCR<22>, e richiede una fase di estrazione da gel. I principali inconvenienti dei protocolli disponibili sono rappresentati da (i) distorsioni da amplificazione mediante PCR e (ii) la mancanza di preservazione dei terminali 3’-P/cP (che forniscono un tratto distintivo di effettiva digestione) con conseguente sottorappresentazione delle specie di RNA che presentano 3’-P/cP (rappresentando gli effettivi RPF) negli insiemi di dati di sequenziamento. Infatti, entrambi i flussi di lavoro richiedono una fase di defosforilazione prima della ligazione di adattatori<3 >o della poliadenilazione. Questa fase di manipolazione abbassa il livello di specificità nelle reazioni di ligazione, portando alla cattura di qualsiasi molecola di RNA breve dotata di gruppi -OH al suo terminale 3’. From a technical perspective, many techniques related to RNA molecular footprints use endoribonucleases to characterize the RNA-protein interaction, large RNA-protein complexes <18> or the interaction of small molecules <19> with RNA. An experimental context that is strongly affected by the lack of available protocols for the preparation of libraries capable of selectively capturing 3'-P terminals is ribosomal profiling (Ribo-seq), a procedure related to RNA molecular imprints based on sequencing deep of ribosome-protected fragments (RPF) 25-35 nt long, i.e. the mRNA fragments generated after nuclease digestion of unprotected single-stranded RNA. By providing information on the position of ribosomes along the transcripts captured at a given moment, this technique represents a powerful approach for studying the biology of protein synthesis <20>. Current protocols for ribosomal profiling involve many sequential steps and are based on the Illumina sequencing platform. In particular, after RPF isolation, two alternative library preparation workflows are currently available: (i) workflow based on the sequential ligation steps of the adapters at the 3 'end of the RPFs, synthesis of the cDNA, circularization and amplification by PCR, with a total of four gel extraction steps <21>; (ii) ligation-independent workflow for low-input material, for which commercial products are available, involving the sequential steps of 3 'RNA polyadenylation, template-switched cDNA synthesis and PCR amplification <22 >, and requires a gel extraction step. The main drawbacks of the available protocols are represented by (i) amplification distortions by PCR and (ii) the lack of preservation of the 3'-P / cP terminals (which provide a distinctive trait of effective digestion) with consequent under-representation of RNA species exhibiting 3'-P / cP (representing actual RPFs) in the sequencing datasets. Indeed, both workflows require a dephosphorylation step prior to ligation of <3> adapters or polyadenylation. This manipulation step lowers the level of specificity in the ligation reactions, leading to the capture of any short RNA molecule with -OH groups at its 3 'terminal.

In aggiunta a ciò, studi recenti hanno rivelato differenze importanti in liquidi biologici (plasma da sangue di cordone ombelicale, lavaggio broncoalveolare, plasma da sangue adulto, saliva da parotide, liquido di follicolo ovarico, siero, liquido amniotico, plasma seminale, urina, bile, saliva sottomandibolare/sottolinguale, liquido cerebrospinale) nella quantità relativa e nel tipo di popolazioni di RNA piccolo come RNA derivati da tRNA, RNA piwi-interagenti, RNA Y. Cosa importante, è noto che alcuni di essi hanno un 3’P o 2’,3’-cP e sono stati associati a cancro e disturbi neurologici ed immunologici<23>. In tale scenario clinico, queste specie di RNA possono avere un ruolo potenziale come biomarcatori, con una significatività predittiva e/o prognostica nella stratificazione di pazienti<24,25>. In addition to this, recent studies have revealed important differences in biological fluids (umbilical cord blood plasma, bronchoalveolar lavage, adult blood plasma, parotid saliva, ovarian follicle fluid, serum, amniotic fluid, seminal plasma, urine, bile , submandibular / sublingual saliva, cerebrospinal fluid) in the relative amount and type of populations of small RNA such as RNA derived from tRNA, piwi-interacting RNA, RNA Y. Importantly, some of them are known to have a 3'P or 2 ', 3'-cP and have been associated with cancer and neurological and immunological disorders <23>. In this clinical scenario, these RNA species may have a potential role as biomarkers, with predictive and / or prognostic significance in patient stratification <24.25>.

Vi è pertanto la necessità di nuovi procedimenti per la preparazione di un campione di RNA per il sequenziamento che siano privi degli inconvenienti dei procedimenti noti. There is therefore a need for new methods for preparing an RNA sample for sequencing which are free from the drawbacks of known methods.

Sintesi dell’invenzione Summary of the invention

Lo scopo della presente descrizione è quello di fornire un nuovo procedimento per preparare un campione di RNA per il sequenziamento ed un kit per attuare tale procedimento. The purpose of the present description is to provide a new process for preparing an RNA sample for sequencing and a kit for carrying out this process.

Secondo l’invenzione, lo scopo di cui sopra viene ottenuto grazie all’oggetto specificamente richiamato nelle rivendicazioni che seguono, che sono intese come parte integrante della presente descrizione. According to the invention, the above purpose is achieved thanks to the object specifically referred to in the following claims, which are intended as an integral part of this description.

La presente invenzione riguarda un procedimento per preparare almeno una molecola di RNA contenuta in un campione biologico per il sequenziamento, comprendente le seguenti fasi: The present invention relates to a process for preparing at least one RNA molecule contained in a biological sample for sequencing, comprising the following steps:

(i) ottenere un campione biologico comprendente almeno una molecola di RNA, in cui l’almeno una molecola di RNA presenta un gruppo fosfato o 2’,3’-fosfato ciclico all’estremità 3’; (i) obtain a biological sample comprising at least one RNA molecule, in which the at least one RNA molecule has a phosphate group or 2 ', 3'-phosphate cyclic at the 3' end;

(ii) fosforilare l’almeno una molecola di RNA all’estremità 5’, introducendo così un gruppo fosfato all’estremità 5’ dell’almeno una molecola di RNA ed ottenendo almeno una molecola di RNA fosforilata ad entrambe le estremità; (ii) phosphorylate at least one RNA molecule at the 5 'end, thus introducing a phosphate group at the 5' end of the at least one RNA molecule and obtaining at least one RNA molecule phosphorylated at both ends;

(iii) ligare l’estremità 5’ dell’almeno una molecola di RNA fosforilata all’estremità 3’ di un primo linker di RNA casuale, in cui il primo linker di RNA casuale presenta un gruppo -OH all’estremità 3’ ed un gruppo di blocco terminale all’estremità 5’, ottenendo almeno un primo prodotto di ligazione; e (iii) ligating the 5 'end of the at least one phosphorylated RNA molecule to the 3' end of a first random RNA linker, in which the first random RNA linker has an -OH group at the 3 'end and a terminal block group at the 5 'end, obtaining at least a first ligation product; And

(iv) ligare l’estremità 3’ dell’almeno un primo prodotto di ligazione all’estremità 5’ di un secondo linker di RNA casuale che presenta gruppi -OH ad entrambe le estremità, ottenendo almeno un secondo prodotto di ligazione; (iv) ligate the 3 'end of the at least one first ligation product to the 5' end of a second random RNA linker that has -OH groups at both ends, obtaining at least a second ligation product;

in cui l’almeno un secondo prodotto di ligazione è adatto al sequenziamento, preferibilmente un sequenziamento a molecola singola. in which the at least one second ligation product is suitable for sequencing, preferably a single-molecule sequencing.

Il presente procedimento è senza PCR e può essere applicato a qualsiasi impronta molecolare di RNA 3’-P/cP-terminato o qualunque frammento di RNA che presenta un 3’-P/cP. Il procedimento oggetto della presente domanda, denominato CLA-p-seq#1, consente la preservazione di modifiche post-trascrizionali dell’RNA e la loro rivelazione mediante piattaforme di sequenziamento diretto dell’RNA a molecola singola basato su nanopori. Questo procedimento supera alcune delle limitazioni che hanno tradizionalmente afflitto lo studio delle impronte molecolari dell’RNA, come protocolli richiedenti tempi lunghi e distorsioni da PCR, e fornisce una procedura potente per il sequenziamento profondo di frammenti di RNA che presentano 3’-P/cP con la piattaforma Oxford Nanopore, permettendo così la rivelazione in tempo reale di una molecola singola di specie di RNA biologicamente rilevanti. This procedure is without PCR and can be applied to any molecular imprint of 3'-P / cP-terminated RNA or any RNA fragment that has a 3'-P / cP. The process subject of this application, called CLA-p-seq # 1, allows the preservation of post-transcriptional RNA modifications and their detection by means of direct sequencing platforms for single-molecule RNA based on nanopores. This procedure overcomes some of the limitations that have traditionally plagued the study of RNA molecular footprints, such as time-consuming protocols and PCR bias, and provides a powerful procedure for deep sequencing of RNA fragments presenting 3'-P / cP. with the Oxford Nanopore platform, thus allowing the real-time detection of a single molecule of biologically relevant RNA species.

In un’ulteriore forma di realizzazione, la presente descrizione riguarda un kit per realizzare il procedimento per preparare almeno una molecola di RNA contenuta in un campione biologico per il sequenziamento, in cui il kit comprende un primo linker di RNA casuale, un secondo linker di RNA casuale, un primo ed un secondo enzima ligasi, in cui: In a further embodiment, the present description relates to a kit for carrying out the process for preparing at least one RNA molecule contained in a biological sample for sequencing, in which the kit comprises a first random RNA linker, a second linker of Random RNA, a first and a second ligase enzyme, in which:

(i) il primo linker di RNA casuale presenta un gruppo -OH all’estremità 3’ ed un gruppo di blocco terminale all’estremità 5’; (i) the first random RNA linker has an -OH group at the 3 'end and a terminal block group at the 5' end;

(ii) il secondo linker di RNA casuale presenta gruppi -OH ad entrambe le estremità; (ii) the second random RNA linker has -OH groups at both ends;

(iii) il primo enzima ligasi è adatto a ligare l’estremità 5’ di una molecola di RNA, che presenta un gruppo fosfato o 2’,3’-fosfato ciclico all’estremità 3’ ed un gruppo fosfato all’estremità 5’, all’estremità 3’ del primo linker di RNA casuale; e (iii) the first ligase enzyme is suitable for ligating the 5 'end of an RNA molecule, which has a cyclic phosphate or 2', 3'-phosphate group at the 3 'end and a phosphate group at the 5' end , at the 3 'end of the first random RNA linker; And

(iv) il secondo enzima ligasi è adatto a ligare l’estremità 3’ della molecola di RNA all’estremità 5’ del secondo linker di RNA casuale. (iv) the second ligase enzyme is suitable for ligating the 3 'end of the RNA molecule to the 5' end of the second random RNA linker.

Breve descrizione dei disegni Brief description of the drawings

L’invenzione sarà ora descritta in dettaglio, solo a titolo di esempio illustrativo e non limitativo, facendo riferimento alle figure allegate, in cui: The invention will now be described in detail, only by way of illustrative and non-limiting example, referring to the attached figures, in which:

- Figura 1. A) Analisi PAGE in TBE-urea del trattamento di poliadenilazione di frammenti derivanti da digestione con RNasi I. B) Rappresentazione schematica del procedimento denominato CLA-p-seq#1. C) Analisi PAGE in TBE-urea di tutti le fasi di CLA-p-seq#1. - Figure 1. A) PAGE analysis in TBE-urea of the polyadenylation treatment of fragments deriving from digestion with RNase I. B) Schematic representation of the procedure called CLA-p-seq # 1. C) PAGE analysis in TBE-urea of all phases of CLA-p-seq # 1.

- Figura 2. Sequenziamento dell’RNA diretto della libreria 5’-linkerA-GFP-linkerB-3’. A) Distribuzione delle lunghezze delle letture di sequenziamento. B) Immagine rappresentativa della mappatura delle letture contro la sequenza di riferimento. L’analisi bioinformatica è stata effettuata con CLC Genomics Workbench (v12; QIAGEN). - Figure 2. Direct RNA sequencing of the 5'-linkerA-GFP-linkerB-3 'library. A) Distribution of the lengths of the sequencing reads. B) Representative image of the mapping of the reads against the reference sequence. Bioinformatics analysis was performed with CLC Genomics Workbench (v12; QIAGEN).

- Figura 3. Sequenze nucleotidiche. - Figure 3. Nucleotide sequences.

Descrizione dettagliata dell’invenzione Detailed description of the invention

Nella seguente descrizione, numerosi dettagli specifici vengono forniti per garantire un’esaustiva comprensione delle forme di realizzazione. Le forme di realizzazione possono venire concretizzate senza uno o più dei dettagli specifici, o con altri procedimenti, componenti, materiali, ecc. In altri casi, strutture, materiali, od operazioni ben noti non sono illustrati o descritti in termini dettagliati per evitare di confondere aspetti delle forme di realizzazione. In the following description, numerous specific details are provided to ensure a comprehensive understanding of the embodiments. Embodiments can be embodied without one or more of the specific details, or with other processes, components, materials, etc. In other cases, well-known structures, materials, or operations are not illustrated or described in detail to avoid confusing aspects of the embodiments.

All’interno della presente descrizione, il riferimento a “una (= aggettivo numerale) forma di realizzazione” oppure “una (= articolo indeterminativo) forma di realizzazione” sta ad indicare che una particolare versione, struttura, o caratteristica descritta in riferimento alla forma di realizzazione è inclusa in almeno una forma di realizzazione. Quindi, le forme delle espressioni “in una (= aggettivo numerale) forma di realizzazione” o “in una (= articolo indeterminativo) forma di realizzazione” in vari punti all’interno della presente descrizione non si riferiscono necessariamente tutte alla stessa forma di realizzazione. Inoltre, le particolari versioni, strutture, o caratteristiche possono venire combinate in qualsiasi modo adatto in una o più forme di realizzazione. Within this description, the reference to "a (= numeral adjective) embodiment" or "a (= indefinite article) embodiment" indicates that a particular version, structure, or feature described with reference to the embodiment is included in at least one embodiment. Thus, the forms of the expressions "in a (= numeral adjective) embodiment" or "in a (= indefinite article) embodiment" at various points within this description do not necessarily all refer to the same embodiment . Furthermore, the particular versions, structures, or features can be combined in any suitable way in one or more embodiments.

Le intestazioni qui fornite sono solo per convenienza e non interpretano lo scopo o il significato delle varie forme di realizzazione. The headings provided here are for convenience only and do not interpret the purpose or meaning of the various embodiments.

La presente invenzione riguarda un nuovo procedimento per preparare almeno una molecola di RNA contenuta in un campione biologico per il sequenziamento, comprendente le seguenti fasi: The present invention relates to a new process for preparing at least one RNA molecule contained in a biological sample for sequencing, comprising the following steps:

(ii) fosforilare l’almeno una molecola di RNA all’estremità 5’, introducendo così un gruppo fosfato all’estremità 5’ dell’almeno una molecola di RNA e ottenendo almeno una molecola di RNA fosforilata ad entrambe le estremità; (ii) phosphorylating at least one RNA molecule at the 5 'end, thus introducing a phosphate group at the 5' end of the at least one RNA molecule and obtaining at least one RNA molecule phosphorylated at both ends;

in cui l’almeno un secondo prodotto di ligazione è adatto al sequenziamento, preferibilmente sequenziamento a molecola singola. Più preferibilmente, il sequenziamento è effettuato con la piattaforma di sequenziamento Oxford Nanopore (sequenziamento basato su nanopori). in which the at least one second ligation product is suitable for sequencing, preferably single-molecule sequencing. More preferably, sequencing is performed with the Oxford Nanopore sequencing platform (nanopore-based sequencing).

In una forma di realizzazione, il campione biologico può essere selezionato tra lisato cellulare di eucarioti (piante, animali, funghi ed organismi unicellulari come protisti), virus o procarioti, tessuto (tra cui sangue e biopsie, cellule in vitro ed ex vivo), liquidi biologici (plasma da sangue di cordone ombelicale, lavaggio broncoalveolare, plasma da sangue adulto, saliva da parotide, liquido di follicolo ovarico, siero, liquido amniotico, plasma seminale, urina, bile, saliva sottomandibolare/sottolinguale, liquido cerebrospinale), colture cellulari 3D. In one embodiment, the biological sample can be selected from cell lysate of eukaryotes (plants, animals, fungi and single-celled organisms such as protists), viruses or prokaryotes, tissue (including blood and biopsies, cells in vitro and ex vivo), biological fluids (umbilical cord blood plasma, bronchoalveolar lavage, adult blood plasma, parotid saliva, ovarian follicle fluid, serum, amniotic fluid, seminal plasma, urine, bile, submandibular / sublingual saliva, cerebrospinal fluid), cell cultures 3D.

In una forma di realizzazione, l’almeno una molecola di RNA che presenta un gruppo fosfato o 2’,3’-fosfato ciclico all’estremità 3’ è generata trattando il campione biologico con una endoribonucleasi, una esoribonucleasi, un ribozima od una tossina capace di scindere mRNA, tRNA, snRNA, snoRNA, RNA Y, lncRNA, piRNA, siRNA, RNA virale (da virus a RNA senso positivo, virus a RNA senso negativo, virus di trascrizione inversa, ed altre specie di RNA prodotte da virus) o rRNA. In one embodiment, the at least one RNA molecule that has a phosphate or cyclic 2 ', 3'-phosphate group at the 3' end is generated by treating the biological sample with an endoribonuclease, a hexoribonuclease, a ribozyme or a toxin capable of cleaving mRNA, tRNA, snRNA, snoRNA, RNA Y, lncRNA, piRNA, siRNA, viral RNA (positive sense RNA virus, negative sense RNA virus, reverse transcription virus, and other RNA species produced by viruses) or rRNA.

In una forma di realizzazione, l’almeno una molecola di RNA che presenta un gruppo fosfato o 2’,3’-fosfato ciclico all’estremità 3’ è fisiologicamente o patologicamente presente nel campione biologico come conseguenza dell’effetto di una endoribonucleasi, una esoribonucleasi, un ribozima o una tossina capace di scindere mRNA, tRNA, snRNA, snoRNA, RNA Y, lncRNA, piRNA, siRNA, RNA virale (da virus a RNA senso positivo, virus a RNA senso negativo, virus di trascrizione inversa, ed altre specie di RNA prodotte da virus) o rRNA presenti nel campione biologico. In one embodiment, the at least one RNA molecule having a cyclic phosphate or 2 ', 3'-phosphate group at the 3' end is physiologically or pathologically present in the biological sample as a consequence of the effect of an endoribonuclease, a exoribonuclease, a ribozyme or toxin capable of cleaving mRNA, tRNA, snRNA, snoRNA, RNA Y, lncRNA, piRNA, siRNA, viral RNA (positive sense RNA virus, negative sense RNA virus, reverse transcription virus, and others RNA species produced by viruses) or rRNA present in the biological sample.

In una forma di realizzazione, l’endoribonucleasi è selezionata preferibilmente tra RNasi A; RNasi T1; RNasi T2; RNasi I; nucleasi micrococcica S7; nucleasi stafilococcica; RNasi L; Angiogenina; colicina E5; endonucleasi dello splicing del tRNA (SE2, SEN34); Cas6 simil-ferredoxina e CasE simil-ferredoxina; IRE1; endoribonucleasi poli(U)-specifica (PP11); Las1; RtcA; topoisomerasi di Tipo IB; endonucleasi Cue2<26 >e proteine Cas. In one embodiment, the endoribonuclease is preferably selected from RNase A; RNase T1; RNase T2; RNase I; micrococcal nuclease S7; staphylococcal nuclease; RNase L; Angiogenin; colicin E5; tRNA splice endonuclease (SE2, SEN34); Cas6 ferredoxin-like and CasE-ferredoxin-like; IRE1; poly (U) -specific endoribonuclease (PP11); Las1; RtcA; Type IB topoisomerase; endonuclease Cue2 <26> and Cas proteins.

In una forma di realizzazione, l’esoribonucleasi è rappresentata preferibilmente da USB1. In one embodiment, the exoribonuclease is preferably represented by USB1.

In una forma di realizzazione, il ribozima è selezionato preferibilmente tra ribozima a testa di martello, ribozima a forcina, ribozimi di epatite delta, ribozima satellite di Varkud (VS). In one embodiment, the ribozyme is preferably selected from hammerhead ribozyme, hairpin ribozyme, hepatitis delta ribozyme, Varkud (VS) satellite ribozyme.

In una forma di realizzazione, la tossina è selezionata preferibilmente tra colicina D e colicina E5, alfa-sarcina, zimocina, PaT, MazF, ChpBK, prrC. In one embodiment, the toxin is preferably selected between colicin D and colicin E5, alpha-sarcin, zymocin, PaT, MazF, ChpBK, prrC.

In una forma di realizzazione, l’almeno una molecola di RNA da sequenziare è a singolo filamento. In one embodiment, the at least one RNA molecule to be sequenced is single-stranded.

In una forma di realizzazione, l’almeno una molecola di RNA da sequenziare è contenuta nel campione biologico in una concentrazione compresa tra 10 pM e 100 µM, preferibilmente tra 1 nM e 10 µM. In one embodiment, the at least one RNA molecule to be sequenced is contained in the biological sample in a concentration between 10 pM and 100 µM, preferably between 1 nM and 10 µM.

In una forma di realizzazione, la fase di fosforilazione (ii) è effettuata utilizzando un enzima fosforilante selezionato tra T4 PNK 3’ meno, T4 PNK ed altre versioni ricombinanti dell’enzima T4 PNK (per esempio Optikinase<TM>). In one embodiment, the phosphorylation step (ii) is carried out using a phosphorylating enzyme selected from T4 PNK 3 'minus, T4 PNK and other recombinant versions of the T4 PNK enzyme (for example Optikinase <TM>).

In una forma di realizzazione, la fase di ligazione (iii) è effettuata utilizzando un enzima ligasi selezionato tra RtcB, Archeasi, tRNA ligasi di Arabidopsis Thaliana, e tRNA ligasi eucariotica. In one embodiment, the ligation step (iii) is performed using an enzyme ligase selected from RtcB, Archease, Arabidopsis Thaliana tRNA ligase, and eukaryotic tRNA ligase.

In una forma di realizzazione, la fase di ligazione (iv) è effettuata utilizzando un enzima ligasi selezionato tra T4 RNA ligasi 1, T4 RNA ligasi 2, T4 RNA ligasi 2 troncata, T4 RNA ligasi 2 K227Q, e Mth RNA ligasi. In one embodiment, the ligation step (iv) is performed using an enzyme ligase selected from T4 RNA ligase 1, T4 RNA ligase 2, truncated T4 RNA ligase 2, T4 RNA ligase 2 K227Q, and Mth RNA ligase.

In una ulteriore forma di realizzazione, la presente invenzione prevede un kit per realizzare il procedimento per preparare almeno una molecola di RNA contenuta in un campione biologico per il sequenziamento, in cui il kit comprende un primo linker di RNA casuale, un secondo linker di RNA casuale, un primo ed un secondo enzima ligasi, in cui: In a further embodiment, the present invention provides a kit for carrying out the process for preparing at least one RNA molecule contained in a biological sample for sequencing, in which the kit comprises a first random RNA linker, a second RNA linker random, a first and a second ligase enzyme, in which:

In una forma di realizzazione, il primo enzima ligasi è selezionato tra RtcB, Archeasi, tRNA ligasi di Arabidopsis Thaliana, e tRNA ligasi eucariotica. In one embodiment, the first enzyme ligase is selected from RtcB, Archease, Arabidopsis Thaliana tRNA ligase, and eukaryotic tRNA ligase.

In una forma di realizzazione, il secondo enzima ligasi è selezionato tra T4 RNA ligasi 1, T4 RNA ligasi 2, T4 RNA ligasi 2 troncata, T4 RNA ligasi 2 K227Q, e Mth RNA ligasi. In una forma di realizzazione, il kit comprende inoltre (a) un enzima fosforilante, e/o (b) una endoribonucleasi, un ribozima, od una tossina capace di scindere mRNA, tRNA, snRNA, snoRNA, RNA Y, lncRNA, piRNA, siRNA, RNA virale (da virus a RNA senso positivo, virus a RNA senso negativo, virus di trascrizione inversa, ed altre specie di RNA prodotte da virus) o rRNA. Preferibilmente, il kit comprende inoltre una endoribonucleasi, in cui la endoribonucleasi è RNasi I. In one embodiment, the second enzyme ligase is selected from T4 RNA ligase 1, T4 RNA ligase 2, truncated T4 RNA ligase 2, T4 RNA ligase 2 K227Q, and Mth RNA ligase. In one embodiment, the kit further comprises (a) a phosphorylating enzyme, and / or (b) an endoribonuclease, ribozyme, or toxin capable of cleaving mRNA, tRNA, snRNA, snoRNA, RNA Y, lncRNA, piRNA, siRNA, viral RNA (positive-sense RNA virus, negative-sense RNA virus, reverse transcription virus, and other virus-produced RNA species) or rRNA. Preferably, the kit further comprises an endoribonuclease, wherein the endoribonuclease is RNase I.

In una o più forme di realizzazione, il primo ed il secondo linker di RNA casuale hanno una lunghezza compresa tra 50 e 500 nucleotidi, purché la somma delle lunghezze del primo e del secondo linker di RNA casuale sia compresa tra 200 e 1000 nucleotidi. In one or more embodiments, the first and second random RNA linkers have a length between 50 and 500 nucleotides, as long as the sum of the lengths of the first and second random RNA linkers is between 200 and 1000 nucleotides.

In una o più forme di realizzazione, il primo ed il secondo linker di RNA casuale hanno un’energia libera minima compresa tra -3 e -150 kcal/mol. Preferibilmente, ogni linker di RNA casuale è progettato per avere un’energia libera minima compresa tra -6 kcal/mol e -24 kcal/mol, senza strutture secondarie di rilievo. Alcune strutture secondarie sono ammesse nella porzione interna della sequenza, ma non ai terminali 5’/3’. L’energia libera minima può essere calcolata mediante un software disponibile agli esperti. In one or more embodiments, the first and second random RNA linkers have a minimum free energy between -3 and -150 kcal / mol. Preferably, each random RNA linker is designed to have a minimum free energy between -6 kcal / mol and -24 kcal / mol, without significant secondary structures. Some secondary structures are allowed in the internal portion of the sequence, but not at the 5 '/ 3' terminals. The minimum free energy can be calculated using software available to experts.

In una o più forme di realizzazione, il gruppo di blocco terminale presente all’estremità 5’ del primo linker di RNA casuale è selezionato tra biotina; una C1-12 alchilammina primaria; un 7-metil-guanilato mono-, bis-, o tri-fosfato; un gruppo azide; un colesteril-TEG; un gruppo 5’-esinile; un 5-Ottadininil-dU; un gruppo tiolo; una carbossi-fluoresceina (FAM); una cianina (Cy3, Cy5). In one or more embodiments, the terminal block group present at the 5 'end of the first random RNA linker is selected from biotin; a primary C1-12 alkylamine; a 7-methyl-guanylate mono-, bis-, or tri-phosphate; an azide group; a cholesteryl-TEG; a 5'-hexinyl group; a 5-Octadininyl-dU; a thiol group; a carboxy-fluorescein (FAM); a cyanine (Cy3, Cy5).

In una forma di realizzazione, il primo linker di RNA casuale ha una sequenza nucleotidica come descritta in SEQ ID No.:2, ed il secondo linker di RNA casuale ha una sequenza nucleotidica selezionata tra SEQ ID No.:3 e SEQ ID No.:4. In one embodiment, the first random RNA linker has a nucleotide sequence as described in SEQ ID No.:2, and the second random RNA linker has a nucleotide sequence selected between SEQ ID No.:3 and SEQ ID No. : 4.

Il primo ed il secondo linker di RNA casuale possono essere sintetizzati chimicamente o trascritti in vitro con un’estremità 5’ modificata. The first and second random RNA linkers can be chemically synthesized or transcribed in vitro with a modified 5 'end.

Il primo linker di RNA casuale che presenta un gruppo di blocco terminale all’estremità 5’ può essere generato chimicamente od enzimaticamente, secondo la comune conoscenza generale dell’esperto del settore. The first random RNA linker that has a terminal block group at the 5 'end can be generated chemically or enzymatically, according to the common general knowledge of the expert in the field.

Il secondo linker di RNA casuale che presenta un gruppo -OH all’estremità 5’ può essere generato chimicamente od enzimaticamente, secondo la comune conoscenza generale dell’esperto del settore. Se generato enzimaticamente, il 5’-OH può essere ottenuto con l’attività catalitica di (i) un ribozima che agisce in -cis (codificato da una sequenza trascritta in vitro) od in -trans (agendo sulla sequenza trascritta in vitro) (ii) enzimi che lasciano un gruppo 5’-OH, come la fosfatasi intestinale di vitello, o (iii) una tossina (come per esempio, colicina D e colicina E5, alfa-sarcina, zimocina, tossina killer di Pichia acaciae (PaT), MazF, ChpBK, prrC). The second random RNA linker that has an -OH group at the 5 'end can be generated chemically or enzymatically, according to the common general knowledge of the expert in the field. If enzymatically generated, 5'-OH can be obtained with the catalytic activity of (i) a ribozyme acting in -cis (encoded by a transcribed sequence in vitro) or in -trans (acting on the transcribed sequence in vitro) ( ii) enzymes that leave a 5'-OH group, such as calf intestinal phosphatase, or (iii) a toxin (such as colicin D and colicin E5, alpha-sarcin, zymocin, Pichia acaciae killer toxin (PaT) , MazF, ChpBK, prrC).

I linker di RNA casuali possono contenere almeno uno, preferibilmente tra 2 e 120 nucleotidi modificati con almeno una delle seguenti modifiche: LNA, PNA, 2-Aminopurina, 2,6-Diaminopurina (2-Amino-dA), 6mA, 5-BromodU, dT Invertito, 5-Metil-dC, 8-aza-7-deazaguanosina, 5-idrossibutinil-2’-deossiuridina, 5-Nitroindolo, 2’-O-Metil-A, 2’-O-Metil-G, 2’-O-Metil-C, 2’-O-Metil-U, 2’-fluoro-A, 2’-fluoro-C, 2’-fluoro-G, 2’-fluoro-U, 2-MetossiEtossi-A, 2-MetossiEtossi-MeC, 2-MetossiEtossi-G, 2-MetossiEtossi-T, 5-Bromo-dU, 2-Aminopurina, dT invertito, 2,6-Diaminopurina, deossiUridina, Dideossi-T invertito, 5-Metil-dC, dideossi-C, deossilnosina, una base universale comprendente 5-Nitroindolo, morfolino, una base 2’-O-Metil-R-A, iso-dC, iso-dG, ribonucleotide, un analogo nucleotidico di treosio, un analogo nucleotidico proteico, un analogo nucleotidico glicolico, un analogo nucleotidico bloccato, un analogo nucleotidico di terminazione di catena, ribonucleotide fosforilato diidrouridina, tiouridina, pseudouridina, queuosina e wyosina, uno zucchero modificato, un legame non naturale, un sito abasico, una dideossi-base, una 5-metilbase, o uno spaziatore selezionato tra spaziatore di Carbonio (RNA 5’-O-(CH2)3-PO4-3’ RNA), spaziatore fotoscindibile, RNA 5’ O-trietilenglicole-PO4-3’ RNA, esaetilenglicole a 18 atomi e 1’,2’-dideossiribosio. Preferibilmente, i linker possono contenere da 1 a 25 nucleotidi modificati nelle prime 25 basi dall’estremità 5’ e nelle ultime 25 basi dall’estremità 3’. Random RNA linkers may contain at least one, preferably between 2 and 120 modified nucleotides with at least one of the following modifications: LNA, PNA, 2-Aminopurine, 2,6-Diaminopurine (2-Amino-dA), 6mA, 5-BromodU , Inverted dT, 5-Methyl-dC, 8-aza-7-deazaguanosine, 5-hydroxybutynyl-2'-deoxyuridine, 5-Nitroindole, 2'-O-Methyl-A, 2'-O-Methyl-G, 2 '-O-Methyl-C, 2'-O-Methyl-U, 2'-fluoro-A, 2'-fluoro-C, 2'-fluoro-G, 2'-fluoro-U, 2-Methoxyethoxy-A , 2-MethoxyEthoxy-MeC, 2-MethoxyEthoxy-G, 2-MethoxyEthoxy-T, 5-Bromo-dU, 2-Aminopurine, inverted dT, 2,6-Diaminopurine, deoxyUridine, inverted Dideoxy-T, 5-Methyl-dC , dideoxy-C, deoxylnosin, a universal base comprising 5-Nitroindole, morpholino, a 2'-O-Methyl-R-A base, iso-dC, iso-dG, ribonucleotide, a nucleotide analog of treose, a nucleotide protein analog, a glycol nucleotide analog, a blocked nucleotide analog, a chain termination nucleotide analog, dihydrouridine phosphorylated ribonucleotide, thiouridine, pseudouridine, queuosin and wyosine, a modified sugar, an unnatural bond, an abase site, a dideoxy-base, a 5-methylbase, or a spacer selected from a Carbon spacer (RNA 5'-O- (CH2) 3-PO4 -3 'RNA), photosenschable spacer, 5' O-triethylene glycol-PO4-3 'RNA, 18-atom hexaethylene glycol and 1', 2'-dideoxyribose. Preferably, the linkers can contain from 1 to 25 modified nucleotides in the first 25 bases from the 5 'end and in the last 25 bases from the 3' end.

Gli inventori hanno sviluppato un procedimento di preparazione di librerie per il sequenziamento basato su nanopori di brevi molecole di RNA che presentano un tratto distintivo 3’-P/cP, che è stato validato nel contesto del profiling ribosomiale (Ribo-seq). In particolare, CLA-pseq#1 è un procedimento senza PCR per il sequenziamento diretto di RNA, che ha il vantaggio di aggirare le distorsioni da RT o PCR e permette la rivelazione di modifiche dell’RNA (per esempio 6mA o analoghi delle basi inseriti ex novo in specie di RNA). The inventors have developed a process for preparing libraries for sequencing based on nanopores of short RNA molecules that have a distinctive 3’-P / cP trait, which has been validated in the context of ribosomal profiling (Ribo-seq). In particular, CLA-pseq # 1 is a PCR-free procedure for direct sequencing of RNA, which has the advantage of circumventing distortions from RT or PCR and allows the detection of RNA modifications (e.g. 6mA or inserted base analogs ex novo in RNA species).

Il presente procedimento permette di abbreviare notevolmente il tempo necessario per la preparazione di librerie per Ribo-seq da più di una settimana, come attualmente richiesto dal protocollo standard<3>, e riduce significativamente le fasi tecniche (defosforilazione, estrazione da gel, purificazione), abbassando quindi la probabilità di introduzione di distorsioni. Il procedimento degli inventori permette qualsiasi studio sulle impronte molecolari dell’RNA in cui si impiega la scissione enzimatica che lascia i terminali 3’-P/cP. Inoltre, studi correlati al cancro e a disturbi neurodegenerativi, autoimmuni e infettivi, così come ad una serie di funzioni cellulari in cui è stato riportato un coinvolgimento di molecole di RNA 3’-P/cP-terminate, saranno particolarmente avvantaggiati dal presente procedimento. In particolare, il presente procedimento è adatto alla caratterizzazione dell’attività endonucleolitica di specifici enzimi, ribozimi o tossine<27>, inclusi sistemi di RNA editing CRISPR-Cas<28>. Infine, il procedimento qui descritto permette rapide procedure di sequenziamento senza il bisogno di costose attrezzature di laboratorio, anche in contesti con risorse limitate. This procedure allows to significantly shorten the time required for the preparation of libraries for Ribo-seq by more than a week, as currently required by the standard protocol <3>, and significantly reduces the technical steps (dephosphorylation, gel extraction, purification) , thus lowering the likelihood of introducing distortions. The inventors' procedure allows any study on the molecular imprints of RNA in which the enzymatic cleavage that leaves the 3'-P / cP terminals is used. Furthermore, studies related to cancer and neurodegenerative, autoimmune and infectious disorders, as well as to a series of cellular functions in which the involvement of 3-P / cP-terminated RNA molecules has been reported, will be particularly benefited by this procedure. In particular, this procedure is suitable for the characterization of the endonucleolytic activity of specific enzymes, ribozymes or toxins <27>, including CRISPR-Cas <28> RNA editing systems. Finally, the procedure described here allows rapid sequencing procedures without the need for expensive laboratory equipment, even in contexts with limited resources.

RISULTATI RESULTS

Gli RNA cellulari possono possedere un gruppo ossidrile (-OH), un gruppo fosfato (-P), o un gruppo 2’,3’-fosfato ciclico (cP) ai loro terminali. La scissione dell’RNA da parte di molte endoribonucleasi lascia spesso estremità 3’-P o 3’-cP, che non sono un substrato compatibile per le ligasi ATP-dipendenti (per esempio T4 RNA ligasi). Un contesto metodologicamente rilevante che implica l’utilizzo di endoribonucleasi per scindere filamenti di RNA e successivi eventi di ligazione è rappresentato dal Ribo-seq per lo studio delle impronte molecolari dell’RNA, che è basato sulle seguenti fasi: (i) lisi cellulare, (ii) digestione di ssRNA con endonucleasi (per esempio RNasi I), (iii) raccolta di frammenti lunghi 25-35 nt (RPF veri e propri), (iv) preparazione della libreria, (v) sequenziamento profondo e (vi) allineamento finale ad un trascrittoma codificante per proteine. Cellular RNAs can have a hydroxyl group (-OH), a phosphate group (-P), or a cyclic 2 ', 3'-phosphate (cP) group at their terminals. The cleavage of RNA by many endoribonucleases often leaves 3'-P or 3'-cP ends, which are not a compatible substrate for ATP-dependent ligases (for example T4 RNA ligases). A methodologically relevant context that involves the use of endoribonucleases to cleave RNA strands and subsequent ligation events is represented by the Ribo-seq for the study of RNA molecular footprints, which is based on the following steps: (i) cell lysis, (ii) ssRNA digestion with endonuclease (e.g. RNase I), (iii) collection of 25-35 nt long fragments (RPF proper), (iv) library preparation, (v) deep sequencing and (vi) alignment final to a protein-coding transcriptome.

Al fine di scoprire la frazione di RPF effettivi sulla popolazione totale di frammenti risultanti dalla scissione con RNasi I, gli inventori hanno tratto vantaggio dal trattamento di poliadenilazione in 3’. I risultati dimostrano che circa il 50% dei frammenti lunghi 25-35 nt ottenuti da cellule in coltura (MCF7) hanno reagito nella reazione di poliadenilazione (Figura 1A). Questo indica che gli RPF selezionati per dimensione sono contaminati da specie di RNA che presentano estremità 3’-OH, che possono essere catturate mediante procedimenti di ligazione standard generando un più alto rumore di fondo. In order to discover the fraction of effective RPF on the total population of fragments resulting from cleavage with RNase I, the inventors took advantage of the 3 'polyadenylation treatment. The results demonstrate that approximately 50% of the 25-35 nt long fragments obtained from cultured cells (MCF7) reacted in the polyadenylation reaction (Figure 1A). This indicates that the RPFs selected for size are contaminated by RNA species with 3'-OH ends, which can be captured by standard ligation procedures, generating a higher background noise.

Al fine di superare i limiti delle strategie di preparazione di librerie per Ribo-seq attualmente disponibili, i presenti inventori hanno cercato di sviluppare un procedimento che permetta (i) la preservazione dei tratti distintivi 3’-P/cP e (ii) l’indipendenza da fasi di amplificazione mediante PCR. Per raggiungere tale scopo, gli inventori hanno utilizzato (i) un enzima capace di ligare terminali 5’-OH a 3’-P/cP (RtcB ligasi)<29,30>, e (ii) linker casuali aventi le caratteristiche sopra indicate ed adatti al sequenziamento basato su nanopori senza PCR. In particolare, i presenti inventori hanno progettato un procedimento dedicato al sequenziamento diretto di RNA basato su nanopori. Per fornire una prova di concetto della fattibilità dell’approccio, gli inventori hanno dapprima utilizzato un frammento di RNA sintetico lungo 30 nt che presenta gruppi -P ad entrambe le estremità 5’ e 3’ (5’P-GFP-3’P) in sostituzione di RPF fosforilati di derivazione cellulare. Il frammento GFP ha la sequenza nucleotidica descritta in SEQ ID No.: 1. In order to overcome the limitations of currently available library preparation strategies for Ribo-seq, the present inventors have tried to develop a process that allows (i) the preservation of the distinctive 3'-P / cP features and (ii) the independence from amplification phases by PCR. To achieve this, the inventors used (i) an enzyme capable of ligating 5'-OH terminals to 3'-P / cP (RtcB ligase) <29,30>, and (ii) random linkers having the above characteristics and suitable for nanopore-based sequencing without PCR. In particular, the present inventors have designed a process dedicated to the direct sequencing of RNA based on nanopores. To provide a proof of concept of the feasibility of the approach, the inventors first used a 30 nt long synthetic RNA fragment that has -P groups at both 5 'and 3' ends (5'P-GFP-3'P) as a replacement for cell-derived phosphorylated RPFs. The GFP fragment has the nucleotide sequence described in SEQ ID No .: 1.

Nel procedimento CLA-p-seq#1 (Figura 1B e C), il frammento GFP è stato ligato ad un primo linker di RNA casuale che presenta un’estremità 3’-OH (linker A, avente la sequenza nucleotidica come descritta in SEQ ID No.: 2) mediante T4 RNA ligasi I. Per assicurare una reattività selettiva del terminale 3’-OH del linker A con l’estremità 5’-P del frammento GFP, una parte biotina (gruppo di blocco terminale) è stata legata covalentemente all’estremità 5’ del linker A, impedendo così qualsiasi reattività intramolecolare indesiderata del linker A e dei prodotti di ligazione. Il prodotto di ligazione è stato quindi separato in TBE-urea PAGE, selezionato per dimensione e purificato da gel per la reazione successiva (Figura 1C). Il costrutto 5’-biotina-linkerA-GFP-3’P così ottenuto è stato quindi ligato ad un secondo linker di RNA casuale che presenta gruppi -OH ad entrambe le estremità (linker B, avente la sequenza nucleotidica come descritta in SEQ ID No.: 3 o 4) utilizzando la RtcB ligasi, che catalizza la reazione dell’estremità 5’-OH del linker B con il terminale 3’-P del prodotto 5’-biotina-linkerA-GFP-3’P. Il prodotto di ligazione finale (5’-biotina-linkerA-GFP-linkerB-3’OH) è stato quindi separato mediante TBE-urea PAGE, selezionato per dimensione e purificato da gel (Figura 1C), in cui tale prodotto di ligazione è adatto al sequenziamento. In the CLA-p-seq # 1 procedure (Figure 1B and C), the GFP fragment was ligated to a first random RNA linker which has a 3'-OH end (linker A, having the nucleotide sequence as described in SEQ ID No .: 2) by T4 RNA ligase I. To ensure selective reactivity of the 3'-OH terminal of linker A with the 5'-P end of the GFP fragment, a biotin part (terminal block group) was ligated covalently at the 5 'end of linker A, thus preventing any unwanted intramolecular reactivity of linker A and ligation products. The ligation product was then separated in TBE-urea PAGE, selected for size and purified from gel for the next reaction (Figure 1C). The 5'-biotin-linkerA-GFP-3'P construct thus obtained was then ligated to a second random RNA linker which has -OH groups at both ends (linker B, having the nucleotide sequence as described in SEQ ID No .: 3 or 4) using the RtcB ligase, which catalyzes the reaction of the 5'-OH end of linker B with the 3'-P end of the 5'-biotin-linkerA-GFP-3'P product. The final ligation product (5'-biotin-linkerA-GFP-linkerB-3'OH) was then separated by TBE-urea PAGE, selected by size and purified by gel (Figure 1C), in which this ligation product is suitable for sequencing.

Il presente procedimento consente l’arricchimento di frammenti di RNA dotati 3’-P/3’-cP perché questo tratto distintivo è essenziale per l’efficienza del protocollo nel suo insieme. Il presente approccio è compatibile con il sequenziamento diretto di RNA basato su nanopori senza PCR a valle ed offre la possibilità di effettuare dosaggi in multiplex se combinato con linker dotati di codice a barre. This procedure allows the enrichment of RNA fragments with 3'-P / 3'-cP because this distinctive trait is essential for the efficiency of the protocol as a whole. The present approach is compatible with nanopore-based direct sequencing of RNA without downstream PCR and offers the possibility of multiplexing assays when combined with barcoded linkers.

Il procedimento di preparazione di librerie qui descritto rappresenta il primo protocollo senza PCR per l’incorporazione selettiva di molecole di RNA 3’-P/3’-cP-terminate da sequenziare con una piattaforma di sequenziamento a molecola singola basato su nanopori. The library preparation procedure described here represents the first PCR-free protocol for the selective incorporation of 3'-P / 3-cP-terminated RNA molecules to be sequenced with a single-molecule sequencing platform based on nanopores.

Sequenziamento basato su nanopori di brevi frammenti di RNA 3’-P-terminati. Sequencing based on nanopores of short 3'-P-terminated RNA fragments.

La libreria basata su GFP sopra descritta è stata sequenziata con MinION di Oxford Nanopore Technologies (ONT) utilizzando celle di flusso R9.4 e la chimica 1D per il sequenziamento diretto di RNA. The GFP-based library described above was sequenced with Oxford Nanopore Technologies (ONT) MinION using R9.4 flow cells and 1D chemistry for direct RNA sequencing.

Per CLA-p-seq#1, gli inventori hanno utilizzato un primer capace di appaiarsi specificamente all’estremità 3’ del linker A (avente la sequenza nucleotidica decritta in SEQ ID No.: 5) per la sintesi del filamento complementare di cDNA necessaria per stabilizzare la molecola di ssRNA secondo il protocollo ONT per sequenziamento diretto di RNA. Dal sequenziamento di un input della libreria di 50 ng, sono state ottenute circa 80.000 letture (letture non riuscite < 5%) in 2 ore, con un punteggio medio di qualità di basecalling (= identificazione della sequenza nucleotidica fatta da software associati a piattaforme di sequenziamento, che convertono i segnali chimici/fisici prodotti dalla piattaforma in una sequenza) intorno a 8,5 ed un’accuratezza di basecalling di circa il 90%. La distribuzione delle lunghezze delle letture era coerente con la dimensione della sequenza di riferimento (270 nt). Delle letture “passate” dopo basecalling con MinKNOWN, il 76% delle letture è risultato ri-mappato sulla sequenza di riferimento 5’-linkerA-GFP-linkerB-3’, con un allineamento sul linker A (72%) simile a quello sul linker B (67%) ed una copertura dell’inserto GFP da parte di >50% delle letture (Tabella 1). In accordo con il fatto che il sequenziamento basato su nanopori funziona meglio con filamenti di RNA più lunghi<31>, gli inventori hanno attuato CLA-p-seq#1 con due differenti lunghezze del linker B (60 nt e 120 nt, rispettivamente) ed hanno osservato che la copertura aumenta con la lunghezza del linker B (Tabella 1). Questo permette di ridurre l’errore all’estremità 3’ delle sequenze (dove inizia la lettura basata su nanopori del filamento di RNA). For CLA-p-seq # 1, the inventors used a primer capable of pairing specifically at the 3 'end of linker A (having the nucleotide sequence described in SEQ ID No .: 5) for the necessary cDNA complementary strand synthesis to stabilize the ssRNA molecule according to the ONT protocol for direct RNA sequencing. From sequencing a 50 ng library input, approximately 80,000 reads (failed reads <5%) were obtained in 2 hours, with an average basecalling quality score (= nucleotide sequence identification by software associated with sequencing, which convert the chemical / physical signals produced by the platform into a sequence) around 8.5 and a basecalling accuracy of about 90%. The distribution of the lengths of the reads was consistent with the size of the reference sequence (270 nt). Of the "past" reads after basecalling with MinKNOWN, 76% of the reads were re-mapped to the reference sequence 5'-linkerA-GFP-linkerB-3 ', with an alignment on linker A (72%) similar to that on linker B (67%) and coverage of the GFP insert by> 50% of the reads (Table 1). In accordance with the fact that nanopore-based sequencing works best with longer RNA strands <31>, the inventors implemented CLA-p-seq # 1 with two different B linker lengths (60 nt and 120 nt, respectively) and observed that coverage increases with the length of linker B (Table 1). This allows to reduce the error at the 3 'end of the sequences (where the nanopore-based reading of the RNA strand begins).

Tabella 1. Percentuale delle letture mappanti su differenti porzioni della sequenza di riferimento. La libreria 1 è stata preparata utilizzando un linker B di 60 nt, mentre la libreria 2 è stata preparata con un linker B di 120 nt. Table 1. Percentage of mapping reads on different portions of the reference sequence. Library 1 was prepared using a 60 nt linker B, while library 2 was prepared with a 120 nt linker B.

Nell’insieme, i presenti risultati forniscono prova del fatto che il procedimento di preparazione delle librerie (i) può incorporare brevi molecole sintetiche di RNA che assomigliano a RNA endogeni scissi che presentano un tratto distintivo 3’-P, e (ii) è effettivamente applicabile alla piattaforma di sequenziamento ONT. Taken together, the present results provide evidence that the library preparation process (i) can incorporate short synthetic RNA molecules that resemble split endogenous RNAs exhibiting a 3'-P distinctive trait, and (ii) is indeed applicable to the ONT sequencing platform.

MATERIALI E PROCEDIMENTI MATERIALS AND PROCEDURES

Frammenti protetti da ribosoma e linker Fragments protected by ribosome and linker

I linker customizzati A e B sono stati sintetizzati da Integrated DNA Technologies (Coralville) e sono costituiti da oligonucleotidi 120-meri (aventi le sequenze nucleotidiche descritte in SEQ ID No.: 2 e 3, rispettivamente) con gruppo bloccante 5’-biotina o gruppo terminale 5’-OH. Custom linkers A and B were synthesized by Integrated DNA Technologies (Coralville) and consist of 120-mer oligonucleotides (having the nucleotide sequences described in SEQ ID No .: 2 and 3, respectively) with 5'-biotin blocking group or 5'-OH terminal group.

I frammenti protetti da ribosoma (RPF), che sono costituiti da oligonucleotidi 30-meri aventi la sequenza nucleotidica descritta in SEQ ID No.: 1 con 5’-P e 3’-P, sono stati sintetizzati da Integrated DNA Technologies (Coralville) o generati in vitro. Ribosome-protected fragments (RPF), which consist of 30-mer oligonucleotides having the nucleotide sequence described in SEQ ID No .: 1 with 5'-P and 3'-P, were synthesized by Integrated DNA Technologies (Coralville) or generated in vitro.

Gli RPF generati in vitro sono stati ottenuti da cellule HEK293T (Rene Embrionale Umano) (Sigma, numero di catalogo 12022001). Le cellule sono state trattate con CHX (10 µg/ml, Sigma, numero di catalogo 01810) per 5 min a 37°C e lisate. Gli RPF sono stati generati incubando 0,3 AU 260 nm di lisato cellulare trattato con CHX con 2,25 U di RNasi I (Ambion, numero di catalogo AM2295) in W-buffer (Immagina BioTechnology numero di catalogo #RL001-4) a temperatura ambiente per 45 min (come descritto in Clamer et al., 2018)<32>. RPFs generated in vitro were obtained from HEK293T (Human Embryonic Kidney) cells (Sigma, catalog number 12022001). Cells were treated with CHX (10 µg / ml, Sigma, catalog number 01810) for 5 min at 37 ° C and lysed. RPFs were generated by incubating 0.3 AU 260 nm of CHX-treated cell lysate with 2.25 U of RNase I (Ambion, catalog number AM2295) in W-buffer (Imagine BioTechnology catalog number # RL001-4) a room temperature for 45 min (as described in Clamer et al., 2018) <32>.

La digestione con RNasi I è stata interrotta aggiungendo 10 U di Superase Inhibitor<TM >(Thermo Scientific, numero di catalogo AM2696) per 10 min in ghiaccio. Dopo la digestione, il lisato è stato purificato (come descritto in Ingolia et al., 2009)<33 >e trattato con SDS all’1% (Sigma, numero di catalogo 05030) e 0,1 mg di Proteinasi K (Euroclone, numero di catalogo EMR022001) a 37°C per 75 min. L’RNA totale è stato estratto con acidofenolo:cloroformio, pH 4,5 (Ambion, numero di catalogo AM9722). L’RNA è stato precipitato con isopropanolo, essiccato all’aria, risospeso in Tris-HCl 10 mM pH 8 ed analizzato su gel di TBE-urea poliacrilammide al 15% (Invitrogen, numero di catalogo EC6885BOX). Gli RPF 30-meri sono stati selezionati per dimensione ed estratti da gel (secondo il protocollo Ribolace<TM>, Immagina BioTechnology) (Clamer et al. 2018)<32>. Digestion with RNase I was stopped by adding 10 U of Superase Inhibitor <TM> (Thermo Scientific, catalog number AM2696) for 10 min on ice. After digestion, the lysate was purified (as described in Ingolia et al., 2009) <33> and treated with 1% SDS (Sigma, catalog number 05030) and 0.1 mg of Proteinase K (Euroclone, catalog number EMR022001) at 37 ° C for 75 min. Total RNA was extracted with acidophenol: chloroform, pH 4.5 (Ambion, catalog number AM9722). The RNA was precipitated with isopropanol, dried in air, resuspended in Tris-HCl 10 mM pH 8 and analyzed on 15% TBE-urea polyacrylamide gel (Invitrogen, catalog number EC6885BOX). RPF 30-mers were selected for size and gel extracted (according to the Ribolace <TM>, Immagina BioTechnology protocol) (Clamer et al. 2018) <32>.

In seguito alla purificazione, i frammenti di RPF generati in vitro sono stati sottoposti a fosforilazione al 5’ con T4 PNK 3’ meno (NEB, numero di catalogo M0236S) prima della cattura con il primo ed il secondo linker di RNA casuale. Following purification, the RPF fragments generated in vitro were subjected to 5 'phosphorylation with T4 PNK 3' minus (NEB, catalog number M0236S) before capture with the first and second random RNA linkers.

Protocollo di Ligazione Ligation Protocol

La ligazione degli RPF all’estremità 5’ del linker A (avente la sequenza nucleotidica descritta in SEQ ID No.: 2) è stata effettuata con RPF 2 µM, linker A 1 µM, 2 µl di tampone di T4 RNA ligasi, 6 µl di PEG8000 al 50%, 1 µl di T4 RNA ligasi (NEB, numero di catalogo M0204L), 0,5 µl di Superase Inhibitor<TM >(Thermo Fisher, numero di catalogo AM2696), ATP 1 mM ed acqua senza nucleasi in un volume di reazione di 20 µl. La reazione è stata effettuata a temperatura ambiente per 2 ore. RPF ligation at the 5 'end of linker A (having the nucleotide sequence described in SEQ ID No .: 2) was performed with RPF 2 µM, linker A 1 µM, 2 µl buffer of T4 RNA ligase, 6 µl of 50% PEG8000, 1 µl of T4 RNA ligase (NEB, catalog number M0204L), 0.5 µl of Superase Inhibitor <TM> (Thermo Fisher, catalog number AM2696), 1 mM ATP and nuclease-free water in one reaction volume of 20 µl. The reaction was carried out at room temperature for 2 hours.

La ligazione del prodotto ligato all’estremità 3’ del linker B (avente la sequenza nucleotidica descritta in SEQ ID No.: 3) è stata ottenuta con RtcB ligasi. La ligazione è stata effettuata con 1 µl di RtcB ligasi (NEB, numero di catalogo M0458S), 2 µl di tampone di RtcB ligasi, GTP 0,1 mM, MnCl2 1 mM, tampone di ligasi, 0,5 µM di linker B e 0,5 µM di prodotto ligato in un volume di reazione di 20 µl. La reazione è stata incubata a 37°C per 2 ore in un termociclatore (Eppendorf). The ligation of the ligated product at the 3 'end of linker B (having the nucleotide sequence described in SEQ ID No .: 3) was obtained with RtcB ligase. Ligation was performed with 1 µl of RtcB ligase (NEB, catalog number M0458S), 2 µl of RtcB ligase buffer, 0.1 mM GTP, 1 mM MnCl2, ligase buffer, 0.5 µM of linker B and 0.5 µM of ligated product in a reaction volume of 20 µl. The reaction was incubated at 37 ° C for 2 hours in a thermal cycler (Eppendorf).

Analisi su gel (opzionale) Gel analysis (optional)

I prodotti di ligazione sono stati analizzati su gel di Tris-borato-EDTA (TBE)-urea acrilammide al 6% precolati (Invitrogen, numero di catalogo EC6865BOX). I campioni sono stati miscelati 1:1 con gel loading II (Thermo Fisher Scientific, numero di catalogo AM8547), denaturati a 70°C per 90 s prima del caricamento sul gel e sottoposti a corsa a 200 V. I gel sono quindi stati colorati con Sybr Gold (Invitrogen, numero di catalogo S11494) e scansionati utilizzando ChemiDoc (GE Healthcare, Piscataway, NJ). Le immagini del gel sono state analizzate utilizzando ImageLab (Biorad). Se necessario, le bande corrispondenti ai prodotti ligati sono state isolate dal gel, triturate ed immerse per una notte in buffer I (Immagina BioTechnology, numero di catalogo #RL001-10) a temperatura ambiente con rotazione costante. La fase acquosa è stata filtrata con provette Millipore ultrafree MC e poi precipitata con isopropanolo (Sigma, numero di catalogo I9516) a -80°C per 2 ore o tutta la notte. Il pellet è stato lavato con etanolo al 70%, centrifugato a 12000 g per 5 min a 4°C ed essiccato all’aria per 20 min prima dell’ulteriore processamento. The ligation products were analyzed on precolated 6% acrylamide tris-borate-EDTA (TBE) -urea gel (Invitrogen, catalog number EC6865BOX). Samples were mixed 1: 1 with gel loading II (Thermo Fisher Scientific, catalog number AM8547), denatured at 70 ° C for 90 s prior to loading onto the gel, and run at 200 V. The gels were then stained. with Sybr Gold (Invitrogen, catalog number S11494) and scanned using ChemiDoc (GE Healthcare, Piscataway, NJ). Gel images were analyzed using ImageLab (Biorad). If necessary, the bands corresponding to the ligated products were isolated from the gel, triturated and immersed overnight in buffer I (Imagine BioTechnology, catalog number # RL001-10) at room temperature with constant rotation. The aqueous phase was filtered with Millipore ultrafree MC tubes and then precipitated with isopropanol (Sigma, catalog number I9516) at -80 ° C for 2 hours or overnight. The pellet was washed with 70% ethanol, centrifuged at 12000 g for 5 min at 4 ° C and air dried for 20 min before further processing.

Sequenziamento Sequencing

Per il sequenziamento diretto dell’RNA, la preparazione delle librerie è stata effettuata secondo il protocollo del kit ONT SQK-RNA002 (Oxford Nanopore Technologies). For the direct sequencing of RNA, the preparation of the libraries was carried out according to the protocol of the ONT SQK-RNA002 kit (Oxford Nanopore Technologies).

Per la fase di trascrizione inversa, è stato utilizzato un primer customizzato “Oligo B luc” (avente la sequenza nucleotidica descritta in SEQ ID No.: 5), che si appaia alla regione 3’ del prodotto ligato. Il sequenziamento dell’RNA è stato effettuato utilizzando celle di flusso ONT R9.4 e lo script miniKNOW in dotazione. For the reverse transcription step, a customized "Oligo B luc" primer (having the nucleotide sequence described in SEQ ID No .: 5) was used, which matches the 3 'region of the ligated product. RNA sequencing was performed using ONT R9.4 flow cells and the miniKNOW script supplied.

Analisi dei dati Data analysis

L’analisi bioinformatica per il sequenziamento diretto dell’RNA è stata effettuata con CLC Genomics Workbench (v12; Qiagen). Bioinformatics analysis for direct RNA sequencing was performed with CLC Genomics Workbench (v12; Qiagen).

Descrizione schematica del procedimento CLA-p-seq #1 Fase 1. Fosforilazione in 5’ degli RPF estratti da gel (che presentano 3’-P o 3’-cP) con T4 PNK 3’ meno (NEB, numero di catalogo M0236S). La reazione di fosforilazione è effettuata come segue: Schematic description of the process CLA-p-seq # 1 Step 1. Phosphorylation in 5 'of RPF extracted from gels (having 3'-P or 3'-cP) with T4 PNK 3' minus (NEB, catalog number M0236S) . The phosphorylation reaction is carried out as follows:

- in una provetta da PCR, miscelare i seguenti reagenti nelle quantità indicate nella Tabella 2 riportata sotto; - in a PCR tube, mix the following reagents in the quantities indicated in Table 2 below;

- incubare la reazione a 37°C per 60 minuti; e - incubate the reaction at 37 ° C for 60 minutes; And

- inattivare a caldo incubando a 65°C per 20 minuti. - heat inactivate by incubating at 65 ° C for 20 minutes.

Tabella 2. Table 2.

n.d., non specificato n.d., unspecified

Fase 2. Ligazione di 5’P-RPF-3’P con il primo linker (5’-biotina-linkerA-3’OH) mediante T4 Rnl1 (NEB, numero di catalogo M0204L). T4 Rnl1 unirà le estremità 3’-OH a 5’-P. Aggiungere alla miscela di reazione della fase 1 i reagenti nelle quantità indicate in Tabella 3; la ligazione mediante T4 è effettuata a RT per 2 ore. Phase 2. Ligation of 5'P-RPF-3'P with the first linker (5'-biotin-linkerA-3'OH) through T4 Rnl1 (NEB, catalog number M0204L). T4 Rnl1 will join the 3'-OH ends to 5'-P. Add the reagents in the quantities indicated in Table 3 to the reaction mixture of phase 1; ligation by T4 is carried out at RT for 2 hours.

Tabella 3. Table 3.

Fase 3. Estrazione da gel e precipitazione del prodotto linkerA-RPF (fase opzionale). Step 3. Extraction from gel and precipitation of linkerA-RPF product (optional step).

Fase 4. Ligazione di 5’-biotina-linkerA-RPF-3’P con 5’OH-linkerB-3’OH mediante RtcB (NEB, numero di catalogo M0458S). La RtcB ligasi unirà le estremità 5’-OH a 3’-P. La ligazione con RtcB ligasi è effettuata a 37°C per 2 ore in un termociclatore (Eppendorf), nelle condizioni di reazione indicate in Tabella 4. Phase 4. Ligation of 5'-biotin-linkerA-RPF-3'P with 5'OH-linkerB-3'OH by RtcB (NEB, catalog number M0458S). The RtcB ligase will join the 5'-OH to 3'-P ends. Ligation with RtcB ligase is carried out at 37 ° C for 2 hours in a thermal cycler (Eppendorf), under the reaction conditions indicated in Table 4.

Tabella 4. Table 4.

Fase 5. Estrazione da gel e precipitazione del prodotto 5’-biotina-linkerA-RPF-linkerB-3’OH (fase opzionale). Phase 5. Extraction from gel and precipitation of the product 5'-biotin-linkerA-RPF-linkerB-3'OH (optional step).

Fase 6. Preparazione delle librerie per sequenziamento diretto dell’RNA (kit ONT) usando l’oligo B customizzato (SEQ ID No.: 5). Phase 6. Preparation of libraries for direct RNA sequencing (ONT kit) using the customized oligo B (SEQ ID No .: 5).

Claims

Claims 1. A process for preparing at least one RNA molecule contained in a biological sample for sequencing comprising the following steps: (i) obtain a biological sample comprising at least one RNA molecule, in which the at least one RNA molecule has a phosphate group or 2 ', 3'-phosphate cyclic at the 3' end; (ii) phosphorylate at least one RNA molecule at the 5 'end, thus introducing a phosphate group at the 5' end of the at least one RNA molecule and obtaining at least one RNA molecule phosphorylated at both ends; (iii) ligating the 5 'end of the at least one phosphorylated RNA molecule to the 3' end of a first random RNA linker, in which the first random RNA linker has an -OH group at the 3 'end and a terminal block group at the 5 'end, obtaining at least a first ligation product; And (iv) ligate the 3 'end of the at least one first ligation product to the 5' end of a second random RNA linker that has -OH groups at both ends, obtaining at least a second ligation product; in which the at least one second ligation product is suitable for sequencing, preferably for single-molecule sequencing.

2. Process according to claim 1, in which the phosphorylation step (ii) is carried out using a phosphorylating enzyme selected from T4 PNK 3 'minus, T4 PNK or recombinant versions of T4 PNK (for example Optikinase <TM>).

Process according to claim 1 or claim 2, wherein the ligation step (iii) is carried out using a ligase enzyme selected from among RtcB, Archease, Arabidopsis Thaliana tRNA ligase, and eukaryotic tRNA ligase.

4. Process according to any one of the preceding claims, in which the ligation step (iv) is carried out using an enzyme ligase selected among T4 RNA ligase 1, T4 RNA ligase 2, T4 RNA ligase 2 truncated, T4 RNA ligase 2 K227Q, and Mth RNA ligase.

5. Process according to any one of the preceding claims, wherein the first and second random RNA linkers have a length between 50 and 500 nucleotides, provided that the sum of the lengths of the first and second random RNA linkers is between 200 and 1000 nucleotides.

6. Process according to any one of the preceding claims, in which the first and second random RNA linkers have a minimum free energy between -3 and -150 kcal / mol.

7. Process according to any one of the preceding claims, in which the terminal block group present at the 5 'end of the first random RNA linker is selected from: biotin; a primary C1-12 alkylamine; a 7-methyl-guanylate mono-, bis-, or tri-phosphate; an azide group; a cholesteryl-TEG; a 5'-hexinyl group; a 5-Octadininyl-dU; a thiol group; a carboxyfluorescein (FAM); and a cyanine (Cy3, Cy5).

Process according to any one of the preceding claims, wherein the at least one RNA molecule having a phosphate group or cyclic 2 ', 3'-phosphate group at the 3' end is generated by treating the biological sample with an endoribonuclease, a hexoribonuclease , a ribozyme or toxin capable of cleaving mRNA, tRNA, snRNA, snoRNA, Y RNA, lncRNA, piRNA, siRNA, viral RNA (positive sense RNA virus, negative sense RNA virus, reverse transcription virus, and other species of RNA produced by viruses) or rRNA.

9. Kit comprising a first random RNA linker, a second random RNA linker, a first and a second ligase enzyme, wherein: (i) the first random RNA linker has an -OH group at the 3 'end and a terminal block group at the 5' end; (ii) the second random RNA linker has -OH groups at both ends; (iii) the first ligase enzyme is suitable for ligating the 5 'end of an RNA molecule, which has a cyclic phosphate or 2', 3'-phosphate group at the 3 'end and a phosphate group at the 5' end , at the 3 'end of the first random RNA linker; And (iv) the second ligase enzyme is suitable for ligating the 3 'end of the RNA molecule to the 5' end of the second random RNA linker.

Kit according to claim 9, wherein the first enzyme ligase is selected among RtcB, Archease, Arabidopsis Thaliana tRNA ligase, and eukaryotic tRNA ligase.

Kit according to claim 9 or claim 10, wherein the second enzyme ligase is selected from T4 RNA ligase 1, T4 RNA ligase 2, truncated T4 RNA ligase 2, T4 RNA ligase 2 K227Q, and Mth RNA ligase.

Kit according to any one of claims 9 to 11, wherein the first and second random RNA linkers have a length between 50 and 500 nucleotides, provided that the sum of the lengths of the first and second random RNA linkers is between 200 and 1000 nucleotides.

13. Kit according to any one of claims 9 to 12, in which the first and second random RNA linkers have a minimum free energy between -3 and -150 kcal / mol.

14. Kit according to any one of claims 9 to 13, in which the terminal block group present at the 5 'end of the first random RNA linker is selected from: biotin; a primary C1-12 alkylamine; a 7-methyl-guanylate mono-, bis-, or tri-phosphate; an azide group; a cholesteryl-TEG; a 5'-hexinyl group; a 5-Octadininyl-dU; a thiol group; a carboxy-fluorescein (FAM); and a cyanine (Cy3, Cy5).

Kit according to any one of claims 9 to 14, further comprising (a) a phosphorylating enzyme, and / or (b) an endoribonuclease, a ribozyme or a toxin capable of cleaving mRNA, tRNA, snRNA, snoRNA, RNA Y, lncRNA , piRNA, siRNA, viral RNA (positive-sense RNA virus, negative-sense RNA virus, reverse transcription virus, and other virus-produced RNA species) or rRNA.

Kit according to any one of claims 9 to 14, wherein the first random RNA linker has a nucleotide sequence as described in SEQ ID No .: 2, and the second random RNA linker has a nucleotide sequence selected from SEQ ID No. .: 3 and 4.