WO2020015621A1 - 一种用于基因检测的血小板核酸文库构建方法和试剂盒 - Google Patents

一种用于基因检测的血小板核酸文库构建方法和试剂盒 Download PDF

Info

Publication number
WO2020015621A1
WO2020015621A1 PCT/CN2019/096097 CN2019096097W WO2020015621A1 WO 2020015621 A1 WO2020015621 A1 WO 2020015621A1 CN 2019096097 W CN2019096097 W CN 2019096097W WO 2020015621 A1 WO2020015621 A1 WO 2020015621A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
platelet
sequence
sample
data
Prior art date
Application number
PCT/CN2019/096097
Other languages
English (en)
French (fr)
Inventor
肖剑萍
叶国栋
许剑雄
陈茂立
韩大雄
郭奇伟
蔡逸民
杨燕燕
李顺杰
董康梅
朱莎莎
张丽芳
宋丹
Original Assignee
厦门生命互联科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 厦门生命互联科技有限公司 filed Critical 厦门生命互联科技有限公司
Publication of WO2020015621A1 publication Critical patent/WO2020015621A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1096Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the invention relates to the field of sequencing, in particular to a method and a kit for constructing a platelet nucleic acid library for gene detection.
  • lung cancer is the tumor with the highest incidence and mortality in China and the world. Late staging at the time of diagnosis is an important cause of lung cancer prognosis. Early lung cancer can achieve better prognosis through multidisciplinary comprehensive treatment. To achieve the purpose of healing.
  • lung cancer is mainly diagnosed and staging using low-dose spiral CT screening, chest-enhanced CT, upper abdominal CT (or B-ultrasound), head-enhanced MR (or contrast-enhanced CT), and whole-body bone scans. If the CT scan shows suspicious malignant characteristics, then the doctor will take a biopsy method and take a tumor tissue sample for pathological diagnosis.
  • non-invasive screening methods for cancer mainly use tumor markers, such as alpha-fetoprotein AFP, carcinoembryonic antigen CEA and CA199, etc., but their sensitivity and specificity for diagnosis are low, and multiple tumor markers must be selected for simultaneous determination. , Generally used to assist diagnosis.
  • liquid biopsy is a branch of in vitro diagnosis.
  • the main detection objects include free circulating tumor cells (CTCs), circulating tumor DNA (ctDNA) and exosomes in the blood.
  • CTCs free circulating tumor cells
  • ctDNA circulating tumor DNA
  • exosomes exosomes in the blood.
  • RNA testing of domesticated platelets such as tumor-conditioned platelets (Tumor Conditioned Platelets) to detect whether a subject has cancer has become a new liquid biopsy method.
  • Chinese patent application 201610911677.X discloses a tumor platelet RNA quantitative detection model and method for early tumor screening, the model includes PCR detection specific primers for clinical diagnosis of tumor platelet RNA biomarker combinations, including CD79A , CD81, SYTL1, CENPC, TTN, RHOH, ZNF101, TRABD2A and TRAC.
  • the method includes preparing a sample, extracting RNA, reverse transcription, PCR detection, calculating a Y value using a formula, and evaluating the result.
  • the patent uses a combined RNA marker to diagnose tumors with a sensitivity of 92.5%, which is higher than the sensitivity of commonly used biomarkers in clinical practice.
  • the patent uses a quantitative PCR method that only detects 9 RNA biomarkers at a time, which can only distinguish between cancer patients and healthy people, and cannot further distinguish tumor types.
  • Chinese patent application 201710731914.9 discloses a method for the quantitative detection of platelet LncRNA for the diagnosis of non-small cell lung cancer, which confirms that the expression of platelet long-chain non-coding RNAs in patients with NSCLC is lower than that of normal humans.
  • Real-time PCR kit for non-small cell lung cancer diagnosis By combining the expression data obtained by real-time fluorescent quantitative PCR amplification of MAGI2-AS3 and ZFAS1, a logistic regression fitting data model for the diagnosis of non-small cell lung cancer was established. This model has higher diagnostic efficacy and sensitivity for non-small cell lung cancer. .
  • this patent only detects the expression of platelet long-chain non-coding RNA, MAGI2-AS3 and ZFAS1, and has a limited application range. It can only be used for the diagnosis of non-small cell lung cancer and it is difficult to meet clinical needs.
  • An object of the present invention is to provide a method for constructing a platelet nucleic acid library.
  • the present invention provides a nucleic acid capture probe, characterized in that the nucleic acid capture probe starts from 5 ′, and is, in turn, 5′-end biotin-modified, amplified primer sequence P1, sequencing adapter sequence P5 , Sample tag sequence, single molecule tag sequence, and polythymine Oligo (dT) sequence;
  • the amplification primer sequence P1 is shown in SEQ ID NO: 1
  • the sequencing adapter sequence P5 is shown in SEQ ID NO: 2
  • the sample tag sequence is composed of 3 to 4 nucleotides
  • the single molecule tag sequence is composed of 10 Consisting of 20 nucleotides
  • the polythymidine Oligo (dT) sequence consists of 20 T bases.
  • the invention also provides a kit comprising the nucleic acid capture probe.
  • the invention also provides a method for constructing a platelet nucleic acid library, which is characterized in that the method is:
  • Platelet RNA micro-amplification use the nucleic acid capture probe of claim 1 or 2; or the nucleic acid capture probe in the kit of claim 3 to perform micro-amplification to obtain an amplified product of platelet full-length cDNA;
  • the collecting whole blood is collecting venous blood using a vacuum blood collection tube containing an anticoagulant, and gently collecting the blood collection tube upside down several times after the collection, so that the anticoagulant and the whole blood are sufficiently mixed.
  • the ultra-pure platelets are separated by centrifugation so that the leukocyte contamination rate in the obtained ultra-pure platelets is less than 0.0001%; preferably, a two-step centrifugation method is used; more preferably, a dual immunomagnetic method is used in the middle of the two-step centrifugation method Beads remove leukocytes and red blood cells.
  • the micro-amplification of platelet RNA is based on ultra-pure platelet RNA as a template, and the nucleic acid capture probe according to claim 1 or 2 or the nucleic acid capture probe in the kit according to claim 3 is used as a primer.
  • Reverse transcriptase reverse transcriptase synthesizes a strand of cDNA complementary to platelet RNA, and uses the template substitution activity of reverse transcriptase to add a primer sequence P1 at the 3 'end of a strand of cDNA, such as SEQ ID NO.1
  • a synthetic one-strand cDNA complementary to platelet RNA is used as a template, and the amplification primer sequence P2 shown in SEQ ID No. 4 is used as a primer.
  • Multiple rounds of amplification and purification are performed to obtain an expanded platelet full-length cDNA.
  • Increased product Preferably, one strand of cDNA from multiple different samples can be mixed and amplified in the same reaction system to obtain amplified products of full-length platelet cDNA from different sources.
  • the platelet nucleic acid library is constructed by using a transposase and a sequencing adapter to fragment and add an adapter to the obtained platelet full-length cDNA amplification product, and using sequencing primers to perform PCR amplification and enrichment on the fragmented product.
  • 5 'end of cDNA use AmPure XP Beads to sort and purify the amplified product to obtain a platelet nucleic acid library carrying a molecular tag at the 5' end; preferably, wherein the sequence of the sequencing adapter is shown in SEQ ID NO: 4, sequencing primers The sequences are shown in SEQ ID NO: 5 and SEQ ID NO: 6.
  • the present invention also provides a method for obtaining gene expression level data, which is characterized in that after constructing a platelet nucleic acid library according to the method, high-throughput sequencing of fragments of the platelet nucleic acid library is performed, and the sequencing data is split using sample tags, Differentiate platelet nucleic acid data from the same source, and perform quality control, reference genome comparison, and bioinformatics analysis of gene expression levels on each sample's sequencing data to obtain sample gene expression level data.
  • the invention also provides a method for analyzing the gene expression level of platelets, which is characterized by analyzing the gene expression level data of the obtained sample in the following steps:
  • the Cancer_healthy and Test_sample matrices are used for the normalization of gene expression level data and PCA principal component analysis based on the differential genes matched by the differential gene matrix cancer_healthy_k1, and the final data is subjected to LDA linear judgment and dimensionality reduction into a learning sample database dimensionality reduction matrix m1 * w and dimensionality reduction matrix m2 * w of the sample library to be tested;
  • Gaussian process classifier for interpretation call the gp toolbox in matlab to establish a mathematical model for the formatted learning sample database dimension reduction matrix m1 * w and the sample sample dimension reduction matrix m2 * w to be tested. Classify probability X;
  • n is the number of genes
  • m1 is the number of samples consisting of m1 cases of health and lung cancer
  • m2 is the number of samples consisting of m2 cases of health and lung cancer
  • k is the number of differential genes
  • w is the dimension.
  • sequence of SEQ ID NO: 1 in the present invention is TAGCAGTCGATTCAACGCAGACATC;
  • sequence of SEQ ID NO: 2 is: CTCTTATACACATCTGACGCTGCCGACGA;
  • sequence of SEQ ID NO: 3 is:
  • sequence of SEQ ID NO: 4 is TAGCAGTCGATTCAACGCAGACA;
  • sequence of SEQ ID NO: 5 is GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG;
  • sequence of SEQ ID NO: 6 is AATTGATACGGCGACCACCGAGATCTACACNNNNNNTCGTCGGCAGCGTC;
  • sequence of SEQ ID NO: 7 is CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTCTCGTGGGCTCGG.
  • the nucleic acid capture probe carrying a molecular tag includes a 5′-end biotin modification (5′-Biotin), an amplification primer sequence P1, a sequencing linker sequence P5, a sample tag sequence, a single molecule tag sequence, and a polythymine Oligo (dT) sequence.
  • the amplification primer sequence P1 is shown in SEQ ID NO: 1
  • the sequencing adapter sequence P5 is shown in SEQ ID ID NO: 2
  • the sample tag sequence is composed of 3 to 4 nucleotides (A, G, C, T)
  • the single molecule tag sequence consists of 10 nucleotides
  • the polythymidine Oligo (dT) sequence consists of 20 T bases.
  • the nucleic acid capture probe can specifically bind to PolyA tail-containing RNA released from platelets, and introduce a sample tag and a single-molecule tag on the 5 'end of a strand of cDNA during subsequent reverse transcription synthesis. They are used to identify platelets from different sources and different RNA molecules in platelets from the same source.
  • the present invention is based on platelet RNA sequencing, which comprehensively analyzes the platelet gene expression level, and the amount of information obtained is much higher than the existing method.
  • the present invention analyzes platelet RNA sequencing data of a subject to determine whether the donor has cancer.
  • the accuracy rate of the present invention is 96.67%, the sensitivity is 93.33%, and the specificity is 100%.
  • the present invention does not need to extract platelet RNA, and can directly lyse platelets and specifically capture platelet-containing PolyA tail RNA, thereby avoiding RNA degradation and loss that may occur during the RNA extraction process.
  • the invention greatly reduces the initial amount of platelets, can separate platelets from a small amount of whole blood, directly perform micro-amplification and library construction, and is suitable for the needs of liquid biopsy, and has important clinical significance and application value.
  • the present invention introduces a sample tag, which can mark platelet nucleic acid of the same subject during platelet RNA capture and its reverse transcription process, and in subsequent experiments, the The samples are mixed into the same reaction system, thereby reducing the experimental workload and increasing the sample detection throughput.
  • the present invention introduces a single molecule tag, which can label platelet nucleic acid of the same subject one by one during platelet RNA capture and its reverse transcription process, so that the labeling of each nucleic acid molecule is unique of.
  • the duplicate sequences are removed to correct the wrong information brought by the preference of PCR amplification.
  • the invention provides a detection kit for platelet RNA sequencing (TCPseq) combined with a machine learning algorithm for tumor diagnosis, which can distinguish the origin of different tumors with only one detection.
  • TCPseq platelet RNA sequencing
  • the invention can not only be used for distinguishing cancer patients from healthy people, and can be used for early detection of tumors and risk assessment of the disease. At the same time, it can distinguish different types of primary tumors.
  • the nucleic acid capture probe carrying the molecular tag contains the following elements from the 5 'end to the 3' end:
  • streptavidin and biotin have a very high affinity, can be used to covalently bind the streptavidin superparamagnetic bead affinity probe 5' end biotin, and then capture the probe needle;
  • the amplification primer sequence P1 as shown in SEQ ID NO: 1, is used for the amplification of the full-length cDNA.
  • the specific sequence is as follows: TAGCAGTCGATTCAACGCAGACATC;
  • the sequencing linker sequence P5 is used for the 5 'end of the platelet nucleic acid library construction.
  • the specific sequence is as follows: CTCTTATACACATCTGACGCTGCCGACGA;
  • the sample tag sequence is composed of 3 nucleotides (A, G, C, T) randomly, forming 64 different combinations, which can mark up to 64 platelets from different subjects at once and mix them into the same reaction system.
  • Single-molecule tag sequence consisting of 10 nucleotides (A, G, C, T) randomly, forming 1048576 different combinations, used for platelet RNA capture and reverse transcription to the same subject Platelet nucleic acid is labeled one by one so that the labeling of each nucleic acid molecule is unique;
  • the 3'-terminal polythymine Oligo (dT) sequence consisting of 20 T bases, specifically binds to PolyA tail-containing RNA released from platelets, and finally achieves the purpose of magnetic bead-bound probes and probe-bound RNA.
  • nucleic acid capture probe carrying a molecular tag was synthesized by Xiamen Nuoketai Biotechnology Co., Ltd.
  • the specific sequence is shown in SEQ ID NO: 3:
  • the single underlined solid line is the amplification primer sequence P1
  • the double underlined solid line is the sequencing adapter sequence
  • the single underlined wavy line is the sample tag sequence
  • the single underlined line is the single molecular tag sequence
  • the last 20 are not identified.
  • T is the 3 'polythymine Oligo (dT) sequence.
  • BD dipotassium EDTA blood collection tube to collect 2 mL of venous blood from the subject. After collection, gently reverse the blood collection tube several times to make the anticoagulant and whole blood thoroughly mixed. The whole blood should be processed within 96 hours after collection.
  • Centrifuge for the first time Place the blood collection tube in the centrifuge rotor, centrifuge at 800g for 5 minutes at room temperature, use a pipette to suck 600 ⁇ L of the upper platelet-rich plasma, and transfer to a new 1.5 mL centrifuge tube.
  • the suction process is as gentle as possible to avoid Stirring the middle white membrane layer will cause leukocytes to float and increase the pollution rate.
  • CD45 immunomagnetic beads Invitrogen, 11153D
  • CD235a immunomagnetic beads Lifeint, A5005M
  • Remove white blood cells add 60 ⁇ L of CD45 and CD235a mixed immunomagnetic beads to the platelet-rich plasma obtained by the first centrifugation, mix by suction, so that the immunomagnetic beads are fully combined with the corresponding cells, place the centrifuge tube on a magnetic stand for 2 minutes, The magnetic beads were captured to remove white blood cells and red blood cells from platelet-rich plasma, and the supernatant was further purified platelet-rich plasma.
  • Second centrifugation Take the further purified platelet-rich plasma, transfer to a new 1.5mL centrifuge tube, centrifuge at 2800g at room temperature for 5min, discard the supernatant, collect the platelet pellet, and resuspend using 10 ⁇ L phosphate buffer (pH7.2). A platelet suspension was obtained.
  • M-280 magnetic bead pretreatment take 100 ⁇ L magnetic beads (Invitrogen, 11205D) and add an equal volume of Solution A (DEPC-treated 0.1M NaOH, DEPC-treated 0.05M NaCl) suction washing, magnetic beads capture the magnetic beads, discard the supernatant , Repeat washing once.
  • Solution A DEPC-treated 0.1M NaOH, DEPC-treated 0.05M NaCl
  • Solution B DEPC-treated 0.1M NaCl
  • M-280 Magnetic Bead Binding Probes Add 30 types of 10 ⁇ M nucleic acid capture probes with different sample tags to the treated M-280 magnetic beads, 4 ⁇ L magnetic beads correspond to 1 ⁇ L probes, and incubate at room temperature for 5 min. The probe has been bound to M-280 magnetic beads.
  • RNA capture mix the above-mentioned probe-bound magnetic beads with 30 platelet lysates from different subjects, that is, each sample tag corresponds to one subject, incubate at room temperature for 5 minutes, and the magnetic frame absorbs the magnetic beads for 2 minutes Remove 10 ⁇ L of supernatant. The RNA is now bound to the magnetic beads and subsequent experiments should be performed immediately.
  • the cDNA amplification products were purified using 50 ⁇ L of VAHTSTM DNA Cleanliness (Vazyme, N411), freshly prepared 80% ethanol to wash magnetic beads, and elution buffer elution.
  • VAHTSTM DNA Cleanliness Vazyme, N411
  • the products obtained were full-length cDNA with sample tags and single-molecule tags.
  • the Illumina HiSeq X series sequencer was used for high-throughput sequencing using the strategy of PE150.
  • the 30 sample tags described in step 3 of Example 2 were used to split the off-line data. Trimmomatic was used for quality control and STAR was used. Compare and annotate with the reference genome with the version number .GRCh37.75, and finally use featureCounts to count the gene expression, and use the awk, grep, sort and other tools of the shell scripting language to format the data.
  • the final data format is 57735 Genes and their corresponding expression levels.
  • the matlab module bioma.data.DataMatrix was used to generate a data matrix Cancer_healthy of 57735 * 864 (57735 is the number of genes and 864 is the number of samples consisting of 440 healthy and 424 lung cancer cases);
  • test_sample 57735 * 30 57735 is the number of genes, 30 is the number of samples consisting of 15 healthy and 15 lung cancers);
  • the Cancer_healthy and Test_sample matrices are used for the normalization of gene expression level data and PCA principal component analysis based on the differential genes matched by the differential gene matrix cancer_healthy_m1, and the final data is linearly judged by LDA to reduce the dimension to 864 * 500 learning sample database dimension reduction matrix and 30 * 500 dimensionality reduction matrix of the sample database to be tested (864 learning samples, 30 is the number of samples to be tested, and 500 is the dimension).
  • Table 1 X value table of 15 healthy people and 15 lung cancer patients
  • the X value is greater than 0.5, it is interpreted as a healthy person. According to the value of the probability X value, the result is consistent with the actual situation, and the correct rate is 96.67%.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了用于基因检测的血小板核酸文库构建方法和试剂盒。所述核酸捕获探针从5'开始,依次为,5'端生物素修饰、扩增引物序列P1、测序接头序列P5、样本标签序列、单分子标签序列和多聚胸腺嘧啶Oligo(dT)序列。还提供含有该核酸捕获探针的试剂盒,和使用该核酸捕获探针进行血小板核酸文库的构建方法。本发明大幅度降低了血小板的起始量,可从少量全血中分离血小板,直接进行微量扩增和文库构建,适用于液体活检的需求。此外,本发明可将不同受检者的样本混合至同一反应体系中,从而提高检测的通量。

Description

一种用于基因检测的血小板核酸文库构建方法和试剂盒 技术领域
本发明涉及测序领域,尤其涉及用于基因检测的血小板核酸文库构建方法和试剂盒。
背景技术
癌症的早期诊断意味着可以提早治疗,对患者的预后及生存极其关键,是提高癌症生存率的最佳方法。以肺癌为例,肺癌是中国乃至世界范围内发病率和病死率最高的肿瘤,确诊时分期较晚是影响肺癌预后的重要原因,而早期肺癌可以通过多学科综合治疗实现较好的预后,甚至达到治愈的目的。目前,肺癌主要采用低剂量螺旋CT筛查,胸部增强CT、上腹部增强CT(或B超)、头部增强MR(或增强CT)以及全身骨扫描进行诊断和分期的基本策略。如果CT扫描显示有可疑的恶性特性,那么医生将会进一步采取组织活检的方法,对肿瘤组织取样进行病理诊断。
鉴于低剂量螺旋CT存在一定的电离辐射,筛查会增加较低的辐射致癌风险,指南建议高危人群每年接受一次低剂量螺旋CT筛查。而该方法还存在一定的假阳性,它会发现一些需要更多检查来确认的异常,而这些异常经证明并非癌症,这将同时增加受检者的生理和心理负担。因此,迫切需要一种风险更低的无创筛查和诊断方法。目前,癌症的无创筛查手段以肿瘤标志物为主,例如甲胎蛋白AFP、癌胚抗原CEA和CA199等,但其诊断的灵敏度和特异性较低,需同时选择多种肿瘤标志物联合测定,一般用于辅助诊断。
近年来,随着越来越多临床证据的出现,利用分子诊断技术指导患者个体化的精准治疗已逐渐成为共识。其中,液体活检作为体外诊断的一个分支,主要检测物包括血液中游离的循环肿瘤细胞(CTCs)、循环肿瘤DNA(ctDNA)和外泌体,其优势在于以非侵入性的取样方式大大降低了组织活检的弊端。然而,目前的检测物含量低且分离成本高,限制了检测方法的快速发展。正常机体中血小板主要通过释放和聚集功能,发挥促凝血以及促进伤口愈合的作用。在重大疾病如急慢性炎症或肿瘤的微环境中,可导致血小板特定的pre-mRNAs发生剪接,进而影响血小板的基因表达谱。此外,血小板是血液中第二丰富的细胞类型,获取方便,分离操作简单,可用作新的检测物。因此,对经过驯化的血小板如肿瘤驯化血小板(Tumor Conditioned Platelets)进行RNA检测,检测受检者是否罹患癌 症,已成为一种新的液体活检方法。
中国专利申请201610911677.X公开了一种用于肿瘤早期筛查的肿瘤血小板RNA定量检测模型及方法,所述模型包括PCR检测特异性引物,用以临床诊断肿瘤血小板RNA生物标志物组合,包括CD79A、CD81、SYTL1、CENPC、TTN、RHOH、ZNF101、TRABD2A和TRAC。所述方法包括制备样本、提取RNA、逆转录、PCR检测、用算式计算Y值和结果评判。该专利使用RNA联合标志物诊断肿瘤的灵敏度能达到92.5%,高于目前临床常用生物标志物灵敏度。但该专利采用PCR定量的方法,一次只检测9个RNA生物标志物,只能区分癌症患者与健康人,无法进一步区分肿瘤类型。
中国专利申请201710731914.9公开了一种用于非小细胞肺癌诊断的血小板LncRNA的定量检测方法,证实NSCLC患者血小板长链非编码RNA MAGI2-AS3、ZFAS1的表达低于正常人,基于此制备出用于非小细胞肺癌诊断的实时荧光定量PCR的试剂盒。通过联合MAGI2-AS3和ZFAS1实时荧光定量PCR扩增获得的表达量数据,建立了非小细胞肺癌诊断的Logistic回归拟合数据模型,该模型对非小细胞肺癌有较高的诊断效能和敏感性。然而,该专利只检测血小板长链非编码RNA MAGI2-AS3和ZFAS1表达量,应用范围有限,只能用于非小细胞肺癌诊断,难以满足临床需求。
发明内容
本发明的目的在于提供一种血小板核酸文库的构建方法。
为实现上述目的,本发明提供一种核酸捕获探针,其特征在于,所述核酸捕获探针从5'开始,依次为,5'端生物素修饰、扩增引物序列P1、测序接头序列P5、样本标签序列、单分子标签序列和多聚胸腺嘧啶Oligo(dT)序列;
进一步,所述扩增引物序列P1如SEQ ID NO:1所示,测序接头序列P5如SEQ ID NO:2所示,样本标签序列由3~4个核苷酸组成,单分子标签序列由10个核苷酸组成,多聚胸腺嘧啶Oligo(dT)序列由20个T碱基组成。
本发明还提供一种试剂盒,其特征在于,含有所述核酸捕获探针。
本发明还提供一种血小板核酸文库的构建方法,其特征在于,方法为:
采集全血;
超纯血小板的分离;
血小板RNA的微量扩增:使用权利要求1或2所述核酸捕获探针;或权利要求3所述试剂盒中的核酸捕获探针进行微量扩增,获得血小板全长cDNA的扩增产物;
血小板核酸文库的构建。
进一步,所述采集全血为使用含抗凝剂的真空采血管采集静脉血,采集后轻轻颠倒采血管数次,使抗凝剂与全血充分混匀。
进一步,所述超纯血小板的分离为采用离心使所得超纯血小板中的白细胞污染率低于0.0001%;优选的,采用两步离心法;更优选的,在两步离心法中间采用双免疫磁珠去除白细胞和红细胞。
进一步,所述血小板RNA的微量扩增为以超纯血小板的RNA为模板,权利要求1或2所述核酸捕获探针,或权利要求3所述试剂盒中的核酸捕获探针为引物,利用反转录酶反转录合成与血小板的RNA互补的一链cDNA,并利用反转录酶的模板置换活性在一链cDNA的3'端加上一段扩增引物序列P1如SEQ ID NO.1所示;以合成得到的与血小板的RNA互补的一链cDNA为模板,如SEQ ID NO.4所示的扩增引物序列P2为引物,多轮扩增并纯化,获得血小板全长cDNA的扩增产物;优选的,可将多个不同样本的一链cDNA混合,在同一反应体系中进行扩增,获得不同来源的血小板全长cDNA的扩增产物。
进一步,所述血小板核酸文库的构建为使用转座酶和测序接头对所得的获得血小板全长cDNA的扩增产物进行片段化和加接头,使用测序引物对片段化产物进行PCR扩增,富集cDNA的5'端;利用AmPure XP Beads分选并纯化扩增产物,获得5'端携带分子标签的血小板核酸文库;优选的,其中,测序接头的序列如SEQ ID NO:4所示,测序引物的序列如SEQ ID NO:5和SEQ ID NO:6。
本发明还提供一种基因表达水平数据的获得方法,其特征在于,按照所述方法构建血小板核酸文库后,对血小板核酸文库的片段进行高通量测序,利用样本标签对测序数据进行拆分,区分同一来源的血小板核酸数据,并对每个样本的测序数据进行质控、参考基因组比对、计算基因表达水平量的生物信息学分析,获得样本的基因表达水平数据。
本发明还提供一种分析血小板的基因表达水平的方法,其特征在于,对获得的样本的基因表达水平数据进行分析,步骤如下:
学习样本库的建立:采用matlab的模块bioma.data.DataMatrix的生成n*m1的数据矩阵Cancer_healthy;
待测样本库的建立:采用matlab的模块bioma.data.DataMatrix的生成n*m2的数据矩阵Test_sample;
差异基因矩阵选取:调用matlab中的Bioinformatics Toolbox工具箱,分析数据矩阵Cancer_healthy中两种样本之间的差异基因,将差异基因进行选取得到一个m1*k的矩阵,及k*1的矩阵cancer_healthy_k1;
数据格式化处理:将Cancer_healthy及Test_sample矩阵根据差异基因矩阵cancer_healthy_k1匹配的差异基因做基因表达水平数据标准化处理和PCA主成分分析,并对最后的数据进行LDA线性判断降维成学习样本库降维矩阵m1*w和待测样本库降维矩阵m2*w;
高斯过程分类器进行判读:调用matlab中的gp工具箱,对上述经格式化处理的学习样本库降维矩阵m1*w和待测样本库降维矩阵m2*w建立数学模型,根据预测类型的概率X进行归类;
其中n为基因数,m1为由m1例健康和肺癌组成的样本数;m2为由m2例健康和肺癌组成的样本数;k为差异基因数,w为维度。
本发明所述SEQ ID NO:1的序列为TAGCAGTCGATTCAACGCAGACATC;
SEQ ID NO:2的序列为:CTCTTATACACATCTGACGCTGCCGACGA;
SEQ ID NO:3的序列为:
Figure PCTCN2019096097-appb-000001
SEQ ID NO:4的序列为TAGCAGTCGATTCAACGCAGACA;
SEQ ID NO:5的序列为GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG;
SEQ ID NO:6的序列为AATGATACGGCGACCACCGAGATCTACACNNNNNNNNTCGTCGGCAGCGTC;
SEQ ID NO:7的序列为CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTCTCGTGGGCTCGG。
所述的携带分子标签的核酸捕获探针,包含5'端生物素修饰(5'-Biotin)、扩增引物序列P1、测序接头序列P5、样本标签序列、单分子标签序列和多聚胸腺嘧啶Oligo(dT)序列。其中,扩增引物序列P1如SEQ ID NO:1所示,测序接头序列P5如SEQ ID NO:2所示,样本标签序列由3~4个核苷酸(A、G、C、T)组成,单分子标签序列由10个核苷酸组成,多聚胸腺嘧啶Oligo(dT)序列由20个T碱基组成。该核酸捕获探针能特异性结合从血小板中释放的含PolyA尾巴的RNA,并在随后的反转录合成过程中,在一链cDNA的5'端上引入一段样本标签和一段单分子标签,分别用于识别不同来源的血小板,以及同一来源的血小板中不同的RNA分子。
与现有方法相比,本发明基于血小板RNA测序,全面分析血小板的基因表达水平,获取的信息量远高于现有方法。本发明对受检者的血小板RNA测序数据进行分析,判断该供体是否罹患癌症,本发明的准确率达96.67%,灵敏度达93.33%,特异性达100%。
与现有方法相比,本发明不需要提取血小板RNA,可直接裂解血小板并特异性捕获血小板含PolyA尾巴的RNA,避免了RNA提取过程中可能发生的RNA降解以及损失。同时, 本发明大幅度降低了血小板的起始量,可从少量全血中分离血小板,直接进行微量扩增和文库构建,适用于液体活检的需求,具有重要的临床意义和应用价值。
与现有方法相比,本发明引入了样本标签,可在血小板RNA捕获及其反转录过程中,对同一受检者的血小板核酸进行标记,并在后续实验中,将不同受检者的样本混合至同一反应体系中,进而减少实验工作量,提高样本检测通量。
与现有方法相比,本发明引入了单分子标签,可在血小板RNA捕获及其反转录过程中,对同一受检者的血小板核酸逐一进行标记,使每个核酸分子的标记都是唯一的。并在后续信息分析中,根据标签的唯一性,去除重复序列,纠正PCR扩增偏好性带来的错误信息。
本发明提供一种用于肿瘤诊断的血小板RNA测序(TCPseq)结合机器学习算法的检测试剂盒,只要一次检测,便可以区分不同肿瘤的来源。本发明不仅可用于区分癌症患者与健康人,进行肿瘤早期检测和罹患风险评估,同时能区分不同原发肿瘤类型,在诊断分型、药物伴随诊断和患者病情检测等方面有巨大的应用前景。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例旨在用于解释本发明,而不能理解为对本发明的限制。本发明的描述中,“第一”、“第二”、“第三”等为指代或描述方便,不能理解为有顺序关系或者有相对重要性指示,除非另有说明,“多个”、“多组”、“多重”的含义是两个(组或重)或两个(组或重)以上。实施例中未注明具体技术或条件者,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。
实施例1制备携带分子标签的核酸捕获探针
携带分子标签的核酸捕获探针自5'端到3'端包含以下元件:
5'端生物素修饰,链霉亲和素与生物素具有极高的亲和力,可利用表面共价结合链霉亲和素的超顺磁珠亲和探针5'端的生物素,进而捕获探针;
扩增引物序列P1,如SEQ ID NO:1所示,用于全长cDNA的扩增,具体序列如下:TAGCAGTCGATTCAACGCAGACATC;
测序接头序列P5,如SEQ ID NO:2所示,用于血小板核酸文库构建中对5'端,具体序列如下:CTCTTATACACATCTGACGCTGCCGACGA;
样本标签序列,由3个核苷酸(A、G、C、T)随机组成,形成64种不同的组合,最多可一次性标记64例不同受检者来源的血小板,并混合至同一反应体系中,进行微量扩增和 文库构建;
单分子标签序列,由10个核苷酸(A、G、C、T)随机组成,形成1048576种不同的组合,用于在血小板RNA捕获及其反转录过程中,对同一受检者的血小板核酸逐一进行标记,使每个核酸分子的标记都是唯一的;
3'端多聚胸腺嘧啶Oligo(dT)序列,由20个T碱基组成,特异性结合从血小板中释放的含PolyA尾巴的RNA,最终实现磁珠结合探针,探针结合RNA的目的。
由厦门纽克泰生物科技有限公司合成上述携带分子标签的核酸捕获探针,具体序列如SEQ ID NO:3所示:
Figure PCTCN2019096097-appb-000002
Figure PCTCN2019096097-appb-000003
其中单下划实线为扩增引物序列P1,双下划实线为测序接头序列,单下划波浪线为样本标签序列,单下划点线为单分子标签序列,最后没有标识的20个T为3'端多聚胸腺嘧啶Oligo(dT)序列。
实施例2血小板核酸文库的构建方法
1.全血的采集
使用BD二钾EDTA采血管采集受试者2mL静脉血,采集后轻轻颠倒采血管数次,使抗凝剂与全血充分混匀,全血采集后应在96h内处理。
2.超纯血小板的分离
第一次离心:将采血管放置到离心机转子中,室温下800g离心5min,使用移液器吸取600μL上层富含血小板血浆,转移至新的1.5mL离心管,吸取过程尽可能轻缓,避免搅动中间白膜层,导致白细胞上浮,污染率增加。
磁珠前处理:CD45免疫磁珠(Invitrogen,11153D)和CD235a免疫磁珠(Lifeint,A5005M)使用前涡旋振荡确保充分混匀,分别吸取60μL转移至同一管新的1.5mL离心管,并添加1mL磷酸缓冲液A(0.1%BSA,2mM EDTA,pH 7.4)进行洗涤,将离心管放置在DynaMag TM-2磁力架上1min,捕获磁珠,取下离心管添加60μL磷酸缓冲液A重悬磁珠。
去除白细胞:在第一次离心获得的富含血小板血浆中添加60μL CD45和CD235a混合免疫磁珠,抽吸混匀,使免疫磁珠与相应细胞充分结合,将离心管放置在磁力架上2min,捕获磁珠,去除富含血小板血浆中的白细胞和红细胞,上清为进一步纯化的富含血小板血浆。
第二次离心:取上述进一步纯化的富含血小板血浆,转移至新的1.5mL离心管,室温 下2800g离心5min,弃上清,收集血小板沉淀,使用10μL磷酸缓冲液(pH 7.2)重悬,获得血小板悬液。
3.血小板RNA的微量扩增
(1)血小板裂解处理
配制10μL血小板裂解液(1.6%Triton X-100,6U/μL RNase抑制剂),取30份不同受检者来源的血小板,每份5μL,加入1μL裂解液,抽吸混匀,短暂离心收集并于室温孵育5min。
(2)血小板RNA捕获与标记
M-280磁珠预处理:取100μL磁珠(Invitrogen,11205D)加等体积Solution A(DEPC-treated 0.1M NaOH,DEPC-treated 0.05M NaCl)抽吸洗涤,磁力架捕获磁珠,弃上清,重复洗涤1次。添加等体积Solution B(DEPC-treated 0.1M NaCl)洗涤磁珠1次,使用40μL NF-water重悬磁珠,并分装至0.2ml RNase-free PCR管中,每管4μL。
M-280磁珠结合探针:在上述处理好的M-280磁珠中,分别添加30种携带不同样本标签的10μM核酸捕获探针,4μL磁珠对应1μL探针,室温孵育5min,此时探针已结合至M-280磁珠上。
RNA捕获:将上述已结合探针的磁珠,分别与30份不同受检者来源的血小板裂解产物混匀,即每种样本标签对应一例受检者,室温孵育5min,磁力架吸附磁珠2min,去除10μL上清。此时RNA已结合至磁珠,应立即进行后续实验。
(3)一链cDNA合成
配制300μL反转录混合液(1×First-Strand Buffer,1M Betaine,1mM dNTPs,9mM MgCl 2,2.5mM DTT,1μM如SEQ ID NO.1所示的扩增引物P1,1U/μL RNase抑制剂,10U/μL SSII),每份磁珠中加入10μL反转录混合液。按照以下程序反应:42℃ 90min,4℃∞。将30份反转录产物混合在一起,磁力架捕获磁珠,弃上清,添加24.5μL NF-water重悬磁珠,获得一链cDNA。
(4)全长cDNA扩增
配制25.5μL扩增混合液(1×KAPA HiFi HotStart ReadyMix,1μM如SEQ ID NO:4所示的扩增引物P2),添加至一链cDNA溶液中,按照以下程序反应:98℃ 3min,15个循环(98℃ 15s,65℃ 20s,72℃ 6min),72℃ 5min,4℃∞。
使用50μL VAHTSTM DNA Clean Beads(Vazyme,N411)纯化cDNA扩增产物,新鲜配制80%乙醇清洗磁珠,Elution Buffer洗脱,所得到的产物即为带样本标签和单分子标签的全长cDNA。
4.血小板核酸文库的构建
根据上述cDNA扩增产物的定量结果,使用TCPseq血小板文库构建试剂盒(Lifeint),取5ng上述血小板cDNA扩增产物进行片段化,经10轮扩增,使用VAHTSTM DNA Clean Beads对扩增产物进行片段分选,获得450bp左右的血小板核酸文库。
实施例3血小板核酸文库的测序及基因表达水平数据的获得
使用Illumina的HiSeq X系列测序仪,采用PE150的策略进行高通量测序,利用实施例2的步骤3所述的30种样本标签,对下机数据进行拆分,使用trimmomatic进行质控,使用STAR与版本号为.GRCh37.75的参考基因组进行比对及注释,最后使用featureCounts进行基因表达量的统计,利用shell脚本语言的awk、grep、sort等工具进行格式化数据,最终的数据格式为57735个基因及对应的表达水平。
实施例4分析血小板的基因表达水平
采用上述血小板RNA测序方法,结合机器学习算法,以肺癌/健康两种类型举例,对30例待测样本进行检测,包括如下步骤:
1.学习样本库的建立
采用matlab的模块bioma.data.DataMatrix的生成57735*864(57735为基因数,864为由440例健康和424例肺癌组成的样本数)的数据矩阵Cancer_healthy;
2.待测样本库的建立
采用matlab的模块bioma.data.DataMatrix的生成57735*30(57735为基因数,30为由15例健康和15例肺癌组成的样本数)的数据矩阵Test_sample;
3.差异基因矩阵选取
调用matlab中的Bioinformatics Toolbox工具箱,分析数据矩阵Cancer_healthy中两种样本之间的差异基因,将差异基因进行选取得到一个864*4721(864为学习样本数,4721为差异基因数)的矩阵,及4721*1(4721为差异基因数)的矩阵cancer_healthy_m1。
4.数据格式化处理
将Cancer_healthy及Test_sample矩阵根据差异基因矩阵cancer_healthy_m1匹配的差异基因做基因表达水平数据标准化处理和PCA主成分分析,并对最后的数据进行LDA线性判断降维成864*500的学习样本库降维矩阵及30*500的待测样本库降维矩阵(864位学习样本数,30为待测样本数,500为维度)。
5.高斯过程分类器进行判读
调用matlab中的gp(高斯过程回归)工具箱,对上述经格式化处理的学习样本库降维矩阵和待测样本库降维矩阵建立数学模型,根据预测类型的概率X进行归类。
表1 15例健康人和15例肺癌患者的X值表
样本编号 样本分组 概率(X)
XJP3-918 健康 0.7365
LHP1-919 健康 0.6875
CXX3-911 健康 0.7120
CYM3-903 健康 0.6354
FLP3-915 健康 0.5986
LMQ-910 健康 0.6852
XMH-912 健康 0.6179
LYB-905 健康 0.6628
QZ-911 健康 0.6741
DKM2-909 健康 0.7584
LWP1-915 健康 0.5358
ZSS3-916 健康 0.6703
ZZY3-917 健康 0.6852
ZGL1-920 健康 0.6900
YGD2-921 健康 0.7387
LZM-924 肺癌 0.3948
LYQ-902 肺癌 0.4489
LTH-903 肺癌 0.4625
LXD-906 肺癌 0.4820
HGR-922 肺癌 0.4536
CZG-920 肺癌 0.3725
JJD-904 肺癌 0.4437
JYS-923 肺癌 0.4832
WWS1-912 肺癌 0.4986
LXL-902 肺癌 0.5306
WQS-901 肺癌 0.4474
LZQ-901 肺癌 0.4821
ZYG-914 肺癌 0.4896
ZCY-904 肺癌 0.3830
XQY-907 肺癌 0.4801
设定X值大于0.5以上判读为健康人。根据概率X值的大小判断出结果与实际一致,正确率达到96.67%。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在不脱离本发明的原理和宗旨的情况下在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。
Figure PCTCN2019096097-appb-000004
Figure PCTCN2019096097-appb-000005
Figure PCTCN2019096097-appb-000006

Claims (10)

  1. 一种核酸捕获探针,其特征在于,所述核酸捕获探针从5'开始,依次为,5'端生物素修饰、扩增引物序列P1、测序接头序列P5、样本标签序列、单分子标签序列和多聚胸腺嘧啶Oligo(dT)序列;
  2. 如权利要求1所述核酸捕获探针,其特征在于,所述扩增引物序列P1如SEQ ID NO:1所示,测序接头序列P5如SEQ ID NO:2所示,样本标签序列由3~4个核苷酸组成,单分子标签序列由10个核苷酸组成,多聚胸腺嘧啶Oligo(dT)序列由20个T碱基组成。
  3. 一种试剂盒,其特征在于,含有权利要求1或2所述核酸捕获探针。
  4. 一种血小板核酸文库的构建方法,其特征在于,方法为:
    采集全血;
    超纯血小板的分离;
    血小板RNA的微量扩增:使用权利要求1或2所述核酸捕获探针;或权利要求3所述试剂盒中的核酸捕获探针进行微量扩增,获得血小板全长cDNA的扩增产物;
    血小板核酸文库的构建。
  5. 如权利要求4所述血小板核酸文库的构建方法,其特征在于,所述采集全血为使用含抗凝剂的真空采血管采集静脉血,采集后轻轻颠倒采血管数次,使抗凝剂与全血充分混匀。
  6. 如权利要求4所述血小板核酸文库的构建方法,其特征在于,所述超纯血小板的分离为采用离心使所得超纯血小板中的白细胞污染率低于0.0001%;优选的,采用两步离心法;更优选的,在两步离心法中间采用双免疫磁珠去除白细胞和红细胞。
  7. 如权利要求4所述血小板核酸文库的构建方法,其特征在于,所述血小板RNA的微量扩增为以超纯血小板的RNA为模板,权利要求1或2所述核酸捕获探针,或权利要求3所述试剂盒中的核酸捕获探针为引物,利用反转录酶反转录合成与血小板的RNA互补的一链cDNA,并利用反转录酶的模板置换活性在一链cDNA的3'端加上一段扩增引物序列P1如SEQ ID NO.1所示;以合成得到的与血小板的RNA互补的一链cDNA为模板,如SEQ ID NO.4所示的扩增引物序列P2为引物,多轮扩增并纯化,获得血小板全长cDNA的扩增产物;优选的,可将多个不同样本的一链cDNA混合,在同一反应体系中进行扩增,获得不同来源的血小板全长cDNA的扩增产物。
  8. 如权利要求4所述血小板核酸文库的构建方法,其特征在于,所述血小板核酸文库的构建为使用转座酶和测序接头对所得的获得血小板全长cDNA的扩增产物进行片段化和加接头,使用测序引物对片段化产物进行PCR扩增,富集cDNA的5'端;利用AmPure XP Beads分选并纯化扩增产物,获得5'端携带分子标签的血小板核酸文库;优选的,其中,测序接头的序列如SEQ ID NO:4所示,测序引物的序列如SEQ ID NO:5和SEQ ID NO:6。
  9. 一种基因表达水平数据的获得方法,其特征在于,按照权利要求4-8任一所述方法构建血小板核酸文库后,对血小板核酸文库的片段进行高通量测序,利用样本标签对测序数据进行拆分,区分同一来源的血小板核酸数据,并对每个样本的测序数据进行质控、参考基因组比对、计算基因表达水平量的生物信息学分析,获得样本的基因表达水平数据。
  10. 一种分析血小板的基因表达水平的方法,其特征在于,对权利要求9获得的样本的基因表达水平数据进行分析,步骤如下:
    学习样本库的建立:采用matlab的模块bioma.data.DataMatrix的生成n*m1的数据矩阵Cancer_healthy;
    待测样本库的建立:采用matlab的模块bioma.data.DataMatrix的生成n*m2的数据矩阵Test_sample;
    差异基因矩阵选取:调用matlab中的Bioinformatics Toolbox工具箱,分析数据矩阵Cancer_healthy中两种样本之间的差异基因,将差异基因进行选取得到一个m1*k的矩阵,及k*1的矩阵cancer_healthy_k1;
    数据格式化处理:将Cancer_healthy及Test_sample矩阵根据差异基因矩阵cancer_healthy_k1匹配的差异基因做基因表达水平数据标准化处理和PCA主成分分析,并对最后的数据进行LDA线性判断降维成学习样本库降维矩阵m1*w和待测样本库降维矩阵m2*w;
    高斯过程分类器进行判读:调用matlab中的gp工具箱,对上述经格式化处理的学习样本库降维矩阵m1*w和待测样本库降维矩阵m2*w建立数学模型,根据预测类型的概率X进行归类;
    其中n为基因数,m1为由m1例健康和肺癌组成的样本数;m2为由m2例健康和肺癌组成的样本数;k为差异基因数,w为维度。
PCT/CN2019/096097 2018-07-17 2019-07-16 一种用于基因检测的血小板核酸文库构建方法和试剂盒 WO2020015621A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810782077.7A CN108949909A (zh) 2018-07-17 2018-07-17 一种用于基因检测的血小板核酸文库构建方法和试剂盒
CN201810782077.7 2018-07-17

Publications (1)

Publication Number Publication Date
WO2020015621A1 true WO2020015621A1 (zh) 2020-01-23

Family

ID=64481415

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/096097 WO2020015621A1 (zh) 2018-07-17 2019-07-16 一种用于基因检测的血小板核酸文库构建方法和试剂盒

Country Status (2)

Country Link
CN (1) CN108949909A (zh)
WO (1) WO2020015621A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108949909A (zh) * 2018-07-17 2018-12-07 厦门生命互联科技有限公司 一种用于基因检测的血小板核酸文库构建方法和试剂盒
CN109811055A (zh) * 2019-01-08 2019-05-28 广州金域医学检验中心有限公司 肉瘤融合基因检测试剂盒及系统
WO2022067494A1 (en) * 2020-09-29 2022-04-07 Singleron (Nanjing) Biotechnologies, Ltd. Method for detection of whole transcriptome in single cells
CN116598005B (zh) * 2023-07-17 2023-10-03 中日友好医院(中日友好临床医学研究所) 基于宿主序列信息的下呼吸道感染概率预测系统及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103834726A (zh) * 2014-01-24 2014-06-04 湖南工程学院 基于微流控微珠阵列芯片和dna聚合酶介导引物延伸技术的微小核糖核酸检测方法
WO2015014962A1 (en) * 2013-08-02 2015-02-05 F. Hoffmann-La Roche Ag Sequence capture method using specialized capture probes (heatseq)
CN106957906A (zh) * 2016-12-23 2017-07-18 孙涛 一种应用于高通量测序检测t细胞白血病微小残留病的引物组合及试剂盒
CN108949909A (zh) * 2018-07-17 2018-12-07 厦门生命互联科技有限公司 一种用于基因检测的血小板核酸文库构建方法和试剂盒

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103938277B (zh) * 2014-04-18 2016-05-25 中国科学院北京基因组研究所 以痕量dna为基础的二代测序文库构建方法
CN107873054B (zh) * 2014-09-09 2022-07-12 博德研究所 用于复合单细胞核酸分析的基于微滴的方法和设备
CN106754904B (zh) * 2016-12-21 2019-03-15 南京诺唯赞生物科技有限公司 一种cDNA的特异性分子标签及其应用
CN107523563A (zh) * 2017-09-08 2017-12-29 杭州和壹基因科技有限公司 一种用于循环肿瘤dna分析的生物信息处理方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015014962A1 (en) * 2013-08-02 2015-02-05 F. Hoffmann-La Roche Ag Sequence capture method using specialized capture probes (heatseq)
CN103834726A (zh) * 2014-01-24 2014-06-04 湖南工程学院 基于微流控微珠阵列芯片和dna聚合酶介导引物延伸技术的微小核糖核酸检测方法
CN106957906A (zh) * 2016-12-23 2017-07-18 孙涛 一种应用于高通量测序检测t细胞白血病微小残留病的引物组合及试剂盒
CN108949909A (zh) * 2018-07-17 2018-12-07 厦门生命互联科技有限公司 一种用于基因检测的血小板核酸文库构建方法和试剂盒

Also Published As

Publication number Publication date
CN108949909A (zh) 2018-12-07

Similar Documents

Publication Publication Date Title
WO2020015621A1 (zh) 一种用于基因检测的血小板核酸文库构建方法和试剂盒
CN110272985B (zh) 基于外周血血浆游离dna高通量测序技术的肿瘤筛查试剂盒及其系统与方法
US20190292601A1 (en) Methods of diagnosing cancer using cancer testis antigens
CN105063209B (zh) 一种外泌体miRNA的定量检测方法
CN110387421A (zh) 用于肺癌检测的DNA甲基化qPCR试剂盒及使用方法
Parsons et al. Circulating plasma tumor DNA
CN108588230B (zh) 一种用于乳腺癌诊断的标记物及其筛选方法
US11401560B2 (en) Set of genes for bladder cancer detection and use thereof
CN109112216A (zh) 三重qPCR检测DNA甲基化的试剂盒和方法
CN109457032B (zh) 甲状腺癌分子诊断试剂盒
CN108660215B (zh) 检测circMAN1A2和circRNF13试剂的应用及试剂盒
CN108796074B (zh) 检测环状RNA circRNF13的试剂在制备肿瘤辅助诊断制剂上的应用及试剂盒
CN111748629A (zh) 一种用于胰腺癌早期诊断的生物标志物的检测试剂
CN111833963A (zh) 一种cfDNA分类方法、装置和用途
JPWO2019117257A1 (ja) 乳がんの検出を補助する方法
CN109402262A (zh) 辅助诊断神经母细胞瘤的PCR检测试剂盒及检测miR-199a-3p表达水平的方法
CN117568481A (zh) 一组与肝癌相关的血浆外泌体tsRNAs标志物及其应用
JP2024023284A (ja) がんのスクリーニング、診断、治療、及び再発における巨細胞の核酸の特徴付けの使用方法
CN111781360A (zh) 游离细胞捕获探针及其相关产品和用途
WO2015079060A2 (en) Mirnas as advanced diagnostic tool in patients with cardiovascular disease, in particular acute myocardial infarction (ami)
CN108660213B (zh) 检测三种非编码rna试剂的应用及试剂盒
Chatterton et al. Brain-derived circulating cell-free DNA defines the brain region and cell specific origins associated with neuronal atrophy
CN115261476A (zh) 筛选血清外泌体LncRNA HULC作为肝癌早期标记物的方法及其制备试剂盒的用途
CN112501295B (zh) miRNA组合、含其的试剂盒及在肺癌诊断中的应用
CN111944893B (zh) 与唇腭裂产前无创诊断相关的miRNA分子标志物及其应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19837936

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19837936

Country of ref document: EP

Kind code of ref document: A1