WO2018148903A1 - 泌尿系统肿瘤的辅助诊断方法 - Google Patents

泌尿系统肿瘤的辅助诊断方法 Download PDF

Info

Publication number
WO2018148903A1
WO2018148903A1 PCT/CN2017/073778 CN2017073778W WO2018148903A1 WO 2018148903 A1 WO2018148903 A1 WO 2018148903A1 CN 2017073778 W CN2017073778 W CN 2017073778W WO 2018148903 A1 WO2018148903 A1 WO 2018148903A1
Authority
WO
WIPO (PCT)
Prior art keywords
window
sample
genome
sequencing
urinary system
Prior art date
Application number
PCT/CN2017/073778
Other languages
English (en)
French (fr)
Inventor
高芳芳
薄世平
梁覃斯
任军
Original Assignee
上海亿康医学检验所有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海亿康医学检验所有限公司 filed Critical 上海亿康医学检验所有限公司
Priority to PCT/CN2017/073778 priority Critical patent/WO2018148903A1/zh
Publication of WO2018148903A1 publication Critical patent/WO2018148903A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Definitions

  • the present invention relates to the field of medicine, and in particular to an auxiliary diagnostic method for urinary system tumors.
  • the liquid biopsy method can capture other tumor cells or DNA entering the blood, which can be used as a tumor diagnosis method, and this method is a non-invasive detection method, and the sample can be repeatedly sampled for detection. .
  • CTC circulating tumor cells
  • ctDNA circulating tumor DNA
  • exosomes exosomes
  • CTC testing is the earliest applied liquid biopsy technique.
  • the CTC count can be used to judge prognosis and recurrence detection; single cell sequencing of CTC can guide tumor medication, master the dynamic changes of cancer, and timely adjust the treatment plan; live CTC isolated from blood can be further cultured and used To construct a tumor research model.
  • the technical difficulty of CTC detection is relatively high. There are not many suppliers who can provide complete CTC detection technology and services on the market, and each technology There are differences.
  • Exosomes are somewhere in between, more abundant than CTC, and more prone to enrichment; in form, secretory vesicles can effectively protect nucleic acid substances and overcome the problem of easy degradation of ctDNA in blood.
  • the information carried by exosomes is diverse, and the proteins and nucleic acids can be used for the analysis of early diagnosis, recurrence monitoring, drug resistance monitoring and the like of cancer.
  • exogenous biopsy throws are still at the laboratory level.
  • a method of assisting diagnosis of a urinary system tumor comprising the steps of:
  • step (iii) aligning the genomic sequence obtained in step (ii) with a reference genome to obtain positional information of the genomic sequence on the reference genome;
  • step (v) performing a Z-test on each window b of step (iv) to calculate the Z value of each window b;
  • step (vi) calculating a Whole Genomic Abnormal Score (WGAS) based on the Z value obtained in step (v);
  • step (ii) the sample to be tested is directly subjected to Malbac-L amplification and sequencing without extracting DNA therein, thereby obtaining a genomic sequence of the sample.
  • step (ii) the DNA in the sample to be tested can be extracted, subjected to Malbac-L amplification, and sequenced, thereby obtaining a genomic sequence of the sample.
  • the reference genome may be continuous or discontinuous.
  • the reference genome comprises a whole genome.
  • the reference genome refers to the full length of all chromosomes of the species (eg, human), the full length of a single or multiple chromosomes, a portion of a single or multiple chromosomes, or a combination thereof.
  • the reference genome has a coverage of more than 50% of the whole genome, preferably 60% or more, more preferably 70% or more, more preferably 80% or more, optimally, above 95.
  • the sample is from an individual to be detected.
  • the individual to be detected is a human or a non-human mammal.
  • the sample is a solid sample or a liquid sample.
  • the sample comprises a body fluid sample.
  • the sample is selected from the group consisting of blood, plasma, interstitial fluid, lymph, cerebrospinal fluid, urine, saliva, aqueous humor, semen, gastrointestinal secretions, or a combination thereof.
  • the sample is selected from the group consisting of blood, urine, or a combination thereof.
  • the sample is selected from the group consisting of a bladder, kidney, urethra, ureter, or a combination thereof.
  • the sample is selected from the group consisting of free circulating tumor cells (CTC), extracellular free DNA (cfDNA), exosomes, or a combination thereof.
  • CTC free circulating tumor cells
  • cfDNA extracellular free DNA
  • exosomes or a combination thereof.
  • the sample contains cells derived from the urinary system or nucleic acid components of the cells.
  • the cells comprise normal cells, cancer cells, or a combination thereof.
  • the urinary system tumor is selected from the group consisting of bladder cancer, kidney cancer, urethral cancer, renal pelvic ureteral cancer, or a combination thereof.
  • the sequencing is selected from the group consisting of single-ended sequencing, double-ended sequencing, or a combination thereof.
  • step (iv) further comprises the step of correcting the copy number of each window b and calculating the corrected copy number of each window b.
  • the correction method is selected from the group consisting of Loess correction, weighting method, residual method, or Its combination.
  • the number of sequences falling into each window b, the base distribution, and the base distribution of the reference genome are counted based on the positional information of the genomic sequence on the reference genome.
  • the number of copies of each window b is corrected based on the sequence and base content of each window b.
  • the Z value of each window b is calculated using the following formula:
  • i is any positive integer from 1 to M;
  • M is the total number of windows of the reference gene component, wherein M is a positive integer ⁇ 50, preferably 50 ⁇ M ⁇ 10 5 , more preferably, 100 ⁇ M ⁇ 10 5, optimally, 200 ⁇ M ⁇ 10 5;
  • x i is a copy of the test sample in the i-th value b i detection window;
  • b i is the i-th window.
  • the normal control sample refers to a homogeneous sample of a normal person of the same species.
  • the genome-wide disorder score is calculated using the following formula:
  • m b is the window sorted at the mthth percent
  • p b is the window sorted at the p%
  • m is 30-98, preferably 40-97, more preferably 60-96, optimally, 80-95, optimally, 95
  • p is 80-100, preferably, 85-100, more preferably, 90-100, optimally, 100
  • pm ⁇ 2 preferably, ⁇ 5, More preferably, ⁇ 10, more preferably ⁇ 15, optimally ⁇ 20).
  • the calculating the genome-wide disorder score includes the following steps:
  • step (v) further includes the following steps:
  • step (iv1) calculating a coefficient of variation CV i of each window b in the normal control sample according to the number of copies of each window b in step (iv);
  • the coefficient of variation CV i is calculated using the following formula:
  • ⁇ i is the arithmetic mean of the copy number of the normal control sample at window b i and is calculated by the following formula:
  • N is the total number of normal control samples, wherein N is a positive integer ⁇ 30, preferably 30 ⁇ N ⁇ 10 8 , more preferably, 50 ⁇ N ⁇ 10 7 , optimally, 100 ⁇ N ⁇ 10 4 ;
  • X j refers to the copy value detected by the jth normal control sample at the window b i ;
  • ⁇ i is the standard deviation of the copy number of the normal control sample at the window b i and is calculated by the following formula:
  • N, j, X j , ⁇ i and ⁇ i are as defined above.
  • a urinary system auxiliary diagnostic apparatus comprising:
  • Malbac-L amplification unit (device or module);
  • a sequencing unit (device or module);
  • a genome-wide disorder score unit (device or module); wherein the genome-wide disorder score unit (device or module) is used to perform the tasks of steps (iii)-(vi) in the first aspect of the invention, and output The results of the genome-wide confusion score obtained.
  • the device further comprises a sample pretreatment unit (device or module).
  • the pretreatment unit (device or module) is used for precipitation treatment, and/or lysis treatment of the sample to be tested.
  • the sample to be tested is a cell sample.
  • the sequencing unit (device or module) comprises a second generation sequencer and/or a third generation sequencer.
  • a method for detecting a urinary system gene comprising:
  • step (iii) aligning the genomic sequence obtained in step (ii) with a reference genome to obtain positional information of the genomic sequence on the reference genome;
  • step (v) performing a Z-test on each window b of step (iv) to calculate the Z value of each window b;
  • step (vi) calculating a Whole Genomic Abnormal Score (WGAS) based on the Z value obtained in step (v);
  • step (vii) The genome-wide disorder score (WGAS) obtained in step (vi) was used as the urinary system gene test result.
  • the method is non-therapeutic and non-diagnostic.
  • Figure 1 shows a schematic of the rapid non-invasive tumor detection method of the present invention.
  • Figure 2 shows the consistency of tissue samples from bladder cancer patients with the detection of chromosome copy number in urine samples.
  • Figure 3 shows the results of urine sample confusion scores for patients with bladder cancer, normal subjects, and non-tumor urinary tract lesions.
  • the present inventors have for the first time established a method for assisting diagnosis and/or prognosis evaluation that can improve the sensitivity and versatility of urinary system tumor detection, specifically, using the Malbac-L amplification method.
  • the sample to be tested is amplified and evaluated for the auxiliary diagnosis and/or prognosis of the urinary system tumor based on the value of the Whole Genome Disorder Score (WGAS).
  • WGAS Whole Genome Disorder Score
  • CNV Copy Number Variations
  • WGAS Whole Genomic Abnormality Score
  • Z-score also known as the standard score, is the process of dividing the difference between a value and an average by the standard deviation. Expressed as:
  • x is a specific value
  • is the arithmetic mean
  • is the standard deviation
  • the Z value represents the distance between the original value and the reference average, calculated in units of standard deviation.
  • partial response refers to a reduction in the sum of the maximum diameters of the target lesions by > 30% for at least 4 weeks.
  • progressive disease refers to a increase in the maximum diameter of a target lesion of at least ⁇ 20%, or the appearance of a new lesion.
  • the mutation site is not particularly limited and may be a known site, or may be a site identified in the future related to a tumor, preferably bladder cancer.
  • the reference genome in the case of a human, may be a whole genome or a partial genome. Also, the reference genome may be continuous or discontinuous.
  • the total coverage (F) of the reference genome is more than 50% of the whole genome, preferably, preferably, 60% or more, more preferably, 70% or more, more Preferably, more than 80%, optimally, more than 95%, wherein the total coverage (F) refers to the percentage of the reference genome as a whole genome.
  • the reference genome is a whole genome.
  • the reference genome is the full length of all chromosomes of the species (eg, human), the full length of a single or multiple chromosomes, a portion of a single or multiple chromosomes, or a combination thereof.
  • the amplification phase of the Malbac-L amplification method is divided into pre-amplification and amplification stages, pre-amplification
  • the 5' end of the primer has a fixed sequence
  • the middle is a random sequence of a certain length, such as B, D, H, V or a combination thereof
  • the 3' end has specific sequences of different lengths (such as GGG, CCC, TTT, AAA, One or more of TGGG, GTTT, TNTNG or GTGG).
  • the primer can be more uniformly bound to the template at lower temperatures.
  • semi-amplicons of varying lengths are produced.
  • the ends of the product carry a fixed base sequence and its complementary sequence, respectively, to form a full amplicon.
  • the fixed base sequence of the entire amplicon and its complement can form a hairpin structure to prevent further amplification from occurring.
  • a primer mixture is added, and the 3 end of the primer is complementary to the fixed sequence of the pre-amplification stage, and the 5 end is identical to the base required by the sequencing platform, and the full amplicon generated in the pre-amplification stage is abundant at this stage. Amplification.
  • the amplified product can be directly subjected to sequencing after being recovered. (See Figure 1)
  • sequencing can be performed using conventional sequencing techniques and platforms.
  • the sequencing platform is not particularly limited, and the second generation sequencing platform includes (but is not limited to): Illumina's GA, GAII, GAIIx, HiSeq1000/2000/2500/3000/4000, X Ten, X Five, NextSeq500/550, MiSeq , MiSeqDx, MiSeq FGx, MiniSeq; SOLiD of Applied Biosystems; 454FLX of Roche; Ion Torrent, Ion PGM, Ion Proton I/II of Thermo Fisher Scientific (Life Technologies); BGISEQ1000, BGISEQ500, BGISEQ100 of Huada Gene; Group's BioelectronSeq 4000; DA8600 of Sun Yat-sen University Daan Gene Co., Ltd.; NextSeq CN500 of Berry and Kang; BIGIS of Zhongke Zixin, a subsidiary of Zixin Pharmaceutical; HYK-PSTAR-
  • Third-generation single-molecule sequencing platforms include, but are not limited to, HeliScope Systems from Helicos BioSciences, SMRT Systems from Pacific Bioscience, GridION, MinION from Oxford Nanopore Technologies.
  • the sequencing type can be Single End sequencing or Paired End sequencing.
  • the sequencing length can be any length greater than 30 bp, such as 30 bp, 40 bp, 50 bp, 100 bp, 300 bp, etc., and the sequencing depth can be 0.01, 0.02 of the genome. 0.1, 1, 5, 10, 30 times, etc. are any multiples greater than 0.01.
  • Illumina's HiSeq2500 high-throughput sequencing platform is preferred, and the sequencing type is single-end sequencing, the sequencing length is 41 bp, and the sequencing data amount is 5M.
  • data processing generally includes the following steps:
  • the method further includes: the type of the sample to be tested is a body fluid, and the body fluid may be blood, tissue interstitial fluid (referred to as tissue fluid or intercellular fluid), lymph fluid, cerebrospinal fluid, urine, saliva,
  • tissue fluid or intercellular fluid tissue interstitial fluid
  • lymph fluid lymph fluid
  • cerebrospinal fluid urine
  • saliva saliva
  • the detection target is DNA contained in body fluid, and the DNA is specifically present in free circulating tumor cells (CTC), extracellular free DNA (cfDNA), exosomes, and the like.
  • CTC free circulating tumor cells
  • cfDNA extracellular free DNA
  • exosomes exosomes, and the like.
  • the extraction method of the sample DNA to be tested includes (but is not limited to): column extraction, magnetic bead extraction. The samples were constructed using a high-throughput sequencing platform to sequence the samples.
  • the method further comprises: removing the joint and the low-quality data from the sequencing result, and comparing the reference genome.
  • the reference genome can be part of the whole genome, any chromosome, or chromosome.
  • the reference genome typically selects a sequence that has been generally identified, such as the human genome can be hg18 (GRCh18), hg19 (GRCh37), hg38 (GRCh38) of NCBI or UCSC, or any part of a chromosome and chromosome.
  • the comparison software can be used with any kind of free or commercial software, such as BWA (Burrows-Wheeler Alignment tool), SOAPaligner/soap2 (Short Oligonucleotide Analysis Package), Bowtie/Bowtie2.
  • BWA Borrows-Wheeler Alignment tool
  • SOAPaligner/soap2 Short Oligonucleotide Analysis Package
  • Bowtie/Bowtie2 Bowtie/Bowtie2.
  • the method further comprises: forming the gene component into a window of a certain length, and according to the measured data amount, the window length may also be the same or different integers in the range of 100 bp to 3,000,000 bp (3M).
  • the number of windows can be any integer in the range of 1,000-30,000,000. Based on the position of the sequence on the genome, the number of sequences falling into each window, the base distribution, and the base distribution of the reference genome were counted.
  • the copy number of each window is corrected according to the sequence of each window and the base GC content.
  • the correction methods include, but are not limited to, Loess correction, and the corrected copy number of each window is calculated.
  • step (d) specifically: taking N (N is a natural number of not less than 30) normal human samples, the same extraction, database construction, sequencing conditions, repeating the above steps (a)-(c ) as a reference data set. For each window b i , there are N normal copy values.
  • the arithmetic mean ⁇ i is calculated as:
  • X 1 , X 3 , X 3 , ... X j are copy values of normal samples.
  • x i is the copy value detected by window b i .
  • the method further comprises: a high repeating region, such as a near centromere, a telomere, a satellite, a heterochromatin, or the like, around the entire genome, a chromosome, a chromosome fragment or a gene.
  • a high repeating region such as a near centromere, a telomere, a satellite, a heterochromatin, or the like, around the entire genome, a chromosome, a chromosome fragment or a gene.
  • the high repeat area is first removed to eliminate the effect on the chaos calculation.
  • the method of removal includes (but is not limited to):
  • L Remove areas of the genome that are not detected by high-throughput sequencing such as centromere, telomere, satellite, and heterochromatin, and remove the L-length region near the centromere, telomere, satellite, and heterochromatin on the genome, L Can be any length less than 3M; or
  • ⁇ i is the arithmetic mean of the copy number of the normal control sample
  • ⁇ i is the standard deviation of the copy number of the normal control sample
  • the CV is sorted from small to large, removing the largest first n% of the window, and n can be any value greater than 0 and less than or equal to 5.
  • step (e) specifically including the calculation of the genome-wide disorder degree score (WGAS):
  • the detection range of the degree of confusion is first determined, including but not limited to any value ranging from 1 M to the genome length (eg, the human genome is about 3 G) of the entire genome, a specific chromosome, a specific chromosome fragment, or a specific gene.
  • the Z value of the window that removes the effect of the repetitive sequence is removed.
  • the absolute value of the Z value is sorted from small to large, and the absolute value of the ordered Z value is evenly distributed in the range of 0%-100%, wherein the absolute value of the absolute value of the Z value is assigned to 0%, and the absolute value of the Z value.
  • the maximum value is assigned to 100%.
  • WGAS Whole Genome Disorder Score
  • m b is the window sorted at the mth
  • p b is the window sorted at the p%.
  • Scores calculated from sample genome-wide chromosome or chromosome fragment copy number anomalies including but not limited to whole genomes, specific chromosomes, chromosome fragments, and specific genes.
  • a method of assisted diagnosis and/or prognosis evaluation of a urinary system tumor comprising the steps of:
  • step (iii) aligning the genomic sequence obtained in step (ii) with a reference genome to obtain positional information of the genomic sequence on the reference genome;
  • step (v) performing a Z-test on each window b of step (iv) to calculate the Z value of each window b;
  • step (vi) calculating a Whole Genomic Abnormal Score (WGAS) based on the Z value obtained in step (v);
  • a urinary system auxiliary diagnostic apparatus comprising:
  • Malbac-L amplification unit (device or module);
  • a sequencing unit (device or module);
  • a genome-wide disorder score unit (device or module); wherein the genome-wide disorder score unit (device or module) is used to perform the tasks of steps (iii)-(vi) in the first aspect of the invention, and output The results of the genome-wide confusion score obtained.
  • step (iii) aligning the genomic sequence obtained in step (ii) with a reference genome to obtain positional information of the genomic sequence on the reference genome;
  • step (v) performing a Z-test on each window b of step (iv) to calculate the Z value of each window b;
  • step (vi) calculating a Whole Genomic Abnormal Score (WGAS) based on the Z value obtained in step (v);
  • step (vii) The genome-wide disorder score (WGAS) obtained in step (vi) was used as the urinary system gene test result.
  • the present invention aims to reduce the operational steps of tumor detection and diagnosis, improve the throughput of non-invasive tumor detection and diagnosis, reduce the detection cost, and improve the sensitivity of detection and diagnosis.
  • the method for performing gene copy number detection of the present invention omits the DNA extraction process, simplifies the operation steps compared with the existing second generation sequencing technology, and since the present invention can realize gene copy number detection at the single cell level, Achieve detection of low starting samples.
  • the amplification product obtained by the Malbac-L amplification method of the present invention can only be derived from the original template, so that the constructed library can fully reflect the change of the gene copy number in the sample, and the detection sensitivity is higher.
  • the present invention combines the Malbac-L amplification technique with the genome disorder degree score (WGAS) for the first time, and can effectively and accurately perform auxiliary diagnosis or prognosis evaluation of urinary system tumors.
  • WGAS genome disorder degree score
  • Example 1 Detection of chromosomal aneuploidy in tissue samples and urine samples of patients with bladder cancer
  • Urine samples are used in the present invention, as follows:
  • the tissue sample genomic DNA extraction method is column extraction
  • the kit is a universal column genomic DNA extraction kit
  • the extracted genomic DNA is quantified using Qubit.
  • Linear amplification reagents include: primer mixture 1 (including: 5'-GAGGTGTGATGGADDDDDGGG-3' (SEQ ID NO.: 1), 5'-GAGGTGTGATGGADDDDDTTT-3' (SEQ ID NO.: 2)), dNTPs, with heat tolerance A DNA polymerase that is subjected to strand displacement properties and a linear amplification reaction buffer.
  • the first amplification product in 2.3 performs a second exponential amplification
  • the exponential amplification reagent comprises: a primer mixture 2 (5'-CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGATGAGGTGTGATGGA-3' (SEQ ID NO.: 3); 5'-CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGATGAGGTGTGATGGA-3' (SEQ ID NO.: 4)), dNTPs, with heat tolerance and A DNA polymerase with strand displacement properties and an exponential amplification reaction buffer.
  • the library construction was completed after the above steps were completed, and the library was purified and stored at -20 °C.
  • the concentration of the library was detected by QPCR method, the dilution factor of the library was calculated by the formula, and the sequencing cluster was generated by the bridge PCR method to form a sequencing template.
  • the constructed sequencing template was sequenced using a synthetic side sequencing platform to finally obtain the base sequence of each DNA fragment.
  • dilution factor concentration of the library (nM) ⁇ 1000, the concentration of the machine.
  • the base sequence of the DNA fragment obtained by sequencing is mapped to the human genome reference map, and the information of the chromosome copy number is obtained by comparison with a reference set composed of a large number of normal samples.
  • Tissue samples were compared to urine sample chromosome copy number information.
  • the results of the second-generation sequencing data showed that in the A sample, the routine detection method of the tissue sample (A1 in Fig. 2) and the rapid non-invasive tumor detection method (A2 in Fig. 2) can detect multiple chromosomal abnormalities;
  • the routine detection method of tissue samples (B1 in Figure 2) and the rapid non-invasive tumor detection method (B2 in Figure 2) showed no obvious chromosomal abnormalities, suggesting that the chromosomes were normal.
  • Example 2 Urine sample genome-wide disorder score (WGAS)
  • the collected samples were subjected to lysis, the first linear amplification and the second exponential amplification, and the sequencing was performed on the sequencing platform, which is the same as the urine sample operation procedure in the first embodiment.
  • the genomic sequence of the sequenced sample is aligned to the reference genome to obtain the position of the sequence on the reference genome Set.
  • the reference gene components were made into a window of a certain length, and the copy number of each window was subjected to a Z test, and the genome-wide disorder was scored according to the Z value of each window (WGAS).
  • the scores for the genome-wide disorder of each sample are shown in Figure 3.
  • the results show that the method of the present invention can effectively distinguish between bladder cancer patients and non-bladder cancer patients, further confirming the effectiveness of the non-invasive detection method of the present invention as a secondary diagnosis of bladder cancer.

Abstract

一种泌尿系统肿瘤的辅助诊断方法,用Malbac-L扩增方法对所述待测样本进行扩增,并基于全基因组混乱度评分(WGAS)的数值对泌尿系统肿瘤进行辅助诊断和/或预后评估。

Description

泌尿系统肿瘤的辅助诊断方法 技术领域
本发明涉及医学领域,具体地,涉及泌尿系统肿瘤的辅助诊断方法。
背景技术
传统的肿瘤诊断方法包括影像、手术病理、活检等,但是这种检测在某些方面存在着不足:
1、忽视了肿瘤病灶的异质性。今天我们已经逐渐认识到肿瘤本身是很复杂的组成,有肿瘤细胞、间质细胞、肿瘤细胞外基质(ECM),甚至还有免疫细胞等参与到肿瘤的发展,如果传统诊疗只针对肿瘤细胞,那么肯定会遇到很大的麻烦;
2、忽视了肿瘤的转移环节。我们能通过影像方法找到肿瘤的原发灶以及转移灶,但是肿瘤细胞是如何从原发灶到转移灶,这个环节我们还缺乏足够的认识,更没有很好的手段去阻断这个过程。
液体活检的检测方法,可以捕获到进入血液的其它肿瘤细胞或DNA,从而可以作为一种肿瘤诊断方法,并且这种方法是一种非介入式的检测方法,并且可重复性的抽取样本进行检测。
当前世界上液体活检技术有三个主要的分支,即循环肿瘤细胞(CTC)、循环肿瘤DNA(ctDNA)以及外泌体(exosome)。
CTC检测为最早应用于临床的液体活检技术。CTC的计数可用于判断预后以及复发检测;对CTC进行单细胞测序,可指导肿瘤用药、掌握癌症的动态变化,及时调整治疗方案;从血液中分离出来的活体CTC,还可进行进一步培养,用于构建肿瘤研究模型。但是由于CTC的特殊性,即稀有性、异质性和结构的复杂性,CTC检测的技术难度较高,市面上能完整提供CTC检测技术及服务的供应商数量不多,而且每家的技术都有所区别。
相对于CTC检测,ctDNA检测的研究历程是十分曲折的。早在1948年已在正常人体血液中检测到游离DNA片段,即cfDNA;紧接着是1973年发现疾病患者血液中的DNA水平要高于正常人,这就意味着可以通过血液中简单的DNA分析可以做初步的疾病筛查;但是直到2013年,研究人员开发出灵敏度极高的基因检测技术,使检测血液中微量DNA的突变成为可能,至此依托于基因检测 的体液活检才成为了现实。
但由于技术限制,ctDNA的应用尚停留在作为组织样本的补充,进行靶向基因检测的初级阶段。而通过循环肿瘤DNA进行早期预警及术后评估等应用由于需要大量的临床数据作为支持,并且受制于检测技术的稳定性,尚未有成熟的产品投入临床市场。
而外泌体则是介于两者之间,在数量上多于CTC,更易富集;在形式上,分泌小泡能够有效保护核酸类物质,克服了ctDNA在血液中容易降解的问题。外泌体携带的信息多样化,其中的蛋白质和核酸,均可用于癌症的早诊、复发监测、抗药性监测等相关方面的分析。但是,目前外泌体活检扔更多地还处于实验室科研水平。
因此,本领域迫切需要开发一种可高效、准确的对肿瘤(尤其是泌尿系统肿瘤)进行辅助诊断和/或预后评估的方法。
发明内容
本发明的目的在于提供一种可高效、准确的对肿瘤(尤其是泌尿系统肿瘤)进行辅助诊断和/或预后评估的方法。
在本发明第一方面,提供了一种泌尿系统肿瘤的辅助诊断方法,所述方法包括步骤:
(i)提供一待测样本;
(ii)对所述待测样本进行Malbac-L扩增、测序,从而获得所述样本的基因组序列;
(iii)将步骤(ii)获得的基因组序列与参考基因组进行比对,从而获得基因组序列在参考基因组上的位置信息;
(iv)将所述的参考基因组分成M个区域片段,其中每个区域片段为一个窗口b,计算每个窗口b的拷贝数;
(v)对步骤(iv)的每个窗口b进行Z检验,从而计算每个窗口b的Z值;
(vi)根据步骤(v)所得到的Z值,计算全基因组混乱度评分(WGAS,Whole genomic abnormality score);和
(vii)基于全基因组混乱度评分(WGAS),从而对泌尿系统肿瘤进行辅助诊断和/或预后评估。
在另一优选例中,在步骤(ii)中,对所述待测样本无需提取其中DNA,直接进行Malbac-L扩增、测序,从而获得所述样本的基因组序列。
在另一优选例中,在步骤(ii)中,可提取所述待测样本中的DNA,进行Malbac-L扩增、测序,从而获得所述样本的基因组序列。
在另一优选例中,所述参考基因组可以是连续的,也可以是不连续的。
在另一优选例中,所述参考基因组包括全基因组。
在另一优选例中,所述参考基因组指该物种(如人)所有染色体的全长、单条或多条染色体的全长、单条或多条染色体的一部分、或其组合。
在另一优选例中,所述参考基因组的覆盖率达到全基因组的50%以上,较佳地,60%以上,更佳地,70%以上,更佳地,80%以上,最佳地,95%以上。
在另一优选例中,所述样本来自待检测个体。
在另一优选例中,所述待检测个体为人或非人哺乳动物。
在另一优选例中,所述样本为固体样本或液体样本。
在另一优选例中,所述样本包括体液样本。
在另一优选例中,所述样本选自下组:血液、血浆、组织间隙液、淋巴液、脑脊液、尿液、唾液、房水、精液、胃肠道分泌液、或其组合。
在另一优选例中,所述样本选自下组:血液、尿液、或其组合。
在另一优选例中,所述样本选自以下组织的样品:膀胱、肾、尿道、输尿管、或其组合。
在另一优选例中,所述样本选自下组:游离的循环肿瘤细胞(CTC)、细胞外游离DNA(cfDNA)、外泌体、或其组合。
在另一优选例中,所述样本含有源自泌尿系统的细胞或所述细胞的核酸成分。
在另一优选例中,所述细胞包括正常细胞、癌细胞、或其组合。
在另一优选例中,所述泌尿系统肿瘤选自下组:膀胱癌、肾癌、尿道癌、肾盂输尿管癌、或其组合。
在另一优选例中,所述Malbac-L扩增的具体方法参见申请号为CN201610264059.0的中国专利申请。
在另一优选例中,所述测序选自下组:单端测序、双端测序、或其组合。
在另一优选例中,所述步骤(iv)还包括校正每个窗口b的拷贝数,计算每个窗口b校正后的拷贝数的步骤。
在另一优选例中,所述校正方法选自下组:Loess校正、权重法、残差法、或 其组合。
在另一优选例中,根据基因组序列在参考基因组上的位置信息,统计落到每个窗口b的序列数目、碱基分布、参考基因组的碱基分布。
在另一优选例中,根据每个窗口b的序列及碱基含量,校正每个窗口b的拷贝数。
在另一优选例中,用下述公式计算每个窗口b的Z值:
Figure PCTCN2017073778-appb-000001
其中,i为1至M的任意正整数;M为参考基因组分成的窗口的总数量,其中M为≥50的正整数,较佳地,50≤M≤105,更佳地,100≤M≤105,最佳地,200≤M≤105;xi为所述待测样本在第i个窗口bi检测的拷贝数值;bi为第i个窗口。
在另一优选例中,所述正常对照样本指同一物种的正常人的同类样本。
在另一优选例中,用下述公式计算全基因组混乱度评分:
Figure PCTCN2017073778-appb-000002
其中,mb为排序在第m%的窗口,pb为排序在第p%的窗口,m为30-98,较佳地,40-97,更佳地,60-96,最佳地,80-95,最佳地,95,p为80-100,较佳地,85-100,更佳地,90-100,最佳地,100,且p-m≥2(较佳地,≥5,更佳地,≥10,更佳地,≥15,最佳地,≥20)。
在另一优选例中,所述计算全基因组混乱度评分之前,包括如下步骤:
(a)根据参考基因组序列特征去除基因组上着丝粒、端粒、随体、异染色质等高通量测序测不到的区域,去除基因组上着丝粒、端粒、随体、异染色质附近L长度的区域,L为小于3M的任何长度;或
(b)根据样本的拷贝数特征去除基因组上着丝粒、端粒、随体、异染色质等高通量测序测不到的区域。
在另一优选例中,所述步骤(v)之前还包括如下步骤:
(iv1)根据步骤(iv)的每个窗口b的拷贝数,计算正常对照样本中每个窗口b的变异系数CVi;和
(iv2)将所述CVi从小到大排序,去除最大的前n%的窗口,其中,n为大于0,小于等于5的任意数值,较佳地,n=1、2、2.5、3、3.1、4、4.2或5。
在另一优选例中,所述变异系数CVi用下述公式进行计算:
Figure PCTCN2017073778-appb-000003
其中,μi为正常对照样本在窗口bi的拷贝数的算术平均值,用如下公式计算:
Figure PCTCN2017073778-appb-000004
其中,j为1至N的任意正整数;N为正常对照样本的总数量,其中N为≥30的正整数,较佳地,30≤N≤108,更佳地,50≤N≤107,最佳地,100≤N≤104;Xj指第j个正常对照样本在所述窗口bi检测的拷贝数值;
σi为正常对照样本在所述窗口bi的拷贝数的标准差,用如下公式计算:
Figure PCTCN2017073778-appb-000005
式中,N、j、Xj、μi和σi的定义如上。
在本发明第二方面,提供了一种泌尿系统辅助诊断设备,包括:
Malbac-L扩增单元(设备或模块);
测序单元(设备或模块);和
全基因组混乱度评分单元(设备或模块);其中,所述全基因组混乱度评分单元(设备或模块)用于执行本发明第一方面中步骤(iii)-(vi)的任务,并输出所得到的全基因组混乱度评分结果。
在另一优选例中,所述装置还包括样品预处理单元(设备或模块)。
在另一优选例中,所述预处理单元(设备或模块)用于对待测样本进行沉淀处理、和/或裂解处理。
在另一优选例中,所述待测样本为细胞样本。
在另一优选例中,所述测序单元(设备或模块)包括二代测序仪和/或三代测序仪。
在本发明第三方面,提供了一种泌尿系统基因检测方法,包括:
(i)提供一待测样本;
(ii)对所述待测样本进行Malbac-L扩增、测序,从而获得所述样本的基因组序列;
(iii)将步骤(ii)获得的基因组序列与参考基因组进行比对,从而获得基因组序列在参考基因组上的位置信息;
(iv)将所述的参考基因组分成M个区域片段,其中每个区域片段为一个窗口b,计算每个窗口b的拷贝数;
(v)对步骤(iv)的每个窗口b进行Z检验,从而计算每个窗口b的Z值;
(vi)根据步骤(v)所得到的Z值,计算全基因组混乱度评分(WGAS,Whole genomic abnormality score);和
(vii)将步骤(vi)所得到的全基因组混乱度评分(WGAS)作为泌尿系统基因检测结果。
在另一优选例中,所述方法为非治疗性和非诊断性的。
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。
附图说明
图1显示了本发明的快速无创肿瘤检测方法的原理图。
图2显示了膀胱癌患者组织样本与尿液样本染色体拷贝数检测的一致性。
图3显示了膀胱癌患者、正常人以及非肿瘤泌尿系病变病人的尿液样本混乱度评分结果。
具体实施方式
本发明人通过广泛而深入的研究,首次建立了一种有效且可提高泌尿系统肿瘤检测灵敏性和通用性的辅助诊断和/或预后评估的方法,具体地,用Malbac—L扩增方法对所述待测样本进行扩增,并基于全基因组混乱度评分(WGAS)的数值对泌尿系统肿瘤的辅助诊断和/或预后评估。在此基础上,本发明人完成了本发明。
术语
如本文所用,术语“拷贝数变异(Copy Number Variations,CNV)”是指样本基因组染色体或染色体片段拷贝数异常,包括但不限于染色体非整倍体、缺失、重复,大于1000bp碱基的微缺失、微重复。
如本文所用,术语“全基因组混乱度值(Whole Genomic Abnormality Score, WGAS)”是根据样本基因组染色体或染色体片段拷贝数异常计算得到的分值,分值检测范围包括但不限于全基因组、特定的染色体、染色体片段、特定基因。
如本文所用,术语“Z值(Z-score)”也叫标准分值(standard score),是一个数值与平均数的差再除以标准差的过程。用公式表示为:
Z score=(x-μ)/σ
其中x为某一具体数值,μ为算术平均值,σ为标准差;Z值代表着原始数值和参考平均值之间的距离,是以标准差为单位计算。
如本文所用,术语“部分缓解(PR,partial response)”指靶病灶最大径之和减少≥30%,至少维持4周。
如本文所用,术语“疾病进展(PD,progressive disease)”指靶病灶最大径之和至少增加≥20%,或出现新病灶。
如本文所用,术语“系统”、“设备”为相同含义。
在本发明中,所述突变位点没有特别限制,可以是已知的位点,也可以是将来鉴定出的与肿瘤(优选膀胱癌)相关的位点。
如本文所用,术语“单元”、“设备”、“模块”可互换使用。
参考基因组
在本发明中,以人为例,所述参考基因组可以是全基因组,也可以是部分基因组。并且,所述参考基因组可以是连续的,也可以是不连续的。当所述参考基因组为部分基因组时,所述参考基因组的总覆盖率(F)为全基因组的50%以上,较佳地,较佳地,60%以上,更佳地,70%以上,更佳地,80%以上,最佳地,95%以上,其中,所述总覆盖率(F)指参考基因组占全基因组的百分比。
在一优选实施方式中,所述参考基因组为全基因组。
在一优选实施方式中,所述参考基因组为该物种(如人)所有染色体的全长、单条或多条染色体的全长、单条或多条染色体的一部分、或其组合。
Malbac-L扩增方法
在本发明中,所述的Malbac-L扩增的具体方法参见申请号为201610264059.0的专利申请。
简而言之,该Malbac-L扩增方法的扩增阶段分为预扩增与扩增阶段,预扩增 阶段,引物5’端拥有一段固定序列,中间为一定长度的随机序列,如B、D、H、V或其组合,3’端拥有不同长度的特定序列(如GGG,CCC,TTT,AAA,TGGG、GTTT、TNTNG或GTGG中的一种或多种)。在较低温度下该引物可以较均匀的结合到模板上。扩增起始阶段,会产生长短不一的半扩增子,经过几个循环,产物的两端分别带有固定碱基序列及其互补序列,形成全扩增子。全扩增子的固定碱基序列及其互补序列可以形成发卡结构从而阻止进一步的扩增发生。在扩增阶段,添加引物混合物,引物的3端与预扩增阶段的固定序列互补,5端与测序平台所需碱基一致,预扩增阶段产生的全扩增子在此阶段被大量的扩增。扩增产物经过回收后可直接进行上机测序。(参见图1)
测序
在本发明中,可用常规的测序技术和平台进行测序。测序平台不受特别限制,其中第二代测序平台包括(但不限于):Illumina公司的GA、GAII、GAIIx、HiSeq1000/2000/2500/3000/4000、X Ten、X Five、NextSeq500/550、MiSeq、MiSeqDx、MiSeq FGx、MiniSeq;Applied Biosystems的SOLiD;Roche的454FLX;Thermo Fisher Scientific(Life Technologies)的Ion Torrent、Ion PGM、Ion Proton I/II;华大基因的BGISEQ1000、BGISEQ500、BGISEQ100;博奥生物集团的BioelectronSeq 4000;中山大学达安基因股份有限公司的DA8600;贝瑞和康的NextSeq CN500;紫鑫药业旗下子公司中科紫鑫的BIGIS;华因康基因HYK-PSTAR-IIA。
第三代单分子测序平台包括(但不限于):Helicos BioSciences公司的HeliScope系统,Pacific Bioscience的SMRT系统,Oxford Nanopore Technologies的GridION、MinION。测序类型可为单端(Single End)测序或双端(Paired End)测序,测序长度可为30bp、40bp、50bp、100bp、300bp等大于30bp的任意长度,测序深度可为基因组的0.01、0.02、0.1、1、5、10、30倍等大于0.01的任意倍数。
在本发明中,优选Illumina公司的HiSeq2500高通量测序平台,测序类型为单端(Single End)测序,测序长度41bp,测序数据量为5M。
数据处理
在本发明中,数据处理通常包括以下步骤:
(a)对待测样本的基因组进行核酸提取、测序,以获得基因组序列;
(b)将所述样本的基因组序列比对到参考基因组,得到序列在参考基因组上的位置;
(c)将参考基因组分成一定长度的窗口,计算每个窗口b的拷贝数;
(d)对每个窗口b进行Z检验,计算每个窗口的Z值;和
(e)计算全基因组混乱度评分(WGAS)。
其中,在步骤(a)中,具体还包括:所述待测样本的类型为体液,体液可以是血液、组织间隙液(简称组织液或细胞间液)、淋巴液、脑脊液、尿液、唾液,检测目标为体液中含有的DNA,DNA具体存在于游离的循环肿瘤细胞(CTC)、细胞外游离DNA(cfDNA)、外泌体等。所述待测样本DNA的提取方式包括(但不限于):柱式提取、磁珠提取。对样本进行文库构建,采用高通量测序平台,对样本进行测序。
其中,在步骤(b)中,具体还包括:将测序结果去掉接头及低质量数据,比对到参考基因组。参考基因组可为全基因组、任意染色体、染色体的一部分。参考基因组通常选择已被公认确定的序列,如人的基因组可为NCBI或UCSC的hg18(GRCh18)、hg19(GRCh37)、hg38(GRCh38),或任意一条染色体及染色体的一部分。比对软件可用任何一种免费或商业软件,如BWA(Burrows-Wheeler Alignment tool)、SOAPaligner/soap2(Short Oligonucleotide Analysis Package)、Bowtie/Bowtie2。将序列比对到参考基因组,得到序列在基因组上的位置。可以选择在基因组上唯一比对的序列,去除基因组上多处比对的序列,消除重复序列对拷贝数计算带来的误差。
其中,在步骤(c)中,具体还包括:将基因组分成一定长度的窗口,根据测的数据量,窗口长度也可以为100bp-3,000,000bp(3M)范围内相同或不同的整数。窗口的数量可以是1,000-30,000,000范围内的任意整数。根据测的序列在基因组上的位置,统计落到每个窗口的序列数目、碱基分布、参考基因组的碱基分布。根据每个窗口的序列及碱基GC含量,校正每个窗口的拷贝数,校正方法包括但不限于Loess校正,计算每个窗口校正后的拷贝数。
其中,在步骤(d)中,具体还包括:取N(N为不少于30的自然数)个正常人的样本,同样的提取、建库、测序条件,重复上述步骤(a)-(c),作为参考数据集。对于每个窗口bi,都对应N个正常拷贝数值。
计算正常对照样本拷贝数的算术平均值μi,算术平均值μi计算公式为:
Figure PCTCN2017073778-appb-000006
计算正常对照样本拷贝数的标准差σi,标准差的计算公式为:
Figure PCTCN2017073778-appb-000007
X1,X3,X3,......Xj为正常样本的拷贝数值。
计算待检测样本每个窗口bi的Z值,Z值的计算公式为:
Figure PCTCN2017073778-appb-000008
xi为窗口bi检测的拷贝数值。
其中,在步骤(e)中,具体还包括:在整个基因组、某条染色体、染色体片段或基因周围存在高重复区域,如近着丝粒、端粒、随体、异染色质等区域。首先去除高重复区域,以消除对混乱度计算的影响。
在一优选实施方式中,去除的方法包括(但不限于):
a.根据参考基因组序列特征去除
去除基因组上着丝粒、端粒、随体、异染色质等高通量测序测不到的区域,去除基因组上着丝粒、端粒、随体、异染色质附近L长度的区域,L可以为小于3M的任何长度;或
b.根据正常样本的拷贝数特征去除
对于每个窗口bi,计算正常对照样本在这个窗口的变异系数CVi(Coefficient of Variation),CVi计算公式为:
Figure PCTCN2017073778-appb-000009
μi为正常对照样本拷贝数的算术平均值,σi为正常对照样本拷贝数的标准差。
CV从小到大排序,去除最大的前n%的窗口,n可以为大于0,小于等于5的任意数值。
其中,在步骤(e)中,具体还包括全基因组混乱度评分(WGAS)的计算方式:
首先确定混乱度的检测范围,检测范围包括但不限于整个基因组、特定染色体、特定染色体片段或特定的基因等1M到基因组长度(如人的基因组约3G)范围内的任意值。在混乱度检测范围内,去除重复序列影响的窗口的Z值取绝 对值,Z值绝对值从小到大排序,并将排好序的Z值绝对值平均分配到0%-100%范围内,其中Z值绝对值最小值被分配至0%,Z值绝对值的最大值被分配给100%。计算对应于第m%到第p%范围内的各窗口Z值绝对值的累计值,其中,m为30-98,较佳地,40-97,更佳地,60-96,最佳地,80-95,最佳地,95;p为80-100,较佳地,85-100,更佳地,90-100,最佳地,100,且p-m≥2(较佳地≥5,更佳地≥10,更佳地≥15,最佳地≥20),所述的累计值即为全基因组混乱度评分(WGAS),计算公式为:
Figure PCTCN2017073778-appb-000010
mb为排序在第m%的窗口,pb为排序在第p%的窗口。用WGAS的值鉴定体液中肿瘤负荷。
全基因组混乱度评分(WGAS)
根据样本全基因组染色体或染色体片段拷贝数异常计算得到的分值,分值检测范围包括但不限于全基因组、特定的染色体、染色体片段、特定基因。
对泌尿系统肿瘤的辅助诊断和/或预后评估的方法
在本发明中,还提供了一种对泌尿系统肿瘤的辅助诊断和/或预后评估的方法,所述方法包括步骤:
(i)提供一待测样本;
(ii)对所述待测样本进行Malbac-L扩增、测序,从而获得所述样本的基因组序列;
(iii)将步骤(ii)获得的基因组序列与参考基因组进行比对,从而获得基因组序列在参考基因组上的位置信息;
(iv)将所述的参考基因组分成M个区域片段,其中每个区域片段为一个窗口b,计算每个窗口b的拷贝数;
(v)对步骤(iv)的每个窗口b进行Z检验,从而计算每个窗口b的Z值;和
(vi)根据步骤(v)所得到的Z值,计算全基因组混乱度评分(WGAS,Whole genomic abnormality score);和
(vii)基于全基因组混乱度评分(WGAS),从而对泌尿系统肿瘤的辅助诊断和/或预后评估。
泌尿系统辅助诊断设备
在本发明中,还提供了一种泌尿系统辅助诊断设备,包括:
Malbac-L扩增单元(设备或模块);
测序单元(设备或模块);和
全基因组混乱度评分单元(设备或模块);其中,所述全基因组混乱度评分单元(设备或模块)用于执行本发明第一方面中步骤(iii)-(vi)的任务,并输出所得到的全基因组混乱度评分结果。
一种泌尿系统基因检测方法
在本发明中,还提供了一种泌尿系统基因检测方法,包括步骤:
(i)提供一待测样本;
(ii)对所述待测样本进行Malbac-L扩增、测序,从而获得所述样本的基因组序列;
(iii)将步骤(ii)获得的基因组序列与参考基因组进行比对,从而获得基因组序列在参考基因组上的位置信息;
(iv)将所述的参考基因组分成M个区域片段,其中每个区域片段为一个窗口b,计算每个窗口b的拷贝数;
(v)对步骤(iv)的每个窗口b进行Z检验,从而计算每个窗口b的Z值;
(vi)根据步骤(v)所得到的Z值,计算全基因组混乱度评分(WGAS,Whole genomic abnormality score);和
(vii)将步骤(vi)所得到的全基因组混乱度评分(WGAS)作为泌尿系统基因检测结果。
本发明的主要优点包括:
(i)本发明旨在减少肿瘤检测诊断的操作步骤,提高无创性肿瘤检测诊断的通量,降低检测成本,提高检测诊断的灵敏度。
(ii)本发明进行基因拷贝数检测的方法省略了DNA提取过程,与现有二代测序技术相比简化了操作步骤,并且由于本发明可实现单细胞水平上的基因拷贝数检测,所以可以实现对低起始量样本的检测。
(iii)本发明用Malbac-L扩增方法所得到的扩增产物只能来源于原始的模板,所以构建的文库更可充分反映样本中基因拷贝数变化,检测灵敏度更高。
(iv)本发明首次将Malbac-L扩增技术与基因组混乱度评分(WGAS)结合,可有效且准确的对泌尿系统肿瘤进行辅助诊断或预后评估。
下面结合具体实施例,进一步陈述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。下列实施例中未注明详细条件的实验方法,通常按照常规条件如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring Harbor Laboratory Press,1989)中所述的条件,或按照制造厂商所建议的条件。除非另外说明,否则百分比和份数按重量计算。
除非有特别说明,否则实施例所用的材料均为市售产品。
实施例1:膀胱癌患者组织样本与尿液样本染色体非整倍体检测
膀胱癌患者组织样本与尿液样本分别进行文库构建,上机测序,数据分析,测序结果进行比较。组织样本建库方式为基因组DNA提取后常规打断建库。尿液样本为本发明中使用方法,具体如下:
1.组织样本:
1.1组织gDNA提取:本实施例中组织样本基因组DNA提取方式为柱式提取,试剂盒为通用型柱式基因组DNA提取试剂盒,提取的基因组DNA使用Qubit进行定量。
1.2文库构建:取500ng基因组DNA,将DNA打断至平均片段长度200bp,打断仪为Covaris DNA打断仪。二代测序快速DNA建库试剂盒NGS Fast DNA Library Prep Set for Illumina进行文库构建,文库纯化回收后进行QPCR定量。
1.3上机测序:使用半导体测序法,测序仪DA8600。
2.尿液样本
2.1获取尿液沉淀
收集正常人和获自医院的膀胱癌肿瘤病人的尿液样本10ml,以晨尿中段尿为优先选择,尿液进行离心,500rpm,4度离心10min,收集沉淀,沉淀使用200ul 1×PBS洗涤2次,最后100ul 1×PBS重悬。
2.2尿液沉淀裂解
对于1中获取的重悬的尿液沉淀取5ul加入5ul裂解液(pH为7.4的Tris-Cl 40mM,EDTA 1mM,KCl 15mM以及3%的Triton X-100)进行裂解,裂解方式为通过加入蛋白酶K进行酶裂解,程序如下:
2.3 2.2中裂解液进行第一次线性扩增
线性扩增试剂包括:引物混合物1(包括:5’-GAGGTGTGATGGADDDDDGGG-3’(SEQ ID NO.:1),5’-GAGGTGTGATGGADDDDDTTT-3’(SEQ ID NO.:2))、dNTPs、具有热耐受和链置换性质的DNA聚合酶以及线性扩增反应缓冲液。
线性扩增程序:
Figure PCTCN2017073778-appb-000012
最后低温保温。
2.4 2.3中的第一次扩增产物进行第二次指数扩增
指数扩增试剂包括:引物混合物2(5’-CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGATGAGGTGTGATGGA-3’(SEQ ID NO.:3);5’-CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGATGAGGTGTGATGGA-3’(SEQ ID NO.:4))、dNTPs,具有热耐受和链置换性质的DNA聚合酶以及指数扩增反应缓冲液。
指数扩增热循环程序:
Figure PCTCN2017073778-appb-000013
最后低温保温。
上述步骤完成后即完成了文库构建,文库纯化后-20℃保存。
2.4上机测序
用QPCR方法检测文库的浓度,通过公式计算文库稀释倍数,利用桥式PCR方法生成测序Cluster,形成测序模板。利用边合成边测序平台对构建好的测序模板进行测序,最终获取每个DNA片段的碱基序列。
文库稀释倍数的计算公式如下:稀释倍数=Pooling文库浓度(nM)×1000,上机浓度。
2.5数据分析
测序获得的DNA片段的碱基序列定位到人类基因组参考图谱,通过与大量正常样本构成的参考集对比,获得染色体拷贝数的信息。
将组织样本与尿液样本染色体拷贝数信息进行对比。
二代测序数据结果表明,在A样本中,组织样本常规检测方法(图2的A1)与快速无创伤肿瘤检测方法(图2的A2)均能检出多条染色体异常;而在B样本中,组织样本常规检测方法(图2的B1)与快速无创伤肿瘤检测方法(图2的B2)均未见明显的染色体异常,提示染色体正常。
上述结果表明,组织样本常规检测方法与快速无创伤肿瘤检测方法对泌尿系统肿瘤(尤其是膀胱癌)患者的检测结果基本一致。
实施例2:尿液样本全基因组混乱度评分(WGAS)
收集膀胱癌患者,正常人以及非肿瘤泌尿系病变病人的尿液样本,各10ml,以晨尿中段尿为优先选择,尿液进行离心,500rpm,4度离心10min,收集沉淀,沉淀使用200ul 1×PBS洗涤2次,最后100ul 1×PBS重悬。
收集的样本进行裂解,第一次线性扩增以及第二次指数扩增,边合成边测序平台进行测序,具体同实施例一中尿液样本操作步骤。
测序样本的基因组序列比对到参考基因组,得到序列在参考基因组上的位 置。将参考基因组分成一定长度的窗口,对每个窗口的拷贝数进行Z检验,根据每个窗口的Z值对全基因组混乱度进行评分(WGAS)。每个样本全基因组混乱度的评分结果如图3所示。
结果表明,使用本发明的方法可以将膀胱癌病人与非膀胱癌病人的样本进行有效区分,进一步证实了本发明的非侵入性检测方法作为膀胱癌辅助诊断的有效性。
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。

Claims (10)

  1. 一种泌尿系统肿瘤的辅助诊断方法,其特征在于,所述方法包括步骤:
    (i)提供一待测样本;
    (ii)对所述待测样本进行Malbac-L扩增、测序,从而获得所述样本的基因组序列;
    (iii)将步骤(ii)获得的基因组序列与参考基因组进行比对,从而获得基因组序列在参考基因组上的位置信息;
    (iv)将所述的参考基因组分成M个区域片段,其中每个区域片段为一个窗口b,计算每个窗口b的拷贝数;
    (v)对步骤(iv)的每个窗口b进行Z检验,从而计算每个窗口b的Z值;
    (vi)根据步骤(v)所得到的Z值,计算全基因组混乱度评分(WGAS,Whole genomic abnormality score);和
    (vii)基于全基因组混乱度评分(WGAS),从而对泌尿系统肿瘤进行辅助诊断和/或预后评估。
  2. 如权利要求1所述的方法,其特征在于,所述样本选自下组:血液、血浆、组织间隙液、淋巴液、脑脊液、尿液、唾液、房水、精液、胃肠道分泌液、或其组合。
  3. 如权利要求1所述的方法,其特征在于,所述步骤(iv)还包括校正每个窗口b的拷贝数,计算每个窗口b校正后的拷贝数的步骤。
  4. 如权利要求1所述的方法,其特征在于,用下述公式计算全基因组混乱度评分:
    Figure PCTCN2017073778-appb-100001
    其中,mb为排序在第m%的窗口,pb为排序在第p%的窗口,m为30-98,较佳地,40-97,更佳地,60-96,最佳地,80-95,最佳地,95,p为80-100,较佳地,85-100,更佳地,90-100,最佳地,100,且p-m≥2(较佳地,≥5,更佳地,≥10,更佳地,≥15,最佳地,≥20)。
  5. 如权利要求1所述的方法,其特征在于,所述步骤(v)之前还包括如下步骤:
    (iv1)根据步骤(iv)的每个窗口b的拷贝数,计算正常对照样本中每个窗口b 的变异系数CVi;和
    (iv2)将所述CVi从小到大排序,去除最大的前n%的窗口,其中,n为大于0,小于等于5的任意数值,较佳地,n=1、2、2.5、3、3.1、4、4.2或5。
  6. 一种泌尿系统辅助诊断设备,其特征在于,包括:
    Malbac-L扩增单元;
    测序单元;和
    全基因组混乱度评分单元;其中,所述全基因组混乱度评分单元用于执行权利要求1中步骤(iii)-(vi)的任务,并输出所得到的全基因组混乱度评分结果。
  7. 如权利要求6所述的设备,其特征在于,所述设备还包括样品预处理单元。
  8. 如权利要求7所述的设备,其特征在于,所述样品预处理单元用于对待测样本进行沉淀处理、和/或裂解处理。
  9. 如权利要求6所述的设备,其特征在于,所述测序单元包括二代测序仪和/或三代测序仪。
  10. 一种泌尿系统基因检测方法,其特征在于,包括:
    (i)提供一待测样本;
    (ii)对所述待测样本进行Malbac-L扩增、测序,从而获得所述样本的基因组序列;
    (iii)将步骤(ii)获得的基因组序列与参考基因组进行比对,从而获得基因组序列在参考基因组上的位置信息;
    (iv)将所述的参考基因组分成M个区域片段,其中每个区域片段为一个窗口b,计算每个窗口b的拷贝数;
    (v)对步骤(iv)的每个窗口b进行Z检验,从而计算每个窗口b的Z值;
    (vi)根据步骤(v)所得到的Z值,计算全基因组混乱度评分(WGAS,Whole genomic abnormality score);和
    (vii)将步骤(vi)所得到的全基因组混乱度评分(WGAS)作为泌尿系统基因检测结果。
PCT/CN2017/073778 2017-02-16 2017-02-16 泌尿系统肿瘤的辅助诊断方法 WO2018148903A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/073778 WO2018148903A1 (zh) 2017-02-16 2017-02-16 泌尿系统肿瘤的辅助诊断方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/073778 WO2018148903A1 (zh) 2017-02-16 2017-02-16 泌尿系统肿瘤的辅助诊断方法

Publications (1)

Publication Number Publication Date
WO2018148903A1 true WO2018148903A1 (zh) 2018-08-23

Family

ID=63169130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/073778 WO2018148903A1 (zh) 2017-02-16 2017-02-16 泌尿系统肿瘤的辅助诊断方法

Country Status (1)

Country Link
WO (1) WO2018148903A1 (zh)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012059563A1 (fr) * 2010-11-03 2012-05-10 Vivatech Procede d'analyse genomique
CN104004817A (zh) * 2013-02-22 2014-08-27 哈佛大学 利用极体或胚胎的单细胞基因组测序来选择试管婴儿的胚胎
WO2014130589A1 (en) * 2013-02-20 2014-08-28 Bionano Genomics, Inc. Characterization of molecules in nanofluidics
US20140336075A1 (en) * 2011-12-17 2014-11-13 Bgi Diagnosis Co., Ltd. Method and system for determinining whether genome is abnormal
CN105385755A (zh) * 2015-11-05 2016-03-09 上海序康医疗科技有限公司 一种利用多重pcr技术进行snp-单体型分析的方法
CN105543339A (zh) * 2015-11-18 2016-05-04 上海序康医疗科技有限公司 一种同时完成基因位点、染色体及连锁分析的方法
CN105925675A (zh) * 2016-04-26 2016-09-07 序康医疗科技(苏州)有限公司 扩增dna的方法
CN106367512A (zh) * 2016-09-22 2017-02-01 上海序康医疗科技有限公司 一种鉴定样本中肿瘤负荷的方法和系统

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012059563A1 (fr) * 2010-11-03 2012-05-10 Vivatech Procede d'analyse genomique
US20140336075A1 (en) * 2011-12-17 2014-11-13 Bgi Diagnosis Co., Ltd. Method and system for determinining whether genome is abnormal
WO2014130589A1 (en) * 2013-02-20 2014-08-28 Bionano Genomics, Inc. Characterization of molecules in nanofluidics
CN104004817A (zh) * 2013-02-22 2014-08-27 哈佛大学 利用极体或胚胎的单细胞基因组测序来选择试管婴儿的胚胎
CN105385755A (zh) * 2015-11-05 2016-03-09 上海序康医疗科技有限公司 一种利用多重pcr技术进行snp-单体型分析的方法
CN105543339A (zh) * 2015-11-18 2016-05-04 上海序康医疗科技有限公司 一种同时完成基因位点、染色体及连锁分析的方法
CN105925675A (zh) * 2016-04-26 2016-09-07 序康医疗科技(苏州)有限公司 扩增dna的方法
CN106367512A (zh) * 2016-09-22 2017-02-01 上海序康医疗科技有限公司 一种鉴定样本中肿瘤负荷的方法和系统

Similar Documents

Publication Publication Date Title
JP7119014B2 (ja) まれな変異およびコピー数多型を検出するためのシステムおよび方法
JP6161607B2 (ja) サンプルにおける異なる異数性の有無を決定する方法
CN107708556B (zh) 诊断方法
TWI793586B (zh) 血漿dna之單分子定序
CN105874082B (zh) 用于非侵入性评估染色体改变的方法和过程
ES2902401T3 (es) Métodos y procesos para la evaluación no invasiva de variaciones genéticas
TWI670495B (zh) 一種鑑定樣本中腫瘤負荷的方法和系統
CN108064314A (zh) 判定癌症状态之方法及系统
WO2018166476A1 (zh) 检测样本中突变位点的方法
CN114574581A (zh) 检测稀有突变和拷贝数变异的系统和方法
TWI727938B (zh) 血漿粒線體dna分析之應用
CN110198711A (zh) 癌症检测方法
KR20190085667A (ko) 무세포 dna를 포함하는 샘플에서 순환 종양 dna를 검출하는 방법 및 그 용도
CN106282195A (zh) 基因突变体及其应用
WO2017107545A1 (zh) Scap基因突变体及其应用
CN105838720B (zh) Ptprq基因突变体及其应用
CN104073499B (zh) Tmc1基因突变体及其应用
WO2018148903A1 (zh) 泌尿系统肿瘤的辅助诊断方法
CN111445956B (zh) 一种二代测序平台的基因组数据高效利用方法和装置
CN106834476A (zh) 一种乳腺癌检测试剂盒
CN112442529A (zh) Eya1基因突变体及其应用
CN117625778A (zh) 用于检测色素失禁症ikbkg基因多种突变的方法及引物和试剂盒
WO2018186687A1 (ko) 생물학적 시료의 핵산 품질을 결정하는 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17897069

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17897069

Country of ref document: EP

Kind code of ref document: A1