WO2013037118A1 - Prostate cancer biomarkers, therapeutic targets and uses thereof - Google Patents

Prostate cancer biomarkers, therapeutic targets and uses thereof Download PDF

Info

Publication number
WO2013037118A1
WO2013037118A1 PCT/CN2011/079709 CN2011079709W WO2013037118A1 WO 2013037118 A1 WO2013037118 A1 WO 2013037118A1 CN 2011079709 W CN2011079709 W CN 2011079709W WO 2013037118 A1 WO2013037118 A1 WO 2013037118A1
Authority
WO
WIPO (PCT)
Prior art keywords
prostate cancer
gene
primers
fusion
long
Prior art date
Application number
PCT/CN2011/079709
Other languages
French (fr)
Chinese (zh)
Inventor
孙颖浩
彭智宇
任善成
易康
毛建华
张纪斌
Original Assignee
上海长海医院
深圳华大基因科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海长海医院, 深圳华大基因科技有限公司 filed Critical 上海长海医院
Priority to CN201180073445.7A priority Critical patent/CN103797120B/en
Priority to PCT/CN2011/079709 priority patent/WO2013037118A1/en
Publication of WO2013037118A1 publication Critical patent/WO2013037118A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57434Specifically defined cancers of prostate

Definitions

  • the invention relates to the field of cancer, in particular prostate cancer.
  • the present invention relates to the use of next generation sequencing techniques to find biomarkers for the diagnosis, prognosis and prediction of therapeutic response and pharmaceutical targets for the effective treatment of prostate cancer, particularly biomarkers for prostate cancer.
  • the RNA-Seq technique which is a transcriptome sequencing technique for analyzing the transcriptome of prostate cancer tissues and adjacent normal tissues, reveals a complete transcriptional map of prostate cancer in Chinese. Background technique
  • prostate cancer In developed countries, prostate cancer remains the highest incidence of cancer, and it ranks second among male cancer-related deaths. The incidence of prostate cancer is increasing worldwide, but the incidence varies widely among countries and races. The highest incidence is in Western countries, such as the United States; the lowest incidence is in East Asian countries, such as China, and this difference may be partly due to genetic differences between different ethnic groups.
  • prostate cancer is a heterogeneous disease. Each tumor varies greatly in tumor evolution and biological behavior (such as tumor dormancy, local growth, distant spread, response to treatment, and recurrence). Therefore, patients with histopathologic staging and Gleason scores with the same treatment regimen may have very different clinical outcomes and tumor progression history. In some patients, the tumor is dormant and confined to the prostate. It can survive for more than 10 years, while other patients die from distant metastasis of the tumor 2-3 years after diagnosis. Evidence suggests that the heterogeneity of clinical behavior of prostate cancer is caused by differences in its underlying molecular mechanisms during tumor progression.
  • NGS Next Generation Sequencing
  • NGS data can analyze genomes from multiple perspectives, such as mutations, transcription, structural variation, and post-transcriptional regulation (such as methylation).
  • mutations such as mutations, transcription, structural variation, and post-transcriptional regulation (such as methylation).
  • post-transcriptional regulation such as methylation
  • RNA-Seq ie, transcriptome sequencing technology
  • USP9Y-TTTY15 CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR. High frequency fusion genes and dozens of other fusion genes are reported, see Table 1 below.
  • Fusion genes expressed in cancer tissues are highly specific prostate cancer markers, detected by real time PCR in blood and urine, prostate puncture Tissue and postoperative tissues were used to detect the presence of fusion genes by FISH, for early diagnosis, molecular typing and prognosis of patients with prostate cancer, and fusion gene can be used as a target for targeted therapy.
  • long-chain non-coding RNAs were found to be involved in transcriptional regulation.
  • 23 long-chain non-coding RNAs are significantly associated with hundreds of genes in the whole genome, while most other genes are only related to several genes or are not related at all. This suggests that long-chain non-coding RNA may have functions other than transcriptional regulation, such as regulation at the post-transcriptional level.
  • almost all long-chain non-coding RNAs are positively correlated with gene expression, suggesting that these long-chain non-coding RNAs may promote gene expression.
  • long-chain non-coding RNA was selected four long-chain non-coding RNAs (two known: DD3 and MALAT1; two new findings: FR257520 and FR348383), and qRT-PCR in both groups. Their expression levels were examined in prostate specimens. The first group was 40 pairs of prostate cancer tissues and their matched adjacent normal tissues, and the second group was 15 normal human prostate tissues and 15 prostate cancer tissues. There is a strong correlation between qRT-PCR and RNA-seq results. Consistent with the RNA-Seq results, PCA3, MALAT1 and FR348383 were overexpressed in most prostate cancer specimens, while FR257520 expression was decreased. The results of PCA3 overexpression were similar to those previously thought to be new diagnostic markers, but we first found that MALAT1, FR257520, and FR348383 were significantly different in prostate cancer from normal prostate.
  • Prostate cancer mutation spectrum We found an average of 1725 point mutations in prostate cancer tissue. However, only a small fraction (on average 1.5%) is located in the coding region of the gene. Interestingly, some point mutations are located in long-chain non-coding RNA. The vast majority of mutations (91.7%) are mutations from T:A to C:G. A reasonable solution to this finding is that this point mutation occurs during RNA editing, and RNA editing changes the adenosine nucleoside to the hypoxanthine nucleoside, which is read as a guanosine nucleoside when translated. This results in a change in a particular RNA nucleotide.
  • RNA-Seq mutations were 96.7% (cDNA level) and 90% (genome level), respectively.
  • DNA is extracted from prostate puncture tissue or post-operative tissue, and PCR is performed to send sequencing to detect the presence of SNPs and point mutations. It is used for molecular prophylaxis and drug treatment targets of prostate cancer patients to judge the prognosis of patients.
  • the 194 mutations of the 183 genes provided by the present invention are shown in Table 3, wherein the preferred 30 gene mutations are shown in Table 8. Table 3. Prostate cancer-specific gene mutations
  • AS Alternative splicing
  • CDK11B 984 chrl 1637645-1637775 1633563-1633726 1633563-1633699
  • TRPT1 83707 chrl l 63748591-63748765 63748843-63749018 63748849-63749018
  • Figure 1 Flow chart of systemic tumor transcriptome analysis.
  • Figure 2 Schematic diagram of the fusion gene.
  • Figure 2c is a schematic diagram of the CTAGE5-khdrbs3 fusion gene, the 23rd exon of ctage5 is fused to the 8th exon of khdrbs3;
  • Figure 2d is a schematic diagram of the Tmprss2-erg fusion gene, the first exon of Tmprss2 and ERG The fourth exon is fused together;
  • Figure 2e shows the frequency of occurrence of the five fusion genes.
  • FIG. 3a Schematic diagram of the fusion gene.
  • Fig. 3a is a fusion diagram of USP9Y-TTTY15, and the third exon of USP9Y is fused together with the fourth exon of TTTY15;
  • Fig. 3b is the result of RT-PCR of USP9Y-TTTY15.
  • Figure 4 Schematic diagram of the fusion gene.
  • Figure 4a RAD50-PDLIM4 fusion gene RT-PCR and Sanger sequencing results;
  • Figure 4b is the results of the SDK1-AMACR fusion gene RT-PCR and Sanger sequencing.
  • Figure 5c shows differential expression of long-chain non-coding RNA DD3 MALAT1 FR0257520 FR0348383 in 40 pairs of cancer and adjacent tissues;
  • Figure 5d is long-chain non-coding RNA: DD3, MALAT1, FR0257520 and FR0348383 in prostate cancer and benign prostatic hyperplasia The difference in expression.
  • the invention provides a biological marker for prostate cancer, comprising the fusion gene shown in Table 1, the long-chain non-coding RNA shown in Table 2, the gene mutation shown in Table 3, and the mutation shown in Table 4. One or more of selective shearing.
  • the biological marker of the present invention can further be used as an early diagnostic marker for prostate cancer, a drug treatment effectiveness judgment marker or a patient prognosis marker.
  • the fusion gene comprises one or more of the 83 fusion genes of Table 6, preferably including 35 underlined in Table 6.
  • One or more of the fusion genes are particularly preferred.
  • the fusion gene comprises one or more of USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR, preferably a fusion gene USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR were amplified using the primers described in Table 5.
  • the long-chain non-coding RNA comprises one or more of DD3, MALAT1, FR0257520, FR0348383, preferably the long-chain non-coding RNA : DD3, MALAT1, FR0257520, FR0348383 were amplified using the primers described in Table 7.
  • the gene mutation comprises one or more of 30 gene mutations as shown in Table 8, preferably 30 genes shown in Table 8. Mutations were amplified using the primers described in Table 9.
  • the selective cleavage comprises PSA or AMACR, preferably selectively cleavage of PSA or AMACR using the primers described in Table 10 for amplification.
  • Another aspect of the present invention provides the use of the biological marker as a target for diagnosing prostate cancer or a drug for treating prostate cancer, in particular as an early diagnostic marker for prostate cancer, and for judging the effectiveness of drug treatment Use of markers or patient prognostic markers.
  • the invention further provides a primer for amplifying the biological marker or a probe of the biological marker for use in preparing a reagent for diagnosing prostate cancer.
  • the primer can be used to specifically amplify the biological marker, the probe specifically binding to the biological marker, thereby indicating the The presence of biological markers.
  • a primer for amplifying the biological marker wherein the primer preferably comprises the primer described in Table 5 for the fusion gene USP9Y-TTTY15, CTAGE5-KHDRBS3 , RAD50-PDLIM4, SDK1-AMACR; primers shown in Table 7, which were used to amplify long-chain non-coding RNAs: DD3, MALAT1, FR0257520, FR0348383; primers shown in Table 9, which were used to amplify Table 8. The 30 gene mutations shown; the primers shown in Table 10, which were used to amplify the selective shear PSA or AMACR.
  • the use of the primers described in Table 5 for the preparation of a medicament for the diagnosis of prostate cancer is provided.
  • the use of the primers shown in Table 7 for the preparation of a medicament for diagnosing prostate cancer is provided.
  • the use of the primers shown in Table 9 for the preparation of a medicament for diagnosing prostate cancer is provided.
  • RNA-Seq 14 pairs of prostate cancer tissues and adjacent normal tissues for RNA-Seq were taken from Shanghai Changhai Hospital. 54 pairs of samples for gene fusion verification: 23 pairs from Shanghai Changhai Hospital, 17 pairs from Jiangsu Provincial Hospital, and 14 pairs from Zhongshan University Third affiliated Hospital. A group of 40 pairs of prostate cancer and cancer for selective shear, long-chain non-coding RNA validation The adjacent organization was taken from Shanghai Changhai Hospital. Another group of 15 tumor samples and 15 BPH (benign prostatic hyperplasia) samples for long-chain non-coding RNA validation were taken from Jiangsu Provincial Hospital and Shanghai Changhai Hospital. The RNA-Seq protocol and its follow-up trials were approved by three hospital ethics committees. All patients completed written informed consent and authorized us to use their samples.
  • Hepatic tissue and adjacent normal tissues were subjected to HE staining (hematoxylin-eosin staining) and examined by the pathologist of the study to ensure that the selected tissue cancer tissue density exceeded 80%, and there was no cancer in the adjacent normal tissues. organization. All pathological samples were reviewed by another pathologist. If there is an inconsistent conclusion, the two pathologists will jointly discuss to determine the conclusion.
  • Oligomeric deoxythymidine magnetic beads were used to separate poly A mRNA from total RNA.
  • the purified mRNA was fragmented with fragmentation buffer. Using these short fragments as templates, the first fragment of cDN A was synthesized using a random hexamer.
  • the second strand of the cDNA strand was synthesized using buffer, dNTPs, RNase H and DNA polymerase I.
  • the short double-stranded cDNA fragment was purified using QIAQuick PCR extraction kit (vendor) and eluted with EB buffer to repair the end and add "A". The short segments are then connected to the Illumina sequencing adaptors.
  • the DNA of the target fragment size was purified by tapping for PCR amplification.
  • the amplified library was sequenced using Illumina HiSeqTM 2000.
  • the cDNA library was constructed using Illumina's mRNA-Seq 8-Sample Prep Kit (Cat. No.: RS-100-0801). The specific protocol was: Oligo-deoxythymidine magnetic beads for separation and aggregation from total RNA. A mRNA. The purified mRNA was fragmented with a fragmentation buffer. Use these short clips as templates, Random hexamer primers were used to synthesize the first stretch of cDNA strands. The second stretch of cDNA strand was synthesized using buffer, dNTPs, RNase H and DNA polymerase I.
  • Short double-stranded cDNA fragments were purified using QIAQuick PCR extraction kit (Qiagen) and eluted with EB buffer to repair the ends and add "A". The short segments are then attached to the Illumina sequencing adaptors.
  • the DNA of the target fragment size was purified by tapping for PCR amplification.
  • the quality of the cDNA library was determined by using an Agilent 2100 Bioanalyzer Bioanalyzer and a Stepone plus fluorescence quantitative PCR machine (according criteria: PCR amplification product size was 322 ⁇ 20 bp, with a short fragment size of 200 ⁇ 20 bp, library Moore concentration of not less than 1.3nM), using sequenced using Illumina HiSeq TM 2000 amplified library.
  • the images generated by the sequencer are subjected to base calling processing through the accompanying sequencer control software.
  • the original sequence is stored in fastq format. Remove dirty readings before analyzing the data. We use three criteria to remove dirty readings:
  • the readings were located on the human genome and transcriptome.
  • SOAP2 Short Oligonucleotide Analysis Package (SOAP) aligner (SOAP2); Li R, Yu C, Li Y, Lam TW, Yiu SM, et al. (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25 : 1966-1967) Method will be read after finishing The numbers were compared to the genome and transcriptome, respectively. The number of mismatches per reading cannot exceed three.
  • Readings that can be localized to a particular gene are used to calculate expression levels.
  • the level of gene expression is the number of reads per kilobase length from a gene per million reads. The formula is as follows:
  • C is the copy number of the selected gene reading
  • N is the copy number of all read genes
  • L is the total length of the exon of the selected gene.
  • the RPKM method can eliminate the effects of different gene lengths and sequence differences on gene expression calculations. Therefore, RPKM can be directly used to compare gene expression differences between samples.
  • the shorter segment length is not shorter than 8 bp
  • TMPRSS2-ERG was verified by RT-PCR and sequencing. We validated the gene fusion obtained by RNA-Seq at the transcriptional level. We designed gene fusion-specific PCR primers. After PCR and agarose electrophoresis, all RT-PCR amplified fragments (Qiagen QIAquick Gel Extraction kit) were sequenced in parallel with Sanger. In this way, we verified five fusion genes, TMPRSS2-ERG, USP9Y-TTTY15, SDKl-AMACR, CTAGE5-KHDRBS3, RAD50-PDLIM4, among which the other four fusion genes except TMPRSS2-ERG are the inventors. Newly discovered.
  • the four newly discovered fusion genes are:
  • the uppercase letters indicate the sequence of the first gene, and the lowercase letters indicate the sequence of the second gene.
  • the amplification primers for these five fusion genes are shown in Table 5 below.
  • the PCR conditions were: 95 X: 10 seconds; 60*C 30 seconds; 90 seconds; 38-43 cycles.
  • PCR product purification was carried out using PCR purification kit PCR Cleanup Kit 50 -prep (AXYGEN> Cat No. AP-PCR-50, Lot No. KB10101204-G), and the PCR product was subjected to 2% agarose gel electrophoresis using gel recovery.
  • the kit DNA Gel Extraction Kit 50-prep (AXYGEN, Cat No. AP-GX-50, Lot No. KE10101204-G) was subjected to strand recovery.
  • Electrophoresis images with fusion genes are shown in Figure 2d (TMPRSS2-ERG and CTAGE5-KHDRBS3), Figures 3a and b (USP9Y-TTTY15) and Figure 4a (RAD50-PDLIM4), Figure 4b ( SDK1-AMACR).
  • RNA from all samples was first extracted and reverse transcribed into cDNA.
  • the RT-PCR primers were identical to the validation primers described above.
  • the cDNA of the sequenced sample was used as a positive control.
  • Prostate cancer gene fusion map Transcriptome sequencing was first used to detect gene fusion in prostate cancer. Using paired end readings, we found a total of 84 gene fusions. In addition to the well-known TMPRSS2-ERG gene fusion, we found 83 new gene fusions that were not reported in previous studies against whites. 35 new and one previously well-known gene fusions were found only in prostate cancer tissues but not in matched normal tissues (see underlined fusion genes), and fusion genes were expressed in normal tissues adjacent to the cancer (see bold black body). Partly), the specific biological significance is temporarily unknown, and the following four fusion genes are found in cancer and cancer.
  • KLK2 KLK3 chrl9 chrl 9 56072113 56055040 fwd, fwd Gene fusion expressed only in cancer is defined as tumor-specific gene fusion.
  • the number of gene fusions for each cancer tissue sample ranged from 1 to 6 respectively.
  • the 83 new genes were fused as shown in Table 6, and 35 new gene fusions were underlined.
  • USP9Y-TTTY15 located on the Y chromosome. USP9Y encodes a protein similar to a ubiquitin-specific protease, while TTTY15 is a non-coding RNA. The deletion or mutation of the USP9Y gene is associated with male infertility. However, previous studies have not revealed that these two genes are involved in tumorigenesis.
  • long-chain non-coding RNA and genes were used to calculate the correlation coefficient R.
  • qRT-PCR validates long-chain non-coding RNA (we performed qRT-PCR on Applied Biosystems Step One Plus using Power SYBR Green Mastermix reagent. GAPDH primer was used as internal reference. A group of 40 pairs of prostate cancer as described above) The adjacent tissues were taken from Shanghai Changhai Hospital, and the other group was used for 15 tumor samples and 15 BPH samples taken from Jiangsu Provincial Hospital and Shanghai Changhai Hospital for long-chain non-coding RNA verification.
  • PCA3 also known as DD3
  • MALAT1 MALAT1
  • FR0348383 FR0257520 expression decreased (Fig. 5).
  • the results of PCA3 overexpression are similar to those previously thought to be new diagnostic markers, but we first found that the frequency of overexpression of MALAT1 is high in prostate cancer.
  • the invention provides 137 long-chain non-coding RNAs for diagnosis and judgment of patients Prognosis and drug response, as well as therapeutic targets, see Table 2.
  • Example 4. Discovery and validation of single nucleotide polymorphisms and point mutations
  • Table 8 The 30 mutations that have been verified, the rightmost column is the template used for CDNA and DNA, S for success and F for failure.
  • the present invention provides 183 mutations which can be used as diagnostic markers, prognostic judgments, drug efficacy judgments, and therapeutic targets. See Table 3 for details.
  • RT-PCR verified selective splicing.
  • the PCR conditions are: seconds; 60 ⁇ 30 seconds; 72* €90 seconds; 33-36 cycles.
  • two gene primers are as follows:
  • the invention provides tumor-specific selective scission as shown in Table 4, which can be used as a diagnostic marker for blood, urine and tissue, as well as for prognosis and treatment.
  • the marker can also be used as a target for cancer treatment.

Abstract

A group of prostate cancer biomarkers are provided, wherein the biomarkers comprise fusion genes, long chain non-coding RNAs, gene mutation and selective spliceosomes. The uses of these biomarkers as reagents for diagnosing prostate cancer or as targets of the drugs for treating prostate cancer are also provided.

Description

前列腺癌的生物学标志物、 治疗靶点及其用途 技术领域  Biological markers, therapeutic targets and uses of prostate cancer
本发明涉及癌症领域, 特别是前列腺癌。 同时, 本发明涉及 使用下一代测序技术, 以寻找用于诊断、 预后和治疗反应预测的 生物学标志物和有效治疗前列腺癌的药物靶点, 特别是用于前列 腺癌的生物学标志物。 本发明中, 特别使用了 RNA-Seq技术, 即 转录组测序技术分析前列腺癌组织和癌旁正常组织的转录组, 揭 示中国人前列腺癌完整的转录图谱。 背景技术  The invention relates to the field of cancer, in particular prostate cancer. At the same time, the present invention relates to the use of next generation sequencing techniques to find biomarkers for the diagnosis, prognosis and prediction of therapeutic response and pharmaceutical targets for the effective treatment of prostate cancer, particularly biomarkers for prostate cancer. In the present invention, the RNA-Seq technique, which is a transcriptome sequencing technique for analyzing the transcriptome of prostate cancer tissues and adjacent normal tissues, reveals a complete transcriptional map of prostate cancer in Chinese. Background technique
在发达国家, 前列腺癌仍是发病率最高的肿瘤, 同时在男性 癌症相关死亡中排第二位。全世界前列腺癌的发病率在不断上升, 但在不同国家和种族中, 其发病率差异很大。 发病率最高的是西 方国家, 如美国; 发病率最低的是东亚国家, 如中国, 这种差异 可能部分是由不同种族的基因差异引起的。 此外, 前列腺癌是一 种异质性疾病。 每一个肿瘤在肿瘤进化以及生物学行为 (如肿瘤 休眠, 局部生长, 远处扩散, 对治疗的反应以及复发等)上差异 很大。 因此, 组织病理学分级分期以及 Gleason评分相同、 治疗 方案相同的病人, 其临床结局以及肿瘤进展史可能截然不同。 有 的病人其肿瘤处于休眠状态、局限于前列腺,可以生存 10年以上, 而其他病人却在诊断后 2-3年死于肿瘤的远处转移。 种种证据表 明 , 前列腺癌临床行为的异质性是在肿瘤进展过程中由其内在的 分子机制差异引起的。  In developed countries, prostate cancer remains the highest incidence of cancer, and it ranks second among male cancer-related deaths. The incidence of prostate cancer is increasing worldwide, but the incidence varies widely among countries and races. The highest incidence is in Western countries, such as the United States; the lowest incidence is in East Asian countries, such as China, and this difference may be partly due to genetic differences between different ethnic groups. In addition, prostate cancer is a heterogeneous disease. Each tumor varies greatly in tumor evolution and biological behavior (such as tumor dormancy, local growth, distant spread, response to treatment, and recurrence). Therefore, patients with histopathologic staging and Gleason scores with the same treatment regimen may have very different clinical outcomes and tumor progression history. In some patients, the tumor is dormant and confined to the prostate. It can survive for more than 10 years, while other patients die from distant metastasis of the tumor 2-3 years after diagnosis. Evidence suggests that the heterogeneity of clinical behavior of prostate cancer is caused by differences in its underlying molecular mechanisms during tumor progression.
在过去的十余年间, DNA和 RNA芯片技术在分析生物学机 制上应用广泛。其帮助我们对前列腺癌的发病机制有了新的了解, 为我们找到用于诊断、 预后和治疗反应预测的生物学标志物提供 了基础。 虽然目 前为止, 类似乳腺癌的 OncotypeDx 和 MammoPrint的用于前列腺癌基因组预后检测极少, 但一些被发 现的前列腺癌分子学改变正在被应用于临床实践。 Taylor 等 ( Taylor BS, et al. (2010) Integrative genomic profiling of human prostate cancer. Cancer Cell 18(l):ll-22. )通过对前列腺癌的综合 基因组分析发现, 某些基因拷贝数的变化可能区分进展性肿瘤和 休眠性肿瘤, 该发现意义重大。 然而, 我们仍迫切需要新的生物 学标志物以更准确地检出前列腺癌并改进对肿瘤进展性及治疗结 局的预测能力。 Over the past decade or so, DNA and RNA chip technology have been widely used in analytical biological mechanisms. It helps us to have a new understanding of the pathogenesis of prostate cancer, It provides the basis for finding biomarkers for the diagnosis, prognosis and prediction of therapeutic response. Although breast cancer-based OncotypeDx and MammoPrint have so far been used to detect prostate cancer genome prognosis, some of the discovered molecular changes in prostate cancer are being used in clinical practice. Taylor et al. (2010) Integrative genomic profiling of human prostate cancer. Cancer Cell 18(l): ll-22. Distinguishing between progressive and dormant tumors is significant. However, we still urgently need new biomarkers to more accurately detect prostate cancer and improve our ability to predict tumor progression and treatment outcomes.
需要指出的是, 虽然以基因芯片为基础的研究对我们对人类 肿瘤发生发展的理解做出了重大贡献,但该技术有很大的局限性, 如不能检测基因组结构的变化和碱基突变。 发明内容  It should be noted that although gene chip-based research has made significant contributions to our understanding of the development of human tumors, the technology has significant limitations, such as the inability to detect changes in genomic structure and base mutations. Summary of the invention
在过去几年中, 下一代测序技术 ( Next Generation Sequencing, NGS ) 的飞速发展克服了上述不足。 NGS使我们能 以前所未有的高分辨率和高通量分析整个肿瘤基因组及转录组。  In the past few years, the rapid development of Next Generation Sequencing (NGS) has overcome these shortcomings. NGS allows us to analyze the entire tumor genome and transcriptome with unprecedented high resolution and high throughput.
NGS的数据能从多个角度分析基因组, 如突变, 转录, 结构变异 和转录后调节(如甲基化)。 此外, NGS技术的不断改进使得科 学家能够对主要的肿瘤类型的基因组进行测序。 NGS data can analyze genomes from multiple perspectives, such as mutations, transcription, structural variation, and post-transcriptional regulation (such as methylation). In addition, continuous improvements in NGS technology have enabled scientists to sequence the genomes of major tumor types.
目前, 几乎所有针对前列腺癌基因组和转录组水平变化的研 究都是在白人中进行, 黄种人的研究极少。 在本研究中, 我们用 RNA-Seq技术, 即转录组测序技术分析了 14对前列腺癌组织和 癌旁正常组织的转录组。 我们将所有的转录产物类型进行分析, 揭示中国人前列腺癌完整的转录图谱。 我们找到了很多异构体包 括: 外显子跳跃、 内含子保留、 5,和 3,端选择性剪切、 基因融合、 点突变、 长链非编码 RNA, 这些都可能在前列腺癌的发生和发展 中起作用。 我们的研究阐明了前列腺癌基因组变化的复杂图谱, 证实了前列腺癌的异质性,推进了我们对中国人前列腺癌的认识。 Currently, almost all studies on changes in prostate cancer genome and transcriptome levels are conducted in whites, and there are few studies in the yellow race. In this study, we analyzed the transcriptomes of 14 pairs of prostate cancer tissues and adjacent normal tissues using RNA-Seq technology, a transcriptome sequencing technique. We analyzed all types of transcripts to reveal a complete transcriptional map of Chinese prostate cancer. We found a lot of isomer packages Including: exon skipping, intron retention, 5, and 3, terminal selective cleavage, gene fusion, point mutations, long-chain non-coding RNA, all of which may play a role in the development and progression of prostate cancer. Our study clarifies the complex map of prostate cancer genome changes, confirms the heterogeneity of prostate cancer, and advances our understanding of Chinese prostate cancer.
1. 前列腺癌新型融合基因的发现和验证 1. Discovery and validation of novel fusion genes for prostate cancer
(1). 对上海长海医院 14对前列腺癌和癌旁组织中进行 RNA-Seq(即转录组测序技术), 发现 USP9Y-TTTY15 、 CTAGE5-KHDRBS3 , RAD50-PDLIM4, SDK1-AMACR共 4个文 献未报道高频融合基因及其它数十个融合基因, 参见如下表 1。  (1). RNA-Seq (ie, transcriptome sequencing technology) was performed on 14 pairs of prostate cancer and adjacent tissues in Shanghai Changhai Hospital. Four documents were found for USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR. High frequency fusion genes and dozens of other fusion genes are reported, see Table 1 below.
表 1. 前列腺癌新型融合基因  Table 1. Novel fusion gene for prostate cancer
链 (正 融合 Chain (positive fusion
5'染色 3'染色 双端5' staining 3' staining
5'基因 3'基因 5'位置 3'位置 链、 反 基因 5' gene 3' gene 5' position 3' position chain, anti-gene
体 ID 体 ID 读数 链) 读数 Body ID body ID reading chain) reading
NCOA7 CRBN chr6 chr3 126178243 3172965 fwd'rev 1 1NCOA7 CRBN chr6 chr3 126178243 3172965 fwd'rev 1 1
SLC25A33 RYK chrl chr3 9536450 135396716 fwd'rev 1 1SLC25A33 RYK chrl chr3 9536450 135396716 fwd'rev 1 1
TBC1 D22A ITPK1 chr22 chr14 4581 1758 92530095 fwd'rev 1 1TBC1 D22A ITPK1 chr22 chr14 4581 1758 92530095 fwd'rev 1 1
EMB ATG10 chr5 chr5 49772631 81390070 rev'fwd 4 1EMB ATG10 chr5 chr5 49772631 81390070 rev'fwd 4 1
FBX025 H19 chr8 chr1 1 403150 1973600 fwd'rev 3 1FBX025 H19 chr8 chr1 1 403150 1973600 fwd'rev 3 1
KDM5D CYorf 15A chrY chrY 20364436 20208484 rev,fwd 1 1KDM5D CYorf 15A chrY chrY 20364436 20208484 rev,fwd 1 1
USP9Y I I I Y15 chrY chrY 13330870 13307836 fwd,fwd 1 13USP9Y I I Y Y15 chrY chrY 13330870 13307836 fwd,fwd 1 13
HPN RPS2 chrl 9 chrl 6 40248089 1952272 fwd'rev 1 1HPN RPS2 chrl 9 chrl 6 40248089 1952272 fwd'rev 1 1
TMPRSS2 ERG chr21 chr21 41801878 38739414 rev, rev 25 1TMPRSS2 ERG chr21 chr21 41801878 38739414 rev, rev 25 1
ARFIP1 DOCK9 chr4 chr13 153970328 98250733 fwd'rev 1 1ARFIP1 DOCK9 chr4 chr13 153970328 98250733 fwd'rev 1 1
STAT3 PDE8A chr17 chr15 37793823 83427791 rev'fwd 3 1STAT3 PDE8A chr17 chr15 37793823 83427791 rev'fwd 3 1
PHF17 SNHG8 chr4 chr4 129972415 1 19419992 fwd'fwd 31 34PHF17 SNHG8 chr4 chr4 129972415 1 19419992 fwd'fwd 31 34
FBX028 CAPN2 chrl chrl 222368721 221998425 fwd'fwd 2 1FBX028 CAPN2 chrl chrl 222368721 221998425 fwd'fwd 2 1
SDK1 AMACR chr7 chr5 4085742 34041761 fwd'rev 1 1SDK1 AMACR chr7 chr5 4085742 34041761 fwd'rev 1 1
IKZF2 MFF chr2 chr2 213720677 227905379 rev'fwd 1 1IKZF2 MFF chr2 chr2 213720677 227905379 rev'fwd 1 1
CAMTA1 INSR chrl chrl 9 6807857 7218907 fwd'rev 5 5CAMTA1 INSR chrl chrl 9 6807857 7218907 fwd'rev 5 5
UPF3A CDC16 chrl 3 chrl 3 1 14075369 1 14025698 fwd'fwd 108 1UPF3A CDC16 chrl 3 chrl 3 1 14075369 1 14025698 fwd'fwd 108 1
DYRK1 A CMTM4 chr21 chr16 37714556 65208762 fwd'rev 2 2DYRK1 A CMTM4 chr21 chr16 37714556 65208762 fwd'rev 2 2
CTAGE5 KHDRBS3 chr14 chr8 38887932 136726484 fwd'fwd 1 1CTAGE5 KHDRBS3 chr14 chr8 38887932 136726484 fwd'fwd 1 1
RAD50 PDLIM4 chr5 chr5 131972987 131626201 fwd'fwd 9 8RAD50 PDLIM4 chr5 chr5 131972987 131626201 fwd'fwd 9 8
WWOX IGF1 chrl 6 chrl 2 76978346 101320474 fwd'rev 2 2 SNRNP70 CAMK2B Chr19 chr7 54299804 44227025 fwd'rev 1 1WWOX IGF1 chrl 6 chrl 2 76978346 101320474 fwd'rev 2 2 SNRNP70 CAMK2B Chr19 chr7 54299804 44227025 fwd'rev 1 1
C20orf94 SYTL4 chr20 chrX 10386879 99846538 fwd'rev 1 1C20orf94 SYTL4 chr20 chrX 10386879 99846538 fwd'rev 1 1
PHF10 OCIAD1 chr6 chr4 169859844 48553987 rev,fwd 1 1PHF10 OCIAD1 chr6 chr4 169859844 48553987 rev,fwd 1 1
AQR MARK3 chr15 chr14 32973156 103027867 rev,fwd 2 2AQR MARK3 chr15 chr14 32973156 103027867 rev,fwd 2 2
DDX39 PAFAH1 B1 chr19 chr17 14391082 2488143 rev,fwd 1 1DDX39 PAFAH1 B1 chr19 chr17 14391082 2488143 rev,fwd 1 1
COL6A3 BRE chr2 chr2 237970109 28374709 rev'fwd 1 1COL6A3 BRE chr2 chr2 237970109 28374709 rev'fwd 1 1
ZC3H6 LRP1 B chr2 chr2 1 12795913 142284440 fwd'rev 7 6ZC3H6 LRP1 B chr2 chr2 1 12795913 142284440 fwd'rev 7 6
LRP1 B ZC3H6 chr2 chr2 142284318 1 12773897 rev'fwd 2 1LRP1 B ZC3H6 chr2 chr2 142284318 1 12773897 rev'fwd 2 1
TMPRSS2 ERG chr21 chr21 41801878 38739414 rev, rev 34 1TMPRSS2 ERG chr21 chr21 41801878 38739414 rev, rev 34 1
APLP2 MBOAT7 chr1 1 chr19 129515588 59369937 fwd'rev 2 2APLP2 MBOAT7 chr1 1 chr19 129515588 59369937 fwd'rev 2 2
MCF2L SH3KBP1 chr13 chrX 1 12747676 19497227 fwd'rev 1 1MCF2L SH3KBP1 chr13 chrX 1 12747676 19497227 fwd'rev 1 1
RPL31 ODF2L chr2 chrl 100988965 86587149 fwd'rev 4 4RPL31 ODF2L chr2 chrl 100988965 86587149 fwd'rev 4 4
FBLN1 LTBP2 chr22 chr14 44315916 74044551 fwd'rev 4 3FBLN1 LTBP2 chr22 chr14 44315916 74044551 fwd'rev 4 3
TAX1 BP1 JAZF1 chr7 chr7 27764277 27998125 fwd'rev 1 1 TAX1 BP1 JAZF1 chr7 chr7 27764277 27998125 fwd'rev 1 1
(2) . 我们在 54对前列腺癌和癌旁组织中对这些融合基因进行 了验证。 我们设计了基因融合特异性的 PCR引物。 PCR和琼脂电 泳后, 所有 RT-PCR扩增片段割股回收(Qiagen QIAquick Gel Extraction kit)并行 Sanger测序。 我们发现臉证的 4个新型融合基 因在癌组织中特异表达、 频率较高 (结果见图 2 - 4 ) 。 这些融合 基因之前未被报道过, 但其在本研究中频率较高提示其在中国人 前列腺癌的发生中起重要作用, 这些可望在后续的研究中得到阐 明。 (2). We validated these fusion genes in 54 pairs of prostate cancer and adjacent tissues. We designed gene fusion-specific PCR primers. After PCR and agar electrophoresis, all RT-PCR amplified fragments were harvested (Qiagen QIAquick Gel Extraction kit) in parallel with Sanger sequencing. We found that the four novel fusion genes of face syndrome are specifically expressed in cancer tissues with high frequency (see Figure 2 - 4 for the results). These fusion genes have not been previously reported, but their high frequency in this study suggests that they play an important role in the development of Chinese prostate cancer, which are expected to be elucidated in subsequent studies.
(3) . 临床应用前景: 在癌组织中表达, 癌旁和正常组织中不 表达的融合基因, 是高度特异性的前列腺癌标记物, 在血液、 尿 液中通过 real time PCR检测, 前列腺穿刺组织和术后组织通过 FISH检测融合基因存在情况, 用于前列腺癌病人的早期诊断、 分 子分型和判断病人预后 , 同时融合基因可作为靶向治疗的靶点。  (3) . Clinical application prospects: Fusion genes expressed in cancer tissues, not expressed in adjacent tissues and normal tissues, are highly specific prostate cancer markers, detected by real time PCR in blood and urine, prostate puncture Tissue and postoperative tissues were used to detect the presence of fusion genes by FISH, for early diagnosis, molecular typing and prognosis of patients with prostate cancer, and fusion gene can be used as a target for targeted therapy.
2. 发现差异性表达的长链非编码 RNA 2. Discover differentially expressed long-chain non-coding RNA
前列腺癌中长链非编码 RNA的转录图谱。 越来越多的证据表 明长链非编码 RNA在细胞生物学许多方面中起作用, 提示其在疾 病的病因学, 包括肿瘤发生机制中起作用。 到目前为止, 之前的 研究都未涉足肿瘤中长链非编码 RNA的整体转录水平改变。 因 此, 我们首先在前列腺癌组织及其配对癌旁正常组织中分析了长 链非编码 RNA的整体转录谱, 发现每个标本中平均有 1599个已知 长链非编码 RNA表达。 接下来, 我们在前列腺癌组织和配对癌旁 正常组织比较了长链非编码 RNA的表达水平, 发现平均有 406个 长链非编码 RNA在二者间有差异性表达(倍数改变 >=2, 假阳性 率, False positive Rate, FDR<=0.001 ),其中 137个长链非编码 RNA 在 50%的前列腺癌中都呈现一致的上调或下调。 Transcriptional map of long-chain non-coding RNA in prostate cancer. More and more evidence tables The long-chain non-coding RNA plays a role in many aspects of cell biology, suggesting its role in the etiology of the disease, including the mechanism of tumorigenesis. To date, none of the previous studies have involved changes in the overall transcriptional level of long-chain non-coding RNA in tumors. Therefore, we first analyzed the overall transcriptional profile of long-chain non-coding RNA in prostate cancer tissues and their matched normal tissues, and found an average of 1599 known long-chain non-coding RNA expressions in each specimen. Next, we compared the expression levels of long-chain non-coding RNA in prostate cancer tissues and matched normal tissues, and found that an average of 406 long-chain non-coding RNAs were differentially expressed between the two (fold change >= 2, False positive rate, FDR<=0.001), of which 137 long-chain non-coding RNAs showed consistent up- or down-regulation in 50% of prostate cancers.
因为大多数长链非编码 RNA被发现与转录调节有关, 我们研 究了长链非编码 RNA表达量的变化对前列腺癌基因表达的影响。 我们分析了每个长链非编码 RNA与所有基因表达量的相关性。 使 用绝对相关系数大于 0.85、假发现率小于 0.01为界值,我们发现与 长链非编码 RNA高度相关的基因。 非常有趣的是, 有 23个长链非 编码 RNA与全基因组中数百个基因显著相关, 而其他大多数基因 仅与几个基因相关, 或者根本就不相关。 这提示长链非编码 RNA 可能有转录调节以外的功能, 比如在转录后水平的调节。 出人意 料的是, 除了两个长链非编码 RNA外, 几乎所有的长链非编码 RNA与基因表达呈正相关,提示这些长链非编码 RNA可能促进基 因的表达。  Since most long-chain non-coding RNAs were found to be involved in transcriptional regulation, we investigated the effect of changes in the expression of long-chain non-coding RNA on prostate cancer gene expression. We analyzed the correlation of each long-chain non-coding RNA to the expression levels of all genes. Using an absolute correlation coefficient greater than 0.85 and a false discovery rate less than 0.01, we found genes highly associated with long-chain non-coding RNA. Interestingly, 23 long-chain non-coding RNAs are significantly associated with hundreds of genes in the whole genome, while most other genes are only related to several genes or are not related at all. This suggests that long-chain non-coding RNA may have functions other than transcriptional regulation, such as regulation at the post-transcriptional level. Surprisingly, in addition to the two long-chain non-coding RNAs, almost all long-chain non-coding RNAs are positively correlated with gene expression, suggesting that these long-chain non-coding RNAs may promote gene expression.
为了研究长链非编码 RNA与前列腺癌的关系, 我们选择了 4 个长链非编码 RNA (两个已知: DD3和 MALAT1; 两个新发现: FR257520和 FR348383 ) , 并用 qRT-PCR在两组前列腺标本中检 测它们的表达量。 第一组是 40对前列腺癌组织及其配对癌旁正常 组织, 第二组是 15个正常人前列腺组织和 15个前列腺癌组织。 qRT-PCR和 RNA-seq结果有很强的相关性。 与 RNA-Seq结果一 致, 在大多数前列腺癌标本中 PCA3、 MALAT1和 FR348383过表 达, 而 FR257520 表达量降低。 PCA3过表达的结果与之前认为其 可能成为新的诊断标志物的研究类似, 但我们首次发现 MALAT1、 FR257520和 FR348383在前列腺癌中表达与正常前列 腺有明显差异。 To investigate the relationship between long-chain non-coding RNA and prostate cancer, we selected four long-chain non-coding RNAs (two known: DD3 and MALAT1; two new findings: FR257520 and FR348383), and qRT-PCR in both groups. Their expression levels were examined in prostate specimens. The first group was 40 pairs of prostate cancer tissues and their matched adjacent normal tissues, and the second group was 15 normal human prostate tissues and 15 prostate cancer tissues. There is a strong correlation between qRT-PCR and RNA-seq results. Consistent with the RNA-Seq results, PCA3, MALAT1 and FR348383 were overexpressed in most prostate cancer specimens, while FR257520 expression was decreased. The results of PCA3 overexpression were similar to those previously thought to be new diagnostic markers, but we first found that MALAT1, FR257520, and FR348383 were significantly different in prostate cancer from normal prostate.
临床应用前景: 在血液、 尿液中通过 real time PCR检测长链 非编码 RNA存在情况,用于前列腺癌病人的早期诊断、分子分型, 同时可作为靶向治疗的靶点, 判断病人预后。 我们的研究结果表 明 137个长链非编码 RNA可以作为生物标志物, 具体参见表 2。  Prospects for clinical application: The presence of long-chain non-coding RNA in real-time PCR in blood and urine is used for early diagnosis and molecular typing of prostate cancer patients, and can be used as a target for targeted therapy to judge the prognosis of patients. Our results indicate that 137 long-chain non-coding RNAs can be used as biomarkers. See Table 2 for details.
表 2. 137个长链非编码 RNA  Table 2. 137 long-chain non-coding RNAs
长链非编码 RNA Genebank登录号 序列长度 Long-chain non-coding RNA Genebank accession number sequence length
FR0020363 AK057593 2677FR0020363 AK057593 2677
FR0407739 DQ650707 296FR0407739 DQ650707 296
FR0282990 AK126514 3800FR0282990 AK126514 3800
FR0037254 AF019382 1423FR0037254 AF019382 1423
FR0091442 AK124134 1749FR0091442 AK124134 1749
FR0029181 AK092342 2695FR0029181 AK092342 2695
FR0255273 U90917 1318FR0255273 U90917 1318
FR0407452 DQ668386 387FR0407452 DQ668386 387
FR0094304 AK023371 2455FR0094304 AK023371 2455
FR0006046 AK096065 2352FR0006046 AK096065 2352
FR0156595 AF147314 393FR0156595 AF147314 393
FR0072345 AY314975 1730FR0072345 AY314975 1730
FR0317352 U92981 1429FR0317352 U92981 1429
FRO 105105 AF086469 294FRO 105105 AF086469 294
FR0357736 BC036881 1841FR0357736 BC036881 1841
FR0205443 AF147384 445FR0205443 AF147384 445
FR0087663 L20494 434FR0087663 L20494 434
FR0248245 AK123449 1877FR0248245 AK123449 1877
FR0093344 XR—000150 4730FR0093344 XR—000150 4730
FR0085797 BC028229 2341FR0085797 BC028229 2341
FR0030275 AK094210 1936FR0030275 AK094210 1936
FR0077061 BK001418 8352FR0077061 BK001418 8352
FR0065198 AF308293 567 -Lr
Figure imgf000008_0001
FR0065198 AF308293 567 -Lr
Figure imgf000008_0001
918 88S1£0D9 W)乙 69I0¾d 918 88S1£0D9 W) B 69I03⁄4d
P££ 6 6 £VV £66 00¾d εοοε 65111039 S8019£0¾dP££ 6 6 £VV £66 003⁄4d εοοε 65111039 S8019£03⁄4d
06Z 9060 0D9 6Z6£6 d ζίςς ^1 £81V ^蘭 d06Z 9060 0D9 6Z6£6 d ζίςς ^1 £81V ^land d
ZIQZ 9H£603V ^ l0Z0¾d ZIQZ 9H£603V ^ l0Z03⁄4d
03V  03V
6οε 686£0Wad 6οε 686£0Wad
IQLl 乙乙 603V 0£18S10¾dIQLl B 603V 0£18S103⁄4d
£9Vl 6StO9£0¾d 61Z 9 0£603V 6乙 εεζοο¾3 ξ£9 £LL9Z00£9Vl 6StO9£03⁄4d 61Z 9 0£603V 6B εεζοο3⁄43 ξ£9 £LL9Z00
619 S 98(MV AS6£600¾d619 S 98 (MV AS6 £6003⁄4d
L9L ΐ0 98Δ0α 6S乙乙 OtO¾dL9L ΐ0 98Δ0α 6S B E OOO3⁄4d
60S Δ0ΐ8010α 06 £蘭 d60S Δ0ΐ8010α 06 £兰 d
9£81 0£6S09¾D 99 蘭 d nez 6S1SS03V 6tO0le0¾d εεΐΐ W18S03V 19l£S00¾d βςζ 乙 S89Sn 蘭 d9£81 0£6S093⁄4D 99 blue d nez 6S1SS03V 6tO0le03⁄4d εεΐΐ W18S03V 19l£S003⁄4d βςζ B S89Sn blue d
89£ΐ 0S68l mV 66乙 WH0¾d 89£ΐ 0S68l mV 66B WH03⁄4d
6 6乙 8X'6966SX'0 tOTV 169乙 8I0¾d ςςςζ 6£SA800¾d 6 6 B 8X'6966SX'0 tOTV 169 B 8I03⁄4d ςςςζ 6£SA8003⁄4d
£801 ςιβιζ £801 ςιβιζ
£ZP  £ZP
16Z 1695ΔΧ εε80乙 εο¾3 乙 8S 9098CHV £L9 ££0 16Z 1695ΔΧ εε80B εο3⁄43 B 8S 9098CHV £L9 ££0
6Z9 9乙 60I0¾d6Z9 9 B 60I03⁄4d
L9PI 6098900¾dL9PI 60989003⁄4d
9£Ll 08 6100¾d9£Ll 08 61003⁄4d
SP9P i£isierv 8091£10¾dSP9P i£isierv 8091£103⁄4d
£L£Z 89Z1應 V £L£Z 89Z1 should be V
OSZZ 乙 68Z V  OSZZ B 68Z V
£00Z 18乙 9 V £00Z 18 B 9 V
Figure imgf000008_0002
Figure imgf000008_0002
ZL91 乙 9£ 0 d 09 080IWHV S9£0900¾d ZL91 B 9 £ 0 d 09 080IWHV S9£09003⁄4d
89£ SILOLZ 9180£ d£89 SILOLZ 9180£ d
ZPZ 蘭 dZPZ blue d
LLSZ 蘭 dLLSZ blue d
L£Pl 909Δ£5Χ9 .6.0/llOZN3/X3d 8ΐΐ.εο/ειοζ OAV L£L ο^Δόο νL£Pl 909Δ£5Χ9 .6.0/llOZN3/X3d 8ΐΐ.εο/ειοζ OAV L£L ο^Δόο ν
Figure imgf000009_0001
Figure imgf000009_0001
0L6 LZ9LL^m^ 0L6 LZ9LL^m^
LZS£ 886S 13V 蘭 d οοε ε 6蘭 ν 90£91蘭 d ζςζ 89乙乙乙 00¾dLZS£ 886S 13V blue d οοε ε 6 blue ν 90£91 blue d ζςζ 89 B and B 003⁄4d
9817 68ΐΔ 6λΥ 9916Z d9817 68ΐΔ 6λΥ 9916Z d
OOZ £0 9£0¾d ^ς OOZ £0 9£03⁄4d ^ς
乙 9917 98£0D9 06乙 89蘭 d
Figure imgf000009_0002
B 9917 98 £0D9 06 B 89 blue d
Figure imgf000009_0002
£Z6£ 68£66 d £Z6£68£66 d
8 8£ SO d8 8 £ SO d
£691 890819Ή3 L9LS££0£691 890819Ή3 L9LS££0
£691 1S1 603V 169£le0¾d£691 1S1 603V 169£le03⁄4d
£Z9 ΐ ΐςΔ 6λΥ 乙乙 W)nO¾d ςζ ζ its乙 ν 8008S00¾d
Figure imgf000009_0003
£Z9 ΐ ΐςΔ 6λΥ E B) nO3⁄4d ςζ ζ its Bv 8008S003⁄4d
Figure imgf000009_0003
Qz 乙 6乙乙 £0 9 101ll00¾d  Qz B 6 E B 0 9 101ll003⁄4d
L9S 9£l7980dV 9£0£6蘭 dL9S 9 £l7980dV 9 £0 £6 blue d
S981 £9^00¾dS981 £9^003⁄4d
00Π ςεεζ9 d ξ9ξ£ 08乙 9603V 00Π ςεεζ9 d ξ9ξ£ 08B 9603V
o 蘭 do blue d
£LZl 880000"ΉΧ Il9l900¾d£LZl 880000"ΉΧ Il9l9003⁄4d
06P
Figure imgf000009_0004
6乙 οεεεο¾3
06P
Figure imgf000009_0004
6乙οεεεο3⁄43
9ZS£ 8699 13V S10 910¾d9ZS £ 8699 13V S10 9103⁄4d
WLZ 8乙 εΐΜΌ9 WLZ 8 B εΐΜΌ9
10S1 O d 10S1 O d
06 £S89£Z LZL9£Z0 ςζς Z9£90 d 06 £S89£Z LZL9£Z0 ςζς Z9£90 d
19乙乙 εην 19乙乙 εην
ΙΖ  ΙΖ
Z09 K£Z9蘭 d SOZ ιβιιζη ^ 98I0¾d  Z09 K£Z9兰 d SOZ ιβιιζη ^ 98I03⁄4d
809Δ 6λΥ WS6600¾d ςςρ£ ι ε9εο39 tOl乙 00¾d 809Δ 6λΥ WS66003⁄4d ςςρ£ ι ε9εο39 tOl B 003⁄4d
ISLZ 80£S603V ISLZ 80£S603V
6061 ΐ 916乙 d ζζη .6.0/llOZN3/X3d 8ΐΐ.εο/ειοζ OAV FR0259075 AY927516 3016061 ΐ 916 B d ζζη .6.0/llOZN3/X3d 8ΐΐ.εο/ειοζ OAV FR0259075 AY927516 301
FR0388685 AK124913 1928FR0388685 AK124913 1928
FR0014408 BC038432 1823FR0014408 BC038432 1823
FR0407670 BC096064 347FR0407670 BC096064 347
FR0278359 AK123944 3186FR0278359 AK123944 3186
FR0105083 AY236157 5139FR0105083 AY236157 5139
FR0062389 Z70702 206FR0062389 Z70702 206
FR0113821 AF252279 32359FR0113821 AF252279 32359
FRO 105049 AF086098 603FRO 105049 AF086098 603
FR0333733 AX772993 408FR0333733 AX772993 408
FR0337126 420FR0337126 420
FR0230133 AK093002 2211FR0230133 AK093002 2211
FR0510076 BX647603 2037FR0510076 BX647603 2037
FR0086895 AY927486 29FR0086895 AY927486 29
FR0289833 AY927522 724FR0289833 AY927522 724
FR0292467 BC013821 1767FR0292467 BC013821 1767
FR0291542 AK098218 2072FR0291542 AK098218 2072
FR0147870 AF041081 6306FR0147870 AF041081 6306
FR0072520 AF103908 5426FR0072520 AF103908 5426
FR0140676 BC021130,BC048192 1486FR0140676 BC021130, BC048192 1486
FR0384272 AK123493 2175FR0384272 AK123493 2175
FR0142848 AL360187 779FR0142848 AL360187 779
FRO 130594 AF086212 693FRO 130594 AF086212 693
FR0379020 AY927602 828FR0379020 AY927602 828
FR0123825 BC008577 1885FR0123825 BC008577 1885
FRO 118423 CR613504 2640FRO 118423 CR613504 2640
FR0379059 BC032043 773FR0379059 BC032043 773
FR0224481 Y12017 200FR0224481 Y12017 200
FR0402396 BC105298 245FR0402396 BC105298 245
FR0291113 AY927590 757FR0291113 AY927590 757
FR0407651 BC134347 320FR0407651 BC134347 320
FR0232833 AL137398 1921 FR0232833 AL137398 1921
3、 单核苷酸多态性和点突变的检测 3. Detection of single nucleotide polymorphisms and point mutations
我们使用 SOAPsnp ( Li RQ, Li YR, Fang XD, Yang HM, Wang J, et al. (2009) SNP detection for massively parallel whole-genome resequencing. Genome Research 19: 1124-1132. ) 检测单核苷酸多态性。 Sanger测序验证突变。 我们通过以下步骤 减少单核苷酸多态性检测的假阳性率, 包括删除一致性质量低于We use SOAPsnp (Li RQ, Li YR, Fang XD, Yang HM, Wang J, et al. (2009) SNP detection for massively parallel whole-genome resequencing. Genome Research 19: 1124-1132.) State. Sanger sequencing confirmed mutations. We pass the following steps Reduce the false positive rate of single nucleotide polymorphism detection, including deletion consistency quality is lower than
20的 SNP、位于剪接供体位点 5bp以内的 SNP以及读数支持不超过 2个的 SNP。 为了找到新的 SNP, 我们进一步在已报道的六大 SNP 数据库进行筛选 ( YH, 1000 genomes, Yoruba, Korean, Watson and NCBI dbSNP ) 。 SNPs of 20, SNPs within 5 bp of the splice donor site, and readings support no more than 2 SNPs. To find new SNPs, we further screened the six SNP databases that have been reported (YH, 1000 genomes, Yoruba, Korean, Watson and NCBI dbSNP).
前列腺癌突变谱。 我们在前列腺癌组织中平均找到 1725个点 突变。 然而, 只有一小部分(平均 1.5% )位于基因的编码区。 有 趣的是,有的点突变位于长链非编码 RNA。绝大多数突变( 91.7% ) 是 T:A 到 C:G的突变。 对该发现的一个合理的解幹是, 这种点突 变发生在 RNA编辑的时候, RNA编辑通过将腺嘌呤核苷改变为次 黄嘌呤核苷, 后者翻译时被读作鸟嘌呤核苷, 从而导致特定的 RNA核苷酸的改变。  Prostate cancer mutation spectrum. We found an average of 1725 point mutations in prostate cancer tissue. However, only a small fraction (on average 1.5%) is located in the coding region of the gene. Interestingly, some point mutations are located in long-chain non-coding RNA. The vast majority of mutations (91.7%) are mutations from T:A to C:G. A reasonable solution to this finding is that this point mutation occurs during RNA editing, and RNA editing changes the adenosine nucleoside to the hypoxanthine nucleoside, which is read as a guanosine nucleoside when translated. This results in a change in a particular RNA nucleotide.
在 290个基因的编码区中共找到 309个点突变。其中 115个为沉 默突变、 181个错义突变、 13个为无义突变。 这些突变都未在多于 一个肿瘤组织中发现,提示在这些前列腺癌样本中没有热点突变。 然而, 我们发现有 3个样本有位于 UTP14C基因不同位置的突变, 有两个样本有位于 4个基因 ( CBARA1 , FRG1 , NAMPT和 ZNF195 )不同位置的突变。我们用基因组 PCR、 RT-PCR和 Sanger 测序证实了 30个突变。其中 27个在基因组水平证实, 29个在 cDNA 水平证实。  A total of 309 point mutations were found in the coding region of 290 genes. Of these, 115 were silent mutations, 181 missense mutations, and 13 were nonsense mutations. None of these mutations were found in more than one tumor tissue, suggesting that there are no hotspot mutations in these prostate cancer samples. However, we found that three samples had mutations at different positions in the UTP14C gene, and two samples had mutations at four different positions in the four genes (CBARA1, FRG1, NAMPT, and ZNF195). We confirmed 30 mutations using genomic PCR, RT-PCR and Sanger sequencing. Of these, 27 were confirmed at the genomic level and 29 were confirmed at the cDNA level.
我们还找到 183有突变的基因,但大多数都是低频率突变。这 与 Taylor^"( Taylor BS, et al. (2010) Integrative genomic profiling of human prostate cancer. Cancer Cell 18(l):ll-22. )报道的 138 个基因结果一致。 在 30个基因进行突变验证发现 RNA-Seq发现突 变的准确性分别为 96.7% ( cDNA水平)和 90% (基因组水平) 。  We also found 183 genes with mutations, but most of them were low frequency mutations. This is consistent with the results of 138 genes reported by Taylor^" (Tayl BS, et al. (2010) Integrative genomic profiling of human prostate cancer. Cancer Cell 18(l): ll-22.) Mutation verification in 30 genes The accuracy of RNA-Seq mutations was found to be 96.7% (cDNA level) and 90% (genome level), respectively.
1个样本有 KLK3基因突变。 令人吃惊的是, 所有样本都没有 P53 和 PTEN突变, 而这两个基因是 COSMIC数据库中与前列腺癌相 关度最高的基因。 虽然大多数突变的基因之前未在前列腺癌中被 报道过,其中 118个在其它肿瘤中被发现过,提示这些基因的突变 可能也导致前列腺癌。 One sample had a mutation in the KLK3 gene. Surprisingly, all samples have no P53. And PTEN mutations, which are the most relevant genes in prostate cancer in the COSMIC database. Although most of the mutated genes have not previously been reported in prostate cancer, 118 of them have been found in other tumors, suggesting that mutations in these genes may also lead to prostate cancer.
临床应用前景: 从前列腺穿刺组织或手术后组织中提取 DNA 后行 PCR后送测序检测 SNP和点突变存在情况, 用于前列腺癌病 人分子分型和药物治疗靶标, 判断病人预后。 本发明提供的 183 个基因的 194个突变参见表 3,其中优选的 30个基因突变如表 8所示 表 3. 前列腺癌特异性基因突变  Prospects for clinical application: DNA is extracted from prostate puncture tissue or post-operative tissue, and PCR is performed to send sequencing to detect the presence of SNPs and point mutations. It is used for molecular prophylaxis and drug treatment targets of prostate cancer patients to judge the prognosis of patients. The 194 mutations of the 183 genes provided by the present invention are shown in Table 3, wherein the preferred 30 gene mutations are shown in Table 8. Table 3. Prostate cancer-specific gene mutations
染色体 突变位置 突变基因 核苷酸改变 突变 密码子改变 氨基酸改变 chr12 4751 1314 DDX23 G->R G->A GCG->GTG Ala->Val chrl 36530890 THRAP3 A->R A->G ATA->GTA lle->Val chr2 4431 1087 PPM1 B A->W A->T CAG->CTG Gln->Leu chr8 41597667 AGPAT6 G->S G->C AGC->ACC Ser->Thr chr9 1 18289531 ASTN2 C->M C->A AGC->ATC Ser->lle chrX 132497912 GPC3 C->M C->A AGC->ATC Ser->lle chr1 1 9834816 SBF2 C->M C->A GAG->TAG Glu->STOP chr13 19106051 MPHOSPH8 G->S G->C GGA->CGA Gly->Arg chr19 16868057 CPAMD8 C->M C->A GTG->TTG Val->Leu chr4 191 1 15602 FRG1 G->R G->A GGG->GAG Gly->Glu chr20 35580977 BLCAP T->Y T->C CAG->CGG Gln->Arg chr20 60947531 DPH3B G->R G->A TGT->TAT V Chromosomal mutation position mutation gene nucleotide change mutation codon change amino acid change chr12 4751 1314 DDX23 G->R G->A GCG->GTG Ala->Val chrl 36530890 THRAP3 A->R A->G ATA->GTA Lle->Val chr2 4431 1087 PPM1 B A->W A->T CAG->CTG Gln->Leu chr8 41597667 AGPAT6 G->S G->C AGC->ACC Ser->Thr chr9 1 18289531 ASTN2 C- >M C->A AGC->ATC Ser->lle chrX 132497912 GPC3 C->M C->A AGC->ATC Ser->lle chr1 1 9834816 SBF2 C->M C->A GAG->TAG Glu ->STOP chr13 19106051 MPHOSPH8 G->S G->C GGA->CGA Gly->Arg chr19 16868057 CPAMD8 C->M C->A GTG->TTG Val->Leu chr4 191 1 15602 FRG1 G->R G->A GGG->GAG Gly->Glu chr20 35580977 BLCAP T->Y T->C CAG->CGG Gln->Arg chr20 60947531 DPH3B G->R G->A TGT->TAT V
< · chr2 74538237 INO80B C->M C->A GCC->GAC Ala->Asp chr13 51501090 UTP14C T->K T->G ATT->AGT lle->Ser chrl 13978120 PRDM2 G->R G->A GGG->AGG Gly->Arg chr4 426550 ZNF721 G->S G->C TCC->TGC Ser->Cys chr5 73967166 ENC1 C->Y C->T GGA->AGA Gly->Arg chr8 1 17928959 RAD21 G->K G->T GAC->GAA Asp->Glu chr19 7415179 ARHGEF18 A->W A->T AGC->TGC Ser->Cys chr9 139630460 C9orf37 A->R A->G TAT->CAT Tyr->His chr12 14831504 WBP1 1 C->M C->A AGT->ATT Ser->lle chr20 13643531 ESF1 T->K T->G AAA->ACA Lys->Thr chr13 47517856 NUDT15 C->Y C->T CGT->TGT Arg->Cys chr15 91346437 CHD2 G->R G->A ATG->ATA Met->lle chr19 56071940 KLK2 T->W T->A TTC->TAC Phe->Tyr chr2 130629259 SMPD4 A->M A->C I I I ->GTT Phe->Val < · chr2 74538237 INO80B C->M C->A GCC->GAC Ala->Asp chr13 51501090 UTP14C T->K T->G ATT->AGT lle->Ser chrl 13978120 PRDM2 G->R G-> A GGG->AGG Gly->Arg chr4 426550 ZNF721 G->S G->C TCC->TGC Ser->Cys chr5 73967166 ENC1 C->Y C->T GGA->AGA Gly->Arg chr8 1 17928959 RAD21 G->K G->T GAC->GAA Asp->Glu chr19 7415179 ARHGEF18 A->W A->T AGC->TGC Ser->Cys chr9 139630460 C9orf37 A->R A->G TAT-> CAT Tyr->His chr12 14831504 WBP1 1 C->M C->A AGT->ATT Ser->lle chr20 13643531 ESF1 T->K T->G AAA->ACA Lys->Thr chr13 47517856 NUDT15 C-> Y C->T CGT->TGT Arg->Cys chr15 91346437 CHD2 G->R G->A ATG->ATA Met->lle chr19 56071940 KLK2 T->W T->A TTC->TAC Phe-> Tyr chr2 130629259 SMPD4 A->M A->CIII ->GTT Phe->Val
12 --
Figure imgf000013_0001
12 --
Figure imgf000013_0001
gy 〇 〇 〇 OOOOOcs__K2VRVAVH Arv--.- Gy 〇 〇 〇 OOOOOcs__K2VRVAVH Arv--.-
〇 〇 〇eu>MVATATAv_l--- ,〇 〇 〇eu>MVATATAv_l--- ,
y 650986 J 〇 〇G 〇se72AK1>MVA TTVTTT>ph---- y 650986 J 〇 〇G 〇se72AK1>MVA TTVTTT>ph----
〇 〇ete ZNF>RVA M>ll--- 5300 〇 〇oeu 1274>RVA prv_l--- 693 〇s 〇〇ooooo 〇 7447vvv--. 〇 〇ete ZNF>RVA M>ll--- 5300 〇 〇oeu 1274>RVA prv_l--- 693 〇s 〇〇ooooo 〇 7447vvv--.
589093ose72 ARID4A TVY Hvvr-. - 589093ose72 ARID4A TVY Hvvr-. -
55089G〇〇〇〇〇〇 Se122P14 AVR A> Avr--- .55089G〇〇〇〇〇〇 Se122P14 AVR A> Avr--- .
Py 985686 〇 G GS74 FARP1>KVTATVTAT AVTr---- G 〇 〇 GaeP14>RVATTVATT Vl>ll---- . Py 985686 〇 G GS74 FARP1>KVTATVTAT AVTr---- G 〇 〇 GaeP14>RVATTVATT Vl>ll---- .
yG〇0sSP14 AVM A> »>v» llVAn--.- . yG〇0sSP14 AVM A> »>v» llVAn--.- .
G〇〇〇〇〇〇P14 AVR A> Av--- .  G〇〇〇〇〇〇P14 AVR A> Av--- .
〇63 〇 〇 〇〇〇SOD>MVAAVTAVTP---- y 59630 〇 G OOOOO 〇〇s221 HNRNPA1>KVTVHv--.- 〇63 〇 〇 〇〇〇SOD>MVAAVTAVTP---- y 59630 〇 G OOOOO 〇〇s221 HNRNPA1>KVTVHv--.-
〇 〇ete>MVA M>ll--- P 33308S 〇s 〇〇sS72 ZNFIvv HiVA--- 38365 S〇〇〇〇eeu 777ARA1 AVR Av TTV phv_l---- py 999935 〇 G〇〇〇cs27DRD7>KVT TT Trv--- . 〇〇ete>MVA M>ll--- P 33308S 〇s 〇〇sS72 ZNFIvv HiVA--- 38365 S〇〇〇〇eeu 777ARA1 AVR Av TTV phv_l---- py 999935 〇G〇〇〇cs27DRD7>KVT TT Trv--- .
〇33 〇 〇 OOOOOose ZH>RVAVH prvr--.- g CGGCG〇 EEF1DVA ArVln-- 〇33 〇 〇 OOOOOose ZH>RVAVH prvr--.- g CGGCG〇 EEF1DVA ArVln--
9906035 〇 G GSO2 MATN2>KVTAAVTAAVTP---- .9906035 〇 G GSO2 MATN2>KVTAAVTAAVTP---- .
g 5396〇5 〇s 〇〇ooooo〇2474 E__v__vvv ArV--.- 3 〇G〇G Ge Aproro Tvw 「VAAVlnV--- , g 5396〇5 〇s 〇〇ooooo〇2474 E__v__vvv ArV--.- 3 〇G〇G Ge Aproro Tvw "VAAVlnV--- ,
y 3353080〇〇s77匚FS Avw AVT AAVTA AnVTr---- y 3353080〇〇s77匚FS Avw AVT AAVTA AnVTr----
8338 〇 〇OOO see442 RERE>RVA HVHHrvph--.- y 〇 〇 〇〇〇〇〇 〇〇s>MVAVTv----
Figure imgf000013_0002
8338 〇〇OOO see442 RERE>RVA HVHHrvph--.- y 〇〇〇〇〇〇〇〇〇s>MVAVTv----
Figure imgf000013_0002
〇 〇 〇〇〇aa KLH匚 2>RVAAVTA Alvvl---- yG3 〇 GGsS U>KVT AAVAAT llVAn---- 5〇〇〇 A__KroH TVY TVT-- , 〇 〇 〇〇〇aa KLH匚 2>RVAAVTA Alvvl---- yG3 〇 GGsS U>KVT AAVAAT llVAn---- 5〇〇〇 A__KroH TVY TVT-- ,
330 JJ6 〇 〇 〇〇〇〇〇aa 72242MD>RVAv Alvvl---- 066 42274  330 JJ6 〇 〇 〇〇〇〇〇aa 72242MD>RVAv Alvvl---- 066 42274
995366 〇 〇ase27>MVA Alvr--- 995366 〇 〇ase27>MVA Alvr---
〇s 〇〇 〇〇〇〇 〇vvAVA--- P 560553 〇 〇 〇〇〇SS172 K__K>RVAAVAA AVAn---- yP0035匚〇6〇〇S 172 _<- AVM Av TATVAT TrVA---- , 〇s 〇〇 〇〇〇〇 〇vvAVA--- P 560553 〇 〇 〇〇〇 SS172 K__K>RVAAVAA AVAn---- yP0035匚〇6〇〇S 172 _<- AVM Av TATVAT TrVA---- ,
匚 S3 〇 〇 〇〇〇SO HA>MVAAVTAVTP---- 匚 S3 〇 〇 〇〇〇SO HA>MVAAVTAVTP----
〇〇 seeFn1rvph- . 〇〇 seeFn1rvph- .
〇s 〇〇oooooaovvv Al>Pr--.- 〇〇〇〇〇 seS N2__ AVAArVAn-- y930G〇〇〇 〇s〇 44247 ZN F227 「>K TV TVTv--- . y 33388 G〇〇 〇 〇 G 〇us747LA4>RVAAAVAAAlVL---- -ex- 〇s 〇〇oooooaovvv Al>Pr--.- 〇〇〇〇〇seS N2__ AVAArVAn-- y930G〇〇〇〇s〇44247 ZN F227 ">K TV TVTv--- . y 33388 G〇〇〇〇G 〇 us747LA4>RVAAAVAAAlVL---- -ex-
S入 Q<-J9S 丄 0丄< -丄 Q丄 0<-0 s<-o 9SdSD S into Q<-J9S 丄 0丄< -丄 Q丄 0<-0 s<-o 9SdSD
d01S<-niE) v<-o ΙΛΙ<-0 exHdz S9S88S d01S<-niE) v<-o ΙΛΙ<-0 exHdz S9S88S
s入, <-nis ovv<-ovo v<-o a<-o ΠΝ00 1-30ε0939 9 W s,, <-nis ovv<-ovo v<-o a<-o ΠΝ00 1-30ε0939 9 W
U|0<-S|H ovo<-ovo o<-o s<-o n9£ddZ 176593689 U|0<-S|H ovo<-ovo o<-o s<-o n9£ddZ 176593689
|BA<-BIV vio<-voo v<-o a<-o l-OXOd 960SS00  |BA<-BIV vio<-voo v<-o a<-o l-OXOd 960SS00
S|H<-6JV ovo<-ooo v<-o a<-o idadvi  S|H<-6JV ovo<-ooo v<-o a<-o idadvi
u|Q<-sAn vvo<-vvv o<-v IAI<-V 91S0V 990ZZ H7 H- ο μμο u|Q<-sAn vvo<-vvv o<-v IAI<-V 91S0V 990ZZ H7 H- ο μμο
J9S<-0Jd 00丄<-000 v<-o a<-o adaao 8JL|。 J9S<-0Jd 00丄<-000 v<-o a<-o adaao 8JL|.
0Jd<-J9S \ oo<-vo丄 0<-丄 人< -丄 l-dl l-Wd 698S瞧 9JL|。  0Jd<-J9S \ oo<-vo丄 0<-丄 person < -丄 l-dl l-Wd 698S瞧 9JL|.
B|V<-JLU ooo<-oov Q〈-丄 人< -丄 9(HS l>S69 SJL|。  B|V<-JLU ooo<-oov Q<-丄 person < -丄 9(HS l>S69 SJL|.
S!H<-uiO ovo<-ovo o<-o s<-o aSHSH P££6Z V9 l  S!H<-uiO ovo<-ovo o<-o s<-o aSHSH P££6Z V9 l
|BA<-BIV S丄 3<-3Q3 丄 <-Q 人 <-o ot Ld丄 n  |BA<-BIV S丄 3<-3Q3 丄 <-Q person <-o ot Ld丄 n
9L|d<-S入。 0丄丄 <-03丄 v<-o ΙΛΙ<-0 Noa 28296006  9L|d<-S is entered. 0丄丄 <-03丄 v<-o ΙΛΙ<-0 Noa 28296006
B| <-^IO 000<-000 o<-o s<-o vss siAi丄 ZS009W)9 JL|0 j入丄 <-ds / 丄 \/丄< -丄 \/Θ v<-o ΙΛΙ<-0 S6 UNZ  B| <-^IO 000<-000 o<-os<-o vss siAi丄ZS009W)9 JL|0 j丄丄<-ds / 丄\/丄< -丄\/Θ v<-o ΙΛΙ<- 0 S6 UNZ
dsy<-nio 丄 V3<-W3 M<-V IAISdl3 JL|。 s入。 < -入 is 丄 0丄< -丄 ΘΘ v<-o ΙΛΙ<-0 i-vavao 31-5966εΖ  Dsy<-nio 丄 V3<-W3 M<-V IAISdl3 JL|. s into. < -入 is 丄 0丄< -丄 ΘΘ v<-o ΙΛΙ<-0 i-vavao 31-5966εΖ
n9"i<-dj丄 £)丄丄 <-33丄 v<-o ΙΛΙ<-0 ^Mxad  N9"i<-dj丄 £)丄丄 <-33丄 v<-o ΙΛΙ<-0 ^Mxad
d01S<-ni3 ονι<-ονο v<-o ΙΛΙ<-0 oei-iAivd d01S<-ni3 ονι<-ονο v<-o ΙΛΙ<-0 oei-iAivd
n9"l<-3lJd 110<- 1 1 1 0<-l 人< -丄 dd關 6980Z368  N9"l<-3lJd 110<- 1 1 1 0<-l person < -丄 dd off 6980Z368
3||<-JLLL 丄<-0 人 <-Q od 93£368 l- 8JL|。  3||<-JLLL 丄<-0 person <-Q od 93£368 l- 8JL|.
J9S<-B|V 丄<-3 Ή<-0 os3an 568εεΖ991- ZJL|。 n9,<-6jv 0丄 Q<-Q3Q v<-o ΙΛΙ<-0 ashu丄 siH 3Z0S0C93 9JL|。 d01S<-u|3 £)\丄<-3\/0 丄 <-Q 人 <-Q i-asNSv 1-080172061- SJL|。  J9S<-B|V 丄<-3 Ή<-0 os3an 568εεΖ991- ZJL|. N9, <-6jv 0丄 Q<-Q3Q v<-o ΙΛΙ<-0 ashu丄 siH 3Z0S0C93 9JL|. d01S<-u|3 £)\丄<-3\/0 丄 <-Q person <-Q i-asNSv 1-080172061- SJL|.
3||<-J3S Q1V<-Q3V Ή<-0 S丄 NQO  3||<-J3S Q1V<-Q3V Ή<-0 S丄 NQO
19|ΛΙ<-9|Ι οιν<-οιν o<-o s<-o (■3dd3T 699986217 μμο 19|ΛΙ<-9|Ι οιν<-οιν o<-o s<-o (■3dd3T 699986217 μμο
13|/\|<-JLU v<-o a<-o L2LVVVLV 13|/\|<-JLU v<-o a<-o L2LVVVLV
usy<-sAn 丄<-9 Ή<-0 οεΐ-οαοο  Usy<-sAn 丄<-9 Ή<-0 οεΐ-οαοο
J9S<-6JV 丄 3V<-33V v<-o ΙΛΙ<-0 l- NIXV 2LL2Z  J9S<-6JV 丄 3V<-33V v<-o ΙΛΙ<-0 l- NIXV 2LL2Z
usy<-dsv 1W< -丄 v<-o a<-o SSS l曰 XJL|。 s入。 <-dj丄 丄 3丄<-33丄 v<-o ΙΛΙ<-0 miAis 6z ΐ·8εοεε 6JL|。  Usy<-dsv 1W< -丄 v<-o a<-o SSS l曰 XJL|. s into. <-dj丄 丄 3丄<-33丄 v<-o ΙΛΙ<-0 miAis 6z ΐ·8εοεε 6JL|.
丄丄丄 <-s丄丄 Ή<-0 1-ddSO 00890289 8JL|。  丄丄丄 <-s丄丄 Ή<-0 1-ddSO 00890289 8JL|.
Π8Ί<-|ΒΛ Θ丄丄 <-E)丄 Θ 丄<-3 Ή<-0 3ΊΊΙΛΙ ZJL|。 ηθΊ<-|ΒΛ Θ丄丄 <-Θ丄 Θ v<-o ΙΛΙ<-0 丄 dl/WN 6331-69901- ZJL|。 d01S<-ni3 Wl<-W3 丄<-3 Ή<-0 leoao 17Z389C88 9JL|。  Π8Ί<-|ΒΛ Θ丄丄 <-E)丄 Θ 丄<-3 Ή<-0 3ΊΊΙΛΙ ZJL|. ηθΊ<-|ΒΛ Θ丄丄 <-Θ丄 Θ v<-o ΙΛΙ<-0 丄 dl/WN 6331-69901- ZJL|. d01S<-ni3 Wl<-W3 丄<-3 Ή<-0 leoao 17Z389C88 9JL|.
J9S<-B|V 丄 0丄< -丄 03 v<-o ΙΛΙ<-0 aoe>iid 9399986C I- SJL|。 J9S<-B|V 丄 0丄< -丄 03 v<-o ΙΛΙ<-0 aoe>iid 9399986C I-SJL|.
|ΒΛ<-η|0 S丄 3<-3V3 丄<-\ M<-V 8ε丄丄 s 9 1-30179 l-E SJL|。 |ΒΛ<-η|0 S丄 3<-3V3 丄<-\ M<-V 8ε丄丄 s 9 1-30179 l-E SJL|.
J8S<-sAo 00丄<-03丄 o<-o s<-o V60LV £P16161Z SJL|。 dsv<-nio 丄 VE><-E)VE) v<-o ΙΛΙ<-0 300SV S890½83  J8S<-sAo 00丄<-03丄 o<-o s<-o V60LV £P16161Z SJL|. Dsv<-nio 丄 VE><-E)VE) v<-o ΙΛΙ<-0 300SV S8901⁄283
n9,<-6jv νιο<-νοο 丄<-3 Ή<-0 \ w人 a 9 l-966ZZS  N9, <-6jv νιο<-νοο 丄<-3 Ή<-0 \ w person a 9 l-966ZZS
CIJI<-6JV v<-o a<-o  CIJI<-6JV v<-o a<-o
60 .6 .0/ll0ZN3/13d 8lTZ,fO/flOZ OAV - -60 .6 .0/ll0ZN3/13d 8lTZ, fO/flOZ OAV - -
J9S<-0Jd 00丄<-000 v<-o a<-o 丄 Vd 265081-65 J9S<-0Jd 00丄<-000 v<-o a<-o 丄 Vd 265081-65
6JV<-19|/\I 33V<-3丄 V o<-v l/\l<-V 9ζΠ>ί 0ZIVQ£  6JV<-19|/\I 33V<-3丄 V o<-v l/\l<-V 9ζΠ>ί 0ZIVQ£
usy<-sAn ovv<-vvv 3< -丄 Ή< -丄 εε圓丄 0896 Ζ Ι· Usy<-sAn ovv<-vvv 3< -丄 Ή< -丄 εε圆丄 0896 Ζ Ι·
J9S<-0Jd 丄 0丄< -丄 QQ 丄 <-Q 人 <-Q Z1d03 336689861- 6JL|。  J9S<-0Jd 丄 0丄< -丄 QQ 丄 <-Q person <-Q Z1d03 336689861- 6JL|.
S!H〈-门 9, 丄 VQ< -丄丄 Q v< -丄 M< -丄 3l/\liad 9JL|。  S!H<-门 9, 丄 VQ< -丄丄 Q v< -丄 M< -丄 3l/\liad 9JL|.
3L|d<-n3"l 1 1 1 <-νιι v< -丄 M< -丄 C903S 1 PVZ£Q0V 9JL|。  3L|d<-n3"l 1 1 1 <-νιι v< -丄 M< -丄 C903S 1 PVZ£Q0V 9JL|.
9||<-JLU viv<-vov v<-o a<-o WIAIVI I- 6500931- 1- 9JL|。  9||<-JLU viv<-vov v<-o a<-o WIAIVI I- 6500931- 1- 9JL|.
^IO<-SAQ 丄 33< -丄 3丄 3< -丄 >i< -丄 SOIVd 17Z9Z00ZS JL|。  ^IO<-SAQ 丄 33< -丄 3丄 3< -丄 >i< -丄 SOIVd 17Z9Z00ZS JL|.
S|H<-dsv 丄 VQ< -丄 o<-o s<-o i-sai LVYiLZLZZ  S|H<-dsv 丄 VQ< -丄 o<-o s<-o i-sai LVYiLZLZZ
usy<-sAn 1W<-3W v<-o ΙΛΙ<-0 SNdO 6631768663 μι usy<-sAn 1W<-3W v<-o ΙΛΙ<-0 88S8Z I-Usy<-sAn 1W<-3W v<-o ΙΛΙ<-0 SNdO 6631768663 μι usy<-sAn 1W<-3W v<-o ΙΛΙ<-0 88S8Z I-
13|/\|<-s入 η OLV<-3W v< -丄 Μ< -丄 990WW 13|/\|<-sin η OLV<-3W v< -丄 Μ< -丄 990WW
usy<-sAn 1W<-3W v<-o ΙΛΙ<-0 8aH0 S0689602 dsy<-nio 丄 V3<-W3 v< -丄 Μ< -丄 i-aviv Usy<-sAn 1W<-3W v<-o ΙΛΙ<-0 8aH0 S0689602 dsy<-nio 丄 V3<-W3 v< -丄 Μ< -丄 i-aviv
j入丄 <-S!H v<-o a<-o 3dvan 606εΐ·6εε 6JL|。 n3"l<-3L|d Θ丄丄 <-〇丄丄 o<-o s<-o 丄 3S V£V06V6PV μιj 丄 <-S!H v<-o a<-o 3dvan 606εΐ·6εε 6JL|. N3"l<-3L|d Θ丄丄 <-〇丄丄 o<-o s<-o 丄 3S V£V06V6PV μι
A|0<-6JV 丄 33< -丄 30 o<-o s<-o Z QVQl£ 8JL|。 A|0<-6JV 丄 33< -丄 30 o<-o s<-o Z QVQl£ 8JL|.
J入丄 <-ds/ ινι<-ινο 丄<-3 Ή<-0 0Z6ZZ00SI- ZJL|。 usy<-J9S ovv<-oov v<-o a<-o i-oad 9£V0VV6V JL|。 s入。 < -入 is 丄 0丄< -丄 ΘΘ v<-o ΙΛΙ<-0 OLLIAIJdd JL|。 s入。 <-6JV 丄 0丄< -丄 Θ〇 v<-o a<-o i-aoioo SJL|。  J 丄 <-ds/ ινι<-ινο 丄<-3 Ή<-0 0Z6ZZ00SI- ZJL|. Usy<-J9S ovv<-oov v<-o a<-o i-oad 9£V0VV6V JL|. s into. < -in is 丄 0丄< -丄 ΘΘ v<-o ΙΛΙ<-0 OLLIAIJdd JL|. s into. <-6JV 丄 0丄< -丄 Θ〇 v<-o a<-o i-aoioo SJL|.
|BA<-BIV 3丄 3<-303 v<-o a<-o Ί9Η00α 99S898l>  |BA<-BIV 3丄 3<-303 v<-o a<-o Ί9Η00α 99S898l>
J41<-B|V oov<-ooo v<-o a<-o 3V91O0  J41<-B|V oov<-ooo v<-o a<-o 3V91O0
n3"l<-3L|d 0丄丄 <- 1 1 1 3< -丄 Ή< -丄 ^svoa S69S68 N3"l<-3L|d 0丄丄 <- 1 1 1 3< -丄 Ή< -丄 ^svoa S69S68
6JV<-J9S oov<-oov o<-o s<-o s yjoi.0 Q £Q 6  6JV<-J9S oov<-oov o<-o s<-o s yjoi.0 Q £Q 6
J9S<-6JV 丄 3V<-33V 丄<-3 >i<-0 1·Λ0丄 d 8Z62S0SS usy<-dsv 1W< -丄 丄 <-Q 人 <-o oi-vseois 96Z17£89Z  J9S<-6JV 丄 3V<-33V 丄<-3 >i<-0 1·Λ0丄 d 8Z62S0SS usy<-dsv 1W< -丄 丄 <-Q person <-o oi-vseois 96Z17£89Z
J41<-B|V oov<-ooo 丄 <-Q 人 <-Q axoa 803Z8SZZ e|v<-JLLL ooo<-oov o<-v a<-v i^gvsd丄 £Z9V£ZV  J41<-B|V oov<-ooo 丄 <-Q person <-Q axoa 803Z8SZZ e|v<-JLLL ooo<-oov o<-v a<-v i^gvsd丄 £Z9V£ZV
U|0<-6JV vvo<-voo v<-o a<-o SVIAIO 1-517601-33  U|0<-6JV vvo<-voo v<-o a<-o SVIAIO 1-517601-33
9||<-|ΒΛ v<-o a<-o Π丄 0附 VZZVZVQ  9||<-|ΒΛ v<-o a<-o Π丄 0 attached VZZVZVQ
s入, <-n|3 ovv<-ovo 丄 <-Q 人 <-o dda 6861^1· 6Sl> 6JL|。 n9"i<-0Jd 3丄 Q<-3QQ 丄 <-Q 人 <-o esaaos 8JL|。 s, <-n|3 ovv<-ovo 丄 <-Q person <-o dda 6861^1· 6Sl> 6JL|. N9"i<-0Jd 3丄 Q<-3QQ 丄 <-Q person <-o esaaos 8JL|.
|ΒΛ< -入 13 Q丄 3<-Q33 丄<-3 >i<-0 aHO 8JL|。  |ΒΛ< -入13 Q丄 3<-Q33 丄<-3 >i<-0 aHO 8JL|.
A|0<-6JV voo<-vov o<-v a<-v gjddids Q£ZVZ0 £V SJL|。  A|0<-6JV voo<-vov o<-v a<-v gjddids Q£ZVZ0 £V SJL|.
6jV<-0Jd voo<-voo o<-o s<-o ZdVH丄  6jV<-0Jd voo<-voo o<-o s<-o ZdVH丄
J入丄 <-s入◦ 丄 \/丄< -丄 Θ丄 v<-o a<-o z丄 so 6179 88173  J入丄 <-s入◦ 丄 \/丄< -丄 Θ丄 v<-o a<-o z丄 so 6179 88173
S!H<-uiO 丄 VQ<-3VQ v<-o ΙΛΙ<-0 i-daaa  S!H<-uiO 丄 VQ<-3VQ v<-o ΙΛΙ<-0 i-daaa
OJd<-B|V 000<-000 o<-o s<-o S3N 600Z06½l- OJd<-B|V 000<-000 o<-o s<-o S3N 600Z061⁄2l-
U|0<-6JV vvo<-voo 丄 <-Q 人 <-Q vdda U|0<-6JV vvo<-voo 丄 <-Q person <-Q vdda
n9"i<-0Jd 3丄 Q<-3QQ 丄 <-Q 人 <-Q Vl-dVHO 36909617 N9"i<-0Jd 3丄 Q<-3QQ 丄 <-Q person <-Q Vl-dVHO 36909617
60.6.0/llOZN3/X3d 8ΐΐ.εο/ειοζ OAV -ςι- n9"i<-J9S V丄丄 <-VQ丄 丄 <-Q 人 <-Q !•31丄 μΐ|。 60.6.0/llOZN3/X3d 8ΐΐ.εο/ειοζ OAV -ςι- n9"i<-J9S V丄丄<-VQ丄丄<-Q People<-Q !•31丄μΐ|.
J9S<-B|V 00丄<-003 V<-0 ΙΛΙ<-0 i-ddaN 60S08Z9 I- dsy<-B|v 丄 V3< -丄 03 丄<-9 Ή<-0 i-ddaN 3eS08Z9 l- μι d01S<-ui9 ονι<-ονο V<-0 a<-o  J9S<-B|V 00丄<-003 V<-0 ΙΛΙ<-0 i-ddaN 60S08Z9 I- dsy<-B|v 丄V3< -丄03 丄<-9 Ή<-0 i-ddaN 3eS08Z9 l - μι d01S<-ui9 ονι<-ονο V<-0 a<-o
J9S<-0Jd 丄 0丄< -丄 00 丄<-0 人 <-Q εο6θ  J9S<-0Jd 丄 0丄< -丄 00 丄<-0 person <-Q εο6θ
s入, <-nis ovv<-ovo 丄 <-Q 人 <-Q 80ZdNZ  s,, <-nis ovv<-ovo 丄 <-Q person <-Q 80ZdNZ
n9"i<-0Jd V丄 o<-voo 丄 <-Q 人 <-Q 381-OVVW  N9"i<-0Jd V丄 o<-voo 丄 <-Q person <-Q 381-OVVW
J9S<-B|V 丄<-3 Ή<-0 i-aHva S898SS8S d01S<-u|3 丄<-0 人 <-Q ei-di>i 16 V10P6Z  J9S<-B|V 丄<-3 Ή<-0 i-aHva S898SS8S d01S<-u|3 丄<-0 people <-Q ei-di>i 16 V10P6Z
d01S<-dJ丄 3V丄 <-33丄 丄<-0 D C人 d01S<-dJ丄 3V丄 <-33丄 丄<-0 D C people
V <-Q 1-SdH 8S6S6 l>0(H  V <-Q 1-SdH 8S6S6 l>0(H
|ΒΛ<- |0 v<-o ΙΛΙ<-0 33aa> NV 9/91-8306  |ΒΛ<- |0 v<-o ΙΛΙ<-0 33aa> NV 9/91-8306
|ΒΛ< -入 13 v<-o ΙΛΙ<-0 3JJ060 80173S80 6JL|。  |ΒΛ< -in 13 v<-o ΙΛΙ<-0 3JJ060 80173S80 6JL|.
94d<-|BA 1 1 1 < -丄丄 0 v<-o ΙΛΙ<-0 SSIAId 6Z0S66S ZJL|。 n3"l<-3L|d 丄丄。 <- 1 1 1 o<-v a<-v l-V I-NVIAI 636179961- 1- 9JL|。  94d<-|BA 1 1 1 < -丄丄 0 v<-o ΙΛΙ<-0 SSIAId 6Z0S66S ZJL|. N3"l<-3L|d 丄丄. <- 1 1 1 o<-v a<-v l-V I-NVIAI 636179961- 1- 9JL|.
3L|d<-n3"l 0丄丄 <-0丄 0 丄 <-Q 人 <-Q 8ΙΛΙ丄 ΙΛΙΟ SJL|。 dJi<-n9n 33丄 <-3丄丄 s< -丄 Ή< -丄  3L|d<-n3"l 0丄丄 <-0丄 0 丄 <-Q person <-Q 8ΙΛΙ丄 ΙΛΙΟ SJL|. dJi<-n9n 33丄 <-3丄丄 s< -丄 Ή< -丄
1 1 1 <-Θ丄丄 v<-o ΙΛΙ<-0 09380^8  1 1 1 <-Θ丄丄 v<-o ΙΛΙ<-0 09380^8
n9"i<-0Jd 0丄0<-000 丄<-0 人 <-Q did> 6081-361-9  N9"i<-0Jd 0丄0<-000 丄<-0 people <-Q did> 6081-361-9
|ΒΛ< -入 13 0丄 3<-033 v<-o ΙΛΙ<-0 SHQ丄 ON  |ΒΛ< -入13 0丄 3<-033 v<-o ΙΛΙ<-0 SHQ丄 ON
J9S<-0Jd VOL<-V00 丄 <-Q 人 <-Q ■d"IQ e60 sms  J9S<-0Jd VOL<-V00 丄 <-Q person <-Q ■d"IQ e60 sms
n9"i<-0Jd V丄 o<-voo 丄 <-Q 人 <-Q dvoas  N9"i<-0Jd V丄 o<-voo 丄 <-Q person <-Q dvoas
d01S<-ni3 ονι<-ονο Ή<-0 OHNd H799890E d01S<-ni3 ονι<-ονο Ή<-0 OHNd H799890E
n9"l<-0Jd νιο<-νοο v<-o y<-o 1 3人 d C9l7939eS d01S<-niE) ονι<-ονο v<-o ΙΛΙ<-0 1-Ζ Ι-9½89  N9"l<-0Jd νιο<-νοο v<-o y<-o 1 3 people d C9l7939eS d01S<-niE) ονι<-ονο v<-o ΙΛΙ<-0 1-Ζ Ι-91⁄289
|ΒΛ<-ηΐΟ v< -丄 Μ< -丄 Sd>id  |ΒΛ<-ηΐΟ v< -丄 Μ< -丄 Sd>id
3||<- 丄 〇iv<-oov 丄 <-Q 人 <-Q 3dda 06SS98 JL|。 j入丄 <-ds / v<-o ΙΛΙ<-0 ζι νοεοζι 3||<- 丄 〇iv<-oov 丄 <-Q person <-Q 3dda 06SS98 JL|. j入丄 <-ds / v<-o ΙΛΙ<-0 ζι νοεοζι
J9S<-0Jd VOL<-V00 v<-o a<-o ΧΉΙΛΙ εεΐ7ε9083 s入, <-nis vvv<-vvo v<-o d<-0 91H0VVW V V0 V19QV 8JL|。 J9S<-0Jd VOL<-V00 v<-o a<-o ΧΉΙΛΙ εεΐ7ε9083 s, <-nis vvv<-vvo v<-o d<-0 91H0VVW V V0 V19QV 8JL|.
0Jd<-nan 丄 00< -丄丄 Q o<-v a<-v 1-dOON 8361-9631- uiO<-S|H ovo<-ovo o<-o s<-o 3xava 896 "8821· JL|。 j入丄 <-ds / 丄 \/丄< -丄 \/Θ v<-o ΙΛΙ<-0 V9 附 l 30891-538 9JL|。 n3"l<-3L|d 0丄丄 <- 1 1 1 3< -丄 Ή< -丄 "Itldl丄 μι 0Jd<-nan 丄00< -丄丄Q o<-va<-v 1-dOON 8361-9631- uiO<-S|H ovo<-ovo o<-os<-o 3xava 896 "8821· JL|. j入丄<-ds / 丄\/丄< -丄\/Θ v<-o ΙΛΙ<-0 V9 Attached l 30891-538 9JL|. n3"l<-3L|d 0丄丄<- 1 1 1 3< -丄Ή< -丄"Itldl丄μι
6JV<-^I0 丄 30<-丄33 o<-o s<-o OIAIS 88366982 1- usy<-sAn iw<-wv 丄<-\/ M<-V S!HNV JL|。 dsv<-siH 0V9<-0V0 0<-Θ 89Z00S d 7JL(06JV<-^I0 丄 30<-丄33 o<-o s<-o OIAIS 88366982 1- usy<-sAn iw<-wv 丄<-\/ M<-V S!HNV JL|. Dsv<-siH 0V9<-0V0 0<-Θ 89Z00S d 7JL(0
0Jd<-n9, νοο<-νιο o<-v a<-v SJL|。0Jd<-n9, νοο<-νιο o<-v a<-v SJL|.
|ΒΛ<-ηΐΟ νιο<-ννο 1<-V M<-V i-Niavo VZLZ2LZZ |ΒΛ<-ηΐΟ νιο<-ννο 1<-V M<-V i-Niavo VZLZ2LZZ
J9S<-0Jd 00丄<-000 v<-o a<-o a丄 3S SS00 S9S dsy<-nio ovo<-ovo o<-o s<-o O I/M  J9S<-0Jd 00丄<-000 v<-o a<-o a丄 3S SS00 S9S dsy<-nio ovo<-ovo o<-o s<-o O I/M
6JV<-J9S G0V<-00V o<-o s<-o sxg丄  6JV<-J9S G0V<-00V o<-o s<-o sxg丄
60L6L0/U0ZK3/L3d 8ΐιζ.εο/ειοζ OAV chr22 40362161 XRCC6 T->W T->A AAT->AAA Asn->Lys chr2 160609801 PLA2R1 A->M A->C TGG->GGG Trp->Gly chr4 89837613 NAP1 L5 C->Y C->T GAA->AAA Glu->Lys chr7 105691 186 NAMPT G->R G->A GCG->GTG Ala->Val 60L6L0/U0ZK3/L3d 8ΐιζ.εο/ειοζ OAV Chr22 40362161 XRCC6 T->W T->A AAT->AAA Asn->Lys chr2 160609801 PLA2R1 A->M A->C TGG->GGG Trp->Gly chr4 89837613 NAP1 L5 C->Y C->T GAA->AAA Glu->Lys chr7 105691 186 NAMPT G->R G->A GCG->GTG Ala->Val
4. 选择性剪切的检测 4. Detection of selective shear
选择性剪切( alternative splicing, AS )是真核细胞中的普遍 现象, 它能使基因转录出不同的 mRNA产物, 进而可能翻译出不 同的蛋白异构体。  Alternative splicing (AS) is a common phenomenon in eukaryotic cells, which allows genes to transcribe different mRNA products, which in turn may translate different protein isoforms.
(1) .我们使用 SpliceMap来寻找剪切位点, 然后运用不同方法 检测不同类型的选择性剪切包括外显子跳跃、 内含子保留以及选 择性 5, 和 3, 剪切位点。 首先我们找到 28个标本转录组中所有的 选择性剪切。 然后我们找到仅存在于癌组织样本而其配对癌旁组 织没有的选择性剪切。 我们找到了数千个选择性剪切, 通过非冗 余读序歸出一组高度可靠地差异性剪切。 在超过一半的前列腺癌 样本中发现有 KLK3 (也叫 PSA )基因的内含子保留, 这可能产生 一种新的蛋白序列。 选择性剪切的转录产物和蛋白都可能作为前 列腺癌诊断的新生物学标记物。 在一部分前列腺癌样本中发现有 AMACR基因的外显子跳跃。这两种选择性剪切方式都用 RT-PCR 在测序组得到了验证。 我们同时在另外 40对样本中用 RT-PCR进 行了验证, 发现绝大多数癌组织样本中有 PSA内含子保留, 而癌 旁组织中几乎没有。 PSA是为数不多的几个常规用于诊断的生物 学标志物。 然而, 目前以 PSA为基础的筛查手段准确度有限。 我 们新发现的 PSA内含子保留可能有助于改进 PSA的敏感性和特异 性。 40个癌组织样本中仅 9个有 AMACR基因外显子跳跃。  (1) We use SpliceMap to find the cleavage site, and then use different methods to detect different types of selective splicing including exon skipping, intron retention, and selectivity 5, and 3, cleavage sites. First we found all selective splicing in the transcriptome of 28 specimens. We then found selective splicing that was not present in the cancer tissue sample but not in the matched paracancerous tissue. We found thousands of selective cuts and a set of highly reliable differential cuts through non-redundant reads. The intron retention of the KLK3 (also known as PSA) gene was found in more than half of prostate cancer samples, which may result in a new protein sequence. Both selectively spliced transcripts and proteins may serve as new biomarkers for the diagnosis of prostate cancer. Exon skipping of the AMACR gene was found in a subset of prostate cancer samples. Both of these alternative splicing methods were verified in the sequencing group by RT-PCR. We also verified by RT-PCR in another 40 pairs of samples, and found that most of the cancer tissue samples contained PSA intron retention, but almost no tissue in the adjacent tissues. PSA is one of the few biomarkers routinely used for diagnosis. However, current PSA-based screening methods have limited accuracy. Our newly discovered PSA intron retention may help to improve the sensitivity and specificity of PSA. Only 9 of the 40 cancer tissue samples had AMACR gene exon skipping.
(2) . 临床应用前景: 在血液、 尿液中通过 real time PCR或者 ELISA检测选择性剪切的存在情况, 用于前列腺癌病人的早期诊 断、 分子分型, 同时可作为靶向治疗的靶点, 判断病人预后。 表 4. 选择性剪切体, 包括 3'剪切位点变异, 5'剪切位点变异: 外显子跳跃和内含子保留四种方式。 (2). Prospects for clinical application: Detection of the presence of selective splicing by real time PCR or ELISA in blood or urine, for early diagnosis and molecular typing of prostate cancer patients, and as a target for targeted therapy Point, determine the patient's prognosis. Table 4. Selective splicing, including 3' cleavage site variation, 5' cleavage site variation: exon skipping and intron retention.
3'剪切位点变异  3' shear site variation
基因名称基因 ID 3' 外显子 可变 3'外显子Gene name gene ID 3' exon variable 3' exon
CDK11B 984 chrl 1637645-1637775 1633563-1633726 1633563-1633699CDK11B 984 chrl 1637645-1637775 1633563-1633726 1633563-1633699
SLC25A27 9481 chr6 46746822-46746924 46752079-46753886 46752343-46753886SLC25A27 9481 chr6 46746822-46746924 46752079-46753886 46752343-46753886
SLC4A7 9497 chr3 27399648-27399745 27389218-27393340 27389218-27395851SLC4A7 9497 chr3 27399648-27399745 27389218-27393340 27389218-27395851
SCP2 6342 chrl 53253198-53253303 53266238-53266331 53266229-53266331SCP2 6342 chrl 53253198-53253303 53266238-53266331 53266229-53266331
HSF4 3299 chrl 6 65758524-65758626 65758865-65759003 65758879-65759003HSF4 3299 chrl 6 65758524-65758626 65758865-65759003 65758879-65759003
SYTL1 84958 chrl 27547045-27547090 27548194-27548230 27548158-27548230SYTL1 84958 chrl 27547045-27547090 27548194-27548230 27548158-27548230
PSMA3 5684 chrl 4 57794396-57794469 57797440-57797491 57797419-57797491PSMA3 5684 chrl 4 57794396-57794469 57797440-57797491 57797419-57797491
RIC8A 60626 chrl l 202416-202511 202597-202759 202615-202759 RIC8A 60626 chrl l 202416-202511 202597-202759 202615-202759
210858092-2108581 210859011-21086073 210859323-2108607 210858092-2108581 210859011-21086073 210859323-2108607
ATF3 467 chrl ATF3 467 chrl
99 9 39  99 9 39
NUPR1 26471 chrl 6 28457618-28457996 28456828-28457031 28456828-28456977 NUPR1 26471 chrl 6 28457618-28457996 28456828-28457031 28456828-28456977
SDF4 51150 chrl 1143701-1143876 1142151-1143047 1142151-1142931SDF4 51150 chrl 1143701-1143876 1142151-1143047 1142151-1142931
WRNIP1 56897 chr6 2713924-2714115 2715428-2715594 2715353-2715594 WRNIP1 56897 chr6 2713924-2714115 2715428-2715594 2715353-2715594
133905431-1339054 133906462-13390676 133906691-1339067 133905431-1339054 133906462-13390676 133906691-1339067
PHF20L1 51105 chr8 PHF20L1 51105 chr8
87 9 69  87 9 69
142175248-1421755 142177795-14218147 142177792-1421814 142175248-1421755 142177795-14218147 142177792-1421814
SLC25A36 55186 chr3 SLC25A36 55186 chr3
37 5 75  37 5 75
JMJD1C 221037 chrlO 64623111-64623238 64622704-64622863 64622704-64622917  JMJD1C 221037 chrlO 64623111-64623238 64622704-64622863 64622704-64622917
5'剪切位点变异  5' shear site variation
基因名称基因 ID染色体组成型外显子 5' 外显子 可变 5'外显子 Gene name gene ID chromosome constitutive exon 5' exon variable 5' exon
TRPT1 83707 chrl l 63748591-63748765 63748843-63749018 63748849-63749018  TRPT1 83707 chrl l 63748591-63748765 63748843-63749018 63748849-63749018
149809248-1498095  149809248-1498095
RPS14 6208 chr5 149807341-149807491 149809459-149809512  RPS14 6208 chr5 149807341-149807491 149809459-149809512
12  12
KLF6 1316 chrlO 3812298-3812421 3813959-3814406 3813833-3814406 NACA 4666 chrl 2 55404503-55404600 55405013-55405333 55405314-55405333 外显子跳跃  KLF6 1316 chrlO 3812298-3812421 3813959-3814406 3813833-3814406 NACA 4666 chrl 2 55404503-55404600 55405013-55405333 55405314-55405333 Exon Jump
基因名称基因 ID 染色体组成型外显子 包含型外显子 组成型外显子  Gene name gene ID chromosome constitutive exon inclusion type exon group forming exon
TXNL1 9352 chrl 8 52442517-524426 52444391-52444495 52444590-52444686  TXNL1 9352 chrl 8 52442517-524426 52444391-52444495 52444590-52444686
90  90
MYBPC1 4604 chrl 2 100598252-10059 100602291-1006023 100603491-100603789  MYBPC1 4604 chrl 2 100598252-10059 100602291-1006023 100603491-100603789
8438 33  8438 33
MRPL52 122704 chrl 4 22369236-223693 22370013-22370086 22372467-22372531  MRPL52 122704 chrl 4 22369236-223693 22370013-22370086 22372467-22372531
03  03
C14orf2 9556 chrl 4 103451156-10345 103457030-1034572 103457560-103457656 PPFIA2 C14orf2 9556 chrl 4 103451156-10345 103457030-1034572 103457560-103457656 PPFIA2
顏 園  Yan Yuan
 painting
Figure imgf000019_0001
Figure imgf000019_0001
C0围hr2  C0 circumference hr2
c5 1212hr 固 C5 1212hr solid
Figure imgf000020_0001
Figure imgf000020_0001
SSPAR  SSPAR
 ring
60 c 44hrl2 60 c 44hrl2
9987函 PDL  9987 letter PDL
Figure imgf000021_0001
75397550 221422I
Figure imgf000021_0001
75397550 221422I
30763036441244II  30763036441244II
c9hrl CSNK1A1 1452 chr5 148869710-148871549 C9hrl CSNK1A1 1452 chr5 148869710-148871549
KLK12 43849 chrl9 56224281-56224409  KLK12 43849 chrl9 56224281-56224409
FOS 2353 chrl4 74815580-74816332  FOS 2353 chrl4 74815580-74816332
C7orf63 79846 chr7 89735581-89738775  C7orf63 79846 chr7 89735581-89738775
ATXN2L 11273 chrl6 28755313-28755549  ATXN2L 11273 chrl6 28755313-28755549
SERPINE1 5054 chr7 100557246-100558393  SERPINE1 5054 chr7 100557246-100558393
SERINC5 256987 chr5 79490542-79497961  SERINC5 256987 chr5 79490542-79497961
GADD45G 10912 chr9 91410616-91410703  GADD45G 10912 chr9 91410616-91410703
CYR61 3491 chrl 85820896-85821010  CYR61 3491 chrl 85820896-85821010
NR4A1 3164 chrl2 50736211-50736544  NR4A1 3164 chrl2 50736211-50736544
NR4A2 4929 chr2 156892773-156893161  NR4A2 4929 chr2 156892773-156893161
NR4A1 3164 chrl 2 50737582-50738738  NR4A1 3164 chrl 2 50737582-50738738
HSP90AA1 3320 chrl 4 101620655-101620770  HSP90AA1 3320 chrl 4 101620655-101620770
EIF4A2 1974 chr3 187988366-187989607  EIF4A2 1974 chr3 187988366-187989607
NAP1L1 4673 chrl 2 74729298-74729802  NAP1L1 4673 chrl 2 74729298-74729802
TSPAN1 10103 chrl 46423143-46423218  TSPAN1 10103 chrl 46423143-46423218
HMG20B 10362 chrl 9 3524799-3525380  HMG20B 10362 chrl 9 3524799-3525380
FOS 2353 chrl 4 74817124-74817238  FOS 2353 chrl 4 74817124-74817238
IL32 9235 chrl 6 3059112-3059282  IL32 9235 chrl 6 3059112-3059282
SERHL 94009 chr22 41237878-41238067  SERHL 94009 chr22 41237878-41238067
C7orf63 79846 chr7 89735635-89738775  C7orf63 79846 chr7 89735635-89738775
NR4A1 3164 chrl 2 50736697-50737107  NR4A1 3164 chrl 2 50736697-50737107
RBM6 10180 chr3 50073985-50074398  RBM6 10180 chr3 50073985-50074398
N0M02 283820 chrl 6 18418868-18419198 为了理解前列腺癌中上述分子遗传学改变, 我们把与基因融 合、 点突变、 差异性表达、 肿瘤特异性差异性剪切相关的肿瘤与 N0M02 283820 chrl 6 18418868-18419198 To understand the above molecular genetic changes in prostate cancer, we have tumors associated with gene fusion, point mutation, differential expression, and tumor-specific differential splicing.
Taylor描述的调节异常的信号通路相对比。 依据文献资料, 我们 把肿瘤中过表达的基因以及已知的癌基因定义为激活基因, 把肿 瘤中表达下调的基因以及已知的抑癌基因定义为失活基因。 我们 计算了每个激活基因、 失活基因在 14个标本中的频率。 如果肿瘤 标本在信号通路中有一个或多个基因有点突变、 基因融合、 差异 性表达或肿瘤特异的选择性剪切, 我们就认为肿瘤在该信号通路 发生了改变。 我们发现有 3个很常见的信号通路 ( AR、 Ras-PI3K-AKT和 RB )在前列腺癌中发生了变化。 与其它很多肿瘤一样, 前列腺癌是一种遗传性疾病, 是由一 系列基因改变的累积引起的。 因此, 更详细的基因特征分析将有 助于更好地理解这些疾病并促进研发新的个体化的靶向治疗。 此 外, 不同种族特别是白人和黄种人之间前列腺癌发病率和临床预 后差异很大。然而, 虽然白人的前列腺癌基因傳被研究得很深入, 黄种人中的相关研究极少。 本研究中, 我们通过 14对癌组织及配 对癌旁正常组织进行 RNA-Seq研究了上述两个问题。 这同时也是 首次同时揭示前列腺癌转录组的多个方面, 包括基因融合、 选择 性剪切、 病毒转录片段和长链非编码 RNA的表达以及体细胞突 变。 通过对上述方面的研究, 我们发现不同前列腺癌病人转录组 有很大的异质性。 对这些不同的基因改变的综合分析发现与中国 人前列腺癌发生相关的信号通路与白人类似。 这些发现为研究中 国人前列腺癌的发病机制提供了新的可能, 同时提供了治疗前列 腺癌的可能方式。 附图说明 Taylor describes the relative imbalance of signaling pathways. According to the literature, we define genes overexpressed in tumors and known oncogenes as activating genes, and genes that are down-regulated in tumors and known tumor suppressor genes are defined as inactivated genes. We calculated the frequency of each activating gene, inactivated gene, in 14 specimens. If a tumor specimen has one or more genes with a little mutation, gene fusion, differential expression, or tumor-specific selective splicing in the signaling pathway, we believe that the tumor has changed in this signaling pathway. We found that three very common signaling pathways (AR, Ras-PI3K-AKT, and RB) have changed in prostate cancer. Like many other tumors, prostate cancer is a hereditary disease caused by the accumulation of a series of genetic changes. Therefore, more detailed genetic analysis will help to better understand these diseases and promote the development of new individualized targeted therapies. In addition, the incidence of prostate cancer and clinical prognosis vary widely among ethnic groups, especially whites and yellows. However, although white prostate cancer gene transmission has been studied intensively, there have been few studies in the yellow race. In this study, we studied these two problems by RNA-Seq in 14 pairs of cancer tissues and matched normal tissues adjacent to the cancer. This is also the first time to simultaneously reveal multiple aspects of the prostate cancer transcriptome, including gene fusion, selective splicing, expression of viral transcripts and long-chain non-coding RNA, and somatic mutations. Through the study of the above aspects, we found that the transcriptome of patients with different prostate cancers is highly heterogeneous. A comprehensive analysis of these different genetic alterations found that the signaling pathways associated with prostate cancer in China are similar to those of whites. These findings offer new possibilities for studying the pathogenesis of prostate cancer in Chinese, and provide a possible way to treat prostate cancer. DRAWINGS
图 1. 系统肿瘤转录组分析流程图。  Figure 1. Flow chart of systemic tumor transcriptome analysis.
图 2. 融合基因示意图。 其中图 2c是 CTAGE5-khdrbs3融合 基因示意图, ctage5的第 23个外显子与 khdrbs3第 8个外显子融 合在一起; 图 2d是 Tmprss2-erg融合基因示意图, Tmprss2第 1 个外显子与 ERG第 4个外显子融合在一起; 图 2e是 5个融合基 因的发生频率。  Figure 2. Schematic diagram of the fusion gene. Figure 2c is a schematic diagram of the CTAGE5-khdrbs3 fusion gene, the 23rd exon of ctage5 is fused to the 8th exon of khdrbs3; Figure 2d is a schematic diagram of the Tmprss2-erg fusion gene, the first exon of Tmprss2 and ERG The fourth exon is fused together; Figure 2e shows the frequency of occurrence of the five fusion genes.
图 3. 融合基因示意图。 其中图 3a是 USP9Y-TTTY15融合示 意图, USP9Y的第 3个外显子和 TTTY15的第 4个外显子融合在 一起; 图 3b是 USP9Y-TTTY15的 RT-PCR结果。  Figure 3. Schematic diagram of the fusion gene. Fig. 3a is a fusion diagram of USP9Y-TTTY15, and the third exon of USP9Y is fused together with the fourth exon of TTTY15; Fig. 3b is the result of RT-PCR of USP9Y-TTTY15.
图 4. 融合基因示意图。 其中图 4a RAD50-PDLIM4融合基因 RT-PCR和 Sanger测序结果; 图 4b是 SDK1-AMACR融合基因 RT-PCR和 Sanger测序结果。 Figure 4. Schematic diagram of the fusion gene. Figure 4a RAD50-PDLIM4 fusion gene RT-PCR and Sanger sequencing results; Figure 4b is the results of the SDK1-AMACR fusion gene RT-PCR and Sanger sequencing.
图 5. 长链非编码的差异表达。 其中图 5c是长链非编码 RNA DD3 MALAT1 FR0257520 FR0348383在 40对癌和癌旁组织中 的差异表达; 图 5d 是长链非编码 RNA: DD3、 MALAT1、 FR0257520和 FR0348383在前列腺癌和良性前列腺增生组织中的 差异表达。 具体实施方式  Figure 5. Differential expression of long-chain non-coding. Figure 5c shows differential expression of long-chain non-coding RNA DD3 MALAT1 FR0257520 FR0348383 in 40 pairs of cancer and adjacent tissues; Figure 5d is long-chain non-coding RNA: DD3, MALAT1, FR0257520 and FR0348383 in prostate cancer and benign prostatic hyperplasia The difference in expression. detailed description
下面将结合实施例对本发明的实施方案进行详细描述, 但是 本领域技术人员将会理解, 下列实施例仅用于说明本发明, 而不 应视为限定本发明的范围。  The embodiments of the present invention are described in detail below with reference to the accompanying drawings.
除非另有定义, 否则本文中所使用的科学和技术术语具有本 领域技术人员通常理解的含义。 为了更好的理解本发明, 特别提 供了下列术语的定义。  Unless otherwise defined, the scientific and technical terms used herein have the meaning commonly understood by one of ordinary skill in the art. For a better understanding of the invention, the following definitions of terms are specifically provided.
发现融合基因、 长链非编码 RNA、 突变、 选择性剪切的共同 步骤: 收集前列腺癌病人样本一>癌组织及癌旁组织行水冻切片后 由病理学家检查保证质量一 >制备 cDNA文库一 >RNA-Seq—>将测 序结果在基因组和转录组定位_>将基因和长链非编码 RNA表达 水平标准化后找到差异表达的长链非编码 RNA、选择性剪切以及 肿瘤特异性的突变、 融合基因。  Common steps for finding fusion genes, long-chain non-coding RNAs, mutations, and selective splicing: Collection of prostate cancer patient samples -> Cancer tissues and adjacent tissues are frozen and sliced by pathologists to ensure quality -> Preparation of cDNA library I>RNA-Seq->Locate sequencing results in genomic and transcriptome _>Improve gene and long-chain non-coding RNA expression levels to find differentially expressed long-chain non-coding RNA, selective splicing, and tumor-specific mutations , fusion genes.
本发明一方面提供了用于前列腺癌的生物学标志物, 包括如 表 1所示的融合基因、表 2所示的长链非编码 RNA、表 3所示的 基因突变、 表 4所示的选择性剪切中的一种或多种。  In one aspect, the invention provides a biological marker for prostate cancer, comprising the fusion gene shown in Table 1, the long-chain non-coding RNA shown in Table 2, the gene mutation shown in Table 3, and the mutation shown in Table 4. One or more of selective shearing.
本发明所述的生物学标志物, 其进一步可用作前列腺癌的早 期诊断标志物、 药物治疗有效性判断标志物或患者预后标志物。 在本发明的具体实施方式中, 所述的生物学标志物中, 所述 融合基因包括表 6的 83个融合基因中的一种或多种,优选的包括 表 6中下划线所示的 35个融合基因中的一种或多种。 The biological marker of the present invention can further be used as an early diagnostic marker for prostate cancer, a drug treatment effectiveness judgment marker or a patient prognosis marker. In a specific embodiment of the present invention, in the biological marker, the fusion gene comprises one or more of the 83 fusion genes of Table 6, preferably including 35 underlined in Table 6. One or more of the fusion genes.
在本发明的具体实施方式中, 所述的生物学标志物中, 所述 融合基因 包括 USP9Y-TTTY15、 CTAGE5-KHDRBS3 、 RAD50-PDLIM4, SDK1-AMACR中的一种或多种, 优选地融合 基因 USP9Y-TTTY15、 CTAGE5-KHDRBS3 , RAD50-PDLIM4, SDK1-AMACR用表 5所述的引物进行扩增。  In a specific embodiment of the present invention, in the biological marker, the fusion gene comprises one or more of USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR, preferably a fusion gene USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR were amplified using the primers described in Table 5.
在本发明的具体实施方式中, 所述的生物学标志物中, 所述 长链非编码 RNA包括 DD3、 MALAT1、 FR0257520、 FR0348383 中的一种或多种,优选地所述长链非编码 RNA: DD3、 MALAT1、 FR0257520、 FR0348383用表 7所述的引物进行扩增。  In a specific embodiment of the present invention, in the biological marker, the long-chain non-coding RNA comprises one or more of DD3, MALAT1, FR0257520, FR0348383, preferably the long-chain non-coding RNA : DD3, MALAT1, FR0257520, FR0348383 were amplified using the primers described in Table 7.
在本发明的具体实施方式中, 所述的生物学标志物中, 所述 基因突变包括如表 8所示的 30个基因突变中的一种或多种,优选 地表 8所示的 30个基因突变用表 9所述的引物进行扩增。  In a specific embodiment of the present invention, in the biological marker, the gene mutation comprises one or more of 30 gene mutations as shown in Table 8, preferably 30 genes shown in Table 8. Mutations were amplified using the primers described in Table 9.
在本发明的具体实施方式中, 所述的生物学标志物中, 所述 选择性剪切包括 PSA 或 AMACR, 优选地选择性剪切 PSA 或 AMACR用表 10所述的引物进行扩增。  In a specific embodiment of the invention, in the biological marker, the selective cleavage comprises PSA or AMACR, preferably selectively cleavage of PSA or AMACR using the primers described in Table 10 for amplification.
本发明另一方提供了所述的生物学标志物在作为诊断前列腺 癌的试剂或者治疗前列腺癌的药物的靶点中的用途, 特别是用作 前列腺癌的早期诊断标志物、 药物治疗有效性判断标志物或患者 预后标志物的用途。  Another aspect of the present invention provides the use of the biological marker as a target for diagnosing prostate cancer or a drug for treating prostate cancer, in particular as an early diagnostic marker for prostate cancer, and for judging the effectiveness of drug treatment Use of markers or patient prognostic markers.
本发明另一方面进一步提供了用于扩增所述的生物学标志物 的引物或所述生物学标志物的探针在制备用于为诊断前列腺癌的 试剂中的用途。 其中, 所述引物可用于特异性扩增所述生物学标 志物, 所述探针特异性与所述生物学标志物结合, 从而指示所述 生物学标志物的存在。 In another aspect, the invention further provides a primer for amplifying the biological marker or a probe of the biological marker for use in preparing a reagent for diagnosing prostate cancer. Wherein the primer can be used to specifically amplify the biological marker, the probe specifically binding to the biological marker, thereby indicating the The presence of biological markers.
在本发明的具体实施方式中, 提供用于扩增所述的生物学标 志物的引物, 其中所述引物优选地包括表 5所述的引物, 其用于 融 合 基 因 USP9Y-TTTY15 、 CTAGE5-KHDRBS3 、 RAD50-PDLIM4, SDK1-AMACR; 表 7所示的引物, 其用于扩 增长链非编码 RNA: DD3、 MALAT1、 FR0257520、 FR0348383; 表 9所示的引物, 其用于扩增表 8所示的 30个基因突变; 表 10 所示的引物, 其用于扩增选择性剪切 PSA或 AMACR。  In a specific embodiment of the invention, a primer for amplifying the biological marker is provided, wherein the primer preferably comprises the primer described in Table 5 for the fusion gene USP9Y-TTTY15, CTAGE5-KHDRBS3 , RAD50-PDLIM4, SDK1-AMACR; primers shown in Table 7, which were used to amplify long-chain non-coding RNAs: DD3, MALAT1, FR0257520, FR0348383; primers shown in Table 9, which were used to amplify Table 8. The 30 gene mutations shown; the primers shown in Table 10, which were used to amplify the selective shear PSA or AMACR.
在本发明的具体实施方式中, 提供了表 5所述的引物在制备 诊断前列腺癌的试剂中的用途。  In a specific embodiment of the invention, the use of the primers described in Table 5 for the preparation of a medicament for the diagnosis of prostate cancer is provided.
在本发明的具体实施方式中, 提供了表 7所示的引物在制备 诊断前列腺癌的试剂中的用途。  In a specific embodiment of the invention, the use of the primers shown in Table 7 for the preparation of a medicament for diagnosing prostate cancer is provided.
在本发明的具体实施方式中, 提供了表 9所示的引物在制备 诊断前列腺癌的试剂中的用途。  In a specific embodiment of the invention, the use of the primers shown in Table 9 for the preparation of a medicament for diagnosing prostate cancer is provided.
在本发明的具体实施方式中, 提供了表 10所示的引物在制备 诊断前列腺癌的试剂中的用途。 实施例 实施例 1. 差异基因表达分析  In a specific embodiment of the invention, the use of the primers shown in Table 10 for the preparation of a medicament for diagnosing prostate cancer is provided. EXAMPLES Example 1. Differential gene expression analysis
1. 收集前列腺癌病人样本  1. Collecting samples of prostate cancer patients
病人和样本。  Patient and sample.
14对用于 RNA-Seq的前列腺癌组织和癌旁正常组织取自上海 长海医院。 54对用于基因融合验证的样本: 23对来自上海长海医 院、 17对来自江苏省立医院、 14对来此中山大学第三附属医院。 一组 40对用于选择性剪切、 长链非编码 RNA验证的前列腺癌和癌 旁组织取自上海长海医院。 另一组用于长链非编码 RNA验证的 15 个肿瘤样本和 15个 BPH (良性前列腺增生)样本分别取自江苏省 立医院和上海长海医院。 RNA-Seq的规程以及其后续试验得到了 3 家医院伦理委员会的批准。 所有病人都填写了书面知情同意书, 授权我们使用他们的样本。 14 pairs of prostate cancer tissues and adjacent normal tissues for RNA-Seq were taken from Shanghai Changhai Hospital. 54 pairs of samples for gene fusion verification: 23 pairs from Shanghai Changhai Hospital, 17 pairs from Jiangsu Provincial Hospital, and 14 pairs from Zhongshan University Third Affiliated Hospital. A group of 40 pairs of prostate cancer and cancer for selective shear, long-chain non-coding RNA validation The adjacent organization was taken from Shanghai Changhai Hospital. Another group of 15 tumor samples and 15 BPH (benign prostatic hyperplasia) samples for long-chain non-coding RNA validation were taken from Jiangsu Provincial Hospital and Shanghai Changhai Hospital. The RNA-Seq protocol and its follow-up trials were approved by three hospital ethics committees. All patients completed written informed consent and authorized us to use their samples.
2.癌组织及癌旁组织行水冻切片后由病理学家检查保证质量 病理检查  2. The cancer tissue and the adjacent tissues are frozen and sliced by a pathologist to ensure the quality. Pathological examination
癌组织和癌旁正常组织水冻切片进行 HE染色 (苏木精 -伊红 染色)后由本研究的病理学家检查以保证所选组织癌组织密度超 过 80%, 同时癌旁正常组织中没有癌组织。 所有病理样本被另一 个病理学家复查。 如果出现结论不一致的情况, 两位病理学家共 同探讨以决定结论。  Hepatic tissue and adjacent normal tissues were subjected to HE staining (hematoxylin-eosin staining) and examined by the pathologist of the study to ensure that the selected tissue cancer tissue density exceeded 80%, and there was no cancer in the adjacent normal tissues. organization. All pathological samples were reviewed by another pathologist. If there is an inconsistent conclusion, the two pathologists will jointly discuss to determine the conclusion.
3. 制备 cDNA文库和 RNA-Seq 3. Preparation of cDNA library and RNA-Seq
寡聚脱氧胸苷磁珠用于从总 RNA中分离多聚 A mRNA。 用片 段化緩冲液将纯化 mRNA片段化。 将这些短片段作为模板, 用随 机六聚体 ^ I物来合成第一段 cDN A链。 第二段 cDNA链用緩冲液、 dNTPs、 RNase H和 DNA多聚酶 I合成。 短双链 cDNA片段用 QIAQuick PCR extraction kit (vendor)纯化并用 EB緩冲液洗脱以 修复末端并加上 "A"。接着,短片段被连接到 Illumina sequencing adaptors上。 目的片段大小的 DNA被割胶纯化用于 PCR扩增。 用 Illumina HiSeq™ 2000对扩增文库进行测序。  Oligomeric deoxythymidine magnetic beads were used to separate poly A mRNA from total RNA. The purified mRNA was fragmented with fragmentation buffer. Using these short fragments as templates, the first fragment of cDN A was synthesized using a random hexamer. The second strand of the cDNA strand was synthesized using buffer, dNTPs, RNase H and DNA polymerase I. The short double-stranded cDNA fragment was purified using QIAQuick PCR extraction kit (vendor) and eluted with EB buffer to repair the end and add "A". The short segments are then connected to the Illumina sequencing adaptors. The DNA of the target fragment size was purified by tapping for PCR amplification. The amplified library was sequenced using Illumina HiSeqTM 2000.
cDNA文库构建使用 Illumina公司 提供的 mRNA-Seq 8-Sample Prep Kit (货号为: RS-100-0801 )进行, 其具体操作流 程为: 寡聚脱氧胸苷磁珠用于从总 RNA中分离多聚 A mRNA。 用 片段化緩冲液将纯化 mRNA片段化。 将这些短片段作为模板, 用 随机六聚体引物来合成第一段 cDNA链。 第二段 cDNA链用緩冲 液、 dNTPs、 RNase H和 DNA多聚酶 I合成。 短双链 cDNA片段用 QIAQuick PCR extraction kit (Qiagen)纯化并用 EB緩冲液洗脱以 修复末端并加上 "A"。接着,短片段被连接到 Illumina sequencing adaptors上。 目的片段大小的 DNA被割胶纯化用于 PCR扩增。 通 过使用 Agilent 2100 Bioanalyzer 生物分析仪和 Stepone plus焚光 定量 PCR仪对 cDNA文库进行质量检测后 (合格标准为: PCR扩 增产物大小为 322 ± 20bp, 其中插入短片段大小为 200 ± 20bp, 文 库摩尔浓度不低于 1.3nM ) ,使用用 Illumina HiSeqTM 2000对扩增 文库进行测序。 The cDNA library was constructed using Illumina's mRNA-Seq 8-Sample Prep Kit (Cat. No.: RS-100-0801). The specific protocol was: Oligo-deoxythymidine magnetic beads for separation and aggregation from total RNA. A mRNA. The purified mRNA was fragmented with a fragmentation buffer. Use these short clips as templates, Random hexamer primers were used to synthesize the first stretch of cDNA strands. The second stretch of cDNA strand was synthesized using buffer, dNTPs, RNase H and DNA polymerase I. Short double-stranded cDNA fragments were purified using QIAQuick PCR extraction kit (Qiagen) and eluted with EB buffer to repair the ends and add "A". The short segments are then attached to the Illumina sequencing adaptors. The DNA of the target fragment size was purified by tapping for PCR amplification. The quality of the cDNA library was determined by using an Agilent 2100 Bioanalyzer Bioanalyzer and a Stepone plus fluorescence quantitative PCR machine (according criteria: PCR amplification product size was 322 ± 20 bp, with a short fragment size of 200 ± 20 bp, library Moore concentration of not less than 1.3nM), using sequenced using Illumina HiSeq TM 2000 amplified library.
4. 数据分析 4. Data analysis
原始读数筛选  Raw reading
将测序仪生成的图像通过配套的测序仪控制软件进行 base calling处理。原始序列储存为 fastq格式。分析数据前删除脏读数。 我们用三个标准删除脏读数:  The images generated by the sequencer are subjected to base calling processing through the accompanying sequencer control software. The original sequence is stored in fastq format. Remove dirty readings before analyzing the data. We use three criteria to remove dirty readings:
1 )删除脏读数;  1) delete dirty readings;
2 )删除 "N" 碱基超过 2%的读数;  2) Delete more than 2% of the "N" base readings;
3 )删除有 50%以上 QA≤15碱基的低质量读数。  3) Delete low quality readings with more than 50% QA ≤ 15 bases.
所有以下分析都基于整理后的读数。  All of the following analyses are based on the corrected readings.
将读数在人类基因组和转录组上定位。  The readings were located on the human genome and transcriptome.
我们使用的基因组和转录组的参考序列是从 UCSC网站下载 (hgl8 version)。我们使用 SOAP2 ( Short Oligonucleotide Analysis Package (SOAP) aligner (SOAP2); Li R, Yu C, Li Y, Lam TW, Yiu SM, et al. (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25: 1966-1967 )方法将整理后的读 数分别与基因组和转录组进行对比。 每个读数的不匹配数不能超 过 3个。 The reference sequences for the genomes and transcriptomes we use are downloaded from the UCSC website (hgl8 version). We use SOAP2 (Short Oligonucleotide Analysis Package (SOAP) aligner (SOAP2); Li R, Yu C, Li Y, Lam TW, Yiu SM, et al. (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25 : 1966-1967) Method will be read after finishing The numbers were compared to the genome and transcriptome, respectively. The number of mismatches per reading cannot exceed three.
基因和长链非编码 RNA表达水平的标准化。  Standardization of gene and long-chain non-coding RNA expression levels.
能被定位到特定基因的读数用于计算表达水平。 基因表达水 平是每百万读段中来自于某基因每千碱基长度的读段数。 公式如 下:  Readings that can be localized to a particular gene are used to calculate expression levels. The level of gene expression is the number of reads per kilobase length from a gene per million reads. The formula is as follows:
RPKM = -,  RPKM = -,,
C是所选基因读数的拷贝数; N是所有读数基因的拷贝数; L 是所选基因外显子的总长度。 对于有超过一个选择性转录产物的 基因, 最长的转录产物用于计算 RPKM。 RPKM法能够消除不同 基因长度和序列差异对基因表达计算的影响。 因此, RPKM之可 以直接用于比较样本间基因的表达差异。 C is the copy number of the selected gene reading; N is the copy number of all read genes; L is the total length of the exon of the selected gene. For genes with more than one selective transcript, the longest transcript is used to calculate RPKM. The RPKM method can eliminate the effects of different gene lengths and sequence differences on gene expression calculations. Therefore, RPKM can be directly used to compare gene expression differences between samples.
我们用相同方法计算非编码 RNA表达水平。  We used the same method to calculate the level of non-coding RNA expression.
5. 差异表达基因分析 5. Differential expression gene analysis
参考 "数字基因表达傳的显著性" (例如 Audic S & Claverie JM (1997) The significance of digital gene expression profiles. Genome Res 7(10):986-995 ) , 我们用假发现率<=0.001和倍数改 变>=2作为标准找到了在 14对前列腺癌组织和配对癌旁正常组织 中差异表达的基因。 每个样本生成平均 66,432,064个读数和 5.98Gb大小的测序的核苷酸。 通过 SOAP2技术, 我们把 84.4%的 读数定位到人类基因组(UCSC hgl8 version )。 通过对比癌组织 和配对癌旁正常组织的转录组序列, 我们在每个前列腺癌标本中 找到了一些基因融合、 差异性表达的长链非编码 RNA、 选择性剪 切和差异性表达的基因。 此外, 我们发现平均每个癌组织样本有 1725个点突变。 这些结果揭示前列腺癌中存在着很大的异质性, 同时信号通路及分子机制在前列腺癌的发生中起作用。 For reference to "Significance of digital gene expression transmission" (eg, Audic S & Claverie JM (1997) The significance of digital gene expression profiles. Genome Res 7(10): 986-995), we use false discovery rates <= 0.001 and multiples A gene >> 2 was used as a standard to find genes differentially expressed in 14 pairs of prostate cancer tissues and matched normal tissues adjacent to the cancer. Each sample generated an average of 66,432,064 readings and a sequenced nucleotide of 5.98 Gb size. With SOAP2 technology, we mapped 84.4% of the readings to the human genome (UCSC hgl8 version). By comparing the transcriptome sequences of cancerous tissues and matched normal tissues, we found some gene fusions, differentially expressed long-chain non-coding RNAs, selective splicing, and differentially expressed genes in each prostate cancer specimen. In addition, we found an average of 1,725 point mutations per cancer tissue sample. These results reveal a great heterogeneity in prostate cancer. At the same time, signaling pathways and molecular mechanisms play a role in the development of prostate cancer.
实施例 2. 前列腺癌新型融合基因的发现和验证 Example 2. Discovery and validation of novel fusion genes for prostate cancer
在我们将短 RNA读数与参考基因组比较时发现, 有的序列要 分成两段才能和基因组相配对。 这类读数需满足以下条件:  When we compared short RNA reads to the reference genome, we found that some sequences were divided into two segments to be paired with the genome. These readings must meet the following conditions:
a)较短片段长度不短于 8bp;  a) the shorter segment length is not shorter than 8 bp;
b)注意不管内含子在什么位置 (从 5,到 3,, 正链或负链) 对两段的对位分析, 我们允许不超过一个的不匹配和无空位 对位。  b) Note that no matter where the intron is (from 5, to 3, positive or negative), for the alignment analysis of the two segments, we allow no more than one mismatch and no gap alignment.
RT-PCR和测序验证基因融合。我们在转录水平对 RNA-Seq 得到的基因融合进行验证。我们设计了基因融合特异性的 PCR引 物。 PCR和琼脂电泳后,所有 RT-PCR扩增片段割胶回收 (Qiagen QIAquick Gel Extraction kit)并行 Sanger测序。 用这种方法我们 验证了 5个融合基因,分别是 TMPRSS2-ERG, USP9Y-TTTY15, SDKl-AMACR, CTAGE5-KHDRBS3, RAD50-PDLIM4 , 其中除 TMPRSS2-ERG外的其他 4个融合基因是本发明人新发现的。  Gene fusion was verified by RT-PCR and sequencing. We validated the gene fusion obtained by RNA-Seq at the transcriptional level. We designed gene fusion-specific PCR primers. After PCR and agarose electrophoresis, all RT-PCR amplified fragments (Qiagen QIAquick Gel Extraction kit) were sequenced in parallel with Sanger. In this way, we verified five fusion genes, TMPRSS2-ERG, USP9Y-TTTY15, SDKl-AMACR, CTAGE5-KHDRBS3, RAD50-PDLIM4, among which the other four fusion genes except TMPRSS2-ERG are the inventors. Newly discovered.
4个新发现的融合基因是:  The four newly discovered fusion genes are:
>39a fwd chrY 155 39b fwd chrY  >39a fwd chrY 155 39b fwd chrY
USP9Y-TTTY15  USP9Y-TTTY15
GATAACTACATAAAGAGACAAAAAAAAGAAAAAAGA GCAAAGATCTGTGCTGTGTCAAGTATGACAGCCATCACT CATGGCTCTCCAGTAGGAGGGAACGACAGCCAGGGCCA GGTTCTTGATGGCCAGTCTCAGCATCTCTTCCAACAGAA CCAGgaatcaaacttgacgtatggagccaagaaagcccttggaaaaactggcctcatat tttgtgtacacagtccctgtacagggtttctgacctgtg CGGGCGGCCGGGTAATAATAAAAAAAAAAAAATAAATAAGATAACTACATAAAGAGACAAAAAAAAGAAAAAAGA GCAAAGATCTGTGCTGTGTCAAGTATGACAGCCATCACT CATGGCTCTCCAGTAGGAGGGAACGACAGCCAGGGCCA GGTTCTTGATGGCCAGTCTCAGCATCTCTTCCAACAGAA CCAGgaatcaaacttgacgtatggagccaagaaagcccttggaaaaactggcctcatat tttgtgtacacagtccctgtacagggtttctgacctgtg CGGGCGGCCGGGTAATAATAAAAAAAAAAAAATAAATAA
GGGGAAAAAAAATAATTAATAAATAATATAA GGGGAAAAAAAATAATTAATAAATAATATAA
() 8572PDLIM4 () 8572PDLIM4
()wd c53bwd c505044a fhr 1144 fhr 1111RAD >  ()wd c53bwd c505044a fhr 1144 fhr 1111RAD >
__ se:35:53wdcwdc82ait2 ID42 fhrl4fhr ><>=__ se:35:53wdcwdc82ait2 ID42 fhrl4fhr ><>=
Figure imgf000031_0001
Figure imgf000031_0001
gggggggccccccc taatataatattttttta Gggggggccccccc taatataatattttttta
gggggggggggggggggggcccccccccccccc taattatttatatatatttataaattttta Gggggggggggggggggcccccccccccccc taattatttatatatatttataaattttta
gggggggggggggggg GGGGCGcccccccccccAttataaaatataaatttaaaaa  Gggggggggggggggg GGGGCGcccccccccccAttataaaatataaatttaaaaa
CGCGGGCGCCCCCGGGGGGTATAAAAATATTATATTTTA CCCCGGGCCCCCCCCGGCCC ATTTTATAATTATTTAA CGCGGGCGCCCCCGGGGGGTATAAAAATATTATATTTTA CCCCGGGCCCCCCCCGGCCC ATTTTATAATTATTTAA
SDKIAMAR- 3wd c73bev c51a fhr 1211 rhr > GATATGAGACTCATGAGACAAGATATTGATACACAGAAG gtccatgctggcagcaaggctgcattggctgccctgtgcccaggagacctgatccaggccat caatggtgagagcacagagctcatgacacacctggaggcacagaaccgcatcaagggctg ccacgatcacctcacactgtctgtgagcag SDKIAMAR- 3wd c73bev c51a fhr 1211 rhr > GATATGAGACTCATGAGACAAGATATTGATACACAGAAG gtccatgctggcagcaaggctgcattggctgccctgtgcccaggagacctgatccaggccat caatggtgagagcacagagctcatgacacacctggaggcacagaaccgcatcaagggctg ccacgatcacctcacactgtctgtgagcag
其中大写字母表示第一个基因的序列, 小写字母表示第二个 基因的序列。  The uppercase letters indicate the sequence of the first gene, and the lowercase letters indicate the sequence of the second gene.
对于这 5个融合基因的扩增引物如下表 5。  The amplification primers for these five fusion genes are shown in Table 5 below.
表 5. 5个融合基因的扩增引物  Table 5. Amplification primers for 5 fusion genes
融合基因 正向引物 反向引物  Fusion gene forward primer reverse primer
AGTAGGCGCGAGCTAAGC GTCCATAGTCGCTGGAGG AGTAGGCGCGAGCTAAGC GTCCATAGTCGCTGGAGG
TMPRSS2-ERG TMPRSS2-ERG
AG AG AG AG
CTGTGTCAAGTATGACAG CTGTGTCAAGTATGACAGCCTGTGTCAAGTATGACAG CTGTGTCAAGTATGACAGC
USP9Y-TTTY15 USP9Y-TTTY15
CCATC CATC CCATC CATC
TGCTGAAAATGAAGCCAC GGACTGGTGGAGATTGGCTGCTGAAAATGAAGCCAC GGACTGGTGGAGATTGGC
CTAGE5-KHDRBS3 CTAGE5-KHDRBS3
TG TA TG TA
ACTAAGTGAATGCGAGAA ACAGACAGTGTGAGGTGAACTAAGTGAATGCGAGAA ACAGACAGTGTGAGGTGA
RAD50-PDLIM4 RAD50-PDLIM4
ACACAA TCGT ACACAA TCGT
ACCTGGTCATTTCCAACAT CAAAGCCAAATAGTTGATACCTGGTCATTTCCAACAT CAAAGCCAAATAGTTGAT
SDK1-AMACR SDK1-AMACR
CAG ATCGTG  CAG ATCGTG
PCR条件是: 95 X: 10秒; 60*C 30秒; 90秒; 38-43个 循环。 The PCR conditions were: 95 X: 10 seconds; 60*C 30 seconds; 90 seconds; 38-43 cycles.
使用 PCR纯化试剂盒 PCR Cleanup Kit 50 -prep ( AXYGEN> Cat No.AP-PCR-50, Lot No.KB10101204-G )进行 PCR产物纯化, 对 PCR产物进行 2 %琼脂糖凝胶电泳,使用胶回收试剂盒 DNA Gel Extraction Kit 50-prep ( AXYGEN , Cat No.AP-GX-50 , Lot No.KE10101204-G )进行股回收。  PCR product purification was carried out using PCR purification kit PCR Cleanup Kit 50 -prep (AXYGEN> Cat No. AP-PCR-50, Lot No. KB10101204-G), and the PCR product was subjected to 2% agarose gel electrophoresis using gel recovery. The kit DNA Gel Extraction Kit 50-prep (AXYGEN, Cat No. AP-GX-50, Lot No. KE10101204-G) was subjected to strand recovery.
有融合基因的电泳图片, 分别参加见图 2d ( TMPRSS2-ERG 和 CTAGE5-KHDRBS3 ) , 图 3a和 b ( USP9Y-TTTY15 ) 和图 4a ( RAD50-PDLIM4 ) , 图 4b ( SDK1-AMACR ) 。 Electrophoresis images with fusion genes are shown in Figure 2d (TMPRSS2-ERG and CTAGE5-KHDRBS3), Figures 3a and b (USP9Y-TTTY15) and Figure 4a (RAD50-PDLIM4), Figure 4b ( SDK1-AMACR).
筛选高频的基因融合。 用 RT-PCR验证了基因融合以后, 我 们在另外 54对样本中验证了每个(以上 4个)融合基因。 首先提取 所有样本的 RNA并逆转录为 cDNA。 RT-PCR引物与上述验证引物 相同。 测序样本的 cDNA作为阳性对照。  Screen high frequency gene fusion. After confirming the gene fusion by RT-PCR, we verified each of the above four fusion genes in 54 pairs of samples. RNA from all samples was first extracted and reverse transcribed into cDNA. The RT-PCR primers were identical to the validation primers described above. The cDNA of the sequenced sample was used as a positive control.
前列腺癌基因融合图谱。 转录组测序最早被用于检测前列腺 癌中的基因融合现象。 使用配对末端读数, 我们一共找到了 84个 基因融合。 除了众所周知的 TMPRSS2-ERG基因融合外, 我们找 到了 83个新的基因融合, 这些在之前针对白人的研究中都未被报 道过。 35个新的和 1个以前熟知的基因融合仅见于前列腺癌组织而 未见于配对癌旁正常组织中 (见下划线部分的融合基因) , 另外 有融合基因在癌旁正常组织表达(见黑体加粗部分) , 具体生物 学意义暂时不明, 还有如下 4个融合基因在癌和癌旁均有。  Prostate cancer gene fusion map. Transcriptome sequencing was first used to detect gene fusion in prostate cancer. Using paired end readings, we found a total of 84 gene fusions. In addition to the well-known TMPRSS2-ERG gene fusion, we found 83 new gene fusions that were not reported in previous studies against whites. 35 new and one previously well-known gene fusions were found only in prostate cancer tissues but not in matched normal tissues (see underlined fusion genes), and fusion genes were expressed in normal tissues adjacent to the cancer (see bold black body). Partly), the specific biological significance is temporarily unknown, and the following four fusion genes are found in cancer and cancer.
5'基因 3'基因 5'染色体 ID 3'染色体 ID 5'位置 3'位置 链(正链、反 5' gene 3' gene 5' chromosome ID 3' chromosome ID 5' position 3' position chain (positive chain, anti
CTSS CTSK chrl chrl 148996958 149045116 rev, revCTSS CTSK chrl chrl 148996958 149045116 rev, rev
KLK3 KLK12 chrl9 chrl 9 56053663 56224525 fwd, revKLK3 KLK12 chrl9 chrl 9 56053663 56224525 fwd, rev
KLK2 KLK3 chrl9 chrl 9 56072076 56055040 fwd, fwdKLK2 KLK3 chrl9 chrl 9 56072076 56055040 fwd, fwd
KLK2 KLK3 chrl9 chrl 9 56072113 56055040 fwd, fwd 只在癌中表达的基因融合定义为肿瘤特异性基因融合。 每个 癌组织样本的基因融合数分别为 1到 6个不等。 所述 83个新基因融 合如表 6所示, 其中的 35个新的基因融合以下划线标出 KLK2 KLK3 chrl9 chrl 9 56072113 56055040 fwd, fwd Gene fusion expressed only in cancer is defined as tumor-specific gene fusion. The number of gene fusions for each cancer tissue sample ranged from 1 to 6 respectively. The 83 new genes were fused as shown in Table 6, and 35 new gene fusions were underlined.
表 6. 83个新基因融合  Table 6. 83 new gene fusions
5'Gene Genbank登录号 3'Gene Genbank登录号 5'Gene Genbank Accession Number 3'Gene Genbank Accession Number
ANXA2 302 PRODH 5625ANXA2 302 PRODH 5625
APLP2 334 MBOAT7 79143APLP2 334 MBOAT7 79143
AQR 9716 MARK3 4140AQR 9716 MARK3 4140
ARFIP1 27236 DOCK9 23348ARFIP1 27236 DOCK9 23348
ARG2 384 VTI1 B 10490ARG2 384 VTI1 B 10490
BUB3 9184 PRKDC 5591BUB3 9184 PRKDC 5591
C1 orf57 428588 NVL 4931C1 orf57 428588 NVL 4931
C20orf94 128710 SYTL4 94121C20orf94 128710 SYTL4 94121
CACNA1 D 776 AMACR 23600 CO 00 CD CO 00 CO CO o 00 CD CD CO o CD 00 o o 00 LO CDCACNA1 D 776 AMACR 23600 CO 00 CD CO 00 CO CO o 00 CD CD CO o CD 00 oo 00 LO CD
CO lO C\J CO C\J o 00 C\J LO CD 00 00 CO C\J C\J CDCO lO C\J CO C\J o 00 C\J LO CD 00 00 CO C\J C\J CD
CD CD LO o C\J CO CO o 00 C\J CO CD 00 o o CD CD 00 COCD CD LO o C\J CO CO o 00 C\J CO CD 00 o o CD CD 00 CO
CO CO σ o CD o CO CO CO 00 CD CD CD CO CD o C\J CO 00 CO CO σ o CD o CO CO CO 00 CD CD CD CO CD o C\J CO 00
CD 00 C\J 00 00 00 00 CO 00 LO CD  CD 00 C\J 00 00 00 00 CO 00 LO CD
C\J CO CO CD o  C\J CO CO CD o
o  o
o  o
Figure imgf000034_0001
Figure imgf000034_0002
Figure imgf000034_0001
Figure imgf000034_0002
Nla mszi Fs RPL31 6160 ODF2L 57489Nla mszi Fs RPL31 6160 ODF2L 57489
SDK1 221935 AMACR 23600SDK1 221935 AMACR 23600
SGMS1 259230 ADD3 120SGMS1 259230 ADD3 120
SLC25A33 84275 RYK 6259SLC25A33 84275 RYK 6259
SNRNP70 6625 CAMK2B 816SNRNP70 6625 CAMK2B 816
STAT3 6774 PDE8A 5151STAT3 6774 PDE8A 5151
TAX1 BP1 8887 JAZF1 221895TAX1 BP1 8887 JAZF1 221895
TBC1 D22A 25771 ITPK1 3705TBC1 D22A 25771 ITPK1 3705
TJ P1 7082 NUS1 1 16150TJ P1 7082 NUS1 1 16150
TPM2 7169 MYL6 4637TPM2 7169 MYL6 4637
TSPAN9 10867 TFIP1 1 24144TSPAN9 10867 TFIP1 1 24144
UBAP2L 9898 C1 orf43 25912UBAP2L 9898 C1 orf43 25912
UPF3A 651 10 CDC16 8881UPF3A 651 10 CDC16 8881
USP53 54532 NR3C2 4306USP53 54532 NR3C2 4306
USP9Y 8287 I I I Y15 64595USP9Y 8287 I I I Y15 64595
UTRN 7402 ARHGAP18 93663UTRN 7402 ARHGAP18 93663
VAPB 9217 ATPBD4 89978VAPB 9217 ATPBD4 89978
WWOX 51741 IGF1 3479WWOX 51741 IGF1 3479
ZC3H4 2321 1 LPPR2 64748ZC3H4 2321 1 LPPR2 64748
ZC3H6 376940 LRP1 B 53353ZC3H6 376940 LRP1 B 53353
ZER1 10444 GLIPR2 152007ZER1 10444 GLIPR2 152007
ZNF252 286101 PSMD4 5710ZNF252 286101 PSMD4 5710
ZNF532 55205 UBA3 9039ZNF532 55205 UBA3 9039
ZNF557 79230 WIF1 1 1 197 最常见的基因融合是 TMPRSS2-ERG和 USP9Y-TTTY15。 二 者均见于 14个测序前列腺癌组织样本中的 3个样本。 我们通过 RNA-Seq检测到另一个最常见的融合基因是位于 Y染色体上的 USP9Y-TTTY15。 USP9Y编码一个类似于泛素特异性蛋白酶的蛋 白, 而 TTTY15是一个非编码 RNA。 USP9Y基因缺失或突变与男 性不育有关。 然而, 之前的研究都未揭示上述两种基因与肿瘤发 生有关。 RNA-Seq结果中, USP9Y基因的 3号外显子和 TTTY15基 因的 3号外显子融合形成的 USP9Y-TTTY15频率 ( 3/14=21.4% ) 与 TMPRSS2-ERG相同。 但是, RT-PCR^现 54个前列腺癌组织 中 19个有 USP9Y-TTTY15。 该融合基因之前未被报道过, 但其在 本研究中频率较高提示其在中国人前列腺癌的发生中起重要作 用, 这些可望在后续的研究中得到阐明。 有趣的是, 用开放阅读 框(ORF )预测工具 Six-Frame Translation发现该融合基因的转 录产物似乎没有开放阅读框, 提示其可能是非编码 RNA。 我们还 发现该融合可能导致 USP9Y功能的缺失和一个新的非编码的融 合基因转录产物。 该融合基因在测序样本和验证样本中较高的出 现频率提示其在前列腺癌中起重要作用。 ZNF557 79230 WIF1 1 1 197 The most common gene fusions are TMPRSS2-ERG and USP9Y-TTTY15. Both were found in 3 of the 14 sequenced prostate cancer tissue samples. Another most common fusion gene we detected by RNA-Seq is USP9Y-TTTY15 located on the Y chromosome. USP9Y encodes a protein similar to a ubiquitin-specific protease, while TTTY15 is a non-coding RNA. The deletion or mutation of the USP9Y gene is associated with male infertility. However, previous studies have not revealed that these two genes are involved in tumorigenesis. In the RNA-Seq results, the USP9Y-TTTY15 frequency (3/14 = 21.4%) formed by the exon 3 of the USP9Y gene and the exon 3 of the TTTY15 gene was identical to that of TMPRSS2-ERG. However, RT-PCR^19 of the 54 prostate cancer tissues have USP9Y-TTTY15. The fusion gene has not been reported before, but it is The high frequency in this study suggests that it plays an important role in the development of Chinese prostate cancer, which is expected to be elucidated in subsequent studies. Interestingly, using the open reading frame (ORF) prediction tool Six-Frame Translation, it was found that the transcript of the fusion gene did not appear to have an open reading frame, suggesting that it may be a non-coding RNA. We also found that this fusion may result in the loss of USP9Y function and a new non-coding fusion gene transcript. The higher frequency of occurrence of this fusion gene in both sequencing and validation samples suggests that it plays an important role in prostate cancer.
在该 54对前列腺癌样本中, 我们还验证了另外 3个 ( CTAGE5-KHDRBS3 , SDK1-AMACR和 RAD50-PDLIM4 )基 因融合,他们的频率分别是 37% , 20% , 33.3%。 实施例 3. 前列腺癌长链非编码 RNA的发现和验证  In the 54 pairs of prostate cancer samples, we also verified three other (CTAGE5-KHDRBS3, SDK1-AMACR and RAD50-PDLIM4) gene fusions, with frequencies of 37%, 20%, and 33.3%, respectively. Example 3. Discovery and validation of long-chain non-coding RNA for prostate cancer
(1) . 从 http:〃 www. ncrna.org/friiadb/dowiiload下载 ncRNA数 据库, 然后删除片段小于 200nt的 ncRNA、 zRNA和非人类 RNA并 得到 2981个长链非编码 RNA。 接下来我们用该数据库计算长链非 编码 RNA的表达水平。 配对癌和癌旁标本的长链非编码 RNA差异 性表达的标准为: 假发现率<=0.001、 倍数改变>=2。 选择在超过 50%样本中一致上调或下调的长链非编码 RNA进行监督聚类分 析(使用 cluster 3.0对基因和长链非编码 RNA表达傳进行分层聚类 分析) 。 进一步行长链非编码 RNA和基因的相关分析。 我们选择 在超过 50%前列腺癌样本中一致上调或下调的长链非编码 RNA 并分析它们与所有在前列腺癌组织中发现的基因的相关性。 长链 非编码 RNA和基因的表达水平 (RPKM ) 用作计算相关系数 R。  (1). Download the ncRNA database from http:〃 www.ncrna.org/friiadb/dowiiload, then delete ncRNA, zRNA and non-human RNA with fragments less than 200nt and obtain 2981 long-chain non-coding RNAs. Next we use this database to calculate the expression level of long-chain non-coding RNA. The criteria for differential expression of long-chain non-coding RNA in paired and paracancerous specimens are: false discovery rate <=0.001, fold change >=2. Long-chain non-coding RNAs that were consistently up-regulated or down-regulated in more than 50% of the samples were selected for supervised cluster analysis (stratified clustering analysis of gene and long-chain non-coding RNA expression using cluster 3.0). Further analysis of long-chain non-coding RNA and genes. We selected long-chain non-coding RNAs that were consistently up-regulated or down-regulated in more than 50% of prostate cancer samples and analyzed their association with all genes found in prostate cancer tissues. Long-chain non-coding RNA and gene expression levels (RPKM) were used to calculate the correlation coefficient R.
(2) . qRT-PCR验证长链非编码 RNA (我们使用 Power SYBR Green Mastermix试剂在 Applied Biosystems Step One Plus做 qRT-PCR。 GAPDH引物用作内参。 如上所述一组 40对前列腺癌 和癌旁组织取自上海长海医院, 另一组用于 15个肿瘤样本和 15个 BPH样本分别取自江苏省立医院和上海长海医院, 用于长链非编 码 RNA验证。 使用两步法 PCR扩增标准程序: Stagel: 预变性 ( Reps: 1; 95*€ 30秒) ; Stage2: PCR^应 ( Reps: 40; 95 X: 5 秒; 60 *€ 34秒) ; Dissociation Stage (解离阶段) 。 (2) . qRT-PCR validates long-chain non-coding RNA (we performed qRT-PCR on Applied Biosystems Step One Plus using Power SYBR Green Mastermix reagent. GAPDH primer was used as internal reference. A group of 40 pairs of prostate cancer as described above) The adjacent tissues were taken from Shanghai Changhai Hospital, and the other group was used for 15 tumor samples and 15 BPH samples taken from Jiangsu Provincial Hospital and Shanghai Changhai Hospital for long-chain non-coding RNA verification. Standard procedure for PCR amplification using a two-step method: Stagel: Pre-denaturation (Reps: 1; 95*€ 30 seconds); Stage2: PCR^ should (Reps: 40; 95 X: 5 seconds; 60 * € 34 seconds); Dissociation Stage (dissociation phase).
设计了针对 4个长链非编码 RNA的引物如下表 7:  Primers for four long-chain non-coding RNAs were designed as follows:
表 7. 4个长链非编码 RNA的引物  Table 7. Primers for 4 long-chain non-coding RNAs
正向引物 (Forward) 反向引物 (Reverse)  Forward Primer Reverse Primer (Reverse)
DD3 GGTGGGAAGGACCTGATGATAG GGGCGAGGCTCATCGAT MALAT1 CTTCCCTAGGGGATTTCAGG GCCCACAGGAACAAGTCCTA  DD3 GGTGGGAAGGACCTGATGATAG GGGCGAGGCTCATCGAT MALAT1 CTTCCCTAGGGGATTTCAGG GCCCACAGGAACAAGTCCTA
CTTCACAAAGCTGAATTAATGTG GTTTTTCTTTCTTTTTGGAGGTC CTTCACAAAGCTGAATTAATGTG GTTTTTCTTTCTTTTTGGAGGTC
FR0257520 FR0257520
G A G A
TAAACCTCCTTATCACATGCAGA GGACACCGTAGATTCTAGGACTAAACCTCCTTATCACATGCAGA GGACACCGTAGATTCTAGGAC
FR0348383 FR0348383
A ACT  A ACT
所有的实验都使用两个或三个孔进行平行重复实验, 结果以 相对于 GAPDH的平均倍数改变绘图(图 5 )。 我们发现有 137个长 链非编码 RNA在 50%的前列腺癌中都呈现一致的上调或下调。 我 们分析了每个长链非编码 RNA与所有基因表达量的相关性发现 有 23个长链非编码 RNA与全基因组中数百个基因显著相关, 而其 他大多数基因仅与几个基因相关, 或者根本就不相关。 结果分析部分  All experiments were performed in parallel with two or three wells and the results were plotted as a mean fold change relative to GAPDH (Figure 5). We found that 137 long-chain non-coding RNAs showed consistent up- or down-regulation in 50% of prostate cancers. We analyzed the association of each long-chain non-coding RNA with the expression levels of all genes. It was found that 23 long-chain non-coding RNAs were significantly associated with hundreds of genes in the whole genome, while most other genes were only related to several genes. Or it is not relevant at all. Results analysis section
我们在 40对前列腺癌和癌旁组织中、 15个正常人前列腺组织 和 15个前列腺癌组织中验证发现,在大多数前列腺癌标本中 PCA3 (又称为 DD3 )、MALAT1和 FR0348383过表达,而 FR0257520 表 达量降低(图 5 ) 。 PCA3过表达的结果与之前认为其可能成为新 的诊断标志物的研究类似, 但我们首次发现 MALAT1过表达的频 率在前列腺癌中很高。  We verified in 40 pairs of prostate cancer and adjacent tissues, 15 normal human prostate tissues and 15 prostate cancer tissues that PCA3 (also known as DD3), MALAT1 and FR0348383 were overexpressed in most prostate cancer specimens. FR0257520 expression decreased (Fig. 5). The results of PCA3 overexpression are similar to those previously thought to be new diagnostic markers, but we first found that the frequency of overexpression of MALAT1 is high in prostate cancer.
本发明提供了 137个长链非编码 RNA可用于诊断、 判断患者 预后和药物反应, 以及治疗的靶点, 参见表 2 实施例 4. 单核苷酸多态性和点突变的发现和验证 The invention provides 137 long-chain non-coding RNAs for diagnosis and judgment of patients Prognosis and drug response, as well as therapeutic targets, see Table 2. Example 4. Discovery and validation of single nucleotide polymorphisms and point mutations
(1) . 我们使用 SOAPsnp检测单核苷酸多态性。 该软件是用重 复测序方法通过将测序序列与已知序列对比将新测序的个体的共 有序列组装到基因组。 通过将共有序列与参考序列相对比, 可以 找到单核苷酸多态性。  (1) . We use SOAPsnp to detect single nucleotide polymorphisms. The software assembles the co-sequences of newly sequenced individuals into the genome by repeated sequencing methods by comparing the sequenced sequences to known sequences. Single nucleotide polymorphisms can be found by comparing the consensus sequence to a reference sequence.
(2) . 我们用 RT-PCR联合 Sanger测序验证 RNA-Seq筛选出的 候选碱基对变异。 PCR条件是: 95XM0秒; 60*€30秒; 72*€90 秒; 38-43个循环。 样品来自上海长海医院 14对前列腺癌和癌旁组 织。 我们随机选择 30个蛋白编码突变进行验证。 其中 27个仅存在 于癌组织( cDNA和 DNA中均有 ) ,而未见于癌旁正常组织( cDNA 和 DNA中均无)。 2个变异仅见与癌组织 cDNA, 而未见于正常组 织 cDNA 1个变异在癌组织和癌旁正常组织中均没有。  (2). We used RT-PCR combined with Sanger sequencing to verify the candidate base pair variation screened by RNA-Seq. The PCR conditions are: 95XM0 seconds; 60*€30 seconds; 72*€90 seconds; 38-43 cycles. The samples were from Shanghai Changhai Hospital with 14 pairs of prostate cancer and adjacent tissues. We randomly selected 30 protein-coding mutations for validation. Of these, 27 were only found in cancer tissues (both cDNA and DNA), but not in adjacent normal tissues (none in cDNA and DNA). Two mutations were found only in the cancer tissue cDNA, but not in the normal tissue cDNA. One variation was not found in cancer tissues and adjacent normal tissues.
表 8.已经验证的 30个突变, 其中最右一列是用的模板分别是 CDNA和 DNA, S代表成功, F代表失败。  Table 8. The 30 mutations that have been verified, the rightmost column is the template used for CDNA and DNA, S for success and F for failure.
 Effect
mm 坐标 mm 基因 艇 Mm coordinate mm gene boat
麵 m cM2 47511314 G->R G->A GOG->GTG Ala->Val nam=DDX23; F CDNA-DNA- chrl 36530890 A->R A->G ATA->GTA fle->Val
Figure imgf000038_0001
S DNACDNA chrlO 114177Q56 A - M A->C AAA->CAA Lys->Gh nam^=ACSL5; S DNACDNA chrl2 6441548 G->R G->A OGC->CAC Ar - His nam&=TAPBPL; s DNACDNA chrl3 40032096 G->R G->A GCA->GTA Ala->Val nam&=FOX01; s DNA DNA chrl5 55603Q21 G->R G->A GAG - AAG Glu->Lys name=OGNLl; s DNA DNA
Surface m cM2 47511314 G->R G->A GOG->GTG Ala->Val nam=DDX23; F CDNA-DNA- chrl 36530890 A->R A->G ATA->GTA fle->Val
Figure imgf000038_0001
S DNACDNA chrlO 114177Q56 A - M A->C AAA->CAA Lys->Gh nam^=ACSL5; S DNACDNA chrl2 6441548 G->R G->A OGC->CAC Ar - His nam&=TAPBPL; s DNACDNA chrl3 40032096 G->R G->A GCA->GTA Ala->Val nam&=FOX01; s DNA DNA chrl5 55603Q21 G->R G->A GAG - AAG Glu->Lys name=OGNLl; s DNA DNA
CDNA-S chrl6 71388363 C->M C->A GAA->TAA Ghi - STOP nam^=ZFHX3;  CDNA-S chrl6 71388363 C->M C->A GAA->TAA Ghi - STOP nam^=ZFHX3;
DNA-F chrl7 74307400 G->S G->C C ->TGT Ser->Cys name=USP36; s DNACDNA chrl9 4360692 C->Y C->T COG->CTG Pro->Leu name=CHAFlA; s DNA DNA chrl 10444156 C->Y C->T OGA->CAA Aig->Gh name=DFFA; s DNACDNA chrl 154907009 C->S C->G GOC->COC Ala - Pro narrE=NES; s DNACDNA chr20 17556245 C->M C->A CAG->CAT Gin - His nam&=RRBPl; s DNA DNA chr20 24 9 G->R G->A GT - TAT Cy&->T r name=CST7; s DNACDNA chr3 135021238 A->R A->G AGA->GGA Ai¾->Gly name=SRPRB; S DNACDNA chS 61818110 G->K G->T GGC->G C Gly->Val name=CHD7; S DNA/CDNA chS 22482646 C->Y C->T COG->CTG Pro->Leu name=SORBS3; s DNA/CDNA chrl 149190131 C->S C->G T C->TKJ F¾e->Leu name=SETOBl; s DNA/CDNA chi9 33913909 G->R G->A CAC -〉 TAC His -〉 T r name=UBAP2; s DNACDNADNA-F chrl7 74307400 G->S G->CC ->TGT Ser->Cys name=USP36; s DNACDNA chrl9 4360692 C->Y C->T COG->CTG Pro->Leu name=CHAFlA; s DNA DNA chrl 10444156 C->Y C->T OGA->CAA Aig->Gh name=DFFA; s DNACDNA chrl 154907009 C->S C->G GOC->COC Ala - Pro narrE=NES; s DNACDNA chr20 17556245 C->M C->A CAG->CAT Gin - His nam&=RRBPl; s DNA DNA chr20 24 9 G->R G->A GT - TAT Cy&->T r name=CST7; s DNACDNA Chr3 135021238 A->R A->G AGA->GGA Ai3⁄4->Gly name=SRPRB; S DNACDNA chS 61818110 G->K G->T GGC->GC Gly->Val name=CHD7; S DNA/CDNA chS 22482646 C->Y C->T COG->CTG Pro->Leu name=SORBS3; s DNA/CDNA chrl 149190131 C->S C->GT C->TKJ F3⁄4e->Leu name=SETOBl; s DNA /CDNA chi9 33913909 G->R G->A CAC -〉 TAC His -> T r name=UBAP2; s DNACDNA
C1B9 99297935 G->K G->T TGG->TGT Tq»Cys nam=TDRD7; s DNACDNA chS 117928959 G->K G->T GAC->GAA Asp->Glu nam^=RAD21; s DNACDNA chrl 腿 35851 T->K T->GC1B9 99297935 G->K G->T TGG->TGT Tq»Cys nam=TDRD7; s DNACDNA chS 117928959 G->K G->T GAC->GAA Asp->Glu nam^=RAD21; s DNACDNA chrl Leg 35851 T->K T->G
Figure imgf000039_0001
Figure imgf000039_0001
chi6 82516802 C->M C->A GAT -〉 TAT Asp->T r narrE=FAM46A; s DNA DNA nam&=AM01Ll Chi6 82516802 C->M C->A GAT -〉 TAT Asp->T r narrE=FAM46A; s DNA DNA nam&=AM01Ll
chrll 94242244 G->R G->A G C->A C Val->fle s DNA DNA chrl2 22109451 G->R G->A CGA->CAA Arg->Gh name=CMAS; s DNA DNA chrl7 77587208 C->Y C->T GOC->ACC Ala->lhr nam&=DCXR; s DNA DNA name=SLC38Al Chrll 94242244 G->R G->AG C->AC Val->fle s DNA DNA chrl2 22109451 G->R G->A CGA->CAA Arg->Gh name=CMAS; s DNA DNA chrl7 77587208 C- >Y C->T GOC->ACC Ala->lhr nam&=DCXR; s DNA DNA name=SLC38Al
chrl7 76834796 C->Y C->T GAT -〉 AAT Asp->Asn s DNA DNA Chrl7 76834796 C->Y C->T GAT -〉 AAT Asp->Asn s DNA DNA
0;  0;
chrl 54948358 C->S C->G AGC->AGG Ser->Aig name=Clafl75; s DNA/CDNA chr21 46376814 G->R G->A GOC->ACC Ala->lhr name=COL6A2; s DNACDNA name=GOLGBl Chrl 54948358 C->S C->G AGC->AGG Ser->Aig name=Clafl75; s DNA/CDNA chr21 46376814 G->R G->A GOC->ACC Ala->lhr name=COL6A2; s DNACDNA Name=GOLGBl
chr3 122893679 G->R G->A OGT->TGT Ai¾->Cys s DNA DNA chi6 112600531 G->R G->A ACA -〉 ATA lhr->ne narrE=LAMA4; s DNACDNA 表 9. 30个突变所使用的引物 Chr3 122893679 G->R G->A OGT->TGT Ai3⁄4->Cys s DNA DNA chi6 112600531 G->R G->A ACA -〉 ATA lhr->ne narrE=LAMA4; s DNACDNA Table 9. 30 Primer used for mutation
引物名称 序列 (5'至 3') 碱基数Primer name sequence (5' to 3') number of bases
114177056-DNA-Forward CTTTACCCTTTCACTGCATCAAC 23114177056-DNA-Forward CTTTACCCTTTCACTGCATCAAC 23
114177056-DNA-Reverse TTTTAATCCATTTTCTCACAAGCA 24114177056-DNA-Reverse TTTTAATCCATTTTCTCACAAGCA 24
6441548-DNA— Forward CAACTTCCTGTCTTCACTTCCTCT 246441548-DNA— Forward CAACTTCCTGTCTTCACTTCCTCT 24
6441548-DNA-Reverse CATGTGGCATATTTACCAATGTC 236441548-DNA-Reverse CATGTGGCATATTTACCAATGTC 23
40032096-DNA-Forward ACTTGTACAGGTGTCTTCACTTGG 2440032096-DNA-Forward ACTTGTACAGGTGTCTTCACTTGG 24
40032096-DNA-Reverse AAGGAGTTGCTGACTTCTGACTCT 2440032096-DNA-Reverse AAGGAGTTGCTGACTTCTGACTCT 24
55603021-DNA-Forward ATCTTCTTCCTCATCACGGATTTA 2455603021-DNA-Forward ATCTTCTTCCTCATCACGGATTTA 24
55603021-DNA-Reverse CTACTTCCTCTTTCCTCCTCCAG 2355603021-DNA-Reverse CTACTTCCTCTTTCCTCCTCCAG 23
71388363-DNA-Forward TATACTGGATGACCAACTCAAAGC 2471388363-DNA-Forward TATACTGGATGACCAACTCAAAGC 24
71388363-DNA-Reverse AGAACCAACTCTCTATAGCCCAGA 2471388363-DNA-Reverse AGAACCAACTCTCTATAGCCCAGA 24
74307400-DNA-Forward GTTGAGATTCCTCTTCCCATTCT 2374307400-DNA-Forward GTTGAGATTCCTCTTCCCATTCT 23
74307400-DNA-Reverse ATAATTTAAGGTGTGCGATTGCTT 2474307400-DNA-Reverse ATAATTTAAGGTGTGCGATTGCTT 24
4360692-DNA-Forward ACATCTTGGCTGTGAGACCAC 214360692-DNA-Forward ACATCTTGGCTGTGAGACCAC 21
4360692-DNA-Reverse CTCACTCTGCCACAAAACACCT 22 -6ε- Ζ 99XDX9VXVDVD9VXVV99VXDD9 jBAUoj-vNa- iLMLL Ζ X9XDXDXDVDVDXVVVDDVVVDVD 3SJ3A3¾— γΝα- Ζ D9X9V9VV9XV9XDX9X9VV9XVV jBAUoj-vNa-
Figure imgf000040_0001
4360692-DNA-Reverse CTCACTCTGCCACAAAACACCT 22 -6ε- Ζ 99XDX9VXVDVD9VXVV99VXDD9 jBAUoj-vNa- iLMLL Ζ X9XDXDXDVDVDXVVVDDVVVDVD 3SJ3A33⁄4— γΝα- Ζ D9X9V9VV9XV9XDX9X9VV9XVV jBAUoj-vNa-
Figure imgf000040_0001
Ζ 9XV9XD9V9VVX9V9VXVDDDDXV Ζ 9XV9XD9V9VVX9V9VXVDDDDXV
Ζ DVXDVDDXDD9VXVDDXDDX9XDV jB joj-vNa-|l l7 l76 Ζ VVDXDXDVDVX9XDDX9XVD9VVV 3SJ3A3 —vNa-0680£S9£ Ζ XVD9V9VDVVDDDXVVXXXXDXXX jB joj-vNa-0680eS9£ Ζ DXDVX9DXDXXXDDXXXDDXXX9X Λ3 — - vNa-w ε IISLP Ζ VXVVDXXDDVXXDVV99XDDDXXX jBAuoj-vNa-iie USLP Ζ XXVX9XV99V9VV9X9XD9VDXDX Ζ DVXDVDDXDD9VXVDDXDDX9XDV jB joj-vNa- | l l7 l76 Ζ VVDXDXDVDVX9XDDX9XVD9VVV 3SJ3A3 -vNa-0680 £ S9 £ Ζ XVD9V9VDVVDDDXVVXXXXDXXX jB joj-vNa-0680eS9 £ Ζ DXDVX9DXDXXXDDXXXDDXXX9X Λ3 - - vNa-w ε IISLP Ζ VXVVDXXDDVXXDVV99XDDDXXX jBAuoj-vNa-iie USLP Ζ XXVX9XV99V9VV9X9XD9VDXDX
Ζ DXXD99VVVDXV9X99VVVDVXX9 Ζ DXXD99VVVDXV9X99VVVDVXX9
Ζ 99VVDD9VXVVVDVXXDDDDVVXV 3SJ3A3¾-vNa- 1 ζ^ξ£Ρ991 Ζ XD9XVV9XVXXXD9XVDDXXXVVV jBAuoj-vNa-lS8S£l991 Ζ XXD9XXX9VDXVXDXVV9XD9V99 3SJ3A3¾-vNa-6S686Al 1 Ζ DV9VVVD99VVVVV99VV9VV9XV
Figure imgf000040_0002
Ζ 99VVDD9VXVVVDVXXDDDDVVXV 3SJ3A33⁄4-vNa- 1 ζ^ξ£Ρ991 Ζ XD9XVV9XVXXXD9XVDDXXXVVV jBAuoj-vNa-lS8S£l991 Ζ XXD9XXX9VDXVXDXVV9XD9V99 3SJ3A33⁄4-vNa-6S686Al 1 Ζ DV9VVVD99VVVVV99VV9VV9XV
Figure imgf000040_0002
Ζ V9VVXX9V9VD9VX9XVDD9V9VD 漏 Λ3 — vNa-S£6乙 6Z66 Ζ Y1DD1DD11D1D1DD1LD11D1DD jBAuod-vNa-S£6A6 66 Ζ XV9X99VXDDXXXDXD9X9VDX9V 3SJ3A3¾-vNa-606ei6ee ζζ XXX99VDDDX99VXV9VV99XD jB joj-vNa-606ei6ee Ζ DVXV9VDVX9VD9VVV9999XV9V 3SJ3A3¾-vNa- iei06i6ii Ζ XXXXXDD9VVXVXDDD9VDXXX9V jBAUoj-vNa- 1£10616171 Ζ V9VVXX9V9VD9VX9XVDD9V9VD drain Λ3 - vNa-S £ 6 B 6Z66 Ζ Y1DD1DD11D1D1DD1LD11D1DD jBAuod-vNa-S £ 6A6 66 Ζ XV9X99VXDDXXXDXD9X9VDX9V 3SJ3A3¾-vNa-606ei6ee ζζ XXX99VDDDX99VXV9VV99XD jB joj-vNa-606ei6ee Ζ DVXV9VDVX9VD9VVV9999XV9V 3SJ3A3¾-vNa- iei06i6ii Ζ XXXXXDD9VVXVXDDD9VDXXX9V jBAUoj-vNa- 1£10616171
9Ζ DDVDXXDXVDVXDDVXXXVXVDVD9X9Ζ DDVDXXDXVDVXDDVXXXVXVDVD9X
Ζ DXVXVXXX9X99XDX9X9XXD999
Figure imgf000040_0003
DX DXVXVXXX9X99XDX9X9XXD999
Figure imgf000040_0003
Ζ DD9VXX9VVDV9VDXV9DXXVVVV 漏 Λ3 — vNa-0 ΐ ΐ 8 ΐ 8 ΐ 9 Ζ X9XVD99VXVXXXVDDX9DXDX9V jB joj-vNa-0U81819 Ζ XXXXXDDV9VDVDDVXDV9VDDXD DD DD9VXX9VVDV9VDXV9DXXVVVV Leak Λ3 — vNa-0 ΐ ΐ 8 ΐ 8 ΐ 9 Ζ X9XVD99VXVXXXVDDX9DXDX9V jB joj-vNa-0U81819 Ζ XXXXXDDV9VDVDDVXDV9VDDXD
Ζ XDXXDV9XDDXDD9XXXXVVVVVX  Ζ XDXXDV9XDDXDD9XXXXVVVVVX
οζ DX9VDXDDXV99DDDXDXX9 3SJ3A3¾-vNa-6l9A88l ιζ DXD9XXXDDDX9VDDDVVX9V jB joj-vNa-6l9A88l Ζ DDXXX9DXDXXDVVXXDXDX9VD9 ζ DX9VDXDDXV99DDDXDXX9 3SJ3A33⁄4-vNa-6l9A88l ιζ DXD9XXXDDDX9VDDDVVX9V jB joj-vNa-6l9A88l Ζ DDXXX9DXDXXDVVXXDXDX9VD9
ιζ 9DVDXDXVDXDV9V99VDDXD ζ 9DVDXDXVDXDV9V99VDDXD
ζζ V9VXDDV99VVV9VXDD999VV 3SJ3A3¾-vNa-600A06lg 1 ζζ D1 Y1D1D1DDD1DD1DD1D jBAUoj-vNa-600A06lgl Ζ X9XDV9V99DV9VV9VVVXVDV99 3SJ3A3¾-vNa-9Si ^ ι Ζ XDXXD9VXVV99XDDV9V9XX9XV jB joj-vNa-9Si ^ ι .6.0/llOZN3/X3d 8ΐΐ.εο/ειοζ OAV 77587208 -DNA— Reverse AAGACTATGCTGAACCGAATCC 22ζζ V9VXDDV99VVV9VXDD999VV 3SJ3A3¾-vNa-600A06lg 1 ζζ D1 Y1D1D1DDD1DD1DD1D jBAUoj-vNa-600A06lgl Ζ X9XDV9V99DV9VV9VVVXVDV99 3SJ3A3¾-vNa-9Si ^ ι Ζ XDXXD9VXVV99XDDV9V9XX9XV jB joj-vNa-9Si ^ ι .6.0 / llOZN3 / X3d 8ΐΐ.εο / ειοζ OAV 77587208 -DNA— Reverse AAGACTATGCTGAACCGAATCC 22
76834796 -DNA-Forward CACCTCCTTCCCAGGTTTTT 2076834796 -DNA-Forward CACCTCCTTCCCAGGTTTTT 20
76834796 -DNA— Reverse CTTTGGACCCTGTCCTCAGA 2076834796 -DNA— Reverse CTTTGGACCCTGTCCTCAGA 20
54948358 -DNA-Forward GATAACTTGAGACATGACCCAGAA 2454948358 -DNA-Forward GATAACTTGAGACATGACCCAGAA 24
54948358 -DNA— Reverse AACAATCAAGATGGAGAGGTAAGC 2454948358 -DNA— Reverse AACAATCAAGATGGAGAGGTAAGC 24
46376814 -DNA-Forward AGAGCTGTCCTTCGTGTTCCT 2146376814 -DNA-Forward AGAGCTGTCCTTCGTGTTCCT 21
46376814 -DNA— Reverse CCGCTTAGCACCATGGAC 1846376814 -DNA— Reverse CCGCTTAGCACCATGGAC 18
122893679 -DNA-Forward CTTTTGCTGAATGTTTTCCTTTTT 24122893679 -DNA-Forward CTTTTGCTGAATGTTTTCCTTTTT 24
122893679-DNA— Reverse GCAAGAGGCTGATATTCAAAATTC 24122893679-DNA— Reverse GCAAGAGGCTGATATTCAAAATTC 24
112600531 -DNA-Forward tatCAACAGCCCCTTCTTGG 20112600531 -DNA-Forward tatCAACAGCCCCTTCTTGG 20
112600531-DNA— Reverse ATGAGACCCGCACTCTGTTT 20 112600531-DNA— Reverse ATGAGACCCGCACTCTGTTT 20
(3). 所有样本都没有 P53和 PTEN突变, 而这两个基因是 COSMIC数据库中与前列腺癌相关度最高的基因。 虽然大多数突 变的基因之前未在前列腺癌中被报道过,其中 118个在其它肿瘤中 被发现过, 提示这些基因的突变可能也导致前列腺癌。 (3). There were no P53 and PTEN mutations in all samples, and these two genes are the most relevant genes in prostate cancer in the COSMIC database. Although most of the mutant genes have not previously been reported in prostate cancer, 118 of them have been found in other tumors, suggesting that mutations in these genes may also lead to prostate cancer.
本发明提供了 183个突变,这些突变可作为诊断标志物、预后 判断、 药物疗效判断和治疗靶点, 具体参见表 3。  The present invention provides 183 mutations which can be used as diagnostic markers, prognostic judgments, drug efficacy judgments, and therapeutic targets. See Table 3 for details.
实施例 5. 选择性剪切的发现和验证  Example 5. Discovery and verification of selective shear
我们用于检测选择性剪切的方法主要包括两步:  Our method for detecting selective shearing consists of two main steps:
1 )我们使用 SOAPsplice 1.1将读数定位到人参考序列, 然后 根据连接点读数(与参考序列两个或以上的独立片段相对应的读 数, 这两个片段之间由内含子隔开) 的对比结果找到剪切位点。 我们尽量使用 SOAPsplice的默认参数,对于完整比对的读数允许 3 个错配, 对于分段比对的读数每个片段仅允许 1个错配。  1) We use SOAPsplice 1.1 to position the readings to the human reference sequence and then compare them based on the junction point readings (the readings corresponding to two or more independent segments of the reference sequence, separated by introns) As a result, a clipping site was found. We try to use the default parameters of SOAPsplice, allowing 3 mismatches for complete alignment readings and only 1 mismatch per segment for segmented alignment readings.
2 )根据选择性剪切机制 ,我们使用剪切位点和对比结果来检 测四种基本的选择性剪切,包括外显子跳跃、选择性 5,剪切位点、 选择性 3, 剪切位点以及内含子保留。  2) According to the selective shear mechanism, we use the cleavage site and the comparison results to detect four basic selective splicing, including exon skipping, selectivity 5, cleavage site, selectivity 3, and shear. Sites and introns are reserved.
找出四种选择性剪切后, 我们选出存在于癌组织而不存在于 癌旁正常组织的选择性剪切。 对每个癌组织标本, 我们分别计算 支持 3种选择性剪切(外显子跳跃、选择性 5,剪切位点和选择性 3, 剪切位点)相应连接位点的连接点读数数目以及内含子保留事件 中保留下来的内含子的平均深度。因为每种选择性剪切数量巨大, 我们通过取 0.99百分位数来得到高可信度的选择性剪切, 并通过 画 circos图以便揭示一些共有模式。 以 1T为例, 其有 2047个选择 性 3, 剪切位点。 支持选择性 3, 剪切位点的连接点读数从 1到 609 不等, 其 0.99百分位数是 69。 因此, 我们保留连接点读数 > 69的 选择性 3, 剪切位点。 此外, 我们还删除掉在癌旁正常组织中也有 的选择性剪切。 最后, 我们得到一组与每个样本相对应的高度可 信的癌特异选择性剪切。 RT-PCR验证选择性剪切。 我们从水冻 癌组织和癌旁组织中提取总 RNA , 然后取 5 gRNA逆转录为 cDNA(Qiagen QuantiTect Reverse Transcription kit)。 我们在 40 对癌组织和癌旁正常组织中用 RT-PCR对选择性剪切进行了验 证。 After finding four alternative cuts, we chose to exist in cancerous tissue without presenting Selective shearing of normal tissues adjacent to the cancer. For each cancer tissue specimen, we calculated the number of junction point readings that support the three types of selective splicing (exon hopping, selectivity 5, cleavage site, and selectivity 3, cleavage site). And the average depth of introns retained in the intron retention event. Because of the large number of selective shears, we obtained high-reliability selective shear by taking the 0.99 percentile, and revealed some common patterns by drawing a circle map. Taking 1T as an example, it has 2047 selective 3, cleavage sites. Supporting selective 3, the junction point readings for the cleavage sites ranged from 1 to 609 with a 0.99 percentile of 69. Therefore, we retain the selectivity 3, shear site for junction point readings > 69. In addition, we also removed the selective shear that is also present in normal tissues adjacent to the cancer. Finally, we obtained a set of highly reliable cancer-specific selective cuts corresponding to each sample. RT-PCR verified selective splicing. We extracted total RNA from frozen and paracancerous tissues, and then reverse-transcribed 5 g of RNA into cDNA (Qiagen QuantiTect Reverse Transcription kit). We examined the selective splicing by RT-PCR in 40 pairs of cancer tissues and adjacent normal tissues.
PCR条件是: 秒; 60 ^ 30秒; 72*€90秒; 33-36个循 环。 其中特别地两个基因引物如下:  The PCR conditions are: seconds; 60^30 seconds; 72*€90 seconds; 33-36 cycles. Among them, two gene primers are as follows:
表 10. PSA和 AMACR选择性剪切的扩增引物  Table 10. Amplification primers for PSA and AMACR selective cleavage
选择性剪切 正向引物 反向引物  Selective shear forward primer reverse primer
PSA CCAAGTTCATGCTGTGTGCT TGCCTAGTAACCGTGTGCTG AMACR GGGAAAATCCAAGGCTTATTTATG AAGTCGTATAGAAAGGTGCTCCAC 发明提供了如表 4所示的肿瘤特异性的选择性剪切,这些选择 性剪切可以作为血液、 尿液和组织的诊断标志物, 也可作为判断 预后、 治疗效果的标志物, 还可以作为肿瘤治疗的靶点。  PSA CCAAGTTCATGCTGTGTGCT TGCCTAGTAACCGTGTGCTG AMACR GGGAAAATCCAAGGCTTATTTATG AAGTCGTATAGAAAGGTGCTCCAC The invention provides tumor-specific selective scission as shown in Table 4, which can be used as a diagnostic marker for blood, urine and tissue, as well as for prognosis and treatment. The marker can also be used as a target for cancer treatment.
在超过一半的前列腺癌样本中发现有 KLK3 (也叫 P S A )基因 的内含子保留, 在一部分前列腺癌样本中发现有 AMACR基因的 外显子跳跃。 这两种选择性剪切方式都用 RT-PCR在测序组得到 了验证。 我们同时在 40对样本(来自长海医院的 40个样本) 中用 RT-PCR进行了验证 , 发现绝大多数癌组织样本中有 PSA内含子 保留, 而癌旁组织中几乎没有。 40个癌组织样本中仅 9个有 AMACR^因外显子跳跃。 尽管本发明的具体实施方式已经得到详细的描述, 本领域技 术人员将会理解。 根据已经公开的所有教导, 可以对那些细节进 行各种修改和替换, 这些改变均在本发明的保护范围之内。 本发 明的全部范围由所附权利要求及其任何等同物给出。 Intron retention of the KLK3 (also known as PSA) gene was found in more than half of prostate cancer samples, and the AMACR gene was found in a subset of prostate cancer samples. Exon jumping. Both of these alternative splicing methods were verified by RT-PCR in the sequencing group. We also verified RT-PCR in 40 pairs of samples (40 samples from Changhai Hospital) and found that most of the cancer tissue samples contained PSA intron retention, but almost no adjacent tissues. Only 9 of the 40 cancer tissue samples had AMACR^ due to exon skipping. Although specific embodiments of the invention have been described in detail, those skilled in the art will understand. Various modifications and substitutions may be made to those details in light of the teachings of the invention, which are within the scope of the invention. The full scope of the invention is given by the appended claims and any equivalents thereof.

Claims

1. 用于前列腺癌的生物学标志物, 其包括如表 1 所示的融 合基因、 表 2所示的长链非编码 RNA、 表 3所示的基因突变、 表 4所示的选择性剪切中的一种或多种。 A biological marker for prostate cancer, which comprises a fusion gene as shown in Table 1, a long-chain non-coding RNA shown in Table 2, a gene mutation shown in Table 3, and an alternative scissors shown in Table 4. One or more of the cuts.
2. 权利要求 1所述的生物学标志物, 其可用作前列腺癌的早 期诊断标志物、 药物治疗有效性判断标志物或患者预后标志物。  2. The biological marker according to claim 1, which can be used as an early diagnostic marker for prostate cancer, a drug treatment effectiveness judgment marker or a patient prognosis marker.
3. 权利要求 1或 2所述的生物学标志物, 其中所述融合基因 包括表 6的 83个融合基因中的一种或多种,优选的包括表 6中下 划线所示的 35个融合基因中的一种或多种。  3. The biomarker of claim 1 or 2, wherein the fusion gene comprises one or more of the 83 fusion genes of Table 6, preferably comprising 35 fusion genes underlined in Table 6 One or more of them.
4. 权利要求 1或 2所述的生物学标志物, 其中所述融合基因 包括 USP9Y-TTTY15、 CTAGE5-KHDRBS3 , RAD50-PDLIM4, SDK1-AMACR 中 的 一种或 多 种 , 优选地融合基 因 USP9Y-TTTY15 、 CTAGE5-KHDRBS3 、 RAD50-PDLIM4 、 SDK1-AMACR用表 5所述的引物进行扩增。  The biological marker according to claim 1 or 2, wherein the fusion gene comprises one or more of USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, SDK1-AMACR, preferably fusion gene USP9Y- TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR were amplified using the primers described in Table 5.
5. 权利要求 1或 2所述的生物学标志物, 其中所述长链非编 码 RNA包括 DD3、 MALAT1、 FR0257520、 FR0348383中的一 种或多种, 优选地所述长链非编码 RNA: DD3、 MALAT1、 FR0257520、 FR0348383用表 7所述的引物进行扩增。  The biological marker according to claim 1 or 2, wherein the long-chain non-coding RNA comprises one or more of DD3, MALAT1, FR0257520, FR0348383, preferably the long-chain non-coding RNA: DD3 MALAT1, FR0257520, and FR0348383 were amplified using the primers described in Table 7.
6. 权利要求 1或 2所述的生物学标志物, 其中所述基因突变 包括如表 8所示的 30个基因突变中的一种或多种, 优选地表 8 所示的 30个基因突变用表 9所述的引物进行扩增。  The biological marker according to claim 1 or 2, wherein the genetic mutation comprises one or more of 30 gene mutations as shown in Table 8, preferably 30 mutations shown in Table 8 The primers described in Table 9 were amplified.
7. 权利要求 1或 2所述的生物学标志物, 其中所述选择性剪 切包括 PSA或 AMACR, 优选地选择性剪切 PSA或 AMACR用 表 10所述的引物进行扩增。  7. The biomarker of claim 1 or 2, wherein the alternative cleavage comprises PSA or AMACR, preferably alternatively cleavage of PSA or AMACR using the primers described in Table 10 for amplification.
8. 权利要求 1 - 7中任一项所述的生物学标志物在作为诊断前 列腺癌的试剂或者治疗前列腺癌的药物的靶点中的用途, 特别是 用作前列腺癌的早期诊断标志物、 药物治疗有效性判断标志物或 患者预后标志物的用途。 8. The biological marker of any one of claims 1 to 7 before being diagnosed The use of an agent for adenocarcinoma or a target for a drug for treating prostate cancer, particularly as an early diagnostic marker for prostate cancer, a marker for determining the effectiveness of drug treatment, or a marker for prognosis of a patient.
9. 用于扩增权利要求 1 - 7中任一项所述的生物学标志物的引 物或所述生物学标志物的探针在制备用于为诊断前列腺癌的试剂 中的用途。  9. Use of a primer for a biological marker according to any one of claims 1 to 7 or a probe of the biological marker for the preparation of a reagent for diagnosing prostate cancer.
10. 权利要求 9所述的用途, 其中所述引物包括表 5所述的引 物, 其用于融合基因 USP9Y-TTTY15、 CTAGE5-KHDRBS3 , RAD50-PDLIM4, SDKl-AMACR; 表 7所示的引物, 其用于扩 增长链非编码 RNA: DD3、 MALAT1、 FR0257520、 FR0348383; 表 9所示的引物, 其用于扩增表 8所示的 30个基因突变; 表 10 所示的引物, 其用于扩增选择性剪切 PSA或 AMACR。  10. The use of claim 9, wherein the primer comprises the primer described in Table 5 for the fusion gene USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, SDKl-AMACR; the primers shown in Table 7, It was used to amplify long-chain non-coding RNAs: DD3, MALAT1, FR0257520, FR0348383; the primers shown in Table 9 were used to amplify the 30 gene mutations shown in Table 8; the primers shown in Table 10 were used. Selectively shear PSA or AMACR for amplification.
11. 表 5所述的引物在制备诊断前列腺癌的试剂中的用途。  11. Use of the primers described in Table 5 for the preparation of a medicament for diagnosing prostate cancer.
12. 表 7所示的引物在制备诊断前列腺癌的试剂中的用途。  12. Use of the primers shown in Table 7 in the preparation of a medicament for diagnosing prostate cancer.
13. 表 9所示的引物在制备诊断前列腺癌的试剂中的用途。  13. Use of the primers shown in Table 9 for the preparation of a medicament for diagnosing prostate cancer.
14. 表 10所示的引物在制备诊断前列腺癌的试剂中的用途。  14. Use of the primers shown in Table 10 for the preparation of a medicament for diagnosing prostate cancer.
PCT/CN2011/079709 2011-09-16 2011-09-16 Prostate cancer biomarkers, therapeutic targets and uses thereof WO2013037118A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201180073445.7A CN103797120B (en) 2011-09-16 2011-09-16 Prostate cancer biomarkers, therapeutic targets and uses thereof
PCT/CN2011/079709 WO2013037118A1 (en) 2011-09-16 2011-09-16 Prostate cancer biomarkers, therapeutic targets and uses thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/079709 WO2013037118A1 (en) 2011-09-16 2011-09-16 Prostate cancer biomarkers, therapeutic targets and uses thereof

Publications (1)

Publication Number Publication Date
WO2013037118A1 true WO2013037118A1 (en) 2013-03-21

Family

ID=47882537

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/079709 WO2013037118A1 (en) 2011-09-16 2011-09-16 Prostate cancer biomarkers, therapeutic targets and uses thereof

Country Status (2)

Country Link
CN (1) CN103797120B (en)
WO (1) WO2013037118A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015103057A1 (en) * 2013-12-30 2015-07-09 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Fusion genes associated with progressive prostate cancer
CN104805178A (en) * 2014-05-20 2015-07-29 吴松 TACC3-FGFR3 fused gene sequence and its detection method and use in bladder cancer detection
CN104962654A (en) * 2014-11-18 2015-10-07 南京医科大学眼科医院 Application of lncRNA-MALAT1 in preparing proliferative vitroretinopathy (PVR) diagnosis reagent
WO2016011428A1 (en) * 2014-07-17 2016-01-21 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Methods of treating cells containing fusion genes
WO2016027701A1 (en) * 2014-08-20 2016-02-25 学校法人日本大学 Method for determining prostate cancer, method for selecting treatment of prostate cancer, and prophylactic or therapeutic agent therefor
US10414755B2 (en) 2017-08-23 2019-09-17 Novartis Ag 3-(1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US10760132B2 (en) 2011-09-15 2020-09-01 University of Pittsburgh—of the Commonwealth System of Higher Education Methods for diagnosing prostate cancer and predicting prostate cancer relapse
US11008624B2 (en) 2015-08-07 2021-05-18 University of Pittsburgh—of the Commonwealth System of Higher Education Methods for predicting prostate cancer relapse
US11185537B2 (en) 2018-07-10 2021-11-30 Novartis Ag 3-(5-amino-1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US11192877B2 (en) 2018-07-10 2021-12-07 Novartis Ag 3-(5-hydroxy-1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3227460B1 (en) * 2014-12-01 2021-01-27 Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. Novel rna-biomarker signature for diagnosis of prostate cancer
CN104611336B (en) * 2015-02-09 2017-06-23 上海长海医院 Fusion TTTY15 USP9Y and its application as prostate cancer marker
CN105586399A (en) * 2015-09-07 2016-05-18 张国新 Kit for serum/plasma lncRNA marker related to stomach cancer
CN105925714A (en) * 2016-07-01 2016-09-07 北京泱深生物信息技术有限公司 Molecular marker for diagnosing cerebral ischemic thrombosis
CN106967719B (en) * 2017-06-01 2021-04-13 上海长海医院 Application of long-chain non-coding RNA as prostate cancer molecular marker
CN109161598B (en) * 2018-11-29 2019-03-22 上海晟燃生物科技有限公司 Prostatic cancer early diagnosis or prognosis evaluation marker lncRNA malat1 and its application
CN112011610A (en) * 2019-05-30 2020-12-01 复旦大学附属肿瘤医院 Kit for cancer prognosis detection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101018874A (en) * 2004-08-13 2007-08-15 千年药品公司 Genes, compositions, kits, and methods for identification, assessment, prevention, and therapy of prostate cancer
CN101027099A (en) * 2004-02-16 2007-08-29 蛋白质系统股份公司 Diagnostic marker for cancer
WO2008096375A2 (en) * 2007-02-07 2008-08-14 Decode Genetics Ehf. Genetic variants contributing to risk of prostate cancer
CN101675341A (en) * 2006-12-19 2010-03-17 萨里大学 Cancer biomarkers
WO2010037735A1 (en) * 2008-10-01 2010-04-08 Noviogendix Research B.V. Molecular markers in prostate cancer

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101027099A (en) * 2004-02-16 2007-08-29 蛋白质系统股份公司 Diagnostic marker for cancer
CN101018874A (en) * 2004-08-13 2007-08-15 千年药品公司 Genes, compositions, kits, and methods for identification, assessment, prevention, and therapy of prostate cancer
CN101675341A (en) * 2006-12-19 2010-03-17 萨里大学 Cancer biomarkers
WO2008096375A2 (en) * 2007-02-07 2008-08-14 Decode Genetics Ehf. Genetic variants contributing to risk of prostate cancer
WO2010037735A1 (en) * 2008-10-01 2010-04-08 Noviogendix Research B.V. Molecular markers in prostate cancer

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10760132B2 (en) 2011-09-15 2020-09-01 University of Pittsburgh—of the Commonwealth System of Higher Education Methods for diagnosing prostate cancer and predicting prostate cancer relapse
US10344338B2 (en) 2013-12-30 2019-07-09 University of Pittsburgh—of the Commonwealth System of Higher Education Fusion genes associated with progressive prostate cancer
US10167519B2 (en) 2013-12-30 2019-01-01 University of Pittsburgh—of the Commonwealth System of Higher Education Fusion genes associated with progressive prostate cancer
US10570460B2 (en) 2013-12-30 2020-02-25 University of Pittsburgh—of the Commonwealth System of Higher Education Fusion genes associated with progressive prostate cancer
US10988812B2 (en) 2013-12-30 2021-04-27 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Fusion genes associated with progressive prostate cancer
US9932641B2 (en) 2013-12-30 2018-04-03 University of Pittsburgh—of the Commonwealth System of Higher Education Fusion genes associated with progressive prostate cancer
WO2015103057A1 (en) * 2013-12-30 2015-07-09 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Fusion genes associated with progressive prostate cancer
CN104805178A (en) * 2014-05-20 2015-07-29 吴松 TACC3-FGFR3 fused gene sequence and its detection method and use in bladder cancer detection
WO2016011428A1 (en) * 2014-07-17 2016-01-21 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Methods of treating cells containing fusion genes
US10308960B2 (en) 2014-07-17 2019-06-04 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Methods for treating cells containing fusion genes
US10822622B2 (en) 2014-07-17 2020-11-03 University of Pittsburgh—of the Commonwealth System of Higher Education Methods for treating cells containing fusion genes
WO2016027701A1 (en) * 2014-08-20 2016-02-25 学校法人日本大学 Method for determining prostate cancer, method for selecting treatment of prostate cancer, and prophylactic or therapeutic agent therefor
JP2016044130A (en) * 2014-08-20 2016-04-04 学校法人日本大学 Prostate cancer determination, treatment selection method, preventive or therapeutic agent
CN104962654A (en) * 2014-11-18 2015-10-07 南京医科大学眼科医院 Application of lncRNA-MALAT1 in preparing proliferative vitroretinopathy (PVR) diagnosis reagent
CN104962654B (en) * 2014-11-18 2018-08-28 南京医科大学眼科医院 Applications of the lncRNA-MALAT1 in preparing proliferative vitreoretinopathy diagnostic reagent
US11008624B2 (en) 2015-08-07 2021-05-18 University of Pittsburgh—of the Commonwealth System of Higher Education Methods for predicting prostate cancer relapse
US10647701B2 (en) 2017-08-23 2020-05-12 Novartis Ag 3-(1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US10640489B2 (en) 2017-08-23 2020-05-05 Novartis Ag 3-(1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US11053218B2 (en) 2017-08-23 2021-07-06 Novartis Ag 3-(1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US10414755B2 (en) 2017-08-23 2019-09-17 Novartis Ag 3-(1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US11833142B2 (en) 2018-07-10 2023-12-05 Novartis Ag 3-(5-amino-1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US11185537B2 (en) 2018-07-10 2021-11-30 Novartis Ag 3-(5-amino-1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof
US11192877B2 (en) 2018-07-10 2021-12-07 Novartis Ag 3-(5-hydroxy-1-oxoisoindolin-2-yl)piperidine-2,6-dione derivatives and uses thereof

Also Published As

Publication number Publication date
CN103797120A (en) 2014-05-14
CN103797120B (en) 2017-04-12

Similar Documents

Publication Publication Date Title
WO2013037118A1 (en) Prostate cancer biomarkers, therapeutic targets and uses thereof
Xie et al. RNA-Seq profiling of serum exosomal circular RNAs reveals Circ-PNN as a potential biomarker for human colorectal cancer
JP4938672B2 (en) Methods, systems, and arrays for classifying cancer, predicting prognosis, and diagnosing based on association between p53 status and gene expression profile
US11035849B2 (en) Predicting the occurrence of metastatic cancer using epigenomic biomarkers and non-invasive methodologies
JP2014506459A (en) Methods for discovering pharmacogenomic biomarkers
US10287634B2 (en) RNA-biomarkers for diagnosing prostate cancer
Chen et al. Targeted resequencing of the microRNAome and 3′ UTRome reveals functional germline DNA variants with altered prevalence in epithelial ovarian cancer
CN112639983A (en) Microsatellite instability detection
Bin et al. An analysis of mutational signatures of synonymous mutations across 15 cancer types
KR20180108820A (en) Genetic profiling of cancer
Sun et al. Comparative transcriptome analysis of the global circular RNAs expression profiles between SHEE and SHEEC cell lines
US20130102483A1 (en) Methods for the analysis of breast cancer disorders
CN113557300A (en) Nucleic acid sequence, RNA target region sequencing library construction method and application
JP2022528182A (en) A composition for diagnosing or predicting a glioma, and a method for providing information related thereto.
KR101847815B1 (en) A method for classification of subtype of triple-negative breast cancer
CN109504773B (en) Biomarker related to oral squamous cell carcinoma differentiation grade
US20220411878A1 (en) Methods for disease detection
CN104846070B (en) The biological markers of prostate cancer, therapy target and application thereof
KR20210134551A (en) Biomarkers for predicting the recurrence possibility and survival prognosis of papillary renal cell carcinoma and uses thereof
KR102605676B1 (en) Marker selection method using differences in methylation of nucleic acids, and diagnostic methods using methyl and demethyl markers
AU2021291586B2 (en) Multimodal analysis of circulating tumor nucleic acid molecules
Dwivedi et al. Molecular Diagnosis in Ovarian Carcinoma
US20230374608A1 (en) Breast cancer splice variants
WO2021066038A1 (en) Biomarker, method, kit and array for predicting therapeutic effects of bcg intravesical infusion therapy in treating bladder cancer
Hoseini et al. Association Between Single Nucleotide Polymorphism rs113488022 of BRAF and Endometriosis in Iranian Population

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11872461

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11872461

Country of ref document: EP

Kind code of ref document: A1