WO2018001295A1 - Molecular marker, reference gene, and application and test kit thereof, and method for constructing testing model - Google Patents

Molecular marker, reference gene, and application and test kit thereof, and method for constructing testing model Download PDF

Info

Publication number
WO2018001295A1
WO2018001295A1 PCT/CN2017/090740 CN2017090740W WO2018001295A1 WO 2018001295 A1 WO2018001295 A1 WO 2018001295A1 CN 2017090740 W CN2017090740 W CN 2017090740W WO 2018001295 A1 WO2018001295 A1 WO 2018001295A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
breast cancer
risk
internal reference
recurrence
Prior art date
Application number
PCT/CN2017/090740
Other languages
French (fr)
Chinese (zh)
Inventor
郭弘妍
孙义民
王亚辉
谢展
邢婉丽
程京
邓涛
张治位
Original Assignee
博奥生物集团有限公司
北京博奥医学检验所有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 博奥生物集团有限公司, 北京博奥医学检验所有限公司 filed Critical 博奥生物集团有限公司
Priority to JP2018568674A priority Critical patent/JP2019527544A/en
Priority to SG11201811263WA priority patent/SG11201811263WA/en
Publication of WO2018001295A1 publication Critical patent/WO2018001295A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Definitions

  • the invention relates to the field of biotechnology, in particular to a molecular marker, an internal reference gene and an application thereof, a detection kit and a construction method of the detection model.
  • Breast cancer is a kind of highly heterogeneous tumor with many prognostic factors. Breast cancer patients with the same clinical stage, histological grade and hormone receptor expression can receive the same treatment plan, and their prognosis may be different. How to accurately determine the prognosis of breast cancer patients and formulate corresponding individualized treatment programs to avoid the harm and burden caused by over-treatment and improper treatment is an urgent problem to be solved in clinical practice.
  • the present invention provides molecular markers and their applications, detection kits, and methods for constructing detection models.
  • the kit is superior to the clinical pathological evaluation results in the prognosis evaluation performance of breast cancer, which can reduce the over-treatment and improper treatment caused by pathological diagnosis errors to meet the needs of individualized and precise treatment of breast cancer patients. Further improved the technical methods for predicting the prognosis of breast cancer in China.
  • the present invention provides the following technical solutions:
  • the invention provides genetic compositions, including the molecular markers MAPT and/or MS4A1.
  • the present invention provides a genetic composition consisting of the molecular markers BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT and MS4A1.
  • the genetic composition further comprises the internal reference genes ACTB, GAPDH, GUSB, NUP214, VCAN.
  • the invention also provides the use of the genetic composition for the detection of a 3-10 year postoperative recurrence and/or mortality risk prediction for breast cancer.
  • the prognosis of breast cancer prognosis in the application is 3-10 years recurrence and/or death risk assessment is specifically: obtaining total RNA of the sample to be tested, obtaining cDNA by reverse transcription, using real-time PCR
  • the method obtains the Ct value of the molecular marker and the reference gene, averages the Ct value of the reference gene, obtains an average Ct value (Ct') of the internal reference gene combination, and then Ct the molecular marker
  • the values were subtracted from the internal reference gene combination Ct' value to normalize, and ⁇ Ct was obtained.
  • the ⁇ Ct value and the subject's age, pT value and LN value were reconstructed by random forest algorithm.
  • the annual recurrence or death risk prediction model was analyzed and the results were obtained. Among them, the pT value is the pathological stage, and the LN value is the number of lymph node metastasis.
  • the value obtained by the analysis is compared with a threshold value, and the result is obtained, the threshold value being 5.
  • the value obtained by the analysis ⁇ 5 is a good prognosis, and the value obtained by the analysis ⁇ 5 is a poor prognosis.
  • the method for constructing a 3-10 year recurrence or death risk assessment test model for breast cancer in the application is: constructing a ⁇ Ct value of the molecular marker of the sample to be tested and the age of the subject , pT value, LN value construct a mathematical matrix, randomly select 1/2 as the training set, 1/2 as the verification set, through the random forest
  • the algorithm establishes a prediction model with 10,000 decision trees. The total random sampling is ⁇ 1000 times, and ⁇ 1000 prediction models are established. From the ⁇ 1000 prediction model, ⁇ 39 preferred models with the highest coincidence rate with follow-up information are selected as the final model. The model was used with a median of ⁇ 39 submodels as the final prognostic risk predictor.
  • the random forest is composed of many decision trees.
  • the decision tree is constructed by a method of double randomness of attributes and samples, so it is also called random decision tree.
  • random forest In a random forest, there is no correlation between decision trees.
  • the test data enters the random forest it is classified by each decision tree.
  • the class with the most classification results in all decision trees is the final result, that is, the result of the decision tree "voting", in other words, the random forest is a inclusion.
  • a classifier for multiple decision trees, and the category of its output is determined by the mode of the category of the individual tree output.
  • a risk threshold threshold of 5
  • the test sample in the present invention is an FFPE sample of a newly diagnosed early or mid-term ER or PR positive breast cancer patient.
  • the present invention also provides a primer set for amplifying the gene composition, the sequence of which is shown in SEQ ID No. 1 to SEQ ID No. 28.
  • the present invention also provides a probe set for amplifying the gene composition, the sequence of which is shown in SEQ ID No. 29 to SEQ ID No. 42.
  • the present invention also provides a primer set for amplifying an internal reference gene of the gene composition, as shown in SEQ ID No. 43 to SEQ ID No. 47.
  • the present invention also provides a probe set for amplifying an internal reference gene of the gene composition, as shown in SEQ ID No. 48 to SEQ ID No. 52.
  • the invention also provides a test kit for predicting the risk of recurrence and/or mortality of breast cancer after 3-10 years, including the primer set and/or the probe set and reagents commonly used in the kit.
  • the invention also provides a method for constructing a risk assessment model for recurrence or death of breast cancer with a prognosis of 3-10 years, constructing a mathematical matrix of the ⁇ Ct value of the molecular marker of the sample to be tested and the age, pT value and LN value of the subject, randomized Select 1/2 as the training set and 1/2 as the verification set.
  • the prediction model with 10000 decision trees is established by the random forest algorithm.
  • the total random sampling is ⁇ 1000 times, and ⁇ 1000 prediction models are established.
  • the ⁇ 1000 prediction models are selected.
  • the ⁇ 39 preferred models with the highest rate of coincidence with follow-up information were used as sub-models of the final model, and the median of ⁇ 39 sub-models was used as the final prognostic risk predictor.
  • the invention also provides an evaluation method for the risk of recurrence or death of breast cancer with a prognosis of 3-10 years, obtaining total RNA of the sample to be tested, obtaining cDNA by reverse transcription, and obtaining the molecular marker and the reference gene by fluorescence quantitative PCR.
  • Ct value, the Ct value of the internal reference gene is averaged to obtain the average Ct value (Ct') of the internal reference gene combination, and then the Ct value of the molecular marker is respectively subtracted from the internal reference gene combination Ct' value to be normalized.
  • ⁇ Ct the ⁇ Ct value and the age, pT value and LN value of the subject were analyzed by the random forest algorithm for 3-10 years postoperative recurrence or death risk prediction model of breast cancer, and the result was obtained, that is, 3-10
  • the annual recurrence or death risk value is predicted to be a good prognosis or a poor prognosis based on the risk threshold (risk threshold of 5).
  • the sample to be tested in the present invention is an FFPE sample of a newly diagnosed early or mid-stage ER or PR positive breast cancer patient.
  • the technical solution to solve the problem of the present invention includes: (1) selecting 192 breast cancer related candidate genes (not limited to breast cancer prognosis related, including internal reference genes), and customizing TLDA gene expression detection chip (Applied Biosystems) through literature and database research. (2) systematically collect complete demographic data, clinical data and follow-up data (recurrence and metastasis time, survival time), and select untreated early and mid-term ER or PR positive breast cancer FFPE samples for newly diagnosed patients, using customized TLDA chip was used to detect 192 genes, and the molecular markers related to prognosis and breast cancer prognosis were screened. (3) The candidate molecular markers and reference genes were screened in independent samples and constructed by random forest algorithm.
  • the present invention provides a prognostic evaluation gene detection system for recurrence or death 3 to 10 years after surgery in a newly diagnosed early or mid-stage ER or PR positive breast cancer patient.
  • FFPE formalin-Fixed and Parrffin-Embedded
  • PCR detection of breast cancer prognosis Ct values were expressed for 14 molecular markers and 5 internal reference genes.
  • a predictive model of the risk of recurrence or death after 3-10 years of postoperative ER or PR-positive early-stage breast cancer patients with Ct values and subject age, pT value and LN number was determined by prognosis or poor prognosis. Compared with the follow-up information, the system achieved an accuracy of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
  • the kit provided by the invention has a prediction accuracy of 81.1% for a newly diagnosed breast cancer patient with a low risk of recurrence or death of 3-10 years, and a pathological prediction accuracy of 71.9%, which is a risk of recurrence or death of 3-10 years.
  • the accuracy of the prediction accuracy of the newly diagnosed patients with high breast cancer was 54.4%, which was close to the accuracy of the corresponding pathological prediction detection of 56.8%.
  • the kit has a concordance rate of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
  • the detection system and the kit are superior to the clinical pathological prediction results in the prognosis evaluation performance of breast cancer, and can reduce the excessive treatment and improper treatment caused by pathological diagnosis errors to meet the individualized precision treatment of breast cancer patients.
  • Figure 1 shows the results of correlation analysis between the internal reference gene and the test gene
  • Figure 2 shows the establishment of a risk assessment model for 3-10 years of recurrence or death in breast cancer.
  • the invention discloses a molecular marker, an internal reference gene and an application thereof, a detection kit and a construction method of the detection model, and those skilled in the art can learn from the contents of the paper and appropriately improve the process parameters. It is to be understood that all such alternatives and modifications are obvious to those skilled in the art and are considered to be included in the present invention.
  • the method and the application of the present invention have been described by the preferred embodiments, and it is obvious that the method and application described herein may be modified or appropriately modified and combined without departing from the scope of the present invention. The technique of the present invention is applied.
  • the technical solution to solve the problem of the present invention includes: (1) selecting 192 breast cancer related candidate genes (not limited to breast cancer prognosis related, including internal reference genes), and customizing TLDA gene expression detection chip (Applied Biosystems) through literature and database research. (2) systematically collect complete demographic data, clinical data and follow-up data (recurrence and metastasis time, survival time), and select untreated early and mid-term ER or PR positive breast cancer FFPE samples for newly diagnosed patients, using customized TLDA chip was used to detect 192 genes, and the molecular markers related to prognosis and breast cancer prognosis were screened. (3) The candidate molecular markers and reference genes were screened in independent samples and constructed by random forest algorithm.
  • LN has or (and) no transfer, and the number of LN transfers
  • Total RNA is subjected to reverse transcription reaction to obtain a cDNA sample
  • Total RNA is subjected to reverse transcription reaction to obtain a cDNA sample
  • the difference of internal reference gene and gene expression between 26 cases of breast cancer prognosis and 26 cases of breast cancer prognosis samples were determined, and the internal reference genes and differentially expressed genes were selected.
  • Candidate molecular markers were verified by large-sample quantities by reverse transcription fluorescent quantitative PCR.
  • the final screening of 14 genes and 5 internal reference genes for diagnosis of breast cancer prognosis (BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1 ;ACTB, GAPDH, GUSB, NUP214, VCAN).
  • Diagnostic kits include primers for these genes, probes, and other conventional reagents for qRT-PCR.
  • the kit further comprises a predictive model, wherein the expression levels of BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT and MS4A1 are ACTB, GAPDH, GUSB.
  • the mean values of NUP214 and VCAN were detected as reference genes.
  • the clinical information of age, PT and LN of breast cancer patients were comprehensively evaluated for the prognosis of recurrence or death after 3-10 years, and the prognosis of poor prognosis was poor.
  • the random forest algorithm in the machine learning method was used to evaluate the risk of recurrence or death after 3-10 years of postoperative detection, and a gene detection model for breast cancer prognosis evaluation was established.
  • the random forest is composed of many decision trees.
  • the decision tree is constructed by a method of double randomness of attributes and samples, so it is also called random decision tree.
  • random forest there is no correlation between decision trees.
  • the test data enters the random forest it is classified by each decision tree.
  • the class with the most classification results in all decision trees is the final result, that is, the result of the decision tree "voting", in other words, the random forest is a inclusion.
  • a classifier for multiple decision trees and the category of its output is the category output by the individual tree The majority depends on the number.
  • 192 candidate genes for breast cancer were detected by TLDA detection technology.
  • the gene expression differences of 26 breast cancer prognosis samples and 26 breast cancer prognosis samples were detected, and differentially expressed genes were screened.
  • the screening process of the internal reference gene using the genetic algorithm based on genorm, bestkeeper, normfinder, delta Ct and considering the biological function of the less fluctuating gene and its relationship with the tumor, screening candidate internal reference genes; calculating all candidate internal reference gene combinations Ct
  • the correlation between the mean and the mean Ct of 192 genes, the most relevant combination is the internal reference genes including: ACTB, GAPDH, GUSB, NUP214, VCAN.
  • Candidate gene screening criteria (1) overall analysis - good prognosis and poor prognosis, the difference between the two groups is 2 times or less, and the proportion of cases with Ct ⁇ 35 is 50%; (2) stratified analysis - no lymph nodes The prognosis of the metastatic group was better than that of the poor prognosis.
  • the present invention provides a prognostic evaluation gene for breast cancer in China: at present, foreign similar products are developed based on European and American populations, and different ethnic groups have different gene expressions. In the present invention, 19 genes are screened, wherein MAPT and MS4A1 are based on The genes related to the recurrence or death assessment of female breast cancer patients in China after 3-10 years postoperatively have been reported. Although this gene has been reported to be associated with breast cancer, no direct report related to the prognosis of breast cancer has been found.
  • the present invention establishes a new internal reference gene combination different from other inventions and products, and the gene combination is less affected by the RNA quality in the FFPE sample, so that the detection result of the molecular marker is more reliable.
  • the predictive model of the random forest algorithm was used for comprehensive analysis. The model predicts the risk of recurrence or death after 3-10 years of postoperative diagnosis of ER+ or PR+ breast cancer in the early and middle stages.
  • the present invention provides a prognostic evaluation gene detection system for relapse or death of patients with stage I and stage II ER or PR positive for untreated breast cancer who are 3-10 years postoperatively. Compared with the follow-up information, the system achieved an accuracy of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
  • the materials and reagents used in the molecular markers, internal reference genes and their applications, detection kits, and detection methods for the detection models provided by the present invention are all commercially available.
  • TLDA Traqman Low Density Array
  • LN has or (and) no transfer, and the number of LN transfers
  • Example 2 TLDA chip screening for molecular markers and internal reference genes
  • RNA extraction from FFPE samples 4 samples of 20 ⁇ m slices per sample or 8 slices of 10 ⁇ m slices were taken, and RNA was extracted according to the instructions of High Pure FFPET RNA Isolation Kit (Roche). The extracted RNA was quantified by NanoDrop-2000. Downstream reverse transcription experiments were performed after control.
  • RNA is subjected to reverse transcription reaction to obtain cDNA sample: 1 ⁇ g of total RNA is taken according to VILO TM Master Mix kit (Invitrogen) instructions for reverse transcription.
  • 192 candidate genes derived from breast cancer were detected by TLDA detection technology.
  • the gene expression differences of 26 breast cancer prognosis samples and 26 breast cancer prognosis samples were detected, and differentially expressed genes were screened.
  • the screening process of the internal reference gene using the genetic algorithm based on genorm, bestkeeper, normfinder, delta Ct and considering the biological function of the less fluctuating gene and its relationship with the tumor, screening candidate internal reference genes; calculating all candidate internal reference gene combinations Ct
  • the correlation between the mean and the mean Ct of 192 genes, the most relevant combination is the internal reference genes including: ACTB, GAPDH, GUSB, NUP214, VCAN.
  • Candidate gene screening criteria (1) overall analysis - good prognosis and poor prognosis, the difference between the two groups is 2 times or less, and the proportion of cases with Ct ⁇ 35 is 50%; (2) stratified analysis - no lymph nodes The prognosis of the metastatic group was better than that of the poor prognosis.
  • the difference between the two groups was more than 2 times, and the statistical difference was ⁇ 0.05.
  • (3) The difference between the two groups was not significant, but it was reported in the prognosis of breast cancer, and Ct ⁇ The proportion of cases in 35 reached 90%.
  • the genes satisfying the above criteria include: BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1 and above.
  • RNA extraction of 289 FFPE samples 4 samples of 20 ⁇ m slices per sample or 8 slices of 10 ⁇ m slices, RNA was extracted according to the instructions of High Pure FFPET RNA Isolation Kit (Roche), and the extracted RNA was quantified by NanoDrop-2000. Downstream reverse transcription experiments were performed after quality control.
  • Table 4 housekeeping gene qRT-PCR primer sequence
  • Table 5 housekeeping gene qRT-PCR probe sequence
  • Example 4 Breast cancer prognosis 3-10 years recurrence or death risk prediction model establishment
  • the random forest algorithm in the machine learning method was used to evaluate the risk of recurrence or death after 3-10 years of postoperative detection, and a gene detection model for breast cancer prognosis evaluation was established.
  • the random forest is composed of many decision trees.
  • the decision tree is constructed by a method of double randomness of attributes and samples, so it is also called random decision tree. In a random forest, there is no correlation between decision trees.
  • the test data enters the random forest it is classified by each decision tree. Finally, the class with the most classification results in all decision trees is the final result, that is, the result of the decision tree "voting", in other words,
  • a random forest is a classifier that contains multiple decision trees, and the category of its output is determined by the mode of the category of the individual tree output.
  • PT staging was stage 1 and 2, of which patients were operated between 2004 and 2008, followed up from 2011 to 2015. In the year, the follow-up period was 3-10 years.
  • the high-purity FFPET RNA Isolation Kit (Roche) was used to extract the total RNA from the above 19 FFPE samples. After the quality control, the RNA was subjected to reverse transcription reaction to obtain cDNA samples. The cDNA products were subjected to qRT-PCR reaction to detect the internal reference genes ACTB, GAPDH, GUSB, NUP214. , VCAN, and BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1 genes.
  • the detection system used FFPE samples of 289 newly diagnosed breast cancer patients with known clinical follow-up data collected by Tianjin Medical University Cancer Hospital and Henan Cancer Hospital. Five internal reference genes and 14 molecular markers were detected.
  • the kit provided by the invention has an accuracy rate of 81.1% for a newly diagnosed breast cancer patient with a low risk of recurrence or death of 3-10 years, and a pathological detection accuracy of 71.8%, which has a high risk of recurrence or death of 3-10 years.
  • the accuracy rate of the newly diagnosed patients with breast cancer was 54.4%, which was close to the corresponding pathological detection accuracy of 56.8%.
  • the kit has a concordance rate of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
  • the detection system and the kit are superior to the clinical pathological diagnosis result in the prognosis evaluation performance of breast cancer, and can reduce the excessive treatment and improper treatment caused by the pathological diagnosis error to meet the individualized precision of the breast cancer patient to a certain extent.
  • the need for treatment has further improved the technical methods for predicting the prognosis of breast cancer in China.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Plant Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Provided are a molecular marker, reference gene, and application and test kit thereof, and a method for constructing a testing model. Using follow-up information for a comparison method, the accuracy of the provided test kit in predicting the risk, in patients initially diagnosed as ER or PR positive for breast cancer, of recurrence or death 3-10 years after surgery is 70%, and accuracy in predicting a low-risk group and a high-risk group is 81.1% and 54.4%, respectively. The corresponding accuracies in predicting the FFPE pathology test results are 71.9% and 56.8%, respectively. The risk prediction model supporting the test kit requires only the Ct value of the molecular marker, the patient's age, the pT stage, and the LN quantity, and does not need to rely on other clinical pathology information; the model provides cancer prognosis assessment that is better than the pathology prediction result alone, and reduces to a certain extent the occurence of improper treatment caused by erroneous pathology prediction, thus further improving the technical method for cancer prognosis assessment.

Description

分子标志物、内参基因及其应用、检测试剂盒以及检测模型的构建方法Molecular marker, internal reference gene and application thereof, detection kit and construction method of detection model
本申请要求于2016年06月30日提交中国专利局、申请号为201610509983.0、发明名称为“分子标志物、内参基因及其应用、检测试剂盒以及检测模型的构建方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims to be submitted to the Chinese Patent Office on June 30, 2016, the application number is 201610509983.0, and the Chinese patent application titled “Molecular Markers, Internal Reference Genes and Their Applications, Detection Kits, and Detection Model Construction Methods” is preferred. The entire contents are hereby incorporated by reference.
技术领域Technical field
本发明涉及生物技术领域,特别涉及分子标志物、内参基因及其应用、检测试剂盒以及检测模型的构建方法。The invention relates to the field of biotechnology, in particular to a molecular marker, an internal reference gene and an application thereof, a detection kit and a construction method of the detection model.
背景技术Background technique
乳腺癌是威胁全世界女性生命健康的主要原因之一,2013年美国癌症协会发布全美癌症统计显示乳腺癌发病率位居女性癌症之首,死亡率位居第二。最新的美国国家癌症中心数据显示,2013年美国女性新发乳腺癌232,340例,死亡39,620例。在美国,平均每8个妇女中就有一个患乳腺癌。中国虽然属于乳腺癌低发的国家,但近年发病率及死亡率明显上升。在全球每年新诊断出的130万例乳腺癌患者中,约15%来自中国。中国乳腺癌网的统计数据显示我国每年新增乳腺癌达3%-4%,超过世界水平1%-2%,发病率为女性易患肿瘤的第一位。亟待开发乳腺癌预防、诊断、预后、个体化治疗技术。Breast cancer is one of the main reasons that threaten the health of women around the world. In 2013, the American Cancer Society released the nation's cancer statistics showing that the incidence of breast cancer ranked first among women's cancer, and the mortality rate ranked second. According to the latest National Cancer Center data, in 2013, 232,340 new breast cancers and 39,620 deaths were reported among women in the United States. In the United States, one in every eight women has breast cancer. Although China is a country with low incidence of breast cancer, its incidence and mortality have increased significantly in recent years. About 15% of the 1.3 million newly diagnosed breast cancer patients diagnosed every year in the world are from China. According to the statistics of China Breast Cancer Network, the annual increase of breast cancer in China is 3%-4%, which is more than 1%-2% of the world level. The incidence rate is the first in women. Urgent development of breast cancer prevention, diagnosis, prognosis, and individualized treatment technologies.
乳腺癌是一类具有高度异质性的肿瘤,其预后影响因素众多,具有相同临床分期、组织学分级及激素受体表达的乳腺癌患者接受相同的治疗方案,其预后也可能不同。如何准确判断乳腺癌患者的预后及制定相应的个体化治疗方案,避免过度治疗和不当治疗给患者带来的伤害和负担,是临床迫切需要解决的问题。Breast cancer is a kind of highly heterogeneous tumor with many prognostic factors. Breast cancer patients with the same clinical stage, histological grade and hormone receptor expression can receive the same treatment plan, and their prognosis may be different. How to accurately determine the prognosis of breast cancer patients and formulate corresponding individualized treatment programs to avoid the harm and burden caused by over-treatment and improper treatment is an urgent problem to be solved in clinical practice.
随着分子生物学技术的飞速发展,使用聚合酶链反应(PCR)、探针杂交及基因芯片等分子生物学方法发现并检测乳腺癌预后相关基因成为可能。2002年Van’t Veer等通过DNA芯片技术筛查117例乳腺癌病例,发现70个与乳腺癌预后相关的基因;2004年美国科学家又利用RT-PCR方法对675例乳腺癌样本进行验证,得到21个与预后相关的基因,Genomic Health公司根据此研究开发了乳腺癌预后相关产品
Figure PCTCN2017090740-appb-000001
是目前唯一一款被NCCN指南、ASCO临床指南及StGallen临床共识3个全球最权威临床指南共同推荐的乳腺癌预后检测产品。此外,Yasuto Naoi等用DNA芯片技术对日本人群中ER阳性、淋巴结阴性乳腺癌患者癌组织样本进行研究,发现95个与预后相关的基因。Torsten O.Nielsen研究组发现,与 临床因素及免疫组织化学染色相比,50个基因组合能够提供更多的乳腺癌预后预测信息,并且用FFPE样本代替新鲜样本或速冻样本进行检测,扩大了可检测样本范围。2002-2013年间乳腺癌预后相关检测产品
Figure PCTCN2017090740-appb-000002
Mammaprint、ProsIgnaTM、MapQuant DxTM相继获得FDA、CE认证。但目前这些产品均基于欧美人群研发,这些产品进入中国后不仅价格昂贵,基因及其检测模型是否适用中国人群也尚待验证。因此,开发经济有效的中国人乳腺癌预后检测技术具有重要意义。
With the rapid development of molecular biology technology, it is possible to detect and detect prognosis-related genes in breast cancer using molecular biology methods such as polymerase chain reaction (PCR), probe hybridization and gene chip. In 2002, Van't Veer and others screened 117 cases of breast cancer by DNA chip technology and found 70 genes related to breast cancer prognosis. In 2004, American scientists used RT-PCR to verify 675 breast cancer samples. 21 prognostic-related genes, Genomic Health developed breast cancer prognosis-related products based on this research
Figure PCTCN2017090740-appb-000001
It is currently the only breast cancer prognostic test product recommended by the NCCN guidelines, ASCO clinical guidelines and StGallen's clinical consensus. In addition, Yasuto Naoi et al. used DNA chip technology to study cancer tissue samples from patients with ER-positive and node-negative breast cancer in the Japanese population, and found 95 prognostic-related genes. The Torsten O. Nielsen team found that 50 gene combinations provide more breast cancer prognostic predictions than clinical factors and immunohistochemical staining, and use FFPE samples instead of fresh or frozen samples for testing. Detect the sample range. Breast cancer prognosis related products from 2002 to 2013
Figure PCTCN2017090740-appb-000002
Mammaprint, ProsIgna TM and MapQuant Dx TM have successively obtained FDA and CE certification. However, these products are currently based on research and development in Europe and the United States. These products are not only expensive after entering China, but whether the genes and their detection models are applicable to the Chinese population has yet to be verified. Therefore, it is of great significance to develop a cost-effective prognostic detection technology for Chinese breast cancer.
发明内容Summary of the invention
有鉴于此,本发明提供了分子标志物及其应用、检测试剂盒以及检测模型的构建方法。该试剂盒在乳腺癌预后评价检测性能上更优于临床病理评价结果,在一定程度上可以减少因病理诊断错误而发生的过度治疗和不当治疗,满足了乳腺癌病人的个体化精准治疗的需求,进一步完善了国内乳腺癌预后预测方面的技术方法。In view of this, the present invention provides molecular markers and their applications, detection kits, and methods for constructing detection models. The kit is superior to the clinical pathological evaluation results in the prognosis evaluation performance of breast cancer, which can reduce the over-treatment and improper treatment caused by pathological diagnosis errors to meet the needs of individualized and precise treatment of breast cancer patients. Further improved the technical methods for predicting the prognosis of breast cancer in China.
为了实现上述发明目的,本发明提供以下技术方案:In order to achieve the above object, the present invention provides the following technical solutions:
本发明提供了基因组合物,包括分子标志物MAPT和/或MS4A1。The invention provides genetic compositions, including the molecular markers MAPT and/or MS4A1.
本发明提供了基因组合物,由分子标志物BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT和MS4A1组成。The present invention provides a genetic composition consisting of the molecular markers BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT and MS4A1.
在本发明的一些具体实施方案中,所述基因组合物还包括内参基因ACTB、GAPDH、GUSB、NUP214、VCAN。In some embodiments of the invention, the genetic composition further comprises the internal reference genes ACTB, GAPDH, GUSB, NUP214, VCAN.
本发明还提供了所述基因组合物在制备乳腺癌术后3-10年复发和/或死亡风险预测的检测装置中的应用。The invention also provides the use of the genetic composition for the detection of a 3-10 year postoperative recurrence and/or mortality risk prediction for breast cancer.
在本发明的一些具体实施方案中,所述应用中乳腺癌预后3-10年复发和/或死亡风险评估检测具体为:获得待测样本的总RNA,经逆转录获得cDNA,采用荧光定量PCR方法获得所述分子标志物和所述内参基因的Ct值,将所述内参基因的Ct值求平均值,得到内参基因组合的平均Ct值(Ct’),然后将所述分子标志物的Ct值分别与内参基因组合Ct’值相减做归一化,得到△Ct,将△Ct值及受检者的年龄、pT值、LN值经随机森林算法所构建的乳腺癌术后3-10年复发或死亡风险预测模型分析,获得结果。其中,pT值为病理分期,LN值淋巴结转移数量。所述分析获得的数值与阈值比较,获得结果,所述阈值为5。所述分析获得的数值≥5为预后好,所述分析获得的数值<5为预后差。In some specific embodiments of the present invention, the prognosis of breast cancer prognosis in the application is 3-10 years recurrence and/or death risk assessment is specifically: obtaining total RNA of the sample to be tested, obtaining cDNA by reverse transcription, using real-time PCR The method obtains the Ct value of the molecular marker and the reference gene, averages the Ct value of the reference gene, obtains an average Ct value (Ct') of the internal reference gene combination, and then Ct the molecular marker The values were subtracted from the internal reference gene combination Ct' value to normalize, and △Ct was obtained. The △Ct value and the subject's age, pT value and LN value were reconstructed by random forest algorithm. The annual recurrence or death risk prediction model was analyzed and the results were obtained. Among them, the pT value is the pathological stage, and the LN value is the number of lymph node metastasis. The value obtained by the analysis is compared with a threshold value, and the result is obtained, the threshold value being 5. The value obtained by the analysis ≥ 5 is a good prognosis, and the value obtained by the analysis < 5 is a poor prognosis.
在本发明的一些具体实施方案中,所述应用中乳腺癌预后3-10年复发或死亡风险评估检测模型的构建方法为:将待测样本的分子标志物的△Ct值和受检者年龄、pT值、LN值构建数学矩阵,随机选取1/2作为训练集,1/2作为验证集,通过随机森林 的算法建立包含10000个决策树的预测模型,共随机抽样≥1000次,建立≥1000个预测模型,从≥1000预测模型中选取与随访信息一致率最高的≥39个优选模型为最终模型的子模型,并采用≥39个子模型的中位数作为最终的预后风险预测值。In some embodiments of the present invention, the method for constructing a 3-10 year recurrence or death risk assessment test model for breast cancer in the application is: constructing a ΔCt value of the molecular marker of the sample to be tested and the age of the subject , pT value, LN value construct a mathematical matrix, randomly select 1/2 as the training set, 1/2 as the verification set, through the random forest The algorithm establishes a prediction model with 10,000 decision trees. The total random sampling is ≥1000 times, and ≥1000 prediction models are established. From the ≥1000 prediction model, ≥39 preferred models with the highest coincidence rate with follow-up information are selected as the final model. The model was used with a median of ≥39 submodels as the final prognostic risk predictor.
随机森林由许多决策树组成,决策树的构建采用了属性与样本双随机的方法,因此也叫做随机决策树。在随机森林中,各个决策树之间是没有关联的。当测试数据进入随机森林时,由每一棵决策树进行分类,最后取所有决策树中分类结果最多的那类为最终的结果,即决策树“投票”的结果,换言之,随机森林是一个包含多个决策树的分类器,并且其输出的类别是由个别树输出的类别的众数而定。在本发明中,我们在传统随机森林算法的基础之上进行了优化,我们将样本随机抽样1000次,建立1000个模型,并从1000模型中选取准确率与特异性值均较高的39个优选模型最为最终模型的子模型,并采用39个子模型的中位数作为最终的预测结果。The random forest is composed of many decision trees. The decision tree is constructed by a method of double randomness of attributes and samples, so it is also called random decision tree. In a random forest, there is no correlation between decision trees. When the test data enters the random forest, it is classified by each decision tree. Finally, the class with the most classification results in all decision trees is the final result, that is, the result of the decision tree "voting", in other words, the random forest is a inclusion. A classifier for multiple decision trees, and the category of its output is determined by the mode of the category of the individual tree output. In the present invention, we have optimized on the basis of the traditional random forest algorithm. We randomly sample the samples 1000 times, establish 1000 models, and select 39 models with higher accuracy and specificity from the 1000 model. The submodel of the model's most final model is preferred, and the median of the 39 submodels is used as the final prediction.
未经治疗的早、中期ER或PR阳性乳腺癌初诊患者的年龄、PT分期、LN转移数量,及14个分子标志物BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT、MS4A1和5个看家基因ACTB、GAPDH、GUSB、NUP214、VCAN的Ct值,输入39个预测模型中进行分析,获得预测分析结果,得到3-10年复发或死亡风险值,并根据风险阈值(阈值为5)预测为预后好或预后差。Age, PT stage, number of LN metastases, and 14 molecular markers BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11 of untreated early- and mid-term ER or PR-positive breast cancer , CD68, BAG1, MAPT, MS4A1 and 5 housekeeping genes ACTB, GAPDH, GUSB, NUP214, VCAN Ct values, input into 39 prediction models for analysis, obtain predictive analysis results, get 3-10 years of recurrence or death risk Values, and based on a risk threshold (threshold of 5), predict a good prognosis or a poor prognosis.
在本发明的一些具体实施方案中,本发明中的待测样本为未经治疗的早、中期ER或PR阳性乳腺癌初诊患者FFPE样本。In some embodiments of the invention, the test sample in the present invention is an FFPE sample of a newly diagnosed early or mid-term ER or PR positive breast cancer patient.
本发明还提供了用于扩增所述基因组合物的引物组,序列如SEQ ID No.1~SEQ ID No.28所示。The present invention also provides a primer set for amplifying the gene composition, the sequence of which is shown in SEQ ID No. 1 to SEQ ID No. 28.
本发明还提供了用于扩增所述基因组合物的探针组,序列如SEQ ID No.29~SEQ ID No.42所示。The present invention also provides a probe set for amplifying the gene composition, the sequence of which is shown in SEQ ID No. 29 to SEQ ID No. 42.
本发明还提供了用于扩增所述基因组合物的内参基因的引物组,如SEQ ID No.43~SEQ ID No.47所示。The present invention also provides a primer set for amplifying an internal reference gene of the gene composition, as shown in SEQ ID No. 43 to SEQ ID No. 47.
本发明还提供了用于扩增所述基因组合物的内参基因的探针组,如SEQ ID No.48~SEQ ID No.52所示。The present invention also provides a probe set for amplifying an internal reference gene of the gene composition, as shown in SEQ ID No. 48 to SEQ ID No. 52.
本发明还提供了乳腺癌术后3-10年复发和/或死亡风险预测的检测试剂盒,包括所述引物组和/或所述的探针组以及试剂盒中常用的试剂。The invention also provides a test kit for predicting the risk of recurrence and/or mortality of breast cancer after 3-10 years, including the primer set and/or the probe set and reagents commonly used in the kit.
本发明还提供了乳腺癌预后3-10年复发或死亡风险评估检测模型的构建方法,将待测样本分子标志物的△Ct值和受检者年龄、pT值、LN值构建数学矩阵,随机选取1/2作为训练集,1/2作为验证集,通过随机森林的算法建立包含10000个决策树的预测模型,共随机抽样≥1000次,建立≥1000个预测模型,从≥1000预测模型中选 取与随访信息一致率最高的≥39个优选模型为最终模型的子模型,并采用≥39个子模型的中位数作为最终的预后风险预测值。The invention also provides a method for constructing a risk assessment model for recurrence or death of breast cancer with a prognosis of 3-10 years, constructing a mathematical matrix of the ΔCt value of the molecular marker of the sample to be tested and the age, pT value and LN value of the subject, randomized Select 1/2 as the training set and 1/2 as the verification set. The prediction model with 10000 decision trees is established by the random forest algorithm. The total random sampling is ≥1000 times, and ≥1000 prediction models are established. The ≥1000 prediction models are selected. The ≥39 preferred models with the highest rate of coincidence with follow-up information were used as sub-models of the final model, and the median of ≥39 sub-models was used as the final prognostic risk predictor.
本发明还提供了乳腺癌预后3-10年复发或死亡风险的评估检测方法,获得待测样本的总RNA,经逆转录获得cDNA,采用荧光定量PCR方法获得所述分子标志物和内参基因的Ct值,将内参基因的Ct值求平均值,得到内参基因组合的平均Ct值(Ct’),然后将分子标志物的Ct值分别与内参基因组合Ct’值相减做归一化,得到△Ct,将△Ct值及受检者的年龄、pT值、LN值经随机森林算法所构建的乳腺癌术后3-10年复发或死亡风险预测模型分析,获得结果,即得到3-10年复发或死亡风险值,并根据风险阈值(风险阈值为5)预测为预后好或预后差。The invention also provides an evaluation method for the risk of recurrence or death of breast cancer with a prognosis of 3-10 years, obtaining total RNA of the sample to be tested, obtaining cDNA by reverse transcription, and obtaining the molecular marker and the reference gene by fluorescence quantitative PCR. Ct value, the Ct value of the internal reference gene is averaged to obtain the average Ct value (Ct') of the internal reference gene combination, and then the Ct value of the molecular marker is respectively subtracted from the internal reference gene combination Ct' value to be normalized. △Ct, the △Ct value and the age, pT value and LN value of the subject were analyzed by the random forest algorithm for 3-10 years postoperative recurrence or death risk prediction model of breast cancer, and the result was obtained, that is, 3-10 The annual recurrence or death risk value is predicted to be a good prognosis or a poor prognosis based on the risk threshold (risk threshold of 5).
本发明中的待测样本为未经治疗的早、中期ER或PR阳性乳腺癌初诊患者FFPE样本。The sample to be tested in the present invention is an FFPE sample of a newly diagnosed early or mid-stage ER or PR positive breast cancer patient.
本发明解决问题的技术方案包括:(1)经文献和数据库调研,选取192个乳腺癌相关的候选基因(不局限于乳腺癌预后相关,含内参基因),定制TLDA基因表达检测芯片(Applied Biosystems公司);(2)系统收集完整的人口学资料、临床资料和随访资料(复发转移时间、存活时间),选择未经治疗的早、中期ER或PR阳性乳腺癌初诊患者FFPE样本,采用定制的TLDA芯片进行192个基因的检测,进行内参基因和乳腺癌预后相关的分子标志物的筛选;(3)筛选得到的候选分子标志物和内参基因在独立样本中进行验证,采用随机森林的算法构建患者术后3-10年复发或死亡风险的预测模型,并评估预测模型与随访结果的一致率;(4)采用独立临床样本进一步验证:19例已知临床随访资料的ER或PR阳性的早中期乳腺癌初诊患者FFPE样本,评估检测结果与随访结果一致率。The technical solution to solve the problem of the present invention includes: (1) selecting 192 breast cancer related candidate genes (not limited to breast cancer prognosis related, including internal reference genes), and customizing TLDA gene expression detection chip (Applied Biosystems) through literature and database research. (2) systematically collect complete demographic data, clinical data and follow-up data (recurrence and metastasis time, survival time), and select untreated early and mid-term ER or PR positive breast cancer FFPE samples for newly diagnosed patients, using customized TLDA chip was used to detect 192 genes, and the molecular markers related to prognosis and breast cancer prognosis were screened. (3) The candidate molecular markers and reference genes were screened in independent samples and constructed by random forest algorithm. A predictive model of the risk of recurrence or death 3-10 years after surgery, and a consensus rate between the predictive model and the follow-up results; (4) further validation using independent clinical samples: 19 cases of known clinical follow-up data for ER or PR positive The FFPE samples of newly diagnosed patients with metaphase breast cancer were evaluated for the coincidence rate between the test results and the follow-up results.
本发明提供了用于未经治疗的早、中期ER或PR阳性乳腺癌初诊患者术后3-10年复发或死亡的预后评价基因检测系统。通过提取福尔马林固定、石蜡包埋的(Formalin-Fixed and Parrffin-Embedded,FFPE)的乳腺癌组织样本中的总RNA,经通用引物反转录后,采用PCR方法检测乳腺癌预后相关的14个分子标志物和5个内参基因的表达Ct值。将Ct值与受检者年龄、pT值和LN数导入随机森林算法构建的ER或PR阳性早中期乳腺癌患者术后3-10年复发或死亡风险的预测模型进行预后好或预后差判定。该系统与随访信息相比,准确性达到70%,除患者年龄、pT分期、LN数量,无需依赖其他临床病理信息。The present invention provides a prognostic evaluation gene detection system for recurrence or death 3 to 10 years after surgery in a newly diagnosed early or mid-stage ER or PR positive breast cancer patient. By extracting total RNA from formalin-Fixed and Parrffin-Embedded (FFPE) breast cancer tissue samples, reverse transcription by universal primers, PCR detection of breast cancer prognosis Ct values were expressed for 14 molecular markers and 5 internal reference genes. A predictive model of the risk of recurrence or death after 3-10 years of postoperative ER or PR-positive early-stage breast cancer patients with Ct values and subject age, pT value and LN number was determined by prognosis or poor prognosis. Compared with the follow-up information, the system achieved an accuracy of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
本发明提供的试剂盒对3-10年复发或死亡风险值低的乳腺癌初诊患者的预测准确率为81.1%,病理预测检测准确性为71.9%,其对3-10年复发或死亡风险值高的乳腺癌初诊患者的预测准确率敏感性为54.4%,与相应病理预测检测准确性56.8%接近。该试剂盒与临床随访信息相比,一致率达到70%,除患者年龄、pT分期、LN数量,无需依赖其他临床病理信息。 The kit provided by the invention has a prediction accuracy of 81.1% for a newly diagnosed breast cancer patient with a low risk of recurrence or death of 3-10 years, and a pathological prediction accuracy of 71.9%, which is a risk of recurrence or death of 3-10 years. The accuracy of the prediction accuracy of the newly diagnosed patients with high breast cancer was 54.4%, which was close to the accuracy of the corresponding pathological prediction detection of 56.8%. Compared with clinical follow-up information, the kit has a concordance rate of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
本检测系统及试剂盒在乳腺癌预后评价检测性能上优于临床病理预测结果,在一定程度上可以减少因病理诊断错误而发生的过度治疗和不当治疗,满足了乳腺癌病人的个体化精准治疗的需求,进一步完善了国内乳腺癌预后预测方面的技术方法。The detection system and the kit are superior to the clinical pathological prediction results in the prognosis evaluation performance of breast cancer, and can reduce the excessive treatment and improper treatment caused by pathological diagnosis errors to meet the individualized precision treatment of breast cancer patients. The need to further improve the technical methods of domestic breast cancer prognosis prediction.
附图说明DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art description will be briefly described below.
图1示内参基因与检测基因相关性分析结果;Figure 1 shows the results of correlation analysis between the internal reference gene and the test gene;
图2示乳腺癌预后3-10年复发或死亡风险评估模型的建立。Figure 2 shows the establishment of a risk assessment model for 3-10 years of recurrence or death in breast cancer.
具体实施方式detailed description
本发明公开了分子标志物、内参基因及其应用、检测试剂盒以及检测模型的构建方法,本领域技术人员可以借鉴本文内容,适当改进工艺参数实现。特别需要指出的是,所有类似的替换和改动对本领域技术人员来说是显而易见的,它们都被视为包括在本发明。本发明的方法及应用已经通过较佳实施例进行了描述,相关人员明显能在不脱离本发明内容、精神和范围内对本文所述的方法和应用进行改动或适当变更与组合,来实现和应用本发明技术。The invention discloses a molecular marker, an internal reference gene and an application thereof, a detection kit and a construction method of the detection model, and those skilled in the art can learn from the contents of the paper and appropriately improve the process parameters. It is to be understood that all such alternatives and modifications are obvious to those skilled in the art and are considered to be included in the present invention. The method and the application of the present invention have been described by the preferred embodiments, and it is obvious that the method and application described herein may be modified or appropriately modified and combined without departing from the scope of the present invention. The technique of the present invention is applied.
本发明解决问题的技术方案包括:(1)经文献和数据库调研,选取192个乳腺癌相关的候选基因(不局限于乳腺癌预后相关,含内参基因),定制TLDA基因表达检测芯片(Applied Biosystems公司);(2)系统收集完整的人口学资料、临床资料和随访资料(复发转移时间、存活时间),选择未经治疗的早、中期ER或PR阳性乳腺癌初诊患者FFPE样本,采用定制的TLDA芯片进行192个基因的检测,进行内参基因和乳腺癌预后相关的分子标志物的筛选;(3)筛选得到的候选分子标志物和内参基因在独立样本中进行验证,采用随机森林的算法构建患者术后3-10年复发或死亡风险的预测模型,并评估预测模型与随访结果的一致率;(4)采用独立临床样本进一步验证:19例已知临床随访资料的ER或PR阳性的早中期乳腺癌初诊患者FFPE样本,评估检测结果与随访结果一致率。The technical solution to solve the problem of the present invention includes: (1) selecting 192 breast cancer related candidate genes (not limited to breast cancer prognosis related, including internal reference genes), and customizing TLDA gene expression detection chip (Applied Biosystems) through literature and database research. (2) systematically collect complete demographic data, clinical data and follow-up data (recurrence and metastasis time, survival time), and select untreated early and mid-term ER or PR positive breast cancer FFPE samples for newly diagnosed patients, using customized TLDA chip was used to detect 192 genes, and the molecular markers related to prognosis and breast cancer prognosis were screened. (3) The candidate molecular markers and reference genes were screened in independent samples and constructed by random forest algorithm. A predictive model of the risk of recurrence or death 3-10 years after surgery, and a consensus rate between the predictive model and the follow-up results; (4) further validation using independent clinical samples: 19 cases of known clinical follow-up data for ER or PR positive The FFPE samples of newly diagnosed patients with metaphase breast cancer were evaluated for the coincidence rate between the test results and the follow-up results.
1.研究样本的选择1. Study sample selection
(1)未经治疗的早、中期乳腺癌初诊患者;(1) Untreated early and mid-stage breast cancer patients;
(2)未经治疗的早、中期ER或(和)PR阳性乳腺癌患者;(2) Untreated early and mid-term ER or (and) PR-positive breast cancer patients;
(3)LN有或(和)无转移,及LN转移数量;(3) LN has or (and) no transfer, and the number of LN transfers;
(4)有准确、详细的随访信息; (4) Have accurate and detailed follow-up information;
本研究共采用339例符合标准的样本进行研究。A total of 339 samples meeting the criteria were used in this study.
2.FFPE样本总RNA提取2. FFPE sample total RNA extraction
采用High Pure FFPET RNA Isolation Kit(Roche)提取FFPE样本总RNA,浓度在25ng-400ng/μL,OD260/280在1.8-2.0范围内,OD260/230在1.5-2.0之间。Total RNA was extracted from FFPE samples using High Pure FFPET RNA Isolation Kit (Roche) at concentrations ranging from 25 ng to 400 ng/μL, OD260/280 in the range of 1.8-2.0, and OD260/230 between 1.5-2.0.
3.TLDA(Applied Biosystems公司)芯片检测。3. TLDA (Applied Biosystems) chip detection.
采用已知临床随访信息的26对预后好和预后差样本进行以下实验。The following experiments were performed on 26 samples with good prognosis and poor prognosis using known clinical follow-up information.
(1)总RNA经逆转录反应得到cDNA样品;(1) Total RNA is subjected to reverse transcription reaction to obtain a cDNA sample;
(2)cDNA产物进行TLDA芯片检测;(2) The cDNA product is subjected to TLDA chip detection;
(3)数据分析与处理,获得候选分子标志物和内参基因。(3) Data analysis and processing to obtain candidate molecular markers and internal reference genes.
4.实时定量RT-PCR(qRT-PCR)方法4. Real-time quantitative RT-PCR (qRT-PCR) method
采用已知临床随访信息的289例样本进行候选分子标志物的验证。Candidate molecular markers were validated using 289 samples of known clinical follow-up information.
(1)总RNA经逆转录反应得到cDNA样品;(1) Total RNA is subjected to reverse transcription reaction to obtain a cDNA sample;
(2)cDNA产物进行RT-PCR检测;(2) The cDNA product is subjected to RT-PCR detection;
(3)数据分析与处理。(3) Data analysis and processing.
5.诊断试剂盒制备方法5. Diagnostic kit preparation method
通过定制TLDA芯片检测方法,确定26例乳腺癌预后好样本和26例乳腺癌预后差样本的内参基因和基因表达差异,筛选出内参基因和差异表达基因。候选分子标志物通过反转录荧光定量PCR进行大样本量的验证。最后筛选出的与乳腺癌预后有关的14个基因和5个内参基因组成诊断试剂盒(BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT、MS4A1;ACTB、GAPDH、GUSB、NUP214、VCAN)。诊断试剂盒包括这些基因的引物,探针,以及qRT-PCR其它常规试剂。所述的试剂盒还包含一个预测模型,其中,BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT和MS4A1的表达水平是以ACTB、GAPDH、GUSB、NUP214、VCAN的均值作为参照基因检测得到,同时综合乳腺癌患者年龄、PT、LN等临床信息综合评估其术后3-10年预后复发或死亡风险,进行预后好货预后差的判定。Through the custom TLDA chip detection method, the difference of internal reference gene and gene expression between 26 cases of breast cancer prognosis and 26 cases of breast cancer prognosis samples were determined, and the internal reference genes and differentially expressed genes were selected. Candidate molecular markers were verified by large-sample quantities by reverse transcription fluorescent quantitative PCR. The final screening of 14 genes and 5 internal reference genes for diagnosis of breast cancer prognosis (BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1 ;ACTB, GAPDH, GUSB, NUP214, VCAN). Diagnostic kits include primers for these genes, probes, and other conventional reagents for qRT-PCR. The kit further comprises a predictive model, wherein the expression levels of BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT and MS4A1 are ACTB, GAPDH, GUSB. The mean values of NUP214 and VCAN were detected as reference genes. At the same time, the clinical information of age, PT and LN of breast cancer patients were comprehensively evaluated for the prognosis of recurrence or death after 3-10 years, and the prognosis of poor prognosis was poor.
6.风险评估检测模型的建立6. Establishment of risk assessment test model
(1)乳腺癌术后3-10年复发或死亡风险预测模型建立。(1) Establishment of a predictive model for risk of recurrence or death after 3-10 years of breast cancer surgery.
采用机器学习方法中随机森林算法评估检测样本的术后3-10年复发或死亡风险值,建立乳腺癌预后评价基因检测模型。随机森林由许多决策树组成,决策树的构建采用了属性与样本双随机的方法,因此也叫做随机决策树。在随机森林中,各个决策树之间是没有关联的。当测试数据进入随机森林时,由每一棵决策树进行分类,最后取所有决策树中分类结果最多的那类为最终的结果,即决策树“投票”的结果,换言之,随机森林是一个包含多个决策树的分类器,并且其输出的类别是由个别树输出的类别 的众数而定。在本发明中,我们在传统随机森林算法的基础之上进行了优化,我们将样本随机抽样1000次,建立1000个模型,并从1000模型中选取准确率最高的39个优选模型最为最终模型的子模型,并采用39个子模型的中位数作为最终的预测结果。The random forest algorithm in the machine learning method was used to evaluate the risk of recurrence or death after 3-10 years of postoperative detection, and a gene detection model for breast cancer prognosis evaluation was established. The random forest is composed of many decision trees. The decision tree is constructed by a method of double randomness of attributes and samples, so it is also called random decision tree. In a random forest, there is no correlation between decision trees. When the test data enters the random forest, it is classified by each decision tree. Finally, the class with the most classification results in all decision trees is the final result, that is, the result of the decision tree "voting", in other words, the random forest is a inclusion. a classifier for multiple decision trees, and the category of its output is the category output by the individual tree The majority depends on the number. In the present invention, we have optimized on the basis of the traditional random forest algorithm. We randomly sample the samples 1000 times, establish 1000 models, and select the 39 best models with the highest accuracy from the 1000 model. The submodel is used and the median of the 39 submodels is used as the final prediction.
以下是本发明进一步的说明:The following is a further description of the invention:
研究第一阶段采用TLDA检测技术共检测乳腺癌相关的候选基因192个,检测26例乳腺癌预后好样本和26例乳腺癌预后差样本的基因表达差异,筛选出差异表达基因。基因的不同表达水平以2-ΔCt表示,其中ΔCt=CT样本-CT参照,以筛选出的内参基因作为参照进行标准化来计算相对表达量。其中内参基因的筛选过程:采用genorm、bestkeeper、normfinder、delta Ct四种基于稳定性算法并考虑波动较小基因的生物学功能及其与肿瘤的关系筛选候选内参基因;计算所有候选内参基因组合Ct均值与192个基因Ct均值的相关性,相关性最高的组合即为内参基因包括:ACTB、GAPDH、GUSB、NUP214、VCAN。候选基因筛选标准:(1)整体分析-预后好与预后差两组的倍数差异达2倍或小于0.5,且Ct<35的病例所占比例达到50%;(2)分层分析-无淋巴结转移组预后好与预后差两组的倍数差异在2倍以上,且统计学差异<0.05;(3)整体分析两组倍数差异不显著,但是在乳腺癌预后相关文献有报道的,且Ct<35的病例所占比例达到90%。满足上述筛选标准的基因包括:BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT、MS4A1In the first phase of the study, 192 candidate genes for breast cancer were detected by TLDA detection technology. The gene expression differences of 26 breast cancer prognosis samples and 26 breast cancer prognosis samples were detected, and differentially expressed genes were screened. The different expression levels of the gene are expressed as 2 - ΔCt , wherein ΔCt = CT sample-CT reference, and the selected reference gene is normalized as a reference to calculate the relative expression amount. The screening process of the internal reference gene: using the genetic algorithm based on genorm, bestkeeper, normfinder, delta Ct and considering the biological function of the less fluctuating gene and its relationship with the tumor, screening candidate internal reference genes; calculating all candidate internal reference gene combinations Ct The correlation between the mean and the mean Ct of 192 genes, the most relevant combination is the internal reference genes including: ACTB, GAPDH, GUSB, NUP214, VCAN. Candidate gene screening criteria: (1) overall analysis - good prognosis and poor prognosis, the difference between the two groups is 2 times or less, and the proportion of cases with Ct < 35 is 50%; (2) stratified analysis - no lymph nodes The prognosis of the metastatic group was better than that of the poor prognosis. The difference between the two groups was more than 2 times, and the statistical difference was <0.05. (3) The difference between the two groups was not significant, but it was reported in the prognosis of breast cancer, and Ct< The proportion of cases in 35 reached 90%. Genes satisfying the above screening criteria include: BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1
首先,本发明提供的具有我国女性特色乳腺癌预后评估基因:目前国外同类产品均是基于欧美人群开发,不同种族人群具有不同的基因表达,本发明中筛选出19个基因,其中MAPT和MS4A1基于我国女性乳腺癌患者筛选出的与术后3-10年复发或死亡评估相关基因,该基因虽然有报道与乳腺癌相关,但尚未发现与乳腺癌预后相关的直接报道。其次,本发明建立了不同于其他发明及产品的新内参基因组合,该基因组合受FFPE样本中RNA质量影响小,使分子标志物的检测结果更为可靠。第三,采用随机森林算法的预测模型进行综合分析,模型对早、中期ER+或PR+的乳腺癌初诊患者进行术后3-10年复发或死亡的风险预测。Firstly, the present invention provides a prognostic evaluation gene for breast cancer in China: at present, foreign similar products are developed based on European and American populations, and different ethnic groups have different gene expressions. In the present invention, 19 genes are screened, wherein MAPT and MS4A1 are based on The genes related to the recurrence or death assessment of female breast cancer patients in China after 3-10 years postoperatively have been reported. Although this gene has been reported to be associated with breast cancer, no direct report related to the prognosis of breast cancer has been found. Secondly, the present invention establishes a new internal reference gene combination different from other inventions and products, and the gene combination is less affected by the RNA quality in the FFPE sample, so that the detection result of the molecular marker is more reliable. Third, the predictive model of the random forest algorithm was used for comprehensive analysis. The model predicts the risk of recurrence or death after 3-10 years of postoperative diagnosis of ER+ or PR+ breast cancer in the early and middle stages.
综上所述,本发明提供了用于未经治疗的乳腺癌ER或PR阳性的I期和II期患者术后3-10年复发或死亡的预后评价基因检测系统。该系统与随访信息相比,准确性达到70%,除患者年龄、pT分期、LN数量,无需依赖其他临床病理信息。In summary, the present invention provides a prognostic evaluation gene detection system for relapse or death of patients with stage I and stage II ER or PR positive for untreated breast cancer who are 3-10 years postoperatively. Compared with the follow-up information, the system achieved an accuracy of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
本发明提供的分子标志物、内参基因及其应用、检测试剂盒以及检测模型的构建方法中所用原料及试剂均可由市场够得。The materials and reagents used in the molecular markers, internal reference genes and their applications, detection kits, and detection methods for the detection models provided by the present invention are all commercially available.
下面结合实施例,进一步阐述本发明: The present invention is further illustrated below in conjunction with the embodiments:
实施例1 样品的收集、样品资料的整理Example 1 Collection of samples and preparation of sample data
采用未经治疗初诊乳腺癌患者的FFPE样本,系统收集完整的临床随访资料,通过对样品资料的整理,发明人从中选择了341例符合下列标准的样本作为TLDA(Taqman Low Density Array,TLDA)芯片检测和后续一系列qRT-PCR验证的实验样品:Using the FFPE samples of untreated newly diagnosed breast cancer patients, the system collected complete clinical follow-up data. Through the collation of the sample data, the inventors selected 341 samples that meet the following criteria as TLDA (Taqman Low Density Array, TLDA) chips. Test and subsequent series of experimental samples verified by qRT-PCR:
(1)未经治疗的早、中期乳腺癌初诊患者;(1) Untreated early and mid-stage breast cancer patients;
(2)未经治疗的早、中期ER或(和)PR阳性乳腺癌患者;(2) Untreated early and mid-term ER or (and) PR-positive breast cancer patients;
(3)LN有或(和)无转移,及LN转移数量;(3) LN has or (and) no transfer, and the number of LN transfers;
(4)有准确、详细的随访信息。(4) Have accurate and detailed follow-up information.
实施例2 TLDA芯片筛选分子标志物和内参基因Example 2 TLDA chip screening for molecular markers and internal reference genes
对符合上述条件的26例乳腺癌预后好样本和26例乳腺癌预后差样本进行TLDA芯片检测,获得相关结果。具体步骤为:Twenty-six patients with good prognosis of breast cancer and 26 samples with poor prognosis of breast cancer were tested by TLDA chip and the relevant results were obtained. The specific steps are:
(1)FFPE样本中提取RNA:每份样本20μm切片取4片或10μm切片取8片,按照High Pure FFPET RNA Isolation Kit(Roche)说明书进行RNA的提取,提取后的RNA经NanoDrop-2000定量质控后进行下游反转录实验。(1) RNA extraction from FFPE samples: 4 samples of 20 μm slices per sample or 8 slices of 10 μm slices were taken, and RNA was extracted according to the instructions of High Pure FFPET RNA Isolation Kit (Roche). The extracted RNA was quantified by NanoDrop-2000. Downstream reverse transcription experiments were performed after control.
(2)总RNA经逆转录反应得到cDNA样品:取1μg总RNA按照
Figure PCTCN2017090740-appb-000003
VILOTMMaster Mix kit(Invitrogen)说明书进行反转录。
(2) Total RNA is subjected to reverse transcription reaction to obtain cDNA sample: 1 μg of total RNA is taken according to
Figure PCTCN2017090740-appb-000003
VILO TM Master Mix kit (Invitrogen) instructions for reverse transcription.
(3)cDNA样品进行TLDA检测:以上cDNA产物与
Figure PCTCN2017090740-appb-000004
Universal PCR Master Mix充分混匀后,在ABI 7900荧光定量PCR仪上按照TLDA标准程序进行检测实验。(4)数据分析与处理:
(3) TLDA detection of cDNA samples: the above cDNA products and
Figure PCTCN2017090740-appb-000004
After the Universal PCR Master Mix was thoroughly mixed, the assay was performed on an ABI 7900 fluorescence quantitative PCR machine according to the TLDA standard procedure. (4) Data analysis and processing:
研究第一阶段采用TLDA检测技术共检测来源于乳腺癌相关的候选基因192个,检测26例乳腺癌预后好样本和26例乳腺癌预后差样本的基因表达差异,筛选出差异表达基因。基因的不同表达水平以2-ΔCt表示,其中ΔCt=CT样本-CT参照,以筛选出的内参基因作为参照进行标准化来计算相对表达量。其中内参基因的筛选过程:采用genorm、bestkeeper、normfinder、delta Ct四种基于稳定性算法并考虑波动较小基因的生物学功能及其与肿瘤的关系筛选候选内参基因;计算所有候选内参基因组合Ct均值与192个基因Ct均值的相关性,相关性最高的组合即为内参基因包括:ACTB、GAPDH、GUSB、NUP214、VCAN。候选基因筛选标准:(1)整体分析-预后好与预后差两组的倍数差异达2倍或小于0.5,且Ct<35的病例所占比例达到50%;(2)分层分析-无淋巴结转移组预后好与预后差两组的倍数差异在2倍以上,且统计学差异<0.05;(3)整体分析两组倍数差异不显著,但是在乳腺癌预后相关文献有报道的,且Ct<35的病例所占比例达到90%。满足上述标准的基因包括:BCL2、PGR、 SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT、MS4A1以上14个基因功能见下表1。In the first phase of the study, 192 candidate genes derived from breast cancer were detected by TLDA detection technology. The gene expression differences of 26 breast cancer prognosis samples and 26 breast cancer prognosis samples were detected, and differentially expressed genes were screened. The different expression levels of the gene are expressed as 2 - ΔCt , wherein ΔCt = CT sample-CT reference, and the selected reference gene is normalized as a reference to calculate the relative expression amount. The screening process of the internal reference gene: using the genetic algorithm based on genorm, bestkeeper, normfinder, delta Ct and considering the biological function of the less fluctuating gene and its relationship with the tumor, screening candidate internal reference genes; calculating all candidate internal reference gene combinations Ct The correlation between the mean and the mean Ct of 192 genes, the most relevant combination is the internal reference genes including: ACTB, GAPDH, GUSB, NUP214, VCAN. Candidate gene screening criteria: (1) overall analysis - good prognosis and poor prognosis, the difference between the two groups is 2 times or less, and the proportion of cases with Ct < 35 is 50%; (2) stratified analysis - no lymph nodes The prognosis of the metastatic group was better than that of the poor prognosis. The difference between the two groups was more than 2 times, and the statistical difference was <0.05. (3) The difference between the two groups was not significant, but it was reported in the prognosis of breast cancer, and Ct< The proportion of cases in 35 reached 90%. The genes satisfying the above criteria include: BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1 and above.
表1基因功能分析Table 1 gene function analysis
序号Serial number 基因名称Gene name 功能相关性Functional relevance
1.1. BCL2BCL2 雌激素相关Estrogen related
2.2. PGRPGR 雌激素相关Estrogen related
3.3. SCUBE2SCUBE2 雌激素相关Estrogen related
4.4. ESR1ESR1 雌激素相关Estrogen related
5.5. MKi67MKi67 增殖相关Proliferation related
6.6. CCNB1CCNB1 增殖相关Proliferation related
7.7. MYBL2MYBL2 增殖相关Proliferation related
8.8. GRB7GRB7 Her-2相关Her-2 related
9.9. ERBB2ERBB2 Her-2相关Her-2 related
10.10. MMP11MMP11 侵袭相关Invasion related
11.11. CD68CD68 分化族68Differentiation group 68
12.12. BAG1BAG1 BCL2结合抗凋亡基因1BCL2 binds to anti-apoptotic gene 1
13.13. MAPTMAPT 微管相关蛋白tauMicrotubule-associated protein tau
14.14. MS4A1MS4A1 跨膜4域亚家族A成员1Transmembrane 4 domain subfamily A member 1
实施例3 分子标志物的大样本量qRT-PCR验证Example 3 Large sample size qRT-PCR verification of molecular markers
TLDA筛选出的14个分子标志物和5个内参基因:BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT、MS4A1;ACTB、GAPDH、GUSB、NUP214、VCAN。采用符合以上样本收集要求及临床随访信息完整289例FFPE样本,进行单管qRT-PCR验证。14 molecular markers and 5 internal reference genes screened by TLDA: BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1; ACTB, GAPDH, GUSB, NUP214 , VCAN. A single tube qRT-PCR was performed using 289 FFPE samples that met the above sample collection requirements and clinical follow-up information.
(1)289例FFPE样本RNA提取:每份样本20μm切片取4片或10μm切片取8片,按照High Pure FFPET RNA Isolation Kit(Roche)说明书进行RNA的提取,提取后的RNA经NanoDrop-2000定量质控后进行下游反转录实验。(1) RNA extraction of 289 FFPE samples: 4 samples of 20 μm slices per sample or 8 slices of 10 μm slices, RNA was extracted according to the instructions of High Pure FFPET RNA Isolation Kit (Roche), and the extracted RNA was quantified by NanoDrop-2000. Downstream reverse transcription experiments were performed after quality control.
(2)289例FFPE样本RNA反转录成cDNA:取1μg总RNA按照
Figure PCTCN2017090740-appb-000005
VILOTMMaster Mix kit(Invitrogen)说明书进行反转录。
(2) 289 FFPE samples were reverse transcribed into cDNA: 1 μg of total RNA was used.
Figure PCTCN2017090740-appb-000005
VILO TM Master Mix kit (Invitrogen) instructions for reverse transcription.
(3)289例FFPE样本cDNA产物进行qPCR检测:每份样本的cDNA产物、探针及引物、
Figure PCTCN2017090740-appb-000006
Universal Master Mix II混匀后,在ABI 7900荧光定量PCR仪上进行检测实验。qPCR引物及探针序列如表2~表5所示。
(3) 289 FFPE sample cDNA products were subjected to qPCR detection: cDNA products, probes and primers of each sample,
Figure PCTCN2017090740-appb-000006
After mixing the Universal Master Mix II, the assay was performed on an ABI 7900 real-time PCR machine. The qPCR primers and probe sequences are shown in Tables 2 to 5.
表2 qRT-PCR引物序列Table 2 qRT-PCR primer sequences
Figure PCTCN2017090740-appb-000007
Figure PCTCN2017090740-appb-000007
Figure PCTCN2017090740-appb-000008
Figure PCTCN2017090740-appb-000008
表3 qRT-PCR探针序列Table 3 qRT-PCR probe sequence
Figure PCTCN2017090740-appb-000009
Figure PCTCN2017090740-appb-000009
表4 看家基因qRT-PCR引物序列Table 4 housekeeping gene qRT-PCR primer sequence
Figure PCTCN2017090740-appb-000010
Figure PCTCN2017090740-appb-000010
表5 看家基因qRT-PCR探针序列Table 5 housekeeping gene qRT-PCR probe sequence
Figure PCTCN2017090740-appb-000011
Figure PCTCN2017090740-appb-000011
实施例4 乳腺癌预后3-10年复发或死亡风险预测模型建立Example 4 Breast cancer prognosis 3-10 years recurrence or death risk prediction model establishment
采用机器学习方法中随机森林算法评估检测样本的术后3-10年复发或死亡风险值,建立乳腺癌预后评价基因检测模型。随机森林由许多决策树组成,决策树的构建采用了属性与样本双随机的方法,因此也叫做随机决策树。在随机森林中,各个决策树之间是没有关联的。当测试数据进入随机森林时,由每一棵决策树进行分类,最后取所有决策树中分类结果最多的那类为最终的结果,即决策树“投票”的结果,换言之, 随机森林是一个包含多个决策树的分类器,并且其输出的类别是由个别树输出的类别的众数而定。在本发明中,我们在传统随机森林算法的基础之上进行了优化,我们将样本随机抽样1000次,建立1000个模型,并从1000模型中选取准确率最高的39个优选模型最为最终模型的子模型,并采用39个子模型的中位数作为最终的预测结果。The random forest algorithm in the machine learning method was used to evaluate the risk of recurrence or death after 3-10 years of postoperative detection, and a gene detection model for breast cancer prognosis evaluation was established. The random forest is composed of many decision trees. The decision tree is constructed by a method of double randomness of attributes and samples, so it is also called random decision tree. In a random forest, there is no correlation between decision trees. When the test data enters the random forest, it is classified by each decision tree. Finally, the class with the most classification results in all decision trees is the final result, that is, the result of the decision tree "voting", in other words, A random forest is a classifier that contains multiple decision trees, and the category of its output is determined by the mode of the category of the individual tree output. In the present invention, we have optimized on the basis of the traditional random forest algorithm. We randomly sample the samples 1000 times, establish 1000 models, and select the 39 best models with the highest accuracy from the 1000 model. The submodel is used and the median of the 39 submodels is used as the final prediction.
实施例5 独立临床样本的进一步验证Example 5 Further verification of independent clinical samples
19例已知临床随访资料的ER或PR阳性的早中期乳腺癌患者FFPE样本:PT分期为1期和2期,其中患者手术时间在2004年到2008年之间,随访观察到2011年到2015年,随访时间在3-10年以上。19 FFPE samples of ER or PR-positive early-stage breast cancer patients with known clinical follow-up data: PT staging was stage 1 and 2, of which patients were operated between 2004 and 2008, followed up from 2011 to 2015. In the year, the follow-up period was 3-10 years.
使用High Pure FFPET RNA Isolation Kit(Roche)提取以上19份FFPE样本总RNA,质控合格后RNA经逆转录反应得到cDNA样品,cDNA产物进行qRT-PCR反应,检测内参基因ACTB、GAPDH、GUSB、NUP214、VCAN,以及BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT、MS4A1基因。以上基因的Ct值导入随机森林方法构建的乳腺癌术后3-10年复发或死亡风险评估模型中得到3-10年复发或死亡风险值,并根据风险阈值预测为预后好或预后差。预测分析结果与已知随访信息一致率为73.6%,具体结果详见表6:The high-purity FFPET RNA Isolation Kit (Roche) was used to extract the total RNA from the above 19 FFPE samples. After the quality control, the RNA was subjected to reverse transcription reaction to obtain cDNA samples. The cDNA products were subjected to qRT-PCR reaction to detect the internal reference genes ACTB, GAPDH, GUSB, NUP214. , VCAN, and BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT, MS4A1 genes. The Ct value of the above genes was introduced into the randomized forest method to establish a 3-10 year recurrence or death risk assessment model for breast cancer recurrence or death risk assessment model, and the risk threshold was predicted to be good prognosis or poor prognosis. The agreement analysis result and the known follow-up information were 73.6%. The specific results are shown in Table 6:
表6 19例乳腺癌患者预后评价结果Table 6 Results of prognosis evaluation of 19 patients with breast cancer
Figure PCTCN2017090740-appb-000012
Figure PCTCN2017090740-appb-000012
Figure PCTCN2017090740-appb-000013
Figure PCTCN2017090740-appb-000013
表7Table 7
Figure PCTCN2017090740-appb-000014
Figure PCTCN2017090740-appb-000014
实施例6Example 6
目前,乳腺癌临床治疗和治疗方案的决定最终取决于病理学检查的结果,同时病理学检查结果也是判断预后最重要的客观依据。本检测系统采用天津医科大学肿瘤医院与河南省肿瘤医院收集的已知临床随访资料的乳腺癌289例初诊病人FFPE样本,对5个内参基因及14个分子标志物分别进行检测。At present, the decision of clinical treatment and treatment plan for breast cancer ultimately depends on the results of pathological examination, and the results of pathological examination are also the most important objective basis for judging prognosis. The detection system used FFPE samples of 289 newly diagnosed breast cancer patients with known clinical follow-up data collected by Tianjin Medical University Cancer Hospital and Henan Cancer Hospital. Five internal reference genes and 14 molecular markers were detected.
检测结果见表8。 The test results are shown in Table 8.
表8Table 8
Figure PCTCN2017090740-appb-000015
Figure PCTCN2017090740-appb-000015
本发明提供的试剂盒对3-10年复发或死亡风险值低的乳腺癌初诊病人的准确率为81.1%,病理检测准确性为71.8%,其对3-10年复发或死亡风险值高的乳腺癌初诊病人的准确率敏感性为54.4%,与相应病理检测准确性56.8%接近。该试剂盒与临床随访信息相比,一致率达到70%,除患者年龄、pT分期、LN数量,无需依赖其他临床病理信息。The kit provided by the invention has an accuracy rate of 81.1% for a newly diagnosed breast cancer patient with a low risk of recurrence or death of 3-10 years, and a pathological detection accuracy of 71.8%, which has a high risk of recurrence or death of 3-10 years. The accuracy rate of the newly diagnosed patients with breast cancer was 54.4%, which was close to the corresponding pathological detection accuracy of 56.8%. Compared with clinical follow-up information, the kit has a concordance rate of 70%. Except for patient age, pT stage, and LN number, there is no need to rely on other clinical pathological information.
本检测系统及试剂盒在乳腺癌预后评价检测性能上更优于临床病理诊断结果,在一定程度上可以减少因病理诊断错误而发生的过度治疗和不当治疗,满足了乳腺癌病人的个体化精准治疗的需求,进一步完善了国内乳腺癌预后预测方面的技术方法。The detection system and the kit are superior to the clinical pathological diagnosis result in the prognosis evaluation performance of breast cancer, and can reduce the excessive treatment and improper treatment caused by the pathological diagnosis error to meet the individualized precision of the breast cancer patient to a certain extent. The need for treatment has further improved the technical methods for predicting the prognosis of breast cancer in China.
以上对本发明所提供的分子标志物及其应用、检测试剂盒以及检测模型的构建方法进行了详细介绍。本文应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出,对于本技术领域技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围内。 The molecular markers provided by the present invention and their applications, detection kits, and methods for constructing detection models are described in detail above. The principles and embodiments of the present invention have been described with reference to specific examples, and the description of the above embodiments is only to assist in understanding the method of the present invention and its core idea. It should be noted that those skilled in the art can make various modifications and changes to the present invention without departing from the spirit and scope of the invention.
Figure PCTCN2017090740-appb-000016
Figure PCTCN2017090740-appb-000016
Figure PCTCN2017090740-appb-000017
Figure PCTCN2017090740-appb-000017
Figure PCTCN2017090740-appb-000018
Figure PCTCN2017090740-appb-000018
Figure PCTCN2017090740-appb-000019
Figure PCTCN2017090740-appb-000019
Figure PCTCN2017090740-appb-000020
Figure PCTCN2017090740-appb-000020
Figure PCTCN2017090740-appb-000021
Figure PCTCN2017090740-appb-000021
Figure PCTCN2017090740-appb-000022
Figure PCTCN2017090740-appb-000022
Figure PCTCN2017090740-appb-000023
Figure PCTCN2017090740-appb-000023
Figure PCTCN2017090740-appb-000024
Figure PCTCN2017090740-appb-000024
Figure PCTCN2017090740-appb-000025
Figure PCTCN2017090740-appb-000025
Figure PCTCN2017090740-appb-000026
Figure PCTCN2017090740-appb-000026
Figure PCTCN2017090740-appb-000027
Figure PCTCN2017090740-appb-000027
Figure PCTCN2017090740-appb-000028
Figure PCTCN2017090740-appb-000028

Claims (15)

  1. 基因组合物,其特征在于,包括分子标志物MAPT和/或MS4A1。A genetic composition characterized by comprising the molecular markers MAPT and/or MS4A1.
  2. 基因组合物,其特征在于,由分子标志物BCL2、PGR、SCUBE2、ESR1、MKi67、CCNB1、MYBL2、GRB7、ERBB2、MMP11、CD68、BAG1、MAPT和MS4A1组成。A gene composition consisting of the molecular markers BCL2, PGR, SCUBE2, ESR1, MKi67, CCNB1, MYBL2, GRB7, ERBB2, MMP11, CD68, BAG1, MAPT and MS4A1.
  3. 根据权利要求2所述的基因组合物,其特征在于,还包括内参基因ACTB、GAPDH、GUSB、NUP214、VCAN。The genetic composition according to claim 2, further comprising internal reference genes ACTB, GAPDH, GUSB, NUP214, VCAN.
  4. 根据权利要求1或2或3所述的基因组合物在制备乳腺癌术后3-10年复发和/或死亡风险预测的检测装置中的应用。Use of the genetic composition according to claim 1 or 2 or 3 for the preparation of a device for predicting the risk of recurrence and/or mortality after 3-10 years of breast cancer surgery.
  5. 根据权利要求4所述的应用,其特征在于,所述乳腺癌术后3-10年复发和/或死亡风险预测具体为:获得待测样本的总RNA,经逆转录获得cDNA,采用荧光定量PCR方法获得所述分子标志物和所述内参基因的Ct值,将所述内参基因的Ct值求平均值,得到内参基因组合的平均Ct值(Ct’),然后将所述分子标志物的Ct值分别与内参基因组合Ct’值相减做归一化,得到△Ct,将△Ct值及受检者的年龄、pT值、LN值经随机森林算法所构建的乳腺癌术后3-10年复发或死亡风险预测模型分析,获得结果。The use according to claim 4, wherein the risk of recurrence and/or death of the breast cancer after 3-10 years is specifically obtained by obtaining total RNA of the sample to be tested, obtaining cDNA by reverse transcription, and using fluorescence quantitative The Ct value of the molecular marker and the internal reference gene is obtained by a PCR method, and the Ct value of the internal reference gene is averaged to obtain an average Ct value (Ct') of the internal reference gene combination, and then the molecular marker is The Ct value was subtracted from the internal reference gene combination Ct' value to normalize, and △Ct was obtained. The △Ct value and the subject's age, pT value and LN value were reconstructed by random forest algorithm. A 10-year recurrence or death risk prediction model was analyzed and the results were obtained.
  6. 根据权利要求5所述的应用,其特征在于,所述乳腺癌术后3-10年复发或死亡风险预测模型的构建方法为:将待测样本分子标志物的△Ct值和受检者年龄、pT值、LN值构建数学矩阵,随机选取1/2作为训练集,1/2作为验证集,通过随机森林的算法建立包含10000个决策树的预测模型,共随机抽样≥1000次,建立≥1000个预测模型,从≥1000预测模型中选取与随访信息一致率最高的≥39个优选模型为最终模型的子模型,并采用≥39个子模型的中位数作为最终的预后风险预测值。The use according to claim 5, wherein the predictive model for the risk of recurrence or death after 3-10 years of breast cancer is constructed by: ΔCt value of the molecular marker of the sample to be tested and the age of the subject , pT value, LN value construct mathematical matrix, randomly select 1/2 as training set, 1/2 as verification set, establish a prediction model containing 10000 decision trees through random forest algorithm, total random sampling ≥1000 times, establish ≥ In 1000 prediction models, ≥39 optimal models with the highest rate of coincidence with follow-up information were selected as the submodel of the final model from ≥1000 predictive models, and the median of ≥39 submodels was used as the final prognostic risk predictor.
  7. 根据权利要求5或6所述的应用,其特征在于,所述待测样本为未经治疗的早、中期ER或PR阳性乳腺癌初诊患者FFPE样本。The use according to claim 5 or 6, wherein the sample to be tested is an FFPE sample of a newly diagnosed early or mid-stage ER or PR positive breast cancer patient.
  8. 用于扩增如权利要求1或2所述基因组合物的引物组,其特征在于,序列如SEQ ID No.1~SEQ ID No.28所示。A primer set for amplifying the gene composition according to claim 1 or 2, wherein the sequence is shown in SEQ ID No. 1 to SEQ ID No. 28.
  9. 用于扩增如权利要求1或2所述基因组合物的探针组,其特征在于,序列如SEQ ID No.29~SEQ ID No.42所示。A probe set for amplifying the gene composition according to claim 1 or 2, wherein the sequence is represented by SEQ ID No. 29 to SEQ ID No. 42.
  10. 用于扩增如权利要求3所述基因组合物的内参基因的引物组,其特征在于,如SEQ ID No.43~SEQ ID No.47所示。 A primer set for amplifying an internal reference gene of the gene composition according to claim 3, which is represented by SEQ ID No. 43 to SEQ ID No. 47.
  11. 用于扩增如权利要求3所述基因组合物的内参基因的探针组,其特征在于,如SEQ ID No.48~SEQ ID No.52所示。A probe set for amplifying an internal reference gene of the gene composition according to claim 3, which is represented by SEQ ID No. 48 to SEQ ID No. 52.
  12. 乳腺癌术后3-10年复发和/或死亡风险预测的检测试剂盒,其特征在于,包括如权利要求8和/或10所述的引物组和/或如权利要求9和/或11所述的探针组以及试剂盒中常用的试剂。A kit for predicting the risk of recurrence and/or mortality after 3-10 years of breast cancer, comprising the primer set according to claim 8 and/or 10 and/or according to claims 9 and/or 11 The probe set described and the reagents commonly used in the kit.
  13. 乳腺癌术后3-10年复发或死亡风险预测模型的构建方法,其特征在于,将待测样本分子标志物的△Ct值和受检者年龄、pT值、LN值构建数学矩阵,随机选取1/2作为训练集,1/2作为验证集,通过随机森林的算法建立包含10000个决策树的预测模型,共随机抽样≥1000次,建立≥1000个预测模型,从≥1000预测模型中选取与随访信息一致率最高的≥39个优选模型为最终模型的子模型,并采用≥39个子模型的中位数作为最终的预后风险预测值。A method for constructing a predictive model for recurrence or death risk of breast cancer 3-10 years after surgery, which is characterized in that a mathematical matrix is constructed by constructing a mathematical matrix of the ΔCt value of the molecular marker of the sample to be tested and the age, pT value and LN value of the subject, and randomly selecting 1/2 is used as the training set, 1/2 is used as the verification set, and the prediction model containing 10000 decision trees is established by the algorithm of random forest. The total random sampling is ≥1000 times, ≥1000 prediction models are established, and ≥1000 prediction models are selected. The ≥39 preferred models with the highest rate of follow-up information were the sub-models of the final model, and the median of ≥39 sub-models was used as the final prognostic risk predictor.
  14. 乳腺癌术后3-10年复发或死亡风险的检测方法,其特征在于,获得待测样本的总RNA,经逆转录获得cDNA,采用荧光定量PCR方法获得所述分子标志物和所述内参基因的Ct值,将所述内参基因的Ct值求平均值,得到内参基因组合的平均Ct值(Ct’),然后将所述分子标志物的Ct值分别与内参基因组合Ct’值相减做归一化,得到△Ct,将△Ct值及受检者的年龄、pT值、LN值经随机森林算法所构建的乳腺癌术后3-10年复发或死亡风险预测模型分析,获得结果。A method for detecting the risk of recurrence or death after 3-10 years of breast cancer, characterized in that the total RNA of the sample to be tested is obtained, the cDNA is obtained by reverse transcription, and the molecular marker and the reference gene are obtained by real-time PCR. The Ct value of the internal reference gene is averaged to obtain an average Ct value (Ct') of the internal reference gene combination, and then the Ct value of the molecular marker is subtracted from the internal reference gene combination Ct' value. Normalization, △Ct was obtained, and the △Ct value and the age, pT value and LN value of the subject were analyzed by the random forest algorithm for 3-10 years postoperative recurrence or death risk prediction model of breast cancer.
  15. 根据权利要求13所述的构建方法或权利要求14所述的检测方法,其特征在于,所述待测样本为未经治疗的早、中期ER或PR阳性乳腺癌初诊患者FFPE样本。 The method according to claim 13 or the method according to claim 14, wherein the sample to be tested is an FFPE sample of a newly diagnosed early or mid-stage ER or PR-positive breast cancer patient.
PCT/CN2017/090740 2016-06-30 2017-06-29 Molecular marker, reference gene, and application and test kit thereof, and method for constructing testing model WO2018001295A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2018568674A JP2019527544A (en) 2016-06-30 2017-06-29 Molecular marker, reference gene, and application thereof, detection kit, and detection model construction method
SG11201811263WA SG11201811263WA (en) 2016-06-30 2017-06-29 Molecular marker, reference gene, and application and test kit thereof, and method for constructing testing model

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610509983.0 2016-06-30
CN201610509983.0A CN107574243B (en) 2016-06-30 2016-06-30 Molecular marker, reference gene and application thereof, detection kit and construction method of detection model

Publications (1)

Publication Number Publication Date
WO2018001295A1 true WO2018001295A1 (en) 2018-01-04

Family

ID=60785942

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/090740 WO2018001295A1 (en) 2016-06-30 2017-06-29 Molecular marker, reference gene, and application and test kit thereof, and method for constructing testing model

Country Status (4)

Country Link
JP (1) JP2019527544A (en)
CN (1) CN107574243B (en)
SG (1) SG11201811263WA (en)
WO (1) WO2018001295A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110493235A (en) * 2019-08-23 2019-11-22 四川长虹电器股份有限公司 A kind of mobile terminal from malicious software synchronization detection method based on network flow characteristic
CN111500724A (en) * 2020-04-28 2020-08-07 王强 Primer group and probe combination for simultaneously detecting six genes of breast cancer
CN114373511A (en) * 2022-03-15 2022-04-19 南方医科大学南方医院 Intestinal cancer model based on 5hmC molecular marker detection and intestinal cancer model construction method

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108841962B (en) * 2018-08-01 2021-11-19 博奥生物集团有限公司 Non-small cell lung cancer detection kit and application thereof
CN109801680B (en) * 2018-12-03 2023-02-28 广州中医药大学(广州中医药研究院) Tumor metastasis and recurrence prediction method and system based on TCGA database
CN110923317A (en) * 2019-11-27 2020-03-27 福建省立医院 Method for breast cancer prognosis prediction and primer group thereof
CN110942808A (en) * 2019-12-10 2020-03-31 山东大学 Prognosis prediction method and prediction system based on gene big data
CN111440869A (en) * 2020-03-16 2020-07-24 武汉百药联科科技有限公司 DNA methylation marker for predicting primary breast cancer occurrence risk and screening method and application thereof
KR102565378B1 (en) * 2020-03-23 2023-08-10 단국대학교 산학협력단 Biomarker for predicting the status of breast cancer hormone receptors
CN112725444A (en) * 2020-12-30 2021-04-30 杭州联川基因诊断技术有限公司 Primer, probe, kit and detection method for detecting PGR gene expression
CN112646864A (en) * 2020-12-30 2021-04-13 杭州联川基因诊断技术有限公司 Primer, probe, kit and detection method for detecting ESR1 gene expression
CN112927795B (en) * 2021-02-23 2022-09-23 山东大学 Breast cancer prediction system based on bagging algorithm
CN113846149A (en) * 2021-09-28 2021-12-28 领航基因科技(杭州)有限公司 Digital PCR real-time analysis method of micropore array chip

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101173313A (en) * 2006-09-19 2008-05-07 天津医科大学附属肿瘤医院 Mammary cancer diversion and prognosis molecule parting gene group, gene chip producing and using method
CN101195825A (en) * 2007-12-10 2008-06-11 上海华冠生物芯片有限公司 Gene for prognosis of breast cancer and uses thereof
WO2008079269A2 (en) * 2006-12-19 2008-07-03 Genego, Inc. Novel methods for functional analysis of high-throughput experimental data and gene groups identified therfrom
CN101921858A (en) * 2010-08-23 2010-12-22 广州益善生物技术有限公司 Liquid phase chip for detecting breast cancer prognosis-related gene mRNA expression level
CN101965190A (en) * 2005-04-04 2011-02-02 维里德克斯有限责任公司 Laser microdissection and microarray analysis of breast tumors reveal estrogen receptor related genes and pathways
CN104263815A (en) * 2014-08-25 2015-01-07 复旦大学附属肿瘤医院 A group of genes used for prognosis of hormone receptor-positive breast cancer and applications thereof
WO2015135035A2 (en) * 2014-03-11 2015-09-17 The Council Of The Queensland Institute Of Medical Research Determining cancer agressiveness, prognosis and responsiveness to treatment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE467679T1 (en) * 2003-12-23 2010-05-15 Santaris Pharma As OLIGOMERIC COMPOUNDS FOR MODULATING BCL-2
US20110166838A1 (en) * 2008-06-16 2011-07-07 Sividon Diagnostics Algorithms for outcome prediction in patients with node-positive chemotherapy-treated breast cancer
WO2015035377A1 (en) * 2013-09-09 2015-03-12 British Columbia Cancer Agency Branch Methods and kits for predicting outcome and methods and kits for treating breast cancer with radiation therapy
CN104004844A (en) * 2014-05-28 2014-08-27 杭州美中疾病基因研究院有限公司 Kit for jointly detecting breast cancer 21 genes and preparation method of kit

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101965190A (en) * 2005-04-04 2011-02-02 维里德克斯有限责任公司 Laser microdissection and microarray analysis of breast tumors reveal estrogen receptor related genes and pathways
CN101173313A (en) * 2006-09-19 2008-05-07 天津医科大学附属肿瘤医院 Mammary cancer diversion and prognosis molecule parting gene group, gene chip producing and using method
WO2008079269A2 (en) * 2006-12-19 2008-07-03 Genego, Inc. Novel methods for functional analysis of high-throughput experimental data and gene groups identified therfrom
CN101195825A (en) * 2007-12-10 2008-06-11 上海华冠生物芯片有限公司 Gene for prognosis of breast cancer and uses thereof
CN101921858A (en) * 2010-08-23 2010-12-22 广州益善生物技术有限公司 Liquid phase chip for detecting breast cancer prognosis-related gene mRNA expression level
WO2015135035A2 (en) * 2014-03-11 2015-09-17 The Council Of The Queensland Institute Of Medical Research Determining cancer agressiveness, prognosis and responsiveness to treatment
CN104263815A (en) * 2014-08-25 2015-01-07 复旦大学附属肿瘤医院 A group of genes used for prognosis of hormone receptor-positive breast cancer and applications thereof

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110493235A (en) * 2019-08-23 2019-11-22 四川长虹电器股份有限公司 A kind of mobile terminal from malicious software synchronization detection method based on network flow characteristic
CN111500724A (en) * 2020-04-28 2020-08-07 王强 Primer group and probe combination for simultaneously detecting six genes of breast cancer
CN111500724B (en) * 2020-04-28 2023-11-21 启程医学科技(山东)有限公司 Primer set and probe combination for simultaneous detection of breast cancer six genes
CN114373511A (en) * 2022-03-15 2022-04-19 南方医科大学南方医院 Intestinal cancer model based on 5hmC molecular marker detection and intestinal cancer model construction method
CN114373511B (en) * 2022-03-15 2022-08-30 南方医科大学南方医院 Intestinal cancer model based on 5hmC molecular marker detection and intestinal cancer model construction method

Also Published As

Publication number Publication date
CN107574243A (en) 2018-01-12
CN107574243B (en) 2021-06-29
SG11201811263WA (en) 2019-01-30
JP2019527544A (en) 2019-10-03

Similar Documents

Publication Publication Date Title
WO2018001295A1 (en) Molecular marker, reference gene, and application and test kit thereof, and method for constructing testing model
US20200255911A1 (en) Method for Using Gene Expression to Determine Prognosis of Prostate Cancer
EP2715348B1 (en) Molecular diagnostic test for cancer
Pateisky et al. hsa-miRNA-154-5p expression in plasma of endometriosis patients is a potential diagnostic marker for the disease
Londoño et al. A need for biomarkers of operational tolerance in liver and kidney transplantation
AU2012261820A1 (en) Molecular diagnostic test for cancer
EP3556867A1 (en) Methods to predict clinical outcome of cancer
JP2009529878A (en) Primary cell proliferation
CN109477145A (en) The biomarker of inflammatory bowel disease
JP6864089B2 (en) Postoperative prognosis or antineoplastic compatibility prediction system for patients with advanced gastric cancer
WO2017223216A1 (en) Compositions and methods for diagnosing lung cancers using gene expression profiles
JP2019535286A (en) How to predict the usefulness of chemotherapy in breast cancer patients
AU2017268510A1 (en) Method for using gene expression to determine prognosis of prostate cancer
US20110301054A1 (en) Method of Stratifying Breast Cancer Patients Based on Gene Expression
US20210079479A1 (en) Compostions and methods for diagnosing lung cancers using gene expression profiles
US20230071390A1 (en) Assessment of pr cellular signaling pathway activity using mathematical modelling of target gene expression
CN111566229A (en) Breast cancer molecular typing and distant metastasis risk gene group, diagnosis product and application
US20150105289A1 (en) Biomarkers for lower urinary tract symptoms (luts)
CN115472294A (en) Model for predicting transformation speed of small cell transformation lung adenocarcinoma patient and construction method thereof
CN112368399A (en) Prediction and prognosis application of miRNA (micro ribonucleic acid) in treatment and care of high-grade serous ovarian cancer
KR20230055393A (en) Method of providing information for diagnosing metastasis of cervical cancer
Nisenblat Circulating microRNAs in endometriosis.
Zendjabil et al. EVALUATION OF MIR-21-3P, MIR-96-5P AND MIR-155-5P IN PLASMA FOR EARLY DETECTION OF BREAST CANCER
Lossos et al. Paraffin-based 6-gene model predicts outcome in diffuse large B-cell

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17819292

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018568674

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17819292

Country of ref document: EP

Kind code of ref document: A1