CN115612731A - Molecular marker for auxiliary diagnosis of cancer - Google Patents
Molecular marker for auxiliary diagnosis of cancer Download PDFInfo
- Publication number
- CN115612731A CN115612731A CN202110789125.7A CN202110789125A CN115612731A CN 115612731 A CN115612731 A CN 115612731A CN 202110789125 A CN202110789125 A CN 202110789125A CN 115612731 A CN115612731 A CN 115612731A
- Authority
- CN
- China
- Prior art keywords
- cancer
- seq
- tubb1
- lung cancer
- dna fragment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 151
- 201000011510 cancer Diseases 0.000 title claims abstract description 134
- 238000003745 diagnosis Methods 0.000 title claims abstract description 39
- 239000003147 molecular marker Substances 0.000 title abstract description 8
- 206010058467 Lung neoplasm malignant Diseases 0.000 claims abstract description 228
- 201000005202 lung cancer Diseases 0.000 claims abstract description 227
- 208000020816 lung neoplasm Diseases 0.000 claims abstract description 227
- 206010006187 Breast cancer Diseases 0.000 claims abstract description 170
- 208000026310 Breast neoplasm Diseases 0.000 claims abstract description 169
- 101150025182 TUBB1 gene Proteins 0.000 claims abstract description 96
- 206010054107 Nodule Diseases 0.000 claims abstract description 63
- 238000012360 testing method Methods 0.000 claims abstract description 16
- 239000000126 substance Substances 0.000 claims abstract description 8
- 230000001737 promoting effect Effects 0.000 claims abstract description 7
- 239000003550 marker Substances 0.000 claims abstract description 4
- 239000012634 fragment Substances 0.000 claims description 152
- 108020004414 DNA Proteins 0.000 claims description 150
- 230000011987 methylation Effects 0.000 claims description 141
- 238000007069 methylation reaction Methods 0.000 claims description 141
- 108091029430 CpG site Proteins 0.000 claims description 120
- 238000013178 mathematical model Methods 0.000 claims description 64
- 210000004072 lung Anatomy 0.000 claims description 60
- 238000000034 method Methods 0.000 claims description 57
- 238000001514 detection method Methods 0.000 claims description 41
- 102000053602 DNA Human genes 0.000 claims description 16
- 108020004682 Single-Stranded DNA Proteins 0.000 claims description 16
- 239000002773 nucleotide Substances 0.000 claims description 16
- 125000003729 nucleotide group Chemical group 0.000 claims description 16
- 238000007477 logistic regression Methods 0.000 claims description 11
- 238000007405 data analysis Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 239000003153 chemical reaction reagent Substances 0.000 claims description 9
- 238000011161 development Methods 0.000 claims description 7
- 230000004069 differentiation Effects 0.000 claims description 7
- 238000002360 preparation method Methods 0.000 claims description 7
- 239000003795 chemical substances by application Substances 0.000 claims description 6
- 210000004369 blood Anatomy 0.000 abstract description 27
- 239000008280 blood Substances 0.000 abstract description 27
- 238000013399 early diagnosis Methods 0.000 abstract description 6
- 230000000694 effects Effects 0.000 abstract description 6
- 238000011160 research Methods 0.000 abstract description 4
- 230000000903 blocking effect Effects 0.000 abstract 1
- 239000000523 sample Substances 0.000 description 56
- 210000001165 lymph node Anatomy 0.000 description 20
- 201000005249 lung adenocarcinoma Diseases 0.000 description 19
- 208000000587 small cell lung carcinoma Diseases 0.000 description 19
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 description 18
- 101000625727 Homo sapiens Tubulin beta chain Proteins 0.000 description 16
- 206010041067 Small cell lung cancer Diseases 0.000 description 16
- 102100024717 Tubulin beta chain Human genes 0.000 description 16
- 230000007067 DNA methylation Effects 0.000 description 14
- 230000008595 infiltration Effects 0.000 description 14
- 238000001764 infiltration Methods 0.000 description 14
- 239000000047 product Substances 0.000 description 13
- 101000595682 Homo sapiens Tubulin beta-1 chain Proteins 0.000 description 12
- 206010041823 squamous cell carcinoma Diseases 0.000 description 11
- 238000001574 biopsy Methods 0.000 description 10
- 210000000481 breast Anatomy 0.000 description 9
- 102100036084 Tubulin beta-1 chain Human genes 0.000 description 8
- 230000004083 survival effect Effects 0.000 description 8
- 238000012549 training Methods 0.000 description 8
- 238000004820 blood count Methods 0.000 description 7
- 238000003384 imaging method Methods 0.000 description 7
- 208000030776 invasive breast carcinoma Diseases 0.000 description 7
- 210000000265 leukocyte Anatomy 0.000 description 7
- LSNNMFCWUKXFEE-UHFFFAOYSA-M Bisulfite Chemical compound OS([O-])=O LSNNMFCWUKXFEE-UHFFFAOYSA-M 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 6
- 210000002751 lymph Anatomy 0.000 description 6
- 210000005075 mammary gland Anatomy 0.000 description 6
- 230000002685 pulmonary effect Effects 0.000 description 6
- 239000000427 antigen Substances 0.000 description 5
- 108091007433 antigens Proteins 0.000 description 5
- 102000036639 antigens Human genes 0.000 description 5
- 108090000623 proteins and genes Proteins 0.000 description 5
- 230000035945 sensitivity Effects 0.000 description 5
- 239000000463 material Substances 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 238000001269 time-of-flight mass spectrometry Methods 0.000 description 4
- 210000001519 tissue Anatomy 0.000 description 4
- 102000012406 Carcinoembryonic Antigen Human genes 0.000 description 3
- 108010022366 Carcinoembryonic Antigen Proteins 0.000 description 3
- ZOKXTWBITQBERF-UHFFFAOYSA-N Molybdenum Chemical compound [Mo] ZOKXTWBITQBERF-UHFFFAOYSA-N 0.000 description 3
- 108091081021 Sense strand Proteins 0.000 description 3
- 238000011976 chest X-ray Methods 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000003211 malignant effect Effects 0.000 description 3
- 229910052750 molybdenum Inorganic materials 0.000 description 3
- 239000011733 molybdenum Substances 0.000 description 3
- 238000004393 prognosis Methods 0.000 description 3
- 238000007619 statistical method Methods 0.000 description 3
- 206010006200 Breast cancer stage II Diseases 0.000 description 2
- 206010025067 Lung carcinoma cell type unspecified stage I Diseases 0.000 description 2
- 206010025068 Lung carcinoma cell type unspecified stage II Diseases 0.000 description 2
- 206010025069 Lung carcinoma cell type unspecified stage III Diseases 0.000 description 2
- 102000012288 Phosphopyruvate Hydratase Human genes 0.000 description 2
- 108010022181 Phosphopyruvate Hydratase Proteins 0.000 description 2
- 206010036790 Productive cough Diseases 0.000 description 2
- 102000004243 Tubulin Human genes 0.000 description 2
- 108090000704 Tubulin Proteins 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000003759 clinical diagnosis Methods 0.000 description 2
- 238000010835 comparative analysis Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical class NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 238000004949 mass spectrometry Methods 0.000 description 2
- 238000012821 model calculation Methods 0.000 description 2
- 238000013188 needle biopsy Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 238000010827 pathological analysis Methods 0.000 description 2
- 230000007170 pathology Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 210000003802 sputum Anatomy 0.000 description 2
- 208000024794 sputum Diseases 0.000 description 2
- 238000001356 surgical procedure Methods 0.000 description 2
- 206010055113 Breast cancer metastatic Diseases 0.000 description 1
- 206010006199 Breast cancer stage I Diseases 0.000 description 1
- 206010006201 Breast cancer stage III Diseases 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 208000005443 Circulating Neoplastic Cells Diseases 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 241000238557 Decapoda Species 0.000 description 1
- 208000032843 Hemorrhage Diseases 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 102100033420 Keratin, type I cytoskeletal 19 Human genes 0.000 description 1
- 108010066302 Keratin-19 Proteins 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 102000043276 Oncogene Human genes 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 108700020978 Proto-Oncogene Proteins 0.000 description 1
- 102000052575 Proto-Oncogene Human genes 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 108010036226 antigen CYFRA21.1 Proteins 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 208000034158 bleeding Diseases 0.000 description 1
- 230000000740 bleeding effect Effects 0.000 description 1
- 238000013276 bronchoscopy Methods 0.000 description 1
- 230000001680 brushing effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 229940044683 chemotherapy drug Drugs 0.000 description 1
- 210000000038 chest Anatomy 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 230000002380 cytological effect Effects 0.000 description 1
- 238000002242 deionisation method Methods 0.000 description 1
- 239000008367 deionised water Substances 0.000 description 1
- 229910021641 deionized water Inorganic materials 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 208000028715 ductal breast carcinoma in situ Diseases 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 210000000981 epithelium Anatomy 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 230000006607 hypermethylation Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 206010073095 invasive ductal breast carcinoma Diseases 0.000 description 1
- 201000010985 invasive ductal carcinoma Diseases 0.000 description 1
- 206010073096 invasive lobular breast carcinoma Diseases 0.000 description 1
- 230000005865 ionizing radiation Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 201000005296 lung carcinoma Diseases 0.000 description 1
- 230000004199 lung function Effects 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000001819 mass spectrum Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 210000004877 mucosa Anatomy 0.000 description 1
- 208000015122 neurodegenerative disease Diseases 0.000 description 1
- 238000013421 nuclear magnetic resonance imaging Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 201000003144 pneumothorax Diseases 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000002271 resection Methods 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 238000011477 surgical intervention Methods 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 210000000779 thoracic wall Anatomy 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000008733 trauma Effects 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/154—Methylation markers
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Physics & Mathematics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- Pathology (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Biotechnology (AREA)
- Data Mining & Analysis (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Epidemiology (AREA)
- General Engineering & Computer Science (AREA)
- Hospice & Palliative Care (AREA)
- Microbiology (AREA)
- Artificial Intelligence (AREA)
- Bioethics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Biochemistry (AREA)
- Evolutionary Computation (AREA)
- Oncology (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a molecular marker for assisting in diagnosing cancer. The invention provides an application of a methylated TUBB1 gene as a marker in preparing a product; the use of the product is at least one of the following: auxiliary diagnosis of cancer or prediction of cancer risk; aid in distinguishing benign nodules from cancer; assisting in distinguishing different subtypes of cancer; assisting in distinguishing different stages of cancer; aid in distinguishing between different cancers; determining whether the test substance has a blocking or promoting effect on the occurrence of the cancer; the cancer may be lung cancer or breast cancer. The research of the invention discovers the hypomethylation phenomenon of the TUBB1 gene in the blood of the lung cancer and breast cancer patients, and the invention has important scientific significance and clinical application value for improving the early diagnosis and treatment effect of the lung cancer and the breast cancer and reducing the death rate.
Description
Technical Field
The invention relates to the field of medicine, in particular to a molecular marker for assisting in diagnosing cancer.
Background
Lung cancer is a malignant tumor occurring in the epithelium of bronchial mucosa, and its morbidity and mortality have been on the rise in recent decades, and is the cancer with the highest worldwide morbidity and mortality. Despite recent advances in diagnostic methods, surgical techniques, and chemotherapeutic drugs, the overall 5-year survival rate for lung cancer patients is only 16%, mainly because most lung cancer patients have metastases at the time of treatment and thus the chance of radical surgical intervention is lost. Research shows that the prognosis of lung cancer is directly related to stage, the 5-year survival rate of the lung cancer in stage I is 83%, the survival rate in stage II is 53%, the survival rate in stage III is 26%, and the survival rate in stage IV is 6%. Therefore, early diagnosis and early treatment are key to reducing mortality in lung cancer patients.
The main lung cancer diagnosis methods at present are as follows: (1) imaging method: such as chest X-ray and low dose helical CT. Early stage lung cancer is difficult to detect by chest X-ray. Although the low-dose spiral CT can find small nodules in the lung, the false positive rate is as high as 96.4%, and unnecessary psychological burden is brought to a person to be examined. Meanwhile, chest X-ray and low dose helical CT are not suitable for frequent use due to radiation. In addition, imaging methods are often affected by the equipment and doctor's experience in viewing the film, as well as the time available for reading the film. (2) cytological methods: such as sputum cytology, bronchoscopic biopsy or biopsy, bronchoalveolar lavage fluid cytology, and the like. Sputum cytology and bronchoscopy swabbing or biopsy have low sensitivity for peripheral lung cancer. Meanwhile, the operation of brushing a sheet under a bronchoscope or taking a biopsy and performing bronchoalveolar lavage fluid cytology is complicated, and the comfort level of a physical examiner is poor. (3) serum tumor markers commonly used at present: carcinoembryonic antigen (CEA), carbohydrate antigen (CA 125/153/199), cytokeratin 19 fragment antigen (CYFRA 21-1), neuron-specific enolase (NSE), and the like. These serum tumor markers have limited sensitivity to lung cancer, typically 30% -40%, and even lower for stage I tumors. Moreover, tumor specificity is limited and affected by many benign pathologies such as benign tumors, inflammation, degenerative diseases, etc. At present, tumor markers are mainly used for screening malignant tumors and rechecking tumor treatment effects. Therefore, there is a need to develop a highly efficient and specific early diagnosis technique for lung cancer.
The currently internationally accepted most effective method of pulmonary nodule diagnosis is chest low dose helical CT screening. However, low-dose helical CT has high sensitivity, and is difficult to identify benign or malignant nodules, although a large number of nodules can be found. Among the nodules found, the proportion of malignancy is less than 4%. Currently, the clinical identification of benign and malignant pulmonary nodules requires long-term follow-up, repeated CT examination, or invasive examination methods such as biopsy sampling (including fine needle biopsy of chest wall, bronchoscopic biopsy, thoracoscopic or open-chest lung biopsy) of pulmonary nodules. CT-guided or ultrasound-guided transthoracic puncture biopsy has higher sensitivity, but has lower diagnosis rate for nodules of <2cm, 30-70% missed diagnosis rate, and higher pneumothorax and bleeding incidence rate. The incidence rate of complication of the bronchoscope needle biopsy is relatively low, but the diagnosis rate of peripheral nodules is limited, the diagnosis rate of the nodules is only 34% when the number of the nodules is less than or equal to 2cm, and the diagnosis rate of the nodules is 63% when the number of the nodules is more than 2 cm. The surgical resection has high diagnosis rate and can directly treat the nodules, but can cause the lung function of the patient to be temporarily reduced, and if the nodules are benign, the patient is subjected to unnecessary operations, thereby resulting in over-treatment. Therefore, there is an urgent need for new in vitro diagnostic molecular markers to assist in the identification of pulmonary nodules, and to minimize unnecessary punctures or surgeries while reducing the rate of missed diagnosis.
Breast cancer is a malignant tumor caused by uncontrolled proliferation of mammary epithelial cells. On one hand, breast cancer is one of the most common malignant tumors of women worldwide, and the incidence rate is the first malignant tumor of women. On the other hand, the survival rate of breast cancer is related to the classification and stage of the tumor. The 5-year survival prognosis for early stage breast cancer is typically higher than 60%, but for advanced breast cancer, the value drops to 40-60%. For metastatic breast cancer, the 5-year survival prognosis is typically about 15%. Therefore, it is necessary to effectively diagnose and treat the breast cancer at the later stage by improving the early detection rate of the breast cancer. At present, clinical medicine mainly has two modes of imaging and pathology for early screening and diagnosis of breast cancer. B-mode ultrasonic imaging in the imaging diagnosis is radiationless, but is limited by the mechanism of ultrasonic imaging, and the method has poor resolution ratio on the focus with small volume and unobvious echo change and is easy to miss diagnosis. The mammary gland molybdenum target inspection technology is a low-dose mammary gland X-ray breast shooting technology, can clearly display the condition of each layer of tissue structure of a mammary gland, has higher false positive rate in mammary gland molybdenum target inspection, needs to puncture the mammary gland of a patient to judge more accurately, and has the hazards of ionizing radiation and the like to the patient due to the mammary gland molybdenum target. The breast nuclear magnetic resonance imaging utilizes magnetic energy and radio waves to check breast tissues and generate internal images, and is mainly suitable for screening high risk groups of breast cancer. The pathological diagnosis mainly includes breast biopsy, which is a method for taking pathological tissues to perform pathological diagnosis, however, biopsy operation is very resistant to patients due to human trauma. In addition, some commonly used tumor markers, such as tumor antigen 15-3, tumor antigen 27.29, carcinoembryonic antigen, tumor antigen 125, circulating tumor cells and the like, are used for diagnosing breast cancer, but the specificity and the sensitivity of the tumor markers need to be improved, and the tumor markers are generally used in combination with imaging research. Therefore, more sensitive and specific molecular markers of early breast cancer are urgently needed to be discovered.
DNA methylation is an important chemical modification of genes, affecting the regulation of gene transcription and nuclear structure. Alterations in DNA methylation are early events and concomitant events in cancer progression, and are mainly manifested by hypermethylation of oncogenes and hypomethylation of proto-oncogenes in tumor tissues. However, the correlation between DNA methylation in blood and tumorigenesis and development has been reported to be relatively small. In addition, blood is easy to collect, DNA methylation is stable, and if a tumor specific blood DNA methylation molecular marker can be found, the method has a huge clinical application value. Therefore, exploring and developing a blood DNA methylation diagnosis technology suitable for clinical detection needs has important clinical application value and social significance for improving early diagnosis and treatment effects of lung cancer and reducing mortality.
Disclosure of Invention
The invention aims to provide a tubulin beta 1 (tubulin, beta 1class VI, TUBBB1) molecular marker for assisting in diagnosing cancer.
In a first aspect, the invention claims the use of a methylated TUBB1 gene as a marker in the preparation of a product. The use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) The auxiliary differentiation of different cancer subtypes is realized;
(4) Assisting in distinguishing different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) Assisting in distinguishing benign nodules of the lung from lung cancer;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) Assisting in distinguishing lung cancer from breast cancer;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
Further, the diagnosis assistance for cancer in (1) may be embodied as at least one of the following: aid in distinguishing between cancer patients and non-cancer controls (it being understood that no cancer has been reported and that no benign nodules of the lung or breast have been reported and that the blood routine is within the reference range); aid in distinguishing between different cancers.
Further, the benign nodules in (2) are benign nodules corresponding to the cancer in (2), such as benign nodules of lung and lung cancer.
Further, the different subtypes of the cancer described in (3) may be pathotyped, such as histological typing.
Further, the different stages of the cancer in (4) may be clinical stages or TNM stages.
In a specific embodiment of the present invention, the diagnosis assistance system of lung cancer in (5) is specifically embodied as at least one of: the kit can be used for assisting in distinguishing lung cancer patients from non-cancer controls, lung adenocarcinoma patients from non-cancer controls, squamous lung carcinoma patients from non-cancer controls, small cell lung cancer patients from non-cancer controls, stage I lung cancer patients from non-cancer controls, stage II-III lung cancer patients from non-cancer controls, lymph node infiltration-free lung cancer patients from non-cancer controls, and lymph node infiltration-free lung cancer patients from non-cancer controls. Wherein the cancer-free control is understood as having no cancer present and ever and no reported benign nodules of the lung or breast and blood routine indicators within the reference range.
In a specific embodiment of the present invention, the auxiliary tool in (6) is specifically used for distinguishing benign nodules and lung cancer in at least one of the following forms: the kit can assist in distinguishing lung cancer and benign nodules of the lung, lung adenocarcinoma and benign nodules of the lung, squamous lung cancer and benign nodules of the lung, small cell lung cancer and benign nodules of the lung, stage I lung cancer and benign nodules of the lung, stage II-III lung cancer and benign nodules of the lung, non-lymph node infiltrated lung cancer and benign nodules of the lung, and lymph node infiltrated lung cancer and benign nodules of the lung.
In a specific embodiment of the present invention, the auxiliary differentiation of different subtypes of lung cancer in (7) is embodied as follows: can help to distinguish any two of lung adenocarcinoma, lung squamous carcinoma and small cell lung cancer.
In a specific embodiment of the present invention, the assisting in distinguishing different stages of lung cancer in (8) is embodied by at least one of the following: can help to distinguish any two of T1 stage lung cancer, T2 stage lung cancer and T3 lung cancer; can help to distinguish the lung cancer without lymph node infiltration from the lung cancer with lymph node infiltration; can help to distinguish any two of clinical stage I lung cancer, clinical stage II lung cancer and clinical stage III lung cancer.
In a specific embodiment of the present invention, the diagnosis assistance method in (9) is specifically embodied as at least one of the following: can help distinguish breast cancer patients from non-cancer female controls. Wherein said cancer-free female control is understood as having no cancer present and ever and no reported benign nodules of the lung or breast and having blood routine indicators within the reference range.
In the above (1) to (12), the cancer may be a cancer capable of causing a reduction in the methylation level of TUBB1 gene in the body, such as lung cancer, breast cancer, and the like.
In a second aspect, the invention claims the use of a substance for detecting the methylation level of the TUBB1 gene in the preparation of a product. The use of the product may be at least one of the foregoing (1) - (12).
In a third aspect, the invention claims the use of a substance for detecting the methylation level of the TUBB1 gene and a medium describing the mathematical modeling method and/or the method of use for the preparation of a product. The use of the product may be at least one of the above (1) to (12).
The mathematical model may be obtained according to a method comprising the steps of:
(A1) Detecting TUBB1 gene methylation levels (training set) of n1 type a samples and n2 type B samples, respectively;
(A2) And (2) taking the methylation level data of the TUBB1 genes of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment.
The use method of the mathematical model comprises the following steps:
(B1) Detecting the methylation level of the TUBB1 gene of the sample to be detected;
(B2) Substituting the TUBB1 gene methylation level data of the sample to be detected, which is obtained in the step (B1), into the mathematical model to obtain a detection index; and then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is the type A or the type B according to the comparison result.
In a specific embodiment of the present invention, the threshold is set to 0.5. Greater than 0.5 is classified as one type, less than 0.5 is classified as another type, and equal to 0.5 is considered as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum johning index (specifically, may be a value corresponding to the maximum johning index). The gray areas greater than the threshold are classified into one category, the gray areas less than the threshold are classified into another category, and the gray areas equal to the threshold are regarded as uncertain gray areas. The type A and the type B are two corresponding classifications, grouping of the two classifications, which group is the type A and which group is the type B, and are determined according to a specific mathematical model without convention.
The type a sample and the type B sample may be any one of:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
In a fourth aspect, the invention claims the use of the medium described in the third aspect of the text describing the method of mathematical modeling and/or the method of use in the manufacture of a product. The use of the product may be at least one of the foregoing (1) - (12).
In a fifth aspect, the invention claims a kit.
The claimed kit of the invention comprises a substance for detecting the methylation level of the TUBB1 gene. The use of the kit may be at least one of the foregoing (1) to (12).
Further, the kit may further contain "a medium in which a mathematical model creation method and/or a use method is described in the third aspect or the fourth aspect.
In a sixth aspect, the invention claims a system.
The claimed system of the present invention comprises:
(D1) Reagents and/or instruments for detecting methylation levels of TUBB1 gene;
(D2) An apparatus comprising unit X and unit Y.
The unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module.
The data acquisition module is configured to acquire (D1) TUBB1 gene methylation level data for n1 type a samples and n2 type B samples detected.
The data analysis processing module is configured to receive TUBB1 gene methylation level data of the n1 a-type samples and the n 2B-type samples sent by the data acquisition module, and establish a mathematical model by a two-classification logistic regression method according to classification modes of the a-type and the B-type according to TUBB1 gene methylation level data of the n1 a-type samples and the n 2B-type samples, and determine a threshold value of classification judgment.
The model output module is configured to receive the mathematical model established by the data analysis processing module and output the mathematical model.
The unit Y is used for determining the type of the sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module.
The data input module is configured to input (D1) TUBB1 gene methylation level data of the test subject.
The data operation module is configured to receive TUBB1 gene methylation level data of the subject sent by the data input module and substitute the TUBB1 gene methylation level data of the subject into the mathematical model established by the data analysis processing module in the unit X, and calculate a detection index.
The data comparison module is configured to receive the detection index calculated by the data operation module and compare the detection index with the threshold determined in the data analysis processing module in the unit X.
The conclusion output module is configured to receive the comparison result from the data comparison module and output the conclusion that the type of the sample to be tested is the type A or the type B according to the comparison result.
The type a sample and the type B sample may be any one of:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Samples of different stages of breast cancer.
Wherein n1 and n2 can both be positive integers of more than 50.
In a specific embodiment of the present invention, the threshold is set to 0.5. Greater than 0.5 is classified as one class and less than 0.5 is classified as another class, equal to 0.5 as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum jordan index (specifically, may be a value corresponding to the maximum jordan index). The gray areas greater than the threshold are classified into one category, the gray areas less than the threshold are classified into another category, and the gray areas equal to the threshold are regarded as uncertain gray areas. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In various aspects of the foregoing, the methylation level of the TUBB1 gene may be the methylation level of all or a portion of CpG sites in fragments as shown in (e 1) - (e 4) below in the TUBB1 gene. The methylated TUBB1 gene is methylated at all or part of CpG sites in the TUBB1 gene in the fragments shown below (e 1) to (e 4).
(e1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e4) The DNA fragment shown in SEQ ID No.4 or the DNA fragment with more than 80 percent of identity with the DNA fragment.
Further, the "whole or partial CpG sites" are any one or more CpG sites in 4 DNA fragments shown in SEQ ID No.1 to SEQ ID No.4 in the TUBB1 gene. The upper limit of the "multiple CpG sites" described herein is all CpG sites in 4 DNA fragments shown in SEQ ID No.1 to SEQ ID No.4 in the TUBB1 gene. All CpG sites in the DNA fragment shown in SEQ ID No.1 are shown in Table 1, and all CpG sites in the DNA fragment shown in SEQ ID No.2 are shown in Table 2, and all CpG sites in the DNA fragment shown in SEQ ID No.3 are shown in Table 3, and all CpG sites in the DNA fragment shown in SEQ ID No.4 are shown in Table 4.
Or, the "whole or partial CpG sites" are all CpG sites in the DNA fragment shown in SEQ ID No.1 (Table 1) and all CpG sites in the DNA fragment shown in SEQ ID No.4 (Table 4).
Or, the "whole or part of CpG sites" may be all or any 22 or any 21 or any 20 or any 19 or any 18 or any 17 or any 16 or any 15 or any 14 or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 of the DNA fragments shown in SEQ ID No.1 in the TUBB1 gene.
Or, the "all or part of the CpG sites" may be all or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 of the following 10 CpG sites in the DNA fragment shown in SEQ ID No. 1:
(f1) The DNA fragment shown in SEQ ID No.1 has CpG sites 384-385 from the 5' end (TUBB 1_ A _ 11);
(f2) The DNA fragment shown in SEQ ID No.1 has CpG sites (451-452) from the 5' end (TUBB 1_ A _ 12);
(f3) The DNA fragment shown in SEQ ID No.1 has CpG sites 460-461 from the 5' end (TUBB 1_ A _ 13);
(f4) The DNA fragment shown in SEQ ID No.1 has CpG sites (TUBB 1_ A _ 14) shown at positions 489-490 from the 5' end;
(f5) The DNA segment shown in SEQ ID No.1 is a CpG site (TUBB 1_ A _ 15) shown by 542-543 bit from the 5' end;
(f6) The DNA fragment shown in SEQ ID No.1 shows CpG sites (TUBB 1_ A _ 16) from 566 th to 567 th positions of the 5' end;
(f7) The DNA fragment shown in SEQ ID No.1 has CpG sites 604-605 from the 5' end (TUBB 1_ A _ 17);
(f8) The DNA segment shown in SEQ ID No.1 is from 673-674 th and 681-682 th CpG sites (TUBB 1_ A _ 18.19) of the 5' end;
(f9) The DNA fragment shown in SEQ ID No.1 shows CpG sites from 725-726 th and 727-728 th of the 5' end (TUBB 1_ A _ 20.21);
(f10) The DNA fragment shown in SEQ ID No.1 shows CpG sites from 747-748 sites and 754-755 sites of the 5' end (TUBB 1_ A _ 22.23).
In a specific embodiment of the invention, some adjacent methylation sites are treated as one methylation site when performing DNA methylation analysis using time-of-flight mass spectrometry since several CpG sites are located on one methylation fragment and the peak pattern is indistinguishable (indistinguishable sites are listed in Table 6), and thus when performing methylation level analysis, and constructing and using related mathematical models. This is the case with (f 8), (f 9) and (f 10) described above.
In each of the above aspects, the substance for detecting the methylation level of the TUBB1 gene may comprise (or be) a primer combination for amplifying a full-length or partial fragment of the TUBB1 gene. The reagents for detecting the methylation level of the TUBB1 gene may comprise (or be) a primer combination for amplifying a full-length or partial fragment of the TUBB1 gene; the apparatus for detecting the methylation level of TUBB1 gene may be a time-of-flight mass spectrometer. Of course, the reagents for detecting the methylation level of TUBB1 gene may also comprise other conventional reagents for performing time-of-flight mass spectrometry.
Further, the partial fragment may be at least one of:
(g1) The DNA fragment shown in SEQ ID No.1 or the DNA fragment contained in the DNA fragment;
(g2) A DNA fragment shown as SEQ ID No.2 or a DNA fragment contained therein;
(g3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment contained in the DNA fragment;
(g4) The DNA fragment shown in SEQ ID No.4 or the DNA fragment contained in the DNA fragment;
(g5) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.1 or a DNA fragment contained therein;
(g6) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.2 or a DNA fragment contained therein;
(g7) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.3 or a DNA fragment contained therein;
(g8) A DNA fragment having an identity of 80% or more with the DNA fragment represented by SEQ ID No.4 or a DNA fragment contained therein.
In the present invention, the primer combination may specifically be a primer pair a and/or a primer pair B and/or a primer pair C and/or a primer D.
The primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.5 or SEQ ID No. 5; the primer A2 is single-stranded DNA shown by 32 th to 56 th nucleotides of SEQ ID No.6 or SEQ ID No. 6.
The primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is single-stranded DNA shown by 11 th to 35 th nucleotides of SEQ ID No.7 or SEQ ID No. 7; the primer B2 is single-stranded DNA shown by 32 th-56 th nucleotides of SEQ ID No.8 or SEQ ID No. 8.
The primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.9 or SEQ ID No. 9; the primer C2 is single-stranded DNA shown by the 32 nd to 56 th nucleotides of SEQ ID No.10 or SEQ ID No. 10.
The primer pair D is a primer pair consisting of a primer D1 and a primer D2; the primer D1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.11 or SEQ ID No. 11; the primer D2 is single-stranded DNA shown by the 32 nd to 56 th nucleotides of SEQ ID No.12 or SEQ ID No. 12.
In addition, the invention also claims a method for distinguishing the sample to be detected as the type A sample or the type B sample. The method may comprise the steps of:
(A) The mathematical model may be established according to a method comprising the steps of:
(A1) Detecting TUBB1 gene methylation levels (training set) for n1 type a samples and n2 type B samples, respectively;
(A2) And (2) taking the TUBB1 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment.
Wherein n1 and n2 in (A1) are both positive integers of 50 or more.
(B) Whether the sample to be tested is a type a sample or a type B sample can be determined according to a method comprising the following steps:
(B1) Detecting the methylation level of the TUBB1 gene of the sample to be detected;
(B2) Substituting the TUBB1 gene methylation level data of the sample to be detected, which is obtained in the step (B1), into the mathematical model to obtain a detection index; and then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is the type A or the type B according to the comparison result.
In a specific embodiment of the present invention, the threshold is set to 0.5. Greater than 0.5 is classified as one class and less than 0.5 is classified as another class, equal to 0.5 as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum johning index (specifically, may be a value corresponding to the maximum johning index). The gray areas greater than the threshold are classified into one category, the gray areas less than the threshold are classified into another category, and the gray areas equal to the threshold are regarded as uncertain gray areas. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
The type a sample and the type B sample may be any one of:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
Any of the above mathematical models may be changed in practical application according to the detection method of DNA methylation and the fitting manner, and needs to be determined according to a specific mathematical model without convention.
In the embodiment of the present invention, the model is specifically log (y/(1-y)) = b0+ b1x1+ b2x2+ b3x3+ \8230, + bnXn, where y is a detection index obtained after a dependent variable is substituted into the model for methylation values of one or more methylation sites of a sample to be tested, b0 is a constant, x1 to xn are independent variables, i.e., methylation values of one or more methylation sites of the test sample (each value is a value between 0 and 1), and b1 to bn are weights assigned to the methylation values of each site by the model.
In the embodiment of the present invention, the model may be established by adding known parameters such as age, sex, white blood cell count, etc. as appropriate to improve the discrimination efficiency. One specific model established in the embodiment of the present invention is a model for assisting in distinguishing benign nodules and lung cancer of the lung, and the model specifically includes: log (y/(1-y)) =0.105+2.062 + TUBB1_A _12-1.634 + TUBB1_A _13-0.243 + TUBB1_A _A14 +0.029 + TUBB1 _A15 + 1.288. TUBB1A _A _16+0.886 + TUBB1 _A17-1.052 _A18.19 + 0.422% TUBB1 u A20.21-1.961. TUBB1 u A22.23 _A _0.027 + TUBB1 age-0.733 + gender (male is assigned as 1, female is assigned as 0.018 + white cells). The TUBB1_ A _11 is the methylation level of CpG sites shown in 384-385 th sites of the 5' end of the DNA fragment shown in SEQ ID No. 1; the TUBB1_ A _12 is the methylation level of the CpG sites from 451-452 sites of the 5' end of the DNA fragment shown in SEQ ID No. 1; the TUBB1_ A _13 is the methylation level of CpG sites shown in 460-461 sites of the 5' end of the DNA fragment shown in SEQ ID No. 1; the TUBB1_ A _14 is the methylation level of the CpG sites shown in the 489-490 bit from the 5' end of the DNA fragment shown in SEQ ID No. 1; the TUBB1_ A _15 is the methylation level of CpG sites shown in 542 th-543 th site of the DNA fragment shown in SEQ ID No.1 from the 5' end; the TUBB1_ A _16 is the methylation level of CpG sites shown in 566 th-567 th sites of the DNA fragment shown in SEQ ID No.1 from the 5' end; the TUBB1_ A _17 is the methylation level of CpG sites shown in 604-605 sites of the 5' end of the DNA fragment shown in SEQ ID No. 1; the TUBB1_ A _18.19 is the methylation level of CpG sites from 673-674 and 681-682 positions of the DNA fragment shown in SEQ ID No.1 at the 5' end; the TUBB1_ A _20.21 is the methylation level of CpG sites shown in 725-726 and 727-728 bits of the 5' end of the DNA fragment shown in SEQ ID No. 1; the TUBB1_ A _22.23 is the methylation level of CpG sites from 747-748 and 754-755 positions of the 5' end of the DNA fragment shown in SEQ ID No. 1. The threshold of the model is 0.5. Patients with a detection index greater than 0.5 calculated by the model are selected as lung cancer patients, and patients with a detection index less than 0.5 are selected as benign nodules of the lung.
In each of the above aspects, the detecting the methylation level of the TUBB1 gene is detecting the methylation level of the TUBB1 gene in blood.
In the above aspects, when the type a specimen and the type B specimen are specimens of different subtypes of lung cancer in (C3), the type a specimen and the type B specimen may be specifically any two of a lung adenocarcinoma specimen, a lung squamous carcinoma specimen, and a small cell lung cancer specimen.
In the above aspects, when the type a specimen and the type B specimen are different stage specimens of (C4) middle lung cancer, the type a specimen and the type B specimen may be specifically any two of a clinical stage I lung cancer specimen, a clinical stage II lung cancer specimen, and a clinical stage III lung cancer specimen.
In the above aspects, when the type a sample and the type B sample are different stages of breast cancer in (C7), the type a sample and the type B sample may specifically be any two of a T1 stage breast cancer sample, a T2 stage breast cancer sample, and a T3 stage breast cancer sample, or a non-lymph node-infiltrating breast cancer sample and a lymph node-infiltrating breast cancer sample, or any two of a clinical stage I breast cancer sample, a clinical stage II breast cancer sample, and a clinical stage III breast cancer sample.
Any of the TUBB1 genes described above may specifically include Genbank accession numbers: NM _030773.4.
The present invention provides hypomethylation of the TUBB1 gene in blood of lung cancer patients and breast cancer. Experiments prove that by taking blood as a sample, cancer (lung cancer and breast cancer) patients and cancer-free controls can be distinguished, benign nodules and lung cancer of the lung can be distinguished, different subtypes and different stages of the lung cancer can be distinguished, and the lung cancer and the breast cancer can be distinguished, and different stages of the breast cancer can be distinguished. The invention has important scientific significance and clinical application value for improving the early diagnosis and treatment effect of the lung cancer and the breast cancer and reducing the death rate.
Drawings
Fig. 1 is a diagram of a mathematical model.
Fig. 2 is an illustration of a mathematical model.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, and the examples are given only for illustrating the present invention and not for limiting the scope of the present invention. The examples provided below serve as a guide for further modifications by a person skilled in the art and do not constitute a limitation of the invention in any way.
The experimental procedures in the following examples, unless otherwise indicated, are conventional and are carried out according to the techniques or conditions described in the literature in the field or according to the instructions of the products. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
The tubulin beta 1 (tubulin, beta 1class VI, TUBB 1) gene quantification assay in the following examples was performed in triplicate and the results averaged.
Example 1 primer design for detection of methylation sites of TUBB1 Gene
Through extensive sequence and functional analysis, four fragments in the TUBB1 gene (TUBB 1_ a fragment, TUBB1_ B fragment, TUBB1_ C fragment, and TUBB1_ D fragment) were selected for methylation level and cancer-related analysis.
The TUBB1_ A fragment (SEQ ID No. 1) is located in the hg19 reference genome chr20:57582659-57583460, the antisense strand.
The TUBB1_ B fragment (SEQ ID No. 2) is located in the hg19 reference genome chr20:57592710-57593377, sense strand.
The TUBB1_ C fragment (SEQ ID No. 3) is located in the hg19 reference genome chr20:57593847-57594651, sense strand.
The TUBB1_ D fragment (SEQ ID No. 4) is located in the hg19 reference genome chr20:57598823-57599623, sense strand.
The CpG site information in the TUBB1_ a fragment is shown in table 1.
CpG site information in TUBB1_ B fragment is shown in table 2.
CpG site information in TUBB1_ C fragment is shown in table 3.
The CpG site information in the TUBB1_ D fragment is shown in table 4.
TABLE 1 CpG site information in TUBB1_ A fragment
TABLE 2 CpG site information in TUBB1_ B fragment
CpG sites | Position of CpG site in sequence |
TUBB1_B_1 | 26-27 from the 5' end of SEQ ID No.2 |
TUBB1_B_2 | 35-36 th position from 5' end of SEQ ID No.2 |
TUBB1_B_3 | 71-72 th from 5' end of SEQ ID No.2 |
TUBB1_B_4 | 230-231 of SEQ ID No.2 from 5' end |
TUBB1_B_5 | 263 th-264 th from 5' end of SEQ ID No.2 |
TUBB1_B_6 | SEQ ID No.2 from position 305 to 306 of the 5' end |
TUBB1_B_7 | 312-313 from 5' end of SEQ ID No.2 |
TUBB1_B_8 | 346-347 th position from 5' end of SEQ ID No.2 |
TUBB1_B_9 | 369-370 bits from the 5' end of SEQ ID No.2 |
TUBB1_B_10 | 417-418 bits of 5' end of SEQ ID No.2 |
TUBB1_B_11 | 442-443 th from 5' end of SEQ ID No.2 |
TUBB1_B_12 | 463 th to 464 th positions from 5' end of SEQ ID No.2 |
TUBB1_B_13 | 479-480 th position from 5' end of SEQ ID No.2 |
TUBB1_B_14 | From 540 th to 541 st of 5' end of SEQ ID No.2 |
TUBB1_B_15 | The 639-640 th position from 5' end of SEQ ID No.2 |
TUBB1_B_16 | 642-643 of SEQ ID No.2 from the 5' end |
TABLE 3 CpG site information in TUBB1_ C fragment
CpG sites | Position of CpG site in sequence |
TUBB1_C_1 | 26-27 from the 5' end of SEQ ID No.3 |
TUBB1_C_2 | 135-136 from the 5' end of SEQ ID No.3 |
TUBB1_C_3 | 159-160 th position from 5' end of SEQ ID No.3 |
TUBB1_C_4 | 189-190 th from 5' end of SEQ ID No.3 |
TUBB1_C_5 | SEQ ID No.3 from the 205 th to the 206 th position of the 5' end |
TUBB1_C_6 | 236-237 of SEQ ID No.3 from the 5' end |
TUBB1_C_7 | 243-244 th position from 5' end of SEQ ID No.3 |
TUBB1_C_8 | 278 th to 279 th positions from 5' end of SEQ ID No.3 |
TUBB1_C_9 | 354-355 from the 5' end of SEQ ID No.3 |
TUBB1_C_10 | 421-422 of SEQ ID No.3 from 5' end |
TUBB1_C_11 | From position 553 to 554 of the 5' end of SEQ ID No.3 |
TUBB1_C_12 | From 667 th to 668 th positions of 5' end of SEQ ID No.3 |
TUBB1_C_13 | The 699 th to 700 th sites from the 5' end of SEQ ID No.3 |
TUBB1_C_14 | 735 to 736 th positions from the 5' end of SEQ ID No.3 |
TUBB1_C_15 | 779-780 position from 5' end of SEQ ID No.3 |
TABLE 4 CpG site information in TUBB1_ D fragment
Specific PCR primers were designed for four fragments (TUBB 1_ A fragment, TUBB1_ B fragment, TUBB1_ C fragment and TUBB1_ D fragment) as shown in Table 5. Wherein, SEQ ID No.5, SEQ ID No.7, SEQ ID No.9 and SEQ ID No.11 are forward primers, and SEQ ID No.6, SEQ ID No.8, SEQ ID No.10 and SEQ ID No.12 are reverse primers; in SEQ ID No.5, SEQ ID No.7, SEQ ID No.9 and SEQ ID No.11, the 1 st to 10 th sites from the 5' end are non-specific tags, and the 11 th to 35 th sites are specific primer sequences; in SEQ ID No.6, SEQ ID No.8, SEQ ID No.10 and SEQ ID No.12, the 1 st to 31 st positions from 5' are nonspecific tags, and the 32 nd to 56 th positions are specific primer sequences. The primer sequence does not contain SNP and CpG sites.
TABLE 5 TUBB1 methylation primer sequences
Example 2 detection of methylation of TUBB1 Gene and analysis of the results
1. Research sample
With the patient's informed consent, a total of 722 lung cancer patients, 152 patients with benign nodules in the lung, 227 breast cancer patients and 945 cancer-free controls (cancer-free controls, i.e., patients who had previously and now had no cancer and had no reported pulmonary nodules and had blood routine indicators within the reference range) were collected.
All patient samples were collected preoperatively and confirmed both imagewise and pathologically.
The subtypes of lung cancer and breast cancer are judged according to the histopathology.
The staging of lung cancer and breast cancer takes AJCC 8 th edition staging system as a judgment standard.
722 lung cancer patients were classified by type: 619 cases of lung adenocarcinoma, 42 cases of lung squamous carcinoma, 49 cases of small cell lung carcinoma, and 12 other cases.
722 lung cancer patients were classified according to stage: 649 cases in stage I, 41 cases in stage II and 32 cases in stage III.
722 lung cancer patients were classified by lung cancer tumor size (T): t1, T2 and T3 are 603 and 36 respectively.
722 lung cancer patients were classified by the presence or absence of lung cancer lymph node infiltration (N): there were 688 cases of lymph node infiltration without lung cancer and 34 cases of lymph node infiltration with lung cancer.
227 breast cancer patients were classified according to type: 34 cases of ductal carcinoma in situ of breast, 165 cases of invasive ductal carcinoma and 28 cases of invasive lobular carcinoma.
227 breast cancer patients were divided by stage: 198 cases in stage I, 20 cases in stage II and 9 cases in stage III.
227 breast cancer patients were classified by lung cancer tumor size (T): t1, T2, T3, T1, T27.
227 breast cancer patients were classified according to the presence or absence of breast cancer lymph node infiltration (N): there were 201 cases of lymph node infiltration without breast cancer and 26 cases of lymph node infiltration with breast cancer.
The median age of each of the cancer-free population, benign nodules in the lung, and breast cancer patients was 56, 57, 58, and 56 years, and the ratio of each of the 3 cancer-free population, benign nodules in the lung, and lung cancer patients was about 1 for both men and women, and all breast cancer patients were women.
2. Methylation detection
1. Total DNA of the blood sample was extracted.
2. The total DNA of the blood sample prepared in step 1 was treated with bisulfite (see Qiagen for DNA methylation kit instructions). Following bisulfite treatment, unmethylated cytosine (C) is converted to uracil (U), while methylated cytosine remains unchanged, i.e., the C base of the original CpG site is converted to C or U following bisulfite treatment.
3. Taking the DNA treated by the bisulfite in the step 2 as a template, adopting 4 pairs of specific primers in the table 5 to perform PCR amplification by DNA polymerase according to a reaction system required by a conventional PCR reaction, wherein the 4 pairs of primers all adopt the same conventional PCR system, and the 4 pairs of primers all perform amplification according to the following procedures.
The PCR reaction program is: 95 ℃,4min → (95 ℃,20s → 56 ℃,30s → 72 ℃,2 min) 45 cycles → 72 ℃,5min → 4 ℃,1h.
4. Taking the amplification product in the step 3, and carrying out DNA methylation analysis by flight time mass spectrum, wherein the specific method comprises the following steps:
(1) To 5. Mu.l of the PCR product was added 2. Mu.l of a shrimp basic phosphate (SAP) solution (0.3 ml SAP 2.5U]+1.7ml H 2 O) then incubated in a PCR apparatus (37 ℃,20min → 85 ℃,5min → 4 ℃,5 min) according to the following procedure;
(2) Taking out 2 μ l of SAP treated product obtained in step (1), adding into 5 μ l of T-Cleavage reaction system according to the instruction, and incubating at 37 deg.C for 3h;
(3) Adding 19 mu l of deionized water into the product obtained in the step (2), and then performing deionization incubation for 1h by using 6 mu g of Resin in a rotary table;
(4) Centrifuging at 2000rpm for 5min at room temperature, and loading the micro-supernatant with 384SpectroCHIP by a Nanodipen mechanical arm;
(5) Performing time-of-flight mass spectrometry; the data obtained were collected with the SpectroACQUIRE v3.3.1.3 software and visualized with the MassArray EpiTyper v1.2 software.
The reagents used in the flight time mass spectrometry detection are all kits (T-clean Mass clear Reagent Auto Kit, cat # 10129A); the detection instrument used for the time-of-flight mass spectrometry is MassARRAY O R Analyzer Chip Prep Module 384, model: 41243; the data analysis software is self-contained software of the detection instrument.
5. And (4) analyzing the data obtained in the step (4).
Statistical analysis of the data was performed by SPSS Statistics 23.0.
Nonparametric tests were used for comparative analysis between the two groups.
The discrimination effect of multiple combinations of CpG sites for different sample groupings was achieved by logistic regression and statistical methods of subject curves.
All statistical tests were two-sided, and P values <0.05 were considered statistically significant.
By mass spectrometry experiments, a total of 80 distinguishable peak patterns of methylated fragments were obtained. The SpectroACQUIRE v3.3.1.3 software automatically calculates the peak area for each sample to obtain the methylation level at each CpG site according to the formula "methylation level = peak area of methylated fragment/(peak area of unmethylated fragment + peak area of methylated fragment)".
3. Analysis of results
1. Tubb1 gene methylation levels in blood of no cancer controls, benign nodules, and lung cancer
The methylation levels of all CpG sites in TUBB1 gene were analyzed using 722 lung cancer patients, 152 patients with benign nodules in the lung and 945 cancer-free controls of blood as the study material (table 6). The results showed that all CpG sites in TUBB1 gene had a median methylation level of 0.23 in the no cancer control group (IQR = 0.16-0.37), 0.18 in the benign nodules (IQR = 0.11-0.32), and 0.19 in the lung cancer patients (IQR = 0.14-0.34).
2. The level of TUBB1 gene methylation in blood can distinguish between non-cancer controls and lung cancer patients
By comparatively analyzing the methylation levels of TUBB1 genes of 722 lung cancer patients and 945 cancer-free controls, it was found that the methylation levels of all CpG sites in TUBB1 genes of lung cancer patients were significantly lower than those of cancer-free controls (p <0.05, table 7). In addition, methylation levels of all CpG sites of the TUBB1 gene in different subtypes of lung cancer (lung adenocarcinoma, lung squamous carcinoma, small cell lung carcinoma) were significantly different from the cancer-free control, respectively. Methylation levels of all CpG sites of the TUBB1 gene in different stages (clinical stage I and stage II-III) of lung cancer are respectively and remarkably different from those of a cancer-free control. In addition, methylation levels were significantly different between the cancer-free controls (p < 0.05) for lung cancer patients without lymph node infiltration and lung cancer patients with lymph node infiltration, respectively. Therefore, the methylation level of TUBB1 gene can be used for clinical diagnosis of lung cancer, especially for early diagnosis of lung cancer.
3. Tubb1 gene methylation levels in blood can distinguish benign nodules in the lung from lung cancer patients
By comparing the methylation levels of TUBB1 gene in 722 lung cancer patients and 152 benign nodules, it was found that the methylation levels of all CpG sites of TUBB1 gene were significantly lower in the benign nodule patients than in the lung cancer patients (p <0.05, table 8). In addition, significant differences were found between the methylation levels of all CpG in the TUBB1 gene of lung cancer patients of different subtypes of lung cancer (adenocarcinoma of the lung, squamous carcinoma of the lung, small cell lung carcinoma), different clinical stages (stages I or II-III) and the presence or absence of lymph node infiltration, respectively, and benign nodules. Therefore, the methylation level of the TUBB1 gene can be applied to distinguish lung cancer patients from benign nodule patients, and is a very potentially valuable marker.
4. Differentiation of different subtypes or stages of lung cancer by the methylation level of the TUBB1 gene in blood
Through comparative analysis of methylation levels of the TUBB1 gene in different subtype lung cancer patients (lung adenocarcinoma, lung squamous carcinoma and small cell lung cancer) and different stage lung cancer patients, the methylation levels of all CpG sites in the TUBB1 gene are found to have significant differences (p is less than 0.05, table 9) under the conditions of different subtypes of lung cancer (lung adenocarcinoma patients, lung squamous carcinoma patients and small cell lung cancer patients), different tumor sizes (T1, T2 and T3), different stages (clinical stage I, stage II and stage III) and the existence of lymph node infiltration. Thus, the methylation level of the TUBB1 gene can be used to differentiate between different subtypes or stages of lung cancer.
5. The level of TUBB1 methylation in blood can be used to diagnose breast cancer
The differences in the methylation level of CpG sites in the TUBB1 gene between breast cancer patients and cancer-free female controls were analyzed using blood from 227 breast cancer patients and 472 cancer-free female control samples as the study material (table 10). The results showed that the median methylation level of all CpG sites of interest in breast cancer patients was 0.20 (IQR = 0.15-0.34), the median methylation level of the non-cancer female control group was 0.23 (IQR = 0.16-0.37), and the methylation level of all CpG sites in breast cancer patients was significantly lower than that of the non-cancer female control (p < 0.05). In addition, methylation levels of all CpG sites in TUBB1 gene were significantly different in different stages of breast cancer (clinical stage I, II-III), with or without lymph node infiltration and different tumor sizes (T1, T2 and T3), respectively (p <0.05, table 11). Thus, the methylation level of TUBB1 gene can be used for clinical diagnosis of breast cancer.
6. TUBB1 methylation levels in blood can distinguish breast cancer patients from lung cancer patients
The difference in methylation level in TUBB1 gene in blood of the breast cancer patient and the lung cancer patient was analyzed using blood of 227 breast cancer patients and 722 lung cancer patients as a study material (table 12). The results indicate that the median methylation level of all CpG sites of interest in breast cancer patients is 0.20 (IQR = 0.15-0.34), the median methylation level of lung cancer patients is 0.19 (IQR = 0.14-0.34), and the methylation level of all CpG sites in breast cancer patients is significantly higher than that of lung cancer patients (p < 0.05). Thus, the methylation level of TUBB1 gene can be used to differentiate breast and lung cancer patients.
7. Establishment of mathematical model for assisting cancer diagnosis
The mathematical model established by the invention can be used for achieving the following purposes:
(1) Differentiating lung cancer patients from non-cancer controls;
(2) Distinguishing lung cancer patients from lung benign nodule patients;
(3) Differentiating breast cancer patients from non-cancerous female controls;
(4) Distinguishing between breast and lung cancer patients
(5) Distinguishing lung cancer subtypes;
(6) Differentiating the stage of lung cancer;
(7) To differentiate breast cancer stages.
The mathematical model is established as follows:
(A) The data source is as follows: 722 lung cancer patients listed in step one, 152 patients with benign nodules in the lung, 227 breast cancer patients, and 945 patients without cancer control ex vivo blood samples for the target CpG site (combination of one or more of tables 1-4) methylation level (same test method as step two).
The data can be added with known parameters such as age, sex, white blood cell count and the like according to actual needs to improve the discrimination efficiency.
(B) Model building
Selecting any two types of patient data of different types, namely training sets (such as cancer-free contrast and lung cancer patients, cancer-free female contrast and breast cancer patients, lung benign nodule patients and lung cancer patients, lung cancer patients and breast cancer patients, lung adenocarcinoma and lung squamous cancer patients, lung adenocarcinoma and small cell lung cancer patients, lung squamous cancer and small cell lung cancer patients, stage I lung cancer patients, stage II lung cancer patients, stage I lung cancer patients, stage III lung cancer patients, stage II lung cancer patients, stage I breast cancer patients, stage II breast cancer patients, stage III breast cancer patients, stage T1 breast cancer patients, stage T2 breast cancer patients, stage T1 breast cancer patients, stage T3 breast cancer patients, lymph node infiltration breast cancer patients) as required to establish models, and using statistical software such as SAS, R, SPSS and the like, using a two-classification logic statistical method to establish mathematical models through formulas. The numerical value corresponding to the maximum Jordan index calculated by the mathematical model formula is a threshold value or 0.5 is directly set as the threshold value, the detection index obtained after the sample to be detected is tested and substituted into the model calculation is classified as one type (B type) when being larger than the threshold value, and classified as the other type (A type) when being smaller than the threshold value, and the detection index is equal to the threshold value and is taken as an uncertain gray zone. When a new sample to be detected is predicted to judge which type the sample belongs to, firstly, the methylation level of one or more CpG sites on the TUBB1 gene of the sample to be detected is detected by a DNA methylation determination method, then the data of the methylation levels are substituted into the mathematical model (if known parameters such as age, sex, white blood cell count and the like are included in the model construction, the step simultaneously substitutes the specific numerical value of the corresponding parameter of the sample to be detected into the model formula), the detection index corresponding to the sample to be detected is obtained by calculation, then the detection index corresponding to the sample to be detected is compared with the threshold value, and the sample to be detected belongs to which type is determined according to the comparison result.
Examples are as follows: as shown in FIG. 1, the data of the methylation level of a single CpG site or the methylation level of a combination of multiple CpG sites of the TUBB1 gene in the training set are used for establishing a mathematical model for distinguishing the class A from the class B by statistical software such as SAS, R, SPSS and the like by using a formula of two-classification logistic regression. The mathematical model is here a two-class logistic regression model, specifically: log (y/1-y) = b0+ b1x1+ b2x2+ b3x3+ \8230, + bnXn, wherein y is a detection index obtained after a dependent variable is substituted into a model about the methylation value of one or more methylation sites of a sample to be tested, b0 is a constant, x 1-xn are independent variables about the methylation value of one or more methylation sites of the test sample (each value is a value between 0 and 1), and b 1-bn are weights assigned to the methylation values of each site by the model. In specific application, a mathematical model is established according to the methylation degrees (x 1-xn) of one or more DNA methylation sites of a sample detected in a training set and known classification conditions (class A or class B, and 0 and 1 are respectively assigned to y), so that the constant B0 of the mathematical model and the weights B1-bn of the methylation sites are determined, and a detection index (0.5 in the example) corresponding to the maximum johnson index calculated by the mathematical model is used as a partition threshold. And (3) the detection index (y value) obtained by testing and substituting the sample to be detected into the model calculation is more than 0.5 and is classified as B, less than 0.5 and is classified as A, and the value is equal to 0.5 and is used as an uncertain gray area. Where class A and class B are two corresponding classes (a grouping of two classes, which group is class A and which group is class B, as determined by a specific mathematical model, not to be agreed herein), such as cancer-free controls and lung cancer patients, cancer-free female controls and breast cancer patients, benign nodules of the lung patients and lung cancer patients, lung cancer patients and breast cancer patients, lung adenocarcinoma and squamous lung cancer patients, lung adenocarcinoma and small cell lung cancer patients, squamous lung cancer and small cell lung cancer patients, stage I lung cancer and stage II lung cancer patients, stage I lung cancer and stage III lung cancer patients, stage II lung cancer and stage III lung cancer patients, stage I breast cancer and stage II breast cancer patients, stage I breast cancer and stage III breast cancer patients, stage II breast cancer and stage III breast cancer patients, stage T1 breast cancer and stage T2 breast cancer patients, stage T1 breast cancer and stage T3 breast cancer patients, stage T2 breast cancer and stage T3 breast cancer patients, non-infiltrating breast cancer and lymph node breast cancer patients. When a sample of a subject is predicted to determine which class it belongs to, blood of the subject is first collected and then DNA is extracted therefrom. After transforming the extracted DNA with bisulfite, the methylation level of a single CpG site or the methylation level of a combination of multiple CpG sites of TUBB1 gene of the subject is detected by DNA methylation assay method, and then the detected methylation data is substituted into the above mathematical model. If the methylation level of one or more CpG sites of the subject's TUBB1 gene is substituted into the above mathematical model, the calculated detection index is larger than the threshold, then the subject determines that the detection index is larger than 0.5 in the training set and belongs to a class (class B); if the methylation level data of one or more CpG sites of the subject's TUBB1 gene is substituted into the above mathematical model, the calculated value, i.e., the detection index, is less than the threshold, then the subject belongs to a class (class A) with the detection index less than 0.5 in the training set; if the methylation level data of one or more CpG sites of the subject's TUBB1 gene is substituted into the above mathematical model, the calculated value, i.e., the detection index, is equal to the threshold, it cannot be determined whether the subject is of class A or B.
Examples are as follows: FIG. 2 is a schematic diagram illustrating methylation of preferred CpG sites of TUBB1_ A (TUBB 1_ A _11, TUBB1_ A _12, TUBB1_ A _13, TUBB1_ A _14, TUBB1_ A _15, TUBB1_ A _16, TUBB1_ A _17, TUBB1_ A _18.19, TUBB1_ A _20.21, TUBB1_ A _ 22.23) and the use of mathematical modeling for the discrimination of benign and malignant nodules in the lung: data on methylation levels of combinations of 10 distinguishable preferred CpG sites that have been detected in a training set of lung cancer patients and lung benign nodule patients (here: 722 lung cancer patients and 152 lung benign nodule patients) and the ages, sexes (male assigned 1 and female assigned 0) and white blood cell counts of the patients were used to establish a mathematical model for distinguishing lung cancer patients from lung benign nodule patients by R software using a formula of two-classification logistic regression. The mathematical model is here a two-class logistic regression model, from which the constant b0 of the mathematical model and the weights b1 to bn of the individual methylation sites are determined, in this case in particular: log (y/(1-y)) =0.105+2.062 + TUBB1_A _, 11-1.131 + TUBB1_, A _, 13-0.243 + TUBB1_, A _, 14+0.029 + TUBB1_, A15 +1.288 _, TU 1_, A _, 16+0.886 + TUBB1_, A _, 17-1.052 + TUBB1_A _18.19+0.422 + TUBB1_A _20.21-1.961 + TUBB1_A _22.23+0.027 age-0.733 sex (male assigned value 1, women assigned a value of 0) +0.018 × white blood cell count. Wherein y is a dependent variable, namely a detection index obtained by substituting methylation values of 10 distinguishable methylation sites of a sample to be detected, age, sex and white blood cell count into a model. Under the condition that 0.5 is set as a threshold value, methylation levels of 10 distinguishable CpG sites of the test sample, namely, TUBB1_ A _11, TUBB1_ A _12, TUBB1_ A _13, TUBB1_ A _14, TUBB1_ A _15, TUBB1_ A _16, TUBB1_ A _17, TUBB1_ A _18.19, TUBB1_ A _20.21 and TUBB1_ A _22.23 are tested and then are substituted into a model together with information of the age, the sex and the white cell count of the test sample to be calculated, and the obtained detection index, namely, the y value is more than 0.5 and is classified as a lung cancer patient, less than 0.5 and is classified as a lung benign tubercle patient, and the detection index is not determined as the lung cancer patient or the lung benign tubercle patient when the y value is equal to 0.5. The area under the curve (AUC) for this model was calculated to be 0.69 (table 16). As an example of the method for determining the subject, as shown in FIG. 2, DNA was extracted from blood collected from two subjects (A, B), and the extracted DNA was converted with bisulfite, and then the methylation levels of 10 distinguishable CpG sites of the subjects, TUBB1_ A _11, TUBB1_ A _12, TUBB1_ A _13, TUBB1_ A _14, TUBB1_ A _15, TUBB1_ A _16, TUBB1_ A _17, TUBB1_ A _18.19, TUBB1_ A _20.21 and TUBB1_ A _22.23, were measured by a DNA methylation method. The detected methylation level data is then substituted into the mathematical model described above, along with information on the age, sex, and white blood cell count of the subject. The first test subject is judged to be a lung cancer patient (consistent with the clinical judgment result) if the value calculated by the first test subject through the mathematical model is 0.84 and is more than 0.5; and (3) substituting the methylation level data of one or more CpG sites of the TUBB1 gene of the second subject into the mathematical model to calculate a value of 0.18 to less than 0.5, and judging the patient with the benign pulmonary nodule by the second subject (which is consistent with the clinical judgment result).
(C) Evaluation of model Effect
According to the above method, mathematical models for distinguishing between a cancer-free control and a lung cancer patient, a cancer-free female control and a breast cancer patient, a lung benign nodule patient and a lung cancer patient, a lung cancer patient and a breast cancer patient, a lung adenocarcinoma and a lung squamous cancer patient, a lung adenocarcinoma and a small cell lung cancer patient, a lung squamous cancer and a small cell lung cancer patient, a lung cancer stage I and a lung cancer stage II patient, a lung cancer stage I and a lung cancer stage III patient, a lung cancer stage II and a lung cancer stage III patient, a breast cancer stage I and a breast cancer stage II and stage II patient, a breast cancer stage II and a breast cancer stage III patient, a breast cancer stage T1 and a breast cancer stage T2 patient, a breast cancer stage T1 and a breast cancer stage T3 patient, a breast cancer stage T2 and a breast cancer stage T3 patient, a lymph node-non-infiltrating breast cancer and a lymph-node-infiltrating breast cancer patient are established, respectively, and the validity thereof is evaluated by a subject curve (ROC curve). The larger the area under the curve (AUC) obtained by ROC curve, the better the discrimination of the model, and the more effective the molecular marker. The results of the evaluation after the mathematical model construction using different CpG sites are shown in tables 13, 14 and 15. In tables 13, 14 and 15, 1 CpG site represents the site of any one CpG site in the amplified fragment of TUBB1_ A, 2 CpG sites represent the combination of any 2 CpG sites in TUBB1_ A, 3 CpG sites represent the combination of any 3 CpG sites in TUBB1_ A, \ 8230; \ 8230, and so on. The values in the table are ranges for the results of different site combinations (i.e., the results of any combination of CpG sites are within the range).
The above results show that the discriminatory ability of TUBB1 gene for each group (cancer-free control and lung cancer patient, cancer-free female control and breast cancer patient, lung benign nodule patient and lung cancer patient, lung cancer patient and breast cancer patient, lung adenocarcinoma and lung squamous carcinoma patient, lung adenocarcinoma and small cell lung cancer patient, lung squamous carcinoma and small cell lung cancer patient, stage I lung cancer and stage II lung cancer patient, stage I lung cancer and stage III lung cancer patient, stage II lung cancer and stage III lung cancer patient, stage I breast cancer and stage II breast cancer patient, stage I breast cancer and stage III breast cancer patient, stage II breast cancer and stage III breast cancer patient, T1 breast cancer and T2 breast cancer patient, T1 breast cancer and T3 breast cancer patient, T2 breast cancer and T3 breast cancer patient, lymph node-infiltrating breast cancer-free and lymph node-infiltrating breast cancer patient) increases with the increase of the number of sites.
In addition, among the CpG sites shown in tables 1 to 4, there are cases where a combination of a few more excellent sites is better in discrimination than a combination of a plurality of non-excellent sites. A combination of 10 distinguishable optimal sites such as TUBB1_ a _11, TUBB1_ a _12, TUBB1_ a _13, TUBB1_ a _14, TUBB1_ a _15, TUBB1_ a _16, TUBB1_ a _17, TUBB1_ a _18.19, TUBB1_ a _20.21, TUBB1_ a _22.23 shown in table 16, table 17 and table 18 is a preferred site for any 10 distinguishable sites in TUBB1_ a.
In summary, cpG sites on the TUBB1 gene and various combinations thereof, cpG sites on the TUBB1_ A fragment and various combinations thereof, TUBB1_ A _11, TUBB1_ A _12, TUBB1_ A _13, TUBB1_ A _14, TUBB1_ A _15, TUBB1_ A _16, TUBB1_ A _17, TUBB1_ A _18.19, TUBB1_ A _20.21, TUBB1_ A _22.23 and various combinations thereof on the TUBB1_ A fragment and various combinations thereof, cpG sites on the TUBB1_ C fragment and various combinations thereof, cpG sites on the TUBB1_ D fragment and various combinations thereof, and the methylation levels of CpG sites on TUBB1_ a, TUBB1_ B, TUBB1_ C and TUBB1_ D and various combinations thereof are all discriminative for cancer-free controls and lung cancer patients, cancer-free female controls and breast cancer patients, lung benign nodule patients and lung cancer patients, lung cancer patients and breast cancer patients, lung adenocarcinoma and squamous carcinoma patients, lung adenocarcinoma and small cell lung cancer patients, squamous carcinoma and small cell lung cancer patients, stage I lung cancer and stage II lung cancer patients, stage I lung cancer and stage III lung cancer patients, stage II lung cancer and stage III lung cancer patients, stage I breast cancer and stage II breast cancer patients, stage I breast cancer and stage III breast cancer patients, stage II breast cancer and stage III breast cancer patients, stage T1 breast cancer and stage T2 breast cancer patients, stage T1 breast cancer and stage T3 breast cancer patients, stage T2 breast cancer and T3 breast cancer patients, lymph node-free and lymph node infiltrated breast cancer patients.
TABLE 6 comparison of methylation levels of cancer-free controls, benign nodules, and lung cancer
TABLE 7 comparison of methylation level differences between cancer-free controls and Lung cancer
TABLE 8 comparison of methylation level differences between benign nodules and Lung cancer
TABLE 9 comparison of methylation level differences between different subtypes and stages of lung cancer
TABLE 10 comparison of methylation level differences between breast cancer and control women without cancer
TABLE 11 comparison of methylation level differences between stages of breast cancer
TABLE 12 comparison of methylation level differences between Lung cancer and Breast cancer
TABLE 13 CpG sites of TUBB1_ A and combinations thereof for differentiating between lung cancer and non-cancerous controls, lung cancer and benign nodules, breast cancer and non-cancerous female controls, lung cancer and breast cancer
TABLE 14 CpG sites of TUBB1_ A and their free combinations for differentiating different stages of lung cancer patients
TABLE 15 CpG sites of TUBB1_ A and combinations thereof for differentiating different stages of breast cancer
TABLE 16 optimal CpG sites of TUBB1_ A and combinations thereof for differentiating lung cancer and non-cancer controls, lung cancer and benign nodules, breast cancer and non-cancer female controls, and lung cancer and breast cancer
TABLE 17 optimal CpG sites of TUBB1_ A and combinations thereof for differentiating between different subtypes and stages of lung cancer patients
TABLE 18 optimal CpG sites of TUBB1_ A and combinations thereof for differentiating different stages of breast cancer
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is made possible within the scope of the claims attached below.
<110> Nanjing Tengten Biotechnology Co., ltd
<120> molecular marker for auxiliary diagnosis of cancer
<130> GNCLN211518
<160> 12
<170> PatentIn version 3.5
<210> 1
<211> 802
<212> DNA
<213> Artificial sequence
<400> 1
gatggacact aatggtttcc atgagcgact tgccaaacaa ataagagaca catcagtagg 60
tagagccccg gggccacact ttgcctcagt gaccactttt tggggaacaa ggactgaaac 120
ttctgggctg acgaagcagc tctccagcct tgctctccac tcggacagtc atgcggggat 180
tccatggcca cctcagcgct tccgggaatg gtcatggaag cttctggaag tcaggaggca 240
gccactgtga cttccccttg cccacgtggc acgcttggaa tgtggtgagt gccactgagt 300
atggagagag tcaggcaagc tcatctgtgg gccctgtgcc aagggccccc agcaggggcc 360
tgtcaggtcg cagcccagaa tgccgggccc tgttcttacc agagaagaag gccatggtgt 420
gggcccaagg gccatgacaa acagaggggc cgcagggagc gagaagccct cccccagtta 480
caaaaccacg tcctgggggg ccacttctgc ttttggttcc tcatttgact aagaagagtt 540
tcgttagcag aaaacctttt caaggcgtct ttggaagtca cattggataa ctcctgatgc 600
ctgcgccaag tggaatcttc ctttggggca ttttctagaa ggatccctcc ttccttctct 660
gcaaaaggga accgtccccc cgaaagggcc tgggcctttg ggaaaagggg tttagaaagc 720
caggcgcggg gacccttctt tgggggcggt ggccgcatcc caggatccct tcctagggga 780
ggcctgggcc cagatgtgag gc 802
<210> 2
<211> 668
<212> DNA
<213> Artificial sequence
<400> 2
ggctccaatg cttcaaccaa ggtcacgggt ctggcgagag gcaggggagg gaaggacctc 60
cagggcttct cgctctgagt tcagtagtct tgcctgtcct gagtccctgt cctgagagat 120
ggtgtccttg agctgctcca gagctaggag tgctctcttc ctaagtaagc cagagttcca 180
agagtgcagg atctggccac atctacctcc ttcattccct tgtctgcagc gcccaaggtg 240
agcccagcct atgacaggtc cccgtccata gctgttcaaa gaatgggggc aacctcaggg 300
tgcccggcag acgttaaaga gaggcctcaa gaggtggttg ctcaccggat ggtggttgtc 360
cacatcctcg gggcttcagg agtgaggcct tgccctgtac ctgcccacct gcacctcgac 420
agctcaacag gcacaaggca gcgtccactg gctctcccct tccgagccag ggctctcacg 480
ccttctcagg gagtcccttc cccatcttcc ttggaagctt ccaccttctg tcctccaggc 540
ggtggggaac tgccaccctc ctgggcaagg atatggagct ggtgtttggc ctggatgccc 600
ttgcctggtt ggaggtaata aagcctaggg ctggctttcg acgatgatag aaacttggtg 660
gcagggga 668
<210> 3
<211> 805
<212> DNA
<213> Artificial sequence
<400> 3
ccagacacca tgagctgaga tgggacgaca gagcaggggc tgggaagggt gctggggcac 60
agcccagccc tggcatggat agggcagggg tccaagagac ctgcaggaca ctgcagagta 120
acttgcatac agggcgagag agggcatgtg ggggagggcg ggcagacagt gcctgatgca 180
ggccatcccg ggacctggga cctgcgaggc ccagttgtgg agcctcctgc tgtggcgggg 240
cacggggctt ctggggacca aagtctggaa aaggggccgg gaagatggac agggaaagcc 300
cttggaggtg gtctggttac ctgggtgagc taggagccac agtcatcata ccacggtcac 360
tagggccagc atggtcacct agaagcctgc aaacagtgcc agcctccagg ccctgtgccc 420
cgaggtggcc tttaatcttg cactcactct ctaggaaatg atggggcagt attctgtgtt 480
gagggaggaa aaacactccc ttccaaaagc atgacaggca gaaagcagag aagggccagg 540
actggctgag ggcggggagc tgggcctctg gggtggacac acccttggtc acattgtgag 600
ggtagcttgg ttggccagtc ccaccactgc agtgaccaca gttgtgttgg gctcacacca 660
gtgaaccgaa gctctggatt ctgagagtct gaggattccg tgaagatctc agacttgggc 720
tcagagcaag gatgcgtgaa attgtccata ttcagattgg ccagtgtggc aaccagatcg 780
gagccaaggt aagtaatgtc tggtt 805
<210> 4
<211> 801
<212> DNA
<213> Artificial sequence
<400> 4
agaatgtcct agaggtggtg aggcacgaga gtgagagctg tgactgcctg cagggcttcc 60
agatcgtcca ctccctgggc gggggcacag gctccgggat gggcactctg ctcatgaaca 120
agattagaga ggagtacccg gaccggatca tgaattcctt cagcgtcatg ccttctccca 180
aggtgtcgga cactgtggtg gagccctaca acgcggttct gtctatccac cagctgattg 240
agaatgcaga tgcctgtttc tgcattgaca atgaggccct ctatgacatc tgcttccgta 300
ccctgaagct gacgacaccc acctatgggg atctcaacca cctagtgtcc ttgaccatga 360
gcggcataac cacctccctc cggttcccgg gtcagctcaa cgcagacctg cgcaagctgg 420
cggtgaacat ggtccccttc ccccgcctgc acttctttat gcccggcttt gccccactca 480
cggcccaggg cagccagcag taccgagccc tctccgtggc cgagctcacc cagcagatgt 540
tcgatgcccg caataccatg gctgcctgtg acctccgccg tggccgctac ctcacagtgg 600
cctgcatttt ccggggcaag atgtccacca aggaagtgga ccagcaactg ctctccgtgc 660
agaccaggaa cagcagctgc tttgtggagt ggattcccaa caacgtcaag gtggctgtct 720
gcgacatccc gccccggggg ctgagcatgg ccgccacctt cattggcaac aacacggcca 780
tccaagagat ctttaatagg g 801
<210> 5
<211> 35
<212> DNA
<213> Artificial sequence
<400> 5
aggaagagag gatggatatt aatggttttt atgag 35
<210> 6
<211> 56
<212> DNA
<213> Artificial sequence
<400> 6
cagtaatacg actcactata gggagaaggc tacctcacat ctaaacccaa acctcc 56
<210> 7
<211> 35
<212> DNA
<213> Artificial sequence
<400> 7
aggaagagag ggttttaatg ttttaattaa ggtta 35
<210> 8
<211> 56
<212> DNA
<213> Artificial sequence
<400> 8
cagtaatacg actcactata gggagaaggc ttcccctacc accaaatttc tatcat 56
<210> 9
<211> 35
<212> DNA
<213> Artificial sequence
<400> 9
aggaagagag ttagatatta tgagttgaga tggga 35
<210> 10
<211> 56
<212> DNA
<213> Artificial sequence
<400> 10
cagtaatacg actcactata gggagaaggc taaccaaaca ttacttacct taactc 56
<210> 11
<211> 35
<212> DNA
<213> Artificial sequence
<400> 11
aggaagagag agaatgtttt agaggtggtg aggta 35
<210> 12
<211> 56
<212> DNA
<213> Artificial sequence
<400> 12
cagtaatacg actcactata gggagaaggc tccctattaa aaatctctta aataac 56
Claims (10)
1. The application of the methylated TUBB1 gene as a marker in the preparation of products; the use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) Assisting in distinguishing different subtypes of cancer;
(4) The auxiliary differentiation of different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) The benign nodules and the lung cancer of the lung can be distinguished in an auxiliary mode;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) The lung cancer and the breast cancer are distinguished in an auxiliary mode;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
2. Use of a substance for detecting the methylation level of the TUBB1 gene in the preparation of a product; the use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) Assisting in distinguishing different subtypes of cancer;
(4) Assisting in distinguishing different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) Assisting in distinguishing benign nodules of the lung from lung cancer;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) The different stages of the lung cancer are distinguished in an auxiliary way;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) Assisting in distinguishing lung cancer from breast cancer;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
3. Use of a substance for detecting the methylation level of the TUBB1 gene and a medium bearing a mathematical model building method and/or a method of use for the preparation of a product; the product has at least one of the following uses:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) Assisting in distinguishing different subtypes of cancer;
(4) The auxiliary differentiation of different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) The benign nodules and the lung cancer of the lung can be distinguished in an auxiliary mode;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) The lung cancer and the breast cancer are distinguished in an auxiliary mode;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
The mathematical model is obtained according to a method comprising the following steps:
(A1) Detecting the methylation level of the TUBB1 gene of n1 samples of A type and n2 samples of B type respectively;
(A2) Taking the TUBB1 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment;
the use method of the mathematical model comprises the following steps:
(B1) Detecting the methylation level of the TUBB1 gene of the sample to be detected;
(B2) Substituting the TUBB1 gene methylation level data of the sample to be detected, which is obtained in the step (B1), into the mathematical model to obtain a detection index; then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is A type or B type according to the comparison result;
the type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of the lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
4. The application of the medium carrying the mathematical model establishing method and/or the using method in the product preparation is described; the use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) The auxiliary differentiation of different cancer subtypes is realized;
(4) Assisting in distinguishing different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) Assisting in distinguishing benign nodules of the lung from lung cancer;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method can help distinguish different stages of breast cancer;
(11) The lung cancer and the breast cancer are distinguished in an auxiliary mode;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
The mathematical model is obtained according to a method comprising the following steps:
(A1) Detecting the methylation level of the TUBB1 gene of n1 samples of A type and n2 samples of B type respectively;
(A2) Taking the TUBB1 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment;
the use method of the mathematical model comprises the following steps:
(B1) Detecting the methylation level of the TUBB1 gene of a sample to be detected;
(B2) Substituting the TUBB1 gene methylation level data of the sample to be detected, which is obtained in the step (B1), into the mathematical model to obtain a detection index; then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is A type or B type according to the comparison result;
the type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Samples of different stages of breast cancer.
5. A kit comprising a substance for detecting the methylation level of the TUBB1 gene; the kit is used for at least one of the following purposes:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) Assisting in distinguishing different subtypes of cancer;
(4) The auxiliary differentiation of different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) Assisting in distinguishing benign nodules of the lung from lung cancer;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method can help distinguish different stages of breast cancer;
(11) The lung cancer and the breast cancer are distinguished in an auxiliary mode;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
6. The kit of claim 5, wherein: the kit further comprises a medium carrying a mathematical model building method and/or a method of use as claimed in claim 3 or 4.
7. A system, comprising:
(D1) Reagents and/or instruments for detecting the methylation level of TUBB1 gene;
(D2) A device comprising a unit X and a unit Y;
the unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module;
the data acquisition module is configured to acquire (D1) TUBB1 gene methylation level data for n1 a-type samples and n 2B-type samples detected;
the data analysis processing module is configured to receive TUBB1 gene methylation level data of the n1 a-type samples and the n 2B-type samples from the data acquisition module, establish a mathematical model by a two-classification logistic regression method according to classification modes of the a-type and the B-type, and determine a threshold value of classification judgment;
the model output module is configured to receive the mathematical model established by the data analysis processing module and output the mathematical model;
the unit Y is used for determining the type of a sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module;
the data input module is configured to input (D1) detected TUBB1 gene methylation level data of a subject;
the data operation module is configured to receive TUBB1 gene methylation level data of the testee from the data input module, and substitute the TUBB1 gene methylation level data of the testee into the mathematical model established by the data analysis processing module in the unit X, and calculate a detection index;
the data comparison module is configured to receive a detection index calculated by the data operation module and compare the detection index with the threshold determined in the data analysis processing module in the unit X;
the conclusion output module is configured to receive the comparison result from the data comparison module and output the conclusion that the type of the sample to be tested is the type A or the type B according to the comparison result;
the type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
8. The use or kit or system of any one of claims 1-7, wherein: the methylation level of the TUBB1 gene is the methylation level of all or part of CpG sites in the following fragments (e 1) - (e 4) in the TUBB1 gene;
the methylated TUBB1 gene is methylated at all or part of CpG sites in the following fragments (e 1) to (e 4) in the TUBB1 gene;
(e1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e4) The DNA fragment shown in SEQ ID No.4 or the DNA fragment with more than 80 percent of identity with the DNA fragment.
9. The use or kit or system of claim 8, wherein: the 'all or part of CpG sites' is any one or more CpG sites in 4 DNA fragments shown in SEQ ID No.1 to SEQ ID No.4 in the TUBB1 gene;
or
The 'all or part of CpG sites' are all CpG sites in the DNA segment shown in SEQ ID No.1 and all CpG sites in the DNA segment shown in SEQ ID No. 4;
the "whole or part of CpG sites" may be all or any 22 or any 21 or any 20 or any 19 or any 18 or any 17 or any 16 or any 15 or any 14 or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 of the DNA fragments represented by SEQ ID No.1 in the TUBB1 gene;
or
The "whole or partial CpG sites" are all or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 of the following 10 CpG sites in the DNA fragment shown in SEQ ID No. 1:
(f1) The DNA fragment shown in SEQ ID No.1 shows CpG sites from 384 th to 385 th positions of the 5' end;
(f2) The DNA fragment shown in SEQ ID No.1 has CpG sites shown as 451-452 positions from the 5' end;
(f3) The DNA fragment shown in SEQ ID No.1 has CpG sites as 460-461 sites from the 5' end;
(f4) The DNA fragment shown in SEQ ID No.1 shows CpG sites from 489-490 th positions at the 5' end;
(f5) The DNA segment shown in SEQ ID No.1 is a CpG site shown by 542-543 th site from the 5' end;
(f6) The DNA fragment shown in SEQ ID No.1 shows CpG sites from 566 th to 567 th positions at the 5' end;
(f7) The DNA fragment shown in SEQ ID No.1 shows CpG sites from 604 th to 605 th positions of the 5' end;
(f8) The DNA fragment shown in SEQ ID No.1 is from the CpG sites shown in 673-674 th and 681-682 th positions of the 5' end;
(f9) The DNA fragment shown in SEQ ID No.1 shows CpG sites from 725-726 th and 727-728 th positions of the 5' end;
(f10) The DNA fragment shown in SEQ ID No.1 has CpG sites from 747-748 and 754-755 positions at the 5' end.
10. The use or kit or system according to any one of claims 1 to 9, wherein: the means for detecting the methylation level of the TUBB1 gene comprises a primer combination for amplifying a full-length or partial fragment of the TUBB1 gene;
the reagents for detecting the methylation level of the TUBB1 gene comprise a primer combination for amplifying a full-length or partial fragment of the TUBB1 gene;
further, the partial fragment is at least one fragment selected from the following fragments:
(g1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment contained in the DNA fragment;
(g2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment contained in the DNA fragment;
(g3) A DNA fragment shown as SEQ ID No.3 or a DNA fragment contained therein;
(g4) The DNA segment shown in SEQ ID No.4 or the DNA segment contained in the DNA segment;
(g5) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.1 or a DNA fragment contained therein;
(g6) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.2 or a DNA fragment contained therein;
(g7) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.3 or a DNA fragment comprising the same;
(g8) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.4 or a DNA fragment comprising the same;
further, the primer combination is a primer pair A and/or a primer pair B and/or a primer pair C and/or a primer D;
the primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is single-stranded DNA shown by 11 th to 35 th nucleotides of SEQ ID No.5 or SEQ ID No. 5; the primer A2 is single-stranded DNA shown by 32 th-56 th nucleotides of SEQ ID No.6 or SEQ ID No. 6;
the primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is single-stranded DNA shown by 11 th to 35 th nucleotides of SEQ ID No.7 or SEQ ID No. 7; the primer B2 is single-stranded DNA shown by 32 th-56 th nucleotides of SEQ ID No.8 or SEQ ID No. 8;
the primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is single-stranded DNA shown by 11 th to 35 th nucleotides of SEQ ID No.9 or SEQ ID No. 9; the primer C2 is single-stranded DNA shown by the 32 nd to 56 th nucleotides of SEQ ID No.10 or SEQ ID No. 10.
The primer pair D is a primer pair consisting of a primer D1 and a primer D2; the primer D1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.11 or SEQ ID No. 11; the primer D2 is single-stranded DNA shown by the 32 nd to 56 th nucleotides of SEQ ID No.12 or SEQ ID No. 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110789125.7A CN115612731A (en) | 2021-07-13 | 2021-07-13 | Molecular marker for auxiliary diagnosis of cancer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110789125.7A CN115612731A (en) | 2021-07-13 | 2021-07-13 | Molecular marker for auxiliary diagnosis of cancer |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115612731A true CN115612731A (en) | 2023-01-17 |
Family
ID=84855589
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110789125.7A Pending CN115612731A (en) | 2021-07-13 | 2021-07-13 | Molecular marker for auxiliary diagnosis of cancer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115612731A (en) |
-
2021
- 2021-07-13 CN CN202110789125.7A patent/CN115612731A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Leygo et al. | DNA methylation as a noninvasive epigenetic biomarker for the detection of cancer | |
CN111910004B (en) | Application of cfDNA in noninvasive diagnosis of early breast cancer | |
CN111863250B (en) | Combined diagnosis model and system for early breast cancer | |
CN116790752A (en) | Molecular marker for early screening and early diagnosing lung cancer | |
CN113136428B (en) | Application of methylation marker in auxiliary diagnosis of cancer | |
Yan et al. | A review on cancer of unknown primary origin: the role of molecular biomarkers in the identification of unknown primary origin | |
CN113215252B (en) | Methylation markers for aiding in the diagnosis of cancer | |
CN113355412B (en) | Methylation markers and kits for aiding in the diagnosis of cancer | |
CN114480630A (en) | Methylation marker for auxiliary diagnosis of cancer | |
CN115612731A (en) | Molecular marker for auxiliary diagnosis of cancer | |
CN113355413B (en) | Application of molecular marker and kit in auxiliary diagnosis of cancer | |
CN113215251B (en) | Methylation marker for assisting diagnosis of cancer | |
CN115701454A (en) | Molecular marker and kit for auxiliary diagnosis of cancer | |
CN113122630B (en) | Calbindin methylation markers for use in aiding diagnosis of cancer | |
CN115612735A (en) | Potential molecular marker for auxiliary diagnosis of cancer | |
JP2018139537A (en) | Method of data acquisition of possibility of lymph node metastasis of esophageal cancer | |
CN115612732A (en) | Marker for auxiliary diagnosis of cancer and kit thereof | |
CN115701453A (en) | Molecular marker and kit for auxiliary diagnosis of cancer | |
CN118028461A (en) | Application of protein gene in auxiliary diagnosis of cancer | |
CN117568473A (en) | Methylation molecular marker for auxiliary diagnosis of cancer | |
CN114507731B (en) | Methylation marker and kit for assisting cancer diagnosis | |
CN117568470A (en) | Molecular marker and kit for auxiliary diagnosis of cancer | |
CN117568471A (en) | Protein gene methylation as a molecular marker for aiding in the diagnosis of cancer | |
CN113215250B (en) | Use of methylation level of genes in aiding diagnosis of cancer | |
CN117604094A (en) | Methylation marker and application of kit in auxiliary diagnosis of cancer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Country or region after: China Address after: 200072, 3rd to 4th floors, Building 10, No. 351 Yuexiu Road, Hongkou District, Shanghai Applicant after: Tengchen Biotechnology (Shanghai) Co.,Ltd. Address before: 210032 building 02, life science and technology Island, No. 11, Yaogu Avenue, Jiangbei new area, Nanjing, Jiangsu Province Applicant before: Nanjing Tengchen Biological Technology Co.,Ltd. Country or region before: China |
|
CB02 | Change of applicant information |