CN115701454A - Molecular marker and kit for auxiliary diagnosis of cancer - Google Patents

Molecular marker and kit for auxiliary diagnosis of cancer Download PDF

Info

Publication number
CN115701454A
CN115701454A CN202110882053.0A CN202110882053A CN115701454A CN 115701454 A CN115701454 A CN 115701454A CN 202110882053 A CN202110882053 A CN 202110882053A CN 115701454 A CN115701454 A CN 115701454A
Authority
CN
China
Prior art keywords
cancer
lung cancer
seq
folr3
lung
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110882053.0A
Other languages
Chinese (zh)
Inventor
狄飞飞
张筝
张晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Tengchen Biological Technology Co ltd
Original Assignee
Nanjing Tengchen Biological Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Tengchen Biological Technology Co ltd filed Critical Nanjing Tengchen Biological Technology Co ltd
Priority to CN202110882053.0A priority Critical patent/CN115701454A/en
Publication of CN115701454A publication Critical patent/CN115701454A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a molecular marker and a kit for auxiliary diagnosis of cancer. The invention provides application of a methylated FOLR3 gene as a marker in preparation of a product; the use of the product is at least one of the following: auxiliary diagnosis of cancer or prediction of cancer risk; aid in distinguishing benign nodules from cancer; assisting in distinguishing different subtypes of cancer; the auxiliary differentiation of different stages of cancer; aid in distinguishing between different cancers; determining whether the test substance has a blocking or promoting effect on the occurrence of the cancer; the cancer may be lung cancer or breast cancer. The research of the invention discovers the hypomethylation phenomenon of the FOLR3 gene in blood of patients with lung cancer and breast cancer, and the invention has important scientific significance and clinical application value for improving the early diagnosis and treatment effect of the lung cancer and the breast cancer and reducing the death rate.

Description

Molecular marker and kit for auxiliary diagnosis of cancer
Technical Field
The invention relates to the field of medicine, in particular to a molecular marker and a kit for auxiliary diagnosis of cancer.
Background
Breast cancer is a malignant tumor caused by uncontrolled proliferation of mammary epithelial cells. On one hand, breast cancer is one of the most common malignant tumors of women worldwide, and the incidence rate is the first malignant tumor of women. On the other hand, the survival rate of breast cancer is related to the classification and stage of the tumor. The 5-year survival prognosis for early stage breast cancer is typically higher than 60%, but for advanced breast cancer, the value drops to 40-60%. For metastatic breast cancer, the 5-year survival prognosis is typically about 15%. Therefore, it is necessary to effectively diagnose and treat the breast cancer at the later stage by improving the early detection rate of the breast cancer. At present, clinical medicine mainly has two modes of imaging and pathology for early screening and diagnosis of breast cancer. B-mode ultrasonic imaging in imaging diagnosis is radiationless, but is limited by the mechanism of ultrasonic imaging, and the method has poor resolution ratio on lesions with small volume and unobvious echo change and is easy to miss diagnosis. The mammary gland molybdenum target inspection technology is a technology for taking a mammary gland by a low-dose mammary gland X-ray, can clearly display the tissue structure condition of each layer of the mammary gland, but has higher false positive rate, needs to puncture the mammary gland of a patient to judge more accurately, and has the harm of ionizing radiation and the like to the patient due to the mammary gland molybdenum target. The breast nuclear magnetic resonance imaging technology utilizes magnetic energy and radio waves to check breast tissues and generate internal images, and is mainly suitable for screening high risk groups of breast cancer. The pathological diagnosis is mainly breast biopsy, which is a method for taking pathological tissues for pathological diagnosis, however, biopsy operation is very resistant to patients due to human trauma. In addition, some common tumor markers, such as tumor antigen 15-3, tumor antigen 125, tumor antigen 27.29, carcinoembryonic antigen, circulating tumor cells and the like, are used for diagnosing breast cancer, but the specificity and the sensitivity of the tumor markers need to be improved, and the tumor markers are generally used in combination with imaging research. Therefore, more sensitive and specific molecular markers of early breast cancer are urgently needed to be discovered.
Lung cancer is a malignant tumor occurring in the epithelium of bronchial mucosa, and the morbidity and mortality of lung cancer have been on the rise in recent decades, and is the cancer with the highest morbidity and mortality all over the world. Despite recent advances in diagnostic methods, surgical techniques, and chemotherapeutic drugs, the overall 5-year survival rate for lung cancer patients is only 16%, mainly because most lung cancer patients have metastases at the time of treatment and thus the chance of radical surgical intervention is lost. Research shows that the prognosis of lung cancer is directly related to stage, the 5-year survival rate of the lung cancer in stage I is 83%, the survival rate in stage II is 53%, the survival rate in stage III is 26%, and the survival rate in stage IV is 6%. Therefore, the key to reducing mortality in lung cancer patients is early diagnosis and early treatment.
At present, there are the following main methods for diagnosing lung cancer: 1. the imaging method comprises the following steps: such as chest X-ray and low dose helical CT. Early stage lung cancer is difficult to detect by chest X-ray. Although the low-dose spiral CT can find small nodules in the lung, the false positive rate is as high as 96.4%, and unnecessary psychological burden is brought to a person to be examined. Meanwhile, chest X-ray and low dose helical CT are not suitable for frequent use due to radiation. In addition, imaging methods are often affected by the equipment and doctor's experience in viewing the film, as well as the time available for reading the film. 2. The cytological method comprises the following steps: such as sputum cytology, bronchoscopic brush or biopsy, bronchoalveolar lavage fluid cytology, and the like. Sputum cytology and bronchoscopy swabs or biopsies have low sensitivity for peripheral lung cancer. Meanwhile, the operation of brushing a sheet or taking a biopsy under a bronchoscope and performing bronchoalveolar lavage fluid cytology is complicated, and the comfort of a physical examiner is poor. 3. Serum tumor markers commonly used at present: carcinoembryonic antigen (CEA), cytokeratin 19 fragment antigen (CYFRA 21-1), carbohydrate antigen (CA 125/153/199), neuron-specific enolase (NSE), and the like. These serum tumor markers have limited sensitivity to lung cancer, typically 30-40%, and even lower for stage I tumors. Moreover, tumor specificity is limited, and is affected by many benign diseases such as benign tumors, inflammations, degenerative diseases, and the like. At present, tumor markers are mainly used for screening malignant tumors and rechecking tumor treatment effects. Therefore, there is a need to develop a highly efficient and specific early diagnosis technique for lung cancer.
Currently, the most effective method for internationally accepted pulmonary nodule diagnosis is chest low dose helical CT screening. However, low-dose helical CT has high sensitivity, and is difficult to identify benign or malignant nodules, although a large number of nodules can be found. Among the nodules found, the proportion of malignancy is less than 4%. At present, the clinical identification of benign and malignant pulmonary nodules needs long-term follow-up, repeated CT examination or the sampling of pulmonary nodule biopsy, including fine needle puncture biopsy of chest wall, bronchoscopic tissue biopsy, thoracoscopic or open-chest operation lung biopsy, and traumatic examination methods. CT-guided or ultrasound-guided transthoracic punch biopsy has higher sensitivity, but has a lower diagnosis rate for nodules smaller than 2cm, a 30-70% missed diagnosis rate, and a higher incidence of pneumothorax and bleeding. The complication rate of the bronchoscopic aspiration biopsy is relatively low, but the diagnosis rate of peripheral nodules is limited, the diagnosis rate of the nodules is only 34% when the number of the nodules is less than or equal to 2cm, and the diagnosis rate of the nodules is 63% when the number of the nodules is more than 2 cm. The surgical resection has high diagnosis rate and can directly treat the nodules, but can cause the lung function of a patient to be temporarily reduced, and if the nodules are benign, the patient is subjected to unnecessary surgery, so that the over-treatment is caused. Therefore, there is an urgent need for new in vitro diagnostic molecular markers to assist in the identification of pulmonary nodules, which reduces the rate of missed diagnosis and also minimizes unnecessary punctures or surgeries.
DNA methylation is an important chemical modification of genes, affecting the regulation of gene transcription and nuclear structure. Alterations in DNA methylation are early events and concomitant events in cancer progression, and are mainly manifested by hypermethylation of oncogenes and hypomethylation of proto-oncogenes in tumor tissues. However, the correlation between DNA methylation in blood and tumorigenesis and development is reported to be relatively small. In addition, blood is easy to collect, DNA methylation is stable, and the clinical application value is huge if a tumor specific blood DNA methylation molecular marker can be found. Therefore, exploring and developing a blood DNA methylation diagnosis technology suitable for clinical detection needs has important clinical application value and social significance for improving the early diagnosis and treatment effect of breast cancer and lung cancer and reducing the mortality.
Disclosure of Invention
The invention aims to provide a folate receptor gamma 3 (FOLR3) molecular marker and a kit for auxiliary diagnosis of cancer.
In a first aspect, the present invention claims the use of the methylated FOLR3 gene as a marker in the preparation of a product. The use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) The auxiliary differentiation of different cancer subtypes is realized;
(4) Assisting in distinguishing different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) Assisting in distinguishing benign nodules of the lung from lung cancer;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) The different stages of the lung cancer are distinguished in an auxiliary way;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) Assisting in distinguishing lung cancer from breast cancer;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
Further, the diagnosis-assisting cancer in (1) may be specifically embodied as at least one of the following: aid in distinguishing between cancer patients and non-cancer controls (it being understood that no cancer has been reported and that no benign nodules of the lung or breast have been reported and that the blood routine is within the reference range); aid in distinguishing between different cancers.
Further, the benign nodules in (2) are benign nodules corresponding to the cancers in (2), such as benign nodules of lung and lung cancer.
Further, the different subtypes of the cancer described in (3) may be pathotyped, such as histological typing.
Further, the different stages of the cancer in (4) may be clinical stages or TNM stages.
In a specific embodiment of the present invention, the diagnosis assisting step (5) is specifically embodied in at least one of the following steps: can help to distinguish lung cancer patients from non-cancer controls, can help to distinguish lung adenocarcinoma patients from non-cancer controls, can help to distinguish lung squamous carcinoma patients from non-cancer controls, can help to distinguish small cell lung cancer patients from non-cancer controls, can help to distinguish stage I lung cancer patients from non-cancer controls, can help to distinguish stage II-III lung cancer patients from non-cancer controls, can help to distinguish lung cancer patients without lymph node infiltration from non-cancer controls, and can help to distinguish lung cancer patients with lymph node infiltration from non-cancer controls. Wherein the cancer-free control is understood as having no cancer present and ever and no reported benign nodules of the lung or breast and blood routine indicators within the reference range.
In a specific embodiment of the present invention, the auxiliary tool in (6) is specifically used for distinguishing benign nodules and lung cancer in at least one of the following forms: the kit can assist in distinguishing lung cancer and benign nodules of the lung, lung adenocarcinoma and benign nodules of the lung, squamous lung cancer and benign nodules of the lung, small cell lung cancer and benign nodules of the lung, stage I lung cancer and benign nodules of the lung, stage II-III lung cancer and benign nodules of the lung, non-lymph node infiltrated lung cancer and benign nodules of the lung, and lymph node infiltrated lung cancer and benign nodules of the lung.
In a specific embodiment of the present invention, the auxiliary differentiation of different subtypes of lung cancer in (7) is embodied as follows: can help to distinguish any two of lung adenocarcinoma, lung squamous carcinoma and small cell lung cancer.
In a specific embodiment of the present invention, the auxiliary differentiation of different stages of lung cancer in (8) is embodied as at least one of the following: can help to distinguish any two of the lung cancer of T1 stage, the lung cancer of T2 stage and the lung cancer of T3 stage; can help to distinguish lung cancer without lymph node infiltration from lung cancer with lymph node infiltration; can help to distinguish any two of clinical stage I lung cancer, clinical stage II lung cancer and clinical stage III lung cancer.
In a specific embodiment of the present invention, the diagnosis assistance system of breast cancer in (9) is specifically embodied to assist in distinguishing between breast cancer patients and non-cancer female controls. Wherein said cancer-free female control is understood as having no cancer present and ever and no reported benign nodules of the lung or breast and blood routine within the reference range.
In a specific embodiment of the present invention, the auxiliary differentiation between different stages of breast cancer in (10) is embodied by at least one of the following: can help to distinguish any two of T1-stage breast cancer, T2-stage breast cancer and T3-stage breast cancer; the breast cancer without lymph node infiltration and the breast cancer with lymph node infiltration can be distinguished in an auxiliary way; can help to distinguish any two of clinical stage I breast cancer, clinical stage II breast cancer and clinical stage III breast cancer.
In the above (1) to (12), the cancer may be a cancer which can cause a reduction in the methylation level of the FOLR3 gene in the body, such as lung cancer, breast cancer, and the like.
In a second aspect, the invention claims the use of a substance for detecting the methylation level of the FOLR3 gene for the preparation of a product. The use of the product may be at least one of the above (1) to (12).
In a third aspect, the invention claims the use of a substance for detecting the methylation level of the FOLR3 gene and a medium describing the mathematical modeling method and/or the method of use for the preparation of a product. The use of the product may be at least one of the above (1) to (12).
The mathematical model may be obtained according to a method comprising the steps of:
(A1) Respectively detecting the FOLR3 gene methylation levels (training sets) of n 1A type samples and n 2B type samples;
(A2) And (2) taking the FOLR3 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment.
Wherein n1 and n2 in (A1) can both be positive integers of 50 or more.
The use method of the mathematical model can comprise the following steps:
(B1) Detecting the FOLR3 gene methylation level of a sample to be detected;
(B2) Substituting the FOLR3 gene methylation level data of the sample to be detected, which are obtained in the step (B1), into the mathematical model to obtain a detection index; and then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is the type A or the type B according to the comparison result.
In a specific embodiment of the present invention, the threshold is set to 0.5. Greater than 0.5 is classified as one class and less than 0.5 is classified as another class, equal to 0.5 as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum jordan index (specifically, may be a value corresponding to the maximum jordan index). A value greater than the threshold is classified as one type, and a value less than the threshold is classified as another type, and equal to the threshold is regarded as an indeterminate gray zone. The type A and the type B are two corresponding classifications, grouping of the two classifications, which group is the type A and which group is the type B, and the type A and the type B are determined according to a specific mathematical model without convention.
The type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Samples of different stages of breast cancer.
In a fourth aspect, the invention claims the use of the "medium describing the method of mathematical modeling and/or the method of use" as described in the third aspect hereinbefore in the manufacture of a product. The use of the product may be at least one of the above (1) to (12).
In a fifth aspect, the invention claims a kit.
The claimed kit of the invention comprises a substance for detecting the methylation level of the FOLR3 gene; the use of the kit may be at least one of the above (1) to (12).
Further, the kit may further comprise the "medium storing the mathematical model building method and/or the using method" as described in the third aspect or the fourth aspect.
In a sixth aspect, the invention claims a system.
The claimed system of the present invention comprises:
(D1) Reagents and/or instruments for detecting the methylation level of the FOLR3 gene;
(D2) An apparatus comprising unit X and unit Y.
The unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module;
the data acquisition module is configured to acquire (D1) FOLR3 gene methylation level data for the n1 type a samples and the n2 type B samples detected.
The data analysis processing module is configured to receive FOLR3 gene methylation level data of n 1A type samples and n 2B type samples sent by the data acquisition module, establish a mathematical model through a two-classification logistic regression method according to the classification modes of the A type and the B type, and determine a threshold value of classification judgment.
The model output module is configured to receive and output the mathematical model sent by the data analysis processing module.
The unit Y is used for determining the type of the sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module.
The data input module is configured to input (D1) detected FOLR3 gene methylation level data of the subject.
The data operation module is configured to receive the FOLR3 gene methylation level data of the person to be tested sent by the data input module, substitute the FOLR3 gene methylation level data of the person to be tested into the mathematical model established by the data analysis processing module in the unit X, and calculate to obtain a detection index.
The data comparison module is configured to receive the detection index sent by the data operation module and compare the detection index with the threshold determined by the data analysis processing module in the unit X.
The conclusion output module is configured to receive the comparison result sent by the data comparison module and output the conclusion that the type of the sample to be tested is the type A or the type B according to the comparison result.
The type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of the lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
Wherein n1 and n2 can both be positive integers of more than 50.
In a specific embodiment of the present invention, the threshold is set to 0.5. Greater than 0.5 is classified as one type, less than 0.5 is classified as another type, and equal to 0.5 is considered as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum johning index (specifically, may be a value corresponding to the maximum johning index). A value greater than the threshold is classified as one type, and a value less than the threshold is classified as another type, and equal to the threshold is regarded as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In the foregoing aspects, the methylation level of the FOLR3 gene may be the methylation level of all or part of CpG sites in the FOLR3 gene in the fragments shown in (e 1) to (e 3) below. The methylated FOLR3 gene can be methylated at all or part of CpG sites in the following fragments (e 1) - (e 3) in the FOLR3 gene.
(e1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment having more than 80% of identity with the DNA fragment.
Further, the "whole or part of CpG sites" can be any one or more CpG sites in 3 DNA fragments shown in SEQ ID No.1 to SEQ ID No.3 in the FOLR3 gene. The upper limit of the "multiple CpG sites" is all CpG sites in 3 DNA fragments shown in SEQ ID No.1 to SEQ ID No.3 in the FOLR3 gene. All CpG sites in the DNA fragment shown in SEQ ID No.1 are shown in Table 1, all CpG sites in the DNA fragment shown in SEQ ID No.2 are shown in Table 2, and all CpG sites in the DNA fragment shown in SEQ ID No.3 are shown in Table 3.
Alternatively, the "whole or part of the CpG sites" may be all CpG sites in the DNA fragment shown in SEQ ID No.1 (see Table 1) and all CpG sites in the DNA fragment shown in SEQ ID No.2 (see Table 2).
Alternatively, the "whole or part of the CpG sites" may be all CpG sites in the DNA fragment shown in SEQ ID No.1 (see Table 1) and all CpG sites in the DNA fragment shown in SEQ ID No.3 (see Table 3).
Alternatively, the "whole or part of the CpG sites" may be all of the CpG sites in the DNA fragment shown in SEQ ID No.2 (see Table 2) and all of the CpG sites in the DNA fragment shown in SEQ ID No.3 (see Table 3).
Alternatively, the "whole or part of the CpG sites" can be all CpG sites in the DNA fragment shown in SEQ ID No.1 (see Table 1), all CpG sites in the DNA fragment shown in SEQ ID No.2 (see Table 2) and all CpG sites in the DNA fragment shown in SEQ ID No.3 (see Table 3).
Or, the "whole or part of the CpG sites" may be all or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 of the DNA fragments shown in SEQ ID No. 1.
Or, the "all or part of the CpG sites" may be all or any 4 or any 3 or any 2 or any 1 of the following CpG sites shown in 5 in the DNA fragment shown in SEQ ID No. 1:
(f1) The DNA fragment shown in SEQ ID No.1 is a CpG site (FOLR 3_ A _ 1) shown in the 40 th to 41 th positions from the 5' end;
(f2) The DNA fragment shown in SEQ ID No.1 is a CpG site (FOLR 3_ A _ 2) shown in 95 th to 96 th positions from the 5' end;
(f3) The DNA segment shown in SEQ ID No.1 shows CpG sites (FOLR 3_ A _ 3) from 152 th to 153 th of the 5' end;
(f4) The DNA fragment shown in SEQ ID No.1 is a CpG locus (FOLR 3_ A _ 4) shown by 157 th to 158 th bits from the 5' end;
(f5) The DNA fragment shown in SEQ ID No.1 has a CpG site (FOLR 3_ A _ 5) at position 361-362 from the 5' end.
In a specific embodiment of the present invention, some adjacent methylation sites are treated as one methylation site when performing DNA methylation analysis using time-of-flight mass spectrometry since several CpG sites are located on one methylation fragment and the peak pattern is indistinguishable (indistinguishable sites are described in Table 5), thus methylation level analysis is performed and the relevant mathematical model is constructed and used.
In the above aspects, the substance for detecting the methylation level of the FOLR3 gene may comprise (or be) a primer combination for amplifying a full-length or partial fragment of the FOLR3 gene. The reagent for detecting the methylation level of the FOLR3 gene can comprise (or is) a primer combination for amplifying the full-length or partial fragment of the FOLR3 gene; the instrument for detecting the methylation level of the FOLR3 gene can be a time-of-flight mass spectrometer. Of course, the reagent for detecting the methylation level of the FOLR3 gene can also comprise other conventional reagents used for performing time-of-flight mass spectrometry.
Further, the partial fragment may be at least one of:
(g1) The DNA fragment shown in SEQ ID No.1 or the DNA fragment contained in the DNA fragment;
(g2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment contained in the DNA fragment;
(g3) A DNA fragment shown as SEQ ID No.3 or a DNA fragment contained therein;
(g4) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.1 or a DNA fragment comprising the same;
(g5) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.2 or a DNA fragment contained therein;
(g6) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.3 or to a DNA fragment comprising the same.
Further, the primer combination is a primer pair A and/or a primer pair B and/or a primer pair C.
The primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.4 or SEQ ID No. 4; the primer A2 is single-stranded DNA shown by 32 th to 56 th nucleotides of SEQ ID No.5 or SEQ ID No. 5.
The primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.6 or SEQ ID No. 6; the primer B2 is single-stranded DNA shown by the 32 nd to 56 th nucleotides of SEQ ID No.7 or SEQ ID No. 7.
The primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is single-stranded DNA shown by 11 th to 35 th nucleotides of SEQ ID No.8 or SEQ ID No. 8; the primer C2 is single-stranded DNA shown by 32 th to 56 th nucleotides of SEQ ID No.9 or SEQ ID No. 9.
In addition, the invention also claims a method for distinguishing the sample to be detected as the type A sample or the type B sample. The method may comprise the steps of:
(A) The mathematical model is established according to a method comprising the following steps:
(A1) Respectively detecting the FOLR3 gene methylation levels (training set) of n 1A type samples and n 2B type samples;
(A2) And (2) taking the FOLR3 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment.
Wherein n1 and n2 in (A1) are both positive integers of 50 or more.
(B) Whether the sample to be tested is a type a sample or a type B sample can be determined according to a method comprising the following steps:
(B1) Detecting the FOLR3 gene methylation level of the sample to be detected;
(B2) Substituting the FOLR3 gene methylation level data of the sample to be detected, which are obtained in the step (B1), into the mathematical model to obtain a detection index; and then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is the type A or the type B according to the comparison result.
In a specific embodiment of the present invention, the threshold is set to 0.5. Greater than 0.5 is classified as one class and less than 0.5 is classified as another class, equal to 0.5 as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum jordan index (specifically, may be a value corresponding to the maximum jordan index). A value greater than the threshold is classified as one type, and a value less than the threshold is classified as another type, and equal to the threshold is regarded as an indeterminate gray zone. The type A and the type B are two corresponding classifications, and the grouping of the two classifications, which group is the type A and which group is the type B, are determined according to a specific mathematical model without convention.
The type a sample and the type B sample may be any one of:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of the lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
Any of the above mathematical models may be changed in practical application according to the detection method of DNA methylation and the fitting manner, and needs to be determined according to a specific mathematical model without convention.
In the embodiment of the present invention, the model is specifically log (y/(1-y)) = b0+ b1x1+ b2x2+ b3x3+ \8230, + bnXn, where y is a detection index obtained after a dependent variable is substituted into the model for methylation values of one or more methylation sites of a sample to be tested, b0 is a constant, x1 to xn are independent variables, i.e., methylation values of one or more methylation sites of the test sample (each value is a value between 0 and 1), and b1 to bn are weights assigned to the methylation values of each site by the model.
In the embodiment of the present invention, the model may be established by adding known parameters such as age, sex, white blood cell count, etc. as appropriate to improve the discrimination efficiency. One particular model established in embodiments of the present invention is a model for assisting in distinguishing benign nodules from lung cancer, the model being specifically: log (y/(1-y)) = -0.64+1.775 FOLR3_A _ _1-1.504 FOLR3_A _2+1.31 FOLR3_A _ _3+0.31 FOLR3 _A4-1.29 FOLR3 _A5 +0.027 age (male assigned value is 1, female assigned value is 0) +0.734 sex +0.013 white blood cell count. The FOLR3_ A _1 is the methylation level of CpG sites shown in 40 th-41 th sites of the 5' end of the DNA fragment shown in SEQ ID No. 1; the FOLR3_ A _2 is the methylation level of CpG sites shown in 95 th to 96 th sites of the 5' end of the DNA segment shown in SEQ ID No. 1; the FOLR3_ A _3 is the methylation level of CpG sites shown in 152 th to 153 th positions of the 5' end of the DNA fragment shown in SEQ ID No. 1; the FOLR3_ A _4 is the methylation level of a CpG locus shown by 157 th to 158 th bits from the 5' end of the DNA fragment shown by SEQ ID No. 1; the FOLR3_ A _5 is the methylation level of CpG sites shown in 361-362 th sites of the 5' end of the DNA fragment shown in SEQ ID No. 1. The threshold of the model is 0.5. Patients with a detection index greater than 0.5 calculated by the model are selected as lung cancer patients, and patients with a detection index less than 0.5 are selected as benign nodules of the lung.
In the above aspects, the detecting the methylation level of the FOLR3 gene is detecting the methylation level of the FOLR3 gene in blood.
In the above aspects, when the type a specimen and the type B specimen are specimens of different subtypes of lung cancer in (C3), the type a specimen and the type B specimen may be specifically any two of a lung adenocarcinoma specimen, a lung squamous carcinoma specimen, and a small cell lung cancer specimen.
In the above aspects, when the type a specimen and the type B specimen are different stage specimens of (C4) intermediate lung cancer, the type a specimen and the type B specimen may be specifically any two of clinical stage I lung cancer specimen, clinical stage II lung cancer specimen, and clinical stage III lung cancer specimen.
In the above aspects, when the type a sample and the type B sample are different stages of breast cancer in (C7), the type a sample and the type B sample may specifically be any two of a T1 stage breast cancer sample, a T2 stage breast cancer sample, and a T3 stage breast cancer sample, or a non-lymph node-infiltrating breast cancer sample and a lymph node-infiltrating breast cancer sample, or any two of a clinical stage I breast cancer sample, a clinical stage II breast cancer sample, and a clinical stage III breast cancer sample.
Any of the above FOLR3 genes may specifically include Genbank accession numbers: NM-000804.4, NM _001318045.2
The invention provides a hypomethylation phenomenon of FOLR3 gene in blood of lung cancer patients and breast cancer patients. Experiments prove that by taking blood as a sample, cancer (lung cancer and breast cancer) patients and cancer-free controls can be distinguished, benign nodules of the lung and the lung cancer can be distinguished, different subtypes and different stages of the lung cancer can be distinguished, different stages of the breast cancer can be distinguished, and the lung cancer and the breast cancer can be distinguished. The invention has important scientific significance and clinical application value for improving the early diagnosis and treatment effect of the lung cancer and the breast cancer and reducing the death rate.
Drawings
Fig. 1 is a schematic diagram of a mathematical model.
Fig. 2 is an illustration of a mathematical model.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The examples provided below serve as a guide for further modifications by a person skilled in the art and do not constitute a limitation of the invention in any way.
The experimental procedures in the following examples, unless otherwise indicated, are conventional and are carried out according to the techniques or conditions described in the literature in the field or according to the instructions of the products. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
The folate receptor gamma 3 (folr 3) gene quantification test in the following examples was performed in triplicate, and the results were averaged.
Example 1 primer design for detecting methylation site of FOLR3 Gene
Through a number of sequence and functional analyses, three segments (FOLR 3_ a, FOLR3_ B, and FOLR3_ C segments) in the FOLR3 gene were selected for methylation level and cancer-related analysis.
The FOLR3_ A fragment (SEQ ID No. 1) is located in hg19 reference genome chr11:71846694-71847320, the sense strand.
The FOLR3_ B fragment (SEQ ID No. 2) is located in hg19 reference genome chr11:71847990-71848596, the sense strand.
The FOLR3_ C fragment (SEQ ID No. 3) is located in hg19 reference genome chr11:71849684-71850330, sense strand.
The information of CpG sites in FOLR3_ a fragment is shown in table 1.
The information of CpG sites in FOLR3_ B fragment is shown in table 2.
The information of CpG sites in FOLR3_ C fragment is shown in table 3.
TABLE 1 CpG site information in FOLR3_ A fragment
CpG sites Position of CpG site in sequence
FOLR3_A_1 SEQ ID No.1 from the 40 th to the 41 th position of the 5' end
FOLR3_A_2 95-96 th bit from 5' end of SEQ ID No.1
FOLR3_A_3 152-153 th bit from 5' end of SEQ ID No.1
FOLR3_A_4 157-158 th from 5' end of SEQ ID No.1
FOLR3_A_5 361 st-362 th bit from 5' end of SEQ ID No.1
FOLR3_A_6 The position 385-386 from the 5' end of SEQ ID No.1
FOLR3_A_7 SEQ ID No.1 at 387-388 positions from the 5' end
FOLR3_A_8 No.1 SEQ ID No. 397 to 398 from the 5' end
FOLR3_A_9 422 th to 423 th positions from 5' end of SEQ ID No.1
FOLR3_A_10 From position 452 to 453 of the 5' end of SEQ ID No.1
FOLR3_A_11 Position 458-459 of SEQ ID No.1 from 5' end
FOLR3_A_12 From 527 th to 528 th positions of 5' end of SEQ ID No.1
FOLR3_A_13 567-568 from the 5' end of SEQ ID No.1
FOLR3_A_14 SEQ ID No.1 at positions 592 to 593 from the 5' end
TABLE 2 CpG site information in FOLR3_ B fragment
CpG sites Position of CpG sites in the sequence
FOLR3_B_1 SEQ ID No.2 from position 53 to 54 of the 5' end
FOLR3_B_2 85-86 of SEQ ID No.2 from the 5' end
FOLR3_B_3 95-96 th site from 5' end of SEQ ID No.2
FOLR3_B_4 181-182 of SEQ ID No.2 from the 5' end
FOLR3_B_5 SEQ ID No.2 at 387-388 th position from 5' end
FOLR3_B_6 407-408 th bit from 5' end of SEQ ID No.2
FOLR3_B_7 From the 5' end, positions 578 to 579 of SEQ ID No.2
TABLE 3 CpG site information in FOLR3_ C fragment
Figure BDA0003192406100000091
Figure BDA0003192406100000101
Specific PCR primers were designed for three fragments (FOLR 3_ a fragment, FOLR3_ B fragment, and FOLR3_ C fragment) as shown in table 4. Wherein, SEQ ID No.4, SEQ ID No.6 and SEQ ID No.8 are forward primers, and SEQ ID No.5, SEQ ID No.7 and SEQ ID No.9 are reverse primers; in SEQ ID No.4, SEQ ID No.6 and SEQ ID No.8, the 1 st to 10 th sites from the 5' end are non-specific tags, and the 11 th to 35 th sites are specific primer sequences; in SEQ ID No.5, SEQ ID No.7 and SEQ ID No.9, the sites 1 to 31 from the 5' are nonspecific tags, and the sites 32 to 56 are specific primer sequences. The primer sequence does not contain SNP and CpG sites.
TABLE 4 FOLR3 methylation primer sequences
Figure BDA0003192406100000102
Example 2 detection of FOLR3 Gene methylation and analysis of the results
1. Research sample
With the patient's informed consent, a total of 722 lung cancer patients, 152 patients with benign nodules in the lung, 227 breast cancer patients and 945 cancer-free controls (cancer-free controls, i.e., patients who had previously and now had no cancer and had no reported pulmonary nodules and had blood routine indicators within the reference range) were collected.
All patient samples were collected preoperatively and were confirmed imagewise and pathologically.
The lung cancer and breast cancer subtypes are judged according to histopathology.
The staging of lung cancer and breast cancer takes AJCC 8 th edition staging system as a judgment standard.
722 lung cancer patients were classified by type: 619 cases of lung adenocarcinoma, 42 cases of lung squamous carcinoma, 49 cases of small cell lung carcinoma, and 12 other cases.
722 lung cancer patients were classified by stage: 649 cases in stage I, 41 cases in stage II and 32 cases in stage III.
722 lung cancer patients were classified by lung cancer tumor size (T): t1, T2 and T3 are 603 and 36 respectively.
722 lung cancer patients were classified by the presence or absence of lung cancer lymph node infiltration (N): there were 688 cases of lymph node infiltration without lung cancer and 34 cases of lymph node infiltration with lung cancer.
227 breast cancer patients were classified according to type: 34 cases of ductal carcinoma in situ of the breast, 165 cases of invasive ductal carcinoma and 28 cases of invasive lobular carcinoma.
227 breast cancer patients were divided by stage: 198 cases in stage I, 20 cases in stage II and 9 cases in stage III.
227 breast cancer patients were classified by lung cancer tumor size (T): t1, T2, T3 and 189.
227 breast cancer patients were classified according to the presence or absence of breast cancer lymph node infiltration (N): there were 201 cases of lymph node infiltration without breast cancer and 26 cases of lymph node infiltration with breast cancer.
The median age of each of the cancer-free population, benign nodules in the lung, and breast cancer patients was 56, 57, 58, and 56 years, and the ratio of each of the 3 cancer-free population, benign nodules in the lung, and lung cancer patients was about 1 for both men and women, and all breast cancer patients were women.
2. Methylation detection
1. Total DNA of the blood sample was extracted.
2. The total DNA of the blood sample prepared in step 1 was treated with bisulfite (see Qiagen for DNA methylation kit instructions). Following bisulfite treatment, unmethylated cytosine (C) is converted to uracil (U), while methylated cytosine remains unchanged, i.e., the C base of the original CpG site is converted to C or U following bisulfite treatment.
3. Taking the DNA treated by the bisulfite in the step 2 as a template, adopting 3 pairs of specific primers in the table 4 to perform PCR amplification by DNA polymerase according to a reaction system required by a conventional PCR reaction, wherein the 3 pairs of primers all adopt the same conventional PCR system, and the 3 pairs of primers all perform amplification according to the following procedures.
The PCR reaction program is: 95 ℃,4min → (95 ℃,20s → 56 ℃,30s → 72 ℃,2 min) 45 cycles → 72 ℃,5min → 4 ℃,1h.
4. Taking the amplification product in the step 3, and carrying out DNA methylation analysis through a flight time mass spectrum, wherein the specific method comprises the following steps:
(1) To 5. Mu.l of the PCR product was added 2. Mu.l of a shrimp basic phosphate (SAP) solution (0.3 ml SAP 2.5U]+1.7ml H 2 O) then incubated in a PCR instrument according to the following procedure: 37 ℃,20min → 85 ℃,5min → 4 ℃,5min.
(2) Mu.l of SAP treated product from step (1) was removed, added to 5. Mu.l of T-Cleavage reaction according to the instructions, and incubated at 37 ℃ for 3h.
(3) The product of step (2) was taken, 19. Mu.l of deionized water was added and deionised for 1h with 6. Mu.g Resin in a rotary shaker.
(4) The supernatant was centrifuged at 2000rpm for 5min at room temperature and the microsuperheat was loaded with 384SpectroCHIP by a Nanodipen robot.
(5) Performing time-of-flight mass spectrometry; the data obtained were collected with the SpectroACQUIRE v3.3.1.3 software and visualized with the MassArray EpiTyper v1.2 software.
The reagents used in the flight time mass spectrometry detection are all kits (T-clean Mass clean Reagent Auto Kit, cat # 10129A); the detection instrument used for the flight time mass spectrum detection comprises
Figure BDA0003192406100000111
Analyzer Chip Prep Module 384, model: 41243; the data analysis software is self-contained software of the detection instrument.
5. And (5) analyzing the data obtained in the step (4).
Statistical analysis of the data was performed by SPSS Statistics 23.0.
The nonparametric test was used for comparative analysis between the two groups.
The discrimination effect of multiple combinations of CpG sites for different sample groupings was achieved by logistic regression and statistical methods of subject curves.
All statistical tests were two-sided, with p values <0.05 considered statistically significant.
Through mass spectrometry experiments, a total of 31 distinguishable peak patterns of methylated fragments were obtained. The methylation level of each sample at each CpG site can be automatically obtained by calculating the peak area according to the formula "methylation level = peak area of methylated fragment/(peak area of unmethylated fragment + peak area of methylated fragment)" using SpectroACQUIRE v3.3.1.3 software.
3. Analysis of results
1. FOLR3 gene methylation levels in blood of non-cancer controls, benign nodules, and lung cancer
The methylation levels of all CpG sites in the FOLR3 gene were analyzed using 722 lung cancer patients, 152 patients with benign nodules in the lung, and 945 cancer-free controls of blood as the study material (Table 5). The results showed that all CpG sites in the FOLR3 gene had a median methylation level of 0.36 in the cancer-free control group (IQR = 0.19-0.54), a median methylation level of 0.29 in the benign nodules (IQR = 0.12-0.47), and a median methylation level of 0.30 in the lung cancer patients (IQR = 0.12-0.48).
2. The methylation level of FOLR3 gene in blood can distinguish non-cancer control patients from lung cancer patients
By comparatively analyzing the methylation levels of FOLR3 genes of 722 lung cancer patients and 945 cancer-free controls, the methylation levels of all CpG sites in FOLR3 genes of lung cancer patients were found to be significantly lower than those of cancer-free controls (p <0.05, table 6). In addition, methylation levels of all CpG sites of FOLR3 gene in different subtypes of lung cancer (lung adenocarcinoma, lung squamous carcinoma and small cell lung cancer) are respectively significantly different from those of a cancer-free control (p is less than 0.05, and the table 6). Methylation levels of all CpG sites of the FOLR3 gene in different stages (clinical stage I and stage II-III) of lung cancer are respectively and remarkably different from that of a cancer-free control (p is less than 0.05, and the table 6). In addition, there was a significant difference in methylation levels between non-lymphangiogenic lung cancer patients and lymphangiogenic lung cancer patients, respectively, and non-cancer controls (p <0.05, table 6). Therefore, the methylation level of the FOLR3 gene can be used for clinical diagnosis of lung cancer, and particularly can be used for early diagnosis of lung cancer.
3. The methylation level of the FOLR3 gene in blood can distinguish benign nodules of lung from lung cancer patients
By comparing and analyzing the methylation levels of the FOLR3 gene in 722 lung cancer patients and 152 benign nodules, the methylation levels of all CpG sites of the FOLR3 gene in the benign nodule patients were found to be significantly lower than those of the lung cancer patients (p <0.05, table 7). In addition, significant differences were found between the methylation levels of all CpG in the FOLR3 gene of lung cancer patients of different subtypes of lung cancer (lung adenocarcinoma, lung squamous carcinoma, small cell lung carcinoma), different clinical stages (stage I or II-III) and presence or absence of lymphatic infiltration (p <0.05, table 7). Therefore, the methylation level of the FOLR3 gene can be applied to distinguishing lung cancer patients from benign nodule patients, and is a very potential marker.
4. Differentiation of different subtypes or stages of lung cancer by FOLR3 gene methylation level in blood
By comparative analysis of methylation levels of FOLR3 genes in different subtype lung cancer patients (lung adenocarcinoma, lung squamous carcinoma and small cell lung cancer) and different stage lung cancer patients, the methylation levels of all CpG sites in the FOLR3 genes are found to have significant differences (p is less than 0.05, table 8) respectively under different lung cancer subtypes (lung adenocarcinoma patients, lung squamous carcinoma patients and small cell lung cancer patients), lung cancer tumor sizes (T1, T2 and T3), different lung cancer stages (clinical stage I, stage II and stage III) and the condition of lymph node infiltration. Thus, the methylation level of FOLR3 gene can be used to differentiate between different subtypes or stages of lung cancer.
5. The FOLR3 methylation level in blood can be used for diagnosing breast cancer
The differences in the methylation level of CpG sites in the FOLR3 gene between breast cancer patients and cancer-free female controls were analyzed using blood from 227 breast cancer patients and 472 cancer-free female control samples as the study material (table 9). The results showed that the median methylation level of all CpG sites of interest in breast cancer patients was 0.33 (IQR = 0.18-0.50), the median methylation level of the no-cancer control group was 0.36 (IQR = 0.19-0.54), and the methylation level of all CpG sites in breast cancer patients was significantly lower than that of the no-cancer control (p < 0.05). In addition, methylation levels of all CpG sites in FOLR3 gene were significantly different in different stages of breast cancer (clinical stage I, stage II, stage III), presence or absence of lymph node infiltration, and different sizes of tumor (T1, T2, and T3), respectively (p <0.05, table 10). Therefore, the methylation level of the FOLR3 gene can be used for clinical diagnosis of breast cancer.
6. The FOLR3 methylation level in blood can distinguish breast cancer patients from lung cancer patients
The difference in methylation level in the FOLR3 gene in blood of breast cancer patients and lung cancer patients was analyzed using blood of 227 breast cancer patients and 722 lung cancer patients as a study material (table 11). The results indicate that the median methylation level of all CpG sites of interest in breast cancer patients is 0.33 (IQR = 0.18-0.50), the median methylation level of lung cancer patients is 0.30 (IQR = 0.12-0.48), and the methylation level of all CpG sites in breast cancer patients is significantly higher than that of lung cancer patients (p < 0.05). Thus, the methylation level of FOLR3 gene can be used to differentiate breast and lung cancer patients.
7. Establishment of mathematical model for assisting cancer diagnosis
The mathematical model established by the invention can be used for achieving the following purposes:
(1) Differentiating lung cancer patients from non-cancer controls;
(2) Distinguishing lung cancer patients from lung benign nodule patients;
(3) Differentiating breast cancer patients from non-cancerous female controls;
(4) Differentiating between breast and lung cancer patients
(5) Distinguishing lung cancer subtypes;
(6) Differentiating the stage of lung cancer;
(7) To differentiate breast cancer stages.
The mathematical model is established as follows:
(A) The data source is as follows: 722 lung cancer patients listed in step one, 152 patients with benign nodules in the lung, 227 breast cancer patients, and 945 cancer-free controls ex vivo blood samples for the targeted CpG site (combination of one or more of tables 1-3) methylation levels (test method as in step two).
The data can be added with known parameters such as age, sex, white blood cell count and the like according to actual needs to improve the discrimination efficiency.
(B) Model building
Selecting any two types of patient data of different types, namely training sets (such as cancer-free controls and lung cancer patients, cancer-free female controls and breast cancer patients, lung benign nodule patients and lung cancer patients, lung cancer patients and breast cancer patients, lung adenocarcinoma and lung squamous cancer patients, lung adenocarcinoma and small cell lung cancer patients, lung squamous cancer and small cell lung cancer patients, stage I lung cancer patients, stage II lung cancer patients, stage I lung cancer patients, stage III lung cancer patients, stage II lung cancer patients, stage I breast cancer patients, stage II breast cancer patients, stage III breast cancer patients, stage T1 breast cancer patients, stage T2 breast cancer patients, stage T1 breast cancer patients, stage T3 breast cancer patients, lymph node infiltration breast cancer patients) as data for establishing models according to needs, and establishing mathematical models by using statistical methods of two-class logistic regression. The numerical value corresponding to the maximum Johnson index calculated by the mathematical model formula is a threshold value or 0.5 is directly set as the threshold value, the detection index obtained after the sample to be detected is tested and substituted into the model calculation is classified into one class (B class) when being larger than the threshold value, and classified into the other class (A class) when being smaller than the threshold value, and the detection index is equal to the threshold value and is used as an uncertain gray zone. When a new sample to be detected is predicted to judge which type the sample belongs to, firstly, the methylation level of one or more CpG sites on the FOLR3 gene of the sample to be detected is detected by a DNA methylation determination method, then the data of the methylation level are substituted into the mathematical model (if known parameters such as age, sex, white cell count and the like are included in the model construction, the step simultaneously substitutes the specific numerical value of the corresponding parameter of the sample to be detected into the model formula), the detection index corresponding to the sample to be detected is obtained by calculation, then the detection index corresponding to the sample to be detected is compared with the threshold value, and the sample to be detected belongs to which type the sample to be detected is determined according to the comparison result.
Examples are: as shown in FIG. 1, the data of methylation level of single CpG site or combination of multiple CpG sites of FOLR3 gene in training set is processed by statistical software such as SAS, R, SPSS, etc. to establish mathematical model for distinguishing A class from B class by using formula of two-classification logistic regression. The mathematical model is here a two-class logistic regression model, specifically: log (y/1-y) = b0+ b1x1+ b2x2+ b3x3+ \8230, + bnXn, wherein y is a detection index obtained after a dependent variable is substituted into a model about the methylation value of one or more methylation sites of a sample to be tested, b0 is a constant, x 1-xn are independent variables about the methylation value of one or more methylation sites of the test sample (each value is a value between 0 and 1), and b 1-bn are weights assigned to the methylation values of each site by the model. In specific application, a mathematical model is established according to the methylation degrees (x 1-xn) of one or more DNA methylation sites of a sample detected in a training set and known classification conditions (class A or class B, and 0 and 1 are respectively assigned to y), so that the constant B0 of the mathematical model and the weights B1-bn of the methylation sites are determined, and a detection index (0.5 in the example) corresponding to the maximum johnson index calculated by the mathematical model is used as a partition threshold. And (3) the detection index (y value) obtained by testing and substituting the sample to be detected into the model calculation is more than 0.5 and is classified as B, less than 0.5 and is classified as A, and the value is equal to 0.5 and is used as an uncertain gray area. Where class A and class B are two corresponding classes (a grouping of two classes, which group is class A and which group is class B, as determined by a specific mathematical model, not to be agreed herein), such as cancer-free controls and lung cancer patients, cancer-free female controls and breast cancer patients, benign nodules of the lung patients and lung cancer patients, lung cancer patients and breast cancer patients, lung adenocarcinoma and squamous lung cancer patients, lung adenocarcinoma and small cell lung cancer patients, squamous lung cancer and small cell lung cancer patients, stage I lung cancer and stage II lung cancer patients, stage I lung cancer and stage III lung cancer patients, stage II lung cancer and stage III lung cancer patients, stage I breast cancer and stage II breast cancer patients, stage I breast cancer and stage III breast cancer patients, stage II breast cancer and stage III breast cancer patients, stage T1 breast cancer and stage T2 breast cancer patients, stage T1 breast cancer and stage T3 breast cancer patients, stage T2 breast cancer and stage T3 breast cancer patients, non-infiltrating breast cancer and lymph node breast cancer patients. When a sample of a subject is predicted to determine which class it belongs to, blood of the subject is first collected and then DNA is extracted therefrom. After the extracted DNA is transformed by bisulfite, the methylation level of a single CpG site or the methylation level of a combination of CpG sites of FOLR3 gene of the subject is detected by using a DNA methylation measuring method, and then the methylation data obtained by the detection is substituted into the above mathematical model. If the methylation level of one or more CpG sites of the FOLR3 gene of the subject is substituted into the mathematical model, and the calculated detection index is larger than the threshold value, the subject judges that the detection index in the training set is larger than 0.5 and belongs to a class (class B); if the methylation level data of one or more CpG sites of the FOLR3 gene of the subject is substituted into the mathematical model, and the calculated value, namely the detection index is less than the threshold value, the subject belongs to a class (class A) with the detection index less than 0.5 in the training set; if the methylation level data of one or more CpG sites of the FOLR3 gene of the subject is substituted into the above mathematical model, the calculated value, i.e., the detection index, is equal to the threshold value, then the subject cannot be judged to be in the A class or the B class.
Examples are as follows: fig. 2 is a schematic diagram illustrating the methylation of preferred CpG sites of FOLR3_ a (FOLR 3_ a _1, FOLR3_ a _2, FOLR3_ a _3, FOLR3_ a _4, FOLR3_ a _ 5) and the application of mathematical modeling for the discrimination of good and malignant nodules in the lung: data on methylation levels of 5 distinguishable preferred CpG site combinations that have been detected in a training set of lung cancer patients and lung benign nodule patients (here: 722 lung cancer patients and 152 lung benign nodule patients) and the ages, sexes (male assigned 1 and female assigned 0) and white blood cell counts of the patients were used to build a mathematical model for distinguishing lung cancer patients from lung benign nodule patients by R software using a two-classification logistic regression formula. The mathematical model is here a two-class logistic regression model, from which the constant b0 of the mathematical model and the weights b1 to bn of the individual methylation sites are determined, in this case in particular: log (y/(1-y)) = -0.64+1.775 FOLR3_A _1-1.504 FOLR3_A _2+1.31 FOLR3_A _3+0.31 FOLR3_A _A4-1.29 FOLR3_A _5+0.027 age (male assigned value of 1, female assigned value of 0) +0.734 sex +0.013 white blood cell count. Wherein y is a dependent variable, namely a detection index obtained by substituting methylation values of 5 distinguishable methylation sites of a sample to be detected, age, sex and white blood cell count into a model. Under the condition that 0.5 is set as a threshold value, the methylation levels of 5 distinguishable CpG sites of FOLR3_ A _1, FOLR3_ A _2, FOLR3_ A _3, FOLR3_ A _4 and FOLR3_ A _5 of a sample to be detected are tested and then substituted into a model together with the information of the age, the sex and the white cell count of the sample to be detected for calculation, and the obtained detection index, namely the y value is more than 0.5 and is classified as a lung cancer patient, less than 0.5 and is classified as a lung benign node patient, and if the y value is equal to 0.5, the lung cancer patient or the lung benign node patient is not determined. The area under the curve (AUC) of this model was calculated to be 0.71 (table 15). Specific examples of the method for determining the subject include, for example, as shown in fig. 2, a method in which DNA is extracted from blood collected from two subjects (a, b), the extracted DNA is converted into bisulfite, and then the methylation levels of 5 distinguishable CpG sites, i.e., FOLR3_ a _1, FOLR3_ a _2, FOLR3_ a _3, FOLR3_ a _4, and FOLR3_ a _5, of the subjects are measured by a DNA methylation measurement method. The detected methylation level data is then substituted into the mathematical model described above, along with information on the age, sex, and white blood cell count of the subject. The first test subject is judged to be a lung cancer patient (consistent with the clinical judgment result) if the value calculated by the first test subject through the mathematical model is 0.71 and more than 0.5; and if the value calculated by substituting the data of the methylation level of one or more CpG sites of the FOLR3 gene of the second subject into the mathematical model is 0.19 to less than 0.5, judging the second subject to be the lung benign nodule patient (which is consistent with the clinical judgment result).
(C) Evaluation of model Effect
According to the above method, mathematical models for distinguishing between a cancer-free control and a lung cancer patient, a cancer-free female control and a breast cancer patient, a lung benign nodule patient and a lung cancer patient, a lung cancer patient and a breast cancer patient, a lung adenocarcinoma and a lung squamous cancer patient, a lung adenocarcinoma and a small cell lung cancer patient, a lung squamous cancer and a small cell lung cancer patient, a lung cancer stage I and a lung cancer stage II patient, a lung cancer stage I and a lung cancer stage III patient, a lung cancer stage II and a lung cancer stage III patient, a breast cancer stage I and a breast cancer stage II and stage II patient, a breast cancer stage II and a breast cancer stage III patient, a breast cancer stage T1 and a breast cancer stage T2 patient, a breast cancer stage T1 and a breast cancer stage T3 patient, a breast cancer stage T2 and a breast cancer stage T3 patient, a lymph node-non-infiltrating breast cancer and a lymph-node-infiltrating breast cancer patient are established, respectively, and the validity thereof is evaluated by a subject curve (ROC curve). The larger the area under the curve (AUC) obtained by the ROC curve, the better the discrimination of the model, and the more effective the molecular marker. The results of the evaluation after the mathematical model construction using different CpG sites are shown in table 12, table 13 and table 14. In tables 12, 13 and 14, 1 CpG site represents any CpG site in the FOLR3_ a amplified fragment, 2 CpG sites represent any combination of 2 CpG sites in the FOLR3_ a, 3 CpG sites represent any combination of 3 CpG sites in the FOLR3_ a, \8230; \8230, and so on. The values in the table are ranges for the results of different site combinations (i.e., results for any combination of CpG sites are within the range).
The above results show that the discriminatory ability of FOLR3 gene for each group (cancer-free control and lung cancer patient, cancer-free female control and breast cancer patient, lung benign nodule patient and lung cancer patient, lung cancer patient and breast cancer patient, lung adenocarcinoma and lung squamous carcinoma patient, lung adenocarcinoma and small cell lung cancer patient, lung squamous carcinoma and small cell lung cancer patient, stage I lung cancer and stage II lung cancer patient, stage I lung cancer and stage III lung cancer patient, stage II lung cancer and stage III lung cancer patient, stage I breast cancer and stage II breast cancer patient, stage I breast cancer and stage III breast cancer patient, stage II breast cancer and stage III breast cancer patient, T1 breast cancer and T2 breast cancer patient, T1 breast cancer and T3 breast cancer patient, T2 breast cancer and T3 breast cancer patient, lymph node-infiltrating breast cancer-free and lymph node-infiltrating breast cancer patient) increases with the increase of the number of sites.
In addition, among the CpG sites shown in tables 1 to 3, there are cases where a combination of a few more excellent sites is better in discrimination than a combination of a plurality of non-excellent sites. For example, the combination of 5 distinguishable optimal sites, FOLR3_ a _1, FOLR3_ a _2, FOLR3_ a _3, FOLR3_ a _4, FOLR3_ a _5, shown in table 15 and table 16 and table 17, is a preferred site combination of any 5 distinguishable sites in FOLR3_ a.
In summary, the methylation levels of the CpG sites on the FOLR3 gene and various combinations thereof, the CpG sites on the FOLR3_ A fragment and various combinations thereof, the CpG sites on the FOLR3_ B fragment and various combinations thereof, the CpG sites on the FOLR3_ C fragment and various combinations thereof, and the CpG sites on the FOLR3_ A, FOLR3_ B and FLOR3_ C and various combinations thereof are all determined for the patients with no cancer control and lung cancer, the female no cancer control and breast cancer patients, the benign tuberculous lung patients and lung cancer patients, the lung cancer patients and breast cancer patients, the lung adenocarcinoma and squamous lung cancer patients, the lung adenocarcinoma and small cell lung cancer patients, the squamous lung cancer and small cell lung cancer patients, the stage I and stage II lung cancer patients, the lung cancer and III patients, the stage I and III patients, the stage III patients with no cancer, the T stage III and the T3 stage T2, the T stage III and the T stage 1 and the T stage with breast cancer.
TABLE 5 comparison of methylation levels of cancer-free controls, benign nodules, and lung cancer
Figure BDA0003192406100000151
Figure BDA0003192406100000161
TABLE 6 comparison of methylation level differences between cancer-free controls and Lung cancer
Figure BDA0003192406100000162
Figure BDA0003192406100000171
TABLE 7 comparison of methylation level differences between benign nodules and Lung cancer
Figure BDA0003192406100000172
Figure BDA0003192406100000181
TABLE 8 comparison of methylation level differences for different subtypes or stages of lung cancer
Figure BDA0003192406100000182
TABLE 9 comparison of methylation level differences between cancer-free controls and breast cancer
Figure BDA0003192406100000191
TABLE 10 comparison of methylation level differences between stages of breast cancer
Figure BDA0003192406100000192
Figure BDA0003192406100000201
TABLE 11 comparison of methylation level differences between Lung and Breast cancers
Figure BDA0003192406100000202
Figure BDA0003192406100000211
TABLE 12 CpG sites of FOLR3_ A and combinations thereof for differentiating lung cancer and non-cancer control, lung cancer and benign nodules, breast cancer and non-cancer control, lung cancer and breast cancer
Figure BDA0003192406100000212
TABLE 13 CpG sites of FOLR3_ A and their free combinations for differentiating different subtypes and stages of lung cancer patients
Figure BDA0003192406100000221
TABLE 14 CpG sites of FOLR3_ A and combinations thereof for differentiating between different stages of breast cancer
Figure BDA0003192406100000222
Figure BDA0003192406100000231
TABLE 15 optimal CpG sites for FOLR3_ A and combinations thereof for differentiating lung cancer and non-cancer controls, lung cancer and benign nodules, breast cancer and non-cancer controls, and lung cancer and breast cancer
Figure BDA0003192406100000232
TABLE 16 optimal CpG sites of FOLR3_ A and combinations thereof for differentiating different subtypes and stages of lung cancer patients
Figure BDA0003192406100000233
Figure BDA0003192406100000241
TABLE 17 optimal CpG sites for FOLR3_ A and combinations thereof for differentiating different stages of breast cancer
Figure BDA0003192406100000242
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.
Sequence listing
<110> Nanjing Tengthen Biotechnology Co., ltd
<120> molecular marker and kit for auxiliary diagnosis of cancer
<130> GNCLN211875
<160> 9
<170> SIPOSequenceListing 1.0
<210> 1
<211> 627
<212> DNA
<213> Artificial sequence
<400> 1
aaaacactgg agaaatccaa gaggggaagt ccacaagggc ggtggctccc tacaaggtca 60
cagagcaagc tggtgtcaga gcctggacct acagcgctgt tggtggaggt cctgcctcca 120
ggtaggggaa gggctccctc tcacctctac acgcagcgca tttcttggct cagctgccct 180
gtaggggatg cagggtgggg acagcagaga tctgggcctg ggagggagag agtacacaat 240
cacatggctg ttgcccctgt ctcaggcctt gtctacctct gactgtggct ctctggcagg 300
aatagatgga catggcctgg cagatgatgc agctgctgct tctggctttg gtgactgctg 360
cggggagtgc ccagcccagg agtgcgcggg ccaggacgga cctgctcaat gtctgcatga 420
acgccaagca ccacaagaca cagcccagcc ccgaggacga gctgtatggc caggtgaggg 480
cagcctggtg taggacagca tgcacacagg tcagagggtg atggcacgag caatggcagg 540
tccagtgtgg tcagaaccaa gggtgccgct gctgacaagg aaggggaggg gcggccaggg 600
ccaccatgcc acaggtaagg ccactga 627
<210> 2
<211> 607
<212> DNA
<213> Artificial sequence
<400> 2
tcttgcagga agaccaagag ttggaacatc caaggaaaag caagtgtgaa gtcgggctgg 60
cagggaagca tgttctgtgt cagccggcac tgggcgtggg ccagggtgtg ggaggtgggt 120
aggtctggct cccctcccat ggatttccct attgtttctc ctgggtgctc aggcctgtca 180
cgcctctgcc atcacttgac cctaggtgca agggttcagc ccagaaattt tatgcaattg 240
attcatgatt tctcaggttt tctgagtcct ggcctagagt gacttcccaa gaaaaaactc 300
caccatttct gcttgtctta cctgccttgt atttaccttt ctaggattgc cttttccaca 360
tttagtcaag tctaggttca gacccacgtg caggctatag ctccttcgtt ctccaccact 420
ctcaggatct atctagagtc tccccacctg gacctccaga ccctgggaga gccagaccag 480
ccccttgacc tccaccctcc cccacaacct gggccaggtt cctctcctcc ctgtcctcag 540
ttataatttt tttttttttt aatttgaggc agagtttcgc tcttgttgcc caggctggaa 600
tgcaatg 607
<210> 3
<211> 647
<212> DNA
<213> Artificial sequence
<400> 3
tggcatgtgc ctgaattcca gctcctcggg aggctgaggt gggaggatta tttgagccca 60
ggatgttgag gctgcagtga gctatgatca caccactgcg ctccagcctg gtcaacagag 120
caagaccctg tctcaaataa ataaatgaat aaatagggcg gaaagcacca atattgtaat 180
tgcctccgtc cccaggtggg agctcctcaa gggccctccc caggaagtgt tcctctggat 240
gacctacctg gggcagagga gccagaatat ggaggagatg gctgtggtgg ggagagactt 300
agtcctgtgt cttccccacc cagtgcagtc cctggaagaa gaatgcctgc tgcacggcca 360
gcaccagcca ggagctgcac aaggacacct cccgcctgta caactttaac tgggatcact 420
gtggtaagat ggaacccacc tgcaagcgcc actttatcca ggacagctgt ctctgagtgc 480
tcacccaacc tggggccctg gatccggcag gtatgagtgc tgttcccaca aacattaacc 540
tcagcagagg gcggagcctg ccagttgctg gcagggaggg cttggtccag gaattcgggt 600
ctgagggtgg tggacgccct gccccctccc acagctctgg tcccctt 647
<210> 4
<211> 35
<212> DNA
<213> Artificial sequence
<400> 4
aggaagagag aaaatattgg agaaatttaa gaggg 35
<210> 5
<211> 56
<212> DNA
<213> Artificial sequence
<400> 5
cagtaatacg actcactata gggagaaggc ttcaataacc ttacctataa cataat 56
<210> 6
<211> 35
<212> DNA
<213> Artificial sequence
<400> 6
aggaagagag ttttgtagga agattaagag ttgga 35
<210> 7
<211> 56
<212> DNA
<213> Artificial sequence
<400> 7
cagtaatacg actcactata gggagaaggc tcattacatt ccaacctaaa caacaa 56
<210> 8
<211> 35
<212> DNA
<213> Artificial sequence
<400> 8
aggaagagag tggtatgtgt ttgaatttta gtttt 35
<210> 9
<211> 56
<212> DNA
<213> Artificial sequence
<400> 9
cagtaatacg actcactata gggagaaggc taaaaaaacc aaaactataa aaaaaa 56

Claims (10)

1. The application of the methylated FOLR3 gene as a marker in the preparation of products; the use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) The auxiliary differentiation of different cancer subtypes is realized;
(4) The auxiliary differentiation of different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) Assisting in distinguishing benign nodules of the lung from lung cancer;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) The different stages of the lung cancer are distinguished in an auxiliary way;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) The lung cancer and the breast cancer are distinguished in an auxiliary mode;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
2. The application of a substance for detecting the methylation level of the FOLR3 gene in the preparation of products; the use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) Assisting in distinguishing different subtypes of cancer;
(4) Assisting in distinguishing different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) The benign nodules and the lung cancer of the lung can be distinguished in an auxiliary mode;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method can help distinguish different stages of breast cancer;
(11) Assisting in distinguishing lung cancer from breast cancer;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
3. The application of a substance for detecting the methylation level of the FOLR3 gene and a medium which is recorded with a mathematical model establishing method and/or a using method in the preparation of products; the use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) The auxiliary differentiation of different cancer subtypes is realized;
(4) The auxiliary differentiation of different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) The benign nodules and the lung cancer of the lung can be distinguished in an auxiliary mode;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) The different stages of the lung cancer are distinguished in an auxiliary way;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) The lung cancer and the breast cancer are distinguished in an auxiliary mode;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
The mathematical model is obtained according to a method comprising the following steps:
(A1) Respectively detecting the FOLR3 gene methylation levels of n 1A type samples and n 2B type samples;
(A2) Taking the FOLR3 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment;
the use method of the mathematical model comprises the following steps:
(B1) Detecting the FOLR3 gene methylation level of a sample to be detected;
(B2) Substituting the FOLR3 gene methylation level data of the sample to be detected, which are obtained in the step (B1), into the mathematical model to obtain a detection index; then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is A type or B type according to the comparison result;
the type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of the lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Samples of different stages of breast cancer.
4. The application of the medium carrying the mathematical model establishing method and/or the using method in the product preparation is described; the use of the product is at least one of the following:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) The auxiliary differentiation of different cancer subtypes is realized;
(4) Assisting in distinguishing different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) The benign nodules and the lung cancer of the lung can be distinguished in an auxiliary mode;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method helps to distinguish different stages of breast cancer;
(11) Assisting in distinguishing lung cancer from breast cancer;
(12) Determining whether the test substance has a hindering or promoting effect on the occurrence of the cancer;
the mathematical model is obtained according to a method comprising the following steps:
(A1) Respectively detecting the FOLR3 gene methylation levels of n 1A type samples and n 2B type samples;
(A2) Taking the FOLR3 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model by a two-classification logistic regression method according to the classification modes of the type A and the type B, and determining a threshold value for classification judgment;
the use method of the mathematical model comprises the following steps:
(B1) Detecting the FOLR3 gene methylation level of a sample to be detected;
(B2) Substituting the FOLR3 gene methylation level data of the sample to be detected, which are obtained in the step (B1), into the mathematical model to obtain a detection index; then comparing the detection index with the threshold value, and determining whether the type of the sample to be detected is A type or B type according to the comparison result;
the type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
5. A kit comprising a substance for detecting the methylation level of the FOLR3 gene; the kit is used for at least one of the following purposes:
(1) Auxiliary diagnosis of cancer or prediction of cancer risk;
(2) Aid in distinguishing benign nodules from cancer;
(3) The auxiliary differentiation of different cancer subtypes is realized;
(4) Assisting in distinguishing different stages of cancer;
(5) Auxiliary diagnosis of lung cancer or prediction of lung cancer risk;
(6) Assisting in distinguishing benign nodules of the lung from lung cancer;
(7) The method helps to distinguish different subtypes of the lung cancer;
(8) Assisting in distinguishing different stages of lung cancer;
(9) Auxiliary diagnosis of breast cancer or prediction of breast cancer risk;
(10) The method can help distinguish different stages of breast cancer;
(11) The lung cancer and the breast cancer are distinguished in an auxiliary mode;
(12) Determining whether the test agent has a hindering or promoting effect on the development of the cancer.
6. The kit of claim 5, wherein: the kit further comprises a medium as described in claim 3 or 4 in which a mathematical model building method and/or a method of use is described.
7. A system, comprising:
(D1) Reagents and/or instruments for detecting the methylation level of the FOLR3 gene;
(D2) A device comprising a unit X and a unit Y;
the unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module;
the data acquisition module is configured to acquire (D1) the detected FOLR3 gene methylation level data of n1 type a samples and n2 type B samples;
the data analysis processing module is configured to receive FOLR3 gene methylation level data of n 1A type samples and n 2B type samples sent by the data acquisition module, establish a mathematical model through a two-classification logistic regression method according to classification modes of the A type and the B type, and determine a threshold value of classification judgment;
the model output module is configured to receive and output the mathematical model sent by the data analysis processing module;
the unit Y is used for determining the type of a sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module;
the data input module is configured to input (D1) detected FOLR3 gene methylation level data of the subject;
the data operation module is configured to receive the FOLR3 gene methylation level data of the person to be tested sent by the data input module, substitute the FOLR3 gene methylation level data of the person to be tested into the mathematical model established by the data analysis processing module in the unit X, and calculate to obtain a detection index;
the data comparison module is configured to receive the detection index sent by the data operation module and compare the detection index with the threshold determined by the data analysis processing module in the unit X;
the conclusion output module is configured to receive the comparison result sent by the data comparison module and output the conclusion that the type of the sample to be tested is the type A or the type B according to the comparison result;
the type A sample and the type B sample are any one of the following:
(C1) Lung cancer samples and no cancer controls;
(C2) Lung cancer samples and benign nodule samples of lung;
(C3) Samples of different subtypes of lung cancer;
(C4) Samples of different stages of lung cancer;
(C5) Lung cancer samples and breast cancer samples;
(C6) Breast cancer samples and cancer-free female controls;
(C7) Breast cancer samples at different stages.
8. The use or kit or system according to any one of claims 1 to 7, wherein: the methylation level of the FOLR3 gene is the methylation level of all or part of CpG sites in the following fragments (e 1) to (e 3) in the FOLR3 gene;
the methylated FOLR3 gene is methylated at all or part of CpG sites in the following fragments (e 1) to (e 3) in the FOLR3 gene;
(e1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment with more than 80% of identity with the DNA fragment.
9. The use or kit or system of claim 8, wherein: the 'all or part of CpG sites' is any one or more CpG sites in 3 DNA fragments shown in SEQ ID No.1 to SEQ ID No.3 in the FOLR3 gene;
or
The 'all or part of CpG sites' are all CpG sites in the DNA segment shown in SEQ ID No.1 and all CpG sites in the DNA segment shown in SEQ ID No. 2;
or
The 'all or part of CpG sites' are all CpG sites in the DNA segment shown in SEQ ID No.1 and all CpG sites in the DNA segment shown in SEQ ID No. 3;
or
The 'all or part of CpG sites' are all CpG sites in the DNA segment shown in SEQ ID No.2 and all CpG sites in the DNA segment shown in SEQ ID No. 3;
or
The 'all or part of CpG sites' are all CpG sites in the DNA fragment shown in SEQ ID No.1, all CpG sites in the DNA fragment shown in SEQ ID No.2 and all CpG sites in the DNA fragment shown in SEQ ID No. 3;
or
The "whole or partial CpG sites" are all or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 of the DNA fragments shown in SEQ ID No. 1;
or
The 'all or part of CpG sites' is all or any 4 or any 3 or any 2 or any 1 of the following 5 CpG sites in the DNA fragment shown in SEQ ID No. 1:
(f1) The CpG sites shown in 40 th-41 th positions of the 5' end of the DNA segment shown in SEQ ID No. 1;
(f2) The CpG sites shown in 95 th-96 th bit of the 5' end of the DNA segment shown in SEQ ID No. 1;
(f3) The DNA fragment shown in SEQ ID No.1 shows CpG sites at 152 th-153 th positions from the 5' end;
(f4) The DNA fragment shown in SEQ ID No.1 is a CpG site shown by 157 th-158 th bits from 5' end;
(f5) The DNA fragment shown in SEQ ID No.1 has CpG sites from 361 th to 362 th positions of 5' end.
10. The use or kit or system according to any one of claims 1 to 9, wherein: the substance for detecting the methylation level of the FOLR3 gene comprises a primer combination for amplifying the full-length or partial fragment of the FOLR3 gene;
the reagent for detecting the methylation level of the FOLR3 gene comprises a primer combination for amplifying the full-length or partial fragment of the FOLR3 gene;
further, the partial fragment is at least one fragment selected from the following fragments:
(g1) The DNA fragment shown in SEQ ID No.1 or the DNA fragment contained in the DNA fragment;
(g2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment contained in the DNA fragment;
(g3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment contained in the DNA fragment;
(g4) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.1 or a DNA fragment comprising the same;
(g5) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.2 or a DNA fragment contained therein;
(g6) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.3 or a DNA fragment contained therein;
further, the primer combination is a primer pair A and/or a primer pair B and/or a primer pair C;
the primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.4 or SEQ ID No. 4; the primer A2 is single-stranded DNA shown by 32 th-56 th nucleotides of SEQ ID No.5 or SEQ ID No. 5;
the primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.6 or SEQ ID No. 6; the primer B2 is single-stranded DNA shown by 32 th-56 th nucleotides of SEQ ID No.7 or SEQ ID No. 7;
the primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.8 or SEQ ID No. 8; the primer C2 is single-stranded DNA shown by 32 th-56 th nucleotides of SEQ ID No.9 or SEQ ID No. 9.
CN202110882053.0A 2021-08-02 2021-08-02 Molecular marker and kit for auxiliary diagnosis of cancer Pending CN115701454A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110882053.0A CN115701454A (en) 2021-08-02 2021-08-02 Molecular marker and kit for auxiliary diagnosis of cancer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110882053.0A CN115701454A (en) 2021-08-02 2021-08-02 Molecular marker and kit for auxiliary diagnosis of cancer

Publications (1)

Publication Number Publication Date
CN115701454A true CN115701454A (en) 2023-02-10

Family

ID=85142502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110882053.0A Pending CN115701454A (en) 2021-08-02 2021-08-02 Molecular marker and kit for auxiliary diagnosis of cancer

Country Status (1)

Country Link
CN (1) CN115701454A (en)

Similar Documents

Publication Publication Date Title
CN111863250B (en) Combined diagnosis model and system for early breast cancer
CN111910004A (en) Application of cfDNA in noninvasive diagnosis of early breast cancer
CN114507731B (en) Methylation marker and kit for assisting cancer diagnosis
CN113136428B (en) Application of methylation marker in auxiliary diagnosis of cancer
CN113215252B (en) Methylation markers for aiding in the diagnosis of cancer
CN113355412B (en) Methylation markers and kits for aiding in the diagnosis of cancer
CN114480630A (en) Methylation marker for auxiliary diagnosis of cancer
CN115701454A (en) Molecular marker and kit for auxiliary diagnosis of cancer
CN113215251B (en) Methylation marker for assisting diagnosis of cancer
CN113122630B (en) Calbindin methylation markers for use in aiding diagnosis of cancer
CN113355413B (en) Application of molecular marker and kit in auxiliary diagnosis of cancer
CN115612731A (en) Molecular marker for auxiliary diagnosis of cancer
JP2018139537A (en) Method of data acquisition of possibility of lymph node metastasis of esophageal cancer
CN113186279B (en) Hyaluronidase methylation marker and kit for auxiliary diagnosis of cancer
CN115701453A (en) Molecular marker and kit for auxiliary diagnosis of cancer
CN115612735A (en) Potential molecular marker for auxiliary diagnosis of cancer
CN115612732A (en) Marker for auxiliary diagnosis of cancer and kit thereof
CN117568473A (en) Methylation molecular marker for auxiliary diagnosis of cancer
CN116790752A (en) Molecular marker for early screening and early diagnosing lung cancer
CN118028461A (en) Application of protein gene in auxiliary diagnosis of cancer
CN117568471A (en) Protein gene methylation as a molecular marker for aiding in the diagnosis of cancer
CN113215250B (en) Use of methylation level of genes in aiding diagnosis of cancer
CN117604094A (en) Methylation marker and application of kit in auxiliary diagnosis of cancer
CN117568470A (en) Molecular marker and kit for auxiliary diagnosis of cancer
CN117568472A (en) Application of methylation marker in auxiliary diagnosis of cancer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 200072, 3rd to 4th floors, Building 10, No. 351 Yuexiu Road, Hongkou District, Shanghai

Applicant after: Tengchen Biotechnology (Shanghai) Co.,Ltd.

Address before: 210032 building 02, life science and technology Island, No. 11, Yaogu Avenue, Jiangbei new area, Nanjing, Jiangsu Province

Applicant before: Nanjing Tengchen Biological Technology Co.,Ltd.

Country or region before: China

CB02 Change of applicant information