CN117690493A - Data processing device for assisting in distinguishing benign and malignant thyroid tumors - Google Patents

Data processing device for assisting in distinguishing benign and malignant thyroid tumors Download PDF

Info

Publication number
CN117690493A
CN117690493A CN202311679586.4A CN202311679586A CN117690493A CN 117690493 A CN117690493 A CN 117690493A CN 202311679586 A CN202311679586 A CN 202311679586A CN 117690493 A CN117690493 A CN 117690493A
Authority
CN
China
Prior art keywords
thyroid
seq
dna fragment
katnal2
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311679586.4A
Other languages
Chinese (zh)
Inventor
王俊
狄飞飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tengchen Biotechnology Shanghai Co ltd
Original Assignee
Tengchen Biotechnology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tengchen Biotechnology Shanghai Co ltd filed Critical Tengchen Biotechnology Shanghai Co ltd
Publication of CN117690493A publication Critical patent/CN117690493A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Organic Chemistry (AREA)
  • Molecular Biology (AREA)
  • Public Health (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Wood Science & Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Zoology (AREA)
  • Evolutionary Computation (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Primary Health Care (AREA)

Abstract

The invention discloses a data processing device for assisting in distinguishing benign and malignant thyroid tumors. The invention claims a data processing device for assisting in distinguishing A type samples from B type samples, which comprises a unit X (provided with a data acquisition module, a data analysis processing module and a model output module) for establishing a mathematical model and a unit Y (provided with a data input module, a data operation module, a data comparison module and a conclusion output module) for determining the type of a sample to be detected; the type a and type B samples are thyroid benign tumors and thyroid malignant tumors. Compared with benign thyroid tumor, the hypomethylation of KATNAL2 gene in thyroid malignant tumor patient tissue is disclosed, and the invention has important scientific significance and clinical application value for improving the early diagnosis and treatment effect of thyroid cancer, reducing the death rate of thyroid cancer and guiding the establishment of reasonable clinical treatment scheme.

Description

Data processing device for assisting in distinguishing benign and malignant thyroid tumors
Technical Field
The invention relates to the field of medical informatics, in particular to a data processing device for assisting in distinguishing benign and malignant thyroid tumors.
Background
Thyroid cancer (Thyroid cancer) is the most common malignancy of the endocrine system, including papillary Thyroid cancer, follicular Thyroid cancer, undifferentiated Thyroid cancer, and medullary carcinoma. Papillary carcinoma (Papillary thyroid cancer, PTC) is the most common, accounting for more than 90% of all thyroid malignancies [ Xing, migzhao; haugen, bryan R; schlumberger, martin (2013), progress in molecular-based management of differentiated thyroid cancer the Lancet,381 (9871), 1058-1069. The prevalence of adult thyroid nodules is statistically about 5-10%, with the population over 60 being the most severe, up to 50-70% [ Guth S, theune U, aberle J, et al, very high prevalence of thyroid nodules detected by high frequency (13 MHz) ultra sound extraction, eur J Clin Invest2009;39:699-706.]. Imaging examination is a common thyroid diagnosis method, and most of the methods depend on experience judgment of doctors, have certain result errors, and imaging has certain radiation damage to human bodies. Fine needle aspiration biopsy is also a clinically common thyroid cancer diagnostic technique that evaluates the benign or malignant nature of a nodule based on the cytological morphology of the aspirate. Because the cytological features of benign and malignant thyroid tumors often overlap, about 10-30% of fine needle punctures are diagnosed as ambiguous cytologic results [ Cibas ES, ali sz. The 2017Bethesda System for Reporting Thyroid Cytopathology.Thyroid.2017;27 (11):1341-6.]. Uncertain puncture results lead to about 60% of patients suffering from overtreatment or missed diagnosis [ Stewart R, leang YJ, bhatt CR, grodski S, serpell J, lee jc. Quantifying the differences in surgical management of patients with definitive and indeterminate thyroid nodule cytology. Eur J Surg oncol.2020;46 (2):252-7.]. This not only increases the economic and physical burden on the patient, but also occupies a significant amount of public health resources, resulting in a significant financial cost for the healthcare system. Therefore, the identification of benign and malignant thyroid tumors is beneficial to clinicians to adopt more accurate treatment schemes, and has important clinical and public health significance.
Epigenetic is a genetic expression control mode which does not involve DNA sequence changes but can be inherited, and can be inherited to the next generation [ Nicog ou A, merlin F. Epigenetics: A way to bridge the gap between biological fields. Stud Hist Philos Biol Biomed Sci.2017;66:73-82]. DNA methylation is one of the important modes of epigenetic regulation, which means that a methyl group is covalently bonded at the 5' -carbon position of cytosine of a genomic CpG dinucleotide under the action of DNA methylation transferase [ Bird A. Peptides of peptides. Nature 2007;447:396-398]. Numerous studies have shown that DNA methylation can cause changes in chromatin structure, DNA conformation, DNA stability, and the manner in which DNA interacts with proteins, thereby controlling gene expression [ Moore LD, le T, fan g.dna methylation and its basic function.neuroopsymacology.2013; 38:23-38].
The DNA methylation marker is the optimal tumor in-vitro early diagnosis molecular marker at the present stage, and the sensitivity and the specificity of the thyroid cancer diagnosis marker are limited clinically at present, particularly the marker for early diagnosis is lacking, so that the more sensitive and specific early molecular marker is urgently discovered.
Disclosure of Invention
The invention provides a data processing device for assisting in distinguishing benign and malignant thyroid tumors.
In a first aspect, the invention claims a data processing apparatus for assisting in distinguishing between type a samples and type B samples.
The type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages;
the data processing apparatus claimed in the present invention comprises a unit X and a unit Y;
the unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module;
the data collection module is configured to collect KATNAL2 gene methylation level data for n1 type a samples and n2 type B samples;
the data analysis processing module is configured to receive KATNAL2 gene methylation level data of the n 1A type samples and the n 2B type samples from the data acquisition module, establish a mathematical model according to a classification mode of the A type and the B type through a two-classification logistic regression method, and determine a threshold value of classification judgment;
Wherein, n1 and n2 can be positive integers above 50, such as above 100.
The model output module is configured to receive the mathematical model established by the data analysis processing module and output the mathematical model;
the unit Y is used for determining the type of the sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module;
the data input module is configured to input KATNAL2 gene methylation level data of a subject;
the data operation module is configured to receive KATNAL2 gene methylation level data of the person to be tested from the data input module, and substitutes the KATNAL2 gene methylation level data of the person to be tested into the mathematical model established by the data analysis processing module in the unit X, so as to calculate a detection index;
the data comparison module is configured to receive the detection index calculated by the data operation module and compare the detection index with the threshold value determined by the data analysis processing module in the unit X;
the conclusion output module is configured to receive the comparison result from the data comparison module and output a conclusion of whether the type of the sample to be tested is A type or B type according to the comparison result.
In a specific embodiment of the present invention, the threshold is set to 0.5. More than 0.5 is classified as one type, less than 0.5 is classified as another type, and 0.5 is equal as an undefined gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum approximate sign-up index (specifically, may be a value corresponding to the maximum approximate sign-up index). Greater than the threshold is classified as one class, less than the threshold is classified as another class, and equal to the threshold as an indeterminate gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
Further, the different subtypes described in (C2) and (C4) may be pathological typing, such as histological typing.
Further, the different stages in (C3) and (C5) may be clinical stages.
In a specific embodiment of the present invention, the thyroid benign tumor and thyroid malignant tumor of different subtypes described in (C2) may be specifically any of the following: thyroid benign tumor and papillary thyroid carcinoma, thyroid benign tumor and follicular thyroid carcinoma, thyroid benign tumor and medullary thyroid carcinoma, thyroid benign tumor and thyroid carcinoma, and thyroid undifferentiated carcinoma.
In a specific embodiment of the present invention, the thyroid benign tumor and the thyroid malignant tumor of different stages in (C3) may be specifically any of the following: thyroid benign tumor and thyroid malignant tumor of stage I, thyroid benign tumor and thyroid malignant tumor of stage II, thyroid benign tumor and thyroid malignant tumor of stage III, thyroid benign tumor and thyroid malignant tumor of stage IV.
In a specific embodiment of the present invention, the different subtypes of thyroid malignancy described in (C4) may specifically be any of the following: papillary and follicular thyroid carcinoma, papillary and medullary thyroid carcinoma, papillary and undifferentiated thyroid carcinoma, follicular and medullary thyroid carcinoma, follicular and undifferentiated thyroid carcinoma, medullary thyroid carcinoma and undifferentiated thyroid carcinoma.
In a specific embodiment of the present invention, the different stages of thyroid malignancy in (C5) may specifically be any of the following: thyroid malignancy and thyroid malignancy in stage I and stage II, thyroid malignancy and thyroid malignancy in stage I and stage III, thyroid malignancy in stage I and stage IV, thyroid malignancy in stage II and stage III, thyroid malignancy in stage II and stage IV, thyroid malignancy in stage III and thyroid malignancy in stage IV.
In a second aspect, the invention claims a system.
The claimed system includes:
(D1) Reagents and/or instrumentation for detecting the methylation level of KATNAL2 gene;
(D2) A device comprising a unit X and a unit Y;
the unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module;
the data acquisition module is configured to acquire KATNAL2 gene methylation level data of (D1) detected n1 type a samples and n2 type B samples;
the data analysis processing module is configured to receive KATNAL2 gene methylation level data of the n 1A type samples and the n 2B type samples from the data acquisition module, establish a mathematical model according to a classification mode of the A type and the B type through a two-classification logistic regression method, and determine a threshold value of classification judgment;
wherein, n1 and n2 can be positive integers above 50, such as above 100.
The model output module is configured to receive the mathematical model established by the data analysis processing module and output the mathematical model;
the unit Y is used for determining the type of the sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module;
The data input module is configured to input (D1) KATNAL2 gene methylation level data of the tested person;
the data operation module is configured to receive KATNAL2 gene methylation level data of the person to be tested from the data input module, and substitutes the KATNAL2 gene methylation level data of the person to be tested into the mathematical model established by the data analysis processing module in the unit X, so as to calculate a detection index;
the data comparison module is configured to receive the detection index calculated by the data operation module and compare the detection index with the threshold value determined by the data analysis processing module in the unit X;
the conclusion output module is configured to receive the comparison result from the data comparison module and output a conclusion of whether the type of the sample to be tested is A type or B type according to the comparison result;
the type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages.
Wherein the specific meanings of (C1) - (C5) are as described in the first aspect.
In a specific embodiment of the present invention, the threshold is set to 0.5. More than 0.5 is classified as one type, less than 0.5 is classified as another type, and 0.5 is equal as an undefined gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum approximate sign-up index (specifically, may be a value corresponding to the maximum approximate sign-up index). Greater than the threshold is classified as one class, less than the threshold is classified as another class, and equal to the threshold as an indeterminate gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
In a third aspect, the invention claims a computer-readable storage medium.
The computer-readable storage medium claimed in the present invention stores a computer program for performing the steps of:
Collecting KATNAL2 gene methylation level data of n1 type a samples and n2 type B samples;
establishing a mathematical model through a two-classification logistic regression method according to the KATNAL2 gene methylation level data of the n 1A type samples and the n 2B type samples and the classification mode of the A type and the B type, and determining a classification judgment threshold value;
inputting KATNAL2 gene methylation level data of a subject;
substituting the KATNAL2 gene methylation level data of the testee into the mathematical model, and calculating to obtain a detection index;
comparing the detection index with the threshold value to obtain a comparison result;
outputting a conclusion of whether the type of the sample to be detected is A type or B type according to the comparison result;
the type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages.
Wherein the specific meanings of (C1) - (C5) are as described in the first aspect.
Wherein, n1 and n2 can be positive integers above 50, such as above 100.
In a specific embodiment of the present invention, the threshold is set to 0.5. More than 0.5 is classified as one type, less than 0.5 is classified as another type, and 0.5 is equal as an undefined gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum approximate sign-up index (specifically, may be a value corresponding to the maximum approximate sign-up index). Greater than the threshold is classified as one class, less than the threshold is classified as another class, and equal to the threshold as an indeterminate gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
In the data processing apparatus of the first aspect hereinbefore or the system of the second aspect hereinbefore or the computer readable storage medium of the third aspect hereinbefore, the methylation level of the KATNAL2 gene may be the methylation level of all or part of the CpG sites in the fragments of the KATNAL2 gene as shown in (E1) - (E4) below;
(E1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment having 80% or more identity thereto;
(E2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment having 80% or more identity thereto;
(E3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment having 80% or more identity thereto;
(E4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment having 80% or more identity thereto;
further, the whole or part of CpG sites are any one of the following:
(F1) Any one or more CpG sites in 4 DNA fragments shown as SEQ ID No.1, SEQ ID No.2, SEQ ID No.3 and SEQ ID No.4 in KATNAL2 gene;
(F2) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene;
(F3) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F4) All CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F5) All CpG sites on the DNA fragment shown in SEQ ID No.1, all CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F6) All CpG sites in the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene or any 15 or any 14 or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 CpG sites;
(F7) All or any 4 or any 3 or any 2 or any 1 of the following CpG sites are present on the DNA fragment shown in SEQ ID No.2 of the KATNAL2 gene:
item 1: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 54 th to 55 th positions of the 5' end;
item 2: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 144 th to 145 th and 148 th to 149 th of the 5' end;
item 3: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 165 th to 166 th positions of the 5' end;
item 4: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 178 to 179 and 188 to 189 of the 5' end;
item 5: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 210 th to 211 th and 214 th to 215 th of the 5' end;
item 6: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 221 th to 222 th positions of the 5' end;
item 7: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 316 to 317 positions of the 5' end.
In particular embodiments of the invention, some adjacent methylation sites are treated as one methylation site when analyzed for DNA methylation using time-of-flight mass spectrometry, because several CpG sites are located on one methylation fragment, the peak pattern is indistinguishable (indistinguishable sites are set forth in Table 6), and thus the methylation level analysis is performed, and related mathematical models are constructed and used.
In the system of the second aspect, the reagent for detecting methylation level of KATNAL2 gene comprises (or is) a primer combination for amplifying full or partial fragment of KATNAL2 gene, and the instrument may be a time-of-flight mass spectrometer. Of course, other conventional reagents for performing time-of-flight mass spectrometry may also be included in the reagents for detecting the methylation level of the KATNAL2 gene.
Further, the partial fragment is at least one fragment of:
(G1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
(G5) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G6) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G7) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G8) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
still further, the primer combination is primer pair a and/or primer pair B and/or primer pair C and/or primer pair D;
the primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is SEQ ID No.5 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 5; the primer A2 is SEQ ID No.6 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 6;
the primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is SEQ ID No.7 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 7; the primer B2 is single-stranded DNA shown in SEQ ID No.8 or 32-56 nucleotides of SEQ ID No. 8;
the primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is SEQ ID No.9 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 9; the primer C2 is SEQ ID No.10 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 10;
The primer pair D is a primer pair consisting of a primer D1 and a primer D2; the primer D1 is SEQ ID No.11 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 11; the primer D2 is single-stranded DNA shown in SEQ ID No.12 or 32-56 nucleotides of SEQ ID No. 12.
In a fourth aspect, the invention claims the use of a methylated KATNAL2 gene as a marker in the preparation of a product; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Differentiation or assistance in differentiating between different stages of thyroid malignancy.
Further, the different subtypes described in (2) and (4) may be pathological typing, such as histological typing.
Further, the different stages in (3) and (5) may be clinical stages.
In a specific embodiment of the present invention, the differentiation or assistance in differentiating between benign thyroid tumors and thyroid malignant tumors of different subtypes described in (2) may be specifically any of the following: distinguishing or aiding in distinguishing benign thyroid tumors from papillary thyroid cancers, distinguishing or aiding in distinguishing benign thyroid tumors from follicular thyroid cancers, distinguishing or aiding in distinguishing benign thyroid tumors from medullary thyroid cancers, distinguishing or aiding in distinguishing benign thyroid tumors from undifferentiated thyroid cancers.
In a specific embodiment of the present invention, the distinguishing or assisting in distinguishing between benign thyroid tumor and different staged thyroid malignant tumor described in (3) may be specifically any of the following: distinguishing or assisting in distinguishing benign thyroid tumor and thyroid malignant tumor of stage I, distinguishing or assisting in distinguishing benign thyroid tumor and thyroid malignant tumor of stage II, distinguishing or assisting in distinguishing benign thyroid tumor and thyroid malignant tumor of stage III, distinguishing or assisting in distinguishing benign thyroid tumor and thyroid malignant tumor of stage IV.
In a specific embodiment of the present invention, the distinguishing or assisting in distinguishing different subtypes of thyroid malignancy described in (4) may specifically be any of the following: distinguishing or aiding in distinguishing papillary thyroid cancer from follicular thyroid cancer, distinguishing or aiding in distinguishing papillary thyroid cancer from medullary thyroid cancer, distinguishing or aiding in distinguishing follicular thyroid cancer from undifferentiated thyroid cancer, distinguishing or aiding in distinguishing medullary thyroid cancer from medullary thyroid cancer.
In a specific embodiment of the present invention, the distinguishing or assisting in distinguishing the different stages of thyroid malignancy in (5) may specifically be any of the following: distinguishing or assisting in distinguishing between stage I thyroid malignancy and stage II thyroid malignancy, distinguishing or assisting in distinguishing between stage I thyroid malignancy and stage III thyroid malignancy, distinguishing or assisting in distinguishing between stage I thyroid malignancy and stage IV thyroid malignancy, distinguishing or assisting in distinguishing between stage II thyroid malignancy and stage III thyroid malignancy, distinguishing or assisting in distinguishing between stage II thyroid malignancy and stage IV thyroid malignancy, distinguishing or assisting in distinguishing between stage III thyroid malignancy and stage IV thyroid malignancy.
In a fifth aspect, the invention claims the use of a substance for detecting the methylation level of KATNAL2 gene for the preparation of a product; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Differentiation or assistance in differentiating between different stages of thyroid malignancy.
Wherein the specific meanings of (1) - (5) are as described in the fourth aspect.
In a sixth aspect, the invention claims the use of a substance for detecting the methylation level of KATNAL2 gene and a medium storing mathematical modeling methods and/or usage methods for the preparation of a product; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Differentiation or assistance in differentiating between different stages of thyroid malignancy.
Wherein the specific meanings of (1) - (5) are as described in the fourth aspect.
The mathematical model is obtained according to a method comprising the following steps:
(A1) Detecting KATNAL2 gene methylation levels of n1 type a samples and n2 type B samples, respectively;
(A2) Taking KATNAL2 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model according to a classification mode of A type and B type by a two-classification logistic regression method, and determining a threshold value of classification judgment;
wherein, n1 and n2 can be positive integers more than 10, such as more than 100.
The mathematical model using method comprises the following steps:
(B1) Detecting the methylation level of KATNAL2 genes of a sample to be detected;
(B2) Substituting the KATNAL2 gene methylation level data of the sample to be detected obtained in the step (B1) into the mathematical model to obtain a detection index; then comparing the detection index with a threshold value, and determining whether the type of the sample to be detected is A type or B type according to a comparison result;
the type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages.
Wherein the specific meanings of (C1) - (C5) are as described in the first aspect.
In a specific embodiment of the present invention, the threshold is set to 0.5. More than 0.5 is classified as one type, less than 0.5 is classified as another type, and 0.5 is equal as an undefined gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum approximate sign-up index (specifically, may be a value corresponding to the maximum approximate sign-up index). Greater than the threshold is classified as one class, less than the threshold is classified as another class, and equal to the threshold as an indeterminate gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group A type and which group B type are determined according to a specific mathematical model, and no convention is needed.
In a seventh aspect, the present invention claims the use of a medium storing a mathematical model building method and/or a use method as described in the sixth aspect above for the preparation of a product; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Differentiation or assistance in differentiating between different stages of thyroid malignancy.
Wherein the specific meanings of (1) - (5) are as described in the fourth aspect.
In the above-described related aspects, the methylation level of the KATNAL2 gene may be the methylation level of all or part of CpG sites in the fragments of the KATNAL2 gene as shown in (E1) - (E4) below;
in the above-mentioned related aspects, the methylated KATNAL2 gene is regarded as methylation of all or part of CpG sites in the fragments shown in (E1) to (E4) below in the KATNAL2 gene;
(E1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment having 80% or more identity thereto;
(E2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment having 80% or more identity thereto;
(E3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment having 80% or more identity thereto;
(E4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment having 80% or more identity thereto;
Further, the whole or part of the CpG sites may be any of the following:
(F1) Any one or more CpG sites in 4 DNA fragments shown as SEQ ID No.1, SEQ ID No.2, SEQ ID No.3 and SEQ ID No.4 in KATNAL2 gene;
(F2) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene;
(F3) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F4) All CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F5) All CpG sites on the DNA fragment shown in SEQ ID No.1, all CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F6) All CpG sites in the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene or any 15 or any 14 or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 CpG sites;
(F7) All or any 4 or any 3 or any 2 or any 1 of the following CpG sites are present on the DNA fragment shown in SEQ ID No.2 of the KATNAL2 gene:
item 1: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 54 th to 55 th positions of the 5' end;
item 2: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 144 th to 145 th and 148 th to 149 th of the 5' end;
item 3: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 165 th to 166 th positions of the 5' end;
item 4: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 178 to 179 and 188 to 189 of the 5' end;
item 5: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 210 th to 211 th and 214 th to 215 th of the 5' end;
item 6: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 221 th to 222 th positions of the 5' end;
item 7: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 316 to 317 positions of the 5' end.
In the above related aspects, the substance for detecting the methylation level of KATNAL2 gene comprises (or is) a primer combination for amplifying a full or partial fragment of KATNAL2 gene;
further, the partial fragment is at least one fragment of:
(G1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
(G5) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G6) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G7) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G8) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
still further, the primer combination is primer pair a and/or primer pair B and/or primer pair C and/or primer pair D;
the primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is SEQ ID No.5 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 5; the primer A2 is SEQ ID No.6 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 6;
the primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is SEQ ID No.7 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 7; the primer B2 is single-stranded DNA shown in SEQ ID No.8 or 32-56 nucleotides of SEQ ID No. 8;
The primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is SEQ ID No.9 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 9; the primer C2 is SEQ ID No.10 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 10;
the primer pair D is a primer pair consisting of a primer D1 and a primer D2; the primer D1 is SEQ ID No.11 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 11; the primer D2 is single-stranded DNA shown in SEQ ID No.12 or 32-56 nucleotides of SEQ ID No. 12.
In addition, the invention also discloses a method for distinguishing whether the sample to be detected is an A type sample or a B type sample. The method may comprise the steps of:
(A) The mathematical model may be built as a method comprising the steps of:
(A1) Detecting KATNAL2 gene methylation levels (training set) of n1 type a samples and n2 type B samples, respectively;
(A2) Taking KATNAL2 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model according to classification modes of A type and B type by a two-classification logistic regression method, and determining a threshold value of classification judgment.
Wherein n1 and n2 in (A1) are positive integers of 10 or more, such as 100 or more.
(B) The sample to be tested may be determined as a type a sample or a type B sample according to a method comprising the steps of:
(B1) Detecting the methylation level of the KATNAL2 gene of the sample to be detected;
(B2) Substituting the KATNAL2 gene methylation level data of the sample to be detected obtained in the step (B1) into the mathematical model to obtain a detection index; and then comparing the detection index with a threshold value, and determining whether the type of the sample to be detected is A type or B type according to the comparison result.
In a specific embodiment of the present invention, the threshold is set to 0.5. More than 0.5 is classified as one type, less than 0.5 is classified as another type, and 0.5 is equal as an undefined gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
In practical applications, the threshold may also be determined according to the maximum approximate sign-up index (specifically, may be a value corresponding to the maximum approximate sign-up index). Greater than the threshold is classified as one class, less than the threshold is classified as another class, and equal to the threshold as an indeterminate gray zone. Wherein the A type and the B type are two corresponding classifications, the two classifications are grouped, which group is the A type and which group is the B type, and the A type and the B type are determined according to a specific mathematical model without convention.
The type a sample and the type B sample may be any one of the foregoing (C1) - (C5).
Any of the above mathematical models may be changed in practical application according to the detection method and the fitting mode of DNA methylation, and the mathematical model is determined according to a specific mathematical model without any convention.
In the embodiment of the invention, the model is specifically ln (y/(1-y))=b0+b1x1+b2x2+b3x3+ … +bnxn, where y is a detection index obtained after substituting the methylation value of one or more methylation sites of the sample to be tested into the model by a dependent variable, b0 is a constant, x1-xn is the methylation value of one or more methylation sites of the sample to be tested (each value is a value between 0 and 1), and b1-bn is the weight given by the model to the methylation value of each site.
One specific model established in the embodiment of the invention is a model for distinguishing or assisting in distinguishing thyroid benign tumor from thyroid malignant tumor, and the model is specifically as follows: ln (y/(1-y))=1.649-1.304 x katnal2_b_5+1.946 x katnal2_b_6.7+1.706 x katnal2_b_8-3.649 x katnal2_b_9.10-3.727 x katnal2_b_11.12-2.189 x katnal2_b_13+1.272 x katnal2_b_14. The KATNAl2_B_5 is the methylation level of the CpG site shown in the 54 th-55 th position of the 5' end of the DNA fragment shown in SEQ ID No. 2; the KATNAl2_B_6.7 is the methylation level of CpG sites shown in positions 144-145 and 148-149 of the DNA fragment shown in SEQ ID No.2 from the 5' end; the KATNAL2_B_8 is the methylation level of CpG sites shown in 165-166 th position of the 5' end of the DNA fragment shown in SEQ ID No. 2; the KATNAl2_B_9.10 is the methylation level of CpG sites shown in 178-179 and 188-189 of the DNA fragment shown in SEQ ID No.2 from the 5' end; the KATNAl2_B_11.12 is the methylation level of CpG sites shown in positions 210-211 and 214-215 of the DNA fragment shown in SEQ ID No.2 from the 5' end; the KATNAl2_B_13 is the methylation level of the CpG site shown in the 221 th-222 th position of the 5' end of the DNA fragment shown in SEQ ID No. 2; the KATNAl2_B_14 is the methylation level of the CpG site shown in the 316-317 th position of the 5' end of the DNA fragment shown in SEQ ID No. 2. The threshold of the model was 0.5. Patients with a detection index greater than 0.5 calculated by the model are or candidate thyroid malignancy patients, and patients with less than 0.5 are or candidate thyroid benign tumor patients.
In the above aspects, the detecting the methylation level of the KATNAL2 gene is detecting the methylation level of the KATNAL2 gene in a tumor tissue sample.
In the present invention, the methylation level of the methylation sites on the DNA fragments shown in SEQ ID Nos. 1, 2, 3 and 4 in the KATNAL2 gene in thyroid malignant tumor tissue is significantly lower than that of thyroid benign tumor.
In the present invention, thyroid cancer of different clinical characteristics such as: methylation levels of methylation sites on DNA fragments shown in SEQ ID Nos. 1, 2, 3 and 4 in KATNAL2 gene in papillary carcinoma, follicular carcinoma, medullary carcinoma and undifferentiated carcinoma tumor tissues are becoming lower.
In the present invention, the methylation level of the methylation sites on the DNA fragments shown in SEQ ID Nos. 1, 2, 3 and 4 in the KATNAL2 gene in the tissue becomes lower as the stage of thyroid malignancy increases.
The KATNAL2 gene described in any of the above is described in particular in Genbank accession number: NM-001387690.1 (GI: 1914794500), transcript variant 1; NM-031303.3 (GI: 1220516027), transcript variant 2; NM-001353899.1 (GI: 1220516019), transcript variant 3; NM-001353900.1 (GI: 1220516021), transcript variant 4; NM-001353901.1 (GI: 1220516023), transcript variant 5; NM-001353902.1 (GI: 1220516025), transcript variant 6; NM-001353903.1 (GI: 1220516028), transcript 7; NM-001353904.1 (GI: 1220516030), transcript variant 8; NM-001353905.1 (GI: 1220516032) transcript variant 9; NM-001353906.1 (GI: 1220516034) transcript variant 10; NM-001353907.1 (GI: 1220516036), transcript variant 11; NM-001353908.1 (GI: 1220516038), transcript variant 12; NM-001353909.1 (GI: 1220516040) transcript variant 13; NM-001367621.1 (GI: 1533911145), transcript variant 14.
The invention proves that the biopsy sample KATNAL2 methylation can be used as a potential marker for the differential diagnosis of thyroid benign tumor and thyroid malignant tumor, different subtypes or thyroid malignant tumor of different stages. The invention has important scientific significance and clinical application value for identifying thyroid benign tumor and thyroid malignant tumor, thyroid malignant tumor of different subtypes or different stages and guiding and formulating reasonable clinical treatment scheme.
Drawings
FIG. 1 is a computer flow diagram of an implementation of the present invention that assists in distinguishing between type A samples and type B samples.
Fig. 2 is a schematic diagram of a mathematical model.
Fig. 3 is an illustration of a mathematical model of benign and malignant thyroid tumors.
Detailed Description
The following detailed description of the invention is provided in connection with the accompanying drawings that are presented to illustrate the invention and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the invention in any way.
The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified.
FIG. 1 is a computer flow diagram of an implementation of the present invention that assists in distinguishing between type A samples and type B samples.
In step S1, KATNAL2 gene methylation level data of n1 type a samples and n2 type B samples are collected;
in step S2, the KATNAL2 gene methylation level data of the n1 a-type samples and the n 2B-type samples in step S1 are built into a mathematical model by a two-classification logistic regression method according to classification modes of a type and a B type, and a threshold value of classification judgment is determined;
in step S3, inputting KATNAL2 gene methylation level data of a subject;
in step S4, substituting KATNAL2 gene methylation level data of the person under test in step S3 into the mathematical model in step S2, and calculating to obtain a detection index;
in step S5, comparing the detection index in S4 with the threshold in S2 to obtain a comparison result;
in step S6, a conclusion of whether the type of the sample to be tested is type a or type B is output according to the comparison result in S5.
Example 1 primer design for detection of methylation site of KATNAL2 Gene
The test selects CpG sites on four fragments of KATNAL2 gene (KATNAL 2_A fragment, KATNAL2_B fragment, KATNAL2_C fragment and KATNAL2_D fragment) for methylation level and correlation analysis of thyroid malignancy.
The KATNAL2_A fragment (SEQ ID No. 1) is located in the hg19 reference genome chr18:44525801-44526509, antisense strand.
The KATNAL2_B fragment (SEQ ID No. 2) is located in the hg19 reference genome chr18:44526519-44526921, antisense strand.
The KATNAL 2-C fragment (SEQ ID No. 3) is located in the hg19 reference genome chr18:44526920-44527572, antisense strand.
The KATNAL2_D fragment (SEQ ID No. 4) is located in the hg19 reference genome chr18:44527828-44528538, antisense strand.
The site information in katnal2_a fragment is shown in table 1.
The site information in katnal2_b fragment is shown in table 2.
The site information in katnal2_c fragment is shown in table 3.
The site information in katnal2_d fragment is shown in table 4.
Table 1 CpG site information in kaTNAl2_A fragment
CpG sites Position of CpG sites in the sequence
KATNAL2_A_1 SEQ ID No.1 from positions 26-27 of the 5' end
KATNAL2_A_2 SEQ ID No.1 from positions 43-44 of the 5' end
KATNAL2_A_3 79 th to 80 th positions from 5' end of SEQ ID No.1
KATNAL2_A_4 SEQ ID No.1 from position 84-85 of the 5' end
KATNAL2_A_5 173 th to 174 th positions of SEQ ID No.1 from 5' end
KATNAL2_A_6 SEQ ID No.1 from positions 286-287 of the 5' end
KATNAL2_A_7 SEQ ID No.1 from positions 302-303 of the 5' end
KATNAL2_A_8 SEQ ID No.1 from positions 419-420 of the 5' end
KATNAL2_A_9 SEQ ID No.1 from position 512-513 of the 5' end
KATNAL2_A_10 SEQ ID No.1 from the 5' end at positions 656-657
KATNAL2_A_11 SEQ ID No.1 from position 683-684 of the 5' end
Table 2 CpG site information in kaTNAl2_B fragment
CpG sites Position of CpG sites in the sequence
KATNAL2_B_1 SEQ ID No.1 from 5'Terminal 26 th to 27 th
KATNAL2_B_2 SEQ ID No.2 from position 36-37 of the 5' end
KATNAL2_B_3 SEQ ID No.2 from positions 43-44 of the 5' end
KATNAL2_B_4 SEQ ID No.2 from positions 45-46 of the 5' end
KATNAL2_B_5 54 th to 55 th positions of SEQ ID No.2 from 5' end
KATNAL2_B_6 SEQ ID No.2 from position 144 to 145 of the 5' end
KATNAL2_B_7 SEQ ID No.2 from position 148-149 of the 5' end
KATNAL2_B_8 165 th to 166 th positions from 5' end of SEQ ID No.2
KATNAL2_B_9 SEQ ID No.2 from position 178-179 of the 5' end
KATNAL2_B_10 SEQ ID No.2 from positions 188-189 of the 5' end
KATNAL2_B_11 SEQ ID No.2 from position 210 to 211 at the 5' end
KATNAL2_B_12 SEQ ID No.2 from positions 214-215 of the 5' end
KATNAL2_B_13 SEQ ID No.2 from positions 221-222 of the 5' end
KATNAL2_B_14 SEQ ID No.2 from positions 316-317 of the 5' end
KATNAL2_B_15 SEQ ID No.2 from positions 345-346 of the 5' end
KATNAL2_B_16 Position 377-378 of SEQ ID No.2 from the 5' end
Table 3 CpG site information in kaTNAl2_C fragment
Table 4 CpG site information in kaTNAl2_D fragment
CpG sites Position of CpG sites in the sequence
KATNAL2_D_1 SEQ ID No.3 from positions 26-27 of the 5' end
KATNAL2_D_2 SEQ ID No.4 from position 74-75 of the 5' end
KATNAL2_D_3 SEQ ID No.4 from positions 108-109 of the 5' end
KATNAL2_D_4 SEQ ID No.4 from positions 129 to 130 of the 5' end
KATNAL2_D_5 SEQ ID No.4 from positions 164-165 of the 5' end
KATNAL2_D_6 SEQ ID No.4 shows positions 244-245 from the 5' end
KATNAL2_D_7 SEQ ID No.4 from positions 316-317 of the 5' end
KATNAL2_D_8 SEQ ID No.4 from position 340-341 of the 5' end
KATNAL2_D_9 372-373 th bit of SEQ ID No.4 from 5' end
KATNAL2_D_10 SEQ ID No.4 from positions 426-427 of the 5' end
KATNAL2_D_11 SEQ ID No.4 from position 502 to position 503 on the 5' end
KATNAL2_D_12 SEQ ID No.4 from the 5' end at positions 508-509
KATNAL2_D_13 525 th to 526 th positions of SEQ ID No.4 from 5' end
KATNAL2_D_14 SEQ ID No.4 from position 534-535 at the 5' end
KATNAL2_D_15 SEQ ID No.4 from position 567-568 at the 5' end
KATNAL2_D_16 SEQ ID No.4 from the 5' end at positions 643-644
KATNAL2_D_17 SEQ ID No.4 from 5' end at positions 685-686
Specific PCR primers were designed for four fragments (katnal2_a fragment, katnal2_b fragment, katnal2_c fragment and katnal2_d fragment) as shown in Table 5. SEQ ID No.5, SEQ ID No.7, SEQ ID No.9 and SEQ ID No.11 are forward primers; SEQ ID No.6, SEQ ID No.8, SEQ ID No.10 and SEQ ID No.12 are reverse primers. The 1 st to 10 th positions of the 5' end in SEQ ID No.5, SEQ ID No.7, SEQ ID No.9 and SEQ ID No.11 are nonspecific labels, and the 11 th to 35 th positions are specific primer sequences; the 1 st to 31 st positions of the 5' end of SEQ ID No.6, SEQ ID No.8, SEQ ID No.10 and SEQ ID No.12 are nonspecific labels, and the 32 nd to 56 th positions are specific primer sequences. The primer sequences do not contain SNPs and CpG sites.
TABLE 5 kaTNAL2 methylation primer sequences
EXAMPLE 2 KaTNAL2 Gene methylation detection and analysis of results (training set)
1. Study sample
A total of 380 thyroid benign tumor tissues and 598 thyroid malignant tumor tissues were collected with patient informed consent. Thyroid cancer stage was judged by the American cancer Association (AJCC) eighth edition stage system. Thyroid malignancy includes four major classes, thyroid papillary carcinoma, thyroid follicular carcinoma, medullary thyroid carcinoma, and thyroid undifferentiated carcinoma, depending on the pathological type. The 598 thyroid malignant tumor patients collected this time include 380 thyroid papillary carcinoma, 138 thyroid follicular carcinoma, 44 thyroid medullary carcinoma, and 36 thyroid undifferentiated carcinoma. According to pathological stage division, 470 patients with stage I, 68 patients with stage II, 24 patients with stage III and 36 patients with stage IV in 598 thyroid malignant tumor patients.
2. Methylation detection
1. Total DNA in tumor tissue is extracted.
2. The total DNA of the tissue samples prepared in step 1 was subjected to bisulfite treatment (see DNA methylation kit instructions for Qiagen). After bisulfite treatment, unmethylated cytosines (C) in the original CpG sites are converted to uracil (U), while methylated cytosines remain unchanged.
3. The DNA treated by the bisulfite in the step 2 is used as a template, 4 pairs of specific primer pairs in the table 5 are adopted to carry out PCR amplification by DNA polymerase according to a reaction system required by a conventional PCR reaction, and all primers adopt a conventional standard PCR reaction system and are amplified according to the following procedure.
The PCR reaction procedure was: 95 ℃,4 min- & gt (95 ℃,20 s- & gt 56 ℃,30 s- & gt 72 ℃ 2 min) 45 cycles- & gt 72 ℃,5 min- & gt 4 ℃ for 1h.
4. Taking the amplified product of the step 3, and carrying out DNA methylation analysis by a time-of-flight mass spectrum, wherein the specific method is as follows:
(1) To 5. Mu.l of the PCR product was added 2. Mu.l of Shrimp Alkaline Phosphate (SAP) solution (0.3 ml SAP [0.5U ] +1.7ml H2O) and then incubated in a PCR apparatus (37 ℃,20 min. Fwdarw. 85 ℃,5 min. Fwdarw. 4 ℃,5 min) according to the following procedure;
(2) Taking out 2 mu.l of the SAP treated product obtained in the step (1), adding the product into a 5 mu l T-clear reaction system according to the instruction, and then incubating for 3 hours at 37 ℃;
(3) Taking the product of the step (2), adding 19 mu l of deionized water, and then carrying out deionized incubation on a rotary shaking table for 1h by using 6 mu g of Resin;
(4) Centrifuging at 2000rpm at room temperature for 5min, and loading 384SpectroCHIP with the micro supernatant by a Nanodispenser mechanical arm;
(5) Time-of-flight mass spectrometry; the data obtained were collected with the spectroacquisition v3.3.1.3 software and visualized by MassArray EpiTyper v 1.2.1.2 software.
Reagents used for the time-of-flight mass spectrometry detection are all kits (T-Cleavage MassCLEAVE Reagent Auto Kit, cat# 10129A); the detection instrument used for the time-of-flight mass spectrometry detection is Massary O R Analyzer Chip Prep Module 384, model: 41243; the data analysis software is self-contained software of the detection instrument.
5. And (5) analyzing the data obtained in the step (4).
Statistical analysis of the data was performed by SPSS Statistics 23.0.
Non-parametric tests were used for comparative analysis between the two groups.
The identification effect of a combination of multiple CpG sites on different sample groupings is achieved by logistic regression and statistical methods of the subject curves.
All statistical tests were double-sided, with p-values <0.05 considered statistically significant.
Through mass spectrometry experiments, 60 distinguishable peak patterns were obtained in total. The methylation level at each CpG site of each sample can be automatically obtained by calculating the peak area according to the "methylation level=peak area of methylated fragments/(peak area of unmethylated fragments+peak area of methylated fragments)" formula using SpectroAcquin v3.3.1.3 software.
4. Statistical analysis
Methylation levels were expressed as median, and methylation level differences between two or more groups were compared using a nonparametric test. The differential diagnostic value of single CpG sites and combinations of multiple CpG sites was assessed by the subject's working profile (receiver operating characteristic curve, ROC profile). The difference of P <0.05 on both sides is statistically significant, and all data are statistically analyzed by SPSS 25.0.
5. Analysis of results
1. Analysis of methylation level of KATNAL2 Gene for different subtypes of thyroid malignant tumor, thyroid malignant tumor and different stages of thyroid malignant tumor
The methylation level of all CpG sites in the KATNAL2 gene was analyzed using tissue samples of 380 thyroid benign tumors and 598 thyroid malignant tumors as study materials. The results showed that the median of KATNAL2 gene methylation level of thyroid benign tumor was 0.47 (iqr=0.37-0.77), the median of KATNAL2 gene methylation level of thyroid malignant tumor was 0.33 (iqr=0.24-0.63), the median of KATNAL2 gene methylation level of papillary thyroid carcinoma was 0.32 (iqr=0.23-0.62), the median of KATNAL2 gene methylation level of thyroid follicular carcinoma was 0.31 (iqr=0.22-0.61), the median of KATNAL2 gene methylation level of thyroid medullary carcinoma was 0.28 (iqr=0.19-0.59), and the median of KATNAL2 gene methylation level of thyroid undifferentiated carcinoma was 0.25 (iqr=0.14-0.56); the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage I is 0.38 (iqr=0.28-0.70), the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage II is 0.32 (iqr=0.22-0.61), the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage iii is 0.30 (iqr=0.17-0.57), and the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage iv is 0.27 (iqr=0.12-0.52). As a result of comparative analysis of methylation levels among several, it was found that methylation levels of all CpG sites in the KATNAL2 gene of benign thyroid tumor were significantly higher than those of all CpG sites in the KATNAL2 gene in thyroid malignancy (Table 6), and the differences between four different clinical characteristics of papillary thyroid cancer, follicular thyroid cancer, medullary thyroid cancer, and undifferentiated thyroid cancer were more and more apparent (Table 6). In addition, as the methylation level of the methylation sites on the DNA fragments shown in SEQ ID Nos. 1, 2, 3 and 4 in the KATNAL2 gene in the tissue increased with the stage of thyroid malignancy, the difference from thyroid benign tumors became more and more pronounced (Table 6).
TABLE 6 methylation level of kaTNAL2 genes for thyroid benign tumor and thyroid malignant tumor, subtypes and stages
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
2. Methylation level of KATNAL2 gene in tumor tissue can distinguish thyroid benign tumor from thyroid malignant tumor of different subtypes
As a result of comparative analysis of the methylation levels of KATNAL2 in 380 cases of thyroid benign tumor and 598 cases of thyroid malignant tumor, it was found that the methylation levels of KATNAL 2-A fragment, KATNAL 2-B fragment, KATNAL 2-C fragment and KATNAL 2-D fragment in patients with thyroid malignant tumor, papillary thyroid cancer, follicular thyroid cancer, medullary thyroid cancer and thyroid cancer were significantly lower than the methylation levels of the corresponding fragments in patients with thyroid benign tumor (P < 0.05). The specific results are shown in Table 7.
TABLE 7 methylation level differences of KATNAL2 genes between thyroid benign tumors and different subtype thyroid malignant tumors
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
3. Methylation level of KATNAL2 gene in tumor tissue can distinguish between different subtypes of thyroid malignancy
As a result of comparative analysis of the methylation levels of KATNAL2 in the cases of thyroid malignant tumors of different types (380 cases of papillary thyroid cancer, 138 cases of follicular thyroid cancer, 44 cases of medullary thyroid cancer and 36 cases of undifferentiated thyroid cancer), it was found that there was a significant difference (P < 0.05) between the methylation levels of KATNAL2 gene in patients with papillary thyroid cancer, follicular thyroid cancer, medullary thyroid cancer and undifferentiated thyroid cancer. The specific results are shown in Table 8.
TABLE 8 methylation level differences in KATNAL2 genes between subtypes of thyroid malignancy
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
4. Methylation level of KATNAL2 gene in tumor tissue can distinguish thyroid benign tumor from thyroid malignant tumor of different stages
As a result of comparative analysis of the methylation levels of KATNAL2 in 380 cases of thyroid benign tumor and in various stages of thyroid malignant tumor patients (470 cases of patients in stage I, 68 cases of patients in stage II, 24 cases of patients in stage III and 36 cases of patients in stage IV), it was found that the methylation levels of KATNAL 2-A fragment, KATNAL 2-B fragment, KATNAL 2-C fragment and KATNAL 2-D fragment in the patients in stage I, II, III and IV of thyroid cancer were significantly lower than the methylation levels of the corresponding fragments in the patients in thyroid benign tumor (P < 0.05). The specific results are shown in Table 9.
TABLE 9 methylation level differences of KATNAL2 genes between thyroid benign tumors and different staged thyroid malignant tumors
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
5. Methylation level of KATNAL2 gene in tumor tissue can distinguish thyroid malignant tumor of different stages
As a result of comparative analysis of the methylation levels of KATNAL2 in thyroid malignancy patients of different stages (470 cases of stage I patients, 68 cases of stage II patients, 24 cases of stage III patients and 36 cases of stage IV patients), it was found that there was a significant difference (P < 0.05) between the methylation levels of KATNAL2 gene in stage I thyroid malignancy, stage II thyroid malignancy, stage III thyroid malignancy and stage IV thyroid malignancy patients. The specific results are shown in Table 10.
TABLE 10 methylation level differences of KATNAL2 genes between different stages of thyroid malignancy
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
6. Establishment of mathematical model for KATNAL2 gene methylation to aid in cancer diagnosis
The mathematical model established by the invention can be used for achieving the following purposes:
(1) Distinguishing thyroid malignant tumor patients from thyroid benign tumor;
(2) Distinguishing between benign thyroid tumors and thyroid malignant tumors of different subtypes;
(3) Distinguishing thyroid benign tumor from thyroid malignant tumor of different stages;
(4) Distinguishing different subtypes of thyroid malignancy;
(5) Different stages of thyroid malignancy are distinguished.
The mathematical model is established as follows:
(A) Data sources: in step one, methylation levels of target CpG sites (combinations of one or more of tables 1 to 4) of tissue samples of 380 cases of thyroid benign tumors and 598 cases of thyroid malignant tumors (380 cases of papillary thyroid carcinomas, 138 cases of thyroid follicular carcinomas, 44 cases of medullary thyroid carcinomas and 36 cases of thyroid undifferentiated carcinomas) are listed (detection method is the same as in step two).
(B) Model building
Any two different types of patient data, namely training sets (for example, thyroid benign tumor and thyroid malignant tumor patients, thyroid benign tumor and thyroid papillary carcinoma patients, thyroid benign tumor and thyroid follicular carcinoma patients, thyroid benign tumor and thyroid medullary carcinoma patients, thyroid benign tumor and thyroid undifferentiated carcinoma patients, thyroid papillary carcinoma and thyroid follicular carcinoma patients, thyroid papillary carcinoma and thyroid medullary carcinoma patients, thyroid papillary carcinoma and thyroid undifferentiated carcinoma patients, thyroid follicular carcinoma and thyroid medullary carcinoma patients, thyroid follicular carcinoma and thyroid undifferentiated carcinoma patients, thyroid medullary carcinoma and thyroid undifferentiated carcinoma patients, thyroid benign tumor and I thyroid malignant tumor patients, thyroid benign tumor and II thyroid malignant tumor patients, thyroid benign tumor and III thyroid malignant tumor patients, I thyroid malignant tumor and II thyroid malignant tumor patients, I thyroid malignant tumor and III thyroid malignant tumor patients, II thyroid malignant tumor and III thyroid malignant tumor patients, IV thyroid malignant tumor and IV thyroid malignant tumor patients) are selected as required to be used for establishing a statistical model by using a statistical regression method of the IV-like data. The numerical value corresponding to the maximum approximate dengue index calculated by the mathematical model formula is a threshold value or is directly set to be 0.5 as the threshold value, the detection index obtained by the sample to be tested after the sample is tested and substituted into the model calculation is more than the threshold value and is classified into one type (B type), less than the threshold value and is classified into the other type (A type), and the detection index is equal to the threshold value and is used as an uncertain gray area. When a new sample to be detected is predicted to judge which type belongs to, firstly detecting methylation levels of one or more CpG sites on the KATNAL2 gene of the sample to be detected by a DNA methylation determination method, then substituting data of the methylation levels into the mathematical model, calculating to obtain a detection index corresponding to the sample to be detected, comparing the detection index corresponding to the sample to be detected with a threshold value, and determining which type of sample the sample to be detected belongs to according to a comparison result.
Examples: as shown in fig. 2, the data of methylation level of single CpG site or multiple CpG site combinations in the training set KATNAL2 gene was used to establish a mathematical model for distinguishing between class a and class B by using a formula of two classification logistic regression through statistical software such as SAS, R, SPSS. The mathematical model is herein a two-class logistic regression model, specifically: ln (y/(1-y))=b0+b1x1+b2x2+b3x3+ … +bnxn, where y is a detection index obtained by substituting the methylation level of one or more methylation sites of the sample to be tested into the model as a function variable, b0 is a constant, x1-xn is an argument which is the methylation level of one or more methylation sites of the sample to be tested (each value is a value between 0 and 1), and b1-bn are weights given to each methylation site by the model. In specific application, a mathematical model is established according to methylation levels (x 1-xn) of one or more DNA methylation sites of a sample detected in a training set and known classification conditions (class A or class B, respectively assigning 0 and 1 to y), so that a constant B0 of the mathematical model and weights B1-bn of each methylation site are determined, and a numerical value corresponding to a maximum approximate dengue index calculated by the mathematical model is used as a threshold value or a threshold value divided by 0.5 is directly set. And the detection index, namely the y value, obtained after the sample to be detected is tested and calculated by substituting the sample into the model is classified as B when the y value is larger than the threshold value, and classified as A when the y value is smaller than the threshold value, and the y value is equal to the threshold value and is used as an uncertain gray area. Where class a and class B are the corresponding two classifications (groupings of classifications, which group a is class B, which group is determined from a specific mathematical model, no convention is made here). In predicting a sample of a subject to determine which class the sample belongs to, a biopsy sample (i.e., tumor tissue) of the subject is first collected and then DNA is extracted therefrom. After the extracted DNA is converted by bisulfite, the methylation level of single CpG sites or the methylation level of a plurality of CpG sites of the KATNAL2 gene of a subject is detected by using a DNA methylation determination method, and methylation data obtained by detection are substituted into the mathematical model. If the methylation level of one or more CpG sites of the KATNAL2 gene of the subject is substituted into the mathematical model and then the calculated value, namely the detection index, is larger than the threshold value, the subject judges the class (B class) with the detection index in the training set larger than the threshold value; if the methylation level data of one or more CpG sites of the KATNAL2 gene of the subject is substituted into the mathematical model to calculate a value, namely a detection index, which is smaller than a threshold value, the subject belongs to the class (class A) with the detection index in the training set which is smaller than the threshold value; if the methylation level data of one or more CpG sites of the KATNAL2 gene of the subject is substituted into the mathematical model and the calculated value, i.e. the detection index, is equal to the threshold value, it is not possible to determine whether the subject is class A or class B.
Examples: as shown in FIG. 3, the methylation of 7 distinguishable CpG sites of kaTNAl2_B (kaTNAl2_B_5, kaTNAl2_B_6.7, kaTNAl2_B_8, kaTNAl2_B_9.10, kaTNAl2_B_11.12, kaTNAl2_B_13, kaTNAl2_B_14) and the use of mathematical modeling to identify thyroid benign and thyroid malignant tissue are illustrated: the above 7 distinguishable CpG site methylation level data of KATNAL2_B, which has been detected in a training set of thyroid benign tumor and thyroid malignant tumor patients (here: 380 thyroid benign tumor and 598 thyroid malignant tumor patients), was used to establish a mathematical model for identifying thyroid malignant tumor patients by SPSS software or R software using a formula of two-class logistic regression. The mathematical model is here a two-class logistic regression model, whereby the constants b0 of the mathematical model and the weights b1-bn of the individual methylation sites are determined, in this case in particular: ln (y/(1-y))=1.649-1.304×katnal2_b_5+1.946×katnal2_b_6.7+1.706×katnal2_b_8-3.649×katnal2_b_9.10-3.727×katnal2_b_11.12-2.189*KAT NAL2_B_13+1.272*KATNAL2_B_14, where y is a dependent variable, i.e., a detection index obtained after substituting the methylation level of the above 7 distinguishable CpG sites of katnal2_b of the sample to be tested into the model. Katnal2_b_6 and katnal2_b_7 are located in the same segment, katnal2_b_9 and katnal2_b_10 are located in the same segment, katnal2_b_11 and katnal2_b_12 are located in the same segment, so that the average value of the methylation levels of these 6 sites is represented by katnal2_b_6.7, katnal2_b_9.10 and katnal2_b_11.12, respectively. Under the condition that 0.5 is set as a threshold value, values obtained by testing methylation levels of the 7 distinguishable CpG sites of the KATNAL2_B of the sample to be tested are substituted into the model to calculate, the obtained detection index, namely the y value is smaller than the threshold value and is classified as a thyroid benign tumor patient, the value is larger than the threshold value and is classified as a thyroid malignant tumor patient, and if the value is equal to the threshold value, the value is not determined as the thyroid benign tumor patient or the thyroid malignant tumor patient. The area under the curve (AUC) calculation for this model was 0.79 (table 15). Specific subject judgment methods are exemplified below, in which biopsies (i.e., tumor tissue) are collected from two subjects (A, B) and DNA is extracted, the extracted DNA is converted by bisulfite, and the methylation level of 7 distinguishable CpG sites, namely kaTNAl2_B_5, kaTNAl2_B_6.7, kaTNAl2_B_8, kaTNAl2_B_9.10, kaTNAl2_B_11.12, kaTNAl2_B_13, and kaTNAl2_B_14, of the subjects is detected by a DNA methylation assay. And substituting the methylation level data information obtained by detection into the mathematical model. The methylation level data of the 7 distinguishable CpG sites of the first test subject are substituted into the mathematical model, and the calculated value is 0.82 to be more than 0.5, so that the first test subject is judged to be a thyroid malignant tumor patient (consistent with clinical diagnosis); and substituting methylation level data of the 7 distinguishable CpG sites of the subject B into the mathematical model to calculate a value of 0.37 to be less than 0.5, and judging the thyroid benign tumor patient (consistent with clinical diagnosis) by the subject B.
(C) Model Effect evaluation
According to the above method, a method for finding a thyroid benign tumor and thyroid malignant tumor patient, a thyroid benign tumor and thyroid papillary carcinoma patient, a thyroid benign tumor and thyroid follicular carcinoma patient, a thyroid benign tumor and thyroid medullary carcinoma patient, a thyroid benign tumor and thyroid undifferentiated carcinoma patient, a thyroid papillary carcinoma and thyroid follicular carcinoma patient, a thyroid papillary carcinoma and thyroid undifferentiated carcinoma patient, a thyroid follicular carcinoma and thyroid medullary carcinoma patient, a thyroid follicular carcinoma and thyroid undifferentiated carcinoma patient, a thyroid medullary carcinoma and thyroid undifferentiated carcinoma patient, a thyroid benign tumor and a thyroid malignant tumor patient in stage II, a thyroid benign tumor and a thyroid malignant tumor patient in stage III, a thyroid benign tumor and a thyroid malignant tumor patient in stage IV, a thyroid malignant tumor patient in stage I and a thyroid malignant tumor patient in stage II, a thyroid malignant tumor patient in stage II and a thyroid malignant tumor patient in stage III, a thyroid malignant tumor patient in stage II and a thyroid malignant tumor patient in stage IV, a thyroid malignant tumor is established, and a mathematical curve is performed on a thyroid malignant subject (ROC). The larger the area under the curve (AUC) from the ROC curve, the better the differentiation of the model, the more efficient the molecular marker. The evaluation results after construction of mathematical models using different CpG sites are shown in tables 11, 12, 13 and 14. In tables 11, 12, 13 and 14, 1 CpG site represents the site of any one CpG site in the amplified fragment of KATNAL2_B, 2 CpG sites represent the combination of any 2 CpG sites in the amplified fragment of KATNAL2_B, 3 CpG sites represent the combination of any 3 CpG sites in the amplified fragment of KATNAL2_B, … … and so on. The values in the table are the range of values for the combined evaluation of the different sites (i.e., the results for any combination of CpG sites are within this range).
The above results of the study show that the discrimination ability of KATNAL2 gene methylation for each group (thyroid benign tumor and thyroid malignant tumor patient, thyroid benign tumor and thyroid papillary carcinoma patient, thyroid benign tumor and thyroid follicular carcinoma patient, thyroid benign tumor and thyroid medullary carcinoma patient, thyroid benign tumor and thyroid undifferentiated carcinoma patient, thyroid papillary carcinoma and thyroid follicular carcinoma patient, thyroid papillary carcinoma and thyroid medullary carcinoma patient, thyroid papillary carcinoma and thyroid undifferentiated carcinoma patient, thyroid follicular carcinoma and thyroid medullary carcinoma patient, thyroid follicular carcinoma and thyroid undifferentiated carcinoma patient, thyroid medullary carcinoma and thyroid undifferentiated carcinoma patient, thyroid benign tumor and phase i thyroid malignant tumor patient, thyroid benign tumor and phase ii thyroid malignant tumor patient, thyroid benign tumor and phase iii thyroid malignant tumor patient, phase i thyroid malignant tumor and phase ii thyroid malignant tumor patient, phase i thyroid malignant tumor and phase iii thyroid malignant tumor patient, phase i thyroid malignant tumor and phase iv thyroid malignant tumor patient, phase ii thyroid malignant tumor and phase iv thyroid malignant tumor patient, phase iv thyroid malignant tumor increases with the number of thyroid malignant tumor 2 gene methylation and phase iv tumor.
In addition, among the CpG sites shown in tables 1 to 4, there are cases where combinations of a few preferred sites are better in discrimination ability than combinations of a plurality of non-preferred sites. The combination of 7 distinguishable CpG sites, e.g., katnal2_b_5, katnal2_b_6.7, katnal2_b_8, katnal2_b_9.10, katnal2_b_11.12, katnal2_b_13, katnal2_b_14 shown in tables 15, 16, 17 and 18, is the preferred site for any 7 combinations of katnal2_b.
Table 11, cpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from thyroid malignant tumors of different subtypes
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 12, cpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from different stages of thyroid malignant tumors
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 13, cpG sites of katnal2_b and combinations thereof for differentiating patients with different subtype thyroid malignancies
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 14, cpG sites of kaTNAl2_B and combinations thereof for differentiating patients with different staged thyroid malignancy
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 15, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from thyroid malignant tumors of different subtypes
And (3) injection: cpG sites in the table are all distinguishable CpG sites.
Table 16, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from different stages of thyroid malignant tumors
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 17, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating between different subtypes of thyroid malignancy
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 18, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating between different stages of thyroid malignancy
Note that: cpG sites in the table are all distinguishable CpG sites.
Example 3 validation of the results of KATNAL2 methylation markers in validation queues
1. The present invention further selects a group of subjects as the validation set (the samples are completely different from the training set in example 2), and the specific sample collection conditions are as follows:
190 thyroid benign tumor tissues and 300 thyroid malignant tumor tissues. Thyroid cancer stage was judged by the American cancer Association (AJCC) eighth edition stage system. Thyroid malignancy includes four major classes, thyroid papillary carcinoma, thyroid follicular carcinoma, medullary carcinoma, and undifferentiated carcinoma, depending on the pathological type. The 300 thyroid malignant tumor patients collected this time included 188 thyroid papillary carcinomas, 70 thyroid follicular carcinomas, 22 medullary carcinomas, and 20 undifferentiated carcinomas. According to pathological stage division, 200 patients with stage I, 40 patients with stage II, 20 patients with stage III and 40 patients with stage IV in 300 thyroid malignant tumor patients.
2. Methylation detection
Methylation detection and data analysis methods were the same as in example 2.
3. Analysis of results
1. Analysis of methylation level of KATNAL2 Gene for different subtypes of thyroid malignant tumor, thyroid malignant tumor and different stages of thyroid malignant tumor
As shown in table 19, the median of KATNAL2 gene methylation level of thyroid benign tumor was 0.47 (iqr=0.38-0.77), the median of KATNAL2 gene methylation level of thyroid malignant tumor was 0.33 (iqr=0.25-0.63), the median of KATNAL2 gene methylation level of papillary thyroid carcinoma was 0.32 (iqr=0.23-0.63), the median of KATNAL2 gene methylation level of thyroid follicular carcinoma was 0.31 (iqr=0.23-0.62), the median of KATNAL2 gene methylation level of thyroid medullary carcinoma was 0.28 (iqr=0.19-0.58), and the median of KATNAL2 gene methylation level of thyroid undifferentiated carcinoma was 0.25 (iqr=0.14-0.56); the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage I is 0.38 (iqr=0.28-0.71), the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage II is 0.32 (iqr=0.22-0.62), the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage iii is 0.30 (iqr=0.16-0.56), and the median of the methylation level of the KATNAL2 gene in the thyroid malignancy stage iv is 0.27 (iqr=0.11-0.51). As a result of comparing and analyzing methylation levels of KATNAL2 genes of thyroid benign tumors with thyroid malignant tumors, different subtypes of thyroid malignant tumors and different stages of thyroid malignant tumors, the methylation levels of all CpG sites in the KATNAL2 genes of the thyroid benign tumors are found to be significantly higher than those of the thyroid malignant tumors, the different subtypes of thyroid malignant tumors and the different stages of thyroid malignant tumors. Therefore, the methylation level of KATNAL2 gene can be used in the population to identify thyroid benign tumor and thyroid malignant tumor, different subtypes of thyroid malignant tumor and different stages of thyroid malignant tumor.
TABLE 19 methylation level of KATNAL2 Gene for thyroid benign tumor and thyroid malignant tumor, subtypes and stages thereof
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
2. Methylation level of KATNAL2 gene in tissue can distinguish benign thyroid tumor from different subtype/different stage thyroid malignant tumor
As a result of comparative analysis of the methylation level of KATNAL2 in thyroid benign tumor cases and thyroid malignant tumor cases, it was found that the methylation level of KATNAL 2-A fragment, KATNAL 2-B fragment, KATNAL 2-C fragment and KATNAL 2-D fragment in thyroid malignant tumor, papillary thyroid cancer, follicular thyroid cancer, medullary carcinoma and undifferentiated carcinoma patients was significantly lower than the methylation level of the corresponding fragment in thyroid benign tumor patients (P < 0.05). The methylation level of katnal2_a, katnal2_b, katnal2_c and katnal2_d fragments in patients with thyroid cancer stage i, stage ii, stage iii and stage iv is significantly lower than the methylation level of the corresponding fragments in patients with benign thyroid tumors (P < 0.05). The specific results are shown in Table 20.
TABLE 20 methylation level differences of KATNAL2 genes between benign thyroid tumors and malignant, different subtypes, and different staged thyroid malignancies
/>
And (3) injection: cpG sites in the table are all distinguishable CpG sites.
3. Methylation level of KATNAL2 gene in tissue can distinguish between different subtypes of thyroid malignancy
As a result of comparative analysis of the methylation level of KATNAL2 between cases of thyroid malignant tumors (papillary thyroid carcinoma, follicular thyroid carcinoma, medullary carcinoma and undifferentiated carcinoma) of different types, it was found that there was a significant difference (P < 0.05) between the methylation levels of KATNAL2 genes in patients with papillary thyroid carcinoma, follicular thyroid carcinoma, medullary carcinoma and undifferentiated carcinoma. The specific results are shown in Table 21.
TABLE 21 methylation level differences in KATNAL2 genes between subtypes of thyroid malignancy
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
4. Methylation level of KATNAL2 gene in tissue can distinguish thyroid malignancy of different stages
As a result of comparative analysis of the methylation levels of KATNAL2 in thyroid malignancy patients of different stages (stage I, stage II, stage III and stage IV), it was found that there was a significant difference (P < 0.05) in the methylation levels of KATNAL2 gene in thyroid malignancy of stage I, stage II, stage III and stage IV. The specific results are shown in Table 22.
TABLE 22 methylation level differences of KATNAL2 genes between different stages of thyroid malignancy
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
5. Model verification
All methylation sites in the validation set samples were substituted into each model in the training set of example 2 for validation. As shown in tables 23-26, the ability of the kaTNAL2 gene methylation to identify each of the different types of samples of the present invention was found to increase with increasing number of methylation sites on the kaTNAL2 gene as tested by the validation set.
Further, as can be seen from tables 27 to 30, the combination of 7 distinguishable CpG sites, KATNAl2_B_5, KATNAl2_B_6.7, KATNAl2_B_8, KATNAl2_B_9.10, KATNAl2_B_11.12, KATNAl2_B_13, KATNAl2_B_14, is better than the discrimination of a combination of a plurality of non-preferred sites.
Table 23, cpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from thyroid malignant tumors of different subtypes
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 24, cpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from different stages of thyroid malignant tumors
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 25, cpG sites of katnal2_b and combinations thereof for differentiating patients with different subtype thyroid malignancies
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 26, cpG sites of katnal2_b and combinations thereof for differentiating patients with different staged thyroid malignancy
/>
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 27, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from thyroid malignant tumors of different subtypes
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 28, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating thyroid benign tumors from different stages of thyroid malignant tumors
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 29, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating between different subtypes of thyroid malignancy
Note that: cpG sites in the table are all distinguishable CpG sites.
Table 30, optimal CpG sites of KATNAL2_B and combinations thereof for differentiating between different stages of thyroid malignancy
Note that: cpG sites in the table are all distinguishable CpG sites.
Substituting all methylation sites in the verification set sample into each model in the training set of the embodiment 2, and displaying the result obtained by model conversion to be consistent with the clinical diagnosis result.
In summary, cpG sites on the kaTNAL2 gene and combinations thereof, cpG sites on the kaTNAL 2A fragment and combinations thereof, cpG sites on the kaTNAL 2B fragment and combinations thereof, kaTNAL 2B 5, kaTNAL 2B 6.7, kaTNAL 2B 8, kaTNAL 2B 9.10, kaTNAL 2B 11.12, kaTNAL 2B 13, kaTNAL 2B 14 sites and combinations thereof, cpG sites on the kaTNAL 2C fragment and combinations thereof, cpG sites on the kaTNAL 2D fragment and combinations thereof, and the methylation level of CpG sites on kaTNAL2_ A, KATNAL _ B, KATNAL _C and kaTNAL2_D and various combinations thereof to thyroid benign tumor and thyroid malignant tumor patients, thyroid benign tumor and thyroid papillary carcinoma patients, thyroid benign tumor and thyroid follicular carcinoma patients, thyroid benign tumor and thyroid undifferentiated carcinoma patients, thyroid papillary carcinoma and thyroid follicular carcinoma patients, thyroid papillary carcinoma and thyroid medullary carcinoma patients, thyroid papillary carcinoma and thyroid undifferentiated carcinoma patients, thyroid follicular carcinoma and thyroid medullary carcinoma patients, thyroid follicular carcinoma and thyroid undifferentiated carcinoma patients, thyroid medullary carcinoma and thyroid undifferentiated carcinoma patients, thyroid benign tumor and I-stage thyroid malignant tumor patients, thyroid benign tumor and II-stage thyroid malignant tumor patients, thyroid benign tumor and III-stage thyroid malignant tumor patients, thyroid benign tumor and IV-stage thyroid malignant tumor patients, I-stage thyroid malignant tumor and IV-stage thyroid malignant tumor patients, patients with thyroid malignant tumor II and thyroid malignant tumor III, thyroid malignant tumor II and thyroid malignant tumor IV, and thyroid malignant tumor III and thyroid malignant tumor IV have discrimination ability.
The present invention is described in detail above. It will be apparent to those skilled in the art that the present invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with respect to specific embodiments, it will be appreciated that the invention may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.

Claims (10)

1. A data processing apparatus for assisting in distinguishing between a type a sample and a type B sample, the type a sample and the type B sample being any one of:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages;
The method is characterized in that: the data processing device comprises a unit X and a unit Y;
the unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module;
the data collection module is configured to collect KATNAL2 gene methylation level data for n1 type a samples and n2 type B samples;
the data analysis processing module is configured to receive KATNAL2 gene methylation level data of the n 1A type samples and the n 2B type samples from the data acquisition module, establish a mathematical model according to a classification mode of the A type and the B type through a two-classification logistic regression method, and determine a threshold value of classification judgment;
the model output module is configured to receive the mathematical model established by the data analysis processing module and output the mathematical model;
the unit Y is used for determining the type of the sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module;
the data input module is configured to input KATNAL2 gene methylation level data of a subject;
the data operation module is configured to receive KATNAL2 gene methylation level data of the person to be tested from the data input module, and substitutes the KATNAL2 gene methylation level data of the person to be tested into the mathematical model established by the data analysis processing module in the unit X, so as to calculate a detection index;
The data comparison module is configured to receive the detection index calculated by the data operation module and compare the detection index with the threshold value determined by the data analysis processing module in the unit X;
the conclusion output module is configured to receive the comparison result from the data comparison module and output a conclusion of whether the type of the sample to be tested is A type or B type according to the comparison result.
2. A system, comprising:
(D1) Reagents and/or instrumentation for detecting the methylation level of KATNAL2 gene;
(D2) A device comprising a unit X and a unit Y;
the unit X is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module;
the data acquisition module is configured to acquire KATNAL2 gene methylation level data of (D1) detected n1 type a samples and n2 type B samples;
the data analysis processing module is configured to receive KATNAL2 gene methylation level data of the n 1A type samples and the n 2B type samples from the data acquisition module, establish a mathematical model according to a classification mode of the A type and the B type through a two-classification logistic regression method, and determine a threshold value of classification judgment;
The model output module is configured to receive the mathematical model established by the data analysis processing module and output the mathematical model;
the unit Y is used for determining the type of the sample to be detected and comprises a data input module, a data operation module, a data comparison module and a conclusion output module;
the data input module is configured to input (D1) KATNAL2 gene methylation level data of the tested person;
the data operation module is configured to receive KATNAL2 gene methylation level data of the person to be tested from the data input module, and substitutes the KATNAL2 gene methylation level data of the person to be tested into the mathematical model established by the data analysis processing module in the unit X, so as to calculate a detection index;
the data comparison module is configured to receive the detection index calculated by the data operation module and compare the detection index with the threshold value determined by the data analysis processing module in the unit X;
the conclusion output module is configured to receive the comparison result from the data comparison module and output a conclusion of whether the type of the sample to be tested is A type or B type according to the comparison result;
The type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages.
3. A computer-readable storage medium, characterized by: the computer readable storage medium stores a computer program for performing the steps of:
collecting KATNAL2 gene methylation level data of n1 type a samples and n2 type B samples;
establishing a mathematical model through a two-classification logistic regression method according to the KATNAL2 gene methylation level data of the n 1A type samples and the n 2B type samples and the classification mode of the A type and the B type, and determining a classification judgment threshold value;
inputting KATNAL2 gene methylation level data of a subject;
substituting the KATNAL2 gene methylation level data of the testee into the mathematical model, and calculating to obtain a detection index;
comparing the detection index with the threshold value to obtain a comparison result;
outputting a conclusion of whether the type of the sample to be detected is A type or B type according to the comparison result;
The type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages.
4. The data processing apparatus of claim 1 or the system of claim 2 or the computer readable storage medium of claim 3, wherein:
the methylation level of the KATNAL2 gene is the methylation level of all or part of CpG sites in fragments shown in the following (E1) - (E4) in the KATNAL2 gene;
(E1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment having 80% or more identity thereto;
(E2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment having 80% or more identity thereto;
(E3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment having 80% or more identity thereto;
(E4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment having 80% or more identity thereto;
further, the whole or part of CpG sites are any one of the following:
(F1) Any one or more CpG sites in 4 DNA fragments shown as SEQ ID No.1, SEQ ID No.2, SEQ ID No.3 and SEQ ID No.4 in KATNAL2 gene;
(F2) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene;
(F3) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F4) All CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F5) All CpG sites on the DNA fragment shown in SEQ ID No.1, all CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F6) All CpG sites in the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene or any 15 or any 14 or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 CpG sites;
(F7) All or any 4 or any 3 or any 2 or any 1 of the following CpG sites are present on the DNA fragment shown in SEQ ID No.2 of the KATNAL2 gene:
Item 1: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 54 th to 55 th positions of the 5' end;
item 2: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 144 th to 145 th and 148 th to 149 th of the 5' end;
item 3: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 165 th to 166 th positions of the 5' end;
item 4: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 178 to 179 and 188 to 189 of the 5' end;
item 5: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 210 th to 211 th and 214 th to 215 th of the 5' end;
item 6: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 221 th to 222 th positions of the 5' end;
item 7: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 316 to 317 positions of the 5' end.
5. The system according to claim 2 or 4, characterized in that: the reagent for detecting the methylation level of the KATNAL2 gene comprises a primer combination for amplifying a full or partial fragment of the KATNAL2 gene;
further, the partial fragment is at least one fragment of:
(G1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
(G5) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G6) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G7) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G8) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
still further, the primer combination is primer pair a and/or primer pair B and/or primer pair C and/or primer pair D;
the primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is SEQ ID No.5 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 5; the primer A2 is SEQ ID No.6 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 6;
the primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is SEQ ID No.7 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 7; the primer B2 is single-stranded DNA shown in SEQ ID No.8 or 32-56 nucleotides of SEQ ID No. 8;
The primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is SEQ ID No.9 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 9; the primer C2 is SEQ ID No.10 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 10;
the primer pair D is a primer pair consisting of a primer D1 and a primer D2; the primer D1 is SEQ ID No.11 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 11; the primer D2 is single-stranded DNA shown in SEQ ID No.12 or 32-56 nucleotides of SEQ ID No. 12.
6. The application of methylation KATNAL2 gene as a marker in the preparation of products; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Differentiation or assistance in differentiating between different stages of thyroid malignancy.
7. Use of a substance for detecting the methylation level of KATNAL2 gene in the preparation of a product; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Differentiation or assistance in differentiating between different stages of thyroid malignancy.
8. Use of a substance for detecting the methylation level of the KATNAL2 gene and a medium storing mathematical modeling methods and/or usage methods for the preparation of a product; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Distinguishing or assisting in distinguishing different stages of thyroid malignancy;
the mathematical model is obtained according to a method comprising the following steps:
(A1) Detecting KATNAL2 gene methylation levels of n1 type a samples and n2 type B samples, respectively;
(A2) Taking KATNAL2 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model according to a classification mode of A type and B type by a two-classification logistic regression method, and determining a threshold value of classification judgment;
the mathematical model using method comprises the following steps:
(B1) Detecting the methylation level of KATNAL2 genes of a sample to be detected;
(B2) Substituting the KATNAL2 gene methylation level data of the sample to be detected obtained in the step (B1) into the mathematical model to obtain a detection index; then comparing the detection index with a threshold value, and determining whether the type of the sample to be detected is A type or B type according to a comparison result;
the type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages.
9. Use of a medium storing a mathematical model building method and/or a use method for the preparation of a product; the application of the product is at least one of the following:
(1) Distinguishing or assisting in distinguishing between benign thyroid tumors and malignant thyroid tumors;
(2) Distinguishing or assisting in distinguishing between benign thyroid tumors and different subtypes of thyroid malignancy;
(3) Distinguishing or assisting in distinguishing benign thyroid tumors from different stages of thyroid malignancy;
(4) Distinguishing or assisting in distinguishing different subtypes of thyroid malignancy;
(5) Distinguishing or assisting in distinguishing different stages of thyroid malignancy;
the mathematical model is obtained according to a method comprising the following steps:
(A1) Detecting KATNAL2 gene methylation levels of n1 type a samples and n2 type B samples, respectively;
(A2) Taking KATNAL2 gene methylation level data of all samples obtained in the step (A1), establishing a mathematical model according to a classification mode of A type and B type by a two-classification logistic regression method, and determining a threshold value of classification judgment;
the mathematical model using method comprises the following steps:
(B1) Detecting the methylation level of KATNAL2 genes of a sample to be detected;
(B2) Substituting the KATNAL2 gene methylation level data of the sample to be detected obtained in the step (B1) into the mathematical model to obtain a detection index; then comparing the detection index with a threshold value, and determining whether the type of the sample to be detected is A type or B type according to a comparison result;
The type a sample and the type B sample are any one of the following:
(C1) Thyroid benign tumor and thyroid malignant tumor;
(C2) Thyroid benign tumors and thyroid malignant tumors of different subtypes;
(C3) Thyroid benign tumors and thyroid malignant tumors of different stages;
(C4) Thyroid malignancy of different subtypes;
(C5) Thyroid malignancy in different stages.
10. The use according to any one of claims 6-9, characterized in that: the methylation level of the KATNAL2 gene is the methylation level of all or part of CpG sites in fragments shown in the following (E1) - (E4) in the KATNAL2 gene;
the methylation KATNAL2 gene is the methylation of all or part of CpG sites in fragments shown in the following (E1) - (E4) in the KATNAL2 gene;
(E1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment having 80% or more identity thereto;
(E2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment having 80% or more identity thereto;
(E3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment having 80% or more identity thereto;
(E4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment having 80% or more identity thereto;
further, the whole or part of CpG sites are any one of the following:
(F1) Any one or more CpG sites in 4 DNA fragments shown as SEQ ID No.1, SEQ ID No.2, SEQ ID No.3 and SEQ ID No.4 in KATNAL2 gene;
(F2) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene;
(F3) All CpG sites on the DNA fragment shown in SEQ ID No.1 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F4) All CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F5) All CpG sites on the DNA fragment shown in SEQ ID No.1, all CpG sites on the DNA fragment shown in SEQ ID No.2 and all CpG sites on the DNA fragment shown in SEQ ID No.3 in the KATNAL2 gene;
(F6) All CpG sites in the DNA fragment shown in SEQ ID No.2 in the KATNAL2 gene or any 15 or any 14 or any 13 or any 12 or any 11 or any 10 or any 9 or any 8 or any 7 or any 6 or any 5 or any 4 or any 3 or any 2 or any 1 CpG sites;
(F7) All or any 4 or any 3 or any 2 or any 1 of the following CpG sites are present on the DNA fragment shown in SEQ ID No.2 of the KATNAL2 gene:
Item 1: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 54 th to 55 th positions of the 5' end;
item 2: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 144 th to 145 th and 148 th to 149 th of the 5' end;
item 3: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 165 th to 166 th positions of the 5' end;
item 4: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 178 to 179 and 188 to 189 of the 5' end;
item 5: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 210 th to 211 th and 214 th to 215 th of the 5' end;
item 6: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 221 th to 222 th positions of the 5' end;
item 7: the DNA fragment shown in SEQ ID No.2 shows CpG sites from 316 to 317 positions of the 5' end;
or (b)
The substance for detecting the methylation level of the KATNAL2 gene comprises a primer combination for amplifying a full or partial fragment of the KATNAL2 gene;
further, the partial fragment is at least one fragment of:
(G1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
(G5) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.1 or a DNA fragment comprising the same;
(G6) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.2 or a DNA fragment comprising the same;
(G7) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.3 or a DNA fragment comprising the same;
(G8) A DNA fragment having an identity of 80% or more to the DNA fragment shown in SEQ ID No.4 or a DNA fragment comprising the same;
still further, the primer combination is primer pair a and/or primer pair B and/or primer pair C and/or primer pair D;
the primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is SEQ ID No.5 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 5; the primer A2 is SEQ ID No.6 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 6;
the primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is SEQ ID No.7 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 7; the primer B2 is single-stranded DNA shown in SEQ ID No.8 or 32-56 nucleotides of SEQ ID No. 8;
the primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is SEQ ID No.9 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 9; the primer C2 is SEQ ID No.10 or single-stranded DNA shown in 32 th-56 th nucleotides of SEQ ID No. 10;
The primer pair D is a primer pair consisting of a primer D1 and a primer D2; the primer D1 is SEQ ID No.11 or single-stranded DNA shown in 11 th-35 th nucleotides of SEQ ID No. 11; the primer D2 is single-stranded DNA shown in SEQ ID No.12 or 32-56 nucleotides of SEQ ID No. 12.
CN202311679586.4A 2022-12-09 2023-12-08 Data processing device for assisting in distinguishing benign and malignant thyroid tumors Pending CN117690493A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211576528.4A CN115786522A (en) 2022-12-09 2022-12-09 Methylation molecular marker for diagnosing thyroid cancer
CN2022115765284 2022-12-09

Publications (1)

Publication Number Publication Date
CN117690493A true CN117690493A (en) 2024-03-12

Family

ID=85418112

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202211576528.4A Pending CN115786522A (en) 2022-12-09 2022-12-09 Methylation molecular marker for diagnosing thyroid cancer
CN202311679586.4A Pending CN117690493A (en) 2022-12-09 2023-12-08 Data processing device for assisting in distinguishing benign and malignant thyroid tumors

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202211576528.4A Pending CN115786522A (en) 2022-12-09 2022-12-09 Methylation molecular marker for diagnosing thyroid cancer

Country Status (1)

Country Link
CN (2) CN115786522A (en)

Also Published As

Publication number Publication date
CN115786522A (en) 2023-03-14

Similar Documents

Publication Publication Date Title
EP3658684B1 (en) Enhancement of cancer screening using cell-free viral nucleic acids
EP3249051B1 (en) Use of methylation sites in y chromosome as prostate cancer diagnosis marker
CN111910004A (en) Application of cfDNA in noninvasive diagnosis of early breast cancer
CN118547070A (en) Computer device for lung cancer diagnosis and application thereof
CN117690493A (en) Data processing device for assisting in distinguishing benign and malignant thyroid tumors
CN113215252B (en) Methylation markers for aiding in the diagnosis of cancer
CN113136428B (en) Application of methylation marker in auxiliary diagnosis of cancer
CN118098367A (en) Data processing device and system for thyroid malignant tumor diagnosis and application thereof
CN117690494A (en) Data processing device for auxiliary diagnosis of benign and malignant thyroid tumor and application thereof
CN118197425A (en) Data processing device for assisting diagnosis of thyroid malignant tumor and benign tumor
CN117711499A (en) Data processing device for assisting in distinguishing benign and malignant thyroid tumors and application thereof
CN116536422A (en) Thyroid cancer early-stage auxiliary diagnosis marker
CN117802236A (en) Application of combined marker for early thyroid cancer identification in preparation of product
CN117746991A (en) Data processing device and system for thyroid cancer diagnosis
CN117711498A (en) Data processing device and system for assisting in distinguishing benign and malignant thyroid tumors and application of data processing device and system
CN113215251B (en) Methylation marker for assisting diagnosis of cancer
CN113122630B (en) Calbindin methylation markers for use in aiding diagnosis of cancer
CN117568469A (en) Methylation marker for differential diagnosis of thyroid malignant tumor and thyroid benign tumor
CN117568473A (en) Methylation molecular marker for auxiliary diagnosis of cancer
CN116802318A (en) Method for evaluating bisulfite reagent and method for gene inspection
CN115701454A (en) Molecular marker and kit for auxiliary diagnosis of cancer
CN117568471A (en) Protein gene methylation as a molecular marker for aiding in the diagnosis of cancer
CN116042830A (en) Digestive tract malignant tumor diagnostic product and application thereof
Kottaridi et al. Research Article A Pyrosequencing Assay for the Quantitative Methylation Analysis of GALR1 in Endometrial Samples: Preliminary Results

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination