CN117344011A

CN117344011A - Methylation biomarker for diagnosing gastric cancer, kit and application

Info

Publication number: CN117344011A
Application number: CN202210785067.5A
Authority: CN
Inventors: 马威锋; 许林浩; 肖芳; 陈思宇; 胡倩; 王军; 陈志伟
Original assignee: AnchorDx Medical Co Ltd
Current assignee: AnchorDx Medical Co Ltd
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2024-01-05

Abstract

The invention discloses a methylation biomarker for diagnosing gastric cancer, a kit and application thereof. The present invention provides a methylation biomarker for diagnosing gastric cancer, wherein the methylation biomarker comprises any one or any combination of 65 different methylation regions listed in Table 6, selected from chr10:90343168-90343288, chr5:92907913-92908009, chr15:96895487-96895582, and the like. The invention adopts the combination of a plurality of methylation area signals to evaluate the degree of methylation difference, overcomes the problem of insufficient methylation signals of single DNA, and improves the sensitivity and specificity of detection, thereby providing more effective auxiliary detection service for applications such as gastric cancer detection. Meanwhile, the methylation region provided by the invention is independently used as a gastric cancer detection marker, and has good sensitivity and specificity.

Description

Methylation biomarker for diagnosing gastric cancer, kit and application

Technical Field

The invention belongs to the technical field of biology, and particularly relates to a methylation biomarker for diagnosing gastric cancer, a kit and application thereof, and more particularly relates to a methylation marker based on cfDNA in blood plasma for assisting in clinically detecting gastric cancer.

Background

Gastric cancer (gastric carcinoma, GC) is one of the most common digestive tract tumors, severely threatening the life health of humans. Early diagnosis and early treatment have important significance for improving curative effect of gastric cancer treatment and reducing death rate. At present, aiming at the detection of stomach tumors, the common screening method mainly comprises the following modes: 1) Gastroscopy is to put a special pipeline into the gastric cavity from the oral cavity through the esophagus, and directly observe the change of gastric mucosa at each part, such as congestion, edema, atrophy, ulcer, bleeding, inflammation, tumor and the like, by naked eyes. When gastroscopy finds suspicious lesions in esophagus, stomach or duodenum, biopsy forceps special at the front end of the gastroscope can be used for taking living tissues for pathological examination, so that the pathological nature of pathological change parts can be determined, and early gastric cancer can be found in time. 2) Pepsinogen (PG) assay: PG is an inactive precursor of pepsin. PG can be classified into two subtypes, PG I and PG II, according to biochemical and immunological activity characteristics. PG I is secreted primarily by the main cells of the gastric body and basal glands and by the cervical mucus cells, while PG II is secreted by the antral pylorus and proximal duodenal bulbar Lu Nashi glands (Brunner's gland). PG is a good indicator reflecting the function of gastric antrum mucosal exocrine, and can be called "serological biopsy", however this method has limited detection sensitivity and specificity. 3) X-ray barium meal examination: x-ray barium meal examination is carried out by drinking an X-ray opaque barium agent, so that the barium agent is smeared on the surface of gastric mucosa. The barium preparation can be clearly displayed through the tablet, and indirectly reflects whether the gastric mucosa has lesions or not while observing the forms of the esophagus and the stomach. When a tumor exists, an image of filling defect can appear at the position covered by the barium.

However, the above means have the following problems: the sensitivity and specificity of the tumor marker are not high on the whole, the tumor marker is greatly influenced by other physiological conditions or individual conditions, the invasive detection hurts the body of a patient, the compliance of the patient on the detection is reduced, and the defects limit the clinical value of the detection methods in early screening of gastric cancer. For example, imaging means mainly have a certain radiation influence and are influenced by equipment, and X-ray-air barium double radiography is greatly influenced by experience of operators, so that early gastric cancer diagnosis rate is low. Gastroscope is a gold standard for early detection of gastric cancer, but has the problems of invasive invasiveness and the like, and has low patient compliance. Gastroscopy also has complications, with more serious complications including cardiopulmonary accidents, severe bleeding, perforation, infection, etc. The feeling of irritation of the pharynx and the feeling of distension of the abdomen during gastroscopy inevitably make some patients feel prohibitive, and meanwhile, the gastroscopy is not suitable for patients with serious illness, such as patients suffering from large bleeding, shock, heart failure, unconsciousness, and senile asthenia. Aiming at the current requirements of early diagnosis and early screening of cancer seeds, blood-based multi-cancer seeds and even pan-cancer seeds are the most critical prevention and diagnosis means, compared with the traditional means, the blood-based gastric cancer liquid biopsy has the following advantages: 1) Noninvasive, no invasive detection similar to an endoscope is needed, and patient compliance is high; 2) The convenience is realized by simply taking blood, large-scale instruments and equipment are not needed, and the device is easy to popularize in people; 3) The performance is good, and compared with other tumor marker detection modes, the overall sensitivity and specificity of molecular detection are relatively high; 4) The result is relatively objective and is not dependent on subjective judgment of specific practitioners, and the result is relatively objective and easy to read; 5) Can realize the joint detection of multiple cancer species, avoid excessively high accumulated false negative or false positive results, and relieve certain medical anxiety.

There are also projects currently under investigation for developing liquid biopsies for early screening of gastric cancer, such as the blood methylation and qPCR based platform offered by Boerchen (Beijing) technologies, inc., which companyThe sensitivity of the product for detecting gastric cancer is 61.76%, the specificity is 85.07%, and the product is superior to the method for detecting the gastric cancer in general tradition. However, in the clinical application scenario, the sensitivity and specificity are higher, so that misdiagnosis and missed diagnosis events can be reduced, and further improvement of the detection performance is necessary for assisting gastric cancer detection.

The applicant of the invention adopts a special signal detection platform method of multiplex PCR (Multiplex polymerase chain reaction, mPCR) +next generation sequencing (Next Generation Sequencing, NGS) based on blood methylation signals, and amplifies DNA molecule methylation signals first and then performs NGS sequencing analysis, so that the signal to noise ratio of the methylation signals can be improved, compared with qPCR platform, the platform method of the applicant of the invention can obtain specific methylation modes of DNA molecules, the resolution reaches single base, more information is provided for subsequent biological information analysis, and the final result display has more excellent sensitivity and specificity. And thus a marker and a marker combination for diagnosing gastric cancer having high sensitivity and specificity are obtained for assisting clinical detection/diagnosis of gastric cancer.

Disclosure of Invention

Problems to be solved by the invention

Aiming at the problems of invasiveness, poor patient compliance, low sensitivity, low specificity and the like existing in the clinical application of the existing gastric cancer detection, the invention develops a marker for diagnosing gastric cancer, in particular to a methylation marker based on blood, and the marker can be detected by adopting a mPCR+NGS platform/method, so that the signal difference of the marker is further enhanced.

Solution for solving the problem

In a first aspect of the invention, a methylation biomarker for diagnosing/aiding in the clinical detection of gastric cancer is presented. Specifically, the invention provides a methylation biomarker for diagnosing gastric cancer, wherein the methylation biomarker comprises any one or any combination of the following different methylation regions:

chr8:70984561-70984656、chr4:13536950-13537044、chr8:99961207-99961337、chr10:110225862-110225977、chr4:13526663-13526744、chr13:46960844-46960969、chr2:213401681-213401784、chr4:154713804-154713923、chr5:92907913-92908009、chr7:96651472-96651564、chr4:13524667-13524790、chr11:125037244-125037331、chr4:144620968-144621093、chr17:66596160-66596269、chr8:26722886-26723016、chr8:143532011-143532108、chr8:24770917-24771026、chr7:139333352-139333459、chr6:40996087-40996182、chr5:172673057-172673162、chr1:47899550-47899648、chr7:27253040-27253164、chr6:133561659-133561777、chr8:99962607-99962690、chr5:140871302-140871422、chr8:10588845-10588971、chr8:132053717-132053827、chr10:28030375-28030474、chr7:150038594-150038715、chr13:36049329-36049419、chr15:96895487-96895582、chr12:115109503-115109616、chr8:132054559-132054681、chr8:54164381-54164463、chr4:13532565-13532660、chr11:61544752-61544867、chr9:91605729-91605817、chr17:46673962-46674085、chr14:103512082-103512178、chr8:132052701-132052859、chr5:134376436-134376563、chr7:1094829-1094913、chr14:99697596-99697701、chr15:96887526-96887616、chr7:27196279-27196357、chr3:138665164-138665243、chr10:90343168-90343288、chr2:154335349-154335456、chr2:176972076-176972174、chr5:112073351-112073434、chr6:31830674-31830762、chr2:177053257-177053351、chr22:25961206-25961295、chr10:85955208-85955302、chr16:82661134-82661236、chr20:2781337-2781445、chr5:475141-475226、chr15:48010834-48010923、chr2:176994614-176994699、chr9:23822006-23822117、chr9:126771376-126771460、chr2:200328933-200329042、chr4:111544276-111544375、chr3:152553579-152553708、chr5:32714407-32714525。

in some embodiments, the differential methylation region comprises at least: at least one of chr10:90343168-90343288, chr5:92907913-92908009 and chr15: 96895487-96895582.

In some specific embodiments, the differential methylation region further comprises at least one of chr5:112073351-112073434, chr4: 13524667-13524790;

optionally, the differential methylation region further comprises at least one of chr4:13526663-13526744, chr7:27253040-27253164, chr8:132054559-132054681, chr8:99962607-99962690, chr6: 133561659-133561777;

Optionally, the differential methylation region further comprises at least one of chr8:10588845-10588971, chr17:66596160-66596269, chr8:132053717-132053827, chr14:99697596-99697701, chr9:91605729-91605817, chr5:140871302-140871422, chr5:134376436-134376563, chr1:47899550-47899648, chr8:143532011-143532108, chr9: 126771376-126771460;

optionally, the differentially methylated regions further include at least one of chr17:46673962-46674085, chr2:176994614-176994699, chr7:1094829-1094913, chr4:144620968-144621093, chr13:46960844-46960969, chr2:213401681-213401784, chr7:139333352-139333459, chr15:96887526-96887616, chr5:32714407-32714525, chr20:2781337-2781445, chr4:13532565-13532660, chr14:103512082-103512178, chr12:115109503-115109616, chr7:96651472-96651564, chr10:85955208-85955302, chr2:177053257-177053351, chr22:25961206-25961295, chr5:112073351-112073434, chr6:31830674-31830762, chr7:27196279-27196357, chr11:61544752-61544867, chr8:99961207-99961337, chr7:150038594-150038715, chr6:40996087-40996182, chr4:13536950-13537044, chr13:36049329-36049419, chr3:152553579-152553708, chr5:172673057-172673162, chr10:110225862-110225977, chr15: 48010834-48010923.

In other specific embodiments, the differential methylation region further comprises at least one of chr9:126771376-126771460, chr6: 133561659-133561777;

optionally, the differential methylation region further comprises at least one of chr8:10588845-10588971, chr5:112073351-112073434, chr8:70984561-70984656, chr7:27253040-27253164, chr7: 27196279-27196357;

optionally, the differential methylation region further comprises at least one of chr11:125037244-125037331, chr5:140871302-140871422, chr14:103512082-103512178, chr2:213401681-213401784, chr20:2781337-2781445, chr12:115109503-115109616, chr17:46673962-46674085, chr7:150038594-150038715, chr8:132053717-132053827, chr5: 32714407-32714525;

optionally, the differential methylation region further comprises at least one of chr5:134376436-134376563, chr8:132054559-132054681, chr3:138665164-138665243, chr4:144620968-144621093, chr8:99961207-99961337, chr13:36049329-36049419, chr7:139333352-139333459, chr10:28030375-28030474, chr4:13526663-13526744, chr8:24770917-24771026, chr4:13536950-13537044, chr3:152553579-152553708, chr2:177053257-177053351, chr10:85955208-85955302, chr5:112073351-112073434, chr22:25961206-25961295, chr9:91605729-91605817, chr8:143532011-143532108, chr2:176972076-176972174, chr10:110225862-110225977, chr8:99962607-99962690, chr6:40996087-40996182, chr11:61544752-61544867, chr8:132052701-132052859, chr15:96887526-96887616, chr9:23822006-23822117, chr5:172673057-172673162, chr4:13532565-13532660, chr1:47899550-47899648, chr17: 66596160-66596269.

In some more specific embodiments, the differential methylation region comprises chr8: chr4: chr8: the composition comprises chr10, chr4, chr13, chr2, chr4, chr5, chr7, chr4, chr11, chr4, chr17, chr8, chr5, chr6, chr5, chr1, chr7, chr2, chr8, chr5, chr8, chr4, chr11, chr9, chr17, chr14, chr8, chr5, r7, chr14, chr7, chr3, r10, r2, chr5, chr15, chr8, chr11, chr14, chr8, chr10, chr5, chr10, chr5, and chr10 chr2:200328933-200329042, chr4:111544276-111544375, chr3:152553579-152553708, chr5:32714407-32714525.

In some embodiments, the gastric cancer is selected from stage I, II, III, or IV gastric cancer.

In some embodiments, the gastric cancer is gastric cancer from a subject, the subject being a mammal; preferably, the mammal is a human.

In some embodiments, a difference in the methylation level of the methylation biomarker in the test sample relative to the methylation level of the methylation biomarker in a sample of a subject not having gastric cancer indicates the presence of gastric cancer in the subject to which the test sample corresponds.

In some embodiments, the differential methylation region is a differential methylation region present in cfDNA.

In a second aspect of the present invention, there is provided any one of the following (i) to (ii):

(i) Use of a methylation biomarker as described in the first aspect of the invention in the manufacture of a reagent or kit for diagnosing gastric cancer;

(ii) Use of a reagent for determining the methylation level of a methylation biomarker as described in the first aspect of the invention in the manufacture of a reagent or kit for diagnosing gastric cancer.

In a third aspect of the invention, there is provided a kit for diagnosing gastric cancer, wherein the kit comprises reagents for detecting the methylation level of a methylation biomarker according to the first aspect of the invention in a sample to be tested.

In some embodiments, the kit detects the methylation level of the methylation biomarker described in the first aspect of the invention using one or more of polymerase chain reaction techniques, in situ hybridization techniques, enzymatic mutation detection techniques, chemical cleavage mismatch techniques, mass spectrometry techniques, and genetic sequencing techniques.

In some specific embodiments, the reagents in the kit comprise primers and/or probes; wherein the primer amplifies a sequence comprising a methylation biomarker as described in the first aspect of the invention; the probe hybridizes at least in part to a sequence of a methylation biomarker described in the first aspect of the invention.

In some more specific embodiments, the reagent comprises at least one primer pair selected from the group consisting of:

primers shown as SEQ ID NO. 1-2; primers shown as SEQ ID NO. 3-4; primers shown as SEQ ID NO. 5-6; primers shown as SEQ ID NO. 7-8; primers shown as SEQ ID NO. 9-10; primers shown as SEQ ID NO. 11-12; primers shown as SEQ ID NO. 13-14; primers shown as SEQ ID NO. 15-16; primers shown as SEQ ID NO. 17-18; primers shown as SEQ ID NO. 19-20; primers shown as SEQ ID NO. 21-22; primers shown as SEQ ID NO. 23-24; primers shown as SEQ ID NO. 25-26; primers shown as SEQ ID NO. 27-28; primers shown as SEQ ID NO. 29-30; primers shown as SEQ ID NO. 31-32; primers shown as SEQ ID NO. 33-34; primers shown as SEQ ID NO. 35-36; primers shown as SEQ ID NO. 37-38; primers shown as SEQ ID NO. 39-40; primers shown as SEQ ID NO. 41-42; primers shown as SEQ ID NO. 43-44; primers shown as SEQ ID NO. 45-46; primers shown as SEQ ID NO. 47-48; primers shown as SEQ ID NO. 49-50; primers shown as SEQ ID NO. 51-52; primers shown as SEQ ID NO. 53-54; primers shown as SEQ ID NO. 55-56; primers shown as SEQ ID NO. 57-58; primers shown as SEQ ID NO. 59-60; primers shown as SEQ ID NO. 61-62; primers shown as SEQ ID NO. 63-64; primers shown as SEQ ID NO. 65-66; primers shown as SEQ ID NO. 67-68; primers shown as SEQ ID NO. 69-70; primers shown as SEQ ID NO. 71-72; primers shown as SEQ ID NOS.73-74; primers shown as SEQ ID NO. 75-76; primers shown as SEQ ID NO. 77-78; primers shown as SEQ ID NO. 79-80; primers shown as SEQ ID NOS: 81-82; primers shown as SEQ ID NO. 83-84; primers shown as SEQ ID NO. 85-86; primers shown as SEQ ID NO. 87-88; primers shown as SEQ ID NO. 89-90; primers shown as SEQ ID NO. 91-92; primers shown as SEQ ID NO. 93-94; primers shown as SEQ ID NO. 95-96; primers shown as SEQ ID NO. 97-98; primers shown as SEQ ID NO. 99-100; primers shown as SEQ ID NO. 101-102; primers shown as SEQ ID NO. 103-104; primers shown as SEQ ID NO. 105-106; primers shown as SEQ ID NO. 107-108; primers shown as SEQ ID NO. 109-110; primers shown as SEQ ID NO. 111-112; primers shown as SEQ ID NO. 113-114; primers shown as SEQ ID NO. 115-116; primers shown as SEQ ID NOS.117-118; primers shown as SEQ ID NO. 119-120; primers shown as SEQ ID NO. 121-122; primers shown as SEQ ID NO. 123-124; primers shown as SEQ ID NO. 125-126; primers shown as SEQ ID NOS.127-128; primers shown in SEQ ID NOS.129-130.

In some embodiments, the sample to be tested is selected from one or more of tissue, whole blood, plasma, serum, pleural effusion, ascites, amniotic fluid, saliva, bone marrow, urine shed cells, urinary sediment, urine supernatant; preferably, the sample to be tested is whole blood, plasma or serum.

In a fourth aspect of the present invention, there is provided a system for diagnosing gastric cancer, wherein the system comprises a detection means, a calculation means and an output means;

the detection device comprises a sample injector for collecting a sample from a subject and a detector for detecting the methylation level of the above-mentioned methylation biomarker in the sample;

the computing device includes a memory having a computer program stored therein and a processor configured to execute the computer program stored in the memory to perform the following discrimination:

the methylation level of the methylation biomarker described in the first aspect of the invention in the sample is different from the methylation level of the methylation biomarker measured in a sample of a subject not suffering from gastric cancer, and the presence of gastric cancer in the subject to which the sample corresponds is determined.

In a fifth aspect of the invention there is provided the use of at least one set of primers selected from the group consisting of the following combinations of primers for detecting the degree of methylation of a methylation biomarker according to the first aspect of the invention in the manufacture of a reagent or kit for diagnosing gastric cancer:

ADVANTAGEOUS EFFECTS OF INVENTION

Compared with the prior art, the invention has the following beneficial effects:

the inventor of the invention finds that 65 DNA methylation areas including chr10:90343168-90343288, chr15:96895487-96895582, chr20:2781337-2781445, chr4:13526663-13526744, chr5:92907913-92908009, chr5:112073351-112073434, chr5:134376436-134376563, chr5:140871302-140871422, chr6:133561659-133561777, chr7:27253040-27253164, chr8:10588845-10588971, chr8:24770917-24771026, chr8:99962607-99962690, chr8:132053717-132053827, chr8:132054559-132054681 and the like are detected by different combinations of the areas, a prediction model is established by adopting a random forest or logistic regression mode and the like, the method finds that the method has great help to improve the accuracy of detecting gastric cancer lesions, and can be used as a marker for assisting gastric cancer detection. The invention adopts the combination of a plurality of methylation area signals to evaluate the degree of methylation difference, overcomes the problem of insufficient methylation signals of single DNA, and improves the sensitivity and specificity of detection, thereby providing more effective auxiliary detection service for applications such as gastric cancer detection. Meanwhile, the methylation region provided by the invention is independently used as a gastric cancer detection marker, and has good sensitivity and specificity. The invention also adds benign and several other highly cancerous samples to the control group so as to distinguish benign lesions of stomach from other non-gastric cancer tumor patients, thereby ensuring the specificity of gastric cancer detection. The invention combines the screening strategy on the biomarker by the method of methylation multiple specificity amplification and NGS sequencing, thereby achieving several main technical effects: compared with the capturing method, the amplicon method has great advantages in terms of overall cost and turnover time; by using a multiplex methylation specific amplification mode, because the selected biomarker can be specifically amplified, the signal-to-noise ratio is improved, so that a certain smaller panel can achieve the effect that a larger panel is required similar to a capturing means, the sequencing cost and the signal-to-noise ratio of analysis are greatly saved, and the overall data analysis and model establishment are more stable and generalization is easier.

Drawings

Fig. 1: heat map clustering schematic diagrams of 77 healthy people and 77 gastric cancer samples; in fig. 1, the vertical alignment represents 65 differential methylation region (or marker) features; all samples (154, including tumor (gastric cancer) samples and healthy human samples) were represented laterally, and both groups of samples were clustered using methylation degree features of 65 markers.

Fig. 2: schematic representation of ROC curves for marker combination 5marker set in example 2.

Fig. 3: schematic representation of ROC curves for the 10marker set of marker combinations in example 2.

Fig. 4: schematic representation of ROC curves for the 20marker set of marker combinations in example 2.

Fig. 5: schematic representation of ROC curves for the 50marker set of marker combinations in example 2.

Fig. 6: schematic representation of ROC curves for 65marker sets containing all markers in example 2.

Fig. 7: schematic representation of ROC curves for marker combination 5marker set in example 4.

Fig. 8: schematic representation of ROC curves for the 10marker set of marker combinations in example 4.

Fig. 9: schematic representation of ROC curves for the 20marker set of marker combinations in example 4.

Fig. 10: schematic representation of ROC curves for the 50marker set of marker combinations in example 4.

Fig. 11: schematic representation of ROC curves for 65marker sets containing all markers in example 4.

Fig. 12: schematic representation of ROC curves for marker combination 3marker set in example 5.

Detailed Description

The following describes the present invention in detail. The following description of the technical features is based on the representative embodiments and specific examples of the present invention, but the present invention is not limited to these embodiments and specific examples. It should be noted that:

in the present specification, the numerical range indicated by the term "numerical value a to numerical value B" means a range including the end point numerical value A, B.

In the present specification, the use of "substantially" or "substantially" means that the standard deviation from the theoretical model or theoretical data is within 5%, preferably 3%, more preferably 1%.

In the present specification, the meaning of "can" includes both the meaning of performing a certain process and the meaning of not performing a certain process.

In this specification, "optional" or "optionally" means that the subsequently described event or circumstance may or may not occur, and that the description includes instances where the event occurs and instances where it does not.

Reference throughout this specification to "some specific/preferred embodiments," "other specific/preferred embodiments," "an embodiment," and so forth, means that a particular element (e.g., feature, structure, property, and/or characteristic) described in connection with the embodiment is included in at least one embodiment described herein, and may or may not be present in other embodiments. In addition, it is to be understood that the elements may be combined in any suitable manner in the various embodiments.

The terms "comprising" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or device that comprises a list of steps is not limited to the elements or modules listed but may alternatively include additional steps not listed or inherent to such process, method, article, or device.

In the present invention, the term "plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

In the present specification, the term "cancer" (also referred to as carcinoma) generally refers to any type of malignant neoplasm, i.e. any morphological and/or physiological change (based on genetic reprogramming) of a target cell that shows or has a tendency to develop a characteristic of cancer as compared to an unaffected (healthy) wild-type control cell. Examples of such changes may relate to cell size and shape (becoming larger or smaller), cell proliferation (increasing cell number), cell differentiation (changing physiological state), apoptosis (programmed cell death) or cell survival.

In the present specification, the term "gastric cancer" is used in the broadest sense and refers to all cancers that begin in the stomach. It includes the following subtypes that begin in the stomach: adenocarcinomas, lymphomas, gastrointestinal stromal tumors (gastrointestinal stromal tumor, GIST), carcinoid and squamous cell carcinomas, small cell carcinomas and leiomyosarcomas. It also includes the following phases (as defined by the corresponding TNM classification in brackets): stage 0 (Tis, N0, M0), stage IA (T1, N0, M0), stage IB (T1, N1, M0; or T2, N0, M0), stage IIA (T1, N2, M0; T2, N1, M0; or T2, N0, M0), stage IIB (T1, N3, M0; T2, N2, M0; T3, N1, M0; or T4a, N0, M0), stage IIIA (T1, N2, M0; T2, N1, M0; or T2, N0, M0), stage IIIB (T3, N3, M0; T4a, N2, M0; or T4b, N0, or N1, M0), stage IIIC (T4 a, N3, M0; or T4b, N2 or N3, M0), and stage IV (any other T, N and M).

The stages of gastric cancer include within its definition, but are not strictly limited to, the following stages as follows:

stage I: in stage I, cancer has formed in the inner layer of the gastric wall mucosa (innermost layer). Stage I is divided into stage IA and stage IB according to the location where the cancer has spread.

Stage IA: the cancer may have spread into the submucosa (the layer of tissue next to the mucosa) of the stomach wall.

Stage IB: the cancer may have spread to submucosa of the stomach wall (the tissue layer next to the mucosa) and be found in 1 or 2 lymph nodes near the tumor; or a muscle layer that has spread to the stomach wall.

Stage II: stage II gastric cancer is divided into stage IIA and stage IIB according to the location where the cancer has spread.

Stage IIA: the cancer has spread to the serosal lower layer of the stomach wall (the tissue layer next to serosa); or has spread to the muscle layer of the stomach wall and is found in 1 or 2 lymph nodes near the tumor; or may have spread to submucosa of the stomach wall (the tissue layer next to the mucosa) and be found in 3 to 6 lymph nodes near the tumor.

IIB phase: the cancer has spread to the serosa (outermost layer) of the stomach wall; or have spread to the serosal lower layer of the stomach wall (the tissue layer next to the serosa) and are found in 1 or 2 lymph nodes close to the tumor; or has spread to the muscle layer of the stomach wall and is found in 3 to 6 lymph nodes near the tumor; or may have spread to submucosa (the tissue layer next to the mucosa) of the stomach wall and be found in 7 or more lymph nodes near the tumor.

Stage III: stage III stomach cancer is divided into stage IIIA, stage IIIB, and stage IIIC according to the location where the cancer has spread.

Stage IIIA: cancer has spread to the serosal layer (outermost layer) of the stomach wall and is found in 1 or 2 lymph nodes near the tumor; or have spread to the serosal lower layer of the stomach wall (the tissue layer next to serosa) and are found in 3 to 6 lymph nodes near the tumor; or have spread to the muscle layer of the stomach wall and are found in 7 or more lymph nodes near the tumor.

Stage IIIB: cancer has spread to adjacent organs such as the spleen, transverse colon, liver, diaphragm, pancreas, kidney, adrenal gland, or small intestine, and can be found in 1 or 2 lymph nodes proximal to the tumor; or serosa (outermost layer) that has spread to the stomach wall and is found in 3 to 6 lymph nodes near the tumor; or diffuse to the serosal lower layer of the stomach wall (the tissue layer immediately adjacent to the serosa) and are found in 7 or more lymph nodes near the tumor.

Stage IIIC: cancer has spread to adjacent organs such as the spleen, transverse colon, liver, diaphragm, pancreas, kidney, adrenal gland, or small intestine, and can be found in 3 or more lymph nodes proximal to the tumor; or serosa (outermost layer) that has spread to the stomach wall and is found in 7 or more lymph nodes near the tumor.

Stage IV: in stage IV, the cancer has spread to distant sites of the body.

In this specification, the term "sample" refers to any substance, including biological samples, that may contain a target molecule for which analysis is desired. As used herein, "sample" or "biological sample" refers to any sample obtained from a living or viral (or prion) source or other macromolecular and biomolecular source, and includes any cell type or tissue of a subject from which nucleic acids, proteins, and/or other macromolecules may be obtained. The sample or biological sample may be a sample obtained directly from a biological source or a sample that is processed. Samples or biological samples include, but are not limited to, body fluids (e.g., whole blood, plasma, serum, cerebral spinal fluid, synovial fluid, urine, sweat, semen, stool, sputum, tears, mucus, amniotic fluid, or the like), exudates, bone marrow samples, ascites, pelvic rinse, pleural fluid, spinal fluid, lymph fluid, eye fluid, extracts of nasal, laryngeal or genital swabs, cell suspensions of digestive tissue, or extracts of fecal matter, and tissue and organ samples from humans, animals (e.g., non-human mammals) and plants, and processed samples derived therefrom.

In this specification, the term "subject" may be a mammal or a cell, tissue, organ or part of said mammal. In the present invention, mammal means any kind of mammal, preferably a human (including a human, a human subject or a human patient). Subjects and mammals include, but are not limited to, farm animals, sports animals, pets, primates, horses, dogs, cats, and rodents such as mice and rats.

In this specification, diagnosis includes detection or identification of a disease state or condition in a subject, determining the likelihood that a subject will have a given disease or condition, determining the likelihood that a subject with a disease or condition will respond to treatment, determining the prognosis (or the likely progression or regression thereof) of a subject with a disease or condition, and determining the effect of treatment on a subject with a disease or condition.

The terms "complementary" and "complementarity" refer to nucleotides (e.g., 1 nucleotide) or polynucleotides (e.g., sequences of nucleotides) associated with a base pairing rules. For example, the sequence 5'-A-G-T-3' is complementary to the sequence 3 '-T-C-A-5'. Complementarity may be "partial" in which only some of the nucleobases are matched according to the base pairing rules. Alternatively, there may be "complete" or "total" complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands affects the efficiency and strength of hybridization between nucleic acid strands. This is particularly important in amplification reactions and detection methods that rely on binding between nucleic acids.

The term "polymerase chain reaction" is used to amplify a target sequence, the method consisting of the steps of: a large excess of the two oligonucleotide primers is introduced into a DNA mixture containing the desired target sequence, followed by a precise thermal cycling sequence in the presence of a DNA polymerase. Both primers are complementary to the corresponding strands of the double stranded target sequence. For amplification, the mixture is denatured and the primers are then annealed to their complementary sequences within the target molecule. After annealing, the primers are amplified with a polymerase to form a pair of new complementary strands. The steps of denaturation, primer annealing, and polymerase extension can be repeated multiple times (i.e., denaturation, annealing, and extension constitute one "cycle; there can be many" cycles ") to obtain high concentrations of amplified fragments of the desired target sequence. The length of the amplified fragment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and is thus a controllable parameter. Because of the repeated aspects of the method, the method is referred to as "polymerase chain reaction" ("PCR"). Since the desired amplified fragment of the target sequence becomes the primary sequence (in terms of concentration) in the mixture, it is said to be "PCR amplified", either as a "PCR product" or as an "amplicon".

In this specification, the term "amplifiable nucleic acid" refers to a nucleic acid that can be amplified by any amplification method. It is contemplated that an "amplifiable nucleic acid" will typically comprise a "sample template".

In this specification, the term "sample template" refers to a nucleic acid derived from a sample for analysis of the presence of a "target". In contrast, a "background template" is used to refer to nucleic acids other than a sample template, which may or may not be present in the sample. Background templates are often unintentional. This may be a carryover result, or may be due to the presence of nucleic acid contaminants that have been attempted to be purified from the sample. For example, nucleic acids other than the nucleic acid to be detected from an organism may be present as background to the test sample.

In the present specification, the term "primer" refers to an oligonucleotide naturally occurring or synthetically produced in a purified restriction digest that is capable of acting as a point of origin of synthesis when subjected to conditions in which synthesis of a primer extension product complementary to a nucleic acid strand is induced (e.g., in the presence of a nucleotide and an inducer such as a DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency of amplification, but may also be double stranded. If double stranded, the primer is first treated to separate its strand before use in preparing the extension product. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be long enough to prime the synthesis of the extension product in the presence of the inducer. The exact length of the primer will depend on many factors, including temperature, source of primer, and use of the method.

In the present specification, the term "probe" refers to an oligonucleotide (e.g., a nucleotide sequence) that is naturally occurring in a purified restriction digest or that is produced synthetically, recombinantly, or by PCR amplification, and that is capable of hybridizing to another target oligonucleotide. Probes may be single-stranded or double-stranded. Probes can be used for detection, identification, and isolation of specific gene sequences (e.g., a "capture probe"). It is contemplated that in some embodiments, any probe used in the present invention may be labeled with any "reporter" such that it is detectable in any detection system.

In this specification, "amplification" generally refers to the process of producing multiple copies of a desired sequence. "multiple copies" means at least two copies. "copy" does not necessarily mean perfect sequence complementarity or identity with the template sequence. For example, copies may include nucleotide analogs such as deoxyinosine, intentional sequence alterations (e.g., introduced by primers that include sequences that hybridize to the template but are not complementary), and/or sequence errors that occur during amplification.

In the present specification, "sequence determination" and the like include determination of information about a nucleotide base motif of a nucleic acid. Such information may include identification or determination of partial or complete sequence information of the nucleic acid. The sequence information may be determined with varying degrees of statistical reliability or confidence. In one aspect, the term includes determining the identity and order of a plurality of consecutive nucleotides in a nucleic acid.

In the present specification, "DNA methylation" or "methylation" refers to a process of transferring a methyl group to a specific base in an organism under the catalysis of DNA methyltransferase using S-adenosylmethionine (SAM) as a methyl donor. In mammals, methylation is predominantly at the C5 position of the cytosine residue. In the human genome, a large number of DNA methylation occurs at cytosine in CpG dinucleotides, C is cytosine, G is guanine, and p is a phosphate group. DNA methylation may also occur in cytosine, where H is adenine, cytosine or thymine, in nucleotide sequences such as CHG and CHH. DNA methylation may also occur on non-cytosines, such as N6-methyladenine. Furthermore, DNA methylation may also be in the form of 5-hydroxymethylcytosine and the like. In most cases, methylation of DNA is induced by methylation modifications at opposite DNA strands or other CpG sites near the DNA base.

In the present specification, "DNA methylation site", "methylation site" refers to single or multiple base positions where DNA methylation modification is possible. Such as CpG sites, CHG sites or CHH sites. In some cases, the DNA methylation site is equivalent to a CpG site.

In this specification, "methylated nucleotide" or "methylated nucleotide base" refers to the presence of a methyl moiety on a nucleotide base, where the methyl moiety is not present in a recognized typical nucleotide base. For example, cytosine does not contain a methyl moiety on its pyrimidine ring, but 5-methylcytosine contains a methyl moiety at the 5-position of its pyrimidine ring. Thus, cytosine is not a methylated nucleotide and 5-methylcytosine is a methylated nucleotide. In another example, thymine contains a methyl moiety at the 5-position of its pyrimidine ring.

In the present specification, "methylation level", "methylation state", "methylation profile" and "methylation status" of a nucleic acid molecule refer to the presence or absence of one or more methylated nucleotide bases in the nucleic acid molecule. For example, a nucleic acid molecule comprising a methylated cytosine is considered methylated (e.g., the methylation state of the nucleic acid molecule is methylated). Nucleic acid molecules that do not contain any methylated nucleotides are considered unmethylated.

In this specification, methylation status may optionally be represented or indicated by a "methylation value" (e.g., representing methylation frequency, fraction, proportion, percentage, etc.). Methylation values can be generated, for example, by quantifying the amount of intact nucleic acid present after restriction digestion with a methylation dependent restriction enzyme, or by comparing amplification spectra after a bisulfite reaction, or by comparing the sequences of bisulfite treated and untreated nucleic acid. Thus, a value such as a methylation value represents methylation status and thus can be used as a quantitative indicator of methylation status in multiple copies of a locus.

As used herein, "methylation frequency" or "percent methylation (%)" refers to the number of instances in which a molecule or locus is methylated relative to the number of instances in which the molecule or locus is unmethylated. For example, in some embodiments, the percent methylation refers to the percent of methylated cytosines, expressed as a β value, i.e., β = number of methylated cytosines carried/(number of methylated cytosines + number of unmethylated cytosines).

Thus, methylation state describes the state of methylation of a nucleic acid (e.g., a genomic sequence). Furthermore, methylation state refers to a property of a nucleic acid fragment that is associated with methylation at a particular genomic locus. Such characteristics include, but are not limited to: whether any cytosine (C) residue within this DNA sequence is methylated, the position of the methylated C residue, the frequency or percentage of methylated C throughout any particular region of the nucleic acid, and allelic differences in methylation due to, for example, differences in the allelic sources. The terms "methylation state", "methylation profile" and "methylation status" also refer to the relative, absolute concentration or pattern of methylated C or unmethylated C throughout any particular region of a nucleic acid in a biological sample. For example, if cytosine (C) residues within a nucleic acid sequence are methylated, they may be referred to as "hypermethylated" or have "increased methylation", while if cytosine (C) residues within a DNA sequence are unmethylated, they may be referred to as "hypomethylated" or have "decreased methylation". Likewise, if a cytosine (C) residue within a nucleic acid sequence is methylated compared to another nucleic acid sequence (e.g., from a different region or from a different individual, etc.), the sequence is considered to be hypermethylated or to have increased methylation compared to the other nucleic acid sequence. Alternatively, if a cytosine (C) residue within a DNA sequence is unmethylated compared to another nucleic acid sequence (e.g., from a different region or from a different individual, etc.), the sequence is considered hypomethylated or has reduced methylation compared to the other nucleic acid sequence. In addition, the term "methylation pattern" as used herein refers to the collective sites of methylated and unmethylated nucleotides at a region of a nucleic acid. The two nucleotides may have the same or similar methylation frequency or methylation percentage, but have different methylation patterns when the number of methylated and unmethylated nucleotides is the same or similar throughout the region, but the positions of the methylated and unmethylated nucleotides are different. When sequences differ in the degree of methylation (e.g., one has increased or decreased methylation relative to the other), frequency, or pattern, the sequences are said to be "differentially methylated" or are said to have "methylation differences" or have "different methylation states. The term "differential methylation" refers to the difference in the level or pattern of nucleic acid methylation in a cancer positive sample compared to the level or pattern of nucleic acid methylation in a cancer negative sample. It may also refer to differences in levels or patterns between patients with recurrent cancer and patients without recurrent cancer after surgery. Specific levels or patterns of differential methylation, as well as DNA methylation, are diagnostic and predictive biomarkers, e.g., once the correct cut-off value or predictive property is defined.

As used herein, the term "differential methylation region" (Differentially Methylated Region, DMR) refers to a DNA region comprising one or more differential methylation sites. DMR comprising a greater number or frequency of methylation sites under selected conditions of interest, such as cancer status, may be referred to as hypermethylated DMR. DMR comprising a lesser number or frequency of methylation sites under selected conditions of interest, such as cancer status, may be referred to as hypomethylated DMR. DMR as a gastric cancer methylation biomarker may be referred to as gastric cancer DMR. In some cases, the DMR may be a single nucleotide that is a methylation site. In the context of the present specification, the "differential methylation region" may also be simply referred to as "methylation region", and in the present specification, the "differential methylation region" may be used as a marker for diagnosing gastric cancer, and therefore, depending on the context, the "differential methylation region" may also be simply referred to as "marker" or "biomarker".

In this specification, the terms "sequencing," "high throughput sequencing," or "next generation sequencing" include sequence determination using such methods: the method determines a number (typically thousands to billions) of nucleic acid sequences in a substantially parallel manner, i.e., in such a method, the preparation of DNA templates is not performed one at a time for sequencing, but rather in a batch process, and in such a method many sequences are preferably read in parallel, or using an ultra-high throughput serial process that can itself be parallelized. Such methods include, but are not limited to, pyrosequencing (e.g., as commercialized by 454Life Sciences,Inc, branford, CT); sequencing by ligation (e.g., as commercialized by solid tm technology, life Technologies, inc., carlsbad, CA); sequencing by synthesis using modified nucleotides (e.g., truSeqTM and HiSeqTM techniques as commercialized by Illumina, inc., san Diego, calif., helicos BiosciencesCorporation, cambridge, heliScope TM, mass.; and PacBio RS as commercialized by Pacific Biosciences ofCalifornia, inc., menlo Park, calif.), sequencing by Ion detection techniques (e.g., ion TorrentTM techniques, life Technologies, carlsbad, calif.); DNA nanosphere sequencing (Complete Genomics, inc., mountain View, CA); nanopore-based sequencing techniques (e.g., developed by Oxford Nanopore Technologies, LTD, oxford, UK) and the like.

In this specification, the term "bisulphite reagent" refers to a reagent that in some embodiments comprises bisulphite (biosulfite), bisulphite (disulite), bisulphite (hydrosulfite) or a combination thereof, and the DNA treated with the bisulphite reagent will convert unmethylated cytosine nucleotides to uracil, while methylated cytosines and other bases remain unchanged, thus allowing discrimination between methylated and unmethylated cytosines in, for example, cpG dinucleotide sequences.

In the present specification, the term "extracellular DNA" or its synonyms "cfDNA (circulating freeDNA)", "circulating DNA" and "free circulating DNA" refer to DNA that is not contained within intact cells in the corresponding body fluid as a sample or from which the sample is derived, but is free to circulate in the body fluid sample. Extracellular DNA is typically fragmented genomic DNA.

The term "AUC" as used herein is an abbreviation for "area under the curve". Specifically, it refers to the area under the Receiver Operating Characteristic (ROC) curve. The ROC curve is a plot of true to false positive ratios for different possible cut points of the diagnostic test. It shows a compromise between sensitivity and specificity depending on the cut point chosen (any increase in sensitivity will be accompanied by a decrease in specificity). The area under the ROC curve (AUC) is a measure of the accuracy of the diagnostic test (the larger the area the better; optimally 1; the random test will have a ROC curve with an area of 0.5 on the diagonal; see j.p. egan. (1975) Signal Detection Theory and ROC Analysis, academic Press, newYork).

The following describes the technical scheme of the present invention in detail.

< methylation biomarker >

In some aspects of the invention, there is provided a methylation biomarker for gastric cancer diagnosis, wherein the methylation biomarker comprises any one or any combination of the different methylation regions selected from the group consisting of:

in some embodiments, the differential methylation region comprises at least: at least one of chr10:90343168-90343288, chr5:92907913-92908009 and chr15:96895487-96895582.

In some specific embodiments, the differential methylation region comprises one of chr10:90343168-90343288, chr5:92907913-92908009, chr15:96895487-96895582. For example, the differentially methylated regions include chr10:90343168-90343288, chr5:92907913-92908009, or chr15:96895487-96895582.

In some specific embodiments, the differential methylation region comprises two of chr10:90343168-90343288, chr5:92907913-92908009, chr15:96895487-96895582. For example, the differential methylation regions include chr10:90343168-90343288 and chr5:92907913-92908009; chr10:90343168-90343288 and chr15:96895487-96895582; or chr5:92907913-92908009 and chr15:96895487-96895582.

In some specific embodiments, the differential methylation region comprises chr10:90343168-90343288, chr5:92907913-92908009, chr15:96895487-96895582.

In some exemplary embodiments, the differential methylation regions include chr10:90343168-90343288, chr15:96895487-96895582, chr20:2781337-2781445, chr4:13526663-13526744, chr5:92907913-92908009, chr5:112073351-112073434, chr5:134376436-134376563, chr5:140871302-140871422, chr6:133561659-133561777, chr7:27253040-27253164, chr8:10588845-10588971, chr8:24770917-24771026, chr8:99962607-99962690, chr8:132053717-132053827, chr8:132054559-132054681.

For differential methylation regions comprising at least one of chr10:90343168-90343288, chr5:92907913-92908009, chr15:96895487-96895582, further, in some embodiments, the differential methylation regions further comprise at least one of chr5:112073351-112073434, chr4:13524667-13524790. Illustratively, the regions of differential methylation include chr5:112073351-112073434, chr5:92907913-92908009, chr10:90343168-90343288, chr15:96895487-96895582, chr4:13524667-13524790.

Still further, in some specific embodiments, the differential methylation region further comprises at least one of chr4:13526663-13526744, chr7:27253040-27253164, chr8:132054559-132054681, chr8:99962607-99962690, chr6:133561659-133561777. Illustratively, the differentially methylated regions include chr15:96895487-96895582, chr10:90343168-90343288, chr5:92907913-92908009, chr5:112073351-112073434, chr4:13524667-13524790, chr4:13526663-13526744, chr7:27253040-27253164, chr8:132054559-132054681, chr8:99962607-99962690, chr6:133561659-133561777.

Still further, in some specific embodiments, the differential methylation region further comprises at least one of chr8:10588845-10588971, chr17:66596160-66596269, chr8:132053717-132053827, chr14:99697596-99697701, chr9:91605729-91605817, chr5:140871302-140871422, chr5:134376436-134376563, chr1:47899550-47899648, chr8:143532011-143532108, chr9:126771376-126771460. Illustratively, the differential methylation regions include chr15:96895487-96895582, chr10:90343168-90343288, chr5:92907913-92908009, chr5:112073351-112073434, chr4:13524667-13524790, chr4:13526663-13526744, chr7:27253040-27253164, chr8:132054559-132054681, chr8:99962607-99962690, chr6:133561659-133561777, chr8:10588845-10588971, chr17:66596160-66596269, chr8:132053717-132053827, chr14:99697596-99697701, chr9:91605729-91605817, chr5:140871302-140871422, chr5:134376436-134376563, chr1:47899550-47899648, chr8:143532011-143532108, chr9:126771376-126771460.

Still further, in some specific embodiments, the differential methylation region further comprises at least one of chr17:46673962-46674085, chr2:176994614-176994699, chr7:1094829-1094913, chr4:144620968-144621093, chr13:46960844-46960969, chr2:213401681-213401784, chr7:139333352-139333459, chr15:96887526-96887616, chr5:32714407-32714525, chr20:2781337-2781445, chr4:13532565-13532660, chr14:103512082-103512178, chr12:115109503-115109616, chr7:96651472-96651564, chr10:85955208-85955302, chr2:177053257-177053351, chr22:25961206-25961295, chr5:112073351-112073434, chr6:31830674-31830762, chr7:27196279-27196357, chr11:61544752-61544867, chr8:99961207-99961337, chr7:150038594-150038715, chr6:40996087-40996182, chr4:13536950-13537044, chr13:36049329-36049419, chr3:152553579-152553708, chr5:172673057-172673162, chr10:110225862-110225977, and chr15:48010834-48010923. Illustratively, the differential methylation region comprises chr15: the composition comprises chr10, chr5, chr4, chr7, chr8, chr6, chr8, chr17, chr8, chr14, chr9, chr5, chr1, chr8, chr9, chr17, chr2, chr7, chr4, chr13, chr2, chr7, chr15, chr5, chr20, chr4, chr14, chr12, chr7, chr10, chr2, chr22, chr5, chr6, chr7, chr11, chr8, chr7, chr6, chr4, chr13, chr3, chr5, chr 10.

Further, in other specific embodiments, the differential methylation region further comprises at least one of chr9:126771376-126771460, chr6:133561659-133561777. Illustratively, the regions of differential methylation include chr10:90343168-90343288, chr5:92907913-92908009, chr9:126771376-126771460, chr15:96895487-96895582, chr6:133561659-133561777.

Still further, in other specific embodiments, the differential methylation region further comprises at least one of chr8:10588845-10588971, chr5:112073351-112073434, chr8:70984561-70984656, chr7:27253040-27253164, chr7:27196279-27196357. Illustratively, the differentially methylated regions include chr10:90343168-90343288, chr5:92907913-92908009, chr9:126771376-126771460, chr15:96895487-96895582, chr6:133561659-133561777, chr8:10588845-10588971, chr5:112073351-112073434, chr8:70984561-70984656, chr7:27253040-27253164, chr7:27196279-27196357.

Still further, in other specific embodiments, the differential methylation region further comprises at least one of chr11:125037244-125037331, chr5:140871302-140871422, chr14:103512082-103512178, chr2:213401681-213401784, chr20:2781337-2781445, chr12:115109503-115109616, chr17:46673962-46674085, chr7:150038594-150038715, chr8:132053717-132053827, chr5:32714407-32714525. Illustratively, the differential methylation regions include chr10:90343168-90343288, chr5:92907913-92908009, chr9:126771376-126771460, chr15:96895487-96895582, chr6:133561659-133561777, chr8:10588845-10588971, chr5:112073351-112073434, chr8:70984561-70984656, chr7:27253040-27253164, chr7:27196279-27196357, chr11:125037244-125037331, chr5:140871302-140871422, chr14:103512082-103512178, chr2:213401681-213401784, chr20:2781337-2781445, chr12:115109503-115109616, chr17:46673962-46674085, chr7:150038594-150038715, chr8:132053717-132053827, chr5:32714407-32714525.

Still further, in other specific embodiments, the differential methylation region further comprises at least one of chr5:134376436-134376563, chr8:132054559-132054681, chr3:138665164-138665243, chr4:144620968-144621093, chr8:99961207-99961337, chr13:36049329-36049419, chr7:139333352-139333459, chr10:28030375-28030474, chr4:13526663-13526744, chr8:24770917-24771026, chr4:13536950-13537044, chr3:152553579-152553708, chr2:177053257-177053351, chr10:85955208-85955302, chr5:112073351-112073434, chr22:25961206-25961295, chr9:91605729-91605817, chr8:143532011-143532108, chr2:176972076-176972174, chr10:110225862-110225977, chr8:99962607-99962690, chr6:40996087-40996182, chr11:61544752-61544867, chr8:132052701-132052859, chr15:96887526-96887616, chr9:23822006-23822117, chr5:172673057-172673162, chr4:13532565-13532660, chr1:47899550-47899648, and chr17:66596160-66596269. Illustratively, the differential methylation region comprises chr10: the composition comprises chr5, chr9, chr15, chr6, chr8, chr5, chr8, chr7, chr11, chr5, chr14, chr2, chr20, chr12, chr17, chr7, chr8, chr5, chr8, chr3, chr4, chr8, chr13, chr7, chr10, chr4, chr8, chr4, chr3, chr2, chr10, chr5, chr22, chr9, chr8, chr2, chr10, chr8, chr6, chr11, chr8, chr15, chr9, chr5, chr4, chr1, and 17.

In some particularly specific embodiments, the differential methylation region comprises chr8: chr4: chr8: the composition comprises chr10, chr4, chr13, chr2, chr4, chr5, chr7, chr4, chr11, chr4, chr17, chr8, chr5, chr6, chr5, chr1, chr7, chr2, chr8, chr5, chr8, chr4, chr11, chr9, chr17, chr14, chr8, chr5, r7, chr14, chr7, chr3, r10, r2, chr5, chr15, chr8, chr11, chr14, chr8, chr10, chr5, chr10, chr5, and chr10 chr2:200328933-200329042, chr4:111544276-111544375, chr3:152553579-152553708, chr5:32714407-32714525.

It will be appreciated by those skilled in the art that the methylation biomarkers can also comprise at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% consecutive fragments of the full length sequence of each of the differential methylation regions described above. In some embodiments, the contiguous segment comprises part or all of the (differential) methylation site in the original differential methylation region. For the methylation biomarker embodiments described above, the same applies to consecutive fragments from the differential methylation region. In some embodiments, successive fragments of one differential methylation region and another differential methylation region can also be used in combination to construct the above-described embodiments of the methylation biomarker.

In the present invention, there is no limitation on the type and stage of gastric cancer. In some embodiments of the invention, the gastric cancer is selected from adenocarcinoma, lymphoma, gastrointestinal stromal tumor, carcinoid tumor, squamous cell carcinoma, small cell carcinoma, or leiomyosarcoma. In some embodiments of the invention, the gastric cancer is selected from stage I, II, III or IV gastric cancer. In another specific embodiment, the gastric cancer may be any of the TNM stages of gastric cancer. In some embodiments of the invention, the gastric cancer is accompanied by lymph node metastasis or is not accompanied by lymph node metastasis. In some embodiments of the invention, the gastric cancer may be poorly, moderately or highly differentiated gastric cancer. In some embodiments of the invention, the gastric cancer may be a diffuse, mixed or intestinal type of gastric cancer in the Lauren typing.

In some embodiments of the invention, the gastric cancer is gastric cancer from a subject, the subject being a mammal; preferably, the mammal is a human.

In some embodiments of the invention, a difference in the methylation level of the methylation biomarker in a test sample relative to the methylation level of the methylation biomarker in a sample from a subject not having gastric cancer is indicative of the presence of gastric cancer in the subject to which the test sample corresponds. In some specific embodiments, the subject not having gastric cancer may be a healthy subject. In other specific embodiments, the subject not having gastric cancer may be a subject having benign lesions of the stomach. In other specific embodiments, the subject not having gastric cancer may be a subject having other cancers.

In some embodiments of the invention, the sample to be tested is selected from one or more of tissue (e.g., a tissue slice), whole blood, plasma, serum, pleural effusion, ascites, amniotic fluid, saliva, bone marrow, urine shed cells, urinary sediment, urine supernatant. In some preferred embodiments of the invention, the sample to be tested is tissue, whole blood, plasma or serum.

In some more preferred embodiments of the invention, the sample to be tested is whole blood, plasma or serum. Due to the convenience of sampling and other aspects, the diagnosis provided by the invention comprises early screening, detection or auxiliary detection of gastric cancer. In some specific embodiments of the invention, the differential methylation region is a differential methylation region present in cfDNA. In some more specific embodiments, the differential methylation region is a differential methylation region in cfDNA present in whole blood, plasma, or serum samples.

Methods are known to those of skill in the art that can be used to detect the methylation level of genomic DNA or free DNA in a sample, such as, but not limited to, one or more of polymerase chain reaction techniques, in situ hybridization techniques, enzymatic mutation detection techniques, chemical cleavage mismatch techniques, mass spectrometry techniques, and genetic sequencing techniques.

In some specific embodiments, the polymerase chain reaction techniques include, but are not limited to, RT-PCR, immuno PCR, nested PCR, fluorescent PCR, in situ PCR, membrane bound PCR, anchor PCR, in situ PCR, asymmetric PCR, long distance PCR, touchdown PCR, gradient PCR, multiplex PCR, and the like; such gene sequencing techniques include, but are not limited to, amplicon sequencing techniques such as amplicon reduced Genome methylation sequencing, whole Genome methylation sequencing (white Genome BisulfiteSequencing, WGBS), DNA enrichment sequencing, pyrophosphate sequencing, sulfite conversion sequencing; such mass spectrometry techniques include, but are not limited to, mass spectrometry based GC-MS, LC-MS, MALDI-TOFMS, FT-MS, ICP-MS, SIMS detection techniques; such in situ hybridization techniques include, but are not limited to, chip-based detection platforms such as 450K and 850K methylation detection techniques.

In some more specific embodiments, the method for detecting the methylation level of genomic DNA or free DNA in a sample includes, but is not limited to, at least one of fluorescent quantitative PCR (qPCR), methylation Specific PCR (MSP), digital PCR (ddPCR), DNA methylation chip, amplicon sequencing, targeted DNA methylation sequencing, whole genome methylation sequencing, DNA methylation mass spectrometry (MassArray).

In some preferred embodiments, the method for detecting the level of genomic DNA or free DNA methylation in a sample comprises multiplex PCR binding amplicon sequencing. In these embodiments, the methylation signal of the DNA molecule is amplified and then sequenced, which can improve the signal to noise ratio of the methylation signal, which can obtain the methylation pattern of the specific DNA molecule, the resolution reaches a single base, and provide more information for the subsequent bioinformatic analysis, and the final result shows that the DNA molecule has more excellent sensitivity and specificity.

< use of methylation biomarker >

In some aspects of the invention, there is provided the use of a methylation biomarker as described above in the manufacture of a reagent or kit for diagnosing gastric cancer.

In other aspects of the invention, there is provided the use of a reagent for determining the methylation level of a methylation biomarker as described above in the preparation of a reagent or kit for diagnosing gastric cancer.

In other aspects of the invention, there is also provided the use of a reagent for determining the methylation level of a methylation biomarker as described above in the preparation of a kit for predicting gastric cancer, detecting gastric cancer, classifying gastric cancer, monitoring gastric cancer treatment, prognosis of gastric cancer or other relevant indicators for assessing gastric cancer.

< kit for diagnosing gastric cancer >

In some aspects of the invention, a kit for diagnosing gastric cancer is provided, wherein the kit comprises reagents for detecting the methylation level of the above-described methylation biomarkers in a test sample.

In some embodiments, the kit detects the methylation level of the methylation biomarker described above using one or more of polymerase chain reaction techniques, in situ hybridization techniques, enzymatic mutation detection techniques, chemical cleavage mismatch techniques, mass spectrometry techniques, and genetic sequencing techniques.

In some specific embodiments, the polymerase chain reaction techniques include, but are not limited to, RT-PCR, immuno PCR, nested PCR, fluorescent PCR, in situ PCR, membrane bound PCR, anchor PCR, in situ PCR, asymmetric PCR, long distance PCR, touchdown PCR, gradient PCR, multiplex PCR, and the like; such gene sequencing techniques include, but are not limited to, amplicon sequencing techniques such as amplicon reduced genome methylation sequencing, whole genome methylation sequencing, DNA enrichment sequencing, pyrophosphate sequencing, sulfite conversion sequencing; detection techniques based on mass spectrometry, such as GC-MS, LC-MS, MALDI-TOFMS, FT-MS, ICP-MS, SIMS; such in situ hybridization techniques include, but are not limited to, chip-based detection platforms such as 450K and 850K methylation detection techniques.

In some more specific embodiments, the detection methods employed by the kit include, but are not limited to, at least one of fluorescent quantitative PCR, methylation specific PCR, digital PCR, DNA methylation chip, amplicon sequencing, targeted DNA methylation sequencing, whole genome methylation sequencing, DNA methylation mass spectrometry. In some preferred embodiments, the detection method employed by the kit comprises multiplex PCR binding amplicon sequencing.

In some embodiments of the invention, the sample to be tested is selected from one or more of tissue (e.g., a tissue slice), whole blood, plasma, serum, pleural effusion, ascites, amniotic fluid, saliva, bone marrow, urine shed cells, urinary sediment, urine supernatant. In some preferred embodiments of the invention, the sample to be tested is tissue, whole blood, plasma or serum. In some more preferred embodiments of the invention, the sample to be tested is whole blood, plasma or serum. Due to the convenience of sampling and other aspects, the kit for diagnosing gastric cancer provided by the invention can be a kit for early screening, detection or auxiliary detection of gastric cancer.

In some embodiments of the invention, the test sample is from a subject, the subject being a mammal; preferably, the mammal is a human.

In some specific embodiments, further, the reagents in the kit comprise: including primers and/or probes; wherein the primer amplifies a sequence comprising the above-described methylation biomarker; the probe hybridizes at least in part to a sequence of a methylation biomarker as described above.

Still further, the reagent comprises at least one primer pair selected from the group consisting of:

< use of primer combination >

In some aspects of the invention, there is provided the use of at least one set of primers selected from the group consisting of the following combinations of primers for detecting the degree of methylation of a methylation biomarker as described above in the preparation of a reagent or kit for diagnosing gastric cancer:

primers shown as SEQ ID NO. 1-2; primers shown as SEQ ID NO. 3-4; primers shown as SEQ ID NO. 5-6; primers shown as SEQ ID NO. 7-8; primers shown as SEQ ID NO. 9-10; primers shown as SEQ ID NO. 11-12; primers shown as SEQ ID NO. 13-14; primers shown as SEQ ID NO. 15-16; primers shown as SEQ ID NO. 17-18; primers shown as SEQ ID NO. 19-20; primers shown as SEQ ID NO. 21-22; primers shown as SEQ ID NO. 23-24; primers shown as SEQ ID NO. 25-26; primers shown as SEQ ID NO. 27-28; primers shown as SEQ ID NO. 29-30; primers shown as SEQ ID NO. 31-32; primers shown as SEQ ID NO. 33-34; primers shown as SEQ ID NO. 35-36; primers shown as SEQ ID NO. 37-38; primers shown as SEQ ID NO. 39-40; primers shown as SEQ ID NO. 41-42; primers shown as SEQ ID NO. 43-44; primers shown as SEQ ID NO. 45-46; primers shown as SEQ ID NO. 47-48; primers shown as SEQ ID NO. 49-50; primers shown as SEQ ID NO. 51-52; primers shown as SEQ ID NO. 53-54; primers shown as SEQ ID NO. 55-56; primers shown as SEQ ID NO. 57-58; primers shown as SEQ ID NO. 59-60; primers shown as SEQ ID NO. 61-62; primers shown as SEQ ID NO. 63-64; primers shown as SEQ ID NO. 65-66; primers shown in SEQ ID NOS.67-68; primers shown as SEQ ID NO. 69-70; primers shown as SEQ ID NO. 71-72; primers shown as SEQ ID NOS.73-74; primers shown as SEQ ID NO. 75-76; primers shown as SEQ ID NO. 77-78; primers shown as SEQ ID NO. 79-80; primers shown as SEQ ID NOS: 81-82; primers shown as SEQ ID NO. 83-84; primers shown as SEQ ID NO. 85-86; primers shown as SEQ ID NO. 87-88; primers shown as SEQ ID NO. 89-90; primers shown as SEQ ID NO. 91-92; primers shown as SEQ ID NO. 93-94; primers shown as SEQ ID NO. 95-96; primers shown as SEQ ID NO. 97-98; primers shown as SEQ ID NO. 99-100; primers shown as SEQ ID NO. 101-102; primers shown as SEQ ID NO. 103-104; primers shown as SEQ ID NO. 105-106; primers shown as SEQ ID NO. 107-108; primers shown as SEQ ID NO. 109-110; primers shown as SEQ ID NO. 111-112; primers shown as SEQ ID NO. 113-114; primers shown as SEQ ID NO. 115-116; primers shown as SEQ ID NOS.117-118; primers shown as SEQ ID NO. 119-120; primers shown as SEQ ID NO. 121-122; primers shown as SEQ ID NO. 123-124; primers shown as SEQ ID NO. 125-126; primers shown as SEQ ID NOS.127-128; primers shown in SEQ ID NOS.129-130.

< System for diagnosing gastric cancer >

Some aspects of the present invention provide a system for diagnosing gastric cancer, wherein the system comprises a detection device, a computing device, and an output device;

the presence of gastric cancer in a subject corresponding to the sample is determined if the methylation level of the methylation biomarker in the sample is different from the methylation level of the methylation biomarker measured in a sample from a subject not having gastric cancer.

In some specific embodiments, the output device is configured to output a detection result of the detection device and/or a discrimination result of the computing device, where the output device includes at least one of a display, a printer, and an audio output device; the computing device comprises at least one of a computer host, a central processing unit and a network server.

< method of diagnosing gastric cancer >

In some aspects of the invention, there is provided a method of diagnosing gastric cancer, comprising the steps of:

obtaining a sample to be tested of a subject;

extracting genomic DNA and/or free DNA of the sample to be detected;

detecting the methylation level of the above-described methylation biomarker in the DNA;

judging whether the gastric cancer exists in the subject.

In some embodiments of the invention, the sample to be tested is selected from one or more of tissue, whole blood, plasma, serum, pleural effusion, ascites, amniotic fluid, saliva, bone marrow, urine shed cells, urinary sediment, urine supernatant. In some preferred embodiments of the invention, the sample to be tested is tissue, whole blood, plasma or serum. In some more preferred embodiments of the invention, the sample to be tested is whole blood, plasma or serum.

In some specific embodiments, methods for detecting the methylation level of the above-described methylation biomarkers in the DNA include, but are not limited to: methylation-specific PCR, sulfite PCR sequencing, real-time quantitative methylation-specific PCR, and the like; simplified genome methylation sequencing, whole genome methylation sequencing, DNA enrichment sequencing, pyrophosphate sequencing, sulfite conversion sequencing, and the like; detection technology based on detection platforms such as mass spectrum; based on chip detection platforms such as 450K and 850K methylation detection techniques. In some preferred embodiments, the method of detecting the methylation level of a methylation biomarker described above in the DNA comprises multiplex PCR binding amplicon sequencing.

In some embodiments of the present invention, there is provided a method of diagnosing gastric cancer, early screening, detection or assisted detection of gastric cancer comprising the steps of:

obtaining a sample to be tested of a subject, wherein the sample to be tested is a whole blood, serum or plasma sample;

extracting free DNA of a sample to be detected;

performing bisulfite conversion on the free DNA;

performing multiplex PCR reaction and product purification on the free DNA converted by the bisulfite;

library construction and sequencing are carried out on the purified product;

the methylation biomarkers described above are analyzed for methylation level to determine whether gastric cancer is present in a subject.

The present invention is further illustrated by the following examples and test examples, which are not intended to be limiting. Specific materials and sources thereof used in embodiments of the present invention are provided below. However, it should be understood that these are merely exemplary and are not intended to limit the present invention, as materials that are the same as or similar to the type, model, quality, nature, or function of the reagents and instruments described below may be used in the practice of the present invention. The experimental methods used in the following examples and test examples are conventional methods unless otherwise specified. Materials, reagents and the like used in the following examples and test examples are commercially available unless otherwise specified.

Examples

Example 1

The embodiment discloses that 65 methylation areas can be used for assisting gastric cancer early screening detection, and the detection method specifically comprises the following steps:

1. sample information

1. A total of 312 participants, including 145 gastric cancer patients, 30 other cancers, 31 benign lesions of the stomach and 106 healthy human samples.

2. Of 175 tumor patients, 83 men, 75 women, 17 had unknown gender information; ages between 27-76, average age 52.8; the classification was according to pathology into 34 patients with stage I (stage I), 31 patients with stage II (stage II), 65 patients with stage III (stage III), 28 patients with stage IV (stage IV) and 17 patients with unknown stage. Sex information is unknown to patients with benign lesions of the stomach, 19 men, 12 women, 17; age between 29-68, average age 52.8; benign lesions include granulomas, polyps, gastritis, and the like. Healthy people include 36 men, 70 women; the ages were between 40-73, with an average age of 56.8.

2. Library building process and sequencing method

1. Extraction and methylation library establishment of cfDNA (cfDNA) of blood plasma

1.1 extraction of blood plasma cfDNA

The extraction of the blood plasma cfDNA of gastric cancer blood is strictly according to the MagMAX of Thermo Fisher Scientific company ^TM Cell-Free DNA Isolation Kit kit protocol.

1.2, plasma cfDNA bisulfite conversion

The extracted cfDNA (10 ng) is subjected to bisulphite conversion, so that unmethylated cytosine in the cfDNA is deaminated and converted into uracil, and methylated cytosine is kept unchanged, so that cfDNA after bisulphite conversion is obtained, and the specific conversion operation is strictly performed according to EZ-96 DNA methyl-Direct MagPrep kit instruction of Zymo Research.

1.3 multiplex PCR reactions and product purification

1.3.1, multiplex PCR reaction, the above converted cfDNA is subjected to multiplex PCR enrichment methylation signals, specific sequences of the primers adopted are shown in Table 7, corresponding primer pair combinations are selected from Table 7 according to different marker combinations, after each primer is quantified, the primers are mixed together according to the molar concentration and the like, and are added according to the following reaction system formula.

Reaction system recipe (50. Mu.L)

Component (A)	Volume (mu L)	Final concentration
			2×Kapa 2G fast multiple Mix	25	1x
Primer mix (Primer mix)	5	50nM
			1.2 Bisulfite DNA (Bisulfite-DNA) obtained in the step of preparing a DNA fragment	Variable	10ng
H ₂ O (no Nuclease-free)	Is added to 50 mu L

1) The 2 XKapa 2G fast multiple Mix and other reaction components were thawed at room temperature according to the instructions of the KAPA2G Fast Multiplex PCR Kit kit from Roche KAPA Biosystems, and mixed upside down after complete thawing.

2) The total reaction volume was calculated based on the number of reactions, and a mixed system was prepared by adding 10% more volume to each component, except for the template DNA. The mixture was gently swirled using a pipette and the liquid was collected to the bottom by short-time centrifugation.

3) The mixed solution was dispensed into 200. Mu.L PCR tubes (tubes) to ensure accurate and consistent volumes of the time-lapse samples.

4) Adding the template into the 3) subpackaged PCR tube, mixing the mixture upside down, and collecting the liquid to the bottom through short-time centrifugation.

5) The reaction program is opened on the PCR instrument in advance, after isothermal temperature is raised to the pre-denaturation temperature, a pause (pause) key is pressed, a PCR tube is put into a heating module (block), and a resume (resume) key is pressed for reaction.

PCR procedure:

6) After the reaction was completed, the reaction product was placed in a-20℃refrigerator overnight (or in a PCR apparatus at 4℃overnight) without performing the next magnetic bead purification experiment.

1.3.2 multiple product purification

1) Hieff is toSmarter DNA Clean Beads (YEASEN, CAT#12600ES 56) was taken out of the refrigerator and equilibrated at room temperature for at least 30min. Preparing 80% ethanol.

2) Vortex or reverse the beads sufficiently to ensure adequate mixing.

3) After the multiple reactions are completed, the product is taken out and centrifuged briefly. To 50. Mu.L of multiplex was added 50. Mu.L of EPC water.

4) 80 mu L Smarter DNA Clean Beads to 1.5ml centrifuge tubes were pipetted in proportion (0.8×).

5) 100. Mu.L of the solution in 3) was pipetted into a 1.5ml centrifuge tube in 4), vortexed well (vortex), incubated at room temperature for 5min (if the beads were hung on the centrifuge tube wall, thrown by hand to the bottom).

6) The last centrifuge tube was briefly centrifuged and placed in a magnetic rack and after the solution was clarified (about 5 min) the supernatant was carefully removed.

7) The PCR tube was kept always in a magnetic rack, the beads were rinsed with 200 μl of freshly prepared 80% ethanol, and after 30 seconds (sec) incubation at room temperature, the supernatant was carefully removed.

8) Repeat step 5) for a total of 2 rinses.

9) And (3) centrifuging a 1.5ml centrifuge tube for 30 seconds for a short time, putting the centrifuge tube on a rack until liquid is clarified, sucking residual liquid, uncovering and airing until no reflection exists on the surface of the magnetic beads.

10 1.5ml centrifuge tube was removed from the magnet holder and 55. Mu.L ddH was added ₂ O, fully vortex and shake, and stand at room temperature for 5min.

11 Centrifuge 1.5ml centrifuge tube briefly and place in a magnetic rack for standing, after the solution is clear (about 5 min), carefully remove 53 μl supernatant into 200 μl PCR tube without touching the magnetic beads.

12 2. Mu.L of DNA was used for Qubit quantification.

1.4, terminal repair

1) According to the reagent kit of NEB company, thawing the reagent on ice, mixing and centrifuging, and preparing a reaction mixture in a PCR tube according to the following reaction system:

2) And calculating the total reaction volume according to the reaction quantity, adding 10% more volume of each component except the template DNA to prepare a mixed system, and blowing and uniformly mixing by using a pipette.

3) The mixed solution was dispensed to 200. Mu.L of PCR tube, template DNA was added, and the mixture was stirred with a pipette and centrifuged briefly (sufficient mixing was required and the presence of small bubbles did not affect the experimental results).

4) The reaction was carried out in a PCR apparatus as follows:

temperature (temperature)	Time
		20℃	30min
65℃	30min
		4℃	Holding

5) After the reaction was completed, the sample tube was removed, centrifuged briefly, and the next reaction (uninterrupted) was performed.

1.5 Joint connection

1) Determining whether dilution of the NEBNExt linker (adapter) is required based on DNA input (input), providing a reference standard procedure according to NEB:

2) According to the reagent kit of NEB company, thawing the reagent on ice, uniformly mixing and centrifuging, and preparing a reaction mixture in a PCR tube according to the following reaction system:

End Prep Reaction Mixture	60μL
		NEBNext Ultra II Ligation Master Mix	30μL
NEBNext Ligation Enhancer	1μL
		NEBNext Adaptor for Illumina	2.5μL
total volume of	93.5μL

3) The total reaction volume was calculated according to the number of reactions, and a mixed system was prepared by adding 10% more volume to each component except End Prep Reaction Mixture and NEBNext Adaptor for Illumina, and was blown and mixed uniformly with a pipette.

4) The required NEBNext Adaptor for Illumina volumes were calculated from the number of reactions and diluted with Tris buffer ph=8.0 as required to dilute the linker with reference to the step 1) dilution table standard.

5) 2. Mu.L of the joint in the step 4) was added to End Prep Reaction Mixture, 31. Mu.L of the mixed system in the step 3) was added, and the mixture was blown and stirred with a pipette and centrifuged briefly.

6) Placing into a PCR instrument, and incubating at 20deg.C for 15min (closing the thermal cover of the PCR instrument).

7) Taking out the product after the completion of the previous step, centrifuging briefly, and adding 3. Mu.L of NEB company USER ^TM Enzyme (NEB#M5505), was blown and mixed with a pipette and centrifuged briefly.

8) Placing into a PCR instrument, and incubating at 37deg.C for 15min (the temperature of the thermal cover of the PCR instrument is required to be greater than or equal to 47 ℃).

9) The product after the end of the reaction may be subjected to the next reaction or left at-20℃overnight.

1.6 library fragment screening

1) AMPure XP beads (Beckman Coulter, inc. #a 63881) were removed from the refrigerator and equilibrated at room temperature for 30min. Preparing 80% ethanol.

2) Vortex or reverse the beads sufficiently to ensure adequate mixing.

3) Aspirate 50. Mu.L XP beads into a 1.5mL centrifuge tube and add the product of step 9) above in 1.5, incubate well at room temperature for 5min.

4) The last centrifuge tube was briefly centrifuged and placed in a magnetic rack and after the solution was clarified (about 5 min), the supernatant was carefully aspirated into a new 1.5mL centrifuge tube.

5) 25. Mu.L XP beads were added to the supernatant, vortexed well and incubated for 5min at room temperature.

7) The PCR tube was kept always in a magnetic rack, the beads were rinsed with 200 μl of freshly prepared 80% ethanol, and after 30sec incubation at room temperature, the supernatant was carefully removed.

8) Step 7) was repeated for a total of 2 rinses.

9) And (3) centrifuging a 1.5mL centrifuge tube for 30sec for short time, putting the centrifuge tube on a rack until liquid is clarified, sucking residual liquid, uncovering and airing until no reflection exists on the surface of the magnetic beads.

10 1.5mL centrifuge tube was removed from the magnet rack, 17. Mu.L of elution buffer (10 mM Tris-HCl) was added thereto, vortexed well, and allowed to stand at room temperature for 5min.

11 Centrifuge 1.5mL centrifuge tube briefly and place in a magnetic rack for standing, after the solution is clear (about 5 min), carefully remove 15 μl supernatant into 200 μl PCR tube without touching the magnetic beads.

12 The product of step 11) is subjected to the next library enrichment or maintained at-20 ℃.

1.7 library enrichment and purification

1) Thawed and mixed up NEBNext Multiplex Oligos for Illumina (Dual Index Primers, neb#e7600) and NEBNext Ultra II Q5 Master Mix, and the reaction mixture was prepared in PCR tubes according to the following reaction system:

2) The reaction mixtures were each mixed by pipetting with a pipette and briefly centrifuged.

3) The PCR reaction was performed according to the following procedure:

4) After the reaction of the previous step is finished, the PCR tube is taken down, and the PCR tube is centrifuged for a short time, so that the magnetic bead purification of the next step is carried out.

5) AMPure XP beads (Beckman Coulter, inc. #a 63881) were removed from the refrigerator and equilibrated at room temperature for 30min. Preparing 80% ethanol.

6) Vortex or reverse the beads sufficiently to ensure adequate mixing.

7) 45. Mu.L XP beads were pipetted into a 1.5mL centrifuge tube and the product of step 4 was added, vortexed well and incubated for 5min at room temperature.

8) The last centrifuge tube was briefly centrifuged and placed in a magnetic rack and after the solution was clarified (about 5 min) the supernatant was carefully removed.

9) The PCR tube was kept always in a magnetic rack, the beads were rinsed with 200 μl of freshly prepared 80% ethanol, and after 30sec incubation at room temperature, the supernatant was carefully removed.

10 Repeat step 9) for a total of 2 rinses.

11 Centrifuging 1.5mL centrifuge tube for 30sec for short time, placing on a rack until the liquid is clear, sucking residual liquid, uncovering and airing until no reflection exists on the surface of the magnetic beads.

12 1.5mL centrifuge tube was removed from the magnet holder, 33. Mu.L of 0.1X TE (thermoFisher) was added, vortexed thoroughly, and allowed to stand at room temperature for 5min.

13 Centrifuge 1.5mL centrifuge tube briefly and place in a magnetic rack for standing, after the solution is clear (about 5 min), carefully remove 31 μl of supernatant into a 1.5mL Lowbind tube (Solarbio) without touching the beads.

14 The product (library) of the previous step can be stored at-20 ℃ or subjected to subsequent on-machine related experiments.

2. Sequencing

And (3) carrying out double-end 150bp sequencing on the methylation specific amplicon by using a NovaSeq sequencer of Illumina company to obtain a sequencing result.

3. And (5) analyzing the machine-starting data:

3.1 raw data processing

The original data of the sequencer is subjected to conventional bioinformatics analysis, firstly, low-quality (QC low, short in length, too much N and the like) reads are filtered through fastp, then, connectors (adapters) at the two ends of the reads, consensus sequences and PolyA/T are removed, filtered and washed sequences to be compared (called clear reads sequences) are obtained, and after the clear reads are compared with positions corresponding to hg19 by using bismark, all amplicon reads data (bam files) of each sample are obtained.

3.2, methylation reads of 65 methylation regions were extracted and normalized (normalization).

The bam file was counted and analyzed in combination with 65 methylation regions (amplicon intervals) to obtain methylation reads on each methylation region. The readcount of 65 methylation regions is then normalized to correct library size differences, batch-to-batch differences, and the like, with reference to the number of sequencing reads (readcounts) aligned to 2 internal reference markers (typically selected over a region of the housekeeping gene). Readcount (which may also be referred to as amplicon number) for each methylation region after correction is used to assess the degree of methylation of the corresponding methylation region, while also being used for the characterization of subsequent modeling predictions. In short, it is: the readcount of each methylation region after correction is equal to the readcount of each methylation region before correction divided by the readcount on the internal reference marker, and if there are multiple internal reference markers, divided by the geometric mean of the multiple internal reference methylation regions.

The 2 internal reference markers used were:

chr	initiation	Termination of	Gene	Chain
					chr12	33560170	33560264	SYT10	-
chr3	89259052	89259151	EPHA3	+

Wherein "+" represents the positive (forward) strand of DNA; "-" indicates the negative (forward) strand of DNA.

4. Modeling prediction classification of gastric cancer and healthy people

154 samples (77 gastric cancer samples and 77 healthy persons) were randomly selected out of the total 231 samples as model training data. The heat map clusters for this 154 sample were first plotted, see FIG. 1. The methylation degree in FIG. 1 is represented by the log value of readcount of each methylation region obtained in the above step 3.2, and as shown in FIG. 1, even if the feature of 65 methylation regions is used for unsupervised clustering, it can be seen that healthy people and tumor patients can be classified.

The 154 samples are randomly segmented for 100 times according to the ratio of 7:3, a Random Forest (Random Forest) is utilized to model in a training (train) set in each segmentation, a risk score (probability) of each sample is calculated in a test (test) set by using the model, discrimination Sensitivity (SE), specificity (SP), area under a curve (AUC), accuracy (ACC), negative Predictive Value (NPV), positive Predictive Value (PPV) and the like of methylation combination are obtained through comparison of the risk score with clinical standard diagnosis, and 100 modeling AUC diagrams are shown in figure 2. The average risk score for each marker was calculated 100 times and the 5 markers with the highest average risk scores were taken, and the predicted performance AUC for each marker is shown in table 1 below, from which it can be seen that these markers have good predicted performance.

Table 1: predicted performance AUC of 5 markers with highest mean risk score over 154 samples

Region(s)	AUC	fixSP_SP	fixSP_SE
				chr5:112073351-112073434	0.748186878	0.9	0.337662338
chr15:96895487-96895582	0.744813628	0.9	0.402597403
				chr5:92907913-92908009	0.737055153	0.9	0.454545455
chr10:90343168-90343288	0.724236802	0.9	0.25974026
				chr4:13524667-13524790	0.692528251	0.9	0.285714286

In the table above, fixsp_sp: specificity (SP) was 0.9; fixsp_se: sensitivity (SE) with Specificity (SP) of 0.9.

Example 2

For the 154 samples, 77 gastric cancer samples and 77 healthy persons in example 1, the same 100 cuts as in example 1 were used with a combination of 5, 10, 20, 50, 65 markers, respectively, and modeled with Random Forest (Random Forest). Wherein:

the 5marker group includes: chr5:112073351-112073434, chr5:92907913-92908009, chr10:90343168-90343288, chr15:96895487-96895582, chr4:13524667-13524790. Wherein, the combination of 10 markers, 20 markers and 50 markers is a marker which is randomly increased from 65 differential methylation areas on the basis of 5 markers, and the specific steps are as follows:

the 10marker group includes: chr15:96895487-96895582, chr10:90343168-90343288, chr5:92907913-92908009, chr5:112073351-112073434, chr4:13524667-13524790, chr4:13526663-13526744, chr7:27253040-27253164, chr8:132054559-132054681, chr8:99962607-99962690, chr6:133561659-133561777;

the 20marker group includes: chr15:96895487-96895582, chr10:90343168-90343288, chr5:92907913-92908009, chr5:112073351-112073434, chr4:13524667-13524790, chr4:13526663-13526744, chr7:27253040-27253164, chr8:132054559-132054681, chr8:99962607-99962690, chr6:133561659-133561777, chr8:10588845-10588971, chr17:66596160-66596269, chr8:132053717-132053827, chr14:99697596-99697701, chr9:91605729-91605817, chr5:140871302-140871422, chr5:134376436-134376563, chr1:47899550-47899648, chr8:143532011-143532108, chr9:126771376-126771460;

The 50marker group includes: chr15, chr10, chr5, chr4, chr7, chr8, chr6, chr8, chr17, chr8, chr14, chr9, chr5, chr1, chr8, chr9, chr17, chr2, chr7, chr4, chr13, chr2, chr7, chr15, chr5, chr20, chr4, chr14, chr12, chr7, chr10, chr2, chr22, chr5, chr6, chr7, chr11, chr8, chr7, chr6, chr4, chr13, chr3, chr5, chr10, chr 15.

The 65marker group includes: chr15: chr10: the composition is prepared from chr5, chr4, chr15, chr5, chr20, chr4, chr6, chr8, chr17, chr8, chr14, chr9, chr5, chr1, chr8, chr9, chr17, chr2, chr7, chr4, chr13, chr2, chr7, chr15, chr5, chr20, chr4, chr14, chr12, chr7, chr10, chr2, chr22, chr5, chr6, chr7, r11, chr8, chr7, chr6, r4, r13, r3, r5, r10, r4, chr10, chr4, chr8, chr10, chr4, chr16, chr4, chr8, chr4, chr16, chr10, chr8, and chr4 chr2, 200328933-200329042.

Under these combined markers, the modeled average AUC and Sensitivity (SE), specificity (SP), accuracy (ACC) and Positive Predictive Value (PPV) and Negative Predictive Value (NPV) are shown in table 2 below. The corresponding subject operating characteristic curve (ROC) is shown in fig. 2-6. The above results were obtained based on multiplex PCR performed as described previously using the primers provided in table 7, followed by analysis of the data obtained by sequencing. These data show that a combination of different numbers of methylated regions can be used as markers to aid gastric cancer detection.

Table 2: combinations of methylated regions can be used as markers to aid gastric cancer detection

Marker set	Average AUC	Average SE	Average SP	Average ACC	Average PPV	Average NPV
							5marker group	0.75088	0.70725	0.80977	0.75826	0.81428	0.75074
10marker group	0.77559	0.70313	0.85371	0.77832	0.84696	0.75454
							20marker group	0.7953	0.7298	0.85585	0.79265	0.84669	0.77135
50marker group	0.82995	0.75077	0.88807	0.81903	0.88323	0.79116
							65marker group	0.82985	0.75047	0.89141	0.82078	0.8869	0.79132

Wherein AUC is area under ROC curve, SE is sensitivity, SP is specificity, ACC is accuracy, PPV is positive predictive value, NPV is negative predictive value.

Example 3

From the 40 samples, 20 healthy persons and 20 tumor patients, which will be selected in example 1, the two groups of samples were symmetrically matched in consideration of age and sex to exclude the influence of age and sex factors on the experiment, especially the differential influence on methylation signals. Then, the 40 samples are cut according to 5-fold cross validation, 10 times are repeated, total cutting is carried out 100 times, random Forest (Random Forest) is utilized to model in the train set in each cut, the risk score of each sample is calculated in the test set by the model, the risk score is compared with clinical standard diagnosis to obtain discrimination Sensitivity (SE), specificity (SP), area Under Curve (AUC), accuracy (ACC), negative Predictive Value (NPV), positive Predictive Value (PPV) and the like of methylation combination, and 100 modeling AUC diagrams are shown in FIG. 7. The average risk score for each marker was calculated 100 times and the marker with the highest average risk score was obtained, and the predicted performance AUC for each marker is shown in the following table, from which it can be seen that these markers have good predicted performance.

Table 3: predictive performance AUC of 5 markers with highest mean risk score over 40 samples

Region(s)	AUC	youden_SP	youden_SE	fixSP_SP	fixSP_SE
						chr10:90343168-90343288	0.84	0.9	0.8	0.9	0.8
chr5:92907913-92908009	0.8425	0.9	0.75	0.9	0.75
						chr9:126771376-126771460	0.78625	0.9	0.75	0.9	0.75
chr15:96895487-96895582	0.7775	0.75	0.75	0.9	0.55
						chr6:133561659-133561777	0.7425	0.85	0.75	0.9	0.2

Wherein, AUC: the area enclosed by the ROC curve and the coordinate axis has a value between 0 and 1, and the larger the value is, the better the model classification performance is.

you_sp: specificity (SP) under optimal classification effect, specifically SP when the value of sp+se is maximum;

you_se: sensitivity (SE) under optimal classification effect, specifically, SE when the value of sp+se is maximum;

fixsp_sp: specificity (SP) was 0.9; fixsp_se: sensitivity (SE) with Specificity (SP) of 0.9.

Example 4

For 40 samples, 20 gastric cancer samples and 20 healthy persons in example 3, the same 100 cuts as in example 3 were used with a combination of 5, 10, 20, 50, 65 markers, respectively, modeled using Random Forest (Random Forest), wherein:

the 5marker group includes: chr10:90343168-90343288, chr5:92907913-92908009, chr9:126771376-126771460, chr15:96895487-96895582, chr6:133561659-133561777. Wherein, the combination of 10 markers, 20 markers and 50 markers is a marker which is randomly increased from 65 differential methylation areas on the basis of 5 markers, and the specific steps are as follows:

The 10marker group includes: chr10:90343168-90343288, chr5:92907913-92908009, chr9:126771376-126771460, chr15:96895487-96895582, chr6:133561659-133561777, chr8:10588845-10588971, chr5:112073351-112073434, chr8:70984561-70984656, chr7:27253040-27253164, chr7:27196279-27196357;

the 20marker group includes: chr10:90343168-90343288, chr5:92907913-92908009, chr9:126771376-126771460, chr15:96895487-96895582, chr6:133561659-133561777, chr8:10588845-10588971, chr5:112073351-112073434, chr8:70984561-70984656, chr7:27253040-27253164, chr7:27196279-27196357, chr11:125037244-125037331, chr5:140871302-140871422, chr14:103512082-103512178, chr2:213401681-213401784, chr20:2781337-2781445, chr12:115109503-115109616, chr17:46673962-46674085, chr7:150038594-150038715, chr8:132053717-132053827, chr5:32714407-32714525;

the 50marker group includes: chr10, chr5, chr9, chr15, chr6, chr8, chr5, chr8, chr7, chr11, chr5, chr14, chr2, chr20, chr12, chr17, chr7, chr8, chr5, chr8, chr3, chr4, chr8, chr13, chr7, chr10, chr4, chr8, chr4, chr3, chr2, chr10, chr5, chr22, chr9, chr8, chr2, chr10, chr8, chr6, chr11, chr8, chr15, chr9, chr5, chr1, chr 17.

The 65marker group includes: chr10: chr5: chr9: the chr15, chr6, chr8, chr5, chr8, chr7, chr11, chr5, chr14, chr2, chr20, chr12, chr17, chr7, chr8, chr5, chr8, chr4, chr8, chr4, chr3, chr2, chr10, chr5, chr22, chr9, chr8, chr10, chr8, r6, chr11, chr8, chr15, r9, r5, r4, r1, r17, r7, r4, chr2, chr10, chr8, chr10, chr8, chr4, chr8, chr10, chr8, chr4, chr22, chr8, chr9, chr8, and chr8 chr16:82661134-82661236.

Under these combined markers, the modeled average AUC and sensitivity SE, specificity SP, accuracy ACC and positive predictive value PPV and negative predictive value NPV are shown in table 4 below. The corresponding subject operating characteristic curve (ROC) is shown in fig. 7-11. The above results were obtained based on multiplex PCR performed as described previously using the primers provided in table 7, followed by analysis of the data obtained by sequencing. These data properties indicate that a combination of different numbers of methylated regions can serve as markers to aid in gastric cancer detection.

Table 4: combinations of methylated regions can be used as markers to aid gastric cancer detection

Marker set	Average AUC	Average SE	Average SP	Average ACC	Average PPV	Average NPV
							5marker group	0.7883	0.8525	0.775	0.81375	0.83114	0.91414
10marker group	0.84567	0.9125	0.795	0.85375	0.85424	0.93069
							20marker group	0.86417	0.905	0.84	0.8725	0.87821	0.92753
50marker group	0.84413	0.885	0.8325	0.85875	0.87396	0.9092
							65marker group	0.84945	0.8825	0.8475	0.865	0.88462	0.90853

Example 5

For the 154 samples, the 77 gastric cancer samples and the 77 healthy persons in example 1, the same 100 cuts as in example 1 were used with 3 core markers and any combination thereof, and modeling was performed with Random Forest (Random Forest). Wherein:

the 3marker group includes: chr10:90343168-90343288, chr5:92907913-92908009, chr15:96895487-96895582.

Under these combined markers, the modeled average AUC and sensitivity SE, specificity SP, accuracy ACC and positive predictive value PPV and negative predictive value NPV are shown in table 5 below. The subject operating characteristic curve (ROC) corresponding to 3markers is further shown in fig. 12. The above results were obtained based on multiplex PCR performed as described previously using the primers provided in table 7, followed by analysis of the data obtained by sequencing. These data properties indicate that a combination of different numbers of methylated regions can serve as markers to aid in gastric cancer detection.

Table 5:

table 6:

/>

wherein, intronic is an intron, intersystemic is an intergenic sequence, exonic is an exon, upstream is an upstream fragment, downstream is a downstream fragment, and upstream is a UTR5 coding sequence.

Table 7:

/>

in the above table, "+" indicates the positive (forward) strand of DNA; "-" indicates the negative (forward) strand of DNA.

Sequence listing

<110> Guangzhou market reference medical Limited liability company

<120> methylation biomarker for diagnosing gastric cancer, kit and use thereof

<130> 2253424IP

<160> 130

<170> PatentIn version 3.5

<210> 1

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 1 Forward primer

<400> 1

aaaacgaaaa ccgaaatccc gaat 24

<210> 2

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 1 reverse primer

<400> 2

tttggcgaag gttatagggt tagg 24

<210> 3

<211> 29

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 2 Forward primer

<400> 3

ttcgaaatag ttgatcgttt aggtttagg 29

<210> 4

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 2 reverse primer

<400> 4

aaattactac cccgcaaatc tccc 24

<210> 5

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 3 Forward primer

<400> 5

atttcgtttt attcgtcgtt gtagttta 28

<210> 6

<211> 29

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 3 reverse primer

<400> 6

aattcatata cataatctat accgcgcta 29

<210> 7

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 4 Forward primer

<400> 7

gtgcgggtta gattattgga tagttg 26

<210> 8

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 4 reverse primer

<400> 8

ctaaaccgaa aattctaaat ttcaacgaaa 30

<210> 9

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 5 Forward primer

<400> 9

ttttaagtcg ttgggtcgtc gtt 23

<210> 10

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 5 reverse primer

<400> 10

ctaaaacact tcaatctacg ccgaat 26

<210> 11

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 6 Forward primer

<400> 11

atatttacga ttttggttgt gttgagga 28

<210> 12

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 6 reverse primer

<400> 12

cacttttacg caaaacgatt ctcaaaa 27

<210> 13

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 7 Forward primer

<400> 13

taagttcgtg ttggtgtgat ttcg 24

<210> 14

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 7 reverse primer

<400> 14

cgtttatact aaccgtttac accctc 26

<210> 15

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 8 Forward primer

<400> 15

aacctcattc cttcgccttt aacc 24

<210> 16

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 8 reverse primer

<400> 16

gtgtacggtc gcgggtatag g 21

<210> 17

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 9 Forward primer

<400> 17

tatatattta tacgggcgcg tgttt 25

<210> 18

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 9 reverse primer

<400> 18

atcatcaaat aaccgacgaa aacaaaa 27

<210> 19

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 10 Forward primer

<400> 19

agtgtatggt agcgtcgtat ttatttg 27

<210> 20

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 10 reverse primer

<400> 20

aaactcaata cctcgcctta ccg 23

<210> 21

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 11 Forward primer

<400> 21

ccaactaaat aaacgatcac ctcccta 27

<210> 22

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 11 reverse primer

<400> 22

agcgaatgga tggtgtttgg ttt 23

<210> 23

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 12 Forward primer

<400> 23

gagtggtttc ggcggttcg 19

<210> 24

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 12 reverse primer

<400> 24

aaactctcac gaattccaaa cctac 25

<210> 25

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 13 Forward primer

<400> 25

cggattagta tattcgcgga tttagc 26

<210> 26

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 13 reverse primer

<400> 26

ctccgtacta aaatacgcta tcaacac 27

<210> 27

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 14 Forward primer

<400> 27

ttcgttgtgc ggttgtaata gaaat 25

<210> 28

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 14 reverse primer

<400> 28

cttaaaatat attccccacc cgaacc 26

<210> 29

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 15 Forward primer

<400> 29

gggaagattt agtattttgt atatgattcg 30

<210> 30

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 15 reverse primer

<400> 30

acctataacg ctacgctacc c 21

<210> 31

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 16 Forward primer

<400> 31

cgcacaaaac tcgaaacaaa caaaa 25

<210> 32

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 16 reverse primer

<400> 32

ttattgttaa tgttaggttt ataggcgagg 30

<210> 33

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 17 Forward primer

<400> 33

gttgtttgaa cggcgggata gt 22

<210> 34

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 17 reverse primer

<400> 34

accgaacatt cgccttcttt cc 22

<210> 35

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 18 Forward primer

<400> 35

gtccgattaa aacgattcta ttctacaaac 30

<210> 36

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 18 reverse primer

<400> 36

ttcgtaatga tgtagacggg gttg 24

<210> 37

<211> 20

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 19 Forward primer

<400> 37

cctccgaccc tacccgctaa 20

<210> 38

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 19 reverse primer

<400> 38

gcggtagttt aggtttcgag cg 22

<210> 39

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 20 Forward primer

<400> 39

tttgttatag gttagttgtg agcgaatt 28

<210> 40

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 20 reverse primer

<400> 40

cgacaaacac ttcctaaaat ccaaacta 28

<210> 41

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 21 Forward primer

<400> 41

tttgagtagt cgaagattcg ggtg 24

<210> 42

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 21 reverse primer

<400> 42

ctctaaacct aactacaact ccaaaacg 28

<210> 43

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 22 Forward primer

<400> 43

tcccgctatc caaaactctt cg 22

<210> 44

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 22 reverse primer

<400> 44

ggtttaggga gtcggtttag cg 22

<210> 45

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 23 Forward primer

<400> 45

cgtagaaggt ttagatacga aatgttat 28

<210> 46

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 23 reverse primer

<400> 46

atataaacat ccgtcccaaa cgc 23

<210> 47

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 24 Forward primer

<400> 47

aggttaagaa aggttcgttt agcg 24

<210> 48

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 24 reverse primer

<400> 48

cccgcgaact ttatctacac ttaa 24

<210> 49

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 25 Forward primer

<400> 49

gggcggtatt ggagaagtcg t 21

<210> 50

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 25 reverse primer

<400> 50

cgcaaacgat ccaataatcc cgaa 24

<210> 51

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 26 Forward primer

<400> 51

aaaatcgctc ccgccctct 19

<210> 52

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 26 reverse primer

<400> 52

tttaggttgt acgtgggttc gg 22

<210> 53

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 27 Forward primer

<400> 53

ggatttaggg aagcgtcgtc g 21

<210> 54

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 27 reverse primer

<400> 54

ccgcgaatat ctaaccactc taaca 25

<210> 55

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 28 Forward primer

<400> 55

cgccatatct tacaaaacct accg 24

<210> 56

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 28 reverse primer

<400> 56

cgcgtcgttt attagggtcg tt 22

<210> 57

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 29 Forward primer

<400> 57

aaacgttcct cattccgcct ac 22

<210> 58

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 29 reverse primer

<400> 58

attttagttg cgggacggtt agg 23

<210> 59

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 30 Forward primer

<400> 59

accgccgaca ctacaaacaa aa 22

<210> 60

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 30 reverse primer

<400> 60

ttcgagagtc ggattgggac g 21

<210> 61

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 31 Forward primer

<400> 61

cgggaggagg tataggacgt ag 22

<210> 62

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 31 reverse primer

<400> 62

cgcgaaaacc aaactcgaac aa 22

<210> 63

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 32 Forward primer

<400> 63

cgagggagtt cggggtttt 19

<210> 64

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 32 reverse primer

<400> 64

atccaattca aactaccgta cact 24

<210> 65

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 33 Forward primer

<400> 65

cgaaatacaa aatctacaaa acaaacgc 28

<210> 66

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 33 reverse primer

<400> 66

tcgggtttta ttggttgtag tcgta 25

<210> 67

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 34 Forward primer

<400> 67

ctccaacccc gcaaactcg 19

<210> 68

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 34 reverse primer

<400> 68

tttggggaag gttgtatggt cg 22

<210> 69

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 35 Forward primer

<400> 69

cctatataaa taataccctt atcgctacct 30

<210> 70

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 35 reverse primer

<400> 70

ggttcgaatt tagtttgttt ggtttt 26

<210> 71

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 36 Forward primer

<400> 71

attcgaaaac gacaacgata tattataaca 30

<210> 72

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 36 reverse primer

<400> 72

gtgtataatt agcgttttat tcggtcg 27

<210> 73

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 37 Forward primer

<400> 73

gagagtgttg ttttcgtttg gagttta 27

<210> 74

<211> 20

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 37 reverse primer

<400> 74

gaatcccctc cgacgcctac 20

<210> 75

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 38 Forward primer

<400> 75

tctccaactc caacgtctaa taacg 25

<210> 76

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 38 reverse primer

<400> 76

ggttttgggt tcggtgatcg t 21

<210> 77

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 39 Forward primer

<400> 77

tgatttcgtg tagagttgag agttgg 26

<210> 78

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 39 reverse primer

<400> 78

ctcacgttat ccttaaaact cgtaacaata 30

<210> 79

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 40 Forward primer

<400> 79

cgttgtagga gttttgcgtt agg 23

<210> 80

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 40 reverse primer

<400> 80

acaaataacc actaccgcta cgt 23

<210> 81

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 41 Forward primer

<400> 81

caaatcgaac tataacctac ctttctcaaa 30

<210> 82

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 41 reverse primer

<400> 82

ggttcggttg tcgcgttgtt a 21

<210> 83

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 42 Forward primer

<400> 83

tttggaattc ggttgtaaat agtaggag 28

<210> 84

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 42 reverse primer

<400> 84

tctaataaat tccccgcaaa acctct 26

<210> 85

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 43 Forward primer

<400> 85

ctaaatcccc accttactcc aacg 24

<210> 86

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 43 reverse primer

<400> 86

atgttagtgt tagttgttag gtaatagacg 30

<210> 87

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 44 forward primer

<400> 87

aacgaatact tctctccgca acc 23

<210> 88

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 44 reverse primer

<400> 88

cgggtagggg tacggtgtt 19

<210> 89

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 45 Forward primer

<400> 89

cggagcgaga ttatttgtga gg 22

<210> 90

<211> 18

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 45 reverse primer

<400> 90

gaaactcgcc gcaacacg 18

<210> 91

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 46 Forward primer

<400> 91

tttaacgagt gttttattaa ggtgtcg 27

<210> 92

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 46 reverse primer

<400> 92

aaaccgaatc caacgtccaa t 21

<210> 93

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 47 Forward primer

<400> 93

aacgttagtt ttgttgcgag gc 22

<210> 94

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 47 reverse primer

<400> 94

aaaaccgaaa ctatctatct tctcgc 26

<210> 95

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 48 Forward primer

<400> 95

gggttggatt agttcggtcg tg 22

<210> 96

<211> 18

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 48 reverse primer

<400> 96

cgcaacgcga cgatacga 18

<210> 97

<211> 29

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 49 Forward primer

<400> 97

cgaaatcata aacgacttta acgaatacg 29

<210> 98

<211> 29

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 49 reverse primer

<400> 98

tgttagcgaa gttagacggg gttatatag 29

<210> 99

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 50 Forward primer

<400> 99

cgcactccgc aataaaacac a 21

<210> 100

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 50 reverse primer

<400> 100

agggtatatt ttcgaggggt acg 23

<210> 101

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 51 Forward primer

<400> 101

ggttaatcgg aagggtaagt ttcg 24

<210> 102

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 51 reverse primer

<400> 102

cgcttcccga actctaatta atct 24

<210> 103

<211> 29

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 52 forward primer

<400> 103

ttaactaact taactacgaa acaaacgcg 29

<210> 104

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 52 reverse primer

<400> 104

ggttacgtgg cgttggtcg 19

<210> 105

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 53 Forward primer

<400> 105

cgatgtttta ttttagagat tggaacg 27

<210> 106

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 53 reverse primer

<400> 106

taaacctccg ctaccccga 19

<210> 107

<211> 35

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 54 Forward primer

<400> 107

cctttaccta aatatccgtc tattctatat tcttt 35

<210> 108

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 54 reverse primer

<400> 108

gttggtgttg tcgatttcgt tgtc 24

<210> 109

<211> 18

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 55 Forward primer

<400> 109

gaaccgcgat tttcccca 18

<210> 110

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 55 reverse primer

<400> 110

aagtcggtgg tgtttgttcg g 21

<210> 111

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 56 Forward primer

<400> 111

aatcgacgat cctaaaactt aaaacg 26

<210> 112

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 56 reverse primer

<400> 112

cgtttatagt tcgttgttcg tttcg 25

<210> 113

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 57 Forward primer

<400> 113

ttttgcgagg ggattatcgt t 21

<210> 114

<211> 21

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 57 reverse primer

<400> 114

ctccgaacga aaccctaaac c 21

<210> 115

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 58 Forward primer

<400> 115

tgtttttcga ttggaaatgt tttacg 26

<210> 116

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 58 reverse primer

<400> 116

aaacgtaaca catctcaaca ccgaa 25

<210> 117

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 59 Forward primer

<400> 117

aataactaaa cgcacccgaa caa 23

<210> 118

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 59 reverse primer

<400> 118

tttagagtcg gggtttgtaa atcg 24

<210> 119

<211> 31

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 60 Forward primer

<400> 119

cgaaatttac aaacttacac ctaaaatcaa c 31

<210> 120

<211> 27

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 60 reverse primer

<400> 120

tttcggattt tcgagtttta ttaggtt 27

<210> 121

<211> 28

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 61 Forward primer

<400> 121

aaaacgaata actttactaa tctccccc 28

<210> 122

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 61 reverse primer

<400> 122

tgtttgttta gtttcggcgg at 22

<210> 123

<211> 19

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 62 Forward primer

<400> 123

cggatgtggt ttcgggatc 19

<210> 124

<211> 22

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 62 reverse primer

<400> 124

gacgactcga actcaattct cc 22

<210> 125

<211> 24

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 63 Forward primer

<400> 125

gcggagttac gttagtcgta ttcg 24

<210> 126

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 63 reverse primer

<400> 126

tcctaatacc gaaacgtcac tcc 23

<210> 127

<211> 23

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 64 Forward primer

<400> 127

ttggttaagg cgtatttgaa cga 23

<210> 128

<211> 26

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 64 reverse primer

<400> 128

aatactataa ccgactatcc ccaacg 26

<210> 129

<211> 18

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 65 Forward primer

<400> 129

gaaacgcacc aatcccca 18

<210> 130

<211> 25

<212> DNA

<213> artificial sequence

<220>

<223> differential methylation region 65 reverse primer

<400> 130

tgtgagatta gaggagggta gatcg 25

Claims

1. A methylation biomarker for diagnosing gastric cancer, wherein the methylation biomarker comprises any one or any combination of the differentially methylated regions selected from the group consisting of:

2. the methylation biomarker for diagnosing gastric cancer according to claim 1, wherein the differential methylation region comprises at least: at least one of chr10:90343168-90343288, chr5:92907913-92908009 and chr15: 96895487-96895582.

3. The methylation biomarker for diagnosing gastric cancer according to claim 2, wherein the differential methylation region further comprises at least one of chr5:112073351-112073434, chr4: 13524667-13524790;

4. The methylation biomarker for diagnosing gastric cancer according to claim 2, wherein the differential methylation region further comprises at least one of chr9:126771376-126771460, chr6: 133561659-133561777;

5. The methylation biomarker for diagnosing gastric cancer according to any one of claims 1 to 4, wherein the differential methylation region comprises chr8: chr4: chr8: the composition comprises chr10, chr4, chr13, chr2, chr4, chr5, chr7, chr4, chr11, chr4, chr17, chr8, chr7, chr6, chr5, chr1, chr7, chr2, chr8, chr5, chr8, chr10, chr7, chr13, chr15, chr12, chr8, chr11, chr4, chr11, chr9, chr17, chr14, chr8, chr5, chr7, chr14, chr15, chr7, chr3, chr10, chr2, chr5, chr15, chr8, chr10, chr2, chr5, chr22, chr10, chr5, chr15, chr14, chr8, chr5, chr14, chr5, and chr14 chr2:176994614-176994699, chr9:23822006-23822117, chr9:126771376-126771460, chr2:200328933-200329042, chr4:111544276-111544375, chr3:15255 32714407-32714525-152553708, chr5:32714407-32714525.

6. The methylation biomarker for diagnosing gastric cancer according to any one of claims 1 to 5, wherein the gastric cancer is selected from stage I, stage II, stage III or stage IV gastric cancer; and/or the number of the groups of groups,

the gastric cancer is gastric cancer from a subject, and the subject is a mammal; preferably, the mammal is a human; and/or the number of the groups of groups,

a difference in the methylation level of the methylation biomarker in the test sample relative to the methylation level of the methylation biomarker in a sample of a subject not having gastric cancer indicates the presence of gastric cancer in the subject to which the test sample corresponds.

7. The methylation biomarker for diagnosing gastric cancer according to any one of claims 1 to 6, wherein the differential methylation region is a differential methylation region present in cfDNA.

8. Any one of the following (i) to (ii):

(i) Use of a methylation biomarker of any of claims 1 to 7 in the manufacture of a reagent or kit for diagnosing gastric cancer;

(ii) Use of a reagent for determining the methylation level of a methylation biomarker according to any of claims 1 to 7 in the manufacture of a reagent or kit for diagnosing gastric cancer.

9. A kit for diagnosing gastric cancer, wherein the kit comprises reagents for detecting the methylation level of the methylation biomarker of any of claims 1 to 7 in a test sample;

optionally, the kit detects the methylation level of the methylation biomarker of any of claims 1 to 7 using one or more of polymerase chain reaction techniques, in situ hybridization techniques, enzymatic mutation detection techniques, chemical shearing mismatch techniques, mass spectrometry techniques, and genetic sequencing techniques;

further, the reagents in the kit comprise primers and/or probes; wherein,

the primer amplifies a sequence comprising the methylation biomarker of any of claims 1 to 7;

the probe hybridizes at least in part to a sequence of the methylation biomarker of any of claims 1 to 7;

10. The kit for diagnosing gastric cancer according to claim 9, wherein the sample to be tested is selected from one or more of tissue, whole blood, plasma, serum, pleural effusion, ascites, amniotic fluid, saliva, bone marrow, urine shed cells, urinary sediment, urine supernatant;

preferably, the sample to be tested is whole blood, plasma or serum.

11. A system for diagnosing gastric cancer, wherein the system comprises a detection device, a computing device, and an output device;

the methylation level of the methylation biomarker of any one of claims 1 to 7 in the sample is different from the methylation level of the methylation biomarker measured in a sample of a subject not having gastric cancer, and the presence of gastric cancer in the subject to which the sample corresponds is determined.

12. Use of at least one set of primers selected from the group consisting of the following combinations for detecting the degree of methylation of the methylation biomarker of any of claims 1 to 7 in the manufacture of a reagent or kit for diagnosing gastric cancer: