CN112980949B - SNP marker for identifying nasopharyngeal carcinoma high-risk group, kit and application thereof - Google Patents

SNP marker for identifying nasopharyngeal carcinoma high-risk group, kit and application thereof Download PDF

Info

Publication number
CN112980949B
CN112980949B CN202011501223.8A CN202011501223A CN112980949B CN 112980949 B CN112980949 B CN 112980949B CN 202011501223 A CN202011501223 A CN 202011501223A CN 112980949 B CN112980949 B CN 112980949B
Authority
CN
China
Prior art keywords
seq
sequence
type
homozygous
nasopharyngeal carcinoma
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011501223.8A
Other languages
Chinese (zh)
Other versions
CN112980949A (en
Inventor
贾卫华
何永巧
王曈旻
曾益新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Sun Yat Sen University Cancer Center
Original Assignee
Sun Yat Sen University
Sun Yat Sen University Cancer Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University, Sun Yat Sen University Cancer Center filed Critical Sun Yat Sen University
Priority to CN202011501223.8A priority Critical patent/CN112980949B/en
Publication of CN112980949A publication Critical patent/CN112980949A/en
Application granted granted Critical
Publication of CN112980949B publication Critical patent/CN112980949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Epidemiology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Hospice & Palliative Care (AREA)
  • Primary Health Care (AREA)
  • Oncology (AREA)
  • Microbiology (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses an SNP marker for identifying high risk groups of nasopharyngeal carcinoma, a kit and application thereof. The marker is any one or more of rs2106123, rs31489, rs3131875, rs1611163, rs9357092, rs9261506, rs2251830, rs2596506, rs2844484, rs9268644, rs6475604, rs1867277, rs9507124 or rs226241. The kit and the risk scoring model constructed by the marker can improve the positive prediction value of EBV antibody screening, effectively identify and enrich high-risk groups of nasopharyngeal carcinoma, and can guide different risk groups to formulate individualized screening initial ages, thereby realizing individualized accurate screening, shunting the screening groups to a certain extent, further improving the screening efficiency of nasopharyngeal carcinoma, and early carrying out risk early warning on the groups without nasopharyngeal carcinoma.

Description

SNP marker for identifying nasopharyngeal carcinoma high risk group, kit and application thereof
Technical Field
The present invention belongs to the field of gene engineering and tumor medicine technology. More particularly, relates to an SNP marker for identifying high risk groups of nasopharyngeal carcinoma, a kit and application thereof.
Background
Nasopharyngeal Carcinoma (Nasopharyngel Carcinoma) is a malignant tumor originating from the Nasopharyngeal epithelium and is frequently found in the top and side walls of the Nasopharyngeal cavity. Nasopharyngeal carcinoma has extremely obvious regional distribution difference, the global overall incidence rate of the nasopharyngeal carcinoma is extremely low, the age-normalized incidence rate is about 0.4/10 ten thousand, but the nasopharyngeal carcinoma is abnormally high in south China, and the incidence rate can reach 20-50/10 ten thousand. According to the estimation of the International agency for research on cancer (IARC), about 12.9 million new cases of nasopharyngeal carcinoma worldwide in 2018 are estimated, wherein the number of new cases of China is 6.06 ten thousand, and accounts for about 47.7% of the total number of new cases of nasopharyngeal carcinoma worldwide. With the continuous development of treatment means, the treatment effect of nasopharyngeal carcinoma is greatly improved, but the treatment effect difference of different stages is very large, the 5-year survival rate of early patients can reach 90%, but the 5-year survival rate of late patients is less than 50%. Because of the characteristics of no specificity of early symptoms of the nasopharyngeal carcinoma, more hidden primary focus and the like, about 80 percent of patients find the nasopharyngeal carcinoma to be in a middle and late stage, the treatment effect is poor, and the fatality rate is high. Therefore, the development of early diagnosis and treatment of nasopharyngeal carcinoma is of great significance to the prevention and control of nasopharyngeal carcinoma.
Numerous studies have shown that the onset of nasopharyngeal carcinoma is closely associated with Epstein-Barr Virus (EBV) infection. According to the technical scheme of tumor screening and early diagnosis and early treatment established by the Ministry of health of China, the screening of plasma EBV antibodies is recommended to be carried out on people in the area of 30-59 years old with high incidence of nasopharyngeal carcinoma, the nasopharyngeal mirror or MRI clinical examination is carried out on people with positive EBV antibodies, and the confirmation of further pathological biopsy is carried out, the screening scheme is implemented in the area of high incidence of nasopharyngeal carcinoma (such as Zhongshan City, sihui City, and Wuzhou City in Guangxi province), and the primary screening result shows that the screening and detection based on the plasma EBV-IgA antibodies can obviously improve the early diagnosis rate of the nasopharyngeal carcinoma (the early diagnosis rates of the nasopharyngeal carcinoma of screened individuals and non-screened individuals are respectively 79.0% and 22.4%), and reduce the death rate of the nasopharyngeal carcinoma (the death rates of screened individuals and non-screened individuals are respectively 1.8/100,000 and 8.3/100,000). However, the existing screening strategies based on the epstein-barr virus antibody still have some limitations, specifically: (1) The Positive Predictive Value (PPV) of the anti-EBV-IgA detection is only about 4%, namely more than 95% of EBV antibody positive persons are false positive, so the screening efficiency based on the EB virus antibody is to be further improved; (2) The increase of the plasma EBV antibody is the pathophysiological change of the body along with the occurrence and development of nasopharyngeal carcinoma, but the plasma EBV antibody is normal for healthy people without diseases, so the EBV antibody can not early warn people without diseases; (3) The currently recommended screening method is based on a general screen for people in the age range of 30-59 years, and an individual precise screening scheme is lacked.
Recent large-scale population studies have shown that a Single Nucleotide Polymorphism (SNP) based multigenic risk score (PRS) strategy can effectively identify high risk populations with complex diseases (Torkamani A, wineiger NE, topol EJ. The personal and clinical utility of polygenic risk scores. Nat. Rev Genet.Sep8; 19 (9): 581-590). The occurrence and development of nasopharyngeal carcinoma are closely related to genetic factors, and the unique ethnic group distribution difference and family aggregation phenomenon of nasopharyngeal carcinoma indicate that the genetic factors play an important role in the occurrence of nasopharyngeal carcinoma. However, the genetic variation related to nasopharyngeal carcinoma reported at present can only explain the heritability of a small part of the disease, and a large number of unknown genetic variations still need to be explored by further research. Therefore, based on the higher inheritance degree of the nasopharyngeal carcinoma and the limitation of the existing EBV antibody screening, the method and the kit based on the multi-gene genetic risk score of the nasopharyngeal carcinoma are developed through comprehensively and systematically researching the genetic susceptibility sites of the nasopharyngeal carcinoma, and have important significance for further improving the screening strategy of the high-incidence area of the nasopharyngeal carcinoma.
Disclosure of Invention
The invention aims to solve the technical problem of overcoming the defects and shortcomings of the existing nasopharyngeal carcinoma screening mode and provides an SNP marker for identifying high risk groups of nasopharyngeal carcinoma, a kit and application thereof.
The invention aims to provide an SNP marker for identifying high-risk groups of nasopharyngeal carcinoma.
Another object of the present invention is to provide a primer for detecting the marker.
The invention also aims to provide application of the marker in preparing a kit for identifying high risk groups of nasopharyngeal carcinoma.
The invention also aims to provide application of the primer in preparing a kit for identifying high risk groups of nasopharyngeal carcinoma.
The invention also aims to provide a kit for identifying the high risk group of nasopharyngeal carcinoma.
It is another object of the present invention to provide a risk scoring model for identifying a high risk group for nasopharyngeal carcinoma based on said markers.
The invention further aims to provide application of the risk scoring model in improving the positive predictive value of EBV-IgA antibody screening.
It is another object of the present invention to provide the use of the risk scoring model in guiding the personalized screening for nasopharyngeal carcinoma for the starting age.
The above purpose of the invention is realized by the following technical scheme:
the invention provides an SNP marker for identifying high risk group of nasopharyngeal carcinoma, the marker is any one or more of rs2106123, rs31489, rs3131875, rs1611163, rs9357092, rs9261506, rs2251830, rs2596506, rs2844484, rs9268644, rs6475604, rs1867277, rs9507124 or rs 226241; the nucleotide sequence is shown as SEQ ID No:43 to 56.
The invention also provides a primer for detecting the SNP marker.
Preferably, the primers comprise PCR amplification primers and single base extension primers; the sequence of the PCR amplification primer for detecting rs3131875 is shown as SEQ ID No: 1-2, and the sequence of the single base extension primer is shown as SEQ ID No:3 is shown in the specification;
the sequence of the PCR amplification primer for detecting rs9268644 is shown as SEQ ID No: 4-5, the sequence of the single base extension primer is shown as SEQ ID No:6 is shown in the specification;
the sequence of the PCR amplification primer for detecting rs6475604 is shown as SEQ ID No: 7-8, the sequence of the single base extension primer is shown as SEQ ID No:9 is shown in the figure;
the sequence of the PCR amplification primer for detecting rs2844484 is shown as SEQ ID No: 10-11, the sequence of the single base extension primer is shown as SEQ ID No:12 is shown in the specification;
the sequence of the PCR amplification primer for detecting rs1867277 is shown as SEQ ID No: 13-14, the sequence of the single base extension primer is shown as SEQ ID No:15 is shown in the figure;
the sequence of the PCR amplification primer for detecting rs31489 is shown as SEQ ID No: 16-17, the sequence of the single base extension primer is shown as SEQ ID No:18, respectively;
the sequence of the PCR amplification primer for detecting rs2596506 is shown as SEQ ID No: 19-20, and the sequence of the single base extension primer is shown as SEQ ID No:21 is shown;
the sequence of the PCR amplification primer for detecting rs226241 is shown as SEQ ID No: 22-23, the sequence of the single base extension primer is shown as SEQ ID No: shown at 24;
the sequence of the PCR amplification primer for detecting rs9261506 is shown as SEQ ID No: 25-26, the sequence of the single base extension primer is shown as SEQ ID No:27 is shown;
the sequence of the PCR amplification primer for detecting rs2251830 is shown as SEQ ID No: 28-29, and the sequence of the single-base extension primer is shown as SEQ ID No:30 is shown in the figure;
the sequence of the PCR amplification primer for detecting rs9507124 is shown as SEQ ID No: 31-32, and the sequence of the single base extension primer is shown as SEQ ID No: shown in 33;
the sequence of the PCR amplification primer for detecting rs9357092 is shown as SEQ ID No: 34-35, the sequence of the single base extension primer is shown as SEQ ID No:36, respectively;
the sequence of the PCR amplification primer for detecting rs2106123 is shown as SEQ ID No: 37-38, and the sequence of the single-base extension primer is shown as SEQ ID No: 39;
the sequence of the PCR amplification primer for detecting rs1611163 is shown as SEQ ID No: 40-41, and the sequence of the single base extension primer is shown as SEQ ID No: shown at 42.
Based on the multi-center large-sample whole genome association research, new genetic susceptibility sites are discovered by comprehensively and systematically identifying SNP (single nucleotide polymorphism) related to nasopharyngeal carcinoma, so that a PRS (PRS) model is developed based on large data analysis, and external verification is performed on independent people; the kit can help people to better stratify disease risk, improve the positive prediction value of the EBV antibody, improve the screening efficiency of the existing screening scheme, guide the individual screening age of the nasopharyngeal carcinoma, provide an important reference basis for the optimization and the perfection of the nasopharyngeal carcinoma screening strategy in China, effectively promote the prevention and the treatment of the nasopharyngeal carcinoma in China, and open up a new path for the individual prevention, the chemical intervention, the drug screening and the like of the nasopharyngeal carcinoma. Therefore, the following applications should be within the scope of the present invention:
the marker is applied to the preparation of a kit for identifying high risk groups of nasopharyngeal carcinoma.
The primer is applied to the preparation of a kit for identifying high risk groups of nasopharyngeal carcinoma.
The invention also provides a kit for identifying the high risk group of nasopharyngeal carcinoma, which contains components for detecting the markers (rs 2106123, rs31489, rs3131875, rs1611163, rs9357092, rs9261506, rs2251830, rs2596506, rs2844484, rs9268644, rs6475604, rs1867277, rs9507124 and rs 226241) in DNA; the DNA is derived from peripheral blood, saliva or oropharyngeal swab.
Preferably, the component comprises a polypeptide having a sequence set forth in SEQ ID No:1 to 42.
More preferably, the components further comprise Taq DNA polymerase, dNTP mixed solution, diluent and buffer.
Specifically, the steps of constructing the kit for identifying the high risk group of nasopharyngeal carcinoma according to the present invention are as follows:
1. establishing a unified specimen library and a database: collecting blood, saliva or oropharyngeal swab samples meeting the standard by a standard operation procedure, and collecting complete demographic data and clinical data by a system;
2. based on two-stage multicenter whole genome association research, a new genetic susceptibility site of nasopharyngeal carcinoma is discovered and identified: in the discovery stage, a whole genome SNP chip is utilized to search SNP loci related to nasopharyngeal carcinoma, and the newly discovered SNP loci are further verified in an independent verification population for the screened significantly related SNP loci;
3. establishing and verifying a nasopharyngeal carcinoma polygene genetic risk score model: based on newly discovered and successfully verified SNP sites, through data statistical analysis, a nasopharyngeal carcinoma polygene genetic risk score model based on 14 SNP sites is constructed in a crowd at a discovery stage, and further, the established model is externally verified in an independent crowd to evaluate the stability of the model;
4. development of a nasopharyngeal carcinoma polygene genetic risk prediction kit: the kit comprises PCR amplification primers for detecting 14 SNP sites (rs 2106123, rs31489, rs3131875, rs1611163, rs9357092, rs9261506, rs2251830, rs2596506, rs2844484, rs9268644, rs6475604, rs1867277, rs9507124 and rs 226241) of a human genome and single base extension primers (the sequence is shown as SEQ ID No: 1-42), and further comprises Taq DNA polymerase, dNTP mixed liquor, diluent and buffer solution.
In addition, the invention also provides a risk score model for identifying the high risk group of nasopharyngeal darcinoma based on the marker, and after the marker in DNA is detected, the risk score is calculated by a risk score formula;
the hazard score = (0.18 × rs 2106123) + (0.26 × rs 31489) + (0.21 × rs 3131875) + (0.50 × rs 1611163) + (0.23 × rs 9357092) + (0.17 × rs 9261506) + (0.27 × rs 2251830) + (0.35 × rs 2596506) + (0.14 × rs 2844484) + (0.46 × rs 9268644) + (0.30 × rs 6475604) + (0.29 × rs 1867277) + (0.14 × rs 9507124) + (0.28 × rs 226241);
among them, for the score of SNP, homozygous protective type = "0", heterozygous type = "1", and homozygous dangerous type = "2".
The method is based on genetic locus typing information, calculates the individual nasopharyngeal carcinoma genetic risk score and the cumulative absolute occurrence risk of nasopharyngeal carcinoma in the next 10 years/lifetime, and further guides the nasopharyngeal carcinoma screening to begin to age; based on a nasopharyngeal carcinoma prospective screening queue, for high risk groups of EBV antibodies, the positive prediction value of the EBV antibodies is improved by combining the genetic risk score of individuals, and the nasopharyngeal carcinoma screening efficiency is improved. Therefore, the following applications should also be within the scope of the present invention:
use of the marker or the risk score model for improving the positive predictive value of EBV-IgA antibody screening.
Use of the marker or the risk scoring model for guiding individualized screening for nasopharyngeal carcinoma for initial age.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention is based on the international largest-scale nasopharyngeal carcinoma whole genome correlation research at present, adopts high-quality research design, firstly carries out multi-center independent population verification on a polygene genetic risk prediction model, and utilizes prospective cohort population with the highest epidemiological evidence to explore the application prospect of the model in nasopharyngeal carcinoma screening, thereby providing a high-quality epidemiological data basis for the optimization and perfection of the nasopharyngeal carcinoma prevention and control strategy in China.
(2) The nasopharyngeal carcinoma polygenic genetic risk score model based on the SNP locus is a novel molecular biomarker, is different from the traditional molecular biomarker derived from blood plasma, the polygenic genetic risk score is based on the genetic information of an individual, only needs to be detected once for a lifetime, can be detected at any age, and has the advantages of lifetime stability, easiness in detection, low cost and the like.
(3) The screening scheme based on the traditional molecular marker can improve the early diagnosis rate of nasopharyngeal carcinoma and reduce the death rate of nasopharyngeal carcinoma, while the screening scheme based on the polygene genetic risk scoring model of nasopharyngeal carcinoma not only can improve the early diagnosis rate and reduce the death rate, but also can carry out early risk early warning, and is expected to fundamentally reduce the morbidity of nasopharyngeal carcinoma if the first-level etiology prevention such as life style intervention is carried out on high-risk groups, which cannot be achieved by the traditional molecular marker.
(4) The nasopharyngeal carcinoma polygene genetic risk score model has the greatest advantages that the nasopharyngeal carcinoma screening efficiency can be further improved, the positive prediction value of the conventional EBV antibody-based screening is improved, the false positive rate of the screening is reduced, and the compliance of participants is increased; the nasopharyngeal carcinoma polygenic genetic risk scoring model can effectively identify nasopharyngeal carcinoma high-risk groups, help identify and enrich the high-risk groups, shunt screening groups to a certain extent, improve screening efficiency, guide screening starting time of different risk groups, and realize individualized and accurate screening.
Drawings
FIG. 1 is a graph of the dose effect of polygene genetic risk scores on the risk of onset of nasopharyngeal carcinoma; wherein, the graph (A) shows the nasopharyngeal cancer onset risk (corrected sex and age) of individuals with different decimals of PRS in a GWAS sample established by the PRS model compared with the PRS in the lowest 10% of groups; black squares represent the OR value, the vertical line range is the 95% confidence interval; the upper dotted line represents the OR value for PRS ≧ 90% of the group population, and the lower dotted line represents the OR value for PRS <10% of the group population; (B) The figure is the risk of nasopharyngeal carcinoma development (corrected sex, age) in individuals with different deciles of PRS in comparison to those with PRS in the lowest 10% of the group population in an independent sample validated by the PRS model.
FIG. 2 is a graph of the results of the application of a polygene genetic risk score in nasopharyngeal cancer screening; wherein, the graph (A) is the EBV-IgA antibody positive prediction values of different polygene genetic risk scoring group individuals in the nasopharyngeal carcinoma screening cohort population, wherein, the horizontal dotted line represents the average value (4.11 percent) of the EBV-IgA antibody positive prediction values in the total population; the abscissa represents a grouping of samples for the respective PRS levels, e.g., <20 represents PRS <20% samples, >60 represents PRS >60% samples; (B) The figure is an estimate of the cumulative risk of an individual to develop nasopharyngeal carcinoma based on a polygenic genetic risk score, i.e., the cumulative absolute risk of an individual to develop nasopharyngeal carcinoma from 20 years to a particular age (x axis); (C) The graph is to estimate the cumulative risk of an individual to develop nasopharyngeal carcinoma 10 years into the future based on the polygenic genetic risk score, i.e., the cumulative absolute risk of an individual to develop nasopharyngeal carcinoma within 10 years into the future at a particular age (x axis); (D) Panel is a graph of recommended nasopharyngeal carcinoma individualized screening onset age results according to PRS, where the red and green solid lines represent nasopharyngeal carcinoma screening suggested onset ages for male and female populations at different PRS levels; the horizontal dashed line represents the screening age (30 years) recommended by current nasopharyngeal carcinoma screening guidelines; the three vertical dotted lines correspond to levels of polygene genetic risk scores in 10%, 50%, 90% of the population, respectively.
Detailed Description
The present invention is further illustrated by the following specific examples, which are not intended to limit the invention in any way. The reagents, methods and apparatus employed in the present invention are conventional in the art, except as otherwise indicated.
Unless otherwise indicated, reagents and materials used in the following examples are commercially available.
Example 1 Whole genome Association analysis to identify novel genetic susceptibility sites for nasopharyngeal carcinoma
In the embodiment, the whole genome association research analysis is carried out on the population in the high incidence area of nasopharyngeal carcinoma, SNP sites which are obviously related to the incidence risk of the nasopharyngeal carcinoma are preliminarily screened out, the external verification is further carried out on the individual population, and finally, the SNP sites are identifiedSNP site (P) related to nasopharyngeal carcinoma onset risk<5×10 -8 ). The specific experimental method is as follows:
1. establishing unified standard specimen library and database
For 6828 nasopharyngeal carcinoma patients and 10473 healthy controls, the system collects complete demographic data and clinical data, collects blood, saliva or oropharyngeal swab samples meeting the standard by standard operating procedures, and establishes a unified specimen library and database.
Genetic locus typing detection of DNA
(1) DNA extraction from peripheral blood, saliva or oropharyngeal swabs
The operation is carried out according to the conventional method. The DNA in the sample is extracted using the Qiagen DNA midi kit (100) kit or similar products. The quantitative determination is carried out by a spectrophotometer, the agarose gel electrophoresis quality inspection is carried out, and the electrophoresis band of the genome DNA is usually not less than 20kb. The quality-control DNA was adjusted to a concentration of 50 ng/. Mu.l, transferred to a 384-well plate, and stored at-80 ℃ until use.
(2) Whole genome association study analysis based on discovery stage population
In the discovery phase, 4506 nasopharyngeal carcinoma patients and 5384 healthy controls were genomically genotyped using multiple genotyping arrays (Illumina Infinium Global Screening array.24, human610.Quad Bead Chip and Infinium Asian Screening Array). In order to increase the genotype density, we performed genotype filling on each data set separately, and then performed data merging. For non-MHC regions, we used SHAPEIT (v 2.12) for haplotype resolution and 1000Genome Phase iii as the reference population, padded with IMPUTE 2. For the MHC region, we applied SNP2HLA for memory filling, using the han population of 10 689 healthy individuals provided by the Beijing Genomics Institute (BGI) as the reference genome. Through standardized quality control, 4835244 SNP sites are finally subjected to association research and analysis, and SNP sites possibly related to the onset of nasopharyngeal carcinoma are preferably selected.
(3) Individual population validation of initially screened SNP sites
Further independent validation of the SNP sites initially screened during the discovery phase was performed in an independent population (2322 nasopharyngeal carcinoma patients and 5089 healthy controls) using the Taqman (Applied Biosystems) genotyping platform.
Designing specific amplification primers and specific probe sequences for the initially screened SNP sites.
Using 384 well plates, 5. Mu.l per well of the reaction system included: 0.3. Mu.l of the mixture of forward primer, reverse primer and typing probe, 0.2. Mu.l of dNTP Mi × tube (10 mM), mgCl 2 0.4. Mu.l (25 mM), 1. Mu.l reaction buffer, 0.1. Mu.l reaction enzyme, 2. Mu.l double distilled water and 1. Mu.l test DNA.
The amplification system was as follows: 60 ℃,30s,1 cycle; 1 cycle at 95 ℃ and 10 min; 45 cycles of 95 ℃,15s and 60 ℃, 1min; 60 ℃ in 30s,1 cycle.
The apparatus used was an ABI7900 type PCR apparatus. The detection result is derived by using QuantStaudio Real-Time PCR Software v1.3 Software.
3. Statistical analysis to discover new genetic susceptibility sites
By meta analysis of the discovery-stage and validation-stage populations, 2 new SNPs significantly associated with nasopharyngeal cancer onset risk were obtained in the non-HLA region, including rs1867277[9q22.33, FO × E1 upstream, OR =0.75, 95% CI =0.68-0.82, P =9.18 × 10% -11 ]And rs226241[17q12, LASP1/RPL23: OR =1.43, 95% CI =1.27-1.62, P =9.84 × 10 -9 ]。
For the HLA region, stepwise conditional regression analysis was used to derive 6 new SNPs significantly associated with risk of developing nasopharyngeal carcinoma, including: rs3131875[ ZFP57/HLA-F: OR =1.7, 95% CI =1.78-2.18, P = 1.6X 10 -39 ]Rs1611163[ HLA-G upstream: OR =0.54, 95% CI =0.49-0.60, P =1.39 × 10 -32 ],rs9357092[ZNR1ASP:OR=2.04,95%CI=1.85-2.25,P=8.74×10 -48 ]Rs2596506[ downstream of HLA-B: OR =0.54, 95% CI =0.49-0.59, P = 4.67X 10 -37 ]、rs2844484[NFKBIL1/LTA:OR=0.64,95%CI=0.59-0.70,P=5.46×10 -24 ]And rs9268644[ HLA-DRA: OR =0.65, 95% CI =0.58-0.73, P = 1.61X 10 -14 ]。
Example 2 establishment of a model for a Polygene genetic Risk score for nasopharyngeal carcinoma and a test kit
In this embodiment, in combination with the 8 new SNPs (rs 1867277, rs226241, rs3131875, rs1611163, rs9357092, rs2596506, rs2844484, rs 9268644) obtained in example 1 and the other 6 SNPs (rs 2106123, rs31489, rs2596506, rs2251830, rs6475604, rs 9507124) related to the onset of nasopharyngeal carcinoma, a nasopharyngeal carcinoma polygenic genetic risk score model and a detection kit are established through data statistical analysis, and specific experimental methods are as follows:
1. establishment and verification of nasopharyngeal carcinoma polygene genetic risk score model
According to GWAS research, 8 novel SNPs are found to be related to the onset risk of nasopharyngeal carcinoma, and 6 SNPs related to the onset of nasopharyngeal carcinoma are verified. In order to further study the effect of the comprehensive index formed by the 14 SNPs for early diagnosis, a mathematical model was constructed based on the population in the discovery stage, and the positive and negative association between each SNP site and the onset of nasopharyngeal carcinoma and the strength of association were comprehensively considered (the association degree of the 14 SNPs for constructing the PRS model with nasopharyngeal carcinoma is shown in table 1).
Specifically, we scored three genotypes for each SNP, homozygous protective =0, heterozygous =1, and homozygous risk =2, and determined a risk score for each subject considering the regression coefficients of each SNP in a multifactorial model as weights (the genotypes and regression coefficients of the 14 SNPs used to construct the PRS model are shown in table 2). The risk score is calculated as follows: the risk score = (0.18 × rs 2106123) + (0.26 × rs 31489) + (0.21 × rs 3131875) + (0.50 × rs 1611163) + (0.23 × rs 9357092) + (0.17 × rs 9261506) + (0.27 × rs 2251830) + (0.35 × rs 2596506) + (0.14 × rs 2844484) + (0.46 × rs 9268644) + (0.30 × rs 6475604) + (0.29 × rs 1867277) + (0.14 × rs 9507124) + (0.28 × rs 226241), the obtained risk score coefficient and the limit value are directly applied to samples studied in whole genome association (rs 2106123 is an example: 0.18 is a regression coefficient of rs 6123, when the rs2106123 is a protective type = GG, rs2106123=0; when the rs2106123 is a type GA = 6123, the results are determined by an instrument for the total SNP genotype calculation, the individual markers are the total SNP score of the markers AA 2106123, and the total SNP score is determined by the instrument.
TABLE 1 correlation degree of 14 SNPs used for constructing PRS model with nasopharyngeal carcinoma
Figure BDA0002843543040000091
Figure BDA0002843543040000101
TABLE 2 genotypes and regression coefficients of 14 SNPs used to construct PRS models
Figure BDA0002843543040000102
2. Development of kit for detecting polygene genetic risk score of nasopharyngeal carcinoma
2.1 PCR amplification primers and Single base extension primers for designing and synthesizing SNP sites
The sequences of 14 SNPs sites which are obviously related to the onset of nasopharyngeal carcinoma and used for constructing PRS are shown in Table 3, and a PCR amplification primer (PCRP) and a single base extension primer (UEP) of the SNP site to be detected are shown in Table 4.
TABLE 3 sequences of 14 SNPs sites significantly associated with the onset of nasopharyngeal carcinoma for construction of PRS
Figure BDA0002843543040000111
Figure BDA0002843543040000121
Figure BDA0002843543040000131
Note: in the sequence table, R represents A/G, M represents A/C, Y represents C/T, and S represents C/G.
TABLE 4 PCR amplification primer and single base extension primer for SNP site to be detected
Figure BDA0002843543040000132
Figure BDA0002843543040000141
Figure BDA0002843543040000151
2.2 construction of the kit
The components of the kit comprise a PCR amplification primer and a single-base extension primer of 14 SNPs sites which are obviously related to the nasopharyngeal carcinoma, and the kit also comprises: taq DNA polymerase, dNTP mixed solution, diluent, buffer solution and the like, and the kit and the using method thereof are as follows:
2.2.1 PCR amplification
PCR amplification was performed in 384-well plates using multiplex PCR, with a total volume of 5. Mu.l per reaction system.
(1) A new 1.5ml EP tube was used to prepare a PCR master mix solution, specifically:
Figure BDA0002843543040000152
(2) Using a 24-channel sample applicator, the sample volume was adjusted to 4. Mu.l, and a PCR master mix solution was added to each well of a 384-well plate. The 384-well plate is a PCR reaction plate.
(3) The prepared 384-well DNA sample plate was removed and the sample volume was adjusted to 1. Mu.l using a 24-channel sample applicator, and each 5. Mu.l of PCR reaction system contained 20 to 50ng of template DNA, 0.5U of Hotstar Taq, 0.5pmol of each amplification primer, and 0.1. Mu.l of 25mM dNTPs.
(4) The PCR reaction conditions were set on a 384-well compatible PCR instrument as follows: 4 minutes at 94 ℃; 20 seconds at 94 ℃,30 seconds at 56 ℃,1 minute at 72 ℃ and 45 cycles; 3 minutes at 72 ℃; keeping at 4 ℃. The 384-well PCR reaction plate was placed on a PCR instrument, and the PCR reaction was initiated.
2.2.2 alkaline phosphatase treatment of PCR products
(1) After the PCR reaction was completed, the PCR product was treated with SAP (shrimp alkaline phosphatase) to remove free dNTPs from the system.
(2) Preparing alkaline phosphatase treatment reaction liquid (SAP Mix), which specifically comprises the following steps:
Figure BDA0002843543040000161
(3) SAP Mi x was added to 384-well PCR reaction plates using a 24-channel applicator with an adjusted loading volume of 2 μ l. The total reaction volume was 7. Mu.l for each alkaline phosphatase-treated well, 5. Mu.l of PCR product and 2. Mu.l of SAP mixture (SAP0.5U, buffer 0.17. Mu.l).
(4) The 384-well plate was placed on a 384-well plate compatible PCR instrument, and the PCR reaction conditions were set: 40 minutes at 37 ℃; 5 minutes at 85 ℃; the temperature was maintained at 4 ℃ and the PCR machine was started for alkaline phosphatase treatment.
2.2.3 Single base extension
(1) After the alkaline phosphatase treatment was completed, the single-base extension reaction was carried out in a total volume of 9. Mu.l.
(2) Preparing a single-base extension reaction solution (E × TEND Mix), which specifically comprises the following steps:
Figure BDA0002843543040000162
(3) Using a 24-channel sample applicator, the sample volume was adjusted to 2. Mu.l, and E.times.TEND mix X was applied to 384-well reaction plates. For each reaction well, the single base extension reaction system contained 7. Mu.l of SAP treated PCR product and 2. Mu.l of E × TEND Mi × liquid (with 0.94. Mu.l of each extension primer mix, 0.041. Mu.l of iPLE × enzyme, 0.2. Mu.l of extension mix).
(4) The 384-well plate was placed on a 384-well plate compatible PCR instrument, and the PCR reaction conditions were set:
i.94 ℃,30 seconds;
II.94 ℃,5 seconds;
III.52 ℃,5 seconds;
IV.80 ℃,5 seconds;
v. back to III,4 times or more;
returning to II,39 times or more;
VII.72 ℃,3 minutes;
VII.4 ℃ constant;
and starting a PCR instrument to perform single base extension reaction.
2.2.4 resin purification
(1) Lay Clean Resin flat into 6mg Resin plate;
(2) Adding 16 μ l of water to the corresponding well of the extension product;
(3) Pouring the dried resin into an extension product plate, sealing the film, and vertically rotating at a low speed for 30 minutes to ensure that the resin is fully contacted with reactants;
(4) The resin was allowed to settle to the bottom of the well by centrifugation.
2.2.5 chip spotting
The MassARRAY Nanodispenser RS1000 spotter was started and the resin purified extension product was transferred to 384-well SpectroCHIP (Sequenom) chips.
2.2.6 Mass spectrometric detection
The spotted SpectroCHIP chip was analyzed by MALDI-TOF (matrix X-assisted laser desorption/ionization-time of flight mass spectrometry), and the detection result was typed by TYPER4.0 software (sequenom) and outputted.
Example 3 efficient identification of individuals at high risk for nasopharyngeal carcinoma based on a nasopharyngeal carcinoma polygenic genetic risk score
By using the polygenic genetic risk score model and the constructed kit in the embodiment 2, the people can be subjected to polygenic risk score detection.
The distribution of the polygenic genetic risk scores can be obtained according to the scores of the polygenic genetic scores, which are detailed in table 2.
TABLE 2 score distribution of multiple Gene genetic Risk scores
Figure BDA0002843543040000171
Figure BDA0002843543040000181
The dose-effect relationship between polygenic genetic risk scores and nasopharyngeal carcinoma onset risk is shown in fig. 1, and PRS can effectively identify a high risk group of nasopharyngeal carcinoma, namely, the nasopharyngeal carcinoma risk is gradually increased along with the increase of PRS, and a significant dose-effect relationship exists. As shown in FIG. 1, panel (A), in the PRS model population established (4506 nasopharyngeal carcinoma patients and 5308 healthy controls), the risk of nasopharyngeal carcinoma development was 7.11 times that of individuals with PRS at the lowest 10% (95% CI = 5.69-8.88) for the highest 10% of PRS individuals. As shown in (B) panel in fig. 1, in the PRS model validation stage (945 nasopharyngeal carcinoma patients and 1236 healthy controls), it was also observed that the risk of developing nasopharyngeal carcinoma gradually increased with increasing PRS, with a significant dose-effect relationship.
Therefore, the method proves that the incidence risk of the high-risk people with nasopharyngeal carcinoma can be well predicted by adopting the polygenic genetic risk score.
Example 4 ability to improve the Positive predictive value of EBV-IgA antibody screening based on a multigene genetic risk score
We evaluated the value of PRS in a prospective screening cohort for nasopharyngeal carcinoma. The high-risk subjects were further clinically examined for confirmation of pathology by preliminary screening of 29413 participants for antibodies against VCA-IgA and EBNA1-IgA in plasma. After 7.33 years of median follow-up, 1756 individuals at high risk for EBV antibodies were shared, with 1434 of them having available specimens for PRS scoring. In the 1434 high-risk population with EBV antibody, 59 new nasopharyngeal carcinoma patients were found by screening, and the positive predictive value of the EBV antibody PPV =4.11% (59/1431 = 4.11%).
PRS detection is carried out on 1434 high-risk populations of antibodies in a nasopharyngeal carcinoma screening queue, the application result of polygenic genetic risk scores in nasopharyngeal carcinoma screening is shown in figure 2, the EBV-IgA antibody positive prediction value of different polygenic genetic risk score group individuals in the nasopharyngeal carcinoma screening queue population is shown in figure 2 (A), PPV of EBV is increased along with the increase of PRS, namely, the PRS is in individuals grouped at the lowest 20%, and the PPV is 2.18%; whereas for individuals with PRS grouped at the highest 10%, the highest 5% and the highest 1%, the PPV was 7.64%, 8.11% and 18.18%, respectively, all higher than the original PPV mean of the EBV antibody by 4.11%. These results indicate that PRS can improve the positive predictive value of EBV antibody screening, thereby improving the screening efficiency of nasopharyngeal carcinoma.
Example 5 the introduction of individualization to the age of screening based on polygenic genetic risk scores
Based on the estimated value of OR parameters of SNPs included in a PRS model, the risk allele type and the allele frequency of each SNP, the absolute risk of occurrence of nasopharyngeal carcinoma of people in south China is estimated, wherein the incidence of nasopharyngeal carcinoma of male and female age groups is derived from IARC (research on incidence of cancer in five continents), volume eleventh (http:// ci5. Irc. Fr/CI5. Times. I/Pages/age. Specific. Currents _ sel. Asp).
We calculate that the average lifetime cumulative risk of nasopharyngeal carcinoma in men is 2.79% and that in women is 0.85%; based on genetic susceptibility factors for nasopharyngeal carcinoma, the corresponding cumulative risk for PRS was 0.38% and 8.23% in men and 0.11% and 2.51% in women, respectively, for the lowest 1% and highest 1% of individuals, indicating that the cumulative risk for life-long for nasopharyngeal carcinoma differed by about 21-fold in the highest 1% and lowest 1% of individuals with PRS. Similarly, we calculated the mean cumulative risk of nasopharyngeal carcinoma for a 30 year old male in the next 10 years to be 0.30% and 0.09% for a female; however, for individuals with PRS at the lowest 1% and highest 1%, the corresponding 10-year absolute risk values for men were 0.04% and 1.00% and for women were 0.01% and 0.29%, indicating that the absolute risk values between the two groups of lowest 1% and highest 1% PRS differed by about 25-fold (the cumulative risk of developing nasopharyngeal carcinoma in an individual was estimated based on the polygenic genetic risk score as shown in fig. 2 (B), and the cumulative risk of developing nasopharyngeal carcinoma in the future 10 years in an individual was estimated based on the polygenic genetic risk score as shown in fig. 2 (C)).
At present, according to the technical scheme of early diagnosis and early treatment of cancer recommended by Ministry of health of China, screening of nasopharyngeal carcinoma in high incidence areas of the nasopharyngeal carcinoma is recommended from the age of 30. Therefore, we estimated the individual recommended screening age of the individuals at different PRS levels by using the risk value (mean: 0.20%) of nasopharyngeal carcinoma onset in the next 10 years of 30-year-old individuals as the threshold for nasopharyngeal carcinoma screening. We found that for men with PRS > 90%, it is recommended that the screening start age be 23 years; while for men with PRS ≦ 10%, the screening was initiated at age 41, which is 18 years different. For the female population, the screening onset age for women with PRS > 90% was 30 years, while for about half of women, i.e., PRS < 50%, the screening risk threshold was not reached at any age, i.e., this portion of women were not advised to have nasopharyngeal carcinoma screening (the PRS recommended screening onset age results are shown in FIG. 2 (D)).
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Sequence listing
<110> Zhongshan university
Sun Yat-sen University Cancer Center (Sun Yat-sen University Affiliated Cancer Hospital, Sun Yat-sen University Cancer Institute)
<120> SNP marker for identifying nasopharyngeal carcinoma high risk group, kit and application thereof
<141> 2020-12-17
<160> 56
<170> SIPOSequenceListing 1.0
<210> 1
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 1
acgttggatg caaagtaatg ggtgtttgtc 30
<210> 2
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 2
acgttggatg ccccccacat catatatttg 30
<210> 3
<211> 24
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 3
atcatatatt tgactcactt taga 24
<210> 4
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 4
acgttggatg tcgcattcca cctgtttacg 30
<210> 5
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 5
acgttggatg tacactttca gcctggtgac 30
<210> 6
<211> 19
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 6
gaccttgtct caaaaaaga 19
<210> 7
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 7
acgttggatg catgtttgct tccccttctg 30
<210> 8
<211> 31
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 8
acgttggatg gactgggtaa tttataaagg c 31
<210> 9
<211> 24
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 9
aaagagaggt ttagttgact caca 24
<210> 10
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 10
acgttggatg acactaaatc caatcacagc 30
<210> 11
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 11
acgttggatg cataagaact ccaaggaccc 30
<210> 12
<211> 22
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 12
gaaacaccag tataaaaaaa ct 22
<210> 13
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 13
acgttggatg ccaggcacgt ttacagtaag 30
<210> 14
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 14
acgttggatg ccaccctcat cctattatgc 30
<210> 15
<211> 20
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 15
cactgggctt tccacttggc 20
<210> 16
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 16
acgttggatg tccctcacca caaactgaag 30
<210> 17
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 17
acgttggatg acccccagtt tggcttgtag 30
<210> 18
<211> 15
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 18
gcttgtagca ggacc 15
<210> 19
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 19
acgttggatg ggaaaaagca gatctatggg 30
<210> 20
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 20
acgttggatg tgttctttat cccttctccc 30
<210> 21
<211> 15
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 21
tctccctcca acctt 15
<210> 22
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 22
acgttggatg ctagaaacag ggttgggttg 30
<210> 23
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 23
acgttggatg actcaacccc aactgtaacc 30
<210> 24
<211> 19
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 24
ccaactgtaa ccctaacct 19
<210> 25
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 25
acgttggatg agtagatgca agcggtgaag 30
<210> 26
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 26
acgttggatg ggctggtgtg gaactgtttg 30
<210> 27
<211> 20
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 27
gggtgaactg tttgtgcgag 20
<210> 28
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 28
acgttggatg ttcagtgtga ttccctcctg 30
<210> 29
<211> 31
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 29
acgttggatg catgctataa atcaagaaga c 31
<210> 30
<211> 21
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 30
caatcaagaa gacattcccc a 21
<210> 31
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 31
acgttggatg ttagtcagcc cacaaatccc 30
<210> 32
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 32
acgttggatg caaagtgttg ccaacaaagg 30
<210> 33
<211> 18
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 33
ccaacaaagg agatgaga 18
<210> 34
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 34
acgttggatg agacagaggc tcgggagtga 30
<210> 35
<211> 29
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 35
acgttggatg ttcagccgga gaccagagt 29
<210> 36
<211> 19
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 36
ccccagtcca gtcccggtc 19
<210> 37
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 37
acgttggatg cattcaccct cttgaggtgg 30
<210> 38
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 38
acgttggatg gtcttcctta gaccgtgaac 30
<210> 39
<211> 22
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 39
ctaccccaag agctgttttc cc 22
<210> 40
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 40
acgttggatg gagatcacgc cattgcattc 30
<210> 41
<211> 30
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 41
acgttggatg gctccttcaa acatcaacag 30
<210> 42
<211> 20
<212> DNA
<213> Artificial sequences (synthetic sequences)
<400> 42
ggttgagacg gagtttccca 20
<210> 43
<211> 181
<212> DNA
<213> human (Homo sapiens)
<400> 43
gtacaaagta atgggtgttt gtctttatca catgtgattt ctctaattaa attttatatt 60
rtctaaagtg agtcaaatat atgatgtggg gggaagtggg agggcaccac aacattttaa 120
ctctgctttc taacaagaga ataggatgta agaggaagaa caagtgggcc acatttgaaa 180
g 181
<210> 44
<211> 171
<212> DNA
<213> human (Homo sapiens)
<400> 44
cgtccatact acagccccat ggaaaaacct cgcattccac ctgtttacgg ttacatgagt 60
tcttcttcct ctttaaaagt mtcttttttg agacaaggtc tcgctgtcac caggctgaaa 120
gtgtaggggt gcaatcacag ctcactgcat cctcaacctc ctgagctcaa g 171
<210> 45
<211> 201
<212> DNA
<213> person (Homo sapiens)
<400> 45
ccatgtgaag aatgacatgt ttgcttcccc ttctgccatg attgtaagtt tcctggggca 60
gcctcctcag ccatgcacaa ytgtgagtca actaaacctc ttgcctttat aaattaccca 120
gtctcaggta tttctttata gcagtgtgag aacagactaa tacaacttct aactgataga 180
gtaatgttga cataacagtt t 201
<210> 46
<211> 211
<212> DNA
<213> person (Homo sapiens)
<400> 46
gtacactgaa aaaatggtta gaaggtaaat tatatagtat gcatgtttta ccacaattta 60
caaaaaatat atcaacacta aatccaatca cagctctcat cragtttttt tatactggtg 120
tttcaacaag cacattgccg ctgtggaggg gaggggtcct tggagttctt atgccaccat 180
gttctttggt gtcacttctc agcacaactt t 211
<210> 47
<211> 240
<212> DNA
<213> person (Homo sapiens)
<400> 47
tgccagcagc tgtccaatag gaataggacg cctgggactg ggtatgttca caatggcggc 60
tccatcttcc cttctctttg ccaggcacgt ttacagtaag gagcagacaa catgtcaccr 120
gccaagtgga aagcccattt gcataatagg atgagggtgg ggtgaacagc cttccacacg 180
cactatgtaa atatcacacc tggtacaacc aacctgtggg ccctacataa atcagacacc 240
<210> 48
<211> 261
<212> DNA
<213> human (Homo sapiens)
<400> 48
aatagtgtga caggtattat gtggtctcga cagaaagtat aacaaattgt ggtttggtgg 60
agttcttccc tcaccacaaa ctgaagtaag tcaaatttgg tttagagggt caaaactgag 120
ttgtgtattg atgaatagca mggtcctgct acaagccaaa ctgggggtgg gggtgggggt 180
gggggaggaa gaatattttc tggcaagcat taacaagtta tatttctggg ctttaattat 240
tctttctgga aaattagtaa a 261
<210> 49
<211> 200
<212> DNA
<213> person (Homo sapiens)
<400> 49
ttcattttat ctgactcaat tcatatacca ttctggaaaa agcagatcta tggggacaga 60
aaacaggtta atggttatcr aaggttggag ggagaaggga taaagaacat tcaaagggaa 120
atttttagga tgaaggaact gttgtagatg gtactagggt agtggatata tgattctata 180
catttttcaa aatccaggct 200
<210> 50
<211> 223
<212> DNA
<213> person (Homo sapiens)
<400> 50
tcaaaacctg aattgggatt taataccaac atcaacccta acccaaattt aacctcaacc 60
caaatcacaa ctcaaactca accccaactg taaccctaac ctyaaatcta aacacatccc 120
aattaataac cccctaaata aaacttctcc tctaccccaa cccaaccctg tttctagggc 180
taatcttgaa accagtttac caccactcct aacactaaac tta 223
<210> 51
<211> 201
<212> DNA
<213> person (Homo sapiens)
<400> 51
acaacttgac tgtgagggca gcaagtgagt agatgcaagc ggtgaaggct gtttgtgctt 60
ttcaggacat gaactcttgt rctcgcacaa acagttccac accagcctgc caccttcttc 120
agcgagactc atgagcgaca tccatgatgc ccatttatta cttcccactc ctatgacttt 180
tttatttcgt ctctgctggg a 201
<210> 52
<211> 201
<212> DNA
<213> person (Homo sapiens)
<400> 52
tgaattttta aattttgttt aaatcctatt cagtgtgatt ccctcctgct gcaggctgga 60
ggctgggaga cagagggaga mtggggaatg tcttcttgat ttatagcatg ttttctagtt 120
aagaaaatac tcaagataaa tatatttatt tataacaatt ttcacatgaa agactttatt 180
caaaaatatg tgcaagaaaa a 201
<210> 53
<211> 261
<212> DNA
<213> human (Homo sapiens)
<400> 53
catgaagaaa agtgtgggct tgctgtcttt tcatcaaaac tcgtagaatt tggctgactg 60
cctggtttat cccaggacat tagtcagccc acaaatcccg catttgttta ttcagtccag 120
agcaagtgaa tactgccatt ytctcatctc ctttgttggc aacactttgt taacctgaat 180
tgggttctca acataaaatg aagtgactaa tcttttgggg ggttccccct cccaggttct 240
tgtatgagca actaaatcta c 261
<210> 54
<211> 201
<212> DNA
<213> human (Homo sapiens)
<400> 54
gatcacttcc ggacccctcg accgcccggc accagcgcgc aagggaccct tcagccggag 60
accagagtcc agtcccggtc rcgaggccac cgccgctgcc cgcctcgaga agcaccacgc 120
gggctgagcc gtcggctagc gggtcactcc cgagcctctg tctgcaccgc gccagcccca 180
gaccacggac gctgagcctc c 201
<210> 55
<211> 201
<212> DNA
<213> person (Homo sapiens)
<400> 55
tcaaagttca ggctctatca ctcctttttg tatagggagc attcaccctc ttgaggtggc 60
atgtgagggc ctgcaccgtg ygggaaaaca gctcttgggt tcacggtcta aggaagacca 120
ggggatgtgg ccttaaccaa tggattggga tcgttttgtc tttaaacttg ggtgtctagc 180
cacttcctag ggggtttact t 201
<210> 56
<211> 200
<212> DNA
<213> human (Homo sapiens)
<400> 56
caggagaatc gcgtgaaccc gggaggcgga ggttgtggtg agccgagatc acgccattgc 60
attccagcct gggcaacaas tgggaaactc cgtctcaaaa aaaaaaaaaa aagaagaata 120
ctttcctaat ggaaacctgt tgatgtttga aggagccagg ggaagaggaa gtgattgaag 180
atgctggaga aaatggagac 200

Claims (7)

1. An SNP marker combination used for identifying high risk group of nasopharyngeal carcinoma, which is characterized in that the SNP marker combination is formed by nucleotide sequences shown as SEQ ID No:43 to 56, and the nucleotide sequence of SEQ ID No: 43-56, wherein the SNP sites of the nucleic acid fragments are respectively: rs2106123, rs31489, rs3131875, rs1611163, rs9357092, rs9261506, rs2251830, rs2596506, rs2844484, rs9268644, rs6475604, rs1867277, rs9507124 and rs226241.
2. A primer combination for detecting the SNP marker set according to claim 1, wherein the primer combination consists of a PCR amplification primer and a single-base extension primer for detecting each SNP site according to claim 1;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs2106123 is shown as SEQ ID No: 1-2, and the sequence of the single base extension primer is shown as SEQ ID No:3 is shown in the figure;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs31489 is shown as SEQ ID No: 4-5, the sequence of the single base extension primer is shown as SEQ ID No:6 is shown in the specification;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs1611163 is shown as SEQ ID No: 7-8, the sequence of the single base extension primer is shown as SEQ ID No:9 is shown in the figure;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs9357092 is shown as SEQ ID No: 10-11, the sequence of the single base extension primer is shown as SEQ ID No:12 is shown in the specification;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs2596506 is shown as SEQ ID No: 13-14, the sequence of the single base extension primer is shown as SEQ ID No:15 is shown in the figure;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs9268644 is shown as SEQ ID No: 16-17, and the sequence of the single base extension primer is shown as SEQ ID No:18, respectively;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs3131875 is shown as SEQ ID No: 19-20, the sequence of the single base extension primer is shown as SEQ ID No:21 is shown;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs2844484 is shown as SEQ ID No: 22-23, the sequence of the single base extension primer is shown as SEQ ID No: shown at 24;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs9261506 is shown as SEQ ID No: 25-26, the sequence of the single base extension primer is shown as SEQ ID No:27 is shown;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs2251830 is shown as SEQ ID No: 28-29, and the sequence of the single-base extension primer is shown as SEQ ID No:30 is shown in the figure;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs6475604 is shown as SEQ ID No: 31-32, and the sequence of the single base extension primer is shown as SEQ ID No: 33;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs1867277 is shown as SEQ ID No: 34-35, the sequence of the single base extension primer is shown as SEQ ID No:36, respectively;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs9507124 is shown as SEQ ID No: 37-38, and the sequence of the single-base extension primer is shown as SEQ ID No: 39;
the sequence of the PCR amplification primer for detecting the typing SNP locus rs226241 is shown as SEQ ID No: 40-41, and the sequence of the single base extension primer is shown as SEQ ID No: shown at 42.
3. Use of a reagent for detecting the SNP marker set according to claim 1 for preparing a kit for identifying a high risk group of nasopharyngeal carcinoma.
4. Use according to claim 3, characterized in that the reagent is the primer combination according to claim 2.
5. A kit for identifying a high risk group of nasopharyngeal carcinoma, comprising the primer combination of claim 2, wherein said primer combination is used for detecting the SNP marker combination of claim 1 in typing DNA; the DNA is derived from peripheral blood, saliva or oropharyngeal swabs.
6. The kit of claim 5, further comprising Taq DNA polymerase, dNTP mixture, diluent and buffer.
7. Use of a reagent for detecting and typing the SNP marker combination according to claim 1 for constructing a risk scoring model for identifying a high risk group of nasopharyngeal carcinoma, wherein the risk scoring model is a model for calculating a risk score by a risk score formula after detecting the SNP marker combination according to claim 1 in DNA;
the risk score = (0.18 × rs 2106123) + (0.26 × rs 31489) + (0.21 × rs 3131875) + (0.50 × rs 1611163) + (0.23 × rs 9357092) + (0.17 × rs 9261506) + (0.27 × rs 2251830) + (0.35 × rs 2596506) + (0.14 × rs 2844484) + (0.46 × rs 9268644) + (0.30 × rs 6475604) + (0.29 × rs 1867277) + (0.14 × rs 9507124) + (0.28 × rs 226241);
wherein the score for each SNP site is: homozygous protective = "0", heterozygous = "1", homozygous dangerous = "2"; the rs2106123 homozygous protective type is GG, the heterozygous type is GA/AG, and the homozygous dangerous type is AA; the rs31489 homozygous protective type is AA, the heterozygous type is CA/AC, and the homozygous dangerous type is CC; the rs3131875 homozygous protective type is TT, the heterozygous type is CT/TC, and the homozygous dangerous type is CC; the rs1611163 homozygous protective type is TT, the heterozygous type is GT/TG, and the homozygous dangerous type is GG; the rs9357092 homozygous protective type is CC, the heterozygous type is CA/AC, and the homozygous dangerous type is AA; the rs9261506 homozygous protection type is AA, the heterozygous type is CA/AC, and the homozygous dangerous type is CC; the rs2251830 homozygous protective type is CC, the heterozygous type is CA/AC, and the homozygous dangerous type is AA; the rs2596506 homozygous protective type is TT, the heterozygous type is GT/TG, and the homozygous dangerous type is GG; the rs2844484 homozygous protection type is AA, the heterozygous type is GA/AG, and the homozygous dangerous type is GG; the rs9268644 homozygous protective type is AA, the heterozygous type is CA/AC, and the homozygous dangerous type is CC; the rs6475604 homozygous protective type is TT, the heterozygous type is CT/TC, and the homozygous dangerous type is CC; the rs1867277 homozygous protective type is AA, the heterozygous type is GA/AG, the homozygous dangerous type is GG; the rs9507124 homozygous protective type is TT, the heterozygous type is TC/CT, and the homozygous dangerous type is CC; the rs226241 homozygous protective type is CC, the heterozygous type is CG/GC, and the homozygous dangerous type is GG.
CN202011501223.8A 2020-12-17 2020-12-17 SNP marker for identifying nasopharyngeal carcinoma high-risk group, kit and application thereof Active CN112980949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011501223.8A CN112980949B (en) 2020-12-17 2020-12-17 SNP marker for identifying nasopharyngeal carcinoma high-risk group, kit and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011501223.8A CN112980949B (en) 2020-12-17 2020-12-17 SNP marker for identifying nasopharyngeal carcinoma high-risk group, kit and application thereof

Publications (2)

Publication Number Publication Date
CN112980949A CN112980949A (en) 2021-06-18
CN112980949B true CN112980949B (en) 2023-03-28

Family

ID=76345083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011501223.8A Active CN112980949B (en) 2020-12-17 2020-12-17 SNP marker for identifying nasopharyngeal carcinoma high-risk group, kit and application thereof

Country Status (1)

Country Link
CN (1) CN112980949B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113637742B (en) * 2021-09-29 2023-12-01 成都二十三魔方生物科技有限公司 High myopia gene detection kit, and high myopia genetic risk assessment system and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104630353A (en) * 2015-01-20 2015-05-20 中山大学肿瘤防治中心 Kit applied to nasopharynx cancer diagnosis, prognosis and treatment effect evaluation
CN106755318A (en) * 2016-11-24 2017-05-31 深圳市核子基因科技有限公司 A kind of kit and its SNP marks for detecting nasopharyngeal carcinoma neurological susceptibility

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101298628A (en) * 2007-04-30 2008-11-05 中国人民解放军军事医学科学院放射与辐射医学研究所 Polymorphism of MDM2 gene related to occurrence of nasopharyngeal carcinoma and lymphatic metastasis and detection method thereof
CN101956014B (en) * 2010-09-30 2014-05-28 中山大学 Kit for detecting 7 genetic markers of peripheral blood in early diagnosis of nasopharyngeal darcinoma
GB201113887D0 (en) * 2011-08-12 2011-09-28 Univ Hong Kong The Gene markers
CN103045743A (en) * 2012-12-28 2013-04-17 中山大学肿瘤防治中心 Kit for detecting susceptibility gene SNP locus of nasopharynx cancer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104630353A (en) * 2015-01-20 2015-05-20 中山大学肿瘤防治中心 Kit applied to nasopharynx cancer diagnosis, prognosis and treatment effect evaluation
CN106755318A (en) * 2016-11-24 2017-05-31 深圳市核子基因科技有限公司 A kind of kit and its SNP marks for detecting nasopharyngeal carcinoma neurological susceptibility

Also Published As

Publication number Publication date
CN112980949A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
TWI541507B (en) Methods for assessing liver pathologies
KR20190132558A (en) Using cell-free dna fragment size to determine copy number variations
CN101679971A (en) The decision method of progression risk of glaucoma
Oldt III et al. Molecular genetic analysis of placental site trophoblastic tumors and epithelioid trophoblastic tumors confirms their trophoblastic origin
CN107254531B (en) Genetic biomarker for auxiliary diagnosis of early colorectal cancer and application thereof
CN111676283B (en) Application of mitochondrial DNA single nucleotide polymorphism related to occurrence of high altitude pulmonary edema
CN110527719B (en) Method for establishing early screening scale for gestational diabetes risk assessment
CN111560428A (en) Application of substance for detecting single nucleotide polymorphism of mitochondrial DNA rs3937033
CN112980949B (en) SNP marker for identifying nasopharyngeal carcinoma high-risk group, kit and application thereof
CN106939334A (en) A kind of detection method of fetal DNA in maternal plasma DNA content
KR101206028B1 (en) Method for diagnosing a breast cancer using a breast cancer specific polymorphic sequence, polynucleotide specific to a breast cancer and microarray immobilized with the polynucleotide
CN109457031B (en) BRCA2 gene g.32338309A &amp; gtG mutant and application thereof in breast cancer auxiliary diagnosis
JP2016516449A (en) Method for determination of fetal DNA fraction in maternal blood using HLA marker
CN114381517B (en) Application of SNP rs12569857 polymorphism detection in preparation of reagent kit for screening plateau pneumochysis susceptible population
CN103502469A (en) Ankylosing spondylitis susceptibility and mononucleotide polymorphism detection method, kit and use thereof
KR101100437B1 (en) A polynucleotide associated with a colon cancer comprising single nucleotide polymorphism, microarray and diagnostic kit comprising the same and method for diagnosing a colon cancer using the polynucleotide
WO2018129888A1 (en) Primary biliary cholangitis-associated interleukin 21 receptor and application thereof
CN114592056A (en) 22q11 micro-deletion and/or micro-repetition detection primer group, primer probe composition, kit and application thereof
WO2018129887A1 (en) Primary biliary cholangitis-associated interleukin 21 and application thereof
JP2008545387A (en) Method and configuration for cardiovascular disease diagnosis
CN110331204B (en) Kit for breast cancer risk assessment and application thereof
WO2017106365A1 (en) Methods for measuring mutation load
RU2804110C1 (en) Set of oligonucleotide primers and probes for determining alleles of the rs55986091 polymorphism and method for its use
WO2018129886A1 (en) Primary biliary cholangitis-associated interleukin 16 and application thereof
KR20190134121A (en) Association of RNF213 single nucleotide polymorphism with the risk of Moyamoya disease in a Korean population

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant