CN115612744A - Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof - Google Patents

Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof Download PDF

Info

Publication number
CN115612744A
CN115612744A CN202211598172.4A CN202211598172A CN115612744A CN 115612744 A CN115612744 A CN 115612744A CN 202211598172 A CN202211598172 A CN 202211598172A CN 115612744 A CN115612744 A CN 115612744A
Authority
CN
China
Prior art keywords
methylation
gene
hpv
typing
cervical cancer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211598172.4A
Other languages
Chinese (zh)
Inventor
陈汶
伍建
王海丽
韩路
姬晓雯
刘娜
王亚坤
陈思淼
戴钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mygenostics Co ltd
Cancer Hospital and Institute of CAMS and PUMC
Original Assignee
Beijing Mygenostics Co ltd
Cancer Hospital and Institute of CAMS and PUMC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mygenostics Co ltd, Cancer Hospital and Institute of CAMS and PUMC filed Critical Beijing Mygenostics Co ltd
Priority to CN202211598172.4A priority Critical patent/CN115612744A/en
Priority to CN202310623058.0A priority patent/CN116463422A/en
Publication of CN115612744A publication Critical patent/CN115612744A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Theoretical Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Bioethics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Hospice & Palliative Care (AREA)
  • Artificial Intelligence (AREA)
  • Biochemistry (AREA)
  • Oncology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to the field of molecular biology detection, in particular to a Human Papilloma Virus (HPV) typing and cervical cancer related gene methylation integrated detection model and a construction method thereof; the method comprises the following steps: extracting cervical cell DNA to be detected, and carrying out methylation treatment; designing a probe, establishing an HPV typing and methylation integrated detection model for bioinformatics analysis by performing high-throughput sequencing after probe targeted enrichment, comparing HPV typing, extracting all CpG sites, extracting CpG sites of genes related to cervical cancer from all CpG sites, calculating the methylation value of each gene region of each HPV typing, and detecting methylation of genes related to cervical cancer; the invention aims to realize HPV typing and cervical cancer related gene methylation detection through one sample and one detection, analyze individual cervical cancer risks through multiple data and avoid over diagnosis caused by single detection.

Description

Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof
Technical Field
The invention relates to the technical field of molecular biology analysis and detection, in particular to a human papilloma virus typing and related gene methylation integrated detection model and a construction method thereof.
Background
Current screening for cervical cancer includes cervical cytology and Human Papillomavirus (HPV) testing. Cytological examinations include pap smears and TCT fluid-based cytology slides, the level of diagnosis of which is greatly influenced by physician subjective factors. Compared with cytological detection, HPV DNA detection can carry out typing and quantitative detection on high-risk types and low-risk types, but most of HPV are transient infection, cannot be developed into cervical precancerous lesions or cervical carcinoma, and easily causes unnecessary psychological burden to women. Relying solely on HPV screening can therefore result in over-diagnosis and over-treatment.
Disclosure of Invention
The invention aims to provide an integrated detection model for human papilloma virus typing and related gene methylation and a construction method thereof to make up for the defects of the technology. The invention adopts the following technical scheme:
gene methylation refers to the process of selectively adding methyl to cytosine (C) in CpG dinucleosides on DNA molecules under the action of enzyme to form 5' -methylcytosine. CpG dinucleosides are often located near the transcriptional control region of a gene, and their methylation can cause changes in chromatin structure, DNA conformation, DNA stability, etc., thereby regulating gene transcription and expression. Genetic methylation abnormality is one of the most common epigenetic changes in tumorigenesis, which is manifested by a decrease in global genomic methylation levels (oncogenes) and an abnormally elevated local methylation level of CpG islands (oncogenes). Patients at risk of potential lesion progression can therefore be shunted by detecting HPV genomic methylation and methylation of human cervical cancer-associated genes.
The invention provides a human papilloma virus typing and related gene methylation integrated detection model and a construction method thereof, wherein the detection model comprises the following steps:
extracting cervical cell DNA to be detected, fragmenting the DNA, and performing end repair and joint connection;
carrying out methylation treatment on the product, and then carrying out PCR enrichment and quality inspection;
designing probes according to the cervical cancer related genes and the full-length sequences of 18 HPV genomes;
the 18 HPV genomes comprise HPV6, HPV11, HPV16, HPV18, HPV31, HPV33, HPV35, HPV39, HPV45, HPV51, HPV52, HPV56, HPV58, HPV59, HPV66, HPV68a, HPV69 and HPV82;
performing high-throughput sequencing by targeting the probe to nucleic acids in the enriched target region;
establishing a human papilloma virus typing and related gene methylation integrated detection model, performing confidence-generation analysis, and comparing HPV typing;
respectively extracting reads contained in each typing gene from the comparison result of the sequence and 18 HPV types, calculating the methylation score of the gene, counting the gene coverage depth, and performing classification prediction on the sample by taking the methylation score and the gene coverage depth as the indexes of the methylation degree of the HPV gene;
detecting methylation of cervical cancer related genes.
Further, the aligning HPV typing comprises: and (3) comparing the preprocessed data with reference genomes of 18 HPV types by using a bismark to obtain detailed information of sequence comparison results, calculating to obtain coverage, and screening data with the coverage being more than 40% to obtain basic typing information.
Further, the cervical cancer-associated gene includes: ADCYAP1, AJAP1, ANKRD18CP, APC, ASTN1, CADM1, CCNA2, CDH1, CDH13, CDH6, CDKN2A, DAPK1, DLX1, EPB41L3, FAM19A4, FANCI, FHIT, GATA4, GFRA1, HS3ST2, JAM3, KCNIP4, LHX8, LMX1A, MAL, MIR124-1, MIR124-2, MIR124-3, PAX1, PCDHA13, PCDHA4, POU4F3, RARB, RNASACN 2A, RUBCNL, RXFP3, SLIT2, SOX1, SOX17, ST6GAL 5, TERT, TIMP3, WIF1, ZNF671, and ZSS 1.
Further, the design probe includes:
acquiring the cervical cancer related genes and the methylation regions of the 18 HPV genome full-length sequences, extracting reference sequences, designing sulfite-treated simulated hypermethylation and hypomethylation sequences according to the forward sequence and the reverse sequence of the reference sequences, then taking the sequences as templates, intercepting the 120bp sequence from the first base as a probe, moving the sequences backwards for n bases again, and intercepting the 120bp sequence as the probe until the last 120bp sequence.
The method for classifying and predicting the samples by taking the methylation score and the gene coverage depth as the indexes of the HPV gene methylation levels comprises the following steps:
multiplying the methylation value of each gene region of each HPV type by the gene coverage depth to obtain a judgment value, and judging to be CIN2+ if the judgment value is more than or equal to 1.04; and if the judgment value is less than 1.04, judging that the CIN is less than or equal to 1.
Further, the calculated gene methylation score is calculated by the following formula:
Figure 317196DEST_PATH_IMAGE001
wherein MHL is methylation score, l is CpG sites contained in cervical cancer related gene, and MHi is the proportion of i continuous CpG sites which are completely methylated.
Further, the methylation score is positively correlated with the degree of cervical lesions.
Further, the detecting methylation of the cervical cancer related gene comprises: comparing to human cervical cancer related genome to generate bam file, removing redundancy and extracting all CpG sites, extracting cervical cancer related gene CpG sites from all CpG sites, respectively extracting reads contained in each high-density high-association region, calculating methylation value in the region, and performing classification prediction on the sample by taking the methylation value as the methylation degree index of the cervical cancer related gene.
In another aspect, the present invention provides a human papillomavirus typing and methylation integrated detection model, which comprises:
the cervical cell DNA extraction module is used for extracting cervical cell DNA to be detected, fragmenting the DNA, and performing end repair and joint connection;
the sequencing module is used for carrying out methylation treatment on the product and then carrying out PCR enrichment and quality inspection;
the probe design module is used for designing probes according to the cervical cancer related genes and the full-length sequence of the 18 HPV genomes; the 18 HPV genomes comprise HPV6, HPV11, HPV16, HPV18, HPV31, HPV33, HPV35, HPV39, HPV45, HPV51, HPV52, HPV56, HPV58, HPV59, HPV66, HPV68a, HPV69 and HPV82;
the enrichment sequencing module is used for carrying out high-throughput sequencing on the nucleic acid in the enrichment target region through the probe target;
the letter generation analysis module is used for carrying out letter generation analysis and comparing HPV types;
comparing the obtained gene groups with the genome related to the human cervical cancer to generate a bam file, removing redundancy and extracting all CpG sites, extracting the CpG sites of the genes related to the cervical cancer from all the CpG sites, calculating the methylation score of each gene region of each HPV type, and performing classification prediction on the sample by taking the methylation score and the gene coverage depth of each gene region of each HPV type as the indexes of the methylation degree of the HPV genes;
detecting methylation of genes related to cervical cancer.
Further, the step of comparing HPV typing comprises the steps of comparing the preprocessed data with 18 HPV reference genome sequences by using bismark to obtain detailed information of sequence comparison results, calculating to obtain coverage, and screening data with the coverage being more than 40% to obtain basic typing information;
the method for classifying and predicting the samples by taking the methylation score and the gene coverage depth as the indexes of the high-low degree of HPV gene methylation comprises the following steps:
multiplying the methylation value of each gene region of each HPV typing by the gene coverage depth to obtain a judgment value, and if the judgment value is more than or equal to 1.04, judging to be CIN2+; and if the judgment value is less than 1.04, judging that the CIN is less than or equal to 1.
Further, the detecting methylation of the cervical cancer related gene comprises: comparing the two genes to generate a bam file of a genome related to the cervical cancer, removing redundancy to extract all CpG sites, extracting CpG sites of genes related to the cervical cancer from all the CpG sites, respectively extracting reads contained in each high-density high-association region, calculating methylation scores in the regions, and performing classification prediction on the samples by taking the methylation scores as methylation degree indexes of the genes related to the cervical cancer.
The invention has the beneficial effects that: the invention can simultaneously detect HPV typing and evaluate HPV methylation and human cervical carcinoma related gene methylation degree by a targeted sequencing method, has high capture efficiency and better capture stability and uniformity, reduces the sequencing cost and avoids the over-diagnosis problem of single HPV detection.
Drawings
FIG. 1 is a graph of AUC values for HPV methylation of example 3;
FIG. 2 is a graph showing the relationship between MHL value and cervical lesion degree in example 3.
Detailed Description
The present invention will be described in further detail with reference to specific examples.
It should be noted that these examples are only for illustrating the present invention, and not for limiting the present invention, and the simple modification of the method based on the idea of the present invention is within the protection scope of the present invention.
Example 1 establishment of HPV genomic typing/methylation and human cervical carcinoma-associated genomic methylation detection method
1. Probe design
(1) Screening of methylated mark gene of human cervical carcinoma-related genome
Screening methylation regions with obvious methylation characteristics related to cervical cancer, taking a reference sequence (reference genome version hg 19) of each region, removing sequences of repeated regions, and analyzing the repeated sequences by using a RepeatMask software to obtain gene sequences, wherein the gene sequences are shown in a table 1.
TABLE 1 Gene List
ADCYAP1 CDH6 GFRA1 MIR124-3 SLIT2
AJAP1 CDKN2A HS3ST2 PAX1 SOX1
ANKRD18CP DAPK1 JAM3 PCDHA13 SOX17
APC DLX1 KCNIP4 PCDHA4 ST6GALNAC5
ASTN1 EPB41L3 LHX8 POU4F3 TERT
CADM1 FAM19A4 LMX1A RARB TIMP3
CCNA2 FANCI MAL RNASEH2A WIF1
CDH1 FHIT MIR124-1 RUBCNL ZNF671
CDH13 GATA4 MIR124-2 RXFP3 ZSCAN1
(2) Probe design
From the above 45 genes and the CpG sites of the methylated regions of the total length sequences of 18 HPV genomes such as extracted HPV6, 11, 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, 68a, 69 and 82 and the like and the positions of the front and back 75bp, a reference sequence (reference genome version hg 19) is extracted, based on the forward sequence and the reverse sequence of the reference sequence, sulfite-treated simulated hypermethylation and hypomethylation sequences are designed, then a 120bp sequence is intercepted from the first base as a template to serve as a probe, n bases are further intercepted, and the 120bp sequence is intercepted to serve as a probe until the last 120bp. N varies according to the GC content of the exon in each region, and the more dense the probe design, the smaller n is, the higher the uniformity achieved by capture.
2. Cervical cell DNA extraction (DNA extraction)
Extracting cervical cell DNA by any one of the following methods:
extracting cervical exfoliated cell DNA: the concentration of the Qubit was determined after extracting genomic DNA from 300. Mu.L of the cells stored in the stock solution according to the general column genome extraction kit (Kangji, CWY 004) instructions.
Extracting genome DNA of paraffin tissues or paraffin sections: samples of 10 μm thick paraffin blocks or 8-10 pieces of paraffin sections were sampled and their concentration was determined by Qubit after genomic DNA extraction according to the GeneRead DNA FFPE Kit (Qiagen, 180134) instructions.
Fresh tissue genomic DNA extraction: 25 mg of fresh tissue is taken, and the concentration is detected by the Qubit after the genome DNA is extracted according to the instructions of a universal column type genome extraction kit (Kangji, CWY 004).
3. Genomic DNA methylation library construction
(1) Methylated anti-pollution joint design and synthesis
The methylation anti-pollution joint is composed of a universal sequence and an anti-pollution label, wherein the anti-pollution label comprises 8-10 bases (random sequence), all the bases of cytosine (C) in the joint are subjected to methylation modification, the synthesized dry powder of the joint is centrifuged by a high-speed centrifuge at 12000 rpm, and a certain volume of Tris-HCl with the pH value of 8.0 and 10mM is added to dilute the mixture to 100 mu M; each of the linker 1 and linker 2 was annealed at 50. Mu.L and 100. Mu.M, and then used.
(2) Genomic DNA methylation library construction
A. Genomic DNA fragmentation
DNA obtained from cells or tissues was fragmented to 200bp using a Covaris sonicator according to the relevant instructions.
And (3) fragmenting the DNA of the tissues or cells in the step (2) to about 200bp by using a Covaris ultrasonic disrupter (Covaris, S220) according to parameters of Peak Incident Power 175W, duty Factor 10%, cycles per Burst 200 and Treatment Time 180S.
B. End repair, connection joint
According to a formula of 2000:1 ratio after Lambda DNA was added to the fragmented DNA, end repair and linker ligation were performed using Rapid Max DNA Lib Prep Kit for Illumina (Abclonal, RK 20217) Kit. Wherein the linker is the methylation modified linker synthesized in the step (1).
C. Library methylation processing
The product of the previous step was methylated using the Epitect Plus DNA Bisufite Kit (Qiagen, 59124) methylation Kit as per the instructions.
Enrichment by PCR
The methylated product of the previous step was amplified using KAPA HIFI Hi-Fi methylation library amplification reagent (KAPA, KK 2802) and purified using Magbead DNA purification kit (Kangwa, CW 2508M).
E. Library quality inspection
And detecting the concentration of the product in the last step by using qubit.
4. Library targeted enrichment
Taking the genomic DNA of a sample cell or tissue to be detected as a library, and carrying out targeted capture by using a designed probe according to the reagent of example 1 and the method of example 2 in the patent 201810580442.6.
5. Analysis of letter of birth
And (4) carrying out high-throughput sequencing on the target sequence capture library obtained in the step (4) by a second-generation sequencing platform such as Nextseq500, X Ten, novaseq and the like to obtain sequencing original data, and carrying out the following analysis.
(1) Raw data quality control
A. Base recognition
And converting and splitting the offline binary BCF format file of the Illumina sequencer into a single sample readable file fastq format according to the sample index sequence by using Illumina official software BCF2fastq (version 2.15.0.4).
B. Data quality control
Sequencing adaptors were removed using cutadapt (version 1.16) and low quality bases were deleted to generate clean reads. Ext> whereinext> theext> parameterext> ofext> cutadaptext> (ext> versionext> 1.16ext>)ext> isext> (ext> -ext> qext> 10ext>,ext> 10ext> -ext> nextseqext> -ext> trimext> =ext> 10ext> -ext> aext> ATCTCGTATGCCGTCTTCTGCTTGext> -ext> Aext> AGATCGGAAGCGGTCGTGTAGGGAAAGAGTGTAGTAGATCTCGGTGCGTCGCCGTATCATText>)ext>,ext> andext> theext> sequenceext> lengthext> isext> lessext> thanext> 80ext>.ext>
(2) HPV typing analysis
The preprocessed data were aligned with reference genomes of 18 HPV types using bismark (-N1-p 6-L30-most _ valid _ alignments 3) to obtain details of the sequence alignment (bam file). And converting the information in the bam into a bed file, calculating to obtain a coverage statistical file (sample.
(3) Data comparison
Clean reads are aligned to the hg19 genome and the HPV genome related to the human cervical carcinoma by using methylation alignment special software Bismark (v0.17.0), wherein Bismark alignment parameters are (-un-genome _ folder-N1-p 8-L30-most _ valid _ alignment 3-B-samtools _ path-o-path _ to _ bowtie-1-2), and a bam file is generated.
And using a deputy _ Bismark module in the Bismark to remove redundancy of the compared bam file.
CpG position information is extracted from the redundancy-removed bam file by using a Bismark _ methyl _ extra module in a Bismark, and parameters are (-p-no _ overlap-align 4-align _ r 2-samtools _ path-bedGraph-buffer _ size 20G-cytosine _ report-genome _ folder-o./-multicore 10 bam).
Comparing the preprocessed data with reference genomes of 18 HPV types by using a bismark to obtain detailed information of sequence comparison results, calculating to obtain coverage, and screening data with the coverage being more than 40% to obtain basic typing information;
(4) CpG site extraction
The bismark _ methyl _ extra module extracts all the CpG position information, and the target gene is the CpG position of the cervical cancer related gene, so that the target CpG position information needs to be extracted from all the CpG position information. The target CpG site information was extracted from all CpG site information using the written s _ extract _ from _ bed.
(5) A methylation score was calculated for each gene region for each type.
The methylation score is calculated as
Figure 615103DEST_PATH_IMAGE002
Wherein l is the CpG sites of the HPV gene, MHi is the proportion of the complete methylation of i continuous CpG sites, and P is the proportion of the complete methyl fragments in the fragments of the i continuous CpG sites.
(6) Predictive analysis of methylation levels
Respectively extracting reads contained in each typing gene from the comparison result of the sequence and 18 HPV types, calculating a gene methylation score (MHL), counting gene coverage depth, and performing classification prediction on a sample by taking the MHL value and the gene coverage depth as HPV gene methylation high-low degree indexes;
comparing to a genome related to human cervical cancer to generate a bam file, removing redundancy and extracting all CpG sites, extracting CpG sites of genes related to the cervical cancer from all the CpG sites, respectively extracting reads contained in each high-density high-association region, calculating methylation scores (MHL) in the regions, and performing classification prediction on the samples according to the methylation degree indexes of the cervical cancer genes with the MHL values.
The methylation detection target of the human cervical carcinoma related gene is a CpG locus in a probe design region, and comprises the following steps:
dividing CpG high density area, namely, the distance between all connected CpG sites in the area is less than 100bp;
screening a high-association-degree locus set for the CpG high-density region;
simultaneous methylation or non-methylation of both sites is considered a methylation correlation of the two sites. The degree of two-site methylation association is defined as: the ratio of the number of simultaneously methylated or simultaneously unmethylated support sequences at both sites to the total of the two cover sequences. The high association degree site set is defined as that the association degree of any site in the site set and the rest at least one site is more than or equal to 0.8. The grouping of sites (called site set) is further divided by the rule, and the interference sites are filtered;
calculating a methylation score for each set of sites
The methylation score is calculated as
Figure 467522DEST_PATH_IMAGE003
Wherein l is a CpG locus contained in the cervical cancer related gene, MHi is the proportion of complete methylation of i continuous CpG loci, P is the proportion of complete methyl fragments in the fragments of the i continuous CpG loci, and i starts from 4 when the methylation score is calculated in order to more effectively distinguish cervicitis/CIN 1 from CIN2/CIN 3/cervical cancer.
Example 2 HPV typing detection
After informed consent of the patients, samples of exfoliated cervical cells of 6 patients to be tested were collected, and HPV types were detected by the method of example 1, with the results shown in Table 2: the first column in the results is the sample number; the second column is a virus reference ID; the third column is the average sequencing depth; the fourth column is the number of covered reads in the target area; the fifth column is coverage on the target area; the sixth column is the proportion that the base sequencing depth of the target region is not less than 20X; the seventh column is that the sequenced reads cover the HPV type and the corresponding gene. As can be seen from the data, the method can better cover the HPV genome, the average depth can reach ten thousand x (20 x coverage can reach 100%), common HPV genotypes can be covered, and the result is consistent with the fluorescent quantitative detection result.
Table 2 test results of cervical exfoliated cell samples of patients to be tested
Figure 520054DEST_PATH_IMAGE004
Figure 912858DEST_PATH_IMAGE005
EXAMPLE 3 HPV methylation detection
The methylation level of HPV genes can participate in the integration process of virus genes and host cells, regulate and control the expression of oncogenes, the proliferation of viruses and the immune escape of organisms, and promote the generation of cervical cancer. According to the invention, under the condition of informed consent of patients, using the kit and the method of example 1, 80 samples of cervical exfoliated cells (CIN 1 and CIN2 +) with different lesions are detected, wherein detection results are shown in Table 3, CPG sites of HPV genome are extracted, a methylation value (MHL) is calculated for each site set, the MHL value is multiplied by the average depth of gene region to obtain a judgment value, a threshold value is determined to be 1.04 (the sensitivity is 79.59%, the specificity is 67.74% and the AUC value is 0.74) by Youden's J static algorithm (figure 1), the judgment value is positively correlated with the lesion degree of cervix, and if the judgment value is more than or equal to 1.04, the CIN2+ is judged; if the determination value is less than 1.04, it is determined that CIN1 (FIG. 2) is not more than.
TABLE 3 detection results of samples of exfoliated cervical cells with different lesions associated with HPV infection in 80 cases
Figure 449625DEST_PATH_IMAGE006
Example 4 Integrated detection of HPV typing, HPV DNA methylation and methylation of human cervical carcinoma-associated genes
After patient consent, 50 cases of cervical exfoliated cells to be detected were subjected to detection of HPV typing, HPV DNA methylation and methylation of human cervical cancer-related genes by the method of example 1, and the results are shown in Table 4: the second column is HPV typing results; third and fourth columns for HPV DNA methylation detection; the fifth column is a judgment result of human cervical carcinoma related genome DNA methylation; the sixth column is the result of histopathology.
TABLE 4 50 examples of integrated test results of exfoliated cervical cells to be tested
Figure 771147DEST_PATH_IMAGE007
Figure 606248DEST_PATH_IMAGE008
Finally, it is noted that the above-mentioned embodiments illustrate rather than limit the invention, and that, while the invention has been described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A construction method of a human papilloma virus typing and related gene methylation integrated detection model is characterized by comprising the following steps:
extracting cervical cell DNA to be detected, fragmenting the DNA, and performing end repair and joint connection;
carrying out methylation treatment, and then carrying out PCR enrichment and quality inspection;
designing probes according to the cervical cancer related genes and the full-length sequences of 18 HPV genomes;
the 18 HPV genomes comprise HPV6, HPV11, HPV16, HPV18, HPV31, HPV33, HPV35, HPV39, HPV45, HPV51, HPV52, HPV56, HPV58, HPV59, HPV66, HPV68a, HPV69 and HPV82;
performing high-throughput sequencing by targeting the nucleic acid in the enriched target region through the probe;
establishing a human papilloma virus typing and related gene methylation integrated detection model, performing biogenic analysis, and comparing HPV typing;
respectively extracting reads contained in each typing gene from the comparison result of the sequence and 18 HPV types, calculating the methylation score of the gene, counting the gene coverage depth, and performing classification prediction on the sample by taking the methylation score and the gene coverage depth as the indexes of the methylation degree of the HPV gene;
detecting methylation of genes related to cervical cancer.
2. The method for constructing the human papillomavirus typing and related gene methylation integrated detection model according to claim 1, wherein the comparison of HPV types comprises: and (3) comparing the preprocessed data with 18 HPV reference genome sequences by using a bismark to obtain detailed information of sequence comparison results, calculating to obtain coverage, and screening data with the coverage being more than 40% to obtain basic typing information.
3. The method for constructing the human papillomavirus typing and related gene methylation integrated detection model according to claim 1, wherein the cervical cancer related genes comprise: ADCYAP1, AJAP1, ANKRD18CP, APC, ASTN1, CADM1, CCNA2, CDH1, CDH13, CDH6, CDKN2A, DAPK1, DLX1, EPB41L3, FAM19A4, FANCI, FHIT, GATA4, GFRA1, HS3ST2, JAM3, KCNIP4, LHX8, LMX1A, MAL, MIR124-1, MIR124-2, MIR124-3, PAX1, PCDHA13, PCDHA4, POU4F3, RARB, RNASACN 2A, RUBCNL, RXFP3, SLIT2, SOX1, SOX17, ST6GAL 5, TERT, TIMP3, WIF1, ZNF671, and ZSS 1.
4. The method for constructing the human papillomavirus typing and related gene methylation integrated detection model according to claim 1, wherein the designing of the probe comprises:
acquiring the cervical cancer related genes and the methylation regions of the 18 HPV genome full-length sequences, extracting a reference sequence, designing sulfite-treated simulated hypermethylation and hypomethylation sequences according to a forward sequence and a reverse sequence of the reference sequence, then intercepting a 120bp sequence from a first base by taking the sequences as templates to serve as a probe, moving n bases backwards again, and intercepting the 120bp sequence to serve as a probe until the last 120bp sequence.
5. The method for constructing the human papillomavirus typing and related gene methylation integrated detection model according to claim 1, wherein the classifying and predicting samples by taking the methylation score and the gene coverage depth as the HPV gene methylation level indexes comprises the following steps:
multiplying the methylation value of each gene region of each HPV type by the gene coverage depth to obtain a judgment value, and judging to be CIN2+ if the judgment value is more than or equal to 1.04; and if the judgment value is less than 1.04, judging that the CIN is less than or equal to 1.
6. The method for constructing a human papillomavirus typing and related gene methylation integrated detection model according to claim 1, wherein the calculated gene methylation score is calculated by the following formula:
Figure 356501DEST_PATH_IMAGE001
wherein MHL is methylation score, l is CpG sites contained in cervical cancer related gene, MHi is the proportion of complete methylation of i continuous CpG sites, and P is the proportion of complete methylation of i continuous CpG sites.
7. The method for constructing the human papillomavirus typing and related gene methylation integrated detection model according to claim 1, wherein the detection of the methylation of the cervical cancer related gene comprises: comparing the two genes to generate a bam file of a genome related to the cervical cancer, removing redundancy to extract all CpG sites, extracting CpG sites of genes related to the cervical cancer from all the CpG sites, respectively extracting reads contained in each high-density high-association region, calculating methylation scores in the regions, and performing classification prediction on the samples by taking the methylation scores as methylation degree indexes of the genes related to the cervical cancer.
8. An integrated detection model for human papillomavirus typing and related gene methylation is characterized by comprising the following steps:
the cervical cell DNA extraction module is used for extracting the cervical cell DNA to be detected, fragmenting the DNA, and performing end repair and joint connection;
the sequencing module is used for carrying out methylation treatment and then carrying out PCR enrichment and quality inspection;
the probe design module is used for designing probes according to the cervical cancer related genes and the full-length sequence of the 18 HPV genomes; the 18 HPV genomes comprise HPV6, HPV11, HPV16, HPV18, HPV31, HPV33, HPV35, HPV39, HPV45, HPV51, HPV52, HPV56, HPV58, HPV59, HPV66, HPV68a, HPV69 and HPV82;
the enrichment sequencing module is used for carrying out high-throughput sequencing on the nucleic acid in the enrichment target region through the probe target;
the letter generation analysis module is used for carrying out letter generation analysis and comparing HPV types;
respectively extracting reads contained in each typing gene from the comparison result of the sequence and 18 HPV types, calculating the methylation score of the gene, counting the gene coverage depth, and performing classification prediction on the sample by taking the methylation score and the gene coverage depth as the indexes of the methylation degree of the HPV gene;
detecting methylation of cervical cancer related genes.
9. The integrated detection model for human papillomavirus typing and related gene methylation according to claim 8, wherein the aligned HPV typing comprises using bismark to align the preprocessed data with 18 HPV reference genome sequences to obtain detailed information of sequence alignment results, calculating to obtain coverage, and screening data with the coverage being greater than 40% to obtain basic typing information;
the method for classifying and predicting the samples by taking the methylation score and the gene coverage depth as the indexes of the HPV gene methylation levels comprises the following steps:
multiplying the methylation value of each gene region of each HPV type by the gene coverage depth to obtain a judgment value, and judging to be CIN2+ if the judgment value is more than or equal to 1.04; and if the judgment value is less than 1.04, judging that the CIN is less than or equal to 1.
10. The integrated detection model for human papillomavirus typing and related gene methylation according to claim 8, wherein the detection of methylation of cervical cancer-related genes comprises: comparing to human cervical cancer related genome to generate bam file, removing redundancy and extracting all CpG sites, extracting cervical cancer related gene CpG sites from all CpG sites, respectively extracting reads contained in each high-density high-association region, calculating methylation value in the region, and performing classification prediction on the sample by taking the methylation value as the methylation degree index of the cervical cancer related gene.
CN202211598172.4A 2022-12-14 2022-12-14 Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof Pending CN115612744A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211598172.4A CN115612744A (en) 2022-12-14 2022-12-14 Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof
CN202310623058.0A CN116463422A (en) 2022-12-14 2022-12-14 Human papilloma virus typing and cervical cancer related gene methylation integrated detection model and construction method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211598172.4A CN115612744A (en) 2022-12-14 2022-12-14 Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202310623058.0A Division CN116463422A (en) 2022-12-14 2022-12-14 Human papilloma virus typing and cervical cancer related gene methylation integrated detection model and construction method thereof

Publications (1)

Publication Number Publication Date
CN115612744A true CN115612744A (en) 2023-01-17

Family

ID=84879608

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202310623058.0A Pending CN116463422A (en) 2022-12-14 2022-12-14 Human papilloma virus typing and cervical cancer related gene methylation integrated detection model and construction method thereof
CN202211598172.4A Pending CN115612744A (en) 2022-12-14 2022-12-14 Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202310623058.0A Pending CN116463422A (en) 2022-12-14 2022-12-14 Human papilloma virus typing and cervical cancer related gene methylation integrated detection model and construction method thereof

Country Status (1)

Country Link
CN (2) CN116463422A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115797356A (en) * 2023-02-09 2023-03-14 山东第一医科大学附属省立医院(山东省立医院) Nuclear magnetic resonance tumor region extraction method
CN117316289A (en) * 2023-09-06 2023-12-29 复旦大学附属华山医院 Methylation sequencing typing method and system for central nervous system tumor
CN117701718A (en) * 2024-02-04 2024-03-15 湖南宏雅基因技术有限公司 Gene methylation marker for diagnosing cervical cancer, primer pair and application thereof

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116646010B (en) * 2023-07-27 2024-03-29 深圳赛陆医疗科技有限公司 Human virus detection method and device, equipment and storage medium
CN117265096A (en) * 2023-09-07 2023-12-22 北京美康基因科学股份有限公司 Kit for cervical high-grade lesions and early detection of cervical cancer

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113423843A (en) * 2018-12-04 2021-09-21 香港精准医学技术有限公司 DNA methylation biomarkers for early detection of cervical cancer
CN113943817A (en) * 2021-12-20 2022-01-18 北京迈基诺基因科技股份有限公司 Cervical cancer canceration level evaluation model and construction method
CN114068009A (en) * 2020-08-07 2022-02-18 四川医枢科技股份有限公司 Cervical cancer and vulvar cancer clinical decision, teaching and scientific research auxiliary support method and system
US20220064742A1 (en) * 2018-11-30 2022-03-03 Genomic Vision Association between integration of viral as hpv or hiv genomes and the severity and/or clinical outcome of disorders as hpv associated cervical lesions or aids pathology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220064742A1 (en) * 2018-11-30 2022-03-03 Genomic Vision Association between integration of viral as hpv or hiv genomes and the severity and/or clinical outcome of disorders as hpv associated cervical lesions or aids pathology
CN113423843A (en) * 2018-12-04 2021-09-21 香港精准医学技术有限公司 DNA methylation biomarkers for early detection of cervical cancer
CN114068009A (en) * 2020-08-07 2022-02-18 四川医枢科技股份有限公司 Cervical cancer and vulvar cancer clinical decision, teaching and scientific research auxiliary support method and system
CN113943817A (en) * 2021-12-20 2022-01-18 北京迈基诺基因科技股份有限公司 Cervical cancer canceration level evaluation model and construction method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王海英;: "宫颈病变程度与人乳头瘤病毒16型甲基化水平的关系研究" *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115797356A (en) * 2023-02-09 2023-03-14 山东第一医科大学附属省立医院(山东省立医院) Nuclear magnetic resonance tumor region extraction method
CN117316289A (en) * 2023-09-06 2023-12-29 复旦大学附属华山医院 Methylation sequencing typing method and system for central nervous system tumor
CN117316289B (en) * 2023-09-06 2024-04-26 复旦大学附属华山医院 Methylation sequencing typing method and system for central nervous system tumor
CN117701718A (en) * 2024-02-04 2024-03-15 湖南宏雅基因技术有限公司 Gene methylation marker for diagnosing cervical cancer, primer pair and application thereof
CN117701718B (en) * 2024-02-04 2024-05-07 湖南宏雅基因技术有限公司 Gene methylation marker for diagnosing cervical cancer, primer pair and application thereof

Also Published As

Publication number Publication date
CN116463422A (en) 2023-07-21

Similar Documents

Publication Publication Date Title
US20230132951A1 (en) Methods and systems for tumor detection
CN115612744A (en) Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof
Thomas et al. Chromosomal gains and losses in human papillomavirus-associated neoplasia of the lower genital tract–a systematic review and meta-analysis
AU2022205234A1 (en) Diagnostic applications using nucleic acid fragments
CN113943817B (en) Cervical cancer canceration level evaluation model and construction method
EP2428584A1 (en) Method for detection of colorectal tumor
AU2018305609B2 (en) Enhancement of cancer screening using cell-free viral nucleic acids
JP2008524990A (en) Detection of human papillomavirus
CN112646882B (en) Composition and diagnostic reagent for detecting cervical high-grade lesion and cervical cancer
CN113930516B (en) Primer, kit, model and construction method for methylation of cervical cancer related gene
CN112375825B (en) Kit for diagnosing and prognosing cervical cancer
CN114214415A (en) Primer probe combination for methylation detection of cervical cancer related genes and application thereof
CN116970705B (en) Nucleic acid product for methylation detection of urothelial oncogene, kit and application
CN107475443A (en) Cervix cancer detects set group
CN116676393A (en) Methylation markers, primer pairs and methods for early screening of cervical cancer
CN105177163A (en) Kit for early screening cervical cancer
CN113948150B (en) JMML related gene methylation level evaluation method, model and construction method
CN117721209B (en) Combined detection reagent and kit for cervical cancer detection
Jary et al. Accurate detection of copy number aberrations in FFPE samples using the mFAST-SeqS approach
CN113337644A (en) Screening method and application of Linc HPV integration gene sites
CN116064805A (en) Cervical cancer related gene methylation detection probe combination, kit and detection method
CN118166163A (en) Multiple PCR detection kit for high-risk human papilloma virus
TW201408778A (en) Cancer screening method III

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230117