CN110527714A - For detecting HPV in the method for the integration site of host genome - Google Patents

For detecting HPV in the method for the integration site of host genome Download PDF

Info

Publication number
CN110527714A
CN110527714A CN201910840742.8A CN201910840742A CN110527714A CN 110527714 A CN110527714 A CN 110527714A CN 201910840742 A CN201910840742 A CN 201910840742A CN 110527714 A CN110527714 A CN 110527714A
Authority
CN
China
Prior art keywords
hpv
sequence
host genome
integration site
genome
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910840742.8A
Other languages
Chinese (zh)
Other versions
CN110527714B (en
Inventor
孟博
田埂
王伟伟
董如一
杨文娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meta Code Gene Technology (beijing) Ltd By Share Ltd
Original Assignee
Meta Code Gene Technology (beijing) Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Meta Code Gene Technology (beijing) Ltd By Share Ltd filed Critical Meta Code Gene Technology (beijing) Ltd By Share Ltd
Priority to CN201910840742.8A priority Critical patent/CN110527714B/en
Publication of CN110527714A publication Critical patent/CN110527714A/en
Application granted granted Critical
Publication of CN110527714B publication Critical patent/CN110527714B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • C12Q1/701Specific hybridization probes
    • C12Q1/708Specific hybridization probes for papilloma

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Virology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention discloses a kind of method for detecting integration site of the HPV in host genome, comprising the following steps: extracts host genome DNA from from the biological sample of subject, and is interrupted the segment for being 800bp-4kb to main peak;Target sequence is obtained to the first detection is carried out in obtained segment using probe groups;The target sequence of acquisition is pre-processed to improve detection sensitivity;The second detection is carried out to target sequence without PCR amplification with based on electric signal sequencing technologies, to confirm HPV in the integration site of host genome.Method of the invention can analyze the state judgement that HPV viruse is present in host cell quickly, in time, have significant clinical application advantage, can more effectively judge the integration information of HPV without error correcting analysis using this method.

Description

For detecting HPV in the method for the integration site of host genome
Technical field
The present invention relates to the detections of viral integration sites, more particularly to for detecting HPV in the integration of host genome The method in site.
Background technique
It is the significant process event for causing cancer and occurring that HPV viruse, which is integrated into human genome, at present to the understanding of the process Also than relatively limited, more and more researches show that HPV integration is the potential molecular marker as molecular diagnosis and individualized treatment Object, and very valuable prognostic indicator.The research method of HPV integration is applied based on two generation sequencing technologies more at present Probe catching method obtains.For example, CN107739761A discloses the high-flux sequence detection side of a kind of HPV parting and integration Method.This method chooses the gene of current HPV hypotype, in conjunction with second generation high throughput sequencing technologies, more fully detection patient infection The type of HPV overcomes the difficulties such as traditional detection method accuracy rate is low, false positive is high, poor repeatability, rate of missed diagnosis height.In molecule Diagnostic field, most direct and specific technology is gene sequencing, and second generation high throughput sequencing technologies than mostly using classics at present Sanger sequencing approach have inspection is more flux-intensive, sequencing speed faster, that accuracy is higher, cost is lower and information content is richer etc. is excellent Point.This method can carry out accurately typing to high-risk HPV and low risk HPV under the help of second generation high throughput sequencing technologies And its integration that human genome whether occurs is detected, accurately individuation is carried out to tester and is assessed, prevention lesion occurs Risk, thus the generation of pre- preventing tumor.Although being all directed to short-movie section based on the capture technique comparative maturity of two generations sequencing The probe of DNA captures or target fragment amplification, and two generation sequencing reading lengths are shorter and sensitivity is too high, easily lead to false positive As a result, and two generation methods only breakpoint location is detected, the integrated structure of long sequence is difficult to identify.
Recently, three generations's sequencing increasingly highlights the characteristics of its length reads long and quick detection cause of disease, but for viral integrase position The detection of point is also more rare, and the error rate of three generations's sequencing result is higher, and generally acknowledged error rate is 10-15% at present, gives As a result credibility brings problem, therefore generally requires the step of increasing error correction for the analysis of three generations's sequencing data, from And improve the accuracy and confidence level of sequencing result.Although the method for existing many error corrections, such as according to two generations and three generations Sequencing result three generations is corrected, or three generations's result is subjected to multiple self-correcting to reduce error rate, but due to The result difference of correction software and the limitation of user's selective power cause significantly to limit for its application.
Summary of the invention
To solve at least partly technical problem in the prior art, the present invention provides a kind of for detecting HPV in host's base Because of the method for the integration site of group.Specifically, the present invention includes the following contents.
The first aspect of the present invention provides a kind of method for detecting integration site of the HPV in host genome, packet Include following steps:
(1) host genome DNA is extracted from from the biological sample of subject, and the host genome DNA is beaten The segment broken to main peak for 800bp-4kb;
(2) the first detection is carried out in step (1) obtained segment using probe groups and obtains target sequence, wherein institute It states probe groups to be made of multiple probes, and respectively include can be with the specific region selective cross of HPV genome for each probe Sequence, the target sequence include the sequence from the sequence of HPV gene and from host genome;
(3) target sequence obtained in step (2) is pre-processed to obtain pretreated target sequence to improve inspection Survey sensitivity;With
(4) the second detection is carried out to target sequence without PCR amplification based on electric signal sequencing, thus really HPV is recognized in the integration site of host genome.
Preferably, the method according to the present invention for detecting integration site of the HPV in host genome, the biology sample This is cervical exfoliated cell or cervical cancer tissues.
Preferably, the method according to the present invention for detecting integration site of the HPV in host genome, the probe groups In each probe be designed as the continuity region for HPV full-length genome.
Preferably, the method according to the present invention for detecting integration site of the HPV in host genome, the step (3) pretreatment in includes being enriched with to the target sequence.
Preferably, the method according to the present invention for detecting integration site of the HPV in host genome, the step (3) pretreatment in includes adding the first connector at the both ends of the target sequence, then by selecting with first connector Property hybridization primer carry out PCR amplification.
Preferably, the method according to the present invention for detecting integration site of the HPV in host genome, the step (4) the second detection includes adding A in the end of pretreated target sequence, is then sequentially connected bar code, followed by Nano-pore sequencing.
Preferably, the method according to the present invention for detecting integration site of the HPV in host genome, the nano-pore The depth of sequencing is 5000 × to 10000 ×.
The second aspect of the present invention, provide it is another for detecting HPV in the method for the integration site of host genome, The following steps are included:
(1) host genome DNA is extracted from from the biological sample of subject, and the host genome DNA is beaten The segment broken to main peak for 800bp-4kb;
(2) the segment one end connect third connector, using the primer sets being made of multiple specific primers and with The universal primer of the third connector selective cross expands to obtain target sequence, wherein the multiple specific primer difference It is designed to hybridize with the continuity sequence selectivity in HPV genome, so that the primer sets be made to cover entire HPV gene Group sequence;With
(3) target sequence is detected without PCR amplification based on electric signal sequencing, to confirm Integration site of the HPV in host genome.
The third aspect of the present invention, provide it is another for detecting HPV in the method for the integration site of host genome, The following steps are included:
(1) host genome DNA is extracted from from the biological sample of subject, and the host genome DNA is beaten The segment broken to main peak for 800bp-4kb;
(2) the segment one end connect third connector, using multiple specific primers composition primer sets and with institute The universal primer for stating third connector selective cross expands to obtain target sequence, wherein the multiple specific primer is set respectively Being calculated as can be with a kind of known gene order selective cross of HPV integration site side;With
(3) target sequence is detected without PCR amplification based on electric signal sequencing, to confirm Integration site of the HPV in host genome.
Further, the method according to the present invention for detecting integration site of the HPV in host genome, the confirmation HPV further comprises data analysis step in the integration site of host genome comprising: matter is carried out to raw sequencing data Control removes noise data, obtains data to be analyzed;Make the data to be analyzed and HPV genome sequence and host genome sequence Column are compared respectively, obtain containing comparison quality score, the result of aligned sequences location information;According to the formula of UCSC, meter The score value and consistency of comparison result are calculated, HPV is chosen respectively and host compares best as a result, two results are grouped together Resultant output listing;Retain the sequencing not only compared completely with HPV genome sequence but also with host genome sequence in list to read The positional relationship for growing and comparing the two, when positional distance of two kinds of aligned sequences in the sequence is to be spaced or be overlapped 10 alkali When base, judge that the sequence is HPV integration sequence;Merge identical sequence of breakpoints and count sequence of breakpoints number, there are 5 or more sequences When being detected with the identical breakpoint, then it is judged as correct HPV integration site.
The present invention carries out Preliminary detection using the segment that interrupts using probe or primer pair host genome for the first time, then right The target sequence that Preliminary detection obtains formulated optimization builds library and analysis strategy, thus establish it is a set of from experimental method establish to The method for accurately carrying out HPV viruse breakpoint analysis can be quick by being not necessarily to the bioinformatic analysis process of error correction Integration site of the HPV viruse in human genome is accurately identified, needs to provide new strategy to meet scientific research and clinic.
Detailed description of the invention
The primer and sequence information of exemplary authentication breakpoint in Fig. 1 the method for the present invention.
New breakpoint PCR amplification result schematic diagram in Fig. 2 the method for the present invention.The figure is the spy for the method for the present invention detection DNA fragmentation product after determining the people of integration site design and the primer PCR amplification of HPV viruse sequence carries out 1% Ago-Gel The result of electrophoresis.Wherein, each letter of A-O respectively represents 13 different breakpoint locations.The figure illustrates each pair of primer The integration site sequence that this method sequencing obtains all successfully is amplified.
Breakpoint result schematic diagram is sequenced based on Sanger in Fig. 3 the method for the present invention.Each letter of A-O respectively represents 13 A different breakpoint location.As a result display is the generation Sanger method sequencing result of Fig. 2 amplified production, these sequencing results in The sequence for having groups of people and HPV is proved after the website NCBI blast compares, and the sequencing result of position and this method is always. Overall length sequencing information is in Fig. 1.
Fig. 4 is the 13 breakpoint location information verified by Sanger method.
Fig. 5 is 17 known breakpoints of detection.
Fig. 6 is to compare Venn figure based on the HPV breakpoint number that distinct methods identify.Wherein, Nanopore result is this hair Bright method qualification result, NGS are two generation sequencing approach qualification results, and Paper is the known breakpoint result delivered.
Specific embodiment
The existing various exemplary embodiment that the present invention will be described in detail, the detailed description are not considered as to limit of the invention System, and it is understood as the more detailed description to certain aspects of the invention, characteristic and embodiment.
It should be understood that it is to describe special embodiment that heretofore described term, which is only, it is not intended to limit this hair It is bright.In addition, for the numberical range in the present invention, it is thus understood that specifically disclose the range upper and lower bound and they it Between each median.Median and any other statement value in any statement value or stated ranges or in the range Lesser range is also included in the present invention each of between interior median.These small range of upper and lower bounds can be independent Ground includes or excludes in range.
Unless otherwise stated, all technical and scientific terms used herein has the routine in field of the present invention The normally understood identical meanings of technical staff.Although the present invention only describes preferred method and material, of the invention Implement or also can be used and similar or equivalent any method and material described herein in testing.The institute mentioned in this specification There is document to be incorporated by reference into, to disclosure and description method relevant to the document and/or material.It is incorporated to any When document conflicts, it is subject to the content of this specification.
The method that the determination multi-pass of integration site to virus in the genome crosses Sanger sequencing and the sequencing of two generations at present, On the one hand, the sequencing of two generations has certain advantage, but it is read length and is difficult to detect to the viral integrase structure of long sequence, confidence level It is lower.On the other hand, nanometer sequencing technologies need multiple self-correcting to reduce its error rate.In view of above-mentioned two o'clock, the present invention By the way that the study found that being enriched with the integration site sequence of HPV viruse using the design of multiple groups probe, the sequencing and analysis in library are built in optimization Strategy, establish it is a set of from experimental method to carry out HPV viruse breakpoint analysis method, and pass through without mistake check and correction biology Bioinformatics analysis process can rapidly and accurately identify integration site of the HPV viruse in human genome, can be to meet section It grinds and clinic needs to provide new strategy.
Method of the invention generally comprises following steps: (one) extracts genomic DNA and obtains the step of interrupting segment; (2) the step of obtaining target sequence;(3) integration site of the confirmation HPV in host genome.The following detailed description of each step Suddenly.
Step (1):
Step (one) of the invention is to extract genomic DNA and obtain the step of interrupting segment.In exemplary implementation scheme In comprising host genome DNA is extracted from from the biological sample of subject, and being interrupted to main peak is 800bp- The segment of 4kb, as step (1).
Subject of the invention is generally mammal or the mankind, is preferably people.It is relevant that biological sample is generally uterine neck Sample, including cervical exfoliated cell or cervical cancer tissues.Host genome is the genome corresponding to subject, is generally people's base Because of group.
Genome of the invention, which interrupts mode, can be used any means known.These means can refer to known textbook, such as Public publications such as " Molecular Cloning:A Laboratory guide " fourth edition of Cold SpringHarbor etc..The length of the segment interrupted needs the main peak to be 800bp-4kb, preferably 1kb-4kb, more preferable 1.5kb-3.5kb, such as 2.5kb.
Step (2):
Step (two) of the invention is the step of obtaining target sequence from obtained segment.Wherein target sequence refers to Comprising the sequence from HPV gene and from the sequence of host genome.Sequence from HPV gene can be located at mesh 5 ' the ends for marking sequence can also be located at its 3 ' end.This is not particularly limited.It similarly, can position from the sequence of HPV gene In 5 ' ends of target sequence, its 3 ' end can also be located at.
In certain embodiments, step of the invention (two) includes the following steps:
(2) target sequence is obtained to the first detection is carried out in obtained segment using probe groups, wherein probe groups are by multiple Probe composition, and respectively include can be with the sequence of the specific region selective cross of HPV genome for each probe;(3) to acquisition Target sequence pre-processed to improve detection sensitivity.
The purpose of first detecting step of the invention is that the host genome of the gene containing HPV is obtained by specific probe groups Segment, i.e., containing the target sequence of integration site.Probe groups of the invention are the probe groups of specially designed multiple probe compositions It closes.Preferably, each probe in probe groups is to the different zones of HPV full-length genome, and these different zones are continuity region, To cover HPV full-length genome.Preferably, probe groups for HPV full-length genome coverage be 2 ×.Here HPV full genome Group is the gene regions for including coding carcinogenic protein important factor comprising early gene area, late gene area and long control region.
In step (3), i.e., the target sequence of acquisition is pre-processed to improve in detection sensitivity, pretreatment may include The process that target sequence is enriched with.The first connector is added at the both ends that pretreatment may additionally include target sequence respectively, then By carrying out PCR amplification using the primer of at least part selective cross with the first connector.The sequence length of first connector Generally 6-50nt, preferably 10-30nt, more preferable 10-20nt.As long as the sequence of the first connector can with host genome and Sequence in HPV genome distinguishes, then is not particularly limited.First joint sequence may include known array and random sequence.Example Such as, it is known that sequence can be 6 to 13 continuous bases, and random sequence can be 1 to 13 continuous base.For example, first Joint sequence can be as shown in TATGGGCAGTCGT.Connector known in the art can be used in first connector.In a kind of exemplary implementation In scheme, the first connector used is preferably suitable for connector used in A-T connecting-type library, more preferably, such as The blocker sequence of Intergrated DNA TechnologiesUniversal Blocker-TS Mix sequence and envelope The Human Cot-1DNA of genome repetitive sequence is closed, is then connect in such a way that both ends connect with host DNA segment, then PCR amplification is carried out with the primer of the first connector selective cross.PCR product after enrichment uses Agilent High Sensitivity DNA2100 chip detects its size and abundance.
In certain embodiments, step of the invention (two) include by the combination of specific primer and universal primer come Obtain target sequence.Specific primer of the invention is generally multiple, thus forms primer sets.Universal primer of the invention is general It is one.Universal primer is applied in combination with primer sets.
Specific primer of the invention separately design for can with the specific region selective cross in HPV genome, it is more The corresponding specific region composition continuity region of a specific primer or sequence, so that primer sets be made to cover entire HPV genome Sequence.
Universal primer is introduced thus, and the present invention needs to connect third connector in the one end for interrupting segment.The sequence of third connector Column length is generally 6-50nt, preferably 10-30nt, more preferable 10-20nt.As long as the sequence of third connector can be with host gene Sequence in group and HPV genome distinguishes, then is not particularly limited.Third joint sequence may include known array and stochastic ordering Column.For example, as it is known that sequence can be 6 to 13 continuous bases, random sequence can be 1 to 13 continuous base.For example, Third joint sequence can be as shown in TATGGGCAGTCGT.Connector known in the art can be used in third connector.Third connector can be with It is identical as above-mentioned first connector, it can also be different.It should be noted that different from the first connector, third connector can only be connected to The single-ended of segment is interrupted, and is not connectable to both ends.
In certain embodiments, specific primer design of the invention be can be with a kind of integration site known to HPV The gene order selective cross of side, multiple specific primers then can be with the bases of multiple integration sites side known to HPV Because sequence selectivity hybridizes, the information of specific integration site is obtained so as to specificity.Known integration site is usually Refer to and frequently-occurring integrates hot spot.For example, non-homology recombination is high-incidence to integrate hotspot location, generation area includes fragile site, dystopy Broken site, transcriptionally active site.
Step (3):
Step (three) of the invention is to confirm HPV in the integration site of host genome.It specifically, may include based on electricity Signal sequencing carries out the second detection to target sequence without PCR amplification, to confirm HPV in host genome Integration site, i.e. step (4).In exemplary arrangement, step (4) includes adding to the target sequence end obtained after the first detection Then A is sequentially connected bar code and the second connector, carry out nano-pore sequencing.It should be noted that sequencing process here is not Include the steps that PCR, in an exemplary embodiment, sequencing depth be 10000 ×, to reach to HPV integration site sequence Depth detection.Specifically, it repairs or adds using the end NEBNext in the end DNA of the biological sample obtained by the first detection DA tail modular reagent box, it is preferable that repaired using such as NEBNext End Repair/dA-tailed reagent to carry out the end DNA The addition of multiple and dA tail adds primary bar code after purification and connects, and the second connector is added after being further purified is connected to preparing The end DNA, it is preferable that the second connector is the Ligation Sequencing of Oxford Nanopore Technology company The sequence measuring joints provided in Kit kit, progress magnetic bead is pure after connecting DNA product the second connector of connection of bar code after purification Change, then carries out nano-pore sequencing.Magnetic beads for purifying step is carried out using known method, in an exemplary embodiment, is used AMPure XP magnetic bead carries out product purification.Primary bar code and the second connector can be synthesized or are purchased from by known method known Commodity, such as use no PCR bar code connector or Barcode modular reagent box, it is preferable that use Oxford Nanopore The primary bar code and bar code connector provided in the Ligation Sequencing Kit kit of Technology company, Adapt to the nano-pore sequencing machine and subsequent bioinformatics process analysis in the method for the present invention without PCR.
Step (three) of the invention further includes the steps that data processing or analysis.Known nano-pore sequencing result error rate compared with Height, error rate is 10%-15% at present.Error correction at present is carried out by analysis software.In view of the difference of software output result Different and user's selective power limitation, and the reorganizing research for being applied to virus is very rare.Bioinformatics through the invention Analysis process, the qualification requirement based on integration sequence merge two kinds of sequence alignment methods, and the parameter of optimal screening breakpoint, from And the speed and accuracy of analysis can be significantly improved, find the viral integration sites information that accurately can verify that.Specifically, originally The analysis of biological information process of invention includes: to carry out Quality Control to raw sequencing data, removes noise data, obtains number to be analyzed According to Quality Control process is not specially limited, in an exemplary embodiment, public according to Oxford Nanopore Technologies The EPI2ME agent Barcoding analysis process of department carries out Quality Control, and the smallest qscore value is 7;Make data to be analyzed with HPV genome sequence and host genome sequence are compared respectively, obtain containing comparison quality score, aligned sequences position letter The result of breath, it is preferable that its reduced parameter is blat-stepSize=5-repMatch=2253-minScore=20- MinIdentity=0-noHead refence_database input.fa output.psl;According to the formula of UCSC, meter The score value and consistency of comparison result are calculated, HPV is chosen respectively and host compares best as a result, two results are grouped together Resultant output listing;Retain the sequencing not only compared completely with HPV genome sequence but also with host genome sequence in list to read The positional relationship for growing and comparing the two, when positional distance of two kinds of aligned sequences in the sequence is to be spaced or be overlapped 10 alkali When base, judge that the sequence is HPV integration sequence;Merge identical sequence of breakpoints and count sequence of breakpoints number, there are 5 or more sequences When being detected with the identical breakpoint, then it is judged as correct HPV integration site;To on the HPV and human genome in the sequence Breakpoint location by annovar software carry out genome on functional annotation.It should be noted that the present invention use with As analysis foundation, it is suitable for the integration site of other DNA integration virus in host to detect analysis for HPV integration site detection Process.
Further, by the Library development flow and analysis method of optimization described above, precise Identification of the present invention is multiple out HPV integrates the breakpoint location in positive sample, and is verified by Sanger method, and set identification HPV disease is as a result demonstrated The strategy of poison is highly effective.
Embodiment
The present embodiment is used for exemplary illustration the method for the present invention.
One, sample information
The fresh cancerous tissue of an example IIIb Cervix Squamous Cell cancer is selected to carry out HPV integration detection as sample.
Two, experimental procedure
1, extracting genome DNA
The extraction of genomic DNA uses salting out method, and respectively using Nanodrop 2000, Qubit mode it is quantitative and 1.0% Ago-Gel carries out clip size Quality Control.Genomic DNA yield is greater than 3 μ g, and segment is greater than 5kb.
2, DNA is interrupted
Sample interrupts method and interrupts pipe using microTUBE-50, interrupts 50 μ l of volume, interrupts parameter using Peak incident Power(W):75;Duty Factor:5%;Cycles Per Burst:200.Genomic DNA is broken into master Peak different length 800bp segment, 2.0% Ago-Gel run colloid inspection, the next hybrid capture of qualified sample and Enriching step.
3, HPV probe hybridizes
Target area is with reference to genome sequence according to the HPV gene order of the website NCBI, and target acquistion region is that HPV is complete Genome area carries out successional probe design in full-length genome region.DNA spy is designed and synthesized according to HPV full-length genome Needle (Intergrated DNA Technolaogy), referring to xGen hybridization capture of DNA Libraries (Intergrated DNA Technolaogy) operation instructions carry out hybrid capture to virus sequence, obtain mesh Mark sequence is simultaneously enriched with using PCR.PCR product after enrichment uses Agilent High Sensitivity DNA 2100 The sample of chip detected magnitude and abundance, satisfactory amount of DNA and quality for testing in next step.
4, other purposes section amplification method
The sample that capture requires cannot be reached for starting amount of DNA, be enriched with using the method for amplification template.Specific plan Slightly: a. interrupts DNA for purpose fragment length, single-ended jointing, a plurality of primer and list covered with HPV genome sequence The amplification of end connector universal primer carries out amplified production to build library in next step after purification;B. for the height being had been reported in database Hair integrates hotspot location design primer, and sequence of breakpoints is included in amplified production region, and amplified production purifies quality inspection, meets item Library is built in the progress of part in next step.
5, the building of sequencing library
5.1 ends DNA are repaired, and are added " A " and are purified
Reaction system is as shown in table 1, and configured reaction mixture is shaken and is mixed, reaction condition: 20 DEG C, 30min;65 DEG C, 30min;4 DEG C of holdings.After reaction, product purification is carried out using 60 μ l (1 ×) AMPure XP magnetic beads, recycling DNA is molten In 25 μ l nuclease free waters.
Repair system in -1 end of table
The connection of 5.2 bar codes
Bar code coupled reaction system is as shown in table 2, and configured reaction mixture is shaken and is mixed, 25 DEG C of incubations 15min.Product purification is carried out using 50 μ l (1 ×) AMPure XP magnetic beads, recycling DNA is dissolved in 26 μ l nuclease free waters.
- 2 bar code linked system of table
The connection of 5.3 connectors
Connector interfaces system is as shown in table 3, and configured reaction mixture is shaken and is mixed, and is incubated at room temperature 10min.It uses 40 μ l (0.4 ×) AMPure XP magnetic beads carry out product purification, and the washing of 140 μ l connector magnetic bead association reaction liquid (ABB) is then added 2 times, recycling DNA is dissolved in 15 μ l elution buffers (ELB).Product is quantitative using Qubit mode.
- 3 connector interfaces system of table
6, upper machine sequencing
It is sequenced using nano-pore sequencing instrument (Oxford Nanopore Technology MinION).
6.1 prepare chip preparation solution
Chip preparation liquid system is as shown in table 4, prepares a sequence testing chip, and open chip and prepare mouth, first eliminates inner wall Then the mixed liquor in 800 μ l tables 4, static 5min is added in bubble.
- 4 chip of table prepares liquid system
6.2 pre- sequencing solution preparations and loading
Loading system is as shown in table 5, while keeping chip preparation hole to open, well is opened, first slowly by 200 μ l Chip preparation solution injects chip and prepares hole, then solution is sequenced in 75 μ l in advance and adding mouth is added dropwise.Finally, close adding mouth and Chip prepares mouth, starts to be sequenced, make target area average sequencing depth reach 10000 × more than.
- 5 loading system of table
Three, data are analyzed
7.1 data Quality Controls
Original lower machine data FASTQ file Quality Control according to Oxford Nanopore Technologies company EPI2ME Agent Barcoding analysis process carries out Quality Control, and the smallest qscore value is 7.
7.2 sequence alignment
Clean sequence is compared with HPV and human genome sequence using Blat analysis process, is obtained containing than confrontation Measure score value, aligned sequences location information etc. as a result, alignment parameters are blat-stepSize=5-repMatch=2253- MinScore=20-minIdentity=0-noHead refence_database input.fa output.psl.
7.3 score values and the output of consistency result
According to the formula of UCSC, the score value and consistency of comparison result are calculated, it is best to choose HPV and human comparison respectively As a result, the resultant output listing that two results are grouped together;
The judgement of 7.4 HPV integration sequence positions
Retain and had not only compared HPV sequence in list but also compared the sequencing reading length of upper human genome sequence and compare HPV and people Positional relationship of the sequence in sequencing reading length, positional distance of two kinds of aligned sequences in the sequence are to be spaced or be overlapped 10 The case where within base, it is believed that the sequence is HPV integration sequence;Merge identical sequence of breakpoints and count sequence of breakpoints number, there is 5 It is correct HPV integration site that the above sequence of item, which is detected the judgement with the identical breakpoint,.
7.5 functional annotation
Function on annovar software progress genome is passed through to the breakpoint location on the HPV and human genome in the sequence It can annotation;
8, result verification
Its recall rate is measured to known integration site using the present invention, in these sites of detection, is sequenced by two generations Method and Sanger are verified and are calculated its verifying rate.
Four, experimental result
It is known to 17 detections/19 (89.5%) using the recall rate of method of the invention to known integration site.In addition, should Method has also detected other unknown integration sites 39, in these sites of detection, has 15 to be verified by two generation sequencing approaches, In addition 13 sites have been extracted and have carried out Sanger verifying, wherein 13 sites are all verified as correct integration site, verifying Rate is the 100% of sampling, and the primer information of 13 integration sites is shown in Fig. 1, and as shown in Figure 2, Sanger sequencing is disconnected for PCR amplification result Point result is shown in Fig. 3, shown in 4.17 known breakpoints of detection are shown in Fig. 5.In conclusion using the analysis process to known to an example 59 integration sites are found altogether in the integration site detection result of the positive sample of HPV integration, wherein 45 are tested by other methods Card, remaining 14 sites are not verified, and total verifying rate in site is 76.3%.The HPV breakpoint identified according to distinct methods Number relatively Venn figure is as shown in Figure 6.
Although the present invention has been described with reference to exemplary embodiments, however, it is to be understood that the present invention is not limited to disclosed Exemplary implementation scheme.It, can be to the exemplary reality of description of the invention without departing substantially from the scope or spirit of the invention The scheme of applying does a variety of adjustment or variation.The scope of the claims should be based on widest explanation to cover all modifications and equivalent structure With function.

Claims (10)

1. a kind of method for detecting integration site of the HPV in host genome, which comprises the following steps:
(1) from from the biological sample of subject extract host genome DNA, and by the host genome DNA interrupt to Main peak is the segment of 800bp-4kb;
(2) the first detection is carried out in step (1) obtained segment using probe groups and obtains target sequence, wherein the spy Needle group is made of multiple probes, and each probe respectively include can with the sequence of the specific region selective cross of HPV genome, The target sequence includes the sequence from the sequence of HPV gene and from host genome;
(3) target sequence obtained in step (2) is pre-processed to obtain pretreated target sequence to improve detection spirit Sensitivity;With
(4) the second detection is carried out to target sequence without PCR amplification based on electric signal sequencing, to confirm Integration site of the HPV in host genome.
2. the method according to claim 1 for detecting integration site of the HPV in host genome, which is characterized in that The biological sample is cervical exfoliated cell or cervical cancer tissues.
3. the method according to claim 1 for detecting integration site of the HPV in host genome, which is characterized in that Each probe in the probe groups is designed as the continuity region for HPV full-length genome.
4. the method according to claim 1 for detecting integration site of the HPV in host genome, which is characterized in that Pretreatment in the step (3) includes being enriched with to the target sequence.
5. the method according to claim 1 for detecting integration site of the HPV in host genome, which is characterized in that Pretreatment in the step (3) include the target sequence both ends add the first connector, then by using with it is described The primer of first connector selective cross carries out PCR amplification.
6. the method according to claim 1 for detecting integration site of the HPV in host genome, which is characterized in that Second detection of the step (4) includes adding A in the end of pretreated target sequence, is then sequentially connected bar code and the Two connectors, followed by nano-pore sequencing.
7. the method according to claim 6 for detecting integration site of the HPV in host genome, which is characterized in that The depth of the nano-pore sequencing is 5000 × to 10000 ×.
8. a kind of method for detecting integration site of the HPV in host genome, which comprises the following steps:
(1) from from the biological sample of subject extract host genome DNA, and by the host genome DNA interrupt to Main peak is the segment of 800bp-4kb;
(2) the segment one end connect third connector, using the primer sets being made of multiple specific primers and with it is described The universal primer of third connector selective cross expands to obtain target sequence, wherein the multiple specific primer separately designs For that can hybridize with the continuity sequence selectivity in HPV genome, so that the primer sets be made to cover entire HPV genome sequence Column;With
(3) target sequence is detected without PCR amplification based on electric signal sequencing, to confirm that HPV exists The integration site of host genome.
9. a kind of method for detecting integration site of the HPV in host genome, which comprises the following steps:
(1) from from the biological sample of subject extract host genome DNA, and by the host genome DNA interrupt to Main peak is the segment of 800bp-4kb;
(2) third connector is connected in one end of the segment, using the primer sets of multiple specific primers composition and with described the The universal primer of three connector selective cross expands to obtain target sequence, wherein the multiple specific primer separately design for It can be with a kind of gene order selective cross of integration site side known to HPV;With
(3) target sequence is detected without PCR amplification based on electric signal sequencing, to confirm that HPV exists The integration site of host genome.
10. -9 described in any item methods for detecting integration site of the HPV in host genome according to claim 1, It is characterized in that, the confirmation HPV further comprises data analysis step in the integration site of host genome comprising:
Quality Control is carried out to raw sequencing data, noise data is removed, obtains data to be analyzed;
The data to be analyzed are compared respectively with HPV genome sequence and host genome sequence, obtain containing comparison The result of quality score, aligned sequences location information;
According to the formula of UCSC, the score value and consistency of comparison result are calculated, HPV is chosen respectively and host compares best knot Fruit, the resultant output listing that two results are grouped together;
Retain list in not only with HPV genome sequence again with the sequencing reading length that host genome sequence compares completely and compared with the two Positional relationship, when positional distance of two kinds of aligned sequences in the sequence be spaced or overlapping 10 bases when, judgement should Sequence is HPV integration sequence;
Merge identical sequence of breakpoints and count sequence of breakpoints number, when there are 5 or more sequences to be detected with the identical breakpoint, then sentences Break as correct HPV integration site.
CN201910840742.8A 2019-09-06 2019-09-06 Method for detecting integration site of HPV in host genome Active CN110527714B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910840742.8A CN110527714B (en) 2019-09-06 2019-09-06 Method for detecting integration site of HPV in host genome

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910840742.8A CN110527714B (en) 2019-09-06 2019-09-06 Method for detecting integration site of HPV in host genome

Publications (2)

Publication Number Publication Date
CN110527714A true CN110527714A (en) 2019-12-03
CN110527714B CN110527714B (en) 2023-03-28

Family

ID=68667418

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910840742.8A Active CN110527714B (en) 2019-09-06 2019-09-06 Method for detecting integration site of HPV in host genome

Country Status (1)

Country Link
CN (1) CN110527714B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110957008A (en) * 2020-02-26 2020-04-03 广州市金域转化医学研究院有限公司 Method and device for detecting human genome virus integration site
CN111020019A (en) * 2020-03-06 2020-04-17 元码基因科技(北京)股份有限公司 Method for gene fusion detection based on nanopore technology
CN113096735A (en) * 2021-03-01 2021-07-09 重庆医科大学 System and method for analyzing HBV DNA integration event from in vitro serum

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103320522A (en) * 2013-07-16 2013-09-25 深圳华大基因研究院 Method and system for determining HPV integration site in genome from human cervical carcinoma sample
CN110093454A (en) * 2019-04-24 2019-08-06 南京格致医学检验有限公司 The amplification sequencing approach analyzed for a variety of HPV Classification Identifications and genome conformity

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103320522A (en) * 2013-07-16 2013-09-25 深圳华大基因研究院 Method and system for determining HPV integration site in genome from human cervical carcinoma sample
CN110093454A (en) * 2019-04-24 2019-08-06 南京格致医学检验有限公司 The amplification sequencing approach analyzed for a variety of HPV Classification Identifications and genome conformity

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YING LIU ET AL: "Genome-wide profiling of the human papillomavirus DNA integration in cervical intraepithelial neoplasia and normal cervical epithelium by HPV capture technology", 《SCIENTIFIC REPORTS》 *
陈丹等: "基因测序技术及其临床应用", 《中华临床实验室管理电子杂志》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110957008A (en) * 2020-02-26 2020-04-03 广州市金域转化医学研究院有限公司 Method and device for detecting human genome virus integration site
CN111020019A (en) * 2020-03-06 2020-04-17 元码基因科技(北京)股份有限公司 Method for gene fusion detection based on nanopore technology
CN113096735A (en) * 2021-03-01 2021-07-09 重庆医科大学 System and method for analyzing HBV DNA integration event from in vitro serum

Also Published As

Publication number Publication date
CN110527714B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
CN112094915B (en) Sarcoma fusion gene and/or mutation joint detection primer group and kit
CN107541791A (en) Construction method, kit and the application in plasma DNA DNA methylation assay library
CN110527714A (en) For detecting HPV in the method for the integration site of host genome
CN102965428A (en) Kit for testing and identifying genetic cardiac hypertrophy related gene mutation
EP4023795A1 (en) Method for detecting mutation and methylation of tumor specific gene in ctdna
CN111235316A (en) Primer probe for identifying novel coronavirus and application of primer probe in triple fluorescence RPA
CN105567681B (en) A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing
WO2019076018A1 (en) Method for constructing amplicon library for detecting low-frequency mutation of target gene
CN104789672B (en) A kind of bar code magnetic bead liquid-phase chip detection kit of thalassemia gene
CN114317762B (en) Three-marker composition for detecting early liver cancer and kit thereof
CN108085395A (en) Primer sets, kit and the method for cervical carcinoma polygenes DNA methylation assay based on high-flux sequence
CN112322736A (en) Reagent combination for detecting liver cancer, kit and application thereof
CN107475403A (en) The analysis method of the method for detection Circulating tumor DNA, kit and its sequencing result from peripheral blood dissociative DNA
CN111073961A (en) High-throughput detection method for gene rare mutation
CN112280865B (en) Reagent combination for detecting liver cancer, kit and application thereof
CN110453012B (en) Universal primer, probe and detection method for detecting 24 genotypes of African swine fever virus by using RAA fluorescence method
CN107345253A (en) Lung cancer clinical medication genetic test standard items and its application
CN112501293A (en) Reagent combination for detecting liver cancer, kit and application thereof
CN107988372A (en) A kind of kit and its detection method for detecting susceptibility gene of colorectal cancer mutation
CN112094916B (en) Plasma free DNA lung cancer gene joint detection kit
CN111349719A (en) Specific primer for detecting novel coronavirus and rapid detection method
CN115786459B (en) Method for detecting tiny residual disease of solid tumor by high-throughput sequencing
CN109609635A (en) The probe library that polygenes is enriched with and the detection method with the treatment-related multiple genes of kinds of tumors
CN108456726A (en) Spinal muscular atrophy genetic test probe, primer and kit
CN109234357A (en) It is a kind of for detecting whether target gene occurs the method for fusion mutation, primer combination, kit and its application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant