CN106845154A - A kind of device for the copy number variation detection of FFPE samples - Google Patents

A kind of device for the copy number variation detection of FFPE samples Download PDF

Info

Publication number
CN106845154A
CN106845154A CN201710067086.3A CN201710067086A CN106845154A CN 106845154 A CN106845154 A CN 106845154A CN 201710067086 A CN201710067086 A CN 201710067086A CN 106845154 A CN106845154 A CN 106845154A
Authority
CN
China
Prior art keywords
module
window
sample
sequencing
ffpe samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710067086.3A
Other languages
Chinese (zh)
Other versions
CN106845154B (en
Inventor
荆瑞琳
张萌萌
董永芳
王旺
李雪峰
玄兆伶
李大为
梁峻彬
陈重建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Annoroad Genetic Technology (Beijing) Co., Ltd.
Annuo uni-data (Yiwu) Medical Inspection Co. Ltd.
Zhejiang Annuo uni-data Biotechnology Co. Ltd.
Original Assignee
ANNOROAD GENETIC TECHNOLOGY (BEIJING) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANNOROAD GENETIC TECHNOLOGY (BEIJING) Co Ltd filed Critical ANNOROAD GENETIC TECHNOLOGY (BEIJING) Co Ltd
Publication of CN106845154A publication Critical patent/CN106845154A/en
Application granted granted Critical
Publication of CN106845154B publication Critical patent/CN106845154B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Abstract

The present invention relates to a kind of FFPE samples copy number variation detection means, its detection sensitivity is high.FFPE samples copy number variation detection means of the invention includes sequencing data acquisition module, sequence alignment module, Primary Stage Data processing module, normalization module, context vault screening module, data fluctuations cancellation module, GC correction modules and output module.

Description

A kind of device for the copy number variation detection of FFPE samples
Technical field
The invention belongs to molecular Biological Detection field, and in particular to FFPE samples copy number variation detection means and detection Method.
Background technology
Formalin fix FFPE (Formalin-fixed and Paraffin-embedded, FFPE) method system Standby tissue specimen is referred to as formalin fix paraffin-embedded tissue sample, abbreviation FFPE samples.FFPE samples can be for a long time Preserve, particularly, there is substantial amounts of tumor tissue section to be preserved in the form of FFPE samples.FFPE samples are usually used in clinical pathology Inspection, oncogene detection and medical scientific, to illustrate disease mechanisms, finding therapeutic targets and indicating the aspects such as prognosis to carry The resource of preciousness is supplied.
The copy number variation (Copy Number Variation, CNV) of gene is a class clinically very important knot Structure makes a variation, the prognosis with kinds of tumors, and the sensitiveness of targeted drug is related.Reliable CNV testing results can be clinical application And condition assessment etc. provides highly important foundation.At present the CNV detection techniques that are clinically used be mostly PCR-based or The laboratory facilities (such as FISH, IHC etc.) of SABC.Such method single detection can only cover a gene, and testing result Sensitivity is relatively low.
CNV detections based on new-generation sequencing (Next-Generation Sequencing, NGS) platform, can protect The CNV testing results of multiple genes are disposably given on the premise of card detection performance.Traditional NGS platform CNV detection techniques are big Research and development are completed based on genome sequencing technology platform more, with the continuous progress of NGS technologies, the height based on target area capture Deep sequencing technology gradually shows advantage under the application scenarios of clinical detection.
But, it is traditional at present because sequencing data of whole genome and target area capture sequencing data have essential difference The CNV detection methods of NGS platforms capture sequencing data and do not apply to for target area, are difficult in the accuracy of detection CNV Ensure, and detection sensitivity has much room for improvement.This problem shows particularly evident in FFPE samples.The DNA fragmentation of FFPE samples Change more seriously, influence can be produced on the process such as target gene DNA captures and NGS sequencings, and eventually affect target area The key technical index such as effective depth.Therefore, the availability of the low depth sequencing data produced by low quality FFPE samples, into For larger technological challenge.
The content of the invention
In view of above-mentioned the deficiencies in the prior art, it is an object of the invention to provide a kind of CNV to FFPE samples Detection sensitivity detection means and detection method higher.
The present inventor has made intensive studies to solve above-mentioned technical problem, as a result finds:In FFPE samples In CNV detection methods, if carry out rational noise reduction process to data, if used suitable context vault, can directly affect To testing result, this kind of influence is especially pronounced particularly in sequencing is captured.By more reasonable comprehensively noise reduction process, dynamic The application of context vault, it is possible to increase the sensitivity of FFPE samples CNV detections, so as to complete the present invention.
That is, the present invention includes:
For FFPE samples copy number variation, (the copy number variation in gene region one kind can occur, it is also possible to send out It is raw in non-genomic region) device of detection, it includes:
Sequencing data acquisition module, for obtaining capture sequencing data from FFPE samples to be checked and from Healthy People The sequencing data of group's sample, the healthy population sample is multiple Healthy People (healthy normal person) samples;
Sequence alignment module, it is connected with the sequencing data acquisition module, for by the sequencing data acquisition module The sequencing data of acquisition is compared with reference gene group sequence, obtain comparison result (include for example, every can with refer to base Because of the chromosome where group short sequence for comparing, coordinate, the information such as the match condition of short sequence and reference gene group), according to The comparison result calculates each site (may have the depth in some sites in each site referred on genome, but capture sequencing Angle value is depth value 0);
Primary Stage Data processing module, it is connected with the sequence alignment module, for by target area (100k~100M, Full-length genome pays close attention to region) window for there are overlap (10~70%) of certain length (50~1000bp) is divided into, Remove the depth extreme value (maximum and minimum) in site in window and calculate depth average or intermediate value, and calculate in the window The G/C content of reference gene group sequence;
Normalization module, it is connected with the Primary Stage Data processing module, for the Primary Stage Data processing module institute The depth average in each window or intermediate value for obtaining are normalized, and are calculated FFPE samples to be checked and healthy population sample Z values in this each window;
Context vault screening module, it is connected with the normalization module, for according to FFPE samples to be checked and healthy population The Z values of sample, filter out n Healthy People sample (one Healthy People of each Healthy People sample correspondence), obtain n Healthy People sample Context vault sample set, the then matrix X of the Z values structure m rows n row using the n Healthy People sample in m windowm×n
Data fluctuations cancellation module, it is connected with the context vault screening module, for consolidating that elimination capture sequencing brings There are data fluctuations;
GC correction modules, it is connected with the data fluctuations cancellation module, for being carried out according to the G/C content in each window GC is corrected;
Output module, it is connected with the GC correction modules, for exporting CNV testing results (including for example, for showing The figure of CNV testing results, result of determination of feminine gender/positive of CNV variations etc.).
The sequencing data acquisition module of the device for the copy number variation detection of FFPE samples of the invention is obtained and uses two Sequencing data obtained from being sequenced to the DNA in FFPE samples to be checked for sequence measurement.The Mainstream Platform one of two generations sequencing As (Sequencing By Synthesis, SBS) technology be sequenced carry out nucleic acid sequencing using in synthesis.Sequencing before, it is necessary to The structure of sequencing library is carried out to nucleic acid (DNA or RNA) sample, basic procedure is as follows:The DNA after fragmentation is carried out into piece first Section end repair, fragment 3' ends after repair add " A " base afterwards, then by above-mentioned DNA fragmentation with contain sequencing primer DNA joints (Adapter) connection of binding site, is expanded finally by PCR, is completed sequencing library and is built.For specific Two generation sequence measurements be not particularly limited, any two generations sequence measurement well known by persons skilled in the art can be used.
Preferably, the sequencing data is the sequencing data obtained using capture sequence measurement;
The target gene of the capture sequencing can be different because of different target diseases.The target disease can be for example Solid carcinoma (such as stomach cancer, mammary gland, colorectal cancer, lung cancer etc.).
Specifically for example, in the case where the target disease is breast cancer, the target gene can be such as EGFR bases Cause, ERBB2 genes, FGFR1 genes, KIT genes, PIK3CA genes or/and PTEN genes;It is straight colon in the target disease In the case of intestinal cancer, the target gene can be such as EGFR gene, ERBB2 genes, FGFR2 genes, KRAS genes, MET Gene, PTEN genes;In the case where the target disease is stomach cancer, the target gene can be such as EGFR gene, ERBB2 genes, FGFR1 genes, FGFR2 genes, KRAS genes, MET genes, PIK3CA genes or/and PTEN genes;Described In the case that target disease is lung cancer, the target gene can be such as ALK gene, BRAF gene, EGFR gene, ERBB2 Gene, FGFR1 genes, KRAS genes, MET genes, PIK3CA or/and PTEN.
Preferably, the Primary Stage Data processing module divides the window using slip window sampling.
Preferably, the normalization module is calculated the Z values in sample to be checked each window according to following formula (1), Zi represents i-th Z value of window in formula (1),
Zi=trimScale (Zi,Zi)……(1)。
Preferably, defined formula (2):
Definition
Wherein, chr represents chromosome, and St represents biological specimen to be checked, SNRepresent healthy population sample;
The context vault screening module is filtered out so that the d according to FFPE samples to be checked and the Z values of healthy population sample It is worth n minimum Healthy People sample, the context vault sample set S after being screened1,S2,S3,…,Sn(N and n are natural number and n < N).
Preferably, the data fluctuations cancellation module is to context vault matrix Xm×nSingular value decomposition is done, the m row r row factors are obtained Matrix Um×r, r is factor number, and (the k factor i.e. in the top, k is generally 4- then to take the k maximum factor of contribution rate 10) LOESS recurrence is carried out, Residual Z is obtainedp
Preferably, the GC correction modules are according to the G/C content in each window, to ZpReturned based on LOESS and do GC corrections, Obtain Residual Zpg
Preferably, the FFPE samples copy number variation detection means also includes:
Data quality checking module, it is connected with the sequencer module and the sequence alignment module, for the sequencing mould The sequencing data that block is obtained carries out quality inspection.It is higher that quality inspection including but not limited to for example removes low-quality short sequence, removal N content Short sequence, remove the short sequence related to Adapter and the finally quality control index of statistics items correlation.
Additionally, present invention additionally comprises:
For FFPE samples copy number variation, (the copy number variation in gene region one kind can occur, it is also possible to send out It is raw in non-genomic region) method of detection, it includes:
Sequencing data obtaining step, obtains capture sequencing data from FFPE samples to be checked and from healthy population sample This sequencing data, the healthy population sample is multiple Healthy People samples;
Sequence alignment procedures, the sequencing data that the sequencing data obtaining step is obtained is carried out with reference gene group sequence Compare, obtain comparison result and (include the chromosome for example, where every short sequence that can be compared with reference gene group, sit The information such as the match condition of mark, short sequence and reference gene group), each site is calculated according to the comparison result and (refers to genome On each site, but it is depth value 0) that may have the depth value in some sites in capture sequencing;
Primary Stage Data process step, target area (100k~100M, full-length genome or pay close attention to region) is divided It is the window for there are overlap (10~70%) of certain length (50~1000bp), removes the depth extreme value in site in window (greatly Value and minimum) and depth average or intermediate value are calculated, and calculate the G/C content of the reference gene group sequence in the window;
Normalization step, is carried out to the depth average or intermediate value in each window obtained by Primary Stage Data process step Normalization, is calculated the Z values in FFPE samples to be checked and healthy population sample each window;
Context vault screens step, according to FFPE samples to be checked and the Z values of healthy population sample, filters out n healthy proper manners This (Healthy People sample, one Healthy People of each context vault sample correspondence), obtains context vault sample set, then strong using the n Z value of the health people sample in m window builds the matrix X of m rows n rowm×n
Data fluctuations removal process, eliminates the inherent data fluctuation that capture sequencing brings;
GC aligning steps, GC corrections are carried out according to the G/C content in each window;And
Output step, output CNV testing results (including for example, figure for showing CNV testing results, the moon of CNV variations Result of determination of property/positive etc.).
The sequencing data obtaining step of the method for the copy number variation detection of FFPE samples of the invention is obtained and uses two Sequencing data obtained from being sequenced to the DNA in FFPE samples to be checked for sequence measurement.The Mainstream Platform one of two generations sequencing As (Sequencing By Synthesis, SBS) technology be sequenced carry out nucleic acid sequencing using in synthesis.Sequencing before, it is necessary to The structure of sequencing library is carried out to nucleic acid (DNA or RNA) sample, basic procedure is as follows:The DNA after fragmentation is carried out into piece first Section end repair, fragment 3' ends after repair add " A " base afterwards, then by above-mentioned DNA fragmentation with contain sequencing primer DNA joints (Adapter) connection of binding site, is expanded finally by PCR, is completed sequencing library and is built.For specific Two generation sequence measurements be not particularly limited, any two generations sequence measurement well known by persons skilled in the art can be used.
Preferably, the sequencing data is the sequencing data obtained using capture sequence measurement;
The target gene of the capture sequencing can be different because of different target diseases.The target disease can be for example Solid carcinoma (such as stomach cancer, mammary gland, colorectal cancer, lung cancer etc.).
Specifically for example, in the case where the target disease is breast cancer, the target gene can be such as EGFR bases Cause, ERBB2 genes, FGFR1 genes, KIT genes, PIK3CA genes or/and PTEN genes;It is straight colon in the target disease In the case of intestinal cancer, the target gene can be such as EGFR gene, ERBB2 genes, FGFR2 genes, KRAS genes, MET Gene, PTEN genes;In the case where the target disease is stomach cancer, the target gene can be such as EGFR gene, ERBB2 genes, FGFR1 genes, FGFR2 genes, KRAS genes, MET genes, PIK3CA genes or/and PTEN genes;Described In the case that target disease is lung cancer, the target gene can be such as ALK gene, BRAF gene, EGFR gene, ERBB2 Gene, FGFR1 genes, KRAS genes, MET genes, PIK3CA or/and PTEN.
Preferably, the Primary Stage Data process step divides the window using slip window sampling.
Preferably, the normalization step is calculated the Z values in sample to be checked each window according to following formula (1), Zi represents i-th Z value of window in formula (1),
Zi=trimScale (Zi,Zi)……(1)。
Preferably, defined formula (2):
Definition
Wherein, chr represents chromosome, STRepresent FFPE samples to be checked, SNRepresent healthy population sample;
The context vault screening step is filtered out so that the d according to FFPE samples to be checked and the Z values of healthy population sample It is worth n minimum Healthy People sample, the context vault sample set S after being screened1,S2,S3,…,Sn(N, n are natural number and n < N).
Preferably, the data fluctuations removal process is to context vault matrix Xm×nSingular value decomposition is done, the m row r row factors are obtained Matrix Um×r, r is factor number, and (the k factor i.e. in the top, k is generally 4- then to take the k maximum factor of contribution rate 10) LOESS recurrence is carried out, Residual Z is obtainedp
Preferably, the GC aligning steps are according to the G/C content in each window, to ZpReturned based on LOESS and do GC corrections, Obtain Residual Zpg
Preferably, the copy number mutation detection method also includes:
Data quality checking step, quality inspection is carried out to the sequencing data that the sequencing steps are obtained.Quality inspection includes but is not limited to example Such as remove low-quality short sequence, removal N content short sequence higher, remove the short sequence related to Adapter and most finish-unification The every related quality control index of meter.
Wherein, the preferred embodiment of above steps can refer to foregoing.
According to the present invention, there is provided a kind of detection sensitivity to FFPE samples CNV detection means and detection method higher.
Brief description of the drawings
Fig. 1 is the schematic diagram of the device for the copy number variation detection of FFPE samples of the invention.
Fig. 2 is figure of the embodiment 1 to the CNV testing results of breast cancer multiple gene.
The specific embodiment of invention
The scientific and technical terminology referred in this specification has the implication identical implication being generally understood that with those skilled in the art, It is defined if any definition of the conflict in this specification.
Definition
Reference gene group:The monoploid sequence of the complete set entrained by one cell or organism, including a full set of base Cause and intervening sequence.
Compare:Refer generally to sequence alignment, refer to determine the similitude between two or more sequences so that homology, and By they according to certain aligned transfer process.
Depth value:For certain site on genome, according to comparison result, the short sequence quantity in the site is covered i.e. It is the depth value in the site.
Window (sliding window):Refer generally to one section of region of regular length on genome.
Context vault:The Sample Storehouse being made up of many cases (it is generally acknowledged that >=20) Healthy People sample.
Capture sequencing:By pre-designed probe, the specific region (region interested) on genome is carried out DNA fragmentation is captured, and the process of NGS sequencings is finally carried out to the DNA fragmentation for grabbing.
NGS (high-flux sequence):High throughput sequencing technologies (High-throughput sequencing) are also known as " next Generation " sequencing technologies (" Next-generation " sequencing technology), with can once parallel to hundreds of thousands to several It is mark that million DNA moleculars carry out sequencing and the general shorter grade of length of reading.
Normalization (Z values):
trimScale(w,v):It is the value that certain needs is normalized to define w, and v is certain data set
A. the data for removing the upper and lower certain percentages of v are obtained
B. calculateMean μ and standard deviation sigma
C. it is calculatedAs final result
SVD (singular value decomposition):SVD is a kind of important matrix decomposition in linear algebra, is positive rule in matrix analysis The popularization of battle array unitarily diagonalizable.There is important application in fields such as signal transacting, statistics.Its effect is that data set is mapped to low-dimensional In space.The characteristic value (being characterized with singular value in SVD) of data set is arranged according to importance, and the process of dimensionality reduction is exactly to give up The process of unessential characteristic vector, and the space of remaining characteristic vector composition is the space after dimensionality reduction.
Embodiment
More specific description is carried out to the present invention by the following examples.It should be appreciated that embodiment described herein is It is of the invention not for limiting for explaining the present invention.
Embodiment 1
Using the device detected for FFPE samples copy number variation of the invention to the group of Female breast cancer patients The CNV situations for knitting FFPE samples are detected.
1.1 DNA for extracting FFPE samples
Using GeneRead DNA FFPE Kit (QIAGEN companies), extraction operation is carried out according to handbook explanation, obtained FFPE sample DNAs.
1.2 samples are interrupted
Instrument being interrupted using Biorupter and entering Break Row, setting interrupts 30 circulations of condition, and 30s ON/30s OFF will FFPE sample DNAs are broken into the fragment of 200bp or so, the DNA fragmentation after being interrupted.
Repair (End Repair) in 1.3 ends
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 1.
Table 1
(2) reaction is repaired in end:1.5mL centrifuge tubes are placed in 20 DEG C of warm bath 30 in Thermomixer after adding DNA sample Minute.Reaction uses the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system after terminating, be dissolved in 32 μ LEB.
1.4 ends add " A " (A-Tailing)
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 2:
Table 2
(2) end adds " A " to react:32 μ L previous steps are added to be placed in 1.5mL centrifuge tubes after purifying the DNA for reclaiming 37 DEG C of warm bath 30 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in In 18 μ L EB.
The connection (Adapter Ligation) of 1.5 joints
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 3:
Table 3
(2) coupled reaction of joint:18 μ L previous steps are added to be placed in sample tube after purifying the DNA for reclaiming 20 DEG C of warm bath 15 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in In the EB of 30 μ L.
1.6 PCR react
(1) reagent needed for being taken out from -20 DEG C of kits of preservation, prepares PCR reaction systems in the PCR pipe of 2mL:
Table 4
(2) PCR programs are set, the program setting of PCR reactions is as follows:
Reaction terminates timely take out sample and is put into 4 DEG C of Refrigerator stores and exits on request or close instrument.
(3) with the DNA in 0.9 × nucleic acid purification magnetic bead recovery purifying reaction system, library after purification is dissolved in 20 μ L's In ddH2O.Qubit detections are carried out to library, by library censorship Agilent 2100.
1.7 breast cancer target areas capture chip libraries hybridization
(1) in this experiment, for provide hybrid capture reaction ionic environment buffer solution and for elute physics inhale Attached or non-specific hybridization cleaning fluid, rinsing liquid are commercially obtained.
(2) Hybrid Library is prepared:By DNA library to be hybridized in thawed on ice, the μ g of gross mass 1 are taken (in subsequent operation step This DNA library is referred to as sample library in rapid).
(3) Ann primers Pool is prepared:By the corresponding Tag primer In1 of sample library Index (100 μM) and consensus primer (1000 μM) respectively take 1000pmol mixing, (this mixture is referred to as into Ann primer pool in subsequent process steps).
(4) preparation of sample is hybridized:To adding 5 μ L COT DNA (Human Cot-1DNA, Life in 1.5mL EP pipes Technologies, 1mg/mL), 1 μ g samples library, Ann primers pool.The hybridization sample EP for preparing is sealed with sealed membrane Pipe, the EP pipes that will fill sample library pool/COT DNA/Ann primers pool are placed in vacuum plant until being completely dried.
(5) solution of sample is hybridized:To being added in the dry powder of sample library pool/COT DNA/Ann primers pool:
7.5 μ 2 × hybridization buffers of L
3 μ L hybridization components A
(6) said mixture is placed on preprepared 95 DEG C of heating modules after fully mixing is denatured 10 minutes.
(7) said mixture is transferred in the 0.2mL flat cover PCR pipes containing 4.5 μ L capture chips.Fully be vortexed concussion 3 seconds, Hybridization samples mixture is placed in 47 DEG C of heating module upper 16 hours.The hot lid temperature of heating module need to be set as 57 DEG C, Product need to subsequently be eluted reclaimer operation after hybridization.
(8) by 10 × cleaning fluid (I, II and III), 10 × rinsing liquid and 2.5 × magnetic bead cleaning fluid be configured to 1 × working solution.
Table 5
(9) following reagent is preheated in 47 DEG C of heating modules:
400 μ 1 × rinsing liquids of L
100 μ 1 × cleaning fluids of L I
1.8 prepare affine absorption magnetic bead
(1) by Streptavidin MagneSphere (Dynabeads M-280Streptavidin, hereinafter referred to as magnetic bead) at room temperature After 30 minutes, magnetic bead is fully vortexed balance mixing 15 seconds.
(2) to 100 μ L magnetic beads are dispensed in 1.5mL centrifuge tubes, the centrifuge tube that will fill 100 μ L magnetic beads is placed on magnetic frame, Careful suction abandons supernatant after about 5 minutes, plus twice magnetic bead initial volume 1 × magnetic bead cleaning fluid, be vortexed and mix 10 seconds.To fill The centrifuge tube of magnetic bead puts back to magnetic frame, adsorbs magnetic bead.Treat that solution is clarified, supernatant is abandoned in suction.Time step is repeated, is washed twice altogether.
(3) inhaled after washing is finished and abandon magnetic bead cleaning fluid, with 1 × magnetic bead cleaning fluid resuspended magnetic bead of vortex of magnetic bead initial volume It is transferred in the PCR pipe of 0.2mL.PCR pipe is placed on magnetic frame suction after adsorbing magnetic bead clarification and abandons supernatant.
The combination and rinsing of 1.9 DNA and affine absorption magnetic bead
(1) the sample library of hybridization is transferred in the 0.2mL PCR pipes for filling affine absorption magnetic bead, vortex oscillation is mixed.
(2) 0.2mL PCR pipes are placed in 47 DEG C of heating modules 45 minutes, were vortexed every 15 minutes and mixed once, make DNA with Magnetic bead is combined.
After (3) 45 minutes are incubated, to 47 DEG C of μ L of 1 × cleaning fluid I 100 of preheating of addition in the DNA sample that 15 μ L are captured. It is vortexed and mixes 10 seconds.Whole components in 0.2mL PCR pipes are transferred in 1.5mL centrifuge tubes.1.5mL centrifuge tubes are placed in magnetic force Magnetic bead is adsorbed on frame, supernatant is abandoned.
(4) 1.5mL centrifuge tubes are removed from magnetic frame, the 1 × rinsing liquid for adding 200 μ L to preheat 47 DEG C.Mixing is played in suction 10 times (need to operate rapidly, prevent reagent, sample temperature to be less than 47 DEG C).Sample is placed in 47 DEG C of heating module upper 5 minutes after mixing. This step is repeated, is washed twice altogether with 47 DEG C of 1 × rinsing liquid.The centrifuge tube of 1.5mL is placed on magnetic frame, magnetic bead is adsorbed, Abandon supernatant.
(5) to 1 × cleaning fluid I that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes, it is vortexed and mixes 2 minutes.Will centrifugation Pipe is placed on magnetic frame, adsorbs magnetic bead, abandons supernatant.To 1 × cleaning fluid II that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes, It is vortexed and mixes 1 minute.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.To adding 200 in above-mentioned 1.5mL centrifuge tubes 1 × the cleaning fluid III of μ L room temperatures, is vortexed and mixes 30 seconds.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.
(6) 1.5mL centrifuge tubes are removed from magnetic frame, add 45 μ L PCR water, dissolving wash-out magnetic capture sample.
The PCR amplifications of 1.10 capture dnas
(1) according to the form below prepares PCR mix after capture, and the concussion that is vortexed after preparing is mixed.Enriching primer F and enriching primer R It is purchased from Invitrogen Corp..
(2) the amplification program setting of magnetic bead adsorption of DNA PCR is as follows:
(3) recovery purifying of hybrid capture DNA PCR primers:With in nucleic acid purification magnetic bead recovery purifying reaction system DNA, magnetic bead usage amount is 0.9 ×, library after purification is dissolved in the ddH of 30 μ L2In O.
1.11 libraries quantify
2100 Bio Analyzer (Agilent)/LabChip GX (Caliper) and QPCR detections, note are carried out to library Record library concentration.
Machine sequencing on 1.12 libraries
The library for building is sequenced with NextSeq 550AR.
1.13 data processing and inversions
Copied at the result that number variation detection means is sequenced to machine on 1.12 libraries using FFPE samples of the invention Reason analysis.
The FFPE samples copy number variation detection means of embodiment 1 includes following modules.
Sequencing data acquisition module:
Capture survey is carried out to breast cancer FFPE samples to be detected using breast cancer target area capture chip for obtaining Sequence obtains sequencing data.
Data quality checking module:
Data quality checking is carried out to sequencing data, the low short sequence of average mass values is filtered out, N content short sequence high is filtered out Row, filter out the short sequence related to Adapter, the sequencing data C for being filtered.
Sequence alignment module:
Using the sequencing data C by filtering, short sequence alignment is carried out with reference to genome HG19 with people, obtain comparison result A.The depth value in each site on genome is calculated according to comparison result A, result D is obtained.
Primary Stage Data processing module:
By cancerous target region division it is certain length and has the window of overlap, removes the depth extreme value in window and calculate Depth intermediate value, and the G/C content of reference gene group sequence in the window is calculated, obtain result X.
Normalization module:
With reference to result X and D, according to formula Zi=trimScale (Zi,Zi) it is calculated genomic DNA to be detected each window Intraoral Z values.
Context vault screening module:
Definition
Chr is the meaning of chromosome, and St represents sample to be detected, and Sn represents context vault sample.
According to genomic DNA to be checked and the Z values of context vault, the context vault sample for causing that d values are minimum is filtered out, screened Context vault sample set S afterwards1,S2,S3,…,Sn
Z values using this n sample in m window build matrix Xm×nIt is stand-by as context vault.
Data fluctuations cancellation module:
To context vault matrix Xm×nSingular value decomposition is done, m row n row factor matrixs U is obtainedm×n, n is factor number.Take contribution The maximum several factors of rate carry out LOESS recurrence, obtain Residual Zp
GC correction modules:
According to the G/C content in m window, to ZpReturned based on LOESS and do GC corrections, obtain Residual Zpg
Output module:
Output module is used to show the figure of CNV testing results.
Testing result is as shown in Fig. 2 each dot in figure is a Z for windowpgValue.Wherein, PIK3CA with Two genes of ERBB2 detect copy number increase.
1.14 result verifications
Same patient original tumour flesh tissue carries out reverse transcription after extracting RNA, using QPCR method validations PIK3CA and Whether the expression quantity of ERBB2 genes raises, and the result is consistent with 1.13 testing results.Detection means of the invention can succeed Detect the copy number variation of FFPE samples.
Industrial applicibility
FFPE samples CNV detection means of the invention and detection method can significantly increase the detection sensitivity of CNV.

Claims (8)

1. a kind of to copy the device that number variation is detected for FFPE samples, it includes:
Sequencing data acquisition module, for obtaining capture sequencing data from FFPE samples to be checked and from healthy population sample This sequencing data, the healthy population sample is multiple Healthy People samples;
Sequence alignment module, it is connected with the sequencing data acquisition module, for the sequencing data acquisition module to be obtained Sequencing data compare with reference gene group sequence, obtain comparison result, each site is calculated according to the comparison result Depth value;
Primary Stage Data processing module, it is connected with the sequence alignment module, for target area to be divided into certain length There is the window of overlap, remove the depth extreme value in site in window and calculate depth average or intermediate value, and calculate the ginseng in the window Examine the G/C content of genome sequence;
Normalization module, it is connected with the Primary Stage Data processing module, obtained by the Primary Stage Data processing module Each window in depth average or intermediate value be normalized, be calculated FFPE samples to be checked and healthy population sample be every Z values in individual window;
Context vault screening module, it is connected with the normalization module, for according to FFPE samples to be checked and healthy population sample Z values, filter out n Healthy People sample, obtain the n context vault sample set of Healthy People sample, then use the n Healthy People Z value of the sample in m window builds the matrix X of m rows n rowm×n
Data fluctuations cancellation module, it is connected with the context vault screening module, for eliminating the intrinsic number that capture sequencing brings According to fluctuation;
GC correction modules, it is connected with the data fluctuations cancellation module, is rectified for carrying out GC according to the G/C content in each window Just;
Output module, it is connected with the GC correction modules, for exporting CNV testing results.
2. device according to claim 1, wherein, the sequencing data is the sequencing number obtained using capture sequence measurement According to.
3. device according to claim 1, wherein, the Primary Stage Data processing module divides described using slip window sampling Window.
4. device according to claim 1, wherein, the normalization module is calculated to be checked according to following formula (1) Z values in biological specimen each window, Zi represents i-th Z value of window in formula (1),
Zi=trimScale (Zi,Zi)……(1)。
5. device according to claim 1, wherein, defined formula (2):
Definition
Wherein, chr represents chromosome, STRepresent sample to be checked, SNHealthy population sample is represented,
The context vault screening module according to the Z values of FFPE samples to be checked and healthy population sample, filter out so that the d values most N small Healthy People sample, the context vault sample set S after being screened1,S2,S3,…,Sn
6. device according to claim 1, wherein, the data fluctuations cancellation module is to context vault matrix Xm×nDo unusual Value is decomposed, and obtains m row r row factor matrixs Um×r, r is factor number, and then taking the k maximum factor of contribution rate is carried out LOESS times Return, obtain Residual Zp
7. device according to claim 6, wherein, the GC correction modules according to the G/C content in each window, to ZpIt is based on LOESS is returned and is done GC corrections, obtains Residual Zpg
8. device according to claim 1, also including data quality checking module, itself and the sequencer module and sequence ratio Module is connected, for carrying out quality inspection to the sequencing data that the sequencer module is obtained.
CN201710067086.3A 2016-12-29 2017-02-07 A device for FFPE sample copy number variation detects Active CN106845154B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2016112473931 2016-12-29
CN201611247393 2016-12-29

Publications (2)

Publication Number Publication Date
CN106845154A true CN106845154A (en) 2017-06-13
CN106845154B CN106845154B (en) 2022-04-08

Family

ID=59121511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710067086.3A Active CN106845154B (en) 2016-12-29 2017-02-07 A device for FFPE sample copy number variation detects

Country Status (1)

Country Link
CN (1) CN106845154B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733979A (en) * 2017-10-30 2018-11-02 成都凡迪医疗器械有限公司 G/C content calibration method, device and the computer readable storage medium of NIPT
CN109979535A (en) * 2017-12-28 2019-07-05 安诺优达基因科技(北京)有限公司 Science of heredity screening apparatus before a kind of embryo implantation
CN109979529A (en) * 2017-12-28 2019-07-05 安诺优达基因科技(北京)有限公司 CNV detection device
CN110797088A (en) * 2019-10-17 2020-02-14 南京医基云医疗数据研究院有限公司 Whole genome resequencing analysis and method for whole genome resequencing analysis
CN111477275A (en) * 2020-04-02 2020-07-31 上海之江生物科技股份有限公司 Method and device for identifying multi-copy area in microorganism target fragment and application

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133914A (en) * 2014-08-12 2014-11-05 厦门万基生物科技有限公司 Method for removing GC deviations introduced by high throughout sequencing and detecting chromosome copy number variation
CN104560697A (en) * 2015-01-26 2015-04-29 上海美吉生物医药科技有限公司 Detection device for instability of genome copy number
CN104662156A (en) * 2012-08-17 2015-05-27 美国陶氏益农公司 Use of a maize untranslated region for transgene expression in plants
CN105483229A (en) * 2015-12-21 2016-04-13 广东腾飞基因科技有限公司 Method and system for detecting fetal chromosome aneuploidy
CN105555968A (en) * 2013-05-24 2016-05-04 塞昆纳姆股份有限公司 Methods and processes for non-invasive assessment of genetic variations
CN105574361A (en) * 2015-11-05 2016-05-11 上海序康医疗科技有限公司 Method for detecting variation of copy numbers of genomes
CN105722994A (en) * 2013-06-17 2016-06-29 维里纳塔健康公司 Method for determining copy number variations in sex chromosomes
CN105760712A (en) * 2016-03-01 2016-07-13 西安电子科技大学 Copy number variation detection method based on next generation sequencing
CN105814574A (en) * 2013-10-04 2016-07-27 塞昆纳姆股份有限公司 Methods and processes for non-invasive assessment of genetic variations
CN106156543A (en) * 2016-06-22 2016-11-23 厦门艾德生物医药科技股份有限公司 A kind of tumor ctDNA information statistical method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104662156A (en) * 2012-08-17 2015-05-27 美国陶氏益农公司 Use of a maize untranslated region for transgene expression in plants
CN105555968A (en) * 2013-05-24 2016-05-04 塞昆纳姆股份有限公司 Methods and processes for non-invasive assessment of genetic variations
CN105722994A (en) * 2013-06-17 2016-06-29 维里纳塔健康公司 Method for determining copy number variations in sex chromosomes
CN105814574A (en) * 2013-10-04 2016-07-27 塞昆纳姆股份有限公司 Methods and processes for non-invasive assessment of genetic variations
CN104133914A (en) * 2014-08-12 2014-11-05 厦门万基生物科技有限公司 Method for removing GC deviations introduced by high throughout sequencing and detecting chromosome copy number variation
CN104560697A (en) * 2015-01-26 2015-04-29 上海美吉生物医药科技有限公司 Detection device for instability of genome copy number
CN105574361A (en) * 2015-11-05 2016-05-11 上海序康医疗科技有限公司 Method for detecting variation of copy numbers of genomes
CN105483229A (en) * 2015-12-21 2016-04-13 广东腾飞基因科技有限公司 Method and system for detecting fetal chromosome aneuploidy
CN105760712A (en) * 2016-03-01 2016-07-13 西安电子科技大学 Copy number variation detection method based on next generation sequencing
CN106156543A (en) * 2016-06-22 2016-11-23 厦门艾德生物医药科技股份有限公司 A kind of tumor ctDNA information statistical method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YUCHAO JIANG等: ""CODEX:a normalization and copy number variation detection method for whole exome sequencing"", 《NUCLEIC ACIDS RESEARCH》 *
刘佳森等: ""苏尼特羊拷贝数变异的基因组分布特征研究"", 《中国畜牧兽医》 *
李燕等: ""新一代测序的拷贝数变异检测算法研究与设计"", 《生物信息学》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733979A (en) * 2017-10-30 2018-11-02 成都凡迪医疗器械有限公司 G/C content calibration method, device and the computer readable storage medium of NIPT
CN109979535A (en) * 2017-12-28 2019-07-05 安诺优达基因科技(北京)有限公司 Science of heredity screening apparatus before a kind of embryo implantation
CN109979529A (en) * 2017-12-28 2019-07-05 安诺优达基因科技(北京)有限公司 CNV detection device
CN109979529B (en) * 2017-12-28 2021-01-08 北京安诺优达医学检验实验室有限公司 CNV detection device
CN109979535B (en) * 2017-12-28 2021-03-02 浙江安诺优达生物科技有限公司 Genetics screening device before embryo implantation
CN110797088A (en) * 2019-10-17 2020-02-14 南京医基云医疗数据研究院有限公司 Whole genome resequencing analysis and method for whole genome resequencing analysis
CN110797088B (en) * 2019-10-17 2020-09-15 南京医基云医疗数据研究院有限公司 Whole genome resequencing analysis and method for whole genome resequencing analysis
CN111477275A (en) * 2020-04-02 2020-07-31 上海之江生物科技股份有限公司 Method and device for identifying multi-copy area in microorganism target fragment and application

Also Published As

Publication number Publication date
CN106845154B (en) 2022-04-08

Similar Documents

Publication Publication Date Title
CN106650312A (en) Device for detecting DNA copy number variation of circulating tumor
CN107475375B (en) A kind of DNA probe library, detection method and kit hybridized for microsatellite locus related to microsatellite instability
CN106845154A (en) A kind of device for the copy number variation detection of FFPE samples
CN108753967A (en) A kind of gene set and its panel detection design methods for liver cancer detection
CN106480205A (en) For detecting combined sequence and the probe of various mutations type simultaneously
CN107475370A (en) Gene group and kit and diagnostic method for pulmonary cancer diagnosis
US11300574B2 (en) Methods for treating breast cancer and for identifying breast cancer antigens
CN108841962A (en) A kind of non-small cell lung cancer detection kit and its application
CN109811055A (en) Sarcoma fusion detection kit and system
CN108315416A (en) Primer, kit and the method for lung cancer gene mutation site are determined based on high throughput sequencing technologies
CN105969857A (en) Non-small cell lung cancer targeted therapy gene detection method
CN106845153A (en) A kind of device for using Circulating tumor DNA pattern detection somatic mutation
CN107475403A (en) The analysis method of the method for detection Circulating tumor DNA, kit and its sequencing result from peripheral blood dissociative DNA
CN104032001B (en) ERBB signal pathway mutation targeted sequencing method for prognosis evaluation of gallbladder carcinoma
CN106755506A (en) Kit for detecting genetic mutation in tumour FFPE samples
CN105256057A (en) Colon cancer microsatellite instability detection kit based on next generation sequencing platform
CN107312770A (en) A kind of construction method in tumour BRCA1/2 genetic mutations library detected for high-flux sequence and its application
AU2022202555A1 (en) Method for measuring a change in an individual's immunorepertoire
CN104845970B (en) The gene related to papillary thyroid rumours
CN109652525A (en) Pulmonary thromboembolism gene panel kit and its application
CN108949979A (en) A method of judging that Lung neoplasm is good pernicious by blood sample
CN107881232A (en) Probe compositions and the application that lung cancer and colorectal cancer gene are detected based on NGS methods
CN106282357A (en) A kind of method detecting Ph like related gene position of fusion
CN106282361A (en) For capturing the gene trap test kit of hematopathy related gene
CN108315324A (en) The detection in BRAF gene mutation site in urine ctDNA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20171215

Address after: 100176 Beijing branch of Beijing economic and Technological Development Zone Street 88 Hospital No. 8 Building 2 unit 701 room

Applicant after: Annoroad Genetic Technology (Beijing) Co., Ltd.

Applicant after: Zhejiang Annuo uni-data Biotechnology Co. Ltd.

Applicant after: Annuo uni-data (Yiwu) Medical Inspection Co. Ltd.

Address before: 100176 Beijing branch of Daxing District economic and Technological Development Zone Street 88 Hospital No. 8 Building 2 unit 701 room

Applicant before: Annoroad Genetic Technology (Beijing) Co., Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 322000 1 building, No. 2 building, No. 10 standard building, Gaoxin Road, Chou Jiang Street, Yiwu, Zhejiang.

Applicant after: ZHEJIANG ANNOROAD BIO-TECHNOLOGY Co.,Ltd.

Applicant after: ANNOROAD GENE TECHNOLOGY (BEIJING) Co.,Ltd.

Applicant after: ANNOROAD (YIWU) MEDICAL INSPECTION CO.,LTD.

Address before: 100176 room 701, unit 2, building 8, courtyard 88, Kechuang 6th Street, Beijing Economic and Technological Development Zone, Beijing

Applicant before: ANNOROAD GENE TECHNOLOGY (BEIJING) Co.,Ltd.

Applicant before: ZHEJIANG ANNOROAD BIO-TECHNOLOGY Co.,Ltd.

Applicant before: ANNOROAD (YIWU) MEDICAL INSPECTION CO.,LTD.

GR01 Patent grant
GR01 Patent grant