A kind of device for detecting FFPE sample Gene Fusions
Technical field
The present invention relates to Gene Fusion detection field, more particularly to a kind of device for detecting FFPE sample Gene Fusions
And method.
Background technology
Formalin fix FFPE (Formalin-fixed and Paraffin-embedded, FFPE) method system
Standby tissue specimen is referred to as formalin fix paraffin-embedded tissue sample, abbreviation FFPE samples.FFPE samples can be for a long time
Preserve, it is usually used in clinical pathology inspection, oncogene detection and medical scientific, is that a reliable molecular biology grinds
The material source studied carefully.
Worldwide, about billions of parts tissue samples are stored in hospital or tissue sample storehouse.It is wherein exhausted
Most of is FFPE samples.FFPE samples have typically represented the biomedical research material of precious and wide material sources, enormous amount
Filing FFPE samples be retrospective study, illustrate disease mechanisms, find therapeutic targets and indicate the aspect such as prognosis to provide treasured
Expensive resource.Particularly, substantial amounts of tumor tissue section is preserved in the form of FFPE samples.
Fusion is a class clinically very important chromosomal structural variation, occurs to be risen in evolution in cancer
The effect of key.Accurately fusion testing result can be that clinical anti-cancer target spot medication treatment and prognosis evaluation provide ginseng
Examine foundation.
The detection technique for being conventionally used to detection fusion gene is based primarily upon genetic method, such as FISH.However, relatively
Low resolution ratio and flux limits application of this kind of method in the detection of complicated epithelial cancer.
With the development of two generation sequencing technologies, the detection method for being largely used to detection fusion gene has been emerged in large numbers.Gene Fusion
In detection method, the confirmation of breakpoint directly influences the judgement of testing result.CREST is the master of current detection Fusion gene
One of flow algorithm, the algorithm is realized assembling twice using packing algorithm, so that false positive is excluded, therefore its major advantage is false sun
Property it is low, but simultaneously because needs are assembled twice, cause to exist that detection speed is slow, resource requirement is high, needs are assembled etc.
Shortcoming;Meanwhile, assembling effect also suffers from coverage, the influence of Insert Fragment length.FFPE samples because its exist degraded high because
Element, causes fragment length shorter, coverage reduction, and the length to assembling fragment has considerable influence, so as to influence fusion detection knot
Really.Therefore, detection how is carried out to the fusion of FFPE samples turns into the art problem demanding prompt solution.
The content of the invention
The technical problems to be solved by the invention
Prior art algorithm due to need to be assembled twice with three comparisons, cause to exist that detection speed is slow, resource will
High weak point is sought, simultaneously because the assembling sequence of FFPE samples is shorter and coverage is relatively low, for the group of repetitive sequence
There is certain uncertainty in dress, may result in testing result mistake.
In view of above-mentioned problems of the prior art, it is an object of the invention to provide one kind for detecting Gene Fusion
Device and method, it has the advantages that detection speed is fast, resource requirement is low, stability is high.
Compared with prior art algorithm, detection means of the invention takes full advantage of the lower machine sequencing fragment of PE sequencings
(reads) information, reduces comparison number of times, it is only necessary to compare twice, and need not assemble, and improves the stabilization of detection
Property.
That is, the present invention includes:
A kind of device for detecting FFPE sample Gene Fusions, it is included with lower module:
Sequencing data acquisition module, the sequencing data for obtaining FFPE samples;Preferably, the sequencing data is to use
The sequencing data that both-end sequencing (Paired-end Sequencing, PE sequencing) method is obtained;
Comparing module:It is connected with the sequencing data acquisition module, for by the sequencing data of acquisition with refer to sequence
Row are compared, and obtain comparison result.The comparison result includes sequencing fragment corresponding positional information in gene.Institute's rheme
Confidence breath includes soft shearing information and success comparison information.The part with soft shearing information is the survey in the sequencing fragment
The soft cut part of sequence fragment, the part that successfully comparison information is carried in the sequencing fragment is the successful comparison of the sequencing fragment
Portion.Preferably, the module can utilize bwa softwares, search sequencing fragment corresponding position in gene, and form bam forms
File;Preferably, in the bam files, including every description information (qname) of sequencing fragment, sequence information (seq), comparison
Position (POS), bit-identify (flag) compares mass value (MAPQ), briefly compares expressing information (Cigar), template length
(Tlen);
Comparing module again:It is connected with the comparing module, sequencing fragment and ginseng for will carry soft shearing information
Examine genome to compare again, obtain comparison result again;
True fusion breakpoint judge module:It is connected with the comparing module again, for judging the sequencing fragment
Fusion breakpoint;And
Output module:It is connected with the true breakpoint judge module that merges, for exporting Gene Fusion testing result,
For example, Gene Fusion breakpoint location (such as left_pos, right_pos), chromosome numbers (such as left_chr, right_chr),
Support (such as sup).
Preferably, the comparing module again can for example include following submodule:
Length filtration submodule:It is connected with the comparing module, and soft shearing (soft- is contained for filtering removal
Clipping) sequencing fragment of the length less than certain value in the sequencing fragment of information;Preferably, the certain value can be for example
15~30bp, preferably 20~25bp.
Breakpoint judging submodule:It is connected with the length filtration submodule, for according to the length filtration submodule
The result data of block, will carry the part of soft shearing information and the junction of the part with normal comparison information in sequencing fragment
As breakpoint;
Distinguish submodule:It is connected with the breakpoint judging submodule, for by the portion with soft shearing information
Divide and the part with normal comparison information separates at breakpoint, and this two-part sequence information is preserved to two respectively
In individual file (such as fastq files);
Submodule is compared again:It is connected with the differentiation submodule, for saving sequence information respectively to described
Two files are compared again with reference sequences, obtain comparison result again;Preferably, the comparison result again includes following letters
Breath:Every the description information (qname) of sequencing fragment, sequence information (seq), comparison position (POS), bit-identify (flag), than
To mass value (MAPQ), expressing information (Cigar), template length (Tlen) are briefly compared.Preferably, bwa can for example be utilized
Software is compared again to above-mentioned two fastq files, forms bam formatted files.The bam formatted files include every
The description information (qname) of fragment is sequenced, sequence information (seq), bit-identify (flag) are compared position (POS), compare mass value
(MAPQ) expressing information (Cigar), template length (Tlen), are briefly compared.
Preferably, the true fusion breakpoint judge module can include following submodules:Filter submodule:Its with it is described
Compare submodule again to be connected, for according to bit-identify (flag) value failed sequencing piece for comparing (unmapped) of filtering removal
Section and the low sequencing fragment for comparing mass value (MAPQ);
Breakpoint information acquisition submodule:It is connected with the filter submodule, for searching with same clip description
The sequencing fragment of information (qname), and obtain breakpoint information;Preferably, breakpoint information includes:(1) Left/right_chr, breaks
The chromosome numbers of point left/right lateral order row;(2) left/right_pos, the comparison position of the first base in breakpoint left/right side;(3)
Left/right_seq, the sequence of breakpoint left/right side base;(4) sup, breakpoint support supports the sequencing fragment of the breakpoint
Number.
Fusion breakpoint screening submodule:It is connected with the breakpoint information acquisition submodule, in breakpoint information
Screening fusion breakpoint;
Fusion breakpoint merges submodule for the first time:It is connected with the breakpoint screening submodule that merges, for will be with phase
The fusion breakpoint of same breakpoint information merges into a true fusion breakpoint, and the fusion breakpoint with identical breakpoint information is individual
Number is used as the true support for merging breakpoint.Wherein, identical breakpoint information refers to left_chr, left_pos, right_chr
With right_pos all sames.
Fusion breakpoint merges submodule again:It merges submodule and is connected for the first time with the breakpoint, for by left_
Chr is identical with right_chr, and the fusion breakpoint that right_pos or left_pos differs within certain value (such as 3bp) is merged into
One true fusion breakpoint.
Preferably, the breakpoint information includes:
left_chr:The chromosome numbers of breakpoint left side sequence, the corresponding reference sequences numberings of read1.
left_pos:The comparison position of the first base in breakpoint left side, sequence of the corresponding comparison positions of read1 plus read1
Row length.
left_seq:The sequence of breakpoint leftmost base.
right_chr:The chromosome numbers of breakpoint right flanks, the corresponding reference sequences numberings of read2.
right_pos:The comparison position of the first base in breakpoint right side, sequence of the corresponding comparison positions of read2 plus read2
Row length.
right_seq:The sequence of breakpoint right side base.
sup:Breakpoint support, supports the number of the sequencing fragment of the breakpoint, is defaulted as 1.
Additionally, breakpoint information can also include, ort:According to the comparison result mould in piece segment description information in sequencing fragment
Formula judges gained, and "+" represents that clean sequencing fragment points of interruption right side occurs soft shearing, and "-" represents that clean sequencing fragments are interrupted
There is soft shearing in point left side.
Preferably, the fusion breakpoint screening submodule includes following element:
Breakpoint mass filter element:For filtering low quality breakpoint, if there is breakpoint A, sup numbers are more than certain value in A
(such as 5), and comparison mass value is all higher than certain value (such as 30) in left_seq and right_seq, and mismatch rate is respectively less than
Certain value (such as 0.05) or/and breakpoint support/breakpoint right side or leftward position depth are more than certain value (such as 0.1), then
Breakpoint A is fusion breakpoint.
Identical breakpoint merges element:For merging identical breakpoint, if there is left_chr in breakpoint A and B, A being equal in B
Right_chr is equal to left_chr in B in right_chr, A, and left_pos is equal to right_pos, right_ in A in B in A
Pos is equal to left_pos in B, then breakpoint A and B are merged into a fusion breakpoint;
Preferably, the fusion breakpoint merges submodule according to above-mentioned fusion breakpoint information again, if in the presence of fusion breakpoint A
Middle right_pos with merge in breakpoint B right_pos less than left_pos in certain value (such as 5), and fusion breakpoint A with melt
Left_pos is less than certain value (such as 5) in closing breakpoint B, then this fusion breakpoint A and fusion breakpoint B are merged into one truly melts
Make and break point.So as to finally give Gene Fusion (gene fusion) testing result.
Additionally, present invention additionally comprises:
A kind of method for detecting FFPE sample Gene Fusions, it is comprised the following steps:
Sequencing data obtaining step, obtains the sequencing data of FFPE samples;Preferably, the sequencing data is to use both-end
The sequencing data that sequencing (Paired-end Sequencing, PE sequencing) method is obtained;
Compare step:The sequencing data of acquisition and reference sequences are compared, comparison result is obtained.The comparison result
Including sequencing fragment corresponding positional information in gene.The positional information includes soft shearing information and success comparison information.
The part with soft shearing information is the soft cut part of the sequencing fragment in the sequencing fragment, is carried in the sequencing fragment
The part of success comparison information is the successful comparison portion of the sequencing fragment.Preferably, the module can utilize bwa softwares, look into
Sequencing fragment corresponding position in gene is looked for, and forms bam formatted files;Preferably, in the bam files, including every survey
The description information (qname) of sequence fragment, sequence information (seq) is compared position (POS), bit-identify (flag), compares mass value
(MAPQ) expressing information (Cigar), template length (Tlen), are briefly compared;
Step is compared again:Sequencing fragment with soft shearing information is compared again with reference gene group, acquisition is compared again
As a result;
True fusion breakpoint judges step:Judge the fusion breakpoint of the sequencing fragment;And
Output step:Output Gene Fusion testing result, for example, breakpoint location (such as left_pos, right_pos), dye
Colour solid numbers (such as left_chr, right_chr), support (such as sup).
Preferably, the step that compares again can for example include following sub-step:
Length filtration sub-step:Length in filtering sequencing fragment of the removal containing soft shearing (soft-clipping) information
Less than the sequencing fragment of certain value;Preferably, the certain value can be such as 15~30bp, preferably 20~25bp.
Breakpoint judges sub-step:According to the result data of the length filtration submodule, will be cut with soft in sequencing fragment
The part of information and the junction of the part with normal comparison information are cut as breakpoint;
Distinguish sub-step:By the part with soft shearing information and the part with normal comparison information disconnected
Separated at point, and this two-part sequence information is preserved into two files (such as fastq files) respectively;
Sub-step is compared again:Two files for saving sequence information respectively are compared again with reference sequences
It is right, obtain comparison result again;Preferably, the comparison result again includes following information:Every description information of sequencing fragment
(qname), sequence information (seq), comparison position (POS), bit-identify (flag), compare mass value (MAPQ), brief deck watch
Up to information (Cigar), template length (Tlen).Preferably, bwa softwares can for example be utilized to above-mentioned two fastq files, then
It is secondary to compare, form bam formatted files.The bam formatted files include every description information (qname) of sequencing fragment,
Sequence information (seq), bit-identify (flag), compare position (POS), compare mass value (MAPQ), briefly compare expressing information
(Cigar), template length (Tlen).
Preferably, the true fusion breakpoint judges that step can include following sub-steps:Filtering substep:Marked according to position
Know the sequencing of (flag) value failed sequencing fragment and low comparison mass value (MAPQ) for comparing (unmapped) of filtering removal
Fragment;
Breakpoint information obtains sub-step:The sequencing fragment with same clip description information (qname) is searched, and obtains disconnected
Point information;Preferably, breakpoint information includes:(1) Left/right_chr, the chromosome numbers of breakpoint left/right lateral order row;(2)
Left/right_pos, the comparison position of the first base in breakpoint left/right side;(3) left/right_seq, breakpoint left/right side alkali
The sequence of base;(4) sup, breakpoint support supports the sequencing fragment number of the breakpoint.
Fusion breakpoint screening sub-step:Fusion breakpoint is screened in breakpoint information;
Fusion breakpoint merges sub-step for the first time:Fusion breakpoint with identical breakpoint information is merged into one truly to melt
Make and break point, and breakpoint number as the support of true fusion breakpoint will be merged with identical breakpoint information.Wherein, identical
Breakpoint information refers to left_chr, left_pos, right_chr and right_pos all same.
Fusion breakpoint merges sub-step again:Left_chr is identical with right_chr, right_pos or left_pos
Fusion breakpoint within difference certain value (such as 3bp) merges into a true fusion breakpoint.
Preferably, the breakpoint information includes:
left_chr:The chromosome numbers of breakpoint left side sequence, the corresponding reference sequences numberings of read1.
left_pos:The comparison position of the first base in breakpoint left side, sequence of the corresponding comparison positions of read1 plus read1
Row length.
left_seq:The sequence of breakpoint leftmost base.
right_chr:The chromosome numbers of breakpoint right flanks, the corresponding reference sequences numberings of read2.
right_pos:The comparison position of the first base in breakpoint right side, sequence of the corresponding comparison positions of read2 plus read2
Row length.
right_seq:The sequence of breakpoint right side base.
sup:Breakpoint support, supports the number of the sequencing fragment of the breakpoint, is defaulted as 1.
Preferably, the breakpoint screening submodule comprises the following steps:
If there is breakpoint A, sup numbers are more than certain value (such as 5) in A, and than confrontation in left_seq and right_seq
Value is all higher than certain value (such as 30), and mismatch rate is respectively less than certain value (such as 0.05) or/and breakpoint support/breakpoint is right
Side or leftward position depth are more than certain value (such as 0.1), then judge that breakpoint A is fusion breakpoint.
If there is left_chr in breakpoint A and B, A is equal to right_chr in B, right_chr is equal to left_ in B in A
Left_pos is equal to right_pos in B in chr, A, and right_pos is equal to left_pos in B in A, then merge breakpoint A and B
It is a fusion breakpoint;
Preferably, the fusion breakpoint merges sub-step according to above-mentioned fusion breakpoint information again, if in the presence of fusion breakpoint A
Middle right_pos with merge in breakpoint B right_pos less than left_pos in certain value (such as 5), and fusion breakpoint A with melt
Left_pos is less than certain value (such as 5) in closing breakpoint B, then this fusion breakpoint A and fusion breakpoint B are merged into one truly melts
Make and break point, so as to finally give Gene Fusion (gene fusion) testing result.
In accordance with the invention it is possible to provide that a kind of detection speed is fast, resource requirement is low, stability is high for the inspection of FFPE samples
The device and method of cls gene fusion.For the second time and in third time comparison process of existing algorithm, only compares a sequence every time,
Long-time occupying system resources.Compared with existing algorithm, there is the advantage that algorithm takes full advantage of PE sequencings in the present invention, reduce ratio
To number of times only with comparing twice.The fragment that be likely to occur fusion is filtrated to get when comparing for the first time (contains soft shearing
The sequencing fragment of information);It is that all sequences are compared simultaneously to compare for second, improves the utilization rate of system resource.This
Outward, inventive algorithm need not be assembled to sequence, without unstability caused by assembling, it is achieved thereby that to FFPE samples
Gene Fusion detection.
Brief description of the drawings
Fig. 1 is the schematic diagram of the device for detecting FFPE sample Gene Fusions of embodiment 1.
The schematic diagram of of the device for detecting Gene Fusion of Fig. 2 prior arts.
The specific embodiment of invention
The scientific and technical terminology referred in this specification has the implication identical implication being generally understood that with those skilled in the art,
It is defined if any definition of the conflict in this specification.
In general, the term used in this specification has following implication.
Reference sequences (Refseq):Species normative reference genome sequence.
Fusion (Fusion gene):Refer to that all or part of sequence of two genes mutually permeates
The process of new gene.It is likely to be chromosome translocation, intercalary delection or chromosome causes caused result.
Reads:Genome or transcript profile sequence fragment.
PE is sequenced:Both-end is sequenced, a kind of sequence measurement.
read1/2:In the lower machine data of PE sequencings, read1 is the base sequence that first round test is obtained, and read2 is second
The base sequence that wheel test is obtained.
bwa:A kind of comparison method software, for searching the position in Refseq where reads, finally can obtain bam lattice
Formula file.
Adapter sequences:The joint sequence of DNA fragmentation both sides in sequencing.
Breakpoint (breakpoint):The point that two gene orders are connected with each other in fusion.
soft-clipping reads:Soft shearing sequence fragment, after reads compares, if there is partial sequence ratio
To Refseq positions, another part comparison is to Refseq another locations or can not compare Refseq, then the reads is claimed
It is soft-clipping reads.
flag:In bam formatted files, a value for describing the information such as sequence alignment pattern, direction
cigar:Brief comparison information expression formula, it is represented using data plus letter and is compared knot based on reference sequences
Really.
unmapped reads:Refer to that reads does not compare a certain position in Refseq.
duplication:Repetitive sequence, refers to the sequence expanded by PCR.
Piece segment description information:Qname, the description information of aligned fragment (template).
Mismatch rate:In comparison process, can allow that reads and Refseq has certain difference, difference value and reads
Length ratio is to mismatch rate.
Compare mass value:The possibility compared to errors present is represented, value is higher, and expressing possibility property is lower.
Embodiment
Embodiment given below, more specific description is carried out to the present invention, but the invention is not restricted to these embodiments.
The device for detecting FFPE sample Gene Fusions of the invention of embodiment 1
Using the device of detection FFPE sample Gene Fusions of the invention to the tissue FFPE sample of female lung cancer patient
This Gene Fusion situation is detected.
Number variation detection means is copied to the tissue FFPE of female pulmonary adenocarcinoma patient using FFPE samples of the invention
The Gene Fusion situation of sample is detected.
1.1 DNA for extracting FFPE samples
Using GeneRead DNA FFPE Kit (QIAGEN companies), extraction operation is carried out according to handbook explanation, obtained
FFPE sample DNAs.
1.2 samples are interrupted
Instrument being interrupted using Biorupter and entering Break Row, setting interrupts 30 circulations of condition, and 30s ON/30s OFF will
FFPE sample DNAs are broken into the fragment of 200bp or so, the DNA fragmentation after being interrupted.
Repair (End Repair) in 1.3 ends
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 1.
Table 1
(2) reaction is repaired in end:1.5mL centrifuge tubes are placed in 20 DEG C of warm bath 30 in Thermomixer after adding DNA sample
Minute.Reaction uses the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system after terminating, be dissolved in 32 μ LEB.
1.4 ends add " A " (A-Tailing)
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 2:
Table 2
(2) end adds " A " to react:32 μ L previous steps are added to be placed in 1.5mL centrifuge tubes after purifying the DNA for reclaiming
37 DEG C of warm bath 30 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in
In 18 μ L EB.
The connection (Adapter Ligation) of 1.5 joints
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 3:
Table 3
(2) coupled reaction of joint:18 μ L previous steps are added to be placed in sample tube after purifying the DNA for reclaiming
20 DEG C of warm bath 15 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in
In the EB of 30 μ L.
1.6PCR reacts
(1) reagent needed for being taken out from -20 DEG C of kits of preservation, prepares PCR reaction systems in the PCR pipe of 2mL:
Table 4
(2) PCR programs are set, the program setting of PCR reactions is as follows:
Reaction terminates timely take out sample and is put into 4 DEG C of Refrigerator stores and exits on request or close instrument.
(3) with the DNA in 0.9 × nucleic acid purification magnetic bead recovery purifying reaction system, library after purification is dissolved in 20 μ L's
ddH2In O.Qubit detections are carried out to library, by library censorship Agilent 2100.
1.7 lung cancer target areas capture chip libraries hybridization
(1) in this experiment, for provide hybrid capture reaction ionic environment buffer solution and for elute physics inhale
Attached or non-specific hybridization cleaning fluid, rinsing liquid are commercially obtained.
(2) Hybrid Library is prepared:By DNA library to be hybridized in thawed on ice, the μ g of gross mass 1 are taken (in subsequent operation step
This DNA library is referred to as sample library in rapid).
(3) Ann primers Pool is prepared:By the corresponding Tag primer In1 of sample library Index (100 μM) and consensus primer
(1000 μM) respectively take 1000pmol mixing, (this mixture is referred to as into Ann primer pool in subsequent process steps).
(4) preparation of sample is hybridized:To adding 5 μ L COT DNA (Human Cot-1DNA, Life in 1.5mL EP pipes
Technologies, 1mg/mL), 1 μ g samples library, Ann primers pool.The hybridization sample EP for preparing is sealed with sealed membrane
Pipe, the EP pipes that will fill sample library pool/COT DNA/Ann primers pool are placed in vacuum plant until being completely dried.
(5) solution of sample is hybridized:To being added in the dry powder of sample library pool/COT DNA/Ann primers pool:
7.5 μ 2 × hybridization buffers of L
3 μ L hybridization components A
(6) said mixture is placed on preprepared 95 DEG C of heating modules after fully mixing is denatured 10 minutes.
(7) said mixture is transferred in the 0.2mL flat cover PCR pipes containing 4.5 μ L capture chips.Fully be vortexed concussion
3 seconds, Hybridization samples mixture is placed in 47 DEG C of heating module upper 16 hours.The hot lid temperature of heating module need to be set as 57 DEG C,
Product need to subsequently be eluted reclaimer operation after hybridization.
(8) by 10 × cleaning fluid (I, II and III), 10 × rinsing liquid and 2.5 × magnetic bead cleaning fluid be configured to 1 × working solution.
Table 5
(9) following reagent is preheated in 47 DEG C of heating modules:
400 μ 1 × rinsing liquids of L
100 μ 1 × cleaning fluids of L I
1.8 prepare affine absorption magnetic bead
(1) by Streptavidin MagneSphere (Dynabeads M-280Streptavidin, hereinafter referred to as magnetic bead) at room temperature
After 30 minutes, magnetic bead is fully vortexed balance mixing 15 seconds.
(2) to 100 μ L magnetic beads are dispensed in 1.5mL centrifuge tubes, the centrifuge tube that will fill 100 μ L magnetic beads is placed on magnetic frame,
Careful suction abandons supernatant after about 5 minutes, plus twice magnetic bead initial volume 1 × magnetic bead cleaning fluid, be vortexed and mix 10 seconds.To fill
The centrifuge tube of magnetic bead puts back to magnetic frame, adsorbs magnetic bead.Treat that solution is clarified, supernatant is abandoned in suction.Time step is repeated, is washed twice altogether.
(3) inhaled after washing is finished and abandon magnetic bead cleaning fluid, with 1 × magnetic bead cleaning fluid resuspended magnetic bead of vortex of magnetic bead initial volume
It is transferred in the PCR pipe of 0.2mL.PCR pipe is placed on magnetic frame suction after adsorbing magnetic bead clarification and abandons supernatant.
The combination and rinsing of 1.9DNA and affine absorption magnetic bead
(1) the sample library of hybridization is transferred in the 0.2mL PCR pipes for filling affine absorption magnetic bead, vortex oscillation is mixed.
(2) 0.2mL PCR pipes are placed in 47 DEG C of heating modules 45 minutes, were vortexed every 15 minutes and mixed once, make DNA with
Magnetic bead is combined.
After (3) 45 minutes are incubated, to 47 DEG C of μ L of 1 × cleaning fluid I 100 of preheating of addition in the DNA sample that 15 μ L are captured.
It is vortexed and mixes 10 seconds.Whole components in 0.2mL PCR pipes are transferred in 1.5mL centrifuge tubes.1.5mL centrifuge tubes are placed in magnetic force
Magnetic bead is adsorbed on frame, supernatant is abandoned.
(4) 1.5mL centrifuge tubes are removed from magnetic frame, the 1 × rinsing liquid for adding 200 μ L to preheat 47 DEG C.Mixing is played in suction
10 times (need to operate rapidly, prevent reagent, sample temperature to be less than 47 DEG C).Sample is placed in 47 DEG C of heating module upper 5 minutes after mixing.
This step is repeated, is washed twice altogether with 47 DEG C of 1 × rinsing liquid.The centrifuge tube of 1.5mL is placed on magnetic frame, magnetic bead is adsorbed,
Abandon supernatant.
(5) to 1 × cleaning fluid I that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes, it is vortexed and mixes 2 minutes.Will centrifugation
Pipe is placed on magnetic frame, adsorbs magnetic bead, abandons supernatant.To 1 × cleaning fluid II that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes,
It is vortexed and mixes 1 minute.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.To adding 200 in above-mentioned 1.5mL centrifuge tubes
1 × the cleaning fluid III of μ L room temperatures, is vortexed and mixes 30 seconds.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.
(6) 1.5mL centrifuge tubes are removed from magnetic frame, add 45 μ L PCR water, dissolving wash-out magnetic capture sample.
The PCR amplifications of 1.10 capture dnas
(1) according to the form below prepares PCR mix after capture, and the concussion that is vortexed after preparing is mixed.Enriching primer F and enriching primer R
It is purchased from Invitrogen Corp..
(2) the amplification program setting of magnetic bead adsorption of DNA PCR is as follows:
(3) recovery purifying of hybrid capture DNA PCR primers:With in nucleic acid purification magnetic bead recovery purifying reaction system
DNA, magnetic bead usage amount is 0.9 ×, library after purification is dissolved in the ddH of 30 μ L2In O.
1.11 libraries quantify
2100Bio Analyzer (Agilent)/LabChip GX (Caliper) and QPCR detections, note are carried out to library
Record library concentration.
Machine sequencing on 1.12 libraries
The library for building is sequenced with NextSeq 550AR (PE100).
1.13 data processing and inversions
Using at the result that the device of detection FFPE sample Gene Fusions of the invention is sequenced to machine on 1.12 libraries
Reason analysis.
The device of detection FFPE sample Gene Fusions of the invention possesses:
Sequencing data acquisition module, chip is captured to lung cancer FFPE samples to be detected for obtaining using lung cancer target area
Originally carry out capture sequencing and obtain sequencing data.
Comparing module:It is connected with the sequencing data acquisition module, for by the sequencing data of acquisition with refer to sequence
Row are compared, and obtain comparison result.The comparison result includes sequencing fragment corresponding position in reference sequences.Institute's rheme
Confidence breath includes soft shearing information and success comparison information.The part with soft shearing information is the survey in the sequencing fragment
The soft cut part of sequence fragment, the part that successfully comparison information is carried in the sequencing fragment is the successful comparison of the sequencing fragment
Portion.The module utilizes bwa softwares, searches sequencing fragment corresponding position in gene, and form bam formatted files;Bam texts
Part includes every description information (qname) of sequencing fragment, and sequence information (seq) is compared position (POS), bit-identify
(flag) mass value (MAPQ), is compared, expressing information (Cigar), template length (Tlen) is briefly compared.
Comparing module again:It is connected with the comparing module, sequencing fragment and ginseng for will carry soft shearing information
Examine genome to compare again, obtain comparison result again.
The comparing module again includes following submodule:
Length filtration submodule:It is connected with the comparing module, and soft shearing (soft- is contained for filtering removal
Clipping) sequencing fragment of the length less than 20bp in the sequencing fragment of information.
Breakpoint judging submodule:It is connected with the length filtration submodule, for according to the length filtration submodule
The result data of block, will carry the part of soft shearing information and the junction of the part with normal comparison information in sequencing fragment
As breakpoint.
Distinguish submodule:It is connected with the breakpoint judging submodule, for by the portion with soft shearing information
Divide and the part with normal comparison information separates at breakpoint, and this two-part sequence information is preserved to two respectively
In individual fastq files.
Submodule is compared again:It is connected with the differentiation submodule, for saving sequence information respectively to described
Two files are compared again with reference sequences, obtain comparison result again;Comparison result includes again:Every is sequenced retouching for fragment
State information (qname), sequence information (seq), compare position (POS), bit-identify (flag), compare mass value (MAPQ), briefly
Compare expressing information (Cigar), template length (Tlen).Using bwa softwares to above-mentioned two fastq files, compared again
It is right, form bam formatted files.The bam formatted files include every description information (qname) of sequencing fragment, sequence information
(seq), bit-identify (flag), compares position (POS), compares mass value (MAPQ), briefly compares expressing information (Cigar), mould
Plate length (Tlen).
True fusion breakpoint judge module:It is connected with the comparing module again, for judging the sequencing fragment
Fusion breakpoint.
The true fusion breakpoint judge module includes following submodules:
Filter submodule:It is connected with the submodule that compares again, is removed for being filtered according to bit-identify (flag) value
The sequencing fragment of the failed sequencing fragment and low comparison mass value (MAPQ) for comparing (unmapped);
Breakpoint information acquisition submodule:It is connected with the filter submodule, for searching with same clip description
The sequencing fragment of information, and obtain breakpoint information.Breakpoint information includes:(1)left_chr:The chromosome of breakpoint left side sequence is compiled
Number, the corresponding reference sequences numberings of read1.(2)left_pos:The comparison position of the first base in breakpoint left side, read1 is corresponding
Compare sequence length of the position plus read1.(3)left_seq:The sequence of breakpoint leftmost base.(4)right_chr:Breakpoint
The chromosome numbers of right flanks, the corresponding reference sequences numberings of read2.(5)right_pos:The first base in breakpoint right side
Compare position, sequence length of the corresponding comparison positions of read2 plus read2.(6)right_seq:The sequence of breakpoint right side base
Row.(7)sup:Breakpoint support, supports the sequencing fragment number of the breakpoint, is defaulted as 1.
Fusion breakpoint screening submodule:It is connected with the breakpoint information acquisition submodule, in breakpoint information
Screening fusion breakpoint.
Fusion breakpoint screening submodule includes following element:
Breakpoint mass filter element:Low quality breakpoint is filtered to remove for crossing.If there is breakpoint A, sup numbers are more than 5 in A,
And comparison mass value is all higher than 30 in left_seq and right_seq, and mismatch rate is respectively less than 0.05, then breakpoint A is judged as
Fusion breakpoint.
Identical breakpoint merges element:For merging identical breakpoint.If having left_chr in breakpoint A and B, A to be equal in B
Right_chr is equal to left_chr in B in right_chr, A, and left_pos is equal to right_pos, right_ in A in B in A
Pos is equal to left_pos in B.A and B are two kinds of forms of same breakpoint, then breakpoint A and breakpoint B merge into a fusion
Breakpoint.
Fusion breakpoint merges submodule for the first time:It is connected with the breakpoint screening submodule that merges, for will be with phase
The breakpoint of same breakpoint information (left_chr, left_pos, right_chr and right_pos all same) merges into one very
Real fusion breakpoint, and using the breakpoint number with identical breakpoint information as true fusion breakpoint support.
Fusion breakpoint merges submodule again:It merges submodule and is connected for the first time with the breakpoint, by left_chr and
Right_chr is identical, but fusion breakpoint within right_pos or left_pos differences 5bp to merge into a true fusion disconnected
Point.The fusion breakpoint again merging module according to above-mentioned fusion breakpoint information, if in the presence of right_pos in fusion breakpoint A with
Left_pos is less than 5 during right_pos is less than in 5, and fusion breakpoint A left_pos and merges breakpoint B in fusion breakpoint B, then
This fusion breakpoint A and fusion breakpoint B are merged into a Gene Fusion breakpoint (gene fusion).So as to finally give gene
Fusion detection result.And
Output module:It is connected with the true breakpoint judge module that merges, for exporting Gene Fusion testing result.
Testing result is as shown in the table.
1.14 result verifications
The residual F FPE samples of same patient are verified using QPCR methods, detects whether it occurs EML4-ALK's
Fusion.RNA extractions are carried out to residual F FPE samples first, specific steps are with reference to Qiagen FFPERNA extracts kit steps
(MagMAXTMFFPE DNA/RNA Ultra Kit).Testing result shows that EML4 is merged with ALK, the result and 1.13
Testing result is consistent.Detection means of the invention can successfully detect the Gene Fusion of FFPE samples.
Industrial applicibility
According to the present invention, there is provided a kind of detection speed is fast, resource requirement is low, stability is high for FFPE pattern detections
The device and method of Gene Fusion.