CN109712672A - Detect method, apparatus, storage medium and the processor of gene rearrangement - Google Patents
Detect method, apparatus, storage medium and the processor of gene rearrangement Download PDFInfo
- Publication number
- CN109712672A CN109712672A CN201811643484.6A CN201811643484A CN109712672A CN 109712672 A CN109712672 A CN 109712672A CN 201811643484 A CN201811643484 A CN 201811643484A CN 109712672 A CN109712672 A CN 109712672A
- Authority
- CN
- China
- Prior art keywords
- sequence
- breakpoint
- compared
- candidate point
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention provides a kind of method, apparatus, storage medium and processors for detecting gene rearrangement.This method comprises: obtaining the sequence to be compared of sample to be tested;Sequence to be compared is compared with reference to genome, obtains abnormal aligned sequences, abnormal aligned sequences include comparing the sequence of malposition, compare the sequence of direction exception and do not compare the sequence with reference to genome;According to comparison position of the abnormal aligned sequences on reference genome and direction is compared, determines the position of Candidate point;It is assembled using the sequence for the position for supporting Candidate point in sequence to be compared, retains the consistent breakpoint of sequence information in assembling result with the position of Candidate point, be denoted as the breakpoint of gene rearrangement.The application solves the problems, such as that the prior art is difficult to detect the breakpoint location of gene rearrangement generation.
Description
Technical field
The present invention relates to genetic mutation detection field, in particular to a kind of method, apparatus for detecting gene rearrangement,
Storage medium and processor.
Background technique
The prior art generallys use the method for RT-Nested PCR to detect gene rearrangement phenomenon, and its step are as follows: based on
The target-gene sequence known prepares special probe, detects gene rearrangement.Nest-type PRC reaction has twice PCR amplification, to reduce
A possibility that amplification multiple target sites (because with two sets of all complementary primers of primer seldom), increases the sensibility of detection;Again
There is the pairing of two pairs of PCR primers and detection template, increases the reliability of detection.Since second set of primer is located at first round PCR
Inside product, rather than a possibility that purpose segment includes two sets of primer binding sites, is minimum, therefore second set of primer can not expand
Increase non-purpose segment.This nested PCR amplification ensures that the second wheel PCR product is nearly or completely specific not without primer pairing
The pollution of non-specific amplification caused by strong.
But RT-Nested PCR checks that gene rearrangement has the disadvantage in that the structure for 1) being unable to judge accurately gene rearrangement.
2) it is limited by primer and probe, unknown rearrangements can not be detected.3) it is unable to get the sequence of rearranged gene fracture bonding pad
Details.
Therefore, it is necessary to be improved to existing detection method.
Summary of the invention
The main purpose of the present invention is to provide method, apparatus, storage medium and the places of a kind of detection detection gene rearrangement
Device is managed, to solve the problems, such as that the prior art is difficult to detect the breakpoint location of gene rearrangement generation.
To achieve the goals above, according to an aspect of the invention, there is provided a kind of method for detecting gene rearrangement, is somebody's turn to do
Method includes: to obtain the sequence to be compared of sample to be tested;Sequence to be compared is compared with reference to genome, obtains anomaly ratio
To sequence, abnormal aligned sequences include comparing the sequence of malposition, compare the sequence of direction exception and do not compare with reference to base
Because of the sequence of group;According to comparison position of the abnormal aligned sequences on reference genome and direction is compared, determines Candidate point
Position;It is assembled, is retained disconnected with candidate in assembling result using the sequence for the position for supporting Candidate point in sequence to be compared
The consistent breakpoint of sequence information of the position of point, is denoted as the breakpoint of gene rearrangement.
Further, the comparison position according to abnormal aligned sequences on reference genome and comparison direction, determine candidate
The position of breakpoint include: by abnormal aligned sequences carry out sequence cutting after again with reference genome alignment, according to different after cutting
Normal comparison position of the aligned sequences on reference genome and comparison direction, determine the position of Candidate point.
Further, the comparison position according to abnormal aligned sequences on reference genome and comparison direction, determine candidate
The position of breakpoint includes: to carry out abnormal aligned sequences after sequence cutting again with reference genome alignment, and acquisition can cross over simultaneously
The sequence of potential the first length of breakpoint two sides, is denoted as the first flag sequence, and can cross over potential breakpoint two sides simultaneously, but length is small
In the second length sequence as the second flag sequence;It is simulated according to the position of the potential breakpoint on the first flag sequence and base occurs
Because of the breakpoint reference sequences of rearrangement;Sequence to be compared is compared with breakpoint reference sequences, and joins to top broken-point can be compared
The sequence for examining sequence and the breakpoint on across breakpoint reference sequences is marked, and is denoted as supporting the breakpoint candidate sequence of breakpoint;It will break
The position of breakpoint on point candidate sequence is determined as the position of Candidate point.
It further, include: according to survey by the position that the position of the breakpoint on breakpoint candidate sequence is determined as Candidate point
Sequence quality and support sequence number are corrected breakpoint candidate sequence, the Candidate point sequence after being corrected;After correction
The position of breakpoint in Candidate point sequence is determined as the position of Candidate point.
It further, include: according to branch by the position that the position of the breakpoint on breakpoint candidate sequence is determined as Candidate point
It holds and supports across breakpoint ginseng in the first flag sequence, the second flag sequence and sequence to be compared of the breakpoint on breakpoint reference sequences
The pairs of sequence of the breakpoint in sequence is examined, the false positive sequence of breakpoints in breakpoint candidate sequence is filtered, obtains filtered candidate
Sequence of breakpoints;The position of breakpoint in filtered Candidate point sequence is determined as to the position of Candidate point.
Further, carrying out assembling using the sequence for the position for supporting Candidate point in sequence to be compared includes: according to branch
It holds and supports across breakpoint ginseng in the first flag sequence, the second flag sequence and sequence to be compared of the breakpoint on breakpoint reference sequences
The pairs of sequence for examining the breakpoint in sequence is assembled, and is retained consistent with the sequence information of the position of Candidate point in assembling result
Breakpoint, be denoted as the breakpoint of gene rearrangement.
Further, the sequence to be compared for obtaining sample to be tested includes: the sequencing library for constructing sample to be tested;To sequencing text
Library carries out high-flux sequence, obtains sequencing data;Sequencing data is pre-processed, the sequence to be compared of sample to be tested is obtained.
Further, sequencing library is hybrid capture library, preferably passes through the capture of SEQIDNO:1 to SEQ ID NO:36
Probe obtains hybrid capture library.
Further, after the breakpoint for obtaining gene rearrangement, method further includes quantifying to the gene reset
The step of, quantitative step includes: to count support gene weight in sequence to be compared according to the sequence information of the breakpoint of gene rearrangement
The sequence number of the breakpoint of row is denoted as marker sequence number;The sequence number of marker sequence number and reference gene is divided by, gained ratio
Gene expression abundance of the gene that value is as reset relative to reference gene.
To achieve the goals above, according to the second aspect of the invention, a kind of device for detecting gene rearrangement is provided,
Device, which is used to store, runs module or module perhaps as the component part of device;Wherein, module is software module, software mould
Block is one or more, the method that software module is used to execute any of the above-described kind of detection gene rearrangement.
According to the third aspect of the present invention, a kind of storage medium is provided, storage medium includes the program of storage,
In, program executes any of the above-described kind of method for detecting gene rearrangement.
According to the fourth aspect of the present invention, a kind of processor is provided, processor is for running program, wherein program
The method of any of the above-described kind of detection gene rearrangement is executed when operation.
It applies the technical scheme of the present invention, by detecting the position of gene rearrangement using high-flux sequence data,
Abnormal sequence is compared with the sequence on reference genome using in sequence to be compared, to determine that the candidate reset is disconnected
Point position, verifies reliable Candidate point position, further by the group shape sequence of sequence to be compared then so as to standard
Really detect the breakpoint location of gene rearrangement, correspondingly, the sequence information of breakpoint location also can accurately be known, further to lead to
Standard PCR is crossed to verify the breakpoint location and provide the foundation.Therefore, the present processes can not only detect known or unknown
Rearrangements, and can accurately detect the specific location and corresponding sequence information reset and occurred.This method directly utilizes
NGS sequencing data does not increase any additional experiment testing cost based on statistics and algorithm development.In addition, the inspection of this method
It is high to survey accuracy, at low cost, the structural rearrangement suitable for low-abundance gene detects.
Detailed description of the invention
The accompanying drawings constituting a part of this application is used to provide further understanding of the present invention, and of the invention shows
Examples and descriptions thereof are used to explain the present invention for meaning property, does not constitute improper limitations of the present invention.In the accompanying drawings:
Fig. 1 shows the quick-reading flow sheets that the breakpoint location of gene rearrangement is detected in a kind of preferred embodiment according to the present invention
Schematic diagram;
Fig. 2 shows the detailed streams for the breakpoint location that gene rearrangement is detected in another preferred embodiment according to the present invention
Journey schematic diagram;And
Fig. 3 and Fig. 4 shows breakpoint location detected by the method for embodiment according to the present invention 1 and surveys through generation PCR
The sequencing result figure of sequence verifying, wherein Fig. 3 shows that the sequencing result of forward primer, Fig. 4 show the survey of reverse primer
Sequence result.
Specific embodiment
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.Below in conjunction with embodiment, the present invention will be described in detail.
As background technique is previously mentioned, the prior art is only capable of judgement and resets when detecting to the gene reset
The raw position of phenomenon and counterweight discharge is unable to Accurate Determining, thus, in order to improve this situation, in a kind of typical reality of the application
It applies in mode, provides a kind of method for detecting gene rearrangement, this method comprises: obtaining the sequence to be compared of sample to be tested;It will
Sequence to be compared is compared with reference to genome, obtains abnormal aligned sequences, and abnormal aligned sequences include comparing malposition
Sequence, compare direction exception sequence and do not compare refer to genome sequence;According to abnormal aligned sequences in reference base
Because of the comparison position in group and direction is compared, determines the position of Candidate point;Utilize support Candidate point in sequence to be compared
The sequence of position is assembled, and is retained the consistent breakpoint of sequence information in assembling result with the position of Candidate point, is denoted as base
Because of the breakpoint of rearrangement.
The method of above-mentioned detection gene rearrangement provided herein, by detecting gene using high-flux sequence data
The position reset compares abnormal sequence with the sequence on reference genome using in sequence to be compared, to determine
Then reliable Candidate point is further verified by the group shape sequence of sequence to be compared in the Candidate point position reset
Position, so as to be accurately detected the breakpoint location of gene rearrangement, correspondingly, the sequence information of breakpoint location also can be accurate
Know, provides the foundation further to verify the breakpoint location by Standard PCR.Therefore, the present processes can not only be examined
Known or unknown rearrangements are measured, and can accurately detect the specific location and corresponding sequence information reset and occurred.
This method directly utilizes NGS sequencing data, based on statistics and algorithm development, do not increase any additional experiment detection at
This.In addition, the detection accuracy of this method is high, at low cost, the structural rearrangement suitable for low-abundance gene is detected.
It is passed through it should be noted that the sequence to be compared of above-mentioned sample to be tested can be from the raw sequencing data of sample to be tested
Sequence to be compared is formed after processing, is also possible to the existing ready-made sequence to be compared that can be used to compare.The above method is logical
It crosses to increase through the sequence after assembling and verifies Candidate point position, so that breakpoint location is more acurrate.
In sequence to be compared, a part can be with the sequence on reference genome alignment, and a part is because occurring gene weight
It arranges and can not directly compare on reference genome, thus this partial sequence is known as abnormal aligned sequences.Abnormal aligned sequences
Sequence including comparing the sequence sequences of tandem sequence repeats (so forward direction) of malposition, comparing direction exception (for example is reversely gone here and there
Join duplicate sequence) and do not compare sequence (sequence of such as insertion and deletion) with reference to genome.According to these anomaly ratios pair
Comparison position of the sequence on reference genome and comparison direction, using existing method (than arriving same chromosome as can comparing
Malposition, occur to be inverted between sequence, it is abnormal by comparing direction, determine its potential breakpoint location.It is arrived alternatively, can compare
Transposition occurs for different chromosome sequence positions, sequence, by comparing direction) it can determine its potential breakpoint location.
In certain preferred embodiments, according to comparison position of the abnormal aligned sequences on reference genome and compare other side
To, determine Candidate point position include: by abnormal aligned sequences carry out sequence cutting after again with reference genome alignment, according to
Comparison position of the abnormal aligned sequences on reference genome and comparison direction after cutting, determine the position of Candidate point.
Specifically, existing sequence cutting, which compares software, bwa, hisat2 or STAR.These softwares are used in comparison
Looser comparison method, on every section of sequence alignment that cutting is opened to the possible position of reference genome, so as to determination
Final comparison position and comparison direction.
In some preferred embodiments, according to comparison position of the abnormal aligned sequences on reference genome and comparison
Direction determines that the position of Candidate point includes: that, again with reference genome alignment, will obtain after the progress sequence cutting of abnormal aligned sequences
The sequence that can cross over potential the first length of breakpoint two sides simultaneously is obtained, the first flag sequence is denoted as, and potential breakpoint can be crossed over simultaneously
Two sides, but less than the sequence of the second length as the second flag sequence;According to the position of the potential breakpoint on the first flag sequence
The breakpoint reference sequences of gene rearrangement occur for simulation;Sequence to be compared is compared with breakpoint reference sequences, and to can compare
The sequence of breakpoint on top broken-point reference sequences and across breakpoint reference sequences is marked, is denoted as supporting the breakpoint of breakpoint candidate
Sequence;The position of breakpoint on breakpoint candidate sequence is determined as to the position of Candidate point.
In the data of both-end sequencing, there are the sequencing sequence of both direction, from the point of view of the sequence according to single-ended sequencing, if logical
It crosses and is cut into two sections or three sections of sequences are compared with reference genome again, every section can compare arrive that refer to genome different respectively
In position and direction, then the potential breakpoint location of gene rearrangement can be inferred according to the position of specific cutting.By dividing the
One flag sequence and the second flag sequence, and building breakpoint reference sequences are simulated with this and are compared again, help to obtain more latent
Across sequence of breakpoints and support the pairs of sequence of normally comparison across breakpoint.Further by supporting on the breakpoint reference sequences
The sequence of breakpoint location acts on Candidate point sequence, to keep the accuracy of screened Candidate point relatively high.It is above-mentioned
First flag sequence crosses over the first length of potential breakpoint two sides according to the difference of sequence length, can rationally be set as 20
~25bp.And less than the sequence of the second length as in the second flag sequence, the second length can be according to sequence length
Difference is rationally set as 10~20bp.
In order to further increase the accuracy of breakpoint location, can according to the sequencing depth of the sequencing data of sample to be tested and
Sequencing strategy, is further corrected above-mentioned Candidate point and false positive filters, to retain the higher breakpoint of authenticity
Position.
In certain preferred embodiments, the position of the breakpoint on breakpoint candidate sequence is determined as to the position of Candidate point
It include: according to sequencing quality and sequence number to be supported to be corrected breakpoint candidate sequence, the Candidate point sequence after being corrected;
The position of breakpoint in Candidate point sequence after correction is determined as to the position of Candidate point.
Specifically, for example, sequencing mean depth reach 1000 ×, the sequence across breakpoint reaches 2% or more of mean depth,
I.e. 20 × breakpoint base correction can be carried out above, positional relationship, the base of comparison are compared by the breakpoint of analog references sequence
Quality carries out breakpoint correction.If support the sequence across breakpoint lower than 20 × breakpoint false positive it is higher, usually remove.
In certain preferred embodiments, the position of the breakpoint on breakpoint candidate sequence is determined as to the position of Candidate point
It include: according in the first flag sequence, the second flag sequence and sequence to be compared of supporting the breakpoint on breakpoint reference sequences
It supports the pairs of sequence of the breakpoint on across breakpoint reference sequences, filters the false positive sequence of breakpoints in breakpoint candidate sequence, obtain
Filtered Candidate point sequence;The position of breakpoint in filtered Candidate point sequence is determined as to the position of Candidate point
It sets.
Specifically, such as retain the first flag sequence number greater than 10, supported on across breakpoint reference sequences in sequence to be compared
Breakpoint pairs of sequence be greater than 50 breakpoint.Certainly, specific value herein can be fitted according to the difference of different sequencing samples
Work as adjustment, is merely illustrative of herein.
In certain preferred embodiments, assembled using the sequence for the position for supporting Candidate point in sequence to be compared
It include: according in the first flag sequence, the second flag sequence and sequence to be compared of supporting the breakpoint on breakpoint reference sequences
Support that the pairs of sequence of the breakpoint on across breakpoint reference sequences is assembled, retain in assembling result with the position of Candidate point
The consistent breakpoint of sequence information, is denoted as the breakpoint of gene rearrangement.
By carrying out using above-mentioned first flag sequence, the second flag sequence and the pairs of sequence for supporting above-mentioned breakpoint
Sequence assembling, the assembling sequence by from the beginning assembling formation verify Candidate point position again, so that finally determining gene weight
The breakpoint location of row is more acurrate.
It has been observed that the data to be compared of the sample to be tested of the application can be it is existing can be directly used for compare to than
To sequence, it is also possible to the band aligned sequences that the initial data that sequencing obtains obtains after processing.In certain preferred embodiments
In, the sequence to be compared for obtaining sample to be tested includes: the sequencing library for constructing sample to be tested;High pass measurement is carried out to sequencing library
Sequence obtains sequencing data;Sequencing data is pre-processed, the sequence to be compared of sample to be tested is obtained.
In certain preferred embodiments, sequencing library is hybrid capture library, preferably passes through SEQIDNO:1 to SEQ ID
The capture probe of NO:36 obtains hybrid capture library.Using hybrid capture library, can be carried out for the sequencing data of target gene
Gene rearrangement detection.The capture probe of above-mentioned SEQ ID NO:1 to SEQ ID NO:36 can capture the full exon of mll gene
Sequence, it is thus possible to for detect the gene exon rearrangement position and its corresponding sequence information.
The above method of the application is capable of the breakpoint location of accurate testing goal gene rearrangement, according to research purpose
Difference can also detect the expression quantity of mutant gene detected using the sequence to be compared of above-mentioned sample to be tested.?
In certain preferred embodiments, obtain gene rearrangement breakpoint after, the above method further include to the gene reset into
The quantitative step of row, quantitative step includes: to count and support in sequence to be compared according to the sequence information of the breakpoint of gene rearrangement
The sequence number of the breakpoint of gene rearrangement is denoted as marker sequence number;The sequence number of marker sequence number and reference gene is divided by,
Gene expression abundance of the gene that gained ratio is as reset relative to reference gene.By to certain genes reset
Expression quantity is detected, and can react the gene under given conditions or the expression under particular procedure state passes through in turn
The expression quantity of the gene is detected under a series of different conditions or different conditions, the difference condition of its expression can be reacted.
Above-mentioned reference gene can reasonably select according to actual needs, for example when the gene of detection is mll gene, usually can choose
ABL1 gene is as reference gene.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Combination of actions, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described because
According to the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know
It knows, the embodiments described in the specification are all preferred embodiments, and related movement not necessarily present invention institute is necessary
's.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation
The method of example can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but it is very much
In the case of the former be more preferably embodiment.Based on this understanding, technical solution of the present invention is substantially in other words to existing
The part that technology contributes can be embodied in the form of software products, which is stored in a storage
In medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that calculating equipment executes each embodiment of the present invention
The method, or make processor to execute method described in each embodiment of the present invention.
In second of the application typical embodiment, a kind of device for detecting gene rearrangement is provided, device is used for
Storage perhaps runs module or module is the component part of device;Wherein, module is software module, and software module is one
Or it is multiple, software module is for executing any of the above-described kind of method.Gene can not only more accurately be detected using the device
The breakpoint location of rearrangement, and the corresponding sequence information of breakpoint location can be obtained, and then convenient for detecting according to its sequence information
Its relative expression quantity, the practicality and the scope of application are wider, and any there are the mutant genes of gene rearrangement phenomenon to use
Above-mentioned apparatus is detected.
Preferably, above-mentioned apparatus includes: to obtain module, comparison module, candidate block and assembling determining module, obtains module
For obtaining the sequence to be compared of sample to be tested;Comparison module is used to for sequence to be compared being compared with reference to genome, obtains
To abnormal aligned sequences, abnormal aligned sequences include comparing the sequence of malposition, compare the sequence of direction exception and do not compare
On with reference to genome sequence;Candidate block is used for comparison position and comparison according to abnormal aligned sequences on reference genome
Direction determines the position of Candidate point;Determining module is assembled to be used to utilize the position for supporting Candidate point in sequence to be compared
Sequence is assembled, and is retained the consistent breakpoint of sequence information in assembling result with the position of Candidate point, is denoted as gene rearrangement
Breakpoint.
In a kind of preferred embodiment, above-mentioned candidate block includes: cutting comparison module and candidate determining module, cutting
Comparison module is used for, again with reference genome alignment, candidate determining module is for root after the progress sequence cutting of abnormal aligned sequences
According to comparison position of the abnormal aligned sequences after cutting on reference genome and direction is compared, determines the position of Candidate point.
In a kind of preferred embodiment, above-mentioned candidate block includes: cutting mark module, analog module, compares label
Module and Candidate point module, cutting mark module be used for by abnormal aligned sequences carry out sequence cutting after again with reference genome
It compares, obtains the sequence that can cross over potential the first length of breakpoint two sides simultaneously, be denoted as the first flag sequence, and can be simultaneously across latent
In breakpoint two sides, but length less than the second length sequence as the second flag sequence;Analog module is used for according to the first label
The breakpoint reference sequences of gene rearrangement occur for the position simulation of the potential breakpoint in sequence;Comparison mark module is used for will be to be compared
Sequence is compared with breakpoint reference sequences, and to the breakpoint that can be compared on top broken-point reference sequences and across breakpoint reference sequences
Sequence be marked, be denoted as support breakpoint breakpoint candidate sequence;Candidate point module is used for will be on breakpoint candidate sequence
The position of breakpoint is determined as the position of Candidate point.
In a kind of preferred embodiment, Candidate point module includes: correction breakpoint module and correction determining module, correction
Breakpoint module is used for according to sequencing quality and sequence number is supported to be corrected breakpoint candidate sequence, and the candidate after being corrected is disconnected
Point sequence;Correction determining module is used to for being determined as the position of the breakpoint in the Candidate point sequence after correction the position of Candidate point
It sets.
In a kind of preferred embodiment, Candidate point module includes: filtering breakpoint module and filtering determining module, filtering
Breakpoint module is used for according to the first flag sequence of breakpoint, the second flag sequence and to be compared supported on breakpoint reference sequences
The pairs of sequence that the breakpoint on across breakpoint reference sequences is supported in sequence, filters the false positive breakpoint sequence in breakpoint candidate sequence
Column, obtain filtered Candidate point sequence;Determining module is filtered to be used for the breakpoint in filtered Candidate point sequence
Position is determined as the position of Candidate point.
In a kind of preferred embodiment, assembling determining module includes: assembling submodule and reservation module, assembles submodule
For being propped up according in the first flag sequence, the second flag sequence and sequence to be compared for supporting the breakpoint on breakpoint reference sequences
The pairs of sequence for holding the breakpoint on across breakpoint reference sequences is assembled, and reservation module is disconnected with candidate in assembling result for retaining
The consistent breakpoint of sequence information of the position of point, is denoted as the breakpoint of gene rearrangement.
In a kind of preferred embodiment, obtaining module includes: building module, sequencer module and preprocessing module, structure
Modeling block is used to construct the sequencing library of sample to be tested;Sequencer module is used to carry out high-flux sequence to sequencing library, is surveyed
Ordinal number evidence;Preprocessing module obtains the sequence to be compared of sample to be tested for pre-processing to sequencing data.
In a kind of preferred embodiment, above-mentioned sequencing library is hybrid capture library, preferably extremely by SEQIDNO:1
The capture probe of SEQ ID NO:36 obtains hybrid capture library.
In a kind of preferred embodiment, above-mentioned apparatus further includes that quantitative quantitative mould is carried out to the gene reset
Block, quantitative module include: statistical module and expression quantity computing module, and statistical module is used for the sequence of the breakpoint according to gene rearrangement
Information counts the sequence number for supporting the breakpoint of gene rearrangement in sequence to be compared, is denoted as marker sequence number;Expression quantity calculates mould
Block is for the sequence number of marker sequence number and reference gene to be divided by, and the gene that gained ratio is as reset is relative to interior
Join the gene expression abundance of gene.
In the application in the third typical embodiment, a kind of storage medium is provided, which includes storage
Program, wherein program execute any of the above-described kind detection gene rearrangement method.
In the 4th kind of the application typical embodiment, a kind of processor is provided, which is used to run program,
Wherein, the method for any of the above-described kind of detection gene rearrangement is executed when program is run.
Above-mentioned storage medium, processor and device can be used to execute the side of above-mentioned detection gene rearrangement by computer
Method, and export corresponding testing result, these products are realized on the basis of not increasing any additional experiment and sequencing cost
Detection to gene rearrangement, and the testing cost of the device is low, accuracy is high.
In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others
Mode is realized.Wherein, the apparatus embodiments described above are merely exemplary, such as the division of the unit, only
A kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or
Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual
Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of unit or module
It connects, can be electrical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a calculating is set
Standby (can be personal computer, server or network equipment etc.) executes the whole or portion of each embodiment the method for the present invention
Step by step.And storage medium above-mentioned includes: that USB flash disk, read-only memory (ROM, Read (-) Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic or disk etc. be various to can store program code
Medium.
Further illustrate the beneficial effect of the application below in conjunction with specific embodiments.
The method of the detection MLL-PTD gene rearrangement of embodiment 1
1, sample and data
1) Bone Marrow of Patients or peripheral blood are extracted, is saved with collection tube.
2) sample nucleic acid is extracted, remaining sample is placed in -80 DEG C of preservations.
3) sequencing library is constructed, hybrid capture method (the hybrid capture probe of mll gene, the respectively gene are passed through
1) 36 exons, particular sequence see the table below, target area are enriched with.
4) library after capturing carries out machine sequencing.
Table 1:
2, the pretreatment of sequencing data
1) data Quality Control
Low-quality sequence is mainly deleted, the sequence comprising 5 or more base N is removed;Continuous 40 nucleotide
Average sequence of the sequencing quality lower than Q20 is also deleted.
2) mll gene sequence is compared
The high quality sequence alignment for being passed through Quality Control with hisat2 is to reference sequences, for further analyzing.
3, MLL-PTD is identified
1) principle and theoretical basis:
MLL-PTD causes mll gene (totally 36 exons) molecular level to morph, and shows as the exon order of connection
It changes, resets and usually occur between exon2 to exon11.
2) MLL-PTD breakpoint identifies:
Pairs of sequence is compared first, according to the sequence location relationship of comparison, for the sequence of positional relationship exception
It is right, find existing structure variation between sequence pair.Simultaneously by the sequence cutting of improper comparison, the looser ratio other side of use
Method determines sequence alignment to possible position final comparison position and compares direction, breakpoint location is according to cutting sequence
It compares position and calculates acquisition.As shown in Figure 1, the sequence of the first length of breakpoint two sides can be crossed over simultaneously across sequence of breakpoints, as the
One flag sequence, can be simultaneously across the sequence of the second length as the second flag sequence.It is simulated by flag sequence breakpoint location
The breakpoint reference sequences that PTD occurs, sequence is compared again, is only retained and is compared good and there is the candidate across sequence of breakpoints
Breakpoint.
3) breakpoint corrects
Because breakpoint border sequences are similar, there are mutation or mistake is sequenced, thus, as shown in Fig. 2, being surveyed according to alignment score
Sequence quality and support sequence number are corrected, and optimum prediction sequence of breakpoints are provided, as Candidate point.
4) more factors filter false positive
To Candidate point, according to the first flag sequence for supporting breakpoint, the second flag sequence and support are across the pairs of of breakpoint
Sequence further filters false positive.Later as shown in Fig. 2, all sequences of breakpoint will be supported to assemble, retain assembling result
With the consistent breakpoint of sequence of breakpoints information.To obtain reliable MLL-PTD structural information.
4, MLL-PTD is quantitative
Marker sequence number/reference gene ABL1 sequence depth based on MLL-PTD, obtains the abundance with ABL1 gene
Ratio.
Specifically 122 samples are detected according to the present processes shown in Fig. 2, are tested with 10 sample hairs
MLL-PTD variation has been given birth to, it is specific to report result such as the following table 2 and table 3.
Table 2:
Table 3:
Sample number | SEQ ID NO: | Fusion sequence * |
A | 37 | AGAGGTCTCTGATGAGTCACTTTCTTGACC@cttttcttttggtttttgttttacagggat |
B | 38 | AGAGGTCTCTGATGAGTCACTTTCTTGACC@cttttcttttggtttttgttttacagggat |
C | 39 | ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat |
D | 40 | CATCTTCTGAGCCAGCAATTGATGACTTGT@cttttcttttggtttttgttttacagggat |
E | 41 | ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat |
F | 42 | ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat |
G | 43 | ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttaaagtccactctgatcctgtggactcc |
H | 44 | ATCTGAGCCAAAACCTAAGAATTGCTCATC@ctgattctggtggtggaggctgctttttct |
I | 45 | ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat |
J | 46 | ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat |
K | 47 | CATCTTCTGAGCCAGCAATTGATGACTTGT@cttttcttttggtttttgttttacagggat |
* the fusion sequence in table 3 is the sequence of reverse complemental, and the letter of such as 4 small letter of A:exon8- > exon represents exon8
Sequence, the representative exon4 of capitalization.
2, it chooses sample C and carries out the MLL-PTD breakpoint arrangement that Sanger sequence verification detects.
Sample information such as the following table 4 of PCR verifying.
Table 4:
Sample number | MLL-PTD structure | Exon A | Exon B | Marker sequence number | Ratio |
C | exon8->exon2 | exon8 | exon2 | 231 | 17.12% |
Verify obtained sequence information are as follows:
ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat
(i.e. SEQ ID NO:39).
3, breakpoint template sequence is generated according to breakpoint location, the 300bp design primer before and after breakpoint carries out PCR amplification.
4, PCR product size reasonable, band become clear it is single, by PCR product carry out Sanger sequencing, sequencing peak figure it is clean.
5, breakpoint arrangement, and the side of breakpoint base context and above-mentioned the application can be found according to Sanger sequencing result
The sequence of breakpoints that method is identified is completely the same (forward primer sequencing result is shown in that Fig. 3, reverse complemental sequencing result are shown in Fig. 4).
It can be seen from the above description that the above embodiments of the present invention realized the following chievements: the application
Method can not only detect known or unknown rearrangements, and can accurately detect and reset the specific location occurred and corresponding
Sequence information.It is applied widely, it is suitble to the detection of all genes that rearrangements occur.
This method directly utilizes NGS sequencing data, based on statistics and algorithm development, does not increase any additional experiment
Testing cost.In addition, the detection accuracy of this method is high, at low cost, the structural rearrangement suitable for low-abundance gene is detected.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can
It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application
On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product
It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment
(can be personal computer, server or the network equipment etc.) executes the certain of each embodiment of the application or embodiment
Partial method.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method
Part explanation.
The application can be used in numerous general or special purpose computing system environments or configuration.Such as: personal computer, service
Device computer, handheld device or portable device, laptop device, multicomputer system, microprocessor-based system, top set
Box, programmable consumer-elcetronics devices, network PC, minicomputer, mainframe computer, including any of the above system or equipment
Distributed computing environment etc..
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general
Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed
Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored
Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they
In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific
Hardware and software combines.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field
For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair
Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.
Sequence table
<110>Beijing You Xun Laboratory of medical test Co., Ltd
<120>method, apparatus, storage medium and the processor of gene rearrangement are detected.
<130> PN102308YXYX
<160> 47
<170> SIPOSequenceListing 1.0
<210> 1
<211> 455
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 1
ctgcttcact tcacggggcg aacatggcgc acagctgtcg gtggcgcttc cccgcccgac 60
ccgggaccac cgggggcggc ggcggcgggg ggcgccgggg cctagggggc gccccgcggc 120
aacgcgtccc ggccctgctg cttccccccg ggcccccggt cggcggtggc ggccccgggg 180
cgcccccctc ccccccggct gtggcggccg cggcggcggc ggcgggaagc agcggggctg 240
gggttccagg gggagcggcc gccgcctcag cagcctcctc gtcgtccgcc tcgtcttcgt 300
cttcgtcatc gtcctcagcc tcttcagggc cggccctgct ccgggtgggc ccgggcttcg 360
acgcggcgct gcaggtctcg gccgccatcg gcaccaacct gcgccggttc cgggccgtgt 420
ttggggagag cggcggggga ggcggcagcg gagag 455
<210> 2
<211> 70
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 2
gatgagcaat tcttaggttt tggctcagat gaagaagtca gagtgcgaag tcccacaagg 60
tctccttcag 70
<210> 3
<211> 2654
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 3
ttaaaactag tcctcgaaaa cctcgtggga gacctagaag tggctctgac cgaaattcag 60
ctatcctctc agatccatct gtgttttccc ctctaaataa atcagagacc aaatctggag 120
ataagatcaa gaagaaagat tctaaaagta tagaaaagaa gagaggaaga cctcccacct 180
tccctggagt aaaaatcaaa ataacacatg gaaaggacat ttcagagtta ccaaagggaa 240
acaaagaaga tagcctgaaa aaaattaaaa ggacaccttc tgctacgttt cagcaagcca 300
caaagattaa aaaattaaga gcaggtaaac tctctcctct caagtctaag tttaagacag 360
ggaagcttca aataggaagg aagggggtac aaattgtacg acggagagga aggcctccat 420
caacagaaag gataaagacc ccttcgggtc tcctcattaa ttctgaactg gaaaagcccc 480
agaaagtccg gaaagacaag gaaggaacac ctccacttac aaaagaagat aagacagttg 540
tcagacaaag ccctcgaagg attaagccag ttaggattat tccttcttca aaaaggacag 600
atgcaaccat tgctaagcaa ctcttacaga gggcaaaaaa gggggctcaa aagaaaattg 660
aaaaagaagc agctcagctg cagggaagaa aggtgaagac acaggtcaaa aatattcgac 720
agttcatcat gcctgttgtc agtgctatct cctcgcggat cattaagacc cctcggcggt 780
ttatagagga tgaggattat gaccctccaa ttaaaattgc ccgattagag tctacaccga 840
atagtagatt cagtgccccg tcctgtggat cttctgaaaa atcaagtgca gcttctcagc 900
actcctctca aatgtcttca gactcctctc gatctagtag ccccagtgtt gatacctcca 960
cagactctca ggcttctgag gagattcagg tacttcctga ggagcggagc gatacccctg 1020
aagttcatcc tccactgccc atttcccagt ccccagaaaa tgagagtaat gataggagaa 1080
gcagaaggta ttcagtgtcg gagagaagtt ttggatctag aacgacgaaa aaattatcaa 1140
ctctacaaag tgccccccag cagcagacct cctcgtctcc acctccacct ctgctgactc 1200
caccgccacc actgcagcca gcctccagta tctctgacca cacaccttgg cttatgcctc 1260
caacaatccc cttagcatca ccatttttgc ctgcttccac tgctcctatg caagggaagc 1320
gaaaatctat tttgcgagaa ccgacattta ggtggacttc tttaaagcat tctaggtcag 1380
agccacaata cttttcctca gcaaagtatg ccaaagaagg tcttattcgc aaaccaatat 1440
ttgataattt ccgaccccct ccactaactc ccgaggacgt tggctttgca tctggttttt 1500
ctgcatctgg taccgctgct tcagcccgat tgttttcgcc actccattct ggaacaaggt 1560
ttgatatgca caaaaggagc cctcttctga gagctccaag atttactcca agtgaggctc 1620
actctagaat atttgagtct gtaaccttgc ctagtaatcg aacttctgct ggaacatctt 1680
cttcaggagt atccaataga aaaaggaaaa gaaaagtgtt tagtcctatt cgatctgaac 1740
caagatctcc ttctcactcc atgaggacaa gaagtggaag gcttagtagt tctgagctct 1800
cacctctcac ccccccgtct tctgtctctt cctcgttaag catttctgtt agtcctcttg 1860
ccactagtgc cttaaaccca acttttactt ttccttctca ttccctgact cagtctgggg 1920
aatctgcaga gaaaaatcag agaccaagga agcagactag tgctccggca gagccatttt 1980
catcaagtag tcctactcct ctcttccctt ggtttacccc aggctctcag actgaaagag 2040
ggagaaataa agacaaggcc cccgaggagc tgtccaaaga tcgagatgct gacaagagcg 2100
tggagaagga caagagtaga gagagagacc gggagagaga aaaggagaat aagcgggagt 2160
caaggaaaga gaaaaggaaa aagggatcag aaattcagag tagttctgct ttgtatcctg 2220
tgggtagggt ttccaaagag aaggttgttg gtgaagatgt tgccacttca tcttctgcca 2280
aaaaagcaac agggcggaag aagtcttcat cacatgattc tgggactgat attacttctg 2340
tgactcttgg ggatacaaca gctgtcaaaa ccaaaatact tataaagaaa gggagaggaa 2400
atctggaaaa aaccaacttg gacctcggcc caactgcccc atccctggag aaggagaaaa 2460
ccctctgcct ttccactcct tcatctagca ctgttaaaca ttccacttcc tccataggct 2520
ccatgttggc tcaggcagac aagcttccaa tgactgacaa gagggttgcc agcctcctaa 2580
aaaaggccaa agctcagctc tgcaagattg agaagagtaa gagtcttaaa caaaccgacc 2640
agcccaaagc acag 2654
<210> 4
<211> 178
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 4
ggtcaagaaa gtgactcatc agagacctct gtgcgaggac cccggattaa acatgtctgc 60
agaagagcag ctgttgccct tggccgaaaa cgagctgtgt ttcctgatga catgcccacc 120
ctgagtgcct taccatggga agaacgagaa aagattttgt cttccatggg gaatgatg 178
<210> 5
<211> 235
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 5
acaagtcatc aattgctggc tcagaagatg ctgaacctct tgctccaccc atcaaaccaa 60
ttaaacctgt cactagaaac aaggcacccc aggaacctcc agtaaagaaa ggacgtcgat 120
cgaggcggtg tgggcagtgt cccggctgcc aggtgcctga ggactgtggt gtttgtacta 180
attgcttaga taagcccaag tttggtggtc gcaatataaa gaagcagtgc tgcaa 235
<210> 6
<211> 65
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 6
gatgagaaaa tgtcagaatc tacaatggat gccttccaaa gcctacctgc agaagcaagc 60
taaag 65
<210> 7
<211> 378
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 7
ctgtgaaaaa gaaagagaaa aagtctaaga ccagtgaaaa gaaagacagc aaagagagca 60
gtgttgtgaa gaacgtggtg gactctagtc agaaacctac cccatcagca agagaggatc 120
ctgccccaaa gaaaagcagt agtgagcctc ctccacgaaa gcccgtcgag gaaaagagtg 180
aagaagggaa tgtctcggcc cctgggcctg aatccaaaca ggccaccact ccagcttcca 240
ggaagtcaag caagcaggtc tcccagccag cactggtcat cccgcctcag ccacctacta 300
caggaccgcc aagaaaagaa gttcccaaaa ccactcctag tgagcccaag aaaaagcagc 360
ctccaccacc agaatcag 378
<210> 8
<211> 74
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 8
gtccagagca gagcaaacag aaaaaagtgg ctccccgccc aagtatccct gtaaaacaaa 60
aaccaaaaga aaag 74
<210> 9
<211> 132
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 9
gaaaaaccac ctccggtcaa taagcaggag aatgcaggca ctttgaacat cctcagcact 60
ctctccaatg gcaatagttc taagcaaaaa attccagcag atggagtcca caggatcaga 120
gtggacttta ag 132
<210> 10
<211> 114
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 10
gaggattgtg aagcagaaaa tgtgtgggag atgggaggct taggaatctt gacttctgtt 60
cctataacac ccagggtggt ttgctttctc tgtgccagta gtgggcatgt agag 114
<210> 11
<211> 147
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 11
tttgtgtatt gccaagtctg ttgtgagccc ttccacaagt tttgtttaga ggagaacgag 60
cgccctctgg aggaccagct ggaaaattgg tgttgtcgtc gttgcaaatt ctgtcacgtt 120
tgtggaaggc aacatcaggc tacaaag 147
<210> 12
<211> 96
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 12
cagctgctgg agtgtaataa gtgccgaaac agctatcacc ctgagtgcct gggaccaaac 60
taccccacca aacccacaaa gaagaagaaa gtctgg 96
<210> 13
<211> 121
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 13
atctgtacca agtgtgttcg ctgtaagagc tgtggatcca caactccagg caaagggtgg 60
gatgcacagt ggtctcatga tttctcactg tgtcatgatt gcgccaagct ctttgctaaa 120
g 121
<210> 14
<211> 123
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 14
gaaacttctg ccctctctgt gacaaatgtt atgatgatga tgactatgag agtaagatga 60
tgcaatgtgg aaagtgtgat cgctgggtcc attccaaatg tgagaatctt tcaggtacag 120
aag 123
<210> 15
<211> 185
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 15
atgagatgta tgagattcta tctaatctgc cagaaagtgt ggcctacact tgtgtgaact 60
gtactgagcg gcaccctgca gagtggcgac tggcccttga aaaagagctg cagatttctc 120
tgaagcaagt tctgacagct ttgttgaatt ctcggactac cagccatttg ctacgctacc 180
ggcag 185
<210> 16
<211> 174
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 16
gctgccaagc ctccagactt aaatcccgag acagaggaga gtataccttc ccgcagctcc 60
cccgaaggac ctgatccacc agttcttact gaggtcagca aacaggatga tcagcagcct 120
ttagatctag aaggagtcaa gaggaagatg gaccaaggga attacacatc tgtg 174
<210> 17
<211> 111
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 17
ttggagttca gtgatgatat tgtgaagatc attcaagcag ccattaattc agatggagga 60
cagccagaaa ttaaaaaagc caacagcatg gtcaagtcct tcttcattcg g 111
<210> 18
<211> 74
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 18
caaatggaac gtgtttttcc atggttcagt gtcaaaaagt ccaggttttg ggagccaaat 60
aaagtatcaa gcaa 74
<210> 19
<211> 194
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 19
cagtgggatg ttaccaaacg cagtgcttcc accttcactt gaccataatt atgctcagtg 60
gcaggagcga gaggaaaaca gccacactga gcagcctcct ttaatgaaga aaatcattcc 120
agctcccaaa cccaaaggtc ctggagaacc agactcacca actcctctgc atcctcctac 180
accaccaatt ttga 194
<210> 20
<211> 107
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 20
gtactgatag gagtcgagaa gacagtccag agctgaaccc acccccaggc atagaagaca 60
atagacagtg tgcgttatgt ttgacttatg gtgatgacag tgctaat 107
<210> 21
<211> 138
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 21
gatgctggtc gtttactata tattggccaa aatgagtgga cacatgtaaa ttgtgctttg 60
tggtcagcgg aagtgtttga agatgatgac ggatcactaa agaatgtgca tatggctgtg 120
atcaggggca agcagctg 138
<210> 22
<211> 159
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 22
agatgtgaat tctgccaaaa gccaggagcc accgtgggtt gctgtctcac atcctgcacc 60
agcaactatc acttcatgtg ttcccgagcc aagaactgtg tctttctgga tgataaaaaa 120
gtatattgcc aacgacatcg ggatttgatc aaaggcgaa 159
<210> 23
<211> 118
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 23
gtggttcctg agaatggatt tgaagttttc agaagagtgt ttgtggactt tgaaggaatc 60
agcttgagaa ggaagtttct caatggcttg gaaccagaaa atatccacat gatgattg 118
<210> 24
<211> 79
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 24
ggtctatgac aatcgactgc ttaggaattc taaatgatct ctccgactgt gaagataagc 60
tctttcctat tggatatca 79
<210> 25
<211> 161
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 25
gtgttccagg gtatactgga gcaccacaga tgctcgcaag cgctgtgtat atacatgcaa 60
gatagtggag tgccgtcctc cagtcgtaga gccggatatc aacagcactg ttgaacatga 120
tgaaaacagg accattgccc atagtccaac atcttttaca g 161
<210> 26
<211> 186
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 26
aaagttcatc aaaagagagt caaaacacag ctgaaattat aagtcctcca tcaccagacc 60
gacctcctca ttcacaaacc tctggctcct gttattatca tgtcatctca aaggtcccca 120
ggattcgaac acccagttat tctccaacac agagatcccc tggctgtcga ccgttgcctt 180
ctgcag 186
<210> 27
<211> 4249
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 27
gaagtcctac cccaaccact catgaaatag tcacagtagg tgatccttta ctctcctctg 60
gacttcgaag cattggctcc aggcgtcaca gtacctcttc cttatcaccc cagcggtcca 120
aactccggat aatgtctcca atgagaactg ggaatactta ctctaggaat aatgtttcct 180
cagtctccac caccgggacc gctactgatc ttgaatcaag tgccaaagta gttgatcatg 240
tcttagggcc actgaattca agtactagtt tagggcaaaa cacttccacc tcttcaaatt 300
tgcaaaggac agtggttact gtaggcaata aaaacagtca cttggatgga tcttcatctt 360
cagaaatgaa gcagtccagt gcttcagact tggtgtccaa gagctcctct ttaaagggag 420
agaagaccaa agtgctgagt tccaagagct cagagggatc tgcacataat gtggcttacc 480
ctggaattcc taaactggcc ccacaggttc ataacacaac atctagagaa ctgaatgtta 540
gtaaaatcgg ctcctttgct gaaccctctt cagtgtcgtt ttcttctaaa gaggccctct 600
ccttcccaca cctccatttg agagggcaaa ggaatgatcg agaccaacac acagattcta 660
cccaatcagc aaactcctct ccagatgaag atactgaagt caaaaccttg aagctatctg 720
gaatgagcaa cagatcatcc attatcaacg aacatatggg atctagttcc agagatagga 780
gacagaaagg gaaaaaatcc tgtaaagaaa ctttcaaaga aaagcattcc agtaaatctt 840
ttttggaacc tggtcaggtg acaactggtg aggaaggaaa cttgaagcca gagtttatgg 900
atgaggtttt gactcctgag tatatgggcc aacgaccatg taacaatgtt tcttctgata 960
agattggtga taaaggcctt tctatgccag gagtccccaa agctccaccc atgcaagtag 1020
aaggatctgc caaggaatta caggcaccac ggaaacgcac agtcaaagtg acactgacac 1080
ctctaaaaat ggaaaatgag agtcaatcca aaaatgccct gaaagaaagt agtcctgctt 1140
cccctttgca aatagagtca acatctccca cagaaccaat ttcagcctct gaaaatccag 1200
gagatggtcc agtggcccaa ccaagcccca ataatacctc atgccaggat tctcaaagta 1260
acaactatca gaatcttcca gtacaggaca gaaacctaat gcttccagat ggccccaaac 1320
ctcaggagga tggctctttt aaaaggaggt atccccgtcg cagtgcccgt gcacgttcta 1380
acatgttttt tgggcttacc ccactctatg gagtaagatc ctatggtgaa gaagacattc 1440
cattctacag cagctcaact gggaagaagc gaggcaagag atcagctgaa ggacaggtgg 1500
atggggccga tgacttaagc acttcagatg aagacgactt atactattac aacttcacta 1560
gaacagtgat ttcttcaggt ggagaggaac gactggcatc ccataattta tttcgggagg 1620
aggaacagtg tgatcttcca aaaatctcac agttggatgg tgttgatgat gggacagaga 1680
gtgatactag tgtcacagcc acaacaagga aaagcagcca gattccaaaa agaaatggta 1740
aagaaaatgg aacagagaac ttaaagattg atagacctga agatgctggg gagaaagaac 1800
atgtcactaa gagttctgtt ggccacaaaa atgagccaaa gatggataac tgccattctg 1860
taagcagagt taaaacacag ggacaagatt ccttggaagc tcagctcagc tcattggagt 1920
caagccgcag agtccacaca agtaccccct ccgacaaaaa tttactggac acctataata 1980
ctgagctcct gaaatcagat tcagacaata acaacagtga tgactgtggg aatatcctgc 2040
cttcagacat tatggacttt gtactaaaga atactccatc catgcaggct ttgggtgaga 2100
gcccagagtc atcttcatca gaactcctga atcttggtga aggattgggt cttgacagta 2160
atcgtgaaaa agacatgggt ctttttgaag tattttctca gcagctgcct acaacagaac 2220
ctgtggatag tagtgtctct tcctctatct cagcagagga acagtttgag ttgcctctag 2280
agctaccatc tgatctgtct gtcttgacca cccggagtcc cactgtcccc agccagaatc 2340
ccagtagact agctgttatc tcagactcag gggagaagag agtaaccatc acagaaaaat 2400
ctgtagcctc ctctgaaagt gacccagcac tgctgagccc aggagtagat ccaactcctg 2460
aaggccacat gactcctgat cattttatcc aaggacacat ggatgcagac cacatctcta 2520
gccctccttg tggttcagta gagcaaggtc atggcaacaa tcaggattta actaggaaca 2580
gtagcacccc tggccttcag gtacctgttt ccccaactgt tcccatccag aaccagaagt 2640
atgtgcccaa ttctactgat agtcctggcc cgtctcagat ttccaatgca gctgtccaga 2700
ccactccacc ccacctgaag ccagccactg agaaactcat agttgttaac cagaacatgc 2760
agccacttta tgttctccaa actcttccaa atggagtgac ccaaaaaatc caattgacct 2820
cttctgttag ttctacaccc agtgtgatgg agacaaatac ttcagtattg ggacccatgg 2880
gaggtggtct cacccttacc acaggactaa atccaagctt gccaacttct caatctttgt 2940
tcccttctgc tagcaaagga ttgctaccca tgtctcatca ccagcactta cattccttcc 3000
ctgcagctac tcaaagtagt ttcccaccaa acatcagcaa tcctccttca ggcctgctta 3060
ttggggttca gcctcctccg gatccccaac ttttggtttc agaatccagc cagaggacag 3120
acctcagtac cacagtagcc actccatcct ctggactcaa gaaaagaccc atatctcgtc 3180
tacagacccg aaagaataaa aaacttgctc cctctagtac cccttcaaac attgcccctt 3240
ctgatgtggt ttctaatatg acattgatta acttcacacc ctcccagctt cctaatcatc 3300
caagtctgtt agatttgggg tcacttaata cttcatctca ccgaactgtc cccaacatca 3360
taaaaagatc taaatctagc atcatgtatt ttgaaccggc acccctgtta ccacagagtg 3420
tgggaggaac tgctgccaca gcggcaggca catcaacaat aagccaggat actagccacc 3480
tcacatcagg gtctgtgtct ggcttggcat ccagttcctc tgtcttgaat gttgtatcca 3540
tgcaaactac cacaacccct acaagtagtg cgtcagttcc aggacacgtc accttaacca 3600
acccaaggtt gcttggtacc ccagatattg gctcaataag caatctttta atcaaagcta 3660
gccagcagag cctggggatt caggaccagc ctgtggcttt accgccaagt tcaggaatgt 3720
ttccacaact ggggacatca cagaccccct ctactgctgc aataacagcg gcatctagca 3780
tctgtgtgct cccctccact cagactacgg gcataacagc cgcttcacct tctggggaag 3840
cagacgaaca ctatcagctt cagcatgtga accagctcct tgccagcaaa actgggattc 3900
attcttccca gcgtgatctt gattctgctt cagggcccca ggtatccaac tttacccaga 3960
cggtagacgc tcctaatagc atgggactgg agcagaacaa ggctttatcc tcagctgtgc 4020
aagccagccc cacctctcct gggggttctc catcctctcc atcttctgga cagcggtcag 4080
caagcccttc agtgccgggt cccactaaac ccaaaccaaa aaccaaacgg tttcagctgc 4140
ctctagacaa agggaatggc aagaagcaca aagtttccca tttgcggacc agttcttctg 4200
aagcacacat tccagaccaa gaaacgacat ccctgacctc aggcacagg 4249
<210> 28
<211> 81
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 28
gactccagga gcagaggctg agcagcagga tacagctagc gtggagcagt cctcccagaa 60
ggagtgtggg caacctgcag g 81
<210> 29
<211> 65
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 29
gcaagtcgct gttcttccgg aagttcaggt gacccaaaat ccagcaaatg aacaagaaag 60
tgcag 65
<210> 30
<211> 171
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 30
aacctaaaac agtggaagaa gaggaaagta atttcagctc cccactgatg ctttggcttc 60
agcaagaaca aaagcggaag gaaagcatta ctgagaaaaa acccaagaaa ggacttgttt 120
ttgaaatttc cagtgatgat ggctttcaga tctgtgcaga aagtattgaa g 171
<210> 31
<211> 75
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 31
atgcctggaa gtcattgaca gataaagtcc aggaagctcg atcaaatgcc cgcctaaagc 60
agctctcatt tgcag 75
<210> 32
<211> 175
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 32
gtgttaacgg tttgaggatg ctggggattc tccatgatgc agttgtgttc ctcattgagc 60
agctgtctgg tgccaagcac tgtcgaaatt acaaattccg tttccacaag ccagaggagg 120
ccaatgaacc ccccttgaac cctcacggct cagccagggc tgaagtccac ctcag 175
<210> 33
<211> 108
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 33
gaagtcagca tttgacatgt ttaacttcct ggcttctaaa catcgtcagc ctcctgaata 60
caaccccaat gatgaagaag aggaggaggt acagctgaag tcagctcg 108
<210> 34
<211> 84
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 34
gagggcaact agcatggatc tgccaatgcc catgcgcttc cggcacttaa aaaagacttc 60
taaggaggca gttggtgtct acag 84
<210> 35
<211> 130
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 35
gtctcccatc catggccggg gtcttttctg taagagaaac attgatgcag gtgagatggt 60
gattgagtat gccggcaacg tcatccgctc catccagact gacaagcggg aaaagtatta 120
cgacagcaag 130
<210> 36
<211> 4928
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 36
ggcattggtt gctatatgtt ccgaattgat gactcagagg tagtggatgc caccatgcat 60
ggaaatgctg cacgcttcat caatcactcg tgtgagccta actgctattc tcgggtcatc 120
aatattgatg ggcagaagca cattgtcatc tttgccatgc gtaagatcta ccgaggagag 180
gaactcactt acgactataa gttccccatt gaggatgcca gcaacaagct gccctgcaac 240
tgtggcgcca agaaatgccg gaagttccta aactaaagct gctcttctcc cccagtgttg 300
gagtgcaagg aggcggggcc atccaaagca acgctgaagg ccttttccag cagctgggag 360
ctcccggatt gcgtggcaca gctgaggggc ctctgtgatg gctgagctct cttatgtcct 420
atactcacat cagacatgtg atcatagtcc cagagacaga gttgaggtct cgaagaaaag 480
atccatgatc ggctttctcc tggggcccct ccaattgttt actgttagaa agtgggaatg 540
gggtccctag cagacttgcc tggaaggagc ctattataga gggttggtta tgttgggaga 600
ttgggcctga atttctccac agaaataagt tgccatcctc aggttggccc tttcccaagc 660
actgtaagtg agtgggtcag gcaaagcccc aaatggaggg ttggttagat tcctgacagt 720
ttgccagcca ggccccacct acagcgtctg tcgaacaaac agaggtctgg tggttttccc 780
tactatcctc ccactcgaga gttcacttct ggttgggaga caggattcct agcacctccg 840
gtgtcaaaag gctgtcatgg ggttgtgcca attaattacc aaacattgag cctgcaggct 900
ttgagtggga gtgttgcccc caggagcctt atctcagcca attacctttc ttgacagtag 960
gagcggcttc cctctcccat tccctcttca ctcccttttc ttcctttccc ctgtcttcat 1020
gccactgctt tcccatgctt ctttcgggtt gtaggggaga ctgactgcct gctcaaggac 1080
actccctgct gggcatagga tgtgcctgca aaaagttccc tgagcctgta agcactccag 1140
gtggggaagt ggacaggagc cattggtcat aaccagacag aatttggaaa cattttcata 1200
aagctccatg gagagtttta aagaaacata tgtagcatga ttttgtagga gaggaaaaag 1260
attatttaaa taggatttaa atcatgcaac aacgagagta tcacagccag gatgaccctt 1320
gggtcccatt cctaagacat ggttacttta ttttcccctt gttaagacat aggaagactt 1380
aatttttaaa cggtcagtgt ccagttgaag gcagaacact aatcagattt caaggcccac 1440
aacttgggga ctagaccacc ttatgttgag ggaactctgc cacctgcgtg caacccacag 1500
ctaaagtaaa ttcaatgaca ctactgccct gattactcct taggatgtgg tcaaaacagc 1560
atcaaatgtt tcttctcttc ctttccccaa gacagagtcc tgaacctgtt aaattaagtc 1620
attggatttt actctgttct gtttacagtt tactatttaa ggttttataa atgtaaatat 1680
attttgtata tttttctatg agaagcactt catagggaga agcacttatg acaaggctat 1740
tttttaaacc gcggtattat cctaatttaa aagaagatcg gtttttaata attttttatt 1800
ttcataggat gaagttagag aaaatattca gctgtacaca caaagtctgg tttttcctgc 1860
ccaacttccc cctggaaggt gtactttttg ttgtttaatg tgtagcttgt ttgtgccctg 1920
ttgacataaa tgtttcctgg gtttgctctt tgacaataaa tggagaagga aggtcaccca 1980
actccattgg gccactcccc tccttcccct attgaagctc ctcaaaaggc tacagtaata 2040
tcttgataca acagattctc ttctttcccg cctctctcct ttccggcgca acttccagag 2100
tggtgggaga cggcaatctt tacatttccc tcatctttct tacttcagag ttagcaaaca 2160
acaagttgaa tggcaacttg acatttttgc atcaccatct gcctcatagg ccactctttc 2220
ctttccctct gcccaccaag tcctcatatc tgcagagaac ccattgatca ccttgtgccc 2280
tcttttgggg cagcctgttg aaactgaagc acagtctgac cactcacgat aaagcagatt 2340
tttctctgcc tctgccacaa ggtttcagag tagtgtagtc caagtagagg gtggggcacc 2400
cttttctcgc cgcaagaagc ccattcctat ggaagtctag caaagcaata cgactcagcc 2460
cagcactctc tgccccagga ctcatggctc tgctgtgcct tccatcctgg gctcccttct 2520
ctcctgtgac cttaagaact ttgtctggtg gctttgctgg aacattgtca ctgttttcac 2580
tgtcatgcag ggagcccagc actgtggcca ggatggcaga gacttccttg tcatcatgga 2640
gaagtgccag caggggactg ggaaaagcac tctacccaga cctcacctcc cttcctcctt 2700
ttgcccatga acaagatgca gtggccctag gggttccact agtgtctgct ttcctttatt 2760
attgcactgt gtgaggtttt tttgtaaatc cttgtattcc tatttttttt aaagaaaaaa 2820
aaaaaacctt aagctgcatt tgttactgaa atgattaatg cactgatggg tcctgaattc 2880
accttgagaa agacccaaag gccagtcagg gggtgggggg aactcagcta aatagaccta 2940
gttactgccc tgctaggcca tgctgtactg tgagcccctc ctcactctct accaacccta 3000
aaccctgagg acaggggagg aacccacagc ttccttctcc tgccagctgc agatggtttg 3060
ccttgccttt ccacccccta attgtcaacc acaaaaatga gaaattcctc ttctagctca 3120
gccttgagtc cattgccaaa ttttcagcac acctgccagc aacttggggg aataagcgaa 3180
ggtttcccta caagagggaa agaaggcaaa aacggcacag ctatctccaa acacatctga 3240
gttcatttca aaagtgacca agggaatctc cgcacaaaag tgcagattga ggaattgtga 3300
tgggtcattc ccaagaatcc cccaaggggc atcccaaatc cctgaggagt aacagctgca 3360
aacctggtca gttctcagtg agagccagct cacttatagc tttgctgcta gaacctgttg 3420
tggctgcatt tcctggtggc cagtgacaac tgtgtaacca gaatagctgc atggcgctga 3480
ccctttggcc ggaacttggt ctcttggctc cctccttggc cacccaccac ctctcgcaca 3540
gcccctctgt ttttacacca ataacaagaa ttaaggggga agccctggca gctatacgtt 3600
ttcaaccaga ctcctttgcc gggacccagc ccgccaccct gctcgcctcc gtcaaacccc 3660
cggccaatgc agtgagcacc atgtagctcc cttgatttaa aaaaaataaa aaataaaaaa 3720
aaaaggaaaa aaaaatacaa cacacacaca aaaataaaaa aaatattcta atgaatgtat 3780
ctttctaaag gactgacgtt caatcaaata tctgaaaata ctaaaggtca aaaccttgtc 3840
agatgttaac ttctaagttc ggtttgggat tttttttttt taatagaaat caagttgttt 3900
ttgtttttaa ggaaaagcgg gtcattgcaa agggctgggt gtaattttat gtttcatttc 3960
cttcatttta aagcaataca aggttatgga gcagatggtt ttgtgccgaa tcatgaatac 4020
tagtcaagtc acacactctg gaaacttgca actttttgtt tgttttggtt ttcaaataaa 4080
tataaatatg atatatatag gaactaatat agtaatgcac catgtaacaa agcctagttc 4140
agtccatggc ttttaattct cttaacacta tagataagga ttgtgttaca gttgctagta 4200
gcggcaggaa gatgtcaggc tcactttcct ctgattcccg aaatgggggg aacctctaac 4260
cataaaggaa tggtagaaca gtccattcct cggatcagag aaaaatgcag acatggtgtc 4320
acctggattt ttttctgccc atgaatgttg ccagtcagta cctgtcctcc ttgtttctct 4380
atttttggtt atgaatgttg gggttaccac ctgcatttag gggaaaattg tgttctgtgc 4440
tttcctggta tcttgttccg aggtactcta gttctgtctt tcaaccaaga aaatagaatt 4500
gtggtgtttc ttttattgaa cttttaacag tctctttagt aaatacaggt agttgaataa 4560
ttgtttcaag agctcaacag atgacaagct tcttttctag aaataagaca ttttttgaca 4620
actttatcat gtataacaga tctgtttttt ttccttgtgt tcttccaagc ttctggttag 4680
agaaaaagag aaaaaaaaaa aaggaaaatg tgtctaaagt ccatcagtgt taactccctg 4740
tgacagggat gaaggaaaat actttaatag ttcaaaaaat aataatgctg aaagctctct 4800
acgaaagact gaatgtaaaa gtaaaaagtg tacatagttg taaaaaaaag gagtttttaa 4860
acatgtttat tttctatgca ctttttttta tttaagtgat agtttaatta ataaacatgt 4920
caagttta 4928
<210> 37
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 37
agaggtctct gatgagtcac tttcttgacc cttttctttt ggtttttgtt ttacagggat 60
<210> 38
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 38
agaggtctct gatgagtcac tttcttgacc cttttctttt ggtttttgtt ttacagggat 60
<210> 39
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 39
atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60
<210> 40
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 40
catcttctga gccagcaatt gatgacttgt cttttctttt ggtttttgtt ttacagggat 60
<210> 41
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 41
atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60
<210> 42
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 42
atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60
<210> 43
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 43
atctgagcca aaacctaaga attgctcatc cttaaagtcc actctgatcc tgtggactcc 60
<210> 44
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 44
atctgagcca aaacctaaga attgctcatc ctgattctgg tggtggaggc tgctttttct 60
<210> 45
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 45
atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60
<210> 46
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 46
atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60
<210> 47
<211> 60
<212> DNA
<213>homo sapiens (Homo sapiens)
<400> 47
catcttctga gccagcaatt gatgacttgt cttttctttt ggtttttgtt ttacagggat 60
Claims (12)
1. a kind of method for detecting gene rearrangement, which is characterized in that the described method includes:
Obtain the sequence to be compared of sample to be tested;
The sequence to be compared is compared with reference to genome, obtains abnormal aligned sequences, the exception aligned sequences packet
The sequence for comparing malposition is included, the sequence of direction exception is compared and does not compare the sequence with reference to genome;
In the comparison position with reference on genome and direction is compared according to the abnormal aligned sequences, is determined candidate disconnected
The position of point;
Assembled using the sequence for the position for supporting the Candidate point in the sequence to be compared, retain assembling result in
The consistent breakpoint of the sequence information of the position of the Candidate point, is denoted as the breakpoint of the gene rearrangement.
2. the method according to claim 1, wherein referring to genome described according to the abnormal aligned sequences
On comparison position and compare direction, determine that the position of Candidate point includes:
Genome alignment is referred to described again after the abnormal aligned sequences are carried out sequence cutting, according to described different after cutting
Normal aligned sequences determine the position of the Candidate point in the comparison position and the comparison direction with reference on genome
It sets.
3. according to the method described in claim 2, it is characterized in that, referring to genome described according to the abnormal aligned sequences
On comparison position and compare direction, determine that the position of the Candidate point includes:
Genome alignment is referred to described again after the abnormal aligned sequences are carried out sequence cutting, acquisition can be simultaneously across potential
The sequence of the first length of breakpoint two sides, is denoted as the first flag sequence, and can cross over the potential breakpoint two sides simultaneously, but length is small
In the second length sequence as the second flag sequence;
The breakpoint reference sequences that gene rearrangement occurs are simulated according to the position of the potential breakpoint on first flag sequence;
The sequence to be compared is compared with the breakpoint reference sequences, and to the upper breakpoint reference sequences can be compared
And the sequence across the breakpoint on the breakpoint reference sequences is marked, and is denoted as supporting the breakpoint candidate sequence of breakpoint;
The position of breakpoint on the breakpoint candidate sequence is determined as to the position of the Candidate point.
4. according to the method described in claim 3, it is characterized in that, the position of the breakpoint on the breakpoint candidate sequence is determined
Position for the Candidate point includes:
According to sequencing quality and sequence number is supported to be corrected the breakpoint candidate sequence, it is described candidate disconnected after being corrected
Point sequence;
The position of breakpoint in the Candidate point sequence after correction is determined as to the position of the Candidate point.
5. according to the method described in claim 3, it is characterized in that, the position of the breakpoint on the breakpoint candidate sequence is determined
Position for the Candidate point includes:
According to first flag sequence, second flag sequence and the institute for supporting the breakpoint on the breakpoint reference sequences
Pairs of sequence of the support across the breakpoint on the breakpoint reference sequences in sequence to be compared is stated, is filtered in the breakpoint candidate sequence
False positive sequence of breakpoints, obtain the filtered Candidate point sequence;
The position of breakpoint in the filtered Candidate point sequence is determined as to the position of the Candidate point.
6. according to the method described in claim 3, it is characterized in that, using the Candidate point is supported in the sequence to be compared
Position sequence carry out assembling include:
According to first flag sequence, second flag sequence and the institute for supporting the breakpoint on the breakpoint reference sequences
It states and supports that the pairs of sequence across the breakpoint on the breakpoint reference sequences is assembled in sequence to be compared, retain in assembling result
With the consistent breakpoint of sequence information of the position of the Candidate point, it is denoted as the breakpoint of the gene rearrangement.
7. method according to any one of claim 1 to 6, which is characterized in that obtain the sequence to be compared of sample to be tested
Include:
Construct the sequencing library of the sample to be tested;
High-flux sequence is carried out to the sequencing library, obtains sequencing data;
The sequencing data is pre-processed, the sequence to be compared of the sample to be tested is obtained.
8. preferably passing through the method according to the description of claim 7 is characterized in that the sequencing library is hybrid capture library
The capture probe of SEQIDNO:1 to SEQ ID NO:36 obtains the hybrid capture library.
9. method according to any one of claim 1 to 6, which is characterized in that in the breakpoint for obtaining the gene rearrangement
Later, the method also includes carrying out quantitative step to the gene reset, the quantitative step includes:
According to the sequence information of the breakpoint of the gene rearrangement, counts and support the disconnected of the gene rearrangement in the sequence to be compared
The sequence number of point, is denoted as marker sequence number;
The sequence number of the marker sequence number and reference gene is divided by, gained ratio is the gene phase reset
For the gene expression abundance of the reference gene.
10. a kind of device for detecting gene rearrangement, which is characterized in that described device is for storing or running module, Huo Zhesuo
State the component part that module is described device;Wherein, the module is software module, and the software module is one or more,
Method of the software module for detection gene rearrangement described in any one of perform claim requirement 1 to 9.
11. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein described program right of execution
Benefit require any one of 1 to 9 described in detect gene rearrangement method.
12. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run
Benefit require any one of 1 to 9 described in detect gene rearrangement method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811643484.6A CN109712672B (en) | 2018-12-29 | 2018-12-29 | Method, device, storage medium and processor for detecting gene rearrangement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811643484.6A CN109712672B (en) | 2018-12-29 | 2018-12-29 | Method, device, storage medium and processor for detecting gene rearrangement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109712672A true CN109712672A (en) | 2019-05-03 |
CN109712672B CN109712672B (en) | 2021-05-25 |
Family
ID=66260266
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811643484.6A Active CN109712672B (en) | 2018-12-29 | 2018-12-29 | Method, device, storage medium and processor for detecting gene rearrangement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109712672B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110942807A (en) * | 2019-11-20 | 2020-03-31 | 北京橡鑫生物科技有限公司 | Method and apparatus for detecting gene rearrangement |
CN111081318A (en) * | 2019-12-06 | 2020-04-28 | 人和未来生物科技(长沙)有限公司 | Fusion gene detection method, system and medium |
CN111524548A (en) * | 2020-07-03 | 2020-08-11 | 至本医疗科技(上海)有限公司 | Method, computing device, and computer storage medium for detecting IGH reordering |
CN114694753A (en) * | 2022-03-18 | 2022-07-01 | 深圳华大医学检验实验室 | Nucleic acid sequence comparison method, device, equipment and readable storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104298892A (en) * | 2014-09-18 | 2015-01-21 | 天津诺禾致源生物信息科技有限公司 | Detection device and method for gene fusion |
CN104794371A (en) * | 2015-04-29 | 2015-07-22 | 深圳华大基因研究院 | Method and device for detecting insertion polymorphism of retrotransposon |
CN105339506A (en) * | 2013-03-15 | 2016-02-17 | 基因组影像公司 | Methods for the detection of breakpoints in rearranged genomic sequences |
US20170199961A1 (en) * | 2015-12-16 | 2017-07-13 | Gritstone Oncology, Inc. | Neoantigen Identification, Manufacture, and Use |
CN106951732A (en) * | 2010-05-25 | 2017-07-14 | 加利福尼亚大学董事会 | BAMBAM:The parallel comparative analysis of high-flux sequence data |
CN107480472A (en) * | 2017-07-21 | 2017-12-15 | 广州漫瑞生物信息技术有限公司 | The detection method and device of a kind of Gene Fusion |
CN108256295A (en) * | 2016-12-29 | 2018-07-06 | 安诺优达基因科技(北京)有限公司 | A kind of device for being used to detect Gene Fusion |
CN108830044A (en) * | 2018-06-05 | 2018-11-16 | 上海鲸舟基因科技有限公司 | For detecting the detection method and device of cancer sample Gene Fusion |
-
2018
- 2018-12-29 CN CN201811643484.6A patent/CN109712672B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951732A (en) * | 2010-05-25 | 2017-07-14 | 加利福尼亚大学董事会 | BAMBAM:The parallel comparative analysis of high-flux sequence data |
CN105339506A (en) * | 2013-03-15 | 2016-02-17 | 基因组影像公司 | Methods for the detection of breakpoints in rearranged genomic sequences |
CN104298892A (en) * | 2014-09-18 | 2015-01-21 | 天津诺禾致源生物信息科技有限公司 | Detection device and method for gene fusion |
CN104794371A (en) * | 2015-04-29 | 2015-07-22 | 深圳华大基因研究院 | Method and device for detecting insertion polymorphism of retrotransposon |
US20170199961A1 (en) * | 2015-12-16 | 2017-07-13 | Gritstone Oncology, Inc. | Neoantigen Identification, Manufacture, and Use |
CN108256295A (en) * | 2016-12-29 | 2018-07-06 | 安诺优达基因科技(北京)有限公司 | A kind of device for being used to detect Gene Fusion |
CN107480472A (en) * | 2017-07-21 | 2017-12-15 | 广州漫瑞生物信息技术有限公司 | The detection method and device of a kind of Gene Fusion |
CN108830044A (en) * | 2018-06-05 | 2018-11-16 | 上海鲸舟基因科技有限公司 | For detecting the detection method and device of cancer sample Gene Fusion |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110942807A (en) * | 2019-11-20 | 2020-03-31 | 北京橡鑫生物科技有限公司 | Method and apparatus for detecting gene rearrangement |
CN111081318A (en) * | 2019-12-06 | 2020-04-28 | 人和未来生物科技(长沙)有限公司 | Fusion gene detection method, system and medium |
CN111524548A (en) * | 2020-07-03 | 2020-08-11 | 至本医疗科技(上海)有限公司 | Method, computing device, and computer storage medium for detecting IGH reordering |
CN111524548B (en) * | 2020-07-03 | 2020-10-23 | 至本医疗科技(上海)有限公司 | Method, computing device, and computer storage medium for detecting IGH reordering |
CN114694753A (en) * | 2022-03-18 | 2022-07-01 | 深圳华大医学检验实验室 | Nucleic acid sequence comparison method, device, equipment and readable storage medium |
CN114694753B (en) * | 2022-03-18 | 2023-04-07 | 深圳华大医学检验实验室 | Nucleic acid sequence comparison method, device, equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109712672B (en) | 2021-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102770558B (en) | The analysis of Fetal genome is carried out by maternal biological sample | |
CN110870016B (en) | Verification method and system for sequence variant exhalations | |
CN108491689B (en) | Tumour neoantigen identification method based on transcript profile | |
Cosart et al. | Exome-wide DNA capture and next generation sequencing in domestic and wild species | |
CN109712672A (en) | Detect method, apparatus, storage medium and the processor of gene rearrangement | |
Tigano et al. | Chromosome-level assembly of the Atlantic silverside genome reveals extreme levels of sequence diversity and structural genetic variation | |
CN111534602A (en) | Method for analyzing human blood type and genotype based on high-throughput sequencing and application thereof | |
Meusnier et al. | Polymerase chain reaction–single strand conformation polymorphism analyses of nuclear and chloroplast DNA provide evidence for recombination, multiple introductions and nascent speciation in the Caulerpa taxifolia complex | |
CN109584957B (en) | Detection kit for capturing α thalassemia related gene copy number | |
CN107460254A (en) | A kind of method based on pig LINE1 transposons insertion polymorphism research and development New molecular marker | |
CN107312861A (en) | A kind of B ALL patients prognosis risk assessment label | |
CN108474028A (en) | Differentiate and distinguish the system and method for genetic material | |
Smirnova et al. | The use of non-functional clonotypes as a natural calibrator for quantitative bias correction in adaptive immune receptor repertoire profiling | |
US20030194711A1 (en) | System and method for analyzing gene expression data | |
CN109949866A (en) | Detection method, device, computer equipment and the storage medium of pathogen operational group | |
CN111276189B (en) | Chromosome balance translocation detection and analysis system based on NGS and application thereof | |
CN116515955B (en) | Multi-gene targeting typing method | |
KR101815529B1 (en) | Human Haplotyping System And Method | |
CN112442528B (en) | LOXHD1 gene mutant and application thereof | |
CN105779463B (en) | VPS13B gene mutation body and its application | |
WO2003050748A2 (en) | Genetic analysis of gene expression in heterosis | |
Sung et al. | Reduced-Cost Genotyping by Resequencing in Peanut Breeding Programs Using Tecan Allegro Targeted Resequencing V2 | |
US20190373871A1 (en) | Method for assaying genetic variants | |
CN114875161A (en) | Molecular marker related to low-temperature tolerance of chicken, primer combination and corresponding breeding method | |
CN117198399A (en) | Microsatellite locus, system and kit for predicting MSI state |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |