CN109712672A

CN109712672A - Detect method, apparatus, storage medium and the processor of gene rearrangement

Info

Publication number: CN109712672A
Application number: CN201811643484.6A
Authority: CN
Inventors: 王彬安; 刘洋洋; 李富威; 王建伟; 伍启熹; 刘倩; 刘珂弟; 唐宇
Original assignee: Beijing You Xun Medical Laboratory Laboratory Co Ltd
Current assignee: Beijing You Xun Medical Laboratory Laboratory Co Ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2019-05-03
Anticipated expiration: 2038-12-29
Also published as: CN109712672B

Abstract

The present invention provides a kind of method, apparatus, storage medium and processors for detecting gene rearrangement.This method comprises: obtaining the sequence to be compared of sample to be tested；Sequence to be compared is compared with reference to genome, obtains abnormal aligned sequences, abnormal aligned sequences include comparing the sequence of malposition, compare the sequence of direction exception and do not compare the sequence with reference to genome；According to comparison position of the abnormal aligned sequences on reference genome and direction is compared, determines the position of Candidate point；It is assembled using the sequence for the position for supporting Candidate point in sequence to be compared, retains the consistent breakpoint of sequence information in assembling result with the position of Candidate point, be denoted as the breakpoint of gene rearrangement.The application solves the problems, such as that the prior art is difficult to detect the breakpoint location of gene rearrangement generation.

Description

Detect method, apparatus, storage medium and the processor of gene rearrangement

Technical field

The present invention relates to genetic mutation detection field, in particular to a kind of method, apparatus for detecting gene rearrangement, Storage medium and processor.

Background technique

The prior art generallys use the method for RT-Nested PCR to detect gene rearrangement phenomenon, and its step are as follows: based on The target-gene sequence known prepares special probe, detects gene rearrangement.Nest-type PRC reaction has twice PCR amplification, to reduce A possibility that amplification multiple target sites (because with two sets of all complementary primers of primer seldom), increases the sensibility of detection；Again There is the pairing of two pairs of PCR primers and detection template, increases the reliability of detection.Since second set of primer is located at first round PCR Inside product, rather than a possibility that purpose segment includes two sets of primer binding sites, is minimum, therefore second set of primer can not expand Increase non-purpose segment.This nested PCR amplification ensures that the second wheel PCR product is nearly or completely specific not without primer pairing The pollution of non-specific amplification caused by strong.

But RT-Nested PCR checks that gene rearrangement has the disadvantage in that the structure for 1) being unable to judge accurately gene rearrangement. 2) it is limited by primer and probe, unknown rearrangements can not be detected.3) it is unable to get the sequence of rearranged gene fracture bonding pad Details.

Therefore, it is necessary to be improved to existing detection method.

Summary of the invention

The main purpose of the present invention is to provide method, apparatus, storage medium and the places of a kind of detection detection gene rearrangement Device is managed, to solve the problems, such as that the prior art is difficult to detect the breakpoint location of gene rearrangement generation.

To achieve the goals above, according to an aspect of the invention, there is provided a kind of method for detecting gene rearrangement, is somebody's turn to do Method includes: to obtain the sequence to be compared of sample to be tested；Sequence to be compared is compared with reference to genome, obtains anomaly ratio To sequence, abnormal aligned sequences include comparing the sequence of malposition, compare the sequence of direction exception and do not compare with reference to base Because of the sequence of group；According to comparison position of the abnormal aligned sequences on reference genome and direction is compared, determines Candidate point Position；It is assembled, is retained disconnected with candidate in assembling result using the sequence for the position for supporting Candidate point in sequence to be compared The consistent breakpoint of sequence information of the position of point, is denoted as the breakpoint of gene rearrangement.

Further, the comparison position according to abnormal aligned sequences on reference genome and comparison direction, determine candidate The position of breakpoint include: by abnormal aligned sequences carry out sequence cutting after again with reference genome alignment, according to different after cutting Normal comparison position of the aligned sequences on reference genome and comparison direction, determine the position of Candidate point.

Further, the comparison position according to abnormal aligned sequences on reference genome and comparison direction, determine candidate The position of breakpoint includes: to carry out abnormal aligned sequences after sequence cutting again with reference genome alignment, and acquisition can cross over simultaneously The sequence of potential the first length of breakpoint two sides, is denoted as the first flag sequence, and can cross over potential breakpoint two sides simultaneously, but length is small In the second length sequence as the second flag sequence；It is simulated according to the position of the potential breakpoint on the first flag sequence and base occurs Because of the breakpoint reference sequences of rearrangement；Sequence to be compared is compared with breakpoint reference sequences, and joins to top broken-point can be compared The sequence for examining sequence and the breakpoint on across breakpoint reference sequences is marked, and is denoted as supporting the breakpoint candidate sequence of breakpoint；It will break The position of breakpoint on point candidate sequence is determined as the position of Candidate point.

It further, include: according to survey by the position that the position of the breakpoint on breakpoint candidate sequence is determined as Candidate point Sequence quality and support sequence number are corrected breakpoint candidate sequence, the Candidate point sequence after being corrected；After correction The position of breakpoint in Candidate point sequence is determined as the position of Candidate point.

It further, include: according to branch by the position that the position of the breakpoint on breakpoint candidate sequence is determined as Candidate point It holds and supports across breakpoint ginseng in the first flag sequence, the second flag sequence and sequence to be compared of the breakpoint on breakpoint reference sequences The pairs of sequence of the breakpoint in sequence is examined, the false positive sequence of breakpoints in breakpoint candidate sequence is filtered, obtains filtered candidate Sequence of breakpoints；The position of breakpoint in filtered Candidate point sequence is determined as to the position of Candidate point.

Further, carrying out assembling using the sequence for the position for supporting Candidate point in sequence to be compared includes: according to branch It holds and supports across breakpoint ginseng in the first flag sequence, the second flag sequence and sequence to be compared of the breakpoint on breakpoint reference sequences The pairs of sequence for examining the breakpoint in sequence is assembled, and is retained consistent with the sequence information of the position of Candidate point in assembling result Breakpoint, be denoted as the breakpoint of gene rearrangement.

Further, the sequence to be compared for obtaining sample to be tested includes: the sequencing library for constructing sample to be tested；To sequencing text Library carries out high-flux sequence, obtains sequencing data；Sequencing data is pre-processed, the sequence to be compared of sample to be tested is obtained.

Further, sequencing library is hybrid capture library, preferably passes through the capture of SEQIDNO:1 to SEQ ID NO:36 Probe obtains hybrid capture library.

Further, after the breakpoint for obtaining gene rearrangement, method further includes quantifying to the gene reset The step of, quantitative step includes: to count support gene weight in sequence to be compared according to the sequence information of the breakpoint of gene rearrangement The sequence number of the breakpoint of row is denoted as marker sequence number；The sequence number of marker sequence number and reference gene is divided by, gained ratio Gene expression abundance of the gene that value is as reset relative to reference gene.

To achieve the goals above, according to the second aspect of the invention, a kind of device for detecting gene rearrangement is provided, Device, which is used to store, runs module or module perhaps as the component part of device；Wherein, module is software module, software mould Block is one or more, the method that software module is used to execute any of the above-described kind of detection gene rearrangement.

According to the third aspect of the present invention, a kind of storage medium is provided, storage medium includes the program of storage, In, program executes any of the above-described kind of method for detecting gene rearrangement.

According to the fourth aspect of the present invention, a kind of processor is provided, processor is for running program, wherein program The method of any of the above-described kind of detection gene rearrangement is executed when operation.

It applies the technical scheme of the present invention, by detecting the position of gene rearrangement using high-flux sequence data, Abnormal sequence is compared with the sequence on reference genome using in sequence to be compared, to determine that the candidate reset is disconnected Point position, verifies reliable Candidate point position, further by the group shape sequence of sequence to be compared then so as to standard Really detect the breakpoint location of gene rearrangement, correspondingly, the sequence information of breakpoint location also can accurately be known, further to lead to Standard PCR is crossed to verify the breakpoint location and provide the foundation.Therefore, the present processes can not only detect known or unknown Rearrangements, and can accurately detect the specific location and corresponding sequence information reset and occurred.This method directly utilizes NGS sequencing data does not increase any additional experiment testing cost based on statistics and algorithm development.In addition, the inspection of this method It is high to survey accuracy, at low cost, the structural rearrangement suitable for low-abundance gene detects.

Detailed description of the invention

The accompanying drawings constituting a part of this application is used to provide further understanding of the present invention, and of the invention shows Examples and descriptions thereof are used to explain the present invention for meaning property, does not constitute improper limitations of the present invention.In the accompanying drawings:

Fig. 1 shows the quick-reading flow sheets that the breakpoint location of gene rearrangement is detected in a kind of preferred embodiment according to the present invention Schematic diagram；

Fig. 2 shows the detailed streams for the breakpoint location that gene rearrangement is detected in another preferred embodiment according to the present invention Journey schematic diagram；And

Fig. 3 and Fig. 4 shows breakpoint location detected by the method for embodiment according to the present invention 1 and surveys through generation PCR The sequencing result figure of sequence verifying, wherein Fig. 3 shows that the sequencing result of forward primer, Fig. 4 show the survey of reverse primer Sequence result.

Specific embodiment

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.Below in conjunction with embodiment, the present invention will be described in detail.

As background technique is previously mentioned, the prior art is only capable of judgement and resets when detecting to the gene reset The raw position of phenomenon and counterweight discharge is unable to Accurate Determining, thus, in order to improve this situation, in a kind of typical reality of the application It applies in mode, provides a kind of method for detecting gene rearrangement, this method comprises: obtaining the sequence to be compared of sample to be tested；It will Sequence to be compared is compared with reference to genome, obtains abnormal aligned sequences, and abnormal aligned sequences include comparing malposition Sequence, compare direction exception sequence and do not compare refer to genome sequence；According to abnormal aligned sequences in reference base Because of the comparison position in group and direction is compared, determines the position of Candidate point；Utilize support Candidate point in sequence to be compared The sequence of position is assembled, and is retained the consistent breakpoint of sequence information in assembling result with the position of Candidate point, is denoted as base Because of the breakpoint of rearrangement.

The method of above-mentioned detection gene rearrangement provided herein, by detecting gene using high-flux sequence data The position reset compares abnormal sequence with the sequence on reference genome using in sequence to be compared, to determine Then reliable Candidate point is further verified by the group shape sequence of sequence to be compared in the Candidate point position reset Position, so as to be accurately detected the breakpoint location of gene rearrangement, correspondingly, the sequence information of breakpoint location also can be accurate Know, provides the foundation further to verify the breakpoint location by Standard PCR.Therefore, the present processes can not only be examined Known or unknown rearrangements are measured, and can accurately detect the specific location and corresponding sequence information reset and occurred. This method directly utilizes NGS sequencing data, based on statistics and algorithm development, do not increase any additional experiment detection at This.In addition, the detection accuracy of this method is high, at low cost, the structural rearrangement suitable for low-abundance gene is detected.

It is passed through it should be noted that the sequence to be compared of above-mentioned sample to be tested can be from the raw sequencing data of sample to be tested Sequence to be compared is formed after processing, is also possible to the existing ready-made sequence to be compared that can be used to compare.The above method is logical It crosses to increase through the sequence after assembling and verifies Candidate point position, so that breakpoint location is more acurrate.

In sequence to be compared, a part can be with the sequence on reference genome alignment, and a part is because occurring gene weight It arranges and can not directly compare on reference genome, thus this partial sequence is known as abnormal aligned sequences.Abnormal aligned sequences Sequence including comparing the sequence sequences of tandem sequence repeats (so forward direction) of malposition, comparing direction exception (for example is reversely gone here and there Join duplicate sequence) and do not compare sequence (sequence of such as insertion and deletion) with reference to genome.According to these anomaly ratios pair Comparison position of the sequence on reference genome and comparison direction, using existing method (than arriving same chromosome as can comparing Malposition, occur to be inverted between sequence, it is abnormal by comparing direction, determine its potential breakpoint location.It is arrived alternatively, can compare Transposition occurs for different chromosome sequence positions, sequence, by comparing direction) it can determine its potential breakpoint location.

In certain preferred embodiments, according to comparison position of the abnormal aligned sequences on reference genome and compare other side To, determine Candidate point position include: by abnormal aligned sequences carry out sequence cutting after again with reference genome alignment, according to Comparison position of the abnormal aligned sequences on reference genome and comparison direction after cutting, determine the position of Candidate point.

Specifically, existing sequence cutting, which compares software, bwa, hisat2 or STAR.These softwares are used in comparison Looser comparison method, on every section of sequence alignment that cutting is opened to the possible position of reference genome, so as to determination Final comparison position and comparison direction.

In some preferred embodiments, according to comparison position of the abnormal aligned sequences on reference genome and comparison Direction determines that the position of Candidate point includes: that, again with reference genome alignment, will obtain after the progress sequence cutting of abnormal aligned sequences The sequence that can cross over potential the first length of breakpoint two sides simultaneously is obtained, the first flag sequence is denoted as, and potential breakpoint can be crossed over simultaneously Two sides, but less than the sequence of the second length as the second flag sequence；According to the position of the potential breakpoint on the first flag sequence The breakpoint reference sequences of gene rearrangement occur for simulation；Sequence to be compared is compared with breakpoint reference sequences, and to can compare The sequence of breakpoint on top broken-point reference sequences and across breakpoint reference sequences is marked, is denoted as supporting the breakpoint of breakpoint candidate Sequence；The position of breakpoint on breakpoint candidate sequence is determined as to the position of Candidate point.

In the data of both-end sequencing, there are the sequencing sequence of both direction, from the point of view of the sequence according to single-ended sequencing, if logical It crosses and is cut into two sections or three sections of sequences are compared with reference genome again, every section can compare arrive that refer to genome different respectively In position and direction, then the potential breakpoint location of gene rearrangement can be inferred according to the position of specific cutting.By dividing the One flag sequence and the second flag sequence, and building breakpoint reference sequences are simulated with this and are compared again, help to obtain more latent Across sequence of breakpoints and support the pairs of sequence of normally comparison across breakpoint.Further by supporting on the breakpoint reference sequences The sequence of breakpoint location acts on Candidate point sequence, to keep the accuracy of screened Candidate point relatively high.It is above-mentioned First flag sequence crosses over the first length of potential breakpoint two sides according to the difference of sequence length, can rationally be set as 20 ~25bp.And less than the sequence of the second length as in the second flag sequence, the second length can be according to sequence length Difference is rationally set as 10~20bp.

In order to further increase the accuracy of breakpoint location, can according to the sequencing depth of the sequencing data of sample to be tested and Sequencing strategy, is further corrected above-mentioned Candidate point and false positive filters, to retain the higher breakpoint of authenticity Position.

In certain preferred embodiments, the position of the breakpoint on breakpoint candidate sequence is determined as to the position of Candidate point It include: according to sequencing quality and sequence number to be supported to be corrected breakpoint candidate sequence, the Candidate point sequence after being corrected； The position of breakpoint in Candidate point sequence after correction is determined as to the position of Candidate point.

Specifically, for example, sequencing mean depth reach 1000 ×, the sequence across breakpoint reaches 2% or more of mean depth, I.e. 20 × breakpoint base correction can be carried out above, positional relationship, the base of comparison are compared by the breakpoint of analog references sequence Quality carries out breakpoint correction.If support the sequence across breakpoint lower than 20 × breakpoint false positive it is higher, usually remove.

In certain preferred embodiments, the position of the breakpoint on breakpoint candidate sequence is determined as to the position of Candidate point It include: according in the first flag sequence, the second flag sequence and sequence to be compared of supporting the breakpoint on breakpoint reference sequences It supports the pairs of sequence of the breakpoint on across breakpoint reference sequences, filters the false positive sequence of breakpoints in breakpoint candidate sequence, obtain Filtered Candidate point sequence；The position of breakpoint in filtered Candidate point sequence is determined as to the position of Candidate point It sets.

Specifically, such as retain the first flag sequence number greater than 10, supported on across breakpoint reference sequences in sequence to be compared Breakpoint pairs of sequence be greater than 50 breakpoint.Certainly, specific value herein can be fitted according to the difference of different sequencing samples Work as adjustment, is merely illustrative of herein.

In certain preferred embodiments, assembled using the sequence for the position for supporting Candidate point in sequence to be compared It include: according in the first flag sequence, the second flag sequence and sequence to be compared of supporting the breakpoint on breakpoint reference sequences Support that the pairs of sequence of the breakpoint on across breakpoint reference sequences is assembled, retain in assembling result with the position of Candidate point The consistent breakpoint of sequence information, is denoted as the breakpoint of gene rearrangement.

By carrying out using above-mentioned first flag sequence, the second flag sequence and the pairs of sequence for supporting above-mentioned breakpoint Sequence assembling, the assembling sequence by from the beginning assembling formation verify Candidate point position again, so that finally determining gene weight The breakpoint location of row is more acurrate.

It has been observed that the data to be compared of the sample to be tested of the application can be it is existing can be directly used for compare to than To sequence, it is also possible to the band aligned sequences that the initial data that sequencing obtains obtains after processing.In certain preferred embodiments In, the sequence to be compared for obtaining sample to be tested includes: the sequencing library for constructing sample to be tested；High pass measurement is carried out to sequencing library Sequence obtains sequencing data；Sequencing data is pre-processed, the sequence to be compared of sample to be tested is obtained.

In certain preferred embodiments, sequencing library is hybrid capture library, preferably passes through SEQIDNO:1 to SEQ ID The capture probe of NO:36 obtains hybrid capture library.Using hybrid capture library, can be carried out for the sequencing data of target gene Gene rearrangement detection.The capture probe of above-mentioned SEQ ID NO:1 to SEQ ID NO:36 can capture the full exon of mll gene Sequence, it is thus possible to for detect the gene exon rearrangement position and its corresponding sequence information.

The above method of the application is capable of the breakpoint location of accurate testing goal gene rearrangement, according to research purpose Difference can also detect the expression quantity of mutant gene detected using the sequence to be compared of above-mentioned sample to be tested.? In certain preferred embodiments, obtain gene rearrangement breakpoint after, the above method further include to the gene reset into The quantitative step of row, quantitative step includes: to count and support in sequence to be compared according to the sequence information of the breakpoint of gene rearrangement The sequence number of the breakpoint of gene rearrangement is denoted as marker sequence number；The sequence number of marker sequence number and reference gene is divided by, Gene expression abundance of the gene that gained ratio is as reset relative to reference gene.By to certain genes reset Expression quantity is detected, and can react the gene under given conditions or the expression under particular procedure state passes through in turn The expression quantity of the gene is detected under a series of different conditions or different conditions, the difference condition of its expression can be reacted. Above-mentioned reference gene can reasonably select according to actual needs, for example when the gene of detection is mll gene, usually can choose ABL1 gene is as reference gene.

It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described because According to the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related movement not necessarily present invention institute is necessary 's.

Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but it is very much In the case of the former be more preferably embodiment.Based on this understanding, technical solution of the present invention is substantially in other words to existing The part that technology contributes can be embodied in the form of software products, which is stored in a storage In medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that calculating equipment executes each embodiment of the present invention The method, or make processor to execute method described in each embodiment of the present invention.

In second of the application typical embodiment, a kind of device for detecting gene rearrangement is provided, device is used for Storage perhaps runs module or module is the component part of device；Wherein, module is software module, and software module is one Or it is multiple, software module is for executing any of the above-described kind of method.Gene can not only more accurately be detected using the device The breakpoint location of rearrangement, and the corresponding sequence information of breakpoint location can be obtained, and then convenient for detecting according to its sequence information Its relative expression quantity, the practicality and the scope of application are wider, and any there are the mutant genes of gene rearrangement phenomenon to use Above-mentioned apparatus is detected.

Preferably, above-mentioned apparatus includes: to obtain module, comparison module, candidate block and assembling determining module, obtains module For obtaining the sequence to be compared of sample to be tested；Comparison module is used to for sequence to be compared being compared with reference to genome, obtains To abnormal aligned sequences, abnormal aligned sequences include comparing the sequence of malposition, compare the sequence of direction exception and do not compare On with reference to genome sequence；Candidate block is used for comparison position and comparison according to abnormal aligned sequences on reference genome Direction determines the position of Candidate point；Determining module is assembled to be used to utilize the position for supporting Candidate point in sequence to be compared Sequence is assembled, and is retained the consistent breakpoint of sequence information in assembling result with the position of Candidate point, is denoted as gene rearrangement Breakpoint.

In a kind of preferred embodiment, above-mentioned candidate block includes: cutting comparison module and candidate determining module, cutting Comparison module is used for, again with reference genome alignment, candidate determining module is for root after the progress sequence cutting of abnormal aligned sequences According to comparison position of the abnormal aligned sequences after cutting on reference genome and direction is compared, determines the position of Candidate point.

In a kind of preferred embodiment, above-mentioned candidate block includes: cutting mark module, analog module, compares label Module and Candidate point module, cutting mark module be used for by abnormal aligned sequences carry out sequence cutting after again with reference genome It compares, obtains the sequence that can cross over potential the first length of breakpoint two sides simultaneously, be denoted as the first flag sequence, and can be simultaneously across latent In breakpoint two sides, but length less than the second length sequence as the second flag sequence；Analog module is used for according to the first label The breakpoint reference sequences of gene rearrangement occur for the position simulation of the potential breakpoint in sequence；Comparison mark module is used for will be to be compared Sequence is compared with breakpoint reference sequences, and to the breakpoint that can be compared on top broken-point reference sequences and across breakpoint reference sequences Sequence be marked, be denoted as support breakpoint breakpoint candidate sequence；Candidate point module is used for will be on breakpoint candidate sequence The position of breakpoint is determined as the position of Candidate point.

In a kind of preferred embodiment, Candidate point module includes: correction breakpoint module and correction determining module, correction Breakpoint module is used for according to sequencing quality and sequence number is supported to be corrected breakpoint candidate sequence, and the candidate after being corrected is disconnected Point sequence；Correction determining module is used to for being determined as the position of the breakpoint in the Candidate point sequence after correction the position of Candidate point It sets.

In a kind of preferred embodiment, Candidate point module includes: filtering breakpoint module and filtering determining module, filtering Breakpoint module is used for according to the first flag sequence of breakpoint, the second flag sequence and to be compared supported on breakpoint reference sequences The pairs of sequence that the breakpoint on across breakpoint reference sequences is supported in sequence, filters the false positive breakpoint sequence in breakpoint candidate sequence Column, obtain filtered Candidate point sequence；Determining module is filtered to be used for the breakpoint in filtered Candidate point sequence Position is determined as the position of Candidate point.

In a kind of preferred embodiment, assembling determining module includes: assembling submodule and reservation module, assembles submodule For being propped up according in the first flag sequence, the second flag sequence and sequence to be compared for supporting the breakpoint on breakpoint reference sequences The pairs of sequence for holding the breakpoint on across breakpoint reference sequences is assembled, and reservation module is disconnected with candidate in assembling result for retaining The consistent breakpoint of sequence information of the position of point, is denoted as the breakpoint of gene rearrangement.

In a kind of preferred embodiment, obtaining module includes: building module, sequencer module and preprocessing module, structure Modeling block is used to construct the sequencing library of sample to be tested；Sequencer module is used to carry out high-flux sequence to sequencing library, is surveyed Ordinal number evidence；Preprocessing module obtains the sequence to be compared of sample to be tested for pre-processing to sequencing data.

In a kind of preferred embodiment, above-mentioned sequencing library is hybrid capture library, preferably extremely by SEQIDNO:1 The capture probe of SEQ ID NO:36 obtains hybrid capture library.

In a kind of preferred embodiment, above-mentioned apparatus further includes that quantitative quantitative mould is carried out to the gene reset Block, quantitative module include: statistical module and expression quantity computing module, and statistical module is used for the sequence of the breakpoint according to gene rearrangement Information counts the sequence number for supporting the breakpoint of gene rearrangement in sequence to be compared, is denoted as marker sequence number；Expression quantity calculates mould Block is for the sequence number of marker sequence number and reference gene to be divided by, and the gene that gained ratio is as reset is relative to interior Join the gene expression abundance of gene.

In the application in the third typical embodiment, a kind of storage medium is provided, which includes storage Program, wherein program execute any of the above-described kind detection gene rearrangement method.

In the 4th kind of the application typical embodiment, a kind of processor is provided, which is used to run program, Wherein, the method for any of the above-described kind of detection gene rearrangement is executed when program is run.

Above-mentioned storage medium, processor and device can be used to execute the side of above-mentioned detection gene rearrangement by computer Method, and export corresponding testing result, these products are realized on the basis of not increasing any additional experiment and sequencing cost Detection to gene rearrangement, and the testing cost of the device is low, accuracy is high.

In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others Mode is realized.Wherein, the apparatus embodiments described above are merely exemplary, such as the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of unit or module It connects, can be electrical or other forms.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.

It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a calculating is set Standby (can be personal computer, server or network equipment etc.) executes the whole or portion of each embodiment the method for the present invention Step by step.And storage medium above-mentioned includes: that USB flash disk, read-only memory (ROM, Read (-) Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic or disk etc. be various to can store program code Medium.

Further illustrate the beneficial effect of the application below in conjunction with specific embodiments.

The method of the detection MLL-PTD gene rearrangement of embodiment 1

1, sample and data

1) Bone Marrow of Patients or peripheral blood are extracted, is saved with collection tube.

2) sample nucleic acid is extracted, remaining sample is placed in -80 DEG C of preservations.

3) sequencing library is constructed, hybrid capture method (the hybrid capture probe of mll gene, the respectively gene are passed through 1) 36 exons, particular sequence see the table below, target area are enriched with.

4) library after capturing carries out machine sequencing.

Table 1:

2, the pretreatment of sequencing data

1) data Quality Control

Low-quality sequence is mainly deleted, the sequence comprising 5 or more base N is removed；Continuous 40 nucleotide Average sequence of the sequencing quality lower than Q20 is also deleted.

2) mll gene sequence is compared

The high quality sequence alignment for being passed through Quality Control with hisat2 is to reference sequences, for further analyzing.

3, MLL-PTD is identified

1) principle and theoretical basis:

MLL-PTD causes mll gene (totally 36 exons) molecular level to morph, and shows as the exon order of connection It changes, resets and usually occur between exon2 to exon11.

2) MLL-PTD breakpoint identifies:

Pairs of sequence is compared first, according to the sequence location relationship of comparison, for the sequence of positional relationship exception It is right, find existing structure variation between sequence pair.Simultaneously by the sequence cutting of improper comparison, the looser ratio other side of use Method determines sequence alignment to possible position final comparison position and compares direction, breakpoint location is according to cutting sequence It compares position and calculates acquisition.As shown in Figure 1, the sequence of the first length of breakpoint two sides can be crossed over simultaneously across sequence of breakpoints, as the One flag sequence, can be simultaneously across the sequence of the second length as the second flag sequence.It is simulated by flag sequence breakpoint location The breakpoint reference sequences that PTD occurs, sequence is compared again, is only retained and is compared good and there is the candidate across sequence of breakpoints Breakpoint.

3) breakpoint corrects

Because breakpoint border sequences are similar, there are mutation or mistake is sequenced, thus, as shown in Fig. 2, being surveyed according to alignment score Sequence quality and support sequence number are corrected, and optimum prediction sequence of breakpoints are provided, as Candidate point.

4) more factors filter false positive

To Candidate point, according to the first flag sequence for supporting breakpoint, the second flag sequence and support are across the pairs of of breakpoint Sequence further filters false positive.Later as shown in Fig. 2, all sequences of breakpoint will be supported to assemble, retain assembling result With the consistent breakpoint of sequence of breakpoints information.To obtain reliable MLL-PTD structural information.

4, MLL-PTD is quantitative

Marker sequence number/reference gene ABL1 sequence depth based on MLL-PTD, obtains the abundance with ABL1 gene Ratio.

Specifically 122 samples are detected according to the present processes shown in Fig. 2, are tested with 10 sample hairs MLL-PTD variation has been given birth to, it is specific to report result such as the following table 2 and table 3.

Table 2:

Table 3:

Sample number	SEQ ID NO:	Fusion sequence *
			A	37	AGAGGTCTCTGATGAGTCACTTTCTTGACC@cttttcttttggtttttgttttacagggat
B	38	AGAGGTCTCTGATGAGTCACTTTCTTGACC@cttttcttttggtttttgttttacagggat
			C	39	ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat
D	40	CATCTTCTGAGCCAGCAATTGATGACTTGT@cttttcttttggtttttgttttacagggat
			E	41	ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat
F	42	ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat
			G	43	ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttaaagtccactctgatcctgtggactcc
H	44	ATCTGAGCCAAAACCTAAGAATTGCTCATC@ctgattctggtggtggaggctgctttttct
			I	45	ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat
J	46	ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat
			K	47	CATCTTCTGAGCCAGCAATTGATGACTTGT@cttttcttttggtttttgttttacagggat

* the fusion sequence in table 3 is the sequence of reverse complemental, and the letter of such as 4 small letter of A:exon8- > exon represents exon8 Sequence, the representative exon4 of capitalization.

2, it chooses sample C and carries out the MLL-PTD breakpoint arrangement that Sanger sequence verification detects.

Sample information such as the following table 4 of PCR verifying.

Table 4:

Sample number	MLL-PTD structure	Exon A	Exon B	Marker sequence number	Ratio
						C	exon8->exon2	exon8	exon2	231	17.12%

Verify obtained sequence information are as follows:

ATCTGAGCCAAAACCTAAGAATTGCTCATC@cttttcttttggtttttgttttacagggat

(i.e. SEQ ID NO:39).

3, breakpoint template sequence is generated according to breakpoint location, the 300bp design primer before and after breakpoint carries out PCR amplification.

4, PCR product size reasonable, band become clear it is single, by PCR product carry out Sanger sequencing, sequencing peak figure it is clean.

5, breakpoint arrangement, and the side of breakpoint base context and above-mentioned the application can be found according to Sanger sequencing result The sequence of breakpoints that method is identified is completely the same (forward primer sequencing result is shown in that Fig. 3, reverse complemental sequencing result are shown in Fig. 4).

It can be seen from the above description that the above embodiments of the present invention realized the following chievements: the application Method can not only detect known or unknown rearrangements, and can accurately detect and reset the specific location occurred and corresponding Sequence information.It is applied widely, it is suitble to the detection of all genes that rearrangements occur.

This method directly utilizes NGS sequencing data, based on statistics and algorithm development, does not increase any additional experiment Testing cost.In addition, the detection accuracy of this method is high, at low cost, the structural rearrangement suitable for low-abundance gene is detected.

As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes the certain of each embodiment of the application or embodiment Partial method.

All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method Part explanation.

The application can be used in numerous general or special purpose computing system environments or configuration.Such as: personal computer, service Device computer, handheld device or portable device, laptop device, multicomputer system, microprocessor-based system, top set Box, programmable consumer-elcetronics devices, network PC, minicomputer, mainframe computer, including any of the above system or equipment Distributed computing environment etc..

Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific Hardware and software combines.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Sequence table

<110>Beijing You Xun Laboratory of medical test Co., Ltd

<120>method, apparatus, storage medium and the processor of gene rearrangement are detected.

<130> PN102308YXYX

<160> 47

<170> SIPOSequenceListing 1.0

<210> 1

<211> 455

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 1

ctgcttcact tcacggggcg aacatggcgc acagctgtcg gtggcgcttc cccgcccgac 60

ccgggaccac cgggggcggc ggcggcgggg ggcgccgggg cctagggggc gccccgcggc 120

aacgcgtccc ggccctgctg cttccccccg ggcccccggt cggcggtggc ggccccgggg 180

cgcccccctc ccccccggct gtggcggccg cggcggcggc ggcgggaagc agcggggctg 240

gggttccagg gggagcggcc gccgcctcag cagcctcctc gtcgtccgcc tcgtcttcgt 300

cttcgtcatc gtcctcagcc tcttcagggc cggccctgct ccgggtgggc ccgggcttcg 360

acgcggcgct gcaggtctcg gccgccatcg gcaccaacct gcgccggttc cgggccgtgt 420

ttggggagag cggcggggga ggcggcagcg gagag 455

<210> 2

<211> 70

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 2

gatgagcaat tcttaggttt tggctcagat gaagaagtca gagtgcgaag tcccacaagg 60

tctccttcag 70

<210> 3

<211> 2654

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 3

ttaaaactag tcctcgaaaa cctcgtggga gacctagaag tggctctgac cgaaattcag 60

ctatcctctc agatccatct gtgttttccc ctctaaataa atcagagacc aaatctggag 120

ataagatcaa gaagaaagat tctaaaagta tagaaaagaa gagaggaaga cctcccacct 180

tccctggagt aaaaatcaaa ataacacatg gaaaggacat ttcagagtta ccaaagggaa 240

acaaagaaga tagcctgaaa aaaattaaaa ggacaccttc tgctacgttt cagcaagcca 300

caaagattaa aaaattaaga gcaggtaaac tctctcctct caagtctaag tttaagacag 360

ggaagcttca aataggaagg aagggggtac aaattgtacg acggagagga aggcctccat 420

caacagaaag gataaagacc ccttcgggtc tcctcattaa ttctgaactg gaaaagcccc 480

agaaagtccg gaaagacaag gaaggaacac ctccacttac aaaagaagat aagacagttg 540

tcagacaaag ccctcgaagg attaagccag ttaggattat tccttcttca aaaaggacag 600

atgcaaccat tgctaagcaa ctcttacaga gggcaaaaaa gggggctcaa aagaaaattg 660

aaaaagaagc agctcagctg cagggaagaa aggtgaagac acaggtcaaa aatattcgac 720

agttcatcat gcctgttgtc agtgctatct cctcgcggat cattaagacc cctcggcggt 780

ttatagagga tgaggattat gaccctccaa ttaaaattgc ccgattagag tctacaccga 840

atagtagatt cagtgccccg tcctgtggat cttctgaaaa atcaagtgca gcttctcagc 900

actcctctca aatgtcttca gactcctctc gatctagtag ccccagtgtt gatacctcca 960

cagactctca ggcttctgag gagattcagg tacttcctga ggagcggagc gatacccctg 1020

aagttcatcc tccactgccc atttcccagt ccccagaaaa tgagagtaat gataggagaa 1080

gcagaaggta ttcagtgtcg gagagaagtt ttggatctag aacgacgaaa aaattatcaa 1140

ctctacaaag tgccccccag cagcagacct cctcgtctcc acctccacct ctgctgactc 1200

caccgccacc actgcagcca gcctccagta tctctgacca cacaccttgg cttatgcctc 1260

caacaatccc cttagcatca ccatttttgc ctgcttccac tgctcctatg caagggaagc 1320

gaaaatctat tttgcgagaa ccgacattta ggtggacttc tttaaagcat tctaggtcag 1380

agccacaata cttttcctca gcaaagtatg ccaaagaagg tcttattcgc aaaccaatat 1440

ttgataattt ccgaccccct ccactaactc ccgaggacgt tggctttgca tctggttttt 1500

ctgcatctgg taccgctgct tcagcccgat tgttttcgcc actccattct ggaacaaggt 1560

ttgatatgca caaaaggagc cctcttctga gagctccaag atttactcca agtgaggctc 1620

actctagaat atttgagtct gtaaccttgc ctagtaatcg aacttctgct ggaacatctt 1680

cttcaggagt atccaataga aaaaggaaaa gaaaagtgtt tagtcctatt cgatctgaac 1740

caagatctcc ttctcactcc atgaggacaa gaagtggaag gcttagtagt tctgagctct 1800

cacctctcac ccccccgtct tctgtctctt cctcgttaag catttctgtt agtcctcttg 1860

ccactagtgc cttaaaccca acttttactt ttccttctca ttccctgact cagtctgggg 1920

aatctgcaga gaaaaatcag agaccaagga agcagactag tgctccggca gagccatttt 1980

catcaagtag tcctactcct ctcttccctt ggtttacccc aggctctcag actgaaagag 2040

ggagaaataa agacaaggcc cccgaggagc tgtccaaaga tcgagatgct gacaagagcg 2100

tggagaagga caagagtaga gagagagacc gggagagaga aaaggagaat aagcgggagt 2160

caaggaaaga gaaaaggaaa aagggatcag aaattcagag tagttctgct ttgtatcctg 2220

tgggtagggt ttccaaagag aaggttgttg gtgaagatgt tgccacttca tcttctgcca 2280

aaaaagcaac agggcggaag aagtcttcat cacatgattc tgggactgat attacttctg 2340

tgactcttgg ggatacaaca gctgtcaaaa ccaaaatact tataaagaaa gggagaggaa 2400

atctggaaaa aaccaacttg gacctcggcc caactgcccc atccctggag aaggagaaaa 2460

ccctctgcct ttccactcct tcatctagca ctgttaaaca ttccacttcc tccataggct 2520

ccatgttggc tcaggcagac aagcttccaa tgactgacaa gagggttgcc agcctcctaa 2580

aaaaggccaa agctcagctc tgcaagattg agaagagtaa gagtcttaaa caaaccgacc 2640

agcccaaagc acag 2654

<210> 4

<211> 178

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 4

ggtcaagaaa gtgactcatc agagacctct gtgcgaggac cccggattaa acatgtctgc 60

agaagagcag ctgttgccct tggccgaaaa cgagctgtgt ttcctgatga catgcccacc 120

ctgagtgcct taccatggga agaacgagaa aagattttgt cttccatggg gaatgatg 178

<210> 5

<211> 235

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 5

acaagtcatc aattgctggc tcagaagatg ctgaacctct tgctccaccc atcaaaccaa 60

ttaaacctgt cactagaaac aaggcacccc aggaacctcc agtaaagaaa ggacgtcgat 120

cgaggcggtg tgggcagtgt cccggctgcc aggtgcctga ggactgtggt gtttgtacta 180

attgcttaga taagcccaag tttggtggtc gcaatataaa gaagcagtgc tgcaa 235

<210> 6

<211> 65

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 6

gatgagaaaa tgtcagaatc tacaatggat gccttccaaa gcctacctgc agaagcaagc 60

taaag 65

<210> 7

<211> 378

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 7

ctgtgaaaaa gaaagagaaa aagtctaaga ccagtgaaaa gaaagacagc aaagagagca 60

gtgttgtgaa gaacgtggtg gactctagtc agaaacctac cccatcagca agagaggatc 120

ctgccccaaa gaaaagcagt agtgagcctc ctccacgaaa gcccgtcgag gaaaagagtg 180

aagaagggaa tgtctcggcc cctgggcctg aatccaaaca ggccaccact ccagcttcca 240

ggaagtcaag caagcaggtc tcccagccag cactggtcat cccgcctcag ccacctacta 300

caggaccgcc aagaaaagaa gttcccaaaa ccactcctag tgagcccaag aaaaagcagc 360

ctccaccacc agaatcag 378

<210> 8

<211> 74

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 8

gtccagagca gagcaaacag aaaaaagtgg ctccccgccc aagtatccct gtaaaacaaa 60

aaccaaaaga aaag 74

<210> 9

<211> 132

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 9

gaaaaaccac ctccggtcaa taagcaggag aatgcaggca ctttgaacat cctcagcact 60

ctctccaatg gcaatagttc taagcaaaaa attccagcag atggagtcca caggatcaga 120

gtggacttta ag 132

<210> 10

<211> 114

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 10

gaggattgtg aagcagaaaa tgtgtgggag atgggaggct taggaatctt gacttctgtt 60

cctataacac ccagggtggt ttgctttctc tgtgccagta gtgggcatgt agag 114

<210> 11

<211> 147

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 11

tttgtgtatt gccaagtctg ttgtgagccc ttccacaagt tttgtttaga ggagaacgag 60

cgccctctgg aggaccagct ggaaaattgg tgttgtcgtc gttgcaaatt ctgtcacgtt 120

tgtggaaggc aacatcaggc tacaaag 147

<210> 12

<211> 96

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 12

cagctgctgg agtgtaataa gtgccgaaac agctatcacc ctgagtgcct gggaccaaac 60

taccccacca aacccacaaa gaagaagaaa gtctgg 96

<210> 13

<211> 121

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 13

atctgtacca agtgtgttcg ctgtaagagc tgtggatcca caactccagg caaagggtgg 60

gatgcacagt ggtctcatga tttctcactg tgtcatgatt gcgccaagct ctttgctaaa 120

g 121

<210> 14

<211> 123

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 14

gaaacttctg ccctctctgt gacaaatgtt atgatgatga tgactatgag agtaagatga 60

tgcaatgtgg aaagtgtgat cgctgggtcc attccaaatg tgagaatctt tcaggtacag 120

aag 123

<210> 15

<211> 185

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 15

atgagatgta tgagattcta tctaatctgc cagaaagtgt ggcctacact tgtgtgaact 60

gtactgagcg gcaccctgca gagtggcgac tggcccttga aaaagagctg cagatttctc 120

tgaagcaagt tctgacagct ttgttgaatt ctcggactac cagccatttg ctacgctacc 180

ggcag 185

<210> 16

<211> 174

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 16

gctgccaagc ctccagactt aaatcccgag acagaggaga gtataccttc ccgcagctcc 60

cccgaaggac ctgatccacc agttcttact gaggtcagca aacaggatga tcagcagcct 120

ttagatctag aaggagtcaa gaggaagatg gaccaaggga attacacatc tgtg 174

<210> 17

<211> 111

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 17

ttggagttca gtgatgatat tgtgaagatc attcaagcag ccattaattc agatggagga 60

cagccagaaa ttaaaaaagc caacagcatg gtcaagtcct tcttcattcg g 111

<210> 18

<211> 74

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 18

caaatggaac gtgtttttcc atggttcagt gtcaaaaagt ccaggttttg ggagccaaat 60

aaagtatcaa gcaa 74

<210> 19

<211> 194

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 19

cagtgggatg ttaccaaacg cagtgcttcc accttcactt gaccataatt atgctcagtg 60

gcaggagcga gaggaaaaca gccacactga gcagcctcct ttaatgaaga aaatcattcc 120

agctcccaaa cccaaaggtc ctggagaacc agactcacca actcctctgc atcctcctac 180

accaccaatt ttga 194

<210> 20

<211> 107

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 20

gtactgatag gagtcgagaa gacagtccag agctgaaccc acccccaggc atagaagaca 60

atagacagtg tgcgttatgt ttgacttatg gtgatgacag tgctaat 107

<210> 21

<211> 138

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 21

gatgctggtc gtttactata tattggccaa aatgagtgga cacatgtaaa ttgtgctttg 60

tggtcagcgg aagtgtttga agatgatgac ggatcactaa agaatgtgca tatggctgtg 120

atcaggggca agcagctg 138

<210> 22

<211> 159

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 22

agatgtgaat tctgccaaaa gccaggagcc accgtgggtt gctgtctcac atcctgcacc 60

agcaactatc acttcatgtg ttcccgagcc aagaactgtg tctttctgga tgataaaaaa 120

gtatattgcc aacgacatcg ggatttgatc aaaggcgaa 159

<210> 23

<211> 118

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 23

gtggttcctg agaatggatt tgaagttttc agaagagtgt ttgtggactt tgaaggaatc 60

agcttgagaa ggaagtttct caatggcttg gaaccagaaa atatccacat gatgattg 118

<210> 24

<211> 79

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 24

ggtctatgac aatcgactgc ttaggaattc taaatgatct ctccgactgt gaagataagc 60

tctttcctat tggatatca 79

<210> 25

<211> 161

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 25

gtgttccagg gtatactgga gcaccacaga tgctcgcaag cgctgtgtat atacatgcaa 60

gatagtggag tgccgtcctc cagtcgtaga gccggatatc aacagcactg ttgaacatga 120

tgaaaacagg accattgccc atagtccaac atcttttaca g 161

<210> 26

<211> 186

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 26

aaagttcatc aaaagagagt caaaacacag ctgaaattat aagtcctcca tcaccagacc 60

gacctcctca ttcacaaacc tctggctcct gttattatca tgtcatctca aaggtcccca 120

ggattcgaac acccagttat tctccaacac agagatcccc tggctgtcga ccgttgcctt 180

ctgcag 186

<210> 27

<211> 4249

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 27

gaagtcctac cccaaccact catgaaatag tcacagtagg tgatccttta ctctcctctg 60

gacttcgaag cattggctcc aggcgtcaca gtacctcttc cttatcaccc cagcggtcca 120

aactccggat aatgtctcca atgagaactg ggaatactta ctctaggaat aatgtttcct 180

cagtctccac caccgggacc gctactgatc ttgaatcaag tgccaaagta gttgatcatg 240

tcttagggcc actgaattca agtactagtt tagggcaaaa cacttccacc tcttcaaatt 300

tgcaaaggac agtggttact gtaggcaata aaaacagtca cttggatgga tcttcatctt 360

cagaaatgaa gcagtccagt gcttcagact tggtgtccaa gagctcctct ttaaagggag 420

agaagaccaa agtgctgagt tccaagagct cagagggatc tgcacataat gtggcttacc 480

ctggaattcc taaactggcc ccacaggttc ataacacaac atctagagaa ctgaatgtta 540

gtaaaatcgg ctcctttgct gaaccctctt cagtgtcgtt ttcttctaaa gaggccctct 600

ccttcccaca cctccatttg agagggcaaa ggaatgatcg agaccaacac acagattcta 660

cccaatcagc aaactcctct ccagatgaag atactgaagt caaaaccttg aagctatctg 720

gaatgagcaa cagatcatcc attatcaacg aacatatggg atctagttcc agagatagga 780

gacagaaagg gaaaaaatcc tgtaaagaaa ctttcaaaga aaagcattcc agtaaatctt 840

ttttggaacc tggtcaggtg acaactggtg aggaaggaaa cttgaagcca gagtttatgg 900

atgaggtttt gactcctgag tatatgggcc aacgaccatg taacaatgtt tcttctgata 960

agattggtga taaaggcctt tctatgccag gagtccccaa agctccaccc atgcaagtag 1020

aaggatctgc caaggaatta caggcaccac ggaaacgcac agtcaaagtg acactgacac 1080

ctctaaaaat ggaaaatgag agtcaatcca aaaatgccct gaaagaaagt agtcctgctt 1140

cccctttgca aatagagtca acatctccca cagaaccaat ttcagcctct gaaaatccag 1200

gagatggtcc agtggcccaa ccaagcccca ataatacctc atgccaggat tctcaaagta 1260

acaactatca gaatcttcca gtacaggaca gaaacctaat gcttccagat ggccccaaac 1320

ctcaggagga tggctctttt aaaaggaggt atccccgtcg cagtgcccgt gcacgttcta 1380

acatgttttt tgggcttacc ccactctatg gagtaagatc ctatggtgaa gaagacattc 1440

cattctacag cagctcaact gggaagaagc gaggcaagag atcagctgaa ggacaggtgg 1500

atggggccga tgacttaagc acttcagatg aagacgactt atactattac aacttcacta 1560

gaacagtgat ttcttcaggt ggagaggaac gactggcatc ccataattta tttcgggagg 1620

aggaacagtg tgatcttcca aaaatctcac agttggatgg tgttgatgat gggacagaga 1680

gtgatactag tgtcacagcc acaacaagga aaagcagcca gattccaaaa agaaatggta 1740

aagaaaatgg aacagagaac ttaaagattg atagacctga agatgctggg gagaaagaac 1800

atgtcactaa gagttctgtt ggccacaaaa atgagccaaa gatggataac tgccattctg 1860

taagcagagt taaaacacag ggacaagatt ccttggaagc tcagctcagc tcattggagt 1920

caagccgcag agtccacaca agtaccccct ccgacaaaaa tttactggac acctataata 1980

ctgagctcct gaaatcagat tcagacaata acaacagtga tgactgtggg aatatcctgc 2040

cttcagacat tatggacttt gtactaaaga atactccatc catgcaggct ttgggtgaga 2100

gcccagagtc atcttcatca gaactcctga atcttggtga aggattgggt cttgacagta 2160

atcgtgaaaa agacatgggt ctttttgaag tattttctca gcagctgcct acaacagaac 2220

ctgtggatag tagtgtctct tcctctatct cagcagagga acagtttgag ttgcctctag 2280

agctaccatc tgatctgtct gtcttgacca cccggagtcc cactgtcccc agccagaatc 2340

ccagtagact agctgttatc tcagactcag gggagaagag agtaaccatc acagaaaaat 2400

ctgtagcctc ctctgaaagt gacccagcac tgctgagccc aggagtagat ccaactcctg 2460

aaggccacat gactcctgat cattttatcc aaggacacat ggatgcagac cacatctcta 2520

gccctccttg tggttcagta gagcaaggtc atggcaacaa tcaggattta actaggaaca 2580

gtagcacccc tggccttcag gtacctgttt ccccaactgt tcccatccag aaccagaagt 2640

atgtgcccaa ttctactgat agtcctggcc cgtctcagat ttccaatgca gctgtccaga 2700

ccactccacc ccacctgaag ccagccactg agaaactcat agttgttaac cagaacatgc 2760

agccacttta tgttctccaa actcttccaa atggagtgac ccaaaaaatc caattgacct 2820

cttctgttag ttctacaccc agtgtgatgg agacaaatac ttcagtattg ggacccatgg 2880

gaggtggtct cacccttacc acaggactaa atccaagctt gccaacttct caatctttgt 2940

tcccttctgc tagcaaagga ttgctaccca tgtctcatca ccagcactta cattccttcc 3000

ctgcagctac tcaaagtagt ttcccaccaa acatcagcaa tcctccttca ggcctgctta 3060

ttggggttca gcctcctccg gatccccaac ttttggtttc agaatccagc cagaggacag 3120

acctcagtac cacagtagcc actccatcct ctggactcaa gaaaagaccc atatctcgtc 3180

tacagacccg aaagaataaa aaacttgctc cctctagtac cccttcaaac attgcccctt 3240

ctgatgtggt ttctaatatg acattgatta acttcacacc ctcccagctt cctaatcatc 3300

caagtctgtt agatttgggg tcacttaata cttcatctca ccgaactgtc cccaacatca 3360

taaaaagatc taaatctagc atcatgtatt ttgaaccggc acccctgtta ccacagagtg 3420

tgggaggaac tgctgccaca gcggcaggca catcaacaat aagccaggat actagccacc 3480

tcacatcagg gtctgtgtct ggcttggcat ccagttcctc tgtcttgaat gttgtatcca 3540

tgcaaactac cacaacccct acaagtagtg cgtcagttcc aggacacgtc accttaacca 3600

acccaaggtt gcttggtacc ccagatattg gctcaataag caatctttta atcaaagcta 3660

gccagcagag cctggggatt caggaccagc ctgtggcttt accgccaagt tcaggaatgt 3720

ttccacaact ggggacatca cagaccccct ctactgctgc aataacagcg gcatctagca 3780

tctgtgtgct cccctccact cagactacgg gcataacagc cgcttcacct tctggggaag 3840

cagacgaaca ctatcagctt cagcatgtga accagctcct tgccagcaaa actgggattc 3900

attcttccca gcgtgatctt gattctgctt cagggcccca ggtatccaac tttacccaga 3960

cggtagacgc tcctaatagc atgggactgg agcagaacaa ggctttatcc tcagctgtgc 4020

aagccagccc cacctctcct gggggttctc catcctctcc atcttctgga cagcggtcag 4080

caagcccttc agtgccgggt cccactaaac ccaaaccaaa aaccaaacgg tttcagctgc 4140

ctctagacaa agggaatggc aagaagcaca aagtttccca tttgcggacc agttcttctg 4200

aagcacacat tccagaccaa gaaacgacat ccctgacctc aggcacagg 4249

<210> 28

<211> 81

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 28

gactccagga gcagaggctg agcagcagga tacagctagc gtggagcagt cctcccagaa 60

ggagtgtggg caacctgcag g 81

<210> 29

<211> 65

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 29

gcaagtcgct gttcttccgg aagttcaggt gacccaaaat ccagcaaatg aacaagaaag 60

tgcag 65

<210> 30

<211> 171

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 30

aacctaaaac agtggaagaa gaggaaagta atttcagctc cccactgatg ctttggcttc 60

agcaagaaca aaagcggaag gaaagcatta ctgagaaaaa acccaagaaa ggacttgttt 120

ttgaaatttc cagtgatgat ggctttcaga tctgtgcaga aagtattgaa g 171

<210> 31

<211> 75

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 31

atgcctggaa gtcattgaca gataaagtcc aggaagctcg atcaaatgcc cgcctaaagc 60

agctctcatt tgcag 75

<210> 32

<211> 175

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 32

gtgttaacgg tttgaggatg ctggggattc tccatgatgc agttgtgttc ctcattgagc 60

agctgtctgg tgccaagcac tgtcgaaatt acaaattccg tttccacaag ccagaggagg 120

ccaatgaacc ccccttgaac cctcacggct cagccagggc tgaagtccac ctcag 175

<210> 33

<211> 108

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 33

gaagtcagca tttgacatgt ttaacttcct ggcttctaaa catcgtcagc ctcctgaata 60

caaccccaat gatgaagaag aggaggaggt acagctgaag tcagctcg 108

<210> 34

<211> 84

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 34

gagggcaact agcatggatc tgccaatgcc catgcgcttc cggcacttaa aaaagacttc 60

taaggaggca gttggtgtct acag 84

<210> 35

<211> 130

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 35

gtctcccatc catggccggg gtcttttctg taagagaaac attgatgcag gtgagatggt 60

gattgagtat gccggcaacg tcatccgctc catccagact gacaagcggg aaaagtatta 120

cgacagcaag 130

<210> 36

<211> 4928

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 36

ggcattggtt gctatatgtt ccgaattgat gactcagagg tagtggatgc caccatgcat 60

ggaaatgctg cacgcttcat caatcactcg tgtgagccta actgctattc tcgggtcatc 120

aatattgatg ggcagaagca cattgtcatc tttgccatgc gtaagatcta ccgaggagag 180

gaactcactt acgactataa gttccccatt gaggatgcca gcaacaagct gccctgcaac 240

tgtggcgcca agaaatgccg gaagttccta aactaaagct gctcttctcc cccagtgttg 300

gagtgcaagg aggcggggcc atccaaagca acgctgaagg ccttttccag cagctgggag 360

ctcccggatt gcgtggcaca gctgaggggc ctctgtgatg gctgagctct cttatgtcct 420

atactcacat cagacatgtg atcatagtcc cagagacaga gttgaggtct cgaagaaaag 480

atccatgatc ggctttctcc tggggcccct ccaattgttt actgttagaa agtgggaatg 540

gggtccctag cagacttgcc tggaaggagc ctattataga gggttggtta tgttgggaga 600

ttgggcctga atttctccac agaaataagt tgccatcctc aggttggccc tttcccaagc 660

actgtaagtg agtgggtcag gcaaagcccc aaatggaggg ttggttagat tcctgacagt 720

ttgccagcca ggccccacct acagcgtctg tcgaacaaac agaggtctgg tggttttccc 780

tactatcctc ccactcgaga gttcacttct ggttgggaga caggattcct agcacctccg 840

gtgtcaaaag gctgtcatgg ggttgtgcca attaattacc aaacattgag cctgcaggct 900

ttgagtggga gtgttgcccc caggagcctt atctcagcca attacctttc ttgacagtag 960

gagcggcttc cctctcccat tccctcttca ctcccttttc ttcctttccc ctgtcttcat 1020

gccactgctt tcccatgctt ctttcgggtt gtaggggaga ctgactgcct gctcaaggac 1080

actccctgct gggcatagga tgtgcctgca aaaagttccc tgagcctgta agcactccag 1140

gtggggaagt ggacaggagc cattggtcat aaccagacag aatttggaaa cattttcata 1200

aagctccatg gagagtttta aagaaacata tgtagcatga ttttgtagga gaggaaaaag 1260

attatttaaa taggatttaa atcatgcaac aacgagagta tcacagccag gatgaccctt 1320

gggtcccatt cctaagacat ggttacttta ttttcccctt gttaagacat aggaagactt 1380

aatttttaaa cggtcagtgt ccagttgaag gcagaacact aatcagattt caaggcccac 1440

aacttgggga ctagaccacc ttatgttgag ggaactctgc cacctgcgtg caacccacag 1500

ctaaagtaaa ttcaatgaca ctactgccct gattactcct taggatgtgg tcaaaacagc 1560

atcaaatgtt tcttctcttc ctttccccaa gacagagtcc tgaacctgtt aaattaagtc 1620

attggatttt actctgttct gtttacagtt tactatttaa ggttttataa atgtaaatat 1680

attttgtata tttttctatg agaagcactt catagggaga agcacttatg acaaggctat 1740

tttttaaacc gcggtattat cctaatttaa aagaagatcg gtttttaata attttttatt 1800

ttcataggat gaagttagag aaaatattca gctgtacaca caaagtctgg tttttcctgc 1860

ccaacttccc cctggaaggt gtactttttg ttgtttaatg tgtagcttgt ttgtgccctg 1920

ttgacataaa tgtttcctgg gtttgctctt tgacaataaa tggagaagga aggtcaccca 1980

actccattgg gccactcccc tccttcccct attgaagctc ctcaaaaggc tacagtaata 2040

tcttgataca acagattctc ttctttcccg cctctctcct ttccggcgca acttccagag 2100

tggtgggaga cggcaatctt tacatttccc tcatctttct tacttcagag ttagcaaaca 2160

acaagttgaa tggcaacttg acatttttgc atcaccatct gcctcatagg ccactctttc 2220

ctttccctct gcccaccaag tcctcatatc tgcagagaac ccattgatca ccttgtgccc 2280

tcttttgggg cagcctgttg aaactgaagc acagtctgac cactcacgat aaagcagatt 2340

tttctctgcc tctgccacaa ggtttcagag tagtgtagtc caagtagagg gtggggcacc 2400

cttttctcgc cgcaagaagc ccattcctat ggaagtctag caaagcaata cgactcagcc 2460

cagcactctc tgccccagga ctcatggctc tgctgtgcct tccatcctgg gctcccttct 2520

ctcctgtgac cttaagaact ttgtctggtg gctttgctgg aacattgtca ctgttttcac 2580

tgtcatgcag ggagcccagc actgtggcca ggatggcaga gacttccttg tcatcatgga 2640

gaagtgccag caggggactg ggaaaagcac tctacccaga cctcacctcc cttcctcctt 2700

ttgcccatga acaagatgca gtggccctag gggttccact agtgtctgct ttcctttatt 2760

attgcactgt gtgaggtttt tttgtaaatc cttgtattcc tatttttttt aaagaaaaaa 2820

aaaaaacctt aagctgcatt tgttactgaa atgattaatg cactgatggg tcctgaattc 2880

accttgagaa agacccaaag gccagtcagg gggtgggggg aactcagcta aatagaccta 2940

gttactgccc tgctaggcca tgctgtactg tgagcccctc ctcactctct accaacccta 3000

aaccctgagg acaggggagg aacccacagc ttccttctcc tgccagctgc agatggtttg 3060

ccttgccttt ccacccccta attgtcaacc acaaaaatga gaaattcctc ttctagctca 3120

gccttgagtc cattgccaaa ttttcagcac acctgccagc aacttggggg aataagcgaa 3180

ggtttcccta caagagggaa agaaggcaaa aacggcacag ctatctccaa acacatctga 3240

gttcatttca aaagtgacca agggaatctc cgcacaaaag tgcagattga ggaattgtga 3300

tgggtcattc ccaagaatcc cccaaggggc atcccaaatc cctgaggagt aacagctgca 3360

aacctggtca gttctcagtg agagccagct cacttatagc tttgctgcta gaacctgttg 3420

tggctgcatt tcctggtggc cagtgacaac tgtgtaacca gaatagctgc atggcgctga 3480

ccctttggcc ggaacttggt ctcttggctc cctccttggc cacccaccac ctctcgcaca 3540

gcccctctgt ttttacacca ataacaagaa ttaaggggga agccctggca gctatacgtt 3600

ttcaaccaga ctcctttgcc gggacccagc ccgccaccct gctcgcctcc gtcaaacccc 3660

cggccaatgc agtgagcacc atgtagctcc cttgatttaa aaaaaataaa aaataaaaaa 3720

aaaaggaaaa aaaaatacaa cacacacaca aaaataaaaa aaatattcta atgaatgtat 3780

ctttctaaag gactgacgtt caatcaaata tctgaaaata ctaaaggtca aaaccttgtc 3840

agatgttaac ttctaagttc ggtttgggat tttttttttt taatagaaat caagttgttt 3900

ttgtttttaa ggaaaagcgg gtcattgcaa agggctgggt gtaattttat gtttcatttc 3960

cttcatttta aagcaataca aggttatgga gcagatggtt ttgtgccgaa tcatgaatac 4020

tagtcaagtc acacactctg gaaacttgca actttttgtt tgttttggtt ttcaaataaa 4080

tataaatatg atatatatag gaactaatat agtaatgcac catgtaacaa agcctagttc 4140

agtccatggc ttttaattct cttaacacta tagataagga ttgtgttaca gttgctagta 4200

gcggcaggaa gatgtcaggc tcactttcct ctgattcccg aaatgggggg aacctctaac 4260

cataaaggaa tggtagaaca gtccattcct cggatcagag aaaaatgcag acatggtgtc 4320

acctggattt ttttctgccc atgaatgttg ccagtcagta cctgtcctcc ttgtttctct 4380

atttttggtt atgaatgttg gggttaccac ctgcatttag gggaaaattg tgttctgtgc 4440

tttcctggta tcttgttccg aggtactcta gttctgtctt tcaaccaaga aaatagaatt 4500

gtggtgtttc ttttattgaa cttttaacag tctctttagt aaatacaggt agttgaataa 4560

ttgtttcaag agctcaacag atgacaagct tcttttctag aaataagaca ttttttgaca 4620

actttatcat gtataacaga tctgtttttt ttccttgtgt tcttccaagc ttctggttag 4680

agaaaaagag aaaaaaaaaa aaggaaaatg tgtctaaagt ccatcagtgt taactccctg 4740

tgacagggat gaaggaaaat actttaatag ttcaaaaaat aataatgctg aaagctctct 4800

acgaaagact gaatgtaaaa gtaaaaagtg tacatagttg taaaaaaaag gagtttttaa 4860

acatgtttat tttctatgca ctttttttta tttaagtgat agtttaatta ataaacatgt 4920

caagttta 4928

<210> 37

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 37

agaggtctct gatgagtcac tttcttgacc cttttctttt ggtttttgtt ttacagggat 60

<210> 38

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 38

agaggtctct gatgagtcac tttcttgacc cttttctttt ggtttttgtt ttacagggat 60

<210> 39

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 39

atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60

<210> 40

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 40

catcttctga gccagcaatt gatgacttgt cttttctttt ggtttttgtt ttacagggat 60

<210> 41

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 41

atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60

<210> 42

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 42

atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60

<210> 43

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 43

atctgagcca aaacctaaga attgctcatc cttaaagtcc actctgatcc tgtggactcc 60

<210> 44

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 44

atctgagcca aaacctaaga attgctcatc ctgattctgg tggtggaggc tgctttttct 60

<210> 45

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 45

atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60

<210> 46

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 46

atctgagcca aaacctaaga attgctcatc cttttctttt ggtttttgtt ttacagggat 60

<210> 47

<211> 60

<212> DNA

<213>homo sapiens (Homo sapiens)

<400> 47

catcttctga gccagcaatt gatgacttgt cttttctttt ggtttttgtt ttacagggat 60

Claims

1. a kind of method for detecting gene rearrangement, which is characterized in that the described method includes:

Obtain the sequence to be compared of sample to be tested；

The sequence to be compared is compared with reference to genome, obtains abnormal aligned sequences, the exception aligned sequences packet The sequence for comparing malposition is included, the sequence of direction exception is compared and does not compare the sequence with reference to genome；

In the comparison position with reference on genome and direction is compared according to the abnormal aligned sequences, is determined candidate disconnected The position of point；

Assembled using the sequence for the position for supporting the Candidate point in the sequence to be compared, retain assembling result in The consistent breakpoint of the sequence information of the position of the Candidate point, is denoted as the breakpoint of the gene rearrangement.

2. the method according to claim 1, wherein referring to genome described according to the abnormal aligned sequences On comparison position and compare direction, determine that the position of Candidate point includes:

Genome alignment is referred to described again after the abnormal aligned sequences are carried out sequence cutting, according to described different after cutting Normal aligned sequences determine the position of the Candidate point in the comparison position and the comparison direction with reference on genome It sets.

3. according to the method described in claim 2, it is characterized in that, referring to genome described according to the abnormal aligned sequences On comparison position and compare direction, determine that the position of the Candidate point includes:

Genome alignment is referred to described again after the abnormal aligned sequences are carried out sequence cutting, acquisition can be simultaneously across potential The sequence of the first length of breakpoint two sides, is denoted as the first flag sequence, and can cross over the potential breakpoint two sides simultaneously, but length is small In the second length sequence as the second flag sequence；

The breakpoint reference sequences that gene rearrangement occurs are simulated according to the position of the potential breakpoint on first flag sequence；

The sequence to be compared is compared with the breakpoint reference sequences, and to the upper breakpoint reference sequences can be compared And the sequence across the breakpoint on the breakpoint reference sequences is marked, and is denoted as supporting the breakpoint candidate sequence of breakpoint；

The position of breakpoint on the breakpoint candidate sequence is determined as to the position of the Candidate point.

4. according to the method described in claim 3, it is characterized in that, the position of the breakpoint on the breakpoint candidate sequence is determined Position for the Candidate point includes:

According to sequencing quality and sequence number is supported to be corrected the breakpoint candidate sequence, it is described candidate disconnected after being corrected Point sequence；

The position of breakpoint in the Candidate point sequence after correction is determined as to the position of the Candidate point.

5. according to the method described in claim 3, it is characterized in that, the position of the breakpoint on the breakpoint candidate sequence is determined Position for the Candidate point includes:

According to first flag sequence, second flag sequence and the institute for supporting the breakpoint on the breakpoint reference sequences Pairs of sequence of the support across the breakpoint on the breakpoint reference sequences in sequence to be compared is stated, is filtered in the breakpoint candidate sequence False positive sequence of breakpoints, obtain the filtered Candidate point sequence；

The position of breakpoint in the filtered Candidate point sequence is determined as to the position of the Candidate point.

6. according to the method described in claim 3, it is characterized in that, using the Candidate point is supported in the sequence to be compared Position sequence carry out assembling include:

According to first flag sequence, second flag sequence and the institute for supporting the breakpoint on the breakpoint reference sequences It states and supports that the pairs of sequence across the breakpoint on the breakpoint reference sequences is assembled in sequence to be compared, retain in assembling result With the consistent breakpoint of sequence information of the position of the Candidate point, it is denoted as the breakpoint of the gene rearrangement.

7. method according to any one of claim 1 to 6, which is characterized in that obtain the sequence to be compared of sample to be tested Include:

Construct the sequencing library of the sample to be tested；

High-flux sequence is carried out to the sequencing library, obtains sequencing data；

The sequencing data is pre-processed, the sequence to be compared of the sample to be tested is obtained.

8. preferably passing through the method according to the description of claim 7 is characterized in that the sequencing library is hybrid capture library The capture probe of SEQIDNO:1 to SEQ ID NO:36 obtains the hybrid capture library.

9. method according to any one of claim 1 to 6, which is characterized in that in the breakpoint for obtaining the gene rearrangement Later, the method also includes carrying out quantitative step to the gene reset, the quantitative step includes:

According to the sequence information of the breakpoint of the gene rearrangement, counts and support the disconnected of the gene rearrangement in the sequence to be compared The sequence number of point, is denoted as marker sequence number；

The sequence number of the marker sequence number and reference gene is divided by, gained ratio is the gene phase reset For the gene expression abundance of the reference gene.

10. a kind of device for detecting gene rearrangement, which is characterized in that described device is for storing or running module, Huo Zhesuo State the component part that module is described device；Wherein, the module is software module, and the software module is one or more, Method of the software module for detection gene rearrangement described in any one of perform claim requirement 1 to 9.

11. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein described program right of execution Benefit require any one of 1 to 9 described in detect gene rearrangement method.

12. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run Benefit require any one of 1 to 9 described in detect gene rearrangement method.