CN105986008A - CNV detection method and CNV detection apparatus - Google Patents
CNV detection method and CNV detection apparatus Download PDFInfo
- Publication number
- CN105986008A CN105986008A CN201510039685.5A CN201510039685A CN105986008A CN 105986008 A CN105986008 A CN 105986008A CN 201510039685 A CN201510039685 A CN 201510039685A CN 105986008 A CN105986008 A CN 105986008A
- Authority
- CN
- China
- Prior art keywords
- window
- comparison
- region
- comparison rate
- rate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention provides a CNV detection method which including the steps of: 1) acquiring a genome sequencing result of a target individual; 2) comparing the sequencing result with a reference sequence to obtain a comparison result, wherein the reference sequence comprises a plurality of windows; 3) on the basis of the comparison result, calculating initial comparison ratio of each window, which is the number of reads in the windows on the comparison dividing the average value of the number of the reads in the windows on the comparison, wherein the average value of the number of the reads in the windows on the comparison is the total number of reads in all windows on the comparison dividing the number of the windows; 4) combining a plurality of adjacent windows of which the initial comparison ratios have no significant difference, and defining the combined adjacent windows as a primary zone, and the rest individual windows are respectively called the primary zones; and 5) if the comparison ratio of the primary zone is not equal to a preset comparison ratio, determining existence of CNV in the primary zone.
Description
Technical field
The present invention relates to bio information field, concrete, the present invention relates to the method and apparatus detecting CNV.
Background technology
Unicellular sequencing technologies is to utilize secondary sequencing technologies to check order the trace dna of individual cells.This technology is main
Including unicellular separation, the extraction of unicellular nucleic acid and amplification and order-checking three parts.Unicellular order-checking is as a revolutionary skill
Art, was extensively applied in scientific research and biomedicine field in recent years.Such as, check order to tumor is unicellular, disclose
The heterogeneity of the unicellular aspect of tumor, deduces the evolutionary process of tumor;Noinvasive prenatal diagnosis;The microorganism that assembling can not be cultivated
Genome;The acquisition of trace cell (prudence etc. can be applied to) genome;Single cell technology is also introduced into embryo and plants
Enter front diagnosis etc..Unicellular sequencing technologies solve trace cellular genome obtain a difficult problem, for disease incidence mechanism and examine
Disconnected learning studies the method providing new.
In unicellular research, unicellular copy number variation (Copy Number Variants, CNV) plays critically important
Role.CNV refers generally to the fragment generation loss more than 1Kb or the phenomenon of repetition on chromosome.CNV is that one is widely present
Genetic polymorphism in animal-plant gene group, its mutation frequency is far above SNP, and genome research is proved CNV and groups of people
Class disease is correlated with, such as with the Various Complex such as tumor, obesity, infantile autism, autoimmune disease and systemic lupus erythematosus (sle)
Disease is correlated with.In the heterogeneity and Study on Evolution of tumor, the detection of the single celled CNV of tumor, by relatively more unicellular it
Between, and the difference of the CNV of unicellular and corresponding tissue, disclose the tumor heterogeneity in individual cells aspect, for tumor
Evolution deduction provides foundation;Noinvasive prenatal diagnosis, then need minim DNA has been detected whether chromosome aneuploid
Variation (one of CNV) and cause mongolism (47 ,+21), E trisomy (47 ,+18), 13-
Patau syndrome (47 ,+13) etc.;Diagnosis before Embryonic limb bud cell and examination, need to enter single sexual cell or embryonic cell
Row coherent detection is analyzed;Legal medical expert collects evidence sample (blood of trace, seminal fluid etc.), needs analysis carrying out trace cell etc..
In general, current biological medical domain proposes need for the detection of large fragment CNV of trace cell, even individual cells
Summation challenge.
Existing CNV detection method is mostly for tissue sequencing data, such as CNV-seq, PenCNV, CNAseg and
Readdepth etc..Unicellular sequencing data, especially low depth sequencing data, have low genome coverage and high amplification is inclined
Tropism, the zones of different short sequence alignment fluctuation at genome is very big, these CNV detection methods be very suitable for single carefully
The detection of born of the same parents' copy number variation.
Summary of the invention
It is contemplated that at least solve at least one the problems referred to above or propose the selection of at least one business.
According to an aspect of of the present present invention, the present invention provides a kind of method detecting CNV, said method comprising the steps of: obtain
Taking the gene order-checking result of target individual, described sequencing result includes multiple reading section;By described sequencing result and reference sequences
Comparison, it is thus achieved that comparison result, described reference sequences includes that multiple window, described comparison result include each described window in comparison
The number of the reading section of mouth;Based on described comparison result, calculate the initial comparison rate of each window, the initial comparison rate of window=
In comparison, described window reads the reading hop count purpose meansigma methods of all windows, all windows in described comparison in hop count mesh/comparison
Read the reading section sum/window number of all windows in hop count purpose meansigma methods=comparison;Merge initial comparison rate without significant difference
Multiple adjacent window apertures, the multiple adjacent window apertures after definition merges are a sub-region, and remaining each independent window is the most once
Region;Comparison rate based on a described sub-region is unequal with predetermined comparison rate, it is determined that a described sub-region exists CNV, institute
Stating the average of the initial comparison rate that comparison rate is the window that a described sub-region comprises of a sub-region, described predetermined comparison rate is
The comparison rate of the window that the comparison rate medium frequency of all windows is the highest, the comparison rate of described window is a sub-region at its place
Comparison rate.In one embodiment of the invention, described genome is available from the individual cells of described target individual.By building
Single celled gene order-checking library, and described sequencing library is carried out the sequencing described sequencing result of acquisition.Optional,
Build described sequencing library and include described genome is carried out degenerate oligonucleotide primed PCR, multiple displacement amplification and/or repeatedly
Anneal ring-type cyclic amplification, to obtain the nucleic acid amount enough building storehouse and/or enough go up the nucleic acid amount that machine checks order.Sequencing is permissible
Utilize existing order-checking platform, include but not limited to CG (Complete Genomics), Illumina/Solexa, Life
Technologies ABI SOLiD and Roche 454 checks order platform, can check order accordingly according to selected order-checking platform
Prepared by library, optional single-ended or both-end order-checking, thus obtained sequencing result is made up of multiple short sequences, by each short sequence
Row are referred to as the section of reading.Described comparison can utilize known comparison software to carry out, such as utilize Bowtie, SOAP, BWA and/
Or TeraMap etc. is carried out.In one embodiment of the invention, only utilize the comparison in described comparison result to described reference
The reading section of sequence unique positions is compared the calculating of rate, is beneficial to improve the accurate of CNV detection improving data accuracy
Property.
Alleged window can predefine, it is also possible to determine when carrying out target individual detection simultaneously.In the present invention one
In embodiment, window is predetermined.The determination of described window includes: by short sequence sets and reference sequences comparison, determine
The original position of the short sequence of described reference sequences in comparison, described short sequence sets includes multiple short sequence;Described with reference to sequence
Delimit window on row, make each described window comprise equal number of described original position, optional, do not have between described window
There is overlap.In comparison process, according to the setting of alignment parameters, a short sequence has at most allowed m base mispairing
(mismatch), m is preferably 1 or 2, if having more than m base generation mispairing in a short sequence, is then considered as this short sequence
Row cannot comparison to reference sequences.The alleged initial base that original position is each short sequence of reference sequences in comparison and ginseng
Examining the matched position of sequence, when having the initial base ratio of multiple short sequence to during to reference sequences same position, only record is once,
I.e. recording described original position is one.Here, the initial base of alleged short sequence, the direction of the short sequence i.e. related to,
It is with the direction of reference sequences as reference, such as, (position, reference sequences front position will be matched in a short sequence
Numbering minimum) base be referred to as the initial base of this short sequence.Each described window is made to comprise identical original position number,
And it is not intended to the number of loci of its not section of reading comprised coupling, so each general window is in different size, so,
Advantageously reduce the skewed popularity that unicellular genome amplification brings.Accordingly, under this design, it is possible to so that each window bag
Carrying out window delimitation containing equal number of ad-hoc location, described ad-hoc location is each short sequence of reference sequences in comparison
The matched position of same position base and reference sequences, such as, making described ad-hoc location is each of reference sequences in comparison
The terminal bases of individual short sequence and the matched position of reference sequences.
Alleged short sequence sets may be from simulated series collection and/or sequencing result, and sequencing result mentioned here can be that oneself measures
The sequencing data of people's nucleic acid, it is also possible to being the sequencing result of sample of nucleic acid disclosed in other people, nucleic acid can be genomic DNA
It can also be dissociative DNA.It is also preferred that the left make simulated series energy in comparison to reference genome that described simulated series is concentrated
Having relatively uniform distribution, in one embodiment of the invention, simulated series can be obtained in that from described reference sequences
The base of one end of chromosome of a length of Q start, copy P base of described chromosome, to obtain Article 1 simulation
Sequence, the other end direction along described chromosome is moved a base and is copied P base of described chromosome, to obtain second
Bar simulated series, the other end direction along described chromosome is moved two bases and is copied P base of described chromosome, to obtain
Obtain Article 3 simulated series, obtain the Q-P+1 article simulated series, the terminal bases of described Q-P+1 article of simulated series according to this
Overlapping with the base of the other end of described chromosome, wherein, P is the length of simulated series, it is also preferred that the left P >=10.At this
In a bright embodiment, overlapping and described window sum between described window, is not had to be not more than 100,000.The size of window
Arrange and can adjust based on CNV accuracy of detection, in the case of people determines with reference to Genome Size, the size of window and window
Number is inversely proportional to.In this embodiment, the sum of window be no less than 10,000 and no more than 100,000, and between do not have
Overlap, is beneficial to accurately detect the CNV not less than 1K of general definition.
In one embodiment of the invention, described target individual is the mankind, and the mankind are diplont, its chromosome set number
Being 2, preferably the described reference sequences behaviour reference genome of correspondence is at least some of, and for example, HG19, HG19 are permissible
Obtain from ncbi database, or be the reference sequences of all windows composition.In another embodiment of the present invention, with
N substitutes the described people each base with reference to the pseudoautosomal region of the Y chromosome of genome, and N represents A, T, C and G
In any one, so, be conducive to the false positive avoiding heterosomal pseudoautosomal region CNV to detect.
In one embodiment of the invention, before the initial comparison rate of merging is without multiple adjacent window apertures of significant difference, utilize
The relation of comparison rate-G/C content carries out GC correction to the initial comparison rate of each described window, it is thus achieved that the correction of each window
Comparison rate, to eliminate or to reduce G/C content to sequencing result, the impact of comparison rate, and replaces with the correction comparison rate of window
Carrying out subsequent detection for the initial comparison rate of described window, such as, it is all that the comparison rate of a described sub-region becomes that it comprises
The average of the correction comparison rate of window, and when determining alleged predetermined comparison rate, the comparison rate of a sub-region is assigned to its institute
The window comprised, the comparison rate of each window being i.e. in a same sub-region is the most equal, for the ratio of a sub-region at its place
To rate, so, the comparison rate of all windows is counted, determine the number of times that each comparison rate occurs, will appear from number of times
Many i.e. window comparison rate that frequency is the highest are set to alleged predetermined comparison rate.The relation of described comparison rate-G/C content can be in advance
The sequencing data utilizing check sample is set up, is preserved, in order to correct each sample to be tested sequencing result, preferred control sample
This is tissue samples infraspecific with target individual, it is also possible to utilize target sample genome when detecting target sample simultaneously
Sequencing result is set up.In one embodiment of the invention, the sequencing result directly utilizing the target sample needed for detection comes
Setting up the relation of comparison rate-G/C content, the foundation of the relation of described comparison rate-G/C content is as follows: obtain at least one sample
The sequencing data of this nucleic acid, described sequencing data is made up of multiple reading sections;Described sequencing data is compared with reference sequences
Right, it is thus achieved that comparison result, described reference sequences includes that multiple window, described comparison result include each described window in comparison
The number of reading section;Calculate the initial comparison rate of each described window, the reading of described window in the initial comparison rate=comparison of window
The reading hop count purpose meansigma methods of all windows in hop count mesh/comparison, the reading hop count purpose meansigma methods of all windows in described comparison=
The reading section sum/window number of all windows in comparison;The initial comparison rate of windows based on many groups and the G/C content of this window
Numerical value, utilize bidimensional regression analytic process to set up the relation of described comparison rate-G/C content.In one embodiment of the invention,
The bidimensional regression analytic process utilized is local weighted recurrence scatterplot smoothing techniques (Lowess).
Alleged initial comparison rate refers to the initial comparison rate adjacent window apertures without essence difference without multiple adjacent window apertures of significant difference,
Such as, due to initial comparison rate or correction comparison rate be curved about one group of numerical value that " 1 " fluctuates, can with 1 or with
1 ± 1*10% is the boundary with or without essence difference, if adjacent, correction comparison rate is all below 0.9, or 0.90~1.10, or
The window of more than 1.10 is the window without essence difference.In one embodiment of the invention, the initial comparison rate of described merging without
Multiple adjacent window apertures of significant difference are that merging meets adjacent window apertures described below, and the correction comparison rate of multiple adjacent window apertures is all
More than 1 or both less than 1.Further, by determining the size of detected CNV and position (breakpoint) definitely occurring,
The method also comprises determining that the second zone in a described sub-region, including, (1) is based on formulaMeter
Calculate the difference of the subregion M in a described sub-region and the comparison rate of other windows all in this sub-region, it is thus achieved that all
Zij, take Zc=max1≤i < j≤n|Zij|, (2) are by ZcCompare with the first marginal value, work as ZcDuring more than the first marginal value, accordingly
Subregion M is described second zone, and described second zone is CNV region, and the border of described second zone is CNV's
Position occurs, and (3) remove the second zone in a described sub-region, update i, j and n, carry out step (1) and (2),
Until without ZcMore than the first marginal value;Wherein, i and j is the numbering of the window in a described sub-region, and n is a described district
The number of the window in territory, described subregion M is i+1 window in a described sub-region between jth window
Region, RiFor the correction comparison rate of i-th window in a described sub-region, described first marginal value is ZijIn distribution first
The probability density of predetermined probability, described first predetermined probability >=95%, 1≤i < j≤n, Si=R1+…+Ri, Sj=R1+
…+Rj, Sn=R1+…+Rn.Assume that subregion M is normal unmanifest region, ZijDistribution refers to ZijObey standard normal
Distribution, the first predetermined probability and the first marginal value one_to_one corresponding, general statistics books all comprise the first predetermined probability and first and face
The form that dividing value is corresponding supplies to consult.In one embodiment of the invention, Z is worked ascFall into region of rejection, i.e. ZcPredetermined more than first
First marginal value of probability for example, 99.9% correspondence, it is known that there occurs small probability event, negates null hypothesis, i.e. subregion M
For variable region.Said process, the same tropism of the correction comparison rate of foundation window, the most both greater than 1 or both less than 1, right
Window merges, it is thus achieved that a big sub-region, then is circulated in each sub-region and judges to determine CNV therein
Generation border, determine second zone the most from which, in multiple sub-regions, determine second zone parallel so simultaneously, profit
In quickly detecting CNV.In one embodiment of the invention, the comparison rate of described second zone is that described second zone comprises
The average of correction comparison rate of all windows.In one embodiment of the invention, the method also includes, based on comparing
State the comparison rate of second zone and the size of described predetermined comparison rate, it is determined that the type of CNV, including, when described secondary
When the comparison rate in region is more than described predetermined comparison rate, it is determined that described second zone is that copy number increases region, when described secondary
When the comparison rate in region is less than described predetermined comparison rate, it is determined that described second zone is that copy number reduces region.The present invention's
In another embodiment, below equation is utilized to calculate the copy number of described second zone, the copy number of second zone=this secondary
The chromosome set number of the comparison rate in region/predetermined comparison rate * target individual, the comparison rate of described second zone comprised by its
There is the average of the correction comparison rate of window.
Without significant difference, it is also possible to refer to the evaluation no significant difference to data variance statistically, such as set
Predetermined probability, usual predetermined probability can be set to not less than 95%, and the correction comparison rate of adjacent multiple windows is carried out statistics inspection
Test, such as, can utilize z inspection or t inspection, the no significant difference (p > 0.05) between multiple correction comparison rate, i.e. recognize
For reaching described without significant difference.In one embodiment of the invention, the initial comparison rate of described merging is without significant difference
Multiple adjacent window apertures are the difference no statistical significance merging and meeting adjacent window apertures as described below correction comparison rate, make merging
The sub-region obtained is CNV region.Merge initial comparison rate to specifically include without multiple adjacent window apertures of significant difference: (a) base
In formulaThe difference of the comparison rate of zoning N and other all windows, it is thus achieved that all Zxy, take
Zb=max1≤x < y≤w|Zxy|, (b) is by ZbCompare with marginal value, work as ZbWhen exceeding described marginal value, corresponding region N
For a described sub-region, a described sub-region is removed, is updated x, y and w, carries out step (a) and (b), directly by (c)
To without ZbExceeding described marginal value, wherein, x and y is the numbering of window, and w is window sum, and described region N is (x+1)th
Individual window is to the region between y-th window, RxFor the correction comparison rate of x-th window, described marginal value is ZxyIn distribution
The probability density of predetermined probability, described predetermined probability >=95%, 1≤x < y≤w, Sx=R1+…+Rx, Sy=R1+
…+Ry, Sw=R1+…+Rw.Described ZxyIt is distributed as ZxyObey standard normal distribution, predetermined probability and marginal value
One_to_one corresponding.In one embodiment of the invention, it is assumed that region N is normal unmanifest region, works as ZbFall into region of rejection,
I.e. ZbExceed the marginal value of predetermined probability for example, 99.9% correspondence, it is known that there occurs small probability event, negate null hypothesis, i.e.
Region N is variable region.Said process, based on all windows are circulated the generation border judging to determine CNV, determines
The sub-region gone out is CNV region.In one embodiment of the invention, the method also includes: based on relatively described one
The size of the comparison rate of sub-region and described predetermined comparison rate, it is determined that the type of described CNV, including, when described once
When the comparison rate in region is more than described predetermined comparison rate, it is determined that a described sub-region is that copy number increases region;When described once
When the comparison rate in region is less than described predetermined comparison rate, it is determined that a described sub-region is that copy number reduces region.The present invention's
In another embodiment, the method also includes: utilize below equation to calculate the copy number of a described sub-region, a sub-region
The chromosome set number of the comparison rate of copy number=this sub-region/predetermined comparison rate * target individual, the comparison of a described sub-region
Rate is the average of the correction comparison rate of its all windows comprised.
Utilize above-mentioned one aspect of the present invention or arbitrary detailed description of the invention in CNV detection method, it is possible to solve above-mentioned
In existing CNV testing process, come with some shortcomings, the window of the employing regular length in the most existing method, it is impossible to very
Solve well bias problem and repetitive sequence problem that in unicellular order-checking, whole genome amplification is brought, it is impossible to well use
Detection etc. in the single celled CNV of diploid.CNV in above-mentioned one aspect of the present invention or arbitrary detailed description of the invention
Detection method, is highly suitable for CNV based on unicellular sequencing data detection, is based particularly on the order-checking of unicellular low depth
CNV detects, and the data using different amplification method to carry out unicellular order-checking or tissue order-checking for difference order-checking platform all have
Effect, the suitability is extensive.When difference order-checking platform uses different whole genome amplification method to carry out unicellular order-checking, this
Bright method is all fine at the Sensitivity and Specificity of detection CNV, is based especially on cyclization cyclic amplification technology (MALBAC)
The sequencing data of Proton platform.And, utilize the testing result of the method for the present invention to have high duplication, credible result.
With existing CNV Comparison between detecting methods, the method for the present invention uses the window of length change, is conducive to keeping all windows ratio
Stability to the meansigma methods of upper short sequence number, it is also possible to avoid the impact that repetitive sequence region is brought so that CNV detects
More accurate.
According to another aspect of the present invention, the present invention provides a kind of device detecting CNV, described device can in order to performing or
Completing the invention described above CNV detection method on the one hand or in arbitrary detailed description of the invention, described device includes: data are defeated
Enter unit, in order to receive data;Data outputting unit, in order to export data;Processor, can perform in order to perform computer
Program, performs described computer executable program and includes realizing in the invention described above one side or arbitrary detailed description of the invention
CNV detection method;And, memory element, in order to store data, including described computer executable program.Described
Computer executable program can be saved in storage medium, alleged storage medium may include that read only memory, random
Memorizer, disk or CD etc..The present invention also provides for a kind of computer-readable recording medium, and it is used for storing holds for computer
Row program, the execution of described program included aforementioned one aspect of the present invention or in its arbitrary detailed description of the invention
CNV detection method.The aforementioned advantage of CNV detection method to the present invention and the description of technical characteristic are also applied for this CNV
Detection device and computer-readable recording medium, do not repeat them here.
Accompanying drawing explanation
Above-mentioned and/or the additional aspect of the present invention and advantage will become bright from combining the accompanying drawings below description to embodiment
Aobvious and easy to understand, wherein:
Fig. 1 is the density profile of the ratio of each window after the window in a specific embodiment of the present invention merges;
Fig. 2 is the CG platform unicellular sequencing data inspection based on MDA amplification in a specific embodiment of the present invention
Survey the result schematic diagram of CNV;
Fig. 3 is the unicellular sequencing data of Proton platform based on MDA amplification in a specific embodiment of the present invention
CNV detection result schematic diagram;
Fig. 4 is the unicellular order-checking of Proton platform based on MALBAC amplification in a specific embodiment of the present invention
The result schematic diagram of the CNV detection of data.
Detailed description of the invention
Hereinafter the general step of the inventive method or the acquisition mode of relevant information are introduced.
First from the website (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/) of UCSC
Carry the hg19 sequential file chromFa.tar.gz with reference to genome.Here, CNV detection method will only use those just than
To the short sequence with reference to one position of genome, sequence N of pseudoautosomal region on Y chromosome is replaced by we.Right
In Y chromosome, what subsequent process used is all this modified version.Pseudoautosomal region is X chromosome and Y dyeing
The position exchanged uniquely can occur between body, and this is also its title origin, is autosome due to what exchange can occur, and X contaminates
Colour solid and Y chromosome are usually the phenomenon do not exchanged, and only occur in that exchange abnormally at pseudoautosomal region, cause man
Property and women are with the duplicate of two these regional genes.This makes the gene expression of pseudoautosomal region be similar to autosome, and
Non-heterosomal sex linked inheritance pattern, thus gain the name.
2. determine the size of each detection window (window).
A) for the lower machine data of Proton order-checking platform, single-ended analog data can be used to divide.With hg19 with reference to gene
On the basis of group, simulating the short sequence of single-ended order-checking, from the beginning of first base of genome chromosome, every 50 bases are one
Read section (reads), and generate ID and mass value generation fastaq form for it.Move a base the most backward, until
The end of short sequence is last base of chromosome.Use bowtie analog data to reference on genome, result
Only retain the short sequence (the short sequence i.e. removing in repeatable comparison) of those unique comparisons, use samtools comparison result
Be converted to BAM form.
Alignment parameters may be configured as: bowtie-S-t-n 2-e 70-m 1--best strata, the ratio of follow-up unicellular sequencing data
Can also equally to parameter, parameter meaning is: in-n 2 represents high-fidelity region, mispairing number not can exceed that 2, and-e 70 represents
Mismatch site mass value not can exceed that 70,--in best report file, the matching result of each short sequence will by quality of match from
High to Low sequence,--srtata with--best is used together the part that report quality is the highest, and-m 1 represents and reports all comparisons
Short sequence.
B) for the lower machine data of CG platform, come by the sequencing data of a cell DNA of the cell line to normal person
Divide.The lower machine data of CG, carry out procedure information analysis, such as utilize Teramap software comparison to reference to genome, so
After the form of comparison result is converted into BAM format result.
The data of different platform the most all can use software samtools to remove the short sequence repeated.Record is with reference to every on genome
One position being initiateed base covering by short sequence, and these positions are divided into 10,000 to 100,000 window, each
The position number comprised in individual window is identical, but its siding-to-siding block length is change.Calculate each window the most respectively to be wrapped
Containing the G/C content with reference to genome sequence.
3. extract the DNA of individual cells, carry out whole genome amplification, then build machine order-checking on storehouse, obtain lower machine data,
And carry out corresponding analysis and process and obtain the result of bam format comparison.
A) Proton platform, the data (BAM form) of its lower machine, we use BEDTools to be converted to FASTQ form
Data, (50bp adds then to use Trimmomatic software that from 3 ' ends, the short sequence being longer than 50bp is intercepted effective length
The short sequence nucleotide sequence isometric with the primer of whole genome amplification method), such as many reannealings and cyclization cyclic amplification technology
(MALBAC) primer is 35bp, and effective length is 85bp, filters out the length short sequence less than effective length simultaneously.Make
With bowtie the short sequence alignment after intercepting to reference genome, and after being converted into bam file ordering with samtools view
Removal repeats short sequence.
B) CG platform, the analysis process using its platform to research and develop compares short for lower machine sequence data with reference to genome hg19
Right.Then comparison result being changed into BAM form and be ranked up, the single-ended mode of samtools is removed and is repeated short sequence.
4. in pair each window, the short sequence in comparison carries out statistical counting and is standardized processing, and i.e. calculates each window
Comparison rate (ratio)=comparison on short sequence number/all window comparisons on the meansigma methods of short sequence number.
5. the ratio that the ratio-GC relation with contents using LOWESS algorithm to determine obtains after processing each window Playsization
Carry out GC correction, it is thus achieved that correction ratio.
The most each sample, according to the ratio value after all window correction, can use CBS segment software to close window
And form non-overlapping region (segment) and calculate its ratio value, this ratio value is assigned in region (segment)
Each window.Concrete includes, (a) is based on formulaZoning N and other all windows
The difference of comparison rate, it is thus achieved that all Zxy, wherein, region N is (x+1)th window to the region between y-th window, ZxyIn
Standard normal distribution, takes Zb=max1≤x < y≤w|Zxy|, (b) is by ZbCompare with marginal value, work as ZbWhen exceeding marginal value, phase
The region N answered be anticipated window combined region, i.e. region N be occur CNV region (c) window combined region is gone
Remove, update x, y and w, carry out above-mentioned two steps (a) and (b), until without ZbExceed marginal value, i.e. circulation divides window
Mouthful, until window can not merge again;Wherein, x and y is the numbering of window, and w is window sum, described RxFor xth
The correction comparison rate of individual window, described marginal value is ZxyThe probability density of the predetermined probability in distribution, described predetermined probability >=95%,
1≤x < y≤w, Sx=R1+…+Rx, Sy=R1+…+Ry, Sw=R1+…+Rw.Described ZxyIt is distributed as
ZxyObey standard normal distribution, predetermined probability and marginal value one_to_one corresponding.Said process can be regarded as, it is assumed that region N is just
Normal unmanifest region, works as ZbFall into region of rejection, i.e. ZbMarginal value more than 99.9% correspondence, it is known that there occurs small probability event,
Negative null hypothesis, i.e. region N are variable region.The ratio of each combined region is the equal of the correction ratio of window included by it
Value, is then assigned to its all windows included the ratio value of this combined region, is the comparison rate of window.
Then, the ratio of all windows is drawn density curve scattergram, as shown in Figure 1.For near diploid cell or two
Times body is the cell of the mode of all times of types, and the ratio value that in density profile, peak-peak is corresponding for this cell copy number is then
The ratio value of 2.
7. it is the ratio of 2 the ratio in each region divided by copy number, then is multiplied by 2, then obtain the copy number in each region.
8. calculate the Sensitivity and Specificity of CNV detection.Sensitivity=LT/LC, specificity=LT/L, wherein, L: refer to
The total length of the CNV (>=1Mb) that unicellular order-checking is found, LC: represent the CNV (>=1Mb) found in tissue order-checking
Total length, LT: represent the total length of CNV (>=1Mb) that unicellular order-checking and tissue order-checking are found jointly.
Below in conjunction with concrete individual specimen, detection method and the testing result according to the present invention is described in detail.Show below
Example, is only used for explaining the present invention, and is not considered as limiting the invention.In describing the invention, " once ", " two
Secondary " etc. for referring to or describing conveniently, it is impossible to be interpreted as ordering relation or relative importance instruction, except as otherwise noted,
" multiple " are meant that two or more.
Except as otherwise explaining, the reagent explained the most especially that relates in following example, sequence (joint, label and primer), soft
Part and instrument, be all conventional commercial product or disclosed, such as builds purchased from the hiseq2000 order-checking platform of Illumina company
Storehouse related kit etc..
Embodiment one: the CNV detection method test of CG platform low depth sequencing data based on MDA amplification
Flourish, with Complete Genomics (CG), IlluminaSolexa and Roche along with high throughput sequencing technologies
The secondary order-checking that 454 is representative, and the HelicosGenetic included by three generations's sequencing technologies (i.e. single-molecule sequencing technology)
The various sequencing technologies such as Analysis System, unimolecule real-time sequencing technologies (SMRT) and nanometer pore single-molecule sequencing technologies
Become the important tool of unicell group logistics research.CG platform as a kind of secondary sequencing technologies being absorbed in human genome,
Completely and accurately can be checked order mankind's full-length genome, and its sequencing throughput is big, in the field of business by highly recognition.It is main for CG platform
Including order-checking platform, high throughput process automatic technology and complete three parts of data management solutions, its platform that checks order
Including DNA nano-array (DNANanoball arrays, DNBTMArrays) and combination probe grappling connect sequencing
(combinatorial probe-anchor ligation, cPALTM), the application of these two technology greatly reduce reagent consumption and
Shorten the time of imaging.So we first at CG platform, utilize the lower machine data of CG platform that the CNV of the present invention is examined
Survey method carries out verification experimental verification.
Isolated from the glioblast tumor tissue of patient 3 unicellular, tissue samples from Beijing Tiantan Hospital provide,
Extract each single celled DNA and utilize MDA whole genome amplification technology to expand, then carrying out library construction, then
Unicellular low depth order-checking is carried out at CG platform.The last detection analysis carrying out unicellular CNV in the present inventive method.For
The CNV detection efficiency of checking the inventive method, we are extracted the DNA of tissue and carry out library construction, then at CG
Platform carries out genome sequencing, and uses the standard analysis flow process detection of CG to obtain the CNV result of tissue.3 slender
CNV that born of the same parents' sample (P1-T2-SC#) and tissue samples (P1-T2) detect is as in figure 2 it is shown, the heavy black table of paralleled by X axis
Showing the copy number in each region, it occurs copy number to increase in this region is described more than 2, sends out in this region is described less than 2
Raw copy number reduces, and represents that equal to 2 copy number is normal, and the ratio value scatterplot of each window represents.
Further the Sensitivity and Specificity of CNV detection method is estimated, sensitivity=LT/LC, specificity=LT/L.
The average sensitivity of 5 samples of estimation is 91.01%, and specificity is 74.47%, and result is as shown in table 1.
Table 1
Then, the repeatability of CNV detection method based on CG order-checking platform MDA amplification is added up, finds sample
Between repeatability higher than 0.7, the results detailed in Table 2.
Table 2
Can draw from sensitivity, specificity and repeatability statistical computation result, the CNV of the present invention analyzes testing process
The effectiveness of testing result, it is feasible on CG order-checking platform.
Embodiment two: the CNV detection method test of the unicellular low depth order-checking of Proton platform based on MDA amplification
Existing unicellular sequencing data is many by Illumina order-checking platform output.Although the sequencing throughput of Illumina sequenator is big,
But machine order-checking time cycle is long on it, order-checking cost is high, and these can limit the fast development that unicellular CNV detection is analyzed.
And some researchs often do not have demand to sequencing throughput, relative, time and cost to order-checking have higher demand, this
Time Proton order-checking platform will be preferably selection.The Proton order-checking platform speed of service is fast, and the order-checking cycle only needs several hours,
Order-checking low cost, is more suitable for being deployed to hospital or third party testing agency, shortens the detection time, reduces cost, thus improves
Detection efficiency.And the unicellular order-checking CNV detection for Proton rarely has report.
Extract 5 cells (MDA-2_BGC#) of mankind's gastric adenocarcinoma cells system (BGC823) from tumour hospital of Peking University,
And carry out unicellular low depth order-checking at Proton platform after carrying out library construction by multiple displacement amplification (MDA) technology.With
Time extract the DNA of mankind's gastric adenocarcinoma cells system (BGC823) cell (BGC), carry out after conventional libraries structure
Proton platform checks order.Then the detection analysis of CNV is carried out by our scheme, unicellular 5 of BGC823
CNV such as Fig. 3 that sample and tissue samples (BGC) detect, the copy number (heavy black of paralleled by X axis) in each region is big
Occur copy number to increase in 2 for explanation region, reduce for copy number less than 2, be that copy number is normal equal to 2, three kinds
In copy number region of variation, the ratio value of window represents by the scatterplot of the different gray scale degree of depth respectively.
Whole genome detects on multiple chromosomes the CNV of large fragment, keeps with the CNV testing result of cell mass
Unanimously, the effectiveness of the inventive method detection CNV is demonstrated.
Then, according to five unicellular and cell CNV testing results, we are further to detecting the quick of CNV method
Perception and specificity are estimated, sensitivity=LT/LC, specificity=LT/L.The average sensitivity of 5 unicellular samples of estimation
Property is 85.86%, and specificity is 81.18%, and result is as shown in table 3.
Table 3
The repeatability of CNV detection method based on Proton order-checking platform MDA amplification is added up, finds between sample
Repeatability is higher than 0.7, the results detailed in Table 4.
Table 4
Can draw from sensitivity, specificity and repeatability statistical computation result, the CNV testing process analysis knot of the present invention
The effectiveness of fruit, it is feasible for the sequencing data of Proton order-checking platform MDA amplification method.
Embodiment three: the CNV detection method test of the unicellular low depth order-checking of Proton platform based on MALBAC amplification
Extract 5 mankind's gastric adenocarcinoma cells system's cell (BGC823), carry out often by MALBAC whole genome amplification method
Unicellular low depth order-checking is carried out at Proton platform after rule library construction;Extract mankind's gastric adenocarcinoma cells system (BGC823) simultaneously
The DNA of one cell (BGC) checks order at Proton platform.The lower machine data obtained carry out CNV by our scheme
Detection analysis, find CNV at five samples of BGC823, as shown in Figure 4, wherein, abscissa represents chromosome to result;
Right side vertical coordinate is 5 unicellular samples and group's cell sample, and left side vertical coordinate is copy number, and the heavy black line on figure represents
The ratio value of zoning, this region copy number of its value explanation more than 2 increases, and subtracts less than copy number in 2 explanation regions
Few, normal equal to copy number in 2 explanation regions.Three kinds of copy number region of variation are respectively with the scatterplot table of different ash color depths
Show the ratio value of window.
The Sensitivity and Specificity further present invention detecting CNV method is estimated, sensitivity=LT/LC, special
Property=LT/L.The average sensitivity of 5 samples of estimation is 84.72%, and specificity is 85.18%, result such as table 5.
Table 5
The repeatability of CNV detection method based on Proton order-checking platform MALBAC amplification is added up, finds sample
Between repeatability higher than 0.92, refer to table 6.
Table 6
Can draw from sensitivity, specificity and repeatability statistical computation result, the CNV testing process analysis knot of the present invention
The effectiveness of fruit, it is feasible for the sequencing data of Proton order-checking platform MALBAC amplification method.
Claims (11)
1. the method detecting CNV, it is characterised in that comprise the following steps:
Obtaining the gene order-checking result of target individual, described sequencing result includes multiple reading section;
By described sequencing result and reference sequences comparison, it is thus achieved that comparison result, described reference sequences includes multiple window, described
Comparison result includes the number of the reading section of each window in comparison;
Based on described comparison result, calculate the initial comparison rate of each window, described window in the initial comparison rate=comparison of window
Read the reading hop count purpose meansigma methods of all windows in hop count mesh/comparison, in described comparison, the reading hop count purpose of all windows is average
The reading section sum/window number of all windows in value=comparison;
Merging the initial comparison rate multiple adjacent window apertures without significant difference, the multiple adjacent window apertures after definition merges are a sub-region,
Remaining each independent window is also referred to as a sub-region;
Comparison rate based on a described sub-region is unequal with predetermined comparison rate, it is determined that a described sub-region exists CNV,
The comparison rate of a described sub-region is the average of the initial comparison rate of the window that a described sub-region comprises,
Described predetermined comparison rate is the comparison rate of the window that the comparison rate medium frequency of all windows is the highest, the comparison rate of described window
Comparison rate for a sub-region at its place.
2. the method for claim 1, it is characterised in that described genome is available from the individual cells of described target individual;
Optional, by building the gene order-checking library of described cell, and described sequencing library is carried out sequencing acquisition
Described sequencing result;
Optional, build described sequencing library and include described genome is carried out degenerate oligonucleotide primed PCR, multiple displacement
Amplification and/or ring-type cyclic amplification of repeatedly annealing.
3. the method for claim 1, it is characterised in that described reference sequences is behaved with reference to genome;
Optional, substituting the described people each base with reference to the pseudoautosomal region of the Y chromosome of genome with N, N represents
Any one in A, T, C and G.
4. the method for claim 1, it is characterised in that the determination of described window, including,
By short sequence sets and reference sequences comparison, determine the original position of the short sequence of described reference sequences in comparison, described short
Sequence sets includes multiple short sequence, and described short sequence sets is from simulated series collection and/or sequencing result;
Described reference sequences delimited window, makes each described window comprise equal number of described original position;
Optional, the acquisition of the simulated series that described simulated series is concentrated includes,
From the beginning of the base of one end of the chromosome of a length of Q of described reference sequences, copy P of described chromosome
Base, to obtain Article 1 simulated series,
Other end direction along described chromosome is moved base and is copied P base of described chromosome, to obtain the
Article two, simulated series,
Other end direction along described chromosome is moved two bases and is copied P base of described chromosomes, to obtain the
Article three, simulated series,
Obtain the Q-P+1 article simulated series, the terminal bases of described Q-P+1 article of simulated series and described dyeing according to this
The base of the other end of body overlaps, wherein,
P is the length of simulated series, P >=10;
Optional, there is no overlap between described window;
Optional, described window sum is not more than 100,000.
5. the method for claim 1, it is characterised in that in the initial comparison rate of described merging without multiple adjacent windows of significant difference
Before Kou, utilize the relation of comparison rate-G/C content that the initial comparison rate of each described window is carried out GC correction, it is thus achieved that each
The correction comparison rate of individual window,
The initial comparison rate of this window is substituted with the correction comparison rate of described window.
6. the method for claim 5, it is characterised in that set up the relation of described comparison rate-G/C content, including,
Obtaining the sequencing data of the nucleic acid of at least one sample, described sequencing data is made up of multiple reading sections;
Being compared with reference sequences by described sequencing data, it is thus achieved that comparison result, described reference sequences includes multiple window,
Described comparison result includes the number of the reading section of each described window in comparison;
Calculate the initial comparison rate of each described window, the reading hop count mesh/ratio of described window in the initial comparison rate=comparison of window
Reading hop count purpose meansigma methods to upper all windows, all in the reading hop count purpose meansigma methods=comparison of all windows in described comparison
The reading section sum/window number of window;
The initial comparison rate of windows based on many groups and the G/C content of this window, utilize bidimensional regression analytic process to set up described ratio
Relation to rate-G/C content;
Optional, described bidimensional regression analytic process is local weighted recurrence scatterplot smoothing techniques.
7. the method for claim 5, it is characterised in that the initial comparison rate of described merging is without multiple adjacent window apertures of significant difference
Refer to, merge and meet following adjacent window apertures,
The difference no statistical significance of correction comparison rate.
8. the method for claim 7, it is characterised in that the initial comparison rate of described merging without multiple adjacent window apertures of significant difference,
Including,
A () is based on formulaThe difference of the comparison rate of zoning N and other all windows,
Obtain all Zxy, take Zb=max1≤x < y≤w| Zxy |,
B () is by ZbCompare with marginal value, work as ZbWhen exceeding described marginal value, corresponding region N is a described sub-region,
C a described sub-region is removed by (), update x, y and w, carries out step (a) and (b), until without ZbSuper
Cross described marginal value, wherein,
X and y is the numbering of window,
W is window sum,
Described region N is (x+1)th window to the region between y-th window,
RxFor the correction comparison rate of x-th window,
Described marginal value is ZxyThe probability density of the predetermined probability in distribution, described predetermined probability >=95%,
1≤x < y≤w,
Sx=R1+...+Rx,
Sy=R1+...+Ry,
Sw=R1+...+Rw。
9. the method for claim 8, it is characterised in that also include,
Comparison rate based on a relatively described sub-region and the size of described predetermined comparison rate, it is determined that the type of described CNV, its
Include,
When the comparison rate of a described sub-region is more than described predetermined comparison rate, it is determined that a described sub-region is that copy number increases district
Territory,
When the comparison rate of a described sub-region is less than described predetermined comparison rate, it is determined that a described sub-region is that copy number reduces district
Territory.
10. claim 7-9 either method, it is characterised in that also include,
Below equation is utilized to calculate the copy number of a described sub-region,
The chromosome set number of the comparison rate of the copy number of one sub-region=this sub-region/predetermined comparison rate * target individual,
The comparison rate of a described sub-region is the average of the correction comparison rate of its all windows comprised.
11. 1 kinds of devices detecting CNV, it is characterised in that include,
Data input cell, in order to receive data;
Data outputting unit, in order to export data;
Processor, in order to perform executable program, performs described executable program and has included claim 1-10 either method;
And,
Memory element, in order to store data, including described executable program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510039685.5A CN105986008A (en) | 2015-01-27 | 2015-01-27 | CNV detection method and CNV detection apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510039685.5A CN105986008A (en) | 2015-01-27 | 2015-01-27 | CNV detection method and CNV detection apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105986008A true CN105986008A (en) | 2016-10-05 |
Family
ID=57034401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510039685.5A Pending CN105986008A (en) | 2015-01-27 | 2015-01-27 | CNV detection method and CNV detection apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105986008A (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106520940A (en) * | 2016-11-04 | 2017-03-22 | 深圳华大基因研究院 | Chromosomal aneuploid and copy number variation detecting method and application thereof |
CN108073790A (en) * | 2016-11-10 | 2018-05-25 | 安诺优达基因科技(北京)有限公司 | A kind of chromosomal variation detection device |
CN108664766A (en) * | 2018-05-18 | 2018-10-16 | 广州金域医学检验中心有限公司 | Copy analysis method, analytical equipment, equipment and the storage medium of number variation |
CN108733982A (en) * | 2017-09-26 | 2018-11-02 | 上海凡迪基因科技有限公司 | Pregnant woman's NIPT calibration of the output results method, apparatus and computer readable storage medium, equipment |
CN109097457A (en) * | 2017-06-20 | 2018-12-28 | 深圳华大智造科技有限公司 | The method for determining predetermined site mutation type in sample of nucleic acid |
CN109979529A (en) * | 2017-12-28 | 2019-07-05 | 安诺优达基因科技(北京)有限公司 | CNV detection device |
WO2019233427A1 (en) * | 2018-06-08 | 2019-12-12 | 中国科学院遗传与发育生物学研究所 | Genome assembly method for constructing ultralong continuous dna sequence |
CN110797081A (en) * | 2019-10-17 | 2020-02-14 | 南京医基云医疗数据研究院有限公司 | Activation area identification method and device, storage medium and electronic equipment |
CN110910954A (en) * | 2019-12-04 | 2020-03-24 | 上海捷易生物科技有限公司 | Method and system for detecting low-depth whole genome gene copy number variation |
CN111028890A (en) * | 2019-12-31 | 2020-04-17 | 东莞博奥木华基因科技有限公司 | CNV detection method based on correction between run |
WO2020124625A1 (en) * | 2018-12-20 | 2020-06-25 | 北京优迅医学检验实验室有限公司 | Ctdna-based gene detection method and apparatus, storage medium, and computer system |
CN113421608A (en) * | 2021-07-03 | 2021-09-21 | 南京世和基因生物技术股份有限公司 | Construction method, detection device and computer readable medium of liver cancer early screening model |
CN113496761A (en) * | 2020-04-03 | 2021-10-12 | 深圳华大生命科学研究院 | Method, device and application for determining CNV in nucleic acid sample |
CN113832252A (en) * | 2021-11-02 | 2021-12-24 | 华南农业大学 | Method for detecting SNP locus genotype of indica-japonica rice |
CN114242164A (en) * | 2021-12-21 | 2022-03-25 | 苏州吉因加生物医学工程有限公司 | Analysis method, device and storage medium for whole genome replication |
CN114703263A (en) * | 2021-12-20 | 2022-07-05 | 北京科迅生物技术有限公司 | Method and device for detecting copy number variation of group chromosomes |
CN116153395A (en) * | 2023-04-17 | 2023-05-23 | 北京大学第三医院(北京大学第三临床医学院) | Method and system for detecting single-cell small fragment chromosome copy number variation |
CN116364195A (en) * | 2023-05-10 | 2023-06-30 | 浙大城市学院 | Pre-training model-based microorganism genetic sequence phenotype prediction method |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013097062A1 (en) * | 2011-12-31 | 2013-07-04 | 深圳华大基因健康科技有限公司 | Method for detecting genetic variation |
-
2015
- 2015-01-27 CN CN201510039685.5A patent/CN105986008A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013097062A1 (en) * | 2011-12-31 | 2013-07-04 | 深圳华大基因健康科技有限公司 | Method for detecting genetic variation |
Non-Patent Citations (2)
Title |
---|
ADAM B. OLSHEN ET AL.: "Circular binary segmentation for the analysis of array-based DNA copy number data", 《BIOSTATISTICS》 * |
SEUNGTAI YOON ET AL.: "Sensitive and accurate detection of copy number variants using read depth of coverage", 《GENOME RESEARCH》 * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106520940A (en) * | 2016-11-04 | 2017-03-22 | 深圳华大基因研究院 | Chromosomal aneuploid and copy number variation detecting method and application thereof |
CN108073790A (en) * | 2016-11-10 | 2018-05-25 | 安诺优达基因科技(北京)有限公司 | A kind of chromosomal variation detection device |
CN108073790B (en) * | 2016-11-10 | 2022-03-01 | 安诺优达基因科技(北京)有限公司 | Chromosome variation detection device |
CN109097457A (en) * | 2017-06-20 | 2018-12-28 | 深圳华大智造科技有限公司 | The method for determining predetermined site mutation type in sample of nucleic acid |
CN108733982A (en) * | 2017-09-26 | 2018-11-02 | 上海凡迪基因科技有限公司 | Pregnant woman's NIPT calibration of the output results method, apparatus and computer readable storage medium, equipment |
CN109979529B (en) * | 2017-12-28 | 2021-01-08 | 北京安诺优达医学检验实验室有限公司 | CNV detection device |
CN112365927B (en) * | 2017-12-28 | 2023-08-25 | 安诺优达基因科技(北京)有限公司 | CNV detection device |
CN112365927A (en) * | 2017-12-28 | 2021-02-12 | 安诺优达基因科技(北京)有限公司 | CNV detection device |
CN109979529A (en) * | 2017-12-28 | 2019-07-05 | 安诺优达基因科技(北京)有限公司 | CNV detection device |
CN108664766B (en) * | 2018-05-18 | 2020-01-31 | 广州金域医学检验中心有限公司 | Method, device, and apparatus for analyzing copy number variation, and storage medium |
CN108664766A (en) * | 2018-05-18 | 2018-10-16 | 广州金域医学检验中心有限公司 | Copy analysis method, analytical equipment, equipment and the storage medium of number variation |
WO2019233427A1 (en) * | 2018-06-08 | 2019-12-12 | 中国科学院遗传与发育生物学研究所 | Genome assembly method for constructing ultralong continuous dna sequence |
WO2020124625A1 (en) * | 2018-12-20 | 2020-06-25 | 北京优迅医学检验实验室有限公司 | Ctdna-based gene detection method and apparatus, storage medium, and computer system |
CN110797081A (en) * | 2019-10-17 | 2020-02-14 | 南京医基云医疗数据研究院有限公司 | Activation area identification method and device, storage medium and electronic equipment |
CN110910954A (en) * | 2019-12-04 | 2020-03-24 | 上海捷易生物科技有限公司 | Method and system for detecting low-depth whole genome gene copy number variation |
CN111028890A (en) * | 2019-12-31 | 2020-04-17 | 东莞博奥木华基因科技有限公司 | CNV detection method based on correction between run |
CN113496761A (en) * | 2020-04-03 | 2021-10-12 | 深圳华大生命科学研究院 | Method, device and application for determining CNV in nucleic acid sample |
CN113496761B (en) * | 2020-04-03 | 2023-09-19 | 深圳华大生命科学研究院 | Method, device and application for determining CNV in nucleic acid sample |
CN113421608A (en) * | 2021-07-03 | 2021-09-21 | 南京世和基因生物技术股份有限公司 | Construction method, detection device and computer readable medium of liver cancer early screening model |
CN113421608B (en) * | 2021-07-03 | 2023-12-01 | 南京世和基因生物技术股份有限公司 | Construction method of liver cancer early screening model, detection device and computer readable medium |
CN113832252A (en) * | 2021-11-02 | 2021-12-24 | 华南农业大学 | Method for detecting SNP locus genotype of indica-japonica rice |
CN114703263A (en) * | 2021-12-20 | 2022-07-05 | 北京科迅生物技术有限公司 | Method and device for detecting copy number variation of group chromosomes |
CN114703263B (en) * | 2021-12-20 | 2023-09-22 | 北京科迅生物技术有限公司 | Group chromosome copy number variation detection method and device |
CN114242164B (en) * | 2021-12-21 | 2023-03-28 | 苏州吉因加生物医学工程有限公司 | Analysis method, device and storage medium for whole genome replication |
CN114242164A (en) * | 2021-12-21 | 2022-03-25 | 苏州吉因加生物医学工程有限公司 | Analysis method, device and storage medium for whole genome replication |
CN116153395A (en) * | 2023-04-17 | 2023-05-23 | 北京大学第三医院(北京大学第三临床医学院) | Method and system for detecting single-cell small fragment chromosome copy number variation |
CN116364195A (en) * | 2023-05-10 | 2023-06-30 | 浙大城市学院 | Pre-training model-based microorganism genetic sequence phenotype prediction method |
CN116364195B (en) * | 2023-05-10 | 2023-10-13 | 浙大城市学院 | Pre-training model-based microorganism genetic sequence phenotype prediction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105986008A (en) | CNV detection method and CNV detection apparatus | |
Aref-Eshghi et al. | Evaluation of DNA methylation episignatures for diagnosis and phenotype correlations in 42 Mendelian neurodevelopmental disorders | |
Zhao et al. | Detection of fetal subchromosomal abnormalities by sequencing circulating cell-free DNA from maternal plasma | |
Neverov et al. | Massively parallel sequencing for monitoring genetic consistency and quality control of live viral vaccines | |
Zien et al. | Centralization: a new method for the normalization of gene expression data | |
CN109880910A (en) | A kind of detection site combination, detection method, detection kit and the system of Tumor mutations load | |
CN106834490A (en) | A kind of method for identifying embryo's balanced translocation breakaway poing and balanced translocation carrier state | |
Larsson et al. | Comparative microarray analysis | |
KR20200093438A (en) | Method and system for determining somatic mutant clonability | |
Galitsyna et al. | Single-cell Hi-C data analysis: safety in numbers | |
CN106367512A (en) | Method and system for identifying tumor loads in samples | |
CN108256289A (en) | A kind of method based on target area capture sequencing genomes copy number variation | |
CN108256292A (en) | A kind of copy number variation detection device | |
CN107267613A (en) | Sequencing data processing system and SMN gene detection systems | |
CN113674803A (en) | Detection method of copy number variation and application thereof | |
Talevich et al. | CNVkit-RNA: copy number inference from RNA-sequencing data | |
CN106795551B (en) | CNV analysis method and detection device for single cell chromosome | |
Ramazzotti et al. | Longitudinal cancer evolution from single cells | |
CN113862351A (en) | Kit and method for identifying extracellular RNA biomarkers in body fluid sample | |
Schiffman et al. | Defining ancestry, heritability and plasticity of cellular phenotypes in somatic evolution | |
McVean et al. | Scanning the human genome for signals of selection | |
CN105177130B (en) | It is used for assessing the mark of aids patient generation immune reconstitution inflammatory syndrome | |
Jeng et al. | Gene expression analysis of combined RNA-seq experiments using a receiver operating characteristic calibrated procedure | |
Wyllie et al. | M. tuberculosis microvariation is common and is associated with transmission: analysis of three years prospective universal sequencing in England | |
KR102361615B1 (en) | Method for drug repositioning based on drug responding gene expression features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161005 |
|
RJ01 | Rejection of invention patent application after publication |