The method and system of tumor load in a kind of identification sample
Technical field
This area is related to biological technical field, in particular it relates to a kind of method and system identifying tumor load in sample.
Background technology
In biomedical scientific research and clinical practice field, the tumor cell of tumor patient often has substantial amounts of gene
Group copy number variation.Copy number variation may be present in tumor tissues, body fluid (as blood, interstitial fluid, lymph fluid, cerebrospinal fluid,
Urine, saliva etc.) in, be specifically present in body fluid free circulating tumor cell (ctc), extracellular dissociate dna (cfdna),
Excretion body etc..In body fluid, the situation of genome copies number variation is the important indicator of identification tumor load, and identification tumor load can
It is applied to tumor early screening, diagnosis, the state of an illness monitoring of patient, prognosis treatment etc..
The main method of detection Oncogenome copy number variation has at present: comparative genome hybridization (comparative
Genomic hybridization, cgh), fluorescent quantitation pcr (realtime fluorescence quantitative
Pcr, rtfq pcr), fluorescence in situ hybridization (fluorescence in situ hybridization, fish), reconnect more
Probe amplification technology (multiplex ligation-dependent probe amplification, mlpa).
However, comparative genome hybridization resolution ratio is relatively low, mb level, flux is low, high cost;Fluorescent quantitation pcr is equally logical
Measure low, high cost, once can only survey a copy number variation;Fluorescence in situ hybridization, just for ad-hoc location, resolution is low, visits
Pin hybridization efficiency is unstable;Multiplex ligation-dependent probe amplification, complex operation, flux is low, high cost, and coverage is little, easily causes
Pcr pollutes.Except above-mentioned technical defect, above technology for detection major part is just for region specific on genome, and tumor
Heterogeneity is very strong, specific one or several site can not effectively in overall merit body fluid tumor load.
Therefore, this area in the urgent need to develop a kind of can more effectively in overall merit body fluid tumor load, improve and swell
The susceptiveness of tumor detection and the method and apparatus of versatility.
Content of the invention
The present invention provide a kind of can more effectively in overall merit body fluid tumor load, the susceptiveness of raising lesion detection
Method and apparatus with versatility.
A kind of method that first aspect present invention identifies tumor load in sample with providing nondiagnostic, including step:
I () provides a sample to be tested;
(ii) described sample to be tested is sequenced, thus obtaining the genome sequence of described sample;
(iii) genome sequence that step (ii) obtains is compared with reference gene group, thus obtaining genome sequence
It is listed in the positional information in reference gene group;
(iv) described reference gene group is divided into m region segments, wherein each region segments is a window b, meter
Calculate the copy number of each window b;
V () carries out z inspection to each window b of step (iv), thus calculating the z value of each window b;With
(vi) according to the z value obtained by step (v), genome randomness (gas), the number based on genome randomness are calculated
Value identifies the tumor load in described sample to be tested.
In another preference, described reference gene group can be continuous or discontinuous.
In another preference, described reference gene group includes full-length genome.
In another preference, described reference gene group refers to the total length of all chromosomes of this species (as people), wall scroll or many
A part for the total length of bar chromosome, wall scroll or a plurality of chromosome or a combination thereof.
In another preference, the coverage rate of described reference gene group reaches more than the 50% of full-length genome, it is preferred that
More than 60%, more preferably, more than 70%, more preferably, more than 80%, most preferably, more than 95%.
In another preference, described sample is derived from individuality to be detected.
In another preference, described individuality to be detected is people or non-human mammal.
In another preference, described sample is solid sample or liquid sample.
In another preference, described sample includes body fluid sample.
In another preference, described sample is selected from the group: blood, blood plasma, interstitial fluid, lymph fluid, cerebrospinal fluid, urine
Liquid, saliva, aqueous humor, seminal fluid or a combination thereof.
In another preference, described sample is selected from the group: free circulating tumor cell (ctc), extracellular dissociate dna
(cfdna), excretion body or a combination thereof.
In another preference, described sequencing is selected from the group: single-ended sequencing, both-end sequencing or a combination thereof.
In another preference, described step (iv) also includes correcting the copy number of each window b, calculates each window b
The step of the copy number after correction.
In another preference, described bearing calibration is selected from the group: loess correction, the method for weighting, residual error method or a combination thereof.
In another preference, the positional information in reference gene group is listed according to genome sequence, statistics falls each window
The sequence number of mouth b, base distribution, the base distribution of reference gene group.
In another preference, the sequence according to each window b and base contentses, correct the copy number of each window b.
The z value of each window b in another preference, is calculated with following formula:
Wherein, i is any positive integer of 1 to m;The total quantity of the window that m is divided into for reference gene group, wherein m are >=50
Positive integer, it is preferred that 50≤m≤105, more preferably, 100≤m≤105, most preferably, 200≤m≤105;xiFor described to be measured
Sample is in i-th window biThe copy numerical value of detection;biFor i-th window;μiFor normal control sample in window biCopy number
Arithmetic mean of instantaneous value, calculated with equation below:
Wherein, j is any positive integer of 1 to n;N is the total quantity of normal control sample, wherein n is >=30 positive integer,
It is preferred that 30≤n≤108, more preferably, 50≤n≤107, most preferably, 100≤n≤104;xjRefer to j-th normal control sample to exist
Described window biThe copy numerical value of detection;σiFor normal control sample in described window biCopy number standard deviation, with public as follows
Formula calculates:
In formula, n, j, xjAnd μiIt is as defined above.
In another preference, described normal control sample refers to the similar sample of the normal person of same species.
In another preference, with following formula calculate genome randomness:
Wherein, mbFor sorting in the window of m%, pbFor sorting in the window of pth %, m is 30-98, it is preferred that 40-
97, more preferably, 60-96, most preferably, 80-95, most preferably, 95, p is 80-100, it is preferred that 85-100, more preferably, 90-100,
Most preferably, 100, and p-m >=2 (it is preferred that >=5, more preferably, >=10, more preferably, >=15, most preferably, >=20).
In another preference, before described calculating genome randomness, comprise the steps:
A () removes the high pass such as centromere, telomere, satellite, heterochromatin on genome according to reference gene group sequence signature
Measure the region that sequence does not detect, remove centromere on genome, telomere, satellite, the region of the neighbouring l length of heterochromatin, l is little
Any length in 3m;Or
B () removes the high flux such as centromere, telomere, satellite, heterochromatin on genome according to the copy number feature of sample
The region not detected.
In another preference, also comprise the steps: before described step (v)
(iv1) copy number of each window b according to step (iv), calculates the change of each window b in normal control sample
Different coefficient cvi;With
(iv2) by described cviSort from small to large, remove the window of maximum front n%, wherein, n is more than 0, less than etc.
In 5 any number, it is preferred that n=1,2,2.5,3,3.1,4,4.2 or 5.
In another preference, described coefficient of variation cviCalculated with following formula:
Wherein, μiFor the arithmetic mean of instantaneous value of normal control sample copy number, calculated with equation below:
σiFor the standard deviation of normal control sample copy number, calculated with equation below:
In formula, n, j, xj、μiAnd σiIt is as defined above.
Second aspect present invention provides a kind of system (equipment) for identifying tumor load in sample, comprising:
Sequencing unit, described sequencing unit is used for carrying out nucleic acid sequencing to sample to be tested, thus obtaining the base of described sample
Because organizing sequence;
Comparing unit, described comparing unit is connected with described sequencing unit, the genome of the described sample for obtaining
Sequence is compared with reference gene group, thus obtaining the positional information that genome sequence is listed in reference gene group;
Calculate and verification unit, described calculating is connected with verification unit and described comparing unit, for calculating described reference
The copy number of each window b of genome, and z inspection is carried out to each window, thus calculating the z value of each window b;And
Identification unit, described identification unit is connected with verification unit with described calculating, for the value according to obtained z, counts
Calculate genome randomness (gas), and the numerical value based on genome randomness identifies the tumor load in sample.
In another preference, described system also includes correcting unit, described correction unit and described calculating and checklist
Unit is connected, for correcting the copy number of each window b of described reference gene group, thus calculating copying after each window b correction
Shellfish number.
In another preference, in described calculating with verification unit, before z inspection is carried out to each window b, can basis
The copy number of each window b, calculates coefficient of variation cv of each window bi, and by described cviSort from small to large, remove maximum
Front n% window, wherein, n is more than 0, any number less than or equal to 5, it is preferred that n=1,2,2.5,3,3.1,4,4.2
Or 5.
It should be understood that within the scope of the present invention, above-mentioned each technical characteristic of the present invention and having in below (eg embodiment)
Can be combined with each other between each technical characteristic of body description, thus constituting new or preferred technical scheme.As space is limited, exist
This no longer tires out one by one states.
Brief description
Fig. 1 shows the analysis method flow chart identifying tumor load in body fluid.
Fig. 2 shows the tumor load testing result in patient's difference clinical application cycle.
Fig. 3 shows s1-7 full-length genome copy number variation and corresponding gas.
Specific embodiment
The present inventor, by extensively in-depth study, establishes a kind of effective first and can improve the sensitive of lesion detection
Property and versatility identification sample in tumor load method, specifically, by calculating genome randomness (gas), thus base
Numerical value in genome randomness identifies the tumor load in sample.
Additionally, present invention also offers a kind of system (equipment) identifying tumor load in sample, described system (equipment)
Including: sequencing unit;Comparing unit;Calculate and verification unit and identification unit.In a preference of the present invention, also include
Correction unit.On this basis, the present inventor completes the present invention.
Term
As used herein, term " copy number variation (copy number variations, cnv) " refers to sample genome
Chromosome or chromosome segment copy number are abnormal, including but not limited to chromosome aneuploid, disappearance, repetition, more than 1000bp
Micro-deleted, micro- repetition of base.
As used herein, term " genome confusion angle value (genomic abnormality score, gas) " is basis
Sample genome chromosome or the extremely calculated score value of chromosome segment copy number, score value detection range includes but is not limited to
Full-length genome, specific chromosome, chromosome segment, specific gene.
As used herein, term " z value (z-score) " is also standard score (standard score), is a numerical value
The difference process divided by standard deviation again with average.It is formulated as:
Z score=(x- μ)/σ
Wherein x is a certain concrete numerical value, and μ is arithmetic mean of instantaneous value, and σ is standard deviation;Z value represents raw value and with reference to flat
The distance between average, is to be calculated in units of standard deviation.
As used herein, term " part alleviates (pr, partial response) " refers to the minimizing of target focus maximum diameter sum
>=30%, at least maintain 4 weeks.
As used herein, term " progression of disease (pd, progressive disease) " refers to target focus maximum diameter sum extremely
Reduce and add >=20%, or new focus occurs.
As used herein, term " system ", " equipment " are identical meanings.
Reference gene group
In the present invention, taking people as a example, described reference gene group can be full-length genome or portion gene group.
And, described reference gene group can be continuous or discontinuous.When described reference gene group is portion gene group
When, total coverage rate (f) of described reference gene group is more than the 50% of full-length genome, it is preferred that it is preferred that more than 60%, more
Goodly, more than 70%, more preferably, more than 80%, most preferably, more than 95%, wherein, described total coverage rate (f) refers to reference gene
Group accounts for the percentage ratio of full-length genome.
In a preferred embodiment, described reference gene group is full-length genome.
In a preferred embodiment, described reference gene group is the total length of all chromosomes of this species (as people), wall scroll
Or the part for total length, wall scroll or a plurality of chromosome of a plurality of chromosome or a combination thereof.
Tumor load
In the present invention, described " tumor load " refers to the extent of injury to body for the tumor, the size of such as tumor, tumor
Active degree, the transfer case of tumor, the tumor of the different parts degree of danger to body.Some evaluate the index of tumor load
Including but not limited to: tumor size, tumor marker height, clinical symptoms (breathe heavily suppress, pain etc.), related complication (on
Vena Cava Syndrome etc.), Expenditure Levels (anemia, hypoproteinemia etc.).
Sequencing
In the present invention, conventional sequencing technologies and platform is can use to be sequenced.Microarray dataset is not particularly limited, wherein
Second filial generation microarray dataset includes but is not limited to: ga, gaii, gaiix, hiseq1000/2000/2500/ of illumina company
3000/4000、x ten、x five、nextseq500/550、miseq、miseqdx、miseq fgx、miniseq;applied
The solid of biosystems;The 454flx of roche;Thermo fisher scientific's (life technologies)
ion torrent、ion pgm、ion proton i/ii;Bgiseq1000, bgiseq500, bgiseq100 of Hua Da gene;
The bioelectronseq 4000 of rich biology group difficult to understand;The da8600 of Da'an Gene Company, Zhongshan University;Bei Rui and
The nextseq cn500 of health;The purple prosperous bigis of section in subsidiary under purple prosperous Pharmaceutical;Hua Yinkang gene hyk-pstar-iia.
Third generation single-molecule sequencing platform includes but is not limited to: the heliscope of helicos biosciences company
System, the smrt system of pacific bioscience, the gridion of oxford nanopore technologies,
minion.Sequencing type can be single-ended (single end) sequencing or both-end (paired end) is sequenced, and sequencing length can be
30bp, 40bp, 50bp, 100bp, 300bp etc. be more than 30bp random length, sequencing depth can be genome 0.01,0.02,
0.1st, any multiple being more than 0.01 such as 1,5,10,30 times.
In the present invention, it is preferred to the hiseq2500 high-flux sequence platform of illumina company, sequencing type is single-ended
(single end) is sequenced, and be sequenced length 41bp, and sequencing data amount is 5m.
Data processing
In the present invention, data processing generally includes following steps:
A () carries out nucleic acid extraction, sequencing to the genome of sample to be tested, to obtain genome sequence;
B the genome sequence of described sample is compared reference gene group by (), obtain position in reference gene group for the sequence
Put;
C reference gene group is divided into the window of certain length by (), calculate the copy number of each window b;
D () carries out z inspection to each window b, calculate the z value of each window;With
E () calculates genome randomness (gas).
Wherein, in step (a), specifically also include: the type of described sample to be tested be body fluid, body fluid can be blood,
Interstitial fluid (abbreviation tissue fluid or intercellular fluid), lymph fluid, cerebrospinal fluid, urine, saliva, detection target is to contain in body fluid
Dna, dna be specifically present in free circulating tumor cell (ctc), extracellular dissociate dna (cfdna), excretion body etc..Described
The extracting mode of sample to be tested dna includes but is not limited to: pillar is extracted, magnetic bead extracts.Sample is carried out with library construction, adopts
High-flux sequence platform, is sequenced to sample.
Wherein, in step (b), specifically also include: sequencing result is removed joint and low quality data, compares reference
Genome.Reference gene group can be full-length genome, any chromosome, a part for chromosome.Reference gene group generally selects
It is recognized the sequence of determination, such as the genome of people can be hg18 (grch18), hg19 (grch19), the hg38 of ncbi or ucsc
(grch38), or any item chromosome and chromosome a part.Compare software and can use any free or business software,
As bwa (burrows-wheeler alignment tool), soapaligner/soap2 (short oligonucleotide
analysis package)、bowtie/bowtie2.By sequence alignment to reference gene group, obtain sequence on genome
Position.Unique sequence comparing on genome can be selected, remove the sequence that on genome, many places compare, eliminate repetitive sequence
The error brought is calculated to copy number.
Wherein, in step (c), specifically also include: genome is divided into the window of certain length, according to the data surveyed
Amount, length of window can also be 100bp-3, identical or different integer in the range of 000,000bp (3m).The quantity of window is permissible
It is the arbitrary integer in the range of 1,000-30,000,000.According to position on genome for the sequence surveyed, statistics falls each
The sequence number of window, base distribution, the base distribution of reference gene group.Sequence according to each window and base gc content,
Correct the copy number of each window, bearing calibration includes but is not limited to loess correction, calculates the copy after each window correction
Number.
Wherein, in step (d), specifically also include: take the sample of n (n is the natural number no less than 30) individual normal person, with
The extraction of sample, build storehouse, sequencing condition, repeat the above steps (a)-(c), as reference data set.For each window bi, all right
Answer n normal copy number value.
Calculate the arithmetic mean of instantaneous value μ of normal control sample copy numberi, arithmetic mean of instantaneous value μiComputing formula is:
Calculate the standard deviation sigma of normal control sample copy numberi, the computing formula of standard deviation is:
x1,x2,x3,......xjCopy numerical value for normal sample.
Calculate sample to be detected each window biZ value, the computing formula of z value is:
xiFor window biThe copy numerical value of detection.
Wherein, in step (e), specifically also include: in whole gene group, certain chromosome, chromosome segment or gene
There is high repeat region in surrounding, such as the region such as nearly centromere, telomere, satellite, heterochromatin.Remove high repeat region first, with
Eliminate the impact that randomness is calculated.
In a preferred embodiment, the method for removal includes but is not limited to:
A. removed according to reference gene group sequence signature
Remove the region that on genome, the high-flux sequence such as centromere, telomere, satellite, heterochromatin does not detect, remove base
Because of the region of l length near the upper centromere of group, telomere, satellite, heterochromatin, l can be any length less than 3m;Or
B. the copy number feature according to normal sample removes
For each window bi, calculate coefficient of variation cv in this window for the normal control samplei(coefficient of
Variation), cviComputing formula is:
μiFor the arithmetic mean of instantaneous value of normal control sample copy number, σiStandard deviation for normal control sample copy number.
Cv sorts from small to large, removes the window of maximum front n%, n can be the Arbitrary Digit more than 0, less than or equal to 5
Value.
Wherein, in step (e), specifically also include the calculation of genome randomness (gas):
Determine the detection range of randomness first, detection range includes but is not limited to whole gene group, specific chromosome, spy
Determine the arbitrary value in the range of genome length (as the genome about 3g of people) for the 1m such as chromosome segment or specific gene.Mixed
In random degree detection range, the z value removing the window of repetitive sequence impact takes absolute value, and z value absolute value sorts from small to large, and will
Sorted z value absolute value is evenly distributed in the range of 0%-100%, and wherein z value absolute value minima is allocated to 0%, z value
The maximum of absolute value is assigned to 100%.Calculate tired corresponding to each window z value absolute value in the range of m% to pth %
Evaluation, wherein, m is 30-98, it is preferred that 40-97, more preferably, 60-96, most preferably, 80-95, most preferably, 95;P is 80-
100, it is preferred that 85-100, more preferably, 90-100, most preferably, 100, and p-m >=2 (preferably >=5, more preferably >=10, more preferably
Ground >=15, most preferably >=20), described aggregate-value is genome randomness (gas), and computing formula is:
mbFor sorting in the window of m%, pbFor sorting in the window of pth %.Born with tumor in the value identification body fluid of gas
Lotus.
The method of tumor load in identification sample
In the present invention, there is provided a kind of effectively and can improve in the susceptiveness of lesion detection and the identification sample of versatility
The method of tumor load, including step:
I () provides a sample to be tested;
(ii) described sample to be tested is sequenced, thus obtaining the genome sequence of described sample;
(iii) genome sequence that step (ii) obtains is compared with reference gene group, thus obtaining genome sequence
It is listed in the positional information in reference gene group;
(iv) described reference gene group is divided into m region segments, wherein each region segments is a window b, meter
Calculate the copy number of each window b;
V () carries out z inspection to each window b of step (iv), thus calculating the z value of each window b;With
(vi) according to the z value obtained by step (v), genome randomness (gas), the number based on genome randomness are calculated
Value identifies the tumor load in described sample to be tested.
In a preference of the present invention, methods described includes step:
A () carries out nucleic acid extraction, sequencing to sample genome, to obtain genome sequence;
B (), by sequence alignment to reference gene group, obtains position on genome for the sequence;
C reference gene group is divided into the window b of certain length by (), calculate the copy number of each window b;And
D () carries out z inspection to each window b, calculate the z value of each window b;Calculate genome randomness (gas), thus
Numerical value based on genome randomness identifies the tumor load in sample.
The system (equipment) of tumor load in identification sample
In the present invention, additionally provide a kind of system (equipment) of tumor load in identification sample, comprising:
Sequencing unit, described sequencing unit is used for carrying out nucleic acid sequencing to sample to be tested, thus obtaining the base of described sample
Because organizing sequence;
Comparing unit, described comparing unit is connected with described sequencing unit, the genome of the described sample for obtaining
Sequence is compared with reference gene group, thus obtaining the positional information that genome sequence is listed in reference gene group;
Calculate and verification unit, described calculating is connected with verification unit and described comparing unit, for calculating described reference
The copy number of each window b of genome, and z inspection is carried out to each window, thus calculating the z value of each window b;And
Identification unit, described identification unit is connected with verification unit with described calculating, for the value according to obtained z, counts
Calculate genome randomness (gas), and the numerical value based on genome randomness identifies the tumor load in sample.
In a preferred embodiment, described system also includes correcting unit, described correction unit and described calculating and inspection
Verification certificate unit is connected, for correcting the copy number of each window b of described reference gene group, thus after calculating each window b correction
Copy number.
Main advantages of the present invention include:
(1) present invention sets up a kind of method and system of tumor load in identification sample, the method for the present invention and be first
System can be accurate and effective identification sample in tumor load.
(2) method of the present invention and system can improve susceptiveness and the versatility of lesion detection.
(3) misery that when method of the present invention and system can reduce tumor patient detection, sampling brings, realizes Non-invasive detection.
(4) method of the present invention and system can effectively detect the patient that some conventional sense cannot sample;
(5) method of the present invention and system can be monitored medication curative effect, to doctor's medication, control to tumor patient real-time detection
Treat and make certain guidance.
With reference to specific embodiment, state the present invention further.It should be understood that these embodiments are merely to illustrate the present invention
Rather than restriction the scope of the present invention.The experimental technique of unreceipted detailed conditions in the following example, generally according to conventional strip
Part such as sambrook et al., molecular cloning: laboratory manual (new york:cold spring harbor laboratory
Press, 1989) condition described in, or according to the condition proposed by manufacturer.Unless otherwise indicated, otherwise percentage ratio and
Number is calculated by weight.
Unless otherwise specified, otherwise the material used by embodiment is commercially available prod.
Embodiment 1
The present invention has been applied to 15 examples, and obtains good effect.In order that the usage of the present invention and effect are more
Plus should be readily appreciated that and grasp, will cite an actual example below and be further elaborated.Implement outline flowchart as shown in figure 1,
Implementation process in detail is as follows:
1. pair sample genome carries out nucleic acid extraction, sequencing
In the present embodiment, detection samples sources are certain gastic cancer patients, extract in blood free dna (cfdna) and
Leukocyte.Nucleic acid extraction adopts the cw2603 nucleic acid extraction kit that health is century bio tech ltd, and extracting method is pressed
The product description operation providing for century bio tech ltd according to health.
Build storehouse test kit and carry out library construction for the cw2185 of century bio tech ltd using health, upper machine sequencing.
Upper machine sequencing is using the hiseq2500 high-flux sequence platform of illumina company, the explanation providing according to illumina company
Book operates.Sequencing type is that single-ended (single end) is sequenced, and be sequenced length 41bp, and sequencing data amount is 5m.
2., by sequence alignment to reference gene group, obtain position on genome for the sequence
Sequencing result is removed joint and low quality data, compares reference gene group.Reference gene group is the gene of people
The hg19 (grch19) of group ucsc, comparison software is bwa (burrows-wheeler alignment tool), using acquiescence ginseng
Number, sequence alignment to reference gene group obtains position on genome for the sequence, selects unique sequence comparing on genome
Row.
3. reference gene group is divided into the window of certain length, calculates the copy number of each window
Genome is divided into 15489 window b (region), each window b length is 200k, according to sequence on genome
Position, statistics falls the sequence number of each window b, base distribution, the base distribution of reference gene group.According to each window
The sequence of b and base gc content, correct the copy number of each window b, and bearing calibration is loess, after calculating each window b correction
Copy number.
4. calculate the cv value of each window
Take the sample of 100 normal persons, same extraction, build storehouse, sequencing condition, repeat above-mentioned 1,2,3 steps, just obtain
Often check sample data, as reference data set, calculates sample to be detected each window biCv value.
For each window bi, all correspond to the individual normal copy number value of n (the present embodiment n=100).
Calculate the arithmetic mean of instantaneous value μ of normal control sample copy numberi, arithmetic mean of instantaneous value μiComputing formula is:
Calculate the standard deviation sigma of normal control sample copy numberi, the computing formula of standard deviation is:
x1,x2,x3,......xjCopy numerical value for normal sample.
Calculate sample to be detected each window biCv value, the computing formula of cv value is:
5. pair each window carries out z inspection, calculates the z value of each window
Calculate sample to be detected each window biZ value, the computing formula of z value is:
xiFor window biThe copy numerical value of detection, μiFor the arithmetic mean of instantaneous value of normal control sample copy number, σiFor normally right
The standard deviation of this copy number in the same old way, computing formula is with step 4.
6. calculate genome randomness (gas)
In the present embodiment, each window cv sorts from small to large, removes the window of maximum front 5%, is not involved in following
Randomness calculates.The detection range of randomness is whole gene group;Z value takes absolute value, and sorts from small to large, calculates m%
To the aggregate-value of pth % window z value absolute value, its aggregate-value as genome randomness (gas).Computing formula is:
mbFor sorting in the window of m%, pbFor sorting in the window of pth %, wherein, m is 95, p is 100.With gas's
Tumor load in value identification body fluid.
7. testing result
More than ten sample is detected.The situation of one typical pathologic is as follows.
Testing result is as shown in table 1, Fig. 2 and Fig. 3.
Table 1 embodiment 1 does tumor load testing result to the clinical application effect of certain patients with gastric cancer
Result shows, before patient clinical medication, is diagnosed as gastric cancer, now cfdna copy number severely subnormal (Fig. 3 s1), entirely
Genome randomness is 999.84, and in blood, tumor load is more serious.
Along with medication, normal to period 4 cfdna copy number, full-length genome randomness is 728.80, and normally white
Cell 729.86 is close.
With the present embodiment identical method, calculate the full-length genome randomness of above-mentioned 100 normal persons, normal range is
722.87-739.89, arithmetic average 733.22, the full-length genome confusion angle value of the present embodiment the 4th medication cycle and leukocyte
Within normal range, tumor load very little in blood is described, it is corresponding for commenting effect result pr (part is alleviated) with its clinic.
With further medication, tumor develops immunity to drugs, and cfdna copy number abnormal conditions become seriously again, and full-length genome mixes
Random degree score value becomes big, and in blood, tumor load becomes serious, and to medication the 7th cycle, full-length genome randomness highest, with its clinic
Effect result pd (progression of disease) is commented to be corresponding.
Result shows, genome randomness can effectively identify the tumor load in body fluid.
The all documents referring in the present invention are all incorporated as reference in this application, independent just as each document
It is incorporated as with reference to like that.In addition, it is to be understood that after the above-mentioned teachings having read the present invention, those skilled in the art can
To make various changes or modifications to the present invention, these equivalent form of values equally fall within the model that the application appended claims are limited
Enclose.