CN109402241A - Identification and the method for analyzing ancient DNA sample - Google Patents
Identification and the method for analyzing ancient DNA sample Download PDFInfo
- Publication number
- CN109402241A CN109402241A CN201710667605.XA CN201710667605A CN109402241A CN 109402241 A CN109402241 A CN 109402241A CN 201710667605 A CN201710667605 A CN 201710667605A CN 109402241 A CN109402241 A CN 109402241A
- Authority
- CN
- China
- Prior art keywords
- dna
- read
- chromosome
- measured
- dna sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Abstract
The invention discloses identifications and the method for analyzing ancient DNA sample, including the method for the DNA information for obtaining DNA sample to be measured, method includes the following steps: carrying out building library and sequencing to the DNA sample to be measured, to obtain sequencing data;Processing is filtered to the sequencing data;Processing is compared in the sequencing data by filtration treatment, to obtain comparison result, the comparison result includes the DNA information of the DNA sample to be measured, the mispairing for comparing processing and at most allowing 4 bases.It can be effectively based on building library and sequencing to ancient DNA sample to be measured using this method, obtain the DNA information of ancient DNA sample to be measured, and, the information is accurate, it is with a high credibility, it can be effective for the genome analysis of Gu DNA to be measured, such as variation detection, the identification of Gu DNA, sex determination and the assessment of modern's DNA pollution rate.
Description
Technical field
The present invention relates to biological order-checking technical fields, in particular to identification and the method for analyzing ancient DNA sample.
Background technique
Extinct plants and animal sample is most important to the evolutionary history research of modern biotechnology population, and the research achievement of ancient human's genome makes
People re-recognize the not only African ancestors' ingredient of genetic constitution of modern, but after walking out Africa again with ancient Buddhist nun
An Dete people and Gu Danni Sol human hair gave birth to gene exchange, had overturned previous people to the understanding of modern's evolutionary history.Meanwhile
The research of extinct plants and animal genome also has and can not replace to the natural selection of modern biotechnology population, the especially mankind and the research of disease
The important function in generation, the plateau adaptability gene of people from Tibetan are proved to be the infiltration between the genome from Gu Danni Sol people
Effect thoroughly.The genetic resources that extinct plants and animal sample can not be replicated as one kind, evolution, selection and disease etc. to modern biotechnology group
Research has huge facilitation and can not substitute.
Extinct plants and animal genetics research has been deep into genomic level.China is as a paleontological resources big country, not only
There are fossil animal and plant extremely abundant and subfossil resource, more there is ancient human's sample abundant to be constantly unearthed, limitation China is ancient
One of the maximum bottleneck of human activities environment development is just a lack of the summary to ancient DNA processing and information analysis method.
Thus, identification at present and the method for analyzing ancient DNA sample still have much room for improvement.
Summary of the invention
The present invention is directed at least solve one of the technical problems existing in the prior art.For this purpose, one object of the present invention
It is to construct the standard information analysis process of a set of ancient DNA based on Illumina bis- generations sequencing data, a set of ancient human is provided
Genome analytical method.
It should be noted that the present invention is following discovery based on inventor and work and completes:
Inventor has carried out a series of theoretical research and experimental exploring for the method for Gu DNA processing and information analysis,
As a result, it has been found that:
1, the fragmentation degree of Gu DNA is very high, therefore is not required to carry out fragmentation to DNA during constructing DNA library
Processing, DNA can directly carry out library construction after the completion of extracting.
2, it is directed to Gu DNA, in the sequencing of upper machine, long segment should not be selected to be sequenced, read length controls within 100bp,
Because Gu DNA average length is in 50-70bp or so, if the length of read is more than 100bp when sequencing, on the one hand can introduce a large amount of
Connector pollution, on the other hand will cause the waste of a large amount of data.
3, it is directed to Gu DNA, the most important step under original Fastq data after machine is exactly according to Illumina data characteristics
And the sequence signature of Gu DNA is filtered data, it is therefore intended that removes low-quality sequence and outer to greatest extent
The DNA sequence dna of source pollution.Data filtering mainly includes 4 aspects: butt joint is filtered, to the low quality alkali of mass value Q≤10
Base is filtered, is filtered to the area N (region that cannot be identified), and removal length is less than 30bp and length is greater than 99bp
Read.If read is less than 30bp, it will cause more mistake in subsequent comparison process and compare.Because of ancient DNA sequence
Height fragmentation, average length generally probably come from existing in 50-70bp if read is too long (being greater than 99bp)
For the pollution of DNA, therefore, in order to retain ancient DNA to greatest extent, then it should delete these and read read.This step and its important, if
The read greater than 99bp is not deleted, it will influences the accuracy of subsequent species identification, this is also to reflect with modern biotechnology sample species
A fixed very big difference.
4, in order to be compatible with most comparison result analysis, original lower machine data are passed through after Quality Control, are used respectively
Original ancient human DNA data are compared in SoapAligner and BWA, wherein the number compared using SoapAligner
According to ultimately producing the comparison result of Soap format;The data compared using BWA, ultimately produce the comparison result of sam format, and
And in view of mutation caused by the deamination of Gu DNA is more, the mispairing of 4 bases is at most allowed during comparison.
Comparison result is accurate and reliable as a result, is conducive to subsequent analysis use.
5, variation detection is carried out to the data after comparison simultaneously using two softwares of SoapSnp and GATK, mainly to monokaryon
Thuja acid variation is detected;Meanwhile using SoapSnp carry out variation detection when, output cns format as a result, i.e. all positions
Point output.Be conducive to subsequent analysis use as a result,.
6, identify for Gu DNA: ancient DNA identification is the most basic premise for carrying out subsequent customized information analysis, inventor
The characterization of molecules that comprehensive Gu DNA has proposes the method for carrying out Gu DNA identification based at least one of following 2 aspects:
(1) it is based on deamination Characteristics of Mutation:
The deamination Characteristics of Mutation of ancient DNA: for extinct plants and animal sample during long-term preservation, double-stranded DNA will receive one kind
Important chemical damage, i.e. cytosine deamination.Deamination occurs mainly in the end position of DNA fragmentation, that is, 5 '
End and 3 ' ends.This deamination can make cytimidine be converted into uracil, therefore can draw when library construction and sequencing
Enter the mutation of C- > T.Therefore Gu DNA carry out two generations sequencing when, reads 5 ' end and 3 ' hold will appear a large amount of C- > T and G- >
The mutation of A.It has been recognised by the inventors that this Catastrophe Model can just be utilized to identify gained sequence whether the evidence for being ancient DNA
One of.
(2) it is based on DNA fragmentation feature:
Depurination (DNA fragmentation) feature: depurination is that DNA chain fracture occurs during ancient DNA is saved
One most important chemical action, that is to say, that in the fragmentation process of ancient DNA, quite a few is due to having occurred
Caused by depurination.It has been recognised by the inventors that this depurination is just when comparing ancient DNA fragmentation to reference genome
It can show the 5 ' end reads to greatly increase toward the ratio that previous base is purine again, on the contrary in 3 ' ends again toward the latter
Base is that the ratio regular meeting of pyrimidine greatly increases.It thus, can also it has been recognised by the inventors that this fracture mode of Gu DNA is as deamination
Using as identify whether one of the material evidence for being ancient DNA.
7, inventor also constructs for women Gu DNA sample, and the side of exogenous DNA Contamination Assessment is carried out by Y chromosome
Method: this method is to obtain Y chromosome specific region (YUR, other any homologies of chromosome of getting along well and no repetition sequence first
The region of column);Then the reads of obtained Gu DNA is compared to YUR, is calculated further according to YUR and specific reads quantity
Assume to be the desired value in the case of male out, the ratio between the reads in practical comparison finally obtained and desired value, i.e.,
For the pollution rate from male.
As a result, in the first aspect of the present invention, the present invention provides a kind of sides of DNA information for obtaining DNA sample to be measured
Method.According to an embodiment of the invention, method includes the following steps: carry out building library and sequencing to the DNA sample to be measured, so as to
Obtain sequencing data, wherein it is described build library when without DNA fragmentation the step of, it is described sequencing read it is of length no more than
100bp;Processing is filtered to the sequencing data, to obtain the sequencing data by filtration treatment;And by the warp
Processing is compared in the sequencing data for crossing filtration treatment, and to obtain comparison result, the comparison result includes the DNA to be measured
The DNA information of sample, wherein the filtration treatment includes at least one of following: (1) filtering removal joint sequence;(2) it filters
Remove the low quality base of mass value Q≤10, wherein when the quantity of the low quality base accounts for whole read total bases amount
When 50% or more, whole read is deleted;When the low quality base is in the end of read, and quantity is no more than whole read
When 50%, the low quality base is only cut off;(3) area N is filtered, wherein when ratio containing N is greater than 10% in read,
Remove the read;When the area N exists only in read both ends, the area N at the read both ends is only cut off;(4) removal length is less than
30bp and length are greater than the read of 99bp, the mispairing for comparing processing and at most allowing 4 bases.
It should be noted that described herein " is filtered the area N, wherein when ratio containing N is greater than 10% in read
When, remove the read;When the area N exists only in read both ends, the area N at the read both ends is only cut off ", wherein " area N " is
Refer to the region that cannot be identified, " ratio containing N " refers to the ratio containing the base that cannot be identified.
According to an embodiment of the invention, can be effectively based on building library and survey to ancient DNA sample to be measured using this method
Sequence obtains the DNA information of ancient DNA sample to be measured, also, the information is accurate, with a high credibility, can be effective for Gu DNA to be measured
Genome analysis, such as variation detection, the identification of Gu DNA, sex determination and modern's DNA pollution rate assess.
According to an embodiment of the invention, carrying out the comparison processing using SoapAligner and BWA simultaneously.It compares as a result,
As a result accurate and reliable.
According to some embodiments of the present invention, when carrying out comparison processing using SoapAligner, Soap format is generated
Comparison result;When carrying out comparison processing using BWA, the comparison result of sam format is generated.As a result, convenient for two kinds of comparisons
As a result merger, final comparison result credibility are high.
In the second aspect of the present invention, the present invention also provides it is a kind of determine DNA sample to be measured whether the side for being ancient DNA
Method.According to an embodiment of the invention, method includes the following steps: being believed according to the mentioned-above DNA for obtaining DNA sample to be measured
The method of breath obtains the DNA information of DNA sample to be measured;Based on the DNA information of the DNA sample to be measured, variation detection is carried out,
To determine the variation information of the DNA sample to be measured;And the variation information based on the DNA sample to be measured, determine described in
Whether DNA sample to be measured is ancient DNA, wherein there are at least one of following state be the DNA sample to be measured be Gu DNA
It indicates: (1) the deamination feature that read is presented below as is sequenced: relative to reference genome, the 5 ' ends and 3 ' of the sequencing read
There is the mutation of the C- > T and G- > A greater than 10% in end;(2) the fragmentation feature that sequencing read is presented below as: relative to reference base
Because of group, 5 ' ends of the sequencing read are dramatically increased toward the ratio that previous base is purine again, and 3 ' ends are again toward latter
A base is that the ratio of pyrimidine dramatically increases.The identification of ancient DNA can be effectively performed using this method, and result accurately may be used
It leans on, is reproducible.
It is detected according to an embodiment of the invention, carrying out the variation using GATK and SoapSnp simultaneously.Detection knot as a result,
Fruit is accurate and reliable.
According to some embodiments of the present invention, when carrying out variation detection using SoapSnp, the knot of cns format is exported
Fruit.It is convenient for subsequent analysis as a result,.
According to an embodiment of the invention, the variation information of the DNA sample to be measured includes single nucleotide variations information.
In the third aspect of the present invention, the present invention also provides a kind of sides of gender individual belonging to determining ancient DNA sample
Method.According to an embodiment of the invention, method includes the following steps: being believed according to the mentioned-above DNA for obtaining DNA sample to be measured
The method of breath obtains the DNA information of ancient DNA sample to be measured;Based on the DNA information of the DNA sample to be measured, following gender is determined
At least one of critical parameter: compare to X chromosome sequencing read and compare arrive Y chromosome sequencing read quantity ratio,
It compares the sequencing read of X chromosome and compares the quantity ratio to the sequencing read of No. 8 chromosome, the sequencing of each chromosome is deep
The heterozygote ratio of degree and each chromosome;And based at least one of described sex determination's parameter, determine the Gu to be measured
Individual gender belonging to DNA sample, in which: (1) sequencing compared to the sequencing read and comparison to Y chromosome of X chromosome is read
The quantity ratio of section is the instruction that individual belonging to the ancient DNA sample to be measured is male close to 9:1;Compare the sequencing of X chromosome
Read and the quantity ratio to the sequencing read of No. 8 chromosome is compared close to 1:1, be that individual belonging to the Gu DNA sample to be measured is
The instruction of women;(2) the sequencing depth of Y chromosome and the sequencing depth of other chromosomes are close, are the ancient DNA samples to be measured
Affiliated individual is the instruction of male;The sequencing depth of Y chromosome is described to be measured significantly less than the sequencing depth of other chromosomes
Individual belonging to ancient DNA sample is the instruction of women;(3) heterozygosis of the heterozygote ratio of X chromosome significantly less than other chromosomes
Sub- ratio is the instruction that individual belonging to the ancient DNA sample to be measured is male;The heterozygote ratio of X chromosome is not significant small
It is the instruction that individual belonging to the ancient DNA sample to be measured is women in the heterozygote ratio of other chromosomes.Utilize this method energy
It is enough that affiliated individual sex identification effectively is carried out to ancient DNA sample, also, result is accurate and reliable, and it is reproducible.
In the fourth aspect of the present invention, the present invention also provides the male modern times DNA in a kind of determining women Gu DNA sample
The method of pollution rate.According to an embodiment of the invention, method includes the following steps:
According to the method for the mentioned-above DNA information for obtaining DNA sample to be measured, the DNA letter of ancient DNA sample to be measured is obtained
Breath;
Assuming that the women Gu DNA sample derives from male, and the DNA information based on the ancient DNA sample to be measured, determine
Sequencing read compares the desired proportion to Y chromosome specific region, wherein the sequencing read is compared to Y chromosome specific region
Desired proportion calculation formula are as follows:
R=(comparing sequencing read quantity/comparison to the genome sequencing read quantity for arriving Y chromosome specific region) ×
0.5;And
The desired proportion to Y chromosome specific region is compared based on the sequencing read, determines the ancient DNA sample to be measured
Y chromosome pollution rate, the Y chromosome pollution rate of the ancient DNA sample to be measured is that the male in women Gu DNA sample is modern
DNA pollution rate,
Wherein, the calculation formula of the Y chromosome pollution rate of the ancient DNA sample to be measured are as follows:
C=(y/R) × (1/n),
Wherein, C is Y chromosome pollution rate ratio, and y is to compare the sequencing read quantity for arriving Y chromosome specific region, and R is
The sequencing read compares the desired proportion to Y chromosome specific region, and n is the sequencing read sum compared to genome.
According to an embodiment of the invention, the modern times of the male in women Gu DNA sample can be effectively determined using this method
DNA pollution rate method.Also, this method is reproducible, as a result accurately and reliably.
According to an embodiment of the invention, obtaining the Y chromosome specific region by following methods: the mankind are referred to gene
The Y chromosome gene order of group is divided into the artificial read set of 30bp or so length;By the artificial read set with it is described
The mankind are compared with reference to the part that genome does not include Y chromosome, to obtain the artificial read by comparison;For all
By the artificial read of comparison, only retains the artificial read for the comparison mistake of 3 bases or more occur, then remove again comprising weight
The artificial read in complex sequences region, then remaining owner's part work and part study section forms the Y chromosome specific region.
Some specific examples according to the present invention, the mankind are Hg19 with reference to genome.
In addition it is also necessary to explanation, according to an embodiment of the invention, method of the invention has following advantages at least
One of:
1, property method for distinguishing individual belonging to determining ancient DNA sample of the invention, can provide the property to extinct plants and animal sample
Do not determine, which can extend to the mankind and other all animal species with sex chromosome.Here gender is sentenced
Surely the proprietary identification method for the ancient DNA characteristics for being different from the sex determination of modern biotechnology, but being found and being summarized based on inventor.
2, the method for male's modern times DNA pollution rate in determination women Gu DNA sample of the invention, is able to detect ancient DNA
The modern DNA pollution rate of sequencing data, this method is significant for ancient DNA analysis, because only that accurate evaluation modern times DNA
Pollution rate can just carry out ancient DNA subsequent analysis.
Additional aspect and advantage of the invention will be set forth in part in the description, and will partially become from the following description
Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect of the invention and advantage will become from the description of the embodiment in conjunction with the following figures
Obviously and it is readily appreciated that, in which:
Fig. 1 is ancient human's DNA information analysis flow chart diagram according to an embodiment of the present invention;
Fig. 2 is the DNA Damage analysis chart of hominid skeleton sample in embodiment 1,
Wherein,
A figure is DNA fragmentation analysis as a result, what is indicated in grey box is the base of ancient DNA fragmentation, indicates outside grey box
Sequence and 3 ' before being the most previous base in the ancient end of DNA fragmentation 5 ' hold the sequence after the last one base,
B figure and c figure are deamination analyses as a result, what two width figure abscissas indicated is base positions on DNA fragmentation, direction
For 5'-3', the 0-25 in b figure indicates preceding 25 bases at the end DNA5', and the 25-0 of c figure indicates last 25 in the end DNA fragmentation 3'
Base;What ordinate indicated is percentage;
Fig. 3 is the percentage of the sum of base shared by heterozygote in each chromosome in embodiment 1;
Fig. 4 is that reads (the i.e. sequencing read) number compared in embodiment 1 to No. 8 chromosomes and sex chromosome compares;
Fig. 5 is that depth distribution situation is sequenced in hair swatch in embodiment 1, and the longitudinal axis indicates depth, and horizontal axis indicates chromosome.
Specific embodiment
The solution of the present invention is explained below in conjunction with embodiment.It will be understood to those of skill in the art that following
Embodiment is merely to illustrate the present invention, and should not be taken as limiting the scope of the invention.Particular technique or item are not specified in embodiment
Part, it described technology or conditions or is carried out according to the literature in the art according to product description.Agents useful for same or instrument
Production firm person is not specified in device, and being can be with conventional products that are commercially available.
Conventional method:
According to an embodiment of the invention, the method according to the invention is referring to Fig.1 standardized ancient DNA sample to be measured
Information analysis generally comprises following steps:
1, the DNA information of DNA sample to be measured is obtained
Specific step is as follows:
The DNA sample to be measured is carried out building library and sequencing, to obtain sequencing data, wherein it is described build library when not
The step of carrying out DNA fragmentation, of length no more than 100bp of the sequencing read;
Processing is filtered to the sequencing data, to obtain the sequencing data by filtration treatment;And
SoapAligner and BWA is utilized simultaneously, and processing is compared in the sequencing data by filtration treatment, with
Just comparison result is obtained, the comparison result includes the DNA information of the DNA sample to be measured,
Wherein,
The filtration treatment includes at least one of following:
(1) filtering removal joint sequence;
(2) filtering removal mass value Q≤10 low quality base, wherein when the quantity of the low quality base account for it is whole
Read total bases amount 50% or more when, delete whole read;When the low quality base is in the end of read, and quantity is not
More than whole read 50% when, only cut off the low quality base;
(3) area N is filtered, wherein when ratio containing N is greater than 10% in read, remove the read;When the area N only
When being present in read both ends, the area N at the read both ends is only cut off;
(4) removal length is less than 30bp and length is greater than the read of 99bp,
The mispairing for comparing processing and at most allowing 4 bases,
When carrying out comparison processing using SoapAligner, the comparison result of Soap format is generated;It is carried out using BWA
When the comparison is handled, the comparison result of sam format is generated.
2, determine DNA sample to be measured whether the method for being ancient DNA
Specific step is as follows:
Variation detection is carried out based on the DNA information of the DNA sample to be measured, while using GATK and SoapSnp, so as to true
The variation information of the fixed DNA sample to be measured;And
Based on the variation information of the DNA sample to be measured, determine whether the DNA sample to be measured is ancient DNA,
Wherein, it is instruction of the DNA sample to be measured for Gu DNA there are at least one of following state:
(1) the deamination feature that is presented below as of sequencing read: relative to reference genome, 5 ' ends of the sequencing read and
There is the mutation of C- > T and G- > A greater than 10% in 3 ' ends;
(2) the fragmentation feature that sequencing read is presented below as: relative to reference genome, 5 ' ends of the sequencing read
It is dramatically increased again toward the ratio that previous base is purine, and 3 ' ends significantly increase toward the ratio that the latter base is pyrimidine again
Add,
Using SoapSnp carry out the variation detect when, export cns format as a result,
The variation information of the DNA sample to be measured includes single nucleotide variations information.
3, gender individual belonging to ancient DNA sample is determined
Specific step is as follows:
Based on the DNA information of the DNA sample to be measured, at least one of following sex determination's parameter is determined: comparing to X and contaminate
The sequencing read of colour solid and the quantity compared to the sequencing read of Y chromosome than, compare sequencing read and comparison to X chromosome
To the quantity ratio of the sequencing read of No. 8 chromosome, the sequencing depth of each chromosome and the heterozygote ratio of each chromosome;With
And
Based at least one of described sex determination's parameter, gender individual belonging to the ancient DNA sample to be measured is determined,
In:
(1) it compares to the sequencing read of X chromosome and the quantity ratio of the sequencing read of comparison to Y chromosome close to 9:1, is
Individual belonging to the ancient DNA sample to be measured is the instruction of male;It compares the sequencing read of X chromosome and compares to No. 8 and dye
The quantity ratio of the sequencing read of body is the instruction that individual belonging to the ancient DNA sample to be measured is women close to 1:1;
(2) the sequencing depth of Y chromosome and the sequencing depth of other chromosomes are close, are the ancient DNA sample institutes to be measured
Belong to the instruction that individual is male;The sequencing depth of Y chromosome is the Gu to be measured significantly less than the sequencing depth of other chromosomes
Individual belonging to DNA sample is the instruction of women;
(3) the heterozygote ratio of X chromosome is the Gu DNA to be measured significantly less than the heterozygote ratio of other chromosomes
Individual belonging to sample is the instruction of male;The heterozygote ratio of X chromosome is not significantly less than the heterozygote ratio of other chromosomes
Rate is the instruction that individual belonging to the ancient DNA sample to be measured is women.
4, male's modern times DNA pollution rate in women Gu DNA sample is determined
Specific step is as follows:
Assuming that the women Gu DNA sample derives from male, and the DNA information based on the ancient DNA sample to be measured, determine
Sequencing read compares the desired proportion to Y chromosome specific region, wherein the sequencing read is compared to Y chromosome specific region
Desired proportion calculation formula are as follows:
R=(comparing sequencing read quantity/comparison to the genome sequencing read quantity for arriving Y chromosome specific region) ×
0.5;And
The desired proportion to Y chromosome specific region is compared based on the sequencing read, determines the ancient DNA sample to be measured
Y chromosome pollution rate, the Y chromosome pollution rate of the ancient DNA sample to be measured is that the male in women Gu DNA sample is modern
DNA pollution rate,
Wherein, the calculation formula of the Y chromosome pollution rate of the ancient DNA sample to be measured are as follows:
C=(y/R) × (1/n),
Wherein, C is Y chromosome pollution rate ratio, and y is to compare the sequencing read quantity for arriving Y chromosome specific region, and R is
The sequencing read compares the desired proportion to Y chromosome specific region, and n is the sequencing read sum compared to genome.
Wherein, the Y chromosome specific region is obtained by following methods: the mankind are referred to the Y chromosome base of genome
Because sequences segmentation is at the artificial read set of 30bp or so length;The artificial read set and the mankind are referred into genome
Part not comprising Y chromosome is compared, to obtain the artificial read by comparison;For all artificial by what is compared
Read only retains the artificial read for the comparison mistake of 3 bases or more occur, then removes the people comprising repetitive sequence region again
Part work and part study section, then remaining owner's part work and part study section forms the Y chromosome specific region.The mankind are Hg19 with reference to genome.
Embodiment 1
The method of the invention according to shown in above-mentioned " conventional method " is standardized information point to ancient DNA sample to be measured
Analysis, specific as follows:
Wherein, ancient 2: 1 hominid skeleton samples of DNA sample to be measured and 1 ancient human's sample of hair.This 2 ancient raw
Object sample standard deviation by Chinese Academy of Sciences and ancient vertebrate animals ancient human research institute provide, the unearthed age about before 3000-8000,
Wherein 1 is hominid skeleton sample (Human_Bone), and 1 is ancient human's sample of hair (Human_Hair) (being shown in Table 1).
Detailed process is as follows:
One, the acquisition of Illumina bis- generations sequencing data
The present invention is based on Illumina bis- generations sequencing data, the DNA of 2 extinct plants and animal samples is extracted and banking process is detailed in:
[1]N.Rohland,M.Hofreiter.Ancient DNA extraction from bones and teeth
[J].NATURE PROTOCOLS,2007,2(7):1756-1762.doi:10.1038/nprot.2007.247;
[2]M.T.Gansauge,M.Meyer.Single-stranded DNA library preparation for
the sequencing of ancient or damaged DNA[J].NATURE PROTOCOLS,2013,8(3):737-
748.doi:10.1038/nprot.2013.038
By referring to be incorporated by herein.
Sequencing strategy used by ancient human's sample uses 2000 PE 50 of Illumina Hiseq, the original of each sample
See Table 1 for details for the sequencing data amount of machine Fastq format under beginning.Wherein, during building library, sample of hair generates deamination
Uracil removed, sample bone does not process uracil.Bone and the last sequencing data amount of sample of hair are
15Gb。
Two, the Quality Control of original lower machine Fastq data
The present invention is in strict accordance with data filtering method described in technical solution, to 2 hominid skeletons and sample of hair
Original lower machine Fasta data carried out stringent filtering.Specific execution standard is as follows: 1) if it find that including to connect in read
Header sequence cuts off joint sequence part;If 2) the base number of mass value Q≤10 account for the 50% of whole read total bases amount with
When upper, whole read was deleted, if low quality base, in the end of read, and quantity is no more than the 50% of whole read, then only
Cut off the base of low quality part;3) read of the removal ratio containing N greater than 10% is only cut if the area N exists only in read both ends
Except the area N at read both ends, remaining base retains;4) removal length is less than 30bp and length is greater than the read of 49bp.After filtering
Data volume see Table 2 for details.
Three, analysis is compared
Original lower machine data are passed through after Quality Control, respectively using SoapAligner and BWA to original hominid skeleton sample
It is compared with sample of hair DNA data, the version used with reference to genome is mankind Hg19.
The command parameter that SoapAligner is compared is as follows:
Sample of hair:
soap –D hg19.fa.index –a Human_Hair.fq1.gz –b Human_Hair.fq2.gz -o
Human_Hair.soap -2 Human_Hair.single–u Human_Hair.unmapped -n 5 -r 1 -l 30 -s
30 -v 2 -p 4 -m 0 -x 80
Sample bone:
soap –D hg19.fa.index –a Human_Hair.fq1.gz –b Human_Hair.fq2.gz -o
Human_Hair.soap -2 Human_Hair.single –u Human_Hair.unmapped -n 5 -r 1 -l 30 -
s 30 -v 4 -p 4 -m 0 -x 80
The command parameter that BWA is compared is as follows:
Sample of hair:
bwa aln hg19.fa Human_Hair.fq1.gz -l 30 -k 2 -t 4 -q 15 -I>Human_
Hair.fq1.sai;bwa aln hg19.fa Human_Hair.fq2.gz-l 30 -k 2 -t 4 -q 15 -I>Human_
Hair.fq2.sai;bwa sampe -a 80 hg19.fa Human_Hair.fq1.sai Human_Hair.fq2.sai
Human_Hair.fq1.gz Human_Hair.fq2.gz>Human_Hair.sam
Sample bone:
bwa aln hg19.fa Human_Hair.fq1.gz -l 30 -k 4 -t 4 -q 15 -I>Human_
Hair.fq1.sai;bwa aln hg19.fa Human_Hair.fq2.gz-l 30 -k 4 -t 4 -q 15 -I>Human_
Hair.fq2.sai;bwa sampe -a 80 hg19.fa Human_Hair.fq1.sai Human_Hair.fq2.sai
Human_Hair.fq1.gz Human_Hair.fq2.gz>Human_Hair.sam
After the completion of comparison, the reads that the unique in comparison result is compared is extracted, while filtering out low-quality comparison knot
Fruit and the comparison result not matched, for analyzing in next step.As a result see Table 2 for details and table 3 for data information.Wherein Chinese ancients
The sequencing result of class sample bone only has minute quantity comparing to arrive human genome (0.1%~0.2%), these data are not
It is enough to support the follow-ups analysis such as variation detection.Therefore filtering, comparison and DNA are only limitted to the information analysis of sample bone
Damage analysis.The Chinese filtered comparison rate of ancient human's hair swatch has reached 10%, and data support subsequent SNP enough
The information analyses such as calling, therefore inventor has carried out more comprehensive information analysis, including mistake to the sequencing result of this sample
Filter, comparison, DNA Damage analysis, depth and coverage analysis, SNP calling analysis, sex determination's analysis and modern
Pollution rate analysis etc..
Four, variation detection
The present invention carries out variation detection to ancient human's sample of hair using GATK and SoapSnp simultaneously.
GATK:
Variation detection is carried out using GATK to carry out fully according to the operating process of GATK, specifically can refer to https: //
www.broadinstitute.org/gatk/.Using GATK carry out variation detection first to bwa compare generate sam file by
It resequences according to caryotype (karyotypic);Then the comparison file of sam format is converted into bam format;It will
Entry in bam file is ranked up from small to large according to physical location;To repeating and compare to chromosome same position
Reads be marked;The read compared to the region indel is compared again;Base mass value is corrected, is finally given birth to
At Human_Hair.bam and Human_Hair.metrics;Finally variation detection is carried out using UnifiedGenotyper.Tool
The parameter of body is as follows:
java –jar GenomeAnalysisTK.jar -glm SNP -l INFO -R hg19.fa -T
UnifiedGenotyper -I Human_Hair.bam -D dbsnp_137.hg19.vcf-o Human_Hair.vcf-
metrics Human_Hair.metrics-stand_call_conf 10
-stand_emit_conf 30。
SoapSnp:
Carrying out variation detection first using SoapSnp is also to the comparison result of SoapAligner according to caryotype
(karyotypic) it resequences, is then ranked up from small to large in same chromosome according to physical location.Tool
Body parameter is as follows:
soapsnp–i Human_Hair.soap.gz–d hg19.fa–o Human_Hair.cns-r 0.0001 -t -
u -L 49 -m -M Human_Hair.mat
Five, Gu DNA assert
Since during carrying out single-stranded DNA banks building to ancient human's sample of hair, inventor has used a kind of spy
Different enzyme UDG removes uracil, to prevent the mutation of C- > T from causing the inaccuracy of result to subsequent analysis.In this way single-stranded
The result in the library of method building can not find out apparent DNADamage mode.Hominid skeleton sample is during building library
And UDG is not used and processes, therefore, when doing ancient DNA identification, inventor uses hominid skeleton sample.
Inventor using mapDamage to the mispairing mode of the sequencing result of hominid skeleton sample and fragment pattern into
It has gone and has statisticallyd analyze and draw, under specifically used parameter enters:
perl mapDamage-0.3.3.pl map–i Human_Bone.sam –d directory –r hg19.fa
-c -t Hair -l 49;perl mapDamage-0.3.3.pl merge –d directory;mapDamage-
0.3.3.pl plot –d directory
As a result as shown in Fig. 2, from the point of view of fragment pattern, the ratio of 5 ' end purine is dramatically increased, and the ratio of pyrimidine is then
It is corresponding to significantly reduce;In terms of deamination mode, 5 ' ends have accumulated a large amount of C- > T mutation, and 3 ' ends then accordingly have accumulated largely
The mutation of G- > A.Therefore, either fragment pattern or deamination feature all comply fully with ancient DNA characteristics, therefore inventor
It can determine that the sequencing data that inventor obtains is Gu DNA.
Six, sex determination
Inventor has carried out sex determination's analysis to the affiliated ancients' individual of hair swatch in terms of 3:
1: analysis is assumed: if the affiliated ancients of hair swatch are a male individuals, the heterozygote ratio on X chromosome
Example will be far smaller than its chromosome.
Analyze result: there is no significantly less than other chromosomes for heterozygote ratio in X chromosome (see Fig. 3).It as a result is female
Property.
2: analysis is assumed: X chromosome and Y chromosome effective length ratio are 9:1, and X chromosome and No. 8 chromosome ratios connect
Nearly 1:1.It, should be close to 9:1 to the reads quantity of X chromosome and the reads quantity of Y chromosome then comparing if it is male;
If it is women, the ratio between X chromosome and No. 8 chromosomes should be close to 1:1.
Analyze result: the reads number ratio of mapping to X chromosome and No. 8 chromosome is close to 1:1, and X chromosome and Y
The ratio of chromosome is 40:1, is far longer than 9:1 (see Fig. 4).It as a result is women.
3: analysis is assumed: if it is male, the sequencing depth of Y chromosome should be close with other chromosomes.
Analysis result: Y chromosome sequencing depth is significantly less than other chromosome each regions (see Fig. 5).It as a result is women.
In summary three aspect as a result, the affiliated ancients' individual of sample of hair be a female individual.
Seven, modern's DNA pollution rate is assessed
Since the sample of inventor's sequencing only has 1 individual, and inventor can not learn the ancients and other ancient humans
And the affiliation between modern, sequencing data amount are again less.Therefore the distinctive segregating of the ancients can not be found
Site, is not available mtDNA and autosome data carry out the assessment of modern's pollution rate.But since inventor judges the individual
For female individual, therefore the pollution rate assessment of modern male individual can be carried out.Basic principle is to compare obtained reads
To the peculiar region of Y chromosome (YUR, other any homologies of chromosome of getting along well and the not region of repetitive sequence), further according to YUR
It is calculated with specific reads quantity and assumes it is the desired value in the case of male, the last practical reads compared and desired value
Between ratio be exactly the pollution rate from male.
The modern male's pollution rate finally obtained is 1.72%~5.98%, is polluted with other ancient human's DNA document reports
Rate is compared to higher, since obtained actual amount of data is relatively low, is likely to result in and a degree of underestimate or over-evaluate.Subsequent
Need to carry out the reads that may be from modern sufficiently filtering in analysis to guarantee the reliability of result.
12 extinct plants and animal sequencing data situations of table
The filtering and comparison analysis result of 2 ancient human's hair swatch of table
The filtering and comparison analysis result of 3 hominid skeleton sample of table
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not
Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any
One or more embodiment or examples in can be combined in any suitable manner.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that: not
A variety of change, modification, replacement and modification can be carried out to these embodiments in the case where being detached from the principle of the present invention and objective, this
The range of invention is defined by the claims and their equivalents.
Claims (10)
1. a kind of method for the DNA information for obtaining DNA sample to be measured, which comprises the following steps:
The DNA sample to be measured is carried out building library and sequencing, to obtain sequencing data, wherein it is described build library when without
The step of DNA fragmentation, of length no more than 100bp of the sequencing read;
Processing is filtered to the sequencing data, to obtain the sequencing data by filtration treatment;And
Processing is compared in the sequencing data by filtration treatment, to obtain comparison result, the comparison result packet
DNA information containing the DNA sample to be measured,
Wherein,
The filtration treatment includes at least one of following:
(1) filtering removal joint sequence;
(2) the low quality base of filtering removal mass value Q≤10, wherein when the quantity of the low quality base accounts for whole read
Total bases amount 50% or more when, delete whole read;When the low quality base is in the end of read, and quantity is no more than
Whole read 50% when, only cut off the low quality base;
(3) area N is filtered, wherein when ratio containing N is greater than 10% in read, remove the read;When the area N there is only
When read both ends, the area N at the read both ends is only cut off;
(4) removal length is less than 30bp and length is greater than the read of 99bp,
The mispairing for comparing processing and at most allowing 4 bases.
2. the method according to claim 1, wherein carrying out the comparison using SoapAligner and BWA simultaneously
Processing.
3. according to the method described in claim 2, it is characterized in that, being given birth to when carrying out comparison processing using SoapAligner
At the comparison result of Soap format;When carrying out comparison processing using BWA, the comparison result of sam format is generated.
4. it is a kind of determine DNA sample to be measured whether the method for being ancient DNA, which comprises the following steps:
Method according to claim 1-3 obtains the DNA information of DNA sample to be measured;
Based on the DNA information of the DNA sample to be measured, variation detection is carried out, to determine the variation letter of the DNA sample to be measured
Breath;And
Based on the variation information of the DNA sample to be measured, determine whether the DNA sample to be measured is ancient DNA,
Wherein, it is instruction of the DNA sample to be measured for Gu DNA there are at least one of following state:
(1) the deamination feature that sequencing read is presented below as: relative to reference genome, the 5 ' ends and 3 ' ends of the sequencing read
There is the mutation of the C- > T and G- > A greater than 10%;
(2) the fragmentation feature that sequencing read is presented below as: relative to reference genome, 5 ' ends of the sequencing read are past again
Previous base is that the ratio of purine dramatically increases, and 3 ' ends are dramatically increased toward the ratio that the latter base is pyrimidine again.
5. according to the method described in claim 4, being examined it is characterized in that, carrying out the variation using GATK and SoapSnp simultaneously
It surveys.
6. according to the method described in claim 5, it is characterized in that, being exported when carrying out variation detection using SoapSnp
The result of cns format.
7. according to the method described in claim 4, it is characterized in that, the variation information of the DNA sample to be measured includes monokaryon glycosides
Acid variation information.
8. individual property method for distinguishing belonging to a kind of determining ancient DNA sample, which comprises the following steps:
Method according to claim 1-3 obtains the DNA information of ancient DNA sample to be measured;
Based on the DNA information of the DNA sample to be measured, at least one of following sex determination's parameter is determined: comparing to X chromosome
Sequencing read and the quantity that compares to the sequencing read of Y chromosome than, compare to the sequencing read of X chromosome and comparison to 8
The quantity ratio of the sequencing read of number chromosome, the sequencing depth of each chromosome and the heterozygote ratio of each chromosome;And
Based at least one of described sex determination's parameter, gender individual belonging to the ancient DNA sample to be measured is determined, in which:
(1) it compares to the sequencing read of X chromosome and the quantity ratio of the sequencing read of comparison to Y chromosome close to 9:1, is described
Individual belonging to Gu DNA sample to be measured is the instruction of male;It compares the sequencing read of X chromosome and compares to No. 8 chromosome
The quantity ratio of read is sequenced close to 1:1, is the instruction that individual belonging to the ancient DNA sample to be measured is women;
(2) the sequencing depth of Y chromosome and the sequencing depth of other chromosomes are close, are a belonging to the ancient DNA sample to be measured
Body is the instruction of male;The sequencing depth of Y chromosome is the Gu DNA to be measured significantly less than the sequencing depth of other chromosomes
Individual belonging to sample is the instruction of women;
(3) the heterozygote ratio of X chromosome is the ancient DNA sample to be measured significantly less than the heterozygote ratio of other chromosomes
Affiliated individual is the instruction of male;The heterozygote ratio of X chromosome significantly less than the heterozygote ratio of other chromosomes, is not
Individual belonging to the ancient DNA sample to be measured is the instruction of women.
9. a kind of method of male's modern times DNA pollution rate in determining women Gu DNA sample, which is characterized in that including following step
It is rapid:
Method according to claim 1-3 obtains the DNA information of ancient DNA sample to be measured;
Assuming that the women Gu DNA sample derives from male, and the DNA information based on the ancient DNA sample to be measured, sequencing is determined
Read compares the desired proportion to Y chromosome specific region, wherein the sequencing read compares the phase to Y chromosome specific region
The calculation formula of prestige ratio are as follows:
R=(comparing sequencing read quantity/comparison to the genome sequencing read quantity for arriving Y chromosome specific region) × 0.5;
And
The desired proportion to Y chromosome specific region is compared based on the sequencing read, determines the Y of the ancient DNA sample to be measured
Chromosomal contamination rate, the Y chromosome pollution rate of the ancient DNA sample to be measured are the male modern times DNA in women Gu DNA sample
Pollution rate,
Wherein, the calculation formula of the Y chromosome pollution rate of the ancient DNA sample to be measured are as follows:
C=(y/R) × (1/n),
Wherein, C is Y chromosome pollution rate ratio, and y is to compare the sequencing read quantity for arriving Y chromosome specific region, and R is described
Read comparison is sequenced to the desired proportion of Y chromosome specific region, n is the sequencing read sum compared to genome.
10. according to the method described in claim 9, it is characterized in that, obtaining the Y chromosome given zone by following methods
Domain:
The mankind are divided into the artificial read set of 30bp or so length with reference to the Y chromosome gene order of genome;
The artificial read set is compared with the mankind with reference to the part that genome does not include Y chromosome, to obtain
Obtain the artificial read by comparing;
For all artificial reads by comparing, only retain the artificial read for the comparison mistake of 3 bases or more occur, then
Removing the artificial read comprising repetitive sequence region again, then remaining owner's part work and part study section forms the Y chromosome specific region,
Optionally, the mankind are Hg19 with reference to genome.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710667605.XA CN109402241A (en) | 2017-08-07 | 2017-08-07 | Identification and the method for analyzing ancient DNA sample |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710667605.XA CN109402241A (en) | 2017-08-07 | 2017-08-07 | Identification and the method for analyzing ancient DNA sample |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109402241A true CN109402241A (en) | 2019-03-01 |
Family
ID=65453879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710667605.XA Pending CN109402241A (en) | 2017-08-07 | 2017-08-07 | Identification and the method for analyzing ancient DNA sample |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109402241A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110273005A (en) * | 2019-05-25 | 2019-09-24 | 深圳市早知道科技有限公司 | A method of the similitude compared with ancients based on SNP parting |
CN110310699A (en) * | 2019-07-01 | 2019-10-08 | 江苏里下河地区农业科学研究所 | The analysis tool and application of target gene sequence are excavated based on whole genome sequence |
CN111370065A (en) * | 2020-03-26 | 2020-07-03 | 北京吉因加医学检验实验室有限公司 | Method and device for detecting cross-sample contamination rate of RNA |
CN111370057A (en) * | 2019-07-31 | 2020-07-03 | 深圳思勤医疗科技有限公司 | Method for determining chromosome structure variation signal intensity and insert length distribution characteristics of sample and application |
CN113793641A (en) * | 2021-09-29 | 2021-12-14 | 苏州赛美科基因科技有限公司 | Method for rapidly judging sample gender from FASTQ file |
CN115161403A (en) * | 2022-05-23 | 2022-10-11 | 哈尔滨工业大学(威海) | Method for judging species affiliation of ancient DNA sample |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009099602A1 (en) * | 2008-02-04 | 2009-08-13 | Massachusetts Institute Of Technology | Selection of nucleic acids by solution hybridization to oligonucleotide baits |
WO2013127049A1 (en) * | 2012-02-27 | 2013-09-06 | 深圳华大基因科技有限公司 | Method and device for detecting microdeletion in chromosome sts area |
CN105358714A (en) * | 2013-05-04 | 2016-02-24 | 斯坦福大学托管董事会 | Enrichment of DNA sequencing libraries from samples containing small amounts of target DNA |
WO2016103473A1 (en) * | 2014-12-26 | 2016-06-30 | 株式会社日立ハイテクノロジーズ | Substrate for use in analysis of nucleic acid, flow cell for use in analysis of nucleic acid, and nucleic acid analysis device |
CN106661575A (en) * | 2014-10-14 | 2017-05-10 | 深圳华大基因科技有限公司 | Linker element and method of using same to construct sequencing library |
-
2017
- 2017-08-07 CN CN201710667605.XA patent/CN109402241A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009099602A1 (en) * | 2008-02-04 | 2009-08-13 | Massachusetts Institute Of Technology | Selection of nucleic acids by solution hybridization to oligonucleotide baits |
WO2013127049A1 (en) * | 2012-02-27 | 2013-09-06 | 深圳华大基因科技有限公司 | Method and device for detecting microdeletion in chromosome sts area |
CN105358714A (en) * | 2013-05-04 | 2016-02-24 | 斯坦福大学托管董事会 | Enrichment of DNA sequencing libraries from samples containing small amounts of target DNA |
CN106661575A (en) * | 2014-10-14 | 2017-05-10 | 深圳华大基因科技有限公司 | Linker element and method of using same to construct sequencing library |
WO2016103473A1 (en) * | 2014-12-26 | 2016-06-30 | 株式会社日立ハイテクノロジーズ | Substrate for use in analysis of nucleic acid, flow cell for use in analysis of nucleic acid, and nucleic acid analysis device |
Non-Patent Citations (5)
Title |
---|
LUDOVIC ORLANDO等: ""True single-molecule DNA sequencing of a pleistocene horse bone"", 《GENOME RESEARCH》 * |
MICHAEL KNAPP等: ""Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives"", 《GENES》 * |
MORTEN RASMUSSEN等: ""Ancient human genome sequence of an extinct Palaeo-Eskimo"", 《NATURE》 * |
TERENCE A.BROWN等: ""The current and future applications of ancient DNA in Quaternary science"", 《JOURNAL OF QUATERNARY SCIENCE》 * |
高山等: "《R语言与Bioconductor生物信息学应用》", 30 January 2014, 天津科技翻译出版有限公司 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110273005A (en) * | 2019-05-25 | 2019-09-24 | 深圳市早知道科技有限公司 | A method of the similitude compared with ancients based on SNP parting |
CN110310699A (en) * | 2019-07-01 | 2019-10-08 | 江苏里下河地区农业科学研究所 | The analysis tool and application of target gene sequence are excavated based on whole genome sequence |
CN111370057A (en) * | 2019-07-31 | 2020-07-03 | 深圳思勤医疗科技有限公司 | Method for determining chromosome structure variation signal intensity and insert length distribution characteristics of sample and application |
CN111370057B (en) * | 2019-07-31 | 2021-03-30 | 深圳思勤医疗科技有限公司 | Method for determining chromosome structure variation signal intensity and insert length distribution characteristics of sample and application |
CN111370065A (en) * | 2020-03-26 | 2020-07-03 | 北京吉因加医学检验实验室有限公司 | Method and device for detecting cross-sample contamination rate of RNA |
CN113793641A (en) * | 2021-09-29 | 2021-12-14 | 苏州赛美科基因科技有限公司 | Method for rapidly judging sample gender from FASTQ file |
CN113793641B (en) * | 2021-09-29 | 2023-11-28 | 苏州赛美科基因科技有限公司 | Method for rapidly judging sample gender from FASTQ file |
CN115161403A (en) * | 2022-05-23 | 2022-10-11 | 哈尔滨工业大学(威海) | Method for judging species affiliation of ancient DNA sample |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109402241A (en) | Identification and the method for analyzing ancient DNA sample | |
KR102091312B1 (en) | Suppression of errors in sequenced DNA fragments using redundant readings with unique molecular index (UMI) | |
JP7013490B2 (en) | Validation methods and systems for sequence variant calls | |
CN104221022B (en) | A kind of copy number mutation detection method and system | |
JP2021170350A (en) | Variant classifier based on deep neural network | |
Steiner et al. | Turning one into five: Integrative taxonomy uncovers complex evolution of cryptic species in the harvester ant Messor “structor” | |
KR20190117529A (en) | Method and system for generation and error correction of unique molecular index sets with heterogeneous molecular length | |
CN107077537A (en) | With short reading sequencing data detection repeat amplification protcol | |
Jayasankar et al. | Morphometric and genetic analyzes of Indian mackerel (Rastrelliger kanagurta) from peninsular India | |
JP2016518822A (en) | Characterization of biological materials using unassembled sequence information, probabilistic methods, and trait-specific database catalogs | |
CN115198023B (en) | Hainan cattle liquid-phase breeding chip and application thereof | |
Bensch et al. | The use of molecular methods in studies of avian haemosporidians | |
CN102618630A (en) | Application of Y-STR (Y chromosome-short tandem repeat) | |
CN115989544A (en) | Method and system for visualizing short reads in repetitive regions of a genome | |
CN111916151B (en) | Traceability detection method and application of verticillium wilt of alfalfa | |
CN107862177B (en) | Construction method of single nucleotide polymorphism molecular marker set for distinguishing carp populations | |
CN105525027A (en) | SNP marker as well as application and detection method thereof | |
CN103348350A (en) | Nucleic acid information processing device and processing method thereof | |
CN105838720B (en) | PTPRQ gene mutation body and its application | |
CN109295239A (en) | The screening technique of side chicken molecular labeling and its application | |
CN103339632A (en) | Nucleic acid information processing device and processing method thereof | |
WO2021241721A1 (en) | Method for treating cell population and method for analyzing genes included in cell population | |
CN108416189A (en) | A kind of variety of crops Heterosis identification method based on molecular marking technique | |
CN106555008A (en) | Detection and identification method and system for microorganisms | |
CN106650311A (en) | Detection and recognition method and system for microorganisms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190301 |