CN104805190B - A kind of method of the specificity for determining hybrid maize variety, uniformity and stability - Google Patents

A kind of method of the specificity for determining hybrid maize variety, uniformity and stability Download PDF

Info

Publication number
CN104805190B
CN104805190B CN201510150506.5A CN201510150506A CN104805190B CN 104805190 B CN104805190 B CN 104805190B CN 201510150506 A CN201510150506 A CN 201510150506A CN 104805190 B CN104805190 B CN 104805190B
Authority
CN
China
Prior art keywords
hybrid strain
genotype
measured
hybrid
corn variety
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510150506.5A
Other languages
Chinese (zh)
Other versions
CN104805190A (en
Inventor
张静
彭海
陈红
章伟雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agriculture Ministry Technology Development Center
Jianghan University
Original Assignee
Agriculture Ministry Technology Development Center
Jianghan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agriculture Ministry Technology Development Center, Jianghan University filed Critical Agriculture Ministry Technology Development Center
Priority to CN201510150506.5A priority Critical patent/CN104805190B/en
Publication of CN104805190A publication Critical patent/CN104805190A/en
Application granted granted Critical
Publication of CN104805190B publication Critical patent/CN104805190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of method of specificity for determining hybrid maize variety, uniformity and stability.Methods described includes:Obtain variant sites;Determine the test zone of corn variety to be measured;Build database;After determining amount of sampling, random sampling mixes and extracts the DNA of mixing sample;Prepare primer;Expanded using the DNA of primer pair mixing sample, amplified production is used to build high-throughput sequencing library;High-flux sequence is carried out to high-throughput sequencing library, obtains that fragment group is sequenced;Analysis sequencing fragment group, obtains Maize Genotypes to be measured and hybrid strain genotype;Compare and obtain approximate kind, variant sites and variant sites rate;After obtaining hybrid strain kind, hybrid strain rate is calculated;Using variant sites, variant sites rate and hybrid strain rate, corn variety specificity, uniformity and stability to be measured are judged.Methods described can accurately, intactly judge the specificity, stability and uniformity of corn variety to be measured, and test speed is faster.

Description

A kind of method of the specificity for determining hybrid maize variety, uniformity and stability
Technical field
The present invention relates to biological technical field, more particularly to a kind of specificity, uniformity for determining hybrid maize variety With the method for stability.
Background technology
As a kind of intellectual property of specialization, new variety of plant has become a company and competing to a national core Strive power.The solution that new variety of plant authorizes account and relative legal problems is tested dependent on DUS, i.e. the specificity to corn variety to be measured (Distinctness), the field trapping test or molecules inside of uniformity (Uniformity) and stability (Stability) Marker Identification.Field trapping test flow is:Corn variety to be measured and approximate kind are planted in field simultaneously, in 2 years and more than The season of growth in, observe their multiple characters, difference of the corn variety to be measured with approximate kind judged according to trait expression Conspicuousness, i.e., it is specific, while judge hybrid strain ratio in colony, i.e. uniformity and stability;The stream of molecules inside Marker Identification Cheng Wei:Individual plant is divided to extract DNA of the corn variety to be measured with each sample in approximate kind, and each survey to each sample respectively Performing PCR (Polymerase Chain Reaction, polymerase chain reaction) is entered in examination region, and carries out electrophoresis to each PCR primer Or generation sequencing detection, according to testing result, the difference site ratio of corn variety to be measured and approximate kind is obtained, according to difference Site ratio, judge the specificity of corn variety to be measured.
The shortcomings that field trapping test is:Cycle is long, workload is big, environmental impact shape, causes to judge inaccuracy.It is indoor The shortcomings that molecular markers for identification is:Need to handle each test zone of each sample respectively, workload is big, it is impossible to sample with Test zone bulk sampling, hybrid strain rate can not be calculated, thus the test of stability and uniformity can not be carried out.Field trapping test Common drawback with molecules inside Marker Identification is:Due to workload is big, can not from existing kind objective selection Approximate kind, applicant can only be weighed by kind and provided, and the approximate product provided based on the motivations such as commercial interest, kind power applicant Kind may be untrue, so as to cause the legal consequence of wrong kind mandate.
The content of the invention
In order to solve the problems of the prior art, the embodiments of the invention provide a kind of spy for determining hybrid maize variety The method of the opposite sex, uniformity and stability.The technical scheme is as follows:
The embodiments of the invention provide the side of a kind of specificity for determining hybrid maize variety, uniformity and stability Method, methods described include:
Obtain the variant sites between different corn varieties;
The test zone of the corn variety to be measured is determined by the variant sites, the test zone includes general survey Region is tried, at least partly described variant sites are included in the universal test region;
The database of genotype in all test zones of the structure comprising the different cultivars;
After the amount of sampling SN for determining the corn variety to be measured, random sampling mixes and extracts the DNA of mixing sample, described Amount of sampling SN meets following condition:BINOM.INV (SN, M, 0.95)/SN≤1.15*M, wherein BINOM.INV are excel Function in 2010, to judge threshold value selected when the uniformity and stability, the condition that the amount of sampling SN meets contains M Justice is:Even if the hybrid strain rate only beyond uniformity and stability judgment threshold M 15%, the amount of sampling 95% it is general Under rate ensures, the stability and uniformity of the corn variety to be measured can be correctly judged;
The primer for expanding the test zone is prepared, the primer includes universal test region primer;
Expanded using the DNA of mixing sample described in the primer pair, obtain the amplified production of the test zone, institute Amplified production is stated to be used to build high-throughput sequencing library;
High-flux sequence is carried out to the high-throughput sequencing library, obtains that fragment group, the depth of the high-flux sequence is sequenced Degree CF meets following condition:BINOM.DIST (10,10, BINOM.DIST (8,20, BINOM.DIST (0, CF, 0.1%, TRUE), TRUE), FALSE) >=99.9%, 1-BINOM.DIST (10000,10000,1-BINOM.DIST (8,20,1- BINOM.DIST (99.99%*CF, CF, 99.9989%, TRUE), TRUE), FALSE)≤0.1% and BINOM.DIST (10* (1-M) * CF, 10*CF, 1-110%*M, TRUE) >=95.0%, wherein, CF is the depth of the high-flux sequence, and M is judgement Selected threshold value when the uniformity and stability, BINOM.DIST are the function in excel 2010, and the high pass measures The condition implication that the depth CF of sequence meets is:It is 10 and the hybrid strain in the hybrid strain rate as little as 0.1%, the hybrid strain kind Under conditions of averagely only having 20 difference sites between kind and the corn variety to be measured, by the depth CF of the high-flux sequence Probability >=99.9% of the whole hybrid strain kinds of detection of decision;It is for 10000 and described miscellaneous in the kind of the database Under conditions of averagely only having 20 difference sites between strain kind and the corn variety to be measured, by the depth of the high-flux sequence Probability≤0.1% of the hybrid strain kind is judged in the presence that CF is determined by accident;The hybrid strain kind be 10 and true hybrid strain rate only More than threshold value selected when judging specific 10% when, by the depth CF of the high-flux sequence determine to stability with Correct probability >=95.0% of judgement conclusion of uniformity;
The sequencing fragment group is analyzed, obtains Maize Genotypes to be measured and hybrid strain genotype;
By the Maize Genotypes to be measured compared with the genotype of the different cultivars in the database, obtain Approximate kind, variant sites and the variant sites rate of the corn variety to be measured;
By the hybrid strain genotype compared with the genotype of the different cultivars in the database, hybrid strain kind is obtained Afterwards, hybrid strain rate is calculated;
Using the variant sites, the variant sites rate and the hybrid strain rate, judge that the corn variety to be measured is special Property, uniformity and stability.
Specifically, the test zone also includes non-universal test zone, and the primer also includes non-universal test zone Primer.
Further, the non-universal test zone primer includes the first primer and the second primer, the first primer bag The first forward primer and the first reverse primer are included, second primer includes the second forward primer and the second reverse primer, described First primer and second primer carry out individually amplification and obtain the amplified production of two non-universal test zones respectively, will The amplified production mixed in equal amounts of two non-universal test zones is used to build the high-throughput sequencing library individually expanded;
5 ' end connections of first forward primer are just like SEQ ID NO in sequence table:Sequence 1 shown in 1, described first 5 ' end connections in reverse primer are just like SEQ ID NO in sequence table:Sequence 2 shown in 2;
5 ' end connections of second forward primer are just like SEQ ID NO in sequence table:Sequence 2 shown in 2, described second 5 ' end connections of reverse primer are just like SEQ ID NO in sequence table:Sequence 1 shown in 1.
Further, using the variant sites, the variant sites rate and the hybrid strain rate, the corn to be measured is judged The method of varietY specificity, uniformity and stability includes:
When the variant sites be present in the variant sites rate >=non-universal test zones of SD or described, the jade to be measured Rice kind has specificity, when the variant sites rate < SD and the variant sites are not present in the non-universal test zone When, for the corn variety to be measured without specificity, wherein SD is threshold value selected when judging specific;
As the hybrid strain rate≤M of the corn variety to be measured, the corn variety to be measured has uniformity and stably Property, when the hybrid strain rate of the corn variety to be measured is more than > M, the corn variety to be measured is without uniformity and stably Property, M is to judge threshold value selected when the uniformity and stability;
The hybrid strain rate R=R1+R2-R3-R4+Rm, wherein:
Wherein, n1 is the number of nucleus hybrid strain kind, and t1 is all special hybrid strains of the i-th 1 nucleus hybrid strain kinds The number of karyogene type, i1j1 are that all special hybrid strain karyogene types of the i-th 1 nucleus hybrid strain kinds press frequency After sorting from low to high, the special hybrid strain karyogene type of jth 1, R1i1j1 is the i-th 1j1 special hybrid strain karyogenes The frequency of type;R1 is the summation of the hybrid strain rate of the nucleus hybrid strain kind calculated by hybrid strain karyogene type, described thin The hybrid strain rate of karyon hybrid strain kind is to remove the spy of 80% and highest 10% minimum in the nucleus hybrid strain kind After the frequency of different hybrid strain karyogene type, 2 times of the average value of the frequency of the remaining special hybrid strain karyogene type;
Wherein, t2 is in addition to the hybrid strain karyogene type that the nucleus hybrid strain kind possesses and frequency >=0.17% The number of the hybrid strain karyogene type, i2 are the institute in addition to the hybrid strain karyogene type that the nucleus hybrid strain kind possesses After having the hybrid strain karyogene type to be sorted from low to high by frequency, the i-th 2 hybrid strain karyogene types, R2i2 is the i-th 2 institutes State the frequency of hybrid strain karyogene type;R2 is to utilize the hybrid strain karyogene type possessed except the nucleus hybrid strain kind to calculate The hybrid strain rate, R2 are minimum in the frequency for remove the hybrid strain karyogene type possessed except the nucleus hybrid strain kind 80% and highest 10% value after, 2 times of the average value of surplus value;
Wherein, n2 is the number of cytoplasm hybrid strain kind, and R3i3 is the hybrid strain rate of the i-th 3 cytoplasm hybrid strain kinds, and t3 is The number of all special hybrid strain matter genotype of the i-th 3 cytoplasm hybrid strain kinds, i3j3 are miscellaneous for the i-th 3 cytoplasm After all special hybrid strain matter genotype of strain kind sort from low to high by frequency, the special hybrid strain matter gene of jth 3 Type, R3i3j3 are the frequency of the i-th 3j3 special hybrid strain matter genotype;R3 is by the described thin of hybrid strain matter genotype calculating The summation of the hybrid strain rate of kytoplasm hybrid strain kind, the hybrid strain rate of the cytoplasm hybrid strain kind are to remove the cytoplasm hybrid strain In kind after the frequency of the special hybrid strain matter genotype of minimum 80% and highest 10%, the remaining special hybrid strain The average value of the frequency of matter genotype;
Wherein, t4 is in addition to the hybrid strain matter genotype that the cytoplasm hybrid strain kind possesses and frequency >=0.17% The number of the hybrid strain matter genotype, i4 are the institute in addition to the hybrid strain matter genotype that the cytoplasm hybrid strain kind possesses After having the hybrid strain matter genotype to be sorted from low to high by frequency, the i-th 4 hybrid strain matter genotype, R4i4 is the i-th 4 institutes State the frequency of hybrid strain matter genotype;R4 is to utilize the hybrid strain matter genotype possessed except the cytoplasm hybrid strain kind to calculate Hybrid strain rate, R4 are 80% He minimum in the frequency for remove the hybrid strain matter genotype possessed except the cytoplasm hybrid strain kind After the value of highest 10%, the average value of surplus value;
Wherein, t5 is the number of the special test zone of hybrid;I5 is the i-th 5 special test zones of hybrid;Rmi5 is the i-th 5 In the individual special test zone of the hybrid, the frequency of female genotype;Rfi5 in the i-th 5 special test zones of hybrid, The frequency of male parent gene type;Rm is the hybrid strain rate of maternal selfing, and Rm is the female parent in the special test zone of the hybrid The average value of the frequency of genotype and the difference of the frequency of the male parent gene type;
Int () is bracket function;
The nucleus hybrid strain kind refers to calculate the hybrid strain kind obtained, the cytoplasm merely with karyogene type Hybrid strain kind refers to calculate the hybrid strain kind obtained merely with matter genotype;The special hybrid strain karyogene type refers to All hybrid strain karyogene types of one nucleus hybrid strain kind;The special hybrid strain matter genotype refers to only one All hybrid strain matter genotype of the cytoplasm hybrid strain kind;The hybrid strain karyogene type refers to that the hybrid strain genotype is The karyogene type;The hybrid strain matter genotype refers to that the hybrid strain genotype is the matter genotype;It is special in the hybrid In test zone, the female genotype differs with the male parent gene type, the female genotype and all cells The genotype of core hybrid strain kind is different, and the genotype of the male parent gene type and all nucleus hybrid strain kinds is not yet Together;The female genotype is the genotype identical genotype with female parent in the corn variety to be measured;The male parent gene Type is the genotype identical genotype with male parent in the corn variety to be measured;
The karyogene type refers to the genotype on nuclear genome;The matter genotype refers to be located at cytoplasm base Because of the genotype in group.
Further, methods described also includes the uniformity and stably for judging the corn variety to be measured in the following ways The correct probability of conclusion of property is:When the corn variety to be measured has uniformity and stability, the correct probability of conclusion >= BINOM.DIST(M*SN,SN,R,TRUE)*BINOM.DIST(∑SeN*M,∑SeN,R,TRUE);When the corn product to be measured When kind not having the uniformity and stability, the correct probability >=BINOM.DIST of conclusion ((1-M) * SN, SN, (1-R), TRUE)*BINOM.DIST(∑SeN*(1-M),∑SeN,1-R,TRUE);Wherein, M is when judging the uniformity and stability Selected threshold value, ∑ SeN are all surveys for being used for the test zone where calculating the frequency of the genotype of the hybrid strain rate R The summation of sequence fragment, BINOM.DIST (M*SN, SN, R, TRUE) is that the corn variety to be measured has carried out SN sampling, actual The hybrid strain rate R being pumped is less than the probability of the threshold value M, BINOM.DIST (Σ SeN*M, Σ SeN, R, TRUE) meaning For:SeN sampling of Σ is carried out to the corn variety to be measured, the hybrid strain rate R being actually pumped is less than threshold value M probability; BINOM.DIST ((1-M) * SN, SN, (1-R), TRUE) has carried out SN time for the corn variety to be measured and sampled, and is actually pumped The hybrid strain rate R is more than the probability of the threshold value M, BINOM.DIST (Σ SeN* (1-M), Σ SeN, 1-R, TRUE) meaning For:SeN sampling of ∑ has been carried out to the corn variety to be measured, the hybrid strain rate R being actually pumped is more than threshold value M probability, The frequency of the genotype refers in the sequencing fragment group that the sequencing segments for representing the genotype accounts for the genotype institute In the ratio of the sequencing fragment sum of the test zone.
Further, when the variant sites are not present in the non-universal test zone, if judging the corn to be measured Kind has specificity, the correct probability >=BINOM.DIST of conclusion ((1-SD) * TRN, TRN, 1-OD, TRUE);If judge institute State corn variety to be measured and do not have specific, the correct probability >=BINOM.DIST (SD*TRN, TRN, OD, TRUE) of conclusion, its In, TRN is the number for detecting successful test zone, and OD is the variant sites rate, and BINOM.DIST is in excel 2010 Function, the correct probability of conclusion is expressed as when judging that the corn variety to be measured has specific, the change dystopy Point rate is more than SD probability, and when judging that the corn variety to be measured does not have specific, the variant sites rate is less than SD's Probability, the successful test zone of detection after analyzing the sequencing fragment group by obtaining.
Specifically, obtaining the method for the hybrid strain kind includes:The hybrid strain kind is to be present in the database Kind, and the potential hybrid strain genotype of the hybrid strain kind is with there is the test section of phase homogenic type between the hybrid strain genotype The number in domain accounts for total ratio >=60% that the hybrid strain kind has the test zone of the potential hybrid strain genotype; The hybrid strain genotype refers to the potential hybrid strain genotype of frequency >=0.02%;
Quantity >=2 of distinguishing base between the potential hybrid strain genotype and all genotype of the corn variety to be measured There are insertion or the missing of discontinuous base in individual or described distinguishing base.
Specifically, the method for determining the universal test region by the variant sites is:Pass through discriminationThe value of discrimination is calculated, wherein, a is the kind sum being detected in variation window area, and bi is institute State the kind number of i-th kind of genotype in variation window area, and bi>1, k is the number of the genotype comprising more than a kind, It is described variation window area be centered on each single nucleotide variations site, it is each to the both sides in the single nucleotide variations site Extend 1/2 window as detection of sequence length to be measured;
The universal test region is for 8000 maximum variation windows of discrimination on cytoplasmic skeleton and positioned at cell 100 maximum variation windows of discrimination in matter genome.
The beneficial effect that technical scheme provided in an embodiment of the present invention is brought is:Method provided in an embodiment of the present invention passes through High-flux sequence and the amplification of more sites, realize the big of the large sample sampling of corn variety to be measured and each individual test zone Sample is sampled, and recycles the comprehensive means such as hybrid strain genotype and hybrid strain rate, successfully realize it is accurate, intactly judge jade to be measured The target of the rice specificity of kind, stability and uniformity, and test speed is faster, can be completed within 10 days.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, embodiment of the present invention will be made into one below It is described in detail on step ground.
Specificity, uniformity and the stability of embodiment one, measure corn variety ' G95/1102 '
Corn variety to be measured provided in an embodiment of the present invention is corn variety " G95/1102 ", corn variety " G95/1102 " For corn variety " 1102 " and " G95 " cross combination, above kind is kind known to disclosure.Determine the corn variety The method of specificity, uniformity and stability comprises the following steps.
First, the variant sites between different corn varieties are obtained.
Variant sites between different corn varieties can be obtained from the documents and materials announced, but this method is obtained Results contrast is fragmentary, in the present embodiment, by by the genome sequence of different corns with reference to corn variety genome sequence Row are compared, and obtain the variant sites between substantial amounts of different corn varieties.
Further, the method for obtaining the genome sequence of different corn varieties is as follows:
The genome sequence of the different corn varieties of the present embodiment shows two kinds of sources, and the first is Chia etc. to 103 jade The high-flux sequence sequence of the genome of rice kind, pertinent literature information are as follows:Chia JM et al.Maize HapMap2identifies extant variation from a genome in flux.Nat Genet.2012,44 (7):803-7.The genome sequence of 103 corn varieties is published in NCBI Short Read Archive (http:// Www.ncbi.nlm.nih.gov/sra), reception number is SRA051245;Second is by Chia etc. the above-mentioned article delivered To " G95 ", " 1102 " and cenospecies, " height relies 145 " to carry out high-flux sequence to the method for middle offer.The present embodiment obtains altogether The high-flux sequence sequence of the genome of 106 corn varieties.
Further, variant sites are obtained using the genome sequence of different cultivars.
Specifically, because the sequencing depth of this 106 corn varieties is not high, it is only capable of identifying single nucleotide variations (SNP) Site, if the sequencing depth of corn variety is sufficiently high, other variation types can be identified as repeated number variation, due to credible Spend it is low, without identification.Software (version number 0.4) is compared by this 106 corn varieties using Frederick Sanger The high-flux sequence sequence of genome compares " B73 " maize cell core reference gene group (version respectively:AGPv1, download ground Location:http://www.ncbi.nlm.nih.gov) and cytoplasm reference gene group on, the cytoplasm reference gene group includes line Plastochondria reference gene group and chloroplaset reference gene group, it is in NCBI (National Center for Biotechnology Information, US National Biotechnology Information center) on reception number be respectively NC_007982.1 and NC_ 001666.2.During contrast, Insert Fragment length is set to 500bp, and other specification is set as default value.The Ssaha Pileup of use Software kit (version number 0.5) identifies the SNP site of each corn variety.The SNP site be defined as difference determination base-pair, The insertion of single base or the missing of single base.The base-pair that the difference determines refers to not include the uncertain base-pair of difference, poor Different uncertain base-pair refers to the base-pair between some degeneracy bases, as R represents A or G, therefore, there may be between A and R Difference, it is also possible to which in the absence of difference, therefore, difference is indefinite between A and R, is not mutually SNP.Therefore, in the embodiment of the present invention SNP site is not include the uncertain base-pair of above-mentioned difference.By the definition of above SNP site, the embodiment of the present invention is all 53855606 SNP sites are obtained between 106 corn varieties altogether, wherein 9005 SNP sites are located on cytoplasmic skeleton, its Remaining SNP site is located on nuclear genome.Genotype referred to hereafter is to refer to the group of multiple SNP sites in test zone Close, karyogene type refers to genotype and is located on nuclear genome, and matter genotype refers to that genotype is located on cytoplasmic skeleton.Example Such as, the 8th test zone is located on nuclear genome in table 1, is karyogene type, and the test zone shares 7 SNP sites, The genotype of the test zone is the combination of this 7 SNP sites.
2nd, the test zone of corn variety to be measured is determined by variant sites, test zone includes universal test region, extremely Small part variant sites are included in universal test region, and its method includes:
Determine universal test region
Universal test region be on cytoplasmic skeleton on the big region of discrimination or nuclear genome discrimination it is big and Equally distributed region, wherein, discriminationWherein, a is the kind being detected in variation window area Sum, bi are the kind number of i-th kind of genotype in variation window area, and bi>1, k is the genotype comprising more than a kind Number, variation window area be centered on each single nucleotide variations site, it is each to the both sides in single nucleotide variations site Extend 1/2 window as detection of sequence length to be measured.The Computing Principle of discrimination is as follows:All interracial number of combinations areWherein, the combination between the different cultivars in same gene type is undistinguishable, and its number isSo, can not The ratio for the breed combination being distinguished isThe ratio for the breed combination that can be distinguished i.e. discrimination As can be seen here, discrimination is bigger, can more distinguish different cultivars, and the big variation window area of discrimination more has to DUS tests Effect.If the variation window area skewness on nuclear genome, can cause some regions adjacent, so that linkage inheritance, Information is easily overlapping, and therefore, the principle of compositionality in universal test region is selected on nuclear genome is:Discrimination is big and SNP positions Point is uniformly distributed.Cytoplasmic skeleton without linkage inheritance problem, so, only need selective discrimination degree big on cytoplasmic skeleton Region.
High-flux sequence is carried out using Proton high-flux sequences instrument in the embodiment of the present invention, the test section of detection is sequenced in it Length of field can reach 200bp, and in order to obtain maximum fault information, the most long test zone in the present embodiment is also 200bp.Therefore, The variant sites that the present embodiment is mentioned refer to whole test zone, may include multiple SNP sites, base referred to hereafter inside it Because type is to refer to the combination of multiple SNP sites in test zone, karyogene type refers to genotype and is located on nuclear genome, matter base Because type refers to that genotype is located on cytoplasmic skeleton.For example, the 1st test zone is located on nuclear genome in table 1, it is Karyogene type, the test zone share 3 SNP sites, and the genotype of the test zone is the combination of this 3 SNP sites.
First, centered on each SNP site of acquisition, respectively extend 99bp and 100bp to the left and right, form 200bp change Different window.According to the 53855606 of acquisition SNP sites, 53855606 variation windows can be obtained, calculate these variation windows The discrimination in mouth region domainFor example, in the 1st variation window area, a=102 kind is detected altogether, Shared k=3 kind genotype CCA, TCA, TCG, their kind number are respectively b1=5, b2=11 and b3=76, because This,It is meant that:, can be by 102 kinds by the 1st variation window area 43% breed combination distinguish, in addition 47% breed combination cannot be distinguished by out, it is necessary to more make a variation window could area Separate.After the same method, calculate to obtain all discriminations of 53855606 variation windows and therefrom choose and be located at cell 8000 maximum variation windows of discrimination and 100 maximum variations of discrimination in cytoplasmic skeleton in Matrix attachment region Window.Check one by one in 8000 variation windows of nuclear genome, each window and next variation window of making a variation Between distance, if distance exceed 500K (1K=1000 base), abandon wherein discrimination it is less make a variation window afterwards again Check, untill the adjacent distance for looking into variation window is all higher than 500K.Selection 500K criterion distance is because corn gene Group size is about 2300M (ten thousand bases of 1M=100), by final selected 2400 universal test areas for being located at nuclear genome Domain is counted, and the interregional distance of average universal test is about 1M, but due to few change dystopys such as some specific regions such as centromere Point, therefore, average distance should be less than 1M.By the above process, 5030 variation windows for being located at nuclear genome are have selected, They are located at together with 100 maximum variation windows of discrimination in cytoplasmic skeleton totally 5130 variation windows works with acquisition Pass through test zone for selected.Wherein, 200 maximum variation windows of selective discrimination degree, are empirical value, the quantity can root Modified according to concrete condition.
The test zone can also include non-universal test zone, and specific method is as follows:
Determine non-universal test zone
Non-universal test zone refers to the non-universal site that special kinds needs detect.DUS tests need detection fixed point to change The non-universal site made, fixed point transformation are the technological means commonly used in modern breeding, and such as back cross breeding, transgenic breeding is fixed Point transformation kind can also have specificity because of it and turn into new varieties.It is non-based on the specific decision principle of New variety protection Universal test region should not be included in universal test region and the site of qualitative character is controlled for known to.In the present embodiment, by In corn variety to be measured come by pinpointing transformation, need to detect without non-universal site, therefore, universal test area nothing but Domain.
3rd, the primer in amplification assay region is prepared, the primer includes universal test region primer, specific as follows:
Universal test region primer is prepared, the universal test region primer is directed to all kinds, specifically:
Universal test region is detected using multiple PCR technique, and multiple PCR technique refers in same PCR reacts Add multiple PCR primers, while multiple sites in amplification gene group.The key of the technology is to design and synthesize multiplex PCR to draw Thing, the multiple PCR technique that the present embodiment is provided using match Mo Feishier companies of the U.S., it can set up to 12000 weight PCR to draw Thing.
Primer acquisition process is as follows:Log in match Mo Feishier company multiple PCR primer Photographing On-line webpage https:// Ampliseq.com/protected/help/pipelineDetails.action, relevant information is submitted by its requirement. In the present embodiment, " Application type " options select " DNA Hotspot designs (single-pool) ".If Multi-pool is selected, then multiplex PCR will divide multitube to carry out, and cost can increased, and single-pool primer only needs Multiplex PCR, cost is saved, shortcoming is that some universal test regions design of primers may fail, but on genome Alternative universal test region is more, therefore, abandons some alternative universal test regions and has no effect on result.By corn to be measured The nucleus reference gene group and cytoplasm reference gene group of kind permeate file, and in " Select the genome After " Custom " being selected in you wish to use " options, ginseng when uploading the file of fusion as design multiple PCR primer Examine genome." Standard DNA ", in Add Hotspot options, what addition needed to design leads to the selection of DNA type options With the positional information of the SNP site in test zone, including chromosome information, SNP initiation site and SNP stop bits Point, its certain embodiments are shown in Table 1.Finally click on the " multiple PCR primer that Submit targets " buttons are submitted and designed.This In embodiment, from all 5130 universal test regions, design and be successfully authenticated 2506 pairs of multiple PCR primers, for expanding Increase corresponding 2506 universal test regions.The method for verifying multiple PCR primer is same by method provided by the invention, extraction Leaves genomic DNA on strain corn, and the genomic DNA of acquisition is expanded, built using the multiple PCR primer of design Storehouse, high-flux sequence simultaneously analyze sequencing fragment group, remove the corresponding primer of following test zone:The sequencing fragment of the test zone Number is less than 1000 or hybrid strain genotype be present, and the primer remained is the multiple PCR primer being proved to be successful.Due to genome DNA derives from same strain maize leaf, it is impossible to hybrid strain kind be present, therefore, hybrid strain genotype is by the special of test zone PCR caused by structure or sequencing Preference mistake, remove these test zones and avoid such system mistake.What is be proved to be successful is more Weight PCR primer is supplied to client to use in fluid form after also being mixed by the said firm.Above-mentioned successful design multiplex PCR draws 2506 universal test regions of thing are the universal test region eventually for corn variety to be measured detection, meanwhile, structure Each kind in database also contains above-mentioned 2506 universal test regions, wherein, 34 universal test regions are positioned at thin In cytogene group, remaining 2472 universal test regions are located on nuclear genome.
It should be noted that:The number requirement >=900 in universal test region, reason is as follows:If less than 900, exist The probability of the hybrid strain kind of erroneous judgement will be more than 1%, and the projectional technique of the threshold value is shown in Table 2.Due to there may be the survey of detection failure Region is tried, therefore, test zone number is general >=and 1000.
Test zone primer can also include non-universal test zone primer, and the non-universal test zone primer is for be measured Corn variety, it is specific as follows:
Prepare non-universal test zone primer
The primer of non-universal test zone includes the first primer and the second primer, the first primer include the first forward primer and First reverse primer, the second primer include the second forward primer and the second reverse primer, and the first primer and the second primer enter respectively Individually amplification obtains the amplified production of two non-universal test zones to row, by the amplified production equivalent of two non-universal test zones It is mixed for building the high-throughput sequencing library individually expanded.5 ' end connections of the first forward primer are just like SEQ ID in sequence table NO:Sequence 1 shown in 1,5 ' end connections in the first reverse primer are just like SEQ ID NO in sequence table:Sequence 2 shown in 2;The 5 ' end connections of two forward primers are just like SEQ ID NO in sequence table:Sequence 2 shown in 2,5 ' end connections of the second reverse primer Just like SEQ ID NO in sequence table:Sequence 1 shown in 1.
The design process of non-universal test zone primer is as follows:The first step, it is no more than 200bp and comprising non-by amplification length The requirement of all SNP sites in universal test region, by common PCR primers design method, design expands non-universal test zone PCR forward primer and reverse primer;Second step, 5 ' ends of designed forward primer and reverse primer are connected into sequence respectively SEQ ID NO in list:1 and sequence table in SEQ ID NO:2, the forward primer and first primer of the first primer are obtained respectively Reverse primer;3rd step, by SEQ ID NO in 5 ' ends of designed forward primer and reverse primer respectively catenation sequence table:2 With SEQ ID NO in sequence table:1, the forward primer of the second primer and the reverse primer of the second primer are obtained respectively.In sequence table SEQ ID NO:1 and sequence table in SEQ ID NO:2 be the joint sequence used in high-flux sequence, thereby using PCR primer band There is the joint sequence of high-flux sequence, after establishing sequencing library after directly being mixed with the product in the general sequencing region of amplification Together be sequenced, without by fragmentation, jointing etc. it is cumbersome build storehouse step, improve operating efficiency and reduce into This.It is to be sequenced from the both ends of non-universal test zone simultaneously to make two pairs of only different primers of joint.
Corn variety to be measured in the present embodiment is due to no non-universal test zone, therefore, universal test region nothing but Primer.
4th, the method for the database of genotype in all test zones of the structure comprising different cultivars is as follows:
The database of genotype in all test zones of the structure comprising different cultivars, specifically, in corn product to be measured Kind test zone on, obtain different cultivars to genotype that should be on test zone and composition data storehouse.This example obtains 2506 universal test region primers and 0 non-universal test zone primer, amplification regions corresponding to them are jade to be measured The test zone of rice kind.The number of the genotype of 2506 test zones of the structure comprising 106 kinds and its SNP positional information According to storehouse, partial results are shown in Table 1.
Table 1 is database variety and genetype and its position, Maize Genotypes to be measured, hybrid strain genotype and its frequency Certain embodiments
"/" represents that the test zone is heterozygous genotypes in table 1, the different genotype of "/" both front and back be present;Except ATGC Outside, other letters represent degeneracy base.If genotype is made up of degeneracy base N entirely, claim corresponding test zone genotype and SNP numbers According to missing, when the genotype or SNP of missing are compared with any genotype or SNP, make indifference processing.It can be provided by the present invention The method Test database kind of detection Maize Genotypes to be measured and the genotype of completion missing.
The present embodiment does not list all database content completely as space is limited, only lists wherein 5 kinds The information of 10 test zones.Equally limited based on length, also have some areas also only to list part in the present embodiment related real Example, remaining unlisted data can be according to the method completion of the present embodiment.
5th, after the amount of sampling SN for determining corn variety to be measured, random sampling mixes and extracts the DNA of mixing sample, method It is as follows:
Calculate corn variety amount of sampling to be measured
Amount of sampling SN should meet following condition:BINOM.INV (SN, M, 0.95)/SN≤1.15*M, wherein, BINOM.INV For the function in excel 2010, its application method is identical with the definition in excel 2010, and its implication is so that accumulation binomial The functional value of distribution is more than or equal to the smallest positive integral of critical value.Amount of sampling SN meet condition implication be:Even if hybrid strain rate is only Beyond the 15% of threshold value M, the amount of sampling can correctly judge the stability and one of corn variety to be measured in the case where 95% probability ensures Cause property.M values artificially determine according to conditions such as crop species, type, specific requirements.In the Ministry of Agriculture, New variety protection is done In public room issue《New variety of plant specificity, uniformity and stability test guide-corn》Middle regulation:Corn hybrid seed uses 3% population norms, therefore, in the present embodiment, M values are used as from 3%.After progressively increasing SN values, calculate above-mentioned formula and find, As SN >=4000, BINOM.INV (SN, 3%, 0.95)/SN≤1.15*3% is set up.Therefore, test sample is treated in the present embodiment This amount of sampling answers >=4000.
Random sampling mixes and extracts the DNA of mixing sample
In the present embodiment, 6000 germinations are have chosen, randomly select 5000 bud being substantially equal to the magnitudes mixing After be placed in mortar, into mortar add liquid nitrogen after be fully ground into powder.Produced using Beijing Tiangeng biochemical technology Co., Ltd Article No. be DP305 plant genome DNA extracts kit extract and obtain the DNA, DNA of corn variety mixing sample to be measured Extracting method is carried out by the operation manual of the kit.Utilize the production of Invitrigen companies of the U.S.dsDNA HS Assay Kit (article No. Q32852) and its specification quantify to the DNA of acquisition, the corn variety to be measured after quantifying DNA is diluted to 10.00ng/ μ l.
6th, expanded using the DNA of primer pair mixing sample, obtain the amplified production of test zone, amplified production is used In structure high-throughput sequencing library, wherein primer includes universal test region primer and the high-flux sequence text in universal test region Storehouse, specific method are as follows:
High-throughput sequencing library includes:The high pass of the high-throughput sequencing library in universal test region and non-universal test zone Sequencing library is measured, in the present embodiment, the high-throughput sequencing library in universal test region nothing but, therefore, all test zones High-throughput sequencing library is the high-throughput sequencing library in universal test region.
The method for building the high-throughput sequencing library in universal test region is as follows:
Utilize library construction Kit 2.0 (production of Mo Feishier companies, article No. 4475345 are matched by the U.S.) multiplex PCR After expanding universal test region, high-throughput sequencing library is built using amplified production.The kit includes following reagent:5×Ion AmpliSeqTMHiFi Mix, FuPa reagents, transferring reagent, sequence measuring joints solution and DNA ligase.The method of library construction is pressed The operation manual of the kit《Ion AmpliSeqTMLibrary Preparation》(publication number:MAN0006735, version: A.0) carry out.It is as follows by 2506 universal test regions of multiplexed PCR amplification, the amplification system of multiplex PCR:5×Ion AmpliSeqTMμ l of HiFi Mix 4, μ l of universal test region primer mixed liquor 4, the DNA 10ng of corn variety to be measured prepared With without the μ l of enzyme water 11.The amplification program of multiplex PCR is as follows:99 DEG C, 2 minutes;(99 DEG C, 15 seconds;60 DEG C, 4 minutes) × 25 follow Ring;10 DEG C of insulations.After primer unnecessary in multiplexed PCR amplification product is digested using FuPa reagents, then phosphorylation is carried out, specifically Method is:2 μ L FuPa reagents are added into the amplified production of multiplex PCR, after mixing, are reacted in PCR instrument by following program: 50 DEG C, 10 minutes;55 DEG C, 10 minutes;60 DEG C, 10 minutes;10 DEG C of preservations, obtain mixture a, and mixture a is containing by phosphorus The amplified production solution of acidifying.By the upper sequence measuring joints of amplified production connection of phosphorylation, specific method is:Add into mixture a Enter μ L of transferring reagent 4, the μ L of sequence measuring joints solution 2 and the μ L of DNA ligase 2, after mixing, reacted in PCR instrument by following program:22 DEG C, 30 minutes;72 DEG C, 10 minutes;10 DEG C of preservations, obtain mixed liquor b.Utilize the ethanol precipitation methods purifying mixed liquor b of standard After be dissolved in 10 μ L without in enzyme water.Utilize the production of Invitrigen companies of the U.S.DsDNA HS Assay Kit (goods Number it is Q32852) and it is measured according to its specification, and after obtaining mixed liquor b mass concentration, by mixed liquor b after purification 15ng/ml is diluted to, obtains the high-throughput sequencing library in concentration about 100pM universal test region.
The method for building the high-throughput sequencing library of non-universal test zone is as follows:
Using the DNA of kind to be measured as template, the first primer of the non-universal test zone prepared using the above method and the Two primers carry out independent PCR amplifications respectively, and the high-flux sequence text of non-universal test zone is obtained after mixed in equal amounts amplified production Storehouse.Concrete operations are pressed《Ion Amplicon Library Preparation(Fusion Method)》(publication number: 4468326) carry out, substantially process is as follows:The forward primer of first primer and reverse primer are dissolved as with water to 10 μM of concentration Afterwards, isometric mixing, obtains the first primer solution.It is formulated as follows PCR reaction systems:μ L of first primer solution 1,30ng jade to be measured Rice kind DNA and PCR high-fidelity mixture (invirtrigen companies of the U.S. produce, article No. 12532016) 45 μ L, are mixed Afterwards, reacted in PCR instrument by following program:94 DEG C, 3 minutes;(94 DEG C, 30 seconds;58 DEG C, 30 seconds;68 DEG C, 1 minute) × 40 Circulation;4 DEG C of insulations.Pcr amplification product is dissolved in 10 μ L water after purification by the method for the ethanol precipitation of standard, utilizes DNA On the biological analyser (model 2100) that 1000 kits (article No. 5067-1504) produce in Agilent company of the U.S., press After the kit specification determines and obtains the molar concentration of amplified production, 200pM, the amplification production of as the first primer are diluted to Thing.Using identical method, amplified production of the concentration for 200pM the second primer is obtained.By the amplified production of the first primer with The amplified production of second primer mixes in equal volume, obtains the non-universal test zone high-throughput sequencing library that concentration is 100pM.This In embodiment, due to universal test region nothing but, therefore, without the high-throughput sequencing library for building non-universal test zone.
Obtain the high-throughput sequencing library of all test zones
In universal test region number and non-universal test zone number ratio mixing equimolar concentration it is general The high-throughput sequencing library of the high-throughput sequencing library of test zone and non-universal test zone, obtained mixture are all The high-throughput sequencing library of test zone.In the present embodiment, because of the high-throughput sequencing library in universal test region nothing but, because This, the high-throughput sequencing library of all test zones of structure is the high pass measurement in the universal test region that concentration is 100pM Preface storehouse.
7th, high-flux sequence is carried out to high-throughput sequencing library, obtains that fragment group is sequenced, specific method is as follows:
Determine high-flux sequence depth CF principle:The depth CF of high-flux sequence meets following condition:BINOM.DIST (10,10, BINOM.DIST (8,20, BINOM.DIST (0, CF, 0.1%, TRUE), TRUE), FALSE) >=99.9%, 1- BINOM.DIST (10000,10000,1-BINOM.DIST (8,20,1-BINOM.DIST (99.99%*CF, CF, 99.9989%, TRUE), TRUE), FALSE)≤0.1% and BINOM.DIST (10* (1-M) * CF, 10*CF, 1-110%*M, TRUE) >=95.0%, wherein, CF is the depth of high-flux sequence, namely averagely the capped multiple of each test zone, M are Judge threshold value selected when uniformity and stability, BINOM.DIST is the function in excel 2010, its application method with Definition in excel 2010 is identical, and what it was returned is the probability of binomial distribution.The meaning of three functions is:In hybrid strain rate As little as 0.1%, the hybrid strain condition wide in variety up to average only 20 difference sites between 10 and hybrid strain kind and corn variety to be measured Under, probability >=99.9% of the whole hybrid strain kinds of detection determined by high-flux sequence depth;In database kind up to 10000 It is individual and between hybrid strain kind and corn variety to be measured under conditions of average only 20 difference sites, determined by high-flux sequence depth In the presence of probability≤0.1% of erroneous judgement hybrid strain kind;It is wide in variety up to 10 and true hybrid strain rate exceeds only judgement specificity in hybrid strain When selected threshold value 10% when, the judgement conclusion to stability and uniformity determined by high-flux sequence depth is correct Probability >=95.0%.Conditions above is very strict, and therefore, true effect is better than above-mentioned threshold value.The projectional technique of above probability is shown in Table 2.
Table 2 is the computational methods of the present embodiment dependent probability
Table 2 is that the tables of data of Excel 2010, its function, cell etc. is identical with Excel 2010 definition.Wherein, " judging threshold value selected when uniformity and stability (M) " for cell B2, other cell numberings are pressed using B2 as reference Excel 2010 rule defines, such as the cell where " hybrid strain rate (R) " adds 4 rows 1 on the basis of B2 and arranged, therefore Numbering is C6, and other cell coding rules are identical with this.
The determination method of the present embodiment high-flux sequence depth is:After M=3% is substituted into above three formula, progressively add It during big sequencing depth CF to 2237, can set up above three equation, therefore, the present embodiment sequencing depth is defined as >=2237 Times.
High-flux sequence is carried out using high-throughput sequencing library
Utilize the high-throughput sequencing library and kit Ion PI Template of all test zones of acquisition OT2200Kit v2 (invirtrigen companies of the U.S. produce, article No. 4485146) be sequenced before ePCR (Emulsion PCR, emulsion polymerization enzyme chain reaction) expand, operating method is carried out by the operation manual of the kit.Utilize ePCR products and reagent Box Ion PI Sequencing 200Kit v2 (invirtrigen companies of the U.S. produce, article No. 4485149) are in Proton High-flux sequence is carried out on two generation high-flux sequence instrument, operating method is carried out by the operation manual of the kit.In the present embodiment In, high-flux sequence flux is arranged to average 30000 times of coverage test region.
A large amount sequencing result is pre-processed
First determine whether high-flux sequence the quality of data whether >=Q20, if<Q20 (this situation is few), then as stated above High-flux sequence is re-started, until quality requirement reaches Q20 standards, Q20 standards, which are met in table 2, " to be sequenced wrong to be specific The requirement of the probability of base "≤0.33%.The high-flux sequence fragment for being up to quality requirement is compared to all 2506 tests Region, remove after comparing the unsuccessful and infull sequencing fragment of genotype detection, remaining all sequencing fragments are referred to as piece is sequenced Section group.The incomplete sequencing fragment of genotype detection refers to could not be by table 1 shown in " positions of the SNP in reference gene group " The reason for all SNP sites detect sequencing fragment, and genotype detection is not complete is that sequencing fragment is too short, compares unsuccessful reason It is that sequencing fragment is mostly non-specific amplification product.
8th, analysis sequencing fragment group, obtains Maize Genotypes to be measured and hybrid strain genotype, method is as follows;
Sequencing fragment group is compared and arrives all test zones, and counts the sequencing segments in each test zone, is removed The test zone of segments≤1000 is sequenced, remaining test zone is the successful test zone of detection.In the present embodiment, 2465 successful test zones of detection are obtained altogether.The fragment for comparing test zone is referred to as the sequencing fragment of the test zone, The base composition that the position in table 1 shown in " positions of the SNP in reference gene group " is extracted from sequencing fragment is referred to as the sequencing The genotype of fragment.The frequency of genotype refers to be sequenced in fragment group, and the sequencing segments for representing the genotype accounts for the genotype The ratio of the sequencing fragment sum of place test zone.The genotype of frequency >=30% is referred to as Maize Genotypes to be measured.One As for, in the sample extracted, the amount of hybrid is not higher than 10%, and sequencing mistake is no more than 1%, and the two is total to be no more than 11%, therefore, for homozygous site, Maize Genotypes to be measured only have one kind, and its frequency should be more than 89%, and right For heterozygous sites, Maize Genotypes to be measured have 2 kinds, and its ratio should be more than 45.5%, therefore, it is specified that corn to be measured Frequency >=30% of variety and genetype, it can exclude because being contaminated with hybrid strain in the wrong and to be measured corn variety of sequencing and to jade to be measured The interference of rice variety and genetype.Hybrid strain genotype refers to the potential hybrid strain genotype of frequency >=0.02%, wherein, potential hybrid strain gene There is discontinuous base in quantity >=2 of distinguishing base between all genotype of type and corn variety to be measured or distinguishing base Insertion or missing.The principle of hybrid strain VDA genotypes is:In high-flux sequence, insertion or missing errors are extremely rare, and because surveying Sequence mistake causes probability as little as (1%/3) 2=0.0011% of 2 fixed distinguishing bases, and require hybrid strain genotype frequency >= 0.02%, under the limitation of these conditions, even 30000 sequencing depth, because sequencing mistake produces certain hybrid strain genotype Probability is only 0.0001% (computational methods are shown in Table 2).0.02% frequency meets most strict DUS testing standards at present, i.e., from 10,000 As little as 2 hybrid detected in grain seed.If distinguishing base quantity=1, whole test zones can all produce mistake Hybrid strain genotype (computational methods are shown in Table 2), if during distinguishing base quantity >=3, hybrid strain genotype quantity is drastically reduced, it is difficult to accurate Hybrid strain rate R is really calculated, therefore, the threshold value of distinguishing base quantity >=2 is optimal.
For example, in fragment group is sequenced, the sequencing fragment sum in the 1st sequencing region is 29506 articles, have TCG, AAG, The genotype such as AAT, AAA ..., the sequencing segments for representing these genotype distinguish 28768,304,2,1 ..., The frequency of these genotype be 28768/29506=97.50%, 16334/29506=1.03%, 2/29506=0.007%, 1/29506=0.003% ....By the definition of Maize Genotypes to be measured and hybrid strain genotype, TCG should be corn to be measured Kind is in the Maize Genotypes to be measured of the 1st test zone, the difference between all genotype of AAG and corn variety to be measured Quantity >=2 of base, thus be potential hybrid strain genotype, it is hybrid strain gene thus due to its frequency 1.03% >=0.02% Type, and its frequency is 1.03%.Other genotype frequencies<0.02%, for genotype caused by sequencing mistake.Hybrid is specifically tested In region, female genotype differs with male parent gene type, and the genotype of female genotype and all nucleus hybrid strain kinds is not Together, and male parent gene type is also different from the genotype of all nucleus hybrid strain kinds;Female genotype be corn variety to be measured in, With the genotype identical genotype of female parent;Male parent gene type is the genotype identical base with male parent in corn variety to be measured Because of type.1st test zone, female genotype TCG is identical with male parent gene type TCG, and therefore, the 1st test zone be not to be miscellaneous The special test zone of kind.Hybrid strain karyogene type refers to that hybrid strain genotype is karyogene type, and hybrid strain matter genotype refers to hybrid strain gene Type is matter genotype.By this definition, the hybrid strain frequency of genotypes AA G of the 1st test zone is hybrid strain karyogene type.By identical side Method, judge and obtain all Maize Genotypes to be measured of 2465 successful test zones of detection, the special test section of hybrid Domain, hybrid strain genotype and its frequency, and the hybrid strain genotype for judging to obtain is hybrid strain karyogene type or hybrid strain matter genotype.Knot Fruit shows:In the present embodiment, share 44 special test zones of hybrid, totally 61 hybrid strain genotype and respectively calculate obtain it Frequency.
The standard sample detection method in the present embodiment is following is a brief introduction of, a kind is taken from corn variety to be measured Son, after sowing and growing up to seedling, pressed using the blade of seedling and extract genomic DNA with corn variety identical method to be measured, should DNA is referred to as the standard sample of corn variety to be measured.With corn variety to be measured simultaneously and by the parallel structure standard sample of same procedure High-throughput sequencing library and high-flux sequence.Wherein, the genotype of frequency >=30% is referred to as standard sample genotype, standard sample Frequency >=0.02% of product hybrid strain genotype and quantity >=2 or the distinguishing base of the distinguishing base between standard sample genotype In have insertion or the missing of discontinuous base.Successfully tested by with corn variety identical method to be measured, each detection of acquisition Standard sample genotype and standard sample hybrid strain genotype in region.If standard sample genotype and corn variety gene to be measured Type identical test zone accounts for standard sample and corn variety to be measured detects the ratio of successful test zone more than 90%, then Standard sample is correct, otherwise, takes 1 seed from corn variety to be measured again, repeats above procedure, until obtaining correctly mark Quasi- sample.By the hybrid strain genotype ratio of the hybrid strain genotype of correct standard sample test zone corresponding with corn variety to be measured Compared with, identical hybrid strain genotype is obtained, removes identical hybrid strain genotype described in corn variety to be measured, correct corn to be measured Kind hybrid strain genotype is retained and is used for subsequent analysis.Above measure eliminates miscellaneous caused by Systematic selection mistake Pnca gene type, Systematic selection mistake are mainly the PCR selectivity mistake amplifications caused by the special construction of gene order.Need What is illustrated is:When database is wide in variety, can represent different cultivars genotype extensively, hybrid strain genotype and database can be required Some genotype of kind is identical, can equally play with standard sample identical function, in this case, it is possible to not detect mark Quasi- sample, reach the purpose for mitigating workload.In the present embodiment, because not detecting hybrid strain genotype, also it is not present The problem of removing wrong hybrid strain genotype.
9th, by Maize Genotypes to be measured compared with the genotype of the different cultivars in database, obtain approximate kind, Variant sites and variant sites rate, method are as follows:
If in the test, corn variety to be measured claims the test zone with the genotype of database kind without missing For corn variety to be measured and the shared test zone of the database kind.In shared test zone, if corn variety to be measured with The genotype of database kind is incomplete same, then the test zone where the genotype is referred to as corn variety to be measured and the data The difference site of storehouse kind, corresponding genotype Differential genotype each other, the number in difference site rate=difference site/shared are surveyed Try the number in region.The approximate kind that the minimum kind of difference bit rate is referred to as corn variety to be measured is obtained from database, accordingly Difference site be referred to as variant sites, the number of number/shared test zone of variant sites rate=variant sites.
In the present embodiment, the shared test zone number of the 1st kind " G95 " of corn variety and database to be measured is 2465.In the 1st shared test zone, corn variety to be measured and " G95 " genotype are TCG, and the two is identical, because This, the 1st shared test zone is not corn variety to be measured and " G95 " difference site, TCG also for corn variety to be measured with The Differential genotype of " G95 ".By identical method, by all shared test zones, corn variety to be measured and " G95 " genotype Compare, discovery shares 47 difference sites, difference site rate=47/2465=1.91%.By identical method, jade to be measured is obtained Rice kind and all 106 interracial difference sites rate in database, and the minimum kind of difference site rate is obtained as " height relies 145 ", there are 47 difference sites, difference site rate is 1.91%.Therefore, " height relies 145 " the approximate product for being corn variety to be measured Kind, the variant sites rate of corn variety to be measured is 1.91%.
Tenth, by hybrid strain genotype compared with the genotype of the different cultivars in database, after obtaining hybrid strain kind, calculate miscellaneous Strain rate, method are as follows:
Obtain hybrid strain kind:The kind that hybrid strain kind is present in database, and the potential hybrid strain genotype of hybrid strain kind Having the number of the test zone of phase homogenic type to account for hybrid strain kind between hybrid strain genotype has the test of potential hybrid strain genotype Total ratio >=60% in region, wherein, the difference between all genotype of potential hybrid strain genotype and corn variety to be measured There are insertion or the missing of discontinuous base in quantity >=2 of base or distinguishing base.Hybrid strain kind is divided into nucleus hybrid strain product Kind and cytoplasm hybrid strain kind, wherein, nucleus hybrid strain kind refers to calculate the hybrid strain kind obtained merely with karyogene type, carefully Kytoplasm hybrid strain kind refers to calculate the hybrid strain kind obtained merely with matter genotype.For example, it is assumed that the base of the kind in database When because of type being respectively AA, AA, AA/TT, AA/TT, AA/TT, AA/TT and AA, the corresponding genotype of corn variety to be measured is respectively AA, AA/TT, TT, AA, TT/CC, GG/CC and during-A, corresponding potential hybrid strain genotype is:Nothing, nothing, AA, TT, AA, AA/TT And AA.Heterozygous genotypes are not present in general homozygous kind, but only a few site there may be, in addition, hybrid strain is mostly cenospecies, Heterozygous sites are more typical, therefore list various possible situations.Parameter 60% can ensure that whole hybrid strain kind detection probabilities are 100% and exist erroneous judgement hybrid strain kind probability be 0%, the determination method of the parameter value is shown in Table 2.
In the present embodiment, the 1st kind in database is G95, and the kind is in the genotype of the 1st test zone TCG, it without identical genotype, judges the test zone of G95 other cell nucleus genes successively between hybrid strain frequency of genotypes AA G Whether there is identical genotype between hybrid strain genotype, as a result find:With phase homogenic type between G95 and hybrid strain genotype The number of test zone is 0, is thus judged, the number of the test zone with phase homogenic type between G95 and hybrid strain genotype It is 0% to account for G95 to have the total ratio of the test zone of potential hybrid strain genotype<60%, therefore, G95 is not miscellaneous for nucleus Strain kind, by identical method, judge that G95 is not cytoplasm hybrid strain kind yet using the test zone of cytogene.By phase With method, judge in database whether other kinds are nucleus hybrid strain kind or cytoplasm hybrid strain kind successively, as a result table It is bright:Only BKN017:MZ is nucleus hybrid strain kind, and all 61 hybrid strain genotype are present in BKN017:In MZ.This implementation In example, acellular matter hybrid strain kind.
Calculate hybrid strain rate R principles
Hybrid strain rate R=R1+R2-R3-R4+Rm, wherein:Wherein, n1 For the number of nucleus hybrid strain kind, t1 is the number of all special hybrid strain karyogene types of the i-th 1 nucleus hybrid strain kinds, I1j1 is jth 1 after all special hybrid strain karyogene types of the i-th 1 nucleus hybrid strain kinds sort from low to high by its frequency Special hybrid strain karyogene type, R1i1j1 are the frequency of the i-th 1j1 special hybrid strain karyogene types;R1 is by hybrid strain karyogene type meter The summation of the hybrid strain rate of the nucleus hybrid strain kind of calculation, the hybrid strain rate of nucleus hybrid strain kind are to remove in nucleus hybrid strain kind After the frequency of the special hybrid strain karyogene type of minimum 80% and highest 10%, the frequency of remaining special hybrid strain karyogene type 2 times of average value;Wherein, t2 is the hybrid strain possessed except nucleus hybrid strain kind The number of outside karyogene type and frequency >=0.17% hybrid strain karyogene type, i2 be except nucleus hybrid strain kind possess it is miscellaneous After all hybrid strain karyogene types outside strain karyogene type sort from low to high by its frequency, the i-th 2 hybrid strain karyogene types, R2i2 is the frequency of the i-th 2 hybrid strain karyogene types;R2 is to utilize the hybrid strain karyogene type possessed except nucleus hybrid strain kind to calculate Hybrid strain rate, it is 80% and highest minimum in the frequency for remove the hybrid strain karyogene type possessed except nucleus hybrid strain kind After 10% value, 2 times of the average value of surplus value;Wherein, n2 is cell The number of matter hybrid strain kind, R3i3 are the hybrid strain rate of the i-th 3 cytoplasm hybrid strain kinds, and t3 is the i-th 3 cytoplasm hybrid strain kinds All special hybrid strain matter genotype number, i3j3 is all special hybrid strain matter genotype of the i-th 3 cytoplasm hybrid strain kinds After being sorted from low to high by its frequency, the special hybrid strain matter genotype of jth 3, R3i3j3 is the i-th 3j3 special hybrid strain matter genes The frequency of type;R3 is by the summation of the hybrid strain rate of the cytoplasm hybrid strain kind of hybrid strain matter genotype calculating, cytoplasm hybrid strain kind Hybrid strain rate be the special hybrid strain matter genotype for removing 80% and highest 10% minimum in cytoplasm hybrid strain kind frequency Afterwards, the average value of the frequency of remaining special hybrid strain matter genotype;Wherein, t4 be except The number of outside the hybrid strain matter genotype that cytoplasm hybrid strain kind possesses and frequency >=0.17% hybrid strain matter genotype, i4 are All hybrid strain matter genotype in addition to the hybrid strain matter genotype that cytoplasm hybrid strain kind possesses sort from low to high by its frequency Afterwards, the i-th 4 hybrid strain matter genotype, R4i4 are the frequency of the i-th 4 hybrid strain matter genotype;R4 is to utilize to remove cytoplasm hybrid strain kind The hybrid strain rate that the hybrid strain matter genotype possessed calculates, it is the frequency for removing the hybrid strain matter genotype possessed except cytoplasm hybrid strain kind In rate after the value of minimum 80% and highest 10%, the average value of surplus value;Wherein, t5 is miscellaneous The number of the special test zone of kind;I5 is the i-th 5 special test zones of hybrid;Rmi5 is the i-th 5 special test zones of hybrid In, the frequency of female genotype;Rfi5 is in the i-th 5 special test zone of hybrid, the frequency of male parent gene type;Rm is maternal certainly The hybrid strain rate of friendship, it is the frequency of female genotype and the difference of the frequency of male parent gene type in the special test zone of hybrid Average value;Int () is bracket function, returns to the integer part of the number in bracket.
The female parent that hybrid strain in corn variety to be measured comes from reproductive process is selfed, flyings pollination mixes and machinery mixes It is miscellaneous, wherein, female parent is the main source of hybrid strain variet complexity from giving flyings pollination to mix.Female parent selfing refers in hybrid seed In production process, the female parent as sterile line should not be selfed generation seed originally, but due to maternal part fertility restorer, produce Seed, so as to form hybrid.Flyings pollination, which mixes, refers to that the pollen of hybrid strain kind passes to corn product to be measured by wind-force etc. Plant and the hybrid seed for formation of pollinating, flyings pollination can not possibly introduce cytoplasm, therefore can only cause hybrid strain karyogene type, and its is miscellaneous Strain rate is 2 times of hybrid strain karyogene type frequency.Mechanical admixture refers to that hybrid strain variety seeds are directly mixed in corn variety to be measured, together When introduce nucleus and cytoplasm, while form hybrid strain karyogene type and hybrid strain matter genotype, its hybrid strain rate should be hybrid strain The frequency of matter genotype.In hybrid strain rate R calculation formula, the hybrid strain rate of mechanical admixture has been over-evaluated 1 times by R1+R2, needs to correct, After correction for R1+R2-R3-R4.It is a technical barrier to distinguish mechanical admixture with flyings pollination to mix, and the present invention solves this One problem.
In hybrid strain rate R calculation formula, the hybrid strain rate of nucleus hybrid strain kind is all 2 × hybrid strain karyogene type frequency, Its reason is as follows:Diploid or allopolyploid plant are 2 copies, therefore, hybrid strain in the test zone of nuclear genome Rate is 2 times of corresponding hybrid strain karyogene type frequency.If the test zone of the nuclear genome of N parts copy must be selected, Then coefficient should be adjusted to N, if copy number is indefinite, make N=2 processing, if wrong, it will when calculating R, by removing 80% The mode of low extremum excludes them.
In hybrid strain rate R calculation formula, merely with 10% of hybrid strain genotype frequency value in centre count Calculate, its principle is:The different hybrid strain genotype of same hybrid strain kind are determined by the hybrid strain rate of the hybrid strain kind, so the phase of frequency Prestige value is equal, and the difference between frequency is expanded by PCR, the error during high-flux sequence causes.Pass through hybrid strain genotype Definition and corn variety standard sample to be measured, these improper values are eliminated substantially, removes 10% extremum and is enough Fall the test zone that very small amount deviates true hybrid strain rate.Why remove the 80% of minimum, and it is maximum then only remove 10%, it is former Reason is as follows:(1) worst error source is sequencing mistake, and it is very low that hybrid strain genotype frequency caused by mistake is sequenced;(2) cleaning In the frequency of hybrid strain genotype outside strain kind, high level is more likely the common hybrid strain genotype of different hybrid strains, is represent true Real hybrid strain rate.
In R2 and R4 calculation formula, it is desirable to which frequency >=0.17% of hybrid strain genotype, its principle are as follows:Work as database In kind number and detection site when reaching 10000,149 hybrid strain genotype erroneous judgements will be averagely produced, when setting hybrid strain During genotype frequency >=0.17%, probability >=99.98% (projectional technique is shown in Table 2) of the hybrid strain genotype of no erroneous judgement just can be accurate Really calculate the value to R2 and R4.It has been the limit in reality that kind number in database and detection site, which reach 10000, because This, the threshold value of frequency >=0.17% of hybrid strain genotype goes for various situations.R2 and R4 introducing so that energy of the present invention It is 0 enough in database kind, i.e., in the case that no database is supported, calculates hybrid strain rate R.Especially, if hybrid strain kind A institute There is hybrid strain genotype to be possessed by hybrid strain kind B and other hybrid strain kinds, thus, hybrid strain kind A is without special hybrid strain genotype.This When, when calculating hybrid strain rate R, hybrid strain kind A and hybrid strain kind B hybrid strain rate are not calculated, and calculate hybrid strain kind AB hybrid strain Rate.Hybrid strain kind AB hybrid strain VDA genotypes are:Hybrid strain genotype common to hybrid strain kind A and hybrid strain kind B.
Hybrid strain rate R calculation formula is general formula, and corn variety to be measured typically only mixes a kind of hybrid strain product in reality Kind, because cenospecies production area is all very big and process specification, so, the possibility of flyings pollination and mechanical admixture is all very low, Up to maternal selfing forms hybrid.
Calculate hybrid strain rate R hypothesis example
Table 3 assumes a hybrid strain rate calculated examples, to become apparent from illustrating hybrid strain rate R calculating process.
Table 3 assumes example to calculate one of hybrid strain rate R
In table 3, nucleus hybrid strain kind common A and B two, so n1=2, cytoplasm hybrid strain kind number only C mono-, so N2=1.By the definition of special hybrid strain karyogene type, the special hybrid strain karyogene type for obtaining hybrid strain kind A is that numbering is No. 1-10 Hybrid strain karyogene type AA, TT, TCC, GG, AC, TTC, TCCC, GGC, ACC and AG, so, t1=10, they frequency difference For 0.10%, 1.20%, 0.10%, 0.10%, 0.02%, 0.10%, 0.10%, 0.10%, 0.10% and 0.10%, to this It is R11111=0.02%, R11121=0.02%, R11131 after 10 special hybrid strain karyogene type frequencies sort from low to high =0.10%, R11141=0.10%, R11151=0.10%, R11161=0.10%, R11171=0.10%, R11181= 0.10%th, R11191=0.10% and R111101=1.20%.From j1=Int (0.8 × t1)+1=Int (0.8 × 10)+1= 9 to j1=t1-Int (0.1 × t1)=10-Int (0.1 × 10)+1=9 R111j1 value is R11191=0.10%, so Nucleus hybrid strain kind A hybrid strain rate isIn the same way, nucleus hybrid strain product are obtained Kind of B hybrid strain rate isThus, nucleus hybrid strain kind is obtained R1i1=R111+R121=0.60%.In a similar manner, R2=0.02%, the hybrid strain rate of cytoplasm hybrid strain kind are obtainedR4=0.04%.In the 1st special test zone of hybrid, Rmi5=52.36%, Rfi5=46.34%, therefore, the maternal selfing rate calculated using the special test zone of the 1st hybrid is 52.36%-46.34% =6.02%, by identical method, calculate in other several special test zones of hybrid, maternal selfing rate is 3.94%, 6.06%th, 6.22% and 7.54%, therefore in the hypothesis example, final maternal selfing rate is:Rm=(6.02%+ 3.94%+6.06%+6.22%+7.54%)/5=5.96%.Therefore, hybrid strain rate R=R1+R2-R3-R4+ in the hypothesis example Rm=0.60%+0.02%-0.10%-0.04%+5.96%=6.44%.
With reference to above-mentioned hypothesis example, the hybrid strain rate R in the present embodiment is calculated:In the present embodiment, nucleus hybrid strain be present Kind BKN017:MZ, acellular matter hybrid strain kind, and in addition to the hybrid strain genotype that hybrid strain kind possesses, no frequency is more than 0.17% hybrid strain genotype, therefore, R2, R3 and R4 are 0, thus, R=R1+Rm.Due to only one hybrid strain kind, therefore, All 61 hybrid strain genotype are special hybrid strain genotype, and their frequency is respectively 1.03%, 1.02%........, is gone Fall after maximum of which 10% calculates average value afterwards with minimum 80% and be multiplied by 2, calculate the R1=2.07% of acquisition.The 1st In the individual special test zone of hybrid, Rmi5=48.88%, Rfi5=48.84%, therefore, calculated using the 1st test zone Maternal selfing rate is 48.88%-48.84%=0.04%, by identical method, calculates the special test zone of all 44 hybrids In, maternal selfing rate be 0.04%, 0.05%, 0.03%....., defined by Rm, calculate the special test zone of these hybrids Their average value is calculated after maternal selfing rate, obtains Rm=0.04% in the present embodiment.Therefore, in the present embodiment, R=R1+ Rm=2.07%+0.04%=2.11%.
11, using variant sites, variant sites rate and hybrid strain rate, specificity, the uniformity of corn variety to be measured are judged And stability, method are as follows:
Wherein, SD is threshold value selected when judging specific, and M is to judge threshold selected when uniformity and stability Value.The method for judging corn variety to be measured specificity, uniformity and stability is:When variant sites rate >=SD or non-universal tests When region has variant sites, corn variety to be measured has specificity, and as variant sites rate < SD and variant sites are not present in During non-universal test zone, corn variety to be measured is without specificity;As hybrid strain rate≤M of corn variety to be measured, jade to be measured Rice kind has uniformity and stability, and when the hybrid strain rate of corn variety to be measured is more than > M, corn variety to be measured does not have one Cause property and stability.With M values, SD values are according to breeding level, desired Stringency, mark the factors such as characteristic, Artificially determine.In the present embodiment, SD selects 1% standard.
In the present embodiment, variant sites rate is 1.91%>SD=1%, therefore, it is special to judge that corn variety to be measured has Property;The hybrid strain rate 2.11% of corn variety to be measured<M=3%, therefore, judge that corn variety to be measured has uniformity and stability.
Further, after specific corn variety to be measured, uniformity and stability is judged, the accuracy of judgement is carried out Estimation, method are as follows:
Specific accuracy calculates:When variant sites are not present in non-universal test zone, if judging corn variety to be measured With specificity, the correct probability >=BINOM.DIST of conclusion ((1-SD) * TRN, TRN, 1-OD, TRUE);If judge jade to be measured Rice kind does not have specific, the correct probability >=BINOM.DIST (SD*TRN, TRN, OD, TRUE) of conclusion, wherein, TRN is The number for the test zone that success detects, OD are variant sites rate, and BINOM.DIST is the function in excel 2010, and it is used Method is identical with the definition in excel 2010, and what it was returned is the probability of binomial distribution.What above-mentioned probability actually calculated It is:When judging to have specific, variant sites rate is more than SD probability;When judge corn variety to be measured without specificity When, variant sites rate is less than SD probability, detects successful test zone by being obtained after analyzing sequencing fragment group.
In this implementation, the specificity of corn variety to be measured is judged using variant sites rate, and judges corn variety to be measured With specificity, therefore, the correct probability >=BINOM.DIST of conclusion ((1-SD) * TRN, TRN, 1-OD, TRUE)= BINOM.DIST ((1-1%) * 2465,2465,1-1.91%, TRUE)=99.99%.As can be seen here, the present embodiment is to special Property judge the correct probability of conclusion be very big.
Uniformity calculates with stability accuracy
The correct probability of conclusion for judging the uniformity and stability of corn variety to be measured is:When corn variety to be measured has When uniformity and stability, correct probability >=BINOM.DIST (M*SN, SN, R, TRUE) * BINOM.DIST (the ∑ SeN* of conclusion M,∑SeN,R,TRUE);When corn variety to be measured does not have uniformity and stability, the correct probability of conclusion >= BINOM.DIST ((1-M) * SN, SN, (1-R), TRUE) * BINOM.DIST (∑ SeN* (1-M), ∑ SeN, 1-R, TRUE), its In, for M to judge threshold value selected when uniformity and stability, ∑ SeN is all frequencies for being used to calculate hybrid strain rate R genotype The summation of the sequencing fragment of test zone where rate, BINOM.DIST (M*SN, SN, R, TRUE) are that corn variety to be measured is carried out SN sampling, the hybrid strain rate R being actually pumped are less than threshold value M probability, BINOM.DIST's (∑ SeN*M, ∑ SeN, R, TRUE) Meaning is:SeN sampling of ∑ is carried out to corn variety to be measured, the hybrid strain rate R being actually pumped is less than threshold value M probability; BINOM.DIST ((1-M) * SN, SN, (1-R), TRUE) has carried out SN time for corn variety to be measured and sampled, the hybrid strain being actually pumped Rate R is more than threshold value M probability, and BINOM.DIST (∑ SeN* (1-M), ∑ SeN, 1-R, TRUE) meaning is:To corn to be measured Kind has carried out SeN sampling of ∑, and the hybrid strain rate R being actually pumped is more than threshold value M probability.∑ SeN be remove 80% minimum After value and 10% maximum, the summation of the test fragment of test zone for calculating hybrid strain rate is remained.Judge consistent The accuracy of property and stability depends entirely on the accuracy of hybrid strain rate, and the positive rate of hybrid strain rate really depends on following three steps Accuracy:First, corn variety sampling accuracy to be measured, second, the accuracy of detection hybrid strain kind from extraction sample, the Three, calculate the accuracy of hybrid strain rate using the hybrid strain kind of detection.Therefore, corn variety uniformity and stability to be measured are judged Accuracy is the product of the step accuracy of the above three.Even because the present invention is under the conditions of most stringent of, detection hybrid strain kind is just True rate also controls more than 99.9%, is actually mostly close to 100%.Therefore, corn variety uniformity to be measured is judged The product of the accuracy of the first step and the 3rd step can be estimated as with the accuracy of stability, it is respectively former and later two in above-mentioned formula The value that function is calculated.For example, BINOM.DIST (M*SN, SN, R, TRUE) meaning is:Corn variety to be measured has been carried out SN times Sampling, the hybrid strain rate R being actually pumped are less than threshold value M probability;For calculating each sequencing of corn variety hybrid strain rate to be measured Fragment, single sample quite also substantially is carried out to corn variety to be measured, therefore, BINOM.DIST (∑ SeN*M, ∑ SeN, R, TRUE) meaning be:SeN sampling of ∑ is carried out to corn variety to be measured, the hybrid strain rate R being actually pumped is less than threshold value M's Probability.
In the present embodiment, the site for hybrid strain rate R is the special test zone of 44 hybrid strains and 61 special hybrid strain genes The test zone of type, it is 3513478 that total amount, which is sequenced, in it, also that is, being carried out again to 5000 samples being pumped 3513478 sampling, the error of so big amount of sampling is fairly small.In the present embodiment, judge that corn variety to be measured has Uniformity and stability, therefore, correct probability >=BINOM.DIST (M*SN, SN, R, the TRUE) * of the judgement conclusion BINOM.DIST (∑ SeN*M, ∑ SeN, R, TRUE)=BINOM.DIST (3%*5000,5000,2.11%, TRUE) * BINOM.DIST (3513478*3%, 3513478,2.11%, TRUE)=100.00%.It can be seen that this implementation is to corn product to be measured The judgement of the uniformity and stability of kind is very accurate.
Result verification
Press《New variety of plant specificity, uniformity and stability test guide-corn》In method plant and observe and treat Corn variety and its approximate kind are surveyed, find corn variety to be measured exists significantly on multiple test characters with approximate kind Difference.《New variety of plant specificity, uniformity and stability test guide-corn》Middle regulation:At least in a character with When approximate kind has obvious and reproducible difference, you can judge that the corn variety to be measured of application possesses specificity.Therefore, sentence Fixed corn variety to be measured has specificity.In experimentation, 40 plants of kinds to be measured and approximate kind (20 plants one has been planted altogether Cell, totally 2 repetitions), 1 plant of special-shaped strain is found,《New variety of plant specificity, uniformity and stability test guide-corn》 Middle regulation:When sample size is 40 plants, 3 special-shaped strains are at most allowed for, thus judge that kind to be measured has uniformity. 《New variety of plant specificity, uniformity and stability test guide-corn》Middle regulation:If a kind possesses uniformity, Then it is believed that the kind possesses stability.Thus judge, kind to be measured also has stability.Experiment shows more than:This reality It is correct to apply the judgement in example to the specificity of kind to be measured, stability and uniformity.
The embodiment of the present invention is expanded by high-flux sequence and more sites, realizes the large sample sampling of corn variety to be measured Sampled with the large sample of inter-species individual test zone, recycle and define hybrid strain genotype, define cytoplasm hybrid strain kind and definition The comprehensive means such as hybrid strain rate calculation formula, successfully realize it is accurate, quick, intactly judge the special of corn variety to be measured The target of property, stability and uniformity, it has the technical effect that what existing DUS method of testings did not all reach.Existing molecule DUS detections Technology such as chip only detects fixed test zone, it is impossible to according to case, flexibly selects non-universal test zone.And present invention detection Be PCR primer, non-universal test zone can be detected easily according to case flexible design primer.Implemented with the present invention Exemplified by example one, for 5000 individual amount of samplings for traditional DUS measuring technologies, work is big, can not complete, example Such as, in field DUS tests, 5000 plants of corns of sampling need more than 2 mu of plantation, and need to plant 2 years, and annual every plant of corn needs to adjust Look into individual character more than 70., it is necessary to do 5000 DNA extractions respectively in widely used SSR molecules DUS tests, 5000*2465 times PCR and 5000*2465 PCR primer detection (assuming that as the present embodiment, have detected 2465 universal test regions).Cause This, because workload is excessive, existing molecule DUS tests there all are not measuring stability and uniformity, although field DUS tests detection one Cause property and stability, but sampling samples amount is all below 1000 plants, and the present embodiment has been sampled 5000 plants of corns, its accuracy shows It is so higher.Why the present embodiment can increase amount of sampling, be because all 5000 samples are used as a sample after all mixing Processing, and field DUS test and comparisons, workload is equivalent to being reduced to 1/5000;Further, all 2465 universal test areas Mixed once amplification is all only done in domain and a high-flux sequence detects, and with SSR molecule DUS test and comparisons, workload is equivalent to contracting It is kept to 1/ (5000*2465).Therefore, the present invention realizes large sample and the inspection of more sites in the case where workload significantly mitigates Survey, make DUS tests not only accurate but also simple.Database variety and genetype is base composition in the embodiment of the present invention one simultaneously, Very standard, same breed is detected in the present inventive method under different experimental conditions, can obtain identical genotype, because And, it is not necessary under different conditions repeat DUS test, therefore, the embodiment of the present invention can directly with database variety and genetype Compare, objectively select the approximate kind of corn variety to be measured.And existing DUS measuring technologies are not up to standard, corn product to be measured Kind abreast carries out DUS tests simultaneously with approximate kind, just reliable conclusion can be obtained, in order to mitigate workload, it has to by By kind, power applicant provides approximate kind, if approximate kind mistake, there may be the legal consequence of erroneous grants.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.

Claims (7)

1. a kind of method of specificity for determining hybrid maize variety, uniformity and stability, it is characterised in that methods described Including:
Obtain the variant sites between different corn varieties;
The test zone of the corn variety to be measured is determined by the variant sites, the test zone includes universal test area Domain, at least partly described variant sites are included in the universal test region, passed through The value of discrimination is calculated, wherein, a is the kind sum being detected in variation window area, and bi is institute State the kind number of i-th kind of genotype in variation window area, and bi>1, k is the number of the genotype comprising more than a kind, It is described variation window area be centered on each single nucleotide variations site, it is each to the both sides in the single nucleotide variations site Extend 1/2 window as detection of sequence length to be measured;The universal test region be cytoplasmic skeleton on discrimination most 8000 big variation windows and 100 maximum variation windows of discrimination in cytoplasmic skeleton;
Structure includes database of the different cultivars in the genotype of all test zones;
After the amount of sampling SN for determining the corn variety to be measured, random sampling mixes and extracts the DNA of mixing sample, the sampling Amount SN meets following condition:BINOM.INV (SN, M, 0.95)/SN≤1.15*M, wherein BINOM.INV are in excel 2010 Function, to judge threshold value selected when the uniformity and stability, the condition implication of the amount of sampling SN satisfactions is M: Even if the hybrid strain rate only exceeds the 15% of threshold value M, the amount of sampling can correctly judge described treat in the case where 95% probability ensures Survey the stability and uniformity of corn variety;
The primer for expanding the test zone is prepared, the primer includes universal test region primer;
Expanded using the DNA of mixing sample described in the primer pair, obtain the amplified production of the test zone, the expansion Volume increase thing is used to build high-throughput sequencing library;
High-flux sequence is carried out to the high-throughput sequencing library, obtains that fragment group, the depth CF of the high-flux sequence is sequenced Meet following condition:BINOM.DIST (10,10, BINOM.DIST (8,20, BINOM.DIST (0, CF, 0.1%, TRUE), TRUE), FALSE) >=99.9%, 1-BINOM.DIST (10000,10000,1-BINOM.DIST (8,20,1-BINOM.DIST (99.99%*CF, CF, 99.9989%, TRUE), TRUE), FALSE)≤0.1% and BINOM.DIST (10* (1-M) * CF, 10*CF, 1-110%*M, TRUE) >=95.0%, wherein, CF is the depth of the high-flux sequence, and M is to judge the uniformity With threshold value selected during stability, BINOM.DIST be excel 2010 in function, the depth CF of the high-flux sequence The condition implication of satisfaction is:The hybrid strain rate as little as 0.1%, the hybrid strain kind be 10 and the hybrid strain kind with it is described Under conditions of averagely only having 20 difference sites between corn variety to be measured, the detection that is determined by the depth CF of the high-flux sequence All probability >=99.9% of the hybrid strain kind;The database kind for 10000 and the hybrid strain kind and institute State under conditions of averagely only having 20 difference sites between corn variety to be measured, deposited by what the depth CF of the high-flux sequence was determined Judging probability≤0.1% of the hybrid strain kind by accident;In the hybrid strain kind be 10 and true hybrid strain rate exceeds only judgement spy When different in nature selected threshold value 10% when, stability and uniformity are sentenced by what the depth CF of the high-flux sequence was determined Determine correct probability >=95.0% of conclusion;
The sequencing fragment group is analyzed, obtains Maize Genotypes to be measured and hybrid strain genotype;
By the Maize Genotypes to be measured compared with the genotype of the different cultivars in the database, described in acquisition Approximate kind, variant sites and the variant sites rate of corn variety to be measured;
By the hybrid strain genotype compared with the genotype of the different cultivars in the database, after obtaining hybrid strain kind, Calculate hybrid strain rate;
Using the variant sites, the variant sites rate and the hybrid strain rate, the corn variety specificity to be measured, one are judged Cause property and stability.
2. according to the method for claim 1, it is characterised in that the test zone also includes non-universal test zone, institute Stating primer also includes non-universal test zone primer.
3. according to the method for claim 2, it is characterised in that the non-universal test zone primer include the first primer and Second primer, first primer include the first forward primer and the first reverse primer, and it is positive that second primer includes second Primer and the second reverse primer, first primer and second primer carry out respectively individually amplification obtain two it is described non-through With the amplified production of test zone, the amplified production mixed in equal amounts of two non-universal test zones is used to build independent expansion The high-throughput sequencing library of increasing;
5 ' end connections of first forward primer are just like SEQ ID NO in sequence table:Sequence 1 shown in 1, described first is reverse 5 ' end connections in primer are just like SEQ ID NO in sequence table:Sequence 2 shown in 2;
5 ' end connections of second forward primer are just like SEQ ID NO in sequence table:Sequence 2 shown in 2, described second is reverse 5 ' end connections of primer are just like SEQ ID NO in sequence table:Sequence 1 shown in 1.
4. according to the method for claim 2, it is characterised in that utilize the variant sites, the variant sites rate and institute Hybrid strain rate is stated, judging the method for the corn variety specificity to be measured, uniformity and stability includes:
When the variant sites be present in the variant sites rate >=non-universal test zones of SD or described, the corn product to be measured Kind there is specificity, as the variant sites rate < SD and when the variant sites are not present in the non-universal test zone, For the corn variety to be measured without specificity, wherein SD is threshold value selected when judging specific;
As the hybrid strain rate≤M of the corn variety to be measured, the corn variety to be measured has uniformity and stability, when When the hybrid strain rate of the corn variety to be measured is more than > M, the corn variety to be measured does not have uniformity and stability, M Selected threshold value during to judge the uniformity and stability;
The hybrid strain rate R=R1+R2-R3-R4+Rm, wherein:
Wherein, n1 is the number of nucleus hybrid strain kind, and t1 is all special hybrid strains of the i-th 1 nucleus hybrid strain kinds The number of karyogene type, i1j1 are that all special hybrid strain karyogene types of the i-th 1 nucleus hybrid strain kinds press frequency After sorting from low to high, the special hybrid strain karyogene type of jth 1, R1i1j1 is the i-th 1j1 special hybrid strain karyogenes The frequency of type;R1 is the summation of the hybrid strain rate of the nucleus hybrid strain kind calculated by hybrid strain karyogene type, described thin The hybrid strain rate of karyon hybrid strain kind is to remove the spy of 80% and highest 10% minimum in the nucleus hybrid strain kind After the frequency of different hybrid strain karyogene type, 2 times of the average value of the frequency of the remaining special hybrid strain karyogene type;
Wherein, t2 is in addition to the hybrid strain karyogene type that the nucleus hybrid strain kind possesses and frequency >=0.17% The number of the hybrid strain karyogene type, i2 are the institute in addition to the hybrid strain karyogene type that the nucleus hybrid strain kind possesses After having the hybrid strain karyogene type to be sorted from low to high by frequency, the i-th 2 hybrid strain karyogene types, R2i2 is the i-th 2 institutes State the frequency of hybrid strain karyogene type;R2 is to utilize the hybrid strain karyogene type possessed except the nucleus hybrid strain kind to calculate The hybrid strain rate, R2 are minimum in the frequency for remove the hybrid strain karyogene type possessed except the nucleus hybrid strain kind 80% and highest 10% value after, 2 times of the average value of surplus value;
Wherein, n2 is the number of cytoplasm hybrid strain kind, and R3i3 is the hybrid strain rate of the i-th 3 cytoplasm hybrid strain kinds, and t3 is The number of all special hybrid strain matter genotype of the i-th 3 cytoplasm hybrid strain kinds, i3j3 are miscellaneous for the i-th 3 cytoplasm After all special hybrid strain matter genotype of strain kind sort from low to high by frequency, the special hybrid strain matter gene of jth 3 Type, R3i3j3 are the frequency of the i-th 3j3 special hybrid strain matter genotype;R3 is by the described thin of hybrid strain matter genotype calculating The summation of the hybrid strain rate of kytoplasm hybrid strain kind, the hybrid strain rate of the cytoplasm hybrid strain kind are to remove the cytoplasm hybrid strain In kind after the frequency of the special hybrid strain matter genotype of minimum 80% and highest 10%, the remaining special hybrid strain The average value of the frequency of matter genotype;
Wherein, t4 is in addition to the hybrid strain matter genotype that the cytoplasm hybrid strain kind possesses and frequency >=0.17% The number of the hybrid strain matter genotype, i4 are the institute in addition to the hybrid strain matter genotype that the cytoplasm hybrid strain kind possesses After having the hybrid strain matter genotype to be sorted from low to high by frequency, the i-th 4 hybrid strain matter genotype, R4i4 is the i-th 4 institutes State the frequency of hybrid strain matter genotype;R4 is to utilize the hybrid strain matter genotype possessed except the cytoplasm hybrid strain kind to calculate Hybrid strain rate, R4 are 80% He minimum in the frequency for remove the hybrid strain matter genotype possessed except the cytoplasm hybrid strain kind After the value of highest 10%, the average value of surplus value;
Wherein, t5 is the number of the special test zone of hybrid;I5 is the i-th 5 special test zones of hybrid;Rmi5 is the i-th 5 In the individual special test zone of the hybrid, the frequency of female genotype;Rfi5 in the i-th 5 special test zones of hybrid, The frequency of male parent gene type;Rm is the hybrid strain rate of maternal selfing, and Rm is the female parent in the special test zone of the hybrid The average value of the frequency of genotype and the difference of the frequency of the male parent gene type;
Int () is bracket function;
The nucleus hybrid strain kind refers to calculate the hybrid strain kind obtained, the cytoplasm hybrid strain merely with karyogene type Kind refers to calculate the hybrid strain kind obtained merely with matter genotype;The special hybrid strain karyogene type refers to only one All hybrid strain karyogene types of the nucleus hybrid strain kind;The special hybrid strain matter genotype refers to only described in one All hybrid strain matter genotype of cytoplasm hybrid strain kind;The hybrid strain karyogene type refers to that the hybrid strain genotype is described Karyogene type;The hybrid strain matter genotype refers to that the hybrid strain genotype is the matter genotype;Specifically tested in the hybrid In region, the female genotype differs with the male parent gene type, and the female genotype and all nucleus are miscellaneous The genotype of strain kind is different, and the male parent gene type is also different from the genotype of all nucleus hybrid strain kinds;Institute It is the genotype identical genotype with female parent in the corn variety to be measured to state female genotype;The male parent gene type is In the corn variety to be measured, the genotype identical genotype with male parent;
The karyogene type refers to the genotype on nuclear genome;The matter genotype refers to be located at cytoplasmic skeleton On genotype.
5. according to the method for claim 4, it is characterised in that methods described also includes treating described in judgement in the following ways The correct probability of conclusion of uniformity and stability for surveying corn variety is:When the corn variety to be measured has uniformity and steady When qualitative, correct probability >=BINOM.DIST (M*SN, SN, R, TRUE) * BINOM.DIST of conclusion (∑ SeN*M, ∑ SeN, R, TRUE);When the corn variety to be measured does not have the uniformity and stability, the correct probability >=BINOM.DIST of conclusion ((1-M)*SN,SN,(1-R),TRUE)*BINOM.DIST(∑SeN*(1-M),∑SeN,1-R,TRUE);Wherein, M is judgement Selected threshold value when the uniformity and stability, ∑ SeN are all frequencies for being used to calculate the genotype of the hybrid strain rate R The summation of the sequencing fragment of the place test zone, BINOM.DIST (M*SN, SN, R, TRUE) are the corn variety to be measured SN sampling is carried out, the hybrid strain rate R being actually pumped is less than the probability of the threshold value M, BINOM.DIST (∑ SeN*M, ∑ SeN, R, TRUE) meaning be:SeN sampling of ∑, the hybrid strain rate R being actually pumped have been carried out to the corn variety to be measured Less than threshold value M probability;BINOM.DIST ((1-M) * SN, SN, (1-R), TRUE) has carried out SN for the corn variety to be measured Secondary sampling, probability of the hybrid strain rate R being actually pumped more than the threshold value M, BINOM.DIST (∑ SeN* (1-M), ∑ SeN, 1-R, TRUE) meaning be:SeN sampling of ∑ is carried out to the corn variety to be measured, the hybrid strain rate R being actually pumped is big In threshold value M probability, the frequency of the genotype refers in the sequencing fragment group, represents the sequencing segments of the genotype The ratio of the sequencing fragment sum of the test zone where accounting for the genotype.
6. according to the method for claim 5, it is characterised in that when the change dystopy is not present in the non-universal test zone During point, if it is specific to judge that the corn variety to be measured has, the correct probability >=BINOM.DIST of conclusion ((1-SD) * TRN, TRN,1-OD,TRUE);If judge the corn variety to be measured without specificity, the correct probability >=BINOM.DIST of conclusion (SD*TRN, TRN, OD, TRUE), wherein, TRN is the number for detecting successful test zone, and OD is the variant sites rate, BINOM.DIST is the function in excel2010, and the correct probability of conclusion, which is expressed as working as, judges the corn variety to be measured During with specificity, the variant sites rate is more than SD probability, when judging that the corn variety to be measured does not have specific, The variant sites rate is less than SD probability, and the successful test zone of detection after analyzing the sequencing fragment group by obtaining .
7. according to the method for claim 1, it is characterised in that obtaining the method for the hybrid strain kind includes:The hybrid strain Kind is the kind being present in the database, and the potential hybrid strain genotype of the hybrid strain kind and the hybrid strain genotype Between have phase homogenic type the number of the test zone account for the hybrid strain kind there is the described of the potential hybrid strain genotype Total ratio >=60% of test zone;The hybrid strain genotype refers to the potential hybrid strain genotype of frequency >=0.02%;
Quantity >=2 of distinguishing base between the potential hybrid strain genotype and all genotype of the corn variety to be measured or There are insertion or the missing of discontinuous base in the distinguishing base.
CN201510150506.5A 2015-03-31 2015-03-31 A kind of method of the specificity for determining hybrid maize variety, uniformity and stability Active CN104805190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510150506.5A CN104805190B (en) 2015-03-31 2015-03-31 A kind of method of the specificity for determining hybrid maize variety, uniformity and stability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510150506.5A CN104805190B (en) 2015-03-31 2015-03-31 A kind of method of the specificity for determining hybrid maize variety, uniformity and stability

Publications (2)

Publication Number Publication Date
CN104805190A CN104805190A (en) 2015-07-29
CN104805190B true CN104805190B (en) 2017-12-01

Family

ID=53690386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510150506.5A Active CN104805190B (en) 2015-03-31 2015-03-31 A kind of method of the specificity for determining hybrid maize variety, uniformity and stability

Country Status (1)

Country Link
CN (1) CN104805190B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740897B (en) * 2016-01-29 2019-03-22 山东省农业科学院作物研究所 Approximate method for screening varieties in corn specific test based on phenotypic character

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104328507A (en) * 2014-10-11 2015-02-04 中国水稻研究所 SNP chip used for identifying rice variety, preparation method and application

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104278101A (en) * 2014-10-21 2015-01-14 江汉大学 Specific ISSR-PCR primer group for identifying brasenia schreberi variety, molecular specific marker and method for identifying brasenia schreberi variety

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104328507A (en) * 2014-10-11 2015-02-04 中国水稻研究所 SNP chip used for identifying rice variety, preparation method and application

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Evaluation of the use of high-density SNP genotyping to implement UPOV Model 2 for DUS testing in barley;Huw Jones等;《Theor Appl Gene》;20121212;第126卷(第4期);第901-911页 *

Also Published As

Publication number Publication date
CN104805190A (en) 2015-07-29

Similar Documents

Publication Publication Date Title
CN104846076B (en) A method of specificity, consistency and the stability of measurement cross-bred rape new varieties
CN104830975A (en) Novel method for testing corn parent source authenticity and proportion
CN104805190B (en) A kind of method of the specificity for determining hybrid maize variety, uniformity and stability
CN104805191B (en) A kind of method of the specificity for testing pure lines corn variety, uniformity and stability
CN104805187B (en) A kind of method of the specificity for testing pure lines new soybean varieties, uniformity and stability
CN104805182B (en) A kind of method for the specificity, uniformity and stability for determining new hybrid rice varieties
CN104805189B (en) A kind of method of the specificity for determining hybrid plant new varieties, uniformity and stability
CN105624298A (en) Method for detecting genetically modified components of rape
CN104805184B (en) A kind of method of the specificity for testing pure lines new rice variety, uniformity and stability
CN105567830A (en) Method for detecting transgenic ingredients of plant
US7811766B2 (en) Genetic identification and validation of Echinacea species
CN105586418A (en) Detection method of transgenic components in rapeseed oil
CN104846077B (en) A method of specificity, consistency and the stability of test pure lines new rape variety
CN104805186B (en) A kind of method for testing corn variety substance derived relation
CN105567833A (en) Detection method for soybean transgenic ingredients
CN105586411A (en) Method for detecting paddy rice transgenic ingredients
CN104805185B (en) A kind of method of test plants kind substance derived relation
CN104805188B (en) A kind of method for testing soybean varieties substance derived relation
CN104805183A (en) Method for testing distinctness, uniformity and stability of pure-line plant new variety
CN104805195A (en) Novel method for testing rice parental source authenticity and proportion of rice parental source
CN117757979B (en) Primer group, kit and identification method for identifying soybean varieties
CN109161605A (en) The development and application of the SNP marker of rice blast resistant gene Pi1
CN104805193A (en) Method for testing substantive derivation relation of rice varieties
CN105586413A (en) Detection method of transgenic components in potatoes
CN105586410B (en) Detection method of drug resistance genes of soil microorganisms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant