CN118086534A - Eriocheir sinensis DNA fingerprint and construction method and application thereof - Google Patents

Eriocheir sinensis DNA fingerprint and construction method and application thereof Download PDF

Info

Publication number
CN118086534A
CN118086534A CN202410435641.3A CN202410435641A CN118086534A CN 118086534 A CN118086534 A CN 118086534A CN 202410435641 A CN202410435641 A CN 202410435641A CN 118086534 A CN118086534 A CN 118086534A
Authority
CN
China
Prior art keywords
eriocheir sinensis
dna
snp
dna fingerprint
sequencing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410435641.3A
Other languages
Chinese (zh)
Inventor
尹绍武
蒋苏
张凯
宁先会
郭鑫萍
张聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Normal University
Original Assignee
Nanjing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Normal University filed Critical Nanjing Normal University
Priority to CN202410435641.3A priority Critical patent/CN118086534A/en
Publication of CN118086534A publication Critical patent/CN118086534A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a eriocheir sinensis DNA fingerprint and a construction method and application thereof, wherein after re-sequencing and site screening are carried out on 46 eriocheir sinensis samples in Yangtze river water area and Liaohe water area, 78 high-specificity Single Nucleotide Polymorphism (SNP) sites which can be used for eriocheir sinensis germplasm resource identification are obtained, and the eriocheir sinensis DNA fingerprint is constructed. The constructed eriocheir sinensis DNA fingerprint can be used for rapidly and accurately identifying the water area of Yangtze river and the water area of Liaohe, and the invention constructs the DNA fingerprint by utilizing SNP locus information for the first time, thereby providing a more effective and more accurate method for the aspects of eriocheir sinensis variety identification, group source, molecular auxiliary breeding and the like.

Description

Eriocheir sinensis DNA fingerprint and construction method and application thereof
Technical Field
The invention belongs to the field of molecular biology, and particularly relates to a eriocheir sinensis DNA fingerprint spectrum and a construction method and application thereof.
Background
Eriocheir sinensis (Eriocheir sinensis) belongs to the genus Eriocheir sinensis (Decapoda), the family Eriocheir sinensis (Varunidae), the genus Eriocheir sinensis (Eriocheir), also called river crab and hairy crab, etc., and is mainly distributed in water areas such as Yangtze river and Liaoning river in China, and is an important aquaculture economic variety. The eriocheir sinensis resources are degenerated due to factors such as excessive fishing, water environment deterioration, habitat damage and the like. Therefore, the phenomenon of unordered seeding and blind seeding of different water systems is more frequent, which inevitably leads to degradation of the germplasm resources of the river crabs, and the yield and the quality are reduced, thus severely restricting the green development of the river crab industry. In addition, the cultivation scale of the Eriocheir sinensis in China is continuously enlarged, and the problems of irregular seed industry protection measures and cultivation management are increasingly prominent. In addition, river crabs are easy to escape during the cultivation and transportation processes, and the mixing of germplasm resources is aggravated. Therefore, a method for accurately identifying different varieties of eriocheir sinensis is urgently needed, and the method is particularly important for subsequent germplasm resource identification and genetic diversity research.
The DNA fingerprint constructed by utilizing the SNP locus with high polymorphism has the characteristics of strong polymorphism, difficult influence of environmental conditions and subjective factors, more accurate identification of genetic information of different varieties and the like. The DNA fingerprint has wide application prospect in variety germplasm resource identification and genetic diversity, but is not applied to the research of Eriocheir sinensis germplasm resource identification.
The prior art for identifying the Eriocheir sinensis variety mainly has the following problems: 1. the existing method for identifying varieties mainly uses molecular markers such as RFLP, ISSR, SSR and the like, and has long required experimental period and low accuracy. 2. The prior art utilizes morphological characteristics of the head and the chest armor and the like to identify the eriocheir sinensis germplasm resources, and is greatly influenced by human factors and environmental conditions. 3. In the prior art, the eriocheir sinensis from different geographical sources is distinguished by utilizing the composition and the content of nutritional ingredients such as fatty acid, the influences of feed composition, cultivation environment and the like on the composition of the nutritional ingredients of the eriocheir sinensis cannot be eliminated, and germplasm resources cannot be identified from genetic essence.
Disclosure of Invention
The invention aims to: aiming at the problems existing in the prior art, the invention provides the eriocheir sinensis DNA fingerprint, and the DNA fingerprint can be used for rapidly, efficiently and accurately identifying the germplasm resources of different water areas of eriocheir sinensis. The invention provides the DNA fingerprint spectrum for identifying the eriocheir sinensis for the first time and has high accuracy.
The invention also provides a construction method and application of the eriocheir sinensis DNA fingerprint.
The technical scheme is as follows: in order to achieve the above purpose, the Eriocheir sinensis DNA fingerprint comprises 78 SNP sites, wherein the SNP sites and specific nucleotides are as follows:
Sequence number Chromosome of the human body Position of Nucleotide(s) Sequence number Chromosome of the human body Position of Nucleotide(s)
1 Chr1 99217 T/C 40 Chr32 2328181 A/T
2 Chr1 38720448 G/A 41 Chr33 161249 C/A
3 Chr2 405837 G/A 42 Chr34 1666626 A/G
4 Chr2 25617819 G/A 43 Chr35 567310 T/G
5 Chr3 70473 A/G 44 Chr36 672423 G/A
6 Chr3 25107852 T/C 45 Chr37 424371 A/C
7 Chr4 1558939 C/T 46 Chr38 323667 A/G
8 Chr4 26621963 C/T 47 Chr39 135012 A/T
9 Chr5 584213 G/A 48 Chr40 53631 C/A
10 Chr5 25646473 T/C 49 Chr41 131332 A/T
11 Chr6 215261 G/C 50 Chr42 193854 C/T
12 Chr6 25243217 T/C 51 Chr43 2957 C/T
13 Chr7 128231 C/T 52 Chr44 374945 A/G
14 Chr7 25130734 A/G 53 Chr45 841188 T/C
15 Chr8 499625 G/A 54 Chr46 142856 C/T
16 Chr9 486990 A/G 55 Chr47 162792 G/A
17 Chr9 25518156 G/A 56 Chr48 61003 G/A
18 Chr10 292307 A/G 57 Chr49 243340 T/G
19 Chr11 793246 A/T 58 Chr50 166519 C/T
20 Chr12 145168 C/G 59 Chr51 23060 T/A
21 Chr13 1008227 T/G 60 Chr52 55932 G/A
22 Chr14 54272 A/G 61 Chr53 2239866 T/A
23 Chr15 179693 A/G 62 Chr54 14560 A/G
24 Chr16 2699367 G/A 63 Chr55 23007 G/A
25 Chr17 73954 C/G 64 Chr56 189487 G/A
26 Chr18 192820 T/C 65 Chr57 254078 C/T
27 Chr19 116958 G/C 66 Chr58 65370 T/A
28 Chr20 282617 A/T 67 Chr59 157051 A/G
29 Chr21 419905 T/G 68 Chr60 86713 C/T
30 Chr22 769751 C/T 69 Chr61 587207 A/G
31 Chr23 11879 G/T 70 Chr62 10220 G/C
32 Chr24 22105 A/C 71 Chr63 390533 C/T
33 Chr25 4578 T/G 72 Chr64 278579 T/A
34 Chr26 118662 C/T 73 Chr65 546629 C/A
35 Chr27 688832 C/T 74 Chr66 108002 G/C
36 Chr28 45286 G/A 75 Chr67 114412 A/G
37 Chr29 1000395 A/G 76 Chr68 15155 T/C
38 Chr30 72370 C/T 77 Chr69 1028650 A/G
39 Chr31 41525 G/A 78 Chr70 962197 A/G
The eriocheir sinensis DNA fingerprint comprises a eriocheir sinensis DNA fingerprint in the Yangtze river area and a eriocheir sinensis DNA fingerprint in the Liaoning river area, wherein the two DNA fingerprints comprise 78 SNP sites, and the SNP sites and specific nucleotides of the eriocheir sinensis DNA fingerprint in the Yangtze river area are as follows:
The SNP locus and specific nucleotide of the Eriocheir sinensis DNA fingerprint in the Liaoning river water area are as follows:
the construction method of the Eriocheir sinensis DNA fingerprint comprises the following steps:
(1) Collecting Eriocheir sinensis in Yangtze river and Liaohe water;
(2) Extracting DNA of eriocheir sinensis muscle tissues;
(3) Carrying out resequencing on the DNA fragments with qualified quality control on a machine;
(4) Filtering the obtained sequencing data, detecting pollution of the sequencing data and evaluating the quality of the sequencing data;
(5) The readss of each sample is compared with a reference genome by Sentieon software and mutation detection is carried out, so that a specific SNP locus database of the eriocheir sinensis is constructed;
(6) And further filtering according to the deficiency rate, MAF value, single copy, site heterozygosity and depth, and screening high polymorphism SNP sites for constructing DNA fingerprint.
Wherein, the step (3) of mechanically re-sequencing the qualified DNA fragments comprises the following specific steps: performing enzyme slicing and sectioning on the DNA sample, linking a sequencing joint with the sectioning DNA together by using a ligase, performing PCR amplification on a connection product, and performing fragment screening on the PCR product by using magnetic beads; then, the linear library is denatured into single strands, cyclized to form a single-strand circular library, the single-strand circular library is subjected to rolling circle replication to form DNA Nanospheres (DNB), and finally DNB is loaded into a sequencing chip by using loading equipment MGIDL-T7, and re-sequencing is performed on the machine by combining probe-anchored polymerization technology.
The sequencing data obtained in the step (4) is filtered, the pollution of the sequencing data is detected, and the quality evaluation of the sequencing data comprises summarizing the sequencing data and quality indexes, the base content of the sequencing data and the pollution of the sequencing data. Wherein, the reference genome in the step (5) is TXID95602 eriocheir sinensis whole genome at NCBI. And (3) performing mutation detection in the step (5) to obtain gVCF of each sample, performing joint-calling by using Sentieon, and performing joint analysis on gVCF of all samples to obtain a mutation result of each individual. And performing preliminary filtration (SNP hard filtration standard: QD < 2.0|FS > 60.0|MQ < 40.0|SOR > 3.0| MQRankSum < -12.5| ReadPosRankSum < -8.0) on the SNP locus obtained after the joint analysis to obtain a specific SNP locus database of Eriocheir sinensis.
Wherein, the screening standard in the step (6) is that the deletion rate is 0; MAF value is more than or equal to 0.1; extracting sequences 100bp upstream and downstream of the locus to perform copy number analysis, and reserving the locus of the upstream and downstream sequences which is unique on the genome; site heterozygosity <0.15 and site depth >10 was further filtered.
The DNA fingerprint of Eriocheir sinensis is applied to identification of germplasm resources of different water areas of Eriocheir sinensis.
The application process comprises the following steps:
(1) Extracting DNA from a Eriocheir sinensis sample to be detected and resequencing;
(2) Performing quality control on the test data, performing preliminary filtration according to SNP hard filtration standards, further filtering according to the deletion rate, the MAF value, single copy, site heterozygosity and depth, and finally uniformly distributing screening sites on chromosomes according to intervals of more than 10Mb to obtain SNP sites capable of completely separating all samples;
(3) Calculating genetic distances among samples and constructing a phylogenetic tree by using Plink software for the screened SNP loci with high polymorphism;
(4) Comparing the screened SNP loci with DNA fingerprints of eriocheir sinensis in different water areas, determining that the sample to be detected is a eriocheir sinensis group when the coincidence rate is more than or equal to 95%, carrying out cluster analysis on genotypes of the SNP loci with high polymorphism, and judging that the eriocheir sinensis in the unknown water area belongs to the Liaohe water area group or the Yangtze river water area group according to the cluster condition of the phylogenetic tree.
Preferably, the Eriocheir sinensis sample to be detected in the step (1) is muscle tissue; the Eriocheir sinensis groups are randomly caught in the Yangtze river water area and the Liaohe river water area respectively.
And (3) and (4) comparing the distances of the genetic relationships among the samples, and performing group cluster analysis to identify the water area group in the Yangtze river or the water area group in the Liaohe by calculating the genetic distance among the samples.
The eriocheir sinensis is mainly distributed in the Yangtze river water area and the Liaohe river water area, and the regional river crab germplasm resources are rich, so that a high-quality breeding material can be provided for subsequent cultivation of new varieties. Therefore, the DNA fingerprint is constructed according to the specific SNP locus combination of the eriocheir sinensis in the Yangtze river area and the Liaohe river area. The invention provides a brand new eriocheir sinensis DNA fingerprint, which is constructed by carrying out re-sequencing and site screening on 46 eriocheir sinensis samples in Yangtze river water areas and Liaohe water areas to obtain 78 high-specificity SNP sites which can be used for eriocheir sinensis germplasm resource identification. The invention can rapidly and accurately identify the water area of Yangtze river and the water area of Liaohe by utilizing the DNA fingerprint of Eriocheir sinensis. The invention constructs DNA fingerprint by utilizing SNP locus information, and provides a more effective and more accurate method for the aspects of eriocheir sinensis variety identification, population source, molecular auxiliary breeding and the like.
The invention constructs the eriocheir sinensis DNA fingerprint by utilizing the SNP locus with high polymorphism for the first time. The invention utilizes SNP locus to identify eriocheir sinensis in different waters from genetic level, and avoids errors caused by various factors when water area identification is carried out by indexes such as morphological characteristics, nutritional ingredients and the like. In addition, because SNP loci have the advantages of polymorphism, wide distribution, high stability and the like, compared with DNA fingerprint constructed by using other molecular markers, the identification result of the invention has higher accuracy. Based on the DNA fingerprint constructed by the invention, the Eriocheir sinensis in the unknown water area can be identified as belonging to the Liaohe water area or the Yangtze river water area by only selecting 78 high polymorphism SNPs screened by the invention for genotype cluster analysis without carrying out all SNP locus detection. Features and advantages of different molecular markers are carefully compared, and the first generation molecular marker is represented by RFLP, so that the cost is high, the experimental steps are more, the period is long and the marker stability is poor. If the second generation molecular marker microsatellite marker can not be directly searched from the DNA database, the second generation molecular marker microsatellite marker must be sequenced first, and then the primer is designed, so that the development cost is high. The invention constructs DNA fingerprint by using SNP molecular markers, has rich content in genome of all organisms, low mutation rate and low acquisition cost.
The beneficial effects are that: compared with the prior art, the invention has the remarkable advantages that:
1. the invention provides an identification method for constructing Eriocheir sinensis DNA fingerprint by obtaining specific SNP molecular markers through a whole genome resequencing technology.
2. The third generation molecular marker SNP has a large number and is widely and uniformly distributed on the genome; the stability is high; the method is suitable for rapid and large-scale screening and the like, utilizes SNP molecular markers to construct DNA fingerprint, and overcomes the defects of long sequencing time, high price, smaller sequencing flux and the like of seed resource identification by utilizing first-generation and second-generation molecular markers such as RFLP, ISSR, SSR and the like.
3. According to the invention, the fingerprint is drawn based on the high polymorphism SNP obtained by the whole genome resequencing of the eriocheir sinensis, and the genetic distance between samples is calculated, so that the eriocheir sinensis of different varieties is accurately identified from the genetic level, and the influence on classification results due to subjective factors, environmental conditions and the like when variety identification is carried out by means of morphological characteristics, nutritional ingredient content and the like is avoided.
4. The high polymorphism SNP loci are subjected to genotype conversion, so that the Eriocheir sinensis SNP molecular identity card can be manufactured, and the variety classification can be rapidly performed by only carrying out cluster analysis on genotypes of SNP loci corresponding to DNA fingerprint patterns when the Eriocheir sinensis is subjected to variety identification.
Drawings
FIG. 1is a base mass distribution diagram of sample sequencing data;
FIG. 2 is a plot of sample sequencing base content;
FIG. 3 is a distribution diagram of 78 SNPs on a chromosome;
FIG. 4 is a phylogenetic tree constructed using 78 SNP pairs for 46 Eriocheir sinensis samples;
FIG. 5 is a graph showing a cluster analysis of 30 Eriocheir sinensis genotypes by using 78 SNP loci.
Detailed Description
For a better understanding of the present invention, the following description will make clear and complete description of the technical solution of the present invention with reference to the accompanying drawings in the embodiments. It is evident that the embodiments described are only some of the embodiments of the invention and that all other embodiments obtained by a person skilled in the art without making any inventive effort fall within the scope of protection of the invention.
Example 1
Construction of eriocheir sinensis DNA fingerprint
1. The invention collects 46 eriocheir sinensis samples from the Yangtze river area and the Liaoriver area at random, wherein 28 Yangtze river areas are marked as CJ-1-CJ-28; 18 Liaohe waters are marked as LH-1-LH-18.
2. And taking muscle tissues of the sample to be detected, extracting DNA and performing quality control.
And (3) extracting DNA from the sample tissue by using a magnetic bead method, detecting the concentration and the integrity of the DNA sample, and reserving the sample with single and clear DNA band and no dragging phenomenon under a gel imaging system. In the embodiment, the quality control of the DNA of 46 eriocheir sinensis muscle tissues is qualified and can be re-sequenced in a subsequent machine.
3. And (5) mechanically resequencing the DNA fragments with qualified quality control.
The DNA sample is subjected to enzyme section segmentation, a sequencing joint and the segmented DNA are linked together by using ligase, the connected product is subjected to PCR amplification, and magnetic beads are used for screening according to the size of the PCR product fragments. Then, the linear library was denatured into single strands, and then cyclized to form a single-stranded circular library, which was subjected to rolling circle replication to form DNA Nanospheres (DNB). Finally, DNB is loaded into the sequencing chip using loading device MGIDL-T7 and resequencing is performed on-machine by a combined probe-anchored polymerization technique.
4. And (5) preprocessing resequencing data and controlling quality.
(1) And (5) summarizing sequencing data and quality indexes. Raw sequencing data were filtered and quality assessed. To ensure SNP accuracy, the sequencing data is subjected to preliminary hard filtration (SNP hard filtration criteria: QD <2.0||FS >60.0||MQ <40.0||SOR >3.0|| MQRankSum < -12.5|| ReadPosRankSum < -8.0). After filtering out the low quality data, quality assessment is performed on the sequencing data. The base sequencing quality value can reflect a sequencing error rate, which corresponds to a sequencing Phred value (Qphred) of: when the Phred score is 20, the correct recognition rate of the base is 99%, and the Q-score is Q20; when the Phred score was 30, the correct recognition rate of the base was 99.9%, and Q-score was Q30. Quality assessment of sequencing yield data (CLEAN DATA) for all samples found CleanQ to be greater than 97.08% and CleanQ to be greater than 91.63%. In order to reflect the stability of the sequencing quality during the sequencing process, the base position of CLEAN READS is taken as an abscissa, and the average sequencing quality value of each position is taken as an ordinate, so that a sample sequencing quality distribution map (figure 1) is obtained. As can be seen from FIG. 1, the average homogeneity value is greater than 30, indicating higher stability of the sequencing quality.
(2) Sequencing the base content of the data. The corresponding base content distribution map was obtained with the base position in CLEAN READS as the abscissa and the proportion of ATCGN bases at each position as the ordinate (FIG. 2). Normally, the A and T bases and the G and C bases should be in equal proportion on each sequencing cycle, based on the base complementary pairing principle and the randomness of the sequencing. However, due to random primer amplification bias and the like, the front ten bases of each read obtained by sequencing have larger fluctuation and then tend to be stable.
(3) And (5) detecting pollution of sequencing data. Randomly selected 10,000 sequences from fastq files for each sample were evaluated for contamination with the NCBI NT database using blastn (Table 1). The comparison result shows that all sample sequences have no obvious pollution condition of other species.
TABLE 1 sequencing data contamination detection results
5. GVCF of each sample was obtained, joint-calling was performed using Sentieon, and gVCF of all samples were subjected to joint analysis to obtain a mutation result for each individual. The data comparison index is shown in table 2. In order to ensure the accuracy of SNP, the SNP loci obtained after the combination analysis are subjected to preliminary filtration (SNP hard filtration standard) to obtain a specific SNP locus database of Eriocheir sinensis. The full genome of TXID95602,95602 Eriocheir sinensis at NCBI was used as the reference genome. CLEAN READS was aligned with the reference genome to assess the quality of the sample, pooling, sequencing, and reference sequences.
Table 2 data alignment index
/>
/>
/>
6. And (5) screening finger print loci.
In mutation detection analysis, the VCF obtained after hard filtration of 46 Eriocheir sinensis samples obtains 61,760,064 SNP loci in total. Filtering according to the site deletion rate of 0 and MAF value of more than or equal to 0.1 to obtain 3,743,841 SNP sites; then carrying out copy number analysis on the sequences of 100bp upstream and downstream of the locus, and reserving the locus of which the sequence upstream and downstream is unique on the genome to obtain 2,808,092 SNP loci; secondly, reserving loci with heterozygosity rate <0.15 and average depth >10, and carrying out total of 20,145 SNP; finally, the screening was performed according to intervals >10Mb and evenly distributed on the chromosome, and 78 SNP loci were obtained (see Table 3). The distribution of 78 SNPs on the chromosome is shown in fig. 3, and all samples can be completely separated based on these 78 sites.
TABLE 3 DNA finger print of Eriocheir sinensis
/>
7. And (5) constructing a phylogenetic tree.
Based on the 78 SNP sites in table 3, genetic distances between samples were calculated using Plink software and a phylogenetic tree was constructed (fig. 4). The method can distinguish the difference degree of samples according to the distance of the genetic relationship, and can intuitively identify the eriocheir sinensis in different water areas by utilizing the constructed phylogenetic tree to cluster the water areas according to the genetic relationship.
8. Constructing DNA finger print of Eriocheir sinensis in different water areas.
And constructing DNA finger print of Eriocheir sinensis in Yangtze river and Liaohe river according to the 78 specific SNP loci. The results are shown in tables 4 and 5.
TABLE 4 DNA fingerprint of Eriocheir sinensis in Yangtze river area
/>
TABLE 5 DNA fingerprint of Eriocheir sinensis in Liaoning river water area
/>
In the embodiment, 78 specific SNP loci are finally selected, and the DNA fingerprint of Eriocheir sinensis, the DNA fingerprint of Yangtze river area and Liaohe river area are summarized by utilizing genotypes of the 78 SNP loci.
Example 2
In order to verify the accuracy of the eriocheir sinensis DNA fingerprint, germplasm resources of 30 eriocheir sinensis randomly selected were identified using 78 specific SNP loci in example 1.
Randomly collecting 20 Eriocheir sinensis crabs from the Yangtze river, and recording as A-1-A-20; 10 eriocheir sinensis crabs are randomly collected from the Liaohe water area and are marked as B-1-B-10. DNA was extracted from muscle tissue, and after quality control was confirmed, the DNA was sequenced on the machine as in example 1. The 78 high polymorphism SNP loci at the same position are compared with the eriocheir sinensis DNA fingerprint, and the coincidence rate reaches 99.49%. The genotypes of the 78 SNP loci are subjected to cluster analysis, so that the eriocheir sinensis populations in different water areas can be identified. The clustering result is shown in figure 5, A-1-A-20 is gathered into one branch, B-1-B-10 is gathered into one branch, and the Eriocheir sinensis groups in two large water areas are successfully identified. The eriocheir sinensis DNA fingerprint can effectively identify eriocheir sinensis groups in the Yangtze river and the Liaohe river, and the accuracy is high.

Claims (10)

1. The eriocheir sinensis DNA fingerprint is characterized by comprising 78 SNP loci, wherein the SNP loci and specific nucleotides are as follows:
Sequence number Chromosome of the human body Position of Nucleotide(s) Sequence number Chromosome of the human body Position of Nucleotide(s) 1 Chr1 99217 T/C 40 Chr32 2328181 A/T 2 Chr1 38720448 G/A 41 Chr33 161249 C/A 3 Chr2 405837 G/A 42 Chr34 1666626 A/G 4 Chr2 25617819 G/A 43 Chr35 567310 T/G 5 Chr3 70473 A/G 44 Chr36 672423 G/A 6 Chr3 25107852 T/C 45 Chr37 424371 A/C 7 Chr4 1558939 C/T 46 Chr38 323667 A/G 8 Chr4 26621963 C/T 47 Chr39 135012 A/T 9 Chr5 584213 G/A 48 Chr40 53631 C/A 10 Chr5 25646473 T/C 49 Chr41 131332 A/T 11 Chr6 215261 G/C 50 Chr42 193854 C/T 12 Chr6 25243217 T/C 51 Chr43 2957 C/T 13 Chr7 128231 C/T 52 Chr44 374945 A/G 14 Chr7 25130734 A/G 53 Chr45 841188 T/C 15 Chr8 499625 G/A 54 Chr46 142856 C/T 16 Chr9 486990 A/G 55 Chr47 162792 G/A 17 Chr9 25518156 G/A 56 Chr48 61003 G/A 18 Chr10 292307 A/G 57 Chr49 243340 T/G 19 Chr11 793246 A/T 58 Chr50 166519 C/T 20 Chr12 145168 C/G 59 Chr51 23060 T/A 21 Chr13 1008227 T/G 60 Chr52 55932 G/A 22 Chr14 54272 A/G 61 Chr53 2239866 T/A 23 Chr15 179693 A/G 62 Chr54 14560 A/G 24 Chr16 2699367 G/A 63 Chr55 23007 G/A 25 Chr17 73954 C/G 64 Chr56 189487 G/A 26 Chr18 192820 T/C 65 Chr57 254078 C/T 27 Chr19 116958 G/C 66 Chr58 65370 T/A 28 Chr20 282617 A/T 67 Chr59 157051 A/G 29 Chr21 419905 T/G 68 Chr60 86713 C/T 30 Chr22 769751 C/T 69 Chr61 587207 A/G 31 Chr23 11879 G/T 70 Chr62 10220 G/C 32 Chr24 22105 A/C 71 Chr63 390533 C/T 33 Chr25 4578 T/G 72 Chr64 278579 T/A 34 Chr26 118662 C/T 73 Chr65 546629 C/A 35 Chr27 688832 C/T 74 Chr66 108002 G/C 36 Chr28 45286 G/A 75 Chr67 114412 A/G 37 Chr29 1000395 A/G 76 Chr68 15155 T/C 38 Chr30 72370 C/T 77 Chr69 1028650 A/G 39 Chr31 41525 G/A 78 Chr70 962197 A/G
2. The eriocheir sinensis DNA fingerprint is characterized by preferably comprising a eriocheir sinensis DNA fingerprint in the Yangtze river and a eriocheir sinensis DNA fingerprint in the Liaohe river, wherein the two DNA fingerprints comprise 78 SNP sites, and the SNP sites and specific nucleotides of the eriocheir sinensis DNA fingerprint in the Yangtze river are as follows:
The SNP locus and specific nucleotide of the Eriocheir sinensis DNA fingerprint in the Liaoning river water area are as follows:
3. A method for constructing a eriocheir sinensis DNA fingerprint according to claim 1 or 2, comprising the steps of:
(1) Collecting Eriocheir sinensis in Yangtze river and Liaohe water;
(2) Extracting DNA of eriocheir sinensis muscle tissues;
(3) Carrying out resequencing on the DNA fragments with qualified quality control on a machine;
(4) Filtering the obtained sequencing data, detecting pollution of the sequencing data and evaluating the quality of the sequencing data;
(5) Adopting Sentieon software to compare reads of each sample to a reference genome and performing mutation detection to construct a specific SNP locus database of Eriocheir sinensis;
(6) And further filtering according to the deficiency rate, MAF value, single copy, site heterozygosity and depth, and screening high polymorphism SNP sites for constructing DNA fingerprint.
4. The method for constructing the eriocheir sinensis DNA fingerprint spectrum according to claim 3, wherein the specific step of mechanically re-sequencing the quality-controlled qualified DNA fragment in the step (3) is as follows: the DNA sample is subjected to enzyme slicing and segmentation, a sequencing joint and segmented DNA are linked together by using ligase, PCR amplification is carried out on the connection product, the PCR product is subjected to fragment screening by using magnetic beads, then, the linear library is denatured into single chains, cyclization is carried out to form a single-chain annular library, DNA Nanospheres (DNB) are formed through rolling circle replication, finally, DNB is loaded into a sequencing chip by using loading equipment MGIDL-T7, and re-sequencing is carried out by combining a probe-anchored polymerization technology.
5. The method for constructing the eriocheir sinensis DNA fingerprint according to claim 3, wherein the filtering, the sequencing data pollution detection and the sequencing data quality evaluation of the sequencing data obtained in the step (4) comprise the summarization of sequencing data and quality indexes, the base content of the sequencing data and the sequencing data pollution detection.
6. The method for constructing a DNA fingerprint of Eriocheir sinensis according to claim 3, wherein the reference genome in the step (5) is TXID95602 Eriocheir sinensis whole genome at NCBI.
7. The method for constructing a DNA fingerprint of Eriocheir sinensis according to claim 3, wherein the mutation detection is performed in the step (5) to obtain gVCF of each sample, joint-calling is performed by Sentieon, gVCF of all samples are subjected to joint analysis to obtain mutation results of each individual, and the SNP loci obtained after joint analysis are subjected to preliminary filtration (SNP hard filtration standard: QD < 2.0|FS > 60.0|MQ < 40.0|SOR > 3.0| MQRankSum < -12.5| ReadPosRankSum < -8.0) to obtain a specific SNP locus database of Eriocheir sinensis.
8. The method for constructing a eriocheir sinensis DNA fingerprint according to claim 3, wherein the screening standard in the step (6) is that the deletion rate is 0; MAF value is more than or equal to 0.1; extracting sequences 100bp upstream and downstream of the locus to perform copy number analysis, and reserving the locus of the upstream and downstream sequences which is unique on the genome; site heterozygosity <0.15uolv and site depth >10 was further filtered.
9. An application of the eriocheir sinensis DNA fingerprint in identifying germplasm resources of different water areas of eriocheir sinensis.
10. The application according to claim 9, wherein the process of the application is:
(1) Extracting DNA from a Eriocheir sinensis sample to be detected and resequencing;
(2) Performing quality control on the test data, performing preliminary filtration according to SNP hard filtration standards, further filtering according to the deletion rate, the MAF value, single copy, site heterozygosity and depth, and finally uniformly distributing screening sites on chromosomes according to intervals of more than 10Mb to obtain SNP sites capable of completely separating all samples;
(3) Calculating genetic distances among samples and constructing a phylogenetic tree by using Plink software for the screened SNP loci with high polymorphism;
(4) Comparing the screened SNP loci with DNA fingerprints of the eriocheir sinensis in different water areas, and determining that the sample to be detected is a eriocheir sinensis group when the coincidence rate is more than or equal to 95%; and carrying out cluster analysis on genotypes of the SNP loci with high polymorphism, and judging that the Eriocheir sinensis in the unknown water area belongs to the Liaohe water area group or the Yangtze river water area group according to the cluster condition of the phylogenetic tree.
CN202410435641.3A 2024-04-11 2024-04-11 Eriocheir sinensis DNA fingerprint and construction method and application thereof Pending CN118086534A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410435641.3A CN118086534A (en) 2024-04-11 2024-04-11 Eriocheir sinensis DNA fingerprint and construction method and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410435641.3A CN118086534A (en) 2024-04-11 2024-04-11 Eriocheir sinensis DNA fingerprint and construction method and application thereof

Publications (1)

Publication Number Publication Date
CN118086534A true CN118086534A (en) 2024-05-28

Family

ID=91153338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410435641.3A Pending CN118086534A (en) 2024-04-11 2024-04-11 Eriocheir sinensis DNA fingerprint and construction method and application thereof

Country Status (1)

Country Link
CN (1) CN118086534A (en)

Similar Documents

Publication Publication Date Title
US11578365B2 (en) Chicken whole-genome SNP chip and use thereof
US20080082273A1 (en) Computer algorithm for automatic allele determination from fluorometer genotyping device
CN105506111B (en) Method for detecting CNV (CNV) marker of MAPK10 gene of Nanyang cattle and application of CNV marker
CN115198023B (en) Hainan cattle liquid-phase breeding chip and application thereof
CN111778353B (en) SNP molecular marker for identifying common wheat variety and SNP molecular marker detection method
CN102618630A (en) Application of Y-STR (Y chromosome-short tandem repeat)
CN109321665B (en) Method for screening molecular markers of Jinhu black-bone chicken and application thereof
CN107217091A (en) A kind of detection method of milch goat Fecundity Trait related gene SNP
CN111088327B (en) Method for detecting cattle body size characters under assistance of SIKE1 gene CNV marker and application thereof
CN114921572B (en) SNP molecular marker for identifying Taihe black-bone chicken variety and application thereof
CN118086534A (en) Eriocheir sinensis DNA fingerprint and construction method and application thereof
CN115927731A (en) SNP (Single nucleotide polymorphism) site combination for constructing litchi SNP fingerprint, application and identification method
CN107885972A (en) It is a kind of based on the fusion detection method of single-ended sequencing and its application
CN116144794A (en) Bovine 12K SV liquid phase chip and design method and application thereof
CN114530200A (en) Mixed sample identification method based on calculation of SNP entropy
CN104573409B (en) The multiple check method of the assignment of genes gene mapping
CN108304693B (en) Method for analyzing gene fusion by using high-throughput sequencing data
CN112359120A (en) Method for detecting cattle MFN1 gene CNV marker and application thereof
CN105543235B (en) Gene and its application
CN116516024A (en) DNA fingerprint of pelteobagrus fulvidraco and application thereof
Wainer-Katsir et al. BIRD: identifying cell doublets via biallelic expression from single cells
CN118064607A (en) DNA fingerprint of leiocassis longirostris, construction method and application thereof
CN117587159B (en) Chilli SNP molecular marker combination, SNP chip and application thereof
CN116590435B (en) Causal candidate gene related to pig backfat thickness and identification method and application thereof
CN115948521B (en) Method for detecting aneuploidy deletion chromosome information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination