CN115786479A - Police performance detection method for police dog - Google Patents
Police performance detection method for police dog Download PDFInfo
- Publication number
- CN115786479A CN115786479A CN202211491466.7A CN202211491466A CN115786479A CN 115786479 A CN115786479 A CN 115786479A CN 202211491466 A CN202211491466 A CN 202211491466A CN 115786479 A CN115786479 A CN 115786479A
- Authority
- CN
- China
- Prior art keywords
- police
- analysis
- dogs
- sequencing
- dog
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 20
- 241000282472 Canis lupus familiaris Species 0.000 claims abstract description 56
- 238000012163 sequencing technique Methods 0.000 claims abstract description 33
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 24
- 238000000034 method Methods 0.000 claims abstract description 20
- 101001006789 Homo sapiens Kinesin heavy chain isoform 5C Proteins 0.000 claims abstract description 16
- 101000737619 Homo sapiens Cingulin-like protein 1 Proteins 0.000 claims abstract description 13
- 102100035396 Cingulin-like protein 1 Human genes 0.000 claims abstract description 12
- 102100027928 Kinesin heavy chain isoform 5C Human genes 0.000 claims description 13
- 238000009395 breeding Methods 0.000 abstract description 9
- 230000001488 breeding effect Effects 0.000 abstract description 8
- 238000004458 analytical method Methods 0.000 description 30
- 230000002068 genetic effect Effects 0.000 description 18
- 210000000232 gallbladder Anatomy 0.000 description 17
- 230000006870 function Effects 0.000 description 16
- 238000012216 screening Methods 0.000 description 11
- 238000012098 association analyses Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 9
- 238000001914 filtration Methods 0.000 description 9
- 238000000513 principal component analysis Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- QANMHLXAZMSUEX-UHFFFAOYSA-N kinetin Chemical class N=1C=NC=2N=CNC=2C=1NCC1=CC=CO1 QANMHLXAZMSUEX-UHFFFAOYSA-N 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 4
- 238000001976 enzyme digestion Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 230000001737 promoting effect Effects 0.000 description 4
- 238000012070 whole genome sequencing analysis Methods 0.000 description 4
- 108700028369 Alleles Proteins 0.000 description 3
- 101150095765 CGNL1 gene Proteins 0.000 description 3
- 108020004414 DNA Proteins 0.000 description 3
- 101100113086 Homo sapiens CGNL1 gene Proteins 0.000 description 3
- 101150056952 KIF5C gene Proteins 0.000 description 3
- FAIXYKHYOGVFKA-UHFFFAOYSA-N Kinetin Natural products N=1C=NC=2N=CNC=2C=1N(C)C1=CC=CO1 FAIXYKHYOGVFKA-UHFFFAOYSA-N 0.000 description 3
- 230000027455 binding Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000003291 dopaminomimetic effect Effects 0.000 description 3
- 230000007614 genetic variation Effects 0.000 description 3
- 229960001669 kinetin Drugs 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 230000037023 motor activity Effects 0.000 description 3
- 239000002773 nucleotide Substances 0.000 description 3
- 125000003729 nucleotide group Chemical group 0.000 description 3
- 238000003908 quality control method Methods 0.000 description 3
- 238000007619 statistical method Methods 0.000 description 3
- 210000000225 synapse Anatomy 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 101150028074 2 gene Proteins 0.000 description 2
- 201000004569 Blindness Diseases 0.000 description 2
- 241000282465 Canis Species 0.000 description 2
- 101000940827 Homo sapiens Ly6/PLAUR domain-containing protein 6B Proteins 0.000 description 2
- 101000592685 Homo sapiens Meiotic nuclear division protein 1 homolog Proteins 0.000 description 2
- 101001133091 Homo sapiens Mucin-20 Proteins 0.000 description 2
- 101001128505 Homo sapiens Myocardial zonula adherens protein Proteins 0.000 description 2
- 101000635799 Homo sapiens Run domain Beclin-1-interacting and cysteine-rich domain-containing protein Proteins 0.000 description 2
- 101000664599 Homo sapiens Tripartite motif-containing protein 2 Proteins 0.000 description 2
- 102100031745 Ly6/PLAUR domain-containing protein 6B Human genes 0.000 description 2
- 102100033679 Meiotic nuclear division protein 1 homolog Human genes 0.000 description 2
- 102100034242 Mucin-20 Human genes 0.000 description 2
- 102100032160 Myocardial zonula adherens protein Human genes 0.000 description 2
- 102100030852 Run domain Beclin-1-interacting and cysteine-rich domain-containing protein Human genes 0.000 description 2
- 102100038799 Tripartite motif-containing protein 2 Human genes 0.000 description 2
- 206010047139 Vasoconstriction Diseases 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- VYFYYTLLBUKUHU-UHFFFAOYSA-N dopamine Chemical compound NCCC1=CC=C(O)C(O)=C1 VYFYYTLLBUKUHU-UHFFFAOYSA-N 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 238000003205 genotyping method Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000011734 sodium Substances 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 230000025033 vasoconstriction Effects 0.000 description 2
- 206010064571 Gene mutation Diseases 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- DGAQECJNVWCQMB-PUAWFVPOSA-M Ilexoside XXIX Chemical compound C[C@@H]1CC[C@@]2(CC[C@@]3(C(=CC[C@H]4[C@]3(CC[C@@H]5[C@@]4(CC[C@@H](C5(C)C)OS(=O)(=O)[O-])C)C)[C@@H]2[C@]1(C)O)C)C(=O)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO)O)O)O.[Na+] DGAQECJNVWCQMB-PUAWFVPOSA-M 0.000 description 1
- 101150059949 MUC4 gene Proteins 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 238000001772 Wald test Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 230000010100 anticoagulation Effects 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 239000012297 crystallization seed Substances 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- UQHKFADEQIVWID-UHFFFAOYSA-N cytokinin Natural products C1=NC=2C(NCC=C(CO)C)=NC=NC=2N1C1CC(O)C(CO)O1 UQHKFADEQIVWID-UHFFFAOYSA-N 0.000 description 1
- 239000004062 cytokinin Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 229960003638 dopamine Drugs 0.000 description 1
- 238000010201 enrichment analysis Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000010429 evolutionary process Effects 0.000 description 1
- 238000013401 experimental design Methods 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 230000020169 heat generation Effects 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000003907 kidney function Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 230000027939 micturition Effects 0.000 description 1
- 230000002297 mitogenic effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000003147 molecular marker Substances 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 210000001640 nerve ending Anatomy 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 238000013081 phylogenetic analysis Methods 0.000 description 1
- 239000003375 plant hormone Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000008327 renal blood flow Effects 0.000 description 1
- 230000013878 renal filtration Effects 0.000 description 1
- 230000001846 repelling effect Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 229910052708 sodium Inorganic materials 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000004936 stimulating effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 210000005239 tubule Anatomy 0.000 description 1
- 210000003462 vein Anatomy 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention relates to selection and breeding of German shepherd dogs, in particular to a police performance detection method for police dogs, which is characterized in that whether the German shepherd dogs carry CGNL1 and/or KIF5C genes is determined by a gene sequencing method. For the field of breeding, the method is simple, feasible and reliable, and is suitable for large-scale popularization.
Description
Technical Field
The invention relates to breeding of German shepherd dogs, in particular to a police performance detection method for police dogs
Background
Today, the unique functions of police dogs are still irreplaceable by people and instruments, and are widely used by police in various countries around the world. The police dog technology is an important component of public security science and technology, is an important means for fighting and preventing crimes by public security organs, and continuously shows a unique effect of the police dog in handling group emergencies by means of violent appearance and strong deterrence force. The police dog technology is fully applied to treat group emergencies, various infringement tactics are flexibly applied, and the police dog technology becomes an important component for fighting and preventing various crimes and maintaining the security and stability of the society of various public security departments.
German shepherds (German Shepherd), often called German demise or German demise, are agile and adapted to the working environment of the activity, often being deployed with tasks such as police, guard, search and rescue and military, which also work as blind guides for the blind. Not all german shepherd dogs are qualified as police dogs, which have a desirable trait-bile number. Traditional breeding is to create genetic variation through various ways to perform optimized selection and evaluation from offspring, is based on canine phenotype, needs breeding workers to have abundant practical experience, has certain blindness and unpredictability in the aspect of screening the police dogs with good courage, and has the disadvantages of long breeding period, time consumption and labor consumption. Therefore, the method for rapidly and reliably screening the police dog with good gallbladder capacity becomes a technical problem to be solved urgently.
Disclosure of Invention
The technical problem to be solved is as follows: the police performance detection method for the police dog provided by the invention is rapid and reliable, overcomes the defects of blindness and unpredictability in the traditional breeding method, and greatly saves time and cost.
Technical scheme
A police performance detection method for a police dog is characterized in that a German shepherd dog is determined to carry CGNL1 and/or KIF5C genes through a gene sequencing method.
The CGNL1 is Gene30799; the KIF5C is Gene22425.
The gene sequencing method is a first-generation, second-generation or higher detection method.
The present invention can detect the presence of CGNL1 and/or KIF5C by PCR.
Advantageous effects
The SLAF-seq is adopted to carry out whole genome association analysis on the gallbladder character of the police dog so as to find SNPs sites associated with the gallbladder character, and finally 24 SNPs sites related to the gallbladder character are identified. According to the 100kb related gene upstream and downstream of the position of SNPs, 5 SNPs sites are related to the gene. The positions of 5 SNPs are respectively located:
LOC10659757;TRIM2;MND1;LOC106559787;LOC111093549;LCO111093557;
LOC102151701; CGNL1; MYZAP; KIF5C; LYPD6B; RUBCN; MUC20; the MUC4 gene is within or near the 14 genes. Performing statistical analysis on GO functions of 14 genes in a relevant interval, finding that 2 Gene functions are possibly related to the gallbladder quantity, and participating in ATP binding activity by Gene22425 (KIF 5C) (GO: 0005524); participate in the production of kinetin complexes (GO: 0005871). Gene30799 (CGNL 1) is involved in motor activity (GO: 0003774), and may have an effect on gallbladder volume. The above data only illustrate the possible effects and the lack of direct evidence suggests that there must be a correlation with gallbladder quantity. Since binding to ATP is associated with numerous reactions, such as heat generation, etc. Kinetin (Kinetin) is one of the mitogenic plant hormones cytokinins. Because of the ability to induce cell division, this substance is called kinetin, and the motor activity function is only indicative of its motor-related activity.
The gallbladder is not afraid of difficulty and fear, is a complex psychological performance and is influenced by multiple factors of audiences. Finding which genes are associated with gall bladder capacity, particularly in canines, has been a problem in the art.
The inventor carries out sequencing analysis on the CGNL1 gene or the KIF5C gene of the German shepherd dog, namely, whether the CGNL1 gene or the KIF5C gene is mutated or not is detected, and through examination items about examination details of the German shepherd dog according to police department police dog breeding rules issued by the police department, if the gene is mutated, the gall examination of the corresponding dog is excellent, and the gene mutation is further proved to be related to the dog gall. Therefore, for the breeding field, only the variation of the genes of the dogs needs to be determined, and the gall content of the dogs is deduced to meet the requirements.
Drawings
FIG. 1: schematic flow chart of SLAF experiment;
FIG. 2: WGS population evolution information analysis flow chart;
FIG. 3: a sample phylogenetic tree;
FIG. 4 shows the result of clustering samples corresponding to each K value of Admixure;
FIG. 5: cross-validation error rates for respective K values of Admixure;
FIG. 6: a principal component analysis result;
FIG. 7: linkage disequilibrium analysis results;
FIG. 8: manhattan plots for trait 1 association analysis;
FIGS. 9Q-Q are diagrams;
FIG. 10: the result of police dog detection and screening by using KIF5C is proved;
FIG. 11: the result of police dog detection and screening by using KIF5C is proved;
FIG. 12: the result of screening police dogs by using CGNL1 detection proves.
Detailed Description
Example 1
1. GWGS technical introduction
Whole Genome Sequencing (WGS) is a next generation Sequencing technology for rapid, low-cost determination of the complete Genome sequence of an organism. SNP information is obtained through analysis and is subjected to genotyping, the method is a rapid, simple and convenient and low-cost genotyping method, is suitable for carrying out marker screening on large-scale samples, and can be used in the fields of molecular marker development, ultrahigh-density map construction, population genetic analysis, population GWAS analysis and the like.
The information analysis content includes: 1) Evaluating the quality of sequencing data; 2) Comparing sample data; 3) Group SNP detection and annotation; 4) Analyzing a population genetic structure; 5) Analyzing a main component; 6) And (5) GWAS analysis.
2. Screening procedures
2.1 basic information on the Material
110 German shepherd dogs, 110 Nanjing police dog institute of public Security department, age of experimental dogs 0.5-1 year, weight of the experimental dogs 20-30 kg, good body conditions, complete immunity and insect repelling, feeding complete pellet feed, performing routine training and feeding, performing gall bladder police performance assessment scoring on each dog, collecting 5ML blood samples through veins, collecting 5ML blood from fresh dogs in an anticoagulation tube, labeling, and storing the samples in a refrigerator at-20 ℃.
2.2 methods
2.2.1 Experimental design
TABLE 1WGS detection packet
2.2.2 Experimental procedures
And (4) respectively carrying out enzyme digestion on the genomic DNA of each sample qualified by detection according to the selected optimal enzyme digestion scheme. And (3) carrying out treatment of adding A to the 3' end of the obtained enzyme digestion fragment (SLAF label), connecting a Dual-index sequencing joint, carrying out PCR amplification, purifying, mixing samples, cutting gel, selecting a target fragment, and carrying out sequencing by using IlluminaHiSeq after the library quality is qualified. In order to evaluate the accuracy of the enzyme digestion experiment, the rice Nipponbare was selected as a Control (Control) for sequencing. The experimental procedure is shown in FIG. 1: SLAF experimental protocol.
2.2.3 information analysis flow
The specific steps are shown in a WGS population evolution information analysis flow chart of FIG. 2;
3. information analysis results
The information analysis mainly comprises the following steps: (1) sequencing data quality control: performing quality control on Raw data obtained by off-line to obtain Clean data; (2) alignment analysis: comparing the Clean data with a reference genome; (3) SNP detection: performing group SNP detection and annotation according to the comparison result; (4) population genetic structure analysis: the method comprises the steps of system evolutionary Tree construction (Tree) and Principal Component Analysis (PCA); and (5) carrying out association analysis on the SNP and the traits.
3.1 sequencing data statistics
Qualified DNA libraries were tested for HiseqTM sequencing, yielding Raw data (i.e., raw data or Raw reads) and results stored in the FASTQ file format (filename:. Fq). The original sequencing data contains joint information, low-quality bases and undetected bases (expressed by N), in order to ensure the quality of information analysis, the information can cause great interference on subsequent information analysis, the interference information needs to be removed before analysis, and finally obtained data is effective data which is called Clean data or Clean reads. The raw data filtering method is as follows:
(1) The reads pair containing the linker sequence needs to be filtered out;
(2) When the content of N contained in the single-ended sequencing read exceeds 10% of the length proportion of the read, the pair of paired reads needs to be removed;
(3) The pair of paired reads needs to be removed when the number of low-mass (quality Q. Ltoreq.5) bases contained in the single ended sequencing read exceeds 50% of the length proportion of the read.
And strictly filtering sequencing data to obtain high-quality Clean data. Statistics were performed on the sequencing data for all samples, including data yield, sequencing error rate, Q20, Q30, GC content, etc. The statistical portion of the sequencing data is shown in Table 2.
TABLE 2 sequencing data yield statistics
Note: sample: a sample name; raw base: the number of bases of the original data; clean base: the high-quality base number of the original data after being filtered; error Rate: sequencing error rate; q20, Q30: the percentage of bases with a Phred number greater than 20, 30 to the total bases; GC content (%): GC content;
in addition, clean data was compared to the nucleotide database of NCBI to assess whether there was DNA contamination from other sources. The statistical result shows that the sequencing quality is high (Q20 is more than or equal to 95 percent, Q30 percent is more than or equal to 90 percent), and the GC distribution of the sample is normal. In conclusion, the data volume meets the contract requirement, and the library construction and sequencing are successful.
3.2 comparative analysis of samples
The sequencing data was aligned to the pseudo-reference genome. The sample alignment rate can reflect the similarity of the sample and a reference genome, and the sequencing depth and the sequencing coverage can directly reflect the uniformity of sequencing data and the homology with a reference sequence. The valid sequencing data were aligned to the reference sequence by BWA [1] software (Version: 0.7.17-r1198, parameter: mem-t 4-k 32-M meaning parameter-t: number of reads; -k: minimal seed length; -M: mark short split as sequences) and the alignment and coverage between the sample and reference sequence were counted based on the alignment results (see Table 3). The alignment results were then sorted by SAMTOOLS [2] (Version: 0.1.19-44428cd, ref: sort).
TABLE 3 statistics of alignment and depth of coverage (parts)
Note: total reads: the number of all reads used for alignment;
mapping reads: comparing Clean data to the number of reads on a reference genome;
mapping rate: mapping reads account for the proportion of the total Clean reads;
average depth: average sequencing depth, the total number of bases aligned to a reference genome divided by the genome size;
coverage1X: the percentage of sites covered by at least 1 base in the reference genome to the genome;
coverage4X: the percentage of sites in the reference genome that are covered by at least 4 bases is the percentage of the genome.
3.3 mutation detection assay
SNP (single nucleotide polymorphism) mainly refers to DNA sequence polymorphism caused by variation of a single nucleotide at the genome level, and includes conversion, modification, etc. of a single base. After all samples are aligned by BWA, the SNPs of the population are detected by adopting software such as SAMTOOLS and the like. Detecting polymorphic sites in the population by using a Bayesian model, and filtering the obtained SNPs: basic filtering was performed to obtain high quality SNPs by the following filtering criteria:
(1) Q20 quality control (i.e., SNP with sequencing error rate greater than 1% is filtered out);
(2) The support number (coverage depth) of SNPs was in the range of sample number x 20;
(3) Filtering out of SNPs with a minimum allele frequency (maf) of less than 0.05;
(4) SNP deletion rate in the whole population is less than 0.5 for filtering (subsequent gene filtering results for subsequent analysis).
3.4 population genetic diversity analysis
Genetic diversity analysis was performed on each population. And constructing an evolutionary tree by using a corresponding mathematical operation method according to the genetic difference among the populations, and carrying out principal component analysis to reveal the genetic diversity, the genetic distance and the genetic difference degree among different populations.
3.4.1 construction of phylogenetic Tree
A phylogenetic tree (also called evolutionary tree) is a branching diagram or tree for describing the evolutionary order among the populations, and is used to represent the evolutionary relationships among the populations. The relationship can be deduced according to the common points or differences of the physical or genetic characteristics of the populations. We constructed the evolutionary tree using the neighbor-join methods.
And (3) according to the SNP locus information of different groups, a phylogenetic tree is constructed, the genetic distance of each group is visually displayed, and a reference is provided for screening the groups with longer genetic distances. The results are shown in FIG. 3, which shows the lineage structure tree (NJ tree) for the entire population, and the lineage evolution pattern of the population can be understood in combination with the information on the geographic distribution of each particular sample.
Software: treebest (1.9.2, http:// treisoft. Sourceform. Net /)
3.4.2. Evolutionary tree analysis
And (3) genetic evolution analysis, wherein 137937 high-consistency group SNPs are obtained in total for subsequent genetic evolution related analysis by filtering according to the completeness of >0.5 and the minor genotype frequency of > 0.05. The difference between sample groups is large, and the integrity of the snp site is not high.
3.4.3 phylogenetic analysis
The phylogenetic tree is used to express the evolutionary relationship between species, and according to the distance of the relationship between various organisms, various organisms are arranged on a branched tree-shaped chart to express the evolutionary process and the relationship simply. Based on SNP, through MEGA5 software, neighbor-join algorithm, construct the population evolutionary tree of the sample as shown in FIG. 3, note: each branch in the figure is a sample. .
3.4.4 genetic Structure analysis
The group genetic structure analysis can provide the ancestral source of individuals and the composition information thereof, and is an important genetic relationship analysis tool. Based on the SNPs, the population structure of the samples was analyzed by the additive software, and clustering was performed assuming that the number of clusters (K value) of the samples was 2 to 10, respectively. And performing cross validation on the clustering result, and determining the optimal clustering number according to the valley value of the cross validation error rate. Clustering cases with K values of 2-10 (FIG. 4) and cross-validation error rates corresponding to each K value (FIG. 5).
3.4.5 principal component analysis
Principal Component Analysis (PCA) is a pure mathematical operation method, and a small number of important variables can be selected from a plurality of related variables through linear transformation. PCA is applied to many disciplines, and is mainly used for clustering analysis in genetics, which is based on individual genome SNP difference degree, clusters individuals into different subgroups according to different character characteristics and is used for mutual verification with other methods.
And (3) performing principal component analysis on each population to obtain a clustering result of each population, and further verifying the differentiation relationship of each population obtained in the evolutionary tree analysis, wherein the display result is shown in figure 6.
3.4.5LD linkage disequilibrium
The Linkage disequilibrium decay (LD decay) pattern is an important element in population genetics research and also the basis of association analysis. General articles of genetic evolution GWAS will give LD attenuation maps. This analysis was performed using a PopLDdecadey (v 3.40) LD analysis, and the LD decay curves for each packet are shown in the results of FIG. 7.
The LD attenuation map is a graph showing the process in which the average LD coefficient between molecular markers on a genome decreases as the distance between markers increases. The general calculation principle is that the LD coefficients between every two markers on the genome are counted, then the LD coefficients are classified according to the distance between the markers, and finally the average LD coefficient between the molecular markers with a certain distance can be calculated. The abscissa is the physical distance (kb) and the ordinate is the LD coefficient (r ^ 2).
3.5 Genome wide association analysis (Genome wide association study)
Genome wide association analysis (GWAS) is used for detecting genetic variation polymorphism of multiple individuals in a Genome wide range, identifying genotypes, and performing association analysis on phenotypic data of a trait of interest to obtain a candidate gene or a Genome segment related to the target trait.
3.5.1 genome-wide association analysis of Complex traits Using GEMMA
GEMMA (Genome-wide Efficient Mixed Model Association algorithm) is a GWAS analysis software based on a Mixed linear Model. Compared with other software based on a hybrid linear model, the method has the advantages that: (1) quick: much faster than other algorithms (EMMA and FaST-LMM). (2) accurate: both EMMAX and GAPIT adopt a strategy of invariable variance component in a fixed zero model to improve the operation speed, which is not as accurate as GEMMA. (3) convenience: the plink binary format data can be directly used without complicated data format conversion. (4 fully functional: single marker GWAS, multi marker GWAS and multi trait GWAS analysis can be performed. The results are shown in Table 4, FIG. 8 and FIG. 9. Note: the height of the genetic locus-log 10 (p-value) in the manhattan plot on the Y axis corresponds to the degree of association of the SNP locus with the phenotypic trait or disease, the stronger the association (i.e., the lower the p-value), the higher the association.
TABLE 4 Whole genome Association analysis results for each trait Table (parts)
The file contains 14 columns of results. The specific meanings are as follows:
and (2) chr: the number of the chromosome where the SNP is located; ps: physical location of SNP; n _ miss: number of SNP-deleted individuals; allele1: a minor allele; allele0: a major allele; af: the frequency of SNPs; beta: (ii) a SNP effect value; se: beta estimation standard error; l _ remle: calculating a remle estimated value of lamda corresponding to the SNP effect; p _ wald: p value by wald test; p _ lrt: a likelihood ratio P value; p _ score: scores check p-value.
Note: the height of the locus-log 10 (p-value) of the gene in the manhattan map on the Y axis corresponds to the degree of association of the SNP locus with a phenotypic trait or a disease, and the stronger the association (i.e., the lower the p-value), the higher the association. Generally, SNPs around strongly associated sites will also show similar signal intensities due to Linkage Disequilibrium (LD) relationships, and peak positions are also generally of real interest for research. Q-Q plot is mainly used to estimate the difference between observed and predicted values of quantitative traits. 3.5.2 site annotation Using ANNOVAR software
The SNP sites were annotated using the annotation software ANNOVAR. The variant _ function.xls notes the genes and positions of all variants in the result file, and notes the variant functions, types, amino acid changes, etc. of exon regions in detail.
TABLE 5 statistical table of ANNOVAR annotation results
4. Results and discussion
The invention adopts SLAF-seq to carry out whole genome association analysis on the gallbladder character of the police dog so as to find SNPs sites associated with the gallbladder character. Finally, 24 SNPs sites related to the gallbladder quantity are identified. According to the 100kb related gene at the upstream and downstream of the position of SNPs, 5 SNPs sites are related to the gene. The positions of 5 SNPs are respectively located:
LOC10659757;TRIM2;MND1;LOC106559787;LOC111093549;LCO111093557;
LOC102151701; CGNL1; MYZAP; KIF5C; LYPD6B; RUBCN; MUC20; inside or near the 14 genes of MUC 4.
It was predicted that the 14 genes may be related to the biliary mass trait, see table 6.
TABLE 6 SNP-associated Gene information
Statistical analysis is carried out on GO functions of 14 genes in a relevant interval, 2 Gene functions are found to be possibly related to courage, and Gene22425 (KIF 5C) is involved in ATP binding activity (GO: 0005524); participate in the production of kinetin complexes (GO: 0005871). Gene30799 (CGNL 1) is involved in motor activity function (GO: 0003774), and may have direct influence on gall volume.
According to KEGG enrichment analysis, only two functions are significantly enriched, namely Ribosome (Ribosome) function and Dopaminergic synapse (Dopaminergic synapse) function, and the dopamine is found to have multiple effects by consulting data: 1. exciting the heart, directly exciting beta 1 receptors and promoting nerve endings to release NA; 2. promoting vasoconstriction, stimulating alpha receptor, skin membrane and bone vasoconstriction; 3. promoting renal function, increasing renal blood flow and renal filtration rate, directly inhibiting tubule from reabsorbing Na +, and removing sodium and promoting urination. The associated gene enrichment function is very close to the trait.
According to the indication of the KEGG enrichment result, the function of KO04728 dopaminergic synapse on KEGG pathway is analyzed, and the corresponding Gene22425 (KIF 5C) is used for further guiding the possibility of controlling the cholecystometric trait of the Gene.
5. Conclusion
According to the statistical analysis of GO functions of 14 genes in a correlation interval, the SNP functions of 2 variant genes CGNL1 and KIF5C are related to the gallbladder quantity of German shepherd dogs and can directly influence the gallbladder quantity.
Example 2 detection and screening of KIF5C for qualified police dogs with good gallbladder content
By adopting a sequencing method, 10 German shepherd dogs are subjected to KIF5C detection, 18 police dogs with KIF5C gene variation are screened out, and further, the 8 dogs have good performance through industry assessment and prove that the method is reliable through conventional means, namely through identification, tracking, public security and precaution training for 4 months.
The results are shown in FIG. 10, which demonstrates the feasibility and reliability of the method of the present invention.
Example 3 detection and screening of KIF5C for qualified police dogs with good gallbladder content
A sequencing method is adopted to detect KIF5C of 22 German shepherd dogs, 18 police dogs with KIF5C genetic variation are screened out, and further, the 18 dogs have good performance through the industry assessment and the result of the conventional means, namely, through the identification, tracking, public security and precaution training for 4 months, the method is proved to be reliable.
The results are shown in FIG. 11, which demonstrates that the method of the present invention is feasible and reliable.
Example 4 CGNL1 was tested to screen qualified police dogs.
In the same way as example 2, a sequencing method is adopted to detect CGNL1 of 68 German shepherd dogs, 56 dogs with CGNL1 gene variation are screened out, and then the evaluation details of the German shepherd dogs are evaluated by a conventional means according to the police department police dog reproduction rules issued by the police department: the dog can shuttle more than 6 people in a traction state, can adapt to traction and touch of strangers, and can fire and discharge firecrackers 50 meters outside when the dog leaves the people, so that the dog is not timid and is not timid for vehicles and pedestrians; the environmental adaptation is strong. All the 56 assessment results are excellent; the method is proved to be reliable.
The result of testing and screening qualified police dogs with good courage by CGNL1 is shown in a figure 12, which proves that the method is feasible and reliable.
Claims (2)
1. A police performance detection method for police dogs is characterized in that whether German shepherd dogs carry variant CGNL1 and/or KIF5C genes is determined through gene sequencing.
2. The method of claim 1, wherein the CGNL1 is Gene30799; the KIF5C is Gene22425.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211491466.7A CN115786479A (en) | 2022-11-25 | 2022-11-25 | Police performance detection method for police dog |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211491466.7A CN115786479A (en) | 2022-11-25 | 2022-11-25 | Police performance detection method for police dog |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115786479A true CN115786479A (en) | 2023-03-14 |
Family
ID=85441549
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211491466.7A Pending CN115786479A (en) | 2022-11-25 | 2022-11-25 | Police performance detection method for police dog |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115786479A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103993091A (en) * | 2014-05-30 | 2014-08-20 | 公安部警犬技术学校 | Method for precisely determining smell sensitivity of German shepherd on DNA level |
CN113151508A (en) * | 2021-05-25 | 2021-07-23 | 云南大学 | Biomarkers, kits and methods for identifying dogs having compliant behavior |
CN113699255A (en) * | 2020-09-02 | 2021-11-26 | 北京中科昆朋生物技术有限公司 | Biomarker, kit and method for identifying dogs with aggressive behavior |
-
2022
- 2022-11-25 CN CN202211491466.7A patent/CN115786479A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103993091A (en) * | 2014-05-30 | 2014-08-20 | 公安部警犬技术学校 | Method for precisely determining smell sensitivity of German shepherd on DNA level |
CN113699255A (en) * | 2020-09-02 | 2021-11-26 | 北京中科昆朋生物技术有限公司 | Biomarker, kit and method for identifying dogs with aggressive behavior |
CN113151508A (en) * | 2021-05-25 | 2021-07-23 | 云南大学 | Biomarkers, kits and methods for identifying dogs having compliant behavior |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Genomic population structure and prevalence of copy number variations in South African Nguni cattle | |
Gurgul et al. | The application of genome-wide SNP genotyping methods in studies on livestock genomes | |
CN108103235B (en) | SNP molecular marker and primer for identifying cold resistance of apple rootstock and application of SNP molecular marker and primer | |
CA2326835A1 (en) | A method for obtaining a plant with a genetic lesion in a gene sequence | |
CN106868131A (en) | No. 6 chromosomes of upland cotton SNP marker related to fibre strength | |
CN113151487A (en) | Molecular identification marker primer combination for quantitative character of stichopus japonicus and thorn and application method thereof | |
Lesur et al. | Development of target sequence capture and estimation of genomic relatedness in a mixed oak stand | |
CN109355398A (en) | One kind SNP marker primer relevant to Erhualian number born alive and its application | |
US20170283854A1 (en) | Multiplexed pcr assay for high throughput genotyping | |
Balakrishnan et al. | The Zebra Finch genome and avian genomics in the wild | |
Wakchaure et al. | Molecular markers and their applications in farm animals: A Review | |
CN108416189B (en) | Crop variety heterosis mode identification method based on molecular marker technology | |
CN105838720B (en) | PTPRQ gene mutation body and its application | |
CN115786479A (en) | Police performance detection method for police dog | |
CN111826429B (en) | Non-hybrid progeny identification method based on simplified genome sequencing and SNP (single nucleotide polymorphism) sub-allele frequency | |
Li et al. | Exploring single nucleotide polymorphism (SNP), microsatellite (SSR) and differentially expressed genes in the jellyfish (Rhopilema esculentum) by transcriptome sequencing | |
CN114875157A (en) | SNP (Single nucleotide polymorphism) marker related to individual growth traits of pelteobagrus fulvidraco and application | |
Nikelski et al. | Mitonuclear co-introgression opposes genetic differentiation between phenotypically divergent songbirds | |
Anderson et al. | Genomic architecture of artificially and sexually-selected traits in a wild cervid | |
JP7166638B2 (en) | Polymorphism detection method | |
Burt | Chicken genomics charts a path to the genome sequence | |
CN114875160B (en) | Application of reagent for detecting SNP molecular marker related to chicken heat stress tolerance, primer combination and detection and identification method thereof | |
CN109750106A (en) | A kind of combination of long-chain non-coding RNA and its detection method and application for evaluating bull sperm vigor height | |
Fletcher et al. | AFLAP: Assembly-Free Linkage Analysis Pipeline using k-mers from whole genome sequencing data | |
CN114574598B (en) | Method for identifying or assisting in identifying area of pig 5/6 rib eye muscle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20230314 |