CN112086127B - Group genetic difference comparison method based on mutation function - Google Patents
Group genetic difference comparison method based on mutation function Download PDFInfo
- Publication number
- CN112086127B CN112086127B CN202010979785.7A CN202010979785A CN112086127B CN 112086127 B CN112086127 B CN 112086127B CN 202010979785 A CN202010979785 A CN 202010979785A CN 112086127 B CN112086127 B CN 112086127B
- Authority
- CN
- China
- Prior art keywords
- gene
- mutation
- site
- population
- certain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000035772 mutation Effects 0.000 title claims abstract description 150
- 230000002068 genetic effect Effects 0.000 title claims abstract description 28
- 238000000034 method Methods 0.000 title claims abstract description 26
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 133
- 238000012163 sequencing technique Methods 0.000 claims abstract description 8
- 230000036438 mutation frequency Effects 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 4
- 238000004949 mass spectrometry Methods 0.000 claims description 3
- 108020004707 nucleic acids Proteins 0.000 claims description 3
- 102000039446 nucleic acids Human genes 0.000 claims description 3
- 150000007523 nucleic acids Chemical class 0.000 claims description 3
- 238000010171 animal model Methods 0.000 claims description 2
- 230000004069 differentiation Effects 0.000 claims description 2
- 238000011156 evaluation Methods 0.000 claims description 2
- 238000002474 experimental method Methods 0.000 claims description 2
- 238000003205 genotyping method Methods 0.000 claims description 2
- 108700026220 vif Genes Proteins 0.000 claims 4
- 238000011160 research Methods 0.000 abstract description 4
- 238000011161 development Methods 0.000 abstract 1
- 230000007614 genetic variation Effects 0.000 abstract 1
- 238000012360 testing method Methods 0.000 abstract 1
- 238000012795 verification Methods 0.000 abstract 1
- 101150092476 ABCA1 gene Proteins 0.000 description 24
- 238000004458 analytical method Methods 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000007481 next generation sequencing Methods 0.000 description 2
- 108700005241 ATP Binding Cassette Transporter 1 Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/30—Detection of binding sites or motifs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Biophysics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Chemical & Material Sciences (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Bioethics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- Evolutionary Computation (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a method for comparing genetic variation difference aiming at group important function mutation, which comprises the steps of obtaining a functional mutation set by utilizing prediction software or an actual function verification test to evaluate mutation functions, then giving higher weight values to the functional mutations, obtaining genotype information of each individual in a certain group by utilizing a second-generation sequencing technology or a gene typing technology such as an SNP chip, and then calculating the gene frequency of a certain polymorphic site of the group. The method raises the genetic unit from a single polymorphic site to a single gene, so that the difference degree of all functional mutations of a certain gene among different populations can be compared, thereby predicting the difference of related phenotypes of certain genes among different populations and guiding the development of related genetic research of the genes among different populations.
Description
Technical Field
The invention particularly relates to a group genetic difference comparison method based on mutation functions, which is based on calculating mutation frequency difference of mutation sites potentially influencing gene functions among groups and evaluating the difference of the gene functions among the groups according to the mutation frequency difference, and belongs to the technical field of bioinformatics.
Background
As next-generation sequencing has become popular, more and more gene data has been generated in large quantities. In the context of big data, there is an increasing research on the contrast of genetic material between different populations. The gene comparison of different populations is currently performed by direct comparison of single mutation sites. However, often the sequence of a gene contains thousands of base pairs, and the mutation frequency at one or several sites is not different enough to describe the differences in the gene population as a whole.
Meanwhile, since the mutation site in one gene does not completely affect the function of the gene, it is not appropriate to consider all the mutation sites together. However, in genetic studies, determining the difference between genes in a population can help to screen for genes that differ significantly between different populations. Meanwhile, the functional research is selectively carried out on the sites with larger difference on the genes, so that the geneticist can be helped to find out the specific functional mutation sites in the population more quickly, accurately and quickly.
In conclusion, a high-throughput, simple and specific evaluation means for comparing the difference of genes among different populations is established, so that the screening efficiency of genetic research can be improved, and genes with larger difference among different populations can be determined to be preferentially researched in candidate populations to be researched.
Disclosure of Invention
The invention aims to overcome the imbalance between the existing comparison method and the comparison requirement, and provides a comparison method for evaluating the difference of gene functions among groups based on calculating the mutation frequency difference of mutation sites potentially influencing the gene functions among the groups.
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
one of the technical schemes of the invention is as follows:
a method for comparing difference scores of each gene in different populations comprises the following steps:
(1) A certain weight value, such as 1, is given to the mutation site judged to be functional, and a weight value of 0 is assigned to the mutation site judged not to significantly affect the gene function;
(2) Detecting the genetic information of an individual and recording the genotype information of the individual; individuals with poor typing quality and failed typing were not included in the follow-up study;
(3) Typing of population results A Individual Gene of population F a Making statistics, and recording genotype frequency F of the locus of the population b to be compared b ;
(4) Repeating the statistics of step 3) in all mutation sites of all genes, and calculating F in units of genes a -F b Get G in order from big to small ij (ii) a Wherein i is the number of a certain gene, and j is the number of a certain site on the gene, wherein G i1 Is F a -F b Maximum value of (d);
(5) For all G ij Calculating the mutation frequency sum S of all important mutation sites on the i gene at the point with the medium weight value of 1 iG=1 A value, wherein G is a weight value, j 1 The number of a certain G =1 mutation site on the i gene (distinguished from the number j, j) 1 Numbering only the mutation sites of G = 1), n 1 Number of mutation sites for all G =1 on the i gene:
(6) For all G ij Calculating the sum S of the number of all non-important mutation points on the i gene at the point with the medium weight value of 0 iG=0 A value, where G is a weight value, n 0 Number of mutation sites for all G =0 on the i gene:
(7) Calculating mutation score S of each mutation site on the i gene ij (ii) a Wherein i is the number of a certain gene, j is the number of a certain site on the gene, j 1 The number of a certain G =1 mutation site on the i gene (distinguished from the number j, j) 1 Numbering only the mutation sites of G = 1), q) 1 The number of the nearest 1G =1 mutation site before the j site on the i gene (different from the number j, q) 1 Numbering only the mutation sites of G = 1), q) 0 The number of the nearest 1G =0 mutation site before the j site on the i gene (different from the number j, q) 0 Numbering only the mutation sites for G = 0):
(8) Calculating the score S of a gene i (ii) a Wherein i is the number of a certain gene, j is the number of a certain site on the gene, and n is the number of all mutation sites on the i gene:
S i positive values indicate that the functional mutation of the gene in population a affects the gene to a greater extent than in population b; s i A negative value indicates that the functional mutation of the gene in population a has a smaller effect on the gene than in population b;
(9) If the population a is required to be compared with the population typing result, the mutation site frequency F a C, mutation site frequency F with other people group c If instead F is calculated in step (4) a -F c And find G ij And (5) and (8) are unchanged. Meanwhile, the formula is programmed by a computer, so that the comparison of a large number of gene difference values among different crowds can be realized in batch.
The second technical scheme of the invention is as follows:
a population genetic difference comparison method based on mutation functions comprises the following steps:
and (I) evaluating the importance degree of a certain mutation site and whether the importance degree of the mutation site can influence the function of the gene.
And (II) detecting the genetic information of the individual and recording the genotype information of the individual.
And (III) carrying out statistics on the genotype frequencies of individual samples of different populations. Genetic data obtained by different sequencing modes for different populations needs to be marked with undetected gene locus information.
And (IV) calculating the difference of different sites in two populations by taking the gene as a unit, and sequencing the sites according to the difference of mutation frequencies of the sites.
And (V) calculating the difference score of each gene in different populations by combining the function information of the mutation and the mutation frequency difference of the mutation site so as to compare the difference degree of the function mutation on each gene among different populations.
Preferably, the method of step (a) for assessing the degree of importance of a mutation site and whether it may affect the function of the gene includes, but is not limited to, the use of: molecular biology experiments, animal models, developed gene association studies or functional prediction software, and the like.
Preferably, the method for detecting genetic information of an individual in step (two) includes, but is not limited to, using a next generation sequencing technique or a genotyping technique such as: mass-Array nucleic acid Mass spectrometry system, SNP chip and the like.
Preferably, the occurrence of undetected gene loci in step (three) refers to the case where there is a difference in coverage of the detection sites in the two populations, and only common mutations that can be detected in the two populations are considered.
Preferably, the gene range differentiation in gene unit in the step (iv) is divided according to the genome of the current version. The reference database includes, but is not limited to: public databases such as NCBI, ensembl, etc. The partitioning of gene intervals may change with version updates.
Preferably, the functional information of the combined mutation and the mutation frequency difference of the mutation site in the step (five) are calculated by the specific method as shown in claim 1.
The invention realizes the formula design of high-flux gene score calculation in different crowds. Corresponding calculation software can be developed by utilizing the formula and is used as a powerful tool for comparing the genetic difference degree in the group genetics. At present, the genetic difference degree of different groups mostly comes from the comparison of mutation frequency difference of single mutation sites, and the simple calculation of the average value of all mutations is not suitable. The invention aims to solve the problem of developing an analysis method which can be used for comparing the population genetic difference degree by taking a gene as a unit and provides a new idea for comparing population genetic characteristics.
Drawings
FIG. 1 is a flow chart of the design method of the present invention in the actual operation process.
Detailed description of the preferred embodiments
The invention particularly relates to a group genetic difference comparison method based on mutation functions, which is based on comparing mutation frequency differences of mutation sites potentially having influence on gene functions among groups and evaluating the differences of the gene functions among the groups according to the mutation frequency differences, and belongs to the technical field of bioinformatics.
Taking the comparison of the difference of the gene ABCA1 between the population A and the population B (or the population C) as an example, the population genetic difference comparison method based on the mutation function comprises the following steps:
and (I) evaluating the importance degree of mutation sites on the ABCA1 gene and whether the importance degree possibly affects the functions of the genes by using prediction software SIFT and PROVEAN to obtain a score of the damage degree of each mutation.
And secondly, acquiring genetic information of exon regions of the crowd B (PoB) and the crowd C (PoC) from the exon database EXAC, and recording the genotype information of the genetic information.
And (III) carrying out statistics on the genotype frequencies of the PoB population and the PoC population. Since exon sequencing rarely covers introns and intergenic regions, and the functional impact of synonymous mutations is difficult to predict by SIFT and PROVEAN software, we analyzed only missense mutation populations in this example, and synonymous mutations were all labeled and not involved in this analysis.
And (IV) distinguishing the functions of missense mutation on the ABCA1 gene according to the SIFT and a boundary value predicted by PROVEAN (SIFT harmfulness judgment boundary value: 0.05, PROVEAN harmfulness judgment boundary value: -2.5).
(V) a certain weight value of 1 is given to the mutation site judged to be functional by SIFT or PROVEAN, and a weight value of 0 is assigned to the mutation site judged not to significantly affect the gene function.
And (VI) detecting the genetic information of the population to be detected and recording the genotype information of the population to be detected. Subsequent studies were not included for individuals with poor typing quality and failed typing. In this example, we replace the mutation frequency of the population to be tested after quality control with the mutation frequency of population a (PoA) in the exogenous database EXAC.
Seventhly, the frequency of a certain mutation site on the ABCA1 gene of the typing result (PoA) of the population A is recorded as F a The genotype frequency of this site of the population B (PoB) to be compared is denoted as F b 。
(eight) repeating the calculation of step (seven) in all mutation sites on the ABCA1 gene. If more genes need to be calculated, repeating the step (seven) and the step (eight) in sequence, and calculating F a -F b Get G in order from big to small ij . Wherein i is the number of a certain gene, in this example, the ABCA1 gene is number 1, and j is the number of a certain site on the ABCA1 gene, and in this example, the missense mutation (chromosome number: 9, physical position: 107588033, base C before mutation, base T after mutation) of the ABCA1 gene, which has the greatest difference in mutation rate between the PoA population and the PoB population, is number 1.
(nine) calculating the mutation frequency sum S of all important mutation sites for the point with the weight value of 1 on the ABCA1 gene 1G=1 Value of where j 1 The number of a mutation site of G =1 in the ABCA1 gene (different from the number j, j) 1 Numbering only the mutation sites of G = 1), in this case n 1 The number of mutation sites for all G =1 on ABCA1 gene, i.e. the number of all potentially deleterious mutations is 713:
(ten) at the same time, the sum S of the number of all the insignificant mutation sites was calculated for the site with a weight value of 0 on the ABCA1 gene (Gene No. 1) 1G=0 Value, n in this example 0 The number of mutation sites for all G =0 on ABCA1 gene, i.e. the number of all predicted harmless mutations 607:
(eleven) calculation of mutation score S for each mutation site on ABCA1 Gene 1j . Wherein S 1j Wherein 1 is the number of the ABCA1 gene, j is the number of a certain site on the ABCA1 gene, j is the number of the ABCA1 gene 1 The number of a mutation site of G =1 in the ABCA1 gene (different from the number j, j) 1 Numbering only the mutation sites of G = 1), q) 1 The number of the most recent 1G =1 mutation site before the j site on ABCA1 gene (different from the number j,q 1 numbering only the mutation sites of G = 1), q) 0 The number of the most recent 1G =0 mutation site before the j site on the ABCA1 gene (different from the number j, q) 0 Numbering only the mutation sites with G = 0), we chose missense mutations when j =10 (chromosome number: 9, physical location: 107593982, base T before mutation, base C after mutation) are exemplified, in this case, 2 important missense mutations including 10 th missense mutation (j = 10) are important missense mutations, i.e., q 1 =2; the 10 missense mutations including the 10 th missense mutation (j = 10) were 8 non-significant missense mutations, i.e. q 0 =8:
(twelve) calculation of the score S on the ABCA1 Gene 1 . Wherein 1 is the number of the ABCA1 gene, j is the number of a certain site on the gene, and n is the number of all mutation sites 1320 on the ABCA1 gene in the example:
s1 is a positive value, which indicates that the functional mutation of the gene in PoA population has larger influence on the gene compared with PoB population; a negative value for S1 indicates that the functional mutation of the gene in PoA population has a smaller effect on the gene than in PoB population. In this example, a negative value for S1 indicates that the ABCA1 gene has a smaller degree of functional mutation in the PoA population relative to the PoB population.
(thirteen) if necessary, the results of the population typing (PoA population, mutation site frequency F) a ) Comparison with the population to be compared 2 (PoC population, mutation site frequency F) c ) Then F is calculated in step (eight) instead a -F c And obtaining a new G ij And (5) keeping the steps from (nine) to (twelve). Meanwhile, the formula is programmed by a computer, so that the comparison of a large number of gene difference values among different crowds can be realized in batch.
The above examples are only specific embodiments of the present invention, obviously, the present invention is not limited to the above embodiments, and the modifications related to the formula should be protected by the present invention.
Claims (6)
1. A method for comparing difference scores of each gene in different populations is characterized by comprising the following steps:
(1) A weight value of 1 is given to a mutation site judged to be functional, and a weight value of 0 is given to a mutation site judged not to significantly affect the gene function;
(2) Detecting the genetic information of an individual and recording the genotype information of the individual; individuals with poor typing quality and failed typing were not included in the follow-up study;
(3) Typing of the population A results the frequency F of a certain mutation site on a single gene of the population a a Making statistics, and recording genotype frequency F of the site of the population b to be compared b ;
(4) Repeating the statistics of step 3) in all mutation sites of all genes, and calculating F in units of genes a -F b The difference value of G is obtained from large to small ij (ii) a Wherein i is the number of a certain gene, and j is the number of a certain site on the gene, wherein G i1 Is F a -F b Maximum value of (d);
(5) For all G ij Calculating the mutation frequency sum S of all important mutation sites on the i gene at the point with the medium weight value of 1 iG=1 Value, where G is the weight value, j 1 Number of a certain G =1 mutation site on the i gene, j 1 Distinguished from the number j, j 1 Numbering only the G =1 mutation sites, n 1 Number of mutation sites for all G =1 on the i gene:
(6) For all G ij Calculating the sum S of the number of all non-important mutation points on the i gene at the point with the medium weight value of 0 iG=0 A value, wherein G is a weight value, n 0 Number of mutation sites for all G =0 on the i gene:
(7) Calculating mutation score S of each mutation site on the i gene ij (ii) a Wherein i is the number of a certain gene, j is the number of a certain site on the gene, j 1 The number of a certain G =1 mutation site on the i gene is different from the number j, j 1 Numbering only the G =1 mutation sites, q 1 The number of the most recent 1G =1 mutation site before the j site on the i gene, q 1 Distinguished from the numbers j, q 1 Numbering only the G =1 mutation sites, q 0 The number of the most recent 1G =0 mutation site before the j site on the i gene, q 0 Distinguished from the numbers j, q 0 Only the mutation sites of G =0 were numbered:
(8) Calculating the score S of a gene i (ii) a Wherein i is the number of a certain gene, j is the number of a certain site on the gene, and n is the number of all mutation sites on the i gene:
S i positive values indicate that the functional mutation of the gene in population a affects the gene to a greater extent than in population b; s i A negative value indicates that the functional mutation of the gene in population a has a smaller effect on the gene than in population b;
(9) If the population a and the mutation site frequency F need to be compared with the population typing results a, Mutation site frequency F of the other population c c Then F is calculated in step 4) instead a -F c And find G ij Step 5) and step 8) are not changed, and simultaneously, the formula is programmed by a computer so as to realize batch productionAnd the calculation of a large number of gene differences among different crowds is realized.
2. A method for comparing genetic differences in a population based on mutation function, said method comprising the steps of:
evaluating the importance degree of a certain mutation site and whether the importance degree of the certain mutation site can influence the function of the gene;
detecting the genetic information of an individual and recording the genotype information of the individual;
carrying out statistics on genotype frequencies of individual samples of different populations; for genetic data obtained by different people in different sequencing modes, marking gene locus information which is not detected;
calculating the difference of different sites in two populations by taking the gene as a unit, and sequencing the sites according to the difference of mutation frequencies of the sites;
and (V) calculating the difference score of each gene in different groups by combining the function information of the mutation and the mutation frequency difference of the mutation site so as to compare the difference degree of the function mutation on each gene among different groups in batches, wherein the calculation method specifically comprises the following steps:
(1) Giving a weight value of 1 to a mutation site judged to be functional, and assigning a weight value of 0 to a mutation site judged not to significantly affect the gene function;
(2) Detecting the genetic information of an individual and recording the genotype information of the individual; individuals with poor typing quality and failed typing were not included in the follow-up study;
(3) Typing of population results A Individual Gene of population F a Making statistics, and recording genotype frequency F of the site of the population b to be compared b ;
(4) Repeating the statistics of step 3) in all mutation sites of all genes, and calculating F in units of genes a -F b The difference value of G is obtained from large to small ij (ii) a Wherein i is the number of a certain gene, and j is the number of a certain site on the gene, wherein G i1 Is F a -F b Maximum value of (d);
(5) For all G ij Calculating the mutation frequency sum S of all important mutation sites on the i gene at the point with the medium weight value of 1 iG=1 Value, where G is the weight value, j 1 Number of a certain G =1 mutation site on the i gene, j 1 Distinguished from the number j, j 1 Numbering only the G =1 mutation sites, n 1 Number of mutation sites for all G =1 on the i gene:
(6) For all G ij Calculating the sum S of the number of all non-important mutation points on the i gene at the point with the medium weight value of 0 iG=0 A value, wherein G is a weight value, n 0 Number of mutation sites for all G =0 on the i gene:
(7) Calculating the mutation score S of each mutation site on the i gene ij (ii) a Wherein i is the number of a certain gene, j is the number of a certain site on the gene, j 1 The number of a certain G =1 mutation site on the i gene is different from the number j, j 1 Numbering only the G =1 mutation sites, q 1 The number of the most recent 1G =1 mutation site before the j site on the i gene, q 1 Distinguished from the numbers j, q 1 Numbering only G =1 mutation sites, q 0 The number of the most recent 1G =0 mutation site before the j site on the i gene, q 0 Distinguished from the numbers j, q 0 Only the mutation sites of G =0 were numbered:
(8) Calculating the score S of a gene i (ii) a Wherein i is the number of a certain gene, j is the number of a certain site on the gene, and n is the number of all mutation sites on the i gene:
S i positive values indicate that the functional mutation of the gene in population a affects the gene to a greater extent than in population b; s. the i A negative value indicates that the functional mutation of the gene in population a has a smaller effect on the gene than in population b;
(9) If the population a is required to be compared with the population typing result, the mutation site frequency F a, Mutation site frequency F of the other population c c Then F is calculated in step 4) instead a -F c And find G ij And 5) the step 5) and the step 8) are not changed, and meanwhile, the formula is programmed by a computer, so that the calculation of a large number of gene difference values among different crowds can be realized in batch.
3. The method of claim 2, wherein the evaluation of the importance of a mutation site and its potential to affect the function of a gene includes but is not limited to the use of: molecular biology experiments, animal models, developed gene association studies or functional prediction software.
4. The method of claim 2, wherein the method for detecting genetic information of an individual includes but is not limited to using a second generation sequencing technology or a genotyping technology, including but not limited to a Mass-Array nucleic acid Mass spectrometry typing system or a SNP chip.
5. The method according to claim 2, wherein the genetic difference of the population based on the mutation function is that the coverage of the detected loci of two populations is different, for example, one population uses second-generation sequencing and has a wider coverage, while the other population uses a Mass-Array nucleic acid Mass spectrometry system and can only detect part of the mutation sequences, so that the mutation frequency of the undetected loci cannot be considered to be 0, and the mutation frequency cannot be directly compared and the markers should be removed.
6. The method as claimed in claim 2, wherein the differentiation of the gene range by gene is divided according to the genome of the current version, and the reference database includes but is not limited to: NCBI, ensembl public database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010979785.7A CN112086127B (en) | 2020-09-17 | 2020-09-17 | Group genetic difference comparison method based on mutation function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010979785.7A CN112086127B (en) | 2020-09-17 | 2020-09-17 | Group genetic difference comparison method based on mutation function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112086127A CN112086127A (en) | 2020-12-15 |
CN112086127B true CN112086127B (en) | 2023-03-10 |
Family
ID=73736844
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010979785.7A Active CN112086127B (en) | 2020-09-17 | 2020-09-17 | Group genetic difference comparison method based on mutation function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112086127B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117423382B (en) * | 2023-10-21 | 2024-05-10 | 云准医药科技(广州)有限公司 | Single-cell barcode identity recognition method based on SNP polymorphism |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106021994A (en) * | 2016-05-13 | 2016-10-12 | 万康源(天津)基因科技有限公司 | Tumor mutation site screening and mutual exclusion gene mining method |
CN107229841A (en) * | 2017-05-24 | 2017-10-03 | 重庆金域医学检验所有限公司 | A kind of genetic mutation appraisal procedure and system |
CN109493917A (en) * | 2018-09-02 | 2019-03-19 | 上海市儿童医院 | A kind of evil component level calculation method of gene mutation harmfulness predicted value |
CN110931081A (en) * | 2019-11-28 | 2020-03-27 | 广州基迪奥生物科技有限公司 | Biological information analysis method for human monogenic genetic disease detection |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160314245A1 (en) * | 2014-06-17 | 2016-10-27 | Genepeeks, Inc. | Device, system and method for assessing risk of variant-specific gene dysfunction |
-
2020
- 2020-09-17 CN CN202010979785.7A patent/CN112086127B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106021994A (en) * | 2016-05-13 | 2016-10-12 | 万康源(天津)基因科技有限公司 | Tumor mutation site screening and mutual exclusion gene mining method |
CN107229841A (en) * | 2017-05-24 | 2017-10-03 | 重庆金域医学检验所有限公司 | A kind of genetic mutation appraisal procedure and system |
CN109493917A (en) * | 2018-09-02 | 2019-03-19 | 上海市儿童医院 | A kind of evil component level calculation method of gene mutation harmfulness predicted value |
CN110931081A (en) * | 2019-11-28 | 2020-03-27 | 广州基迪奥生物科技有限公司 | Biological information analysis method for human monogenic genetic disease detection |
Also Published As
Publication number | Publication date |
---|---|
CN112086127A (en) | 2020-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106980763B (en) | Screening method of cancer driver gene based on gene mutation frequency | |
KR101325736B1 (en) | Apparatus and method for extracting bio markers | |
Snedecor et al. | Fast and accurate kinship estimation using sparse SNPs in relatively large database searches | |
CN111863127B (en) | Method for constructing genetic regulation network of plant transcription factor to target gene | |
JP2005531853A (en) | System and method for SNP genotype clustering | |
CN111863125A (en) | Mono-parent diploid detection method based on NGS-trio and application | |
CN112086127B (en) | Group genetic difference comparison method based on mutation function | |
KR101795662B1 (en) | Apparatus and Method for Diagnosis of metabolic disease | |
CN110444253B (en) | Method and system suitable for mixed pool gene positioning | |
WO2024140368A1 (en) | Sample cross contamination detection method and device | |
CN118186103A (en) | Lateolabrax japonicus 100k liquid phase chip and application thereof | |
US20230129183A1 (en) | Tailored gene chip for genetic test and fabrication method therefor | |
Zhu et al. | A generalized dSpliceType framework to detect differential splicing and differential expression events using RNA-Seq | |
KR20220064951A (en) | SYSTEMS AND METHODS FOR USING DENSITY OF SINGLE NUCLEOTIDE VARIATIONS FOR THE VERIFICATION OF COPY NUMBER VARIATIONS IN HUMAN EMBRYOS | |
CN111091867B (en) | Gene variation site screening method and system | |
CN112837746B (en) | Probe design method and positioning method for wheat exon sequencing gene positioning | |
KR100601937B1 (en) | Method for robust genotyping using DNA chip having a discriminating probe and amplicon probe immobilized thereon and DNA chip used therein | |
Coussement et al. | Quantitative transcriptomic and epigenomic data analysis: a primer | |
CN114023390B (en) | Classification of gastric cancer subtypes and uses thereof | |
CN111128297B (en) | Preparation method of gene chip | |
CN116343902A (en) | Method and system for complex disease polygenic genetic risk assessment | |
WO2013097143A1 (en) | Method and device for estimating genome heterozygosity rate | |
CN118460706A (en) | Methods, devices, media and program products for detecting mitochondrial genes | |
Satyawana et al. | Leveraging the 3000 Rice Genome Data for Computational Design of Polymorphic Markers in a Local Rice Variety Lacking Sequence Data | |
He et al. | P1028 Comparing 2 strategies for selecting low density SNPs for imputation-mediated, multiple-trait genomic prediction in a US Holstein population |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |