CN111489788A - Deep association nuclear learning technology for explaining complex disease genetic relationship - Google Patents

Deep association nuclear learning technology for explaining complex disease genetic relationship Download PDF

Info

Publication number
CN111489788A
CN111489788A CN202010229815.2A CN202010229815A CN111489788A CN 111489788 A CN111489788 A CN 111489788A CN 202010229815 A CN202010229815 A CN 202010229815A CN 111489788 A CN111489788 A CN 111489788A
Authority
CN
China
Prior art keywords
module
deep
causal
pathway
regression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010229815.2A
Other languages
Chinese (zh)
Other versions
CN111489788B (en
Inventor
邓岳
鲍峰
王勃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202010229815.2A priority Critical patent/CN111489788B/en
Publication of CN111489788A publication Critical patent/CN111489788A/en
Application granted granted Critical
Publication of CN111489788B publication Critical patent/CN111489788B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/40Population genetics; Linkage disequilibrium
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Genetics & Genomics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Physiology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Ecology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioethics (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a deep association core learning technology for explaining a genetic relationship of a complex disease, which comprises the following steps: the system comprises a path grouping module, a gene coding module, a convolutional layer module, a kernel machine regression module and a gradient return algorithm module; the pathway grouping module groups variants in the same biological pathway; the gene coding module encodes and models alleles in each group of variants in a heat-coded manner; the convolutional layer module is used for identifying a causal region in an allele and encoding a causal locus as a deep feature; the kernel machine regression module performs regression through a deep learning method, extracts the features of the path hierarchy and counts the significance of the features; and the gradient return algorithm module optimizes and updates the depth network parameters and feeds back the parameters to the convolutional layer module. The invention solves the limitation in the traditional technology, realizes the detection of complex association between genes and enhances the interpretability of GWAS.

Description

Deep association nuclear learning technology for explaining complex disease genetic relationship
Technical Field
The invention relates to the technical field of genetic engineering, in particular to a deep association nuclear learning technology for explaining the genetic relationship of complex diseases.
Background
Currently, genetic mutations cause complex diseases in a variety of different ways, and a comprehensive determination of genetic causality can provide valuable insight into the development and treatment of diseases. However, existing genome-wide association study (GWAS) approaches are always based on linear hypotheses and simple disease models, limiting their popularity in discovering complex causal relationships. On the other hand, with the development of deep learning, the deep learning method is generally used as a "black box" tool in genomics to solve the problems that the conventional technology cannot solve, but the underlying theory behind the deep learning method cannot be explained.
Existing genome-wide association study (GWAS) methods are always based on linear hypothesis and simple disease models, thereby limiting their universality in discovering complex causal relationships, and the effect is only obvious when genes have strong and directly associated variables. Meanwhile, some existing technologies rely on some preset genetic models to perform artificial gene coding, but in fact, the genetic effect of diseases is unknown, and early modeling is difficult to perform, so that a method without a genetic model is needed to reasonably simulate the internal relation between genes and characterization.
Therefore, how to provide a deep association nuclear learning technology for explaining the genetic relationship of complex diseases is a problem to be solved urgently by those skilled in the art.
Disclosure of Invention
In view of the above, the present invention provides a deep association nuclear learning technique for explaining the genetic relationship of complex diseases.
In order to achieve the purpose, the invention adopts the following technical scheme:
a deep association nuclear learning technique to interpret complex disease genetic relationships, comprising: the route grouping module, the gene coding module, the convolutional layer module, the kernel machine regression module and the gradient return algorithm module are sequentially connected with the route grouping module;
the pathway grouping module groups variants in the same biological pathway;
the gene coding module encodes and models alleles in each group of variants in a thermally encoded manner;
the convolutional layer module is for identifying a causal region in an allele, encoding a causal locus as a deep feature;
the kernel machine regression module performs regression through a deep learning method, extracts the features of the path hierarchy and counts the significance of the features;
and the gradient feedback algorithm module optimizes and updates the depth network parameters and feeds back the parameters to the convolutional layer module.
Preferably, the pathway grouping module increases the combined effect of multiple SNPs in a pathway, which are grouped into a pathway-level genome.
Preferably, the genotype of the SNP includes major homozygote, heterozygote, and minor homozygote genotypes.
Preferably, the convolutional layer module identifies causal regions by increasing the convolution output difference between causal and non-causal regions and expanding the similarity between samples carrying causal alleles, improving the detectability of causal paths.
Preferably, the kernel machine regression module performs regression by a deep learning method to find out the relationship between genetic causal relationship and complex disease.
Preferably, the kernel machine regression module includes regression factors for environmental factors and genotype characteristics.
Preferably, the gradient back-transmission algorithm module optimizes the deep network parameters by using tensierflow.
Preferably, the kernel machine regression module performs statistical tests through SKAT framework correlation.
Preferably, the gradient back-propagation algorithm module performs back propagation through multi-instance learning, and optimizes the whole SKAT framework parameters in an end-to-end manner.
Compared with the prior art, the invention discloses a deep association nuclear learning technology for explaining the genetic relationship of complex diseases, and the complex, nonlinear and various causal sites are automatically deduced from the pathway-level gene sequences by utilizing the deep learning capability. Meanwhile, the 'black box' of the deep learning method of the DAK can be tried to be opened, so that the performance of association detection can be greatly improved, the interpretability of the deep learning model for GWAS research is further enhanced through analysis, and the principle understanding of deep learning is deepened. The method for reasonably simulating the internal relation between the gene core representations by designing a non-inheritance model solves some limitations in the traditional technology, realizes the detection of complex association between genes and enhances the interpretability of GWAS.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic structural diagram provided by the present invention.
FIG. 2 is a bar graph of error rates compared by seven methods provided by the present invention.
Fig. 3 is a DAK training curve diagram provided by the present invention.
FIG. 4a is a histogram of the effects of five genetic models provided by the present invention.
FIG. 4b is a graph showing the efficacy of the different genotypes according to the method of the present invention in five genetic models.
FIG. 4c is a graph showing the line of efficacy of each method in five genetic models for different genotypes at 5000 samples according to the present invention.
FIG. 4d is a line graph showing the efficacy of each method in five genetic models for a sample of 3000 different genotypes as provided by the present invention.
FIG. 5a is a graph showing the efficacy of each of the five genetic models under the action of multiple genes according to the present invention.
FIG. 5b is a graph showing the line of efficacy of each of the five genetic models at 5000 samples for the polygene provided by the present invention.
FIG. 5c is a graph showing the line of efficacy of each of the methods in five genetic models at a sample of 3000 under the polygenic effect provided by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a deep association kernel learning technology for explaining a genetic relationship of a complex disease, which comprises the following steps: the route grouping module 1, the gene coding module 2, the convolutional layer module 3, the kernel machine regression module 4 and the gradient return algorithm module 5 are sequentially connected with the route grouping module 1;
the pathway grouping module 1 groups variants in the same biological pathway;
the gene coding module 2 encodes and models the alleles in each group of variants in a heat-coded manner;
the convolutional layer module 3 is used to identify causal regions in alleles, encoding causal loci as deep features;
the kernel machine regression module 4 performs regression through a deep learning method, extracts the features of the path hierarchy and counts the significance of the features;
the gradient return algorithm module 5 optimizes and updates the depth network parameters and feeds back the parameters to the convolutional layer module 3.
To further optimize the above technical solution, the pathway grouping module 1 increases the combined effect of multiple SNPs in the pathway, which are grouped into a pathway-level genome.
To further optimize the above technical solution, the genotype of the SNP includes major homozygote, heterozygote and minor homozygote genotypes.
To further optimize the above solution, the convolutional layer module 3 identifies causal regions by increasing the convolution output difference between the causal and non-causal regions and enlarging the similarity between samples carrying causal alleles.
In order to further optimize the technical scheme, the kernel machine regression module 4 performs regression by a deep learning method to find out the relationship between genetic causal relationship and complex diseases.
In order to further optimize the above solution, the kernel machine regression module 4 includes regression factors for environmental factors and genotype characteristics.
In order to further optimize the above technical solution, the gradient backhaul algorithm module 5 optimizes the deep network parameters by using tenserflow.
In order to further optimize the above technical solution, the kernel machine regression module 4 performs a statistical test through SKAT framework association.
In order to further optimize the above technical solution, the gradient back-propagation algorithm module 5 performs back propagation through multi-instance learning, and optimizes the whole SKAT framework parameters in an end-to-end manner.
In each simulation experiment, the present invention simulates the data set under null (no causal relationship) or surrogate (disease is caused by different genetic associations) assumptions. The performance of the different methods was evaluated in 100 replicates using type I error rate (corresponding to the original hypothesis) and empirical efficacy (corresponding to the alternative hypothesis) (method).
First disclose Type-I errors, if no causal sites are present in all pathways (no assumptions), all methods show lower error rate levels, as shown in FIG. 2, and altering the sample size has little impact on the results. Fig. 3 is a training curve showing that DAK converges in several iterations.
We then believe that the disease in the database is caused by a single common variation. To illustrate the different functional pathways by which genes cause disease, it was assumed that alleles of causal loci play a role in five different genetic models: 1) additive model, secondary homozygous genotype affects twice as much as heterozygous; 2) dominant mode, both genotypes show the same magnitude of effect; 3) multiplicative models, secondary alleles multiply disease risk; 4) recessive models, where only a few homozygous genotypes play a role; 5) heterozygous model, only heterozygous alleles worked, as shown in figure 4 a. Under the most widely used additive disease model, all methods showed reasonable accuracy to identify pathways with disease loci, as shown in fig. 4 b. However, the efficacy of all comparative methods decreases dramatically when the basic genetic model changes, whereas the DAK technique of the present invention maintains reliable performance with optimal efficacy under all conditions. In particular, for relatively difficult recessive genetic models, the accuracy of all comparison methods is greatly reduced and far below the performance of DAK. At the same time, as shown in fig. 4b, when the sample size was increased to 5,000, the efficacy of all methods increased, while DAK was still the best.
Due to the low gene frequency, finding rare variations in GWAS (minor allele frequency < 0.5%) is a difficult task. In contrast, the present invention simulates a rare dataset of 5000 samples, with the corresponding disease in the dataset being caused by a single rare variation under five genotype models. As shown in fig. 4c, DAK achieved higher performance on recessive and multiplicative genetic models than other models. On the other hand, as shown in fig. 4d, DAK can find causal rare variations on the data set at a power of around 0.8 even with only 3000 samples, which is a difficult task for several other methods. These experiments illustrate the great advantages of the technology designed by the present invention in the gene detection disease technology.
On the other hand, most diseases are the result of the co-action of multiple genes. However, identifying combined and mixed effect signals from causal variables of multiple genes has been very difficult. The method of the present invention and other methods were actually tested and compared by randomly assigning three causal common variants and generating phenotypes under five genetic models (methods) to simulate the combined effect. All methods performed much less than the single variable results described above. However, in all multivariate experiments, DAK was still far superior to other methods and had the most stable performance, as shown in fig. 5 a. The advantage of DAK is even more pronounced when the causal position is an unusual variation, as shown in fig. 5b and 5 c.
The invention has experimented with DAK on four real datasets covering cancer and mental disease, further proving that DAK can find invisible but meaningful ways. In particular, interesting causal relationships between schizophrenia and the dilated cardiomyopathy pathway have been discovered.
The invention adopts DAK structure mathematical model:
for the ith individual in a total of N samples, yiRepresents a phenotype (e.g., a disease or control);
Figure BDA0002428935190000061
is an adjustment vector consisting of K context-dependent factors (e.g., gender, stratification, and bias). The genotype of each SNP falls into one of three categories: major homozygote, heterozygote and minor homozygote genotypes. Thus, it is natural to represent the genotype of each SNP with a heat vector, where non-zero entries represent its specific genotype. All l on the p-th path of the individual i(P)Combining SNPs together to obtain a corresponding pathway-level genotype matrix
Figure BDA0002428935190000062
After path assembly, p total number of paths were obtained for all experimental samples.
Conv (· | Θ) by convolution layer using M convolution operatorsc) Convert each
Figure BDA0002428935190000063
Figure BDA0002428935190000064
Wherein the content of the first and second substances,
Figure BDA0002428935190000071
representing parameters
Figure BDA00024289351900000710
The j (th) convolution operator of]Representing the max pool operator.
Figure BDA0002428935190000072
All learnable parameters representing convolutional layers.
By applying a pass hThe output of the convolutional layer of layers, a kernel representation of the p-path of the ith individual is obtained,
Figure BDA0002428935190000073
where k (·, ·) is a kernel function and N is the number of samples.
Kernel regression function defining path hierarchy:
Figure BDA0002428935190000074
where ω is { α }, a learnable regression factor containing environmental factors and genotype characteristics for individual i can be obtained from P paths
Figure BDA0002428935190000075
The labels (disease and non-disease) are provided only at the individual level, not at each single pathway level. Thus, consider the multi-instance learning penalty and define the respective level labels for sample i as:
Figure BDA0002428935190000076
the loss function of this multi-instance learning is naturally explained in GWAS: a sample is considered a patient if at least one pathway in the sample is associated with a disease. The training loss function is defined as:
Figure BDA0002428935190000077
the loss function is optimized in tensierflow in batch form.
After good training, scoring tests were performed to quantify the statistical significance of each path using the same method as in SKAT 12. For each path P, the statistical score passes through the kernel similarity matrix
Figure BDA0002428935190000078
By:
QP=(L-Y)Tκ(P)(L-Y)
wherein the content of the first and second substances,
Figure BDA0002428935190000079
is the predicted disease state across pathway P in N samples. As stated in the introduction to SKAT, QPHexix-2Are compared to obtain a P value.
The deep association kernel learning (DAK) techniques disclosed herein enable automatic causal genotype coding of GWAS at the path level. DAK can detect common and rare variants of complex genetic effects that are not detectable by currently existing methods. The "black box" of the deep learning model is explained and the reason why it can greatly improve the performance of the association detection is discussed. When applied to real-world GWAS data, the DAK analysis designed by the present invention finds potential contingent pathways that may be explained by other biological studies. The invention reasonably simulates the internal relation between the gene nucleus characteristics, solves some limitations in the traditional technology, realizes the detection of complex association between genes and enhances the interpretability of GWAS.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (9)

1. A deep association nuclear learning technique for interpreting genetic relationships in complex diseases, comprising: the route grouping module (1), a gene coding module (2), a convolutional layer module (3), a kernel machine regression module (4) and a gradient return algorithm module (5) which are sequentially connected with the route grouping module (1);
the pathway grouping module (1) groups variants in the same biological pathway;
the gene coding module (2) encodes and models alleles in each set of variants in a thermally encoded manner;
the convolutional layer module (3) is for identifying a causal region in an allele, encoding a causal locus as a deep feature;
the kernel machine regression module (4) performs regression through a deep learning method, extracts the features of the path hierarchy and counts the significance of the features;
the gradient feedback algorithm module (5) optimizes and updates the depth network parameters and feeds back the parameters to the convolutional layer module (3).
2. The deep association nuclear learning technique for interpreting genetic relationships of complex diseases as claimed in claim 1, wherein the pathway grouping module (1) increases the combined effect of a plurality of SNPs in a pathway, which are grouped into a genome at the pathway level.
3. The deep association nuclear learning technique of claim 2, wherein the genotypes of the SNPs comprise major homozygote, heterozygote and minor homozygote genotypes.
4. The deep correlation nuclear learning technique of interpreting complex disease inheritance relationships according to claim 1, wherein the convolution layer module (3) identifies causal regions by increasing the convolution output difference between causal and non-causal regions and expanding the similarity between samples carrying causal alleles.
5. The deep correlation nuclear learning technique for interpreting genetic relationship of complex diseases as claimed in claim 1, wherein the kernel machine regression module (4) performs regression by deep learning method to find out the relationship between genetic causal relationship and complex diseases.
6. The deep correlation nuclear learning technique for interpreting genetic relationship of complex diseases as claimed in claim 1, wherein the kernel machine regression module (4) includes regression factors of environmental factors and genotype features.
7. The deep correlation nuclear learning technique for interpreting genetic relationship of complex diseases as claimed in claim 1, characterized in that the gradient back-transmission algorithm module (5) optimizes the deep network parameters by using tensorflow.
8. The deep correlation nuclear learning technique for interpreting genetic relationship of complex diseases as claimed in claim 1, wherein the kernel machine regression module (4) performs statistical tests by SKAT framework correlation.
9. The deep correlation nuclear learning technique for interpreting genetic relationship of complex diseases as claimed in claim 1, wherein the gradient back-propagation algorithm module (5) performs back propagation through multi-instance learning to optimize the whole SKAT framework parameters in an end-to-end manner.
CN202010229815.2A 2020-03-27 2020-03-27 Deep association kernel learning system for explaining genetic relationship of complex diseases Active CN111489788B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010229815.2A CN111489788B (en) 2020-03-27 2020-03-27 Deep association kernel learning system for explaining genetic relationship of complex diseases

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010229815.2A CN111489788B (en) 2020-03-27 2020-03-27 Deep association kernel learning system for explaining genetic relationship of complex diseases

Publications (2)

Publication Number Publication Date
CN111489788A true CN111489788A (en) 2020-08-04
CN111489788B CN111489788B (en) 2022-05-20

Family

ID=71812518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010229815.2A Active CN111489788B (en) 2020-03-27 2020-03-27 Deep association kernel learning system for explaining genetic relationship of complex diseases

Country Status (1)

Country Link
CN (1) CN111489788B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101370946A (en) * 2005-10-21 2009-02-18 基因信息股份有限公司 Method and apparatus for correlating levels of biomarker products with disease
CN101845501A (en) * 2010-05-18 2010-09-29 孟涛 Comprehensive genetic analysis method of susceptibility of complex diseases
CN107025386A (en) * 2017-03-22 2017-08-08 杭州电子科技大学 A kind of method that gene association analysis is carried out based on deep learning algorithm
CN107341366A (en) * 2017-07-19 2017-11-10 西安交通大学 A kind of method that complex disease susceptibility loci is predicted using machine learning
US20180163261A1 (en) * 2015-02-26 2018-06-14 Asuragen, Inc. Methods and apparatuses for improving mutation assessment accuracy
US20190228838A1 (en) * 2016-09-26 2019-07-25 Mcmaster University Tuning of Associations For Predictive Gene Scoring

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101370946A (en) * 2005-10-21 2009-02-18 基因信息股份有限公司 Method and apparatus for correlating levels of biomarker products with disease
CN101845501A (en) * 2010-05-18 2010-09-29 孟涛 Comprehensive genetic analysis method of susceptibility of complex diseases
US20180163261A1 (en) * 2015-02-26 2018-06-14 Asuragen, Inc. Methods and apparatuses for improving mutation assessment accuracy
US20190228838A1 (en) * 2016-09-26 2019-07-25 Mcmaster University Tuning of Associations For Predictive Gene Scoring
CN107025386A (en) * 2017-03-22 2017-08-08 杭州电子科技大学 A kind of method that gene association analysis is carried out based on deep learning algorithm
CN107341366A (en) * 2017-07-19 2017-11-10 西安交通大学 A kind of method that complex disease susceptibility loci is predicted using machine learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FENG BAO 等: "Explaining the Genetic Causality for Complex Phenotype via Deep Association Kernel Learning", 《PATTERNS》 *

Also Published As

Publication number Publication date
CN111489788B (en) 2022-05-20

Similar Documents

Publication Publication Date Title
Torada et al. ImaGene: a convolutional neural network to quantify natural selection from genomic data
Lloyd The structure and confirmation of evolutionary theory
Yeaman et al. Quantifying how constraints limit the diversity of viable routes to adaptation
Flagel et al. The unreasonable effectiveness of convolutional neural networks in population genetic inference
Good et al. Genetic diversity in the interference selection limit
Sanchez et al. Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation
Banks et al. A review of particle swarm optimization. Part II: hybridisation, combinatorial, multicriteria and constrained optimization, and indicative applications
Knowles ParEGO: A hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems
Uppu et al. A review on methods for detecting SNP interactions in high-dimensional genomic data
Tangherloni et al. Biochemical parameter estimation vs. benchmark functions: A comparative study of optimization performance and representation design
Fusco How many processes are responsible for phenotypic evolution?
Rees et al. Evolving integral projection models: evolutionary demography meets eco‐evolutionary dynamics
Mourad et al. A hierarchical Bayesian network approach for linkage disequilibrium modeling and data-dimensionality reduction prior to genome-wide association studies
Li et al. Two infill criteria driven surrogate-assisted multi-objective evolutionary algorithms for computationally expensive problems with medium dimensions
Sun et al. A fast EM algorithm for BayesA-like prediction of genomic breeding values
Isidro y Sánchez et al. Training set optimization for sparse phenotyping in genomic selection: A conceptual overview
De Jode et al. Ten years of demographic modelling of divergence and speciation in the sea
Aguirre-Liguori et al. Evaluation of the minimum sampling design for population genomic and microsatellite studies: An analysis based on wild maize
Fortes‐Lima et al. Complex genetic admixture histories reconstructed with Approximate Bayesian Computation
Motsinger-Reif et al. Grammatical evolution decision trees for detecting gene-gene interactions
Shirk et al. The effect of gene flow from unsampled demes in landscape genetic analysis
Raynaud et al. Performance and limitations of linkage-disequilibrium-based methods for inferring the genomic landscape of recombination and detecting hotspots: a simulation study
Montesinos‐Lopez et al. Application of a Poisson deep neural network model for the prediction of count data in genome‐based prediction
Rees et al. Why so variable: can genetic variance in flowering thresholds be maintained by fluctuating selection?
Huang et al. Harnessing deep learning for population genetic inference

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant