CA3215520A1 - Voxelisation efficace pour apprentissage en profondeur - Google Patents
Voxelisation efficace pour apprentissage en profondeur Download PDFInfo
- Publication number
- CA3215520A1 CA3215520A1 CA3215520A CA3215520A CA3215520A1 CA 3215520 A1 CA3215520 A1 CA 3215520A1 CA 3215520 A CA3215520 A CA 3215520A CA 3215520 A CA3215520 A CA 3215520A CA 3215520 A1 CA3215520 A1 CA 3215520A1
- Authority
- CA
- Canada
- Prior art keywords
- amino acid
- voxel
- atoms
- computer
- atom
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013135 deep learning Methods 0.000 title description 11
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 123
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 119
- 238000013507 mapping Methods 0.000 claims abstract description 43
- 150000001413 amino acids Chemical class 0.000 claims description 582
- 125000004429 atom Chemical group 0.000 claims description 475
- 238000000034 method Methods 0.000 claims description 252
- 125000004432 carbon atom Chemical group C* 0.000 claims description 84
- 238000005516 engineering process Methods 0.000 abstract description 59
- 235000001014 amino acid Nutrition 0.000 description 508
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 196
- 108700028369 Alleles Proteins 0.000 description 162
- 230000007918 pathogenicity Effects 0.000 description 93
- 235000018102 proteins Nutrition 0.000 description 90
- 230000008569 process Effects 0.000 description 63
- 210000004027 cell Anatomy 0.000 description 54
- 229910052799 carbon Inorganic materials 0.000 description 51
- 238000013528 artificial neural network Methods 0.000 description 39
- 238000013527 convolutional neural network Methods 0.000 description 39
- 239000002773 nucleotide Substances 0.000 description 36
- 241000894007 species Species 0.000 description 36
- 235000004279 alanine Nutrition 0.000 description 35
- 125000003729 nucleotide group Chemical group 0.000 description 35
- 238000012545 processing Methods 0.000 description 35
- 230000001717 pathogenic effect Effects 0.000 description 28
- 230000027455 binding Effects 0.000 description 27
- 238000012549 training Methods 0.000 description 27
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 23
- 238000010200 validation analysis Methods 0.000 description 21
- 230000015654 memory Effects 0.000 description 20
- 150000001721 carbon Chemical group 0.000 description 18
- 230000006870 function Effects 0.000 description 18
- 238000004364 calculation method Methods 0.000 description 14
- 238000003860 storage Methods 0.000 description 14
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 13
- 230000004913 activation Effects 0.000 description 13
- 238000001994 activation Methods 0.000 description 13
- 238000010801 machine learning Methods 0.000 description 13
- 230000000306 recurrent effect Effects 0.000 description 13
- 239000011575 calcium Substances 0.000 description 12
- 230000002068 genetic effect Effects 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- -1 Alanine amino acid Chemical class 0.000 description 11
- 241000288906 Primates Species 0.000 description 11
- 238000010586 diagram Methods 0.000 description 11
- 238000004422 calculation algorithm Methods 0.000 description 10
- 230000004048 modification Effects 0.000 description 10
- 238000012986 modification Methods 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 9
- 125000004435 hydrogen atom Chemical class [H]* 0.000 description 9
- 230000035772 mutation Effects 0.000 description 9
- 238000010606 normalization Methods 0.000 description 9
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 8
- 108091028043 Nucleic acid sequence Proteins 0.000 description 8
- 238000002864 sequence alignment Methods 0.000 description 8
- 239000004475 Arginine Substances 0.000 description 7
- 239000004471 Glycine Substances 0.000 description 7
- 238000013459 approach Methods 0.000 description 7
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 7
- 102000005701 Calcium-Binding Proteins Human genes 0.000 description 6
- 108010045403 Calcium-Binding Proteins Proteins 0.000 description 6
- 102000053602 DNA Human genes 0.000 description 6
- 108020004414 DNA Proteins 0.000 description 6
- 230000004568 DNA-binding Effects 0.000 description 6
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 6
- 108091023040 Transcription factor Proteins 0.000 description 6
- 102000040945 Transcription factor Human genes 0.000 description 6
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 230000013595 glycosylation Effects 0.000 description 6
- 238000006206 glycosylation reaction Methods 0.000 description 6
- 239000003999 initiator Substances 0.000 description 6
- 229910052751 metal Inorganic materials 0.000 description 6
- 239000002184 metal Substances 0.000 description 6
- 229930182817 methionine Natural products 0.000 description 6
- 238000002703 mutagenesis Methods 0.000 description 6
- 231100000350 mutagenesis Toxicity 0.000 description 6
- 229910052725 zinc Inorganic materials 0.000 description 6
- 239000011701 zinc Substances 0.000 description 6
- 108010077544 Chromatin Proteins 0.000 description 5
- 210000003483 chromatin Anatomy 0.000 description 5
- 238000007477 logistic regression Methods 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 125000004433 nitrogen atom Chemical group N* 0.000 description 5
- 125000004430 oxygen atom Chemical group O* 0.000 description 5
- 238000011176 pooling Methods 0.000 description 5
- 101150072950 BRCA1 gene Proteins 0.000 description 4
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical group [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 4
- 229910052791 calcium Inorganic materials 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 229910052739 hydrogen Inorganic materials 0.000 description 4
- 239000001257 hydrogen Substances 0.000 description 4
- 230000029226 lipidation Effects 0.000 description 4
- 229910052757 nitrogen Inorganic materials 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 108090000144 Human Proteins Proteins 0.000 description 3
- 102000003839 Human Proteins Human genes 0.000 description 3
- 235000003704 aspartic acid Nutrition 0.000 description 3
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 229910052760 oxygen Inorganic materials 0.000 description 3
- 239000001301 oxygen Substances 0.000 description 3
- 238000004321 preservation Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 102000036365 BRCA1 Human genes 0.000 description 2
- 108700020463 BRCA1 Proteins 0.000 description 2
- 108700040618 BRCA1 Genes Proteins 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 101001066268 Homo sapiens Erythroid transcription factor Proteins 0.000 description 2
- 101001012669 Homo sapiens Melanoma inhibitory activity protein 2 Proteins 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 102100029778 Melanoma inhibitory activity protein 2 Human genes 0.000 description 2
- 108700011259 MicroRNAs Proteins 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 208000029560 autism spectrum disease Diseases 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000002487 chromatin immunoprecipitation Methods 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 229920002803 thermoplastic polyurethane Polymers 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 240000001436 Antirrhinum majus Species 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 208000035976 Developmental Disabilities Diseases 0.000 description 1
- 208000012239 Developmental disease Diseases 0.000 description 1
- 102100031690 Erythroid transcription factor Human genes 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 108020005004 Guide RNA Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- 238000004510 Lennard-Jones potential Methods 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 108010021466 Mutant Proteins Proteins 0.000 description 1
- 102000008300 Mutant Proteins Human genes 0.000 description 1
- BDUHCSBCVGXTJM-IZLXSDGUSA-N Nutlin-3 Chemical compound CC(C)OC1=CC(OC)=CC=C1C1=N[C@H](C=2C=CC(Cl)=CC=2)[C@H](C=2C=CC(Cl)=CC=2)N1C(=O)N1CC(=O)NCC1 BDUHCSBCVGXTJM-IZLXSDGUSA-N 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 101150080074 TP53 gene Proteins 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 238000000126 in silico method Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- PNDPGZBMCMUPRI-UHFFFAOYSA-N iodine Chemical compound II PNDPGZBMCMUPRI-UHFFFAOYSA-N 0.000 description 1
- 238000013140 knowledge distillation Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 238000001000 micrograph Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 108700025694 p53 Genes Proteins 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012913 prioritisation Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 230000006916 protein interaction Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 238000013517 stratification Methods 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Genetics & Genomics (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- Public Health (AREA)
- Crystallography & Structural Chemistry (AREA)
- Epidemiology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Image Processing (AREA)
- Image Generation (AREA)
- Magnetic Resonance Imaging Apparatus (AREA)
Abstract
La technologie divulguée consiste à déterminer efficacement quels atomes dans une protéine sont les plus proches de voxels dans une grille. Les atomes ont des coordonnées d'atomes tridimensionnelles (3D), et les voxels ont des coordonnées de voxels 3D. La technologie divulguée génère une mise en correspondance d'atomes sur voxels qui met en correspondance, sur chacun des atomes, un voxel contenant sélectionné en fonction des coordonnées d'atome 3D correspondantes d'un atome particulier de la protéine par rapport aux coordonnées de voxel 3D dans la grille. La technologie divulguée génère une mise en correspondance voxel sur atomes qui met en correspondance, à chacun des voxels, un sous-ensemble des atomes. Le sous-ensemble des atomes mis en correspondance avec un voxel particulier dans la grille comprend les atomes dans la protéine qui sont mis en correspondance sur le voxel particulier par mise en correspondance atome sur voxels. La technologie divulguée consiste à utiliser la mise en correspondance voxel sur atomes pour déterminer, pour chacun des voxels, un atome le plus proche dans la protéine.
Applications Claiming Priority (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163175495P | 2021-04-15 | 2021-04-15 | |
US63/175,495 | 2021-04-15 | ||
US202163175767P | 2021-04-16 | 2021-04-16 | |
US63/175,767 | 2021-04-16 | ||
US17/703,935 US20220336056A1 (en) | 2021-04-15 | 2022-03-24 | Multi-channel protein voxelization to predict variant pathogenicity using deep convolutional neural networks |
US17/703,958 | 2022-03-24 | ||
US17/703,958 US20220336057A1 (en) | 2021-04-15 | 2022-03-24 | Efficient voxelization for deep learning |
US17/703,935 | 2022-03-24 | ||
PCT/US2022/024918 WO2022221593A1 (fr) | 2021-04-15 | 2022-04-14 | Voxélisation efficace pour apprentissage en profondeur |
Publications (1)
Publication Number | Publication Date |
---|---|
CA3215520A1 true CA3215520A1 (fr) | 2022-10-20 |
Family
ID=81448684
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA3215520A Pending CA3215520A1 (fr) | 2021-04-15 | 2022-04-14 | Voxelisation efficace pour apprentissage en profondeur |
CA3215514A Pending CA3215514A1 (fr) | 2021-04-15 | 2022-04-14 | Voxelisation de proteine a canaux multiples pour predire une pathogenicite d'un variant a l'aide de reseaux neuronaux convolutifs profonds |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA3215514A Pending CA3215514A1 (fr) | 2021-04-15 | 2022-04-14 | Voxelisation de proteine a canaux multiples pour predire une pathogenicite d'un variant a l'aide de reseaux neuronaux convolutifs profonds |
Country Status (9)
Country | Link |
---|---|
EP (2) | EP4323991A1 (fr) |
JP (2) | JP2024513995A (fr) |
KR (2) | KR20230170680A (fr) |
AU (2) | AU2022259667A1 (fr) |
BR (2) | BR112023021266A2 (fr) |
CA (2) | CA3215520A1 (fr) |
IL (2) | IL307661A (fr) |
MX (2) | MX2023012227A (fr) |
WO (2) | WO2022221591A1 (fr) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116153404B (zh) * | 2023-02-28 | 2023-08-15 | 成都信息工程大学 | 一种单细胞ATAC-seq数据分析方法 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2018350891B9 (en) * | 2017-10-16 | 2022-05-19 | Illumina, Inc. | Deep learning-based techniques for training deep convolutional neural networks |
WO2019084559A1 (fr) * | 2017-10-27 | 2019-05-02 | Apostle, Inc. | Prédiction d'impact pathogène lié au cancer de mutations somatiques à l'aide de procédés basés sur un apprentissage profond |
CN110245685B (zh) * | 2019-05-15 | 2022-03-25 | 清华大学 | 基因组单位点变异致病性的预测方法、系统及存储介质 |
-
2022
- 2022-04-14 JP JP2023563033A patent/JP2024513995A/ja active Pending
- 2022-04-14 AU AU2022259667A patent/AU2022259667A1/en active Pending
- 2022-04-14 IL IL307661A patent/IL307661A/en unknown
- 2022-04-14 EP EP22726207.8A patent/EP4323991A1/fr active Pending
- 2022-04-14 BR BR112023021266A patent/BR112023021266A2/pt unknown
- 2022-04-14 BR BR112023021343A patent/BR112023021343A2/pt unknown
- 2022-04-14 MX MX2023012227A patent/MX2023012227A/es unknown
- 2022-04-14 WO PCT/US2022/024916 patent/WO2022221591A1/fr active Application Filing
- 2022-04-14 KR KR1020237034825A patent/KR20230170680A/ko unknown
- 2022-04-14 JP JP2023563036A patent/JP2024514894A/ja active Pending
- 2022-04-14 KR KR1020237034824A patent/KR20230170679A/ko unknown
- 2022-04-14 WO PCT/US2022/024918 patent/WO2022221593A1/fr active Application Filing
- 2022-04-14 IL IL307667A patent/IL307667A/en unknown
- 2022-04-14 CA CA3215520A patent/CA3215520A1/fr active Pending
- 2022-04-14 AU AU2022258691A patent/AU2022258691A1/en active Pending
- 2022-04-14 EP EP22720250.4A patent/EP4323989A1/fr active Pending
- 2022-04-14 MX MX2023012226A patent/MX2023012226A/es unknown
- 2022-04-14 CA CA3215514A patent/CA3215514A1/fr active Pending
Also Published As
Publication number | Publication date |
---|---|
WO2022221593A1 (fr) | 2022-10-20 |
EP4323991A1 (fr) | 2024-02-21 |
CA3215514A1 (fr) | 2022-10-20 |
MX2023012226A (es) | 2024-01-08 |
KR20230170679A (ko) | 2023-12-19 |
AU2022258691A1 (en) | 2023-10-26 |
AU2022259667A1 (en) | 2023-10-26 |
BR112023021266A2 (pt) | 2023-12-12 |
KR20230170680A (ko) | 2023-12-19 |
BR112023021343A2 (pt) | 2023-12-19 |
EP4323989A1 (fr) | 2024-02-21 |
JP2024513995A (ja) | 2024-03-27 |
IL307667A (en) | 2023-12-01 |
WO2022221591A1 (fr) | 2022-10-20 |
MX2023012227A (es) | 2024-01-08 |
JP2024514894A (ja) | 2024-04-03 |
IL307661A (en) | 2023-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230045003A1 (en) | Deep learning-based use of protein contact maps for variant pathogenicity prediction | |
US20220336057A1 (en) | Efficient voxelization for deep learning | |
US20230108241A1 (en) | Predicting variant pathogenicity from evolutionary conservation using three-dimensional (3d) protein structure voxels | |
US11515010B2 (en) | Deep convolutional neural networks to predict variant pathogenicity using three-dimensional (3D) protein structures | |
CA3215520A1 (fr) | Voxelisation efficace pour apprentissage en profondeur | |
CA3215462A1 (fr) | Reseaux neuronaux convolutifs profonds pour predire une pathogenicite d'un variant a l'aide de structures proteiques tridimensionnelles (3d) | |
US20230047347A1 (en) | Deep neural network-based variant pathogenicity prediction | |
US20230343413A1 (en) | Protein structure-based protein language models | |
EP4413575A1 (fr) | Apprentissage combiné et par transfert d'un prédicteur de pathogénicité de variants au moyen d'échantillons de protéines à brèche et sans brèche | |
WO2023059750A1 (fr) | Apprentissage combiné et par transfert d'un prédicteur de pathogénicité de variants au moyen d'échantillons de protéines à brèche et sans brèche | |
JP2024538478A (ja) | ギャップ付き及び非ギャップタンパク質サンプルを使用した変異体病原性予測器の複合学習及び転移学習 | |
JP2024538477A (ja) | タンパク質構造に基づくタンパク質言語モデル | |
JP2024538475A (ja) | 三次元(3d)タンパク質構造ボクセルを用いた進化的保存からの変異体病原性の予測 | |
CN117178327A (zh) | 使用深度卷积神经网络来预测变体致病性的多通道蛋白质体素化 | |
EP4381507A1 (fr) | Utilisation basée sur l'apprentissage de transfert de cartes de contact de protéine pour une prédiction de pathogénicité de variant | |
CN117581302A (zh) | 使用有缺口和非缺口的蛋白质样品的变体致病性预测器的组合学习和迁移学习 |