KR20230171930A - 3차원(3d) 단백질 구조들을 사용하여 변이체 병원성을 예측하기 위한 심층 콘볼루션 신경망들 - Google Patents
3차원(3d) 단백질 구조들을 사용하여 변이체 병원성을 예측하기 위한 심층 콘볼루션 신경망들 Download PDFInfo
- Publication number
- KR20230171930A KR20230171930A KR1020237034175A KR20237034175A KR20230171930A KR 20230171930 A KR20230171930 A KR 20230171930A KR 1020237034175 A KR1020237034175 A KR 1020237034175A KR 20237034175 A KR20237034175 A KR 20237034175A KR 20230171930 A KR20230171930 A KR 20230171930A
- Authority
- KR
- South Korea
- Prior art keywords
- amino acid
- voxel
- channels
- distance
- voxels
- Prior art date
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 136
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 130
- 230000007918 pathogenicity Effects 0.000 title claims abstract description 103
- 238000013527 convolutional neural network Methods 0.000 title abstract description 43
- 150000001413 amino acids Chemical class 0.000 claims abstract description 682
- 108700028369 Alleles Proteins 0.000 claims abstract description 137
- 238000012545 processing Methods 0.000 claims abstract description 35
- 125000004429 atom Chemical group 0.000 claims description 411
- 238000000034 method Methods 0.000 claims description 310
- 230000008569 process Effects 0.000 claims description 63
- 229910052799 carbon Inorganic materials 0.000 claims description 56
- 230000015654 memory Effects 0.000 claims description 24
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 14
- 238000005516 engineering process Methods 0.000 abstract description 50
- 230000000295 complement effect Effects 0.000 abstract 1
- 235000001014 amino acid Nutrition 0.000 description 638
- 125000003275 alpha amino acid group Chemical group 0.000 description 92
- 235000018102 proteins Nutrition 0.000 description 90
- 210000004027 cell Anatomy 0.000 description 54
- 238000013528 artificial neural network Methods 0.000 description 39
- 239000002773 nucleotide Substances 0.000 description 37
- 235000004279 alanine Nutrition 0.000 description 36
- 125000003729 nucleotide group Chemical group 0.000 description 36
- 241000894007 species Species 0.000 description 36
- 238000013507 mapping Methods 0.000 description 32
- 230000001717 pathogenic effect Effects 0.000 description 31
- 230000027455 binding Effects 0.000 description 29
- 238000012549 training Methods 0.000 description 26
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 24
- 238000010200 validation analysis Methods 0.000 description 21
- 230000006870 function Effects 0.000 description 16
- 230000004048 modification Effects 0.000 description 14
- 238000012986 modification Methods 0.000 description 14
- 238000003860 storage Methods 0.000 description 14
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 13
- 230000004913 activation Effects 0.000 description 13
- 238000004364 calculation method Methods 0.000 description 13
- 230000002068 genetic effect Effects 0.000 description 13
- 238000010801 machine learning Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 230000000306 recurrent effect Effects 0.000 description 12
- -1 alanine amino acid Chemical class 0.000 description 11
- 241000288906 Primates Species 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 10
- 108091036078 conserved sequence Proteins 0.000 description 10
- 238000010606 normalization Methods 0.000 description 10
- 238000013135 deep learning Methods 0.000 description 9
- 125000004435 hydrogen atom Chemical class [H]* 0.000 description 9
- 230000035772 mutation Effects 0.000 description 9
- 239000004475 Arginine Substances 0.000 description 8
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 8
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 8
- 239000004471 Glycine Substances 0.000 description 7
- 108091028043 Nucleic acid sequence Proteins 0.000 description 7
- 238000013459 approach Methods 0.000 description 7
- 238000002887 multiple sequence alignment Methods 0.000 description 7
- 102000005701 Calcium-Binding Proteins Human genes 0.000 description 6
- 108010045403 Calcium-Binding Proteins Proteins 0.000 description 6
- 230000004568 DNA-binding Effects 0.000 description 6
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 6
- 108091023040 Transcription factor Proteins 0.000 description 6
- 102000040945 Transcription factor Human genes 0.000 description 6
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 230000013595 glycosylation Effects 0.000 description 6
- 238000006206 glycosylation reaction Methods 0.000 description 6
- 239000003999 initiator Substances 0.000 description 6
- 230000029226 lipidation Effects 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 6
- 229910052751 metal Inorganic materials 0.000 description 6
- 239000002184 metal Substances 0.000 description 6
- 229930182817 methionine Natural products 0.000 description 6
- 238000002703 mutagenesis Methods 0.000 description 6
- 231100000350 mutagenesis Toxicity 0.000 description 6
- 229910052725 zinc Inorganic materials 0.000 description 6
- 239000011701 zinc Substances 0.000 description 6
- 108010077544 Chromatin Proteins 0.000 description 5
- 102000053602 DNA Human genes 0.000 description 5
- 108020004414 DNA Proteins 0.000 description 5
- 210000003483 chromatin Anatomy 0.000 description 5
- 238000007477 logistic regression Methods 0.000 description 5
- 229910052757 nitrogen Inorganic materials 0.000 description 5
- 125000004433 nitrogen atom Chemical group N* 0.000 description 5
- 125000004430 oxygen atom Chemical group O* 0.000 description 5
- 238000011176 pooling Methods 0.000 description 5
- 101150072950 BRCA1 gene Proteins 0.000 description 4
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 4
- 238000004132 cross linking Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 229910052739 hydrogen Inorganic materials 0.000 description 4
- 239000001257 hydrogen Substances 0.000 description 4
- 229910052760 oxygen Inorganic materials 0.000 description 4
- 239000001301 oxygen Substances 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 108090000144 Human Proteins Proteins 0.000 description 3
- 102000003839 Human Proteins Human genes 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 3
- 235000003704 aspartic acid Nutrition 0.000 description 3
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 3
- 125000004432 carbon atom Chemical group C* 0.000 description 3
- 230000001186 cumulative effect Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 102000036365 BRCA1 Human genes 0.000 description 2
- 108700020463 BRCA1 Proteins 0.000 description 2
- 108700040618 BRCA1 Genes Proteins 0.000 description 2
- 230000007067 DNA methylation Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 101001066268 Homo sapiens Erythroid transcription factor Proteins 0.000 description 2
- 101000891113 Homo sapiens T-cell acute lymphocytic leukemia protein 1 Proteins 0.000 description 2
- 108091092195 Intron Proteins 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 108700011259 MicroRNAs Proteins 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 101000702553 Schistosoma mansoni Antigen Sm21.7 Proteins 0.000 description 2
- 101000714192 Schistosoma mansoni Tegument antigen Proteins 0.000 description 2
- 102100040365 T-cell acute lymphocytic leukemia protein 1 Human genes 0.000 description 2
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 208000029560 autism spectrum disease Diseases 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000014759 maintenance of location Effects 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000004321 preservation Methods 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 230000026676 system process Effects 0.000 description 2
- 229920002803 thermoplastic polyurethane Polymers 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 240000001436 Antirrhinum majus Species 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 208000035976 Developmental Disabilities Diseases 0.000 description 1
- 208000012239 Developmental disease Diseases 0.000 description 1
- 102100031690 Erythroid transcription factor Human genes 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 241000288105 Grus Species 0.000 description 1
- 108020005004 Guide RNA Proteins 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- 238000004510 Lennard-Jones potential Methods 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- BDUHCSBCVGXTJM-IZLXSDGUSA-N Nutlin-3 Chemical compound CC(C)OC1=CC(OC)=CC=C1C1=N[C@H](C=2C=CC(Cl)=CC=2)[C@H](C=2C=CC(Cl)=CC=2)N1C(=O)N1CC(=O)NCC1 BDUHCSBCVGXTJM-IZLXSDGUSA-N 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 101150080074 TP53 gene Proteins 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 150000001721 carbon Chemical group 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 238000002487 chromatin immunoprecipitation Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000002073 fluorescence micrograph Methods 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 230000036433 growing body Effects 0.000 description 1
- 230000035876 healing Effects 0.000 description 1
- 238000000126 in silico method Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- PNDPGZBMCMUPRI-UHFFFAOYSA-N iodine Chemical compound II PNDPGZBMCMUPRI-UHFFFAOYSA-N 0.000 description 1
- 238000013140 knowledge distillation Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 108700025694 p53 Genes Proteins 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012913 prioritisation Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 230000006916 protein interaction Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 238000013517 stratification Methods 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
Landscapes
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Chemical & Material Sciences (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- Genetics & Genomics (AREA)
- Epidemiology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Analytical Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Evolutionary Computation (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Peptides Or Proteins (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Applications Claiming Priority (13)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163175495P | 2021-04-15 | 2021-04-15 | |
US17/232,056 | 2021-04-15 | ||
US63/175,495 | 2021-04-15 | ||
US17/232,056 US20220336054A1 (en) | 2021-04-15 | 2021-04-15 | Deep Convolutional Neural Networks to Predict Variant Pathogenicity using Three-Dimensional (3D) Protein Structures |
US202163175767P | 2021-04-16 | 2021-04-16 | |
US63/175,767 | 2021-04-16 | ||
US17/468,411 US11515010B2 (en) | 2021-04-15 | 2021-09-07 | Deep convolutional neural networks to predict variant pathogenicity using three-dimensional (3D) protein structures |
US17/468,411 | 2021-09-07 | ||
US17/703,958 US20220336057A1 (en) | 2021-04-15 | 2022-03-24 | Efficient voxelization for deep learning |
US17/703,958 | 2022-03-24 | ||
US17/703,935 | 2022-03-24 | ||
US17/703,935 US20220336056A1 (en) | 2021-04-15 | 2022-03-24 | Multi-channel protein voxelization to predict variant pathogenicity using deep convolutional neural networks |
PCT/US2022/024913 WO2022221589A1 (fr) | 2021-04-15 | 2022-04-14 | Réseaux neuronaux convolutifs profonds pour prédire une pathogénicité d'un variant à l'aide de structures protéiques tridimensionnelles (3d) |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20230171930A true KR20230171930A (ko) | 2023-12-21 |
Family
ID=81580106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020237034175A KR20230171930A (ko) | 2021-04-15 | 2022-04-14 | 3차원(3d) 단백질 구조들을 사용하여 변이체 병원성을 예측하기 위한 심층 콘볼루션 신경망들 |
Country Status (8)
Country | Link |
---|---|
EP (1) | EP4323990A1 (fr) |
JP (1) | JP2024513994A (fr) |
KR (1) | KR20230171930A (fr) |
AU (1) | AU2022256491A1 (fr) |
BR (1) | BR112023021302A2 (fr) |
CA (1) | CA3215462A1 (fr) |
IL (1) | IL307671A (fr) |
WO (2) | WO2022221587A1 (fr) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116153435B (zh) * | 2023-04-21 | 2023-08-11 | 山东大学齐鲁医院 | 基于上色与三维结构的多肽预测方法及系统 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10423861B2 (en) * | 2017-10-16 | 2019-09-24 | Illumina, Inc. | Deep learning-based techniques for training deep convolutional neural networks |
WO2019084559A1 (fr) * | 2017-10-27 | 2019-05-02 | Apostle, Inc. | Prédiction d'impact pathogène lié au cancer de mutations somatiques à l'aide de procédés basés sur un apprentissage profond |
CN110245685B (zh) * | 2019-05-15 | 2022-03-25 | 清华大学 | 基因组单位点变异致病性的预测方法、系统及存储介质 |
-
2022
- 2022-04-14 WO PCT/US2022/024911 patent/WO2022221587A1/fr active Application Filing
- 2022-04-14 EP EP22721220.6A patent/EP4323990A1/fr active Pending
- 2022-04-14 AU AU2022256491A patent/AU2022256491A1/en active Pending
- 2022-04-14 CA CA3215462A patent/CA3215462A1/fr active Pending
- 2022-04-14 WO PCT/US2022/024913 patent/WO2022221589A1/fr active Application Filing
- 2022-04-14 KR KR1020237034175A patent/KR20230171930A/ko unknown
- 2022-04-14 JP JP2023563032A patent/JP2024513994A/ja active Pending
- 2022-04-14 BR BR112023021302A patent/BR112023021302A2/pt unknown
- 2022-04-14 IL IL307671A patent/IL307671A/en unknown
Also Published As
Publication number | Publication date |
---|---|
BR112023021302A2 (pt) | 2023-12-19 |
IL307671A (en) | 2023-12-01 |
CA3215462A1 (fr) | 2022-10-20 |
WO2022221589A1 (fr) | 2022-10-20 |
JP2024513994A (ja) | 2024-03-27 |
AU2022256491A1 (en) | 2023-10-26 |
WO2022221587A1 (fr) | 2022-10-20 |
EP4323990A1 (fr) | 2024-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023014912A1 (fr) | Utilisation basée sur l'apprentissage de transfert de cartes de contact de protéine pour une prédiction de pathogénicité de variant | |
US20220336056A1 (en) | Multi-channel protein voxelization to predict variant pathogenicity using deep convolutional neural networks | |
US20230108368A1 (en) | Combined and transfer learning of a variant pathogenicity predictor using gapped and non-gapped protein samples | |
KR20230171930A (ko) | 3차원(3d) 단백질 구조들을 사용하여 변이체 병원성을 예측하기 위한 심층 콘볼루션 신경망들 | |
US11515010B2 (en) | Deep convolutional neural networks to predict variant pathogenicity using three-dimensional (3D) protein structures | |
KR20230170679A (ko) | 심층 학습을 위한 효율적인 복셀화 | |
US20230045003A1 (en) | Deep learning-based use of protein contact maps for variant pathogenicity prediction | |
US20230047347A1 (en) | Deep neural network-based variant pathogenicity prediction | |
US20230343413A1 (en) | Protein structure-based protein language models | |
KR20240082269A (ko) | 3차원(3d) 단백질 구조 복셀을 사용하는 진화 보존으로부터의 변이체 병원성 예측 | |
KR20240088641A (ko) | 갭 단백질 샘플 및 비-갭 단백질 샘플을 사용하는 변이체 병원성 예측자의 결합 학습 및 전이 학습 | |
CN117178326A (zh) | 使用三维(3d)蛋白质结构来预测变体致病性的深度卷积神经网络 | |
WO2023059750A1 (fr) | Apprentissage combiné et par transfert d'un prédicteur de pathogénicité de variants au moyen d'échantillons de protéines à brèche et sans brèche | |
US20240112751A1 (en) | Copy number variation (cnv) breakpoint detection | |
KR20240041877A (ko) | 변이 병원성 예측을 위한 단백질 접촉 맵의 전이학습 기반 이용 | |
CN117581302A (zh) | 使用有缺口和非缺口的蛋白质样品的变体致病性预测器的组合学习和迁移学习 |