CA3215520A1 - Voxelisation efficace pour apprentissage en profondeur - Google Patents

Voxelisation efficace pour apprentissage en profondeur Download PDF

Info

Publication number
CA3215520A1
CA3215520A1 CA3215520A CA3215520A CA3215520A1 CA 3215520 A1 CA3215520 A1 CA 3215520A1 CA 3215520 A CA3215520 A CA 3215520A CA 3215520 A CA3215520 A CA 3215520A CA 3215520 A1 CA3215520 A1 CA 3215520A1
Authority
CA
Canada
Prior art keywords
amino acid
voxel
atoms
computer
atom
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3215520A
Other languages
English (en)
Inventor
Tobias HAMP
Hong Gao
Kai-How FARH
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Illumina Cambridge Ltd
Illumina Inc
Original Assignee
Illumina Cambridge Ltd
Illumina Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/703,935 external-priority patent/US20220336056A1/en
Application filed by Illumina Cambridge Ltd, Illumina Inc filed Critical Illumina Cambridge Ltd
Publication of CA3215520A1 publication Critical patent/CA3215520A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Genetics & Genomics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Public Health (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Epidemiology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Image Processing (AREA)
  • Image Generation (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)

Abstract

La technologie divulguée consiste à déterminer efficacement quels atomes dans une protéine sont les plus proches de voxels dans une grille. Les atomes ont des coordonnées d'atomes tridimensionnelles (3D), et les voxels ont des coordonnées de voxels 3D. La technologie divulguée génère une mise en correspondance d'atomes sur voxels qui met en correspondance, sur chacun des atomes, un voxel contenant sélectionné en fonction des coordonnées d'atome 3D correspondantes d'un atome particulier de la protéine par rapport aux coordonnées de voxel 3D dans la grille. La technologie divulguée génère une mise en correspondance voxel sur atomes qui met en correspondance, à chacun des voxels, un sous-ensemble des atomes. Le sous-ensemble des atomes mis en correspondance avec un voxel particulier dans la grille comprend les atomes dans la protéine qui sont mis en correspondance sur le voxel particulier par mise en correspondance atome sur voxels. La technologie divulguée consiste à utiliser la mise en correspondance voxel sur atomes pour déterminer, pour chacun des voxels, un atome le plus proche dans la protéine.
CA3215520A 2021-04-15 2022-04-14 Voxelisation efficace pour apprentissage en profondeur Pending CA3215520A1 (fr)

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
US202163175495P 2021-04-15 2021-04-15
US63/175,495 2021-04-15
US202163175767P 2021-04-16 2021-04-16
US63/175,767 2021-04-16
US17/703,935 US20220336056A1 (en) 2021-04-15 2022-03-24 Multi-channel protein voxelization to predict variant pathogenicity using deep convolutional neural networks
US17/703,958 2022-03-24
US17/703,958 US20220336057A1 (en) 2021-04-15 2022-03-24 Efficient voxelization for deep learning
US17/703,935 2022-03-24
PCT/US2022/024918 WO2022221593A1 (fr) 2021-04-15 2022-04-14 Voxélisation efficace pour apprentissage en profondeur

Publications (1)

Publication Number Publication Date
CA3215520A1 true CA3215520A1 (fr) 2022-10-20

Family

ID=81448684

Family Applications (2)

Application Number Title Priority Date Filing Date
CA3215520A Pending CA3215520A1 (fr) 2021-04-15 2022-04-14 Voxelisation efficace pour apprentissage en profondeur
CA3215514A Pending CA3215514A1 (fr) 2021-04-15 2022-04-14 Voxelisation de proteine a canaux multiples pour predire une pathogenicite d'un variant a l'aide de reseaux neuronaux convolutifs profonds

Family Applications After (1)

Application Number Title Priority Date Filing Date
CA3215514A Pending CA3215514A1 (fr) 2021-04-15 2022-04-14 Voxelisation de proteine a canaux multiples pour predire une pathogenicite d'un variant a l'aide de reseaux neuronaux convolutifs profonds

Country Status (9)

Country Link
EP (2) EP4323991A1 (fr)
JP (2) JP2024513995A (fr)
KR (2) KR20230170680A (fr)
AU (2) AU2022259667A1 (fr)
BR (2) BR112023021266A2 (fr)
CA (2) CA3215520A1 (fr)
IL (2) IL307661A (fr)
MX (2) MX2023012227A (fr)
WO (2) WO2022221591A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116153404B (zh) * 2023-02-28 2023-08-15 成都信息工程大学 一种单细胞ATAC-seq数据分析方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2018350891B9 (en) * 2017-10-16 2022-05-19 Illumina, Inc. Deep learning-based techniques for training deep convolutional neural networks
WO2019084559A1 (fr) * 2017-10-27 2019-05-02 Apostle, Inc. Prédiction d'impact pathogène lié au cancer de mutations somatiques à l'aide de procédés basés sur un apprentissage profond
CN110245685B (zh) * 2019-05-15 2022-03-25 清华大学 基因组单位点变异致病性的预测方法、系统及存储介质

Also Published As

Publication number Publication date
WO2022221593A1 (fr) 2022-10-20
EP4323991A1 (fr) 2024-02-21
CA3215514A1 (fr) 2022-10-20
MX2023012226A (es) 2024-01-08
KR20230170679A (ko) 2023-12-19
AU2022258691A1 (en) 2023-10-26
AU2022259667A1 (en) 2023-10-26
BR112023021266A2 (pt) 2023-12-12
KR20230170680A (ko) 2023-12-19
BR112023021343A2 (pt) 2023-12-19
EP4323989A1 (fr) 2024-02-21
JP2024513995A (ja) 2024-03-27
IL307667A (en) 2023-12-01
WO2022221591A1 (fr) 2022-10-20
MX2023012227A (es) 2024-01-08
JP2024514894A (ja) 2024-04-03
IL307661A (en) 2023-12-01

Similar Documents

Publication Publication Date Title
US20230045003A1 (en) Deep learning-based use of protein contact maps for variant pathogenicity prediction
US20220336057A1 (en) Efficient voxelization for deep learning
US20230108241A1 (en) Predicting variant pathogenicity from evolutionary conservation using three-dimensional (3d) protein structure voxels
US11515010B2 (en) Deep convolutional neural networks to predict variant pathogenicity using three-dimensional (3D) protein structures
CA3215520A1 (fr) Voxelisation efficace pour apprentissage en profondeur
CA3215462A1 (fr) Reseaux neuronaux convolutifs profonds pour predire une pathogenicite d'un variant a l'aide de structures proteiques tridimensionnelles (3d)
US20230047347A1 (en) Deep neural network-based variant pathogenicity prediction
US20230343413A1 (en) Protein structure-based protein language models
EP4413575A1 (fr) Apprentissage combiné et par transfert d'un prédicteur de pathogénicité de variants au moyen d'échantillons de protéines à brèche et sans brèche
WO2023059750A1 (fr) Apprentissage combiné et par transfert d'un prédicteur de pathogénicité de variants au moyen d'échantillons de protéines à brèche et sans brèche
JP2024538478A (ja) ギャップ付き及び非ギャップタンパク質サンプルを使用した変異体病原性予測器の複合学習及び転移学習
JP2024538477A (ja) タンパク質構造に基づくタンパク質言語モデル
JP2024538475A (ja) 三次元(3d)タンパク質構造ボクセルを用いた進化的保存からの変異体病原性の予測
CN117178327A (zh) 使用深度卷积神经网络来预测变体致病性的多通道蛋白质体素化
EP4381507A1 (fr) Utilisation basée sur l'apprentissage de transfert de cartes de contact de protéine pour une prédiction de pathogénicité de variant
CN117581302A (zh) 使用有缺口和非缺口的蛋白质样品的变体致病性预测器的组合学习和迁移学习