KR20240088641A - 갭 단백질 샘플 및 비-갭 단백질 샘플을 사용하는 변이체 병원성 예측자의 결합 학습 및 전이 학습 - Google Patents

갭 단백질 샘플 및 비-갭 단백질 샘플을 사용하는 변이체 병원성 예측자의 결합 학습 및 전이 학습 Download PDF

Info

Publication number
KR20240088641A
KR20240088641A KR1020237045483A KR20237045483A KR20240088641A KR 20240088641 A KR20240088641 A KR 20240088641A KR 1020237045483 A KR1020237045483 A KR 1020237045483A KR 20237045483 A KR20237045483 A KR 20237045483A KR 20240088641 A KR20240088641 A KR 20240088641A
Authority
KR
South Korea
Prior art keywords
amino acid
gap
pathogenicity
protein
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
KR1020237045483A
Other languages
English (en)
Korean (ko)
Inventor
토비아스 햄프
홍 가오
카이-하우 파
Original Assignee
일루미나, 인코포레이티드
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/533,091 external-priority patent/US11538555B1/en
Application filed by 일루미나, 인코포레이티드 filed Critical 일루미나, 인코포레이티드
Priority claimed from PCT/US2022/045823 external-priority patent/WO2023059750A1/en
Publication of KR20240088641A publication Critical patent/KR20240088641A/ko
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • G16B5/20Probabilistic models
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/20Heterogeneous data integration

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Epidemiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Probability & Statistics with Applications (AREA)
  • Physiology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
KR1020237045483A 2021-10-06 2022-10-05 갭 단백질 샘플 및 비-갭 단백질 샘플을 사용하는 변이체 병원성 예측자의 결합 학습 및 전이 학습 Pending KR20240088641A (ko)

Applications Claiming Priority (13)

Application Number Priority Date Filing Date Title
US202163253122P 2021-10-06 2021-10-06
US63/253,122 2021-10-06
US202163281592P 2021-11-19 2021-11-19
US202163281579P 2021-11-19 2021-11-19
US63/281,579 2021-11-19
US63/281,592 2021-11-19
US17/533,091 2021-11-22
US17/533,091 US11538555B1 (en) 2021-10-06 2021-11-22 Protein structure-based protein language models
US17/953,286 2022-09-26
US17/953,293 2022-09-26
US17/953,293 US20230108368A1 (en) 2021-10-06 2022-09-26 Combined and transfer learning of a variant pathogenicity predictor using gapped and non-gapped protein samples
US17/953,286 US20230108241A1 (en) 2021-10-06 2022-09-26 Predicting variant pathogenicity from evolutionary conservation using three-dimensional (3d) protein structure voxels
PCT/US2022/045823 WO2023059750A1 (en) 2021-10-06 2022-10-05 Combined and transfer learning of a variant pathogenicity predictor using gapped and non-gapped protein samples

Publications (1)

Publication Number Publication Date
KR20240088641A true KR20240088641A (ko) 2024-06-20

Family

ID=89808344

Family Applications (3)

Application Number Title Priority Date Filing Date
KR1020237045483A Pending KR20240088641A (ko) 2021-10-06 2022-10-05 갭 단백질 샘플 및 비-갭 단백질 샘플을 사용하는 변이체 병원성 예측자의 결합 학습 및 전이 학습
KR1020237045482A Pending KR20240082270A (ko) 2021-10-06 2022-10-05 단백질 구조 기반의 단백질 언어 모델
KR1020237045389A Pending KR20240082269A (ko) 2021-10-06 2022-10-05 3차원(3d) 단백질 구조 복셀을 사용하는 진화 보존으로부터의 변이체 병원성 예측

Family Applications After (2)

Application Number Title Priority Date Filing Date
KR1020237045482A Pending KR20240082270A (ko) 2021-10-06 2022-10-05 단백질 구조 기반의 단백질 언어 모델
KR1020237045389A Pending KR20240082269A (ko) 2021-10-06 2022-10-05 3차원(3d) 단백질 구조 복셀을 사용하는 진화 보존으로부터의 변이체 병원성 예측

Country Status (4)

Country Link
EP (3) EP4413577A1 (https=)
JP (3) JP2024538478A (https=)
KR (3) KR20240088641A (https=)
CN (2) CN117546242A (https=)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117178327A (zh) * 2021-04-15 2023-12-05 因美纳有限公司 使用深度卷积神经网络来预测变体致病性的多通道蛋白质体素化
CN118629516B (zh) * 2024-05-17 2025-09-16 安徽农业大学 一种基于多模态特征和孪生网络的神经肽预测方法及系统
CN119560009B (zh) * 2025-01-22 2025-06-24 浙江工业大学 一种蛋白质翻译后修饰与疾病关联预测系统及方法

Also Published As

Publication number Publication date
JP2024538478A (ja) 2024-10-23
EP4413575A1 (en) 2024-08-14
KR20240082270A (ko) 2024-06-10
KR20240082269A (ko) 2024-06-10
EP4413577A1 (en) 2024-08-14
EP4413576A1 (en) 2024-08-14
CN117642824A (zh) 2024-03-01
JP2024538477A (ja) 2024-10-23
CN117546242A (zh) 2024-02-09
JP2024538475A (ja) 2024-10-23

Similar Documents

Publication Publication Date Title
US12444482B2 (en) Multi-channel protein voxelization to predict variant pathogenicity using deep convolutional neural networks
US11515010B2 (en) Deep convolutional neural networks to predict variant pathogenicity using three-dimensional (3D) protein structures
KR20240088641A (ko) 갭 단백질 샘플 및 비-갭 단백질 샘플을 사용하는 변이체 병원성 예측자의 결합 학습 및 전이 학습
KR20230170680A (ko) 심층 콘볼루션 신경망들을 사용하여 변이체 병원성을 예측하기 위한 다중 채널 단백질 복셀화
JP7755105B2 (ja) 3次元(3d)タンパク質構造を用いて変異体病原性を予測する深層畳み込みニューラルネットワーク
US20230343413A1 (en) Protein structure-based protein language models
US20230108368A1 (en) Combined and transfer learning of a variant pathogenicity predictor using gapped and non-gapped protein samples
US20230047347A1 (en) Deep neural network-based variant pathogenicity prediction
CN117581302A (zh) 使用有缺口和非缺口的蛋白质样品的变体致病性预测器的组合学习和迁移学习
CN117178327A (zh) 使用深度卷积神经网络来预测变体致病性的多通道蛋白质体素化
WO2023059750A1 (en) Combined and transfer learning of a variant pathogenicity predictor using gapped and non-gapped protein samples

Legal Events

Date Code Title Description
PA0105 International application

Patent event date: 20231229

Patent event code: PA01051R01D

Comment text: International Patent Application

PG1501 Laying open of application