CL2022000397A1 - Sistemas y métodos de predicción de proteínas - Google Patents

Sistemas y métodos de predicción de proteínas

Info

Publication number
CL2022000397A1
CL2022000397A1 CL2022000397A CL2022000397A CL2022000397A1 CL 2022000397 A1 CL2022000397 A1 CL 2022000397A1 CL 2022000397 A CL2022000397 A CL 2022000397A CL 2022000397 A CL2022000397 A CL 2022000397A CL 2022000397 A1 CL2022000397 A1 CL 2022000397A1
Authority
CL
Chile
Prior art keywords
input
machine learning
learning model
methods
proteins
Prior art date
Application number
CL2022000397A
Other languages
English (en)
Inventor
Leonardo Alvarez
Roberto Ibañez
Patricio Alegre
Pedro Retamal
Simón Correa
Romualdo Paz
Javier Caceres-Delpiano
Cynthia Sanhueza
Juan Jiménez
Original Assignee
Geaenzymes Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Geaenzymes Co filed Critical Geaenzymes Co
Publication of CL2022000397A1 publication Critical patent/CL2022000397A1/es

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/20Protein or domain folding
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K1/00General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Biochemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Analytical Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Las realizaciones de la invención incluyen sistemas y métodos que permiten la identificación de proteínas candidatas que tienen características deseadas de una proteína objetivo. Un método a modo de ejemplo comprende recibir una primera y una segunda proteínas de entrada. El método comprende además aplicar un primer modelo de aprendizaje automático a las primeras y segundas proteínas de entrada para generar los fragmentos correspondientes. El método comprende además aplicar un segundo modelo de aprendizaje automático a los fragmentos, donde la aplicación del segundo modelo de aprendizaje automático comprende generar una representación codificada en un espacio multidimensional para cada uno de los fragmentos. El método también comprende generar una puntuación de similitud entre los fragmentos de la primera entrada y la segunda entrada. El método comprende entonces generar una escala jerárquica de similitud entre la primera y la segunda entrada de acuerdo con la puntuación de similitud y seleccionar las proteínas candidatas basándose en la escala jerárquica.
CL2022000397A 2019-08-23 2022-02-17 Sistemas y métodos de predicción de proteínas CL2022000397A1 (es)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US201962891202P 2019-08-23 2019-08-23

Publications (1)

Publication Number Publication Date
CL2022000397A1 true CL2022000397A1 (es) 2022-09-30

Family

ID=74684318

Family Applications (1)

Application Number Title Priority Date Filing Date
CL2022000397A CL2022000397A1 (es) 2019-08-23 2022-02-17 Sistemas y métodos de predicción de proteínas

Country Status (5)

Country Link
US (1) US20220375539A1 (es)
EP (1) EP4018020A4 (es)
CL (1) CL2022000397A1 (es)
IL (1) IL290612A (es)
WO (1) WO2021041199A1 (es)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210174909A1 (en) * 2019-12-10 2021-06-10 Homodeus, Inc. Generative machine learning models for predicting functional protein sequences
US20210249105A1 (en) * 2020-02-06 2021-08-12 Salesforce.Com, Inc. Systems and methods for language modeling of protein engineering
US20220165359A1 (en) 2020-11-23 2022-05-26 Peptilogics, Inc. Generating anti-infective design spaces for selecting drug candidates
US11512345B1 (en) 2021-05-07 2022-11-29 Peptilogics, Inc. Methods and apparatuses for generating peptides by synthesizing a portion of a design space to identify peptides having non-canonical amino acids
CN114678061A (zh) * 2022-02-09 2022-06-28 浙江大学杭州国际科创中心 基于预训练语言模型的蛋白质构象感知表示学习方法
CN115050429A (zh) * 2022-05-17 2022-09-13 慧壹科技(上海)有限公司 Protac目标分子生成方法、计算机系统及储存介质
CN115497555B (zh) * 2022-08-16 2024-01-05 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) 多物种蛋白质功能预测方法、装置、设备及存储介质
WO2024076641A1 (en) * 2022-10-06 2024-04-11 Just-Evotec Biologics, Inc. Machine learning architecture to generate protein sequences
CN116130004B (zh) * 2023-01-06 2024-05-24 成都侣康科技有限公司 一种抗菌肽的鉴定处理方法和系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2931892B1 (en) * 2012-12-12 2018-09-12 The Broad Institute, Inc. Methods, models, systems, and apparatus for identifying target sequences for cas enzymes or crispr-cas systems for target sequences and conveying results thereof
US20150019232A1 (en) * 2013-07-10 2015-01-15 International Business Machines Corporation Identifying target patients for new drugs by mining real-world evidence
US9373059B1 (en) * 2014-05-05 2016-06-21 Atomwise Inc. Systems and methods for applying a convolutional network to spatial data
WO2015173803A2 (en) * 2014-05-11 2015-11-19 Ofek - Eshkolot Research And Development Ltd A system and method for generating detection of hidden relatedness between proteins via a protein connectivity network
EP3821433B1 (en) * 2018-09-21 2024-06-05 DeepMind Technologies Limited Iterative protein structure prediction using gradients of quality scores

Also Published As

Publication number Publication date
EP4018020A1 (en) 2022-06-29
WO2021041199A1 (en) 2021-03-04
EP4018020A4 (en) 2023-09-13
US20220375539A1 (en) 2022-11-24
IL290612A (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CL2022000397A1 (es) Sistemas y métodos de predicción de proteínas
CO2017009672A2 (es) Determinación del modo de derivación de información de movimiento en la codificación de video
MX2020006803A (es) Estrategias de descodificacion para identificacion de proteinas.
AR107349A1 (es) Intrapredicción híbrida
CO2019009920A2 (es) Método y aparato para la representación compacta de datos de bioinformática mediante el uso de múltiples descriptores genómicos
WO2018183263A3 (en) Correcting error in a first classifier by evaluating classifier output in parallel
BR112019004335A2 (pt) pesquisa de similaridade usando códigos polissêmicos
MX2020004145A (es) Variantes de desoxirribonucleasa (dnasa).
MX2016004674A (es) Sistema y metodo para determinar la secuencia de realizacion de una pluralidad de tareas.
BR112017010222A2 (pt) discriminando expressões ambíguas para aprimorar experiência do usuário
BR112018013550A2 (pt) identificação de entidades utilizando um modelo de aprendizado profundo
BR112018001230A2 (pt) aprendizagem de transferência em redes neurais
BR112018077322A2 (pt) sistemas e métoodos para identificar conteúdo de correspondência
JP2016224994A5 (es)
CL2020003275A1 (es) Método y aparato para inter-predicción basada en modalidad de fusión
AU2017408800A1 (en) Method and system of mining information, electronic device and readable storable medium
BR112018076406A2 (pt) sistemas e métodos para um atlas de imagens
PH12018501123A1 (en) Information generation method and apparatus, information acquisition method and apparatus, information processing method and apparatus, and payment method and client
SV2016005288A (es) Mã‰todos y aparato para la coordinaciã“n de la selecciã“n del sistema entre un conjunto de nodos
CL2021000671A1 (es) Método de codificación/decodificación de señales de imagen y dispositivo para lo mismo
MX2020001411A (es) Sistema, metodo y producto de programa informatico para determinar la alineacion de categoria de una cuenta.
BR112019000188A2 (pt) método implementado por computador, meio não transitório, legível por computador e sistema implementado por computador
MX2020007346A (es) Metodo de configuracion de la seccion de red, primer elemento de red y segundo elemento de red.
MX2022004644A (es) Motor de búsqueda mejorado que usa aprendizaje conjunto para clasificación de multi-etiqueta.
BR112017014399A2 (pt) aparelhos, métodos e sistemas de processamento de cubo de criptografia de múltiplas partes