WO2023065220A1 - 一种基于深度学习的化学分子相关水溶性预测方法 - Google Patents

一种基于深度学习的化学分子相关水溶性预测方法 Download PDF

Info

Publication number
WO2023065220A1
WO2023065220A1 PCT/CN2021/125323 CN2021125323W WO2023065220A1 WO 2023065220 A1 WO2023065220 A1 WO 2023065220A1 CN 2021125323 W CN2021125323 W CN 2021125323W WO 2023065220 A1 WO2023065220 A1 WO 2023065220A1
Authority
WO
WIPO (PCT)
Prior art keywords
deep learning
chemical
model
learning model
smiles
Prior art date
Application number
PCT/CN2021/125323
Other languages
English (en)
French (fr)
Inventor
袁曙光
侯园园
王世玉
陈显翀
Original Assignee
深圳阿尔法分子科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳阿尔法分子科技有限责任公司 filed Critical 深圳阿尔法分子科技有限责任公司
Priority to PCT/CN2021/125323 priority Critical patent/WO2023065220A1/zh
Publication of WO2023065220A1 publication Critical patent/WO2023065220A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Definitions

  • the present invention relates to the technical field of molecular water-solubility analysis, and more specifically, to a method for predicting chemical molecule-related water-solubility based on deep learning.
  • the purpose of the present invention is to overcome the defects of the above-mentioned prior art and provide a method for predicting the water solubility of chemical molecules based on deep learning.
  • a method for predicting the water solubility of chemical molecules based on deep learning includes the following steps:
  • Constructing a deep learning model wherein the deep learning model is constructed based on a two-way time series prediction model and an attention mechanism for learning the correspondence between chemical molecular structure sequences and water-soluble properties;
  • the deep learning model is trained with the goal of minimizing the set loss function.
  • the training process uses character sequence codes representing chemical molecular structures as input, and uses chemical molecule-related water-soluble attribute information as output.
  • a method for predicting chemical molecule-related water solubility includes the following steps:
  • the character sequence code is input into the trained deep learning model obtained according to the first aspect of the present invention, and the water-soluble property information related to the chemical molecule is obtained.
  • the present invention has the advantage of providing a data-driven end-to-end deep learning model (BCSA) and applying it to the prediction process of molecular water solubility.
  • BCSA data-driven end-to-end deep learning model
  • the model provided by the present invention is simple and does not rely on additional auxiliary knowledge, and can also be used to predict other physicochemical and ADMET properties.
  • Fig. 1 is a schematic diagram of the architecture of an end-to-end deep learning model according to an embodiment of the present invention
  • Fig. 2 is a schematic diagram of changes in R2 during the training process of a verification set and a test set according to an embodiment of the present invention
  • Fig. 3 is a scatter diagram of prediction effects of four different models according to an embodiment of the present invention.
  • Fig. 4 is a scatter diagram of prediction results on a test set according to an embodiment of the present invention.
  • the deep learning-based chemical molecule-related water solubility prediction method generally includes the pre-training process of the deep learning model and the actual prediction process.
  • the pre-training process includes the following steps: constructing a deep learning model, which is constructed based on a two-way time series prediction model and an attention mechanism for learning the correspondence between chemical molecular structure sequences and water-soluble properties;
  • the goal of training the deep learning model is to minimize the loss function.
  • the training process takes the character sequence code representing the structure of the chemical molecule as input, and takes the information about the water-soluble properties of the chemical molecule as the output.
  • the bidirectional time series forecasting model can use bidirectional long short-term memory network (BILSTM) or bidirectional gated recurrent unit (BIGRU), etc.
  • BILSTM bidirectional long short-term memory network
  • BIGRU bidirectional gated recurrent unit
  • the character sequence characterizing the chemical molecular structure can be in the SMILES format or other character sequences.
  • SMILES is a specification for clearly describing the molecular structure with ASCII character strings. For clarity, the BILSTM model and SMILES are taken as examples below.
  • a BCSA model architecture is built on the basis of BILSTM and channel attention (channel attention) and spatial attention (spatial attention) work using SMILES ⁇ Weininger, 1988 #86 ⁇ molecular characterization, and for SMILES molecular characterization
  • the non-uniqueness of the data and use the SMIELS enhancement technology to amplify the data to obtain more effective labeled data sets as the input of the model, and use the average value of each amplified molecule as the final prediction result to make the model more powerful.
  • Generalization Then, for the same data set, different commonly used graph neural network models are used for comparative research with the present invention, and the performance advantages of the model provided by the present invention are explored under different molecular characterizations.
  • the dataset used was derived from the work of Cui ⁇ Cui, 2020 #69 ⁇ et al. 2020, containing 9943 non-redundant compounds.
  • Molecules are presented in SMILES (Simplified Molecular-Input Line-Entry System) format. This symbol format is characterized by a single line of text and a sequence of atoms and covalent bonds. From the perspective of formal language theory, both atoms and covalent bonds are regarded as symbolic markers, and a SMILES string is just a sequence of symbols. This representation has been used to predict biochemical properties, and to encode SMILES, the present invention tokenizes them using regular expressions from ⁇ Schwaller, 2018 #64 ⁇ , and the tokens are separated by spaces.
  • the processing result is for example: "c1c(C)c ccc 1".
  • a method similar to word2vec is used for embedding input.
  • the dataset is augmented with SMILES enumeration to extend the dataset, and the SMILES string is padded with "padding" to a fixed length of 150 characters. Excess text beyond this length is simply discarded.
  • the dataset is randomly split into training set (80%), validation set (10%) and test set (10%).
  • the main body of the deep learning model includes BILSTM, channel attention module and spatial attention module, which are used to learn the correspondence between chemical molecular structure sequences and water solubility properties.
  • BILSTM is mainly in order to obtain the sequence information of SMILES
  • the present invention utilizes RNN (cyclic neural network) model in natural language processing to the good processing ability of long-distance relation in the sequence, obtains based on the special variant BILSTM of LSTM model under batch processing mode Context information for SMILES sequences.
  • BILSTM is composed of an LSTM that processes sequences forward and an LSTM that processes sequences backwards, which allows it to process not only features from the past, but also features from the future.
  • BILSTM utilizes SMILES sequence encoding as input Each time step t outputs the forward hidden layer state and backward hidden layer states The output of the hidden layer of BILSTM at time t is the connection of two states, which can be expressed as:
  • W e is the learning weight of the embedding vector, which is simplified as:
  • the embodiment of the present invention optimizes and embeds the CBAM (Convlution Block Attention Module, convolution block attention module) mechanism into the current forward-propagating sequence neural network model, including two sub-modules, one marked as Channel Attention map ( M c ), the other is marked as Spatial attention map(M s ), which are used to obtain key information on different channels and spatial axes respectively.
  • CBAM Convlution Block Attention Module, convolution block attention module
  • the Channel Attention Moudle mainly focuses on what the SMILES character content is.
  • the spatial information of the BILSTM output matrix is aggregated through average-pooling and max-pooling operations, and two different spatial context descriptors C avg and C max are obtained, respectively representing the average pooling Output information and maximum pooling output information; input the two descriptors into a 2-layer shared MLP network, and finally obtain the output vector of Channel Attention by summing.
  • the whole process is formalized as:
  • for example, uses the relu activation function
  • W 0 and W 1 are the learning weights of the first and second layers of the shared MLP (Multilayer Perceptron) model, respectively.
  • the spatial attention module (Spatial attention moudle) mainly focuses on the SMILES character sequence information part.
  • a two-layer one-dimensional convolutional network with a core of 7 is used to implement, and the specific implementation is formalized as:
  • represents the relu activation function
  • Conv1d 7 x represents a 1-dimensional convolutional layer with a kernel size of 7 and filters of x.
  • O represents the hidden state mapping vector after aggregating attention weights through the Avg-pooling operation.
  • the last part of the regression task is to send the trained vector O to a two-layer fully connected layer to predict the final attribute value.
  • relu which is commonly used in deep learning research, can be used as an intermediate activation function, and dropout can be used to alleviate the occurrence of overfitting.
  • MSE mean square error
  • N represents the training data size
  • yi represents the actual value of the experiment.
  • Bayesian optimization ⁇ Bergstra, 2011 #92 ⁇ is used to explore the best choice of hyperparameters to As the minimum objective acquisition function, where Represents the predicted value, y i represents the real value, represents the mean of the experimental true value.
  • TPE Te-structured Parzen Estimator
  • the best hyperparameters for training are found by using the best prediction effect of the verification machine, as shown in Table 1.
  • the model will be further trained on the enumeration training set to 30 points in order to improve the final accuracy.
  • the framework of the model is implemented using pytorch and all calculations and model training are performed on a Linux server (opensuse): Intel(R) Xeon(R) Platinum 8173M CPU@2.00GHz and NvidiaGeForce RTX 2080 Ti graphics card with 11G.
  • the provided model is evaluated using four performance metrics commonly used in regression tasks, including: (coefficient of determination) R-Squared(R 2 ), spearman, RMSE, MAE.
  • R 2 the measure of the spearman coefficient
  • R 2 the measure of the spearman coefficient
  • RMSE MAE error measure
  • SAman the measure of the spearman coefficient
  • MAE error measure can help measure the difference between the predicted value and the real value, the closer the calculation result is to 0, the better the prediction effect, and vice versa.
  • the purpose of the present invention is to use molecular SMILES sequence self-encoding to develop a deep learning model to explore the role of deep neural networks based on SMILES molecular sequence descriptors in predicting molecular solubility.
  • the original dataset includes 7955 training sets, 996 validation sets and 995 test sets.
  • the BILSTM model was built respectively, and the BCSA model was built on this basis.
  • Figure 2 shows the variation trend of the model fitting effect R2 of the verification set and the test set during the training process of 400 epochs when the smoothness of the curve is 0.8. It can be clearly seen from the figure that the model of the present invention has stronger fitting effect and generalization ability than the BILSTM model on both validation sets and test sets.
  • the model of the present invention is based on the SMILES molecular sequence code, and different molecules have many different SMILES characters, that is, there are multiple sequence codes, data enhancement is feasible and necessary.
  • the SMILES enhancement technology is further used to amplify the original segmentation data set, and the training molecules are respectively enhanced by 20 times (each molecule is represented by 20 SMILES) and 40 times (each molecule is represented by 40 SMILES)
  • the BCSA model in which structurally simple molecules may have repeated SMILES.
  • the training set, verification set and test set finally obtained are (134454:19881:16834) and (239260:30042:39800) augmented data respectively.
  • the model with the best performance in the verification set R2 during the training process was used, and the average value of the amplified molecules in the test set was used as the final prediction result to measure the ability of the model to extract molecular sequence information.
  • the results are shown in Table 2.
  • the verification results show that the stability and generalization ability of the enhanced data model have been significantly improved, and our model has achieved the best results in the SMILES40 data set, which indicates that the enhanced model better pays attention to the different sequences of molecules information.
  • Table 2 Statistics of prediction results of training set and test set
  • the SEBSCA model based on molecular enhancement of the present invention has achieved the best molecular solubility performance prediction, and has good predictions for data in different ranges. It can be seen that the model of the present invention has certain competitive advantages.
  • the logP dataset is still based on the Cui ⁇ Cui, 2020 #69 ⁇ et al. dataset.
  • good results have been achieved on the test dataset, with an R2 of 0.99 and an RMSE of 0.29. It can be seen from the scatter plot that the data in each range can achieve a better fit.
  • the present invention proposes an end-to-end deep learning model framework based on molecular augmentation using LSTM fusion attention mechanism, which utilizes long and short memory
  • the advantages of sequence processing in the network are added to the improved channel attention and spatial attention modules to extract key information about water solubility prediction in the SMILES sequence, and Bayesian optimization is used to make the provided model simple and does not depend on additional auxiliary knowledge (such as the complex spatial structure of molecules) and can be used for the prediction of other physicochemical and ADMET properties (absorption, distribution, metabolism, excretion and toxicity properties).
  • the present invention can be a system, method and/or computer program product.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present invention.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that the realization through hardware, the realization through software and the combination of software and hardware are all equivalent.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于深度学习的化学分子相关水溶性预测方法。该方法包括:构建深度学习模型,其中所述深度学习模型基于双向时间序列预测模型和注意力机制构建,用于学习化学分子结构序列与水溶性属性之间的对应关系;以设定的损失函数最小化为目标训练所述深度学习模型,训练过程以表征化学分子结构的字符序列编码作为输入,以化学分子相关水溶性属性信息作为输出。利用本发明训练的深度学习模型,能够准确预测水溶性以及其他相关属性。

Description

一种基于深度学习的化学分子相关水溶性预测方法 技术领域
本发明涉及分子水溶性分析技术领域,更具体地,涉及一种基于深度学习的化学分子相关水溶性预测方法。
背景技术
近年来,深度学习已成功应用于目标检测和图像分割,它为处理大量数据和在科学领域做出有用的预测提供了有用的工具。然而,将深度学习相关框架应用在分子属性预测上仍然是一个具有挑战性的研究问题。由于新实验技术的出现以及可用化合物活性和生物医学数据的显着增加,深度学习在药物发现中的应用也得到了进一步推动,例如包括制药公司药物设计过程中分子相互作用的预测,药物-靶标相互作用预测的探索,化学合成和逆合成途径的探索,以及化学性质的预测等。
可以预见的是,深度学习将在未来更多地参与药物发现领域。在药物发现的历史上,水溶性预测这一重要的物理化学分子性质多年来一直受到深入研究。化学信息的各种表示和深度学习架构模型也已应用于溶解度预测问题。根据表示方法的选择取决于不同的模型,最常用的组合包括分子指纹和全连接神经网络、SMILES表征和循环神经网络、分子图和图神经网络等。在现有的水溶性预测模型架构中,训练数据集的大小范围从100到10000不等。由于使用的数据集不同,报告的性能差异很大,并且存在许多挑战,例如数据集噪声、分子的复杂空间结构等。
综上,搭建一个稳定且健壮的深度学习模型,使得在分子水溶性预测上实现较好的效果,以节省药物研发的时间和经济成本仍然是一个非常值得研究的问题。
发明内容
本发明的目的是克服上述现有技术的缺陷,提供一种基于深度学习的化学分子相关水溶性预测方法。
根据本发明的第一方面,提供一种基于深度学习的化学分子相关水溶性预测方法。该方法包括以下步骤:
构建深度学习模型,其中所述深度学习模型基于双向时间序列预测模型和注意力机制构建,用于学习化学分子结构序列与水溶性属性之间的对应关系;
以设定的损失函数最小化为目标训练所述深度学习模型,训练过程以表征化学分子结构的字符序列编码作为输入,以化学分子相关水溶性属性信息作为输出。
根据本发明的第二方面,提供一种化学分子相关水溶性预测方法。该方法包括以下步骤:
获取表征待测化学分子结构的字符序列编码;
将所述字符序列编码输入到根据本发明上述第一方面获得的经训练深度学习模型,获得该化学分子相关水溶性属性信息。
与现有技术相比,本发明的优点在于,提供一种基于数据驱动的端到端的深度学习模型(BCSA),并将其应用到分子水溶性的预测过程中。本发明提供的模型简单且不依赖于额外的辅助知识,也可用于预测其他物理化学和ADMET特性。
通过以下参照附图对本发明的示例性实施例的详细描述,本发明的其它特征及其优点将会变得清楚。
附图说明
被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。
图1是根据本发明一个实施例的端到端深度学习模型的架构示意图;
图2是根据本发明一个实施例的验证集和测试集的训练过程中R2的变化示意图;
图3是根据本发明一个实施例的四种不同模型的预测效果散点图;
图4是根据本发明一个实施例的测试集上的预测结果散点图。
具体实施方式
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
简言之,本发明提供的基于深度学习的化学分子相关水溶性预测方法整体上包括深度学习模型的预训练过程和实际预测过程。预训练过程包括以下步骤:构建深度学习模型,所述深度学习模型基于双向时间序列预测模型和注意力机制构建,用于学习化学分子结构序列与水溶性属性之间的对应关系;以设定的损失函数最小化为目标训练所述深度学习模型,训练过程以表征化学分子结构的字符序列编码作为输入,以化学分子相关水溶性属性信息作为输出。其中双向时间序列预测模型可采用双向长短期记忆网络(BILSTM)或双向门控循环单元(BIGRU)等。表征化学分子结构的字符序列可采用SMILES格式或其他格式的字符序列,SMILES是用ASCII字符串明确描述分子结构的规范。为清楚起见,下文以BILSTM模型和SMILES为例进行说明。
在本发明中,利用SMILES{Weininger,1988 #86}分子表征在BILSTM和channel attention(通道注意力)和spatial attention(空间注意力)工作 的基础上构建了一个BCSA模型架构,并且针对SMILES分子表征的不唯一性,又利用SMIELS增强技术对数据进行扩增,以获得更多有效的标记数据集作为该模型的输入,利用每个扩增分子的平均值作为最终预测结果使得模型具有更强大的泛化能力。然后,又针对同一数据集利用不同的常用的图神经网络模型与本发明进行比较研究,探索不同的分子表征下本发明所提供模型的性能优势。
在下文中,将具体描述数据预处理过程、模型架构以及评估结果。
一、分子数据集的表示和预处理
在一个实施例中,使用的数据集源自Cui{Cui,2020#69}等人2020的工作,包含9943种非冗余化合物。分子以SMILES(Simplified Molecular-Input Line-Entry System)的格式呈现。这种符号格式的特点是单行文本和一系列原子和共价键。从形式语言理论的角度来看,将原子和共价键都视为符号标记,而SMILES字符串只是一个符号序列。这种表示已被用于预测生化特性,为了编码SMILES,本发明使用{Schwaller,2018#64}中的正则表达式来标记它们,并且标记用空格分隔。处理结果例如:“c1c(C)c ccc 1”。接下来,采用类似于word2vec的方法来进行嵌入输入。此外,数据集通过SMILES枚举增强扩展数据集,并且SMILES字符串用“填充”填充到150个字符的固定长度。超出此长度的多余文本直接丢弃。最后,数据集被随机分成训练集(80%)、验证集(10%)和测试集(10%)。
二、深度学习模型架构
参见图1所示,深度学习模型主体包括BILSTM、通道注意力模块和空间注意力模块,用于学习化学分子结构序列与水溶性属性之间的对应关系。
BILSTM主要是为了获取SMILES的序列信息,本发明利用RNN(循环神经网络)模型在自然语言处理中对序列中远程关系的良好的处理能力,在批处理模式下基于LSTM模型的特殊变体BILSTM获取SMILES序列的上下文信息。BILSTM由一个向前处理序列的LSTM和一个向后处理序列的LSTM组合而成,这使得它不仅可以处理来自过去的特征,也可以处理来自未来的特征。BILSTM利用SMILES序列编码作为输入
Figure PCTCN2021125323-appb-000001
Figure PCTCN2021125323-appb-000002
每个时间步t都会输出向前的隐藏层状态
Figure PCTCN2021125323-appb-000003
和向后的隐藏层状态
Figure PCTCN2021125323-appb-000004
BILSTM在t时刻隐藏层的输出是两个状态的连接,可表示为:
Figure PCTCN2021125323-appb-000005
进一步地,BILSTM的处理过程可归纳为:
C=f(W ex i,h t-1)    (2)
其中f表示一个多层的BILSTM,W e是嵌入向量的学习权重,简单化表示为:
C={h 1,h 2,…,h T}     (3)
针对注意力机制,本发明实施例将CBAM(Convlution Block Attention Module,卷积块注意模块)机制优化嵌入到当前向前传播的序列神经网络模型中,包括两个子模块,一个标记为Channel Attention map(M c),另一个标记为Spatial attention map(M s),分别用于获取不同通道和空间轴上的重点信息,整个注意力输出过程可以被表示为:
Figure PCTCN2021125323-appb-000006
其中
Figure PCTCN2021125323-appb-000007
表示元素的点乘。σ表示sigmoid激活函数,C‘是最终的输出。
具体地,通道注意力模块(Channel Attention Moudle)主要关注SMILES字符内容是什么。例如,首先通过平均池化(average-pooling)和最大池化(max-pooling)操作聚合BILSTM输出矩阵的空间信息,获得两个不同的空间上下文描述符C avg和C max,分别表示平均池化输出信息和最大池化输出信息;将两个描述符分别输入到一个2层共享MLP网络,最后利用求和的方式获得Channel Attention的输出向量。整个过程被形式化表示为:
M c(C)=MLP(AvgPool1d(C))+MLP(MaxPool1d(C))=W 1(σ(W 0(C avg))+W 1(σ(W 0(C max)))      (5)
为了减轻网络的开销,σ例如使用relu激活函数,W 0,W 1分别是共享MLP(多层感知器)模型第一层和第二层的学习权重。
空间注意力模块(Spatial attention moudle)主要集中在SMILES字符序列信息部分。在一个实施例中,利用了两层核为7的一维卷积网络来实现,具体实现形式化为:
M s(C)=Conv1d 7,1(σ(Conv1d 7,16(C)))     (6)
其中,σ表示relu激活函数,Conv1d 7,x表示一个kernel大小为7,filters为x的1维卷积图层。最终整个注意力网络模块表示为:
Figure PCTCN2021125323-appb-000008
其中
Figure PCTCN2021125323-appb-000009
表示点乘,O表示通过Avg-pooling操作聚合注意力加权之后的隐藏状态映射向量。
在本发明中,回归任务最后一部分是将训练的向量O输送给一个两层的全连接层预测最终的属性值。例如,可利用深度学习研究过程中普遍采用的relu作为中间激活函数,并利用dropout缓解过拟合的发生。在训练过程中,使用MSE(均方误差)作为模型训练的损失函数,表示为:
Figure PCTCN2021125323-appb-000010
其中,N表示训练的数据大小,
Figure PCTCN2021125323-appb-000011
表示预测值,y i代表实验的真实值。
三、关于超参数的选择
在本发明提供的模型中,有许多参数影响训练和架构,在不同的参数设置下,模型的性能会有所不同。在一个实施例中,采用贝叶斯优化{Bergstra,2011#92}探索超参数最佳选择,以
Figure PCTCN2021125323-appb-000012
Figure PCTCN2021125323-appb-000013
作为最小化目标采集函数,其中
Figure PCTCN2021125323-appb-000014
表示预测值,y i代表真实值,
Figure PCTCN2021125323-appb-000015
表示实验真实值均值。在优化时,利用TPE(Tree-structured Parzen Estimator)算法根据过去的结果构建概率模型。在训练集上进行训练,总共生成了100个模型,每个模型训练60个epoch,并加入早停策略(patience=20)加快训练速度。最终利用验证机的最佳预测效果找到训练的最佳超参数如表1所示。最终该模型将进一步在枚举(enumeration)训练集上训练到30个点以期待提高最终精度。
表1:超参数选择空间以及最优超参数
Figure PCTCN2021125323-appb-000016
Figure PCTCN2021125323-appb-000017
模型的框架使用pytorch实现并且所有的计算和模型训练都在Linux服务器(opensuse):Intel(R)Xeon(R)Platinum 8173M CPU@2.00GHz和NvidiaGeForce RTX 2080 Ti graphics card with 11G。
四、评估标准
在一个实施例中,使用回归任务中常用的四个性能指标来评估所提供的模型,包括:(决定系数)R-Squared(R 2),spearman,RMSE,MAE。其中R 2,spearman系数度量可以帮助观测整个模型对数据的拟合能力是否良好,计算结果越接近于1,模型拟合效果越好,反之亦然。而RMSE,MAE误差度量可以帮助衡量预测值与真实值之间的差异,计算结果越接近于0,预测效果越好,反之亦然。
五、针对水溶性的验证结果
本发明的目的是利用分子SMILES序列自编码开发一种深度学习模型,用来探索基于SMILES分子序列描述符的深度神经网络对预测分子溶解度的作用。例如,在原始的数据集上包括7955个训练集,996个验证集和995个测试集。利用表1中训练的最佳超参数分别搭建了BILSTM模型并在此基础上搭建了BCSA模型。图2显示了曲线的平滑度=0.8时,训练400个epoch过程中验证集和测试集的模型拟合效果R2的变化趋势。从图中可以明显的看出,本发明的模型无论是在验证集(validation sets)还是测试集(test sets)上都比BILSTM模型有更强的拟合效果和泛化能力。
在深度学习中,样本数量越多,训练出来的效果越好,模型泛化能力越强。由于本发明的模型是基于SMILES分子序列编码的,并且不同的分子存在多种不同的SMILES字符,即存在多种序列编码,所以数据增强是可行且有必要的。优选地,进一步利用SMILES增强技术在原始切分数据集上进行了扩增,分别训练了分子增强20倍(每个分子用20个SMILES 表示)和40倍(每个分子用40个SMILES表示)的BCSA模型,其中结构简单的分子可能会存在重复的SMILES。为防止影响训练结果,经过清除重复数据,最终获得的训练集、验证集和测试集分别为(134454:19881:16834)和(239260:30042:39800)的扩增数据。实验中,利用训练过程中验证集R2表现效果最好的模型,利用测试集中扩增分子的均值作为最终预测结果来衡量该模型对分子序列的信息的提取能力,结果参见表2。验证结果表明,增强后的数据模型稳定性和泛化能力都有了显著的提高,在SMILES40数据集中我们的模型获得了最好的效果,这表明增强的模型更好的关注到了分子的不同序列信息。该模型将进一步分子扩增提高模型的精确度。在测试集实现了(R2=0.83-0.88,rmse=0.79-0.95)的准确性。相对于原先cui基于此数据集开发的利用分子指纹构建的deeper-net model(R2=0.72-0.79,RMSE=0.988-1.151),本发明表现出了较好的预测性能。
表2:训练集和测试集的预测结果统计
Figure PCTCN2021125323-appb-000018
为了更好的展示本发明模型的竞争力,进一步搭建了一系列基于图神经网络的GCN{Kipf,2016#3}、MPNN{Gilmer,2017#50}、AttentiveFP{Pérez Santín,2021#53}基线模型,探讨基于分子增强的序列描述符和分子图描述符在预测溶解度方面的影响力。这些模型的搭建均使用DGL团队发布的生命科学python软件包DGL-LifeSci实现。图3显示了不同模型对统一测试集的预测值和实际溶解度值散点图。从图中可以看出,本发明的基于分 子增强的SEBSCA模型实现了最好的分子溶解度性能预测,对不同范围内的数据都有较好的预测。可见,本发明的模型具有一定的竞争优势。
六、针对其他相关属性的预测
实验中,还使用BCSA(SMILES40)模型对油水分配系数logP和logD(pH=7.4)进行了相关预测。logP数据集仍然基于Cui{Cui,2020#69}等人数据集。由图4的左图可以看出,在测试数据集上取得了良好的结果,其R2为0.99,RMSE为0.29。从散点图可以看出,每个范围内的数据都能达到较好的拟合。另外,logD(pH=7.4)训练数据集来自Wang等。数据集被随机分成8:1:1。训练数据通过使用SMILES Enumeration 40x获得。最后以31290:3858:4031(训练:验证:测试)的比例获得了40x数据集。选择每个分子的平均预测作为最终预测结果。从图4的右图可以看出,测试集的R2为0.93,RMSE为0.36。与报道的Wang SVM模型相比,测试集的R2=0.89,RMSE=0.56和训练集的R2=0.92,RMSE=0.51,本发明所提供模型的测试集的预测也超过了王{Wang,2015#97}的训练集的性能。由此可以看到,本发明在油水相关的预测方面也表现出更好的性能,可以提供可靠且稳健的预测。
综上所述,针对准确预测水溶性是药物缺失中具有挑战性的任务,本发明提出了一个基于分子增强的利用LSTM的融合注意力机制的端到端深度学习模型框架,该模型利用长短记忆力网络中对序列处理的优势加入改进的channel attention和spatial attention模块提取SMILES序列中关于水溶性预测的重点信息部分,并利用贝叶斯优化,使得所提供的模型简单且不依赖于额外的辅助知识(如分子复杂的空间结构)并且可用于其他物理化学和ADMET特性(吸收、分布、代谢、排泄和毒性特性)的预测。
本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本发明的各个方面的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任 意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本发明操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++、Python等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以 执行计算机可读程序指令,从而实现本发明的各个方面。
这里参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和 硬件结合的方式实现都是等价的。
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本发明的范围由所附权利要求来限定。

Claims (10)

  1. 一种基于深度学习的化学分子相关水溶性预测方法,包括以下步骤:
    构建深度学习模型,其中所述深度学习模型基于双向时间序列预测模型和注意力机制构建,用于学习化学分子结构序列与水溶性属性之间的对应关系;
    以设定的损失函数最小化为目标训练所述深度学习模型,训练过程以表征化学分子结构的字符序列编码作为输入,以化学分子相关水溶性属性信息作为输出。
  2. 根据权利要求1所述的方法,其特征在于,所述深度学习模型是双向长短期记忆网络,并且在向前传播中嵌入通道注意力模块和空间注意力模块,分别用于获取不同通道和空间轴上的信息。
  3. 根据权利要求2所述的方法,其特征在于,表征化学分子结构的字符序列编码是SMILES序列编码,对于所述双向长短期记忆网络,利用SMILES序列编码作为输入,标记为
    Figure PCTCN2021125323-appb-100001
    每个时间步t输出向前的隐藏层状态
    Figure PCTCN2021125323-appb-100002
    和向后的隐藏层状态
    Figure PCTCN2021125323-appb-100003
    所述双向长短期记忆网络在t时刻隐藏层的输出是两个状态的连接,表示为
    Figure PCTCN2021125323-appb-100004
    所述双向长短期记忆网络的处理过程表示为:
    C=f(W ex i,h t-1)
    其中f表示一个多层的双向长短期记忆网络,W e是嵌入向量的学习权重。
  4. 根据权利要求3所述的方法,其特征在于,所述通道注意力模块用于表征SMILES字符内容,执行以下步骤:
    通过平均池化操作和最大池化操作聚合所述双向长短期记忆网络输出矩阵的空间信息,获得两个不同的空间上下文描述符C avg和C max
    将两个描述符C avg和C max分别输入多层共享感知器,利用求和方式获得通道注意力的输出向量;
    其中C avg和C max分别表示平均池化输出信息和最大池化输出信息。
  5. 根据权利要求4所述的方法,其特征在于,所述共享多层感知器是 2层共享的感知器,所述通道注意力模块的执行过程被表示为:
    M c(C)=MLP(AvgPool1d(C))+MLP(MaxPool1d(C))=W 1(σ(W 0(C avg))+W 1(σ(W 0(C max)))
    其中,σ表示relu激活函数,W 0和W 1分别是共享多层感知机器第一层和第二层的学习权重。
  6. 根据权利要求5所述的方法,其特征在于,所述空间注意力模块用于表征SMILES字符序列信息部分,利用两层核为7的一维卷积网络实现,表示为:
    M s(C)=Conv1d 7,1(σ(Conv1d 7,16(C)))
    其中,σ表示relu激活函数,Conv1d 7,x表示一个核大小为7,滤波器为x的1维卷积图层,整个注意力机制表示为:
    Figure PCTCN2021125323-appb-100005
    其中
    Figure PCTCN2021125323-appb-100006
    表示点乘。
  7. 根据权利要求6所述的方法,其特征在于,获得的向量O输送至一个两层的全连接层预测对应的化学分子相关水溶性属性值。
  8. 根据权利要求1所述的方法,其特征在于,将所述损失函数设置为:
    Figure PCTCN2021125323-appb-100007
    其中,N表示训练的数据大小,
    Figure PCTCN2021125323-appb-100008
    表示预测值,y i代表标记的真实值。
  9. 一种化学分子相关水溶性预测方法,包括以下步骤:
    获取表征待测化学分子结构的字符序列编码;
    将所述字符序列编码输入到根据权利要求1至8任一项所述方法获得的经训练深度学习模型,获得该化学分子相关水溶性属性信息。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现根据权利要求1至8或9中任一项所述方法的步骤。
PCT/CN2021/125323 2021-10-21 2021-10-21 一种基于深度学习的化学分子相关水溶性预测方法 WO2023065220A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/125323 WO2023065220A1 (zh) 2021-10-21 2021-10-21 一种基于深度学习的化学分子相关水溶性预测方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/125323 WO2023065220A1 (zh) 2021-10-21 2021-10-21 一种基于深度学习的化学分子相关水溶性预测方法

Publications (1)

Publication Number Publication Date
WO2023065220A1 true WO2023065220A1 (zh) 2023-04-27

Family

ID=86058692

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/125323 WO2023065220A1 (zh) 2021-10-21 2021-10-21 一种基于深度学习的化学分子相关水溶性预测方法

Country Status (1)

Country Link
WO (1) WO2023065220A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756881A (zh) * 2023-08-21 2023-09-15 人工智能与数字经济广东省实验室(广州) 一种轴承剩余使用寿命预测方法、装置及存储介质
CN117351860A (zh) * 2023-12-04 2024-01-05 深圳市伟创高科电子有限公司 一种基于数码管的仪表显示方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109741797A (zh) * 2018-12-10 2019-05-10 中国药科大学 一种利用深度学习技术预测小分子化合物水溶性等级的方法
US20200176087A1 (en) * 2018-12-03 2020-06-04 Battelle Memorial Institute Method for simultaneous characterization and expansion of reference libraries for small molecule identification
CN111640471A (zh) * 2020-05-27 2020-09-08 牛张明 基于双向长短记忆模型的药物小分子活性预测的方法和系统
CN111710375A (zh) * 2020-05-13 2020-09-25 中国科学院计算机网络信息中心 一种分子性质预测方法及系统
CN113241128A (zh) * 2021-04-29 2021-08-10 天津大学 基于分子空间位置编码注意力神经网络模型的分子性质预测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200176087A1 (en) * 2018-12-03 2020-06-04 Battelle Memorial Institute Method for simultaneous characterization and expansion of reference libraries for small molecule identification
CN109741797A (zh) * 2018-12-10 2019-05-10 中国药科大学 一种利用深度学习技术预测小分子化合物水溶性等级的方法
CN111710375A (zh) * 2020-05-13 2020-09-25 中国科学院计算机网络信息中心 一种分子性质预测方法及系统
CN111640471A (zh) * 2020-05-27 2020-09-08 牛张明 基于双向长短记忆模型的药物小分子活性预测的方法和系统
CN113241128A (zh) * 2021-04-29 2021-08-10 天津大学 基于分子空间位置编码注意力神经网络模型的分子性质预测方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SANGHYUN WOO; JONGCHAN PARK; JOON-YOUNG LEE; IN SO KWEON: "CBAM: Convolutional Block Attention Module", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 17 July 2018 (2018-07-17), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081113447 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756881A (zh) * 2023-08-21 2023-09-15 人工智能与数字经济广东省实验室(广州) 一种轴承剩余使用寿命预测方法、装置及存储介质
CN116756881B (zh) * 2023-08-21 2024-01-05 人工智能与数字经济广东省实验室(广州) 一种轴承剩余使用寿命预测方法、装置及存储介质
CN117351860A (zh) * 2023-12-04 2024-01-05 深圳市伟创高科电子有限公司 一种基于数码管的仪表显示方法
CN117351860B (zh) * 2023-12-04 2024-02-13 深圳市伟创高科电子有限公司 一种基于数码管的仪表显示方法

Similar Documents

Publication Publication Date Title
US11120801B2 (en) Generating dialogue responses utilizing an independent context-dependent additive recurrent neural network
Mohankumar et al. Towards transparent and explainable attention models
US11436414B2 (en) Device and text representation method applied to sentence embedding
US10769532B2 (en) Network rating prediction engine
US20190303535A1 (en) Interpretable bio-medical link prediction using deep neural representation
Wang et al. Research on Healthy Anomaly Detection Model Based on Deep Learning from Multiple Time‐Series Physiological Signals
CN109766557B (zh) 一种情感分析方法、装置、存储介质及终端设备
US20090210218A1 (en) Deep Neural Networks and Methods for Using Same
WO2023065220A1 (zh) 一种基于深度学习的化学分子相关水溶性预测方法
US20210303970A1 (en) Processing data using multiple neural networks
US20230075100A1 (en) Adversarial autoencoder architecture for methods of graph to sequence models
CN114830133A (zh) 利用多个正例的监督对比学习
US20150227849A1 (en) Method and System for Invariant Pattern Recognition
WO2021089012A1 (zh) 图网络模型的节点分类方法、装置及终端设备
US20240013059A1 (en) Extreme Language Model Compression with Optimal Sub-Words and Shared Projections
US11645500B2 (en) Method and system for enhancing training data and improving performance for neural network models
US20200327450A1 (en) Addressing a loss-metric mismatch with adaptive loss alignment
EP4120137A1 (en) System and method for molecular property prediction using edge conditioned identity mapping convolution neural network
WO2021181313A1 (en) Edge message passing neural network
US20230087667A1 (en) Canonicalization of data within open knowledge graphs
CN114093435A (zh) 一种基于深度学习的化学分子相关水溶性预测方法
WO2021012263A1 (en) Systems and methods for end-to-end deep reinforcement learning based coreference resolution
Chatterjee et al. Class-biased sarcasm detection using BiLSTM variational autoencoder-based synthetic oversampling
Gultchin et al. Operationalizing complex causes: A pragmatic view of mediation
Haaralahti Utilization of local large language models for business applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21960970

Country of ref document: EP

Kind code of ref document: A1