CN111798935A - Universal compound structure-property correlation prediction method based on neural network - Google Patents
Universal compound structure-property correlation prediction method based on neural network Download PDFInfo
- Publication number
- CN111798935A CN111798935A CN201910280668.9A CN201910280668A CN111798935A CN 111798935 A CN111798935 A CN 111798935A CN 201910280668 A CN201910280668 A CN 201910280668A CN 111798935 A CN111798935 A CN 111798935A
- Authority
- CN
- China
- Prior art keywords
- neural network
- prediction method
- molecular
- layer
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a universal compound structure-property correlation prediction method based on a neural network, which comprises the following steps: step 1, transforming a molecular descriptor into a characteristic vector form to construct a data set; step 2, dividing the data set into a training set and a testing set, sending the training set into a fully-connected neural network for training, and determining parameters of a fully-convolutional model; and 3, transforming the molecular descriptors to be predicted into a characteristic vector form, and predicting according to the full convolution model. The prediction method has higher accuracy in the prediction of the solubility.
Description
Technical Field
The invention relates to a construction method of a prediction model, in particular to a universal compound structure-property correlation prediction method based on a neural network.
Background
Solubility is an essential property of compounds, particularly small molecule compounds. In general, for different compounds, different compounds will have different solubilities under the same conditions in the same solution due to the structure and spatial arrangement of the compounds themselves. The determination of the solubility plays an important role in chemical processes, process preparation, chemical substance migration in medicines and environments and the like in the chemical industry.
However, since the variety of compounds is large in reality and different solutions require different storage and measurement conditions, it is impractical to measure the solubility of all compounds by practical methods, and it is urgent to establish a universal solubility measurement method which is accurate, reliable and fast based on existing data.
Such methods may be collectively referred to as quantitative structure-property correlation prediction (hereinafter, abbreviated as QSPR). QSPR is the latest solubility calculation and prediction method at present, that is, a fitting model is established according to the quantitative relationship between the calculated molecular structure parameter (molecular descriptor) of the compound and a specific property (such as solubility) for prediction, and the QSPR research is generally divided into three steps:
(1) calculating a molecular descriptor;
(2) establishing a prediction model;
(3) and (5) analyzing the prediction accuracy.
Disclosure of Invention
The invention aims to provide a universal compound structure-property correlation prediction method based on a neural network, which has higher accuracy in the prediction of solubility.
In order to achieve the above purpose, the solution of the invention is:
a universal compound structure-property correlation prediction method based on a neural network comprises the following steps:
step 1, transforming a molecular descriptor into a characteristic vector form to construct a data set;
step 2, dividing the data set into a training set and a testing set, sending the training set into a fully-connected neural network for training, and determining parameters of a fully-convolutional model;
and 3, transforming the molecular descriptors to be predicted into a characteristic vector form, and predicting according to the full convolution model.
The specific content of the step 1 is as follows: the molecular descriptor is divided into 2 parts, which are respectively corresponding to the molecular fingerprint calculated by the molecular structural formula and the general descriptor, and the 2 parts are connected to form a characteristic vector.
In step 2, the architecture of the fully-connected neural network is as follows: the first layer is an input layer, then a plurality of convolution layers are arranged, finally, a full convolution network with the convolution kernel size of [1,1] is used as a classification network, and the mean value of each layer is used for representing the category represented by the layer.
In the step 2, a back propagation algorithm or a gradient descent algorithm is adopted to train the data set.
In the step 2, the training set and the test set are divided in the following manner: 90% of the data set was taken as the training set and 10% of the data set was taken as the test set.
After the scheme is adopted, the descriptor is universal in design, calculation and combination, the descriptor can be considered to have strong universality, and various description methods can be universal and combined. The prediction model has learning and prediction capabilities for solubility of indefinite length due to the adoption of the convolutional neural network based on deep learning, so that the model provided by the invention has great universality.
The invention has the following characteristics:
(1) the length of the most common molecular structure expression formula SMILE at present can be unlimited, and the method has universality;
(2) descriptor data can be freely added, and only the same descriptor is added to the same data set, so that the method has universality;
(3) the descriptor characteristics are automatically extracted by the convolutional neural network, and meanwhile, the descriptor characteristics are trained by combining the labels, so that the method has simplicity;
(4) the convolutional neural network is ingenious in design, can achieve perfect accuracy rate in a very short training period, and has high practicability;
(5) the method is originally applied to QSPR, and the accuracy rate reaches the advanced level of the world.
Drawings
FIG. 1 is a schematic diagram of the structure of building molecular descriptors;
FIG. 2 is a schematic diagram of a fully connected layer classifier;
FIG. 3 is a diagram of a similar molecule structure;
FIG. 4 is a schematic diagram of a full convolution neural network architecture;
FIG. 5 is a schematic diagram of a full convolution neural network structure with parameters.
Detailed Description
1. Molecular descriptors
Molecular descriptors are the basis for QSPR, which refers to the nature and measure of molecules that can be represented numerically in one or more aspects, as understood and analyzed by computers. The molecular descriptors can be direct numerical representations of the physicochemical properties of the molecules, or can be calculated from a variety of data indices according to a particular algorithm. The former includes physical and chemical indexes of molecular compound such as boiling point and melting point, and the latter relates more to the outer energy of molecules, the outer electronic charge distribution between bonds, and the like.
The calculation methods of the molecular descriptors are various, and almost six thousand of various physicochemical parameters covering the characteristic characters and the structural characteristics of the compounds can be calculated at the present stage of different software and software packages.
RDkit is a free source of chemical informatics and machine learning software, and provides APIs of C + + and Python, wherein the API carries a specific molecular descriptor calculation method, and the calculation result converts a molecular structural formula into a vector group with 279 characteristic representations.
As shown in fig. 1, the molecular descriptor in the model is divided into 2 parts, which are respectively the molecular fingerprint calculated corresponding to the molecular structural formula and the general descriptor, and then the two parts are connected through a connection operation to form a feature vector for representing the feature molecule.
2. Full convolution neural network model
2.1 fully-connected neural networks
A general neural network is generally configured by basic modules such as convolutional layers, active layers, and full link layers. The convolutional layer is responsible for feature extraction, and the convolutional kernel is an important component of the convolutional layer and essentially performs feature extraction on signals of all levels. The convolutions cooperate to form convolutional layers, which are connected to the previous convolutional layer and the next convolutional layer or the fully-connected layer by neurons on the kernel. After the feature of the previous layer is convoluted by a learnable convolution kernel, the corresponding feature graph is output through an activation function, and the corresponding feature graph is combined into the values of a plurality of feature graphs in convolution.
In the formula (1), the reaction mixture is,the output result of the last convolutional layer convolution kernel, which passes through the convolution kernel in the current convolutional layerConvolution is carried out, product calculation is carried out, and then offset is carried outAnd adding to obtain the final product. The F function is generally called an activation function and is constructed using a tanh or relu function. The structure thus formed is shown in fig. 2.
The full-connection layer is used for classifying data, the special effect of the data extracted from the previous layer is subjected to one-dimensional transformation to form a one-dimensional vector, the one-dimensional vector is connected with the full-connection kernel number determined according to the experience of a designer, and the result is calculated in a matrix calculation mode. The final result is the final classification calculation using softmax or sigmoid function as the activation function. And outputting the final classification according to the result probability.
In addition, in order to train the neural network, a downsampling layer and a regularization layer are generally added between convolution layers to process an activation result, so that the fitting result is improved, overfitting is reduced, and the training speed is increased.
2.2 full convolution neural network architecture
The classification function of the traditional neural network is completed by a final full-connection layer, namely a function of mapping a feature result extracted by a convolutional layer to a specific mark space. This has the advantage of facilitating the calculation by using a softmax or sigmoid function after a matrix calculation. However, this results in a large amount of redundant parameters, which is very poor for data reuse. Another significant drawback of the fully-connected layer is that when the input data is reconstructed into one-dimensional vectors, the data structure between the vectors is lost. In addition, for the set fully-connected layer, because the intrinsic calculation method is matrix calculation, for the vectors at the input end, the input must use the vectors with the same dimension so as to ensure a uniform calculation.
The conventional convolutional neural network obtains the judgment of the rotation invariance of the image recognition through the pooling effect of the space. However, when applied to the field of chemistry, the properties of the molecules and the changes in the positions of the structures are quite different, and as shown in FIG. 3, the nature of their determination is quite different even if it is slightly different.
The authors in this document have originally used a new fully convolutional based neural network to classify molecular description feature vectors. Starting from convolutional layer feature extraction, obtaining a feature map with the highest corresponding degree, fitting the features of the corresponding layer in a global pooling manner, and calculating the corresponding strongest features as the corresponding classification results, as shown in fig. 4 and 5, the specific architecture is as follows: the first layer is an input layer which is converted into molecular fingerprints; followed by several convolutional layers; and finally, using a full convolution network with a convolution kernel size of [1,1] as a classification network, and using the average value of each layer to represent the category represented by the layer.
2.3 training of full convolution neural networks
The training of convolutional neural networks is actually to find a set of optimal solutions in a data space that is assumed to exist, so that the value of the calculated objective function (loss function) is minimized. In theory, the data space is infinite, and the combination of solutions is infinite, so that it is impossible to artificially set a set of optimal solutions.
Common neural network training methods are mainly a Back Propagation (BP) algorithm and a Gradient Descent (GD) algorithm. The same is true of the full convolution neural network herein. According to input data, after forward propagation, calculating errors between the input data and actual values through a loss function, propagating the errors backwards layer by layer, calculating partial derivatives of the errors to each convolution kernel value, and updating weights and deviations according to the partial derivatives.
Where, conv is the convolution operation,in order to be the parameters of the convolution kernel,the result is output for the last convolution kernel, where the convolution kernel value after the current convolution kernel is rotated by 180 degrees is used as the weight multiplier.
2.4, discussion of some details of full convolution neural network
The neural network used herein is a full convolution neural network, and currently, researches find that there are 3 main factors affecting the convolution neural network: the number of convolutional layers, the number of convolutional kernels, and the organization of the neural network. In practical application, Facebook's Resnet successfully superimposes the input layer and the residual error together, and successfully solves the problem that gradient propagation disappears in the process of increasing the number of convolution layers. Google's incorporation and subsequent versions design a network with a good local topology, i.e., perform multiple convolution operations or pooling operations on the input image in parallel, and stitch all output results into a very deep feature map, increasing the number of convolution kernels greatly without increasing the parameter values greatly.
The fully convolutional neural network proposed herein successfully changes the organization structure of the conventional neural network, replacing the final fully-connected layer for classification with convolutional layers, which also conforms to the "fully convolutional" neural network proposed herein.
3. QSPR model training based on full convolution neural network
3.1 data Structure and data conversion
SMILES (Simplified molecular input specification) is a specification for explicitly describing a molecular structure using ASCII character strings. SMILES was developed by Arthur Weininger and David Weininger in the late 80's of the 20 th century and was modified and expanded by others, particularly by the Sun's Chemical Information Systems Inc. (Daylight Chemical Information Systems Inc.).
TABLE 1
Table 1 is a diagram of the structure of SMILES and the corresponding compound molecules displayed by the software, and it can be seen that different SMILES correspond to different compound molecule structures, and the SMILES can obtain the corresponding descriptors through the corresponding software calculation (herein Rdkit), as shown in table 2.
TABLE 2
For a common single SMILES, the generated molecular descriptor is a [1,200] array vector with dimensions, which is used to replace the molecular representation and is also treated as input data for the model. The solubility is specifically defined as that solubility itself presents a continuous numerical sequence, and therefore is artificially classified using one-hot coding, and is classified into 10 classes for the sake of simplicity herein.
3.2 specific design and parameters of the full convolution model
In order to solve the problem that the characteristics are not obvious in the training process, the network bridges multiple dense connections, in the process of characteristic extraction of the convolutional layers, direct connection is established between any two layers, and the input of each layer is the union of the outputs of all the layers. And all the feature information extracted by the layer is also transmitted as communication information to the next layer until the final global convolutional layer. The global convolutional layer is used for performing spatial mapping on the extracted features and mapping the most significant features corresponding to the extracted features on different spatial levels so as to determine the corresponding categories. Thus, after sufficient extraction and mapping, the final pair is made to have the corresponding input value fall within a specific target interval.
3.3, full convolution model QSPR experimental results and analysis
3.3.1 Experimental data set
There are 3 data sets used herein, in order:
1) abraham octanol solubility dataset
2) Delaney water solubility dataset
3) Tox21 toxicity data set
Abraham and Delaney have 283 and 1144 records, respectively, where the structural formula SMILES is used and the specific values of the solubility are calculated log.
Tox21 was derived from the Tox21 program of national institute of health, chemical genomics (NCGC) of Lockville, Maryland, USA, where 12 groups of data were selected (nr-ahr, nr-ar, nr-ar-lbd, nr-aromatase, nr-er, nr-er-lbd, nr-ppar-gamma, sr-are, sr-atad5, sr-hse, sr-mmp, sr-p 53). Each set of data was approximately 8000 records.
All data sets were divided into training sets and test sets, accounting for 90% of the training sets and 10% of the test sets, respectively.
3.3.2 Experimental results and analysis
The full convolutional neural network model was implemented using a Tensorflow library as the basic framework. The main hardware used herein is two NVIDIA 1070 graphics cards as image processors, the batch size (batch size) is set to 50, the number of training times is unlimited, the learning rate is 0.0001, and the adopted optimizer model is a stochastic gradient descent optimizer (gradientdescreenoptimizer).
In each iteration of the model training, the parameters of all convolutional layers are involved in the calculation and are updated, and all the parameters are parameters of the convolutional filter. And the model simultaneously calculates the accuracy of the training set and the test set, and stops the model training when the accuracy on the training set is more than 0.9999. The results of the verification are shown in the following table:
TABLE 3
SVM | Logistic regression | Full convolution model | |
Abraham | 0.38 | 0.17 | 0.79+ |
tox21 | 0.76 | 0.21 | 0.9999+ |
Delaney | 0.51 | 0.15 | 0.92+ |
As can be seen from table 3, the full convolution model proposed herein achieves very good accuracy on each data set, and especially on Tox21 data set, the average of 12 verification results can substantially reach 100% accuracy, which is very necessary for the verification of toxicity. The results for the Abraham and Delaney datasets are less than ideal for Tox21, most likely because the data volume of the dataset is too small to cover all the required training points.
4. Concluding sentence
Experiments were performed on different sets of molecular activity data using a fully convolutional neural network. Experiments show that: compared with the traditional machine learning tool, the full convolution neural network can obtain the best accuracy rate on small data, and the training speed is not obviously reduced.
The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the protection scope of the present invention.
Claims (5)
1. A universal compound structure-property correlation prediction method based on a neural network is characterized by comprising the following steps:
step 1, transforming a molecular descriptor into a characteristic vector form to construct a data set;
step 2, dividing the data set into a training set and a testing set, sending the training set into a fully-connected neural network for training, and determining parameters of a fully-convolutional model;
and 3, transforming the molecular descriptors to be predicted into a characteristic vector form, and predicting according to the full convolution model.
2. The prediction method of claim 1, wherein: the specific content of the step 1 is as follows: the molecular descriptor is divided into 2 parts, which are respectively corresponding to the molecular fingerprint calculated by the molecular structural formula and the general descriptor, and the 2 parts are connected to form a characteristic vector.
3. The prediction method of claim 1, wherein: in step 2, the architecture of the fully-connected neural network is as follows: the first layer is an input layer, then a plurality of convolution layers are arranged, finally, a full convolution network with the convolution kernel size of [1,1] is used as a classification network, and the mean value of each layer is used for representing the category represented by the layer.
4. The prediction method of claim 1, wherein: in the step 2, a back propagation algorithm or a gradient descent algorithm is adopted to train the data set.
5. The prediction method of claim 1, wherein: in step 2, the training set and the test set are divided in the following manner: 90% of the data set was taken as the training set and 10% of the data set was taken as the test set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280668.9A CN111798935A (en) | 2019-04-09 | 2019-04-09 | Universal compound structure-property correlation prediction method based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280668.9A CN111798935A (en) | 2019-04-09 | 2019-04-09 | Universal compound structure-property correlation prediction method based on neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111798935A true CN111798935A (en) | 2020-10-20 |
Family
ID=72805083
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910280668.9A Pending CN111798935A (en) | 2019-04-09 | 2019-04-09 | Universal compound structure-property correlation prediction method based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111798935A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112634993A (en) * | 2020-12-30 | 2021-04-09 | 中国科学院生态环境研究中心 | Prediction model and screening method for activation activity of estrogen receptor of chemicals |
CN113362905A (en) * | 2021-06-08 | 2021-09-07 | 浙江大学 | Asymmetric catalytic reaction enantioselectivity prediction method based on deep learning |
CN113380346A (en) * | 2021-06-08 | 2021-09-10 | 河南大学 | Coupling reaction yield intelligent prediction method based on attention convolution neural network |
CN113380337A (en) * | 2021-06-08 | 2021-09-10 | 浙江大学 | Organic fluorescent small molecule optical property prediction method based on deep neural network |
CN113674807A (en) * | 2021-08-10 | 2021-11-19 | 南京工业大学 | Molecular screening method based on deep learning technology qualitative and quantitative model |
CN113903409A (en) * | 2021-12-08 | 2022-01-07 | 北京晶泰科技有限公司 | Molecular data processing method, model construction and prediction method and related device |
CN115062181A (en) * | 2022-08-16 | 2022-09-16 | 苏州创腾软件有限公司 | Polymer glass transition temperature prediction method based on convolutional neural network |
Citations (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030028330A1 (en) * | 2001-07-13 | 2003-02-06 | Ailan Cheng | System and method for aqueous solubility prediction |
CN101329699A (en) * | 2008-07-31 | 2008-12-24 | 四川大学 | Method for predicting medicament molecule pharmacokinetic property and toxicity based on supporting vector machine |
CN101339180A (en) * | 2008-08-14 | 2009-01-07 | 南京工业大学 | Organic compound explosive characteristic prediction method based on support vector machine |
CN101477597A (en) * | 2009-01-15 | 2009-07-08 | 浙江大学 | Natural product active ingredient computation and recognition method based compound characteristic |
KR20120085153A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network hybrid model predicting lower flammability limit volume percent of organic compound |
KR20120085148A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network model predicting standard state enthalpy of formation of pure organic compound |
KR20120085147A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network hybrid model predicting solubility index of organic compound |
KR20120085144A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network hybrid model predicting water solubility of pure organic compound |
WO2012177108A2 (en) * | 2011-10-04 | 2012-12-27 | 주식회사 켐에쎈 | Model, method and system for predicting, processing and servicing online physicochemical and thermodynamic properties of pure compound |
CN102980972A (en) * | 2012-11-06 | 2013-03-20 | 南京工业大学 | Method for determining hot dangerousness of self-reactive chemical substance |
CN103235901A (en) * | 2013-05-07 | 2013-08-07 | 黑龙江中医药大学 | Method for determining phase transition boundary of microemulsion drug carrier of tanshinone IIA |
CN104376221A (en) * | 2014-11-21 | 2015-02-25 | 环境保护部南京环境科学研究所 | Method for predicating skin permeability coefficients of organic chemicals |
US20150134315A1 (en) * | 2013-09-27 | 2015-05-14 | Codexis, Inc. | Structure based predictive modeling |
CN106777986A (en) * | 2016-12-19 | 2017-05-31 | 南京邮电大学 | Ligand molecular fingerprint generation method based on depth Hash in drug screening |
CN106777922A (en) * | 2016-11-30 | 2017-05-31 | 华东理工大学 | A kind of CTA hydrofinishings production process agent model modeling method |
US20170161635A1 (en) * | 2015-12-02 | 2017-06-08 | Preferred Networks, Inc. | Generative machine learning systems for drug design |
CN106874688A (en) * | 2017-03-01 | 2017-06-20 | 中国药科大学 | Intelligent lead compound based on convolutional neural networks finds method |
CN107239803A (en) * | 2017-07-21 | 2017-10-10 | 国家海洋局第海洋研究所 | Utilize the sediment automatic classification method of deep learning neutral net |
CN107563496A (en) * | 2017-08-07 | 2018-01-09 | 北京工业大学 | A kind of deep learning mode identification method of vectorial core convolutional neural networks |
US20180012124A1 (en) * | 2016-07-05 | 2018-01-11 | International Business Machines Corporation | Neural network for chemical compounds |
CN107871054A (en) * | 2016-09-23 | 2018-04-03 | 中国石油天然气股份有限公司 | Oil refining atmospheric and vacuum distillation unit nertralizer Selection Method and nertralizer composition based on AHP Field Using Fuzzy Comprehensive Assessments |
CN107886491A (en) * | 2017-11-27 | 2018-04-06 | 深圳市唯特视科技有限公司 | A kind of image combining method based on pixel arest neighbors |
US10146914B1 (en) * | 2018-03-01 | 2018-12-04 | Recursion Pharmaceuticals, Inc. | Systems and methods for evaluating whether perturbations discriminate an on target effect |
CN108985001A (en) * | 2017-06-05 | 2018-12-11 | 欧阳德方 | A kind of pharmaceutical preparation prediction technique |
CN109033738A (en) * | 2018-07-09 | 2018-12-18 | 湖南大学 | A kind of pharmaceutical activity prediction technique based on deep learning |
CN109376798A (en) * | 2018-11-23 | 2019-02-22 | 东南大学 | A kind of classification method based on convolutional neural networks titanium dioxide lattice phase |
CN109461475A (en) * | 2018-10-26 | 2019-03-12 | 中国科学技术大学 | Molecular attribute prediction method based on artificial neural network |
US20190080057A1 (en) * | 2017-09-12 | 2019-03-14 | Michael Stanley | Toxicity or adverse effect of a substance predicting automated system and method of training thereof |
WO2019050902A1 (en) * | 2017-09-05 | 2019-03-14 | Adaptive Phage Therapeutics, Inc. | Methods to determine the sensitivity profile of a bacterial strain to a therapeutic composition |
-
2019
- 2019-04-09 CN CN201910280668.9A patent/CN111798935A/en active Pending
Patent Citations (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030028330A1 (en) * | 2001-07-13 | 2003-02-06 | Ailan Cheng | System and method for aqueous solubility prediction |
CN101329699A (en) * | 2008-07-31 | 2008-12-24 | 四川大学 | Method for predicting medicament molecule pharmacokinetic property and toxicity based on supporting vector machine |
CN101339180A (en) * | 2008-08-14 | 2009-01-07 | 南京工业大学 | Organic compound explosive characteristic prediction method based on support vector machine |
CN101477597A (en) * | 2009-01-15 | 2009-07-08 | 浙江大学 | Natural product active ingredient computation and recognition method based compound characteristic |
WO2012177108A2 (en) * | 2011-10-04 | 2012-12-27 | 주식회사 켐에쎈 | Model, method and system for predicting, processing and servicing online physicochemical and thermodynamic properties of pure compound |
KR20120085153A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network hybrid model predicting lower flammability limit volume percent of organic compound |
KR20120085147A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network hybrid model predicting solubility index of organic compound |
KR20120085144A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network hybrid model predicting water solubility of pure organic compound |
KR20120085148A (en) * | 2011-10-05 | 2012-07-31 | 주식회사 켐에쎈 | Multiple linear regression-artificial neural network model predicting standard state enthalpy of formation of pure organic compound |
CN102980972A (en) * | 2012-11-06 | 2013-03-20 | 南京工业大学 | Method for determining hot dangerousness of self-reactive chemical substance |
CN103235901A (en) * | 2013-05-07 | 2013-08-07 | 黑龙江中医药大学 | Method for determining phase transition boundary of microemulsion drug carrier of tanshinone IIA |
US20150134315A1 (en) * | 2013-09-27 | 2015-05-14 | Codexis, Inc. | Structure based predictive modeling |
CN104376221A (en) * | 2014-11-21 | 2015-02-25 | 环境保护部南京环境科学研究所 | Method for predicating skin permeability coefficients of organic chemicals |
US20170161635A1 (en) * | 2015-12-02 | 2017-06-08 | Preferred Networks, Inc. | Generative machine learning systems for drug design |
US20180012124A1 (en) * | 2016-07-05 | 2018-01-11 | International Business Machines Corporation | Neural network for chemical compounds |
CN107871054A (en) * | 2016-09-23 | 2018-04-03 | 中国石油天然气股份有限公司 | Oil refining atmospheric and vacuum distillation unit nertralizer Selection Method and nertralizer composition based on AHP Field Using Fuzzy Comprehensive Assessments |
CN106777922A (en) * | 2016-11-30 | 2017-05-31 | 华东理工大学 | A kind of CTA hydrofinishings production process agent model modeling method |
CN106777986A (en) * | 2016-12-19 | 2017-05-31 | 南京邮电大学 | Ligand molecular fingerprint generation method based on depth Hash in drug screening |
CN106874688A (en) * | 2017-03-01 | 2017-06-20 | 中国药科大学 | Intelligent lead compound based on convolutional neural networks finds method |
CN108985001A (en) * | 2017-06-05 | 2018-12-11 | 欧阳德方 | A kind of pharmaceutical preparation prediction technique |
CN107239803A (en) * | 2017-07-21 | 2017-10-10 | 国家海洋局第海洋研究所 | Utilize the sediment automatic classification method of deep learning neutral net |
CN107563496A (en) * | 2017-08-07 | 2018-01-09 | 北京工业大学 | A kind of deep learning mode identification method of vectorial core convolutional neural networks |
WO2019050902A1 (en) * | 2017-09-05 | 2019-03-14 | Adaptive Phage Therapeutics, Inc. | Methods to determine the sensitivity profile of a bacterial strain to a therapeutic composition |
US20190080057A1 (en) * | 2017-09-12 | 2019-03-14 | Michael Stanley | Toxicity or adverse effect of a substance predicting automated system and method of training thereof |
CN107886491A (en) * | 2017-11-27 | 2018-04-06 | 深圳市唯特视科技有限公司 | A kind of image combining method based on pixel arest neighbors |
US10146914B1 (en) * | 2018-03-01 | 2018-12-04 | Recursion Pharmaceuticals, Inc. | Systems and methods for evaluating whether perturbations discriminate an on target effect |
CN109033738A (en) * | 2018-07-09 | 2018-12-18 | 湖南大学 | A kind of pharmaceutical activity prediction technique based on deep learning |
CN109461475A (en) * | 2018-10-26 | 2019-03-12 | 中国科学技术大学 | Molecular attribute prediction method based on artificial neural network |
CN109376798A (en) * | 2018-11-23 | 2019-02-22 | 东南大学 | A kind of classification method based on convolutional neural networks titanium dioxide lattice phase |
Non-Patent Citations (1)
Title |
---|
DRUGAL: "基于神经网络的溶解度预测", Retrieved from the Internet <URL:https://blog.csdn.net/u012325865/article/details/82725777> * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112634993A (en) * | 2020-12-30 | 2021-04-09 | 中国科学院生态环境研究中心 | Prediction model and screening method for activation activity of estrogen receptor of chemicals |
CN113362905A (en) * | 2021-06-08 | 2021-09-07 | 浙江大学 | Asymmetric catalytic reaction enantioselectivity prediction method based on deep learning |
CN113380346A (en) * | 2021-06-08 | 2021-09-10 | 河南大学 | Coupling reaction yield intelligent prediction method based on attention convolution neural network |
CN113380337A (en) * | 2021-06-08 | 2021-09-10 | 浙江大学 | Organic fluorescent small molecule optical property prediction method based on deep neural network |
CN113362905B (en) * | 2021-06-08 | 2022-07-08 | 浙江大学 | Asymmetric catalytic reaction enantioselectivity prediction method based on deep learning |
CN113674807A (en) * | 2021-08-10 | 2021-11-19 | 南京工业大学 | Molecular screening method based on deep learning technology qualitative and quantitative model |
CN113903409A (en) * | 2021-12-08 | 2022-01-07 | 北京晶泰科技有限公司 | Molecular data processing method, model construction and prediction method and related device |
CN113903409B (en) * | 2021-12-08 | 2023-07-07 | 北京晶泰科技有限公司 | Molecular data processing method, model construction and prediction method and related devices |
CN115062181A (en) * | 2022-08-16 | 2022-09-16 | 苏州创腾软件有限公司 | Polymer glass transition temperature prediction method based on convolutional neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111798935A (en) | Universal compound structure-property correlation prediction method based on neural network | |
CN113707235B (en) | Drug micromolecule property prediction method, device and equipment based on self-supervision learning | |
WO2022083624A1 (en) | Model acquisition method, and device | |
CN110532417B (en) | Image retrieval method and device based on depth hash and terminal equipment | |
CN110163258A (en) | A kind of zero sample learning method and system reassigning mechanism based on semantic attribute attention | |
Guo et al. | A centroid-based gene selection method for microarray data classification | |
CN107577605A (en) | A kind of feature clustering system of selection of software-oriented failure prediction | |
CN112199532B (en) | Zero sample image retrieval method and device based on Hash coding and graph attention machine mechanism | |
EP4322031A1 (en) | Recommendation method, recommendation model training method, and related product | |
EP4273754A1 (en) | Neural network training method and related device | |
Feng et al. | Dual-graph convolutional network based on band attention and sparse constraint for hyperspectral band selection | |
Günen et al. | Analyzing the contribution of training algorithms on deep neural networks for hyperspectral image classification | |
CN116417093A (en) | Drug target interaction prediction method combining transducer and graph neural network | |
Kong et al. | Deep PLS: A lightweight deep learning model for interpretable and efficient data analytics | |
Liao et al. | Class-wise graph embedding-based active learning for hyperspectral image classification | |
CN110348287A (en) | A kind of unsupervised feature selection approach and device based on dictionary and sample similar diagram | |
Cui et al. | Dual-triple attention network for hyperspectral image classification using limited training samples | |
CN116580848A (en) | Multi-head attention mechanism-based method for analyzing multiple groups of chemical data of cancers | |
CN107480441B (en) | Modeling method and system for children septic shock prognosis prediction | |
CN116798652A (en) | Anticancer drug response prediction method based on multitasking learning | |
CN112086144A (en) | Molecule generation method, molecule generation device, electronic device, and storage medium | |
CN113257357B (en) | Protein residue contact map prediction method | |
Kothari et al. | Potato leaf disease detection using deep learning | |
CN113724195B (en) | Quantitative analysis model and establishment method of protein based on immunofluorescence image | |
CN115861902A (en) | Unsupervised action migration and discovery methods, systems, devices, and media |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |